1
|
Sanches IH, Braga RC, Alves VM, Andrade CH. Enhancing hERG Risk Assessment with Interpretable Classificatory and Regression Models. Chem Res Toxicol 2024; 37:910-922. [PMID: 38781421 PMCID: PMC11187631 DOI: 10.1021/acs.chemrestox.3c00400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 04/22/2024] [Accepted: 05/14/2024] [Indexed: 05/25/2024]
Abstract
The human Ether-à-go-go-Related Gene (hERG) is a transmembrane protein that regulates cardiac action potential, and its inhibition can induce a potentially deadly cardiac syndrome. In vitro tests help identify hERG blockers at early stages; however, the high cost motivates searching for alternative, cost-effective methods. The primary goal of this study was to enhance the Pred-hERG tool for predicting hERG blockage. To achieve this, we developed new QSAR models that incorporated additional data, updated existing classificatory and multiclassificatory models, and introduced new regression models. Notably, we integrated SHAP (SHapley Additive exPlanations) values to offer a visual interpretation of these models. Utilizing the latest data from ChEMBL v30, encompassing over 14,364 compounds with hERG data, our binary and multiclassification models outperformed both the previous iteration of Pred-hERG and all publicly available models. Notably, the new version of our tool introduces a regression model for predicting hERG activity (pIC50). The optimal model demonstrated an R2 of 0.61 and an RMSE of 0.48, surpassing the only available regression model in the literature. Pred-hERG 5.0 now offers users a swift, reliable, and user-friendly platform for the early assessment of chemically induced cardiotoxicity through hERG blockage. The tool provides versatile outcomes, including (i) classificatory predictions of hERG blockage with prediction reliability, (ii) multiclassificatory predictions of hERG blockage with reliability, (iii) regression predictions with estimated pIC50 values, and (iv) probability maps illustrating the contribution of chemical fragments for each prediction. Furthermore, we implemented explainable AI analysis (XAI) to visualize SHAP values, providing insights into the contribution of each feature to binary classification predictions. A consensus prediction calculated based on the predictions of the three developed models is also present to assist the user's decision-making process. Pred-hERG 5.0 has been designed to be user-friendly, making it accessible to users without computational or programming expertise. The tool is freely available at http://predherg.labmol.com.br.
Collapse
Affiliation(s)
- Igor H. Sanches
- Laboratory
for Molecular Modeling and Drug Design (LabMol), Faculty of Pharmacy, Universidade Federal de Goiás, Goiânia, GO 74690-900, Brazil
- Center
for Excellence in Artificial Intelligence (CEIA), Institute of Informatics, Universidade Federal de Goiás, Goiânia, GO 74690-900, Brazil
- Center
for the Research and Advancement in Fragments and Molecular Targets
(CRAFT), School of Pharmaceutical Sciences at Ribeirao Preto, University of São Paulo, Ribeirão Preto, SP 05508-220, Brazil
| | | | - Vinicius M. Alves
- University
of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States
| | - Carolina Horta Andrade
- Laboratory
for Molecular Modeling and Drug Design (LabMol), Faculty of Pharmacy, Universidade Federal de Goiás, Goiânia, GO 74690-900, Brazil
- Center
for Excellence in Artificial Intelligence (CEIA), Institute of Informatics, Universidade Federal de Goiás, Goiânia, GO 74690-900, Brazil
- Center
for the Research and Advancement in Fragments and Molecular Targets
(CRAFT), School of Pharmaceutical Sciences at Ribeirao Preto, University of São Paulo, Ribeirão Preto, SP 05508-220, Brazil
| |
Collapse
|
2
|
Riaz Gondal MU, Atta Mehdi H, Khenhrani RR, Kumari N, Ali MF, Kumar S, Faraz M, Malik J. Role of Machine Learning and Artificial Intelligence in Arrhythmias and Electrophysiology. Cardiol Rev 2024:00045415-990000000-00270. [PMID: 38761137 DOI: 10.1097/crd.0000000000000715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 05/20/2024]
Abstract
Machine learning (ML), a subset of artificial intelligence (AI) centered on machines learning from extensive datasets, stands at the forefront of a technological revolution shaping various facets of society. Cardiovascular medicine has emerged as a key domain for ML applications, with considerable efforts to integrate these innovations into routine clinical practice. Within cardiac electrophysiology, ML applications, especially in the automated interpretation of electrocardiograms, have garnered substantial attention in existing literature. However, less recognized are the diverse applications of ML in cardiac electrophysiology and arrhythmias, spanning basic science research on arrhythmia mechanisms, both experimental and computational, as well as contributions to enhanced techniques for mapping cardiac electrical function and translational research related to arrhythmia management. This comprehensive review delves into various ML applications within the scope of this journal, organized into 3 parts. The first section provides a fundamental understanding of general ML principles and methodologies, serving as a foundational resource for readers interested in exploring ML applications in arrhythmia research. The second part offers an in-depth review of studies in arrhythmia and electrophysiology that leverage ML methodologies, showcasing the broad potential of ML approaches. Each subject is thoroughly outlined, accompanied by a review of notable ML research advancements. Finally, the review delves into the primary challenges and future perspectives surrounding ML-driven cardiac electrophysiology and arrhythmias research.
Collapse
Affiliation(s)
| | - Hassan Atta Mehdi
- Department of Medicine, Jinnah Postgraduate Medical Centre, Karachi, Pakistan
| | - Raja Ram Khenhrani
- Department of Medicine, Internal Medicine Fellow, Shaheed Mohtarma Benazir Bhutto Medical College and Lyari General Hospital, Karachi, Pakistan
| | - Neha Kumari
- Department of Medicine, Jinnah Postgraduate Medical Centre, Karachi, Pakistan
| | - Muhammad Faizan Ali
- Department of Medicine, Jinnah Postgraduate Medical Centre, Karachi, Pakistan
| | - Sooraj Kumar
- Department of Medicine, Jinnah Sindh Medical University, Karachi, Pakistan; and
| | - Maria Faraz
- Department of Cardiovascular Medicine, Cardiovascular Analytics Group, Rawalpindi, Pakistan
| | - Jahanzeb Malik
- Department of Cardiovascular Medicine, Cardiovascular Analytics Group, Rawalpindi, Pakistan
| |
Collapse
|
3
|
Guo W, Liu J, Dong F, Song M, Li Z, Khan MKH, Patterson TA, Hong H. Review of machine learning and deep learning models for toxicity prediction. Exp Biol Med (Maywood) 2023; 248:1952-1973. [PMID: 38057999 PMCID: PMC10798180 DOI: 10.1177/15353702231209421] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/08/2023] Open
Abstract
The ever-increasing number of chemicals has raised public concerns due to their adverse effects on human health and the environment. To protect public health and the environment, it is critical to assess the toxicity of these chemicals. Traditional in vitro and in vivo toxicity assays are complicated, costly, and time-consuming and may face ethical issues. These constraints raise the need for alternative methods for assessing the toxicity of chemicals. Recently, due to the advancement of machine learning algorithms and the increase in computational power, many toxicity prediction models have been developed using various machine learning and deep learning algorithms such as support vector machine, random forest, k-nearest neighbors, ensemble learning, and deep neural network. This review summarizes the machine learning- and deep learning-based toxicity prediction models developed in recent years. Support vector machine and random forest are the most popular machine learning algorithms, and hepatotoxicity, cardiotoxicity, and carcinogenicity are the frequently modeled toxicity endpoints in predictive toxicology. It is known that datasets impact model performance. The quality of datasets used in the development of toxicity prediction models using machine learning and deep learning is vital to the performance of the developed models. The different toxicity assignments for the same chemicals among different datasets of the same type of toxicity have been observed, indicating benchmarking datasets is needed for developing reliable toxicity prediction models using machine learning and deep learning algorithms. This review provides insights into current machine learning models in predictive toxicology, which are expected to promote the development and application of toxicity prediction models in the future.
Collapse
Affiliation(s)
- Wenjing Guo
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Jie Liu
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Fan Dong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Meng Song
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Zoe Li
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Md Kamrul Hasan Khan
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Tucker A Patterson
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Huixiao Hong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| |
Collapse
|
4
|
Liu W, Hopkins AM, Yan P, Du S, Luyt LG, Li Y, Hou J. Can machine learning 'transform' peptides/peptidomimetics into small molecules? A case study with ghrelin receptor ligands. Mol Divers 2023; 27:2239-2255. [PMID: 36331785 DOI: 10.1007/s11030-022-10555-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Accepted: 10/19/2022] [Indexed: 11/06/2022]
Abstract
There has been considerable interest in transforming peptides into small molecules as peptide-based molecules often present poorer bioavailability and lower metabolic stability. Our studies looked into building machine learning (ML) models to investigate if ML is able to identify the 'bioactive' features of peptides and use the features to accurately discriminate between binding and non-binding small molecules. The ghrelin receptor (GR), a receptor that is implicated in various diseases, was used as an example to demonstrate whether ML models derived from a peptide library can be used to predict small molecule binders. ML models based on three different algorithms, namely random forest, support vector machine, and extreme gradient boosting, were built based on a carefully curated dataset of peptide/peptidomimetic and small molecule GR ligands. The results indicated that ML models trained with a dataset exclusively composed of peptides/peptidomimetics provide limited predictive power for small molecules, but that ML models trained with a diverse dataset composed of an array of both peptides/peptidomimetics and small molecules displayed exceptional results in terms of accuracy and false rates. The diversified models can accurately differentiate the binding small molecules from non-binding small molecules using an external validation set with new small molecules that we synthesized previously. Structural features that are the most critical contributors to binding activity were extracted and are remarkably consistent with the crystallography and mutagenesis studies.
Collapse
Affiliation(s)
- Wenjie Liu
- Department of Chemistry, Lakehead University and Thunder Bay Regional Health Research Institute, 980 Oliver Road, Thunder Bay, ON, P7B 6V4, Canada
| | - Austin M Hopkins
- Department of Chemistry, Lakehead University and Thunder Bay Regional Health Research Institute, 980 Oliver Road, Thunder Bay, ON, P7B 6V4, Canada
| | - Peizhi Yan
- Department of Electrical and Computer Engineering, The University of British Columbia, Vancouver, BC, Canada
| | - Shan Du
- Department of Computer Science, Mathematics, Physics and Statistics, The University of British Columbia, Okanagan, Kelowna, BC, Canada
| | - Leonard G Luyt
- Department of Chemistry, University of Western Ontario, London, ON, Canada
- London Regional Cancer Program, Lawson Health Research Institute, London, ON, Canada
| | - Yifeng Li
- Department of Computer Science, Brock University, Saint Catharines, ON, Canada
| | - Jinqiang Hou
- Department of Chemistry, Lakehead University and Thunder Bay Regional Health Research Institute, 980 Oliver Road, Thunder Bay, ON, P7B 6V4, Canada.
| |
Collapse
|
5
|
Chen Y, Yu X, Li W, Tang Y, Liu G. In silico prediction of hERG blockers using machine learning and deep learning approaches. J Appl Toxicol 2023; 43:1462-1475. [PMID: 37093028 DOI: 10.1002/jat.4477] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Revised: 04/04/2023] [Accepted: 04/19/2023] [Indexed: 04/25/2023]
Abstract
The human ether-à-go-go-related gene (hERG) is associated with drug cardiotoxicity. If the hERG channel is blocked, it will lead to prolonged QT interval and cause sudden death in severe cases. Therefore, it is important to evaluate the hERG-blocking property of compounds in early drug discovery. In this study, a dataset containing 4556 compounds with IC50 values determined by patch clamp techniques on mammalian lineage cells was collected, and hERG blockers and non-blockers were distinguished according to three single thresholds and two binary thresholds. Four machine learning (ML) algorithms combining four molecular fingerprints and molecular descriptors as well as graph convolutional neural networks (GCNs) were used to construct a series of binary classification models. The results showed that the best models varied for different thresholds. The ML models implemented by support vector machine and random forest performed well based on Morgan fingerprints and molecular descriptors, with AUCs ranging from 0.884 to 0.950. GCN showed superior prediction performance with AUCs above 0.952, which might be related to its direct extraction of molecular features from the original input. Meanwhile, the classification of binary threshold was better than that of single threshold, which could provide us with a more accurate prediction of hERG blockers. At last, the applicability domain for the model was defined, and seven structural alerts that might generate hERG blockage were identified by information gain and substructure frequency analysis. Our work would be beneficial for identifying hERG blockers in chemicals.
Collapse
Affiliation(s)
- Yuanting Chen
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai, 200237, China
| | - Xinxin Yu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai, 200237, China
| | - Weihua Li
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai, 200237, China
| | - Yun Tang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai, 200237, China
| | - Guixia Liu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai, 200237, China
| |
Collapse
|
6
|
El Harchi A, Hancox JC. hERG agonists pose challenges to web-based machine learning methods for prediction of drug-hERG channel interaction. J Pharmacol Toxicol Methods 2023; 123:107293. [PMID: 37468081 DOI: 10.1016/j.vascn.2023.107293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Revised: 05/23/2023] [Accepted: 07/12/2023] [Indexed: 07/21/2023]
Abstract
Pharmacological blockade of the IKr channel (hERG) by diverse drugs in clinical use is associated with the Long QT Syndrome that can lead to life threatening arrhythmia. Various computational tools including machine learning models (MLM) for the prediction of hERG inhibition have been developed to facilitate the throughput screening of drugs in development and optimise thus the prediction of hERG liabilities. The use of MLM relies on large libraries of training compounds for the quantitative structure-activity relationship (QSAR) modelling of hERG inhibition. The focus on inhibition omits potential effects of hERG channel agonist molecules and their associated QT shortening risk. It is instructive, therefore, to consider how known hERG agonists are handled by MLM. Here, two highly developed online computational tools for the prediction of hERG liability, Pred-hERG and HergSPred were probed for their ability to detect hERG activator drug molecules as hERG interactors. In total, 73 hERG blockers were tested with both computational tools giving overall good predictions for hERG blockers with reported IC50s below Pred-hERG and HergSPred cut-off threshold for hERG inhibition. However, for compounds with reported IC50s above this threshold such as disopyramide or sotalol discrepancies were observed. HergSPred identified all 20 hERG agonists selected as interacting with the hERG channel. Further studies are warranted to improve online MLM prediction of hERG related cardiotoxicity, by explicitly taking into account channel agonism as well as inhibition.
Collapse
Affiliation(s)
- Aziza El Harchi
- School of Physiology and Pharmacology and Neuroscience, Biomedical Sciences Building, The University of Bristol, University Walk, Bristol BS8 1TD, UK.
| | - Jules C Hancox
- School of Physiology and Pharmacology and Neuroscience, Biomedical Sciences Building, The University of Bristol, University Walk, Bristol BS8 1TD, UK
| |
Collapse
|
7
|
Das NR, Sharma T, Toropov AA, Toropova AP, Tripathi MK, Achary PGR. Machine-learning technique, QSAR and molecular dynamics for hERG-drug interactions. J Biomol Struct Dyn 2023; 41:13766-13791. [PMID: 37021352 DOI: 10.1080/07391102.2023.2193641] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2022] [Accepted: 02/06/2023] [Indexed: 04/07/2023]
Abstract
One of the most well-known anti-targets defining medication cardiotoxicity is the voltage-dependent hERG K + channel, which is well-known for its crucial involvement in cardiac action potential repolarization. Torsades de Pointes, QT prolongation, and sudden death are all caused by hERG (the human Ether-à-go-go-Related Gene) inhibition. There is great interest in creating predictive computational (in silico) tools to identify and weed out potential hERG blockers early in the drug discovery process because testing for hERG liability and the traditional experimental screening are complicated, expensive and time-consuming. This study used 2D descriptors of a large curated dataset of 6766 compounds and machine learning approaches to build robust descriptor-based QSAR and predictive classification models for KCNH2 liability. Decision Tree, Random Forest, Logistic Regression, Ada Boosting, kNN, SVM, Naïve Bayes, neural network and stochastic gradient classification classifier algorithms were used to build classification models. If a compound's IC50 value was between 10 μM and less, it was classified as a blocker (hERG-positive), and if it was more, it was classified as a non-blocker (hERG-negative). Matthew's correlation coefficient formula and F1score were applied to compare and track the developed models' performance. Molecular docking and dynamics studies were performed to understand the cardiotoxicity relating to the hERG-gene. The hERG residues interacting after 100 ns are LEU:697, THR:708, PHE:656, HIS:674, HIS:703, TRP:705 and ASN:709 and the hERG-ligand-16 complex trajectory showed stable behaviour with lesser fluctuations in the entire simulation of 200 ns.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Nilima Rani Das
- Department of CA, Siksha 'O' Anusandhan Deemed to be University, Bhubaneswar, Odisha, India
| | - Tripti Sharma
- School of Pharmaceutical Sciences, Siksha 'O' Anusandhan Deemed to be University, Bhubaneswar, Odisha, India
| | - Andrey A Toropov
- Department of Environmental Health Science, Laboratory of Environmental Chemistry and Toxicology, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Milan, Italy
| | - Alla P Toropova
- Department of Environmental Health Science, Laboratory of Environmental Chemistry and Toxicology, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Milan, Italy
| | | | - P Ganga Raju Achary
- Department of Chemistry, Siksha 'O' Anusandhan Deemed to be University, Bhubaneswar, Odisha, India
| |
Collapse
|
8
|
Hanser T. Federated learning for molecular discovery. Curr Opin Struct Biol 2023; 79:102545. [PMID: 36804704 DOI: 10.1016/j.sbi.2023.102545] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 01/06/2023] [Accepted: 01/13/2023] [Indexed: 02/18/2023]
Abstract
Federated Learning enables machine learning across multiple sources of data and alleviates the risk of leaking private information between partners thereby encouraging knowledge sharing and collaborative modelling. Hence, Federated Learning opens the ways to a new generation of improved models. Domains involving molecular informatics, like Drug Discovery, are progressively adopting Federated Learning; this review describes the main projects and applications of Federated Learning for molecular discovery with a special focus on their benefits and the remaining challenges. All the studies demonstrate a real benefit of Federated Learning, namely the improvement of the performance of models as well as their applicability domain thanks to knowledge aggregation. The selected publications also reveal several remaining challenges to be addressed to fully exploit Federated Learning.
Collapse
Affiliation(s)
- Thierry Hanser
- Lhasa Limited, Granary Wharf House. 2 Canal Wharf. LS11 5PS Leeds United Kingdom.
| |
Collapse
|
9
|
Melnikov F, Anger LT, Hasselgren C. Toward Quantitative Models in Safety Assessment: A Case Study to Show Impact of Dose-Response Inference on hERG Inhibition Models. Int J Mol Sci 2022; 24:ijms24010635. [PMID: 36614078 PMCID: PMC9820331 DOI: 10.3390/ijms24010635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Revised: 12/23/2022] [Accepted: 12/24/2022] [Indexed: 12/31/2022] Open
Abstract
Due to challenges with historical data and the diversity of assay formats, in silico models for safety-related endpoints are often based on discretized data instead of the data on a natural continuous scale. Models for discretized endpoints have limitations in usage and interpretation that can impact compound design. Here, we present a consistent data inference approach, exemplified on two data sets of Ether-à-go-go-Related Gene (hERG) K+ inhibition data, for dose-response and screening experiments that are generally applicable for in vitro assays. hERG inhibition has been associated with severe cardiac effects and is one of the more prominent safety targets assessed in drug development, using a wide array of in vitro and in silico screening methods. In this study, the IC50 for hERG inhibition is estimated from diverse historical proprietary data. The IC50 derived from a two-point proprietary screening data set demonstrated high correlation (R = 0.98, MAE = 0.08) with IC50s derived from six-point dose-response curves. Similar IC50 estimation accuracy was obtained on a public thallium flux assay data set (R = 0.90, MAE = 0.2). The IC50 data were used to develop a robust quantitative model. The model's MAE (0.47) and R2 (0.46) were on par with literature statistics and approached assay reproducibility. Using a continuous model has high value for pharmaceutical projects, as it enables rank ordering of compounds and evaluation of compounds against project-specific inhibition thresholds. This data inference approach can be widely applicable to assays with quantitative readouts and has the potential to impact experimental design and improve model performance, interpretation, and acceptance across many standard safety endpoints.
Collapse
|
10
|
Physicochemical QSAR analysis of hERG inhibition revisited: towards a quantitative potency prediction. J Comput Aided Mol Des 2022; 36:837-849. [PMID: 36305984 DOI: 10.1007/s10822-022-00483-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Accepted: 10/04/2022] [Indexed: 01/07/2023]
Abstract
In an earlier study (Didziapetris R & Lanevskij K (2016). J Comput Aided Mol Des. 30:1175-1188) we collected a database of publicly available hERG inhibition data for almost 6700 drug-like molecules and built a probabilistic Gradient Boosting classifier with a minimal set of physicochemical descriptors (log P, pKa, molecular size and topology parameters). This approach favored interpretability over statistical performance but still achieved an overall classification accuracy of 75%. In the current follow-up work we expanded the database (provided in Supplementary Information) to almost 9400 molecules and performed temporal validation of the model on a set of novel chemicals from recently published lead optimization projects. Validation results showed almost no performance degradation compared to the original study. Additionally, we rebuilt the model using AFT (Accelerated Failure Time) learning objective in XGBoost, which accepts both quantitative and censored data often reported in protein inhibition studies. The new model achieved a similar level of accuracy of discerning hERG blockers from non-blockers at 10 µM threshold, which can be conceived as close to the performance ceiling for methods aiming to describe only non-specific ligand interactions with hERG. Yet, this model outputs quantitative potency values (IC50) and is not tied to a particular classification cut-off. pIC50 from patch-clamp measurements can be predicted with R2 ≈ 0.4 and MAE < 0.5, which enables ligand ranking according to their expected potency levels. The employed approach can be valuable for quantitative modeling of various ADME and drug safety endpoints with a high prevalence of censored data.
Collapse
|
11
|
Ye L, Ngan DK, Xu T, Liu Z, Zhao J, Sakamuru S, Zhang L, Zhao T, Xia M, Simeonov A, Huang R. Prediction of drug-induced liver injury and cardiotoxicity using chemical structure and in vitro assay data. Toxicol Appl Pharmacol 2022; 454:116250. [PMID: 36150479 PMCID: PMC9561045 DOI: 10.1016/j.taap.2022.116250] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 08/24/2022] [Accepted: 09/14/2022] [Indexed: 11/18/2022]
Abstract
Drug-induced liver injury (DILI) and cardiotoxicity (DICT) are major adverse effects triggered by many clinically important drugs. To provide an alternative to in vivo toxicity testing, the U.S. Tox21 consortium has screened a collection of ∼10K compounds, including drugs in clinical use, against >70 cell-based assays in a quantitative high-throughput screening (qHTS) format. In this study, we compiled reference compound lists for DILI and DICT and compared the potential of Tox21 assay data with chemical structure information in building prediction models for human in vivo hepatotoxicity and cardiotoxicity. Models were built with four different machine learning algorithms (e.g., Random Forest, Naïve Bayes, eXtreme Gradient Boosting, and Support Vector Machine) and model performance was evaluated by calculating the area under the receiver operating characteristic curve (AUC-ROC). Chemical structure-based models showed reasonable predictive power for DILI (best AUC-ROC = 0.75 ± 0.03) and DICT (best AUC-ROC = 0.83 ± 0.03), while Tox21 assay data alone only showed better than random performance. DILI and DICT prediction models built using a combination of assay data and chemical structure information did not have a positive impact on model performance. The suboptimal predictive performance of the assay data is likely due to insufficient coverage of an adequately predictive number of toxicity mechanisms. The Tox21 consortium is currently expanding coverage of biological response space with additional assays that probe toxicologically important targets and under-represented pathways that may improve the prediction of in vivo toxicity such as DILI and DICT.
Collapse
Affiliation(s)
- Lin Ye
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, MD 20850, USA
| | - Deborah K Ngan
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, MD 20850, USA
| | - Tuan Xu
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, MD 20850, USA
| | - Zhichao Liu
- National Center for Toxicological Research, U.S. Food and Drug Administration (FDA), Jefferson, AR 72079, USA
| | - Jinghua Zhao
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, MD 20850, USA
| | - Srilatha Sakamuru
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, MD 20850, USA
| | - Li Zhang
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, MD 20850, USA
| | - Tongan Zhao
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, MD 20850, USA
| | - Menghang Xia
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, MD 20850, USA
| | - Anton Simeonov
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, MD 20850, USA
| | - Ruili Huang
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, MD 20850, USA.
| |
Collapse
|
12
|
Delre P, Lavado GJ, Lamanna G, Saviano M, Roncaglioni A, Benfenati E, Mangiatordi GF, Gadaleta D. Ligand-based prediction of hERG-mediated cardiotoxicity based on the integration of different machine learning techniques. Front Pharmacol 2022; 13:951083. [PMID: 36133824 PMCID: PMC9483173 DOI: 10.3389/fphar.2022.951083] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Accepted: 07/20/2022] [Indexed: 11/13/2022] Open
Abstract
Drug-induced cardiotoxicity is a common side effect of drugs in clinical use or under postmarket surveillance and is commonly due to off-target interactions with the cardiac human-ether-a-go-go-related (hERG) potassium channel. Therefore, prioritizing drug candidates based on their hERG blocking potential is a mandatory step in the early preclinical stage of a drug discovery program. Herein, we trained and properly validated 30 ligand-based classifiers of hERG-related cardiotoxicity based on 7,963 curated compounds extracted by the freely accessible repository ChEMBL (version 25). Different machine learning algorithms were tested, namely, random forest, K-nearest neighbors, gradient boosting, extreme gradient boosting, multilayer perceptron, and support vector machine. The application of 1) the best practices for data curation, 2) the feature selection method VSURF, and 3) the synthetic minority oversampling technique (SMOTE) to properly handle the unbalanced data, allowed for the development of highly predictive models (BAMAX = 0.91, AUCMAX = 0.95). Remarkably, the undertaken temporal validation approach not only supported the predictivity of the herein presented classifiers but also suggested their ability to outperform those models commonly used in the literature. From a more methodological point of view, the study put forward a new computational workflow, freely available in the GitHub repository (https://github.com/PDelre93/hERG-QSAR), as valuable for building highly predictive models of hERG-mediated cardiotoxicity.
Collapse
Affiliation(s)
- Pietro Delre
- CNR—Institute of Crystallography, Bari, Italy
- Chemistry Department, University of Bari “Aldo Moro”, Bari, Italy
| | - Giovanna J. Lavado
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Milan, Italy
| | - Giuseppe Lamanna
- CNR—Institute of Crystallography, Bari, Italy
- Chemistry Department, University of Bari “Aldo Moro”, Bari, Italy
| | | | - Alessandra Roncaglioni
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Milan, Italy
| | - Emilio Benfenati
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Milan, Italy
| | - Giuseppe Felice Mangiatordi
- CNR—Institute of Crystallography, Bari, Italy
- *Correspondence: Giuseppe Felice Mangiatordi, ; Domenico Gadaleta,
| | - Domenico Gadaleta
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Milan, Italy
- *Correspondence: Giuseppe Felice Mangiatordi, ; Domenico Gadaleta,
| |
Collapse
|
13
|
Goel H, Yu W, MacKerell AD. hERG Blockade Prediction by Combining Site Identification by Ligand Competitive Saturation and Physicochemical Properties. CHEMISTRY (BASEL, SWITZERLAND) 2022; 4:630-646. [PMID: 36712295 PMCID: PMC9881610 DOI: 10.3390/chemistry4030045] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Human ether-a-go-go-related gene (hERG) potassium channel is well-known contributor to drug-induced cardiotoxicity and therefore an extremely important target when performing safety assessments of drug candidates. Ligand-based approaches in connection with quantitative structure active relationships (QSAR) analyses have been developed to predict hERG toxicity. Availability of the recent published cryogenic electron microscopy (cryo-EM) structure for the hERG channel opened the prospect for using structure-based simulation and docking approaches for hERG drug liability predictions. In recent time, the idea of combining structure- and ligand-based approaches for modeling hERG drug liability has gained momentum offering improvements in predictability when compared to ligand-based QSAR practices alone. The present article demonstrates uniting the structure-based SILCS (site-identification by ligand competitive saturation) approach in conjunction with physicochemical properties to develop predictive models for hERG blockade. This combination leads to improved model predictability based on Pearson's R and percent correct (represents rank-ordering of ligands) metric for different validation sets of hERG blockers involving diverse chemical scaffold and wide range of pIC50 values. The inclusion of the SILCS structure-based approach allows determination of the hERG region to which compounds bind and the contribution of different chemical moieties in the compounds to blockade, thereby facilitating the rational ligand design to minimize hERG liability.
Collapse
Affiliation(s)
- Himanshu Goel
- Computer Aided Drug Design Center, Department of Pharmaceutical Sciences, University of Maryland School of Pharmacy, 20 Penn St. Baltimore, MD 21201, United States
| | - Wenbo Yu
- Computer Aided Drug Design Center, Department of Pharmaceutical Sciences, University of Maryland School of Pharmacy, 20 Penn St. Baltimore, MD 21201, United States
| | - Alexander D. MacKerell
- Computer Aided Drug Design Center, Department of Pharmaceutical Sciences, University of Maryland School of Pharmacy, 20 Penn St. Baltimore, MD 21201, United States
| |
Collapse
|
14
|
Shan M, Jiang C, Qin L, Cheng G. A Review of Computational Methods in Predicting hERG Channel Blockers. ChemistrySelect 2022. [DOI: 10.1002/slct.202201221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Mengyi Shan
- School of Pharmaceutical Sciences Zhejiang Chinese Medical University Hangzhou 310053 People's Republic of China
| | - Chen Jiang
- QuanMin RenZheng (HangZhou) Technology Co. Ltd. China
| | - Lu‐Ping Qin
- School of Pharmaceutical Sciences Zhejiang Chinese Medical University Hangzhou 310053 People's Republic of China
| | - Gang Cheng
- School of Pharmaceutical Sciences Zhejiang Chinese Medical University Hangzhou 310053 People's Republic of China
| |
Collapse
|
15
|
Zhu Z, Deng Z, Wang Q, Wang Y, Zhang D, Xu R, Guo L, Wen H. Simulation and Machine Learning Methods for Ion-Channel Structure Determination, Mechanistic Studies and Drug Design. Front Pharmacol 2022; 13:939555. [PMID: 35837274 PMCID: PMC9275593 DOI: 10.3389/fphar.2022.939555] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Accepted: 06/07/2022] [Indexed: 11/13/2022] Open
Abstract
Ion channels are expressed in almost all living cells, controlling the in-and-out communications, making them ideal drug targets, especially for central nervous system diseases. However, owing to their dynamic nature and the presence of a membrane environment, ion channels remain difficult targets for the past decades. Recent advancement in cryo-electron microscopy and computational methods has shed light on this issue. An explosion in high-resolution ion channel structures paved way for structure-based rational drug design and the state-of-the-art simulation and machine learning techniques dramatically improved the efficiency and effectiveness of computer-aided drug design. Here we present an overview of how simulation and machine learning-based methods fundamentally changed the ion channel-related drug design at different levels, as well as the emerging trends in the field.
Collapse
Affiliation(s)
- Zhengdan Zhu
- Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
- Beijing Institute of Big Data Research, Beijing, China
| | - Zhenfeng Deng
- DP Technology, Beijing, China
- School of Pharmaceutical Sciences, Peking University, Beijing, China
| | | | | | - Duo Zhang
- Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
- DP Technology, Beijing, China
| | - Ruihan Xu
- DP Technology, Beijing, China
- National Engineering Research Center of Visual Technology, Peking University, Beijing, China
| | | | - Han Wen
- DP Technology, Beijing, China
| |
Collapse
|
16
|
Krishna S, Borrel A, Huang R, Zhao J, Xia M, Kleinstreuer N. High-Throughput Chemical Screening and Structure-Based Models to Predict hERG Inhibition. BIOLOGY 2022; 11:biology11020209. [PMID: 35205076 PMCID: PMC8869358 DOI: 10.3390/biology11020209] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/04/2021] [Revised: 01/18/2022] [Accepted: 01/21/2022] [Indexed: 12/23/2022]
Abstract
Simple Summary Cardiovascular disease is the leading cause of death for people of most ethnicities in the United States. The human ether-a-go-go-related gene (hERG) potassium channel plays a pivotal role in cardiac rhythm regulation, and cardiotoxicity associated with hERG inhibition by drug molecules and environmental chemicals is a major public health concern. An evaluation of the effect of environmental chemicals on hERG channel function can help inform the potential public health risks of these compounds. To assess the cardiotoxic effect of diverse drugs and environmental compounds, the Tox21 federal research program has screened a collection of 9667 chemicals for inhibitory activity against the hERG channel. A set of molecular descriptors covering physicochemical and structural properties of chemicals, self-organizing maps, and hierarchical clustering were applied to characterize the chemicals inhibiting hERG. Machine learning approaches were applied to build robust statistical models that can predict the probability of any new chemical to cause cardiotoxicity via this mechanism. Abstract Chemical inhibition of the human ether-a -go-go-related gene (hERG) potassium channel leads to a prolonged QT interval that can contribute to severe cardiotoxicity. The adverse effects of hERG inhibition are one of the principal causes of drug attrition in clinical and pre-clinical development. Preliminary studies have demonstrated that a wide range of environmental chemicals and toxicants may also inhibit the hERG channel and contribute to the pathophysiology of cardiovascular (CV) diseases. As part of the US federal Tox21 program, the National Center for Advancing Translational Science (NCATS) applied a quantitative high throughput screening (qHTS) approach to screen the Tox21 library of 10,000 compounds (~7871 unique chemicals) at 14 concentrations in triplicate to identify chemicals perturbing hERG activity in the U2OS cell line thallium flux assay platform. The qHTS cell-based thallium influx assay provided a robust and reliable dataset to evaluate the ability of thousands of drugs and environmental chemicals to inhibit hERG channel protein, and the use of chemical structure-based clustering and chemotype enrichment analysis facilitated the identification of molecular features that are likely responsible for the observed hERG activity. We employed several machine-learning approaches to develop QSAR prediction models for the assessment of hERG liabilities for drug-like and environmental chemicals. The training set was compiled by integrating hERG bioactivity data from the ChEMBL database with the Tox21 qHTS thallium flux assay data. The best results were obtained with the random forest method (~92.6% balanced accuracy). The data and scripts used to generate hERG prediction models are provided in an open-access format as key in vitro and in silico tools that can be applied in a translational toxicology pipeline for drug development and environmental chemical screening.
Collapse
Affiliation(s)
- Shagun Krishna
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences (NIEHS), Research Triangle, NC 27560, USA;
| | | | - Ruili Huang
- Division of Preclinical Innovation, National Center for Advancing Translational Sciences (NCATS), Bethesda, MD 20892-4874, USA; (R.H.); (J.Z.); (M.X.)
| | - Jinghua Zhao
- Division of Preclinical Innovation, National Center for Advancing Translational Sciences (NCATS), Bethesda, MD 20892-4874, USA; (R.H.); (J.Z.); (M.X.)
| | - Menghang Xia
- Division of Preclinical Innovation, National Center for Advancing Translational Sciences (NCATS), Bethesda, MD 20892-4874, USA; (R.H.); (J.Z.); (M.X.)
| | - Nicole Kleinstreuer
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences (NIEHS), Research Triangle, NC 27560, USA;
- Correspondence: ; Tel.: +1-984-287-3150
| |
Collapse
|
17
|
Shan M, Jiang C, Chen J, Qin LP, Qin JJ, Cheng G. Predicting hERG channel blockers with directed message passing neural networks. RSC Adv 2022; 12:3423-3430. [PMID: 35425351 PMCID: PMC8979305 DOI: 10.1039/d1ra07956e] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Accepted: 12/13/2021] [Indexed: 11/30/2022] Open
Abstract
Compounds with human ether-à-go-go related gene (hERG) blockade activity may cause severe cardiotoxicity. Assessing the hERG liability in the early stages of the drug discovery process is important, and the in silico methods for predicting hERG channel blockers are actively pursued. In the present study, the directed message passing neural network (D-MPNN) was applied to construct classification models for identifying hERG blockers based on diverse datasets. Several descriptors and fingerprints were tested along with the D-MPNN model. Among all these combinations, D-MPNN with the moe206 descriptors generated from MOE (D-MPNN + moe206) showed significantly improved performances. The AUC-ROC values of the D-MPNN + moe206 model reached 0.956 ± 0.005 under random split and 0.922 ± 0.015 under scaffold split on Cai's hERG dataset, respectively. Moreover, the comparisons between our models and several recently reported machine learning models were made based on various datasets. Our results indicated that the D-MPNN + moe206 model is among the best classification models. Overall, the excellent performance of the DMPNN + moe206 model achieved in this study highlights its potential application in the discovery of novel and effective hERG blockers. Compounds with human ether-à-go-go related gene (hERG) blockade activity may cause severe cardiotoxicity.![]()
Collapse
Affiliation(s)
- Mengyi Shan
- College of Pharmaceutical Sciences, Zhejiang Chinese Medical University Hangzhou 310053 People's Republic of China
| | - Chen Jiang
- College of Pharmaceutical Sciences, Zhejiang Chinese Medical University Hangzhou 310053 People's Republic of China .,Hangzhou Jingchun Trading Co., Ltd. China
| | - Jing Chen
- College of Pharmaceutical Sciences, Zhejiang Chinese Medical University Hangzhou 310053 People's Republic of China .,College of Pharmaceutical Sciences, Zhejiang University Hangzhou Zhejiang 310058 PR China
| | - Lu-Ping Qin
- College of Pharmaceutical Sciences, Zhejiang Chinese Medical University Hangzhou 310053 People's Republic of China
| | - Jiang-Jiang Qin
- The Cancer Hospital of the University of Chinese Academy of Sciences, Zhejiang Cancer Hospital, Institute of Basic Medicine and Cancer (IBMC), Chinese Academy of Sciences Hangzhou 310022 China
| | - Gang Cheng
- College of Pharmaceutical Sciences, Zhejiang Chinese Medical University Hangzhou 310053 People's Republic of China
| |
Collapse
|
18
|
Creanza TM, Delre P, Ancona N, Lentini G, Saviano M, Mangiatordi GF. Structure-Based Prediction of hERG-Related Cardiotoxicity: A Benchmark Study. J Chem Inf Model 2021; 61:4758-4770. [PMID: 34506150 PMCID: PMC9282647 DOI: 10.1021/acs.jcim.1c00744] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
![]()
Drug-induced blockade of the human
ether-à-go-go-related
gene (hERG) channel is today considered the main
cause of cardiotoxicity in postmarketing surveillance. Hence, several
ligand-based approaches were developed in the last years and are currently
employed in the early stages of a drug discovery process for in silico cardiac safety assessment of drug candidates.
Herein, we present the first structure-based classifiers able to discern hERG binders from nonbinders. LASSO regularized support
vector machines were applied to integrate docking scores and protein–ligand
interaction fingerprints. A total of 396 models were trained and validated
based on: (i) high-quality experimental bioactivity information returned
by 8337 curated compounds extracted from ChEMBL (version 25) and (ii)
structural predictor data. Molecular docking simulations were performed
using GLIDE and GOLD software programs and four different hERG structural models, namely, the recently published structures
obtained by cryoelectron microscopy (PDB codes: 5VA1 and 7CN1) and
two published homology models selected for comparison. Interestingly,
some classifiers return performances comparable to ligand-based models
in terms of area under the ROC curve (AUCMAX = 0.86 ±
0.01) and negative predictive values (NPVMAX = 0.81 ±
0.01), thus putting forward the herein proposed computational workflow
as a valuable tool for predicting hERG-related cardiotoxicity
without the limitations of ligand-based models, typically affected
by low interpretability and a limited applicability domain. From a
methodological point of view, our study represents the first example
of a successful integration of docking scores and protein–ligand
interaction fingerprints (IFs) through a support vector machine (SVM)
LASSO regularized strategy. Finally, the study highlights the importance
of using hERG structural models accounting for ligand-induced
fit effects and allowed us to select the best-performing protein conformation
(made available in the Supporting Information, SI) to be employed
for a reliable structure-based prediction of hERG-related cardiotoxicity.
Collapse
Affiliation(s)
- Teresa Maria Creanza
- CNR-Institute of Intelligent Industrial Technologies and Systems for Advanced Manufacturing, Via Amendola 122/o, 70126 Bari, Italy
| | - Pietro Delre
- Chemistry Department, University of Bari "Aldo Moro", via E. Orabona, 4, I-70125 Bari, Italy.,CNR-Institute of Crystallography, Via Amendola 122/o, 70126 Bari, Italy
| | - Nicola Ancona
- CNR-Institute of Intelligent Industrial Technologies and Systems for Advanced Manufacturing, Via Amendola 122/o, 70126 Bari, Italy
| | - Giovanni Lentini
- Department of Pharmacy-Pharmaceutical Sciences, University of Bari "Aldo Moro", via E. Orabona, 4, I-70125 Bari, Italy
| | - Michele Saviano
- CNR-Institute of Crystallography, Via Amendola 122/o, 70126 Bari, Italy
| | | |
Collapse
|
19
|
Lee KH, Fant AD, Guo J, Guan A, Jung J, Kudaibergenova M, Miranda WE, Ku T, Cao J, Wacker S, Duff HJ, Newman AH, Noskov SY, Shi L. Toward Reducing hERG Affinities for DAT Inhibitors with a Combined Machine Learning and Molecular Modeling Approach. J Chem Inf Model 2021; 61:4266-4279. [PMID: 34420294 DOI: 10.1021/acs.jcim.1c00856] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Psychostimulant drugs, such as cocaine, inhibit dopamine reuptake via blockading the dopamine transporter (DAT), which is the primary mechanism underpinning their abuse. Atypical DAT inhibitors are dissimilar to cocaine and can block cocaine- or methamphetamine-induced behaviors, supporting their development as part of a treatment regimen for psychostimulant use disorders. When developing these atypical DAT inhibitors as medications, it is necessary to avoid off-target binding that can produce unwanted side effects or toxicities. In particular, the blockade of a potassium channel, human ether-a-go-go (hERG), can lead to potentially lethal ventricular tachycardia. In this study, we established a counter screening platform for DAT and against hERG binding by combining machine learning-based quantitative structure-activity relationship (QSAR) modeling, experimental validation, and molecular modeling and simulations. Our results show that the available data are adequate to establish robust QSAR models, as validated by chemical synthesis and pharmacological evaluation of a validation set of DAT inhibitors. Furthermore, the QSAR models based on subsets of the data according to experimental approaches used have predictive power as well, which opens the door to target specific functional states of a protein. Complementarily, our molecular modeling and simulations identified the structural elements responsible for a pair of DAT inhibitors having opposite binding affinity trends at DAT and hERG, which can be leveraged for rational optimization of lead atypical DAT inhibitors with desired pharmacological properties.
Collapse
Affiliation(s)
- Kuo Hao Lee
- Computational Chemistry and Molecular Biophysics Section, Molecular Targets and Medications Discovery Branch, National Institute on Drug Abuse-Intramural Research Program, National Institutes of Health, Baltimore, Maryland 21224, United States
| | - Andrew D Fant
- Computational Chemistry and Molecular Biophysics Section, Molecular Targets and Medications Discovery Branch, National Institute on Drug Abuse-Intramural Research Program, National Institutes of Health, Baltimore, Maryland 21224, United States
| | - Jiqing Guo
- Libin Cardiovascular Institute of Alberta, Cumming School of Medicine, University of Calgary, Calgary, Alberta T2N 4N1, Canada
| | - Andy Guan
- Computational Chemistry and Molecular Biophysics Section, Molecular Targets and Medications Discovery Branch, National Institute on Drug Abuse-Intramural Research Program, National Institutes of Health, Baltimore, Maryland 21224, United States
| | - Joslyn Jung
- Computational Chemistry and Molecular Biophysics Section, Molecular Targets and Medications Discovery Branch, National Institute on Drug Abuse-Intramural Research Program, National Institutes of Health, Baltimore, Maryland 21224, United States
| | - Mary Kudaibergenova
- Centre for Molecular Simulation, Department of Biological Sciences, University of Calgary, Calgary, Alberta T2N 1N4, Canada
| | - Williams E Miranda
- Centre for Molecular Simulation, Department of Biological Sciences, University of Calgary, Calgary, Alberta T2N 1N4, Canada
| | - Therese Ku
- Medicinal Chemistry Section, Molecular Targets and Medications Discovery Branch, National Institute on Drug Abuse-Intramural Research Program, National Institutes of Health, Baltimore, Maryland 21224, United States
| | - Jianjing Cao
- Medicinal Chemistry Section, Molecular Targets and Medications Discovery Branch, National Institute on Drug Abuse-Intramural Research Program, National Institutes of Health, Baltimore, Maryland 21224, United States
| | - Soren Wacker
- Libin Cardiovascular Institute of Alberta, Cumming School of Medicine, University of Calgary, Calgary, Alberta T2N 4N1, Canada.,Centre for Molecular Simulation, Department of Biological Sciences, University of Calgary, Calgary, Alberta T2N 1N4, Canada.,Achlys Inc., 7-126 Li Ka Shing Center for Health and Innovation, Edmonton, Alberta T6G 2E1, Canada
| | - Henry J Duff
- Libin Cardiovascular Institute of Alberta, Cumming School of Medicine, University of Calgary, Calgary, Alberta T2N 4N1, Canada
| | - Amy Hauck Newman
- Medicinal Chemistry Section, Molecular Targets and Medications Discovery Branch, National Institute on Drug Abuse-Intramural Research Program, National Institutes of Health, Baltimore, Maryland 21224, United States
| | - Sergei Y Noskov
- Centre for Molecular Simulation, Department of Biological Sciences, University of Calgary, Calgary, Alberta T2N 1N4, Canada
| | - Lei Shi
- Computational Chemistry and Molecular Biophysics Section, Molecular Targets and Medications Discovery Branch, National Institute on Drug Abuse-Intramural Research Program, National Institutes of Health, Baltimore, Maryland 21224, United States
| |
Collapse
|
20
|
Karim A, Lee M, Balle T, Sattar A. CardioTox net: a robust predictor for hERG channel blockade based on deep learning meta-feature ensembles. J Cheminform 2021; 13:60. [PMID: 34399849 PMCID: PMC8365955 DOI: 10.1186/s13321-021-00541-z] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2021] [Accepted: 08/05/2021] [Indexed: 11/10/2022] Open
Abstract
MOTIVATION Ether-a-go-go-related gene (hERG) channel blockade by small molecules is a big concern during drug development in the pharmaceutical industry. Blockade of hERG channels may cause prolonged QT intervals that potentially could lead to cardiotoxicity. Various in-silico techniques including deep learning models are widely used to screen out small molecules with potential hERG related toxicity. Most of the published deep learning methods utilize a single type of features which might restrict their performance. Methods based on more than one type of features such as DeepHIT struggle with the aggregation of extracted information. DeepHIT shows better performance when evaluated against one or two accuracy metrics such as negative predictive value (NPV) and sensitivity (SEN) but struggle when evaluated against others such as Matthew correlation coefficient (MCC), accuracy (ACC), positive predictive value (PPV) and specificity (SPE). Therefore, there is a need for a method that can efficiently aggregate information gathered from models based on different chemical representations and boost hERG toxicity prediction over a range of performance metrics. RESULTS In this paper, we propose a deep learning framework based on step-wise training to predict hERG channel blocking activity of small molecules. Our approach utilizes five individual deep learning base models with their respective base features and a separate neural network to combine the outputs of the five base models. By using three external independent test sets with potency activity of IC50 at a threshold of 10 [Formula: see text]m, our method achieves better performance for a combination of classification metrics. We also investigate the effective aggregation of chemical information extracted for robust hERG activity prediction. In summary, CardioTox net can serve as a robust tool for screening small molecules for hERG channel blockade in drug discovery pipelines and performs better than previously reported methods on a range of classification metrics.
Collapse
Affiliation(s)
- Abdul Karim
- School of Information Communication Technology, Griffith University, 4111 Nathan, Brisbane, Australia
| | - Matthew Lee
- School of Information Communication Technology, Griffith University, 4111 Nathan, Brisbane, Australia
| | - Thomas Balle
- Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, 2006 Sydney, Australia
- Brain and Mind Centre, The University of Sydney, 2050 Sydney, Australia
| | - Abdul Sattar
- Institute of Integrated and Intelligent Systems, Griffith University, 4111 Nathan, Brisbane, Australia
| |
Collapse
|
21
|
Rácz A, Bajusz D, Miranda-Quintana RA, Héberger K. Machine learning models for classification tasks related to drug safety. Mol Divers 2021; 25:1409-1424. [PMID: 34110577 PMCID: PMC8342376 DOI: 10.1007/s11030-021-10239-x] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Accepted: 05/27/2021] [Indexed: 12/23/2022]
Abstract
In this review, we outline the current trends in the field of machine learning-driven classification studies related to ADME (absorption, distribution, metabolism and excretion) and toxicity endpoints from the past six years (2015-2021). The study focuses only on classification models with large datasets (i.e. more than a thousand compounds). A comprehensive literature search and meta-analysis was carried out for nine different targets: hERG-mediated cardiotoxicity, blood-brain barrier penetration, permeability glycoprotein (P-gp) substrate/inhibitor, cytochrome P450 enzyme family, acute oral toxicity, mutagenicity, carcinogenicity, respiratory toxicity and irritation/corrosion. The comparison of the best classification models was targeted to reveal the differences between machine learning algorithms and modeling types, endpoint-specific performances, dataset sizes and the different validation protocols. Based on the evaluation of the data, we can say that tree-based algorithms are (still) dominating the field, with consensus modeling being an increasing trend in drug safety predictions. Although one can already find classification models with great performances to hERG-mediated cardiotoxicity and the isoenzymes of the cytochrome P450 enzyme family, these targets are still central to ADMET-related research efforts.
Collapse
Affiliation(s)
- Anita Rácz
- Plasma Chemistry Research Group, Research Centre for Natural Sciences, Magyar tudósok krt. 2, Budapest, 1117, Hungary.
| | - Dávid Bajusz
- Medicinal Chemistry Research Group, Research Centre for Natural Sciences, Magyar tudósok krt. 2, Budapest, 1117, Hungary
| | | | - Károly Héberger
- Plasma Chemistry Research Group, Research Centre for Natural Sciences, Magyar tudósok krt. 2, Budapest, 1117, Hungary.
| |
Collapse
|
22
|
Abstract
Machine learning (ML), a branch of artificial intelligence, where machines learn from big data, is at the crest of a technological wave of change sweeping society. Cardiovascular medicine is at the forefront of many ML applications, and there is a significant effort to bring them into mainstream clinical practice. In the field of cardiac electrophysiology, ML applications have also seen a rapid growth and popularity, particularly the use of ML in the automatic interpretation of ECGs, which has been extensively covered in the literature. Much lesser known are the other aspects of ML application in cardiac electrophysiology and arrhythmias, such as those in basic science research on arrhythmia mechanisms, both experimental and computational; in the development of better techniques for mapping of cardiac electrical function; and in translational research related to arrhythmia management. In the current review, we examine comprehensively such ML applications as they match the scope of this journal. The current review is organized in 3 parts. The first provides an overview of general ML principles and methodologies that will afford readers of the necessary information on the subject, serving as the foundation for inviting further ML applications in arrhythmia research. The basic information we provide can serve as a guide on how one might design and conduct an ML study. The second part is a review of arrhythmia and electrophysiology studies in which ML has been utilized, highlighting the broad potential of ML approaches. For each subject, we outline comprehensively the general topics, while reviewing some of the research advances utilizing ML under the subject. Finally, we discuss the main challenges and the perspectives for ML-driven cardiac electrophysiology and arrhythmia research.
Collapse
Affiliation(s)
- Natalia A. Trayanova
- Department of Biomedical Engineering, Johns Hopkins University, 3400 North Charles Street, Baltimore, MD, USA 21218
- Alliance for Cardiovascular Diagnosis and Treatment Innovation, Whiting School of Engineering and School of Medicine, Johns Hopkins University, 3400 North Charles Street, Baltimore, MD, USA 21218
- Division of Cardiology, Department of Medicine, Johns Hopkins University School of Medicine, 733 North Broadway, Baltimore, MD, USA 21205
| | - Dan M. Popescu
- Alliance for Cardiovascular Diagnosis and Treatment Innovation, Whiting School of Engineering and School of Medicine, Johns Hopkins University, 3400 North Charles Street, Baltimore, MD, USA 21218
- Department of Applied Mathematics and Statistics, Johns Hopkins University, 3400 North Charles Street, Baltimore, MD, USA 21218
| | - Julie K. Shade
- Department of Biomedical Engineering, Johns Hopkins University, 3400 North Charles Street, Baltimore, MD, USA 21218
- Alliance for Cardiovascular Diagnosis and Treatment Innovation, Whiting School of Engineering and School of Medicine, Johns Hopkins University, 3400 North Charles Street, Baltimore, MD, USA 21218
| |
Collapse
|
23
|
Kim H, Kim E, Lee I, Bae B, Park M, Nam H. Artificial Intelligence in Drug Discovery: A Comprehensive Review of Data-driven and Machine Learning Approaches. BIOTECHNOL BIOPROC E 2021; 25:895-930. [PMID: 33437151 PMCID: PMC7790479 DOI: 10.1007/s12257-020-0049-y] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Revised: 05/27/2020] [Accepted: 06/03/2020] [Indexed: 02/07/2023]
Abstract
As expenditure on drug development increases exponentially, the overall drug discovery process requires a sustainable revolution. Since artificial intelligence (AI) is leading the fourth industrial revolution, AI can be considered as a viable solution for unstable drug research and development. Generally, AI is applied to fields with sufficient data such as computer vision and natural language processing, but there are many efforts to revolutionize the existing drug discovery process by applying AI. This review provides a comprehensive, organized summary of the recent research trends in AI-guided drug discovery process including target identification, hit identification, ADMET prediction, lead optimization, and drug repositioning. The main data sources in each field are also summarized in this review. In addition, an in-depth analysis of the remaining challenges and limitations will be provided, and proposals for promising future directions in each of the aforementioned areas.
Collapse
Affiliation(s)
- Hyunho Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Eunyoung Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Ingoo Lee
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Bongsung Bae
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Minsu Park
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Hojung Nam
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| |
Collapse
|
24
|
Wang MWH, Goodman JM, Allen TEH. Machine Learning in Predictive Toxicology: Recent Applications and Future Directions for Classification Models. Chem Res Toxicol 2020; 34:217-239. [PMID: 33356168 DOI: 10.1021/acs.chemrestox.0c00316] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
In recent times, machine learning has become increasingly prominent in predictive toxicology as it has shifted from in vivo studies toward in silico studies. Currently, in vitro methods together with other computational methods such as quantitative structure-activity relationship modeling and absorption, distribution, metabolism, and excretion calculations are being used. An overview of machine learning and its applications in predictive toxicology is presented here, including support vector machines (SVMs), random forest (RF) and decision trees (DTs), neural networks, regression models, naïve Bayes, k-nearest neighbors, and ensemble learning. The recent successes of these machine learning methods in predictive toxicology are summarized, and a comparison of some models used in predictive toxicology is presented. In predictive toxicology, SVMs, RF, and DTs are the dominant machine learning methods due to the characteristics of the data available. Lastly, this review describes the current challenges facing the use of machine learning in predictive toxicology and offers insights into the possible areas of improvement in the field.
Collapse
Affiliation(s)
- Marcus W H Wang
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Jonathan M Goodman
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Timothy E H Allen
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom.,MRC Toxicology Unit, University of Cambridge, Hodgkin Building, Lancaster Road, Leicester LE1 7HB, United Kingdom
| |
Collapse
|
25
|
Siramshetty VB, Nguyen DT, Martinez NJ, Southall NT, Simeonov A, Zakharov AV. Critical Assessment of Artificial Intelligence Methods for Prediction of hERG Channel Inhibition in the "Big Data" Era. J Chem Inf Model 2020; 60:6007-6019. [PMID: 33259212 DOI: 10.1021/acs.jcim.0c00884] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
The rise of novel artificial intelligence (AI) methods necessitates their benchmarking against classical machine learning for a typical drug-discovery project. Inhibition of the potassium ion channel, whose alpha subunit is encoded by the human ether-à-go-go-related gene (hERG), leads to a prolonged QT interval of the cardiac action potential and is a significant safety pharmacology target for the development of new medicines. Several computational approaches have been employed to develop prediction models for the assessment of hERG liabilities of small molecules including recent work using deep learning methods. Here, we perform a comprehensive comparison of hERG effect prediction models based on classical approaches (random forests and gradient boosting) and modern AI methods [deep neural networks (DNNs) and recurrent neural networks (RNNs)]. The training set (∼9000 compounds) was compiled by integrating the hERG bioactivity data from the ChEMBL database with experimental data generated from an in-house, high-throughput thallium flux assay. We utilized different molecular descriptors including the latent descriptors, which are real-value continuous vectors derived from chemical autoencoders trained on a large chemical space (>1.5 million compounds). The models were prospectively validated on ∼840 in-house compounds screened in the same thallium flux assay. The best results were obtained with the XGBoost method and RDKit descriptors. The comparison of models based only on latent descriptors revealed that the DNNs performed significantly better than the classical methods. The RNNs that operate on SMILES provided the highest model sensitivity. The best models were merged into a consensus model that offered superior performance compared to reference models from academic and commercial domains. Furthermore, we shed light on the potential of AI methods to exploit the big data in chemistry and generate novel chemical representations useful in predictive modeling and tailoring a new chemical space.
Collapse
Affiliation(s)
- Vishal B Siramshetty
- National Center for Advancing Translational Sciences (NCATS), 9800 Medical Center Drive, Rockville, Maryland 20850, United States
| | - Dac-Trung Nguyen
- National Center for Advancing Translational Sciences (NCATS), 9800 Medical Center Drive, Rockville, Maryland 20850, United States
| | - Natalia J Martinez
- National Center for Advancing Translational Sciences (NCATS), 9800 Medical Center Drive, Rockville, Maryland 20850, United States
| | - Noel T Southall
- National Center for Advancing Translational Sciences (NCATS), 9800 Medical Center Drive, Rockville, Maryland 20850, United States
| | - Anton Simeonov
- National Center for Advancing Translational Sciences (NCATS), 9800 Medical Center Drive, Rockville, Maryland 20850, United States
| | - Alexey V Zakharov
- National Center for Advancing Translational Sciences (NCATS), 9800 Medical Center Drive, Rockville, Maryland 20850, United States
| |
Collapse
|
26
|
Kim H, Nam H. hERG-Att: Self-attention-based deep neural network for predicting hERG blockers. Comput Biol Chem 2020; 87:107286. [PMID: 32531518 DOI: 10.1016/j.compbiolchem.2020.107286] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Accepted: 05/09/2020] [Indexed: 02/05/2023]
Abstract
A voltage-gated potassium channel encoded by the human ether-à-go-go-related gene (hERG) regulates cardiac action potential, and it is involved in cardiotoxicity with compounds that inhibit its activity. Therefore, the screening of hERG channel blockers is a mandatory step in the drug discovery process. The screening of hERG blockers by using conventional methods is inefficient in terms of cost and efforts. This has led to the development of many in silico hERG blocker prediction models. However, constructing a high-performance predictive model with interpretability on hERG blockage by certain compounds is a major obstacle. In this study, we developed the first, attention-based, interpretable model that predicts hERG blockers and captures important hERG-related compound substructures. To do that, we first collected various datasets, ranging from public databases to publicly available private datasets, to train and test the model. Then, we developed a precise and interpretable hERG blocker prediction model by using deep learning with a self-attention approach that has an appropriate molecular descriptor, Morgan fingerprint. The proposed prediction model was validated, and the validation result showed that the model was well-optimized and had high performance. The test set performance of the proposed model was significantly higher than that of previous fingerprint-based conventional machine learning models. In particular, the proposed model generally had high accuracy and F1 score thereby, representing the model's predictive reliability. Furthermore, we interpreted the calculated attention score vectors obtained from the proposed prediction model and demonstrated the important structural patterns that are represented in hERG blockers. In summary, we have proposed a powerful and interpretable hERG blocker prediction model that can reduce the overall cost of drug discovery by accurately screening for hERG blockers and suggesting hERG-related substructures.
Collapse
Affiliation(s)
- Hyunho Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Buk-gu, Gwangju 61005, Republic of Korea.
| | - Hojung Nam
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Buk-gu, Gwangju 61005, Republic of Korea.
| |
Collapse
|
27
|
Wang Y, Huang L, Jiang S, Wang Y, Zou J, Fu H, Yang S. Capsule Networks Showed Excellent Performance in the Classification of hERG Blockers/Nonblockers. Front Pharmacol 2020; 10:1631. [PMID: 32063849 PMCID: PMC6997788 DOI: 10.3389/fphar.2019.01631] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2019] [Accepted: 12/13/2019] [Indexed: 02/05/2023] Open
Abstract
Capsule networks (CapsNets), a new class of deep neural network architectures proposed recently by Hinton et al., have shown a great performance in many fields, particularly in image recognition and natural language processing. However, CapsNets have not yet been applied to drug discovery-related studies. As the first attempt, we in this investigation adopted CapsNets to develop classification models of hERG blockers/nonblockers; drugs with hERG blockade activity are thought to have a potential risk of cardiotoxicity. Two capsule network architectures were established: convolution-capsule network (Conv-CapsNet) and restricted Boltzmann machine-capsule networks (RBM-CapsNet), in which convolution and a restricted Boltzmann machine (RBM) were used as feature extractors, respectively. Two prediction models of hERG blockers/nonblockers were then developed by Conv-CapsNet and RBM-CapsNet with the Doddareddy's training set composed of 2,389 compounds. The established models showed excellent performance in an independent test set comprising 255 compounds, with prediction accuracies of 91.8 and 92.2% for Conv-CapsNet and RBM-CapsNet models, respectively. Various comparisons were also made between our models and those developed by other machine learning methods including deep belief network (DBN), convolutional neural network (CNN), multilayer perceptron (MLP), support vector machine (SVM), k-nearest neighbors (kNN), logistic regression (LR), and LightGBM, and with different training sets. All the results showed that the models by Conv-CapsNet and RBM-CapsNet are among the best classification models. Overall, the excellent performance of capsule networks achieved in this investigation highlights their potential in drug discovery-related studies.
Collapse
Affiliation(s)
- Yiwei Wang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, China
- College of Preclinical Medicine, Southwest Medical University, Luzhou, China
| | - Lei Huang
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
- Basic Teaching Department, Sichuan College of Architectural Technology, Deyang, China
| | - Siwen Jiang
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Yifei Wang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, China
| | - Jun Zou
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, China
| | - Hongguang Fu
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Shengyong Yang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, China
| |
Collapse
|
28
|
|
29
|
Zhang Y, Zhao J, Wang Y, Fan Y, Zhu L, Yang Y, Chen X, Lu T, Chen Y, Liu H. Prediction of hERG K+ channel blockage using deep neural networks. Chem Biol Drug Des 2019; 94:1973-1985. [PMID: 31394026 DOI: 10.1111/cbdd.13600] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2019] [Revised: 07/23/2019] [Accepted: 07/30/2019] [Indexed: 01/08/2023]
Abstract
Human ether-a-go-go-related gene (hERG) K+ channel blockage may cause severe cardiac side-effects and has become a serious issue in safety evaluation of drug candidates. Therefore, improving the ability to avoid undesirable hERG activity in the early stage of drug discovery is of significant importance. The purpose of this study was to build predictive models of hERG activity by deep neural networks. For each combination of sampling methods and descriptors, deep neural networks with different architectures were implemented to build classification models. The optimal model M15 with three hidden layers, undersampling method, and 2D descriptors yielded the prediction accuracy of 0.78 and F1 score of 0.75 on the test set as well as accuracy of 0.77 and F1 score of 0.34 on the external validation set, outperforming the other 35 models including 9 random forest models. Particularly, the optimal model M15 achieved the highest F1 score and the second highest accuracy when compared with other five methods from four groups using different machine learning algorithms with the same external validation set. It can be believed that this model has powerful capability on prediction of hERG toxicity, which is of great benefit for developing novel drug candidates.
Collapse
Affiliation(s)
- Yanmin Zhang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| | - Junnan Zhao
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| | - Yuchen Wang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| | - Yuanrong Fan
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| | - Lu Zhu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| | - Yan Yang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| | - Xingye Chen
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| | - Tao Lu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China.,State Key Laboratory of Natural Medicines, China Pharmaceutical University, Nanjing, China
| | - Yadong Chen
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| | - Haichun Liu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| |
Collapse
|
30
|
Zhang Y, Wang Y, Zhou W, Fan Y, Zhao J, Zhu L, Lu S, Lu T, Chen Y, Liu H. A combined drug discovery strategy based on machine learning and molecular docking. Chem Biol Drug Des 2019; 93:685-699. [PMID: 30688405 DOI: 10.1111/cbdd.13494] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Revised: 01/04/2019] [Accepted: 01/19/2019] [Indexed: 12/14/2022]
Abstract
Data mining methods based on machine learning play an increasingly important role in drug design and discovery. In the current work, eight machine learning methods including decision trees, k-Nearest neighbor, support vector machines, random forests, extremely randomized trees, AdaBoost, gradient boosting trees, and XGBoost were evaluated comprehensively through a case study of ACC inhibitor data sets. Internal and external data sets were employed for cross-validation of the eight machine learning methods. Results showed that the extremely randomized trees model performed best and was adopted as the first step of virtual screening. Together with structure-based virtual screening in the second step, this combined strategy obtained desirable results. This work indicates that the combination of machine learning methods with traditional structure-based virtual screening can effectively strengthen the ability in finding potential hits from large compound database for a given target.
Collapse
Affiliation(s)
- Yanmin Zhang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| | - Yuchen Wang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| | - Weineng Zhou
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| | - Yuanrong Fan
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| | - Junnan Zhao
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| | - Lu Zhu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| | - Shuai Lu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| | - Tao Lu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China.,State Key Laboratory of Natural Medicines, China Pharmaceutical University, Nanjing, China
| | - Yadong Chen
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| | - Haichun Liu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing, China
| |
Collapse
|
31
|
Mayr F, Vieider C, Temml V, Stuppner H, Schuster D. Open-Access Activity Prediction Tools for Natural Products. Case Study: hERG Blockers. PROGRESS IN THE CHEMISTRY OF ORGANIC NATURAL PRODUCTS 2019; 110:177-238. [PMID: 31621014 DOI: 10.1007/978-3-030-14632-0_6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Interference with the hERG potassium ion channel may cause cardiac arrhythmia and can even lead to death. Over the last few decades, several drugs, already on the market, and many more investigational drugs in various development stages, have had to be discontinued because of their hERG-associated toxicity. To recognize potential hERG activity in the early stages of drug development, a wide array of computational tools, based on different principles, such as 3D QSAR, 2D and 3D similarity, and machine learning, have been developed and are reviewed in this chapter. The various available prediction tools Similarity Ensemble Approach, SuperPred, SwissTargetPrediction, HitPick, admetSAR, PASSonline, Pred-hERG, and VirtualToxLab™ were used to screen a dataset of known hERG synthetic and natural product actives and inactives to quantify and compare their predictive power. This contribution will allow the reader to evaluate the suitability of these computational methods for their own related projects. There is an unmet need for natural product-specific prediction tools in this field.
Collapse
Affiliation(s)
- Fabian Mayr
- Institute of Pharmacy/Pharmacognosy, University of Innsbruck, Innsbruck, Austria
- Institute of Pharmacy/Pharmaceutical Chemistry, University of Innsbruck, Innsbruck, Austria
| | - Christian Vieider
- Institute of Pharmacy/Pharmaceutical Chemistry, University of Innsbruck, Innsbruck, Austria
| | - Veronika Temml
- Institute of Pharmacy/Pharmacognosy, University of Innsbruck, Innsbruck, Austria
| | - Hermann Stuppner
- Institute of Pharmacy/Pharmacognosy, University of Innsbruck, Innsbruck, Austria
| | - Daniela Schuster
- Institute of Pharmacy/Pharmaceutical Chemistry, University of Innsbruck, Innsbruck, Austria.
- Department of Pharmaceutical and Medicinal Chemistry, Institute of Pharmacy, Paracelsus Medical University Salzburg, Salzburg, Austria.
| |
Collapse
|
32
|
Munawar S, Windley MJ, Tse EG, Todd MH, Hill AP, Vandenberg JI, Jabeen I. Experimentally Validated Pharmacoinformatics Approach to Predict hERG Inhibition Potential of New Chemical Entities. Front Pharmacol 2018; 9:1035. [PMID: 30333745 PMCID: PMC6176658 DOI: 10.3389/fphar.2018.01035] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2018] [Accepted: 08/27/2018] [Indexed: 12/17/2022] Open
Abstract
The hERG (human ether-a-go-go-related gene) encoded potassium ion (K+) channel plays a major role in cardiac repolarization. Drug-induced blockade of hERG has been a major cause of potentially lethal ventricular tachycardia termed Torsades de Pointes (TdPs). Therefore, we presented a pharmacoinformatics strategy using combined ligand and structure based models for the prediction of hERG inhibition potential (IC50) of new chemical entities (NCEs) during early stages of drug design and development. Integrated GRid-INdependent Descriptor (GRIND) models, and lipophilic efficiency (LipE), ligand efficiency (LE) guided template selection for the structure based pharmacophore models have been used for virtual screening and subsequent hERG activity (pIC50) prediction of identified hits. Finally selected two hits were experimentally evaluated for hERG inhibition potential (pIC50) using whole cell patch clamp assay. Overall, our results demonstrate a difference of less than ±1.6 log unit between experimentally determined and predicted hERG inhibition potential (IC50) of the selected hits. This revealed predictive ability and robustness of our models and could help in correctly rank the potency order (lower μM to higher nM range) against hERG.
Collapse
Affiliation(s)
- Saba Munawar
- Research Center for Modeling and Simulation, National University of Science and Technology, Islamabad, Pakistan.,Victor Chang Cardiac Research Institute, Sydney, NSW, Australia
| | | | - Edwin G Tse
- School of Chemistry, The University of Sydney, Sydney, NSW, Australia
| | - Matthew H Todd
- School of Chemistry, The University of Sydney, Sydney, NSW, Australia
| | - Adam P Hill
- Victor Chang Cardiac Research Institute, Sydney, NSW, Australia
| | | | - Ishrat Jabeen
- Research Center for Modeling and Simulation, National University of Science and Technology, Islamabad, Pakistan
| |
Collapse
|