1
|
Li X, Wang C, Chai X, Liu X, Qiao K, Fu Y, Jin Y, Jia Q, Zhu F, Zhang Y. Discovery of Potent Selective HDAC6 Inhibitors with 5-Phenyl-1 H-indole Fragment: Virtual Screening, Rational Design, and Biological Evaluation. J Chem Inf Model 2024. [PMID: 39042494 DOI: 10.1021/acs.jcim.4c01052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/25/2024]
Abstract
Among the HDACs family, histone deacetylase 6 (HDAC6) has attracted extensive attention due to its unique structure and biological functions. Numerous studies have shown that compared with broad-spectrum HDACs inhibitors, selective HDAC6 inhibitors exert ideal efficacy in tumor treatment with insignificant toxic and side effects, demonstrating promising clinical application prospect. Herein, we carried out rational drug design by integrating a deep learning model, molecular docking, and molecular dynamics simulation technology to construct a virtual screening process. The designed derivatives with 5-phenyl-1H-indole fragment as Cap showed desirable cytotoxicity to the various tumor cell lines, all of which were within 15 μM (ranging from 0.35 to 14.87 μM), among which compound 5i had the best antiproliferative activities against HL-60 (IC50 = 0.35 ± 0.07 μM) and arrested HL-60 cells in the G0/G1 phase. In addition, 5i exhibited better isotype selective inhibitory activities due to the potent potency against HDAC6 (IC50 = 5.16 ± 0.25 nM) and the reduced inhibitory activities against HDAC1 (selective index ≈ 124), which was further verified by immunoblotting results. Moreover, the representative binding conformation of 5i on HDAC6 was revealed and the key residues contributing 5i's binding were also identified via decomposition free-energy analysis. The discovery of lead compound 5i also indicates that virtual screening is still a beneficial tool in drug discovery and can provide more molecular skeletons with research potential for drug design, which is worthy of widespread application.
Collapse
Affiliation(s)
- Xuedong Li
- School of Pharmacy, Hebei Medical University, Shijiazhuang 050017, PR China
| | - Chengzhao Wang
- College of Basic Medicine, Hebei Medical University, Shijiazhuang 050017, PR China
| | - Xu Chai
- School of Pharmacy, Hebei Medical University, Shijiazhuang 050017, PR China
| | - Xingang Liu
- School of Pharmacy, Hebei Medical University, Shijiazhuang 050017, PR China
| | - Kening Qiao
- School of Pharmacy, Hebei Medical University, Shijiazhuang 050017, PR China
| | - Yan Fu
- School of Pharmacy, Hebei Medical University, Shijiazhuang 050017, PR China
| | - Yanzhao Jin
- Shijiazhuang Xianyu Digital Biotechnology Co., Ltd, Shijiazhuang 050024, PR China
| | - Qingzhong Jia
- School of Pharmacy, Hebei Medical University, Shijiazhuang 050017, PR China
| | - Feng Zhu
- School of Pharmacy, Hebei Medical University, Shijiazhuang 050017, PR China
- College of Pharmaceutical Sciences, National Key Laboratory of Advanced Drug Delivery and Release Systems, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, PR China
| | - Yang Zhang
- School of Pharmacy, Hebei Medical University, Shijiazhuang 050017, PR China
| |
Collapse
|
2
|
Brocidiacono M, Francoeur P, Aggarwal R, Popov KI, Koes DR, Tropsha A. BigBind: Learning from Nonstructural Data for Structure-Based Virtual Screening. J Chem Inf Model 2024; 64:2488-2495. [PMID: 38113513 DOI: 10.1021/acs.jcim.3c01211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2023]
Abstract
Deep learning methods that predict protein-ligand binding have recently been used for structure-based virtual screening. Many such models have been trained using protein-ligand complexes with known crystal structures and activities from the PDBBind data set. However, because PDBbind only includes 20K complexes, models typically fail to generalize to new targets, and model performance is on par with models trained with only ligand information. Conversely, the ChEMBL database contains a wealth of chemical activity information but includes no information about binding poses. We introduce BigBind, a data set that maps ChEMBL activity data to proteins from the CrossDocked data set. BigBind comprises 583 K ligand activities and includes 3D structures of the protein binding pockets. Additionally, we augmented the data by adding an equal number of putative inactives for each target. Using this data, we developed Banana (basic neural network for binding affinity), a neural network-based model to classify active from inactive compounds, defined by a 10 μM cutoff. Our model achieved an AUC of 0.72 on BigBind's test set, while a ligand-only model achieved an AUC of 0.59. Furthermore, Banana achieved competitive performance on the LIT-PCBA benchmark (median EF1% 1.81) while running 16,000 times faster than molecular docking with Gnina. We suggest that Banana, as well as other models trained on this data set, will significantly improve the outcomes of prospective virtual screening tasks.
Collapse
Affiliation(s)
- Michael Brocidiacono
- Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States
| | - Paul Francoeur
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| | - Rishal Aggarwal
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| | - Konstantin I Popov
- Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States
| | - David Ryan Koes
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| | - Alexander Tropsha
- Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States
| |
Collapse
|
3
|
Zeng X, Li SJ, Lv SQ, Wen ML, Li Y. A comprehensive review of the recent advances on predicting drug-target affinity based on deep learning. Front Pharmacol 2024; 15:1375522. [PMID: 38628639 PMCID: PMC11019008 DOI: 10.3389/fphar.2024.1375522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Accepted: 03/21/2024] [Indexed: 04/19/2024] Open
Abstract
Accurate calculation of drug-target affinity (DTA) is crucial for various applications in the pharmaceutical industry, including drug screening, design, and repurposing. However, traditional machine learning methods for calculating DTA often lack accuracy, posing a significant challenge in accurately predicting DTA. Fortunately, deep learning has emerged as a promising approach in computational biology, leading to the development of various deep learning-based methods for DTA prediction. To support researchers in developing novel and highly precision methods, we have provided a comprehensive review of recent advances in predicting DTA using deep learning. We firstly conducted a statistical analysis of commonly used public datasets, providing essential information and introducing the used fields of these datasets. We further explored the common representations of sequences and structures of drugs and targets. These analyses served as the foundation for constructing DTA prediction methods based on deep learning. Next, we focused on explaining how deep learning models, such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Transformer, and Graph Neural Networks (GNNs), were effectively employed in specific DTA prediction methods. We highlighted the unique advantages and applications of these models in the context of DTA prediction. Finally, we conducted a performance analysis of multiple state-of-the-art methods for predicting DTA based on deep learning. The comprehensive review aimed to help researchers understand the shortcomings and advantages of existing methods, and further develop high-precision DTA prediction tool to promote the development of drug discovery.
Collapse
Affiliation(s)
- Xin Zeng
- College of Mathematics and Computer Science, Dali University, Dali, China
| | - Shu-Juan Li
- Yunnan Institute of Endemic Diseases Control and Prevention, Dali, China
| | - Shuang-Qing Lv
- Institute of Surveying and Information Engineering West Yunnan University of Applied Science, Dali, China
| | - Meng-Liang Wen
- State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan, Yunnan University, Kunming, China
| | - Yi Li
- College of Mathematics and Computer Science, Dali University, Dali, China
| |
Collapse
|
4
|
Shiota K, Akutsu T. Multi-shelled ECIF: improved extended connectivity interaction features for accurate binding affinity prediction. BIOINFORMATICS ADVANCES 2023; 3:vbad155. [PMID: 37928345 PMCID: PMC10625475 DOI: 10.1093/bioadv/vbad155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 09/20/2023] [Accepted: 10/19/2023] [Indexed: 11/07/2023]
Abstract
Motivation Extended connectivity interaction features (ECIF) is a method developed to predict protein-ligand binding affinity, allowing for detailed atomic representation. It performed very well in terms of Comparative Assessment of Scoring Functions 2016 (CASF-2016) scoring power. However, ECIF has the limitation of not being able to adequately account for interatomic distances. Results To investigate what kind of distance representation is effective for P-L binding affinity prediction, we have developed two algorithms that improved ECIF's feature extraction method to take distance into account. One is multi-shelled ECIF, which takes into account the distance between atoms by dividing the distance between atoms into multiple layers. The other is weighted ECIF, which weights the importance of interactions according to the distance between atoms. A comparison of these two methods shows that multi-shelled ECIF outperforms weighted ECIF and the original ECIF, achieving a CASF-2016 scoring power Pearson correlation coefficient of 0.877. Availability and implementation All the codes and data are available on GitHub (https://github.com/koji11235/MSECIFv2).
Collapse
Affiliation(s)
- Koji Shiota
- Department of Intelligence Science and Technology, Graduate School of Informatics, Kyoto University, Kyoto, Kyoto 606-8501, Japan
| | - Tatsuya Akutsu
- Department of Intelligence Science and Technology, Graduate School of Informatics, Kyoto University, Kyoto, Kyoto 606-8501, Japan
| |
Collapse
|
5
|
Ong WJG, Kirubakaran P, Karanicolas J. Poor Generalization by Current Deep Learning Models for Predicting Binding Affinities of Kinase Inhibitors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.04.556234. [PMID: 37732243 PMCID: PMC10508770 DOI: 10.1101/2023.09.04.556234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/22/2023]
Abstract
The extreme surge of interest over the past decade surrounding the use of neural networks has inspired many groups to deploy them for predicting binding affinities of drug-like molecules to their receptors. A model that can accurately make such predictions has the potential to screen large chemical libraries and help streamline the drug discovery process. However, despite reports of models that accurately predict quantitative inhibition using protein kinase sequences and inhibitors' SMILES strings, it is still unclear whether these models can generalize to previously unseen data. Here, we build a Convolutional Neural Network (CNN) analogous to those previously reported and evaluate the model over four datasets commonly used for inhibitor/kinase predictions. We find that the model performs comparably to those previously reported, provided that the individual data points are randomly split between the training set and the test set. However, model performance is dramatically deteriorated when all data for a given inhibitor is placed together in the same training/testing fold, implying that information leakage underlies the models' performance. Through comparison to simple models in which the SMILES strings are tokenized, or in which test set predictions are simply copied from the closest training set data points, we demonstrate that there is essentially no generalization whatsoever in this model. In other words, the model has not learned anything about molecular interactions, and does not provide any benefit over much simpler and more transparent models. These observations strongly point to the need for richer structure-based encodings, to obtain useful prospective predictions of not-yet-synthesized candidate inhibitors.
Collapse
Affiliation(s)
- Wern Juin Gabriel Ong
- Cancer Signaling & Microenvironment Program, Fox Chase Cancer Center, Philadelphia, PA 19111
- Bowdoin College, Brunswick, ME 04011
| | - Palani Kirubakaran
- Cancer Signaling & Microenvironment Program, Fox Chase Cancer Center, Philadelphia, PA 19111
| | - John Karanicolas
- Cancer Signaling & Microenvironment Program, Fox Chase Cancer Center, Philadelphia, PA 19111
| |
Collapse
|