1
|
Wang L, Wang S, Yang H, Li S, Wang X, Zhou Y, Tian S, Liu L, Bai F. Conformational Space Profiling Enhances Generic Molecular Representation for AI-Powered Ligand-Based Drug Discovery. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024:e2403998. [PMID: 39206753 DOI: 10.1002/advs.202403998] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Revised: 06/25/2024] [Indexed: 09/04/2024]
Abstract
The molecular representation model is a neural network that converts molecular representations (SMILES, Graph) into feature vectors, and is an essential module applied across a wide range of artificial intelligence-driven drug discovery scenarios. However, current molecular representation models rarely consider the three-dimensional conformational space of molecules, losing sight of the dynamic nature of small molecules as well as the essence of molecular conformational space that covers the heterogeneity of molecule properties, such as the multi-target mechanism of action, recognition of different biomolecules, dynamics in cytoplasm and membrane. In this study, a new model named GeminiMol is proposed to incorporate conformational space profiles into molecular representation learning, which extracts the feature of capturing the complicated interplay between the molecular structure and the conformational space. Although GeminiMol is pre-trained on a relatively small-scale molecular dataset (39290 molecules), it shows balanced and superior performance not only on 67 molecular properties predictions but also on 73 cellular activity predictions and 171 zero-shot tasks (including virtual screening and target identification). By capturing the molecular conformational space profile, the strategy paves the way for rapid exploration of chemical space and facilitates changing paradigms for drug design.
Collapse
Affiliation(s)
- Lin Wang
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, Shanghai Tech University, Shanghai, 201210, China
| | - Shihang Wang
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, Shanghai Tech University, Shanghai, 201210, China
| | - Hao Yang
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, Shanghai Tech University, Shanghai, 201210, China
| | - Shiwei Li
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, Shanghai Tech University, Shanghai, 201210, China
| | - Xinyu Wang
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, Shanghai Tech University, Shanghai, 201210, China
| | - Yongqi Zhou
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, Shanghai Tech University, Shanghai, 201210, China
| | - Siyuan Tian
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, Shanghai Tech University, Shanghai, 201210, China
| | - Lu Liu
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, Shanghai Tech University, Shanghai, 201210, China
| | - Fang Bai
- Shanghai Institute for Advanced Immunochemical Studies, School of Life Science and Technology, Information Science and Technology, Shanghai Tech University, Shanghai Clinical Research and Trial Center, Shanghai, 201210, China
| |
Collapse
|
2
|
Zeng X, Zhong KY, Meng PY, Li SJ, Lv SQ, Wen ML, Li Y. MvGraphDTA: multi-view-based graph deep model for drug-target affinity prediction by introducing the graphs and line graphs. BMC Biol 2024; 22:182. [PMID: 39183297 PMCID: PMC11346193 DOI: 10.1186/s12915-024-01981-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Accepted: 08/13/2024] [Indexed: 08/27/2024] Open
Abstract
BACKGROUND Accurately identifying drug-target affinity (DTA) plays a pivotal role in drug screening, design, and repurposing in pharmaceutical industry. It not only reduces the time, labor, and economic costs associated with biological experiments but also expedites drug development process. However, achieving the desired level of computational accuracy for DTA identification methods remains a significant challenge. RESULTS We proposed a novel multi-view-based graph deep model known as MvGraphDTA for DTA prediction. MvGraphDTA employed a graph convolutional network (GCN) to extract the structural features from original graphs of drugs and targets, respectively. It went a step further by constructing line graphs with edges as vertices based on original graphs of drugs and targets. GCN was also used to extract the relationship features within their line graphs. To enhance the complementarity between the extracted features from original graphs and line graphs, MvGraphDTA fused the extracted multi-view features of drugs and targets, respectively. Finally, these fused features were concatenated and passed through a fully connected (FC) network to predict DTA. CONCLUSIONS During the experiments, we performed data augmentation on all the training sets used. Experimental results showed that MvGraphDTA outperformed the competitive state-of-the-art methods on benchmark datasets for DTA prediction. Additionally, we evaluated the universality and generalization performance of MvGraphDTA on additional datasets. Experimental outcomes revealed that MvGraphDTA exhibited good universality and generalization capability, making it a reliable tool for drug-target interaction prediction.
Collapse
Affiliation(s)
- Xin Zeng
- College of Mathematics and Computer Science, Dali University, Dali, 671003, China
| | - Kai-Yang Zhong
- College of Mathematics and Computer Science, Dali University, Dali, 671003, China
| | - Pei-Yan Meng
- College of Mathematics and Computer Science, Dali University, Dali, 671003, China
| | - Shu-Juan Li
- Yunnan Institute of Endemic Diseases Control & Prevention, Dali, 671000, China
| | - Shuang-Qing Lv
- Institute of Surveying and Information Engineering, West Yunnan University of Applied Science, Dali, 671000, China
| | - Meng-Liang Wen
- State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan, Yunnan University, Kunming, 650000, China
| | - Yi Li
- College of Mathematics and Computer Science, Dali University, Dali, 671003, China.
| |
Collapse
|
3
|
Chen X, Huang J, Shen T, Zhang H, Xu L, Yang M, Xie X, Yan Y, Yan J. DEAttentionDTA: protein-ligand binding affinity prediction based on dynamic embedding and self-attention. Bioinformatics 2024; 40:btae319. [PMID: 38897656 PMCID: PMC11193059 DOI: 10.1093/bioinformatics/btae319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 03/23/2024] [Accepted: 06/17/2024] [Indexed: 06/21/2024] Open
Abstract
MOTIVATION Predicting protein-ligand binding affinity is crucial in new drug discovery and development. However, most existing models rely on acquiring 3D structures of elusive proteins. Combining amino acid sequences with ligand sequences and better highlighting active sites are also significant challenges. RESULTS We propose an innovative neural network model called DEAttentionDTA, based on dynamic word embeddings and a self-attention mechanism, for predicting protein-ligand binding affinity. DEAttentionDTA takes the 1D sequence information of proteins as input, including the global sequence features of amino acids, local features of the active pocket site, and linear representation information of the ligand molecule in the SMILE format. These three linear sequences are fed into a dynamic word-embedding layer based on a 1D convolutional neural network for embedding encoding and are correlated through a self-attention mechanism. The output affinity prediction values are generated using a linear layer. We compared DEAttentionDTA with various mainstream tools and achieved significantly superior results on the same dataset. We then assessed the performance of this model in the p38 protein family. AVAILABILITY AND IMPLEMENTATION The resource codes are available at https://github.com/whatamazing1/DEAttentionDTA.
Collapse
Affiliation(s)
- Xiying Chen
- Key Lab of Molecular Biophysics of Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Jinsha Huang
- Key Lab of Molecular Biophysics of Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Tianqiao Shen
- Key Lab of Molecular Biophysics of Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Houjin Zhang
- Key Lab of Molecular Biophysics of Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Li Xu
- Key Lab of Molecular Biophysics of Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Min Yang
- Key Lab of Molecular Biophysics of Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Xiaoman Xie
- Key Lab of Molecular Biophysics of Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Yunjun Yan
- Key Lab of Molecular Biophysics of Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Jinyong Yan
- Key Lab of Molecular Biophysics of Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| |
Collapse
|
4
|
Zhou Y, Chen SJ. Advances in machine-learning approaches to RNA-targeted drug design. ARTIFICIAL INTELLIGENCE CHEMISTRY 2024; 2:100053. [PMID: 38434217 PMCID: PMC10904028 DOI: 10.1016/j.aichem.2024.100053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2024]
Abstract
RNA molecules play multifaceted functional and regulatory roles within cells and have garnered significant attention in recent years as promising therapeutic targets. With remarkable successes achieved by artificial intelligence (AI) in different fields such as computer vision and natural language processing, there is a growing imperative to harness AI's potential in computer-aided drug design (CADD) to discover novel drug compounds that target RNA. Although machine-learning (ML) approaches have been widely adopted in the discovery of small molecules targeting proteins, the application of ML approaches to model interactions between RNA and small molecule is still in its infancy. Compared to protein-targeted drug discovery, the major challenges in ML-based RNA-targeted drug discovery stem from the scarcity of available data resources. With the growing interest and the development of curated databases focusing on interactions between RNA and small molecule, the field anticipates a rapid growth and the opening of a new avenue for disease treatment. In this review, we aim to provide an overview of recent advancements in computationally modeling RNA-small molecule interactions within the context of RNA-targeted drug discovery, with a particular emphasis on methodologies employing ML techniques.
Collapse
Affiliation(s)
- Yuanzhe Zhou
- Department of Physics and Astronomy, University of Missouri, Columbia, MO 65211-7010, USA
| | - Shi-Jie Chen
- Department of Physics and Astronomy, Department of Biochemistry, Institute of Data Sciences and Informatics, University of Missouri, Columbia, MO 65211-7010, USA
| |
Collapse
|
5
|
Zhou G, Qin Y, Hong Q, Li H, Chen H, Shen J. GEMF: a novel geometry-enhanced mid-fusion network for PLA prediction. Brief Bioinform 2024; 25:bbae333. [PMID: 38980371 PMCID: PMC11232467 DOI: 10.1093/bib/bbae333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Revised: 06/04/2024] [Accepted: 06/26/2024] [Indexed: 07/10/2024] Open
Abstract
Accurate prediction of protein-ligand binding affinity (PLA) is important for drug discovery. Recent advances in applying graph neural networks have shown great potential for PLA prediction. However, existing methods usually neglect the geometric information (i.e. bond angles), leading to difficulties in accurately distinguishing different molecular structures. In addition, these methods also pose limitations in representing the binding process of protein-ligand complexes. To address these issues, we propose a novel geometry-enhanced mid-fusion network, named GEMF, to learn comprehensive molecular geometry and interaction patterns. Specifically, the GEMF consists of a graph embedding layer, a message passing phase, and a multi-scale fusion module. GEMF can effectively represent protein-ligand complexes as graphs, with graph embeddings based on physicochemical and geometric properties. Moreover, our dual-stream message passing framework models both covalent and non-covalent interactions. In particular, the edge-update mechanism, which is based on line graphs, can fuse both distance and angle information in the covalent branch. In addition, the communication branch consisting of multiple heterogeneous interaction modules is developed to learn intricate interaction patterns. Finally, we fuse the multi-scale features from the covalent, non-covalent, and heterogeneous interaction branches. The extensive experimental results on several benchmarks demonstrate the superiority of GEMF compared with other state-of-the-art methods.
Collapse
Affiliation(s)
- Guoqiang Zhou
- School of Computer Science, Nanjing University of Posts and Telecommunications, No.9 Wenyuan Road, Jiangsu 210023, China
| | - Yuke Qin
- School of Computer Science, Nanjing University of Posts and Telecommunications, No.9 Wenyuan Road, Jiangsu 210023, China
| | - Qiansen Hong
- School of Computer Science, Nanjing University of Posts and Telecommunications, No.9 Wenyuan Road, Jiangsu 210023, China
| | - Haoran Li
- School of Computing and Information Technology, University of Wollongong, Northfields Avenue, NSW 2522, Australia
| | - Huaming Chen
- School of Electrical and Computer Engineering, University of Sydney, Camperdown, NSW 2050, Australia
| | - Jun Shen
- School of Computing and Information Technology, University of Wollongong, Northfields Avenue, NSW 2522, Australia
| |
Collapse
|
6
|
Qu X, Dong L, Luo D, Si Y, Wang B. Water Network-Augmented Two-State Model for Protein-Ligand Binding Affinity Prediction. J Chem Inf Model 2024; 64:2263-2274. [PMID: 37433009 DOI: 10.1021/acs.jcim.3c00567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/13/2023]
Abstract
Water network rearrangement from the ligand-unbound state to the ligand-bound state is known to have significant effects on the protein-ligand binding interactions, but most of the current machine learning-based scoring functions overlook these effects. In this study, we endeavor to construct a comprehensive and realistic deep learning model by incorporating water network information into both ligand-unbound and -bound states. In particular, extended connectivity interaction features were integrated into graph representation, and graph transformer operator was employed to extract features of the ligand-unbound and -bound states. Through these efforts, we developed a water network-augmented two-state model called ECIFGraph::HM-Holo-Apo. Our new model exhibits satisfactory performance in terms of scoring, ranking, docking, screening, and reverse screening power tests on the CASF-2016 benchmark. In addition, it can achieve superior performance in large-scale docking-based virtual screening tests on the DEKOIS2.0 data set. Our study highlights that the use of a water network-augmented two-state model can be an effective strategy to bolster the robustness and applicability of machine learning-based scoring functions, particularly for targets with hydrophilic or solvent-exposed binding pockets.
Collapse
Affiliation(s)
- Xiaoyang Qu
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
| | - Lina Dong
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
| | - Ding Luo
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
| | - Yubing Si
- College of Chemistry, Zhengzhou University, Zhengzhou 450001, P. R. China
| | - Binju Wang
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
- Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen 361005, P. R. China
| |
Collapse
|
7
|
Zha J, Su J, Li T, Cao C, Ma Y, Wei H, Huang Z, Qian L, Wen K, Zhang J. Encoding Molecular Docking for Quantum Computers. J Chem Theory Comput 2023; 19:9018-9024. [PMID: 38090816 DOI: 10.1021/acs.jctc.3c00943] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2023]
Abstract
Molecular docking is important in drug discovery but is burdensome for classical computers. Here, we introduce Grid Point Matching (GPM) and Feature Atom Matching (FAM) to accelerate pose sampling in molecular docking by encoding the problem into quadratic unconstrained binary optimization (QUBO) models so that it could be solved by quantum computers like the coherent Ising machine (CIM). As a result, GPM shows a sampling power close to that of Glide SP, a method performing an extensive search. Moreover, it is estimated to be 1000 times faster on the CIM than on classical computers. Our methods could boost virtual drug screening of small molecules and peptides in future.
Collapse
Affiliation(s)
- Jinyin Zha
- Beijing QBoson Quantum Technology Co., Ltd., Beijing 100015, China
- Medicinal Chemistry and Bioinformatics Center, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China
| | - Jiaqi Su
- Beijing QBoson Quantum Technology Co., Ltd., Beijing 100015, China
| | - Tiange Li
- Beijing QBoson Quantum Technology Co., Ltd., Beijing 100015, China
| | - Chongyu Cao
- Beijing QBoson Quantum Technology Co., Ltd., Beijing 100015, China
| | - Yin Ma
- Beijing QBoson Quantum Technology Co., Ltd., Beijing 100015, China
| | - Hai Wei
- Beijing QBoson Quantum Technology Co., Ltd., Beijing 100015, China
| | - Zhiguo Huang
- China Mobile (Suzhou) Software Technology Company Limited, Suzhou 215163, China
| | - Ling Qian
- China Mobile (Suzhou) Software Technology Company Limited, Suzhou 215163, China
| | - Kai Wen
- Beijing QBoson Quantum Technology Co., Ltd., Beijing 100015, China
| | - Jian Zhang
- Medicinal Chemistry and Bioinformatics Center, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China
| |
Collapse
|
8
|
Cai L, Han F, Ji B, He X, Wang L, Niu T, Zhai J, Wang J. In Silico Screening of Natural Flavonoids against 3-Chymotrypsin-like Protease of SARS-CoV-2 Using Machine Learning and Molecular Modeling. Molecules 2023; 28:8034. [PMID: 38138524 PMCID: PMC10745665 DOI: 10.3390/molecules28248034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 11/30/2023] [Accepted: 12/07/2023] [Indexed: 12/24/2023] Open
Abstract
The "Long-COVID syndrome" has posed significant challenges due to a lack of validated therapeutic options. We developed a novel multi-step virtual screening strategy to reliably identify inhibitors against 3-chymotrypsin-like protease of SARS-CoV-2 from abundant flavonoids, which represents a promising source of antiviral and immune-boosting nutrients. We identified 57 interacting residues as contributors to the protein-ligand binding pocket. Their energy interaction profiles constituted the input features for Machine Learning (ML) models. The consensus of 25 classifiers trained using various ML algorithms attained 93.9% accuracy and a 6.4% false-positive-rate. The consensus of 10 regression models for binding energy prediction also achieved a low root-mean-square error of 1.18 kcal/mol. We screened out 120 flavonoid hits first and retained 50 drug-like hits after predefined ADMET filtering to ensure bioavailability and safety profiles. Furthermore, molecular dynamics simulations prioritized nine bioactive flavonoids as promising anti-SARS-CoV-2 agents exhibiting both high structural stability (root-mean-square deviation < 5 Å for 218 ns) and low MM/PBSA binding free energy (<-6 kcal/mol). Among them, KB-2 (PubChem-CID, 14630497) and 9-O-Methylglyceofuran (PubChem-CID, 44257401) displayed excellent binding affinity and desirable pharmacokinetic capabilities. These compounds have great potential to serve as oral nutraceuticals with therapeutic and prophylactic properties as care strategies for patients with long-COVID syndrome.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Junmei Wang
- School of Pharmacy, University of Pittsburgh, Pittsburgh, PA 15261, USA; (L.C.); (F.H.); (B.J.); (X.H.); (L.W.); (T.N.); (J.Z.)
| |
Collapse
|
9
|
Libouban PY, Aci-Sèche S, Gómez-Tamayo JC, Tresadern G, Bonnet P. The Impact of Data on Structure-Based Binding Affinity Predictions Using Deep Neural Networks. Int J Mol Sci 2023; 24:16120. [PMID: 38003312 PMCID: PMC10671244 DOI: 10.3390/ijms242216120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 10/30/2023] [Accepted: 11/01/2023] [Indexed: 11/26/2023] Open
Abstract
Artificial intelligence (AI) has gained significant traction in the field of drug discovery, with deep learning (DL) algorithms playing a crucial role in predicting protein-ligand binding affinities. Despite advancements in neural network architectures, system representation, and training techniques, the performance of DL affinity prediction has reached a plateau, prompting the question of whether it is truly solved or if the current performance is overly optimistic and reliant on biased, easily predictable data. Like other DL-related problems, this issue seems to stem from the training and test sets used when building the models. In this work, we investigate the impact of several parameters related to the input data on the performance of neural network affinity prediction models. Notably, we identify the size of the binding pocket as a critical factor influencing the performance of our statistical models; furthermore, it is more important to train a model with as much data as possible than to restrict the training to only high-quality datasets. Finally, we also confirm the bias in the typically used current test sets. Therefore, several types of evaluation and benchmarking are required to understand models' decision-making processes and accurately compare the performance of models.
Collapse
Affiliation(s)
- Pierre-Yves Libouban
- Institute of Organic and Analytical Chemistry (ICOA), UMR7311, Université d’Orléans, CNRS, Pôle de Chimie rue de Chartres, 45067 Orléans, CEDEX 2, France; (P.-Y.L.); (S.A.-S.)
| | - Samia Aci-Sèche
- Institute of Organic and Analytical Chemistry (ICOA), UMR7311, Université d’Orléans, CNRS, Pôle de Chimie rue de Chartres, 45067 Orléans, CEDEX 2, France; (P.-Y.L.); (S.A.-S.)
| | - Jose Carlos Gómez-Tamayo
- Computational Chemistry, Janssen Research & Development, Janssen Pharmaceutica N. V., B-2340 Beerse, Belgium; (J.C.G.-T.); (G.T.)
| | - Gary Tresadern
- Computational Chemistry, Janssen Research & Development, Janssen Pharmaceutica N. V., B-2340 Beerse, Belgium; (J.C.G.-T.); (G.T.)
| | - Pascal Bonnet
- Institute of Organic and Analytical Chemistry (ICOA), UMR7311, Université d’Orléans, CNRS, Pôle de Chimie rue de Chartres, 45067 Orléans, CEDEX 2, France; (P.-Y.L.); (S.A.-S.)
| |
Collapse
|
10
|
Poonia P, Sharma M, Jha P, Chopra M. Pharmacophore-based virtual screening of ZINC database, molecular modeling and designing new derivatives as potential HDAC6 inhibitors. Mol Divers 2023; 27:2053-2071. [PMID: 36214962 DOI: 10.1007/s11030-022-10540-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2022] [Accepted: 09/30/2022] [Indexed: 11/25/2022]
Abstract
To date, many HDAC6 inhibitors have been identified and developed but none is clinically approved as of now. Through this study, we aim to obtain novel HDAC6 selective inhibitors and provide new insights into the detailed structural design of potential HDAC6 inhibitors. A HypoGen-based 3D QSAR HDAC6 pharmacophore was built and used as a query model to screen approximately 8 million ZINC database compounds. First, the ZINC Database was filtered using ADMET, followed by pharmacophore-based library screening. Using fit value and estimated activity cutoffs, a final set of 54 ZINC hits was obtained that were further investigated using molecular docking with the crystal structure of human histone deacetylase 6 catalytic domain 2 in complex with Trichostatin A (PDB ID: 5EDU). Through detailed in silico screening of the ZINC database, we shortlisted three hits as the lead molecules for designing novel HDAC6 inhibitors with better efficacy. Docking with 5EDU, followed by ADMET and TOPKAT analysis of modified ZINC hits provided 9 novel potential HDAC6 inhibitors that possess better docking scores and 2D interactions as compared to the control ZINC hit molecules. Finally, a 50 ns MD analysis run followed by Protein-Ligand Interaction Energy (PLIE) analysis of the top scored hits provided a novel molecule N1 that showed promisingly similar results to that of Ricolinostat (a known HDAC6 inhibitor). The comparable result of the designed hits to established HDAC6 inhibitors suggests that these compounds might prove to be successful HDAC6 inhibitors in future. Designed novel hits that might act as good HDAC6 inhibitors derived from ZINC database using combined molecular docking and modeling approaches.
Collapse
Affiliation(s)
- Priya Poonia
- Dr. B.R. Ambedkar Center for Biomedical Research, University of Delhi, Delhi, 110036, India
| | - Monika Sharma
- Dr. B.R. Ambedkar Center for Biomedical Research, University of Delhi, Delhi, 110036, India
| | - Prakash Jha
- Dr. B.R. Ambedkar Center for Biomedical Research, University of Delhi, Delhi, 110036, India
| | - Madhu Chopra
- Dr. B.R. Ambedkar Center for Biomedical Research, University of Delhi, Delhi, 110036, India.
| |
Collapse
|
11
|
Hu G, Fang Y, Xu H, Wang G, Yang R, Gao F, Wei Q, Gu Y, Zhang C, Qiu J, Gao N, Wen Q, Qiao H. Identification of Cytochrome P450 2E1 as a Novel Target in Glioma and Development of Its Inhibitor as an Anti-Tumor Agent. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2023; 10:e2301096. [PMID: 37283464 PMCID: PMC10427391 DOI: 10.1002/advs.202301096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 04/24/2023] [Indexed: 06/08/2023]
Abstract
Glioblastoma (GBM) is a devastating inflammation-related cancer for which novel therapeutic targets are urgently required. Previous studies of the authors indicate Cytochrome P450 2E1 (CYP2E1) as a novel inflammatory target and develop a specific inhibitor Q11. Here it is demonstrated that CYP2E1 overexpression is closely related to higher malignancy in GBM patients. CYP2E1 activity is positively correlated with tumor weight in GBM rats. Significantly higher CYP2E1 expression accompanied by increased inflammation is detected in a mouse GBM model. Q11, 1-(4-methyl-5-thialzolyl) ethenone, a newly developed specific inhibitor of CYP2E1 here remarkably attenuates tumor growth and prolongs survival in vivo. Q11 does not directly affect tumor cells but blocks the tumor-promoting effect of microglia/macrophage (M/Mφ) in the tumor microenvironment through PPARγ-mediated activation of the STAT-1 and NF-κB pathways and inhibition of the STAT-3 and STAT-6 pathways. The effectiveness and safety of targeting CYP2E1 in GBM are further supported by studies with Cyp2e1 knockout rodents. In conclusion, a pro-GBM mechanism in which CYP2E1-PPARγ-STAT-1/NF-κB/STAT-3/STAT-6 axis fueled tumorigenesis by reprogramming M/Mφ and Q11 as a promising anti-inflammatory agent for GBM treatment is uncovered.
Collapse
Affiliation(s)
- Guiming Hu
- Institute of Clinical PharmacologyZhengzhou UniversityKexue RoadZhengzhou450001China
- Department of PathologyThe Second Affiliated Hospital of Zhengzhou UniversityJingba RoadZhengzhou450014China
| | - Yan Fang
- Institute of Clinical PharmacologyZhengzhou UniversityKexue RoadZhengzhou450001China
- Department of PathologyThe Second Affiliated Hospital of Zhengzhou UniversityJingba RoadZhengzhou450014China
| | - Haiwei Xu
- School of Pharmaceutical SciencesZhengzhou UniversityKexue RoadZhengzhou450001China
| | - Guanzhe Wang
- Institute of Clinical PharmacologyZhengzhou UniversityKexue RoadZhengzhou450001China
| | - Rui Yang
- Institute of Clinical PharmacologyZhengzhou UniversityKexue RoadZhengzhou450001China
| | - Fei Gao
- Institute of Clinical PharmacologyZhengzhou UniversityKexue RoadZhengzhou450001China
| | - Qingda Wei
- Institute of Clinical PharmacologyZhengzhou UniversityKexue RoadZhengzhou450001China
| | - Yuhan Gu
- Institute of Clinical PharmacologyZhengzhou UniversityKexue RoadZhengzhou450001China
| | - Cunzhen Zhang
- Institute of Clinical PharmacologyZhengzhou UniversityKexue RoadZhengzhou450001China
| | - Jinhuan Qiu
- Institute of Clinical PharmacologyZhengzhou UniversityKexue RoadZhengzhou450001China
| | - Na Gao
- Institute of Clinical PharmacologyZhengzhou UniversityKexue RoadZhengzhou450001China
| | - Qiang Wen
- Institute of Clinical PharmacologyZhengzhou UniversityKexue RoadZhengzhou450001China
| | - Hailing Qiao
- Institute of Clinical PharmacologyZhengzhou UniversityKexue RoadZhengzhou450001China
| |
Collapse
|
12
|
Mohanty M, Mohanty PS. Molecular docking in organic, inorganic, and hybrid systems: a tutorial review. MONATSHEFTE FUR CHEMIE 2023; 154:1-25. [PMID: 37361694 PMCID: PMC10243279 DOI: 10.1007/s00706-023-03076-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Accepted: 05/08/2023] [Indexed: 06/28/2023]
Abstract
Molecular docking simulation is a very popular and well-established computational approach and has been extensively used to understand molecular interactions between a natural organic molecule (ideally taken as a receptor) such as an enzyme, protein, DNA, RNA and a natural or synthetic organic/inorganic molecule (considered as a ligand). But the implementation of docking ideas to synthetic organic, inorganic, or hybrid systems is very limited with respect to their use as a receptor despite their huge popularity in different experimental systems. In this context, molecular docking can be an efficient computational tool for understanding the role of intermolecular interactions in hybrid systems that can help in designing materials on mesoscale for different applications. The current review focuses on the implementation of the docking method in organic, inorganic, and hybrid systems along with examples from different case studies. We describe different resources, including databases and tools required in the docking study and applications. The concept of docking techniques, types of docking models, and the role of different intermolecular interactions involved in the docking process to understand the binding mechanisms are explained. Finally, the challenges and limitations of dockings are also discussed in this review. Graphical abstract
Collapse
Affiliation(s)
- Madhuchhanda Mohanty
- School of Biotechnology, Kalinga Institute of Industrial Technology (KIIT), Deemed to be University, Bhubaneswar, 751024 India
| | - Priti S. Mohanty
- School of Biotechnology, Kalinga Institute of Industrial Technology (KIIT), Deemed to be University, Bhubaneswar, 751024 India
- School of Chemical Technology, Kalinga Institute of Industrial Technology (KIIT), Deemed to be University, Bhubaneswar, 751024 India
| |
Collapse
|
13
|
Rayka M, Firouzi R. GB-score: Minimally designed machine learning scoring function based on distance-weighted interatomic contact features. Mol Inform 2023; 42:e2200135. [PMID: 36722733 DOI: 10.1002/minf.202200135] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Revised: 11/24/2022] [Accepted: 11/28/2022] [Indexed: 02/02/2023]
Abstract
In recent years, thanks to advances in computer hardware and dataset availability, data-driven approaches (like machine learning) have become one of the essential parts of the drug design framework to accelerate drug discovery procedures. Constructing a new scoring function, a function that can predict the binding score for a generated protein-ligand pose during docking procedure or a crystal complex, based on machine and deep learning has become an active research area in computer-aided drug design. GB-Score is a state-of-the-art machine learning-based scoring function that utilizes distance-weighted interatomic contact features, PDBbind-v2019 general set, and Gradient Boosting Trees algorithm to the binding affinity prediction. The distance-weighted interatomic contact featurization method used the distance between different ligand and protein atom types for numerical representation of the protein-ligand complex. GB-Score attains Pearson's correlation 0.862 and RMSE 1.190 on the CASF-2016 benchmark test in the scoring power metric. GB-Score's codes are freely available on the web at https://github.com/miladrayka/GB_Score.
Collapse
Affiliation(s)
- Milad Rayka
- Department of Physical Chemistry, Chemistry and Chemical Engineering Research Center of Iran, Tehran, Iran
| | - Rohoullah Firouzi
- Department of Physical Chemistry, Chemistry and Chemical Engineering Research Center of Iran, Tehran, Iran
| |
Collapse
|
14
|
Wu Q, Huang SY. HCovDock: an efficient docking method for modeling covalent protein-ligand interactions. Brief Bioinform 2023; 24:6961470. [PMID: 36573474 DOI: 10.1093/bib/bbac559] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 11/02/2022] [Accepted: 11/17/2022] [Indexed: 12/28/2022] Open
Abstract
Covalent inhibitors have received extensive attentions in the past few decades because of their long residence time, high binding efficiency and strong selectivity. Therefore, it is valuable to develop computational tools like molecular docking for modeling of covalent protein-ligand interactions or screening of potential covalent drugs. Meeting the needs, we have proposed HCovDock, an efficient docking algorithm for covalent protein-ligand interactions by integrating a ligand sampling method of incremental construction and a scoring function with covalent bond-based energy. Tested on a benchmark containing 207 diverse protein-ligand complexes, HCovDock exhibits a significantly better performance than seven other state-of-the-art covalent docking programs (AutoDock, Cov_DOX, CovDock, FITTED, GOLD, ICM-Pro and MOE). With the criterion of ligand root-mean-squared distance < 2.0 Å, HCovDock obtains a high success rate of 70.5% and 93.2% in reproducing experimentally observed structures for top 1 and top 10 predictions. In addition, HCovDock is also validated in virtual screening against 10 receptors of three proteins. HCovDock is computationally efficient and the average running time for docking a ligand is only 5 min with as fast as 1 sec for ligands with one rotatable bond and about 18 min for ligands with 23 rotational bonds. HCovDock can be freely assessed at http://huanglab.phys.hust.edu.cn/hcovdock/.
Collapse
Affiliation(s)
- Qilong Wu
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| |
Collapse
|
15
|
Blanes-Mira C, Fernández-Aguado P, de Andrés-López J, Fernández-Carvajal A, Ferrer-Montiel A, Fernández-Ballester G. Comprehensive Survey of Consensus Docking for High-Throughput Virtual Screening. Molecules 2022; 28:molecules28010175. [PMID: 36615367 PMCID: PMC9821981 DOI: 10.3390/molecules28010175] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Revised: 12/19/2022] [Accepted: 12/21/2022] [Indexed: 12/28/2022] Open
Abstract
The rapid advances of 3D techniques for the structural determination of proteins and the development of numerous computational methods and strategies have led to identifying highly active compounds in computer drug design. Molecular docking is a method widely used in high-throughput virtual screening campaigns to filter potential ligands targeted to proteins. A great variety of docking programs are currently available, which differ in the algorithms and approaches used to predict the binding mode and the affinity of the ligand. All programs heavily rely on scoring functions to accurately predict ligand binding affinity, and despite differences in performance, none of these docking programs is preferable to the others. To overcome this problem, consensus scoring methods improve the outcome of virtual screening by averaging the rank or score of individual molecules obtained from different docking programs. The successful application of consensus docking in high-throughput virtual screening highlights the need to optimize the predictive power of molecular docking methods.
Collapse
|
16
|
Zhu H, Yang J, Huang N. Assessment of the Generalization Abilities of Machine-Learning Scoring Functions for Structure-Based Virtual Screening. J Chem Inf Model 2022; 62:5485-5502. [PMID: 36268980 DOI: 10.1021/acs.jcim.2c01149] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
In structure-based virtual screening (SBVS), it is critical that scoring functions capture protein-ligand atomic interactions. By focusing on the local domains of ligand binding pockets, a standardized pocket Pfam-based clustering (Pfam-cluster) approach was developed to assess the cross-target generalization ability of machine-learning scoring functions (MLSFs). Subsequently, 12 typical MLSFs were evaluated using random cross-validation (Random-CV), protein sequence similarity-based cross-validation (Seq-CV), and pocket Pfam-based cross-validation (Pfam-CV) methods. Surprisingly, all of the tested models showed decreased performances from Random-CV to Seq-CV to Pfam-CV experiments, not showing satisfactory generalization capacity. Our interpretable analysis suggested that the predictions on novel targets by MLSFs were dependent on buried solvent-accessible surface area (SASA)-related features of complex structures, with greater predicted binding affinities on complexes owning larger protein-ligand interfaces. By combining buried SASA-related features with target-specific patterns that were only shared among structurally similar compounds in the same cluster, the random forest (RF)-Score attained a good performance in the Random-CV test. Based on these findings, we strongly advise assessing the generalization ability of MLSFs with the Pfam-cluster approach and being cautious with the features learned by MLSFs.
Collapse
Affiliation(s)
- Hui Zhu
- Tsinghua Institute of Multidisciplinary Biomedical Research, Tsinghua University, Beijing, China102206, China.,National Institute of Biological Sciences, 7 Science Park Road, Zhongguancun Life Science Park, Beijing102206, China
| | - Jincai Yang
- National Institute of Biological Sciences, 7 Science Park Road, Zhongguancun Life Science Park, Beijing102206, China
| | - Niu Huang
- Tsinghua Institute of Multidisciplinary Biomedical Research, Tsinghua University, Beijing, China102206, China.,National Institute of Biological Sciences, 7 Science Park Road, Zhongguancun Life Science Park, Beijing102206, China
| |
Collapse
|
17
|
Yang C, Chen EA, Zhang Y. Protein-Ligand Docking in the Machine-Learning Era. Molecules 2022; 27:4568. [PMID: 35889440 PMCID: PMC9323102 DOI: 10.3390/molecules27144568] [Citation(s) in RCA: 34] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Accepted: 07/14/2022] [Indexed: 11/16/2022] Open
Abstract
Molecular docking plays a significant role in early-stage drug discovery, from structure-based virtual screening (VS) to hit-to-lead optimization, and its capability and predictive power is critically dependent on the protein-ligand scoring function. In this review, we give a broad overview of recent scoring function development, as well as the docking-based applications in drug discovery. We outline the strategies and resources available for structure-based VS and discuss the assessment and development of classical and machine learning protein-ligand scoring functions. In particular, we highlight the recent progress of machine learning scoring function ranging from descriptor-based models to deep learning approaches. We also discuss the general workflow and docking protocols of structure-based VS, such as structure preparation, binding site detection, docking strategies, and post-docking filter/re-scoring, as well as a case study on the large-scale docking-based VS test on the LIT-PCBA data set.
Collapse
Affiliation(s)
- Chao Yang
- Department of Chemistry, New York University, New York, NY 10003, USA; (C.Y.); (E.A.C.)
| | - Eric Anthony Chen
- Department of Chemistry, New York University, New York, NY 10003, USA; (C.Y.); (E.A.C.)
| | - Yingkai Zhang
- Department of Chemistry, New York University, New York, NY 10003, USA; (C.Y.); (E.A.C.)
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| |
Collapse
|
18
|
Jiang H, Wang J, Cong W, Huang Y, Ramezani M, Sarma A, Dokholyan NV, Mahdavi M, Kandemir MT. Predicting Protein-Ligand Docking Structure with Graph Neural Network. J Chem Inf Model 2022; 62:2923-2932. [PMID: 35699430 PMCID: PMC10279412 DOI: 10.1021/acs.jcim.2c00127] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Modern day drug discovery is extremely expensive and time consuming. Although computational approaches help accelerate and decrease the cost of drug discovery, existing computational software packages for docking-based drug discovery suffer from both low accuracy and high latency. A few recent machine learning-based approaches have been proposed for virtual screening by improving the ability to evaluate protein-ligand binding affinity, but such methods rely heavily on conventional docking software to sample docking poses, which results in excessive execution latencies. Here, we propose and evaluate a novel graph neural network (GNN)-based framework, MedusaGraph, which includes both pose-prediction (sampling) and pose-selection (scoring) models. Unlike the previous machine learning-centric studies, MedusaGraph generates the docking poses directly and achieves from 10 to 100 times speedup compared to state-of-the-art approaches, while having a slightly better docking accuracy.
Collapse
Affiliation(s)
- Huaipan Jiang
- Department of Computer Science and Engineering, Pennsylvania State University, State College, Pennsylvania 16802, United States
| | - Jian Wang
- Departments of Pharmacology and Biochemistry and Molecular Biology, Pennsylvania State College of Medicine, Hershey, Pennsylvania 17033, United States
| | - Weilin Cong
- Department of Computer Science and Engineering, Pennsylvania State University, State College, Pennsylvania 16802, United States
| | - Yihe Huang
- Department of Computer Science and Engineering, Pennsylvania State University, State College, Pennsylvania 16802, United States
| | - Morteza Ramezani
- Department of Computer Science and Engineering, Pennsylvania State University, State College, Pennsylvania 16802, United States
| | - Anup Sarma
- Department of Computer Science and Engineering, Pennsylvania State University, State College, Pennsylvania 16802, United States
| | - Nikolay V Dokholyan
- Departments of Pharmacology and Biochemistry and Molecular Biology, Pennsylvania State College of Medicine, Hershey, Pennsylvania 17033, United States
- Departments of Chemistry and Biomedical Engineering, Pennsylvania State University, State College, Pennsylvania 16802, United States
| | - Mehrdad Mahdavi
- Department of Computer Science and Engineering, Pennsylvania State University, State College, Pennsylvania 16802, United States
| | - Mahmut T Kandemir
- Department of Computer Science and Engineering, Pennsylvania State University, State College, Pennsylvania 16802, United States
| |
Collapse
|
19
|
Saeed A, Ejaz SA, Sarfraz M, Tamam N, Siddique F, Riaz N, Qais FA, Chtita S, Iqbal J. Discovery of Phenylcarbamoylazinane-1,2,4-Triazole Amides Derivatives as the Potential Inhibitors of Aldo-Keto Reductases (AKR1B1 & AKRB10): Potential Lead Molecules for Treatment of Colon Cancer. Molecules 2022; 27:molecules27133981. [PMID: 35807227 PMCID: PMC9268700 DOI: 10.3390/molecules27133981] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2022] [Revised: 05/19/2022] [Accepted: 05/23/2022] [Indexed: 12/12/2022] Open
Abstract
Both members of the aldo-keto reductases (AKRs) family, AKR1B1 and AKR1B10, are over-expressed in various type of cancer, making them potential targets for inflammation-mediated cancers such as colon, lung, breast, and prostate cancers. This is the first comprehensive study which focused on the identification of phenylcarbamoylazinane-1, 2,4-triazole amides (7a−o) as the inhibitors of aldo-keto reductases (AKR1B1, AKR1B10) via detailed computational analysis. Firstly, the stability and reactivity of compounds were determined by using the Guassian09 programme in which the density functional theory (DFT) calculations were performed by using the B3LYP/SVP level. Among all the derivatives, the 7d, 7e, 7f, 7h, 7j, 7k, and 7m were found chemically reactive. Then the binding interactions of the optimized compounds within the active pocket of the selected targets were carried out by using molecular docking software: AutoDock tools and Molecular operation environment (MOE) software, and during analysis, the Autodock (academic software) results were found to be reproducible, suggesting this software is best over the MOE (commercial software). The results were found in correlation with the DFT results, suggesting 7d as the best inhibitor of AKR1B1 with the energy value of −49.40 kJ/mol and 7f as the best inhibitor of AKR1B10 with the energy value of −52.84 kJ/mol. The other potent compounds also showed comparable binding energies. The best inhibitors of both targets were validated by the molecular dynamics simulation studies where the root mean square value of <2 along with the other physicochemical properties, hydrogen bond interactions, and binding energies were observed. Furthermore, the anticancer potential of the potent compounds was confirmed by cell viability (MTT) assay. The studied compounds fall into the category of drug-like properties and also supported by physicochemical and pharmacological ADMET properties. It can be suggested that the further synthesis of derivatives of 7d and 7f may lead to the potential drug-like molecules for the treatment of colon cancer associated with the aberrant expression of either AKR1B1 or AKR1B10 and other associated malignancies.
Collapse
Affiliation(s)
- Amna Saeed
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, The Islamia University of Bahawalpur, Bahawalpur 63100, Pakistan;
| | - Syeda Abida Ejaz
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, The Islamia University of Bahawalpur, Bahawalpur 63100, Pakistan;
- Correspondence: (S.A.E.); (J.I.)
| | - Muhammad Sarfraz
- College of Pharmacy, Al Ain Campus, Al Ain University, Al Ain P.O. Box 64141, United Arab Emirates;
| | - Nissren Tamam
- Department of Physics, College of Science, Princess Nourah bint Abdulrahman University, P.O Box 84428, Riyadh 11671, Saudi Arabia;
| | - Farhan Siddique
- Laboratory of Organic Electronics, Department of Science and Technology, Linköping University, SE-60174 Norrköping, Sweden;
- Department of Pharmacy, Royal Institute of Medical Sciences (RIMS), Multan 60000, Pakistan
| | - Naheed Riaz
- Department of Chemistry, Baghdad-ul-Jadeed Campus, The Islamia University of Bahawalpur, Bahawalpur 63100, Pakistan;
| | - Faizan Abul Qais
- Department of Agricultural Microbiology, Faculty of Agricultural Sciences, Aligarh Muslim University, Aligarh 202002, UP, India;
| | - Samir Chtita
- Laboratory of Analytical and Molecular Chemistry, Faculty of Sciences Ben M’Sik, Hassan II University of Casablanca, Sidi Othmane, Casablanca BP7955, Morocco;
| | - Jamshed Iqbal
- Centre for Advanced Drug Research, Abbottabad Campus, COMSATS University Islamabad, Abbotabad 22060, Pakistan
- Correspondence: (S.A.E.); (J.I.)
| |
Collapse
|
20
|
Fujimoto KJ, Minami S, Yanai T. Machine-Learning- and Knowledge-Based Scoring Functions Incorporating Ligand and Protein Fingerprints. ACS OMEGA 2022; 7:19030-19039. [PMID: 35694525 PMCID: PMC9178954 DOI: 10.1021/acsomega.2c02822] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Accepted: 05/12/2022] [Indexed: 06/15/2023]
Abstract
We propose a novel machine-learning-based scoring function for drug discovery that incorporates ligand and protein structural information into a knowledge-based PMF score. Molecular docking, a simulation method for structure-based drug design (SBDD), is expected to reduce the enormous costs associated with conventional experimental methods in terms of rational drug discovery. Molecular docking has two main purposes: to predict ligand-binding structures for target proteins and to predict protein-ligand binding affinity. Currently available programs of molecular docking offer an accurate prediction of ligand binding structures for many systems. However, the accurate prediction of binding affinity remains challenging. In this study, we developed a new scoring function that incorporates fingerprints representing ligand and protein structures as descriptors in the PMF score. Here, regression analysis of the scoring function was performed using the following machine learning techniques: least absolute shrinkage and selection operator (LASSO) and light gradient boosting machine (LightGBM). The results on a test data set showed that the binding affinity delivered by the newly developed scoring function has a Pearson correlation coefficient of 0.79 with the experimental value, which surpasses that of the conventional scoring functions. Further analysis provided a chemical understanding of the descriptors that contributed significantly to the improvement in prediction accuracy. Our approach and findings are useful for rational drug discovery.
Collapse
Affiliation(s)
- Kazuhiro J. Fujimoto
- Institute
of Transformative Bio-Molecules (WPI-ITbM), Nagoya University, Furocho, Chikusa, Nagoya 464-8601, Japan
- Department
of Chemistry, Graduate School of Science, Nagoya University, Furocho, Chikusa, Nagoya 464-8601, Japan
| | - Shota Minami
- Department
of Chemistry, Graduate School of Science, Nagoya University, Furocho, Chikusa, Nagoya 464-8601, Japan
| | - Takeshi Yanai
- Institute
of Transformative Bio-Molecules (WPI-ITbM), Nagoya University, Furocho, Chikusa, Nagoya 464-8601, Japan
- Department
of Chemistry, Graduate School of Science, Nagoya University, Furocho, Chikusa, Nagoya 464-8601, Japan
| |
Collapse
|
21
|
Jha P, Saluja D, Chopra M. Structure-guided pharmacophore based virtual screening, docking, and molecular dynamics to discover repurposed drugs as novel inhibitors against endoribonuclease Nsp15 of SARS-CoV-2. J Biomol Struct Dyn 2022:1-11. [PMID: 35652904 DOI: 10.1080/07391102.2022.2079561] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
COVID-19 (Corona Virus Disease of 2019) caused by the novel 'Severe Acute Respiratory Syndrome Coronavirus-2' (SARS-CoV-2) has wreaked havoc on human health and the global economy. As a result, for new medication development, it's critical to investigate possible therapeutic targets against the novel virus. 'Non-structural protein 15' (Nsp15) endonuclease is one of the crucial targets which helps in the replication of virus and virulence in the host immune system. Here, in the current study, we developed the structure-based pharmacophore model based on Nsp15-UMP interactions and virtually screened several databases against the selected model. To validate the screening process, we docked the top hits obtained after secondary filtering (Lipinski's rule of five, ADMET & Topkat) followed by 100 ns molecular dynamics (MD) simulations. Next, to revalidate the MD simulation studies, we have calculated the binding free energy of each complex using the MM-PBSA procedure. The discovered repurposed drugs can aid the rational design of novel inhibitors for Nsp15 of the SARS-CoV-2 enzyme and may be considered for immediate drug development.
Collapse
Affiliation(s)
- Prakash Jha
- Laboratory of Molecular Modeling and Anticancer Drug Development, Dr. B. R. Ambedkar Center for Biomedical Research (ACBR), University of Delhi, Delhi, India
| | - Daman Saluja
- Medical Biotechnology Laboratory, Dr. B. R. Ambedkar Center for Biomedical Research (ACBR), University of Delhi, Delhi, India
| | - Madhu Chopra
- Laboratory of Molecular Modeling and Anticancer Drug Development, Dr. B. R. Ambedkar Center for Biomedical Research (ACBR), University of Delhi, Delhi, India
| |
Collapse
|
22
|
Yang C, Zhang Y. Delta Machine Learning to Improve Scoring-Ranking-Screening Performances of Protein-Ligand Scoring Functions. J Chem Inf Model 2022; 62:2696-2712. [PMID: 35579568 DOI: 10.1021/acs.jcim.2c00485] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Protein-ligand scoring functions are widely used in structure-based drug design for fast evaluation of protein-ligand interactions, and it is of strong interest to develop scoring functions with machine-learning approaches. In this work, by expanding the training set, developing physically meaningful features, employing our recently developed linear empirical scoring function Lin_F9 (Yang, C. J. Chem. Inf. Model. 2021, 61, 4630-4644) as the baseline, and applying extreme gradient boosting (XGBoost) with Δ-machine learning, we have further improved the robustness and applicability of machine-learning scoring functions. Besides the top performances for scoring-ranking-screening power tests of the CASF-2016 benchmark, the new scoring function ΔLin_F9XGB also achieves superior scoring and ranking performances in different structure types that mimic real docking applications. The scoring powers of ΔLin_F9XGB for locally optimized poses, flexible redocked poses, and ensemble docked poses of the CASF-2016 core set achieve Pearson's correlation coefficient (R) values of 0.853, 0.839, and 0.813, respectively. In addition, the large-scale docking-based virtual screening test on the LIT-PCBA data set demonstrates the reliability and robustness of ΔLin_F9XGB in virtual screening application. The ΔLin_F9XGB scoring function and its code are freely available on the web at (https://yzhang.hpc.nyu.edu/Delta_LinF9_XGB).
Collapse
Affiliation(s)
- Chao Yang
- Department of Chemistry, New York University, New York, New York 10003, United States
| | - Yingkai Zhang
- Department of Chemistry, New York University, New York, New York 10003, United States.,NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| |
Collapse
|
23
|
Zhou Y, Jiang Y, Chen SJ. RNA-ligand molecular docking: advances and challenges. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL MOLECULAR SCIENCE 2022; 12:e1571. [PMID: 37293430 PMCID: PMC10250017 DOI: 10.1002/wcms.1571] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Accepted: 07/20/2021] [Indexed: 12/16/2022]
Abstract
With rapid advances in computer algorithms and hardware, fast and accurate virtual screening has led to a drastic acceleration in selecting potent small molecules as drug candidates. Computational modeling of RNA-small molecule interactions has become an indispensable tool for RNA-targeted drug discovery. The current models for RNA-ligand binding have mainly focused on the docking-and-scoring method. Accurate docking and scoring should tackle four crucial problems: (1) conformational flexibility of ligand, (2) conformational flexibility of RNA, (3) efficient sampling of binding sites and binding poses, and (4) accurate scoring of different binding modes. Moreover, compared with the problem of protein-ligand docking, predicting ligand binding to RNA, a negatively charged polymer, is further complicated by additional effects such as metal ion effects. Thermodynamic models based on physics-based and knowledge-based scoring functions have shown highly encouraging success in predicting ligand binding poses and binding affinities. Recently, kinetic models for ligand binding have further suggested that including dissociation kinetics (residence time) in ligand docking would result in improved performance in estimating in vivo drug efficacy. More recently, the rise of deep-learning approaches has led to new tools for predicting RNA-small molecule binding. In this review, we present an overview of the recently developed computational methods for RNA-ligand docking and their advantages and disadvantages.
Collapse
Affiliation(s)
- Yuanzhe Zhou
- Department of Physics and Astronomy, Department of Biochemistry, Institute of Data Sciences and Informatics, University of Missouri, Columbia, MO 65211-7010, USA
| | - Yangwei Jiang
- Department of Physics and Astronomy, Department of Biochemistry, Institute of Data Sciences and Informatics, University of Missouri, Columbia, MO 65211-7010, USA
| | - Shi-Jie Chen
- Department of Physics and Astronomy, Department of Biochemistry, Institute of Data Sciences and Informatics, University of Missouri, Columbia, MO 65211-7010, USA
| |
Collapse
|
24
|
Zheng L, Meng J, Jiang K, Lan H, Wang Z, Lin M, Li W, Guo H, Wei Y, Mu Y. Improving protein-ligand docking and screening accuracies by incorporating a scoring function correction term. Brief Bioinform 2022; 23:6548372. [PMID: 35289359 PMCID: PMC9116214 DOI: 10.1093/bib/bbac051] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Revised: 01/30/2022] [Accepted: 01/31/2022] [Indexed: 12/13/2022] Open
Abstract
Scoring functions are important components in molecular docking for structure-based drug discovery. Traditional scoring functions, generally empirical- or force field-based, are robust and have proven to be useful for identifying hits and lead optimizations. Although multiple highly accurate deep learning- or machine learning-based scoring functions have been developed, their direct applications for docking and screening are limited. We describe a novel strategy to develop a reliable protein–ligand scoring function by augmenting the traditional scoring function Vina score using a correction term (OnionNet-SFCT). The correction term is developed based on an AdaBoost random forest model, utilizing multiple layers of contacts formed between protein residues and ligand atoms. In addition to the Vina score, the model considerably enhances the AutoDock Vina prediction abilities for docking and screening tasks based on different benchmarks (such as cross-docking dataset, CASF-2016, DUD-E and DUD-AD). Furthermore, our model could be combined with multiple docking applications to increase pose selection accuracies and screening abilities, indicating its wide usage for structure-based drug discoveries. Furthermore, in a reverse practice, the combined scoring strategy successfully identified multiple known receptors of a plant hormone. To summarize, the results show that the combination of data-driven model (OnionNet-SFCT) and empirical scoring function (Vina score) is a good scoring strategy that could be useful for structure-based drug discoveries and potentially target fishing in future.
Collapse
Affiliation(s)
- Liangzhen Zheng
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong 518055, China.,Shanghai Zelixir Biotech Company Ltd., Shanghai 200030, China
| | - Jintao Meng
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong 518055, China.,National Supercomputer Center in Shenzhen, Shenzhen, 518000, China
| | - Kai Jiang
- Institute of Plant and Food Science, Department of Biology, School of Life Sciences, Southern University of Science and Technology (SUSTech), Shenzhen, Guangdong 518055, China
| | - Haidong Lan
- Tencent AI Lab, Shenzhen, Guangdong 518000, China
| | - Zechen Wang
- School of Physics, Shandong University, Jinan, Shandong 250101, China
| | - Mingzhi Lin
- Shanghai Zelixir Biotech Company Ltd., Shanghai 200030, China
| | - Weifeng Li
- School of Physics, Shandong University, Jinan, Shandong 250101, China
| | - Hongwei Guo
- Institute of Plant and Food Science, Department of Biology, School of Life Sciences, Southern University of Science and Technology (SUSTech), Shenzhen, Guangdong 518055, China
| | - Yanjie Wei
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong 518055, China
| | - Yuguang Mu
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive 637551, Singapore
| |
Collapse
|
25
|
Nikolaienko T, Gurbych O, Druchok M. Complex machine learning model needs complex testing: Examining predictability of molecular binding affinity by a graph neural network. J Comput Chem 2022; 43:728-739. [PMID: 35201629 DOI: 10.1002/jcc.26831] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 01/04/2022] [Accepted: 02/09/2022] [Indexed: 12/12/2022]
Abstract
Drug discovery pipelines typically involve high-throughput screening of large amounts of compounds in a search of potential drugs candidates. As a chemical space of small organic molecules is huge, a "navigation" over it urges for fast and lightweight computational methods, thus promoting machine-learning approaches for processing huge pools of candidates. In this contribution, we present a graph-based deep neural network for prediction of protein-drug binding affinity and assess its predictive power under thorough testing conditions. Within the suggested approach, both protein and drug molecules are represented as graphs and passed to separate graph sub-networks, then concatenated and regressed towards a binding affinity. The neural network is trained on two binding affinity datasets-PDBbind and data imported from RCSB Protein Data Bank. In order to explore the generalization capabilities of the model we go beyond traditional random or leave-cluster-out techniques and demonstrate the need for more elaborate model performance assessment - six different strategies for test/train data partitioning (random, time- and property-arranged, protein- and ligand-clustered) with a k-fold cross-validation are engaged. Finally, we discuss the model performance in terms of a set of metrics for different split strategies and fold arrangement. Our code is available at https://github.com/SoftServeInc/affinity-by-GNN.
Collapse
Affiliation(s)
- Tymofii Nikolaienko
- SoftServe, Inc., Lviv, Ukraine.,Faculty of Physics, Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
| | - Oleksandr Gurbych
- Blackthorn AI Ltd., London, UK.,Department of Artificial Intelligence Systems, Lviv Polytechnic National University, Lviv, Ukraine
| | - Maksym Druchok
- SoftServe, Inc., Lviv, Ukraine.,Institute for Condensed Matter Physics, NAS of Ukraine, Lviv, Ukraine
| |
Collapse
|
26
|
Zhu YX, Sheng YJ, Ma YQ, Ding HM. Assessing the Performance of Screening MM/PBSA in Protein-Ligand Interactions. J Phys Chem B 2022; 126:1700-1708. [PMID: 35188781 DOI: 10.1021/acs.jpcb.1c09424] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Accurate calculation of the binding free energies between a protein and a ligand is the primary objective of structure-based drug design, but it still remains a challenging problem. In this work, we apply the screening molecular mechanics/Poisson Boltzmann surface area (MM/PBSA) method to calculate the binding affinity of protein-ligand interactions. Our results show that the performance of the screening MM/PBSA is better than that of the standard MM/PBSA, especially in a charged-ligand system. In addition, we also investigate the effect of the solute dielectric constant on the results, and find that the optimal solute dielectric constants are different between the neutral-ligand system and the charged-ligand system. Moreover, we also evaluate the effect of the atomic-charge methods on the performance of the screening MM/PBSA. The present study demonstrates that the screening MM/PBSA should be a reliable method for calculating binding energy of biosystems.
Collapse
Affiliation(s)
- Yu-Xin Zhu
- Center for Soft Condensed Matter Physics and Interdisciplinary Research, School of Physical Science and Technology, Soochow University, Suzhou 215006, China
| | - Yan-Jing Sheng
- Center for Soft Condensed Matter Physics and Interdisciplinary Research, School of Physical Science and Technology, Soochow University, Suzhou 215006, China
| | - Yu-Qiang Ma
- National Laboratory of Solid State Microstructures and Department of Physics, Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing 210093, China
| | - Hong-Ming Ding
- Center for Soft Condensed Matter Physics and Interdisciplinary Research, School of Physical Science and Technology, Soochow University, Suzhou 215006, China
| |
Collapse
|
27
|
Mohammadi S, Narimani Z, Ashouri M, Firouzi R, Karimi-Jafari MH. Ensemble learning from ensemble docking: revisiting the optimum ensemble size problem. Sci Rep 2022; 12:410. [PMID: 35013496 PMCID: PMC8748946 DOI: 10.1038/s41598-021-04448-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Accepted: 12/21/2021] [Indexed: 11/09/2022] Open
Abstract
Despite considerable advances obtained by applying machine learning approaches in protein–ligand affinity predictions, the incorporation of receptor flexibility has remained an important bottleneck. While ensemble docking has been used widely as a solution to this problem, the optimum choice of receptor conformations is still an open question considering the issues related to the computational cost and false positive pose predictions. Here, a combination of ensemble learning and ensemble docking is suggested to rank different conformations of the target protein in light of their importance for the final accuracy of the model. Available X-ray structures of cyclin-dependent kinase 2 (CDK2) in complex with different ligands are used as an initial receptor ensemble, and its redundancy is removed through a graph-based redundancy removal, which is shown to be more efficient and less subjective than clustering-based representative selection methods. A set of ligands with available experimental affinity are docked to this nonredundant receptor ensemble, and the energetic features of the best scored poses are used in an ensemble learning procedure based on the random forest method. The importance of receptors is obtained through feature selection measures, and it is shown that a few of the most important conformations are sufficient to reach 1 kcal/mol accuracy in affinity prediction with considerable improvement of the early enrichment power of the models compared to the different ensemble docking without learning strategies. A clear strategy has been provided in which machine learning selects the most important experimental conformers of the receptor among a large set of protein–ligand complexes while simultaneously maintaining the final accuracy of affinity predictions at the highest level possible for available data. Our results could be informative for future attempts to design receptor-specific docking-rescoring strategies.
Collapse
Affiliation(s)
- Sara Mohammadi
- Department of Bioinformatics, Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Zahra Narimani
- Department of Computer Science and Information Technology, Institute for Advanced Studies in Basic Sciences (IASBS), 45137-66731, Zanjan, Iran
| | - Mitra Ashouri
- Department of Bioinformatics, Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Rohoullah Firouzi
- Department of Physical Chemistry, Chemistry and Chemical Engineering Research Center of Iran, Tehran, Iran
| | | |
Collapse
|
28
|
Chen YQ, Sheng YJ, Ding HM, Ma YQ. Efficient calculation of protein-ligand binding free energy with GFN methods: the power of cluster model. Phys Chem Chem Phys 2022; 24:14339-14347. [DOI: 10.1039/d2cp00161f] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
The protein-ligand interactions are crucial in many biochemical processes and biomedical applications, yet it still remains challenging to accurately calculating the binding free energy of their interactions. In this work,...
Collapse
|
29
|
Basciu A, Callea L, Motta S, Bonvin AM, Bonati L, Vargiu AV. No dance, no partner! A tale of receptor flexibility in docking and virtual screening. VIRTUAL SCREENING AND DRUG DOCKING 2022. [DOI: 10.1016/bs.armc.2022.08.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
30
|
Can docking scoring functions guarantee success in virtual screening? VIRTUAL SCREENING AND DRUG DOCKING 2022. [DOI: 10.1016/bs.armc.2022.08.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
31
|
Wang DD, Chan MT, Yan H. Structure-based protein-ligand interaction fingerprints for binding affinity prediction. Comput Struct Biotechnol J 2021; 19:6291-6300. [PMID: 34900139 PMCID: PMC8637032 DOI: 10.1016/j.csbj.2021.11.018] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 11/09/2021] [Accepted: 11/13/2021] [Indexed: 11/17/2022] Open
Abstract
Binding affinity prediction (BAP) using protein–ligand complex structures is crucial to computer-aided drug design, but remains a challenging problem. To achieve efficient and accurate BAP, machine-learning scoring functions (SFs) based on a wide range of descriptors have been developed. Among those descriptors, protein–ligand interaction fingerprints (IFPs) are competitive due to their simple representations, elaborate profiles of key interactions and easy collaborations with machine-learning algorithms. In this paper, we have adopted a building-block-based taxonomy to review a broad range of IFP models, and compared representative IFP-based SFs in target-specific and generic scoring tasks. Atom-pair-counts-based and substructure-based IFPs show great potential in these tasks.
Collapse
Affiliation(s)
- Debby D Wang
- School of Health Science and Engineering, University of Shanghai for Science and Technology, 516 Jungong Rd, Shanghai 200093, China
| | - Moon-Tong Chan
- School of Science and Technology, Hong Kong Metropolitan University, 30 Good Shepherd St, Ho Man Tin, Hong Kong
| | - Hong Yan
- Department of Electrical Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong
| |
Collapse
|
32
|
A geometric deep learning approach to predict binding conformations of bioactive molecules. NAT MACH INTELL 2021. [DOI: 10.1038/s42256-021-00409-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
33
|
Wilt S, Kodani S, Valencia L, Hudson PK, Sanchez S, Quintana T, Morisseau C, Hammock BD, Kandasamy R, Pecic S. Further exploration of the structure-activity relationship of dual soluble epoxide hydrolase/fatty acid amide hydrolase inhibitors. Bioorg Med Chem 2021; 51:116507. [PMID: 34794001 DOI: 10.1016/j.bmc.2021.116507] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Revised: 10/25/2021] [Accepted: 10/28/2021] [Indexed: 11/30/2022]
Abstract
Fatty acid amide hydrolase (FAAH) is a membrane protein that hydrolyzes endocannabinoids, and its inhibition produces analgesic and anti-inflammatory effects. The soluble epoxide hydrolase (sEH) hydrolyzes epoxyeicosatrienoic acids (EETs) to dihydroxyeicosatetraenoic acids. EETs have anti-inflammatory and inflammation resolving properties, thus inhibition of sEH consequently reduces inflammation. Concurrent inhibition of both enzymes may represent a novel approach in the treatment of chronic pain. Drugs with multiple targets can provide a superior therapeutic effect and a decrease in side effects compared to ligands with single targets. Previously, microwave-assisted methodologies were employed to synthesize libraries of benzothiazole analogs from which high affinity dual inhibitors (e.g. 3, sEH IC50 = 9.6 nM; FAAH IC50 = 7 nM) were identified. Here, our structure-activity relationship studies revealed that the 4-phenylthiazole moiety is well tolerated by both enzymes, producing excellent inhibition potencies in the low nanomolar range (e.g. 6o, sEH IC50 = 2.5 nM; FAAH IC50 = 9.8 nM). Docking experiments show that the new class of dual inhibitors bind within the catalytic sites of both enzymes. Prediction of several pharmacokinetic/pharmacodynamic properties suggest that these new dual inhibitors are good candidates for further in vivo evaluation. Finally, dual inhibitor 3 was tested in the Formalin Test, a rat model of acute inflammatory pain. The data indicate that 3 produces antinociception against the inflammatory phase of the Formalin Test in vivo and is metabolically stable following intraperitoneal administration in male rats. Further, antinociception produced by 3 is comparable to that of ketoprofen, a traditional nonsteroidal anti-inflammatory drug. The results presented here will help toward the long-term goal of developing novel non-opioid therapeutics for pain management.
Collapse
Affiliation(s)
- Stephanie Wilt
- Department of Chemistry & Biochemistry, California State University, Fullerton, 800 N. State College, Fullerton, CA 92834, United States
| | - Sean Kodani
- Department of Entomology and Nematology, and UCD Comprehensive Cancer Center, University of California Davis, Davis, CA 95616, United States
| | - Leah Valencia
- Department of Chemistry & Biochemistry, California State University, Fullerton, 800 N. State College, Fullerton, CA 92834, United States
| | - Paula K Hudson
- Department of Chemistry & Biochemistry, California State University, Fullerton, 800 N. State College, Fullerton, CA 92834, United States
| | - Stephanie Sanchez
- Department of Psychology, California State University, East Bay, 25800 Carlos Bee Blvd. Science S229, Hayward, CA 94542, United States
| | - Taylor Quintana
- Department of Psychology, California State University, East Bay, 25800 Carlos Bee Blvd. Science S229, Hayward, CA 94542, United States
| | - Christophe Morisseau
- Department of Entomology and Nematology, and UCD Comprehensive Cancer Center, University of California Davis, Davis, CA 95616, United States
| | - Bruce D Hammock
- Department of Entomology and Nematology, and UCD Comprehensive Cancer Center, University of California Davis, Davis, CA 95616, United States
| | - Ram Kandasamy
- Department of Psychology, California State University, East Bay, 25800 Carlos Bee Blvd. Science S229, Hayward, CA 94542, United States.
| | - Stevan Pecic
- Department of Chemistry & Biochemistry, California State University, Fullerton, 800 N. State College, Fullerton, CA 92834, United States.
| |
Collapse
|
34
|
Lu H, Wei Z, Wang C, Guo J, Zhou Y, Wang Z, Liu H. Redesigning Vina@QNLM for Ultra-Large-Scale Molecular Docking and Screening on a Sunway Supercomputer. Front Chem 2021; 9:750325. [PMID: 34778205 PMCID: PMC8581564 DOI: 10.3389/fchem.2021.750325] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Accepted: 09/14/2021] [Indexed: 11/28/2022] Open
Abstract
Ultra-large-scale molecular docking can improve the accuracy of lead compounds in drug discovery. In this study, we developed a molecular docking piece of software, Vina@QNLM, which can use more than 4,80,000 parallel processes to search for potential lead compounds from hundreds of millions of compounds. We proposed a task scheduling mechanism for large-scale parallelism based on Vinardo and Sunway supercomputer architecture. Then, we readopted the core docking algorithm to incorporate the full advantage of the heterogeneous multicore processor architecture in intensive computing. We successfully expanded it to 10, 465, 065 cores (1,61,001 management process elements and 0, 465, 065 computing process elements), with a strong scalability of 55.92%. To the best of our knowledge, this is the first time that 10 million cores are used for molecular docking on Sunway. The introduction of the heterogeneous multicore processor architecture achieved the best speedup, which is 11x more than that of the management process element of Sunway. The performance of Vina@QNLM was comprehensively evaluated using the CASF-2013 and CASF-2016 protein-ligand benchmarks, and the screening power was the highest out of the 27 pieces of software tested in the CASF-2013 benchmark. In some existing applications, we used Vina@QNLM to dock more than 10 million molecules to nine rigid proteins related to SARS-CoV-2 within 8.5 h on 10 million cores. We also developed a platform for the general public to use the software.
Collapse
Affiliation(s)
- Hao Lu
- College of Computer Science and Technology, Ocean University of China, Qingdao, China
- Pilot National Laboratory for Marine Science and Technology, Qingdao, China
| | - Zhiqiang Wei
- College of Computer Science and Technology, Ocean University of China, Qingdao, China
- Pilot National Laboratory for Marine Science and Technology, Qingdao, China
| | - Cunji Wang
- College of Computer Science and Technology, Ocean University of China, Qingdao, China
| | - Jingjing Guo
- College of Computer Science and Technology, Ocean University of China, Qingdao, China
| | - Yuandong Zhou
- College of Computer Science and Technology, Ocean University of China, Qingdao, China
| | - Zhuoya Wang
- Pilot National Laboratory for Marine Science and Technology, Qingdao, China
| | - Hao Liu
- College of Computer Science and Technology, Ocean University of China, Qingdao, China
- Pilot National Laboratory for Marine Science and Technology, Qingdao, China
| |
Collapse
|
35
|
Wang Z, Zheng L, Liu Y, Qu Y, Li YQ, Zhao M, Mu Y, Li W. OnionNet-2: A Convolutional Neural Network Model for Predicting Protein-Ligand Binding Affinity Based on Residue-Atom Contacting Shells. Front Chem 2021; 9:753002. [PMID: 34778208 PMCID: PMC8579074 DOI: 10.3389/fchem.2021.753002] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Accepted: 10/06/2021] [Indexed: 01/31/2023] Open
Abstract
One key task in virtual screening is to accurately predict the binding affinity (△G) of protein-ligand complexes. Recently, deep learning (DL) has significantly increased the predicting accuracy of scoring functions due to the extraordinary ability of DL to extract useful features from raw data. Nevertheless, more efforts still need to be paid in many aspects, for the aim of increasing prediction accuracy and decreasing computational cost. In this study, we proposed a simple scoring function (called OnionNet-2) based on convolutional neural network to predict △G. The protein-ligand interactions are characterized by the number of contacts between protein residues and ligand atoms in multiple distance shells. Compared to published models, the efficacy of OnionNet-2 is demonstrated to be the best for two widely used datasets CASF-2016 and CASF-2013 benchmarks. The OnionNet-2 model was further verified by non-experimental decoy structures from docking program and the CSAR NRC-HiQ data set (a high-quality data set provided by CSAR), which showed great success. Thus, our study provides a simple but efficient scoring function for predicting protein-ligand binding free energy.
Collapse
Affiliation(s)
- Zechen Wang
- School of Physics, Shandong University, Jinan, China
| | | | - Yang Liu
- School of Physics, Shandong University, Jinan, China
| | - Yuanyuan Qu
- School of Physics, Shandong University, Jinan, China
| | - Yong-Qiang Li
- School of Physics, Shandong University, Jinan, China
| | - Mingwen Zhao
- School of Physics, Shandong University, Jinan, China
| | - Yuguang Mu
- School of Biological Sciences, Nanyang Technological University, Singapore
| | - Weifeng Li
- School of Physics, Shandong University, Jinan, China
| |
Collapse
|
36
|
Yuan H, Huang J, Li J. Protein-ligand binding affinity prediction model based on graph attention network. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2021; 18:9148-9162. [PMID: 34814340 DOI: 10.3934/mbe.2021451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Estimating the binding affinity between proteins and drugs is very important in the application of structure-based drug design. Currently, applying machine learning to build the protein-ligand binding affinity prediction model, which is helpful to improve the performance of classical scoring functions, has attracted many scientists' attention. In this paper, we have developed an affinity prediction model called GAT-Score based on graph attention network (GAT). The protein-ligand complex is represented by a graph structure, and the atoms of protein and ligand are treated in the same manner. Two improvements are made to the original graph attention network. Firstly, a dynamic feature mechanism is designed to enable the model to deal with bond features. Secondly, a virtual super node is introduced to aggregate node-level features into graph-level features, so that the model can be used in the graph-level regression problems. PDBbind database v.2018 is used to train the model. Finally, the performance of GAT-Score was tested by the scheme $C_s$ (Core set as the test set) and CV (Cross-Validation). It has been found that our results are better than most methods from machine learning models with traditional molecular descriptors.
Collapse
Affiliation(s)
- Hong Yuan
- School of Medical Information and Engineering, Southwest Medical University, Luzhou, China
- Medicine & Engineering & Informatics Fusion and Transformation Key Laboratory of Luzhou City, Luzhou, China
| | - Jing Huang
- School of Medical Information and Engineering, Southwest Medical University, Luzhou, China
- Medicine & Engineering & Informatics Fusion and Transformation Key Laboratory of Luzhou City, Luzhou, China
| | - Jin Li
- School of Medical Information and Engineering, Southwest Medical University, Luzhou, China
- Medicine & Engineering & Informatics Fusion and Transformation Key Laboratory of Luzhou City, Luzhou, China
| |
Collapse
|
37
|
Abstract
Molecular docking is one of the most widely used computational tools in structure-based drug design and is critically dependent on accuracy and robustness of the scoring function. In this work, we introduce a new scoring function Lin_F9, which is a linear combination of nine empirical terms, including a unified metal bond term to specifically describe metal-ligand interactions. Parameters in Lin_F9 are obtained with a multistage fitting protocol using explicit water-included structures. For the CASF-2016 benchmark test set, Lin_F9 achieves the top scoring power among all 34 classical scoring functions for both original crystal poses and locally optimized poses with Pearson correlation coefficients (R) of 0.680 and 0.687, respectively. Meanwhile, in comparison with Vina, Lin_F9 achieves consistently better scoring power and ranking power with various types of protein-ligand complex structures that mimic real docking applications, including end-to-end flexible docking for the CASF-2016 benchmark test set using a single or an ensemble of protein receptor structures, as well as for D3R Grand Challenge (GC4) test sets. Lin_F9 has been implemented in a fork of Smina as an optional built-in scoring function that can be used for docking applications as well as for further improvement of scoring functions and docking protocols. Lin_F9 is accessible through https://yzhang.hpc.nyu.edu/Lin_F9/.
Collapse
Affiliation(s)
- Chao Yang
- Department of Chemistry, New York University, New York, New York 10003, United States
| | - Yingkai Zhang
- Department of Chemistry, New York University, New York, New York 10003, United States
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| |
Collapse
|
38
|
Llanos MA, Gantner ME, Rodriguez S, Alberca LN, Bellera CL, Talevi A, Gavernet L. Strengths and Weaknesses of Docking Simulations in the SARS-CoV-2 Era: the Main Protease (Mpro) Case Study. J Chem Inf Model 2021; 61:3758-3770. [PMID: 34313128 DOI: 10.1021/acs.jcim.1c00404] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
The scientific community is working against the clock to arrive at therapeutic interventions to treat patients with COVID-19. Among the strategies for drug discovery, virtual screening approaches have the capacity to search potential hits within millions of chemical structures in days, with the appropriate computing infrastructure. In this article, we first analyzed the published research targeting the inhibition of the main protease (Mpro), one of the most studied targets of SARS-CoV-2, by docking-based methods. An alarming finding was the lack of an adequate validation of the docking protocols (i.e., pose prediction and virtual screening accuracy) before applying them in virtual screening campaigns. The performance of the docking protocols was tested at some level in 57.7% of the 168 investigations analyzed. However, we found only three examples of a complete retrospective analysis of the scoring functions to quantify the virtual screening accuracy of the methods. Moreover, only two publications reported some experimental evaluation of the proposed hits until preparing this manuscript. All of these findings led us to carry out a retrospective performance validation of three different docking protocols, through the analysis of their pose prediction and screening accuracy. Surprisingly, we found that even though all tested docking protocols have a good pose prediction, their screening accuracy is quite limited as they fail to correctly rank a test set of compounds. These results highlight the importance of conducting an adequate validation of the docking protocols before carrying out virtual screening campaigns, and to experimentally confirm the predictions made by the models before drawing bold conclusions. Finally, successful structure-based drug discovery investigations published during the redaction of this manuscript allow us to propose the inclusion of target flexibility and consensus scoring as alternatives to improve the accuracy of the methods.
Collapse
Affiliation(s)
- Manuel A Llanos
- Laboratory of Bioactive Research and Development (LIDeB), Department of Biological Sciences, Faculty of Exact Sciences, National University of La Plata (UNLP), 47&115, La Plata (B1900ADU), Buenos Aires, Argentina
| | - Melisa E Gantner
- Laboratory of Bioactive Research and Development (LIDeB), Department of Biological Sciences, Faculty of Exact Sciences, National University of La Plata (UNLP), 47&115, La Plata (B1900ADU), Buenos Aires, Argentina
| | - Santiago Rodriguez
- Laboratory of Bioactive Research and Development (LIDeB), Department of Biological Sciences, Faculty of Exact Sciences, National University of La Plata (UNLP), 47&115, La Plata (B1900ADU), Buenos Aires, Argentina
| | - Lucas N Alberca
- Laboratory of Bioactive Research and Development (LIDeB), Department of Biological Sciences, Faculty of Exact Sciences, National University of La Plata (UNLP), 47&115, La Plata (B1900ADU), Buenos Aires, Argentina
| | - Carolina L Bellera
- Laboratory of Bioactive Research and Development (LIDeB), Department of Biological Sciences, Faculty of Exact Sciences, National University of La Plata (UNLP), 47&115, La Plata (B1900ADU), Buenos Aires, Argentina
| | - Alan Talevi
- Laboratory of Bioactive Research and Development (LIDeB), Department of Biological Sciences, Faculty of Exact Sciences, National University of La Plata (UNLP), 47&115, La Plata (B1900ADU), Buenos Aires, Argentina
| | - Luciana Gavernet
- Laboratory of Bioactive Research and Development (LIDeB), Department of Biological Sciences, Faculty of Exact Sciences, National University of La Plata (UNLP), 47&115, La Plata (B1900ADU), Buenos Aires, Argentina
| |
Collapse
|
39
|
Xie L, Xu L, Chang S, Xu X, Meng L. Multitask deep networks with grid featurization achieve improved scoring performance for protein-ligand binding. Chem Biol Drug Des 2021; 96:973-983. [PMID: 33058459 DOI: 10.1111/cbdd.13648] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2019] [Revised: 10/11/2019] [Accepted: 10/27/2019] [Indexed: 01/10/2023]
Abstract
Deep learning-based methods have been extensively developed to improve scoring performance in structure-based drug discovery. Extending multitask deep networks in addressing pharmaceutical problems shows remarkable improvements over single task network. Recently, grid featurization has been introduced to convert protein-ligand complex co-ordinates into fingerprints with the advantage of incorporating inter- and intra-molecular information. The combination of grid featurization with multitask deep networks would hold great potential to boost the scoring performance. We examined the performance of three novel multitask deep networks (standard multitask, bypass, and progressive network) in reproducing the binding affinities of protein-ligand complexes in comparison with AutoDock Vina docking and MM/GBSA method. Among five evaluated methods, progressive network combined with grid featurization provided the best Pearson correlation coefficient (0.74) and least mean absolute average error (0.98) for the overall scoring performance. Moreover, all networks increased screening ability for the re-docking pose and progressive network even achieved AUC of 0.87 over 0.52 of AutoDock Vina. Our results demonstrated that progressive network combined with grid featurization would be one powerful rescoring approach to strengthen screening results after obtaining protein-ligand complex in the conventional docking software.
Collapse
Affiliation(s)
- Liangxu Xie
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Lei Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Shan Chang
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Xiaojun Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Li Meng
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| |
Collapse
|
40
|
Sánchez-Cruz N, Medina-Franco JL, Mestres J, Barril X. Extended connectivity interaction features: improving binding affinity prediction through chemical description. Bioinformatics 2021; 37:1376-1382. [PMID: 33226061 DOI: 10.1093/bioinformatics/btaa982] [Citation(s) in RCA: 50] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Revised: 10/27/2020] [Accepted: 11/10/2020] [Indexed: 12/22/2022] Open
Abstract
MOTIVATION Machine-learning scoring functions (SFs) have been found to outperform standard SFs for binding affinity prediction of protein-ligand complexes. A plethora of reports focus on the implementation of increasingly complex algorithms, while the chemical description of the system has not been fully exploited. RESULTS Herein, we introduce Extended Connectivity Interaction Features (ECIF) to describe protein-ligand complexes and build machine-learning SFs with improved predictions of binding affinity. ECIF are a set of protein-ligand atom-type pair counts that take into account each atom's connectivity to describe it and thus define the pair types. ECIF were used to build different machine-learning models to predict protein-ligand affinities (pKd/pKi). The models were evaluated in terms of 'scoring power' on the Comparative Assessment of Scoring Functions 2016. The best models built on ECIF achieved Pearson correlation coefficients of 0.857 when used on its own, and 0.866 when used in combination with ligand descriptors, demonstrating ECIF descriptive power. AVAILABILITY AND IMPLEMENTATION Data and code to reproduce all the results are freely available at https://github.com/DIFACQUIM/ECIF. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Norberto Sánchez-Cruz
- Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico
| | - José L Medina-Franco
- Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico
| | - Jordi Mestres
- Research Group on Systems Pharmacology, Research Program on Biomedical Informatics (GRIB), IMIM Hospital del Mar Medical Research Institute and University Pompeu Fabra, Parc de Recerca Biomedica (PRBB), 08003 Barcelona, Catalonia, Spain
- Chemotargets SL, Parc Cientific de Barcelona (PCB), 08028 Barcelona, Catalonia, Spain
| | - Xavier Barril
- Institut de Biomedicina de la Universitat de Barcelona (IBUB) and Facultat de Farmacia, Universitat de Barcelona, 08028 Barcelona, Spain
- Catalan Institution for Research and Advanced Studies (ICREA), 08010 Barcelona, Spain
| |
Collapse
|
41
|
Jiang Z, Xu J, Yan A, Wang L. A comprehensive comparative assessment of 3D molecular similarity tools in ligand-based virtual screening. Brief Bioinform 2021; 22:6304389. [PMID: 34151363 DOI: 10.1093/bib/bbab231] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Revised: 05/10/2021] [Accepted: 05/27/2021] [Indexed: 12/19/2022] Open
Abstract
Three-dimensional (3D) molecular similarity, one major ligand-based virtual screening (VS) method, has been widely used in the drug discovery process. A variety of 3D molecular similarity tools have been developed in recent decades. In this study, we assessed a panel of 15 3D molecular similarity programs against the DUD-E and LIT-PCBA datasets, including commercial ROCS and Phase, in terms of screening power and scaffold-hopping power. The results revealed that (1) SHAFTS, LS-align, Phase Shape_Pharm and LIGSIFT showed the best VS capability in terms of screening power. Some 3D similarity tools available to academia can yield relatively better VS performance than commercial ROCS and Phase software. (2) Current 3D similarity VS tools exhibit a considerable ability to capture actives with new chemotypes in terms of scaffold hopping. (3) Multiple conformers relative to single conformations will generally improve VS performance for most 3D similarity tools, with marginal improvement observed in area under the receiving operator characteristic curve values, enrichment factor in the top 1% and hit rate in the top 1% values showed larger improvement. Moreover, redundancy and complementarity analyses of hit lists from different query seeds and different 3D similarity VS tools showed that the combination of different query seeds and/or different 3D similarity tools in VS campaigns retrieved more (and more diverse) active molecules. These findings provide useful information for guiding choices of the optimal 3D molecular similarity tools for VS practices and designing possible combination strategies to discover more diverse active compounds.
Collapse
Affiliation(s)
- Zhenla Jiang
- South China University of Technology, Guangzhou 510006, China
| | - Jianrong Xu
- Shanghai Jiao Tong University School of Medicine and Shanghai University of Traditional Chinese Medicine, Guangzhou 510006, China
| | - Aixia Yan
- Beijing University of Chemical Technology, Guangzhou 510006, China
| | - Ling Wang
- South China University of Technology, Guangzhou 510006, China
| |
Collapse
|
42
|
Bai Q, Ma J, Liu S, Xu T, Banegas-Luna AJ, Pérez-Sánchez H, Tian Y, Huang J, Liu H, Yao X. WADDAICA: A webserver for aiding protein drug design by artificial intelligence and classical algorithm. Comput Struct Biotechnol J 2021; 19:3573-3579. [PMID: 34194678 PMCID: PMC8234348 DOI: 10.1016/j.csbj.2021.06.017] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 06/05/2021] [Accepted: 06/12/2021] [Indexed: 10/25/2022] Open
Abstract
Artificial intelligence can train the related known drug data into deep learning models for drug design, while classical algorithms can design drugs through established and predefined procedures. Both deep learning and classical algorithms have their merits for drug design. Here, the webserver WADDAICA is built to employ the advantage of deep learning model and classical algorithms for drug design. The WADDAICA mainly contains two modules. In the first module, WADDAICA provides deep learning models for scaffold hopping of compounds to modify or design new novel drugs. The deep learning model which is used in WADDAICA shows a good scoring power based on the PDBbind database. In the second module, WADDAICA supplies functions for modifying or designing new novel drugs by classical algorithms. WADDAICA shows better Pearson and Spearman correlations of binding affinity than Autodock Vina that is considered to have the best scoring power. Besides, WADDAICA supplies a friendly and convenient web interface for users to submit drug design jobs. We believe that WADDAICA is a useful and effective tool to help researchers to modify or design novel drugs by deep learning models and classical algorithms. WADDAICA is free and accessible at https://bqflab.github.io or https://heisenberg.ucam.edu:5000.
Collapse
Affiliation(s)
- Qifeng Bai
- Key Lab of Preclinical Study for New Drugs of Gansu Province, Institute of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Lanzhou University, Lanzhou, Gansu 730000, P. R. China
| | - Jian Ma
- School of Pharmacy, Lanzhou University, Lanzhou, Gansu 730000, P. R. China
| | - Shuo Liu
- School of Pharmacy, Lanzhou University, Lanzhou, Gansu 730000, P. R. China
| | | | - Antonio Jesús Banegas-Luna
- Structural Bioinformatics and High Performance Computing Research Group (BIO-HPC), Computer Engineering Department, UCAM Universidad Católica de Murcia, Murcia, Spain
| | - Horacio Pérez-Sánchez
- Structural Bioinformatics and High Performance Computing Research Group (BIO-HPC), Computer Engineering Department, UCAM Universidad Católica de Murcia, Murcia, Spain
| | - Yanan Tian
- School of Pharmacy, Lanzhou University, Lanzhou, Gansu 730000, P. R. China
| | | | - Huanxiang Liu
- School of Pharmacy, Lanzhou University, Lanzhou, Gansu 730000, P. R. China
| | - Xiaojun Yao
- Key Lab of Preclinical Study for New Drugs of Gansu Province, Institute of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Lanzhou University, Lanzhou, Gansu 730000, P. R. China
| |
Collapse
|
43
|
Kadukova M, Machado KDS, Chacón P, Grudinin S. KORP-PL: a coarse-grained knowledge-based scoring function for protein-ligand interactions. Bioinformatics 2021; 37:943-950. [PMID: 32840574 DOI: 10.1093/bioinformatics/btaa748] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2020] [Revised: 07/27/2020] [Accepted: 08/18/2020] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Despite the progress made in studying protein-ligand interactions and the widespread application of docking and affinity prediction tools, improving their precision and efficiency still remains a challenge. Computational approaches based on the scoring of docking conformations with statistical potentials constitute a popular alternative to more accurate but costly physics-based thermodynamic sampling methods. In this context, a minimalist and fast sidechain-free knowledge-based potential with a high docking and screening power can be very useful when screening a big number of putative docking conformations. RESULTS Here, we present a novel coarse-grained potential defined by a 3D joint probability distribution function that only depends on the pairwise orientation and position between protein backbone and ligand atoms. Despite its extreme simplicity, our approach yields very competitive results with the state-of-the-art scoring functions, especially in docking and screening tasks. For example, we observed a twofold improvement in the median 5% enrichment factor on the DUD-E benchmark compared to Autodock Vina results. Moreover, our results prove that a coarse sidechain-free potential is sufficient for a very successful docking pose prediction. AVAILABILITYAND IMPLEMENTATION The standalone version of KORP-PL with the corresponding tests and benchmarks are available at https://team.inria.fr/nano-d/korp-pl/ and https://chaconlab.org/modeling/korp-pl. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Maria Kadukova
- Univ. Grenoble Alpes, CNRS, Inria, Grenoble INP, LJK, 38000 Grenoble, France.,Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, 141701 Dolgoprudniy, Russia
| | - Karina Dos Santos Machado
- Univ. Grenoble Alpes, CNRS, Inria, Grenoble INP, LJK, 38000 Grenoble, France.,Computational Biology Laboratory, Centro de Ciências Computacionais, Universidade Federal do Rio Grande - FURG, Rio Grande, RS 96201-090, Brazil
| | - Pablo Chacón
- Department of Biological Physical Chemistry, Rocasolano Institute of Physical Chemistry C.S.I.C, Madrid 28006, Spain
| | - Sergei Grudinin
- Univ. Grenoble Alpes, CNRS, Inria, Grenoble INP, LJK, 38000 Grenoble, France
| |
Collapse
|
44
|
Rayka M, Karimi-Jafari MH, Firouzi R. ET-score: Improving Protein-ligand Binding Affinity Prediction Based on Distance-weighted Interatomic Contact Features Using Extremely Randomized Trees Algorithm. Mol Inform 2021; 40:e2060084. [PMID: 34021703 DOI: 10.1002/minf.202060084] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 04/20/2021] [Indexed: 12/15/2022]
Abstract
The molecular docking simulation is a key computational tool in modern drug discovery research that its predictive performance strongly depends on the employed scoring functions. Many recent studies have shown that the application of machine learning algorithms in the development of scoring functions has led to a significant improvement in docking performance. In this work, we introduce a new machine learning (ML) based scoring function called ET-Score, which employs the distance-weighted interatomic contacts between atom type pairs of the ligand and the protein for featurizing protein-ligand complexes and Extremely Randomized Trees algorithm for the training process. The performance of ET-Score is compared with some successful ML-based scoring functions and several popular classical scoring functions on the PDBbind 2016v core set. It is shown that our ET-Score model (with Pearson's correlation of 0.827 and RMSE of 1.332) achieves very good performance in comparison with most of the ML-based scoring functions and all classical scoring functions despite its extremely low computational cost. ET-Score's codes are freely available on the web at https://github.com/miladrayka/ET_Score.
Collapse
Affiliation(s)
- Milad Rayka
- Department of Physical Chemistry, Chemistry and Chemical Engineering Research Center of Iran, Tehran, Iran
| | | | - Rohoullah Firouzi
- Department of Physical Chemistry, Chemistry and Chemical Engineering Research Center of Iran, Tehran, Iran
| |
Collapse
|
45
|
Ji B, He X, Zhai J, Zhang Y, Man VH, Wang J. Machine learning on ligand-residue interaction profiles to significantly improve binding affinity prediction. Brief Bioinform 2021; 22:6184410. [PMID: 33758923 DOI: 10.1093/bib/bbab054] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2020] [Revised: 01/06/2021] [Accepted: 02/02/2021] [Indexed: 01/01/2023] Open
Abstract
Structure-based virtual screenings (SBVSs) play an important role in drug discovery projects. However, it is still a challenge to accurately predict the binding affinity of an arbitrary molecule binds to a drug target and prioritize top ligands from an SBVS. In this study, we developed a novel method, using ligand-residue interaction profiles (IPs) to construct machine learning (ML)-based prediction models, to significantly improve the screening performance in SBVSs. Such a kind of the prediction model is called an IP scoring function (IP-SF). We systematically investigated how to improve the performance of IP-SFs from many perspectives, including the sampling methods before interaction energy calculation and different ML algorithms. Using six drug targets with each having hundreds of known ligands, we conducted a critical evaluation on the developed IP-SFs. The IP-SFs employing a gradient boosting decision tree (GBDT) algorithm in conjunction with the MIN + GB simulation protocol achieved the best overall performance. Its scoring power, ranking power and screening power significantly outperformed the Glide SF. First, compared with Glide, the average values of mean absolute error and root mean square error of GBDT/MIN + GB decreased about 38 and 36%, respectively. Second, the mean values of squared correlation coefficient and predictive index increased about 225 and 73%, respectively. Third, more encouragingly, the average value of the areas under the curve of receiver operating characteristic for six targets by GBDT, 0.87, is significantly better than that by Glide, which is only 0.71. Thus, we expected IP-SFs to have broad and promising applications in SBVSs.
Collapse
Affiliation(s)
- Beihong Ji
- Department of Pharmaceutical Sciences and Computational Chemical Genomics Screening Center, School of Pharmacy, University of Pittsburgh, Pittsburgh, PA 15261, USA
| | - Xibing He
- Department of Pharmaceutical Sciences and Computational Chemical Genomics Screening Center, School of Pharmacy, University of Pittsburgh, Pittsburgh, PA 15261, USA
| | - Jingchen Zhai
- Department of Pharmaceutical Sciences and Computational Chemical Genomics Screening Center, School of Pharmacy, University of Pittsburgh, Pittsburgh, PA 15261, USA
| | - Yuzhao Zhang
- Department of Pharmaceutical Sciences and Computational Chemical Genomics Screening Center, School of Pharmacy, University of Pittsburgh, Pittsburgh, PA 15261, USA
| | - Viet Hoang Man
- Department of Pharmaceutical Sciences and Computational Chemical Genomics Screening Center, School of Pharmacy, University of Pittsburgh, Pittsburgh, PA 15261, USA
| | - Junmei Wang
- Department of Pharmaceutical Sciences and Computational Chemical Genomics Screening Center, School of Pharmacy, University of Pittsburgh, Pittsburgh, PA 15261, USA
| |
Collapse
|
46
|
Lim S, Lu Y, Cho CY, Sung I, Kim J, Kim Y, Park S, Kim S. A review on compound-protein interaction prediction methods: Data, format, representation and model. Comput Struct Biotechnol J 2021; 19:1541-1556. [PMID: 33841755 PMCID: PMC8008185 DOI: 10.1016/j.csbj.2021.03.004] [Citation(s) in RCA: 42] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Revised: 02/28/2021] [Accepted: 03/01/2021] [Indexed: 01/27/2023] Open
Abstract
There has recently been a rapid progress in computational methods for determining protein targets of small molecule drugs, which will be termed as compound protein interaction (CPI). In this review, we comprehensively review topics related to computational prediction of CPI. Data for CPI has been accumulated and curated significantly both in quantity and quality. Computational methods have become powerful ever to analyze such complex the data. Thus, recent successes in the improved quality of CPI prediction are due to use of both sophisticated computational techniques and higher quality information in the databases. The goal of this article is to provide reviews of topics related to CPI, such as data, format, representation, to computational models, so that researchers can take full advantages of these resources to develop novel prediction methods. Chemical compounds and protein data from various resources were discussed in terms of data formats and encoding schemes. For the CPI methods, we grouped prediction methods into five categories from traditional machine learning techniques to state-of-the-art deep learning techniques. In closing, we discussed emerging machine learning topics to help both experimental and computational scientists leverage the current knowledge and strategies to develop more powerful and accurate CPI prediction methods.
Collapse
Affiliation(s)
- Sangsoo Lim
- Bioinformatics Institute, Seoul National University, Seoul, Republic of Korea
| | - Yijingxiu Lu
- Department of Computer Science and Engineering, College of Engineering, Seoul National University, Seoul, Republic of Korea
| | - Chang Yun Cho
- Institute of Engineering Research, Seoul National University, Seoul, Republic of Korea
| | - Inyoung Sung
- Institute of Engineering Research, Seoul National University, Seoul, Republic of Korea
| | - Jungwoo Kim
- Department of Computer Science and Engineering, College of Engineering, Seoul National University, Seoul, Republic of Korea
| | - Youngkuk Kim
- Department of Computer Science and Engineering, College of Engineering, Seoul National University, Seoul, Republic of Korea
| | - Sungjoon Park
- Department of Computer Science and Engineering, College of Engineering, Seoul National University, Seoul, Republic of Korea
| | - Sun Kim
- Bioinformatics Institute, Seoul National University, Seoul, Republic of Korea
- Department of Computer Science and Engineering, College of Engineering, Seoul National University, Seoul, Republic of Korea
- Institute of Engineering Research, Seoul National University, Seoul, Republic of Korea
- Interdisciplinary Program in Bioinformatics, College of Natural Sciences, Seoul National University, Seoul, Republic of Korea
| |
Collapse
|
47
|
Guedes IA, Barreto AMS, Marinho D, Krempser E, Kuenemann MA, Sperandio O, Dardenne LE, Miteva MA. New machine learning and physics-based scoring functions for drug discovery. Sci Rep 2021; 11:3198. [PMID: 33542326 PMCID: PMC7862620 DOI: 10.1038/s41598-021-82410-1] [Citation(s) in RCA: 79] [Impact Index Per Article: 26.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Accepted: 01/20/2021] [Indexed: 12/11/2022] Open
Abstract
Scoring functions are essential for modern in silico drug discovery. However, the accurate prediction of binding affinity by scoring functions remains a challenging task. The performance of scoring functions is very heterogeneous across different target classes. Scoring functions based on precise physics-based descriptors better representing protein–ligand recognition process are strongly needed. We developed a set of new empirical scoring functions, named DockTScore, by explicitly accounting for physics-based terms combined with machine learning. Target-specific scoring functions were developed for two important drug targets, proteases and protein–protein interactions, representing an original class of molecules for drug discovery. Multiple linear regression (MLR), support vector machine and random forest algorithms were employed to derive general and target-specific scoring functions involving optimized MMFF94S force-field terms, solvation and lipophilic interactions terms, and an improved term accounting for ligand torsional entropy contribution to ligand binding. DockTScore scoring functions demonstrated to be competitive with the current best-evaluated scoring functions in terms of binding energy prediction and ranking on four DUD-E datasets and will be useful for in silico drug design for diverse proteins as well as for specific targets such as proteases and protein–protein interactions. Currently, the MLR DockTScore is available at www.dockthor.lncc.br.
Collapse
Affiliation(s)
- Isabella A Guedes
- Laboratório Nacional de Computação Científica, Petrópolis, 25651-075, Brazil.,Inserm U973, Université Paris Diderot, Paris, France
| | - André M S Barreto
- Laboratório Nacional de Computação Científica, Petrópolis, 25651-075, Brazil
| | - Diogo Marinho
- Laboratório Nacional de Computação Científica, Petrópolis, 25651-075, Brazil
| | | | | | - Olivier Sperandio
- Inserm U973, Université Paris Diderot, Paris, France.,Structural Bioinformatics Unit, CNRS UMR3528, Institut Pasteur, 75015, Paris, France
| | - Laurent E Dardenne
- Laboratório Nacional de Computação Científica, Petrópolis, 25651-075, Brazil.
| | - Maria A Miteva
- Inserm U973, Université Paris Diderot, Paris, France. .,Inserm U1268 "Medicinal Chemistry and Translational Research", CiTCoM, UMR 8038, CNRS, Université de Paris, 75006, Paris, France.
| |
Collapse
|
48
|
Chen H, Wang Z, Fan F, Shi P, Xu X, Du M, Wang C. Analysis Method of Lactoferrin Based on Uncoated Capillary Electrophoresis. EFOOD 2021. [DOI: 10.2991/efood.k.210720.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022] Open
|
49
|
Piao C, Zhang Q, Jin D, Wang L, Tang C, Zhang N, Lian F, Tong X. A Study on the Mechanism of Milkvetch Root in the Treatment of Diabetic Nephropathy Based on Network Pharmacology. EVIDENCE-BASED COMPLEMENTARY AND ALTERNATIVE MEDICINE : ECAM 2020; 2020:6754761. [PMID: 33178322 PMCID: PMC7648691 DOI: 10.1155/2020/6754761] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/18/2020] [Revised: 08/24/2020] [Accepted: 09/18/2020] [Indexed: 02/06/2023]
Abstract
Diabetic nephropathy (DN) is one of the most common complications of diabetes mellitus. Owing to its complicated pathogenesis, no satisfactory treatment strategies for DN are available. Milkvetch Root is a common traditional Chinese medicine (TCM) and has been extensively used to treat DN in clinical practice in China for many years. However, due to the complexity of botanical ingredients, the exact pharmacological mechanism of Milkvetch Root in treating DN has not been completely elucidated. The aim of this study was to explore the active components and potential mechanism of Milkvetch Root by using a systems pharmacology approach. First, the components and targets of Milkvetch Root were analyzed by using the Traditional Chinese Medicine Systems Pharmacology database. We found the common targets of Milkvetch Root and DN constructed a protein-protein interaction (PPI) network using STRING and screened the key targets via topological analysis. Enrichment of Gene Ontology (GO) pathways and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were analyzed. Subsequently, major hubs were identified and imported to the Database for Annotation, Visualization and Integrated Discovery for pathway enrichment analysis. The binding activity and targets of the active components of Milkvetch Root were verified by using the molecular docking software SYBYL. Finally, we found 20 active components in Milkvetch Root. Moreover, the enrichment analysis of GO and KEGG pathways suggested that AGE-RAGE signaling pathway, HIF-1 signaling pathway, PI3K-Akt signaling pathway, and TNF signaling pathway might be the key pathways for the treatment of DN; more importantly, 10 putative targets of Milkvetch Root (AKT1, VEGFA, IL-6, PPARG, CCL2, NOS3, SERPINE1, CRP, ICAM1, and SLC2A) were identified to be of great significance in regulating these biological processes and pathways. This study provides an important scientific basis for further elucidating the mechanism of Milkvetch Root in treating DN.
Collapse
Affiliation(s)
- Chunli Piao
- Shenzhen Hospital, Guangzhou University of Chinese Medicine (Futian), Shenzhen 518000, Guangdong, China
| | - Qi Zhang
- Changchun University of Chinese Medicine, Changchun 130000, Jilin, China
| | - De Jin
- Guang'anmen Hospital, China Academy of Chinese Medical Science, Beijing 100000, China
| | - Li Wang
- Shenzhen Hospital, Guangzhou University of Chinese Medicine (Futian), Shenzhen 518000, Guangdong, China
| | - Cheng Tang
- Shenzhen Hospital, Guangzhou University of Chinese Medicine (Futian), Shenzhen 518000, Guangdong, China
| | - Naiwen Zhang
- Shenzhen Hospital, Guangzhou University of Chinese Medicine (Futian), Shenzhen 518000, Guangdong, China
| | - Fengmei Lian
- Guang'anmen Hospital, China Academy of Chinese Medical Science, Beijing 100000, China
| | - Xiaolin Tong
- Guang'anmen Hospital, China Academy of Chinese Medical Science, Beijing 100000, China
| |
Collapse
|
50
|
Smith ST, Meiler J. Assessing multiple score functions in Rosetta for drug discovery. PLoS One 2020; 15:e0240450. [PMID: 33044994 PMCID: PMC7549810 DOI: 10.1371/journal.pone.0240450] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Accepted: 09/27/2020] [Indexed: 12/25/2022] Open
Abstract
Rosetta is a computational software suite containing algorithms for a wide variety of macromolecular structure prediction and design tasks including small molecule protocols commonly used in drug discovery or enzyme design. Here, we benchmark RosettaLigand score functions and protocols in comparison to results of other software recently published in the Comparative Assessment of Score Functions (CASF-2016). The CASF-2016 benchmark covers a wide variety of tests including scoring and ranking multiple compounds against a target, ligand docking of a small molecule to a target, and virtual screening to extract binders from a compound library. Direct comparison to the score functions provided by CASF-2016 results shows that the original RosettaLigand score function ranks among the top software for scoring, ranking, docking and screening tests. Most notably, the RosettaLigand score function ranked 2/34 among other report score functions in CASF-2016. We additionally perform a ligand docking test with full sampling to mimic typical use cases. Despite improved performance of newer score functions in canonical protein structure prediction and design, we demonstrate here that more recent Rosetta score functions have reduced performance across all small molecule benchmarks. The tests described here have also been uploaded to the Rosetta scientific benchmarking server and will be run weekly to track performance as the code is continually being developed.
Collapse
Affiliation(s)
- Shannon T. Smith
- Chemical and Physical Biology Program, Vanderbilt University, Nashville, Tennessee, United States of America
- Center for Structural Biology, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Jens Meiler
- Center for Structural Biology, Vanderbilt University, Nashville, Tennessee, United States of America
- Departments of Chemistry, Pharmacology, and Biomedical Informatics, Center for Structural Biology and Institute of Chemical Biology, Nashville, Tennessee, United States of America
- Institute for Drug Discovery, Leipzig University Medical School, Leipzig, Städelschule, Germany
- * E-mail:
| |
Collapse
|