1
|
Zhang Y, Zhang Z, Ke D, Pan X, Wang X, Xiao X, Ji C. FragGrow: A Web Server for Structure-Based Drug Design by Fragment Growing within Constraints. J Chem Inf Model 2024; 64:3970-3976. [PMID: 38725251 DOI: 10.1021/acs.jcim.4c00154] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/28/2024]
Abstract
Fragment growing is an important ligand design strategy in drug discovery. In this study, we present FragGrow, a web server that facilitates structure-based drug design by fragment growing. FragGrow offers two working modes: one for growing molecules through the direct replacement of hydrogen atoms or substructures and the other for growing via virtual synthesis. FragGrow works by searching for suitable fragments that meet a set of constraints from an indexed 3D fragment database and using them to create new compounds in 3D space. The users can set a range of constraints when searching for their desired fragment, including the fragment's ability to interact with specific protein sites; its size, topology, and physicochemical properties; and the presence of particular heteroatoms and functional groups within the fragment. We hope that FragGrow will serve as a useful tool for medicinal chemists in ligand design. The FragGrow server is freely available to researchers and can be accessed at https://fraggrow.xundrug.cn.
Collapse
Affiliation(s)
- Yueqing Zhang
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200062, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, 200062, China
| | - Zhihan Zhang
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200062, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, 200062, China
| | - Dongliang Ke
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200062, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, 200062, China
| | - Xiaolin Pan
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200062, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, 200062, China
| | - Xingyu Wang
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, 200062, China
| | - Xudong Xiao
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200062, China
| | - Changge Ji
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200062, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, 200062, China
| |
Collapse
|
2
|
Rayka M, Mirzaei M, Mohammad Latifi A. An ensemble-based approach to estimate confidence of predicted protein-ligand binding affinity values. Mol Inform 2024; 43:e202300292. [PMID: 38358080 DOI: 10.1002/minf.202300292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 01/22/2024] [Accepted: 02/02/2024] [Indexed: 02/16/2024]
Abstract
When designing a machine learning-based scoring function, we access a limited number of protein-ligand complexes with experimentally determined binding affinity values, representing only a fraction of all possible protein-ligand complexes. Consequently, it is crucial to report a measure of confidence and quantify the uncertainty in the model's predictions during test time. Here, we adopt the conformal prediction technique to evaluate the confidence of a prediction for each member of the core set of the CASF 2016 benchmark. The conformal prediction technique requires a diverse ensemble of predictors for uncertainty estimation. To this end, we introduce ENS-Score as an ensemble predictor, which includes 30 models with different protein-ligand representation approaches and achieves Pearson's correlation of 0.842 on the core set of the CASF 2016 benchmark. Also, we comprehensively investigate the residual error of each data point to assess the normality behavior of the distribution of the residual errors and their correlation to the structural features of the ligands, such as hydrophobic interactions and halogen bonding. In the end, we provide a local host web application to facilitate the usage of ENS-Score. All codes to repeat results are provided at https://github.com/miladrayka/ENS_Score.
Collapse
Affiliation(s)
- Milad Rayka
- Applied Biotechnology Research Center, Baqiyatallah University of Medical Sciences, Tehran, Iran
| | - Morteza Mirzaei
- Applied Biotechnology Research Center, Baqiyatallah University of Medical Sciences, Tehran, Iran
| | - Ali Mohammad Latifi
- Applied Biotechnology Research Center, Baqiyatallah University of Medical Sciences, Tehran, Iran
| |
Collapse
|
3
|
Luo D, Liu D, Qu X, Dong L, Wang B. Enhancing Generalizability in Protein-Ligand Binding Affinity Prediction with Multimodal Contrastive Learning. J Chem Inf Model 2024; 64:1892-1906. [PMID: 38441880 DOI: 10.1021/acs.jcim.3c01961] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2024]
Abstract
Improving the generalization ability of scoring functions remains a major challenge in protein-ligand binding affinity prediction. Many machine learning methods are limited by their reliance on single-modal representations, hindering a comprehensive understanding of protein-ligand interactions. We introduce a graph-neural-network-based scoring function that utilizes a triplet contrastive learning loss to improve protein-ligand representations. In this model, three-dimensional complex representations and the fusion of two-dimensional ligand and coarse-grained pocket representations converge while distancing from decoy representations in latent space. After rigorous validation on multiple external data sets, our model exhibits commendable generalization capabilities compared to those of other deep learning-based scoring functions, marking it as a promising tool in the realm of drug discovery. In the future, our training framework can be extended to other biophysical- and biochemical-related problems such as protein-protein interaction and protein mutation prediction.
Collapse
Affiliation(s)
- Ding Luo
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
| | - Dandan Liu
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
| | - Xiaoyang Qu
- School of Pharmacy and Medical Technology, Putian University, Putian 351100, P. R. China
- Key Laboratory of Pharmaceutical Analysis and Laboratory Medicine (Putian University), Fujian Province University, Putian 351100, P. R. China
| | - Lina Dong
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
| | - Binju Wang
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
- Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen 361005, P. R. China
| |
Collapse
|
4
|
Wang R, Feng H, Wei GW. ChatGPT in Drug Discovery: A Case Study on Anticocaine Addiction Drug Development with Chatbots. J Chem Inf Model 2023; 63:7189-7209. [PMID: 37956228 PMCID: PMC11021135 DOI: 10.1021/acs.jcim.3c01429] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
The birth of ChatGPT, a cutting-edge language model-based chatbot developed by OpenAI, ushered in a new era in AI. However, due to potential pitfalls, its role in rigorous scientific research is not clear yet. This paper vividly showcases its innovative application within the field of drug discovery. Focused specifically on developing anticocaine addiction drugs, the study employs GPT-4 as a virtual guide, offering strategic and methodological insights to researchers working on generative models for drug candidates. The primary objective is to generate optimal drug-like molecules with desired properties. By leveraging the capabilities of ChatGPT, the study introduces a novel approach to the drug discovery process. This symbiotic partnership between AI and researchers transforms how drug development is approached. Chatbots become facilitators, steering researchers toward innovative methodologies and productive paths for creating effective drug candidates. This research sheds light on the collaborative synergy between human expertise and AI assistance, wherein ChatGPT's cognitive abilities enhance the design and development of pharmaceutical solutions. This paper not only explores the integration of advanced AI in drug discovery but also reimagines the landscape by advocating for AI-powered chatbots as trailblazers in revolutionizing therapeutic innovation.
Collapse
Affiliation(s)
- Rui Wang
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Hongsong Feng
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
5
|
Dong L, Shi S, Qu X, Luo D, Wang B. Ligand binding affinity prediction with fusion of graph neural networks and 3D structure-based complex graph. Phys Chem Chem Phys 2023; 25:24110-24120. [PMID: 37655493 DOI: 10.1039/d3cp03651k] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
Abstract
Accurate prediction of protein-ligand binding affinity is pivotal for drug design and discovery. Here, we proposed a novel deep fusion graph neural networks framework named FGNN to learn the protein-ligand interactions from the 3D structures of protein-ligand complexes. Unlike 1D sequences for proteins or 2D graphs for ligands, the 3D graph of protein-ligand complex enables the more accurate representations of the protein-ligand interactions. Benchmark studies have shown that our fusion models FGNN can achieve more accurate prediction of binding affinity than any individual algorithm. The advantages of fusion strategies have been demonstrated in terms of expressive power of data, learning efficiency and model interpretability. Our fusion models show satisfactory performances on diverse data sets, demonstrating their generalization ability. Given the good performances in both binding affinity prediction and virtual screening, our fusion models are expected to be practically applied for drug screening and design. Our work highlights the potential of the fusion graph neural network algorithm in solving complex prediction problems in computational biology and chemistry. The fusion graph neural networks (FGNN) model is freely available in https://github.com/LinaDongXMU/FGNN.
Collapse
Affiliation(s)
- Lina Dong
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, iChEM, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, 361005, China.
| | - Shuai Shi
- Department of Algorithm, TuringQ Co., Ltd., Shanghai, 200240, China
| | - Xiaoyang Qu
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, iChEM, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, 361005, China.
| | - Ding Luo
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, iChEM, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, 361005, China.
| | - Binju Wang
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, iChEM, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, 361005, China.
- Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen, 361005, China
| |
Collapse
|
6
|
Shen L, Feng H, Qiu Y, Wei GW. SVSBI: sequence-based virtual screening of biomolecular interactions. Commun Biol 2023; 6:536. [PMID: 37202415 PMCID: PMC10195826 DOI: 10.1038/s42003-023-04866-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Accepted: 04/24/2023] [Indexed: 05/20/2023] Open
Abstract
Virtual screening (VS) is a critical technique in understanding biomolecular interactions, particularly in drug design and discovery. However, the accuracy of current VS models heavily relies on three-dimensional (3D) structures obtained through molecular docking, which is often unreliable due to the low accuracy. To address this issue, we introduce a sequence-based virtual screening (SVS) as another generation of VS models that utilize advanced natural language processing (NLP) algorithms and optimized deep K-embedding strategies to encode biomolecular interactions without relying on 3D structure-based docking. We demonstrate that SVS outperforms state-of-the-art performance for four regression datasets involving protein-ligand binding, protein-protein, protein-nucleic acid binding, and ligand inhibition of protein-protein interactions and five classification datasets for protein-protein interactions in five biological species. SVS has the potential to transform current practices in drug discovery and protein engineering.
Collapse
Affiliation(s)
- Li Shen
- Department of Mathematics, Michigan State University, East Lansing, MI, 48824, USA
| | - Hongsong Feng
- Department of Mathematics, Michigan State University, East Lansing, MI, 48824, USA
| | - Yuchi Qiu
- Department of Mathematics, Michigan State University, East Lansing, MI, 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, MI, 48824, USA.
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI, 48824, USA.
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, 48824, USA.
| |
Collapse
|
7
|
Cui Z, Zhang N, Zhou T, Zhou X, Meng H, Yu Y, Zhang Z, Zhang Y, Wang W, Liu Y. Conserved Sites and Recognition Mechanisms of T1R1 and T2R14 Receptors Revealed by Ensemble Docking and Molecular Descriptors and Fingerprints Combined with Machine Learning. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2023; 71:5630-5645. [PMID: 37005743 DOI: 10.1021/acs.jafc.3c00591] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Taste peptides, as an important component of protein-rich foodstuffs, potentiate the nutrition and taste of food. Thereinto, umami- and bitter-taste peptides have been ex tensively reported, while their taste mechanisms remain unclear. Meanwhile, the identification of taste peptides is still a time-consuming and costly task. In this study, 489 peptides with umami/bitter taste from TPDB (http://tastepeptides-meta.com/) were collected and used to train the classification models based on docking analysis, molecular descriptors (MDs), and molecular fingerprints (FPs). A consensus model, taste peptide docking machine (TPDM), was generated based on five learning algorithms (linear regression, random forest, gaussian naive bayes, gradient boosting tree, and stochastic gradient descent) and four molecular representation schemes. Model interpretive analysis showed that MDs (VSA_EState, MinEstateIndex, MolLogP) and FPs (598, 322, 952) had the greatest impact on the umami/bitter prediction of peptides. Based on the consensus docking results, we obtained the key recognition modes of umami/bitter receptors (T1Rs/T2Rs): (1) residues 107S-109S, 148S-154T, 247F-249A mainly form hydrogen bonding contacts and (2) residues 153A-158L, 163L, 181Q, 218D, 247F-249A in T1R1 and 56D, 106P, 107V, 152V-156F, 173K-180F in T2R14 constituted their hydrogen bond pockets. The model is available at http://www.tastepeptides-meta.com/yyds.
Collapse
Affiliation(s)
- Zhiyong Cui
- Department of Food Science & Technology, School of Agriculture & Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Ninglong Zhang
- Department of Food Science & Technology, School of Agriculture & Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Tianxing Zhou
- Department of Bioinformatics, Faculty of Science, The University of Melbourne, Parkville 3010, Victoria, Australia
| | - Xueke Zhou
- Department of Food Science & Technology, School of Agriculture & Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Hengli Meng
- Department of Food Science & Technology, School of Agriculture & Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yanyang Yu
- Department of Food Science & Technology, School of Agriculture & Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Zhiwei Zhang
- Department of Food Science & Technology, School of Agriculture & Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yin Zhang
- Key Laboratory of Meat Processing of Sichuan, Chengdu University, Chengdu 610106, China
| | - Wenli Wang
- Department of Food Science & Technology, School of Agriculture & Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yuan Liu
- Department of Food Science & Technology, School of Agriculture & Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| |
Collapse
|
8
|
Wang Z, Zheng L, Wang S, Lin M, Wang Z, Kong AWK, Mu Y, Wei Y, Li W. A fully differentiable ligand pose optimization framework guided by deep learning and a traditional scoring function. Brief Bioinform 2023; 24:6887112. [PMID: 36502369 DOI: 10.1093/bib/bbac520] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 10/17/2022] [Accepted: 10/31/2022] [Indexed: 12/14/2022] Open
Abstract
The recently reported machine learning- or deep learning-based scoring functions (SFs) have shown exciting performance in predicting protein-ligand binding affinities with fruitful application prospects. However, the differentiation between highly similar ligand conformations, including the native binding pose (the global energy minimum state), remains challenging that could greatly enhance the docking. In this work, we propose a fully differentiable, end-to-end framework for ligand pose optimization based on a hybrid SF called DeepRMSD+Vina combined with a multi-layer perceptron (DeepRMSD) and the traditional AutoDock Vina SF. The DeepRMSD+Vina, which combines (1) the root mean square deviation (RMSD) of the docking pose with respect to the native pose and (2) the AutoDock Vina score, is fully differentiable; thus is capable of optimizing the ligand binding pose to the energy-lowest conformation. Evaluated by the CASF-2016 docking power dataset, the DeepRMSD+Vina reaches a success rate of 94.4%, which outperforms most reported SFs to date. We evaluated the ligand conformation optimization framework in practical molecular docking scenarios (redocking and cross-docking tasks), revealing the high potentialities of this framework in drug design and discovery. Structural analysis shows that this framework has the ability to identify key physical interactions in protein-ligand binding, such as hydrogen-bonding. Our work provides a paradigm for optimizing ligand conformations based on deep learning algorithms. The DeepRMSD+Vina model and the optimization framework are available at GitHub repository https://github.com/zchwang/DeepRMSD-Vina_Optimization.
Collapse
Affiliation(s)
- Zechen Wang
- School of Physics, Shandong University, Jinan, Shandong 250100, China
| | - Liangzhen Zheng
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong 518055, China.,Shanghai Zelixir Biotech Company Ltd., Shanghai 200030, China
| | - Sheng Wang
- Shanghai Zelixir Biotech Company Ltd., Shanghai 200030, China
| | - Mingzhi Lin
- Shanghai Zelixir Biotech Company Ltd., Shanghai 200030, China
| | - Zhihao Wang
- School of Physics, Shandong University, Jinan, Shandong 250100, China
| | - Adams Wai-Kin Kong
- Rolls-Royce Corporate Lab, Nanyang Technological University, Singapore 637551, Singapore
| | - Yuguang Mu
- School of Biological Sciences, Nanyang Technological University, Singapore 637551, Singapore
| | - Yanjie Wei
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong 518055, China
| | - Weifeng Li
- School of Physics, Shandong University, Jinan, Shandong 250100, China
| |
Collapse
|
9
|
Meli R, Morris GM, Biggin PC. Scoring Functions for Protein-Ligand Binding Affinity Prediction using Structure-Based Deep Learning: A Review. FRONTIERS IN BIOINFORMATICS 2022; 2:885983. [PMID: 36187180 PMCID: PMC7613667 DOI: 10.3389/fbinf.2022.885983] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Accepted: 05/11/2022] [Indexed: 01/01/2023] Open
Abstract
The rapid and accurate in silico prediction of protein-ligand binding free energies or binding affinities has the potential to transform drug discovery. In recent years, there has been a rapid growth of interest in deep learning methods for the prediction of protein-ligand binding affinities based on the structural information of protein-ligand complexes. These structure-based scoring functions often obtain better results than classical scoring functions when applied within their applicability domain. Here we review structure-based scoring functions for binding affinity prediction based on deep learning, focussing on different types of architectures, featurization strategies, data sets, methods for training and evaluation, and the role of explainable artificial intelligence in building useful models for real drug-discovery applications.
Collapse
Affiliation(s)
- Rocco Meli
- Department of Biochemistry, University of Oxford, Oxford, United Kingdom
| | - Garrett M. Morris
- Department of Statistics, University of Oxford, Oxford, United Kingdom
| | - Philip C. Biggin
- Department of Biochemistry, University of Oxford, Oxford, United Kingdom
| |
Collapse
|