1
|
Dai J, Zhou Z, Zhao Y, Kong F, Zhai Z, Zhu Z, Cai J, Huang S, Xu Y, Sun T. Combined usage of ligand- and structure-based virtual screening in the artificial intelligence era. Eur J Med Chem 2025; 283:117162. [PMID: 39673863 DOI: 10.1016/j.ejmech.2024.117162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2024] [Revised: 11/27/2024] [Accepted: 12/09/2024] [Indexed: 12/16/2024]
Abstract
Drug design has always been pursuing techniques with time- and cost-benefits. Virtual screening, generally classified as ligand-based (LBVS) and structure-based (SBVS) approaches, could identify active compounds in the large chemical library to reduce time and cost. Owing to the intrinsic flaws and complementary nature of both approaches, continued efforts have been made to combine them to mitigate limitations. Meanwhile, the emergence of machine learning (ML) endows them with opportunities to leverage vast amounts of data to improve their defects. However, few discussions on how to merge ML-improved LBVS and SBVS have been conducted. Therefore, this review provides insights into combined usage of ML-improved LBVS and SBVS to enlighten medicinal chemists to utilize these joint strategies to lift the screening efficiency as well as AI professionals to design novel techniques.
Collapse
Affiliation(s)
- Jingyi Dai
- School of Intelligent Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, 611137, Sichuan, China.
| | - Ziyi Zhou
- School of Intelligent Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, 611137, Sichuan, China.
| | - Yanru Zhao
- School of Intelligent Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, 611137, Sichuan, China.
| | - Fanjing Kong
- School of Intelligent Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, 611137, Sichuan, China.
| | - Zhenwei Zhai
- School of Intelligent Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, 611137, Sichuan, China.
| | - Zhishan Zhu
- School of Intelligent Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, 611137, Sichuan, China.
| | - Jie Cai
- School of Intelligent Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, 611137, Sichuan, China.
| | - Sha Huang
- School of Intelligent Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, 611137, Sichuan, China.
| | - Ying Xu
- Hospital of Chengdu University of Traditional Chinese Medicine, Chengdu, 610072, Sichuan, China.
| | - Tao Sun
- School of Intelligent Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, 611137, Sichuan, China; State Key Laboratory of Southwestern Chinese Medicine Resources, School of Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu, 611137, Sichuan, China.
| |
Collapse
|
2
|
Jiang YY, Yan ST, Zhang SZ, Wang M, Diao WM, Li J, Fang XM, Yin H. Discovery of pyrazolo[1,5-a]pyrimidine derivatives targeting TLR4-TLR4∗ homodimerization via AI-powered next-generation screening. Eur J Med Chem 2024; 280:116945. [PMID: 39388907 DOI: 10.1016/j.ejmech.2024.116945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2024] [Revised: 09/28/2024] [Accepted: 10/04/2024] [Indexed: 10/12/2024]
Abstract
TLR4 signaling is instrumental in orchestrating multiple aspects of innate immunity. Developing small molecule inhibitors targeting the TLR4 pathway holds potential therapeutic promise for TLR4-related disorders. Herein, an artificial intelligence (AI)-powered next-generation screening approach, employing HelixVS and HelixDock, was utilized to focus on the TLR4-TLR4∗ (a second copy of TLR4) homodimerization surface, leading to the identification of a potent pyrazolo[1,5-a]pyrimidine derivative, designated as compound 1. An extensive structure-activity relationship (SAR) exploration culminated in the discovery of the lead compound TH023, which effectively blocked the LPS-stimulated NF-κB activation and nitric oxide overproduction in HEK-Blue hTLR4 and RAW264.7 cells, with IC50 values of 0.354 and 1.61 μM, respectively. Molecular dynamic (MD) simulations indicated that TH023 stabilized TLR4-MD-2 and disrupted its association with TLR4∗. Moreover, TH023 alleviated the lung injury and decreased pro-inflammatory cytokine levels in LPS-induced septic mice. These findings not only illuminated the strategic advantage of HelixDock in advancing the frontiers of AI-driven drug discovery, but also provided valuable structural insights for the rational design of TLR4-TLR4∗ protein-protein interaction (PPI) inhibitors based on the pyrazolo[1,5-a]pyrimidine scaffold. Overall, this study validated a new strategy for TLR4 signaling regulation by targeting its dimerization, thereby underscoring the therapeutic promise of TH023 in treating TLR4-mediated inflammatory diseases.
Collapse
Affiliation(s)
- Yao-Yao Jiang
- State Key Laboratory of Membrane Biology, School of Pharmaceutical Sciences, Tsinghua-Peking Center for Life Sciences, Key Laboratory of Bioorganic Phosphorous Chemistry and Chemical Biology (Ministry of Education), Tsinghua University, Beijing, 100084, China
| | - Shuai-Ting Yan
- State Key Laboratory of Membrane Biology, School of Pharmaceutical Sciences, Tsinghua-Peking Center for Life Sciences, Key Laboratory of Bioorganic Phosphorous Chemistry and Chemical Biology (Ministry of Education), Tsinghua University, Beijing, 100084, China
| | | | - Meng Wang
- Toll Biotech Co., Ltd. (Beijing), Beijing, 102209, China
| | - Wei-Ming Diao
- Toll Biotech Co., Ltd. (Beijing), Beijing, 102209, China
| | - Jun Li
- PaddleHelix Team, Baidu Inc., Shenzhen, 518000, China
| | - Xiao-Min Fang
- PaddleHelix Team, Baidu Inc., Shenzhen, 518000, China
| | - Hang Yin
- State Key Laboratory of Membrane Biology, School of Pharmaceutical Sciences, Tsinghua-Peking Center for Life Sciences, Key Laboratory of Bioorganic Phosphorous Chemistry and Chemical Biology (Ministry of Education), Tsinghua University, Beijing, 100084, China.
| |
Collapse
|
3
|
Carpenter KA, Altman RB. Databases of ligand-binding pockets and protein-ligand interactions. Comput Struct Biotechnol J 2024; 23:1320-1338. [PMID: 38585646 PMCID: PMC10997877 DOI: 10.1016/j.csbj.2024.03.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 03/16/2024] [Accepted: 03/17/2024] [Indexed: 04/09/2024] Open
Abstract
Many research groups and institutions have created a variety of databases curating experimental and predicted data related to protein-ligand binding. The landscape of available databases is dynamic, with new databases emerging and established databases becoming defunct. Here, we review the current state of databases that contain binding pockets and protein-ligand binding interactions. We have compiled a list of such databases, fifty-three of which are currently available for use. We discuss variation in how binding pockets are defined and summarize pocket-finding methods. We organize the fifty-three databases into subgroups based on goals and contents, and describe standard use cases. We also illustrate that pockets within the same protein are characterized differently across different databases. Finally, we assess critical issues of sustainability, accessibility and redundancy.
Collapse
Affiliation(s)
- Kristy A. Carpenter
- Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
| | - Russ B. Altman
- Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
- Department of Medicine, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
4
|
Vittorio S, Lunghini F, Morerio P, Gadioli D, Orlandini S, Silva P, Jan Martinovic, Pedretti A, Bonanni D, Del Bue A, Palermo G, Vistoli G, Beccari AR. Addressing docking pose selection with structure-based deep learning: Recent advances, challenges and opportunities. Comput Struct Biotechnol J 2024; 23:2141-2151. [PMID: 38827235 PMCID: PMC11141151 DOI: 10.1016/j.csbj.2024.05.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 05/15/2024] [Accepted: 05/15/2024] [Indexed: 06/04/2024] Open
Abstract
Molecular docking is a widely used technique in drug discovery to predict the binding mode of a given ligand to its target. However, the identification of the near-native binding pose in docking experiments still represents a challenging task as the scoring functions currently employed by docking programs are parametrized to predict the binding affinity, and, therefore, they often fail to correctly identify the ligand native binding conformation. Selecting the correct binding mode is crucial to obtaining meaningful results and to conveniently optimizing new hit compounds. Deep learning (DL) algorithms have been an area of a growing interest in this sense for their capability to extract the relevant information directly from the protein-ligand structure. Our review aims to present the recent advances regarding the development of DL-based pose selection approaches, discussing limitations and possible future directions. Moreover, a comparison between the performances of some classical scoring functions and DL-based methods concerning their ability to select the correct binding mode is reported. In this regard, two novel DL-based pose selectors developed by us are presented.
Collapse
Affiliation(s)
- Serena Vittorio
- Dipartimento di Scienze Farmaceutiche, Università degli Studi di Milano, Via Luigi Mangiagalli 25, I-20133 Milano, Italy
| | - Filippo Lunghini
- EXSCALATE, Dompé Farmaceutici SpA, Via Tommaso de Amicis 95, 80123 Naples, Italy
| | - Pietro Morerio
- Pattern Analysis and Computer Vision, Fondazione Istituto Italiano di Tecnologia, Via Morego, 30, 16163 Genova, Italy
| | - Davide Gadioli
- Dipartimento di Elettronica Informazione e Bioingegneria, Politecnico di Milano, Via Ponzio 34/5, I-20133 Milano, Italy
| | - Sergio Orlandini
- SCAI, SuperComputing Applications and Innovation Department, CINECA, Via dei Tizii 6, Rome 00185, Italy
| | - Paulo Silva
- IT4Innovations, VSB – Technical University of Ostrava, 17. listopadu 2172/15, 70800 Ostrava-Poruba, Czech Republic
| | - Jan Martinovic
- IT4Innovations, VSB – Technical University of Ostrava, 17. listopadu 2172/15, 70800 Ostrava-Poruba, Czech Republic
| | - Alessandro Pedretti
- Dipartimento di Scienze Farmaceutiche, Università degli Studi di Milano, Via Luigi Mangiagalli 25, I-20133 Milano, Italy
| | - Domenico Bonanni
- Department of Physical and Chemical Sciences, University of L′Aquila, via Vetoio, L′Aquila 67010, Italy
| | - Alessio Del Bue
- Pattern Analysis and Computer Vision, Fondazione Istituto Italiano di Tecnologia, Via Morego, 30, 16163 Genova, Italy
| | - Gianluca Palermo
- Dipartimento di Elettronica Informazione e Bioingegneria, Politecnico di Milano, Via Ponzio 34/5, I-20133 Milano, Italy
| | - Giulio Vistoli
- Dipartimento di Scienze Farmaceutiche, Università degli Studi di Milano, Via Luigi Mangiagalli 25, I-20133 Milano, Italy
| | - Andrea R. Beccari
- EXSCALATE, Dompé Farmaceutici SpA, Via Tommaso de Amicis 95, 80123 Naples, Italy
| |
Collapse
|
5
|
Cao D, Chen M, Zhang R, Wang Z, Huang M, Yu J, Jiang X, Fan Z, Zhang W, Zhou H, Li X, Fu Z, Zhang S, Zheng M. SurfDock is a surface-informed diffusion generative model for reliable and accurate protein-ligand complex prediction. Nat Methods 2024:10.1038/s41592-024-02516-y. [PMID: 39604569 DOI: 10.1038/s41592-024-02516-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Accepted: 10/16/2024] [Indexed: 11/29/2024]
Abstract
Accurately predicting protein-ligand interactions is crucial for understanding cellular processes. We introduce SurfDock, a deep-learning method that addresses this challenge by integrating protein sequence, three-dimensional structural graphs and surface-level features into an equivariant architecture. SurfDock employs a generative diffusion model on a non-Euclidean manifold, optimizing molecular translations, rotations and torsions to generate reliable binding poses. Our extensive evaluations across various benchmarks demonstrate SurfDock's superiority over existing methods in docking success rates and adherence to physical constraints. It also exhibits remarkable generalizability to unseen proteins and predicted apo structures, while achieving state-of-the-art performance in virtual screening tasks. In a real-world application, SurfDock identified seven novel hit molecules in a virtual screening project targeting aldehyde dehydrogenase 1B1, a key enzyme in cellular metabolism. This showcases SurfDock's ability to elucidate molecular mechanisms underlying cellular processes. These results highlight SurfDock's potential as a transformative tool in structural biology, offering enhanced accuracy, physical plausibility and practical applicability in understanding protein-ligand interactions.
Collapse
Affiliation(s)
- Duanhua Cao
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
| | - Mingan Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Physical Science and Technology, ShanghaiTech University, Shanghai, China
- Lingang Laboratory, Shanghai, China
| | - Runze Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Zhaokun Wang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Manlin Huang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- Nanchang University, Nanchang, China
| | - Jie Yu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- Lingang Laboratory, Shanghai, China
- School of Information Science and Technology, ShanghaiTech University, Shanghai, China
| | - Xinyu Jiang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Zhehuan Fan
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Wei Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Hao Zhou
- Institute for AI Industry Research (AIR), Tsinghua University, Beijing, China
| | - Xutong Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
| | - Zunyun Fu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
| | - Sulin Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Mingyue Zheng
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China.
- University of Chinese Academy of Sciences, Beijing, China.
| |
Collapse
|
6
|
Seo S, Kim WY. PharmacoNet: deep learning-guided pharmacophore modeling for ultra-large-scale virtual screening. Chem Sci 2024; 15:19473-19487. [PMID: 39568882 PMCID: PMC11575537 DOI: 10.1039/d4sc04854g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2024] [Accepted: 11/03/2024] [Indexed: 11/22/2024] Open
Abstract
As ultra-large-scale virtual screening becomes critical for early-stage drug discovery, highly efficient screening methods are gaining prominence. Deep-learning-based approaches which directly estimate binding affinities without binding conformation have attracted great attention as an alternative solution to molecular docking, but the generalization capability of existing methods in vast chemical space remains uncertain due to restricted training data. Here, we introduce PharmacoNet, the first deep-learning framework for pharmacophore modeling toward ultra-fast virtual screening. PharmacoNet offers fully automated protein-based pharmacophore modeling and evaluates the potency of ligands with a parameterized analytical scoring function, ensuring high generalization ability across unseen targets and ligands. Our benchmark study shows that PharmacoNet is extremely fast yet reasonably accurate compared to traditional docking methods and existing deep learning-based scoring models. We successfully identified selective inhibitors from 187 million compounds against cannabinoid receptors within 21 hours on a single CPU. This study uncovers the hitherto untapped potential of deep learning in pharmacophore modeling.
Collapse
Affiliation(s)
- Seonghwan Seo
- Department of Chemistry, KAIST 291 Daehak-ro, Yuseong-gu Daejeon 34141 Republic of Korea
| | - Woo Youn Kim
- Department of Chemistry, KAIST 291 Daehak-ro, Yuseong-gu Daejeon 34141 Republic of Korea
- Graduate School of Data Science, KAIST 291 Daehak-ro, Yuseong-gu Daejeon 34141 Republic of Korea
- HITS Inc. 28 Teheran-ro 4-gil, Gangnam-gu Seoul 06234 Republic of Korea
| |
Collapse
|
7
|
Dong C, Huang YP, Lin X, Zhang H, Gao YQ. DSDPFlex: Flexible-Receptor Docking with GPU Acceleration. J Chem Inf Model 2024; 64:8537-8548. [PMID: 39514506 DOI: 10.1021/acs.jcim.4c01715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2024]
Abstract
Molecular docking is an essential tool in structure-based drug discovery, widely utilized to model ligand-protein interactions and enrich potential hits. Among the different docking strategies, semiflexible docking (rigid-receptor and flexible-ligand model) is the most popular, benefiting from its balance of docking accuracy and speed. However, this approach ignores the conformational changes of proteins and hence demands suitable protein conformations as input. When the binding interaction adheres to an induced-fit model, flexible methods such as molecular dynamics simulation can be utilized, but they are computationally demanding. To balance between speed and accuracy, the flexible docking approach is an effective choice, as exemplified by AutoDock Vina and AutoDockFR, which treat selected protein side chains as flexible parts. However, the efficiency of flexible docking methods is yet to be improved for virtual screening usage. In this article, we introduce DSDPFlex, an improved flexible-receptor docking method accelerated by GPU parallelization. Beyond acceleration, optimizations with respect to sampling, scoring, and search space are implemented in DSDPFlex to further improve its capability in flexible tasks. In cross-docking evaluation, DSDPFlex demonstrates superior accuracy compared to AutoDock Vina and is 100 times faster than Vina in flexible-receptor tasks. We also show the advantage of flexible-receptor methods on suboptimal pockets and validate the advantage of DSDPFlex in screening on apo and AlphaFold2-predicted structures. With improvements in both efficiency and accuracy, DSDPFlex is expected to hold potential in future docking-based studies.
Collapse
Affiliation(s)
- Chengwei Dong
- College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Yu-Peng Huang
- College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Xiaohan Lin
- College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Hong Zhang
- Changping Laboratory, Yard 28, Science Park Road, Changping District, Beijing 102200, China
| | - Yi Qin Gao
- College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
- Changping Laboratory, Yard 28, Science Park Road, Changping District, Beijing 102200, China
- Biomedical Pioneering Innovation Center, Peking University, Beijing 100871, China
| |
Collapse
|
8
|
Andrianov GV, Haroldsen E, Karanicolas J. vScreenML v2.0: Improved Machine Learning Classification for Reducing False Positives in Structure-Based Virtual Screening. Int J Mol Sci 2024; 25:12350. [PMID: 39596415 PMCID: PMC11595162 DOI: 10.3390/ijms252212350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2024] [Revised: 11/08/2024] [Accepted: 11/11/2024] [Indexed: 11/28/2024] Open
Abstract
The enthusiastic adoption of make-on-demand chemical libraries for virtual screening has highlighted the need for methods that deliver improved hit-finding discovery rates. Traditional virtual screening methods are often inaccurate, with most compounds nominated in a virtual screen not engaging the intended target protein to any detectable extent. Emerging machine learning approaches have made significant progress in this regard, including our previously described tool vScreenML. The broad adoption of vScreenML was hindered by its challenging usability and dependencies on certain obsolete or proprietary software packages. Here, we introduce vScreenML 2.0 to address each of these limitations with a streamlined Python implementation. Through careful benchmarks, we show that vScreenML 2.0 outperforms other widely used tools for virtual screening hit discovery.
Collapse
Affiliation(s)
- Grigorii V. Andrianov
- Cancer Signaling & Microenvironment Program, Fox Chase Cancer Center, Philadelphia, PA 19111, USA; (G.V.A.); (E.H.)
- Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan 420008, Russia
| | - Emeline Haroldsen
- Cancer Signaling & Microenvironment Program, Fox Chase Cancer Center, Philadelphia, PA 19111, USA; (G.V.A.); (E.H.)
| | - John Karanicolas
- Cancer Signaling & Microenvironment Program, Fox Chase Cancer Center, Philadelphia, PA 19111, USA; (G.V.A.); (E.H.)
- Moulder Center for Drug Discovery Research, Temple University School of Pharmacy, Philadelphia, PA 19140, USA
| |
Collapse
|
9
|
Li G, Yuan Y, Zhang R. Predicting Protein-Ligand Binding Affinity Using Fusion Model of Spatial-Temporal Graph Neural Network and 3D Structure-Based Complex Graph. Interdiscip Sci 2024:10.1007/s12539-024-00644-9. [PMID: 39541085 DOI: 10.1007/s12539-024-00644-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2024] [Revised: 07/09/2024] [Accepted: 07/16/2024] [Indexed: 11/16/2024]
Abstract
The investigation of molecular interactions between ligands and their target molecules is becoming more significant as protein structure data continues to develop. In this study, we introduce PLA-STGCNnet, a deep fusion spatial-temporal graph neural network designed to study protein-ligand interactions based on the 3D structural data of protein-ligand complexes. Unlike 1D protein sequences or 2D ligand graphs, the 3D graph representation offers a more precise portrayal of the complex interactions between proteins and ligands. Research studies have shown that our fusion model, PLA-STGCNnet, outperforms individual algorithms in accurately predicting binding affinity. The advantage of a fusion model is the ability to fully combine the advantages of multiple different models and improve overall performance by combining their features and outputs. Our fusion model shows satisfactory performance on different data sets, which proves its generalization ability and stability. The fusion-based model showed good performance in protein-ligand affinity prediction, and we successfully applied the model to drug screening. Our research underscores the promise of fusion spatial-temporal graph neural networks in addressing complex challenges in protein-ligand affinity prediction. The Python scripts for implementing various model components are accessible at https://github.com/ligaili01/PLA-STGCN.
Collapse
Affiliation(s)
- Gaili Li
- School of Information science and Engineering, Lanzhou University, lanzhou, 730000, China
| | - Yongna Yuan
- School of Information science and Engineering, Lanzhou University, lanzhou, 730000, China.
| | - Ruisheng Zhang
- School of Information science and Engineering, Lanzhou University, lanzhou, 730000, China.
| |
Collapse
|
10
|
Hong Y, Ha J, Sim J, Lim CJ, Oh KS, Chandrasekaran R, Kim B, Choi J, Ko J, Shin WH, Lee J. Accurate prediction of protein-ligand interactions by combining physical energy functions and graph-neural networks. J Cheminform 2024; 16:121. [PMID: 39497201 PMCID: PMC11536843 DOI: 10.1186/s13321-024-00912-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Accepted: 10/07/2024] [Indexed: 11/07/2024] Open
Abstract
We introduce an advanced model for predicting protein-ligand interactions. Our approach combines the strengths of graph neural networks with physics-based scoring methods. Existing structure-based machine-learning models for protein-ligand binding prediction often fall short in practical virtual screening scenarios, hindered by the intricacies of binding poses, the chemical diversity of drug-like molecules, and the scarcity of crystallographic data for protein-ligand complexes. To overcome the limitations of existing machine learning-based prediction models, we propose a novel approach that fuses three independent neural network models. One classification model is designed to perform binary prediction of a given protein-ligand complex pose. The other two regression models are trained to predict the binding affinity and root-mean-square deviation of a ligand conformation from an input complex structure. We trained the model to account for both deviations in experimental and predicted binding affinities and pose prediction uncertainties. By effectively integrating the outputs of the triplet neural networks with a physics-based scoring function, our model showed a significantly improved performance in hit identification. The benchmark results with three independent decoy sets demonstrate that our model outperformed existing models in forward screening. Our model achieved top 1% enrichment factors of 32.7 and 23.1 with the CASF2016 and DUD-E benchmark sets, respectively. The benchmark results using the LIT-PCBA set further confirmed its higher average enrichment factors, emphasizing the model's efficiency and generalizability. The model's efficiency was further validated by identifying 23 active compounds from 63 candidates in experimental screening for autotaxin inhibitors, demonstrating its practical applicability in hit discovery.Scientific contributionOur work introduces a novel training strategy for a protein-ligand binding affinity prediction model by integrating the outputs of three independent sub-models and utilizing expertly crafted decoy sets. The model showcases exceptional performance across multiple benchmarks. The high enrichment factors in the LIT-PCBA benchmark demonstrate its potential to accelerate hit discovery.
Collapse
Affiliation(s)
- Yiyu Hong
- Arontier Co., 241, Gangnam-daero, Seocho-gu, Seoul, 06735, Republic of Korea
| | - Junsu Ha
- Arontier Co., 241, Gangnam-daero, Seocho-gu, Seoul, 06735, Republic of Korea
| | - Jaemin Sim
- Department of Molecular Medicine and Biopharmaceutical Sciences, Graduate School of Convergence Science and Technology, Seoul National University, Seoul, 08826, Republic of Korea
| | - Chae Jo Lim
- Data Convergence Drug Research Center, Korea Research Institute of Chemical Technology, Daejeon, 34114, Republic of Korea
| | - Kwang-Seok Oh
- Data Convergence Drug Research Center, Korea Research Institute of Chemical Technology, Daejeon, 34114, Republic of Korea
| | | | - Bomin Kim
- College of Pharmacy, Seoul National University, Seoul, 08826, Republic of Korea
| | - Jieun Choi
- College of Pharmacy, Seoul National University, Seoul, 08826, Republic of Korea
| | - Junsu Ko
- Arontier Co., 241, Gangnam-daero, Seocho-gu, Seoul, 06735, Republic of Korea.
| | - Woong-Hee Shin
- Arontier Co., 241, Gangnam-daero, Seocho-gu, Seoul, 06735, Republic of Korea.
- Department of Medicine, Korea University College of Medicine, Seoul, 02841, Republic of Korea.
| | - Juyong Lee
- Arontier Co., 241, Gangnam-daero, Seocho-gu, Seoul, 06735, Republic of Korea.
- Department of Molecular Medicine and Biopharmaceutical Sciences, Graduate School of Convergence Science and Technology, Seoul National University, Seoul, 08826, Republic of Korea.
- Research Institute of Pharmaceutical Science, College of Pharmacy, Seoul National University, Seoul, 08826, Republic of Korea.
- College of Pharmacy, Seoul National University, Seoul, 08826, Republic of Korea.
| |
Collapse
|
11
|
Hu Q, Wang Z, Meng J, Li W, Guo J, Mu Y, Wang S, Zheng L, Wei Y. OpenDock: a pytorch-based open-source framework for protein-ligand docking and modelling. Bioinformatics 2024; 40:btae628. [PMID: 39432683 PMCID: PMC11552628 DOI: 10.1093/bioinformatics/btae628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2024] [Revised: 09/19/2024] [Accepted: 10/19/2024] [Indexed: 10/23/2024] Open
Abstract
MOTIVATION Molecular docking is an invaluable computational tool with broad applications in computer-aided drug design and enzyme engineering. However, current molecular docking tools are typically implemented in languages such as C++ for calculation speed, which lack flexibility and user-friendliness for further development. Moreover, validating the effectiveness of external scoring functions for molecular docking and screening within these frameworks is challenging, and implementing more efficient sampling strategies is not straightforward. RESULTS To address these limitations, we have developed an open-source molecular docking framework, OpenDock, based on Python and PyTorch. This framework supports the integration of multiple scoring functions; some can be utilized during molecular docking and pose optimization, while others can be used for post-processing scoring. In terms of sampling, the current version of this framework supports simulated annealing and Monte Carlo optimization. Additionally, it can be extended to include methods such as genetic algorithms and particle swarm optimization for sampling docking poses and protein side chain orientations. Distance constraints are also implemented to enable covalent docking, restricted docking or distance map constraints guided pose sampling. Overall, this framework serves as a valuable tool in drug design and enzyme engineering, offering significant flexibility for most protein-ligand modelling tasks. AVAILABILITY AND IMPLEMENTATION OpenDock is publicly available at: https://github.com/guyuehuo/opendock.
Collapse
Affiliation(s)
- Qiuyue Hu
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518000, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Zechen Wang
- School of Physics, Shangdong University, Jinan, 250100, China
| | - Jintao Meng
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518000, China
| | - Weifeng Li
- School of Physics, Shangdong University, Jinan, 250100, China
| | - Jingjing Guo
- Centre in Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macao SAR, 999078, China
| | - Yuguang Mu
- School of Biological Sciences, Nanyang Technological University, Singapore 637551, Singapore
| | - Sheng Wang
- Shanghai Zelixir Biotech Co. Ltd, Shanghai, 201203, China
| | | | - Yanjie Wei
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518000, China
| |
Collapse
|
12
|
Luo Q, Wang S, Li HY, Zheng L, Mu Y, Guo J. Benchmarking reverse docking through AlphaFold2 human proteome. Protein Sci 2024; 33:e5167. [PMID: 39276010 PMCID: PMC11400627 DOI: 10.1002/pro.5167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Revised: 08/21/2024] [Accepted: 08/24/2024] [Indexed: 09/16/2024]
Abstract
Predicting the binding of ligands to the human proteome via reverse-docking methods enables the understanding of ligand's interactions with potential protein targets in the human body, thereby facilitating drug repositioning and the evaluation of potential off-target effects or toxic side effects of drugs. In this study, we constructed 11 reverse docking pipelines by integrating site prediction tools (PointSite and SiteMap), docking programs (Glide and AutoDock Vina), and scoring functions (Glide, Autodock Vina, RTMScore, DeepRMSD, and OnionNet-SFCT), and then thoroughly benchmarked their predictive capabilities. The results show that the Glide_SFCT (PS) pipeline exhibited the best target prediction performance based on the atomic structure models in AlphaFold2 human proteome. It achieved a success rate of 27.8% when considering the top 100 ranked prediction. This pipeline effectively narrows the range of potential targets within the human proteome, laying a foundation for drug target prediction, off-target assessment, and toxicity prediction, ultimately boosting drug development. By facilitating these critical aspects of drug discovery and development, our work has the potential to ultimately accelerate the identification of new therapeutic agents and improve drug safety.
Collapse
Affiliation(s)
- Qing Luo
- Centre in Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macao, China
| | - Sheng Wang
- Shanghai Zelixir Biotech Company Ltd., China
| | - Hoi Yeung Li
- School of Biological Sciences, Nanyang Technological University, Singapore
| | - Liangzhen Zheng
- Shenzhen Zelixir Biotech Company Ltd., China
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Yuguang Mu
- School of Biological Sciences, Nanyang Technological University, Singapore
| | - Jingjing Guo
- Centre in Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macao, China
| |
Collapse
|
13
|
Lam HYI, Guan JS, Ong XE, Pincket R, Mu Y. Protein language models are performant in structure-free virtual screening. Brief Bioinform 2024; 25:bbae480. [PMID: 39327890 PMCID: PMC11427677 DOI: 10.1093/bib/bbae480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Revised: 08/17/2024] [Accepted: 09/12/2024] [Indexed: 09/28/2024] Open
Abstract
Hitherto virtual screening (VS) has been typically performed using a structure-based drug design paradigm. Such methods typically require the use of molecular docking on high-resolution three-dimensional structures of a target protein-a computationally-intensive and time-consuming exercise. This work demonstrates that by employing protein language models and molecular graphs as inputs to a novel graph-to-transformer cross-attention mechanism, a screening power comparable to state-of-the-art structure-based models can be achieved. The implications thereof include highly expedited VS due to the greatly reduced compute required to run this model, and the ability to perform early stages of computer-aided drug design in the complete absence of 3D protein structures.
Collapse
Affiliation(s)
- Hilbert Yuen In Lam
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Dr, Singapore 637551, Singapore, Republic of Singapore
- MagMol Pte. Ltd., 68 Circular Road, #02-01, Singapore 049422, Singapore, Republic of Singapore
| | - Jia Sheng Guan
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Dr, Singapore 637551, Singapore, Republic of Singapore
| | - Xing Er Ong
- MagMol Pte. Ltd., 68 Circular Road, #02-01, Singapore 049422, Singapore, Republic of Singapore
| | - Robbe Pincket
- Heliovision, Asstraat 5, 3000 Leuven, Leuven, Kingdom of Belgium
| | - Yuguang Mu
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Dr, Singapore 637551, Singapore, Republic of Singapore
- MagMol Pte. Ltd., 68 Circular Road, #02-01, Singapore 049422, Singapore, Republic of Singapore
| |
Collapse
|
14
|
Zhou G, Rusnac DV, Park H, Canzani D, Nguyen HM, Stewart L, Bush MF, Nguyen PT, Wulff H, Yarov-Yarovoy V, Zheng N, DiMaio F. An artificial intelligence accelerated virtual screening platform for drug discovery. Nat Commun 2024; 15:7761. [PMID: 39237523 PMCID: PMC11377542 DOI: 10.1038/s41467-024-52061-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Accepted: 08/23/2024] [Indexed: 09/07/2024] Open
Abstract
Structure-based virtual screening is a key tool in early drug discovery, with growing interest in the screening of multi-billion chemical compound libraries. However, the success of virtual screening crucially depends on the accuracy of the binding pose and binding affinity predicted by computational docking. Here we develop a highly accurate structure-based virtual screen method, RosettaVS, for predicting docking poses and binding affinities. Our approach outperforms other state-of-the-art methods on a wide range of benchmarks, partially due to our ability to model receptor flexibility. We incorporate this into a new open-source artificial intelligence accelerated virtual screening platform for drug discovery. Using this platform, we screen multi-billion compound libraries against two unrelated targets, a ubiquitin ligase target KLHDC2 and the human voltage-gated sodium channel NaV1.7. For both targets, we discover hit compounds, including seven hits (14% hit rate) to KLHDC2 and four hits (44% hit rate) to NaV1.7, all with single digit micromolar binding affinities. Screening in both cases is completed in less than seven days. Finally, a high resolution X-ray crystallographic structure validates the predicted docking pose for the KLHDC2 ligand complex, demonstrating the effectiveness of our method in lead discovery.
Collapse
Affiliation(s)
- Guangfeng Zhou
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Domnita-Valeria Rusnac
- Howard Hughes Medical Institute, Department of Pharmacology, University of Washington, Seattle, WA, USA
| | - Hahnbeom Park
- Brain Science Institute, Korea Institute of Science and Technology, Seoul, Republic of Korea
- KIST-SKKU Brain Research Center, SKKU Institute for Convergence, Sungkyunkwan University, Suwon, Republic of Korea
| | - Daniele Canzani
- Department of Chemistry, University of Washington, Seattle, WA, USA
| | - Hai Minh Nguyen
- Department of Pharmacology, University of California Davis, Davis, CA, USA
| | - Lance Stewart
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Matthew F Bush
- Department of Chemistry, University of Washington, Seattle, WA, USA
| | - Phuong Tran Nguyen
- Department of Physiology and Membrane Biology, University of California Davis, Davis, CA, USA
| | - Heike Wulff
- Department of Pharmacology, University of California Davis, Davis, CA, USA
| | - Vladimir Yarov-Yarovoy
- Department of Physiology and Membrane Biology, University of California Davis, Davis, CA, USA
- Department of Anesthesiology and Pain Medicine, University of California Davis, Sacramento, CA, USA
| | - Ning Zheng
- Howard Hughes Medical Institute, Department of Pharmacology, University of Washington, Seattle, WA, USA.
| | - Frank DiMaio
- Department of Biochemistry, University of Washington, Seattle, WA, USA.
- Institute for Protein Design, University of Washington, Seattle, WA, USA.
| |
Collapse
|
15
|
Chen J, Yang X, Wu H. A Multibranch Neural Network for Drug-Target Affinity Prediction Using Similarity Information. ACS OMEGA 2024; 9:35978-35989. [PMID: 39184467 PMCID: PMC11339836 DOI: 10.1021/acsomega.4c05607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/15/2024] [Revised: 08/03/2024] [Accepted: 08/06/2024] [Indexed: 08/27/2024]
Abstract
Predicting drug-target affinity (DTA) is beneficial for accelerating drug discovery. In recent years, graph structure-based deep learning models have garnered significant attention in this field. However, these models typically handle drug or target protein in isolation and only extract the molecular structure information on the drug or protein itself. To address this limitation, existing network-based models represent drug-target interactions or affinities as a knowledge graph to capture the interaction information. In this study, we propose a novel solution. Specifically, we introduce drug similarity information and protein similarity information into the field of DTA prediction. Moreover, we propose a network framework that autonomously extracts similarity information, avoiding reliance on knowledge graphs. Based on this framework, we design a multibranch neural network called GASI-DTA. This network integrates similarity information, sequence information, and molecular structure information. Comprehensive experimental results conducted on two benchmark data sets and three cold-start scenarios demonstrate that our model outperforms state-of-the-art graph structure-based methods in nearly all metrics. Furthermore, it exhibits significant advantages over existing network-based models, outperforming the best of them in the majority of metrics. Our study's code and data are openly accessible at http://github.com/XiaoLin-Yang-S/GASI-DTA.
Collapse
Affiliation(s)
- Jing Chen
- School
of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China
- Jiangsu
Provincial Engineering Laboratory of Pattern Recognition and Computing
Intelligence, Jiangnan University, Wuxi 214122, China
| | - Xiaolin Yang
- School
of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China
| | - Haoyu Wu
- School
of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China
| |
Collapse
|
16
|
Zhang WY, Zheng XL, Coghi PS, Chen JH, Dong BJ, Fan XX. Revolutionizing adjuvant development: harnessing AI for next-generation cancer vaccines. Front Immunol 2024; 15:1438030. [PMID: 39206192 PMCID: PMC11349682 DOI: 10.3389/fimmu.2024.1438030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Accepted: 07/23/2024] [Indexed: 09/04/2024] Open
Abstract
With the COVID-19 pandemic, the importance of vaccines has been widely recognized and has led to increased research and development efforts. Vaccines also play a crucial role in cancer treatment by activating the immune system to target and destroy cancer cells. However, enhancing the efficacy of cancer vaccines remains a challenge. Adjuvants, which enhance the immune response to antigens and improve vaccine effectiveness, have faced limitations in recent years, resulting in few novel adjuvants being identified. The advancement of artificial intelligence (AI) technology in drug development has provided a foundation for adjuvant screening and application, leading to a diversification of adjuvants. This article reviews the significant role of tumor vaccines in basic research and clinical treatment and explores the use of AI technology to screen novel adjuvants from databases. The findings of this review offer valuable insights for the development of new adjuvants for next-generation vaccines.
Collapse
Affiliation(s)
- Wan-Ying Zhang
- Dr. Neher’s Biophysics Laboratory for Innovative Drug Discovery, State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Macao, Macao SAR, China
| | - Xiao-Li Zheng
- Dr. Neher’s Biophysics Laboratory for Innovative Drug Discovery, State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Macao, Macao SAR, China
| | - Paolo Saul Coghi
- Dr. Neher’s Biophysics Laboratory for Innovative Drug Discovery, State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Macao, Macao SAR, China
| | - Jun-Hui Chen
- Intervention and Cell Therapy Center, Peking University Shenzhen Hospital, Shenzhen, China
| | - Bing-Jun Dong
- Gynecology Department, Zhuhai Hospital of Integrated Traditional Chinese and Western Medicine, Zhuhai, China
| | - Xing-Xing Fan
- Dr. Neher’s Biophysics Laboratory for Innovative Drug Discovery, State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Macao, Macao SAR, China
| |
Collapse
|
17
|
Chen G, Qin Y, Sheng R. Integrating Prior Chemical Knowledge into the Graph Transformer Network to Predict the Stability Constants of Chelating Agents and Metal Ions. J Chem Inf Model 2024; 64:5867-5877. [PMID: 39075943 DOI: 10.1021/acs.jcim.4c00614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/31/2024]
Abstract
The latest advancements in nuclear medicine indicate that radioactive isotopes and associated metal chelators play crucial roles in the diagnosis and treatment of diseases. The development of metal chelators mainly relies on traditional trial-and-error methods, lacking rational guidance and design. In this study, we propose the structure-aware transformer (SAT) combined with molecular fingerprint (SATCMF), a novel graph transformer network framework that incorporates prior chemical knowledge to construct coordination edges and learns the interactions between chelating agents and metal ions. SATCMF is trained on stability data collected from metal ion-ligand complexes, leveraging the SAT network to extract structural features relevant to the binding of ligands with metal ions. It further integrates molecular fingerprint features to refine the prediction of the stability constants of the chelating agents and metal ions. The experimental results on benchmark data set demonstrate that SATCMF achieves state-of-the-art performance based on four different graph neural network architectures. Additionally, visualizing the learned molecular attention distribution provides interpretable insights from the prediction results, offering valuable guidance for the development of novel metal chelators.
Collapse
Affiliation(s)
- Geng Chen
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, P. R. China
| | - Yiyang Qin
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, P. R. China
| | - Rong Sheng
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, P. R. China
- Jinhua Institute of Zhejiang University, Zhejiang University, Jinhua 321036, P. R. China
| |
Collapse
|
18
|
Peng S, Rajjou L. Advancing plant biology through deep learning-powered natural language processing. PLANT CELL REPORTS 2024; 43:208. [PMID: 39102077 DOI: 10.1007/s00299-024-03294-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Accepted: 07/19/2024] [Indexed: 08/06/2024]
Abstract
The application of deep learning methods, specifically the utilization of Large Language Models (LLMs), in the field of plant biology holds significant promise for generating novel knowledge on plant cell systems. The LLM framework exhibits exceptional potential, particularly with the development of Protein Language Models (PLMs), allowing for in-depth analyses of nucleic acid and protein sequences. This analytical capacity facilitates the discernment of intricate patterns and relationships within biological data, encompassing multi-scale information within DNA or protein sequences. The contribution of PLMs extends beyond mere sequence patterns and structure--function recognition; it also supports advancements in genetic improvements for agriculture. The integration of deep learning approaches into the domain of plant sciences offers opportunities for major breakthroughs in basic research across multi-scale plant traits. Consequently, the strategic application of deep learning methodologies, particularly leveraging the potential of LLMs, will undoubtedly play a pivotal role in advancing plant sciences, plant production, plant uses and propelling the trajectory toward sustainable agroecological and agro-food transitions.
Collapse
Affiliation(s)
- Shuang Peng
- Université Paris-Saclay, INRAE, AgroParisTech, Institut Jean-Pierre Bourgin for Plant Sciences (IJPB), 78000, Versailles, France
| | - Loïc Rajjou
- Université Paris-Saclay, INRAE, AgroParisTech, Institut Jean-Pierre Bourgin for Plant Sciences (IJPB), 78000, Versailles, France.
| |
Collapse
|
19
|
Wang T, Zhang X, Zhang O, Chen G, Pan P, Wang E, Wang J, Wu J, Zhou D, Wang L, Jin R, Chen S, Shen C, Kang Y, Hsieh CY, Hou T. Highly Accurate and Efficient Deep Learning Paradigm for Full-Atom Protein Loop Modeling with KarmaLoop. RESEARCH (WASHINGTON, D.C.) 2024; 7:0408. [PMID: 39055686 PMCID: PMC11268956 DOI: 10.34133/research.0408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/25/2024] [Accepted: 05/22/2024] [Indexed: 07/27/2024]
Abstract
Protein loop modeling is a challenging yet highly nontrivial task in protein structure prediction. Despite recent progress, existing methods including knowledge-based, ab initio, hybrid, and deep learning (DL) methods fall substantially short of either atomic accuracy or computational efficiency. To overcome these limitations, we present KarmaLoop, a novel paradigm that distinguishes itself as the first DL method centered on full-atom (encompassing both backbone and side-chain heavy atoms) protein loop modeling. Our results demonstrate that KarmaLoop considerably outperforms conventional and DL-based methods of loop modeling in terms of both accuracy and efficiency, with the average RMSDs of 1.77 and 1.95 Å for the CASP13+14 and CASP15 benchmark datasets, respectively, and manifests at least 2 orders of magnitude speedup in general compared with other methods. Consequently, our comprehensive evaluations indicate that KarmaLoop provides a state-of-the-art DL solution for protein loop modeling, with the potential to hasten the advancement of protein engineering, antibody-antigen recognition, and drug design.
Collapse
Affiliation(s)
- Tianyue Wang
- Innovation Institute for Artificial Intelligence in Medicine ofZhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Xujun Zhang
- Innovation Institute for Artificial Intelligence in Medicine ofZhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Odin Zhang
- Innovation Institute for Artificial Intelligence in Medicine ofZhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | | | - Peichen Pan
- Innovation Institute for Artificial Intelligence in Medicine ofZhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Ercheng Wang
- Innovation Institute for Artificial Intelligence in Medicine ofZhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
- Zhejiang Laboratory, Hangzhou 311100, Zhejiang, China
| | - Jike Wang
- Innovation Institute for Artificial Intelligence in Medicine ofZhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Jialu Wu
- Innovation Institute for Artificial Intelligence in Medicine ofZhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Donghao Zhou
- Shenzhen Institute of Advanced Technology,
Chinese Academy of Sciences, Shenzhen 518055, Guangdong, China
| | - Langcheng Wang
- Department of Pathology,
New York University Medical Center, New York, NY 10016, USA
| | - Ruofan Jin
- Innovation Institute for Artificial Intelligence in Medicine ofZhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
- College of Life Sciences,
Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Shicheng Chen
- Innovation Institute for Artificial Intelligence in Medicine ofZhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Chao Shen
- Innovation Institute for Artificial Intelligence in Medicine ofZhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Yu Kang
- Innovation Institute for Artificial Intelligence in Medicine ofZhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Chang-Yu Hsieh
- Innovation Institute for Artificial Intelligence in Medicine ofZhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine ofZhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| |
Collapse
|
20
|
Lu D, Luo D, Zhang Y, Wang B. A Robust Induced Fit Docking Approach with the Combination of the Hybrid All-Atom/United-Atom/Coarse-Grained Model and Simulated Annealing. J Chem Theory Comput 2024; 20:6414-6423. [PMID: 38966989 DOI: 10.1021/acs.jctc.4c00653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/06/2024]
Abstract
Molecular docking remains an indispensable tool in computational biology and structure-based drug discovery. However, the correct prediction of binding poses remains a major challenge for molecular docking, especially for target proteins where a substrate binding induces significant reorganization of the active site. Here, we introduce an Induced Fit Docking (IFD) approach named AA/UA/CG-SA-IFD, which combines a hybrid All-Atom/United-Atom/Coarse-Grained model with Simulated Annealing. In this approach, the core region is represented by the All-Atom(AA) model, while the protein environment beyond the core region and the solvent are treated with either the United-Atom (UA) or the Coarse-Grained (CG) model. By combining the Elastic Network Model (ENM) for the CG region, the hybrid model ensures a reasonable description of ligand binding and the environmental effects of the protein, facilitating highly efficient and reliable sampling of ligand binding through Simulated Annealing (SA) at a high temperature. Upon validation with two testing sets, the AA/UA/CG-SA-IFD approach demonstrates remarkable accuracy and efficiency in induced fit docking, even for challenging cases where the docked poses significantly deviate from crystal structures.
Collapse
Affiliation(s)
- Dexin Lu
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 360015, P. R. China
| | - Ding Luo
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 360015, P. R. China
| | - Yuwei Zhang
- Jiangsu Key Laboratory of New Power Batteries, Jiangsu Collaborative Innovation Centre of Biomedical Functional Materials, School of Chemistry and Materials Science, Nanjing Normal University, Nanjing 210023, P. R. China
| | - Binju Wang
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 360015, P. R. China
| |
Collapse
|
21
|
Shen C, Song J, Hsieh CY, Cao D, Kang Y, Ye W, Wu Z, Wang J, Zhang O, Zhang X, Zeng H, Cai H, Chen Y, Chen L, Luo H, Zhao X, Jian T, Chen T, Jiang D, Wang M, Ye Q, Wu J, Du H, Shi H, Deng Y, Hou T. DrugFlow: An AI-Driven One-Stop Platform for Innovative Drug Discovery. J Chem Inf Model 2024; 64:5381-5391. [PMID: 38920405 DOI: 10.1021/acs.jcim.4c00621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/27/2024]
Abstract
Artificial intelligence (AI)-aided drug design has demonstrated unprecedented effects on modern drug discovery, but there is still an urgent need for user-friendly interfaces that bridge the gap between these sophisticated tools and scientists, particularly those who are less computer savvy. Herein, we present DrugFlow, an AI-driven one-stop platform that offers a clean, convenient, and cloud-based interface to streamline early drug discovery workflows. By seamlessly integrating a range of innovative AI algorithms, covering molecular docking, quantitative structure-activity relationship modeling, molecular generation, ADMET (absorption, distribution, metabolism, excretion and toxicity) prediction, and virtual screening, DrugFlow can offer effective AI solutions for almost all crucial stages in early drug discovery, including hit identification and hit/lead optimization. We hope that the platform can provide sufficiently valuable guidance to aid real-word drug design and discovery. The platform is available at https://drugflow.com.
Collapse
Affiliation(s)
- Chao Shen
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Jianfei Song
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
| | - Chang-Yu Hsieh
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410004, Hunan, China
| | - Yu Kang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Wenling Ye
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
| | - Zhenxing Wu
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Jike Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Odin Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Xujun Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Hao Zeng
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
| | - Heng Cai
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
| | - Yu Chen
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
| | - Linkang Chen
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
| | - Hao Luo
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
| | - Xinda Zhao
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
| | - Tianye Jian
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
| | - Tong Chen
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
| | - Dejun Jiang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Mingyang Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Qing Ye
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Jialu Wu
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Hongyan Du
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Hui Shi
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
| | - Yafeng Deng
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
- Department of Automation, Tsinghua University, Beijing 100084, China
| | - Tingjun Hou
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| |
Collapse
|
22
|
Liu T, Simine L. DeltaGzip: Computing Biopolymer-Ligand Binding Affinity via Kolmogorov Complexity and Lossless Compression. J Chem Inf Model 2024; 64:5617-5623. [PMID: 38980667 DOI: 10.1021/acs.jcim.4c00461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
The design of biosequences for biosensing and therapeutics is a challenging multistep search and optimization task. In principle, computational modeling may speed up the design process by virtual screening of sequences based on their binding affinities to target molecules. However, in practice, existing machine-learned models trained to predict binding affinities lack the flexibility with respect to reaction conditions, and molecular dynamics simulations that can incorporate reaction conditions suffer from high computational costs. Here, we describe a computational approach called DeltaGzip that evaluates the free energy of binding in biopolymer-ligand complexes from ultrashort equilibrium molecular dynamics simulations. The entropy of binding is evaluated using the Kolmogorov complexity definition of entropy and approximated using a lossless compression algorithm, Gzip. We benchmark the method on a well-studied data set of protein-ligand complexes comparing the predictions of DeltaGzip to the free energies of binding obtained using Jarzynski equality and experimental measurements.
Collapse
Affiliation(s)
- Tao Liu
- Department of Chemistry, McGill University, Montreal, Quebec H3A 0B8, Canada
| | - Lena Simine
- Department of Chemistry, McGill University, Montreal, Quebec H3A 0B8, Canada
| |
Collapse
|
23
|
Wu H, Liu J, Zhang R, Lu Y, Cui G, Cui Z, Ding Y. A review of deep learning methods for ligand based drug virtual screening. FUNDAMENTAL RESEARCH 2024; 4:715-737. [PMID: 39156568 PMCID: PMC11330120 DOI: 10.1016/j.fmre.2024.02.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 01/10/2024] [Accepted: 02/18/2024] [Indexed: 08/20/2024] Open
Abstract
Drug discovery is costly and time consuming, and modern drug discovery endeavors are progressively reliant on computational methodologies, aiming to mitigate temporal and financial expenditures associated with the process. In particular, the time required for vaccine and drug discovery is prolonged during emergency situations such as the coronavirus 2019 pandemic. Recently, the performance of deep learning methods in drug virtual screening has been particularly prominent. It has become a concern for researchers how to summarize the existing deep learning in drug virtual screening, select different models for different drug screening problems, exploit the advantages of deep learning models, and further improve the capability of deep learning in drug virtual screening. This review first introduces the basic concepts of drug virtual screening, common datasets, and data representation methods. Then, large numbers of common deep learning methods for drug virtual screening are compared and analyzed. In addition, a dataset of different sizes is constructed independently to evaluate the performance of each deep learning model for the difficult problem of large-scale ligand virtual screening. Finally, the existing challenges and future directions in the field of virtual screening are presented.
Collapse
Affiliation(s)
- Hongjie Wu
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Junkai Liu
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Runhua Zhang
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Yaoyao Lu
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Guozeng Cui
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Zhiming Cui
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Yijie Ding
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China
| |
Collapse
|
24
|
Kairys V, Baranauskiene L, Kazlauskiene M, Zubrienė A, Petrauskas V, Matulis D, Kazlauskas E. Recent advances in computational and experimental protein-ligand affinity determination techniques. Expert Opin Drug Discov 2024; 19:649-670. [PMID: 38715415 DOI: 10.1080/17460441.2024.2349169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Accepted: 04/25/2024] [Indexed: 05/22/2024]
Abstract
INTRODUCTION Modern drug discovery revolves around designing ligands that target the chosen biomolecule, typically proteins. For this, the evaluation of affinities of putative ligands is crucial. This has given rise to a multitude of dedicated computational and experimental methods that are constantly being developed and improved. AREAS COVERED In this review, the authors reassess both the industry mainstays and the newest trends among the methods for protein - small-molecule affinity determination. They discuss both computational affinity predictions and experimental techniques, describing their basic principles, main limitations, and advantages. Together, this serves as initial guide to the currently most popular and cutting-edge ligand-binding assays employed in rational drug design. EXPERT OPINION The affinity determination methods continue to develop toward miniaturization, high-throughput, and in-cell application. Moreover, the availability of data analysis tools has been constantly increasing. Nevertheless, cross-verification of data using at least two different techniques and careful result interpretation remain of utmost importance.
Collapse
Affiliation(s)
- Visvaldas Kairys
- Department of Bioinformatics, Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Lina Baranauskiene
- Department of Biothermodynamics and Drug Design, Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | | | - Asta Zubrienė
- Department of Biothermodynamics and Drug Design, Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Vytautas Petrauskas
- Department of Biothermodynamics and Drug Design, Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Daumantas Matulis
- Department of Biothermodynamics and Drug Design, Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Egidijus Kazlauskas
- Department of Biothermodynamics and Drug Design, Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| |
Collapse
|
25
|
Zhang X, Shen C, Zhang H, Kang Y, Hsieh CY, Hou T. Advancing Ligand Docking through Deep Learning: Challenges and Prospects in Virtual Screening. Acc Chem Res 2024; 57:1500-1509. [PMID: 38577892 DOI: 10.1021/acs.accounts.4c00093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/06/2024]
Abstract
Molecular docking, also termed ligand docking (LD), is a pivotal element of structure-based virtual screening (SBVS) used to predict the binding conformations and affinities of protein-ligand complexes. Traditional LD methodologies rely on a search and scoring framework, utilizing heuristic algorithms to explore binding conformations and scoring functions to evaluate binding strengths. However, to meet the efficiency demands of SBVS, these algorithms and functions are often simplified, prioritizing speed over accuracy.The emergence of deep learning (DL) has exerted a profound impact on diverse fields, ranging from natural language processing to computer vision and drug discovery. DeepMind's AlphaFold2 has impressively exhibited its ability to accurately predict protein structures solely from amino acid sequences, highlighting the remarkable potential of DL in conformation prediction. This groundbreaking advancement circumvents the traditional search-scoring frameworks in LD, enhancing both accuracy and processing speed and thereby catalyzing a broader adoption of DL algorithms in binding pose prediction. Nevertheless, a consensus on certain aspects remains elusive.In this Account, we delineate the current status of employing DL to augment LD within the VS paradigm, highlighting our contributions to this domain. Furthermore, we discuss the challenges and future prospects, drawing insights from our scholarly investigations. Initially, we present an overview of VS and LD, followed by an introduction to DL paradigms, which deviate significantly from traditional search-scoring frameworks. Subsequently, we delve into the challenges associated with the development of DL-based LD (DLLD), encompassing evaluation metrics, application scenarios, and physical plausibility of the predicted conformations. In the evaluation of LD algorithms, it is essential to recognize the multifaceted nature of the metrics. While the accuracy of binding pose prediction, often measured by the success rate, is a pivotal aspect, the scoring/screening power and computational speed of these algorithms are equally important given the pivotal role of LD tools in VS. Regarding application scenarios, early methods focused on blind docking, where the binding site is unknown. However, recent studies suggest a shift toward identifying binding sites rather than solely predicting binding poses within these models. In contrast, LD with a known pocket in VS has been shown to be more practical. Physical plausibility poses another significant challenge. Although DLLD models often achieve higher success rates compared to traditional methods, they may generate poses with implausible local structures, such as incorrect bond angles or lengths, which are disadvantageous for postprocessing tasks like visualization. Finally, we discuss the future perspectives for DLLD, emphasizing the need to improve generalization ability, strike a balance between speed and accuracy, account for protein conformation flexibility, and enhance physical plausibility. Additionally, we delve into the comparison between generative and regression algorithms in this context, exploring their respective strengths and potential.
Collapse
Affiliation(s)
- Xujun Zhang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
- Hangzhou Carbonsilicon AI Technology Co., Ltd, Hangzhou 310018, Zhejiang, China
| | - Chao Shen
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
- Hangzhou Carbonsilicon AI Technology Co., Ltd, Hangzhou 310018, Zhejiang, China
| | - Haotian Zhang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
- Hangzhou Carbonsilicon AI Technology Co., Ltd, Hangzhou 310018, Zhejiang, China
| | - Yu Kang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Chang-Yu Hsieh
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Tingjun Hou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| |
Collapse
|
26
|
Gu S, Yang Y, Zhao Y, Qiu J, Wang X, Tong HHY, Liu L, Wan X, Liu H, Hou T, Kang Y. Evaluation of AlphaFold2 Structures for Hit Identification across Multiple Scenarios. J Chem Inf Model 2024; 64:3630-3639. [PMID: 38630855 DOI: 10.1021/acs.jcim.3c01976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/19/2024]
Abstract
The introduction of AlphaFold2 (AF2) has sparked significant enthusiasm and generated extensive discussion within the scientific community, particularly among drug discovery researchers. Although previous studies have addressed the performance of AF2 structures in virtual screening (VS), a more comprehensive investigation is still necessary considering the paramount importance of structural accuracy in drug design. In this study, we evaluate the performance of AF2 structures in VS across three common drug discovery scenarios: targets with holo, apo, and AF2 structures; targets with only apo and AF2 structures; and targets exclusively with AF2 structures. We utilized both the traditional physics-based Glide and the deep-learning-based scoring function RTMscore to rank the compounds in the DUD-E, DEKOIS 2.0, and DECOY data sets. The results demonstrate that, overall, the performance of VS on AF2 structures is comparable to that on apo structures but notably inferior to that on holo structures across diverse scenarios. Moreover, when a target has solely AF2 structure, selecting the holo structure of the target from different subtypes within the same protein family produces comparable results with the AF2 structure for VS on the data set of the AF2 structures, and significantly better results than the AF2 structures on its own data set. This indicates that utilizing AF2 structures for docking-based VS may not yield most satisfactory outcomes, even when solely AF2 structures are available. Moreover, we rule out the possibility that the variations in VS performance between the binding pockets of AF2 and holo structures arise from the differences in their biological assembly composition.
Collapse
Affiliation(s)
- Shukai Gu
- Faculty of Applied Science, Macao Polytechnic University, Macao 999078, SAR, China
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Yuwei Yang
- Faculty of Applied Science, Macao Polytechnic University, Macao 999078, SAR, China
| | - Yihao Zhao
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Jiayue Qiu
- Faculty of Applied Science, Macao Polytechnic University, Macao 999078, SAR, China
| | - Xiaorui Wang
- State Key Laboratory of Quality Re-search in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Macao 999078, China
| | - Henry Hoi Yee Tong
- Faculty of Applied Science, Macao Polytechnic University, Macao 999078, SAR, China
| | - Liwei Liu
- Advanced Computing and Storage Laboratory, Central Research Institute, 2012 Laboratories, Huawei Technologies Co., Ltd., Nanjing 210000, Jiangsu, China
| | - Xiaozhe Wan
- Advanced Computing and Storage Laboratory, Central Research Institute, 2012 Laboratories, Huawei Technologies Co., Ltd., Nanjing 210000, Jiangsu, China
| | - Huanxiang Liu
- Faculty of Applied Science, Macao Polytechnic University, Macao 999078, SAR, China
| | - Tingjun Hou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Yu Kang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| |
Collapse
|
27
|
Wang T, Tong J, Zhang X, Wang Z, Xu L, Pan P, Hou T. Structure-based virtual screening of novel USP5 inhibitors targeting the zinc finger ubiquitin-binding domain. Comput Biol Med 2024; 174:108397. [PMID: 38603896 DOI: 10.1016/j.compbiomed.2024.108397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 03/21/2024] [Accepted: 04/01/2024] [Indexed: 04/13/2024]
Abstract
The equilibrium of cellular protein levels is pivotal for maintaining normal physiological functions. USP5 belongs to the deubiquitination enzyme (DUBs) family, controlling protein degradation and preserving cellular protein homeostasis. Aberrant expression of USP5 is implicated in a variety of diseases, including cancer, neurodegenerative diseases, and inflammatory diseases. In this paper, a multi-level virtual screening (VS) approach was employed to target the zinc finger ubiquitin-binding domain (ZnF-UBD) of USP5, leading to the identification of a highly promising candidate compound 0456-0049. Molecular dynamics (MD) simulations were then employed to assess the stability of complex binding and predict hotspot residues in interactions. The results indicated that the candidate stably binds to the ZnF-UBD of USP5 through crucial interactions with residues ARG221, TRP209, GLY220, ASN207, TYR261, TYR259, and MET266. Binding free energy calculations, along with umbrella sampling (US) simulations, underscored a superior binding affinity of the candidate relative to known inhibitors. Moreover, US simulations revealed conformational changes of USP5 during ligand dissociation. These insights provide a valuable foundation for the development of novel inhibitors targeting USP5.
Collapse
Affiliation(s)
- Tianhao Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, PR China; College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an, 710021, PR China
| | - Jianbo Tong
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an, 710021, PR China.
| | - Xing Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, PR China; College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an, 710021, PR China
| | - Zhe Wang
- School of Pharmacy, Hangzhou Normal University, Hangzhou, 310058, Zhejiang, PR China
| | - Lei Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, 213001, PR China
| | - Peichen Pan
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, PR China.
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, PR China.
| |
Collapse
|
28
|
Kumar N, Acharya V. Advances in machine intelligence-driven virtual screening approaches for big-data. Med Res Rev 2024; 44:939-974. [PMID: 38129992 DOI: 10.1002/med.21995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 07/15/2023] [Accepted: 10/29/2023] [Indexed: 12/23/2023]
Abstract
Virtual screening (VS) is an integral and ever-evolving domain of drug discovery framework. The VS is traditionally classified into ligand-based (LB) and structure-based (SB) approaches. Machine intelligence or artificial intelligence has wide applications in the drug discovery domain to reduce time and resource consumption. In combination with machine intelligence algorithms, VS has emerged into revolutionarily progressive technology that learns within robust decision orders for data curation and hit molecule screening from large VS libraries in minutes or hours. The exponential growth of chemical and biological data has evolved as "big-data" in the public domain demands modern and advanced machine intelligence-driven VS approaches to screen hit molecules from ultra-large VS libraries. VS has evolved from an individual approach (LB and SB) to integrated LB and SB techniques to explore various ligand and target protein aspects for the enhanced rate of appropriate hit molecule prediction. Current trends demand advanced and intelligent solutions to handle enormous data in drug discovery domain for screening and optimizing hits or lead with fewer or no false positive hits. Following the big-data drift and tremendous growth in computational architecture, we presented this review. Here, the article categorized and emphasized individual VS techniques, detailed literature presented for machine learning implementation, modern machine intelligence approaches, and limitations and deliberated the future prospects.
Collapse
Affiliation(s)
- Neeraj Kumar
- Artificial Intelligence for Computational Biology Lab (AICoB), Biotechnology Division, CSIR-Institute of Himalayan Bioresource Technology, Palampur, Himachal Pradesh, India
- Academy of Scientific and Innovative Research, Ghaziabad, India
| | - Vishal Acharya
- Artificial Intelligence for Computational Biology Lab (AICoB), Biotechnology Division, CSIR-Institute of Himalayan Bioresource Technology, Palampur, Himachal Pradesh, India
- Academy of Scientific and Innovative Research, Ghaziabad, India
| |
Collapse
|
29
|
Qu X, Dong L, Luo D, Si Y, Wang B. Water Network-Augmented Two-State Model for Protein-Ligand Binding Affinity Prediction. J Chem Inf Model 2024; 64:2263-2274. [PMID: 37433009 DOI: 10.1021/acs.jcim.3c00567] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/13/2023]
Abstract
Water network rearrangement from the ligand-unbound state to the ligand-bound state is known to have significant effects on the protein-ligand binding interactions, but most of the current machine learning-based scoring functions overlook these effects. In this study, we endeavor to construct a comprehensive and realistic deep learning model by incorporating water network information into both ligand-unbound and -bound states. In particular, extended connectivity interaction features were integrated into graph representation, and graph transformer operator was employed to extract features of the ligand-unbound and -bound states. Through these efforts, we developed a water network-augmented two-state model called ECIFGraph::HM-Holo-Apo. Our new model exhibits satisfactory performance in terms of scoring, ranking, docking, screening, and reverse screening power tests on the CASF-2016 benchmark. In addition, it can achieve superior performance in large-scale docking-based virtual screening tests on the DEKOIS2.0 data set. Our study highlights that the use of a water network-augmented two-state model can be an effective strategy to bolster the robustness and applicability of machine learning-based scoring functions, particularly for targets with hydrophilic or solvent-exposed binding pockets.
Collapse
Affiliation(s)
- Xiaoyang Qu
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
| | - Lina Dong
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
| | - Ding Luo
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
| | - Yubing Si
- College of Chemistry, Zhengzhou University, Zhengzhou 450001, P. R. China
| | - Binju Wang
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
- Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen 361005, P. R. China
| |
Collapse
|
30
|
Zhang X, Gao H, Wang H, Chen Z, Zhang Z, Chen X, Li Y, Qi Y, Wang R. PLANET: A Multi-objective Graph Neural Network Model for Protein-Ligand Binding Affinity Prediction. J Chem Inf Model 2024; 64:2205-2220. [PMID: 37319418 DOI: 10.1021/acs.jcim.3c00253] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Predicting protein-ligand binding affinity is a central issue in drug design. Various deep learning models have been published in recent years, where many of them rely on 3D protein-ligand complex structures as input and tend to focus on the single task of reproducing binding affinity. In this study, we have developed a graph neural network model called PLANET (Protein-Ligand Affinity prediction NETwork). This model takes the graph-represented 3D structure of the binding pocket on the target protein and the 2D chemical structure of the ligand molecule as input. It was trained through a multi-objective process with three related tasks, including deriving the protein-ligand binding affinity, protein-ligand contact map, and ligand distance matrix. Besides the protein-ligand complexes with known binding affinity data retrieved from the PDBbind database, a large number of non-binder decoys were also added to the training data for deriving the final model of PLANET. When tested on the CASF-2016 benchmark, PLANET exhibited a scoring power comparable to the best result yielded by other deep learning models as well as a reasonable ranking power and docking power. In virtual screening trials conducted on the DUD-E benchmark, PLANET's performance was notably better than several deep learning and machine learning models. As on the LIT-PCBA benchmark, PLANET achieved comparable accuracy as the conventional docking program Glide, but it only spent less than 1% of Glide's computation time to finish the same job because PLANET did not need exhaustive conformational sampling. Considering the decent accuracy and efficiency of PLANET in binding affinity prediction, it may become a useful tool for conducting large-scale virtual screening.
Collapse
Affiliation(s)
- Xiangying Zhang
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| | - Haotian Gao
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| | - Haojie Wang
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| | - Zhihang Chen
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| | - Zhe Zhang
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| | - Xinchong Chen
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| | - Yan Li
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| | - Yifei Qi
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| | - Renxiao Wang
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| |
Collapse
|
31
|
Rayka M, Mirzaei M, Mohammad Latifi A. An ensemble-based approach to estimate confidence of predicted protein-ligand binding affinity values. Mol Inform 2024; 43:e202300292. [PMID: 38358080 DOI: 10.1002/minf.202300292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 01/22/2024] [Accepted: 02/02/2024] [Indexed: 02/16/2024]
Abstract
When designing a machine learning-based scoring function, we access a limited number of protein-ligand complexes with experimentally determined binding affinity values, representing only a fraction of all possible protein-ligand complexes. Consequently, it is crucial to report a measure of confidence and quantify the uncertainty in the model's predictions during test time. Here, we adopt the conformal prediction technique to evaluate the confidence of a prediction for each member of the core set of the CASF 2016 benchmark. The conformal prediction technique requires a diverse ensemble of predictors for uncertainty estimation. To this end, we introduce ENS-Score as an ensemble predictor, which includes 30 models with different protein-ligand representation approaches and achieves Pearson's correlation of 0.842 on the core set of the CASF 2016 benchmark. Also, we comprehensively investigate the residual error of each data point to assess the normality behavior of the distribution of the residual errors and their correlation to the structural features of the ligands, such as hydrophobic interactions and halogen bonding. In the end, we provide a local host web application to facilitate the usage of ENS-Score. All codes to repeat results are provided at https://github.com/miladrayka/ENS_Score.
Collapse
Affiliation(s)
- Milad Rayka
- Applied Biotechnology Research Center, Baqiyatallah University of Medical Sciences, Tehran, Iran
| | - Morteza Mirzaei
- Applied Biotechnology Research Center, Baqiyatallah University of Medical Sciences, Tehran, Iran
| | - Ali Mohammad Latifi
- Applied Biotechnology Research Center, Baqiyatallah University of Medical Sciences, Tehran, Iran
| |
Collapse
|
32
|
Wang Z, Wang S, Li Y, Guo J, Wei Y, Mu Y, Zheng L, Li W. A new paradigm for applying deep learning to protein-ligand interaction prediction. Brief Bioinform 2024; 25:bbae145. [PMID: 38581420 PMCID: PMC10998640 DOI: 10.1093/bib/bbae145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 02/21/2024] [Accepted: 03/18/2024] [Indexed: 04/08/2024] Open
Abstract
Protein-ligand interaction prediction presents a significant challenge in drug design. Numerous machine learning and deep learning (DL) models have been developed to accurately identify docking poses of ligands and active compounds against specific targets. However, current models often suffer from inadequate accuracy or lack practical physical significance in their scoring systems. In this research paper, we introduce IGModel, a novel approach that utilizes the geometric information of protein-ligand complexes as input for predicting the root mean square deviation of docking poses and the binding strength (pKd, the negative value of the logarithm of binding affinity) within the same prediction framework. This ensures that the output scores carry intuitive meaning. We extensively evaluate the performance of IGModel on various docking power test sets, including the CASF-2016 benchmark, PDBbind-CrossDocked-Core and DISCO set, consistently achieving state-of-the-art accuracies. Furthermore, we assess IGModel's generalizability and robustness by evaluating it on unbiased test sets and sets containing target structures generated by AlphaFold2. The exceptional performance of IGModel on these sets demonstrates its efficacy. Additionally, we visualize the latent space of protein-ligand interactions encoded by IGModel and conduct interpretability analysis, providing valuable insights. This study presents a novel framework for DL-based prediction of protein-ligand interactions, contributing to the advancement of this field. The IGModel is available at GitHub repository https://github.com/zchwang/IGModel.
Collapse
Affiliation(s)
- Zechen Wang
- School of Physics, Shandong University, South Shanda Road, 250100 Shandong, China
| | - Sheng Wang
- Shanghai Zelixir Biotech, Xiangke Road, 200030, Shanghai, China
| | - Yangyang Li
- School of Physics, Shandong University, South Shanda Road, 250100 Shandong, China
| | - Jingjing Guo
- Centre in Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Rua de Luís Gonzaga Gomes, Macao, China
| | - Yanjie Wei
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Xueyuan Road 1068, Shenzhen, 518055 Guang Dong, China
| | - Yuguang Mu
- School of Biological Sciences, Nanyang Technological University, Singapore
| | - Liangzhen Zheng
- Shanghai Zelixir Biotech, Xiangke Road, 200030, Shanghai, China
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Xueyuan Road 1068, Shenzhen, 518055 Guang Dong, China
| | - Weifeng Li
- School of Physics, Shandong University, South Shanda Road, 250100 Shandong, China
| |
Collapse
|
33
|
Guo L, Wang J. GSScore: a novel Graphormer-based shell-like scoring method for protein-ligand docking. Brief Bioinform 2024; 25:bbae201. [PMID: 38706316 PMCID: PMC11070652 DOI: 10.1093/bib/bbae201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2023] [Revised: 02/05/2024] [Accepted: 04/16/2024] [Indexed: 05/07/2024] Open
Abstract
Protein-ligand interactions (PLIs) are essential for cellular activities and drug discovery. But due to the complexity and high cost of experimental methods, there is a great demand for computational approaches to recognize PLI patterns, such as protein-ligand docking. In recent years, more and more models based on machine learning have been developed to directly predict the root mean square deviation (RMSD) of a ligand docking pose with reference to its native binding pose. However, new scoring methods are pressingly needed in methodology for more accurate RMSD prediction. We present a new deep learning-based scoring method for RMSD prediction of protein-ligand docking poses based on a Graphormer method and Shell-like graph architecture, named GSScore. To recognize near-native conformations from a set of poses, GSScore takes atoms as nodes and then establishes the docking interface of protein-ligand into multiple bipartite graphs within different shell ranges. Benefiting from the Graphormer and Shell-like graph architecture, GSScore can effectively capture the subtle differences between energetically favorable near-native conformations and unfavorable non-native poses without extra information. GSScore was extensively evaluated on diverse test sets including a subset of PDBBind version 2019, CASF2016 as well as DUD-E, and obtained significant improvements over existing methods in terms of RMSE, $R$ (Pearson correlation coefficient), Spearman correlation coefficient and Docking power.
Collapse
Affiliation(s)
- Linyuan Guo
- School of Computer Science and Engineering, Central South University, Rd. Lu Shan Nan, 410083, Changsha, P.R. China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Rd. Lu Shan Nan, 410083, Changsha, P.R. China
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Rd. Lu Shan Nan, 410083, Changsha, P.R. China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Rd. Lu Shan Nan, 410083, Changsha, P.R. China
| |
Collapse
|
34
|
Luo D, Liu D, Qu X, Dong L, Wang B. Enhancing Generalizability in Protein-Ligand Binding Affinity Prediction with Multimodal Contrastive Learning. J Chem Inf Model 2024; 64:1892-1906. [PMID: 38441880 DOI: 10.1021/acs.jcim.3c01961] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2024]
Abstract
Improving the generalization ability of scoring functions remains a major challenge in protein-ligand binding affinity prediction. Many machine learning methods are limited by their reliance on single-modal representations, hindering a comprehensive understanding of protein-ligand interactions. We introduce a graph-neural-network-based scoring function that utilizes a triplet contrastive learning loss to improve protein-ligand representations. In this model, three-dimensional complex representations and the fusion of two-dimensional ligand and coarse-grained pocket representations converge while distancing from decoy representations in latent space. After rigorous validation on multiple external data sets, our model exhibits commendable generalization capabilities compared to those of other deep learning-based scoring functions, marking it as a promising tool in the realm of drug discovery. In the future, our training framework can be extended to other biophysical- and biochemical-related problems such as protein-protein interaction and protein mutation prediction.
Collapse
Affiliation(s)
- Ding Luo
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
| | - Dandan Liu
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
| | - Xiaoyang Qu
- School of Pharmacy and Medical Technology, Putian University, Putian 351100, P. R. China
- Key Laboratory of Pharmaceutical Analysis and Laboratory Medicine (Putian University), Fujian Province University, Putian 351100, P. R. China
| | - Lina Dong
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
| | - Binju Wang
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
- Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen 361005, P. R. China
| |
Collapse
|
35
|
Zhang Y, Li S, Meng K, Sun S. Machine Learning for Sequence and Structure-Based Protein-Ligand Interaction Prediction. J Chem Inf Model 2024; 64:1456-1472. [PMID: 38385768 DOI: 10.1021/acs.jcim.3c01841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2024]
Abstract
Developing new drugs is too expensive and time -consuming. Accurately predicting the interaction between drugs and targets will likely change how the drug is discovered. Machine learning-based protein-ligand interaction prediction has demonstrated significant potential. In this paper, computational methods, focusing on sequence and structure to study protein-ligand interactions, are examined. Therefore, this paper starts by presenting an overview of the data sets applied in this area, as well as the various approaches applied for representing proteins and ligands. Then, sequence-based and structure-based classification criteria are subsequently utilized to categorize and summarize both the classical machine learning models and deep learning models employed in protein-ligand interaction studies. Moreover, the evaluation methods and interpretability of these models are proposed. Furthermore, delving into the diverse applications of protein-ligand interaction models in drug research is presented. Lastly, the current challenges and future directions in this field are addressed.
Collapse
Affiliation(s)
- Yunjiang Zhang
- Beijing Key Laboratory for Green Catalysis and Separation, The Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, P. R. China
| | - Shuyuan Li
- Beijing Key Laboratory for Green Catalysis and Separation, The Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, P. R. China
| | - Kong Meng
- Beijing Key Laboratory for Green Catalysis and Separation, The Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, P. R. China
| | - Shaorui Sun
- Beijing Key Laboratory for Green Catalysis and Separation, The Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, P. R. China
| |
Collapse
|
36
|
Cai H, Shen C, Jian T, Zhang X, Chen T, Han X, Yang Z, Dang W, Hsieh CY, Kang Y, Pan P, Ji X, Song J, Hou T, Deng Y. CarsiDock: a deep learning paradigm for accurate protein-ligand docking and screening based on large-scale pre-training. Chem Sci 2024; 15:1449-1471. [PMID: 38274053 PMCID: PMC10806797 DOI: 10.1039/d3sc05552c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Accepted: 12/18/2023] [Indexed: 01/27/2024] Open
Abstract
The expertise accumulated in deep neural network-based structure prediction has been widely transferred to the field of protein-ligand binding pose prediction, thus leading to the emergence of a variety of deep learning-guided docking models for predicting protein-ligand binding poses without relying on heavy sampling. However, their prediction accuracy and applicability are still far from satisfactory, partially due to the lack of protein-ligand binding complex data. To this end, we create a large-scale complex dataset containing ∼9 M protein-ligand docking complexes for pre-training, and propose CarsiDock, the first deep learning-guided docking approach that leverages pre-training of millions of predicted protein-ligand complexes. CarsiDock contains two main stages, i.e., a deep learning model for the prediction of protein-ligand atomic distance matrices, and a translation, rotation and torsion-guided geometry optimization procedure to reconstruct the matrices into a credible binding pose. The pre-training and multiple innovative architectural designs facilitate the dramatically improved docking accuracy of our approach over the baselines in terms of multiple docking scenarios, thereby contributing to its outstanding early recognition performance in several retrospective virtual screening campaigns. Further explorations demonstrate that CarsiDock can not only guarantee the topological reliability of the binding poses but also successfully reproduce the crucial interactions in crystalized structures, highlighting its superior applicability.
Collapse
Affiliation(s)
- Heng Cai
- Hangzhou Carbonsilicon AI Technology Co., Ltd Hangzhou 310018 Zhejiang China
| | - Chao Shen
- Hangzhou Carbonsilicon AI Technology Co., Ltd Hangzhou 310018 Zhejiang China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Tianye Jian
- Hangzhou Carbonsilicon AI Technology Co., Ltd Hangzhou 310018 Zhejiang China
| | - Xujun Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Tong Chen
- Hangzhou Carbonsilicon AI Technology Co., Ltd Hangzhou 310018 Zhejiang China
| | - Xiaoqi Han
- Hangzhou Carbonsilicon AI Technology Co., Ltd Hangzhou 310018 Zhejiang China
| | - Zhuo Yang
- Hangzhou Carbonsilicon AI Technology Co., Ltd Hangzhou 310018 Zhejiang China
| | - Wei Dang
- Hangzhou Carbonsilicon AI Technology Co., Ltd Hangzhou 310018 Zhejiang China
| | - Chang-Yu Hsieh
- Hangzhou Carbonsilicon AI Technology Co., Ltd Hangzhou 310018 Zhejiang China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Yu Kang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Peichen Pan
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Xiangyang Ji
- Department of Automation, Tsinghua University Beijing 100084 China
| | - Jianfei Song
- Hangzhou Carbonsilicon AI Technology Co., Ltd Hangzhou 310018 Zhejiang China
| | - Tingjun Hou
- Hangzhou Carbonsilicon AI Technology Co., Ltd Hangzhou 310018 Zhejiang China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Yafeng Deng
- Hangzhou Carbonsilicon AI Technology Co., Ltd Hangzhou 310018 Zhejiang China
| |
Collapse
|
37
|
Wu H, Liu J, Jiang T, Zou Q, Qi S, Cui Z, Tiwari P, Ding Y. AttentionMGT-DTA: A multi-modal drug-target affinity prediction using graph transformer and attention mechanism. Neural Netw 2024; 169:623-636. [PMID: 37976593 DOI: 10.1016/j.neunet.2023.11.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 09/29/2023] [Accepted: 11/07/2023] [Indexed: 11/19/2023]
Abstract
The accurate prediction of drug-target affinity (DTA) is a crucial step in drug discovery and design. Traditional experiments are very expensive and time-consuming. Recently, deep learning methods have achieved notable performance improvements in DTA prediction. However, one challenge for deep learning-based models is appropriate and accurate representations of drugs and targets, especially the lack of effective exploration of target representations. Another challenge is how to comprehensively capture the interaction information between different instances, which is also important for predicting DTA. In this study, we propose AttentionMGT-DTA, a multi-modal attention-based model for DTA prediction. AttentionMGT-DTA represents drugs and targets by a molecular graph and binding pocket graph, respectively. Two attention mechanisms are adopted to integrate and interact information between different protein modalities and drug-target pairs. The experimental results showed that our proposed model outperformed state-of-the-art baselines on two benchmark datasets. In addition, AttentionMGT-DTA also had high interpretability by modeling the interaction strength between drug atoms and protein residues. Our code is available at https://github.com/JK-Liu7/AttentionMGT-DTA.
Collapse
Affiliation(s)
- Hongjie Wu
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, 215009, China.
| | - Junkai Liu
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, 215009, China; Yangtze Delta Region Institute(Quzhou), University of Electronic Science and Technology of China, Quzhou, 324003, China.
| | - Tengsheng Jiang
- Gusu School, Nanjing Medical University, Suzhou, 215009, China.
| | - Quan Zou
- Yangtze Delta Region Institute(Quzhou), University of Electronic Science and Technology of China, Quzhou, 324003, China.
| | - Shujie Qi
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, 215009, China.
| | - Zhiming Cui
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, 215009, China.
| | - Prayag Tiwari
- School of Information Technology, Halmstad University, Sweden.
| | - Yijie Ding
- Yangtze Delta Region Institute(Quzhou), University of Electronic Science and Technology of China, Quzhou, 324003, China.
| |
Collapse
|
38
|
Shen T, Liu F, Wang Z, Sun J, Bu Y, Meng J, Chen W, Yao K, Mu Y, Li W, Zhao G, Wang S, Wei Y, Zheng L. zPoseScore model for accurate and robust protein-ligand docking pose scoring in CASP15. Proteins 2023; 91:1837-1849. [PMID: 37606194 DOI: 10.1002/prot.26573] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 07/20/2023] [Accepted: 07/31/2023] [Indexed: 08/23/2023]
Abstract
We introduce a deep learning-based ligand pose scoring model called zPoseScore for predicting protein-ligand complexes in the 15th Critical Assessment of Protein Structure Prediction (CASP15). Our contributions are threefold: first, we generate six training and evaluation data sets by employing advanced data augmentation and sampling methods. Second, we redesign the "zFormer" module, inspired by AlphaFold2's Evoformer, to efficiently describe protein-ligand interactions. This module enables the extraction of protein-ligand paired features that lead to accurate predictions. Finally, we develop the zPoseScore framework with zFormer for scoring and ranking ligand poses, allowing for atomic-level protein-ligand feature encoding and fusion to output refined ligand poses and ligand per-atom deviations. Our results demonstrate excellent performance on various testing data sets, achieving Pearson's correlation R = 0.783 and 0.659 for ranking docking decoys generated based on experimental and predicted protein structures of CASF-2016 protein-ligand complexes. Additionally, we obtain an averaged local distance difference test (lDDT pli = 0.558) of AIchemy LIG2 in CASP15 for de novo protein-ligand complex structure predictions. Detailed analysis shows that accurate ligand binding site prediction and side-chain orientation are crucial for achieving better prediction performance. Our proposed model is one of the most accurate protein-ligand pose prediction models and could serve as a valuable tool in small molecule drug discovery.
Collapse
Affiliation(s)
- Tao Shen
- Shanghai Zelixir Biotech Company Ltd., Shanghai, China
| | - Fuxu Liu
- Shanghai Zelixir Biotech Company Ltd., Shanghai, China
| | - Zechen Wang
- School of Physics, Shandong University, Jinan, Shandong, China
| | - Jinyuan Sun
- Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
| | - Yifan Bu
- Shanghai Zelixir Biotech Company Ltd., Shanghai, China
| | - Jintao Meng
- Shenzhen Key Laboratory of Intelligent Bioinformatics, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China
| | - Weihua Chen
- Shanghai Zelixir Biotech Company Ltd., Shanghai, China
| | - Keyi Yao
- Shanghai Zelixir Biotech Company Ltd., Shanghai, China
| | - Yuguang Mu
- School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| | - Weifeng Li
- School of Physics, Shandong University, Jinan, Shandong, China
| | - Guoping Zhao
- Shenzhen Key Laboratory of Intelligent Bioinformatics, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China
| | - Sheng Wang
- Shanghai Zelixir Biotech Company Ltd., Shanghai, China
| | - Yanjie Wei
- Shenzhen Key Laboratory of Intelligent Bioinformatics, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China
| | - Liangzhen Zheng
- Shanghai Zelixir Biotech Company Ltd., Shanghai, China
- Shenzhen Key Laboratory of Intelligent Bioinformatics, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China
| |
Collapse
|
39
|
Li Y, Fan Z, Rao J, Chen Z, Chu Q, Zheng M, Li X. An overview of recent advances and challenges in predicting compound-protein interaction (CPI). MEDICAL REVIEW (2021) 2023; 3:465-486. [PMID: 38282802 PMCID: PMC10808869 DOI: 10.1515/mr-2023-0030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 08/30/2023] [Indexed: 01/30/2024]
Abstract
Compound-protein interactions (CPIs) are critical in drug discovery for identifying therapeutic targets, drug side effects, and repurposing existing drugs. Machine learning (ML) algorithms have emerged as powerful tools for CPI prediction, offering notable advantages in cost-effectiveness and efficiency. This review provides an overview of recent advances in both structure-based and non-structure-based CPI prediction ML models, highlighting their performance and achievements. It also offers insights into CPI prediction-related datasets and evaluation benchmarks. Lastly, the article presents a comprehensive assessment of the current landscape of CPI prediction, elucidating the challenges faced and outlining emerging trends to advance the field.
Collapse
Affiliation(s)
- Yanbei Li
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, Zhejiang Province, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Zhehuan Fan
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Jingxin Rao
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Zhiyi Chen
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, Zhejiang Province, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Qinyu Chu
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, Zhejiang Province, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Mingyue Zheng
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, Zhejiang Province, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Xutong Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
40
|
Dong T, Yang Z, Zhou J, Chen CYC. Equivariant Flexible Modeling of the Protein-Ligand Binding Pose with Geometric Deep Learning. J Chem Theory Comput 2023; 19:8446-8459. [PMID: 37938978 DOI: 10.1021/acs.jctc.3c00273] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2023]
Abstract
Flexible modeling of the protein-ligand complex structure is a fundamental challenge for in silico drug development. Recent studies have improved commonly used docking tools by incorporating extra-deep learning-based steps. However, such strategies limit their accuracy and efficiency because they retain massive sampling pressure and lack consideration for flexible biomolecular changes. In this study, we propose FlexPose, a geometric graph network capable of direct flexible modeling of complex structures in Euclidean space without the following conventional sampling and scoring strategies. Our model adopts two key designs: scalar-vector dual feature representation and SE(3)-equivariant network, to manage dynamic structural changes, as well as two strategies: conformation-aware pretraining and weakly supervised learning, to boost model generalizability in unseen chemical space. Benefiting from these paradigms, our model dramatically outperforms all tested popular docking tools and recently advanced deep learning methods, especially in tasks involving protein conformation changes. We further investigate the impact of protein and ligand similarity on the model performance with two conformation-aware strategies. Moreover, FlexPose provides an affinity estimation and model confidence for postanalysis.
Collapse
Affiliation(s)
- Tiejun Dong
- Intelligent Medical Research Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, Guangdong 510275, China
| | - Ziduo Yang
- Intelligent Medical Research Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, Guangdong 510275, China
| | - Jun Zhou
- Intelligent Medical Research Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, Guangdong 510275, China
| | - Calvin Yu-Chian Chen
- Intelligent Medical Research Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, Guangdong 510275, China
- AI for Science (AI4S)-Preferred Program, Peking University Shenzhen Graduate School, Shenzhen, Guangdong 518055, China
- School of Electronic and Computer Engineering, Peking University Shenzhen Graduate School, Shenzhen, Guangdong 518055, China
- Department of Medical Research, China Medical University Hospital, Taichung 40447, Taiwan
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung 41354, Taiwan
| |
Collapse
|
41
|
Nguyen NQ, Park S, Gim M, Kang J. MulinforCPI: enhancing precision of compound-protein interaction prediction through novel perspectives on multi-level information integration. Brief Bioinform 2023; 25:bbad484. [PMID: 38180829 PMCID: PMC10768804 DOI: 10.1093/bib/bbad484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 11/15/2023] [Accepted: 12/05/2023] [Indexed: 01/07/2024] Open
Abstract
Forecasting the interaction between compounds and proteins is crucial for discovering new drugs. However, previous sequence-based studies have not utilized three-dimensional (3D) information on compounds and proteins, such as atom coordinates and distance matrices, to predict binding affinity. Furthermore, numerous widely adopted computational techniques have relied on sequences of amino acid characters for protein representations. This approach may constrain the model's ability to capture meaningful biochemical features, impeding a more comprehensive understanding of the underlying proteins. Here, we propose a two-step deep learning strategy named MulinforCPI that incorporates transfer learning techniques with multi-level resolution features to overcome these limitations. Our approach leverages 3D information from both proteins and compounds and acquires a profound understanding of the atomic-level features of proteins. Besides, our research highlights the divide between first-principle and data-driven methods, offering new research prospects for compound-protein interaction tasks. We applied the proposed method to six datasets: Davis, Metz, KIBA, CASF-2016, DUD-E and BindingDB, to evaluate the effectiveness of our approach.
Collapse
Affiliation(s)
- Ngoc-Quang Nguyen
- Department of Computer Science and Engineering, Korea University, 02841, Seoul, Korea
| | - Sejeong Park
- Department of Computer Science and Engineering, Korea University, 02841, Seoul, Korea
- AIGEN Sciences, 04778, Seoul, Korea
| | - Mogan Gim
- Department of Computer Science and Engineering, Korea University, 02841, Seoul, Korea
| | - Jaewoo Kang
- Department of Computer Science and Engineering, Korea University, 02841, Seoul, Korea
- Interdisciplinary Graduate Program in Bioinformatics, Korea University, 02841, Seoul, Korea
- AIGEN Sciences, 04778, Seoul, Korea
| |
Collapse
|
42
|
Xia S, Chen E, Zhang Y. Integrated Molecular Modeling and Machine Learning for Drug Design. J Chem Theory Comput 2023; 19:7478-7495. [PMID: 37883810 PMCID: PMC10653122 DOI: 10.1021/acs.jctc.3c00814] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 10/10/2023] [Accepted: 10/11/2023] [Indexed: 10/28/2023]
Abstract
Modern therapeutic development often involves several stages that are interconnected, and multiple iterations are usually required to bring a new drug to the market. Computational approaches have increasingly become an indispensable part of helping reduce the time and cost of the research and development of new drugs. In this Perspective, we summarize our recent efforts on integrating molecular modeling and machine learning to develop computational tools for modulator design, including a pocket-guided rational design approach based on AlphaSpace to target protein-protein interactions, delta machine learning scoring functions for protein-ligand docking as well as virtual screening, and state-of-the-art deep learning models to predict calculated and experimental molecular properties based on molecular mechanics optimized geometries. Meanwhile, we discuss remaining challenges and promising directions for further development and use a retrospective example of FDA approved kinase inhibitor Erlotinib to demonstrate the use of these newly developed computational tools.
Collapse
Affiliation(s)
- Song Xia
- Department
of Chemistry, New York University, New York, New York 10003, United States
| | - Eric Chen
- Department
of Chemistry, New York University, New York, New York 10003, United States
| | - Yingkai Zhang
- Department
of Chemistry, New York University, New York, New York 10003, United States
- Simons
Center for Computational Physical Chemistry at New York University, New York, New York 10003, United States
- NYU-ECNU
Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| |
Collapse
|
43
|
Yu L, He X, Fang X, Liu L, Liu J. Deep Learning with Geometry-Enhanced Molecular Representation for Augmentation of Large-Scale Docking-Based Virtual Screening. J Chem Inf Model 2023; 63:6501-6514. [PMID: 37882338 DOI: 10.1021/acs.jcim.3c01371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2023]
Abstract
Structure-based virtual screening has been a crucial tool in drug discovery for decades. However, as the chemical space expands, the existing structure-based virtual screening techniques based on molecular docking and scoring struggle to handle billion-entry ultralarge libraries due to the high computational cost. To address this challenge, people have resorted to machine learning techniques to enhance structure-based virtual screening for efficiently exploring the vast chemical space. In those cases, compounds are usually treated as sequential strings or two-dimensional topology graphs, limiting their ability to incorporate three-dimensional structural information for downstream tasks. We herein propose a novel deep learning protocol, GEM-Screen, which utilizes the geometry-enhanced molecular representation of the compounds docking to a specific target and is trained on docking scores of a small fraction of a library through an active learning strategy to approximate the docking outcome for yet nontraining entries. This protocol is applied to virtual screening campaigns against the AmpC and D4 targets, demonstrating that GEM-Screen enriches more than 90% of the hit scaffolds for AmpC in the top 4% of model predictions and more than 80% of the hit scaffolds for D4 in the same top-ranking size of library. GEM-Screen can be used in conjunction with traditional docking programs for docking of only the top-ranked compounds to avoid the exhaustive docking of the whole library, thus allowing for discovering top-scoring compounds from billion-entry libraries in a rapid yet accurate fashion.
Collapse
Affiliation(s)
- Lan Yu
- School of Science, China Pharmaceutical University, Nanjing 210009, China
| | - Xiao He
- Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China
- New York University-East China Normal University Center for Computational Chemistry, New York University Shanghai, Shanghai 200062, China
| | - Xiaomin Fang
- Baidu International Technology (Shenzhen) Co., Ltd., Shenzhen 518063, China
| | - Lihang Liu
- Baidu International Technology (Shenzhen) Co., Ltd., Shenzhen 518063, China
| | - Jinfeng Liu
- School of Science, China Pharmaceutical University, Nanjing 210009, China
- School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, Nanjing 210009, China
| |
Collapse
|
44
|
Zhao X, Li H, Zhang K, Huang SY. Iterative Knowledge-Based Scoring Function for Protein-Ligand Interactions by Considering Binding Affinity Information. J Phys Chem B 2023; 127:9021-9034. [PMID: 37822259 DOI: 10.1021/acs.jpcb.3c04421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/13/2023]
Abstract
Scoring functions for protein-ligand interactions play a critical role in structure-based drug design. Owing to the good balance between general applicability and computational efficiency, knowledge-based scoring functions have obtained significant advancements and achieved many successes. Nevertheless, knowledge-based scoring functions face a challenge in utilizing the experimental affinity data and thus may not perform well in binding affinity prediction. Addressing the challenge, we have proposed an improved version of the iterative knowledge-based scoring function ITScore by considering binding affinity information, which is referred to as ITScoreAff, based on a large training set of 6216 protein-ligand complexes with both structures and affinity data. ITScoreAff was extensively evaluated and compared with ITScore, 33 traditional, and 6 machine learning scoring functions in terms of docking power, ranking power, and screening power on the independent CASF-2016 benchmark. It was shown that ITScoreAff obtained an overall better performance than the other 40 scoring functions and gave an average success rate of 85.3% in docking power, a correlation coefficient of 0.723 in scoring power, and an average rank correlation coefficient of 0.668 in ranking power. In addition, ITScoreAff also achieved the overall best screening power when the top 10% of the ranked database were considered. These results demonstrated the robustness of ITScoreAff and its improvement over existing scoring functions.
Collapse
Affiliation(s)
- Xuejun Zhao
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Hao Li
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Keqiong Zhang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| |
Collapse
|
45
|
Zhang O, Wang T, Weng G, Jiang D, Wang N, Wang X, Zhao H, Wu J, Wang E, Chen G, Deng Y, Pan P, Kang Y, Hsieh CY, Hou T. Learning on topological surface and geometric structure for 3D molecular generation. NATURE COMPUTATIONAL SCIENCE 2023; 3:849-859. [PMID: 38177756 DOI: 10.1038/s43588-023-00530-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Accepted: 09/06/2023] [Indexed: 01/06/2024]
Abstract
Highly effective de novo design is a grand challenge of computer-aided drug discovery. Practical structure-specific three-dimensional molecule generations have started to emerge in recent years, but most approaches treat the target structure as a conditional input to bias the molecule generation and do not fully learn the detailed atomic interactions that govern the molecular conformation and stability of the binding complexes. The omission of these fine details leads to many models having difficulty in outputting reasonable molecules for a variety of therapeutic targets. Here, to address this challenge, we formulate a model, called SurfGen, that designs molecules in a fashion closely resembling the figurative key-and-lock principle. SurfGen comprises two equivariant neural networks, Geodesic-GNN and Geoatom-GNN, which capture the topological interactions on the pocket surface and the spatial interaction between ligand atoms and surface nodes, respectively. SurfGen outperforms other methods in a number of benchmarks, and its high sensitivity on the pocket structures enables an effective generative-model-based solution to the thorny issue of mutation-induced drug resistance.
Collapse
Affiliation(s)
- Odin Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Tianyue Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Gaoqi Weng
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Dejun Jiang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Ning Wang
- Hangzhou Carbonsilicon AI Technology Co., Ltd, Hangzhou, China
| | - Xiaorui Wang
- Hangzhou Carbonsilicon AI Technology Co., Ltd, Hangzhou, China
| | - Huifeng Zhao
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Jialu Wu
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Ercheng Wang
- Zhejiang Lab, Zhejiang University, Hangzhou, China
| | | | - Yafeng Deng
- Hangzhou Carbonsilicon AI Technology Co., Ltd, Hangzhou, China
| | - Peichen Pan
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Yu Kang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.
| | - Chang-Yu Hsieh
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.
| |
Collapse
|
46
|
Dong L, Shi S, Qu X, Luo D, Wang B. Ligand binding affinity prediction with fusion of graph neural networks and 3D structure-based complex graph. Phys Chem Chem Phys 2023; 25:24110-24120. [PMID: 37655493 DOI: 10.1039/d3cp03651k] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
Abstract
Accurate prediction of protein-ligand binding affinity is pivotal for drug design and discovery. Here, we proposed a novel deep fusion graph neural networks framework named FGNN to learn the protein-ligand interactions from the 3D structures of protein-ligand complexes. Unlike 1D sequences for proteins or 2D graphs for ligands, the 3D graph of protein-ligand complex enables the more accurate representations of the protein-ligand interactions. Benchmark studies have shown that our fusion models FGNN can achieve more accurate prediction of binding affinity than any individual algorithm. The advantages of fusion strategies have been demonstrated in terms of expressive power of data, learning efficiency and model interpretability. Our fusion models show satisfactory performances on diverse data sets, demonstrating their generalization ability. Given the good performances in both binding affinity prediction and virtual screening, our fusion models are expected to be practically applied for drug screening and design. Our work highlights the potential of the fusion graph neural network algorithm in solving complex prediction problems in computational biology and chemistry. The fusion graph neural networks (FGNN) model is freely available in https://github.com/LinaDongXMU/FGNN.
Collapse
Affiliation(s)
- Lina Dong
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, iChEM, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, 361005, China.
| | - Shuai Shi
- Department of Algorithm, TuringQ Co., Ltd., Shanghai, 200240, China
| | - Xiaoyang Qu
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, iChEM, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, 361005, China.
| | - Ding Luo
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, iChEM, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, 361005, China.
| | - Binju Wang
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, iChEM, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, 361005, China.
- Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen, 361005, China
| |
Collapse
|
47
|
Kamerlin SCL. Progress in using deep learning to treat cancer. NATURE COMPUTATIONAL SCIENCE 2023; 3:739-740. [PMID: 38177785 DOI: 10.1038/s43588-023-00514-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2024]
|
48
|
Zhang X, Zhang O, Shen C, Qu W, Chen S, Cao H, Kang Y, Wang Z, Wang E, Zhang J, Deng Y, Liu F, Wang T, Du H, Wang L, Pan P, Chen G, Hsieh CY, Hou T. Efficient and accurate large library ligand docking with KarmaDock. NATURE COMPUTATIONAL SCIENCE 2023; 3:789-804. [PMID: 38177786 DOI: 10.1038/s43588-023-00511-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 08/08/2023] [Indexed: 01/06/2024]
Abstract
Ligand docking is one of the core technologies in structure-based virtual screening for drug discovery. However, conventional docking tools and existing deep learning tools may suffer from limited performance in terms of speed, pose quality and binding affinity accuracy. Here we propose KarmaDock, a deep learning approach for ligand docking that integrates the functions of docking acceleration, binding pose generation and correction, and binding strength estimation. The three-stage model consists of the following components: (1) encoders for the protein and ligand to learn the representations of intramolecular interactions; (2) E(n) equivariant graph neural networks with self-attention to update the ligand pose based on both protein-ligand and intramolecular interactions, followed by post-processing to ensure chemically plausible structures; (3) a mixture density network for scoring the binding strength. KarmaDock was validated on four benchmark datasets and tested in a real-world virtual screening project that successfully identified experiment-validated active inhibitors of leukocyte tyrosine kinase (LTK).
Collapse
Affiliation(s)
- Xujun Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| | - Odin Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| | - Chao Shen
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| | - Wanglin Qu
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| | - Shicheng Chen
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| | - Hanqun Cao
- Department of Mathematics, Chinese University of Hong Kong, Hong Kong, China
| | - Yu Kang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| | - Zhe Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| | | | - Jintu Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| | - Yafeng Deng
- Hangzhou Carbonsilicon AI Technology Co., Ltd, Hangzhou, Zhejiang, China
| | - Furui Liu
- Zhejiang Lab, Hangzhou, Zhejiang, China
| | - Tianyue Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| | - Hongyan Du
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| | - Langcheng Wang
- Department of Pathology, New York University Medical Center, New York, NY, USA
| | - Peichen Pan
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, China.
| | | | - Chang-Yu Hsieh
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, China.
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, China.
| |
Collapse
|
49
|
Hagg A, Kirschner KN. Open-Source Machine Learning in Computational Chemistry. J Chem Inf Model 2023; 63:4505-4532. [PMID: 37466636 PMCID: PMC10430767 DOI: 10.1021/acs.jcim.3c00643] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Indexed: 07/20/2023]
Abstract
The field of computational chemistry has seen a significant increase in the integration of machine learning concepts and algorithms. In this Perspective, we surveyed 179 open-source software projects, with corresponding peer-reviewed papers published within the last 5 years, to better understand the topics within the field being investigated by machine learning approaches. For each project, we provide a short description, the link to the code, the accompanying license type, and whether the training data and resulting models are made publicly available. Based on those deposited in GitHub repositories, the most popular employed Python libraries are identified. We hope that this survey will serve as a resource to learn about machine learning or specific architectures thereof by identifying accessible codes with accompanying papers on a topic basis. To this end, we also include computational chemistry open-source software for generating training data and fundamental Python libraries for machine learning. Based on our observations and considering the three pillars of collaborative machine learning work, open data, open source (code), and open models, we provide some suggestions to the community.
Collapse
Affiliation(s)
- Alexander Hagg
- Institute
of Technology, Resource and Energy-Efficient Engineering (TREE), University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
- Department
of Electrical Engineering, Mechanical Engineering and Technical Journalism, University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
| | - Karl N. Kirschner
- Institute
of Technology, Resource and Energy-Efficient Engineering (TREE), University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
- Department
of Computer Science, University of Applied
Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
| |
Collapse
|
50
|
Wang T, Tong J, Zhang X, Luo H, Xu L, Wang Z. In silico screening and computational evaluation of novel promising USP14 inhibitors targeting the palm-thumb pocket. Phys Chem Chem Phys 2023; 25:20903-20916. [PMID: 37527190 DOI: 10.1039/d3cp02537c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/03/2023]
Abstract
Protein degradation and synthesis are essential for regulating various biological activities within the body. As a member of deubiquitinating enzymes (DUBs), ubiquitin-specific protease 14 (USP14) plays a critical role in regulating protein degradation and maintaining cellular protein homeostasis. However, abnormal expression of USP14 has been associated with a variety of malignant tumors and other diseases. In this study, we conducted hierarchical virtual screening against the palm-thumb pocket of USP14, which resulted in the identification of two promising hits with novel scaffolds. We systematically evaluated the potential of these two hits in terms of their binding affinity and selectivity at the computational level. The results indicated that they had stronger binding affinities than previously reported molecules, as evidenced by lower docking scores and binding free energies. The binding stability analysis and hotspot residue prediction based on the MD simulations further revealed that they were capable of stably binding to the palm-thumb pocket of USP14 via crucial interactions with the residues GLN197, TYR476, ASP199, PHE331, TYR436 and HIS426. More importantly, both candidates exhibit higher selectivity for USP14 over several other USP family members (USP5, USP7 and USP15). Our findings are hoped to be a good starting point for the development of selective USP14 inhibitors.
Collapse
Affiliation(s)
- Tianhao Wang
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an 710021, P. R. China.
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China.
| | - Jianbo Tong
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an 710021, P. R. China.
| | - Xing Zhang
- College of Chemistry and Chemical Engineering, Shaanxi University of Science and Technology, Xi'an 710021, P. R. China.
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China.
| | - Hao Luo
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China.
| | - Lei Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, China
| | - Zhe Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China.
| |
Collapse
|