1
|
Cao D, Chen M, Zhang R, Wang Z, Huang M, Yu J, Jiang X, Fan Z, Zhang W, Zhou H, Li X, Fu Z, Zhang S, Zheng M. SurfDock is a surface-informed diffusion generative model for reliable and accurate protein-ligand complex prediction. Nat Methods 2024:10.1038/s41592-024-02516-y. [PMID: 39604569 DOI: 10.1038/s41592-024-02516-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Accepted: 10/16/2024] [Indexed: 11/29/2024]
Abstract
Accurately predicting protein-ligand interactions is crucial for understanding cellular processes. We introduce SurfDock, a deep-learning method that addresses this challenge by integrating protein sequence, three-dimensional structural graphs and surface-level features into an equivariant architecture. SurfDock employs a generative diffusion model on a non-Euclidean manifold, optimizing molecular translations, rotations and torsions to generate reliable binding poses. Our extensive evaluations across various benchmarks demonstrate SurfDock's superiority over existing methods in docking success rates and adherence to physical constraints. It also exhibits remarkable generalizability to unseen proteins and predicted apo structures, while achieving state-of-the-art performance in virtual screening tasks. In a real-world application, SurfDock identified seven novel hit molecules in a virtual screening project targeting aldehyde dehydrogenase 1B1, a key enzyme in cellular metabolism. This showcases SurfDock's ability to elucidate molecular mechanisms underlying cellular processes. These results highlight SurfDock's potential as a transformative tool in structural biology, offering enhanced accuracy, physical plausibility and practical applicability in understanding protein-ligand interactions.
Collapse
Affiliation(s)
- Duanhua Cao
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
| | - Mingan Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Physical Science and Technology, ShanghaiTech University, Shanghai, China
- Lingang Laboratory, Shanghai, China
| | - Runze Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Zhaokun Wang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Manlin Huang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- Nanchang University, Nanchang, China
| | - Jie Yu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- Lingang Laboratory, Shanghai, China
- School of Information Science and Technology, ShanghaiTech University, Shanghai, China
| | - Xinyu Jiang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Zhehuan Fan
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Wei Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Hao Zhou
- Institute for AI Industry Research (AIR), Tsinghua University, Beijing, China
| | - Xutong Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
| | - Zunyun Fu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
| | - Sulin Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Mingyue Zheng
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China.
- University of Chinese Academy of Sciences, Beijing, China.
| |
Collapse
|
2
|
Nordquist EB, Zhao M, Kumar A, MacKerell AD. Combined Physics- and Machine-Learning-Based Method to Identify Druggable Binding Sites Using SILCS-Hotspots. J Chem Inf Model 2024; 64:7743-7757. [PMID: 39283165 PMCID: PMC11473228 DOI: 10.1021/acs.jcim.4c01189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2024]
Abstract
Identifying druggable binding sites on proteins is an important and challenging problem, particularly for cryptic, allosteric binding sites that may not be obvious from X-ray, cryo-EM, or predicted structures. The Site-Identification by Ligand Competitive Saturation (SILCS) method accounts for the flexibility of the target protein using all-atom molecular simulations that include various small molecule solutes in aqueous solution. During the simulations, the combination of protein flexibility and comprehensive sampling of the water and solute spatial distributions can identify buried binding pockets absent in experimentally determined structures. Previously, we reported a method for leveraging the information in the SILCS sampling to identify binding sites (termed Hotspots) of small mono- or bicyclic compounds, a subset of which coincide with known binding sites of drug-like molecules. Here, we build on that physics-based approach and present a ML model for ranking the Hotspots according to the likelihood they can accommodate drug-like molecules (e.g., molecular weight >200 Da). In the independent validation set, which includes various enzymes and receptors, our model recalls 67% and 89% of experimentally validated ligand binding sites in the top 10 and 20 ranked Hotspots, respectively. Furthermore, we show that the model's output Decision Function is a useful metric to predict binding sites and their potential druggability in new targets. Given the utility the SILCS method for ligand discovery and optimization, the tools presented represent an important advancement in the identification of orthosteric and allosteric binding sites and the discovery of drug-like molecules targeting those sites.
Collapse
Affiliation(s)
- Erik B. Nordquist
- Computer Aided Drug Design Center, Department of Pharmaceutical Sciences, School of Pharmacy, University of Maryland, Baltimore, Baltimore, Maryland 21201, United States
| | - Mingtian Zhao
- Computer Aided Drug Design Center, Department of Pharmaceutical Sciences, School of Pharmacy, University of Maryland, Baltimore, Baltimore, Maryland 21201, United States
| | - Anmol Kumar
- Computer Aided Drug Design Center, Department of Pharmaceutical Sciences, School of Pharmacy, University of Maryland, Baltimore, Baltimore, Maryland 21201, United States
| | - Alexander D. MacKerell
- Computer Aided Drug Design Center, Department of Pharmaceutical Sciences, School of Pharmacy, University of Maryland, Baltimore, Baltimore, Maryland 21201, United States
| |
Collapse
|
3
|
Eissa I, Yousef RG, Elkaeed EB, Alsfouk AA, Husein DZ, Ibrahim IM, Ismail A, Elkady H, Metwaly AM. New Theobromine Apoptotic Analogue with Anticancer Potential Targeting the EGFR Protein: Computational and In Vitro Studies. ACS OMEGA 2024; 9:15861-15881. [PMID: 38617602 PMCID: PMC11007702 DOI: 10.1021/acsomega.3c08148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 03/06/2024] [Accepted: 03/12/2024] [Indexed: 04/16/2024]
Abstract
AIM The aim of this study was to design and examine a novel epidermal growth factor receptor (EGFR) inhibitor with apoptotic properties by utilizing the essential structural characteristics of existing EGFR inhibitors as a foundation. METHOD The study began with the natural alkaloid theobromine and developed a new semisynthetic derivative (T-1-PMPA). Computational ADMET assessments were conducted first to evaluate its anticipated safety and general drug-likeness. Deep density functional theory (DFT) computations were initially performed to validate the three-dimensional (3D) structure and reactivity of T-1-PMPA. Molecular docking against the EGFR proteins was conducted to investigate T-1-PMPA's binding affinity and inhibitory potential. Additional molecular dynamics (MD) simulations over 200 ns along with MM-GPSA, PLIP, and principal component analysis of trajectories (PCAT) experiments were employed to verify the binding and inhibitory properties of T-1-PMPA. Afterward, T-1-PMPA was semisynthesized to validate the proposed design and in silico findings through several in vitro examinations. RESULTS DFT studies indicated T-1-PMPA's reactivity using electrostatic potential, global reactive indices, and total density of states. Molecular docking, MD simulations, MM-GPSA, PLIP, and ED suggested the binding and inhibitory properties of T-1-PMPA against the EGFR protein. The in silico ADMET predicted T-1-PMPA's safety and general drug-likeness. In vitro experiments demonstrated that T-1-PMPA effectively inhibited EGFRWT and EGFR790m, with IC50 values of 86 and 561 nM, respectively, compared to Erlotinib (31 and 456 nM). T-1-PMPA also showed significant suppression of the proliferation of HepG2 and MCF7 malignant cell lines, with IC50 values of 3.51 and 4.13 μM, respectively. The selectivity indices against the two cancer cell lines indicated the overall safety of T-1-PMPA. Flow cytometry confirmed the apoptotic effects of T-1-PMPA by increasing the total percentage of apoptosis to 42% compared to 31, and 3% in Erlotinib-treated and control cells, respectively. The qRT-PCR analysis further supported the apoptotic effects by revealing significant increases in the levels of Casp3 and Casp9. Additionally, T-1-PMPA controlled the levels of TNFα and IL2 by 74 and 50%, comparing Erlotinib's values (84 and 74%), respectively. CONCLUSION In conclusion, our study's findings suggest the potential of T-1-PMPA as a promising apoptotic anticancer lead compound targeting the EGFR.
Collapse
Affiliation(s)
- Ibrahim
H. Eissa
- Pharmaceutical
Medicinal Chemistry & Drug Design Department, Faculty of Pharmacy
(Boys), Al-Azhar University, Cairo 11884, Egypt
| | - Reda G. Yousef
- Pharmaceutical
Medicinal Chemistry & Drug Design Department, Faculty of Pharmacy
(Boys), Al-Azhar University, Cairo 11884, Egypt
| | - Eslam B. Elkaeed
- Department
of Pharmaceutical Sciences, College of Pharmacy, AlMaarefa University, Riyadh 13713, Saudi Arabia
| | - Aisha A. Alsfouk
- Department
of Pharmaceutical Sciences, College of Pharmacy, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
| | - Dalal Z. Husein
- Chemistry
Department, Faculty of Science, New Valley
University, El-Kharja 72511, Egypt
| | - Ibrahim M. Ibrahim
- Biophysics
Department, Faculty of Science, Cairo University, Giza 12613, Egypt
| | - Ahmed Ismail
- Biochemistry
and Molecular Biology Department, Faculty of Pharmacy, Al-Azhar University, Cairo 11884, Egypt
| | - Hazem Elkady
- Pharmaceutical
Medicinal Chemistry & Drug Design Department, Faculty of Pharmacy
(Boys), Al-Azhar University, Cairo 11884, Egypt
| | - Ahmed M. Metwaly
- Pharmacognosy
and Medicinal Plants Department, Faculty of Pharmacy (Boys), Al-Azhar University, Cairo 11884, Egypt
- Biopharmaceutical
Products Research Department, Genetic Engineering and Biotechnology
Research Institute, City of Scientific Research
and Technological Applications (SRTA-City), Alexandria 21934, Egypt
| |
Collapse
|
4
|
Luo D, Liu D, Qu X, Dong L, Wang B. Enhancing Generalizability in Protein-Ligand Binding Affinity Prediction with Multimodal Contrastive Learning. J Chem Inf Model 2024; 64:1892-1906. [PMID: 38441880 DOI: 10.1021/acs.jcim.3c01961] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2024]
Abstract
Improving the generalization ability of scoring functions remains a major challenge in protein-ligand binding affinity prediction. Many machine learning methods are limited by their reliance on single-modal representations, hindering a comprehensive understanding of protein-ligand interactions. We introduce a graph-neural-network-based scoring function that utilizes a triplet contrastive learning loss to improve protein-ligand representations. In this model, three-dimensional complex representations and the fusion of two-dimensional ligand and coarse-grained pocket representations converge while distancing from decoy representations in latent space. After rigorous validation on multiple external data sets, our model exhibits commendable generalization capabilities compared to those of other deep learning-based scoring functions, marking it as a promising tool in the realm of drug discovery. In the future, our training framework can be extended to other biophysical- and biochemical-related problems such as protein-protein interaction and protein mutation prediction.
Collapse
Affiliation(s)
- Ding Luo
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
| | - Dandan Liu
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
| | - Xiaoyang Qu
- School of Pharmacy and Medical Technology, Putian University, Putian 351100, P. R. China
- Key Laboratory of Pharmaceutical Analysis and Laboratory Medicine (Putian University), Fujian Province University, Putian 351100, P. R. China
| | - Lina Dong
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
| | - Binju Wang
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
- Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen 361005, P. R. China
| |
Collapse
|
5
|
Zhang Y, Li S, Meng K, Sun S. Machine Learning for Sequence and Structure-Based Protein-Ligand Interaction Prediction. J Chem Inf Model 2024; 64:1456-1472. [PMID: 38385768 DOI: 10.1021/acs.jcim.3c01841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2024]
Abstract
Developing new drugs is too expensive and time -consuming. Accurately predicting the interaction between drugs and targets will likely change how the drug is discovered. Machine learning-based protein-ligand interaction prediction has demonstrated significant potential. In this paper, computational methods, focusing on sequence and structure to study protein-ligand interactions, are examined. Therefore, this paper starts by presenting an overview of the data sets applied in this area, as well as the various approaches applied for representing proteins and ligands. Then, sequence-based and structure-based classification criteria are subsequently utilized to categorize and summarize both the classical machine learning models and deep learning models employed in protein-ligand interaction studies. Moreover, the evaluation methods and interpretability of these models are proposed. Furthermore, delving into the diverse applications of protein-ligand interaction models in drug research is presented. Lastly, the current challenges and future directions in this field are addressed.
Collapse
Affiliation(s)
- Yunjiang Zhang
- Beijing Key Laboratory for Green Catalysis and Separation, The Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, P. R. China
| | - Shuyuan Li
- Beijing Key Laboratory for Green Catalysis and Separation, The Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, P. R. China
| | - Kong Meng
- Beijing Key Laboratory for Green Catalysis and Separation, The Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, P. R. China
| | - Shaorui Sun
- Beijing Key Laboratory for Green Catalysis and Separation, The Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, P. R. China
| |
Collapse
|
6
|
Wang M, Wu Z, Wang J, Weng G, Kang Y, Pan P, Li D, Deng Y, Yao X, Bing Z, Hsieh CY, Hou T. Genetic Algorithm-Based Receptor Ligand: A Genetic Algorithm-Guided Generative Model to Boost the Novelty and Drug-Likeness of Molecules in a Sampling Chemical Space. J Chem Inf Model 2024; 64:1213-1228. [PMID: 38302422 DOI: 10.1021/acs.jcim.3c01964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2024]
Abstract
Deep learning-based de novo molecular design has recently gained significant attention. While numerous DL-based generative models have been successfully developed for designing novel compounds, the majority of the generated molecules lack sufficiently novel scaffolds or high drug-like profiles. The aforementioned issues may not be fully captured by commonly used metrics for the assessment of molecular generative models, such as novelty, diversity, and quantitative estimation of the drug-likeness score. To address these limitations, we proposed a genetic algorithm-guided generative model called GARel (genetic algorithm-based receptor-ligand interaction generator), a novel framework for training a DL-based generative model to produce drug-like molecules with novel scaffolds. To efficiently train the GARel model, we utilized dense net to update the parameters based on molecules with novel scaffolds and drug-like features. To demonstrate the capability of the GARel model, we used it to design inhibitors for three targets: AA2AR, EGFR, and SARS-Cov2. The results indicate that GARel-generated molecules feature more diverse and novel scaffolds and possess more desirable physicochemical properties and favorable docking scores. Compared with other generative models, GARel makes significant progress in balancing novelty and drug-likeness, providing a promising direction for the further development of DL-based de novo design methodology with potential impacts on drug discovery.
Collapse
Affiliation(s)
- Mingyang Wang
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
- CarbonSilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang ,China
| | - Zhengjian Wu
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
- School of Computer Science, Wuhan University, Wuhan 430072, Hubei ,China
| | - Jike Wang
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
- CarbonSilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang ,China
| | - Gaoqi Weng
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
| | - Yu Kang
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
| | - Peichen Pan
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
| | - Dan Li
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
| | - Yafeng Deng
- CarbonSilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang ,China
| | - Xiaojun Yao
- Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery Macau Institute for Applied Research in Medicine and Health State Key Laboratory of Quality Research in Chinese Medicine, Macau University of Science and Technology, Taipa, Macau 999078, China
| | - Zhitong Bing
- Institute of Modern Physics, Chinese Academy of Sciences, Lanzhou, Gansu 730000, China
| | - Chang-Yu Hsieh
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
| | - Tingjun Hou
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
| |
Collapse
|
7
|
Lindley S, Lu Y, Shukla D. The Experimentalist's Guide to Machine Learning for Small Molecule Design. ACS APPLIED BIO MATERIALS 2024; 7:657-684. [PMID: 37535819 PMCID: PMC10880109 DOI: 10.1021/acsabm.3c00054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Accepted: 07/17/2023] [Indexed: 08/05/2023]
Abstract
Initially part of the field of artificial intelligence, machine learning (ML) has become a booming research area since branching out into its own field in the 1990s. After three decades of refinement, ML algorithms have accelerated scientific developments across a variety of research topics. The field of small molecule design is no exception, and an increasing number of researchers are applying ML techniques in their pursuit of discovering, generating, and optimizing small molecule compounds. The goal of this review is to provide simple, yet descriptive, explanations of some of the most commonly utilized ML algorithms in the field of small molecule design along with those that are highly applicable to an experimentally focused audience. The algorithms discussed here span across three ML paradigms: supervised learning, unsupervised learning, and ensemble methods. Examples from the published literature will be provided for each algorithm. Some common pitfalls of applying ML to biological and chemical data sets will also be explained, alongside a brief summary of a few more advanced paradigms, including reinforcement learning and semi-supervised learning.
Collapse
Affiliation(s)
- Sarah
E. Lindley
- Department
of Bioengineering, University of Illinois, Urbana−Champaign, Illinois 61801, United States
| | - Yiyang Lu
- Department
of Chemical and Biomolecular Engineering, University of Illinois, Urbana−Champaign, Illinois 61801, United States
| | - Diwakar Shukla
- Department
of Bioengineering, University of Illinois, Urbana−Champaign, Illinois 61801, United States
- Department
of Chemical and Biomolecular Engineering, University of Illinois, Urbana−Champaign, Illinois 61801, United States
- Center
for Biophysics & Computational Biology, University of Illinois, Urbana−Champaign, Illinois 61801, United States
- Department
of Plant Biology, University of Illinois, Urbana−Champaign, Illinois 61801, United States
| |
Collapse
|
8
|
Metwaly A, Saleh MM, Alsfouk A, Ibrahim IM, Abd-Elraouf M, Elkaeed E, Elkady H, Eissa I. In silico and in vitro evaluation of the anti-virulence potential of patuletin, a natural methoxy flavone, against Pseudomonas aeruginosa. PeerJ 2024; 12:e16826. [PMID: 38313021 PMCID: PMC10838535 DOI: 10.7717/peerj.16826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 01/02/2024] [Indexed: 02/06/2024] Open
Abstract
This study aimed to investigate the potential of patuletin, a rare natural flavonoid, as a virulence and LasR inhibitor against Pseudomonas aeruginosa. Various computational studies were utilized to explore the binding of Patuletin and LasR at a molecular level. Molecular docking revealed that Patuletin strongly interacted with the active pocket of LasR, with a high binding affinity value of -20.96 kcal/mol. Further molecular dynamics simulations, molecular mechanics generalized Born surface area (MM/GBSA), protein-ligand interaction profile (PLIP), and essential dynamics analyses confirmed the stability of the patuletin-LasR complex, and no significant structural changes were observed in the LasR protein upon binding. Key amino acids involved in binding were identified, along with a free energy value of -26.9 kcal/mol. In vitro assays were performed to assess patuletin's effects on P. aeruginosa. At a sub-inhibitory concentration (1/4 MIC), patuletin significantly reduced biofilm formation by 48% and 42%, decreased pyocyanin production by 24% and 14%, and decreased proteolytic activities by 42% and 20% in P. aeruginosa isolate ATCC 27853 (PA27853) and P. aeruginosa clinical isolate (PA1), respectively. In summary, this study demonstrated that patuletin effectively inhibited LasR activity in silico and attenuated virulence factors in vitro, including biofilm formation, pyocyanin production, and proteolytic activity. These findings suggest that patuletin holds promise as a potential therapeutic agent in combination with antibiotics to combat antibiotic-tolerant P. aeruginosa infections.
Collapse
Affiliation(s)
- Ahmed Metwaly
- Pharmacognosy and Medicinal Plants Department, Faculty of Pharmacy (Boys), Al-Azhar University, Cairo, Egypt
- City of Scientific Research and Technological Applications (SRTA-City), Biopharmaceutical Products Research Department, Genetic Engineering and Biotechnology Research Institute, Alexandria, Egypt
| | - Moustafa M. Saleh
- Microbiology and Immunology Department, Faculty of Pharmacy, Port Said University, Port Said, Egypt
| | - Aisha Alsfouk
- Department of Pharmaceutical Sciences, College of Pharmacy, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
| | - Ibrahim M. Ibrahim
- Biophysics Department, Faculty of Science, Cairo University, Giza, Egypt
| | - Muhamad Abd-Elraouf
- Pharmacognosy and Medicinal Plants Department, Faculty of Pharmacy (Boys), Al-Azhar University, Cairo, Egypt
| | - Eslam Elkaeed
- Department of Pharmaceutical Sciences, College of Pharmacy, AlMaarefa University, Riyadh, Saudi Arabia
| | - Hazem Elkady
- Pharmacognosy and Medicinal Plants Department, Faculty of Pharmacy (Boys), Al-Azhar University, Cairo, Egypt
| | - Ibrahim Eissa
- Pharmaceutical Medicinal Chemistry & Drug Design Department, Faculty of Pharmacy (Boys), Al-Azhar University, Cairo, Egypt
| |
Collapse
|
9
|
Li Y, Fan Z, Rao J, Chen Z, Chu Q, Zheng M, Li X. An overview of recent advances and challenges in predicting compound-protein interaction (CPI). MEDICAL REVIEW (2021) 2023; 3:465-486. [PMID: 38282802 PMCID: PMC10808869 DOI: 10.1515/mr-2023-0030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 08/30/2023] [Indexed: 01/30/2024]
Abstract
Compound-protein interactions (CPIs) are critical in drug discovery for identifying therapeutic targets, drug side effects, and repurposing existing drugs. Machine learning (ML) algorithms have emerged as powerful tools for CPI prediction, offering notable advantages in cost-effectiveness and efficiency. This review provides an overview of recent advances in both structure-based and non-structure-based CPI prediction ML models, highlighting their performance and achievements. It also offers insights into CPI prediction-related datasets and evaluation benchmarks. Lastly, the article presents a comprehensive assessment of the current landscape of CPI prediction, elucidating the challenges faced and outlining emerging trends to advance the field.
Collapse
Affiliation(s)
- Yanbei Li
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, Zhejiang Province, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Zhehuan Fan
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Jingxin Rao
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Zhiyi Chen
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, Zhejiang Province, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Qinyu Chu
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, Zhejiang Province, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Mingyue Zheng
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, Zhejiang Province, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Xutong Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
10
|
Libouban PY, Aci-Sèche S, Gómez-Tamayo JC, Tresadern G, Bonnet P. The Impact of Data on Structure-Based Binding Affinity Predictions Using Deep Neural Networks. Int J Mol Sci 2023; 24:16120. [PMID: 38003312 PMCID: PMC10671244 DOI: 10.3390/ijms242216120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 10/30/2023] [Accepted: 11/01/2023] [Indexed: 11/26/2023] Open
Abstract
Artificial intelligence (AI) has gained significant traction in the field of drug discovery, with deep learning (DL) algorithms playing a crucial role in predicting protein-ligand binding affinities. Despite advancements in neural network architectures, system representation, and training techniques, the performance of DL affinity prediction has reached a plateau, prompting the question of whether it is truly solved or if the current performance is overly optimistic and reliant on biased, easily predictable data. Like other DL-related problems, this issue seems to stem from the training and test sets used when building the models. In this work, we investigate the impact of several parameters related to the input data on the performance of neural network affinity prediction models. Notably, we identify the size of the binding pocket as a critical factor influencing the performance of our statistical models; furthermore, it is more important to train a model with as much data as possible than to restrict the training to only high-quality datasets. Finally, we also confirm the bias in the typically used current test sets. Therefore, several types of evaluation and benchmarking are required to understand models' decision-making processes and accurately compare the performance of models.
Collapse
Affiliation(s)
- Pierre-Yves Libouban
- Institute of Organic and Analytical Chemistry (ICOA), UMR7311, Université d’Orléans, CNRS, Pôle de Chimie rue de Chartres, 45067 Orléans, CEDEX 2, France; (P.-Y.L.); (S.A.-S.)
| | - Samia Aci-Sèche
- Institute of Organic and Analytical Chemistry (ICOA), UMR7311, Université d’Orléans, CNRS, Pôle de Chimie rue de Chartres, 45067 Orléans, CEDEX 2, France; (P.-Y.L.); (S.A.-S.)
| | - Jose Carlos Gómez-Tamayo
- Computational Chemistry, Janssen Research & Development, Janssen Pharmaceutica N. V., B-2340 Beerse, Belgium; (J.C.G.-T.); (G.T.)
| | - Gary Tresadern
- Computational Chemistry, Janssen Research & Development, Janssen Pharmaceutica N. V., B-2340 Beerse, Belgium; (J.C.G.-T.); (G.T.)
| | - Pascal Bonnet
- Institute of Organic and Analytical Chemistry (ICOA), UMR7311, Université d’Orléans, CNRS, Pôle de Chimie rue de Chartres, 45067 Orléans, CEDEX 2, France; (P.-Y.L.); (S.A.-S.)
| |
Collapse
|
11
|
Tran-Nguyen VK, Junaid M, Simeon S, Ballester PJ. A practical guide to machine-learning scoring for structure-based virtual screening. Nat Protoc 2023; 18:3460-3511. [PMID: 37845361 DOI: 10.1038/s41596-023-00885-w] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 07/03/2023] [Indexed: 10/18/2023]
Abstract
Structure-based virtual screening (SBVS) via docking has been used to discover active molecules for a range of therapeutic targets. Chemical and protein data sets that contain integrated bioactivity information have increased both in number and in size. Artificial intelligence and, more concretely, its machine-learning (ML) branch, including deep learning, have effectively exploited these data sets to build scoring functions (SFs) for SBVS against targets with an atomic-resolution 3D model (e.g., generated by X-ray crystallography or predicted by AlphaFold2). Often outperforming their generic and non-ML counterparts, target-specific ML-based SFs represent the state of the art for SBVS. Here, we present a comprehensive and user-friendly protocol to build and rigorously evaluate these new SFs for SBVS. This protocol is organized into four sections: (i) using a public benchmark of a given target to evaluate an existing generic SF; (ii) preparing experimental data for a target from public repositories; (iii) partitioning data into a training set and a test set for subsequent target-specific ML modeling; and (iv) generating and evaluating target-specific ML SFs by using the prepared training-test partitions. All necessary code and input/output data related to three example targets (acetylcholinesterase, HMG-CoA reductase, and peroxisome proliferator-activated receptor-α) are available at https://github.com/vktrannguyen/MLSF-protocol , can be run by using a single computer within 1 week and make use of easily accessible software/programs (e.g., Smina, CNN-Score, RF-Score-VS and DeepCoy) and web resources. Our aim is to provide practical guidance on how to augment training data to enhance SBVS performance, how to identify the most suitable supervised learning algorithm for a data set, and how to build an SF with the highest likelihood of discovering target-active molecules within a given compound library.
Collapse
Affiliation(s)
| | - Muhammad Junaid
- Centre de Recherche en Cancérologie de Marseille, Marseille, France
| | - Saw Simeon
- Centre de Recherche en Cancérologie de Marseille, Marseille, France
| | | |
Collapse
|
12
|
Szulc NA, Mackiewicz Z, Bujnicki JM, Stefaniak F. Structural interaction fingerprints and machine learning for predicting and explaining binding of small molecule ligands to RNA. Brief Bioinform 2023; 24:bbad187. [PMID: 37204195 DOI: 10.1093/bib/bbad187] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Revised: 04/07/2023] [Accepted: 04/25/2023] [Indexed: 05/20/2023] Open
Abstract
Ribonucleic acids (RNAs) play crucial roles in living organisms and some of them, such as bacterial ribosomes and precursor messenger RNA, are targets of small molecule drugs, whereas others, e.g. bacterial riboswitches or viral RNA motifs are considered as potential therapeutic targets. Thus, the continuous discovery of new functional RNA increases the demand for developing compounds targeting them and for methods for analyzing RNA-small molecule interactions. We recently developed fingeRNAt-a software for detecting non-covalent bonds formed within complexes of nucleic acids with different types of ligands. The program detects several non-covalent interactions and encodes them as structural interaction fingerprint (SIFt). Here, we present the application of SIFts accompanied by machine learning methods for binding prediction of small molecules to RNA. We show that SIFt-based models outperform the classic, general-purpose scoring functions in virtual screening. We also employed Explainable Artificial Intelligence (XAI)-the SHapley Additive exPlanations, Local Interpretable Model-agnostic Explanations and other methods to help understand the decision-making process behind the predictive models. We conducted a case study in which we applied XAI on a predictive model of ligand binding to human immunodeficiency virus type 1 trans-activation response element RNA to distinguish between residues and interaction types important for binding. We also used XAI to indicate whether an interaction has a positive or negative effect on binding prediction and to quantify its impact. Our results obtained using all XAI methods were consistent with the literature data, demonstrating the utility and importance of XAI in medicinal chemistry and bioinformatics.
Collapse
Affiliation(s)
- Natalia A Szulc
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, 4 Ks. Trojdena Str, 02-109 Warsaw, Poland
- Laboratory of Protein Metabolism, International Institute of Molecular and Cell Biology in Warsaw, 4 Ks. Trojdena Str, 02-109 Warsaw, Poland
| | - Zuzanna Mackiewicz
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, 4 Ks. Trojdena Str, 02-109 Warsaw, Poland
- Laboratory of RNA Biology - ERA Chairs Group, International Institute of Molecular and Cell Biology in Warsaw, 4 Ks. Trojdena Str, 02-109 Warsaw, Poland
| | - Janusz M Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, 4 Ks. Trojdena Str, 02-109 Warsaw, Poland
| | - Filip Stefaniak
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, 4 Ks. Trojdena Str, 02-109 Warsaw, Poland
| |
Collapse
|
13
|
Zhang X, Shen C, Jiang D, Zhang J, Ye Q, Xu L, Hou T, Pan P, Kang Y. TB-IECS: an accurate machine learning-based scoring function for virtual screening. J Cheminform 2023; 15:63. [PMID: 37403155 DOI: 10.1186/s13321-023-00731-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 06/18/2023] [Indexed: 07/06/2023] Open
Abstract
Machine learning-based scoring functions (MLSFs) have shown potential for improving virtual screening capabilities over classical scoring functions (SFs). Due to the high computational cost in the process of feature generation, the numbers of descriptors used in MLSFs and the characterization of protein-ligand interactions are always limited, which may affect the overall accuracy and efficiency. Here, we propose a new SF called TB-IECS (theory-based interaction energy component score), which combines energy terms from Smina and NNScore version 2, and utilizes the eXtreme Gradient Boosting (XGBoost) algorithm for model training. In this study, the energy terms decomposed from 15 traditional SFs were firstly categorized based on their formulas and physicochemical principles, and 324 feature combinations were generated accordingly. Five best feature combinations were selected for further evaluation of the model performance in regard to the selection of feature vectors with various length, interaction types and ML algorithms. The virtual screening power of TB-IECS was assessed on the datasets of DUD-E and LIT-PCBA, as well as seven target-specific datasets from the ChemDiv database. The results showed that TB-IECS outperformed classical SFs including Glide SP and Dock, and effectively balanced the efficiency and accuracy for practical virtual screening.
Collapse
Affiliation(s)
- Xujun Zhang
- Innovation Institute for Artificial Intelligence in Medicine of, Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Chao Shen
- Innovation Institute for Artificial Intelligence in Medicine of, Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Dejun Jiang
- Innovation Institute for Artificial Intelligence in Medicine of, Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Jintu Zhang
- Innovation Institute for Artificial Intelligence in Medicine of, Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Qing Ye
- Innovation Institute for Artificial Intelligence in Medicine of, Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Lei Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, 213001, China
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of, Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Peichen Pan
- Innovation Institute for Artificial Intelligence in Medicine of, Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China.
| | - Yu Kang
- Innovation Institute for Artificial Intelligence in Medicine of, Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China.
| |
Collapse
|
14
|
Cui Z, Zhang N, Zhou T, Zhou X, Meng H, Yu Y, Zhang Z, Zhang Y, Wang W, Liu Y. Conserved Sites and Recognition Mechanisms of T1R1 and T2R14 Receptors Revealed by Ensemble Docking and Molecular Descriptors and Fingerprints Combined with Machine Learning. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2023; 71:5630-5645. [PMID: 37005743 DOI: 10.1021/acs.jafc.3c00591] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Taste peptides, as an important component of protein-rich foodstuffs, potentiate the nutrition and taste of food. Thereinto, umami- and bitter-taste peptides have been ex tensively reported, while their taste mechanisms remain unclear. Meanwhile, the identification of taste peptides is still a time-consuming and costly task. In this study, 489 peptides with umami/bitter taste from TPDB (http://tastepeptides-meta.com/) were collected and used to train the classification models based on docking analysis, molecular descriptors (MDs), and molecular fingerprints (FPs). A consensus model, taste peptide docking machine (TPDM), was generated based on five learning algorithms (linear regression, random forest, gaussian naive bayes, gradient boosting tree, and stochastic gradient descent) and four molecular representation schemes. Model interpretive analysis showed that MDs (VSA_EState, MinEstateIndex, MolLogP) and FPs (598, 322, 952) had the greatest impact on the umami/bitter prediction of peptides. Based on the consensus docking results, we obtained the key recognition modes of umami/bitter receptors (T1Rs/T2Rs): (1) residues 107S-109S, 148S-154T, 247F-249A mainly form hydrogen bonding contacts and (2) residues 153A-158L, 163L, 181Q, 218D, 247F-249A in T1R1 and 56D, 106P, 107V, 152V-156F, 173K-180F in T2R14 constituted their hydrogen bond pockets. The model is available at http://www.tastepeptides-meta.com/yyds.
Collapse
Affiliation(s)
- Zhiyong Cui
- Department of Food Science & Technology, School of Agriculture & Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Ninglong Zhang
- Department of Food Science & Technology, School of Agriculture & Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Tianxing Zhou
- Department of Bioinformatics, Faculty of Science, The University of Melbourne, Parkville 3010, Victoria, Australia
| | - Xueke Zhou
- Department of Food Science & Technology, School of Agriculture & Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Hengli Meng
- Department of Food Science & Technology, School of Agriculture & Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yanyang Yu
- Department of Food Science & Technology, School of Agriculture & Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Zhiwei Zhang
- Department of Food Science & Technology, School of Agriculture & Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yin Zhang
- Key Laboratory of Meat Processing of Sichuan, Chengdu University, Chengdu 610106, China
| | - Wenli Wang
- Department of Food Science & Technology, School of Agriculture & Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yuan Liu
- Department of Food Science & Technology, School of Agriculture & Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| |
Collapse
|
15
|
Gu S, Shen C, Yu J, Zhao H, Liu H, Liu L, Sheng R, Xu L, Wang Z, Hou T, Kang Y. Can molecular dynamics simulations improve predictions of protein-ligand binding affinity with machine learning? Brief Bioinform 2023; 24:6995375. [PMID: 36681903 DOI: 10.1093/bib/bbad008] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2022] [Revised: 12/04/2022] [Accepted: 12/30/2023] [Indexed: 01/23/2023] Open
Abstract
Binding affinity prediction largely determines the discovery efficiency of lead compounds in drug discovery. Recently, machine learning (ML)-based approaches have attracted much attention in hopes of enhancing the predictive performance of traditional physics-based approaches. In this study, we evaluated the impact of structural dynamic information on the binding affinity prediction by comparing the models trained on different dimensional descriptors, using three targets (i.e. JAK1, TAF1-BD2 and DDR1) and their corresponding ligands as the examples. Here, 2D descriptors are traditional ECFP4 fingerprints, 3D descriptors are the energy terms of the Smina and NNscore scoring functions and 4D descriptors contain the structural dynamic information derived from the trajectories based on molecular dynamics (MD) simulations. We systematically investigate the MD-refined binding affinity prediction performance of three classical ML algorithms (i.e. RF, SVR and XGB) as well as two common virtual screening methods, namely Glide docking and MM/PBSA. The outcomes of the ML models built using various dimensional descriptors and their combinations reveal that the MD refinement with the optimized protocol can improve the predictive performance on the TAF1-BD2 target with considerable structural flexibility, but not for the less flexible JAK1 and DDR1 targets, when taking docking poses as the initial structure instead of the crystal structures. The results highlight the importance of the initial structures to the final performance of the model through conformational analysis on the three targets with different flexibility.
Collapse
Affiliation(s)
- Shukai Gu
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Chao Shen
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Jiahui Yu
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Hong Zhao
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Huanxiang Liu
- Faculty of Applied Science, Macao Polytechnic University, Macao, SAR, China
| | - Liwei Liu
- Advanced Computing and Storage Laboratory, Central Research Institute, 2012 Laboratories, Huawei Technologies Co., Ltd., Shenzhen 518129, Guangdong, China
| | - Rong Sheng
- Health Technology Development Dept, Huawei Device Co., Ltd., Dongguan 523808, Guangdong, China
| | - Lei Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, China
| | - Zhe Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Yu Kang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| |
Collapse
|
16
|
Wang L, Shi SH, Li H, Zeng XX, Liu SY, Liu ZQ, Deng YF, Lu AP, Hou TJ, Cao DS. Reducing false positive rate of docking-based virtual screening by active learning. Brief Bioinform 2023; 24:6987822. [PMID: 36642412 DOI: 10.1093/bib/bbac626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Revised: 12/10/2022] [Accepted: 12/20/2022] [Indexed: 01/17/2023] Open
Abstract
Machine learning-based scoring functions (MLSFs) have become a very favorable alternative to classical scoring functions because of their potential superior screening performance. However, the information of negative data used to construct MLSFs was rarely reported in the literature, and meanwhile the putative inactive molecules recorded in existing databases usually have obvious bias from active molecules. Here we proposed an easy-to-use method named AMLSF that combines active learning using negative molecular selection strategies with MLSF, which can iteratively improve the quality of inactive sets and thus reduce the false positive rate of virtual screening. We chose energy auxiliary terms learning as the MLSF and validated our method on eight targets in the diverse subset of DUD-E. For each target, we screened the IterBioScreen database by AMLSF and compared the screening results with those of the four control models. The results illustrate that the number of active molecules in the top 1000 molecules identified by AMLSF was significantly higher than those identified by the control models. In addition, the free energy calculation results for the top 10 molecules screened out by the AMLSF, null model and control models based on DUD-E also proved that more active molecules can be identified, and the false positive rate can be reduced by AMLSF.
Collapse
Affiliation(s)
- Lei Wang
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, China
| | - Shao-Hua Shi
- Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, China
| | - Hui Li
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, China
| | - Xiang-Xiang Zeng
- Department of Computer Science, Hunan University, Changsha 410082, Hunan, China
| | - Su-You Liu
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, China
| | - Zhao-Qian Liu
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, China
| | - Ya-Feng Deng
- CarbonSilicon AI Technology Co., Ltd, Hangzhou, Zhejiang 310018, China
| | - Ai-Ping Lu
- Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, China
| | - Ting-Jun Hou
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Dong-Sheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, China.,Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, China
| |
Collapse
|
17
|
Qu X, Dong L, Zhang J, Si Y, Wang B. Systematic Improvement of the Performance of Machine Learning Scoring Functions by Incorporating Features of Protein-Bound Water Molecules. J Chem Inf Model 2022; 62:4369-4379. [PMID: 36083808 DOI: 10.1021/acs.jcim.2c00916] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Water molecules at the ligand-protein interfaces play crucial roles in the binding of the ligands, but the behavior of protein-bound water is largely ignored in many currently used machine learning (ML)-based scoring functions (SFs). In an attempt to improve the prediction performance of existing ML-based SFs, we estimated the water distribution with a HydraMap (HM) method and then incorporated the features extracted from protein-bound waters obtained in this way into three ML-based SFs: RF-Score, ECIF, and PLEC. It was found that a combination of HM-based features can consistently improve the performance of all three SFs, including their scoring, ranking, and docking power. HydraMap-based features show consistently good performance with both crystal structures and docked structures, demonstrating their robustness for SFs. Overall, HM-based features, which are a statistical representation of hydration sites at protein-ligand interfaces, are expected to improve the prediction performance for diverse SFs.
Collapse
Affiliation(s)
- Xiaoyang Qu
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen 361005 P. R. China
| | - Lina Dong
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen 361005 P. R. China
| | - Jinyan Zhang
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen 361005 P. R. China
| | - Yubing Si
- College of Chemistry, Zhengzhou University, Zhengzhou 450001, P. R. China
| | - Binju Wang
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen 361005 P. R. China
| |
Collapse
|
18
|
Dong L, Qu X, Wang B. XLPFE: A Simple and Effective Machine Learning Scoring Function for Protein-Ligand Scoring and Ranking. ACS OMEGA 2022; 7:21727-21735. [PMID: 35785279 PMCID: PMC9245135 DOI: 10.1021/acsomega.2c01723] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 05/30/2022] [Indexed: 06/15/2023]
Abstract
Prediction of protein-ligand binding affinities is a central issue in structure-based computer-aided drug design. In recent years, much effort has been devoted to the prediction of the binding affinity in protein-ligand complexes using machine learning (ML). Due to the remarkable ability of ML methods in nonlinear fitting, ML-based scoring functions (SFs) can deliver much improved performance on a selected test set, such as the comparative assessment of scoring functions (CASF), when compared to the classical SFs. However, the performance of ML-based SFs heavily relies on the overall similarity of the training set and the test set. To improve the performance and transferability of an SF, we have tried to combine various features including energy terms from X-score and AutoDock Vina, the properties of ligands, and the statistical sequence-related information from either the binding site or the full protein. In conjunction with extreme trees (ET), an ML model, we have developed XLPFE, a new SF. Compared with other tested methods such as X-score, AutoDock Vina, ΔvinaXGB, PSH-ML, or CNN-score, XLPFE achieves consistently better scoring and ranking power for various types of protein-ligand complex structures beyond the CASF, suggesting that XLPFE has superior transferability. In particular, XLPFE performs better with metalloenzymes. With its faster speed, improved accuracy, and better transferability, XLPFE could be usefully applied to a diverse range of protein-ligand complexes.
Collapse
Affiliation(s)
- Lina Dong
- State
Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian
Provincial Key Laboratory of Theoretical and Computational Chemistry,
iChEM, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 360015, P. R. China
| | - Xiaoyang Qu
- State
Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian
Provincial Key Laboratory of Theoretical and Computational Chemistry,
College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 360015, P. R. China
| | - Binju Wang
- State
Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian
Provincial Key Laboratory of Theoretical and Computational Chemistry,
College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 360015, P. R. China
| |
Collapse
|
19
|
Wang M, Hsieh CY, Wang J, Wang D, Weng G, Shen C, Yao X, Bing Z, Li H, Cao D, Hou T. RELATION: A Deep Generative Model for Structure-Based De Novo Drug Design. J Med Chem 2022; 65:9478-9492. [PMID: 35713420 DOI: 10.1021/acs.jmedchem.2c00732] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Deep learning (DL)-based de novo molecular design has recently gained considerable traction. Many DL-based generative models have been successfully developed to design novel molecules, but most of them are ligand-centric and the role of the 3D geometries of target binding pockets in molecular generation has not been well-exploited. Here, we proposed a new 3D-based generative model called RELATION. In the RELATION model, the BiTL algorithm was specifically designed to extract and transfer the desired geometric features of the protein-ligand complexes to a latent space for generation. The pharmacophore conditioning and docking-based Bayesian sampling were applied to efficiently navigate the vast chemical space for the design of molecules with desired geometric properties and pharmacophore features. As a proof of concept, the RELATION model was used to design inhibitors for two targets, AKT1 and CDK2. The calculation results demonstrated that the RELATION model could efficiently generate novel molecules with favorable binding affinity and pharmacophore features.
Collapse
Affiliation(s)
- Mingyang Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Chang-Yu Hsieh
- Tencent, Tencent Quantum Lab, Shenzhen 518057, Guangdong, P. R. China
| | - Jike Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Dong Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Gaoqi Weng
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Chao Shen
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Xiaojun Yao
- Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery Macau Institute for Applied Research in Medicine and Health State Key Laboratory of Quality Research in Chinese Medicine, Macau University of Science and Technology, Taipa 999078, Macau, P. R. China
| | - Zhitong Bing
- Institute of Modern Physics, Chinese Academy of Sciences, Lanzhou 730000, P. R. China
| | - Honglin Li
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science & Technology, Shanghai 200237, P. R. China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, P. R. China
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| |
Collapse
|
20
|
Meli R, Morris GM, Biggin PC. Scoring Functions for Protein-Ligand Binding Affinity Prediction using Structure-Based Deep Learning: A Review. FRONTIERS IN BIOINFORMATICS 2022; 2:885983. [PMID: 36187180 PMCID: PMC7613667 DOI: 10.3389/fbinf.2022.885983] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Accepted: 05/11/2022] [Indexed: 01/01/2023] Open
Abstract
The rapid and accurate in silico prediction of protein-ligand binding free energies or binding affinities has the potential to transform drug discovery. In recent years, there has been a rapid growth of interest in deep learning methods for the prediction of protein-ligand binding affinities based on the structural information of protein-ligand complexes. These structure-based scoring functions often obtain better results than classical scoring functions when applied within their applicability domain. Here we review structure-based scoring functions for binding affinity prediction based on deep learning, focussing on different types of architectures, featurization strategies, data sets, methods for training and evaluation, and the role of explainable artificial intelligence in building useful models for real drug-discovery applications.
Collapse
Affiliation(s)
- Rocco Meli
- Department of Biochemistry, University of Oxford, Oxford, United Kingdom
| | - Garrett M. Morris
- Department of Statistics, University of Oxford, Oxford, United Kingdom
| | - Philip C. Biggin
- Department of Biochemistry, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
21
|
Chopra H, Bibi S, Kumar S, Khan MS, Kumar P, Singh I. Preparation and Evaluation of Chitosan/PVA Based Hydrogel Films Loaded with Honey for Wound Healing Application. Gels 2022; 8:gels8020111. [PMID: 35200493 PMCID: PMC8871709 DOI: 10.3390/gels8020111] [Citation(s) in RCA: 56] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 02/04/2022] [Accepted: 02/07/2022] [Indexed: 11/16/2022] Open
Abstract
In the present study, chitosan/polyvinyl alcohol (PVA)-based honey hydrogel films were developed for potential wound healing application. The hydrogel films were developed by a solvent-casting method and were evaluated in terms of thickness, weight variation, folding endurance, moisture content and moisture uptake. The water vapor transmission rate was found to range between 1650.50 ± 35.86 and 2698.65 ± 76.29 g/m2/day. The tensile strength and elongation at break were found to range between 4.74 ± 0.83 and 38.36 ± 5.39 N, and 30.58 ± 3.64 and 33.51 ± 2.47 mm, respectively, indicating significant mechanical properties of the films. SEM images indicated smooth surface morphology of the films. FTIR, DSC and in silico analysis were performed, which highlighted the docking energies of the protein–ligand complex and binding interactions such as hydrogen bonding, Pi–Pi bonding, and Pi–H bonding between the selected compounds and target proteins; hence, we concluded, with the three best molecules (lumichrome, galagin and chitosan), that there was wound healing potential. In vitro studies pointed toward a sustained release of honey from the films. The antimicrobial performance of the films was investigated against Staphylococcus aureus. Overall, the results signaled the potential application of chitosan/PVA based hydrogel films as wound dressings. Furthermore, in vivo experiments may be required to evaluate the clinical efficacy of honey-loaded chitosan/PVA hydrogel films in wound healing.
Collapse
Affiliation(s)
- Hitesh Chopra
- Chitkara College of Pharmacy, Chitkara University, Rajpura 140401, Punjab, India;
| | - Shabana Bibi
- Yunnan Herbal Laboratory, College of Ecology and Environmental Sciences, Yunnan University, Kunming 650091, China;
- The International Joint Research Center for Sustainable Utilization of Cordyceps Bioresources in China and Southeast Asia, Yunnan University, Kunming 650091, China
| | - Sandeep Kumar
- College of Pharmacy, Amar Shaheed Baba Ajit Singh Jujhar Singh Memorial College, Ropar 140111, Punjab, India;
| | - Muhammad Saad Khan
- Department of Biosciences, Faculty of Sciences, COMSATS University Islamabad, Sahiwal 57000, Pakistan;
| | - Pradeep Kumar
- Department of Pharmacy and Pharmacology, School of Therapeutic Sciences, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg 2193, South Africa
- Correspondence: (P.K.); (I.S.)
| | - Inderbir Singh
- Chitkara College of Pharmacy, Chitkara University, Rajpura 140401, Punjab, India;
- Correspondence: (P.K.); (I.S.)
| |
Collapse
|
22
|
Can docking scoring functions guarantee success in virtual screening? VIRTUAL SCREENING AND DRUG DOCKING 2022. [DOI: 10.1016/bs.armc.2022.08.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|