1
|
Ishitani R, Takemoto M, Tomii K. Protein ligand binding site prediction using graph transformer neural network. PLoS One 2024; 19:e0308425. [PMID: 39106255 DOI: 10.1371/journal.pone.0308425] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Accepted: 07/23/2024] [Indexed: 08/09/2024] Open
Abstract
Ligand binding site prediction is a crucial initial step in structure-based drug discovery. Although several methods have been proposed previously, including those using geometry based and machine learning techniques, their accuracy is considered to be still insufficient. In this study, we introduce an approach that leverages a graph transformer neural network to rank the results of a geometry-based pocket detection method. We also created a larger training dataset compared to the conventionally used sc-PDB and investigated the correlation between the dataset size and prediction performance. Our findings indicate that utilizing a graph transformer-based method alongside a larger training dataset could enhance the performance of ligand binding site prediction.
Collapse
Affiliation(s)
- Ryuichiro Ishitani
- Division of Computational Drug Discovery and Design, Medical Research Institute, Tokyo Medical and Dental University, Bunkyo-ku, Tokyo, Japan
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
- Preferred Networks, Inc., Chiyoda-ku, Tokyo, Japan
| | - Mizuki Takemoto
- Division of Computational Drug Discovery and Design, Medical Research Institute, Tokyo Medical and Dental University, Bunkyo-ku, Tokyo, Japan
| | - Kentaro Tomii
- Artificial Intelligence Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST), Koto-ku, Tokyo, Japan
| |
Collapse
|
2
|
Popov P, Kalinin R, Buslaev P, Kozlovskii I, Zaretckii M, Karlov D, Gabibov A, Stepanov A. Unraveling viral drug targets: a deep learning-based approach for the identification of potential binding sites. Brief Bioinform 2023; 25:bbad459. [PMID: 38113077 PMCID: PMC10783863 DOI: 10.1093/bib/bbad459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 11/10/2023] [Accepted: 11/22/2023] [Indexed: 12/21/2023] Open
Abstract
The coronavirus disease 2019 (COVID-19) pandemic has spurred a wide range of approaches to control and combat the disease. However, selecting an effective antiviral drug target remains a time-consuming challenge. Computational methods offer a promising solution by efficiently reducing the number of candidates. In this study, we propose a structure- and deep learning-based approach that identifies vulnerable regions in viral proteins corresponding to drug binding sites. Our approach takes into account the protein dynamics, accessibility and mutability of the binding site and the putative mechanism of action of the drug. We applied this technique to validate drug targeting toward severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike glycoprotein S. Our findings reveal a conformation- and oligomer-specific glycan-free binding site proximal to the receptor binding domain. This site comprises topologically important amino acid residues. Molecular dynamics simulations of Spike in complex with candidate drug molecules bound to the potential binding sites indicate an equilibrium shifted toward the inactive conformation compared with drug-free simulations. Small molecules targeting this binding site have the potential to prevent the closed-to-open conformational transition of Spike, thereby allosterically inhibiting its interaction with human angiotensin-converting enzyme 2 receptor. Using a pseudotyped virus-based assay with a SARS-CoV-2 neutralizing antibody, we identified a set of hit compounds that exhibited inhibition at micromolar concentrations.
Collapse
Affiliation(s)
- Petr Popov
- Tetra-d, Rheinweg 9, Schaffhausen, 8200, Switzerland
- School of Science, Constructor University Bremen gGmbH, 28759, Bremen, Germany
| | - Roman Kalinin
- M.M. Shemyakin and Yu.A. Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow, 117997, Russia
| | - Pavel Buslaev
- Nanoscience Center and Department of Chemistry, University of Jyväskylä, 40014, Jyväskylä, Finland
| | - Igor Kozlovskii
- Tetra-d, Rheinweg 9, Schaffhausen, 8200, Switzerland
- School of Science, Constructor University Bremen gGmbH, 28759, Bremen, Germany
| | - Mark Zaretckii
- Tetra-d, Rheinweg 9, Schaffhausen, 8200, Switzerland
- School of Science, Constructor University Bremen gGmbH, 28759, Bremen, Germany
| | - Dmitry Karlov
- School of Pharmacy, Medical Biology Centre, Queen’s University Belfast, Street, Belfast, BT9 7BL Northern Ireland, U.K
| | - Alexander Gabibov
- M.M. Shemyakin and Yu.A. Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow, 117997, Russia
| | - Alexey Stepanov
- Department of Chemistry, The Scripps Research Institute, 10550 North Torrey Pines Road MB-10, La Jolla, 92037, CA, USA
| |
Collapse
|
3
|
Yu Y, Rué Casamajo A, Finnigan W, Schnepel C, Barker R, Morrill C, Heath RS, De Maria L, Turner NJ, Scrutton NS. Structure-Based Design of Small Imine Reductase Panels for Target Substrates. ACS Catal 2023; 13:12310-12321. [PMID: 37736118 PMCID: PMC10510103 DOI: 10.1021/acscatal.3c02278] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 08/20/2023] [Indexed: 09/23/2023]
Abstract
Biocatalysis is important in the discovery, development, and manufacture of pharmaceuticals. However, the identification of enzymes for target transformations of interest requires major screening efforts. Here, we report a structure-based computational workflow to prioritize protein sequences by a score based on predicted activities on substrates, thereby reducing a resource-intensive laboratory-based biocatalyst screening. We selected imine reductases (IREDs) as a class of biocatalysts to illustrate the application of the computational workflow termed IREDFisher. Validation by using published data showed that IREDFisher can retrieve the best enzymes and increase the hit rate by identifying the top 20 ranked sequences. The power of IREDFisher is confirmed by computationally screening 1400 sequences for chosen reductive amination reactions with different levels of complexity. Highly active IREDs were identified by only testing 20 samples in vitro. Our speed test shows that it only takes 90 min to rank 85 sequences from user input and 30 min for the established IREDFisher database containing 591 IRED sequences. IREDFisher is available as a user-friendly web interface (https://enzymeevolver.com/IREDFisher). IREDFisher enables the rapid discovery of IREDs for applications in synthesis and directed evolution studies, with minimal time and resource expenditure. Future use of the workflow with other enzyme families could be implemented following the modification of the workflow scoring function.
Collapse
Affiliation(s)
- Yuqi Yu
- Department
of Chemistry, The University of Manchester,
Manchester Institute of Biotechnology, 131 Princess Street, Manchester M1 7DN, U.K.
- Augmented
Biologics Discovery & Design, Department of Biologics Engineering, BioPharmaceuticals R&D, AstraZeneca, Cambridge CB21 6GH, U.K.
| | - Arnau Rué Casamajo
- Department
of Chemistry, The University of Manchester,
Manchester Institute of Biotechnology, 131 Princess Street, Manchester M1 7DN, U.K.
| | - William Finnigan
- Department
of Chemistry, The University of Manchester,
Manchester Institute of Biotechnology, 131 Princess Street, Manchester M1 7DN, U.K.
| | - Christian Schnepel
- Department
of Chemistry, The University of Manchester,
Manchester Institute of Biotechnology, 131 Princess Street, Manchester M1 7DN, U.K.
| | - Rhys Barker
- Department
of Chemistry, The University of Manchester,
Manchester Institute of Biotechnology, 131 Princess Street, Manchester M1 7DN, U.K.
| | - Charlotte Morrill
- Department
of Chemistry, The University of Manchester,
Manchester Institute of Biotechnology, 131 Princess Street, Manchester M1 7DN, U.K.
| | - Rachel S. Heath
- Department
of Chemistry, The University of Manchester,
Manchester Institute of Biotechnology, 131 Princess Street, Manchester M1 7DN, U.K.
| | - Leonardo De Maria
- Medicinal
Chemistry, Research and Early Development, Respiratory and Immunology
(RI), BioPharmaceuticals R&D, AstraZeneca, Gothenburg 43150, Sweden
| | - Nicholas J. Turner
- Department
of Chemistry, The University of Manchester,
Manchester Institute of Biotechnology, 131 Princess Street, Manchester M1 7DN, U.K.
| | - Nigel S. Scrutton
- Department
of Chemistry, The University of Manchester,
Manchester Institute of Biotechnology, 131 Princess Street, Manchester M1 7DN, U.K.
| |
Collapse
|
4
|
Liao J, Wang Q, Wu F, Huang Z. In Silico Methods for Identification of Potential Active Sites of Therapeutic Targets. Molecules 2022; 27:7103. [PMID: 36296697 PMCID: PMC9609013 DOI: 10.3390/molecules27207103] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 08/12/2022] [Accepted: 08/25/2022] [Indexed: 07/30/2023] Open
Abstract
Target identification is an important step in drug discovery, and computer-aided drug target identification methods are attracting more attention compared with traditional drug target identification methods, which are time-consuming and costly. Computer-aided drug target identification methods can greatly reduce the searching scope of experimental targets and associated costs by identifying the diseases-related targets and their binding sites and evaluating the druggability of the predicted active sites for clinical trials. In this review, we introduce the principles of computer-based active site identification methods, including the identification of binding sites and assessment of druggability. We provide some guidelines for selecting methods for the identification of binding sites and assessment of druggability. In addition, we list the databases and tools commonly used with these methods, present examples of individual and combined applications, and compare the methods and tools. Finally, we discuss the challenges and limitations of binding site identification and druggability assessment at the current stage and provide some recommendations and future perspectives.
Collapse
Affiliation(s)
- Jianbo Liao
- Key Laboratory of Big Data Mining and Precision Drug Design of Guangdong Medical University, Key Laboratory of Computer-Aided Drug Design of Dongguan City, Key Laboratory for Research and Development of Natural Drugs of Guangdong Province, School of Pharmacy, Guangdong Medical University, Dongguan 523808, China
- The Second School of Clinical Medicine, Guangdong Medical University, Dongguan 523808, China
| | - Qinyu Wang
- Key Laboratory of Big Data Mining and Precision Drug Design of Guangdong Medical University, Key Laboratory of Computer-Aided Drug Design of Dongguan City, Key Laboratory for Research and Development of Natural Drugs of Guangdong Province, School of Pharmacy, Guangdong Medical University, Dongguan 523808, China
| | - Fengxu Wu
- Hubei Key Laboratory of Wudang Local Chinese Medicine Research, School of Pharmaceutical Sciences, Hubei University of Medicine, Shiyan 442000, China
| | - Zunnan Huang
- Key Laboratory of Big Data Mining and Precision Drug Design of Guangdong Medical University, Key Laboratory of Computer-Aided Drug Design of Dongguan City, Key Laboratory for Research and Development of Natural Drugs of Guangdong Province, School of Pharmacy, Guangdong Medical University, Dongguan 523808, China
- Marine Biomedical Research Institute of Guangdong Zhanjiang, Zhanjiang 524023, China
| |
Collapse
|
5
|
Brackenridge DA, McGuffin LJ. Proteins and Their Interacting Partners: An Introduction to Protein-Ligand Binding Site Prediction Methods with a Focus on FunFOLD3. Methods Mol Biol 2021; 2365:43-58. [PMID: 34432238 DOI: 10.1007/978-1-0716-1665-9_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Proteins are essential molecules with a diverse range of functions; elucidating their biological and biochemical characteristics can be difficult and time consuming using in vitro and/or in vivo methods. Additionally, in vivo protein-ligand binding site elucidation is unable to keep place with current growth in sequencing, leaving the majority of new protein sequences without known functions. Therefore, the development of new methods, which aim to predict the protein-ligand interactions and ligand-binding site residues directly from amino acid sequences, is becoming increasingly important. In silico prediction can utilise either sequence information, structural information or a combination of both. In this chapter, we will discuss the broad range of methods for ligand-binding site prediction from protein structure and we will describe our method, FunFOLD3, for the prediction of protein-ligand interactions and ligand-binding sites based on template-based modelling. Additionally, we will describe the step-by-step instructions using the FunFOLD3 downloadable application along with examples from the Critical Assessment of Techniques for Protein Structure Prediction (CASP) where FunFOLD3 has been used to aid ligand and ligand-binding site prediction. Finally, we will introduce our newer method, FunFOLD3-D, a version of FunFOLD3 which aims to improve template-based protein-ligand binding site prediction through the integration of docking, using AutoDock Vina.
Collapse
|
6
|
Macari G, Toti D, Pasquadibisceglie A, Polticelli F. DockingApp RF: A State-of-the-Art Novel Scoring Function for Molecular Docking in a User-Friendly Interface to AutoDock Vina. Int J Mol Sci 2020; 21:ijms21249548. [PMID: 33333976 PMCID: PMC7765429 DOI: 10.3390/ijms21249548] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Revised: 12/11/2020] [Accepted: 12/11/2020] [Indexed: 11/28/2022] Open
Abstract
Motivation: Bringing a new drug to the market is expensive and time-consuming. To cut the costs and time, computer-aided drug design (CADD) approaches have been increasingly included in the drug discovery pipeline. However, despite traditional docking tools show a good conformational space sampling ability, they are still unable to produce accurate binding affinity predictions. This work presents a novel scoring function for molecular docking seamlessly integrated into DockingApp, a user-friendly graphical interface for AutoDock Vina. The proposed function is based on a random forest model and a selection of specific features to overcome the existing limits of Vina’s original scoring mechanism. A novel version of DockingApp, named DockingApp RF, has been developed to host the proposed scoring function and to automatize the rescoring procedure of the output of AutoDock Vina, even to nonexpert users. Results: By coupling intermolecular interaction, solvent accessible surface area features and Vina’s energy terms, DockingApp RF’s new scoring function is able to improve the binding affinity prediction of AutoDock Vina. Furthermore, comparison tests carried out on the CASF-2013 and CASF-2016 datasets demonstrate that DockingApp RF’s performance is comparable to other state-of-the-art machine-learning- and deep-learning-based scoring functions. The new scoring function thus represents a significant advancement in terms of the reliability and effectiveness of docking compared to AutoDock Vina’s scoring function. At the same time, the characteristics that made DockingApp appealing to a wide range of users are retained in this new version and have been complemented with additional features.
Collapse
Affiliation(s)
- Gabriele Macari
- Department of Sciences, Roma Tre University, 00146 Rome, Italy; (G.M.); (A.P.)
| | - Daniele Toti
- Faculty of Mathematical, Physical and Natural Sciences, Catholic University of the Sacred Heart, 25121 Brescia, Italy;
| | | | - Fabio Polticelli
- Department of Sciences, Roma Tre University, 00146 Rome, Italy; (G.M.); (A.P.)
- National Institute of Nuclear Physics, Roma Tre Section, 00146 Rome, Italy
- Correspondence:
| |
Collapse
|
7
|
Computational methods and tools for binding site recognition between proteins and small molecules: from classical geometrical approaches to modern machine learning strategies. J Comput Aided Mol Des 2019; 33:887-903. [PMID: 31628659 DOI: 10.1007/s10822-019-00235-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2019] [Accepted: 10/11/2019] [Indexed: 10/25/2022]
Abstract
In the current "genomic era" the number of identified genes is growing exponentially. However, the biological function of a large number of the corresponding proteins is still unknown. Recognition of small molecule ligands (e.g., substrates, inhibitors, allosteric regulators, etc.) is pivotal for protein functions in the vast majority of the cases and knowledge of the region where these processes take place is essential for protein function prediction and drug design. In this regard, computational methods represent essential tools to tackle this problem. A significant number of software tools have been developed in the last few years which exploit either protein sequence information, structure information or both. This review describes the most recent developments in protein function recognition and binding site prediction, in terms of both freely-available and commercial solutions and tools, detailing the main characteristics of the considered tools and providing a comparative analysis of their performance.
Collapse
|
8
|
Fragment-Based Ligand-Protein Contact Statistics: Application to Docking Simulations. Int J Mol Sci 2019; 20:ijms20102499. [PMID: 31117183 PMCID: PMC6567162 DOI: 10.3390/ijms20102499] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2019] [Revised: 05/16/2019] [Accepted: 05/17/2019] [Indexed: 01/26/2023] Open
Abstract
In this work, the information contained in the contacts between fragments of small-molecule ligands and protein residues has been collected and its exploitability has been verified by using the scoring of docking simulations as a test case for bringing about a proof of concept. Contact statistics between small-molecule fragments and binding site residues were collected and analyzed using a dataset composed of 200,000+ binding sites and associated ligands, derived from the database of the LIBRA ligand binding site recognition software, as a starting point. The fragments were generated by applying the decomposition algorithm implemented in BRICS. A simple "potential" based on the contact frequencies was tested against the CASF-2013 benchmark; its performance was then evaluated through the rescoring of docking poses generated for the DUD-E dataset. The results obtained indicate that this approach, its simplicity notwithstanding, yields promising results that are comparable, and in some cases, superior, to those obtained with other, more complex scoring functions.
Collapse
|
9
|
Toti D, Viet Hung L, Tortosa V, Brandi V, Polticelli F. LIBRA-WA: a web application for ligand binding site detection and protein function recognition. Bioinformatics 2018; 34:878-880. [PMID: 29126218 PMCID: PMC6192203 DOI: 10.1093/bioinformatics/btx715] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2017] [Accepted: 11/04/2017] [Indexed: 02/04/2023] Open
Abstract
Summary Recently, LIBRA, a tool for active/ligand binding site prediction, was described. LIBRA's effectiveness was comparable to similar state-of-the-art tools; however, its scoring scheme, output presentation, dependence on local resources and overall convenience were amenable to improvements. To solve these issues, LIBRA-WA, a web application based on an improved LIBRA engine, has been developed, featuring a novel scoring scheme consistently improving LIBRA's performance, and a refined algorithm that can identify binding sites hosted at the interface between different subunits. LIBRA-WA also sports additional functionalities like ligand clustering and a completely redesigned interface for an easier analysis of the output. Extensive tests on 373 apoprotein structures indicate that LIBRA-WA is able to identify the biologically relevant ligand/ligand binding site in 357 cases (∼96%), with the correct prediction ranking first in 349 cases (∼98% of the latter, ∼94% of the total). The earlier stand-alone tool has also been updated and dubbed LIBRA+, by integrating LIBRA-WA's improved engine for cross-compatibility purposes. Availability and implementation LIBRA-WA and LIBRA+ are available at: http://www.computationalbiology.it/software.html. Contact polticel@uniroma3.it. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Daniele Toti
- Department of Sciences, University of Roma Tre, 00146 Rome, Italy
| | - Le Viet Hung
- Department of Science and Technology, Nguyen Tat Thanh University, Ho chi Minh City, Vietnam
| | | | - Valentina Brandi
- Department of Sciences, University of Roma Tre, 00146 Rome, Italy
| | - Fabio Polticelli
- Department of Sciences, University of Roma Tre, 00146 Rome, Italy.,National Institute of Nuclear Physics, Roma Tre Section, 00146 Rome, Italy
| |
Collapse
|
10
|
Krivák R, Hoksza D. P2Rank: machine learning based tool for rapid and accurate prediction of ligand binding sites from protein structure. J Cheminform 2018; 10:39. [PMID: 30109435 PMCID: PMC6091426 DOI: 10.1186/s13321-018-0285-8] [Citation(s) in RCA: 181] [Impact Index Per Article: 30.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2017] [Accepted: 06/29/2018] [Indexed: 01/29/2023] Open
Abstract
Background Ligand binding site prediction from protein structure has many applications related to elucidation of protein function and structure based drug discovery. It often represents only one step of many in complex computational drug design efforts. Although many methods have been published to date, only few of them are suitable for use in automated pipelines or for processing large datasets.
These use cases require stability and speed, which disqualifies many of the recently introduced tools that are either template based or available only as web servers. Results We present P2Rank, a stand-alone template-free tool for prediction of ligand binding sites based on machine learning. It is based on prediction of ligandability of local chemical neighbourhoods that are centered on points placed on the solvent accessible surface of a protein.
We show that P2Rank outperforms several existing tools, which include two widely used stand-alone tools (Fpocket, SiteHound), a comprehensive consensus based tool (MetaPocket 2.0), and a recent deep learning based method (DeepSite). P2Rank belongs to the fastest available tools (requires under 1 s for prediction on one protein), with additional advantage of multi-threaded implementation. Conclusions P2Rank is a new open source software package for ligand binding site prediction from protein structure. It is available as a user-friendly stand-alone command line program and a Java library. P2Rank has a lightweight installation and does not depend on other bioinformatics tools or large structural or sequence databases. Thanks to its speed and ability to make fully automated predictions, it is particularly well suited for processing large datasets or as a component of scalable structural bioinformatics pipelines. Electronic supplementary material The online version of this article (10.1186/s13321-018-0285-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Radoslav Krivák
- Department of Software Engineering, Charles University, Prague, Czech Republic.
| | - David Hoksza
- Department of Software Engineering, Charles University, Prague, Czech Republic.
| |
Collapse
|