1
|
Wei H, Wang W, Peng Z, Yang J. Q-BioLiP: A Comprehensive Resource for Quaternary Structure-based Protein-ligand Interactions. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024; 22:qzae001. [PMID: 38862427 DOI: 10.1093/gpbjnl/qzae001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 11/12/2023] [Accepted: 12/03/2023] [Indexed: 06/13/2024]
Abstract
Since its establishment in 2013, BioLiP has become one of the widely used resources for protein-ligand interactions. Nevertheless, several known issues occurred with it over the past decade. For example, the protein-ligand interactions are represented in the form of single chain-based tertiary structures, which may be inappropriate as many interactions involve multiple protein chains (known as quaternary structures). We sought to address these issues, resulting in Q-BioLiP, a comprehensive resource for quaternary structure-based protein-ligand interactions. The major features of Q-BioLiP include: (1) representing protein structures in the form of quaternary structures rather than single chain-based tertiary structures; (2) pairing DNA/RNA chains properly rather than separation; (3) providing both experimental and predicted binding affinities; (4) retaining both biologically relevant and irrelevant interactions to alleviate the wrong justification of ligands' biological relevance; and (5) developing a new quaternary structure-based algorithm for the modelling of protein-ligand complex structure. With these new features, Q-BioLiP is expected to be a valuable resource for studying biomolecule interactions, including protein-small molecule interaction, protein-metal ion interaction, protein-peptide interaction, protein-protein interaction, protein-DNA/RNA interaction, and RNA-small molecule interaction. Q-BioLiP is freely available at https://yanglab.qd.sdu.edu.cn/Q-BioLiP/.
Collapse
Affiliation(s)
- Hong Wei
- School of Mathematical Sciences, Nankai University, Tianjin 300071, China
| | - Wenkai Wang
- School of Mathematical Sciences, Nankai University, Tianjin 300071, China
| | - Zhenling Peng
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| | - Jianyi Yang
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| |
Collapse
|
2
|
Ahmed F, Brooks CL. FASTDock: A Pipeline for Allosteric Drug Discovery. J Chem Inf Model 2023; 63:7219-7227. [PMID: 37939386 PMCID: PMC10773972 DOI: 10.1021/acs.jcim.3c00895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2023]
Abstract
Allostery is involved in innumerable biological processes and plays a fundamental role in human disease. Thus, the exploration of allosteric modulation is crucial for research on biological mechanisms and in the development of novel therapeutics. The development of small-molecule allosteric effectors can be used as tools to probe biological mechanisms of interest. One of the main limitations in targeting allosteric sites is the difficulty in uncovering them for specific receptors. Furthermore, upon discovery of novel allosteric modulation, early lead generation is made more difficult as compared to that at orthosteric sites because there is likely no information about the types of molecules that can bind at the site. In the work described here, we present a novel drug discovery pipeline, FASTDock, which allows one to uncover ligandable sites as well as small molecules that target the given site without requiring pre-existing knowledge of ligands that can bind in the targeted site. By using a hierarchical screening strategy, this method has the potential to enable high-throughput screens of an exceptionally large database of targeted ligand space.
Collapse
Affiliation(s)
- Furyal Ahmed
- Biophysics Program, University of Michigan, Ann Arbor, MI 48103
| | - Charles L. Brooks
- Department of Chemistry and Biophysics Program, University of Michigan, Ann Arbor, MI 48103
| |
Collapse
|
3
|
Gagliardi L, Rocchia W. SiteFerret: Beyond Simple Pocket Identification in Proteins. J Chem Theory Comput 2023; 19:5242-5259. [PMID: 37470784 PMCID: PMC10413863 DOI: 10.1021/acs.jctc.2c01306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Indexed: 07/21/2023]
Abstract
We present a novel method for the automatic detection of pockets on protein molecular surfaces. The algorithm is based on an ad hoc hierarchical clustering of virtual probe spheres obtained from the geometrical primitives used by the NanoShaper software to build the solvent-excluded molecular surface. The final ranking of putative pockets is based on the Isolation Forest method, an unsupervised learning approach originally developed for anomaly detection. A detailed importance analysis of pocket features provides insight into which geometrical (clustering) and chemical (amino acidic composition) properties characterize a good binding site. The method also provides a segmentation of pockets into smaller subpockets. We prove that subpockets are a convenient representation to pinpoint the binding site with great precision. SiteFerret is outstanding in its versatility, accurately predicting a wide range of binding sites, from those binding small molecules to those binding peptides, including difficult shallow sites.
Collapse
Affiliation(s)
| | - Walter Rocchia
- CONCEPT Lab, Istituto Italiano di Tecnologia, Via Melen - 83, B Block, 16152 Genova, Italy
| |
Collapse
|
4
|
Kwon Y, Park S, Lee J, Kang J, Lee HJ, Kim W. BEAR: A Novel Virtual Screening Method Based on Large-Scale Bioactivity Data. J Chem Inf Model 2023; 63:1429-1437. [PMID: 36821004 DOI: 10.1021/acs.jcim.2c01300] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/24/2023]
Abstract
Data-driven drug discovery exploits a comprehensive set of big data to provide an efficient path for the development of new drugs. Currently, publicly available bioassay data sets provide extensive information regarding the bioactivity profiles of millions of compounds. Using these large-scale drug screening data sets, we developed a novel in silico method to virtually screen hit compounds against protein targets, named BEAR (Bioactive compound Enrichment by Assay Repositioning). The underlying idea of BEAR is to reuse bioassay data for predicting hit compounds for targets other than their originally intended purposes, i.e., "assay repositioning". The BEAR approach differs from conventional virtual screening methods in that (1) it relies solely on bioactivity data and requires no physicochemical features of either the target or ligand. (2) Accordingly, structurally diverse candidates are predicted, allowing for scaffold hopping. (3) BEAR shows stable performance across diverse target classes, suggesting its general applicability. Large-scale cross-validation of more than a thousand targets showed that BEAR accurately predicted known ligands (median area under the curve = 0.87), proving that BEAR maintained a robust performance even in the validation set with additional constraints. In addition, a comparative analysis demonstrated that BEAR outperformed other machine learning models, including a recent deep learning model for ABC transporter family targets. We predicted P-gp and BCRP dual inhibitors using the BEAR approach and validated the predicted candidates using in vitro assays. The intracellular accumulation effects of mitoxantrone, a well-known P-gp/BCRP dual substrate for cancer treatment, confirmed nine out of 72 dual inhibitor candidates preselected by primary cytotoxicity screening. Consequently, these nine hits are novel and potent dual inhibitors for both P-gp and BCRP, solely predicted by bioactivity profiles without relying on any structural information of targets or ligands.
Collapse
Affiliation(s)
| | - Sera Park
- KaiPharm, Seoul 03760, Republic of Korea
| | - Jaeok Lee
- College of Pharmacy, Research Institute of Pharmaceutical Science, Ewha Womans University, Seoul 03760, Republic of Korea
| | - Jiyeon Kang
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Republic of Korea
| | - Hwa Jeong Lee
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Republic of Korea
| | - Wankyu Kim
- KaiPharm, Seoul 03760, Republic of Korea.,Department of Life Sciences, College of Natural Science, Ewha Womans University, Seoul 03760, Republic of Korea
| |
Collapse
|
5
|
Meller A, Ward M, Borowsky J, Kshirsagar M, Lotthammer JM, Oviedo F, Ferres JL, Bowman GR. Predicting locations of cryptic pockets from single protein structures using the PocketMiner graph neural network. Nat Commun 2023; 14:1177. [PMID: 36859488 PMCID: PMC9977097 DOI: 10.1038/s41467-023-36699-3] [Citation(s) in RCA: 23] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 02/09/2023] [Indexed: 03/03/2023] Open
Abstract
Cryptic pockets expand the scope of drug discovery by enabling targeting of proteins currently considered undruggable because they lack pockets in their ground state structures. However, identifying cryptic pockets is labor-intensive and slow. The ability to accurately and rapidly predict if and where cryptic pockets are likely to form from a structure would greatly accelerate the search for druggable pockets. Here, we present PocketMiner, a graph neural network trained to predict where pockets are likely to open in molecular dynamics simulations. Applying PocketMiner to single structures from a newly curated dataset of 39 experimentally confirmed cryptic pockets demonstrates that it accurately identifies cryptic pockets (ROC-AUC: 0.87) >1,000-fold faster than existing methods. We apply PocketMiner across the human proteome and show that predicted pockets open in simulations, suggesting that over half of proteins thought to lack pockets based on available structures likely contain cryptic pockets, vastly expanding the potentially druggable proteome.
Collapse
Affiliation(s)
- Artur Meller
- Department of Biochemistry and Molecular Biophysics, Washington University in St. Louis, 660 S. Euclid Ave., Box 8231, St. Louis, MO, 63110, USA
- Medical Scientist Training Program, Washington University in St. Louis, 660 S. Euclid Ave., St. Louis, MO, 63110, USA
| | - Michael Ward
- Department of Biochemistry and Molecular Biophysics, Washington University in St. Louis, 660 S. Euclid Ave., Box 8231, St. Louis, MO, 63110, USA
| | - Jonathan Borowsky
- Department of Biochemistry and Molecular Biophysics, Washington University in St. Louis, 660 S. Euclid Ave., Box 8231, St. Louis, MO, 63110, USA
| | | | - Jeffrey M Lotthammer
- Department of Biochemistry and Molecular Biophysics, Washington University in St. Louis, 660 S. Euclid Ave., Box 8231, St. Louis, MO, 63110, USA
| | - Felipe Oviedo
- AI for Good Research Lab, Microsoft, Redmond, WA, USA
| | | | - Gregory R Bowman
- Department of Biochemistry and Molecular Biophysics, Washington University in St. Louis, 660 S. Euclid Ave., Box 8231, St. Louis, MO, 63110, USA.
- Department of Biochemistry and Molecular Biophysics, University of Pennsylvania, 3620 Hamilton Walk, Philadelphia, PA, 19104, USA.
| |
Collapse
|
6
|
Sunsetting Binding MOAD with its last data update and the addition of 3D-ligand polypharmacology tools. Sci Rep 2023; 13:3008. [PMID: 36810894 PMCID: PMC9944886 DOI: 10.1038/s41598-023-29996-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Accepted: 02/14/2023] [Indexed: 02/24/2023] Open
Abstract
Binding MOAD is a database of protein-ligand complexes and their affinities with many structured relationships across the dataset. The project has been in development for over 20 years, but now, the time has come to bring it to a close. Currently, the database contains 41,409 structures with affinity coverage for 15,223 (37%) complexes. The website BindingMOAD.org provides numerous tools for polypharmacology exploration. Current relationships include links for structures with sequence similarity, 2D ligand similarity, and binding-site similarity. In this last update, we have added 3D ligand similarity using ROCS to identify ligands which may not necessarily be similar in two dimensions but can occupy the same three-dimensional space. For the 20,387 different ligands present in the database, a total of 1,320,511 3D-shape matches between the ligands were added. Examples of the utility of 3D-shape matching in polypharmacology are presented. Finally, plans for future access to the project data are outlined.
Collapse
|
7
|
Trawally M, Demir-Yazıcı K, İpek Dingis-Birgül S, Kaya K, Akdemir A, Güzel-Akdemir Ö. Dithiocarbamates and dithiocarbonates containing 6-nitrosaccharin scaffold: Synthesis, antimycobacterial activity and in silico target prediction using ensemble docking-based reverse virtual screening. J Mol Struct 2022. [DOI: 10.1016/j.molstruc.2022.134818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
8
|
Meli R, Morris GM, Biggin PC. Scoring Functions for Protein-Ligand Binding Affinity Prediction using Structure-Based Deep Learning: A Review. FRONTIERS IN BIOINFORMATICS 2022; 2:885983. [PMID: 36187180 PMCID: PMC7613667 DOI: 10.3389/fbinf.2022.885983] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Accepted: 05/11/2022] [Indexed: 01/01/2023] Open
Abstract
The rapid and accurate in silico prediction of protein-ligand binding free energies or binding affinities has the potential to transform drug discovery. In recent years, there has been a rapid growth of interest in deep learning methods for the prediction of protein-ligand binding affinities based on the structural information of protein-ligand complexes. These structure-based scoring functions often obtain better results than classical scoring functions when applied within their applicability domain. Here we review structure-based scoring functions for binding affinity prediction based on deep learning, focussing on different types of architectures, featurization strategies, data sets, methods for training and evaluation, and the role of explainable artificial intelligence in building useful models for real drug-discovery applications.
Collapse
Affiliation(s)
- Rocco Meli
- Department of Biochemistry, University of Oxford, Oxford, United Kingdom
| | - Garrett M. Morris
- Department of Statistics, University of Oxford, Oxford, United Kingdom
| | - Philip C. Biggin
- Department of Biochemistry, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
9
|
Ru X, Ye X, Sakurai T, Zou Q. NerLTR-DTA: drug-target binding affinity prediction based on neighbor relationship and learning to rank. Bioinformatics 2022; 38:1964-1971. [PMID: 35134828 DOI: 10.1093/bioinformatics/btac048] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Revised: 12/20/2021] [Accepted: 01/28/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Drug-target interaction prediction plays an important role in new drug discovery and drug repurposing. Binding affinity indicates the strength of drug-target interactions. Predicting drug-target binding affinity is expected to provide promising candidates for biologists, which can effectively reduce the workload of wet laboratory experiments and speed up the entire process of drug research. Given that, numerous new proteins are sequenced and compounds are synthesized, several improved computational methods have been proposed for such predictions, but there are still some challenges. (i) Many methods only discuss and implement one application scenario, they focus on drug repurposing and ignore the discovery of new drugs and targets. (ii) Many methods do not consider the priority order of proteins (or drugs) related to each target drug (or protein). Therefore, it is necessary to develop a comprehensive method that can be used in multiple scenarios and focuses on candidate order. RESULTS In this study, we propose a method called NerLTR-DTA that uses the neighbor relationship of similarity and sharing to extract features, and applies a ranking framework with regression attributes to predict affinity values and priority order of query drug (or query target) and its related proteins (or compounds). It is worth noting that using the characteristics of learning to rank to set different queries can smartly realize the multi-scenario application of the method, including the discovery of new drugs and new targets. Experimental results on two commonly used datasets show that NerLTR-DTA outperforms some state-of-the-art competing methods. NerLTR-DTA achieves excellent performance in all application scenarios mentioned in this study, and the rm(test)2 values guarantee such excellent performance is not obtained by chance. Moreover, it can be concluded that NerLTR-DTA can provide accurate ranking lists for the relevant results of most queries through the statistics of the association relationship of each query drug (or query protein). In general, NerLTR-DTA is a powerful tool for predicting drug-target associations and can contribute to new drug discovery and drug repurposing. AVAILABILITY AND IMPLEMENTATION The proposed method is implemented in Python and Java. Source codes and datasets are available at https://github.com/RUXIAOQING964914140/NerLTR-DTA.
Collapse
Affiliation(s)
- Xiaoqing Ru
- Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan.,Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang 324000, China
| | - Xiucai Ye
- Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan
| | - Tetsuya Sakurai
- Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China.,Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang 324000, China
| |
Collapse
|
10
|
Morningstar-Kywi N, Wang K, Asbell TR, Wang Z, Giles JB, Lai J, Brill D, Sutch BT, Haworth IS. Prediction of Water Distributions and Displacement at Protein-Ligand Interfaces. J Chem Inf Model 2022; 62:1489-1497. [PMID: 35261241 DOI: 10.1021/acs.jcim.1c01266] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
The retention and displacement of water molecules during formation of ligand-protein interfaces play a major role in determining ligand binding. Understanding these effects requires a method for positioning of water molecules in the bound and unbound proteins and for defining water displacement upon ligand binding. We describe an algorithm for water placement and a calculation of ligand-driven water displacement in >9000 protein-ligand complexes. The algorithm predicts approximately 38% of experimental water positions within 1.0 Å and about 83% within 1.5 Å. We further show that the predicted water molecules can complete water networks not detected in crystallographic structures of the protein-ligand complexes. The algorithm was also applied to solvation of the corresponding unbound proteins, and this allowed calculation of water displacement upon ligand binding based on differences in the water network between the bound and unbound structures. We illustrate use of this approach through comparison of water displacement by structurally related ligands at the same binding site. This method for evaluation of water displacement upon ligand binding may be of value for prediction of the effects of ligand modification in drug design.
Collapse
Affiliation(s)
- Noam Morningstar-Kywi
- Department of Pharmacology and Pharmaceutical Sciences, School of Pharmacy, University of Southern California, 1985 Zonal Avenue, Los Angeles, California 90089, United States
| | - Kaichen Wang
- Department of Pharmacology and Pharmaceutical Sciences, School of Pharmacy, University of Southern California, 1985 Zonal Avenue, Los Angeles, California 90089, United States
| | - Thomas R Asbell
- Department of Pharmacology and Pharmaceutical Sciences, School of Pharmacy, University of Southern California, 1985 Zonal Avenue, Los Angeles, California 90089, United States
| | - Zhaohui Wang
- Department of Pharmacology and Pharmaceutical Sciences, School of Pharmacy, University of Southern California, 1985 Zonal Avenue, Los Angeles, California 90089, United States
| | - Jason B Giles
- Department of Pharmacology and Pharmaceutical Sciences, School of Pharmacy, University of Southern California, 1985 Zonal Avenue, Los Angeles, California 90089, United States
| | - Jiawei Lai
- Department of Pharmacology and Pharmaceutical Sciences, School of Pharmacy, University of Southern California, 1985 Zonal Avenue, Los Angeles, California 90089, United States
| | - Dab Brill
- Department of Pharmacology and Pharmaceutical Sciences, School of Pharmacy, University of Southern California, 1985 Zonal Avenue, Los Angeles, California 90089, United States
| | - Brian T Sutch
- Department of Pharmacology and Pharmaceutical Sciences, School of Pharmacy, University of Southern California, 1985 Zonal Avenue, Los Angeles, California 90089, United States
| | - Ian S Haworth
- Department of Pharmacology and Pharmaceutical Sciences, School of Pharmacy, University of Southern California, 1985 Zonal Avenue, Los Angeles, California 90089, United States
| |
Collapse
|
11
|
Dhakal A, McKay C, Tanner JJ, Cheng J. Artificial intelligence in the prediction of protein-ligand interactions: recent advances and future directions. Brief Bioinform 2022; 23:bbab476. [PMID: 34849575 PMCID: PMC8690157 DOI: 10.1093/bib/bbab476] [Citation(s) in RCA: 64] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Revised: 09/28/2021] [Accepted: 10/15/2021] [Indexed: 12/13/2022] Open
Abstract
New drug production, from target identification to marketing approval, takes over 12 years and can cost around $2.6 billion. Furthermore, the COVID-19 pandemic has unveiled the urgent need for more powerful computational methods for drug discovery. Here, we review the computational approaches to predicting protein-ligand interactions in the context of drug discovery, focusing on methods using artificial intelligence (AI). We begin with a brief introduction to proteins (targets), ligands (e.g. drugs) and their interactions for nonexperts. Next, we review databases that are commonly used in the domain of protein-ligand interactions. Finally, we survey and analyze the machine learning (ML) approaches implemented to predict protein-ligand binding sites, ligand-binding affinity and binding pose (conformation) including both classical ML algorithms and recent deep learning methods. After exploring the correlation between these three aspects of protein-ligand interaction, it has been proposed that they should be studied in unison. We anticipate that our review will aid exploration and development of more accurate ML-based prediction strategies for studying protein-ligand interactions.
Collapse
Affiliation(s)
- Ashwin Dhakal
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, 65211, USA
| | - Cole McKay
- Department of Biochemistry, University of Missouri, Columbia, MO, 65211, USA
| | - John J Tanner
- Department of Biochemistry, University of Missouri, Columbia, MO, 65211, USA
- Department of Chemistry, University of Missouri, Columbia, MO, 65211, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, 65211, USA
| |
Collapse
|
12
|
Cetin-Atalay R, Kahraman DC, Nalbat E, Rifaioglu AS, Atakan A, Donmez A, Atas H, Atalay MV, Acar AC, Doğan T. Data Centric Molecular Analysis and Evaluation of Hepatocellular Carcinoma Therapeutics Using Machine Intelligence-Based Tools. J Gastrointest Cancer 2021; 52:1266-1276. [PMID: 34910274 DOI: 10.1007/s12029-021-00768-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/13/2021] [Indexed: 10/19/2022]
Abstract
PURPOSE Computational approaches have been used at different stages of drug development with the purpose of decreasing the time and cost of conventional experimental procedures. Lately, techniques mainly developed and applied in the field of artificial intelligence (AI), have been transferred to different application domains such as biomedicine. METHODS In this study, we conducted an investigative analysis via data-driven evaluation of potential hepatocellular carcinoma (HCC) therapeutics in the context of AI-assisted drug discovery/repurposing. First, we discussed basic concepts, computational approaches, databases, modeling approaches, and featurization techniques in drug discovery/repurposing. In the analysis part, we automatically integrated HCC-related biological entities such as genes/proteins, pathways, phenotypes, drugs/compounds, and other diseases with similar implications, and represented these heterogeneous relationships via a knowledge graph using the CROssBAR system. RESULTS Following the system-level evaluation and selection of critical genes/proteins and pathways to target, our deep learning-based drug/compound-target protein interaction predictors DEEPScreen and MDeePred have been employed for predicting new bioactive drugs and compounds for these critical targets. Finally, we embedded ligands of selected HCC-associated proteins which had a significant enrichment with the CROssBAR system into a 2-D space to identify and repurpose small molecule inhibitors as potential drug candidates based on their molecular similarities to known HCC drugs. CONCLUSIONS We expect that these series of data-driven analyses can be used as a roadmap to propose early-stage potential inhibitors (from database-scale sets of compounds) to both HCC and other complex diseases, which may subsequently be analyzed with more targeted in silico and experimental approaches.
Collapse
Affiliation(s)
- Rengul Cetin-Atalay
- Section of Pulmonary and Critical Care Medicine, University of Chicago, Chicago, IL, 60637, USA.
| | - Deniz Cansen Kahraman
- Cancer Systems Biology Laboratory, Graduate School of Informatics, METU, Ankara, 06800, Turkey.
| | - Esra Nalbat
- Cancer Systems Biology Laboratory, Graduate School of Informatics, METU, Ankara, 06800, Turkey
| | - Ahmet Sureyya Rifaioglu
- Department of Computer Engineering, Iskenderun Technical University, Iskenderun, Hatay, 31200, Turkey.,Department of Computer Engineering, METU, Ankara, 06800, Turkey
| | - Ahmet Atakan
- Department of Computer Engineering, METU, Ankara, 06800, Turkey.,Department of Computer Engineering, EBYU, Ankara, 24002, Turkey
| | - Ataberk Donmez
- Department of Computer Engineering, METU, Ankara, 06800, Turkey.,Department of Computer Science, University of Maryland, College Park, MD, 20742, USA
| | - Heval Atas
- Cancer Systems Biology Laboratory, Graduate School of Informatics, METU, Ankara, 06800, Turkey
| | - M Volkan Atalay
- Cancer Systems Biology Laboratory, Graduate School of Informatics, METU, Ankara, 06800, Turkey.,Department of Computer Engineering, METU, Ankara, 06800, Turkey
| | - Aybar C Acar
- Cancer Systems Biology Laboratory, Graduate School of Informatics, METU, Ankara, 06800, Turkey
| | - Tunca Doğan
- Cancer Systems Biology Laboratory, Graduate School of Informatics, METU, Ankara, 06800, Turkey. .,Department of Computer Engineering, Hacettepe University, Ankara, 06800, Turkey.
| |
Collapse
|
13
|
Vijayan RSK, Kihlberg J, Cross JB, Poongavanam V. Enhancing preclinical drug discovery with artificial intelligence. Drug Discov Today 2021; 27:967-984. [PMID: 34838731 DOI: 10.1016/j.drudis.2021.11.023] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 10/15/2021] [Accepted: 11/19/2021] [Indexed: 12/14/2022]
Abstract
Artificial intelligence (AI) is becoming an integral part of drug discovery. It has the potential to deliver across the drug discovery and development value chain, starting from target identification and reaching through clinical development. In this review, we provide an overview of current AI technologies and a glimpse of how AI is reimagining preclinical drug discovery by highlighting examples where AI has made a real impact. Considering the excitement and hyperbole surrounding AI in drug discovery, we aim to present a realistic view by discussing both opportunities and challenges in adopting AI in drug discovery.
Collapse
Affiliation(s)
- R S K Vijayan
- Institute for Applied Cancer Science, MD Anderson Cancer Center, Houston, TX, USA
| | - Jan Kihlberg
- Department of Chemistry-BMC, Uppsala University, Uppsala, Sweden
| | - Jason B Cross
- Institute for Applied Cancer Science, MD Anderson Cancer Center, Houston, TX, USA.
| | | |
Collapse
|
14
|
Tanramluk D, Pakotiprapha D, Phoochaijaroen S, Chantravisut P, Thampradid S, Vanichtanankul J, Narupiyakul L, Akavipat R, Yuvaniyama J. MANORAA: A machine learning platform to guide protein-ligand design by anchors and influential distances. Structure 2021; 30:181-189.e5. [PMID: 34614393 DOI: 10.1016/j.str.2021.09.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2021] [Revised: 06/25/2021] [Accepted: 09/08/2021] [Indexed: 10/20/2022]
Abstract
The MANORAA platform uses structure-based approaches to provide information on drug design originally derived from mapping tens of thousands of amino acids on a grid. In-depth analyses of the pockets, frequently occurring atoms, influential distances, and active-site boundaries are used for the analysis of active sites. The algorithms derived provide model equations that can predict whether changes in distances, such as contraction or expansion, will result in improved binding affinity. The algorithm is confirmed using kinetic studies of dihydrofolate reductase (DHFR), together with two DHFR-TS crystal structures. Empirical analyses of 881 crystal structures involving 180 ligands are used to interpret protein-ligand binding affinities. MANORAA links to major biological databases for web-based analysis of drug design. The frequency of atoms inside the main protease structures, including those from SARS-CoV-2, shows how the rigid part of the ligand can be used as a probe for molecular design (http://manoraa.org).
Collapse
Affiliation(s)
- Duangrudee Tanramluk
- Institute of Molecular Biosciences, Mahidol University, Salaya, Nakhon Pathom 73170, Thailand; Integrative Computational BioScience (ICBS) Center, Mahidol University, Salaya, Nakhon Pathom 73170, Thailand.
| | - Danaya Pakotiprapha
- Department of Biochemistry and Center for Excellence in Protein and Enzyme Technology, Faculty of Science, Mahidol University, Ratchathewi, Bangkok 10400, Thailand
| | - Sakao Phoochaijaroen
- Integrative Computational BioScience (ICBS) Center, Mahidol University, Salaya, Nakhon Pathom 73170, Thailand
| | - Pattra Chantravisut
- Institute of Molecular Biosciences, Mahidol University, Salaya, Nakhon Pathom 73170, Thailand
| | - Sirikanya Thampradid
- Department of Biochemistry and Center for Excellence in Protein and Enzyme Technology, Faculty of Science, Mahidol University, Ratchathewi, Bangkok 10400, Thailand
| | - Jarunee Vanichtanankul
- National Center for Genetic Engineering and Biotechnology (BIOTEC), 113 Thailand Science Park, Khlong Nueng, Khlong Luang, Pathum Thani 12120, Thailand
| | - Lalita Narupiyakul
- Integrative Computational BioScience (ICBS) Center, Mahidol University, Salaya, Nakhon Pathom 73170, Thailand; Department of Computer Engineering, Faculty of Engineering, Mahidol University, Salaya, Nakhon Pathom 73170, Thailand
| | - Ruj Akavipat
- Integrative Computational BioScience (ICBS) Center, Mahidol University, Salaya, Nakhon Pathom 73170, Thailand
| | - Jirundon Yuvaniyama
- Department of Biochemistry and Center for Excellence in Protein and Enzyme Technology, Faculty of Science, Mahidol University, Ratchathewi, Bangkok 10400, Thailand
| |
Collapse
|
15
|
Gusmão AS, Abreu LS, Tavares JF, de Freitas HF, Silva da Rocha Pita S, Dos Santos EG, Caldas IS, Vieira AA, Silva EO. Computer-Guided Trypanocidal Activity of Natural Lactones Produced by Endophytic Fungus of Euphorbia umbellata. Chem Biodivers 2021; 18:e2100493. [PMID: 34403573 DOI: 10.1002/cbdv.202100493] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Accepted: 08/17/2021] [Indexed: 11/11/2022]
Abstract
Hundreds of millions of people worldwide are affected by Chagas' disease caused by Trypanosoma cruzi. Since the current treatment lack efficacy, specificity, and suffers from several side-effects, novel therapeutics are mandatory. Natural products from endophytic fungi have been useful sources of lead compounds. In this study, three lactones isolated from an endophytic strain culture were in silico evaluated for rational guidance of their bioassay screening. All lactones displayed in vitro activity against T. cruzi epimastigote and trypomastigote forms. Notably, the IC50 values of (+)-phomolactone were lower than benznidazole (0.86 vs. 30.78 μM against epimastigotes and 0.41 vs. 4.88 μM against trypomastigotes). Target-based studies suggested that lactones displayed their trypanocidal activities due to T. cruzi glyceraldehyde-3-phosphate dehydrogenase (TcGAPDH) inhibition, and the binding free energy for all three TcGAPDH-lactone complexes suggested that (+)-phomolactone has a lower score value (-3.38), corroborating with IC50 assays. These results highlight the potential of these lactones for further anti-T. cruzi drug development.
Collapse
Affiliation(s)
- Amanda Santos Gusmão
- Organic Chemistry Department, Chemistry Institute, Federal University of Bahia, Barão de Jeremoabo 147, Salvador, 40170115, Bahia, Brazil
| | - Lucas Silva Abreu
- Institute for Research in Pharmaceuticals and Medications, Federal University of Paraíba, Campus I, João Pessoa, 58051900, Paraíba, Brazil
| | - Josean Fechine Tavares
- Institute for Research in Pharmaceuticals and Medications, Federal University of Paraíba, Campus I, João Pessoa, 58051900, Paraíba, Brazil
| | - Humberto Fonseca de Freitas
- Laboratory of Bioinformatics and Molecular Modeling (LaBiMM), Pharmacy College, Federal University of Bahia, Barão de Jeremoabo 147, Salvador, 40170115, Bahia, Brazil
| | - Samuel Silva da Rocha Pita
- Laboratory of Bioinformatics and Molecular Modeling (LaBiMM), Pharmacy College, Federal University of Bahia, Barão de Jeremoabo 147, Salvador, 40170115, Bahia, Brazil
| | - Elda Gonçalves Dos Santos
- Pathology and Parasitology Department, Institute of Biomedical Sciences, Federal University of Alfenas, Gabriel Monteiro da Silva 500, Alfenas, 37130001, Minas Gerais, Brazil
| | - Ivo Santana Caldas
- Pathology and Parasitology Department, Institute of Biomedical Sciences, Federal University of Alfenas, Gabriel Monteiro da Silva 500, Alfenas, 37130001, Minas Gerais, Brazil
| | - André Alexandre Vieira
- Organic Chemistry Department, Chemistry Institute, Federal University of Bahia, Barão de Jeremoabo 147, Salvador, 40170115, Bahia, Brazil
| | - Eliane Oliveira Silva
- Organic Chemistry Department, Chemistry Institute, Federal University of Bahia, Barão de Jeremoabo 147, Salvador, 40170115, Bahia, Brazil
| |
Collapse
|
16
|
Veit-Acosta M, de Azevedo Junior WF. Computational Prediction of Binding Affinity for CDK2-ligand Complexes. A Protein Target for Cancer Drug Discovery. Curr Med Chem 2021; 29:2438-2455. [PMID: 34365938 DOI: 10.2174/0929867328666210806105810] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Revised: 06/15/2021] [Accepted: 06/22/2021] [Indexed: 11/22/2022]
Abstract
BACKGROUND CDK2 participates in the control of eukaryotic cell-cycle progression. Due to the great interest in CDK2 for drug development and the relative easiness in crystallizing this enzyme, we have over 400 structural studies focused on this protein target. This structural data is the basis for the development of computational models to estimate CDK2-ligand binding affinity. OBJECTIVE This work focuses on the recent developments in the application of supervised machine learning modeling to develop scoring functions to predict the binding affinity of CDK2. METHOD We employed the structures available at the protein data bank and the ligand information accessed from the BindingDB, Binding MOAD, and PDBbind to evaluate the predictive performance of machine learning techniques combined with physical modeling used to calculate binding affinity. We compared this hybrid methodology with classical scoring functions available in docking programs. RESULTS Our comparative analysis of previously published models indicated that a model created using a combination of a mass-spring system and cross-validated Elastic Net to predict the binding affinity of CDK2-inhibitor complexes outperformed classical scoring functions available in AutoDock4 and AutoDock Vina. CONCLUSION All studies reviewed here suggest that targeted machine learning models are superior to classical scoring functions to calculate binding affinities. Specifically for CDK2, we see that the combination of physical modeling with supervised machine learning techniques exhibits improved predictive performance to calculate the protein-ligand binding affinity. These results find theoretical support in the application of the concept of scoring function space.
Collapse
Affiliation(s)
- Martina Veit-Acosta
- Western Michigan University, 1903 Western, Michigan Ave, Kalamazoo, MI 49008. United States
| | | |
Collapse
|
17
|
Ahmed A, Mam B, Sowdhamini R. DEELIG: A Deep Learning Approach to Predict Protein-Ligand Binding Affinity. Bioinform Biol Insights 2021; 15:11779322211030364. [PMID: 34290496 PMCID: PMC8274096 DOI: 10.1177/11779322211030364] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Accepted: 06/05/2021] [Indexed: 12/03/2022] Open
Abstract
Protein-ligand binding prediction has extensive biological significance. Binding affinity helps in understanding the degree of protein-ligand interactions and is a useful measure in drug design. Protein-ligand docking using virtual screening and molecular dynamic simulations are required to predict the binding affinity of a ligand to its cognate receptor. Performing such analyses to cover the entire chemical space of small molecules requires intense computational power. Recent developments using deep learning have enabled us to make sense of massive amounts of complex data sets where the ability of the model to “learn” intrinsic patterns in a complex plane of data is the strength of the approach. Here, we have incorporated convolutional neural networks to find spatial relationships among data to help us predict affinity of binding of proteins in whole superfamilies toward a diverse set of ligands without the need of a docked pose or complex as user input. The models were trained and validated using a stringent methodology for feature extraction. Our model performs better in comparison to some existing methods used widely and is suitable for predictions on high-resolution protein crystal (⩽2.5 Å) and nonpeptide ligand as individual inputs. Our approach to network construction and training on protein-ligand data set prepared in-house has yielded significant insights. We have also tested DEELIG on few COVID-19 main protease-inhibitor complexes relevant to the current public health scenario. DEELIG-based predictions can be incorporated in existing databases including RSCB PDB, PDBMoad, and PDBbind in filling missing binding affinity data for protein-ligand complexes.
Collapse
Affiliation(s)
- Asad Ahmed
- National Institute of Technology Warangal, Warangal, India
| | - Bhavika Mam
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bangalore, India
- The University of Trans-Disciplinary Health Sciences and Technology (TDU), Bangalore, India
| | - Ramanathan Sowdhamini
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bangalore, India
- Ramanathan Sowdhamini, National Centre for Biological Sciences, Tata Institute of Fundamental Research, GKVK Campus, Bangalore 560065, Karnataka, India.
| |
Collapse
|
18
|
Bitencourt-Ferreira G, Rizzotto C, de Azevedo Junior WF. Machine Learning-Based Scoring Functions, Development and Applications with SAnDReS. Curr Med Chem 2021; 28:1746-1756. [PMID: 32410551 DOI: 10.2174/0929867327666200515101820] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2019] [Revised: 04/06/2020] [Accepted: 04/07/2020] [Indexed: 11/22/2022]
Abstract
BACKGROUND Analysis of atomic coordinates of protein-ligand complexes can provide three-dimensional data to generate computational models to evaluate binding affinity and thermodynamic state functions. Application of machine learning techniques can create models to assess protein-ligand potential energy and binding affinity. These methods show superior predictive performance when compared with classical scoring functions available in docking programs. OBJECTIVE Our purpose here is to review the development and application of the program SAnDReS. We describe the creation of machine learning models to assess the binding affinity of protein-ligand complexes. METHODS SAnDReS implements machine learning methods available in the scikit-learn library. This program is available for download at https://github.com/azevedolab/sandres. SAnDReS uses crystallographic structures, binding and thermodynamic data to create targeted scoring functions. RESULTS Recent applications of the program SAnDReS to drug targets such as Coagulation factor Xa, cyclin-dependent kinases and HIV-1 protease were able to create targeted scoring functions to predict inhibition of these proteins. These targeted models outperform classical scoring functions. CONCLUSION Here, we reviewed the development of machine learning scoring functions to predict binding affinity through the application of the program SAnDReS. Our studies show the superior predictive performance of the SAnDReS-developed models when compared with classical scoring functions available in the programs such as AutoDock4, Molegro Virtual Docker and AutoDock Vina.
Collapse
Affiliation(s)
| | - Camila Rizzotto
- Pontifical Catholic University of Rio Grande do Sul - PUCRS, Porto Alegre-RS, Brazil
| | | |
Collapse
|
19
|
Bitencourt-Ferreira G, Duarte da Silva A, Filgueira de Azevedo W. Application of Machine Learning Techniques to Predict Binding Affinity for Drug Targets: A Study of Cyclin-Dependent Kinase 2. Curr Med Chem 2021; 28:253-265. [PMID: 31729287 DOI: 10.2174/2213275912666191102162959] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Revised: 08/22/2019] [Accepted: 09/24/2019] [Indexed: 11/22/2022]
Abstract
BACKGROUND The elucidation of the structure of cyclin-dependent kinase 2 (CDK2) made it possible to develop targeted scoring functions for virtual screening aimed to identify new inhibitors for this enzyme. CDK2 is a protein target for the development of drugs intended to modulate cellcycle progression and control. Such drugs have potential anticancer activities. OBJECTIVE Our goal here is to review recent applications of machine learning methods to predict ligand- binding affinity for protein targets. To assess the predictive performance of classical scoring functions and targeted scoring functions, we focused our analysis on CDK2 structures. METHODS We have experimental structural data for hundreds of binary complexes of CDK2 with different ligands, many of them with inhibition constant information. We investigate here computational methods to calculate the binding affinity of CDK2 through classical scoring functions and machine- learning models. RESULTS Analysis of the predictive performance of classical scoring functions available in docking programs such as Molegro Virtual Docker, AutoDock4, and Autodock Vina indicated that these methods failed to predict binding affinity with significant correlation with experimental data. Targeted scoring functions developed through supervised machine learning techniques showed a significant correlation with experimental data. CONCLUSION Here, we described the application of supervised machine learning techniques to generate a scoring function to predict binding affinity. Machine learning models showed superior predictive performance when compared with classical scoring functions. Analysis of the computational models obtained through machine learning could capture essential structural features responsible for binding affinity against CDK2.
Collapse
Affiliation(s)
- Gabriela Bitencourt-Ferreira
- Laboratory of Computational Systems Biology. Pontifical Catholic University of Rio Grande do Sul (PUCRS). Av. Ipiranga, 6681 Porto Alegre/RS 90619-900 , Brazil
| | - Amauri Duarte da Silva
- Specialization Program in Bioinformatics. Pontifical Catholic University of Rio Grande do Sul (PUCRS). Av. Ipiranga, 6681 Porto Alegre/RS 90619-900, Brazil
| | - Walter Filgueira de Azevedo
- Laboratory of Computational Systems Biology. Pontifical Catholic University of Rio Grande do Sul (PUCRS). Av. Ipiranga, 6681 Porto Alegre/RS 90619-900 , Brazil
| |
Collapse
|
20
|
Das T, Ranjan A, Sieroń L, Maniukiewicz W, Das S. Direct Synthesis, Characterization and Theoretical Studies of N‐(6‐Amino‐1,3‐dimethyl‐2,4‐dioxo‐1,2,3,4‐tetrahydropyrimidin‐5‐yl)benzamide Derivatives. ChemistrySelect 2021. [DOI: 10.1002/slct.202004745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Tushar Das
- Department of Chemistry National Institute of Technology Patna, Ashok Rajpath Patna 800005 India
- Department of Pharmacoinformatics National Institute of Pharmaceutical Education and Research Hajipur Vaishali Hajipur 844102 India
| | - Amit Ranjan
- Cancer & Translational Research Lab Dr. D.Y. Patil Biotechnology & Bioinformatics Institute Dr. D.Y. Patil Vidyapeeth Pune 411033 India
| | - Lesław Sieroń
- Institute of General and Ecological Chemistry Lodz University of Technology Żeromskiego 116 Łódź Poland
| | - Waldemar Maniukiewicz
- Institute of General and Ecological Chemistry Lodz University of Technology Żeromskiego 116 Łódź Poland
| | - Subrata Das
- Department of Chemistry National Institute of Technology Patna, Ashok Rajpath Patna 800005 India
| |
Collapse
|
21
|
Brackenridge DA, McGuffin LJ. Proteins and Their Interacting Partners: An Introduction to Protein-Ligand Binding Site Prediction Methods with a Focus on FunFOLD3. Methods Mol Biol 2021; 2365:43-58. [PMID: 34432238 DOI: 10.1007/978-1-0716-1665-9_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Proteins are essential molecules with a diverse range of functions; elucidating their biological and biochemical characteristics can be difficult and time consuming using in vitro and/or in vivo methods. Additionally, in vivo protein-ligand binding site elucidation is unable to keep place with current growth in sequencing, leaving the majority of new protein sequences without known functions. Therefore, the development of new methods, which aim to predict the protein-ligand interactions and ligand-binding site residues directly from amino acid sequences, is becoming increasingly important. In silico prediction can utilise either sequence information, structural information or a combination of both. In this chapter, we will discuss the broad range of methods for ligand-binding site prediction from protein structure and we will describe our method, FunFOLD3, for the prediction of protein-ligand interactions and ligand-binding sites based on template-based modelling. Additionally, we will describe the step-by-step instructions using the FunFOLD3 downloadable application along with examples from the Critical Assessment of Techniques for Protein Structure Prediction (CASP) where FunFOLD3 has been used to aid ligand and ligand-binding site prediction. Finally, we will introduce our newer method, FunFOLD3-D, a version of FunFOLD3 which aims to improve template-based protein-ligand binding site prediction through the integration of docking, using AutoDock Vina.
Collapse
|
22
|
Pavlovicz RE, Park H, DiMaio F. Efficient consideration of coordinated water molecules improves computational protein-protein and protein-ligand docking discrimination. PLoS Comput Biol 2020; 16:e1008103. [PMID: 32956350 PMCID: PMC7529342 DOI: 10.1371/journal.pcbi.1008103] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Revised: 10/01/2020] [Accepted: 06/29/2020] [Indexed: 12/25/2022] Open
Abstract
Highly coordinated water molecules are frequently an integral part of protein-protein and protein-ligand interfaces. We introduce an updated energy model that efficiently captures the energetic effects of these ordered water molecules on the surfaces of proteins. A two-stage method is developed in which polar groups arranged in geometries suitable for water placement are first identified, then a modified Monte Carlo simulation allows highly coordinated waters to be placed on the surface of a protein while simultaneously sampling amino acid side chain orientations. This “semi-explicit” water model is implemented in Rosetta and is suitable for both structure prediction and protein design. We show that our new approach and energy model yield significant improvements in native structure recovery of protein-protein and protein-ligand docking discrimination tests. Well-coordinated water molecules—those forming multiple hydrogen bonds with nearby polar groups—play an important role in the structure of biomolecular systems, yet the effect of these waters is often not considered in molecular energy computations. In this paper, we describe a method to efficiently consider these water molecules both implicitly and explicitly at the interfaces formed by two polar molecules. In computations related to determining how a protein interacts with binding partners, we show that the use of this new method significantly improves results. Future application of this approach may improve the design of new protein and small molecule drugs.
Collapse
Affiliation(s)
- Ryan E. Pavlovicz
- Department of Biochemistry, University of Washington, Seattle, Washington, United States of America
- Institute for Protein Design, University of Washington, Seattle, Washington, United States of America
| | - Hahnbeom Park
- Department of Biochemistry, University of Washington, Seattle, Washington, United States of America
- Institute for Protein Design, University of Washington, Seattle, Washington, United States of America
| | - Frank DiMaio
- Department of Biochemistry, University of Washington, Seattle, Washington, United States of America
- Institute for Protein Design, University of Washington, Seattle, Washington, United States of America
- * E-mail:
| |
Collapse
|
23
|
Weitzner BD, Kipnis Y, Daniel AG, Hilvert D, Baker D. A computational method for design of connected catalytic networks in proteins. Protein Sci 2020; 28:2036-2041. [PMID: 31642127 PMCID: PMC6863703 DOI: 10.1002/pro.3757] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2019] [Revised: 10/21/2019] [Accepted: 10/21/2019] [Indexed: 02/05/2023]
Abstract
Computational design of new active sites has generally proceeded by geometrically defining interactions between the reaction transition state(s) and surrounding side‐chain functional groups which maximize transition‐state stabilization, and then searching for sites in protein scaffolds where the specified side‐chain–transition‐state interactions can be realized. A limitation of this approach is that the interactions between the side chains themselves are not constrained. An extensive connected hydrogen bond network involving the catalytic residues was observed in a designed retroaldolase following directed evolution. Such connected networks could increase catalytic activity by preorganizing active site residues in catalytically competent orientations, and enabling concerted interactions between side chains during catalysis, for example, proton shuffling. We developed a method for designing active sites in which the catalytic side chains, in addition to making interactions with the transition state, are also involved in extensive hydrogen bond networks. Because of the added constraint of hydrogen‐bond connectivity between the catalytic side chains, to find solutions, a wider range of interactions between these side chains and the transition state must be considered. Our new method starts from a ChemDraw‐like two‐dimensional representation of the transition state with hydrogen‐bond donors, acceptors, and covalent interaction sites indicated, and all placements of side‐chain functional groups that make the indicated interactions with the transition state, and are fully connected in a single hydrogen‐bond network are systematically enumerated. The RosettaMatch method can then be used to identify realizations of these fully‐connected active sites in protein scaffolds. The method generates many fully‐connected active site solutions for a set of model reactions that are promising starting points for the design of fully‐preorganized enzyme catalysts.
Collapse
Affiliation(s)
- Brian D Weitzner
- Department of Biochemistry, University of Washington, Seattle, Washington.,Institute for Protein Design, University of Washington, Seattle, Washington
| | - Yakov Kipnis
- Department of Biochemistry, University of Washington, Seattle, Washington.,Institute for Protein Design, University of Washington, Seattle, Washington.,Howard Hughes Medical Institute, University of Washington, Seattle, Washington
| | - A Gerard Daniel
- Department of Biochemistry, University of Washington, Seattle, Washington.,Institute for Protein Design, University of Washington, Seattle, Washington
| | - Donald Hilvert
- Laboratory of Organic Chemistry, ETH Zurich, Zurich, Switzerland
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, Washington.,Institute for Protein Design, University of Washington, Seattle, Washington.,Howard Hughes Medical Institute, University of Washington, Seattle, Washington
| |
Collapse
|
24
|
Accurate Representation of Protein-Ligand Structural Diversity in the Protein Data Bank (PDB). Int J Mol Sci 2020; 21:ijms21062243. [PMID: 32213914 PMCID: PMC7139665 DOI: 10.3390/ijms21062243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Revised: 03/06/2020] [Accepted: 03/20/2020] [Indexed: 11/16/2022] Open
Abstract
The number of available protein structures in the Protein Data Bank (PDB) has considerably increased in recent years. Thanks to the growth of structures and complexes, numerous large-scale studies have been done in various research areas, e.g., protein-protein, protein-DNA, or in drug discovery. While protein redundancy was only simply managed using simple protein sequence identity threshold, the similarity of protein-ligand complexes should also be considered from a structural perspective. Hence, the protein-ligand duplicates in the PDB are widely known, but were never quantitatively assessed, as they are quite complex to analyze and compare. Here, we present a specific clustering of protein-ligand structures to avoid bias found in different studies. The methodology is based on binding site superposition, and a combination of weighted Root Mean Square Deviation (RMSD) assessment and hierarchical clustering. Repeated structures of proteins of interest are highlighted and only representative conformations were conserved for a non-biased view of protein distribution. Three types of cases are described based on the number of distinct conformations identified for each complex. Defining these categories decreases by 3.84-fold the number of complexes, and offers more refined results compared to a protein sequence-based method. Widely distinct conformations were analyzed using normalized B-factors. Furthermore, a non-redundant dataset was generated for future molecular interactions analysis or virtual screening studies.
Collapse
|
25
|
de Ávila MB, Bitencourt-Ferreira G, de Azevedo WF. Structural Basis for Inhibition of Enoyl-[Acyl Carrier Protein] Reductase (InhA) from Mycobacterium tuberculosis. Curr Med Chem 2020; 27:745-759. [DOI: 10.2174/0929867326666181203125229] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2018] [Revised: 07/26/2018] [Accepted: 11/14/2018] [Indexed: 12/18/2022]
Abstract
Background::
The enzyme trans-enoyl-[acyl carrier protein] reductase (InhA) is a central
protein for the development of antitubercular drugs. This enzyme is the target for the pro-drug
isoniazid, which is catalyzed by the enzyme catalase-peroxidase (KatG) to become active.
Objective::
Our goal here is to review the studies on InhA, starting with general aspects and focusing on
the recent structural studies, with emphasis on the crystallographic structures of complexes involving
InhA and inhibitors.
Method::
We start with a literature review, and then we describe recent studies on InhA crystallographic
structures. We use this structural information to depict protein-ligand interactions. We also analyze the
structural basis for inhibition of InhA. Furthermore, we describe the application of computational
methods to predict binding affinity based on the crystallographic position of the ligands.
Results::
Analysis of the structures in complex with inhibitors revealed the critical residues responsible
for the specificity against InhA. Most of the intermolecular interactions involve the hydrophobic residues
with two exceptions, the residues Ser 94 and Tyr 158. Examination of the interactions has shown
that many of the key residues for inhibitor binding were found in mutations of the InhA gene in the
isoniazid-resistant Mycobacterium tuberculosis. Computational prediction of the binding affinity for
InhA has indicated a moderate uphill relationship with experimental values.
Conclusion::
Analysis of the structures involving InhA inhibitors shows that small modifications on
these molecules could modulate their inhibition, which may be used to design novel antitubercular
drugs specific for multidrug-resistant strains.
Collapse
Affiliation(s)
- Maurício Boff de Ávila
- Laboratory of Computational Systems Biology, School of Sciences - Pontifical Catholic University of Rio, Grande do Sul (PUCRS), Av. Ipiranga, 6681, Porto Alegre-RS 90619-900, Brazil
| | - Gabriela Bitencourt-Ferreira
- Laboratory of Computational Systems Biology, School of Sciences - Pontifical Catholic University of Rio, Grande do Sul (PUCRS), Av. Ipiranga, 6681, Porto Alegre-RS 90619-900, Brazil
| | - Walter Filgueira de Azevedo
- Laboratory of Computational Systems Biology, School of Sciences - Pontifical Catholic University of Rio, Grande do Sul (PUCRS), Av. Ipiranga, 6681, Porto Alegre-RS 90619-900, Brazil
| |
Collapse
|
26
|
Su M, Feng G, Liu Z, Li Y, Wang R. Tapping on the Black Box: How Is the Scoring Power of a Machine-Learning Scoring Function Dependent on the Training Set? J Chem Inf Model 2020; 60:1122-1136. [DOI: 10.1021/acs.jcim.9b00714] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Minyi Su
- State Key Laboratory of Bioorganic and Natural Products Chemistry, Center for Excellence in Molecular Synthesis, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, 345 Lingling Road, Shanghai 200032, People’s Republic of China
- University of Chinese Academy of Sciences, Beijing 100049, People’s Republic of China
| | - Guoqin Feng
- State Key Laboratory of Bioorganic and Natural Products Chemistry, Center for Excellence in Molecular Synthesis, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, 345 Lingling Road, Shanghai 200032, People’s Republic of China
- University of Chinese Academy of Sciences, Beijing 100049, People’s Republic of China
| | - Zhihai Liu
- State Key Laboratory of Bioorganic and Natural Products Chemistry, Center for Excellence in Molecular Synthesis, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, 345 Lingling Road, Shanghai 200032, People’s Republic of China
| | - Yan Li
- State Key Laboratory of Bioorganic and Natural Products Chemistry, Center for Excellence in Molecular Synthesis, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, 345 Lingling Road, Shanghai 200032, People’s Republic of China
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People’s Republic of China
| | - Renxiao Wang
- State Key Laboratory of Bioorganic and Natural Products Chemistry, Center for Excellence in Molecular Synthesis, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, 345 Lingling Road, Shanghai 200032, People’s Republic of China
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People’s Republic of China
- Shanxi Key Laboratory of Innovative Drugs for the Treatment of Serious Diseases Basing on Chronic Inflammation, College of Traditional Chinese Medicines, Shanxi University of Chinese Medicine, Taiyuan, Shanxi 030619, People’s Republic of China
| |
Collapse
|
27
|
Hu X, Maffucci I, Contini A. Advances in the Treatment of Explicit Water Molecules in Docking and Binding Free Energy Calculations. Curr Med Chem 2020; 26:7598-7622. [DOI: 10.2174/0929867325666180514110824] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2018] [Revised: 02/26/2018] [Accepted: 04/18/2018] [Indexed: 12/30/2022]
Abstract
Background:
The inclusion of direct effects mediated by water during the ligandreceptor
recognition is a hot-topic of modern computational chemistry applied to drug discovery
and development. Docking or virtual screening with explicit hydration is still debatable,
despite the successful cases that have been presented in the last years. Indeed, how to select
the water molecules that will be included in the docking process or how the included waters
should be treated remain open questions.
Objective:
In this review, we will discuss some of the most recent methods that can be used in
computational drug discovery and drug development when the effect of a single water, or of a
small network of interacting waters, needs to be explicitly considered.
Results:
Here, we analyse the software to aid the selection, or to predict the position, of water
molecules that are going to be explicitly considered in later docking studies. We also present
software and protocols able to efficiently treat flexible water molecules during docking, including
examples of applications. Finally, we discuss methods based on molecular dynamics
simulations that can be used to integrate docking studies or to reliably and efficiently compute
binding energies of ligands in presence of interfacial or bridging water molecules.
Conclusions:
Software applications aiding the design of new drugs that exploit water molecules,
either as displaceable residues or as bridges to the receptor, are constantly being developed.
Although further validation is needed, workflows that explicitly consider water will
probably become a standard for computational drug discovery soon.
Collapse
Affiliation(s)
- Xiao Hu
- Università degli Studi di Milano, Dipartimento di Scienze Farmaceutiche, Sezione di Chimica Generale e Organica “A. Marchesini”, Via Venezian, 21 20133 Milano, Italy
| | - Irene Maffucci
- Pasteur, Département de Chimie, École Normale Supérieure, PSL Research University, Sorbonne Universités, UPMC Univ. Paris 06, CNRS, 75005 Paris, France
| | - Alessandro Contini
- Università degli Studi di Milano, Dipartimento di Scienze Farmaceutiche, Sezione di Chimica Generale e Organica “A. Marchesini”, Via Venezian, 21 20133 Milano, Italy
| |
Collapse
|
28
|
Thafar M, Raies AB, Albaradei S, Essack M, Bajic VB. Comparison Study of Computational Prediction Tools for Drug-Target Binding Affinities. Front Chem 2019; 7:782. [PMID: 31824921 PMCID: PMC6879652 DOI: 10.3389/fchem.2019.00782] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2019] [Accepted: 10/30/2019] [Indexed: 12/30/2022] Open
Abstract
The drug development is generally arduous, costly, and success rates are low. Thus, the identification of drug-target interactions (DTIs) has become a crucial step in early stages of drug discovery. Consequently, developing computational approaches capable of identifying potential DTIs with minimum error rate are increasingly being pursued. These computational approaches aim to narrow down the search space for novel DTIs and shed light on drug functioning context. Most methods developed to date use binary classification to predict if the interaction between a drug and its target exists or not. However, it is more informative but also more challenging to predict the strength of the binding between a drug and its target. If that strength is not sufficiently strong, such DTI may not be useful. Therefore, the methods developed to predict drug-target binding affinities (DTBA) are of great value. In this study, we provide a comprehensive overview of the existing methods that predict DTBA. We focus on the methods developed using artificial intelligence (AI), machine learning (ML), and deep learning (DL) approaches, as well as related benchmark datasets and databases. Furthermore, guidance and recommendations are provided that cover the gaps and directions of the upcoming work in this research area. To the best of our knowledge, this is the first comprehensive comparison analysis of tools focused on DTBA with reference to AI/ML/DL.
Collapse
Affiliation(s)
- Maha Thafar
- Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- College of Computers and Information Technology, Taif University, Taif, Saudi Arabia
| | - Arwa Bin Raies
- Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Somayah Albaradei
- Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Magbubah Essack
- Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Vladimir B. Bajic
- Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| |
Collapse
|
29
|
CavBench: A benchmark for protein cavity detection methods. PLoS One 2019; 14:e0223596. [PMID: 31609980 PMCID: PMC6791542 DOI: 10.1371/journal.pone.0223596] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2019] [Accepted: 09/24/2019] [Indexed: 11/19/2022] Open
Abstract
Extensive research has been applied to discover new techniques and methods to model protein-ligand interactions. In particular, considerable efforts focused on identifying candidate binding sites, which quite often are active sites that correspond to protein pockets or cavities. Thus, these cavities play an important role in molecular docking. However, there is no established benchmark to assess the accuracy of new cavity detection methods. In practice, each new technique is evaluated using a small set of proteins with known binding sites as ground-truth. However, studies supported by large datasets of known cavities and/or binding sites and statistical classification (i.e., false positives, false negatives, true positives, and true negatives) would yield much stronger and reliable assessments. To this end, we propose CavBench, a generic and extensible benchmark to compare different cavity detection methods relative to diverse ground truth datasets (e.g., PDBsum) using statistical classification methods.
Collapse
|
30
|
Updates to Binding MOAD (Mother of All Databases): Polypharmacology Tools and Their Utility in Drug Repurposing. J Mol Biol 2019; 431:2423-2433. [PMID: 31125569 DOI: 10.1016/j.jmb.2019.05.024] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2019] [Revised: 05/13/2019] [Accepted: 05/14/2019] [Indexed: 01/02/2023]
Abstract
The goal of Binding MOAD is to provide users with a data set focused on high-quality x-ray crystal structures that have been solved with biologically relevant ligands bound. Where available, experimental binding affinities (Ka, Kd, Ki, IC50) are provided from the primary literature of the crystal structure. The database has been updated regularly since 2005, and this most recent update has added nearly 7000 new structures (growth of 21%). MOAD currently contains 32,747 structures, composed of 9117 protein families and 16,044 unique ligands. The data are freely available on www.BindingMOAD.org. This paper outlines updates to the data in Binding MOAD as well as improvements made to both the website and its contents. The NGL viewer has been added to improve visualization of the ligands and protein structures. MarvinJS has been implemented, over the outdated MarvinView, to work with JChem for small molecule searching in the database. To add tools for predicting polypharmacology, we have added information about sequence, binding-site, and ligand similarity between entries in the database. A main premise behind polypharmacology is that similar binding sites will bind similar ligands. The large amount of protein-ligand information available in Binding MOAD allows us to compute pairwise ligand and binding-site similarities. Lists of similar ligands and similar binding sites have been added to allow users to identify potential polypharmacology pairs. To show the utility of the polypharmacology data, we detail a few examples from Binding MOAD of drug repurposing targets with their respective similarities.
Collapse
|
31
|
Volkart PA, Bitencourt-Ferreira G, Souto AA, de Azevedo WF. Cyclin-Dependent Kinase 2 in Cellular Senescence and Cancer. A Structural and Functional Review. Curr Drug Targets 2019; 20:716-726. [DOI: 10.2174/1389450120666181204165344] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2018] [Revised: 11/27/2018] [Accepted: 11/28/2018] [Indexed: 02/03/2023]
Abstract
<P>Background: Cyclin-dependent kinase 2 (CDK2) has been studied due to its role in the
cell-cycle progression. The elucidation of the CDK2 structure paved the way to investigate the molecular
basis for inhibition of this enzyme, with the coordinated efforts combining crystallography with
functional studies.
</P><P>
Objective: Our goal here is to review recent functional and structural studies directed to understanding
the role of CDK2 in cancer and senescence.
</P><P>
Methods: There are over four hundreds of crystallographic structures available for CDK2, many of
them with binding affinity information. We use this abundance of data to analyze the essential features
responsible for the inhibition of CDK2 and its function in cancer and senescence.
</P><P>
Results: The structural and affinity data available CDK2 makes it possible to have a clear view of the
vital CDK2 residues involved in molecular recognition. A detailed description of the structural basis
for ligand binding is of pivotal importance in the design of CDK2 inhibitors. Our analysis shows the
relevance of the residues Leu 83 and Asp 86 for binding affinity. The recent findings revealing the
participation of CDK2 inhibition in senescence open the possibility to explore the richness of structural
and affinity data for a new era in the development of CDK2 inhibitors, targeting cellular senescence.
</P><P>
Conclusion: Here, we analyzed structural information for CDK2 in combination with inhibitors and
mapped the molecular aspects behind the strongest CDK2 inhibitors for which structures and ligandbinding
affinity data were available. From this analysis, we identified the significant intermolecular
interactions responsible for binding affinity. This knowledge may guide the future development of
CDK2 inhibitors targeting cancer and cellular senescence.</P>
Collapse
Affiliation(s)
- Priscylla Andrade Volkart
- School of Sciences - Pontifical Catholic University of Rio Grande do Sul (PUCRS). Av. Ipiranga, 6681 Porto Alegre/RS 90619-900, Brazil
| | - Gabriela Bitencourt-Ferreira
- School of Sciences - Pontifical Catholic University of Rio Grande do Sul (PUCRS). Av. Ipiranga, 6681 Porto Alegre/RS 90619-900, Brazil
| | - André Arigony Souto
- School of Sciences - Pontifical Catholic University of Rio Grande do Sul (PUCRS). Av. Ipiranga, 6681 Porto Alegre/RS 90619-900, Brazil
| | - Walter Filgueira de Azevedo
- School of Sciences - Pontifical Catholic University of Rio Grande do Sul (PUCRS). Av. Ipiranga, 6681 Porto Alegre/RS 90619-900, Brazil
| |
Collapse
|
32
|
Bouadjenek MR, Zobel J, Verspoor K. Automated assessment of biological database assertions using the scientific literature. BMC Bioinformatics 2019; 20:216. [PMID: 31035936 PMCID: PMC6489365 DOI: 10.1186/s12859-019-2801-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2018] [Accepted: 04/09/2019] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND The large biological databases such as GenBank contain vast numbers of records, the content of which is substantively based on external resources, including published literature. Manual curation is used to establish whether the literature and the records are indeed consistent. We explore in this paper an automated method for assessing the consistency of biological assertions, to assist biocurators, which we call BARC, Biocuration tool for Assessment of Relation Consistency. In this method a biological assertion is represented as a relation between two objects (for example, a gene and a disease); we then use our novel set-based relevance algorithm SaBRA to retrieve pertinent literature, and apply a classifier to estimate the likelihood that this relation (assertion) is correct. RESULTS Our experiments on assessing gene-disease relations and protein-protein interactions using the PubMed Central collection show that BARC can be effective at assisting curators to perform data cleansing. Specifically, the results obtained showed that BARC substantially outperforms the best baselines, with an improvement of F-measure of 3.5% and 13%, respectively, on gene-disease relations and protein-protein interactions. We have additionally carried out a feature analysis that showed that all feature types are informative, as are all fields of the documents. CONCLUSIONS BARC provides a clear benefit for the biocuration community, as there are no prior automated tools for identifying inconsistent assertions in large-scale biological databases.
Collapse
Affiliation(s)
- Mohamed Reda Bouadjenek
- Department of Mechanical & Industrial Engineering, University of Toronto, Toronto, M5S 3G8 Canada
| | - Justin Zobel
- School of Computing and Information Systems, University of Melbourne, Melbourne, 3010 Australia
| | - Karin Verspoor
- School of Computing and Information Systems, University of Melbourne, Melbourne, 3010 Australia
| |
Collapse
|
33
|
Huang SY. Comprehensive assessment of flexible-ligand docking algorithms: current effectiveness and challenges. Brief Bioinform 2019; 19:982-994. [PMID: 28334282 DOI: 10.1093/bib/bbx030] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Protein-ligand docking has been playing an important role in modern drug discovery. To model drug-target binding in real systems, a number of flexible-ligand docking algorithms with different sampling strategies and scoring methods have been subsequently developed over the past three decades, while rigid-ligand docking is still being used because of its compelling computational efficiency. Here, a comprehensive assessment has been conducted to investigate the effectiveness of flexible-ligand docking versus rigid-ligand docking for three representative docking algorithms (global optimization, incremental construction and multi-conformer docking) in virtual screening and pose prediction on the Directory of Useful Decoys. It was found that overall flexible-ligand docking did not achieve a statistically significant improvement in enrichments over rigid-ligand docking in virtual screening, but all docking programs significantly improved the success rates when considering ligand flexibility in pose prediction. The worse effectiveness of flexible-ligand docking in virtual screening than in pose prediction suggests that the challenges of current docking algorithms exist in ranking more than docking, although the use of flexible-ligand docking in virtual screening was justified by its better effectiveness for more flexible ligand in virtual screening. Challenges for scoring, including internal energy, charge polarization, entropy and flexibility, were investigated and discussed. An empirical way was also proposed to consider loss of ligand conformational entropy for virtual screening.
Collapse
Affiliation(s)
- Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, P. R. China
| |
Collapse
|
34
|
Kores K, Lešnik S, Bren U, Janežič D, Konc J. Discovery of Novel Potential Human Targets of Resveratrol by Inverse Molecular Docking. J Chem Inf Model 2019; 59:2467-2478. [PMID: 30883115 DOI: 10.1021/acs.jcim.8b00981] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Resveratrol is a polyphenol known for its antioxidant and anti-inflammatory properties, which support its use as a treatment for variety of diseases. There are already known connections of resveratrol to chemoprevention of cancer because of its ability to prevent tumor initiation and inhibit tumor promotion and progression. Resveratrol is also believed to be important in cardiovascular diseases and neurological disorders, such as Alzheimer's disease. Using an inverse molecular docking approach, we sought to find new potential targets of resveratrol. Docking of resveratrol into each ProBiS predicted binding site of >38 000 protein structures from the Protein Data Bank was examined, and a number of novel potential targets into which resveratrol was docked successfully were found. These explain known actions or predict new effects of resveratrol. The results included three human proteins that are already known to bind resveratrol. A majority of proteins discovered however have no already described connections with resveratrol. We report new potential target human proteins and proteins connected with different organisms into which resveratrol can dock. Our results reveal previously unknown potential target human proteins, whose connection with cardiovascular and neurological disorders could lead to new potential treatments for variety of diseases. We believe that our research could help in future experimental studies on revestratol bioactivity in humans.
Collapse
Affiliation(s)
- Katarina Kores
- University of Maribor , Faculty for Chemistry and Chemical Technology Maribor , Smetanova ulica 17 , SI-2000 Maribor , Slovenia
| | - Samo Lešnik
- National Institute of Chemistry , Hajdrihova 19 , SI-1000 Ljubljana , Slovenia
| | - Urban Bren
- University of Maribor , Faculty for Chemistry and Chemical Technology Maribor , Smetanova ulica 17 , SI-2000 Maribor , Slovenia.,National Institute of Chemistry , Hajdrihova 19 , SI-1000 Ljubljana , Slovenia.,University of Primorska , Faculty of Mathematics, Natural Sciences and Information Technology , Glagoljaška 8 , SI-6000 Koper , Slovenia
| | - Dušanka Janežič
- University of Primorska , Faculty of Mathematics, Natural Sciences and Information Technology , Glagoljaška 8 , SI-6000 Koper , Slovenia
| | - Janez Konc
- National Institute of Chemistry , Hajdrihova 19 , SI-1000 Ljubljana , Slovenia.,University of Primorska , Faculty of Mathematics, Natural Sciences and Information Technology , Glagoljaška 8 , SI-6000 Koper , Slovenia
| |
Collapse
|
35
|
CSgator: an integrated web platform for compound set analysis. J Cheminform 2019; 11:17. [PMID: 30830479 PMCID: PMC6419788 DOI: 10.1186/s13321-019-0339-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2018] [Accepted: 02/26/2019] [Indexed: 12/13/2022] Open
Abstract
Drug discovery typically involves investigation of a set of compounds (e.g. drug screening hits) in terms of target, disease, and bioactivity. CSgator is a comprehensive analytic tool for set-wise interpretation of compounds. It has two unique analytic features of Compound Set Enrichment Analysis (CSEA) and Compound Cluster Analysis (CCA), which allows batch analysis of compound set in terms of (i) target, (ii) bioactivity, (iii) disease, and (iv) structure. CSEA and CCA present enriched profiles of targets and bioactivities in a compound set, which leads to novel insights on underlying drug mode-of-action, and potential targets. Notably, we propose a novel concept of 'Hit Enriched Assays", i.e. bioassays of which hits are enriched among a given set of compounds. As an example, we show its utility in revealing drug mode-of-action or identifying hidden targets for anti-lymphangiogenesis screening hits. CSgator is available at http://csgator.ewha.ac.kr , and most analytic results are downloadable.
Collapse
|
36
|
|
37
|
Jiménez J, Škalič M, Martínez-Rosell G, De Fabritiis G. KDEEP: Protein–Ligand Absolute Binding Affinity Prediction via 3D-Convolutional Neural Networks. J Chem Inf Model 2018; 58:287-296. [DOI: 10.1021/acs.jcim.7b00650] [Citation(s) in RCA: 389] [Impact Index Per Article: 64.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Affiliation(s)
- José Jiménez
- Computational
Biophysics Laboratory, Universitat Pompeu Fabra, Parc de Recerca Biomèdica de Barcelona, Carrer del Dr. Aiguader
88, Barcelona 08003, Spain
| | - Miha Škalič
- Computational
Biophysics Laboratory, Universitat Pompeu Fabra, Parc de Recerca Biomèdica de Barcelona, Carrer del Dr. Aiguader
88, Barcelona 08003, Spain
| | - Gerard Martínez-Rosell
- Computational
Biophysics Laboratory, Universitat Pompeu Fabra, Parc de Recerca Biomèdica de Barcelona, Carrer del Dr. Aiguader
88, Barcelona 08003, Spain
| | - Gianni De Fabritiis
- Computational
Biophysics Laboratory, Universitat Pompeu Fabra, Parc de Recerca Biomèdica de Barcelona, Carrer del Dr. Aiguader
88, Barcelona 08003, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Passeig Lluis Companys 23, 08010 Barcelona, Spain
| |
Collapse
|
38
|
Réau M, Langenfeld F, Zagury JF, Lagarde N, Montes M. Decoys Selection in Benchmarking Datasets: Overview and Perspectives. Front Pharmacol 2018; 9:11. [PMID: 29416509 PMCID: PMC5787549 DOI: 10.3389/fphar.2018.00011] [Citation(s) in RCA: 51] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2017] [Accepted: 01/05/2018] [Indexed: 11/24/2022] Open
Abstract
Virtual Screening (VS) is designed to prospectively help identifying potential hits, i.e., compounds capable of interacting with a given target and potentially modulate its activity, out of large compound collections. Among the variety of methodologies, it is crucial to select the protocol that is the most adapted to the query/target system under study and that yields the most reliable output. To this aim, the performance of VS methods is commonly evaluated and compared by computing their ability to retrieve active compounds in benchmarking datasets. The benchmarking datasets contain a subset of known active compounds together with a subset of decoys, i.e., assumed non-active molecules. The composition of both the active and the decoy compounds subsets is critical to limit the biases in the evaluation of the VS methods. In this review, we focus on the selection of decoy compounds that has considerably changed over the years, from randomly selected compounds to highly customized or experimentally validated negative compounds. We first outline the evolution of decoys selection in benchmarking databases as well as current benchmarking databases that tend to minimize the introduction of biases, and secondly, we propose recommendations for the selection and the design of benchmarking datasets.
Collapse
Affiliation(s)
- Manon Réau
- Laboratoire GBA, EA4627, Conservatoire National des Arts et Métiers, Paris, France
| | - Florent Langenfeld
- Laboratoire GBA, EA4627, Conservatoire National des Arts et Métiers, Paris, France
| | - Jean-François Zagury
- Laboratoire GBA, EA4627, Conservatoire National des Arts et Métiers, Paris, France
| | - Nathalie Lagarde
- Laboratoire GBA, EA4627, Conservatoire National des Arts et Métiers, Paris, France
| | - Matthieu Montes
- Laboratoire GBA, EA4627, Conservatoire National des Arts et Métiers, Paris, France
| |
Collapse
|
39
|
Ashtawy HM, Mahapatra NR. Descriptor Data Bank (DDB): A Cloud Platform for Multiperspective Modeling of Protein–Ligand Interactions. J Chem Inf Model 2017; 58:134-147. [DOI: 10.1021/acs.jcim.7b00310] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Affiliation(s)
- Hossam M. Ashtawy
- Department of Electrical
and Computer Engineering, Michigan State University, East Lansing, Michigan 48824-1226, United States
| | - Nihar R. Mahapatra
- Department of Electrical
and Computer Engineering, Michigan State University, East Lansing, Michigan 48824-1226, United States
| |
Collapse
|
40
|
All-Atom Four-Body Knowledge-Based Statistical Potentials to Distinguish Native Protein Structures from Nonnative Folds. BIOMED RESEARCH INTERNATIONAL 2017; 2017:5760612. [PMID: 29119109 PMCID: PMC5651141 DOI: 10.1155/2017/5760612] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/27/2017] [Revised: 08/13/2017] [Accepted: 08/23/2017] [Indexed: 02/05/2023]
Abstract
Recent advances in understanding protein folding have benefitted from coarse-grained representations of protein structures. Empirical energy functions derived from these techniques occasionally succeed in distinguishing native structures from their corresponding ensembles of nonnative folds or decoys which display varying degrees of structural dissimilarity to the native proteins. Here we utilized atomic coordinates of single protein chains, comprising a large diverse training set, to develop and evaluate twelve all-atom four-body statistical potentials obtained by exploring alternative values for a pair of inherent parameters. Delaunay tessellation was performed on the atomic coordinates of each protein to objectively identify all quadruplets of interacting atoms, and atomic potentials were generated via statistical analysis of the data and implementation of the inverted Boltzmann principle. Our potentials were evaluated using benchmarking datasets from Decoys-‘R'-Us, and comparisons were made with twelve other physics- and knowledge-based potentials. Ranking 3rd, our best potential tied CHARMM19 and surpassed AMBER force field potentials. We illustrate how a generalized version of our potential can be used to empirically calculate binding energies for target-ligand complexes, using HIV-1 protease-inhibitor complexes for a practical application. The combined results suggest an accurate and efficient atomic four-body statistical potential for protein structure prediction and assessment.
Collapse
|
41
|
Kimura SR, Hu HP, Ruvinsky AM, Sherman W, Favia AD. Deciphering Cryptic Binding Sites on Proteins by Mixed-Solvent Molecular Dynamics. J Chem Inf Model 2017; 57:1388-1401. [PMID: 28537745 DOI: 10.1021/acs.jcim.6b00623] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
In recent years, molecular dynamics simulations of proteins in explicit mixed solvents have been applied to various problems in protein biophysics and drug discovery, including protein folding, protein surface characterization, fragment screening, allostery, and druggability assessment. In this study, we perform a systematic study on how mixtures of organic solvent probes in water can reveal cryptic ligand binding pockets that are not evident in crystal structures of apo proteins. We examine a diverse set of eight PDB proteins that show pocket opening induced by ligand binding and investigate whether solvent MD simulations on the apo structures can induce the binding site observed in the holo structures. The cosolvent simulations were found to induce conformational changes on the protein surface, which were characterized and compared with the holo structures. Analyses of the biological systems, choice of probes and concentrations, druggability of the resulting induced pockets, and application to drug discovery are discussed here.
Collapse
Affiliation(s)
- S Roy Kimura
- Schrödinger KK , 17th Fl, Marunouchi Trust Tower North, 1-8-1 Marunouchi, Chiyoda-ku, Tokyo, Japan
| | - Hai Peng Hu
- Lilly China Research and Development Center (LCRDC), Eli Lilly and Company , Building 8, 338 Jia Li Lue Road, Shanghai 201203, PR China
| | - Anatoly M Ruvinsky
- Schrödinger LLC , 222 Third Street, Suite 2230, Cambridge, Massachusetts 02142, United States
| | - Woody Sherman
- Schrödinger LLC , 222 Third Street, Suite 2230, Cambridge, Massachusetts 02142, United States
| | - Angelo D Favia
- Lilly China Research and Development Center (LCRDC), Eli Lilly and Company , Building 8, 338 Jia Li Lue Road, Shanghai 201203, PR China
| |
Collapse
|
42
|
Annotation of Alternatively Spliced Proteins and Transcripts with Protein-Folding Algorithms and Isoform-Level Functional Networks. Methods Mol Biol 2017; 1558:415-436. [PMID: 28150250 DOI: 10.1007/978-1-4939-6783-4_20] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Tens of thousands of splice isoforms of proteins have been catalogued as predicted sequences from transcripts in humans and other species. Relatively few have been characterized biochemically or structurally. With the extensive development of protein bioinformatics, the characterization and modeling of isoform features, isoform functions, and isoform-level networks have advanced notably. Here we present applications of the I-TASSER family of algorithms for folding and functional predictions and the IsoFunc, MIsoMine, and Hisonet data resources for isoform-level analyses of network and pathway-based functional predictions and protein-protein interactions. Hopefully, predictions and insights from protein bioinformatics will stimulate many experimental validation studies.
Collapse
|
43
|
Cao C, Xu S. Improving the performance of the PLB index for ligand-binding site prediction using dihedral angles and the solvent-accessible surface area. Sci Rep 2016; 6:33232. [PMID: 27619067 PMCID: PMC5020399 DOI: 10.1038/srep33232] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2016] [Accepted: 08/23/2016] [Indexed: 12/02/2022] Open
Abstract
Protein ligand-binding site prediction is highly important for protein function determination and structure-based drug design. Over the past twenty years, dozens of computational methods have been developed to address this problem. Soga et al. identified ligand cavities based on the preferences of amino acids for the ligand-binding site (RA) and proposed the propensity for ligand binding (PLB) index to rank the cavities on the protein surface. However, we found that residues exhibit different RAs in response to changes in solvent exposure. Furthermore, previous studies have suggested that some dihedral angles of amino acids in specific regions of the Ramachandran plot are preferred at the functional sites of proteins. Based on these discoveries, the amino acid solvent-accessible surface area and dihedral angles were combined with the RA and PLB to obtain two new indexes, multi-factor RA (MF-RA) and multi-factor PLB (MF-PLB). MF-PLB, PLB and other methods were tested using two benchmark databases and two particular ligand-binding sites. The results show that MF-PLB can improve the success rate of PLB for both ligand-bound and ligand-unbound structures, particularly for top choice prediction.
Collapse
Affiliation(s)
- Chen Cao
- College of Computer Science and Technology, Jilin University, Changchun, Jilin, China
- Key Laboratory of Symbol Computation and Knowledge Engineering of the Ministry of Education, Jilin University, Changchun, Jilin, China
| | - Shutan Xu
- Department of Biochemistry and Molecular Biology, Institute of Bioinformatics, University of Georgia, Athens, GA, USA
| |
Collapse
|
44
|
Tanramluk D, Narupiyakul L, Akavipat R, Gong S, Charoensawan V. MANORAA (Mapping Analogous Nuclei Onto Residue And Affinity) for identifying protein-ligand fragment interaction, pathways and SNPs. Nucleic Acids Res 2016; 44:W514-21. [PMID: 27131358 PMCID: PMC4987895 DOI: 10.1093/nar/gkw314] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2016] [Revised: 04/07/2016] [Accepted: 04/13/2016] [Indexed: 11/15/2022] Open
Abstract
Protein-ligand interaction analysis is an important step of drug design and protein engineering in order to predict the binding affinity and selectivity between ligands to the target proteins. To date, there are more than 100 000 structures available in the Protein Data Bank (PDB), of which ∼30% are protein-ligand (MW below 1000 Da) complexes. We have developed the integrative web server MANORAA (Mapping Analogous Nuclei Onto Residue And Affinity) with the aim of providing a user-friendly web interface to assist structural study and design of protein-ligand interactions. In brief, the server allows the users to input the chemical fragments and present all the unique molecular interactions to the target proteins with available three-dimensional structures in the PDB. The users can also link the ligands of interest to assess possible off-target proteins, human variants and pathway information using our all-in-one integrated tools. Taken together, we envisage that the server will facilitate and improve the study of protein-ligand interactions by allowing observation and comparison of ligand interactions with multiple proteins at the same time. (http://manoraa.org).
Collapse
Affiliation(s)
- Duangrudee Tanramluk
- Institute of Molecular Biosciences, Mahidol University, Salaya, Nakhon Pathom 73170, Thailand Integrative Computational BioScience (ICBS) Center, Mahidol University, Salaya, Nakhon Pathom 73170, Thailand
| | - Lalita Narupiyakul
- Integrative Computational BioScience (ICBS) Center, Mahidol University, Salaya, Nakhon Pathom 73170, Thailand Department of Computer Engineering, Faculty of Engineering, Mahidol University, Salaya, Nakhon Pathom 73170, Thailand
| | - Ruj Akavipat
- Integrative Computational BioScience (ICBS) Center, Mahidol University, Salaya, Nakhon Pathom 73170, Thailand Department of Computer Science, Faculty of Science, Kasetsart University, Chatuchak, Bangkok 10900, Thailand
| | - Sungsam Gong
- Department of Obstetrics and Gynaecology, University of Cambridge, The Rosie Hospital, Cambridge CB2 0SW, UK
| | - Varodom Charoensawan
- Integrative Computational BioScience (ICBS) Center, Mahidol University, Salaya, Nakhon Pathom 73170, Thailand Department of Biochemistry, Faculty of Science, Mahidol University, Ratchathewi, Bangkok 10400, Thailand
| |
Collapse
|
45
|
Chang CW, Chou CW, Chang DTH. CCProf: exploring conformational change profile of proteins. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:baw029. [PMID: 27016699 PMCID: PMC4808249 DOI: 10.1093/database/baw029] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/14/2015] [Accepted: 02/23/2016] [Indexed: 12/18/2022]
Abstract
In many biological processes, proteins have important interactions with various molecules such as proteins, ions or ligands. Many proteins undergo conformational changes upon these interactions, where regions with large conformational changes are critical to the interactions. This work presents the CCProf platform, which provides conformational changes of entire proteins, named conformational change profile (CCP) in the context. CCProf aims to be a platform where users can study potential causes of novel conformational changes. It provides 10 biological features, including conformational change, potential binding target site, secondary structure, conservation, disorder propensity, hydropathy propensity, sequence domain, structural domain, phosphorylation site and catalytic site. All these information are integrated into a well-aligned view, so that researchers can capture important relevance between different biological features visually. The CCProf contains 986 187 protein structure pairs for 3123 proteins. In addition, CCProf provides a 3D view in which users can see the protein structures before and after conformational changes as well as binding targets that induce conformational changes. All information (e.g. CCP, binding targets and protein structures) shown in CCProf, including intermediate data are available for download to expedite further analyses. Database URL: http://zoro.ee.ncku.edu.tw/ccprof/
Collapse
Affiliation(s)
- Che-Wei Chang
- Department of Electrical Engineering, National Cheng Kung University, Tainan, 70101, Taiwan
| | - Chai-Wei Chou
- Department of Electrical Engineering, National Cheng Kung University, Tainan, 70101, Taiwan
| | - Darby Tien-Hao Chang
- Department of Electrical Engineering, National Cheng Kung University, Tainan, 70101, Taiwan
| |
Collapse
|
46
|
Remez N, Garcia-Serna R, Vidal D, Mestres J. The In Vitro Pharmacological Profile of Drugs as a Proxy Indicator of Potential In Vivo Organ Toxicities. Chem Res Toxicol 2016; 29:637-48. [PMID: 26952164 DOI: 10.1021/acs.chemrestox.5b00470] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
The potential of a drug to cause certain organ toxicities is somehow implicitly contained in its full pharmacological profile, provided the drug reaches and accumulates at the various organs where the different interacting proteins in its profile, both targets and off-targets, are expressed. Under this assumption, a computational approach was implemented to obtain a projected anatomical profile of a drug from its in vitro pharmacological profile linked to protein expression data across 47 organs. It was observed that the anatomical profiles obtained when using only the known primary targets of the drugs reflected roughly the intended organ targets. However, when both known and predicted secondary pharmacology was considered, the projected anatomical profiles of the drugs were able to clearly highlight potential organ off-targets. Accordingly, when applied to sets of drugs known to cause cardiotoxicity and hepatotoxicity, the approach is able to identify heart and liver, respectively, as the organs where the proteins in the pharmacological profile of the corresponding drugs are specifically expressed. When applied to a set of drugs linked to a risk of Torsades de Pointes, heart is again the organ clearly standing out from the rest and a potential protein profile hazard is proposed. The approach can be used as a proxy indicator of potential in vivo organ toxicities.
Collapse
Affiliation(s)
- Nikita Remez
- Systems Pharmacology, Research Program on Biomedical Informatics (GRIB), IMIM Hospital del Mar Medical Research Institute and University Pompeu Fabra, Parc de Recerca Biomèdica , Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain.,Chemotargets SL, Parc Científic de Barcelona, Baldiri Reixac 4 (TI-05A7), 08028 Barcelona, Catalonia, Spain
| | - Ricard Garcia-Serna
- Chemotargets SL, Parc Científic de Barcelona, Baldiri Reixac 4 (TI-05A7), 08028 Barcelona, Catalonia, Spain
| | - David Vidal
- Chemotargets SL, Parc Científic de Barcelona, Baldiri Reixac 4 (TI-05A7), 08028 Barcelona, Catalonia, Spain
| | - Jordi Mestres
- Systems Pharmacology, Research Program on Biomedical Informatics (GRIB), IMIM Hospital del Mar Medical Research Institute and University Pompeu Fabra, Parc de Recerca Biomèdica , Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain.,Chemotargets SL, Parc Científic de Barcelona, Baldiri Reixac 4 (TI-05A7), 08028 Barcelona, Catalonia, Spain
| |
Collapse
|
47
|
Glaab E. Building a virtual ligand screening pipeline using free software: a survey. Brief Bioinform 2016; 17:352-66. [PMID: 26094053 PMCID: PMC4793892 DOI: 10.1093/bib/bbv037] [Citation(s) in RCA: 63] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2015] [Revised: 05/20/2015] [Indexed: 12/17/2022] Open
Abstract
Virtual screening, the search for bioactive compounds via computational methods, provides a wide range of opportunities to speed up drug development and reduce the associated risks and costs. While virtual screening is already a standard practice in pharmaceutical companies, its applications in preclinical academic research still remain under-exploited, in spite of an increasing availability of dedicated free databases and software tools. In this survey, an overview of recent developments in this field is presented, focusing on free software and data repositories for screening as alternatives to their commercial counterparts, and outlining how available resources can be interlinked into a comprehensive virtual screening pipeline using typical academic computing facilities. Finally, to facilitate the set-up of corresponding pipelines, a downloadable software system is provided, using platform virtualization to integrate pre-installed screening tools and scripts for reproducible application across different operating systems.
Collapse
|
48
|
Cimermancic P, Weinkam P, Rettenmaier TJ, Bichmann L, Keedy DA, Woldeyes RA, Schneidman-Duhovny D, Demerdash ON, Mitchell JC, Wells JA, Fraser JS, Sali A. CryptoSite: Expanding the Druggable Proteome by Characterization and Prediction of Cryptic Binding Sites. J Mol Biol 2016; 428:709-719. [PMID: 26854760 DOI: 10.1016/j.jmb.2016.01.029] [Citation(s) in RCA: 137] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2015] [Revised: 01/29/2016] [Accepted: 01/30/2016] [Indexed: 01/04/2023]
Abstract
Many proteins have small-molecule binding pockets that are not easily detectable in the ligand-free structures. These cryptic sites require a conformational change to become apparent; a cryptic site can therefore be defined as a site that forms a pocket in a holo structure, but not in the apo structure. Because many proteins appear to lack druggable pockets, understanding and accurately identifying cryptic sites could expand the set of drug targets. Previously, cryptic sites were identified experimentally by fragment-based ligand discovery and computationally by long molecular dynamics simulations and fragment docking. Here, we begin by constructing a set of structurally defined apo-holo pairs with cryptic sites. Next, we comprehensively characterize the cryptic sites in terms of their sequence, structure, and dynamics attributes. We find that cryptic sites tend to be as conserved in evolution as traditional binding pockets but are less hydrophobic and more flexible. Relying on this characterization, we use machine learning to predict cryptic sites with relatively high accuracy (for our benchmark, the true positive and false positive rates are 73% and 29%, respectively). We then predict cryptic sites in the entire structurally characterized human proteome (11,201 structures, covering 23% of all residues in the proteome). CryptoSite increases the size of the potentially "druggable" human proteome from ~40% to ~78% of disease-associated proteins. Finally, to demonstrate the utility of our approach in practice, we experimentally validate a cryptic site in protein tyrosine phosphatase 1B using a covalent ligand and NMR spectroscopy. The CryptoSite Web server is available at http://salilab.org/cryptosite.
Collapse
Affiliation(s)
- Peter Cimermancic
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA; Graduate Group in Biological and Medical Informatics,University of California, San Francisco, San Francisco, CA 94158, USA.
| | - Patrick Weinkam
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - T Justin Rettenmaier
- Graduate Group in Chemistry and Chemical Biology, University of California, San Francisco, San Francisco, CA 94158, USA; Pharmaceutical Chemistry and California Institute for Quantitative Biosciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Leon Bichmann
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Daniel A Keedy
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Rahel A Woldeyes
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA; Graduate Group in Chemistry and Chemical Biology, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Dina Schneidman-Duhovny
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Omar N Demerdash
- Department of Chemical and Biomolecular Engineering, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Julie C Mitchell
- Departments of Biochemistry and Mathematics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - James A Wells
- Pharmaceutical Chemistry and California Institute for Quantitative Biosciences, University of California, San Francisco, San Francisco, CA 94158, USA; Cellular and Molecular Pharmacology and California Institute for Quantitative Biosciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - James S Fraser
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Andrej Sali
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA; Pharmaceutical Chemistry and California Institute for Quantitative Biosciences, University of California, San Francisco, San Francisco, CA 94158, USA. http://salilab.org
| |
Collapse
|
49
|
Zhu L, Yang Y, Lu X. The selectivity and promiscuity of brain-neuroregenerative inhibitors between ROCK1 and ROCK2 isoforms: An integration of SB-QSSR modelling, QM/MM analysis and in vitro kinase assay. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2016; 27:47-65. [PMID: 26854727 DOI: 10.1080/1062936x.2015.1132765] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
The Rho-associated kinases (ROCKs) have long been recognized as an attractive therapeutic target for various neurological diseases; selective inhibition of ROCK1 and ROCK2 isoforms would result in distinct biological effects on neurogenesis, neuroplasticity and neuroregeneration after brain surgery and traumatic brain injury. However, the discovery and design of isoform-selective inhibitors remain a great challenge due to the high conservation and similarity between the kinase domains of ROCK1 and ROCK2. Here, a structure-based quantitative structure-selectivity relationship (SB-QSSR) approach was used to correlate experimentally measured selectivity with the difference in inhibitor binding to the two kinase isoforms. The resulting regression models were examined rigorously through both internal cross-validation and external blind validation; a nonlinear predictor was found to have high fitting stability and strong generalization ability, which was then employed to perform virtual screening against a structurally diverse, drug-like compound library. Consequently, five and seven hits were identified as promising candidates of 1-o-2 and 2-o-1 selective inhibitors, respectively, from which seven purchasable compounds were tested in vitro using a standard kinase assay protocol to determine their inhibitory activity against and selectivity between ROCK1 and ROCK2. The structural basis, energetic property and biological implication underlying inhibitor selectivity and promiscuity were also investigated systematically using a hybrid quantum mechanics/molecular mechanics (QM/MM) scheme.
Collapse
Affiliation(s)
- L Zhu
- a Department of Neurosurgery , People's Hospital affiliated to Jiangsu University , Zhenjiang , China
| | - Y Yang
- a Department of Neurosurgery , People's Hospital affiliated to Jiangsu University , Zhenjiang , China
| | - X Lu
- a Department of Neurosurgery , People's Hospital affiliated to Jiangsu University , Zhenjiang , China
| |
Collapse
|
50
|
Barelier S, Sterling T, O’Meara MJ, Shoichet BK. The Recognition of Identical Ligands by Unrelated Proteins. ACS Chem Biol 2015; 10:2772-84. [PMID: 26421501 DOI: 10.1021/acschembio.5b00683] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The binding of drugs and reagents to off-targets is well-known. Whereas many off-targets are related to the primary target by sequence and fold, many ligands bind to unrelated pairs of proteins, and these are harder to anticipate. If the binding site in the off-target can be related to that of the primary target, this challenge resolves into aligning the two pockets. However, other cases are possible: the ligand might interact with entirely different residues and environments in the off-target, or wholly different ligand atoms may be implicated in the two complexes. To investigate these scenarios at atomic resolution, the structures of 59 ligands in 116 complexes (62 pairs in total), where the protein pairs were unrelated by fold but bound an identical ligand, were examined. In almost half of the pairs, the ligand interacted with unrelated residues in the two proteins (29 pairs), and in 14 of the pairs wholly different ligand moieties were implicated in each complex. Even in those 19 pairs of complexes that presented similar environments to the ligand, ligand superposition rarely resulted in the overlap of related residues. There appears to be no single pattern-matching "code" for identifying binding sites in unrelated proteins that bind identical ligands, though modeling suggests that there might be a limited number of different patterns that suffice to recognize different ligand functional groups.
Collapse
Affiliation(s)
- Sarah Barelier
- Department of Pharmaceutical
Chemistry, University of California San Francisco, 1700 Fourth
Street, Byers Hall, San Francisco, California 94158, United States
| | - Teague Sterling
- Department of Pharmaceutical
Chemistry, University of California San Francisco, 1700 Fourth
Street, Byers Hall, San Francisco, California 94158, United States
| | - Matthew J. O’Meara
- Department of Pharmaceutical
Chemistry, University of California San Francisco, 1700 Fourth
Street, Byers Hall, San Francisco, California 94158, United States
| | - Brian K. Shoichet
- Department of Pharmaceutical
Chemistry, University of California San Francisco, 1700 Fourth
Street, Byers Hall, San Francisco, California 94158, United States
| |
Collapse
|