1
|
Pereira GP, Jiménez-García B, Pellarin R, Launay G, Wu S, Martin J, Souza PCT. Rational Prediction of PROTAC-Compatible Protein-Protein Interfaces by Molecular Docking. J Chem Inf Model 2023; 63:6823-6833. [PMID: 37877240 DOI: 10.1021/acs.jcim.3c01154] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2023]
Abstract
Proteolysis targeting chimeras (PROTACs) are heterobifunctional ligands that mediate the interaction between a protein target and an E3 ligase, resulting in a ternary complex, whose interaction with the ubiquitination machinery leads to target degradation. This technology is emerging as an exciting new avenue for therapeutic development, with several PROTACs currently undergoing clinical trials targeting cancer. Here, we describe a general and computationally efficient methodology combining restraint-based docking, energy-based rescoring, and a filter based on the minimal solvent-accessible surface distance to produce PROTAC-compatible PPIs suitable for when there is no a priori known PROTAC ligand. In a benchmark employing a manually curated data set of 13 ternary complex crystals, we achieved an accuracy of 92% when starting from bound structures and 77% when starting from unbound structures, respectively. Our method only requires that the ligand-bound structures of the monomeric forms of the E3 ligase and target proteins be given to run, making it general, accurate, and highly efficient, with the ability to impact early-stage PROTAC-based drug design campaigns where no structural information about the ternary complex structure is available.
Collapse
Affiliation(s)
- Gilberto P Pereira
- Molecular Microbiology and Structural Biochemistry, CNRS UMR 5086 and Université Claude Bernard Lyon 1, 7 Passage du Vercors, 69007 Lyon, France
- Laboratory of Biology and Modeling of the Cell, École Normale Supérieure de Lyon, Université Claude Bernard Lyon 1, CNRS UMR 5239 and Inserm U1293, 46 Allée d'Italie, 69007 Lyon, France
| | | | - Riccardo Pellarin
- Molecular Microbiology and Structural Biochemistry, CNRS UMR 5086 and Université Claude Bernard Lyon 1, 7 Passage du Vercors, 69007 Lyon, France
- Laboratory of Biology and Modeling of the Cell, École Normale Supérieure de Lyon, Université Claude Bernard Lyon 1, CNRS UMR 5239 and Inserm U1293, 46 Allée d'Italie, 69007 Lyon, France
| | - Guillaume Launay
- Molecular Microbiology and Structural Biochemistry, CNRS UMR 5086 and Université Claude Bernard Lyon 1, 7 Passage du Vercors, 69007 Lyon, France
- Laboratory of Biology and Modeling of the Cell, École Normale Supérieure de Lyon, Université Claude Bernard Lyon 1, CNRS UMR 5239 and Inserm U1293, 46 Allée d'Italie, 69007 Lyon, France
| | - Sangwook Wu
- PharmCADD, Busan 48792, Republic of Korea
- Department of Physics, Pukyong National University, Busan 48513, Republic of Korea
| | - Juliette Martin
- Molecular Microbiology and Structural Biochemistry, CNRS UMR 5086 and Université Claude Bernard Lyon 1, 7 Passage du Vercors, 69007 Lyon, France
- Laboratory of Biology and Modeling of the Cell, École Normale Supérieure de Lyon, Université Claude Bernard Lyon 1, CNRS UMR 5239 and Inserm U1293, 46 Allée d'Italie, 69007 Lyon, France
| | - Paulo C T Souza
- Molecular Microbiology and Structural Biochemistry, CNRS UMR 5086 and Université Claude Bernard Lyon 1, 7 Passage du Vercors, 69007 Lyon, France
- Laboratory of Biology and Modeling of the Cell, École Normale Supérieure de Lyon, Université Claude Bernard Lyon 1, CNRS UMR 5239 and Inserm U1293, 46 Allée d'Italie, 69007 Lyon, France
| |
Collapse
|
2
|
Guo L, He J, Lin P, Huang SY, Wang J. TRScore: a three-dimensional RepVGG-based scoring method for ranking protein docking models. Bioinformatics 2022; 38:2444-2451. [PMID: 35199137 DOI: 10.1093/bioinformatics/btac120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2021] [Revised: 01/19/2022] [Accepted: 02/21/2022] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Protein-protein interactions (PPI) play important roles in cellular activities. Due to the technical difficulty and high cost of experimental methods, there are considerable interests towards the development of computational approaches, such as protein docking, to decipher PPI patterns. One of the important and difficult aspects in protein docking is recognizing near-native conformations from a set of decoys, but unfortunately traditional scoring functions still suffer from limited accuracy. Therefore, new scoring methods are pressingly needed in methodological and/or practical implications. RESULTS We present a new deep learning-based scoring method for ranking protein-protein docking models based on a three-dimensional (3D) RepVGG network, named TRScore. To recognize near-native conformations from a set of decoys, TRScore voxelizes the protein-protein interface into a 3D grid labeled by the number of atoms in different physicochemical classes. Benefiting from the deep convolutional RepVGG architecture, TRScore can effectively capture the subtle differences between energetically favorable near-native models and unfavorable non-native decoys without needing extra information. TRScore was extensively evaluated on diverse test sets including protein-protein docking benchmark 5.0 update set, DockGround decoy set, as well as realistic CAPRI decoy set, and overall obtained a significant improvement over existing methods in cross validation and independent evaluations. AVAILABILITY Codes available at: https://github.com/BioinformaticsCSU/TRScore.
Collapse
Affiliation(s)
- Linyuan Guo
- School of Computer Science, Central South University, Changsha, Hunan 410083, China
| | - Jiahua He
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Peicong Lin
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Jianxin Wang
- School of Computer Science, Central South University, Changsha, Hunan 410083, China
| |
Collapse
|
3
|
Hamzeh-Mivehroud M, Sokouti B, Dastmalchi S. Molecular Docking at a Glance. Oncology 2017. [DOI: 10.4018/978-1-5225-0549-5.ch030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The current chapter introduces different aspects of molecular docking technique in order to give an overview to the readers about the topics which will be dealt with throughout this volume. Like many other fields of science, molecular docking studies has experienced a lagging period of slow and steady increase in terms of acquiring attention of scientific community as well as its frequency of application, followed by a pronounced era of exponential expansion in theory, methodology, areas of application and performance due to developments in related technologies such as computational resources and theoretical as well as experimental biophysical methods. In the following sections the evolution of molecular docking will be reviewed and its different components including methods, search algorithms, scoring functions, validation of the methods, and area of applications plus few case studies will be touched briefly.
Collapse
Affiliation(s)
| | | | - Siavoush Dastmalchi
- Biotechnology Research Center, Tabriz University of Medical Sciences, Iran & School of Pharmacy, Tabriz University of Medical Sciences, Iran
| |
Collapse
|
4
|
Ashkani J, Rees DJG. A simple, high-throughput modeling approach reveals insights into the mechanism of gametophytic self-incompatibility. Sci Rep 2016; 6:34732. [PMID: 27721467 PMCID: PMC5056379 DOI: 10.1038/srep34732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2016] [Accepted: 09/15/2016] [Indexed: 11/29/2022] Open
Abstract
Specificity in the GSI response results from the S-haplotype-specific molecular interaction of S-locus F-box (SLF/SFB) and SRNase proteins in the self-incompatibility locus (S-locus). The answer to the question of how these two components of the S-locus (SRNase and SLF/SFB) interact has been gathered from several models. Since there is not enough evidence as to which one is the definitive model, none of them can be ruled out. Despite the identification of interacting protein elements, the mechanism by which SLF/SFB and SRNase interact to differently trigger the self-incompatibility among families and subfamilies remain uncertain. The high-throughput modeling approach demonstrates structural visions into the possible existence of a Collaborative Non-Self Recognition model in apple. These findings postulate several prospects for future investigation providing useful information to guide the implementation of breeding strategies.
Collapse
Affiliation(s)
- Jahanshah Ashkani
- Biotechnology Department, University of the Western Cape, Robert Sobokwe Road, Bellville, 7535, South Africa
- Agricultural Research Council, Biotechnology Platform, Private Bag X5, Onderstepoort, 0110, South Africa
| | - D. J. G. Rees
- Biotechnology Department, University of the Western Cape, Robert Sobokwe Road, Bellville, 7535, South Africa
- Agricultural Research Council, Biotechnology Platform, Private Bag X5, Onderstepoort, 0110, South Africa
| |
Collapse
|
5
|
Rana J, Rajasekharan S, Gulati S, Dudha N, Gupta A, Chaudhary VK, Gupta S. Network mapping among the functional domains of Chikungunya virus nonstructural proteins. Proteins 2014; 82:2403-11. [PMID: 24825751 DOI: 10.1002/prot.24602] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2013] [Revised: 04/27/2014] [Accepted: 04/29/2014] [Indexed: 11/11/2022]
Abstract
Formation of virus specific replicase complex is among the most important steps that determines the fate of viral transcription and replication during Chikungunya virus (CHIKV) infection. In the present study, the authors have computationally generated a 3D structure of CHIKV late replicase complex on the basis of the interactions identified among the domains of CHIKV nonstructural proteins (nsPs) which make up the late replicase complex. The interactions among the domains of CHIKV nsPs were identified using systems such as pull down, protein interaction ELISA, and yeast two-hybrid. The structures of nsPs were generated using I-TASSER and the biological assembly of the replicase complex was determined using ZRANK and RDOCK. A total of 36 interactions among the domains and full length proteins were tested and 12 novel interactions have been identified. These interactions included the homodimerization of nsP1 and nsP4 through their respective C-ter domains; the associations of nsP2 helicase domain and C-ter domain of nsP4 with methyltransferase and membrane binding domains of nsP1; the interaction of nsP2 protease domain with C-ter domain of nsP4; and the interaction of nsP3 macro and alphavirus unique domains with the C-ter domain of nsP1. The novel interactions identified in the current study form a network of organized associations that suggest the spatial arrangement of nsPs in the late replicase complex of CHIKV.
Collapse
Affiliation(s)
- Jyoti Rana
- Center for Emerging Diseases, Department of Biotechnology, Jaypee Institute of Information Technology, A-10, Noida, 201307, Uttar Pradesh, India
| | | | | | | | | | | | | |
Collapse
|
6
|
Li L, Huang Y, Xiao Y. How to use not-always-reliable binding site information in protein-protein docking prediction. PLoS One 2013; 8:e75936. [PMID: 24124522 PMCID: PMC3790831 DOI: 10.1371/journal.pone.0075936] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2013] [Accepted: 08/22/2013] [Indexed: 11/19/2022] Open
Abstract
In many protein-protein docking algorithms, binding site information is used to help predicting the protein complex structures. Using correct and accurate binding site information can increase protein-protein docking success rate significantly. On the other hand, using wrong binding sites information should lead to a failed prediction, or, at least decrease the success rate. Recently, various successful theoretical methods have been proposed to predict the binding sites of proteins. However, the predicted binding site information is not always reliable, sometimes wrong binding site information could be given. Hence there is a high risk to use the predicted binding site information in current docking algorithms. In this paper, a softly restricting method (SRM) is developed to solve this problem. By utilizing predicted binding site information in a proper way, the SRM algorithm is sensitive to the correct binding site information but insensitive to wrong information, which decreases the risk of using predicted binding site information. This SRM is tested on benchmark 3.0 using purely predicted binding site information. The result shows that when the predicted information is correct, SRM increases the success rate significantly; however, even if the predicted information is completely wrong, SRM only decreases success rate slightly, which indicates that the SRM is suitable for utilizing predicted binding site information.
Collapse
Affiliation(s)
- Lin Li
- Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
- Computational Biophysics and Bioinformatics, Department of Physics, Clemson University, South Carolina, United States of America
| | - Yanzhao Huang
- Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
- * E-mail: (YH); (YX)
| | - Yi Xiao
- Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
- * E-mail: (YH); (YX)
| |
Collapse
|
7
|
Vajda S, Hall DR, Kozakov D. Sampling and scoring: a marriage made in heaven. Proteins 2013; 81:1874-84. [PMID: 23775627 DOI: 10.1002/prot.24343] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2013] [Revised: 05/14/2013] [Accepted: 05/31/2013] [Indexed: 12/11/2022]
Abstract
Most structure prediction algorithms consist of initial sampling of the conformational space, followed by rescoring and possibly refinement of a number of selected structures. Here we focus on protein docking, and show that while decoupling sampling and scoring facilitates method development, integration of the two steps can lead to substantial improvements in docking results. Since decoupling is usually achieved by generating a decoy set containing both non-native and near-native docked structures, which can be then used for scoring function construction, we first review the roles and potential pitfalls of decoys in protein-protein docking, and show that some type of decoys are better than others for method development. We then describe three case studies showing that complete decoupling of scoring from sampling is not the best choice for solving realistic docking problems. Although some of the examples are based on our own experience, the results of the CAPRI docking and scoring experiments also show that performing both sampling and scoring generally yields better results than scoring the structures generated by all predictors. Next we investigate how the selection of training and decoy sets affects the performance of the scoring functions obtained. Finally, we discuss pathways to better alignment of the two steps, and show some algorithms that achieve a certain level of integration. Although we focus on protein-protein docking, our observations most likely also apply to other conformational search problems, including protein structure prediction and the docking of small molecules to proteins.
Collapse
Affiliation(s)
- Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, 02215
| | | | | |
Collapse
|
8
|
Schneider S, Zacharias M. Scoring optimisation of unbound protein-protein docking including protein binding site predictions. J Mol Recognit 2011; 25:15-23. [DOI: 10.1002/jmr.1165] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Affiliation(s)
- Sebastian Schneider
- Physik-Department T38; Technische Universität München; James Franck Str. 1; 85748; Garching; Germany
| | - Martin Zacharias
- Physik-Department T38; Technische Universität München; James Franck Str. 1; 85748; Garching; Germany
| |
Collapse
|
9
|
Huang W, Liu H. Optimized grid-based protein-protein docking as a global search tool followed by incorporating experimentally derivable restraints. Proteins 2011; 80:691-702. [PMID: 22190391 DOI: 10.1002/prot.23223] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2011] [Revised: 10/10/2011] [Accepted: 10/12/2011] [Indexed: 12/16/2022]
Abstract
Unbound protein docking, or the computational prediction of the structure of a protein complex from the structures of its separated components, is of importance but still challenging. A practical approach toward reliable results for unbound docking is to incorporate experimentally derived information with computation. To this end, truly systematic search of the global docking space is desirable. The fast Fourier transform (FFT) docking is a systematic search method with high computational efficiency. However, by using FFT to perform unbound docking, possible conformational changes upon binding must be treated implicitly. To better accommodate the implicit treatment of conformational flexibility, we develop a rational approach to optimize "softened" parameters for FFT docking. In connection with the increased "softness" of the parameters in this global search step, we use a revised rule to select candidate models from the search results. For complexes designated as of low and medium difficulty for unbound docking, these adaptations of the original FTDOCK program lead to substantial improvements of the global search results. Finally, we show that models resulted from FFT-based global search can be further filtered with restraints derivable from nuclear magnetic resonance (NMR) chemical shift perturbation or mutagenesis experiments, leading to a small set of models that can be feasibly refined and evaluated using computationally more expensive methods and that still include high-ranking near-native conformations.
Collapse
Affiliation(s)
- Wei Huang
- School of Life Sciences and Hefei National Laboratory for Physical Sciences at the Microscale, University of Science and Technology of China (USTC), Hefei, Anhui 230027, People's Republic of China
| | | |
Collapse
|
10
|
TSUKAMOTO KOKI, YOSHIKAWA TATSUYA, HOURAI YUICHIRO, FUKUI KAZUHIKO, AKIYAMA YUTAKA. DEVELOPMENT OF AN AFFINITY EVALUATION AND PREDICTION SYSTEM BY USING THE SHAPE COMPLEMENTARITY CHARACTERISTIC BETWEEN PROTEINS. J Bioinform Comput Biol 2011; 6:1133-56. [DOI: 10.1142/s0219720008003904] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2007] [Revised: 03/25/2008] [Accepted: 04/28/2008] [Indexed: 11/18/2022]
Abstract
A system was developed to evaluate and predict the interaction between protein pairs by using the widely used shape complementarity search method as the algorithm for docking simulations between the proteins. This system, which we call the affinity evaluation and prediction (AEP) system, was used to evaluate the interaction between 20 protein pairs. The system first executes a "round robin" shape complementarity search of the target protein group, and evaluates the interaction of the complex structures obtained by shape complementarity search. These complex structures are selected by using a statistical procedure that we developed called "grouping". At a low prevalence of 5.0%, our AEP system predicted protein–protein interaction with 65.0% recall, 15.1% precision, 80.0% accuracy, and had an area under the curve (AUC) of 0.74. By optimizing the grouping process, our AEP system successfully predicted 13 protein pairs (among 20 pairs) that were biologically significant combinations. Our ultimate goal is to construct an affinity database that will provide crucial information obtained using our AEP system to cell biologists and drug designers.
Collapse
Affiliation(s)
- KOKI TSUKAMOTO
- Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-42 Aomi, Koto-ku, Tokyo 135-0064, Japan
| | - TATSUYA YOSHIKAWA
- Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-42 Aomi, Koto-ku, Tokyo 135-0064, Japan
- Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka 560-8531, Japan
| | - YUICHIRO HOURAI
- Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-42 Aomi, Koto-ku, Tokyo 135-0064, Japan
| | - KAZUHIKO FUKUI
- Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-42 Aomi, Koto-ku, Tokyo 135-0064, Japan
| | - YUTAKA AKIYAMA
- Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-42 Aomi, Koto-ku, Tokyo 135-0064, Japan
| |
Collapse
|
11
|
Launay G, Simonson T. A large decoy set of protein-protein complexes produced by flexible docking. J Comput Chem 2010; 32:106-20. [DOI: 10.1002/jcc.21604] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
12
|
Using correlated parameters for improved ranking of protein-protein docking decoys. J Comput Chem 2010; 32:787-96. [DOI: 10.1002/jcc.21657] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2010] [Revised: 07/06/2010] [Accepted: 08/06/2010] [Indexed: 11/07/2022]
|
13
|
Liang S, Wang G, Zhou Y. Refining near-native protein-protein docking decoys by local resampling and energy minimization. Proteins 2010; 76:309-16. [PMID: 19156819 DOI: 10.1002/prot.22343] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
How to refine a near-native structure to make it closer to its native conformation is an unsolved problem in protein-structure and protein-protein complex-structure prediction. In this article, we first test several scoring functions for selecting locally resampled near-native protein-protein docking conformations and then propose a computationally efficient protocol for structure refinement via local resampling and energy minimization. The proposed method employs a statistical energy function based on a Distance-scaled Ideal-gas REference state (DFIRE) as an initial filter and an empirical energy function EMPIRE (EMpirical Protein-InteRaction Energy) for optimization and re-ranking. Significant improvement of final top-1 ranked structures over initial near-native structures is observed in the ZDOCK 2.3 decoy set for Benchmark 1.0 (74% whose global rmsd reduced by 0.5 A or more and only 7% increased by 0.5 A or more). Less significant improvement is observed for Benchmark 2.0 (38% versus 33%). Possible reasons are discussed.
Collapse
Affiliation(s)
- Shide Liang
- Indiana University School of Informatics, Indiana University-Purdue University, Indianapolis, 46202, USA
| | | | | |
Collapse
|
14
|
Liang S, Liu S, Zhang C, Zhou Y. A simple reference state makes a significant improvement in near-native selections from structurally refined docking decoys. Proteins 2009; 69:244-53. [PMID: 17623864 PMCID: PMC2673351 DOI: 10.1002/prot.21498] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Near-native selections from docking decoys have proved challenging especially when unbound proteins are used in the molecular docking. One reason is that significant atomic clashes in docking decoys lead to poor predictions of binding affinities of near native decoys. Atomic clashes can be removed by structural refinement through energy minimization. Such an energy minimization, however, will lead to an unrealistic bias toward docked structures with large interfaces. Here, we extend an empirical energy function developed for protein design to protein-protein docking selection by introducing a simple reference state that removes the unrealistic dependence of binding affinity of docking decoys on the buried solvent accessible surface area of interface. The energy function called EMPIRE (EMpirical Protein-InteRaction Energy), when coupled with a refinement strategy, is found to provide a significantly improved success rate in near native selections when applied to RosettaDock and refined ZDOCK docking decoys. Our work underlines the importance of removing nonspecific interactions from specific ones in near native selections from docking decoys.
Collapse
Affiliation(s)
- Shide Liang
- Howard Hughes Medical Institute Center for Single Molecule Biophysics, Department of Physiology and Biophysics, State University of New York at Buffalo, Buffalo, NY 14214, USA
| | | | | | | |
Collapse
|
15
|
Liang S, Meroueh SO, Wang G, Qiu C, Zhou Y. Consensus scoring for enriching near-native structures from protein-protein docking decoys. Proteins 2009; 75:397-403. [PMID: 18831053 DOI: 10.1002/prot.22252] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
The identification of near native protein-protein complexes among a set of decoys remains highly challenging. A strategy for improving the success rate of near native detection is to enrich near native docking decoys in a small number of top ranked decoys. Recently, we found that a combination of three scoring functions (energy, conservation, and interface propensity) can predict the location of binding interface regions with reasonable accuracy. Here, these three scoring functions are modified and combined into a consensus scoring function called ENDES for enriching near native docking decoys. We found that all individual scores result in enrichment for the majority of 28 targets in ZDOCK2.3 decoy set and the 22 targets in Benchmark 2.0. Among the three scores, the interface propensity score yields the highest enrichment in both sets of protein complexes. When these scores are combined into the ENDES consensus score, a significant increase in enrichment of near-native structures is found. For example, when 2000 dock decoys are reduced to 200 decoys by ENDES, the fraction of near-native structures in docking decoys increases by a factor of about six in average. ENDES was implemented into a computer program that is available for download at http://sparks.informatics.iupui.edu.
Collapse
Affiliation(s)
- Shide Liang
- Indiana University School of Informatics, Indiana University-Purdue University, Indianapolis, Indiana 46202, USA
| | | | | | | | | |
Collapse
|
16
|
Tsukamoto K, Yoshikawa T, Yokota K, Hourai Y, Fukui K. The development of an affinity evaluation and prediction system by using protein-protein docking simulations and parameter tuning. Adv Appl Bioinform Chem 2009; 2:1-15. [PMID: 21918611 PMCID: PMC3169950 DOI: 10.2147/aabc.s3646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
A system was developed to evaluate and predict the interaction between protein pairs by using the widely used shape complementarity search method as the algorithm for docking simulations between the proteins. We used this system, which we call the affinity evaluation and prediction (AEP) system, to evaluate the interaction between 20 protein pairs. The system first executes a “round robin” shape complementarity search of the target protein group, and evaluates the interaction between the complex structures obtained by the search. These complex structures are selected by using a statistical procedure that we developed called ‘grouping’. At a prevalence of 5.0%, our AEP system predicted protein–protein interactions with a 50.0% recall, 55.6% precision, 95.5% accuracy, and an F-measure of 0.526. By optimizing the grouping process, our AEP system successfully predicted 10 protein pairs (among 20 pairs) that were biologically relevant combinations. Our ultimate goal is to construct an affinity database that will provide cell biologists and drug designers with crucial information obtained using our AEP system.
Collapse
Affiliation(s)
- Koki Tsukamoto
- Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIS T), Koto-ku, Tokyo, Japan
| | | | | | | | | |
Collapse
|
17
|
Launay G, Simonson T. Homology modelling of protein-protein complexes: a simple method and its possibilities and limitations. BMC Bioinformatics 2008; 9:427. [PMID: 18844985 PMCID: PMC2586029 DOI: 10.1186/1471-2105-9-427] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2008] [Accepted: 10/09/2008] [Indexed: 11/21/2022] Open
Abstract
Background Structure-based computational methods are needed to help identify and characterize protein-protein complexes and their function. For individual proteins, the most successful technique is homology modelling. We investigate a simple extension of this technique to protein-protein complexes. We consider a large set of complexes of known structures, involving pairs of single-domain proteins. The complexes are compared with each other to establish their sequence and structural similarities and the relation between the two. Compared to earlier studies, a simpler dataset, a simpler structural alignment procedure, and an additional energy criterion are used. Next, we compare the Xray structures to models obtained by threading the native sequence onto other, homologous complexes. An elementary requirement for a successful energy function is to rank the native structure above any threaded structure. We use the DFIREβ energy function, whose quality and complexity are typical of the models used today. Finally, we compare near-native models to distinctly non-native models. Results If weakly stable complexes are excluded (defined by a binding energy cutoff), as well as a few unusual complexes, a simple homology principle holds: complexes that share more than 35% sequence identity share similar structures and interaction modes; this principle was less clearcut in earlier studies. The energy function was then tested for its ability to identify experimental structures among sets of decoys, produced by a simple threading procedure. On average, the experimental structure is ranked above 92% of the alternate structures. Thus, discrimination of the native structure is good but not perfect. The discrimination of near-native structures is fair. Typically, a single, alternate, non-native binding mode exists that has a native-like energy. Some of the associated failures may correspond to genuine, alternate binding modes and/or native complexes that are artefacts of the crystal environment. In other cases, additional model filtering with more sophisticated tools is needed. Conclusion The results suggest that the simple modelling procedure applied here could help identify and characterize protein-protein complexes. The next step is to apply it on a genomic scale.
Collapse
Affiliation(s)
- Guillaume Launay
- Laboratoire de Biochimie (UMR CNRS 7654), Department of Biology, Ecole Polytechnique, 91128, Palaiseau, France.
| | | |
Collapse
|
18
|
Abstract
Using an efficient iterative method, we have developed a distance-dependent knowledge-based scoring function to predict protein-protein interactions. The function, referred to as ITScore-PP, was derived using the crystal structures of a training set of 851 protein-protein dimeric complexes containing true biological interfaces. The key idea of the iterative method for deriving ITScore-PP is to improve the interatomic pair potentials by iteration, until the pair potentials can distinguish true binding modes from decoy modes for the protein-protein complexes in the training set. The iterative method circumvents the challenging reference state problem in deriving knowledge-based potentials. The derived scoring function was used to evaluate the ligand orientations generated by ZDOCK 2.1 and the native ligand structures on a diverse set of 91 protein-protein complexes. For the bound test cases, ITScore-PP yielded a success rate of 98.9% if the top 10 ranked orientations were considered. For the more realistic unbound test cases, the corresponding success rate was 40.7%. Furthermore, for faster orientational sampling purpose, several residue-level knowledge-based scoring functions were also derived following the similar iterative procedure. Among them, the scoring function that uses the side-chain center of mass (SCM) to represent a residue, referred to as ITScore-PP(SCM), showed the best performance and yielded success rates of 71.4% and 30.8% for the bound and unbound cases, respectively, when the top 10 orientations were considered. ITScore-PP was further tested using two other published protein-protein docking decoy sets, the ZDOCK decoy set and the RosettaDock decoy set. In addition to binding mode prediction, the binding scores predicted by ITScore-PP also correlated well with the experimentally determined binding affinities, yielding a correlation coefficient of R = 0.71 on a test set of 74 protein-protein complexes with known affinities. ITScore-PP is computationally efficient. The average run time for ITScore-PP was about 0.03 second per orientation (including optimization) on a personal computer with 3.2 GHz Pentium IV CPU and 3.0 GB RAM. The computational speed of ITScore-PP(SCM) is about an order of magnitude faster than that of ITScore-PP. ITScore-PP and/or ITScore-PP(SCM) can be combined with efficient protein docking software to study protein-protein recognition.
Collapse
Affiliation(s)
- Sheng-You Huang
- Department of Physics and Astronomy, University of Missouri, Columbia, Missouri 65211, USA
| | | |
Collapse
|
19
|
de Sancho D, Rey A. Energy minimizations with a combination of two knowledge-based potentials for protein folding. J Comput Chem 2008; 29:1684-92. [DOI: 10.1002/jcc.20924] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|
20
|
Chaudhury S, Sircar A, Sivasubramanian A, Berrondo M, Gray JJ. Incorporating biochemical information and backbone flexibility in RosettaDock for CAPRI rounds 6-12. Proteins 2008; 69:793-800. [PMID: 17894347 DOI: 10.1002/prot.21731] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
In CAPRI rounds 6-12, RosettaDock successfully predicted 2 of 5 unbound-unbound targets to medium accuracy. Improvement over the previous method was achieved with computational mutagenesis to select decoys that match the energetics of experimentally determined hot spots. In the case of Target 21, Orc1/Sir1, this resulted in a successful docking prediction where RosettaDock alone or with simple site constraints failed. Experimental information also helped limit the interacting region of TolB/Pal, producing a successful prediction of Target 26. In addition, we docked multiple loop conformations for Target 20, and we developed a novel flexible docking algorithm to simultaneously optimize backbone conformation and rigid-body orientation to generate a wide diversity of conformations for Target 24. Continued challenges included docking of homology targets that differ substantially from their template (sequence identity <50%) and accounting for large conformational changes upon binding. Despite a larger number of unbound-unbound and homology model binding targets, Rounds 6-12 reinforced that RosettaDock is a powerful algorithm for predicting bound complex structures, especially when combined with experimental data.
Collapse
Affiliation(s)
- Sidhartha Chaudhury
- Program in Molecular and Computational Biophysics, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | | | | | | | | |
Collapse
|
21
|
Launay G, Mendez R, Wodak S, Simonson T. Recognizing protein-protein interfaces with empirical potentials and reduced amino acid alphabets. BMC Bioinformatics 2007; 8:270. [PMID: 17662112 PMCID: PMC2034607 DOI: 10.1186/1471-2105-8-270] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2007] [Accepted: 07/27/2007] [Indexed: 11/25/2022] Open
Abstract
Background In structural genomics, an important goal is the detection and classification of protein–protein interactions, given the structures of the interacting partners. We have developed empirical energy functions to identify native structures of protein–protein complexes among sets of decoy structures. To understand the role of amino acid diversity, we parameterized a series of functions, using a hierarchy of amino acid alphabets of increasing complexity, with 2, 3, 4, 6, and 20 amino acid groups. Compared to previous work, we used the simplest possible functional form, with residue–residue interactions and a stepwise distance-dependence. We used increased computational ressources, however, constructing 290,000 decoys for 219 protein–protein complexes, with a realistic docking protocol where the protein partners are flexible and interact through a molecular mechanics energy function. The energy parameters were optimized to correctly assign as many native complexes as possible. To resolve the multiple minimum problem in parameter space, over 64000 starting parameter guesses were tried for each energy function. The optimized functions were tested by cross validation on subsets of our native and decoy structures, by blind tests on series of native and decoy structures available on the Web, and on models for 13 complexes submitted to the CAPRI structure prediction experiment. Results Performance is similar to several other statistical potentials of the same complexity. For example, the CAPRI target structure is correctly ranked ahead of 90% of its decoys in 6 cases out of 13. The hierarchy of amino acid alphabets leads to a coherent hierarchy of energy functions, with qualitatively similar parameters for similar amino acid types at all levels. Most remarkably, the performance with six amino acid classes is equivalent to that of the most detailed, 20-class energy function. Conclusion This suggests that six carefully chosen amino acid classes are sufficient to encode specificity in protein–protein interactions, and provide a starting point to develop more complicated energy functions.
Collapse
Affiliation(s)
- Guillaume Launay
- Laboratoire de Biochimie (UMR CNRS 7654), Department of Biology, Ecole Polytechnique, 91128, Palaiseau, France
| | - Raul Mendez
- Service de Conformation de Macromolécules Biologiques et Bioinformatique, Centre de Biologie Structurale et Bioinformatique, Université Libre de Bruxelles, Belgium
| | - Shoshana Wodak
- Structural Biology Program, Hospital for Sick Children, Toronto, Canada
| | - Thomas Simonson
- Laboratoire de Biochimie (UMR CNRS 7654), Department of Biology, Ecole Polytechnique, 91128, Palaiseau, France
| |
Collapse
|
22
|
Champ PC, Camacho CJ. FastContact: a free energy scoring tool for protein-protein complex structures. Nucleic Acids Res 2007; 35:W556-60. [PMID: 17537824 PMCID: PMC1933237 DOI: 10.1093/nar/gkm326] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
‘FastContact’ is a server that estimates the direct electrostatic and desolvation interaction free energy between two proteins in units of kcal/mol. Users submit two proteins in PDB format, and the output is emailed back to the user in three files: one output file, and the two processed proteins. Besides the electrostatic and desolvation free energy, the server reports residue contact free energies that rapidly highlight the hotspots of the interaction and evaluates the van der Waals interaction using CHARMm. Response time is ∼1 min. The server has been successfully tested and validated, scoring refined complex structures and blind sets of docking decoys, as well as proven useful predicting protein interactions. ‘FastContact’ offers unique capabilities from biophysical insights to scoring and identifying important contacts.
Collapse
|
23
|
Cheng J, Pei J, Lai L. A free-rotating and self-avoiding chain model for deriving statistical potentials based on protein structures. Biophys J 2007; 92:3868-77. [PMID: 17351015 PMCID: PMC1868969 DOI: 10.1529/biophysj.106.102152] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Statistical potentials have been widely used in protein studies despite the much-debated theoretical basis. In this work, we have applied two physical reference states for deriving the statistical potentials based on protein structure features to achieve zero interaction and orthogonalization. The free-rotating chain-based potential applies a local free-rotating chain reference state, which could theoretically be described by the Gaussian distribution. The self-avoiding chain-based potential applies a reference state derived from a database of artificial self-avoiding backbones generated by Monte Carlo simulation. These physical reference states are independent of known protein structures and are based solely on the analytical formulation or simulation method. The new potentials performed better and yielded higher Z-scores and success rates compared to other statistical potentials. The end-to-end distance distribution produced by the self-avoiding chain model was similar to the distance distribution of protein atoms in structure database. This fact may partly explain the basis of the reference states that depend on the atom pair frequency observed in the protein database. The current study showed that a more physical reference model improved the performance of statistical potentials in protein fold recognition, which could also be extended to other types of applications.
Collapse
Affiliation(s)
- Ji Cheng
- State Key Laboratory for Structural Chemistry of Stable and Unstable Species, College of Chemistry and Molecular Engineering, and Center for Theoretical Biology, Peking University, Beijing, China
| | | | | |
Collapse
|
24
|
Müller W, Sticht H. A protein-specifically adapted scoring function for the reranking of docking solutions. Proteins 2007; 67:98-111. [PMID: 17243180 DOI: 10.1002/prot.21310] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
In this work, we developed a protein-specifically adapted scoring function and applied it to the reranking of protein-protein docking solutions generated with a conventional docking program. The approach was validated using experimentally determined structures of the bacterial HPr-protein in complex with four structurally nonhomologous binding partners as an example. A sufficiently large data basis for the generation of protein-specifically adapted pair potentials was generated by modeling all orthologous complexes for each type of interaction resulting in a total of 224 complexes. The parameters for potential generation were systematically varied and resulted in a total of 66,132 different scoring functions that were tested for their ability of successful reranking of 1000 docking solutions generated from modeled structures of the unbound binding partners. Parameters that proved critical for the generation of good scoring functions were the distance cutoff used for the generation of the pair potential, and an additional cutoff that allows a proper weighting of conserved and nonconserved contacts in the interface. Compared to the original scoring function, application of this novel type of scoring functions resulted in a significant accumulation of acceptable docking solutions within the first 10 ranks. Depending on the type of complex investigated one to five acceptable complex geometries are found among the 10 highest-ranked solutions and for three of the four systems tested, an acceptable solution was placed on the first rank.
Collapse
Affiliation(s)
- Wolfgang Müller
- Institut für Biochemie, Abteilung Bioinformatik, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | | |
Collapse
|
25
|
Bernauer J, Azé J, Janin J, Poupon A. A new protein-protein docking scoring function based on interface residue properties. Bioinformatics 2007; 23:555-62. [PMID: 17237048 DOI: 10.1093/bioinformatics/btl654] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Protein-protein complexes are known to play key roles in many cellular processes. However, they are often not accessible to experimental study because of their low stability and difficulty to produce the proteins and assemble them in native conformation. Thus, docking algorithms have been developed to provide an in silico approach of the problem. A protein-protein docking procedure traditionally consists of two successive tasks: a search algorithm generates a large number of candidate solutions, and then a scoring function is used to rank them. RESULTS To address the second step, we developed a scoring function based on a Voronoï tessellation of the protein three-dimensional structure. We showed that the Voronoï representation may be used to describe in a simplified but useful manner, the geometric and physico-chemical complementarities of two molecular surfaces. We measured a set of parameters on native protein-protein complexes and on decoys, and used them as attributes in several statistical learning procedures: a logistic function, Support Vector Machines (SVM), and a genetic algorithm. For the later, we used ROGER, a genetic algorithm designed to optimize the area under the receiver operating characteristics curve. To further test the scores derived with ROGER, we ranked models generated by two different docking algorithms on targets of a blind prediction experiment, improving in almost all cases the rank of native-like solutions. AVAILABILITY http://genomics.eu.org/spip/-Bioinformatics-tools-
Collapse
Affiliation(s)
- J Bernauer
- Yeast Structural Genomics, IBBMC UMR CNRS 8619, Bâtiment 430, Université Paris-Sud, 91405 Orsay, France
| | | | | | | |
Collapse
|
26
|
Liang S, Zhang C, Liu S, Zhou Y. Protein binding site prediction using an empirical scoring function. Nucleic Acids Res 2006; 34:3698-707. [PMID: 16893954 PMCID: PMC1540721 DOI: 10.1093/nar/gkl454] [Citation(s) in RCA: 194] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Most biological processes are mediated by interactions between proteins and their interacting partners including proteins, nucleic acids and small molecules. This work establishes a method called PINUP for binding site prediction of monomeric proteins. With only two weight parameters to optimize, PINUP produces not only 42.2% coverage of actual interfaces (percentage of correctly predicted interface residues in actual interface residues) but also 44.5% accuracy in predicted interfaces (percentage of correctly predicted interface residues in the predicted interface residues) in a cross validation using a 57-protein dataset. By comparison, the expected accuracy via random prediction (percentage of actual interface residues in surface residues) is only 15%. The binding sites of the 57-protein set are found to be easier to predict than that of an independent test set of 68 proteins. The average coverage and accuracy for this independent test set are 30.5 and 29.4%, respectively. The significant gain of PINUP over expected random prediction is attributed to (i) effective residue-energy score and accessible-surface-area-dependent interface-propensity, (ii) isolation of functional constraints contained in the conservation score from the structural constraints through the combination of residue-energy score (for structural constraints) and conservation score and (iii) a consensus region built on top-ranked initial patches.
Collapse
Affiliation(s)
| | | | | | - Yaoqi Zhou
- To whom correspondence should be addressed. Tel: +1 716 829 2985; Fax: +1 716 829 2344;
| |
Collapse
|
27
|
Camacho CJ, Ma H, Champ PC. Scoring a diverse set of high-quality docked conformations: A metascore based on electrostatic and desolvation interactions. Proteins 2006; 63:868-77. [PMID: 16506242 DOI: 10.1002/prot.20932] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Predicting protein-protein interactions involves sampling and scoring docked conformations. Barring some large structural rearrangement, rapidly sampling the space of docked conformations is now a real possibility, and the limiting step for the successful prediction of protein interactions is the scoring function used to reduce the space of conformations from billions to a few, and eventually one high affinity complex. An atomic level free-energy scoring function that estimates in units of kcal/mol both electrostatic and desolvation interactions (plus van der Waals if appropriate) of protein-protein docked conformations is used to rerank the blind predictions (860 in total) submitted for six targets to the community-wide Critical Assessment of PRediction of Interactions (CAPRI; http://capri.ebi.ac.uk). We found that native-like models often have varying intermolecular contacts and atom clashes, making unlikely that one can construct a universal function that would rank all these models as native-like. Nevertheless, our scoring function is able to consistently identify the native-like complexes as those with the lowest free energy for the individual models of 16 (out of 17) human predictors for five of the targets, while at the same time the modelers failed to do so in more than half of the cases. The scoring of high-quality models developed by a wide variety of methods and force fields confirms that electrostatic and desolvation forces are the dominant interactions determining the bound structure. The CAPRI experiment has shown that modelers can predict valuable models of protein-protein complexes, and improvements in scoring functions should soon solve the docking problem for complexes whose backbones do not change much upon binding. A scoring server and programs are available at http://structure.pitt.edu.
Collapse
Affiliation(s)
- Carlos J Camacho
- Department of Computational Biology, University of Pittsburgh, Pittsburgh, Pennsylvania 15213, USA.
| | | | | |
Collapse
|