1
|
Sun X, Wu Z, Su J, Li C. A deep attention model for wide-genome protein-peptide binding affinity prediction at a sequence level. Int J Biol Macromol 2024; 276:133811. [PMID: 38996881 DOI: 10.1016/j.ijbiomac.2024.133811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Revised: 07/09/2024] [Accepted: 07/09/2024] [Indexed: 07/14/2024]
Abstract
Peptides are pivotal in numerous biological activities by engaging in up to 40 % of protein-protein interactions in many cellular processes. Due to their exceptional specificity and effectiveness, peptides have emerged as promising candidates for drug design. However, accurately predicting protein-peptide binding affinity remains a challenging. Aiming at the problem, we develop a prediction model PepPAP based on convolutional neural network and multi-head attention, which relies solely on sequence features. These features include physicochemical properties, intrinsic disorder, sequence encoding, and especially interface propensity which is extracted from 16,689 non-redundant protein-peptide complexes. Notably, the adopted regression stratification cross-validation scheme proposed in our previous work is beneficial to improve the prediction for the cases with extreme binding affinity values. On three benchmark test datasets: T100, a series of peptides targeting to PDZ domain and CXCR4, PepPAP shows excellent performance, outperforming the existing methods and demonstrating its good generalization ability. Furthermore, PepPAP has good results in binary interaction prediction, and the analysis of the feature space distribution visualization highlights PepPAP's effectiveness. To the best of our knowledge, PepPAP is the first sequence-based deep attention model for wide-genome protein-peptide binding affinity prediction, and holds the potential to offer valuable insights for the peptide-based drug design.
Collapse
Affiliation(s)
- Xiaohan Sun
- College of Chemistry and Life Science, Beijing University of Technology, Beijing 100124, China
| | - Zhixiang Wu
- College of Chemistry and Life Science, Beijing University of Technology, Beijing 100124, China
| | - Jingjie Su
- College of Chemistry and Life Science, Beijing University of Technology, Beijing 100124, China
| | - Chunhua Li
- College of Chemistry and Life Science, Beijing University of Technology, Beijing 100124, China.
| |
Collapse
|
2
|
Launay R, Teppa E, Esque J, André I. Modeling Protein Complexes and Molecular Assemblies Using Computational Methods. Methods Mol Biol 2023; 2553:57-77. [PMID: 36227539 DOI: 10.1007/978-1-0716-2617-7_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Many biological molecules are assembled into supramolecular complexes that are necessary to perform functions in the cell. Better understanding and characterization of these molecular assemblies are thus essential to further elucidate molecular mechanisms and key protein-protein interactions that could be targeted to modulate the protein binding affinity or develop new binders. Experimental access to structural information on these supramolecular assemblies is often hampered by the size of these systems that make their recombinant production and characterization rather difficult. Computational methods combining both structural data, molecular modeling techniques, and sequence coevolution information can thus offer a good alternative to gain access to the structural organization of protein complexes and assemblies. Herein, we present some computational methods to predict structural models of the protein partners, to search for interacting regions using coevolution information, and to build molecular assemblies. The approach is exemplified using a case study to model the succinate-quinone oxidoreductase heterocomplex.
Collapse
Affiliation(s)
- Romain Launay
- Toulouse Biotechnology Institute, TBI, Université de Toulouse, CNRS, INRAE, INSA, Toulouse Cedex 04, France
| | - Elin Teppa
- Toulouse Biotechnology Institute, TBI, Université de Toulouse, CNRS, INRAE, INSA, Toulouse Cedex 04, France
| | - Jérémy Esque
- Toulouse Biotechnology Institute, TBI, Université de Toulouse, CNRS, INRAE, INSA, Toulouse Cedex 04, France.
| | - Isabelle André
- Toulouse Biotechnology Institute, TBI, Université de Toulouse, CNRS, INRAE, INSA, Toulouse Cedex 04, France.
| |
Collapse
|
3
|
Romero-Molina S, Ruiz-Blanco YB, Mieres-Perez J, Harms M, Münch J, Ehrmann M, Sanchez-Garcia E. PPI-Affinity: A Web Tool for the Prediction and Optimization of Protein-Peptide and Protein-Protein Binding Affinity. J Proteome Res 2022; 21:1829-1841. [PMID: 35654412 PMCID: PMC9361347 DOI: 10.1021/acs.jproteome.2c00020] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
![]()
Virtual screening
of protein–protein and protein–peptide
interactions is a challenging task that directly impacts the processes
of hit identification and hit-to-lead optimization in drug design
projects involving peptide-based pharmaceuticals. Although several
screening tools designed to predict the binding affinity of protein–protein
complexes have been proposed, methods specifically developed to predict
protein–peptide binding affinity are comparatively scarce.
Frequently, predictors trained to score the affinity of small molecules
are used for peptides indistinctively, despite the larger complexity
and heterogeneity of interactions rendered by peptide binders. To
address this issue, we introduce PPI-Affinity, a tool that leverages
support vector machine (SVM) predictors of binding affinity to screen
datasets of protein–protein and protein–peptide complexes,
as well as to generate and rank mutants of a given structure. The
performance of the SVM models was assessed on four benchmark datasets,
which include protein–protein and protein–peptide binding
affinity data. In addition, we evaluated our model on a set of mutants
of EPI-X4, an endogenous peptide inhibitor of the chemokine receptor
CXCR4, and on complexes of the serine proteases HTRA1 and HTRA3 with
peptides. PPI-Affinity is freely accessible at https://protdcal.zmb.uni-due.de/PPIAffinity.
Collapse
Affiliation(s)
- Sandra Romero-Molina
- Computational Biochemistry, Center of Medical Biotechnology, University of Duisburg-Essen, Essen 45141, Germany
| | - Yasser B Ruiz-Blanco
- Computational Biochemistry, Center of Medical Biotechnology, University of Duisburg-Essen, Essen 45141, Germany
| | - Joel Mieres-Perez
- Computational Biochemistry, Center of Medical Biotechnology, University of Duisburg-Essen, Essen 45141, Germany
| | - Mirja Harms
- Institute of Molecular Virology, Ulm University Medical Center, Ulm 89081, Germany
| | - Jan Münch
- Institute of Molecular Virology, Ulm University Medical Center, Ulm 89081, Germany.,Core Facility Functional Peptidomics, Ulm University Medical Center, Ulm 89081, Germany
| | - Michael Ehrmann
- Faculty of Biology, Center of Medical Biotechnology, University of Duisburg-Essen, Essen 45141, Germany
| | - Elsa Sanchez-Garcia
- Computational Biochemistry, Center of Medical Biotechnology, University of Duisburg-Essen, Essen 45141, Germany
| |
Collapse
|
4
|
Jiang Y, Liu HF, Liu R. Systematic comparison and prediction of the effects of missense mutations on protein-DNA and protein-RNA interactions. PLoS Comput Biol 2021; 17:e1008951. [PMID: 33872313 PMCID: PMC8084330 DOI: 10.1371/journal.pcbi.1008951] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Revised: 04/29/2021] [Accepted: 04/08/2021] [Indexed: 12/30/2022] Open
Abstract
The binding affinities of protein-nucleic acid interactions could be altered due to missense mutations occurring in DNA- or RNA-binding proteins, therefore resulting in various diseases. Unfortunately, a systematic comparison and prediction of the effects of mutations on protein-DNA and protein-RNA interactions (these two mutation classes are termed MPDs and MPRs, respectively) is still lacking. Here, we demonstrated that these two classes of mutations could generate similar or different tendencies for binding free energy changes in terms of the properties of mutated residues. We then developed regression algorithms separately for MPDs and MPRs by introducing novel geometric partition-based energy features and interface-based structural features. Through feature selection and ensemble learning, similar computational frameworks that integrated energy- and nonenergy-based models were established to estimate the binding affinity changes resulting from MPDs and MPRs, but the selected features for the final models were different and therefore reflected the specificity of these two mutation classes. Furthermore, the proposed methodology was extended to the identification of mutations that significantly decreased the binding affinities. Extensive validations indicated that our algorithm generally performed better than the state-of-the-art methods on both the regression and classification tasks. The webserver and software are freely available at http://liulab.hzau.edu.cn/PEMPNI and https://github.com/hzau-liulab/PEMPNI. Protein-nucleic acid interactions play important roles in various cellular processes. Missense mutations occurring in DNA- or RNA-binding proteins (termed MPDs and MPRs, respectively) could change the binding affinities of these interactions. Previous studies have compared protein-DNA and protein-RNA interactions from multifaceted viewpoints, but less attention has been given to the similarities and specific differences between the effects of MPDs and MPRs and between the methodologies for predicting the affinity changes induced by the two mutation classes. Therefore, we systematically compared their impacts and demonstrated that MPDs and MPRs could have specific preferences for binding affinity changes. These observations motivated us to construct regression models separately for MPDs and MPRs by introducing novel energy and nonenergy descriptors. Although similar frameworks were developed to estimate these two categories of mutation effects, different descriptors were selected in the regression models and further revealed the specificity of mutation classes. The interplay between the energy and nonenergy modules effectively improved prediction performance. Our algorithm can also be adopted to disentangle mutations significantly decreasing binding affinities from other mutations.
Collapse
Affiliation(s)
- Yao Jiang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, P. R. China
| | - Hui-Fang Liu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, P. R. China
| | - Rong Liu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, P. R. China
| |
Collapse
|
5
|
Launay G, Ohue M, Prieto Santero J, Matsuzaki Y, Hilpert C, Uchikoga N, Hayashi T, Martin J. Evaluation of CONSRANK-Like Scoring Functions for Rescoring Ensembles of Protein–Protein Docking Poses. Front Mol Biosci 2020; 7:559005. [PMID: 33195406 PMCID: PMC7641601 DOI: 10.3389/fmolb.2020.559005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Accepted: 09/28/2020] [Indexed: 11/13/2022] Open
Abstract
Scoring is a challenging step in protein–protein docking, where typically thousands of solutions are generated. In this study, we ought to investigate the contribution of consensus-rescoring, as introduced by Oliva et al. (2013) with the CONSRANK method, where the set of solutions is used to build statistics in order to identify recurrent solutions. We explore several ways to perform consensus-based rescoring on the ZDOCK decoy set for Benchmark 4. We show that the information of the interface size is critical for successful rescoring in this context, but that consensus rescoring in itself performs less well than traditional physics-based evaluation. The results of physics-based and consensus-based rescoring are partially overlapping, supporting the use of a combination of these approaches.
Collapse
Affiliation(s)
- Guillaume Launay
- CNRS, UMR 5086 Molecular Microbiology and Structural Biochemistry, University of Lyon, Lyon, France
| | - Masahito Ohue
- Department of Computer Science, School of Computing, Tokyo Institute of Technology, Tokyo, Japan
- *Correspondence: Masahito Ohue,
| | - Julia Prieto Santero
- CNRS, UMR 5086 Molecular Microbiology and Structural Biochemistry, University of Lyon, Lyon, France
| | - Yuri Matsuzaki
- Tokyo Tech Academy for Leadership, Tokyo Institute of Technology, Tokyo, Japan
| | - Cécile Hilpert
- CNRS, UMR 5086 Molecular Microbiology and Structural Biochemistry, University of Lyon, Lyon, France
| | - Nobuyuki Uchikoga
- Department of Network Design, School of Interdisciplinary Mathematical Sciences, Meiji University, Tokyo, Japan
| | - Takanori Hayashi
- Department of Computer Science, School of Computing, Tokyo Institute of Technology, Tokyo, Japan
| | - Juliette Martin
- CNRS, UMR 5086 Molecular Microbiology and Structural Biochemistry, University of Lyon, Lyon, France
- Juliette Martin,
| |
Collapse
|
6
|
Pan Y, Zhou S, Guan J. Computationally identifying hot spots in protein-DNA binding interfaces using an ensemble approach. BMC Bioinformatics 2020; 21:384. [PMID: 32938375 PMCID: PMC7495898 DOI: 10.1186/s12859-020-03675-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Protein-DNA interaction governs a large number of cellular processes, and it can be altered by a small fraction of interface residues, i.e., the so-called hot spots, which account for most of the interface binding free energy. Accurate prediction of hot spots is critical to understand the principle of protein-DNA interactions. There are already some computational methods that can accurately and efficiently predict a large number of hot residues. However, the insufficiency of experimentally validated hot-spot residues in protein-DNA complexes and the low diversity of the employed features limit the performance of existing methods. RESULTS Here, we report a new computational method for effectively predicting hot spots in protein-DNA binding interfaces. This method, called PreHots (the abbreviation of Predicting Hotspots), adopts an ensemble stacking classifier that integrates different machine learning classifiers to generate a robust model with 19 features selected by a sequential backward feature selection algorithm. To this end, we constructed two new and reliable datasets (one benchmark for model training and one independent dataset for validation), which totally consist of 123 hot spots and 137 non-hot spots from 89 protein-DNA complexes. The data were manually collected from the literature and existing databases with a strict process of redundancy removal. Our method achieves a sensitivity of 0.813 and an AUC score of 0.868 in 10-fold cross-validation on the benchmark dataset, and a sensitivity of 0.818 and an AUC score of 0.820 on the independent test dataset. The results show that our approach outperforms the existing ones. CONCLUSIONS PreHots, which is based on stack ensemble of boosting algorithms, can reliably predict hot spots at the protein-DNA binding interface on a large scale. Compared with the existing methods, PreHots can achieve better prediction performance. Both the webserver of PreHots and the datasets are freely available at: http://dmb.tongji.edu.cn/tools/PreHots/ .
Collapse
Affiliation(s)
- Yuliang Pan
- Department of Computer Science and Technology, Tongji University, No. 4800 Caoan Road, Shanghai, 201804, China
| | - Shuigeng Zhou
- Shanghai Key Laboratory of Intelligent Information Processing, and School of Computer Science, Fudan University, No. 220 Handan Road, Shanghai, 200433, China
| | - Jihong Guan
- Department of Computer Science and Technology, Tongji University, No. 4800 Caoan Road, Shanghai, 201804, China.
| |
Collapse
|
7
|
Park T, Woo H, Baek M, Yang J, Seok C. Structure prediction of biological assemblies using GALAXY in CAPRI rounds 38-45. Proteins 2019; 88:1009-1017. [PMID: 31774573 DOI: 10.1002/prot.25859] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2019] [Revised: 11/11/2019] [Accepted: 11/23/2019] [Indexed: 12/12/2022]
Abstract
We participated in CARPI rounds 38-45 both as a server predictor and a human predictor. These CAPRI rounds provided excellent opportunities for testing prediction methods for three classes of protein interactions, that is, protein-protein, protein-peptide, and protein-oligosaccharide interactions. Both template-based methods (GalaxyTBM for monomer protein, GalaxyHomomer for homo-oligomer protein, GalaxyPepDock for protein-peptide complex) and ab initio docking methods (GalaxyTongDock and GalaxyPPDock for protein oligomer, GalaxyPepDock-ab-initio for protein-peptide complex, GalaxyDock2 and Galaxy7TM for protein-oligosaccharide complex) have been tested. Template-based methods depend heavily on the availability of proper templates and template-target similarity, and template-target difference is responsible for inaccuracy of template-based models. Inaccurate template-based models could be improved by our structure refinement and loop modeling methods based on physics-based energy optimization (GalaxyRefineComplex and GalaxyLoop) for several CAPRI targets. Current ab initio docking methods require accurate protein structures as input. Small conformational changes from input structure could be accounted for by our docking methods, producing one of the best models for several CAPRI targets. However, predicting large conformational changes involving protein backbone is still challenging, and full exploration of physics-based methods for such problems is still to come.
Collapse
Affiliation(s)
- Taeyong Park
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Hyeonuk Woo
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Minkyung Baek
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Jinsol Yang
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| |
Collapse
|
8
|
Baek M, Park T, Woo H, Seok C. Prediction of protein oligomer structures using GALAXY in CASP13. Proteins 2019; 87:1233-1240. [PMID: 31509276 DOI: 10.1002/prot.25814] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2019] [Revised: 08/30/2019] [Accepted: 09/07/2019] [Indexed: 01/24/2023]
Abstract
Many proteins need to form oligomers to be functional, so oligomer structures provide important clues to biological roles of proteins. Prediction of oligomer structures therefore can be a useful tool in the absence of experimentally resolved structures. In this article, we describe the server and human methods that we used to predict oligomer structures in the CASP13 experiment. Performances of the methods on the 42 CASP13 oligomer targets consisting of 30 homo-oligomers and 12 hetero-oligomers are discussed. Our server method, Seok-assembly, generated models with interface contact similarity measure greater than 0.2 as model 1 for 11 homo-oligomer targets when proper templates existed in the database. Model refinement methods such as loop modeling and molecular dynamics (MD)-based overall refinement failed to improve model qualities when target proteins have domains not covered by templates or when chains have very small interfaces. In human predictions, additional experimental data such as low-resolution electron microscopy (EM) map were utilized. EM data could assist oligomer structure prediction by providing a global shape of the complex structure.
Collapse
Affiliation(s)
- Minkyung Baek
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Taeyong Park
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Hyeonuk Woo
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| |
Collapse
|
9
|
Park T, Baek M, Lee H, Seok C. GalaxyTongDock: Symmetric and asymmetric ab initio protein-protein docking web server with improved energy parameters. J Comput Chem 2019; 40:2413-2417. [PMID: 31173387 DOI: 10.1002/jcc.25874] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2018] [Revised: 04/27/2019] [Accepted: 05/20/2019] [Indexed: 12/21/2022]
Abstract
Protein-protein docking methods are spotlighted for their roles in providing insights into protein-protein interactions in the absence of full structural information by experiment. GalaxyTongDock is an ab initio protein-protein docking web server that performs rigid-body docking just like ZDOCK but with improved energy parameters. The energy parameters were trained by iterative docking and parameter search so that more native-like structures are selected as top rankers. GalaxyTongDock performs asymmetric docking of two different proteins (GalaxyTongDock_A) and symmetric docking of homo-oligomeric proteins with Cn and Dn symmetries (GalaxyTongDock_C and GalaxyTongDock_D). Performance tests on an unbound docking benchmark set for asymmetric docking and a model docking benchmark set for symmetric docking showed that GalaxyTongDock is better or comparable to other state-of-the-art methods. Experimental and/or evolutionary information on binding interfaces can be easily incorporated by using block and interface options. GalaxyTongDock web server is freely available at http://galaxy.seoklab.org/tongdock. © 2019 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Taeyong Park
- Department of Chemistry, Seoul National University, Seoul 08826, Republic of Korea
| | - Minkyung Baek
- Department of Chemistry, Seoul National University, Seoul 08826, Republic of Korea
| | - Hasup Lee
- Department of Chemistry, Seoul National University, Seoul 08826, Republic of Korea
| | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul 08826, Republic of Korea
| |
Collapse
|
10
|
Deng L, Sui Y, Zhang J. XGBPRH: Prediction of Binding Hot Spots at Protein⁻RNA Interfaces Utilizing Extreme Gradient Boosting. Genes (Basel) 2019; 10:genes10030242. [PMID: 30901953 PMCID: PMC6471955 DOI: 10.3390/genes10030242] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2019] [Revised: 03/14/2019] [Accepted: 03/15/2019] [Indexed: 01/24/2023] Open
Abstract
Hot spot residues at protein⁻RNA complexes are vitally important for investigating the underlying molecular recognition mechanism. Accurately identifying protein⁻RNA binding hot spots is critical for drug designing and protein engineering. Although some progress has been made by utilizing various available features and a series of machine learning approaches, these methods are still in the infant stage. In this paper, we present a new computational method named XGBPRH, which is based on an eXtreme Gradient Boosting (XGBoost) algorithm and can effectively predict hot spot residues in protein⁻RNA interfaces utilizing an optimal set of properties. Firstly, we download 47 protein⁻RNA complexes and calculate a total of 156 sequence, structure, exposure, and network features. Next, we adopt a two-step feature selection algorithm to extract a combination of 6 optimal features from the combination of these 156 features. Compared with the state-of-the-art approaches, XGBPRH achieves better performances with an area under the ROC curve (AUC) score of 0.817 and an F1-score of 0.802 on the independent test set. Meanwhile, we also apply XGBPRH to two case studies. The results demonstrate that the method can effectively identify novel energy hotspots.
Collapse
Affiliation(s)
- Lei Deng
- School of Computer Science and Engineering, Central South University, Changsha 410075, China.
| | - Yuanchao Sui
- School of Computer Science and Engineering, Central South University, Changsha 410075, China.
| | - Jingpu Zhang
- School of Computer and Data Science, Henan University of Urban Construction, Pingdingshan 467000, China.
| |
Collapse
|
11
|
Pan Y, Wang Z, Zhan W, Deng L. Computational identification of binding energy hot spots in protein-RNA complexes using an ensemble approach. Bioinformatics 2019; 34:1473-1480. [PMID: 29281004 DOI: 10.1093/bioinformatics/btx822] [Citation(s) in RCA: 72] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2017] [Accepted: 12/19/2017] [Indexed: 11/12/2022] Open
Abstract
Motivation Identifying RNA-binding residues, especially energetically favored hot spots, can provide valuable clues for understanding the mechanisms and functional importance of protein-RNA interactions. Yet, limited availability of experimentally recognized energy hot spots in protein-RNA crystal structures leads to the difficulties in developing empirical identification approaches. Computational prediction of RNA-binding hot spot residues is still in its infant stage. Results Here, we describe a computational method, PrabHot (Prediction of protein-RNA binding hot spots), that can effectively detect hot spot residues on protein-RNA binding interfaces using an ensemble of conceptually different machine learning classifiers. Residue interaction network features and new solvent exposure characteristics are combined together and selected for classification with the Boruta algorithm. In particular, two new reference datasets (benchmark and independent) have been generated containing 107 hot spots from 47 known protein-RNA complex structures. In 10-fold cross-validation on the training dataset, PrabHot achieves promising performances with an AUC score of 0.86 and a sensitivity of 0.78, which are significantly better than that of the pioneer RNA-binding hot spot prediction method HotSPRing. We also demonstrate the capability of our proposed method on the independent test dataset and gain a competitive advantage as a result. Availability and implementation The PrabHot webserver is freely available at http://denglab.org/PrabHot/. Contact leideng@csu.edu.cn. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yuliang Pan
- School of Software, Central South University, Changsha 410075, China
| | - Zixiang Wang
- School of Software, Central South University, Changsha 410075, China
| | - Weihua Zhan
- School of Electronics and Computer Science, Zhejiang Wanli University, Ningbo 315100, China
| | - Lei Deng
- School of Software, Central South University, Changsha 410075, China
- Shanghai Key Laboratory of Intelligent Information Processing, Fudan University, Shanghai 200433, China
| |
Collapse
|
12
|
Machine Learning Approaches for Protein⁻Protein Interaction Hot Spot Prediction: Progress and Comparative Assessment. Molecules 2018; 23:molecules23102535. [PMID: 30287797 PMCID: PMC6222875 DOI: 10.3390/molecules23102535] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2018] [Revised: 09/27/2018] [Accepted: 10/02/2018] [Indexed: 12/27/2022] Open
Abstract
Hot spots are the subset of interface residues that account for most of the binding free energy, and they play essential roles in the stability of protein binding. Effectively identifying which specific interface residues of protein–protein complexes form the hot spots is critical for understanding the principles of protein interactions, and it has broad application prospects in protein design and drug development. Experimental methods like alanine scanning mutagenesis are labor-intensive and time-consuming. At present, the experimentally measured hot spots are very limited. Hence, the use of computational approaches to predicting hot spots is becoming increasingly important. Here, we describe the basic concepts and recent advances of machine learning applications in inferring the protein–protein interaction hot spots, and assess the performance of widely used features, machine learning algorithms, and existing state-of-the-art approaches. We also discuss the challenges and future directions in the prediction of hot spots.
Collapse
|
13
|
Deng L, Xu X, Liu H. PredCSO: an ensemble method for the prediction of S-sulfenylation sites in proteins. Mol Omics 2018; 14:257-265. [DOI: 10.1039/c8mo00089a] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Predicting S-sulfenylation sites in proteins based on sequence and structural features by building an ensemble model by gradient tree boosting.
Collapse
Affiliation(s)
- Lei Deng
- School of Software, Central South University
- Changsha
- China
| | - Xiaojie Xu
- School of Software, Central South University
- Changsha
- China
| | - Hui Liu
- School of Software, Central South University
- Changsha
- China
- Lab of Information Management, Changzhou University
- Jiangsu
| |
Collapse
|
14
|
Fradera X, Babaoglu K. Overview of Methods and Strategies for Conducting Virtual Small Molecule Screening. ACTA ACUST UNITED AC 2017; 9:196-212. [DOI: 10.1002/cpch.27] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Affiliation(s)
- Xavier Fradera
- Modeling and Informatics, MRL, Merck & Co., Inc; Boston Massachusetts
| | - Kerim Babaoglu
- Modeling and Informatics, MRL, Merck & Co., Inc; West Point Pennsylvania
| |
Collapse
|
15
|
Pan Y, Liu D, Deng L. Accurate prediction of functional effects for variants by combining gradient tree boosting with optimal neighborhood properties. PLoS One 2017; 12:e0179314. [PMID: 28614374 PMCID: PMC5470696 DOI: 10.1371/journal.pone.0179314] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2017] [Accepted: 05/27/2017] [Indexed: 12/20/2022] Open
Abstract
Single amino acid variations (SAVs) potentially alter biological functions, including causing diseases or natural differences between individuals. Identifying the relationship between a SAV and certain disease provides the starting point for understanding the underlying mechanisms of specific associations, and can help further prevention and diagnosis of inherited disease.We propose PredSAV, a computational method that can effectively predict how likely SAVs are to be associated with disease by incorporating gradient tree boosting (GTB) algorithm and optimally selected neighborhood features. A two-step feature selection approach is used to explore the most relevant and informative neighborhood properties that contribute to the prediction of disease association of SAVs across a wide range of sequence and structural features, especially some novel structural neighborhood features. In cross-validation experiments on the benchmark dataset, PredSAV achieves promising performances with an AUC score of 0.908 and a specificity of 0.838, which are significantly better than that of the other existing methods. Furthermore, we validate the capability of our proposed method by an independent test and gain a competitive advantage as a result. PredSAV, which combines gradient tree boosting with optimally selected neighborhood features, can return reliable predictions in distinguishing between disease-associated and neutral variants. Compared with existing methods, PredSAV shows improved specificity as well as increased overall performance.
Collapse
Affiliation(s)
- Yuliang Pan
- School of Software, Central South University, Changsha, China
| | - Diwei Liu
- School of Software, Central South University, Changsha, China
| | - Lei Deng
- School of Software, Central South University, Changsha, China
- Shanghai Key Laboratory of Intelligent Information Processing, Shanghai, China
| |
Collapse
|
16
|
Malhotra S, Mathew OK, Sowdhamini R. DOCKSCORE: a webserver for ranking protein-protein docked poses. BMC Bioinformatics 2015; 16:127. [PMID: 25902779 PMCID: PMC4414291 DOI: 10.1186/s12859-015-0572-6] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2014] [Accepted: 04/13/2015] [Indexed: 11/28/2022] Open
Abstract
Background Proteins interact with a variety of other molecules such as nucleic acids, small molecules and other proteins inside the cell. Structure-determination of protein-protein complexes is challenging due to several reasons such as the large molecular weights of these macromolecular complexes, their dynamic nature, difficulty in purification and sample preparation. Computational docking permits an early understanding of the feasibility and mode of protein-protein interactions. However, docking algorithms propose a number of solutions and it is a challenging task to select the native or near native pose(s) from this pool. DockScore is an objective scoring scheme that can be used to rank protein-protein docked poses. It considers several interface parameters, namely, surface area, evolutionary conservation, hydrophobicity, short contacts and spatial clustering at the interface for scoring. Results We have implemented DockScore in form of a webserver for its use by the scientific community. DockScore webserver can be employed, subsequent to docking, to perform scoring of the docked solutions, starting from multiple poses as inputs. The results, on scores and ranks for all the poses, can be downloaded as a csv file and graphical view of the interface of best ranking poses is possible. Conclusions The webserver for DockScore is made freely available for the scientific community at: http://caps.ncbs.res.in/dockscore/. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0572-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sony Malhotra
- National Centre for Biological Sciences (TIFR), UAS-GKVK Campus, Bellary Road, Bangalore, 560 065, India.
| | - Oommen K Mathew
- National Centre for Biological Sciences (TIFR), UAS-GKVK Campus, Bellary Road, Bangalore, 560 065, India. .,SASTRA University, Tirumalaisamudram, Thanjavur, 613 401, Tamil Nadu, India.
| | - Ramanathan Sowdhamini
- National Centre for Biological Sciences (TIFR), UAS-GKVK Campus, Bellary Road, Bangalore, 560 065, India.
| |
Collapse
|
17
|
Deng L, Guan J, Wei X, Yi Y, Zhang QC, Zhou S. Boosting prediction performance of protein-protein interaction hot spots by using structural neighborhood properties. J Comput Biol 2013; 20:878-91. [PMID: 24134392 DOI: 10.1089/cmb.2013.0083] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Binding of one protein to another in a highly specific manner to form stable complexes is critical in most biological processes, yet the mechanisms involved in the interaction of proteins are not fully clear. The identification of hot spots, a small subset of binding interfaces that account for the majority of binding free energy, is becoming increasingly important in understanding the principles of protein interactions. Despite experiments like alanine scanning mutagenesis and a variety of computational methods that have been applied to this problem, comparative studies suggest that the development of accurate and reliable solutions is still in its infant stage. We developed PredHS (Prediction of Hot Spots), a computational method that can effectively identify hot spots on protein-binding interfaces by using 38 optimally chosen properties. The optimal combination of features was selected from a set of 324 novel structural neighborhood properties by a two-step feature selection method consisting of a random forest algorithm and a sequential backward elimination method. We evaluated the performance of PredHS using a benchmark of 265 alanine-mutated interface residues (Dataset I) and a trimmed subset (Dataset II) with 10-fold cross-validation. Compared with the state-of-the art approaches, PredHS achieves a significant improvement on the prediction quality, which stems from the new structural neighborhood properties, the novel way of feature generation, as well as the selection power of the proposed two-step method. We further validated the capability of our method by an independent test and obtained promising results.
Collapse
Affiliation(s)
- Lei Deng
- 1 Department of Computer Science and Technology, Tongji University , Shanghai, China
| | | | | | | | | | | |
Collapse
|
18
|
Martins JM, Ramos RM, Pimenta AC, Moreira IS. Solvent-accessible surface area: How well can be applied to hot-spot detection? Proteins 2013; 82:479-90. [PMID: 24105801 DOI: 10.1002/prot.24413] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2013] [Revised: 08/25/2013] [Accepted: 09/02/2013] [Indexed: 11/08/2022]
Abstract
A detailed comprehension of protein-based interfaces is essential for the rational drug development. One of the key features of these interfaces is their solvent accessible surface area profile. With that in mind, we tested a group of 12 SASA-based features for their ability to correlate and differentiate hot- and null-spots. These were tested in three different data sets, explicit water MD, implicit water MD, and static PDB structure. We found no discernible improvement with the use of more comprehensive data sets obtained from molecular dynamics. The features tested were shown to be capable of discerning between hot- and null-spots, while presenting low correlations. Residue standardization such as rel SASAi or rel/res SASAi , improved the features as a tool to predict ΔΔGbinding values. A new method using support machine learning algorithms was developed: SBHD (Sasa-Based Hot-spot Detection). This method presents a precision, recall, and F1 score of 0.72, 0.81, and 0.76 for the training set and 0.91, 0.73, and 0.81 for an independent test set.
Collapse
Affiliation(s)
- João M Martins
- REQUIMTE/Departamento de Química e Bioquímica, Faculdade de Ciências da Universidade do Porto, Rua do Campo Alegre s/n, 4169-007, Porto, Portugal
| | | | | | | |
Collapse
|
19
|
Vajda S, Hall DR, Kozakov D. Sampling and scoring: a marriage made in heaven. Proteins 2013; 81:1874-84. [PMID: 23775627 DOI: 10.1002/prot.24343] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2013] [Revised: 05/14/2013] [Accepted: 05/31/2013] [Indexed: 12/11/2022]
Abstract
Most structure prediction algorithms consist of initial sampling of the conformational space, followed by rescoring and possibly refinement of a number of selected structures. Here we focus on protein docking, and show that while decoupling sampling and scoring facilitates method development, integration of the two steps can lead to substantial improvements in docking results. Since decoupling is usually achieved by generating a decoy set containing both non-native and near-native docked structures, which can be then used for scoring function construction, we first review the roles and potential pitfalls of decoys in protein-protein docking, and show that some type of decoys are better than others for method development. We then describe three case studies showing that complete decoupling of scoring from sampling is not the best choice for solving realistic docking problems. Although some of the examples are based on our own experience, the results of the CAPRI docking and scoring experiments also show that performing both sampling and scoring generally yields better results than scoring the structures generated by all predictors. Next we investigate how the selection of training and decoy sets affects the performance of the scoring functions obtained. Finally, we discuss pathways to better alignment of the two steps, and show some algorithms that achieve a certain level of integration. Although we focus on protein-protein docking, our observations most likely also apply to other conformational search problems, including protein structure prediction and the docking of small molecules to proteins.
Collapse
Affiliation(s)
- Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, 02215
| | | | | |
Collapse
|
20
|
Oliva R, Vangone A, Cavallo L. Ranking multiple docking solutions based on the conservation of inter-residue contacts. Proteins 2013; 81:1571-84. [PMID: 23609916 DOI: 10.1002/prot.24314] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2013] [Revised: 03/16/2013] [Accepted: 04/08/2013] [Indexed: 01/11/2023]
Abstract
Molecular docking is the method of choice for investigating the molecular basis of recognition in a large number of functional protein complexes. However, correctly scoring the obtained docking solutions (decoys) to rank native-like (NL) conformations in the top positions is still an open problem. Herein we present CONSRANK, a simple and effective tool to rank multiple docking solutions, which relies on the conservation of inter-residue contacts in the analyzed decoys ensemble. First it calculates a conservation rate for each inter-residue contact, then it ranks decoys according to their ability to match the more frequently observed contacts. We applied CONSRANK to 102 targets from three different benchmarks, RosettaDock, DOCKGROUND, and Critical Assessment of PRedicted Interactions (CAPRI). The method performs consistently well, both in terms of NL solutions ranked in the top positions and of values of the area under the receiver operating characteristic curve. Its ideal application is to solutions coming from different docking programs and procedures, as in the case of CAPRI targets. For all the analyzed CAPRI targets where a comparison is feasible, CONSRANK outperforms the CAPRI scorers. The fraction of NL solutions in the top ten positions in the RosettaDock, DOCKGROUND, and CAPRI benchmarks is enriched on average by a factor of 3.0, 1.9, and 9.9, respectively. Interestingly, CONSRANK is also able to specifically single out the high/medium quality (HMQ) solutions from the docking decoys ensemble: it ranks 46.2 and 70.8% of the total HMQ solutions available for the RosettaDock and CAPRI targets, respectively, within the top 20 positions.
Collapse
Affiliation(s)
- Romina Oliva
- Department of Applied Sciences, University "Parthenope" of Naples, Centro Direzionale Isola C4, 80143, Naples, Italy
| | | | | |
Collapse
|
21
|
Schneider S, Zacharias M. Scoring optimisation of unbound protein-protein docking including protein binding site predictions. J Mol Recognit 2011; 25:15-23. [DOI: 10.1002/jmr.1165] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Affiliation(s)
- Sebastian Schneider
- Physik-Department T38; Technische Universität München; James Franck Str. 1; 85748; Garching; Germany
| | - Martin Zacharias
- Physik-Department T38; Technische Universität München; James Franck Str. 1; 85748; Garching; Germany
| |
Collapse
|
22
|
Huang W, Liu H. Optimized grid-based protein-protein docking as a global search tool followed by incorporating experimentally derivable restraints. Proteins 2011; 80:691-702. [PMID: 22190391 DOI: 10.1002/prot.23223] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2011] [Revised: 10/10/2011] [Accepted: 10/12/2011] [Indexed: 12/16/2022]
Abstract
Unbound protein docking, or the computational prediction of the structure of a protein complex from the structures of its separated components, is of importance but still challenging. A practical approach toward reliable results for unbound docking is to incorporate experimentally derived information with computation. To this end, truly systematic search of the global docking space is desirable. The fast Fourier transform (FFT) docking is a systematic search method with high computational efficiency. However, by using FFT to perform unbound docking, possible conformational changes upon binding must be treated implicitly. To better accommodate the implicit treatment of conformational flexibility, we develop a rational approach to optimize "softened" parameters for FFT docking. In connection with the increased "softness" of the parameters in this global search step, we use a revised rule to select candidate models from the search results. For complexes designated as of low and medium difficulty for unbound docking, these adaptations of the original FTDOCK program lead to substantial improvements of the global search results. Finally, we show that models resulted from FFT-based global search can be further filtered with restraints derivable from nuclear magnetic resonance (NMR) chemical shift perturbation or mutagenesis experiments, leading to a small set of models that can be feasibly refined and evaluated using computationally more expensive methods and that still include high-ranking near-native conformations.
Collapse
Affiliation(s)
- Wei Huang
- School of Life Sciences and Hefei National Laboratory for Physical Sciences at the Microscale, University of Science and Technology of China (USTC), Hefei, Anhui 230027, People's Republic of China
| | | |
Collapse
|
23
|
Marsh L. Prediction of ligand binding using an approach designed to accommodate diversity in protein-ligand interactions. PLoS One 2011; 6:e23215. [PMID: 21860668 PMCID: PMC3157911 DOI: 10.1371/journal.pone.0023215] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2011] [Accepted: 07/12/2011] [Indexed: 02/07/2023] Open
Abstract
Computational determination of protein-ligand interaction potential is important for many biological applications including virtual screening for therapeutic drugs. The novel internal consensus scoring strategy is an empirical approach with an extended set of 9 binding terms combined with a neural network capable of analysis of diverse complexes. Like conventional consensus methods, internal consensus is capable of maintaining multiple distinct representations of protein-ligand interactions. In a typical use the method was trained using ligand classification data (binding/no binding) for a single receptor. The internal consensus analyses successfully distinguished protein-ligand complexes from decoys (r2, 0.895 for a series of typical proteins). Results are superior to other tested empirical methods. In virtual screening experiments, internal consensus analyses provide consistent enrichment as determined by ROC-AUC and pROC metrics.
Collapse
Affiliation(s)
- Lorraine Marsh
- Department of Biology, Long Island University, Brooklyn, New York, United States of America.
| |
Collapse
|
24
|
Mitra P, Pal D. PRUNE and PROBE--two modular web services for protein-protein docking. Nucleic Acids Res 2011; 39:W229-34. [PMID: 21576226 PMCID: PMC3125751 DOI: 10.1093/nar/gkr317] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The protein–protein docking programs typically perform four major tasks: (i) generation of docking poses, (ii) selecting a subset of poses, (iii) their structural refinement and (iv) scoring, ranking for the final assessment of the true quaternary structure. Although the tasks can be integrated or performed in a serial order, they are by nature modular, allowing an opportunity to substitute one algorithm with another. We have implemented two modular web services, (i) PRUNE: to select a subset of docking poses generated during sampling search (http://pallab.serc.iisc.ernet.in/prune) and (ii) PROBE: to refine, score and rank them (http://pallab.serc.iisc.ernet.in/probe). The former uses a new interface area based edge-scoring function to eliminate >95% of the poses generated during docking search. In contrast to other multi-parameter-based screening functions, this single parameter based elimination reduces the computational time significantly, in addition to increasing the chances of selecting native-like models in the top rank list. The PROBE server performs ranking of pruned poses, after structure refinement and scoring using a regression model for geometric compatibility, and normalized interaction energy. While web-service similar to PROBE is infrequent, no web-service akin to PRUNE has been described before. Both the servers are publicly accessible and free for use.
Collapse
Affiliation(s)
- Pralay Mitra
- Bioinformatics Centre, Indian Institute of Science, Bangalore 560 012, India
| | | |
Collapse
|
25
|
de Vries SJ, Bonvin AMJJ. CPORT: a consensus interface predictor and its performance in prediction-driven docking with HADDOCK. PLoS One 2011; 6:e17695. [PMID: 21464987 PMCID: PMC3064578 DOI: 10.1371/journal.pone.0017695] [Citation(s) in RCA: 233] [Impact Index Per Article: 17.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2010] [Accepted: 02/08/2011] [Indexed: 11/19/2022] Open
Abstract
Background Macromolecular complexes are the molecular machines of the cell. Knowledge at the atomic level is essential to understand and influence their function. However, their number is huge and a significant fraction is extremely difficult to study using classical structural methods such as NMR and X-ray crystallography. Therefore, the importance of large-scale computational approaches in structural biology is evident. This study combines two of these computational approaches, interface prediction and docking, to obtain atomic-level structures of protein-protein complexes, starting from their unbound components. Methodology/Principal Findings Here we combine six interface prediction web servers into a consensus method called CPORT (Consensus Prediction Of interface Residues in Transient complexes). We show that CPORT gives more stable and reliable predictions than each of the individual predictors on its own. A protocol was developed to integrate CPORT predictions into our data-driven docking program HADDOCK. For cases where experimental information is limited, this prediction-driven docking protocol presents an alternative to ab initio docking, the docking of complexes without the use of any information. Prediction-driven docking was performed on a large and diverse set of protein-protein complexes in a blind manner. Our results indicate that the performance of the HADDOCK-CPORT combination is competitive with ZDOCK-ZRANK, a state-of-the-art ab initio docking/scoring combination. Finally, the original interface predictions could be further improved by interface post-prediction (contact analysis of the docking solutions). Conclusions/Significance The current study shows that blind, prediction-driven docking using CPORT and HADDOCK is competitive with ab initio docking methods. This is encouraging since prediction-driven docking represents the absolute bottom line for data-driven docking: any additional biological knowledge will greatly improve the results obtained by prediction-driven docking alone. Finally, the fact that original interface predictions could be further improved by interface post-prediction suggests that prediction-driven docking has not yet been pushed to the limit. A web server for CPORT is freely available at http://haddock.chem.uu.nl/services/CPORT.
Collapse
Affiliation(s)
- Sjoerd J de Vries
- Faculty of Science, Bijvoet Center for Biomolecular Research, Utrecht University, Utrecht, The Netherlands.
| | | |
Collapse
|
26
|
Pons C, Talavera D, de la Cruz X, Orozco M, Fernandez-Recio J. Scoring by intermolecular pairwise propensities of exposed residues (SIPPER): a new efficient potential for protein-protein docking. J Chem Inf Model 2011; 51:370-7. [PMID: 21214199 DOI: 10.1021/ci100353e] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
A detailed and complete structural knowledge of the interactome is one of the grand challenges in Biology, and a variety of computational docking approaches have been developed to complement experimental efforts and help in the characterization of protein-protein interactions. Among the different docking scoring methods, those based on physicochemical considerations can give the maximum accuracy at the atomic level, but they are usually computationally demanding and necessarily noisy when implemented in rigid-body approaches. Coarser-grained knowledge-based potentials are less sensitive to details of atomic arrangements, thus providing an efficient alternative for scoring of rigid-body docking poses. In this study, we have extracted new statistical potentials from intermolecular pairs of exposed residues in known complex structures, which were then used to score protein-protein docking poses. The new method, called SIPPER (scoring by intermolecular pairwise propensities of exposed residues), combines the value of residue desolvation based on solvent-exposed area with the propensity-based contribution of intermolecular residue pairs. This new scoring function found a near-native orientation within the top 10 predictions in nearly one-third of the cases of a standard docking benchmark and proved to be also useful as a filtering step, drastically reducing the number of docking candidates needed by energy-based methods like pyDock.
Collapse
Affiliation(s)
- Carles Pons
- Life Sciences Department, Barcelona Supercomputing Center, National Institute of Bioinformatics, Barcelona, Spain
| | | | | | | | | |
Collapse
|
27
|
Ali HI, Fujita T, Akaho E, Nagamatsu T. A comparative study of AutoDock and PMF scoring performances, and SAR of 2-substituted pyrazolotriazolopyrimidines and 4-substituted pyrazolopyrimidines as potent xanthine oxidase inhibitors. J Comput Aided Mol Des 2009; 24:57-75. [DOI: 10.1007/s10822-009-9314-z] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2009] [Accepted: 12/04/2009] [Indexed: 11/28/2022]
|