1
|
Nikam R, Yugandhar K, Gromiha MM. DeepBSRPred: deep learning-based binding site residue prediction for proteins. Amino Acids 2023; 55:1305-1316. [PMID: 36574037 DOI: 10.1007/s00726-022-03228-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Accepted: 12/15/2022] [Indexed: 12/28/2022]
Abstract
MOTIVATION Proteins-protein interactions (PPIs) are important to govern several cellular activities. Amino acid residues, which are located at the interface are known as the binding sites and the information about binding sites helps to understand the binding affinities and functions of protein-protein complexes. RESULTS We have developed a deep neural network-based method, DeepBSRPred, for predicting the binding sites using protein sequence information and predicted structures from AlphaFold2. Specific sequence and structure-based features include position-specific scoring matrix (PSSM), solvent accessible surface area, conservation score and amino acid properties, and residue depth, respectively. Our method predicted the binding sites with an average F1 score of 0.73 in a dataset of 1236 proteins. Further, we compared the performance with other existing methods in the literature using four benchmark datasets and our method outperformed those methods. AVAILABILITY AND IMPLEMENTATION The DeepBSRPred web server can be found at https://web.iitm.ac.in/bioinfo2/deepbsrpred/index.html , along with all datasets used in this study. The trained models, the DeepBSRPred standalone source code, and the feature computation pipeline are freely available at https://web.iitm.ac.in/bioinfo2/deepbsrpred/download.html .
Collapse
Affiliation(s)
- Rahul Nikam
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, 600036, India
| | - Kumar Yugandhar
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, 600036, India
- Department of Computational Biology, Cornell University, New York, NY, USA
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, 600036, India.
- Department of Computer Science, Tokyo Institute of Technology, Yokohama, Japan.
| |
Collapse
|
2
|
EPCES and EPSVR: Prediction of B-Cell Antigenic Epitopes on Protein Surfaces with Conformational Information. Methods Mol Biol 2020; 2131:289-297. [PMID: 32162262 DOI: 10.1007/978-1-0716-0389-5_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Accurate prediction of discontinuous antigenic epitopes is important for immunologic research and medical applications, but it is not an easy problem. Currently, there are only a few prediction servers available, though discontinuous epitopes constitute the majority of all B-cell antigenic epitopes. In this chapter, we describe two online servers, EPCES and EPSVR, for discontinuous epitope prediction. All methods were benchmarked by a curated independent test set, in which all antigens had no complex structures with the antibody, and their epitopes were identified by various biochemical experiments. The servers and all datasets are available at http://sysbio.unl.edu/EPCES/ and http://sysbio.unl.edu/EPSVR/ .
Collapse
|
3
|
Stojanoski V, Adamski CJ, Hu L, Mehta SC, Sankaran B, Zwart P, Prasad BVV, Palzkill T. Removal of the Side Chain at the Active-Site Serine by a Glycine Substitution Increases the Stability of a Wide Range of Serine β-Lactamases by Relieving Steric Strain. Biochemistry 2016; 55:2479-90. [PMID: 27073009 DOI: 10.1021/acs.biochem.6b00056] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Serine β-lactamases are bacterial enzymes that hydrolyze β-lactam antibiotics. They utilize an active-site serine residue as a nucleophile, forming an acyl-enzyme intermediate during hydrolysis. In this study, thermal denaturation experiments as well as X-ray crystallography were performed to test the effect of substitution of the catalytic serine with glycine on protein stability in serine β-lactamases. Six different enzymes comprising representatives from each of the three classes of serine β-lactamases were examined, including TEM-1, CTX-M-14, and KPC-2 of class A, P99 of class C, and OXA-48 and OXA-163 of class D. For each enzyme, the wild type and a serine-to-glycine mutant were evaluated for stability. The glycine mutants all exhibited enhanced thermostability compared to that of the wild type. In contrast, alanine substitutions of the catalytic serine in TEM-1, OXA-48, and OXA-163 did not alter stability, suggesting removal of the Cβ atom is key to the stability increase associated with the glycine mutants. The X-ray crystal structures of P99 S64G, OXA-48 S70G and S70A, and OXA-163 S70G suggest that removal of the side chain of the catalytic serine releases steric strain to improve enzyme stability. Additionally, analysis of the torsion angles at the nucleophile position indicates that the glycine mutants exhibit improved distance and angular parameters of the intrahelical hydrogen bond network compared to those of the wild-type enzymes, which is also consistent with increased stability. The increased stability of the mutants indicates that the enzyme pays a price in stability for the presence of a side chain at the catalytic serine position but that the cost is necessary in that removal of the serine drastically impairs function. These findings support the stability-function hypothesis, which states that active-site residues are optimized for substrate binding and catalysis but that the requirements for catalysis are often not consistent with the requirements for optimal stability.
Collapse
Affiliation(s)
| | | | | | | | - Banumathi Sankaran
- Berkeley Center for Structural Biology, Molecular Biophysics and Integrated Bioimaging, Advanced Light Source, Lawrence Berkeley National Laboratory , Berkeley, California 94720, United States
| | - Peter Zwart
- Berkeley Center for Structural Biology, Molecular Biophysics and Integrated Bioimaging, Advanced Light Source, Lawrence Berkeley National Laboratory , Berkeley, California 94720, United States
| | | | | |
Collapse
|
4
|
Wang DD, Wang R, Yan H. Fast prediction of protein–protein interaction sites based on Extreme Learning Machines. Neurocomputing 2014. [DOI: 10.1016/j.neucom.2012.12.062] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
|
5
|
Li L, Huang Y, Xiao Y. How to use not-always-reliable binding site information in protein-protein docking prediction. PLoS One 2013; 8:e75936. [PMID: 24124522 PMCID: PMC3790831 DOI: 10.1371/journal.pone.0075936] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2013] [Accepted: 08/22/2013] [Indexed: 11/19/2022] Open
Abstract
In many protein-protein docking algorithms, binding site information is used to help predicting the protein complex structures. Using correct and accurate binding site information can increase protein-protein docking success rate significantly. On the other hand, using wrong binding sites information should lead to a failed prediction, or, at least decrease the success rate. Recently, various successful theoretical methods have been proposed to predict the binding sites of proteins. However, the predicted binding site information is not always reliable, sometimes wrong binding site information could be given. Hence there is a high risk to use the predicted binding site information in current docking algorithms. In this paper, a softly restricting method (SRM) is developed to solve this problem. By utilizing predicted binding site information in a proper way, the SRM algorithm is sensitive to the correct binding site information but insensitive to wrong information, which decreases the risk of using predicted binding site information. This SRM is tested on benchmark 3.0 using purely predicted binding site information. The result shows that when the predicted information is correct, SRM increases the success rate significantly; however, even if the predicted information is completely wrong, SRM only decreases success rate slightly, which indicates that the SRM is suitable for utilizing predicted binding site information.
Collapse
Affiliation(s)
- Lin Li
- Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
- Computational Biophysics and Bioinformatics, Department of Physics, Clemson University, South Carolina, United States of America
| | - Yanzhao Huang
- Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
- * E-mail: (YH); (YX)
| | - Yi Xiao
- Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
- * E-mail: (YH); (YX)
| |
Collapse
|
6
|
Li B, Kihara D. Protein docking prediction using predicted protein-protein interface. BMC Bioinformatics 2012; 13:7. [PMID: 22233443 PMCID: PMC3287255 DOI: 10.1186/1471-2105-13-7] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2011] [Accepted: 01/10/2012] [Indexed: 11/10/2022] Open
Abstract
Background Many important cellular processes are carried out by protein complexes. To provide physical pictures of interacting proteins, many computational protein-protein prediction methods have been developed in the past. However, it is still difficult to identify the correct docking complex structure within top ranks among alternative conformations. Results We present a novel protein docking algorithm that utilizes imperfect protein-protein binding interface prediction for guiding protein docking. Since the accuracy of protein binding site prediction varies depending on cases, the challenge is to develop a method which does not deteriorate but improves docking results by using a binding site prediction which may not be 100% accurate. The algorithm, named PI-LZerD (using Predicted Interface with Local 3D Zernike descriptor-based Docking algorithm), is based on a pair wise protein docking prediction algorithm, LZerD, which we have developed earlier. PI-LZerD starts from performing docking prediction using the provided protein-protein binding interface prediction as constraints, which is followed by the second round of docking with updated docking interface information to further improve docking conformation. Benchmark results on bound and unbound cases show that PI-LZerD consistently improves the docking prediction accuracy as compared with docking without using binding site prediction or using the binding site prediction as post-filtering. Conclusion We have developed PI-LZerD, a pairwise docking algorithm, which uses imperfect protein-protein binding interface prediction to improve docking accuracy. PI-LZerD consistently showed better prediction accuracy over alternative methods in the series of benchmark experiments including docking using actual docking interface site predictions as well as unbound docking cases.
Collapse
Affiliation(s)
- Bin Li
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
| | | |
Collapse
|
7
|
Thomas VL, McReynolds AC, Shoichet BK. Structural bases for stability-function tradeoffs in antibiotic resistance. J Mol Biol 2009; 396:47-59. [PMID: 19913034 DOI: 10.1016/j.jmb.2009.11.005] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2009] [Revised: 11/02/2009] [Accepted: 11/04/2009] [Indexed: 10/20/2022]
Abstract
Preorganization of enzyme active sites for substrate recognition typically comes at a cost to the stability of the folded form of the protein; consequently, enzymes can be dramatically stabilized by substitutions that attenuate the size and preorganization "strain" of the active site. How this stability-activity tradeoff constrains enzyme evolution has remained less certain, and it is unclear whether one should expect major stability insults as enzymes mutate towards new activities or how these new activities manifest structurally. These questions are both germane and easy to study in beta-lactamases, which are evolving on the timescale of years to confer resistance to an ever-broader spectrum of beta-lactam antibiotics. To explore whether stability is a substantial constraint on this antibiotic resistance evolution, we investigated extended-spectrum mutants of class C beta-lactamases, which had evolved new activity versus third-generation cephalosporins. Five mutant enzymes had between 100-fold and 200-fold increased activity against the antibiotic cefotaxime in enzyme assays, and the mutant enzymes all lost thermodynamic stability (from 1.7 kcal mol(-)(1) to 4.1 kcal mol(-)(1)), consistent with the stability-function hypothesis. Intriguingly, several of the substitutions were 10-20 A from the catalytic serine; the question of how they conferred extended-spectrum activity arose. Eight structures, including complexes with inhibitors and extended-spectrum antibiotics, were determined by X-ray crystallography. Distinct mechanisms of action, including changes in the flexibility and ground-state structures of the enzyme, are revealed for each mutant. These results explain the structural bases for the antibiotic resistance conferred by these substitutions and their corresponding decrease in protein stability, which will constrain the evolution of new antibiotic resistance.
Collapse
Affiliation(s)
- Veena L Thomas
- Graduate Program in Pharmaceutical Sciences and Pharmacogenomics, University of California San Francisco, San Francisco, CA 94158-2518, USA
| | | | | |
Collapse
|
8
|
Liang S, Zheng D, Zhang C, Zacharias M. Prediction of antigenic epitopes on protein surfaces by consensus scoring. BMC Bioinformatics 2009; 10:302. [PMID: 19772615 PMCID: PMC2761409 DOI: 10.1186/1471-2105-10-302] [Citation(s) in RCA: 71] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2009] [Accepted: 09/22/2009] [Indexed: 12/05/2022] Open
Abstract
Background Prediction of antigenic epitopes on protein surfaces is important for vaccine design. Most existing epitope prediction methods focus on protein sequences to predict continuous epitopes linear in sequence. Only a few structure-based epitope prediction algorithms are available and they have not yet shown satisfying performance. Results We present a new antigen Epitope Prediction method, which uses ConsEnsus Scoring (EPCES) from six different scoring functions - residue epitope propensity, conservation score, side-chain energy score, contact number, surface planarity score, and secondary structure composition. Applied to unbounded antigen structures from an independent test set, EPCES was able to predict antigenic eptitopes with 47.8% sensitivity, 69.5% specificity and an AUC value of 0.632. The performance of the method is statistically similar to other published methods. The AUC value of EPCES is slightly higher compared to the best results of existing algorithms by about 0.034. Conclusion Our work shows consensus scoring of multiple features has a better performance than any single term. The successful prediction is also due to the new score of residue epitope propensity based on atomic solvent accessibility.
Collapse
Affiliation(s)
- Shide Liang
- School of Engineering and Science, Jacobs University Bremen, Campus Ring 1, D-28759 Bremen, Germany
| | | | | | | |
Collapse
|
9
|
Liang S, Meroueh SO, Wang G, Qiu C, Zhou Y. Consensus scoring for enriching near-native structures from protein-protein docking decoys. Proteins 2009; 75:397-403. [PMID: 18831053 DOI: 10.1002/prot.22252] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
The identification of near native protein-protein complexes among a set of decoys remains highly challenging. A strategy for improving the success rate of near native detection is to enrich near native docking decoys in a small number of top ranked decoys. Recently, we found that a combination of three scoring functions (energy, conservation, and interface propensity) can predict the location of binding interface regions with reasonable accuracy. Here, these three scoring functions are modified and combined into a consensus scoring function called ENDES for enriching near native docking decoys. We found that all individual scores result in enrichment for the majority of 28 targets in ZDOCK2.3 decoy set and the 22 targets in Benchmark 2.0. Among the three scores, the interface propensity score yields the highest enrichment in both sets of protein complexes. When these scores are combined into the ENDES consensus score, a significant increase in enrichment of near-native structures is found. For example, when 2000 dock decoys are reduced to 200 decoys by ENDES, the fraction of near-native structures in docking decoys increases by a factor of about six in average. ENDES was implemented into a computer program that is available for download at http://sparks.informatics.iupui.edu.
Collapse
Affiliation(s)
- Shide Liang
- Indiana University School of Informatics, Indiana University-Purdue University, Indianapolis, Indiana 46202, USA
| | | | | | | | | |
Collapse
|
10
|
Abstract
Protein–DNA/RNA/protein interactions play critical roles in many biological functions. Previous studies have focused on the different features characterizing the different macromolecule-binding sites and approaches to detect these sites. However, no common unique signature of these sites had been reported. Thus, this work aims to provide a ‘common’ principle dictating the location of the different macromolecule-binding sites founded upon fundamental principles of binding thermodynamics. To achieve this aim, a comprehensive set of structurally nonhomologous DNA-, RNA-, obligate protein- and nonobligate protein-binding proteins, both free and bound to their respective macromolecules, was created and a novel strategy for detecting clusters of residues with electrostatic or steric strain given the protein structure was developed. The results show that regardless of the macromolecule type, the binding strength and conformational changes upon binding, macromolecule-binding sites are energetically less stable than nonmacromolecule-binding sites. They also reveal new energetic features distinguishing DNA- from RNA-binding sites and obligate protein- from nonobligate protein-binding sites in both free/bound protein structures.
Collapse
Affiliation(s)
- Yao Chi Chen
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | | |
Collapse
|
11
|
Liang S, Zhang C, Liu S, Zhou Y. Protein binding site prediction using an empirical scoring function. Nucleic Acids Res 2006; 34:3698-707. [PMID: 16893954 PMCID: PMC1540721 DOI: 10.1093/nar/gkl454] [Citation(s) in RCA: 194] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Most biological processes are mediated by interactions between proteins and their interacting partners including proteins, nucleic acids and small molecules. This work establishes a method called PINUP for binding site prediction of monomeric proteins. With only two weight parameters to optimize, PINUP produces not only 42.2% coverage of actual interfaces (percentage of correctly predicted interface residues in actual interface residues) but also 44.5% accuracy in predicted interfaces (percentage of correctly predicted interface residues in the predicted interface residues) in a cross validation using a 57-protein dataset. By comparison, the expected accuracy via random prediction (percentage of actual interface residues in surface residues) is only 15%. The binding sites of the 57-protein set are found to be easier to predict than that of an independent test set of 68 proteins. The average coverage and accuracy for this independent test set are 30.5 and 29.4%, respectively. The significant gain of PINUP over expected random prediction is attributed to (i) effective residue-energy score and accessible-surface-area-dependent interface-propensity, (ii) isolation of functional constraints contained in the conservation score from the structural constraints through the combination of residue-energy score (for structural constraints) and conservation score and (iii) a consensus region built on top-ranked initial patches.
Collapse
Affiliation(s)
| | | | | | - Yaoqi Zhou
- To whom correspondence should be addressed. Tel: +1 716 829 2985; Fax: +1 716 829 2344;
| |
Collapse
|
12
|
Nayal M, Honig B. On the nature of cavities on protein surfaces: Application to the identification of drug-binding sites. Proteins 2006; 63:892-906. [PMID: 16477622 DOI: 10.1002/prot.20897] [Citation(s) in RCA: 195] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
In this article we introduce a new method for the identification and the accurate characterization of protein surface cavities. The method is encoded in the program SCREEN (Surface Cavity REcognition and EvaluatioN). As a first test of the utility of our approach we used SCREEN to locate and analyze the surface cavities of a nonredundant set of 99 proteins cocrystallized with drugs. We find that this set of proteins has on average about 14 distinct cavities per protein. In all cases, a drug is bound at one (and sometimes more than one) of these cavities. Using cavity size alone as a criterion for predicting drug-binding sites yields a high balanced error rate of 15.7%, with only 71.7% coverage. Here we characterize each surface cavity by computing a comprehensive set of 408 physicochemical, structural, and geometric attributes. By applying modern machine learning techniques (Random Forests) we were able to develop a classifier that can identify drug-binding cavities with a balanced error rate of 7.2% and coverage of 88.9%. Only 18 of the 408 cavity attributes had a statistically significant role in the prediction. Of these 18 important attributes, almost all involved size and shape rather than physicochemical properties of the surface cavity. The implications of these results are discussed. A SCREEN Web server is available at http://interface.bioc.columbia.edu/screen.
Collapse
Affiliation(s)
- Murad Nayal
- Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York, USA
| | | |
Collapse
|