1
|
Harihar B, Saravanan KM, Gromiha MM, Selvaraj S. Importance of Inter-residue Contacts for Understanding Protein Folding and Unfolding Rates, Remote Homology, and Drug Design. Mol Biotechnol 2024:10.1007/s12033-024-01119-4. [PMID: 38498284 DOI: 10.1007/s12033-024-01119-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Accepted: 02/10/2024] [Indexed: 03/20/2024]
Abstract
Inter-residue interactions in protein structures provide valuable insights into protein folding and stability. Understanding these interactions can be helpful in many crucial applications, including rational design of therapeutic small molecules and biologics, locating functional protein sites, and predicting protein-protein and protein-ligand interactions. The process of developing machine learning models incorporating inter-residue interactions has been improved recently. This review highlights the theoretical models incorporating inter-residue interactions in predicting folding and unfolding rates of proteins. Utilizing contact maps to depict inter-residue interactions aids researchers in developing computer models for detecting remote homologs and interface residues within protein-protein complexes which, in turn, enhances our knowledge of the relationship between sequence and structure of proteins. Further, the application of contact maps derived from inter-residue interactions is highlighted in the field of drug discovery. Overall, this review presents an extensive assessment of the significant models that use inter-residue interactions to investigate folding rates, unfolding rates, remote homology, and drug development, providing potential future advancements in constructing efficient computational models in structural biology.
Collapse
Affiliation(s)
- Balasubramanian Harihar
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, Tamil Nadu, 620024, India
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, 600036, India
| | - Konda Mani Saravanan
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, Tamil Nadu, 620024, India
- Department of Biotechnology, Bharath Institute of Higher Education and Research, Chennai, Tamil Nadu, 600073, India
| | - Michael M Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, 600036, India
| | - Samuel Selvaraj
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, Tamil Nadu, 620024, India.
| |
Collapse
|
2
|
Huang A, Lu F, Liu F. Exploring the molecular mechanism of cold-adaption of an alkaline protease mutant by molecular dynamics simulations and residue interaction network. Protein Sci 2023; 32:e4837. [PMID: 37984374 PMCID: PMC10682693 DOI: 10.1002/pro.4837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2023] [Revised: 10/18/2023] [Accepted: 11/09/2023] [Indexed: 11/22/2023]
Abstract
Psychrophilic proteases have attracted enormous attention in past decades, due to their high catalytic activity at low temperatures in a wide range of industrial processes, especially in the detergent and leather industries. Among them, H5 is an alkaline protease mutant, which featuring psychrophilic-like behavior, but the reasons that H5 with higher activity at low temperatures are still poorly understood. Herein, the molecular dynamics (MD) simulations combined with residue interaction network (RIN) were utilized to investigate the mechanisms of the cold-adaption of mutant H5. The results demonstrated that two loops involved in the substrate binding G100-S104 and S125-S129 in H5 had higher mobility, and the distance enlargement between the two loops modulated the substrate's accessibility compared with wild type counterpart. Besides, H5 enhanced conformational flexibility by weakening salt bridges and increasing interaction with the solvent. In particular, the absence of Lys251-Asp197-Arg247 salt bridge network may contribute to the structural mobility. Based on the free energy landscape and molecular mechanics Poisson-Boltzmann surface area of the wild type and H5, it was elucidated that H5 possessed a large population of interconvertible conformations, resulting in the weaker substrate binding free energy. The calculated RIN topology parameters such as the average degree, average cluster coefficient, and average path length further verified that the mutant H5 attenuated residue-to-residue interactions. The investigation of the mechanisms by which how the residue mutation affects the stability and activity of enzymes provides a theoretical basis for the development of cold-adapted protease.
Collapse
Affiliation(s)
- Ailan Huang
- College of BiotechnologyTianjin University of Science & TechnologyTianjinChina
| | - Fuping Lu
- College of BiotechnologyTianjin University of Science & TechnologyTianjinChina
- Key Laboratory of Industrial Fermentation Microbiology, Ministry of EducationTianjin Key Laboratory of Industrial MicrobiologyTianjinChina
| | - Fufeng Liu
- College of BiotechnologyTianjin University of Science & TechnologyTianjinChina
- Key Laboratory of Industrial Fermentation Microbiology, Ministry of EducationTianjin Key Laboratory of Industrial MicrobiologyTianjinChina
| |
Collapse
|
3
|
Nikam R, Yugandhar K, Gromiha MM. Deep learning-based method for predicting and classifying the binding affinity of protein-protein complexes. BIOCHIMICA ET BIOPHYSICA ACTA. PROTEINS AND PROTEOMICS 2023; 1871:140948. [PMID: 37567456 DOI: 10.1016/j.bbapap.2023.140948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Revised: 08/05/2023] [Accepted: 08/08/2023] [Indexed: 08/13/2023]
Abstract
Protein-protein interactions (PPIs) play a critical role in various biological processes. Accurately estimating the binding affinity of PPIs is essential for understanding the underlying molecular recognition mechanisms. In this study, we employed a deep learning approach to predict the binding affinity (ΔG) of protein-protein complexes. To this end, we compiled a dataset of 903 protein-protein complexes, each with its corresponding experimental binding affinity, which belong to six functional classes. We extracted 8 to 20 non-redundant features from the sequence information as well as the predicted three-dimensional structures using feature selection methods for each protein functional class. Our method showed an overall mean absolute error of 1.05 kcal/mol and a correlation of 0.79 between experimental and predicted ΔG values. Additionally, we evaluated our model for discriminating high and low affinity protein-protein complexes and it achieved an accuracy of 87% with an F1 score of 0.86 using 10-fold cross-validation on the selected features. Our approach presents an efficient tool for studying PPIs and provides crucial insights into the underlying mechanisms of the molecular recognition process. The web server can be freely accessed at https://web.iitm.ac.in/bioinfo2/DeepPPAPred/index.html.
Collapse
Affiliation(s)
- Rahul Nikam
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, Tamil Nadu, India
| | - Kumar Yugandhar
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, Tamil Nadu, India; Department of Computational Biology, Cornell University, New York, USA
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, Tamil Nadu, India; Department of Computer Science, Tokyo Institute of Technology, Yokohama, Japan; Department of Computer Science, National University of Singapore, Singapore.
| |
Collapse
|
4
|
Scalvini B, Sheikhhassani V, Mashaghi A. Topological principles of protein folding. Phys Chem Chem Phys 2021; 23:21316-21328. [PMID: 34545868 DOI: 10.1039/d1cp03390e] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
What is the topology of a protein and what governs protein folding to a specific topology? This is a fundamental question in biology. The protein folding reaction is a critically important cellular process, which is failing in many prevalent diseases. Understanding protein folding is also key to the design of new proteins for applications. However, our ability to predict the folding of a protein chain is quite limited and much is still unknown about the topological principles of folding. Current predictors of folding kinetics, including the contact order and size, present a limited predictive power, suggesting that these models are fundamentally incomplete. Here, we use a newly developed mathematical framework to define and extract the topology of a native protein conformation beyond knot theory, and investigate the relationship between native topology and folding kinetics in experimentally characterized proteins. We show that not only the folding rate, but also the mechanistic insight into folding mechanisms can be inferred from topological parameters. We identify basic topological features that speed up or slow down the folding process. The approach enabled the decomposition of protein 3D conformation into topologically independent elementary folding units, called circuits. The number of circuits correlates significantly with the folding rate, offering not only an efficient kinetic predictor, but also a tool for a deeper understanding of theoretical folding models. This study contributes to recent work that reveals the critical relevance of topology to protein folding with a new, contact-based, mathematically rigorous perspective. We show that topology can predict folding kinetics when geometry-based predictors like contact order and size fail.
Collapse
Affiliation(s)
- Barbara Scalvini
- Medical Systems Biophysics and Bioengineering, Leiden Academic Centre for Drug Research, Faculty of Science, Leiden University, Einsteinweg 55, 2333CC Leiden, The Netherlands.
| | - Vahid Sheikhhassani
- Medical Systems Biophysics and Bioengineering, Leiden Academic Centre for Drug Research, Faculty of Science, Leiden University, Einsteinweg 55, 2333CC Leiden, The Netherlands.
| | - Alireza Mashaghi
- Medical Systems Biophysics and Bioengineering, Leiden Academic Centre for Drug Research, Faculty of Science, Leiden University, Einsteinweg 55, 2333CC Leiden, The Netherlands.
| |
Collapse
|
5
|
Fang X, Huang J, Zhang R, Wang F, Zhang Q, Li G, Yan J, Zhang H, Yan Y, Xu L. Convolution Neural Network-Based Prediction of Protein Thermostability. J Chem Inf Model 2019; 59:4833-4843. [PMID: 31657922 DOI: 10.1021/acs.jcim.9b00220] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Most natural proteins exhibit poor thermostability, which limits their industrial application. Computer-aided rational design is an efficient purpose-oriented method that can improve protein thermostability. Numerous machine-learning-based methods have been designed to predict the changes in protein thermostability induced by mutations. However, all of these methods have certain limitations due to existing mutation coding methods that overlook protein sequence features. Here we propose a method to predict protein thermostability using convolutional neural networks based on an in-depth study of thermostability-related protein properties. This method comprises a three-dimensional coding algorithm, including protein mutation information and a strategy to extract neighboring features at protein mutation sites based on multiscale convolution. The accuracies on the S1615 and S388 data sets, which are widely used for protein thermostability predictions, reached 86.4 and 87%, respectively. The Matthews correlation coefficient was nearly double those produced using other methods. Furthermore, a model was constructed to predict the thermostability of Rhizomucor miehei lipase mutants based on the S3661 data set, a single amino acid mutation data set screened from the ProTherm protein thermodynamics database. Compared with the RIF strategy, which consists of three algorithms, i.e., Rosetta ddg monomer, I Mutant 3.0, and FoldX, the accuracy of the proposed method was higher (75.0 vs 66.7%), and the negative sample resolution was simultaneously enhanced. These results indicate that our prediction method more effectively assessed the protein thermostability and distinguished its features, making it a powerful tool to devise mutations that enhance the thermostability of proteins, particularly enzymes.
Collapse
Affiliation(s)
- Xingrong Fang
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology , Huazhong University of Science and Technology , Wuhan 430074 , P. R. China
| | - Jinsha Huang
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology , Huazhong University of Science and Technology , Wuhan 430074 , P. R. China
| | - Rui Zhang
- Editorial Board of the Journal of Wuhan Institute of Technology , Wuhan Institute of Technology , Wuhan 430074 , P. R. China
| | - Fei Wang
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology , Huazhong University of Science and Technology , Wuhan 430074 , P. R. China
| | - Qiuyu Zhang
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology , Huazhong University of Science and Technology , Wuhan 430074 , P. R. China
| | - Guanlin Li
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology , Huazhong University of Science and Technology , Wuhan 430074 , P. R. China
| | - Jinyong Yan
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology , Huazhong University of Science and Technology , Wuhan 430074 , P. R. China
| | - Houjin Zhang
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology , Huazhong University of Science and Technology , Wuhan 430074 , P. R. China
| | - Yunjun Yan
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology , Huazhong University of Science and Technology , Wuhan 430074 , P. R. China
| | - Li Xu
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology , Huazhong University of Science and Technology , Wuhan 430074 , P. R. China
| |
Collapse
|
6
|
Bittrich S, Kaden M, Leberecht C, Kaiser F, Villmann T, Labudde D. Application of an interpretable classification model on Early Folding Residues during protein folding. BioData Min 2019; 12:1. [PMID: 30627219 PMCID: PMC6321665 DOI: 10.1186/s13040-018-0188-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2018] [Accepted: 11/20/2018] [Indexed: 01/09/2023] Open
Abstract
Background Machine learning strategies are prominent tools for data analysis. Especially in life sciences, they have become increasingly important to handle the growing datasets collected by the scientific community. Meanwhile, algorithms improve in performance, but also gain complexity, and tend to neglect interpretability and comprehensiveness of the resulting models. Results Generalized Matrix Learning Vector Quantization (GMLVQ) is a supervised, prototype-based machine learning method and provides comprehensive visualization capabilities not present in other classifiers which allow for a fine-grained interpretation of the data. In contrast to commonly used machine learning strategies, GMLVQ is well-suited for imbalanced classification problems which are frequent in life sciences. We present a Weka plug-in implementing GMLVQ. The feasibility of GMLVQ is demonstrated on a dataset of Early Folding Residues (EFR) that have been shown to initiate and guide the protein folding process. Using 27 features, an area under the receiver operating characteristic of 76.6% was achieved which is comparable to other state-of-the-art classifiers. The obtained model is accessible at https://biosciences.hs-mittweida.de/efpred/. Conclusions The application on EFR prediction demonstrates how an easy interpretation of classification models can promote the comprehension of biological mechanisms. The results shed light on the special features of EFR which were reported as most influential for the classification: EFR are embedded in ordered secondary structure elements and they participate in networks of hydrophobic residues. Visualization capabilities of GMLVQ are presented as we demonstrate how to interpret the results. Electronic supplementary material The online version of this article (10.1186/s13040-018-0188-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sebastian Bittrich
- 1University of Applied Sciences Mittweida, Technikumplatz 17, Mittweida, 09648 Germany.,2Biotechnology Center (BIOTEC) TU Dresden, Tatzberg 47/49, Dresden, 01307 Germany
| | - Marika Kaden
- 1University of Applied Sciences Mittweida, Technikumplatz 17, Mittweida, 09648 Germany
| | - Christoph Leberecht
- 1University of Applied Sciences Mittweida, Technikumplatz 17, Mittweida, 09648 Germany.,2Biotechnology Center (BIOTEC) TU Dresden, Tatzberg 47/49, Dresden, 01307 Germany
| | - Florian Kaiser
- 1University of Applied Sciences Mittweida, Technikumplatz 17, Mittweida, 09648 Germany.,2Biotechnology Center (BIOTEC) TU Dresden, Tatzberg 47/49, Dresden, 01307 Germany
| | - Thomas Villmann
- 1University of Applied Sciences Mittweida, Technikumplatz 17, Mittweida, 09648 Germany
| | - Dirk Labudde
- 1University of Applied Sciences Mittweida, Technikumplatz 17, Mittweida, 09648 Germany
| |
Collapse
|
7
|
Li B, Fooksa M, Heinze S, Meiler J. Finding the needle in the haystack: towards solving the protein-folding problem computationally. Crit Rev Biochem Mol Biol 2018; 53:1-28. [PMID: 28976219 PMCID: PMC6790072 DOI: 10.1080/10409238.2017.1380596] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2017] [Revised: 08/22/2017] [Accepted: 09/13/2017] [Indexed: 12/22/2022]
Abstract
Prediction of protein tertiary structures from amino acid sequence and understanding the mechanisms of how proteins fold, collectively known as "the protein folding problem," has been a grand challenge in molecular biology for over half a century. Theories have been developed that provide us with an unprecedented understanding of protein folding mechanisms. However, computational simulation of protein folding is still difficult, and prediction of protein tertiary structure from amino acid sequence is an unsolved problem. Progress toward a satisfying solution has been slow due to challenges in sampling the vast conformational space and deriving sufficiently accurate energy functions. Nevertheless, several techniques and algorithms have been adopted to overcome these challenges, and the last two decades have seen exciting advances in enhanced sampling algorithms, computational power and tertiary structure prediction methodologies. This review aims at summarizing these computational techniques, specifically conformational sampling algorithms and energy approximations that have been frequently used to study protein-folding mechanisms or to de novo predict protein tertiary structures. We hope that this review can serve as an overview on how the protein-folding problem can be studied computationally and, in cases where experimental approaches are prohibitive, help the researcher choose the most relevant computational approach for the problem at hand. We conclude with a summary of current challenges faced and an outlook on potential future directions.
Collapse
Affiliation(s)
- Bian Li
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
| | - Michaela Fooksa
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
- Chemical and Physical Biology Graduate Program, Vanderbilt University, Nashville, TN, USA
| | - Sten Heinze
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
| | - Jens Meiler
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
| |
Collapse
|
8
|
Stahl K, Schneider M, Brock O. EPSILON-CP: using deep learning to combine information from multiple sources for protein contact prediction. BMC Bioinformatics 2017; 18:303. [PMID: 28623886 PMCID: PMC5474060 DOI: 10.1186/s12859-017-1713-x] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2016] [Accepted: 05/30/2017] [Indexed: 01/12/2023] Open
Abstract
BACKGROUND Accurately predicted contacts allow to compute the 3D structure of a protein. Since the solution space of native residue-residue contact pairs is very large, it is necessary to leverage information to identify relevant regions of the solution space, i.e. correct contacts. Every additional source of information can contribute to narrowing down candidate regions. Therefore, recent methods combined evolutionary and sequence-based information as well as evolutionary and physicochemical information. We develop a new contact predictor (EPSILON-CP) that goes beyond current methods by combining evolutionary, physicochemical, and sequence-based information. The problems resulting from the increased dimensionality and complexity of the learning problem are combated with a careful feature analysis, which results in a drastically reduced feature set. The different information sources are combined using deep neural networks. RESULTS On 21 hard CASP11 FM targets, EPSILON-CP achieves a mean precision of 35.7% for top- L/10 predicted long-range contacts, which is 11% better than the CASP11 winning version of MetaPSICOV. The improvement on 1.5L is 17%. Furthermore, in this study we find that the amino acid composition, a commonly used feature, is rendered ineffective in the context of meta approaches. The size of the refined feature set decreased by 75%, enabling a significant increase in training data for machine learning, contributing significantly to the observed improvements. CONCLUSIONS Exploiting as much and diverse information as possible is key to accurate contact prediction. Simply merging the information introduces new challenges. Our study suggests that critical feature analysis can improve the performance of contact prediction methods that combine multiple information sources. EPSILON-CP is available as a webservice: http://compbio.robotics.tu-berlin.de/epsilon/.
Collapse
Affiliation(s)
- Kolja Stahl
- Robotics and Biology Laboratory, Department of Electrical Engineering and Computer Science, Technische Universität Berlin, Marchstraße 23, Berlin, 10587 Germany
| | - Michael Schneider
- Robotics and Biology Laboratory, Department of Electrical Engineering and Computer Science, Technische Universität Berlin, Marchstraße 23, Berlin, 10587 Germany
| | - Oliver Brock
- Robotics and Biology Laboratory, Department of Electrical Engineering and Computer Science, Technische Universität Berlin, Marchstraße 23, Berlin, 10587 Germany
| |
Collapse
|
9
|
Berezovsky IN, Guarnera E, Zheng Z. Basic units of protein structure, folding, and function. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2016; 128:85-99. [PMID: 27697476 DOI: 10.1016/j.pbiomolbio.2016.09.009] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/29/2016] [Revised: 09/05/2016] [Accepted: 09/26/2016] [Indexed: 10/20/2022]
Abstract
Study of the hierarchy of domain structure with alternative sets of domains and analysis of discontinuous domains, consisting of remote segments of the polypeptide chain, raised a question about the minimal structural unit of the protein domain. The hypothesis on the decisive role of the polypeptide backbone in determining the elementary units of globular proteins have led to the discovery of closed loops. It is reviewed here how closed loops form the loop-n-lock structure of proteins, providing the foundation for stability and designability of protein folds/domain and underlying their co-translational folding. Simplified protein sequences are considered here with the aim to explore the basic principles that presumably dominated the folding and stability of proteins in the early stages of structural evolution. Elementary functional loops (EFLs), closed loops with one or few catalytic residues, are, in turn, units of the protein function. They are apparent descendants of the prebiotic ring-like peptides, which gave rise to the first functional folds/domains being fused in the beginning of the evolution of protein structure. It is also shown how evolutionary relations between protein functional superfamilies and folds delineated with the help of EFLs can contribute to establishing the rules for design of desired enzymatic functions. Generalized descriptors of the elementary functions are proposed to be used as basic units in the future computational design.
Collapse
Affiliation(s)
- Igor N Berezovsky
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, 138671, Singapore; Department of Biological Sciences (DBS), National University of Singapore (NUS), 8 Medical Drive, 117579, Singapore.
| | - Enrico Guarnera
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, 138671, Singapore
| | - Zejun Zheng
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, 138671, Singapore
| |
Collapse
|
10
|
Prediction of change in protein unfolding rates upon point mutations in two state proteins. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2016; 1864:1104-1109. [DOI: 10.1016/j.bbapap.2016.06.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2016] [Revised: 05/05/2016] [Accepted: 06/01/2016] [Indexed: 11/23/2022]
|
11
|
Nagarajan R, Archana A, Thangakani AM, Jemimah S, Velmurugan D, Gromiha MM. PDBparam: Online Resource for Computing Structural Parameters of Proteins. Bioinform Biol Insights 2016; 10:73-80. [PMID: 27330281 PMCID: PMC4909059 DOI: 10.4137/bbi.s38423] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2016] [Revised: 04/20/2016] [Accepted: 04/24/2016] [Indexed: 02/07/2023] Open
Abstract
Understanding the structure-function relationship in proteins is a longstanding goal in molecular and computational biology. The development of structure-based parameters has helped to relate the structure with the function of a protein. Although several structural features have been reported in the literature, no single server can calculate a wide-ranging set of structure-based features from protein three-dimensional structures. In this work, we have developed a web-based tool, PDBparam, for computing more than 50 structure-based features for any given protein structure. These features are classified into four major categories: (i) interresidue interactions, which include short-, medium-, and long-range interactions, contact order, long-range order, total contact distance, contact number, and multiple contact index, (ii) secondary structure propensities such as α-helical propensity, β-sheet propensity, and propensity of amino acids to exist at various positions of α-helix and amino acid compositions in high B-value regions, (iii) physicochemical properties containing ionic interactions, hydrogen bond interactions, hydrophobic interactions, disulfide interactions, aromatic interactions, surrounding hydrophobicity, and buriedness, and (iv) identification of binding site residues in protein-protein, protein-nucleic acid, and protein-ligand complexes. The server can be freely accessed at http://www.iitm.ac.in/bioinfo/pdbparam/. We suggest the use of PDBparam as an effective tool for analyzing protein structures.
Collapse
Affiliation(s)
- R. Nagarajan
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| | - A. Archana
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| | - A. Mary Thangakani
- CAS in Crystallography and Biophysics, University of Madras, Chennai, India
- Bioinformatics Infrastructure Facility, University of Madras, Chennai, India
| | - S. Jemimah
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| | - D. Velmurugan
- CAS in Crystallography and Biophysics, University of Madras, Chennai, India
- Bioinformatics Infrastructure Facility, University of Madras, Chennai, India
| | - M. Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| |
Collapse
|
12
|
Contribution of main chain and side chain atoms and their locations to the stability of thermophilic proteins. J Mol Graph Model 2016; 64:85-93. [DOI: 10.1016/j.jmgm.2016.01.001] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2015] [Accepted: 01/03/2016] [Indexed: 11/21/2022]
|
13
|
Corrales M, Cuscó P, Usmanova DR, Chen HC, Bogatyreva NS, Filion GJ, Ivankov DN. Machine Learning: How Much Does It Tell about Protein Folding Rates? PLoS One 2015; 10:e0143166. [PMID: 26606303 PMCID: PMC4659572 DOI: 10.1371/journal.pone.0143166] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2015] [Accepted: 11/02/2015] [Indexed: 11/18/2022] Open
Abstract
The prediction of protein folding rates is a necessary step towards understanding the principles of protein folding. Due to the increasing amount of experimental data, numerous protein folding models and predictors of protein folding rates have been developed in the last decade. The problem has also attracted the attention of scientists from computational fields, which led to the publication of several machine learning-based models to predict the rate of protein folding. Some of them claim to predict the logarithm of protein folding rate with an accuracy greater than 90%. However, there are reasons to believe that such claims are exaggerated due to large fluctuations and overfitting of the estimates. When we confronted three selected published models with new data, we found a much lower predictive power than reported in the original publications. Overly optimistic predictive powers appear from violations of the basic principles of machine-learning. We highlight common misconceptions in the studies claiming excessive predictive power and propose to use learning curves as a safeguard against those mistakes. As an example, we show that the current amount of experimental data is insufficient to build a linear predictor of logarithms of folding rates based on protein amino acid composition.
Collapse
Affiliation(s)
- Marc Corrales
- Genome Architecture, Gene Regulation, Stem Cells and Cancer Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Spain Genome Architecture, Gene Regulation, Stem Cells and Cancer Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain
| | - Pol Cuscó
- Genome Architecture, Gene Regulation, Stem Cells and Cancer Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Spain Genome Architecture, Gene Regulation, Stem Cells and Cancer Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain
| | - Dinara R. Usmanova
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain
- Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, Russia
| | - Heng-Chang Chen
- Genome Architecture, Gene Regulation, Stem Cells and Cancer Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Spain Genome Architecture, Gene Regulation, Stem Cells and Cancer Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain
| | - Natalya S. Bogatyreva
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain
- Laboratory of Protein Physics, Institute of Protein Research of the Russian Academy of Sciences, Pushchino, Moscow Region, Russia
| | - Guillaume J. Filion
- Genome Architecture, Gene Regulation, Stem Cells and Cancer Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Spain Genome Architecture, Gene Regulation, Stem Cells and Cancer Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain
| | - Dmitry N. Ivankov
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain
- Laboratory of Protein Physics, Institute of Protein Research of the Russian Academy of Sciences, Pushchino, Moscow Region, Russia
- * E-mail:
| |
Collapse
|
14
|
Chaudhary P, Naganathan AN, Gromiha MM. Folding RaCe: a robust method for predicting changes in protein folding rates upon point mutations. ACTA ACUST UNITED AC 2015; 31:2091-7. [PMID: 25686635 DOI: 10.1093/bioinformatics/btv091] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2014] [Accepted: 02/10/2015] [Indexed: 11/13/2022]
Abstract
MOTIVATION Protein engineering methods are commonly employed to decipher the folding mechanism of proteins and enzymes. However, such experiments are exceedingly time and resource intensive. It would therefore be advantageous to develop a simple computational tool to predict changes in folding rates upon mutations. Such a method should be able to rapidly provide the sequence position and chemical nature to modulate through mutation, to effect a particular change in rate. This can be of importance in protein folding, function or mechanistic studies. RESULTS We have developed a robust knowledge-based methodology to predict the changes in folding rates upon mutations formulated from amino and acid properties using multiple linear regression approach. We benchmarked this method against an experimental database of 790 point mutations from 26 two-state proteins. Mutants were first classified according to secondary structure, accessible surface area and position along the primary sequence. Three prime amino acid features eliciting the best relationship with folding rates change were then shortlisted for each class along with an optimized window length. We obtained a self-consistent mean absolute error of 0.36 s(-1) and a mean Pearson correlation coefficient (PCC) of 0.81. Jack-knife test resulted in a MAE of 0.42 s(-1) and a PCC of 0.73. Moreover, our method highlights the importance of outlier(s) detection and studying their implications in the folding mechanism. AVAILABILITY AND IMPLEMENTATION A web server 'Folding RaCe' has been developed and is available at http://www.iitm.ac.in/bioinfo/proteinfolding/foldingrace.html. CONTACT gromiha@iitm.ac.in SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Priyashree Chaudhary
- Department of Biotechnology, Bhupat & Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600 036, India
| | - Athi N Naganathan
- Department of Biotechnology, Bhupat & Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600 036, India
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat & Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600 036, India
| |
Collapse
|
15
|
Ruiz-Blanco YB, Marrero-Ponce Y, Prieto PJ, Salgado J, García Y, Sotomayor-Torres CM. A Hooke׳s law-based approach to protein folding rate. J Theor Biol 2015; 364:407-17. [DOI: 10.1016/j.jtbi.2014.09.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2014] [Revised: 08/28/2014] [Accepted: 09/02/2014] [Indexed: 10/24/2022]
|
16
|
Dehzangi A, Paliwal K, Lyons J, Sharma A, Sattar A. A Segmentation-Based Method to Extract Structural and Evolutionary Features for Protein Fold Recognition. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2014; 11:510-519. [PMID: 26356019 DOI: 10.1109/tcbb.2013.2296317] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Protein fold recognition (PFR) is considered as an important step towards the protein structure prediction problem. Despite all the efforts that have been made so far, finding an accurate and fast computational approach to solve the PFR still remains a challenging problem for bioinformatics and computational biology. In this study, we propose the concept of segmented-based feature extraction technique to provide local evolutionary information embedded in position specific scoring matrix (PSSM) and structural information embedded in the predicted secondary structure of proteins using SPINE-X. We also employ the concept of occurrence feature to extract global discriminatory information from PSSM and SPINE-X. By applying a support vector machine (SVM) to our extracted features, we enhance the protein fold prediction accuracy for 7.4 percent over the best results reported in the literature. We also report 73.8 percent prediction accuracy for a data set consisting of proteins with less than 25 percent sequence similarity rates and 80.7 percent prediction accuracy for a data set with proteins belonging to 110 folds with less than 40 percent sequence similarity rates. We also investigate the relation between the number of folds and the number of features being used and show that the number of features should be increased to get better protein fold prediction results when the number of folds is relatively large.
Collapse
|
17
|
Tasdighian S, Di Paola L, De Ruvo M, Paci P, Santoni D, Palumbo P, Mei G, Di Venere A, Giuliani A. Modules Identification in Protein Structures: The Topological and Geometrical Solutions. J Chem Inf Model 2013; 54:159-68. [DOI: 10.1021/ci400218v] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Setareh Tasdighian
- Department
of Plant Systems Biology, Flanders Interuniversity Institute for Biotechnology, Ghent University, K.L. Ledeganckstraat 35, B-9000 Ghent, Belgium
| | - Luisa Di Paola
- Faculty
of Engineering, Università CAMPUS BioMedico, Via A. del
Portillo, 21, 00128 Roma, Italy
| | - Micol De Ruvo
- CNR-Institute of Systems Analysis and Computer Science (IASI), viale Manzoni 30, 00185 Roma, Italy
| | - Paola Paci
- CNR-Institute of Systems Analysis and Computer Science (IASI), viale Manzoni 30, 00185 Roma, Italy
| | - Daniele Santoni
- Department
of Experimental Medicine and Surgery, University of Rome “Tor Vergata”, via Montpellier 1, 00133 Rome, Italy
| | - Pasquale Palumbo
- CNR-Institute of Systems Analysis and Computer Science (IASI), viale Manzoni 30, 00185 Roma, Italy
- Department
of Experimental Medicine and Surgery, University of Rome “Tor Vergata”, via Montpellier 1, 00133 Rome, Italy
| | - Giampiero Mei
- Department
of Experimental Medicine and Surgery, University of Rome “Tor Vergata”, via Montpellier 1, 00133 Rome, Italy
| | - Almerinda Di Venere
- Environment
and Health Department, Istituto Superiore di Sanità, Viale
Regina Elena 299, 00161, Roma, Italy
| | - Alessandro Giuliani
- Environment
and Health Department, Istituto Superiore di Sanità, Viale
Regina Elena 299, 00161, Roma, Italy
| |
Collapse
|
18
|
Compiani M, Capriotti E. Computational and theoretical methods for protein folding. Biochemistry 2013; 52:8601-24. [PMID: 24187909 DOI: 10.1021/bi4001529] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
A computational approach is essential whenever the complexity of the process under study is such that direct theoretical or experimental approaches are not viable. This is the case for protein folding, for which a significant amount of data are being collected. This paper reports on the essential role of in silico methods and the unprecedented interplay of computational and theoretical approaches, which is a defining point of the interdisciplinary investigations of the protein folding process. Besides giving an overview of the available computational methods and tools, we argue that computation plays not merely an ancillary role but has a more constructive function in that computational work may precede theory and experiments. More precisely, computation can provide the primary conceptual clues to inspire subsequent theoretical and experimental work even in a case where no preexisting evidence or theoretical frameworks are available. This is cogently manifested in the application of machine learning methods to come to grips with the folding dynamics. These close relationships suggested complementing the review of computational methods within the appropriate theoretical context to provide a self-contained outlook of the basic concepts that have converged into a unified description of folding and have grown in a synergic relationship with their computational counterpart. Finally, the advantages and limitations of current computational methodologies are discussed to show how the smart analysis of large amounts of data and the development of more effective algorithms can improve our understanding of protein folding.
Collapse
Affiliation(s)
- Mario Compiani
- School of Sciences and Technology, University of Camerino , Camerino, Macerata 62032, Italy
| | | |
Collapse
|
19
|
Dehzangi A, Paliwal K, Sharma A, Dehzangi O, Sattar A. A combination of feature extraction methods with an ensemble of different classifiers for protein structural class prediction problem. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013; 10:564-75. [PMID: 24091391 DOI: 10.1109/tcbb.2013.65] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Better understanding of structural class of a given protein reveals important information about its overall folding type and its domain. It can also be directly used to provide critical information on general tertiary structure of a protein which has a profound impact on protein function determination and drug design. Despite tremendous enhancements made by pattern recognition-based approaches to solve this problem, it still remains as an unsolved issue for bioinformatics that demands more attention and exploration. In this study, we propose a novel feature extraction model that incorporates physicochemical and evolutionary-based information simultaneously. We also propose overlapped segmented distribution and autocorrelation-based feature extraction methods to provide more local and global discriminatory information. The proposed feature extraction methods are explored for 15 most promising attributes that are selected from a wide range of physicochemical-based attributes. Finally, by applying an ensemble of different classifiers namely, Adaboost.M1, LogitBoost, naive Bayes, multilayer perceptron (MLP), and support vector machine (SVM) we show enhancement of the protein structural class prediction accuracy for four popular benchmarks.
Collapse
|
20
|
Gromiha MM, Pathak MC, Saraboji K, Ortlund EA, Gaucher EA. Hydrophobic environment is a key factor for the stability of thermophilic proteins. Proteins 2013; 81:715-21. [PMID: 23319168 DOI: 10.1002/prot.24232] [Citation(s) in RCA: 85] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2012] [Revised: 11/16/2012] [Accepted: 11/28/2012] [Indexed: 11/07/2022]
Affiliation(s)
- M Michael Gromiha
- Department of Biotechnology, Indian Institute of Technology Madras, Chennai 600036, Tamilnadu, India.
| | | | | | | | | |
Collapse
|
21
|
Dehzangi A, Paliwal K, Lyons J, Sharma A, Sattar A. Enhancing Protein Fold Prediction Accuracy Using Evolutionary and Structural Features. PATTERN RECOGNITION IN BIOINFORMATICS 2013. [DOI: 10.1007/978-3-642-39159-0_18] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
22
|
Di Paola L, De Ruvo M, Paci P, Santoni D, Giuliani A. Protein Contact Networks: An Emerging Paradigm in Chemistry. Chem Rev 2012. [DOI: 10.1021/cr3002356] [Citation(s) in RCA: 173] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Affiliation(s)
- L. Di Paola
- Faculty of Engineering, Università CAMPUS BioMedico, Via A. del Portillo,
21, 00128 Roma, Italy
| | | | | | - D. Santoni
- BioMathLab, CNR-Institute of Systems Analysis and Computer Science (IASI), viale Manzoni 30, 00185
Roma, Italy
| | - A. Giuliani
- Environment
and Health Department, Istituto Superiore di Sanità, Viale Regina Elena
299, 00161, Roma, Italy
| |
Collapse
|
23
|
Real value prediction of protein folding rate change upon point mutation. J Comput Aided Mol Des 2012; 26:339-47. [DOI: 10.1007/s10822-012-9560-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2011] [Accepted: 03/02/2012] [Indexed: 10/28/2022]
|
24
|
Zou T, Ozkan SB. Local and non-local native topologies reveal the underlying folding landscape of proteins. Phys Biol 2011; 8:066011. [DOI: 10.1088/1478-3975/8/6/066011] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
|
25
|
Zhang H, Zhang T, Gao J, Ruan J, Shen S, Kurgan L. Determination of protein folding kinetic types using sequence and predicted secondary structure and solvent accessibility. Amino Acids 2010; 42:271-83. [DOI: 10.1007/s00726-010-0805-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2010] [Accepted: 11/01/2010] [Indexed: 10/18/2022]
|
26
|
Gao J, Zhang T, Zhang H, Shen S, Ruan J, Kurgan L. Accurate prediction of protein folding rates from sequence and sequence-derived residue flexibility and solvent accessibility. Proteins 2010; 78:2114-30. [PMID: 20455267 DOI: 10.1002/prot.22727] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Protein folding rates vary by several orders of magnitude and they depend on the topology of the fold and the size and composition of the sequence. Although recent works show that the rates can be predicted from the sequence, allowing for high-throughput annotations, they consider only the sequence and its predicted secondary structure. We propose a novel sequence-based predictor, PFR-AF, which utilizes solvent accessibility and residue flexibility predicted from the sequence, to improve predictions and provide insights into the folding process. The predictor includes three linear regressions for proteins with two-state, multistate, and unknown (mixed-state) folding kinetics. PFR-AF on average outperforms current methods when tested on three datasets. The proposed approach provides high-quality predictions in the absence of similarity between the predicted and the training sequences. The PFR-AF's predictions are characterized by high (between 0.71 and 0.95, depending on the dataset) correlation and the lowest (between 0.75 and 0.9) mean absolute errors with respect to the experimental rates, as measured using out-of-sample tests. Our models reveal that for the two-state chains inclusion of solvent-exposed Ala may accelerate the folding, while increased content of Ile may reduce the folding speed. We also demonstrate that increased flexibility of coils facilitates faster folding and that proteins with larger content of solvent-exposed strands may fold at a slower pace. The increased flexibility of the solvent-exposed residues is shown to elongate folding, which also holds, with a lower correlation, for buried residues. Two case studies are included to support our findings.
Collapse
Affiliation(s)
- Jianzhao Gao
- College of Mathematics and LPMC, Nankai University, Tianjin, People's Republic of China
| | | | | | | | | | | |
Collapse
|
27
|
Hamacher K. Efficient quantification of the importance of contacts for the dynamical stability of proteins. J Comput Chem 2010; 32:810-5. [PMID: 20957707 DOI: 10.1002/jcc.21659] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2010] [Revised: 07/12/2010] [Accepted: 08/05/2010] [Indexed: 11/07/2022]
Abstract
Understanding the stability of the native state and the dynamics of a protein is of great importance for all areas of biomolecular design. The efficient estimation of the influence of individual contacts between amino acids in a protein structure is a first step in the reengineering of a particular protein for technological or pharmacological purposes. At the same time, the functional annotation of molecular evolution can be facilitated by such insight. Here, we use a recently suggested, information theoretical measure in biomolecular design - the Kullback-Leibler-divergence - to quantify and therefore rank residue-residue contacts within proteins according to their overall contribution to the molecular mechanics. We implement this protocol on the basis of a reduced molecular model, which allows us to use a well-known lemma of linear algebra to speed up the computation. The increase in computational performance is around 10(1)- to 10(4)-fold. We applied the method to two proteins to illustrate the protocol and its results. We found that our method can reliably identify key residues in the molecular mechanics and the protein fold in comparison to well-known properties in the serine protease inhibitor. We found significant correlations to experimental results, e.g., dissociation constants and Φ values.
Collapse
|
28
|
Huang LT, Gromiha MM. First insight into the prediction of protein folding rate change upon point mutation. Bioinformatics 2010; 26:2121-7. [DOI: 10.1093/bioinformatics/btq350] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
29
|
Xi L, Li S, Liu H, Li J, Lei B, Yao X. Global and local prediction of protein folding rates based on sequence autocorrelation information. J Theor Biol 2010; 264:1159-68. [DOI: 10.1016/j.jtbi.2010.03.042] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2009] [Revised: 03/28/2010] [Accepted: 03/29/2010] [Indexed: 11/24/2022]
|
30
|
Sun W, He J. Understanding on the residue contact network using the log-normal cluster model and the multilevel wheel diagram. Biopolymers 2010; 93:904-16. [DOI: 10.1002/bip.21494] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|