1
|
Harihar B, Saravanan KM, Gromiha MM, Selvaraj S. Importance of Inter-residue Contacts for Understanding Protein Folding and Unfolding Rates, Remote Homology, and Drug Design. Mol Biotechnol 2024:10.1007/s12033-024-01119-4. [PMID: 38498284 DOI: 10.1007/s12033-024-01119-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Accepted: 02/10/2024] [Indexed: 03/20/2024]
Abstract
Inter-residue interactions in protein structures provide valuable insights into protein folding and stability. Understanding these interactions can be helpful in many crucial applications, including rational design of therapeutic small molecules and biologics, locating functional protein sites, and predicting protein-protein and protein-ligand interactions. The process of developing machine learning models incorporating inter-residue interactions has been improved recently. This review highlights the theoretical models incorporating inter-residue interactions in predicting folding and unfolding rates of proteins. Utilizing contact maps to depict inter-residue interactions aids researchers in developing computer models for detecting remote homologs and interface residues within protein-protein complexes which, in turn, enhances our knowledge of the relationship between sequence and structure of proteins. Further, the application of contact maps derived from inter-residue interactions is highlighted in the field of drug discovery. Overall, this review presents an extensive assessment of the significant models that use inter-residue interactions to investigate folding rates, unfolding rates, remote homology, and drug development, providing potential future advancements in constructing efficient computational models in structural biology.
Collapse
Affiliation(s)
- Balasubramanian Harihar
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, Tamil Nadu, 620024, India
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, 600036, India
| | - Konda Mani Saravanan
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, Tamil Nadu, 620024, India
- Department of Biotechnology, Bharath Institute of Higher Education and Research, Chennai, Tamil Nadu, 600073, India
| | - Michael M Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, 600036, India
| | - Samuel Selvaraj
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, Tamil Nadu, 620024, India.
| |
Collapse
|
2
|
Casier R, Duhamel J. Appraisal of blob-Based Approaches in the Prediction of Protein Folding Times. J Phys Chem B 2023; 127:8852-8859. [PMID: 37793094 DOI: 10.1021/acs.jpcb.3c04958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/06/2023]
Abstract
A series of reports published in the last 3 years has illustrated that a blob-based model (BBM) can predict the folding time of proteins from their primary amino acid (aa) sequence based on three simple rules established to characterize the long-range backbone dynamics (LRBD) of racemic polypeptides. The sole use of LRBD to predict protein folding times with the BBM represents a radical departure from all other prediction methods currently applied to determine protein folding times, which rely instead on parameters such as the structure content, folding kinetics, chain length, amino acid properties, or contact topography of proteins. Furthermore, the built-in modularity of the BBM enables the parametrization and inclusion of new phenomena affecting the LRBD of polypeptides, while its conceptual simplicity makes it an interesting new mathematical tool for studying protein folding. However, its novelty implies that its relationship with many other methods used to predict protein folding times has not been well researched. Consequently, the purpose of this report is to uncover the physical phenomena encountered during protein folding that are best described by the BBM through the identification of parameters that have been recognized over the years as being strong predictors for protein folding, such as protein size, topology, structural class, and folding kinetics. This was accomplished by determining the parameters most strongly correlated with the folding times predicted by the BBM. While the BBM in its present form appears to be a good indicator of the folding times of the vast majority of the 195 proteins considered so far, this report finds that it excels for moderately large proteins that are primarily composed of locally formed structural motifs such as α-helices or for proteins that fold in multiple steps. Altogether, these observations based on the use of the BBM support the notion that proteins fold the way they do because the LRBD of polypeptides is mostly driven by the local interactions experienced between aa's within reach of one another.
Collapse
Affiliation(s)
- Remi Casier
- Institute for Polymer Research, Waterloo Institute for Nanotechnology, Department of Chemistry, University of Waterloo, Waterloo, Ontario N2L3G1, Canada
| | - Jean Duhamel
- Institute for Polymer Research, Waterloo Institute for Nanotechnology, Department of Chemistry, University of Waterloo, Waterloo, Ontario N2L3G1, Canada
| |
Collapse
|
3
|
Casier R, Duhamel J. Synergetic Effects of Alanine and Glycine in Blob-Based Methods for Predicting Protein Folding Times. J Phys Chem B 2023; 127:1325-1337. [PMID: 36749707 DOI: 10.1021/acs.jpcb.2c08155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
The polypeptide PGlyAlaGlu was prepared with 20 mol % glycine (Gly), 36 mol % d,l-alanine (Ala), and 44 mol % d,l-glutamic acid (Glu) and labeled with the dye 1-pyrenemethylamine to yield a series of Py-PGlyAlaGlu samples. The fluorescence decays of the Py-PGlyAlaGlu samples were analyzed according to the fluorescence blob model (FBM) to obtain the number Nblobexp of amino acids (aa's) encompassed inside the subvolume Vblob of the polypeptide probed by an excited pyrene. An Nblobexp value of 29 (±2) was retrieved for Py-PGlyAlaGlu, which was much larger than for any of the copolypeptide PGlyGlu or PAlaGlu prepared with either Gly and Glu or Ala and Glu, respectively. The continuous increase in Nblobexp with decreasing side chain size (SCS) from 10 aa's for PGlu to 16 aa's for PAlaGlu and 22 aa's for PGlyGlu was used earlier to define the reach of an aa and determine the groups of aa's that could interact with each other along a polypeptide backbone according to their SCS. These groups of aa's, referred to as blobs, led to the implementation of blob-based models (BBM) to predict the folding time τFtheo,BBM of 145 proteins, which was found to match their experimental folding time τFexp with a relatively high 0.71 correlation coefficient. Nevertheless, the much higher Nblobexp value found for Py-PGlyAlaGlu compared to all other pyrene-labeled polypeptides studied to date indicates that the reach of aa's along a polypeptide sequence is affected not only by SCS but also by synergetic effects between different aa's. Following this new insight, a revised BBM was implemented to predict τFtheo,BBM for 195 proteins assuming the existence or absence of synergies to control the interactions between aa's along a polypeptide sequence. Similarly good correlation coefficients of 0.71 and 0.74 were obtained for a direct 1:1 comparison of τFexp and τFtheo,BBM for the 195 proteins without and with synergies, respectively. This result suggests that synergetic effects between different aa's have little effect on τFtheo,BBM predicted from BBM underlying the robustness of this methodology.
Collapse
Affiliation(s)
- Remi Casier
- Institute for Polymer Research, Waterloo Institute for Nanotechnology, Department of Chemistry, University of Waterloo, Waterloo, ON N2L 3G1, Canada
| | - Jean Duhamel
- Institute for Polymer Research, Waterloo Institute for Nanotechnology, Department of Chemistry, University of Waterloo, Waterloo, ON N2L 3G1, Canada
| |
Collapse
|
4
|
The protein folding rate and the geometry and topology of the native state. Sci Rep 2022; 12:6384. [PMID: 35430582 PMCID: PMC9013383 DOI: 10.1038/s41598-022-09924-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Accepted: 03/21/2022] [Indexed: 11/08/2022] Open
Abstract
AbstractProteins fold in 3-dimensional conformations which are important for their function. Characterizing the global conformation of proteins rigorously and separating secondary structure effects from topological effects is a challenge. New developments in applied knot theory allow to characterize the topological characteristics of proteins (knotted or not). By analyzing a small set of two-state and multi-state proteins with no knots or slipknots, our results show that 95.4% of the analyzed proteins have non-trivial topological characteristics, as reflected by the second Vassiliev measure, and that the logarithm of the experimental protein folding rate depends on both the local geometry and the topology of the protein’s native state.
Collapse
|
5
|
Signorini LF, Perego C, Potestio R. Protein self-entanglement modulates successful folding to the native state: A multi-scale modeling study. J Chem Phys 2021; 155:115101. [PMID: 34551527 DOI: 10.1063/5.0063254] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
The computer-aided investigation of protein folding has greatly benefited from coarse-grained models, that is, simplified representations at a resolution level lower than atomistic, providing access to qualitative and quantitative details of the folding process that would be hardly attainable, via all-atom descriptions, for medium to long molecules. Nonetheless, the effectiveness of low-resolution models is itself hampered by the presence, in a small but significant number of proteins, of nontrivial topological self-entanglements. Features such as native state knots or slipknots introduce conformational bottlenecks, affecting the probability to fold into the correct conformation; this limitation is particularly severe in the context of coarse-grained models. In this work, we tackle the relationship between folding probability, protein folding pathway, and protein topology in a set of proteins with a nontrivial degree of topological complexity. To avoid or mitigate the risk of incurring in kinetic traps, we make use of the elastic folder model, a coarse-grained model based on angular potentials optimized toward successful folding via a genetic procedure. This light-weight representation allows us to estimate in silico folding probabilities, which we find to anti-correlate with a measure of topological complexity as well as to correlate remarkably well with experimental measurements of the folding rate. These results strengthen the hypothesis that the topological complexity of the native state decreases the folding probability and that the force-field optimization mimics the evolutionary process these proteins have undergone to avoid kinetic traps.
Collapse
Affiliation(s)
- Lorenzo Federico Signorini
- The George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel and Department of Physics, University of Trento, Trento, Italy
| | - Claudio Perego
- Department of Innovative Technologies, University of Applied Sciences and Arts of Southern Switzerland, Manno, Switzerland and Polymer Theory Department, Max Planck Institute for Polymer Research, Mainz, Germany
| | - Raffaello Potestio
- Department of Physics, University of Trento, Trento, Italy and INFN-TIFPA, Trento Institute for Fundamental Physics and Applications, Trento, Italy
| |
Collapse
|
6
|
Casier R, Duhamel J. Blob-Based Predictions of Protein Folding Times from the Amino Acid-Dependent Conformation of Polypeptides in Solution. Macromolecules 2021. [DOI: 10.1021/acs.macromol.0c02617] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Remi Casier
- Institute for Polymer Research, Waterloo Institute for Nanotechnology, Department of Chemistry, University of Waterloo, Waterloo, ON N2L3G1, Canada
| | - Jean Duhamel
- Institute for Polymer Research, Waterloo Institute for Nanotechnology, Department of Chemistry, University of Waterloo, Waterloo, ON N2L3G1, Canada
| |
Collapse
|
7
|
Casier R, Duhamel J. Blob-Based Approach to Estimate the Folding Time of Proteins Supported by Pyrene Excimer Fluorescence Experiments. Macromolecules 2020. [DOI: 10.1021/acs.macromol.0c02201] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Remi Casier
- Institute for Polymer Research, Waterloo Institute for Nanotechnology, Department of Chemistry, University of Waterloo, Waterloo, ON N2L 3G1, Canada
| | - Jean Duhamel
- Institute for Polymer Research, Waterloo Institute for Nanotechnology, Department of Chemistry, University of Waterloo, Waterloo, ON N2L 3G1, Canada
| |
Collapse
|
8
|
Kuwajima K. The Molten Globule, and Two-State vs. Non-Two-State Folding of Globular Proteins. Biomolecules 2020; 10:biom10030407. [PMID: 32155758 PMCID: PMC7175247 DOI: 10.3390/biom10030407] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Revised: 03/03/2020] [Accepted: 03/06/2020] [Indexed: 11/16/2022] Open
Abstract
From experimental studies of protein folding, it is now clear that there are two types of folding behavior, i.e., two-state folding and non-two-state folding, and understanding the relationships between these apparently different folding behaviors is essential for fully elucidating the molecular mechanisms of protein folding. This article describes how the presence of the two types of folding behavior has been confirmed experimentally, and discusses the relationships between the two-state and the non-two-state folding reactions, on the basis of available data on the correlations of the folding rate constant with various structure-based properties, which are determined primarily by the backbone topology of proteins. Finally, a two-stage hierarchical model is proposed as a general mechanism of protein folding. In this model, protein folding occurs in a hierarchical manner, reflecting the hierarchy of the native three-dimensional structure, as embodied in the case of non-two-state folding with an accumulation of the molten globule state as a folding intermediate. The two-state folding is thus merely a simplified version of the hierarchical folding caused either by an alteration in the rate-limiting step of folding or by destabilization of the intermediate.
Collapse
Affiliation(s)
- Kunihiro Kuwajima
- Department of Physics, School of Science, the University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan; ; Tel.: +81-90-5435-6540
- School of Computational Sciences, Korea Institute for Advanced Study (KIAS), Seoul 02455, Korea
| |
Collapse
|
9
|
Khor S. Folding with a protein's native shortcut network. Proteins 2019; 86:924-934. [PMID: 29790602 DOI: 10.1002/prot.25524] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2017] [Revised: 04/13/2018] [Accepted: 05/14/2018] [Indexed: 11/09/2022]
Abstract
A complex network approach to protein folding is proposed, wherein a protein's contact map is reconceptualized as a network of shortcut edges, and folding is steered by a structural characteristic of this network. Shortcut networks are generated by a known message passing algorithm operating on protein residue networks. It is found that the shortcut networks of native structures (SCN0s) are relevant graph objects with which to study protein folding at a formal level. The logarithm form of their contact order (SCN0_lnCO) correlates significantly with folding rate of two-state and nontwo-state proteins. The clustering coefficient of SCN0s (CSCN0 ) correlates significantly with folding rate, transition-state placement and stability of two-state folders. Reasonable folding pathways for several model proteins are produced when CSCN0 is used to combine protein segments incrementally to form the native structure. The folding bias captured by CSCN0 is detectable in non-native structures, as evidenced by Molecular Dynamics simulation generated configurations for the fast folding Villin-headpiece peptide. These results support the use of shortcut networks to investigate the role protein geometry plays in the folding of both small and large globular proteins, and have implications for the design of multibody interaction schemes in folding models. One facet of this geometry is the set of native shortcut triangles, whose attributes are found to be well-suited to identify dehydrated intraprotein areas in tight turns, or at the interface of different secondary structure elements.
Collapse
Affiliation(s)
- Susan Khor
- Department of Computer Science, Memorial University of Newfoundland, St. John's, Newfoundland and Labrador, Canada
| |
Collapse
|
10
|
|
11
|
Prediction of change in protein unfolding rates upon point mutations in two state proteins. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2016; 1864:1104-1109. [DOI: 10.1016/j.bbapap.2016.06.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2016] [Revised: 05/05/2016] [Accepted: 06/01/2016] [Indexed: 11/23/2022]
|
12
|
Network measures for protein folding state discrimination. Sci Rep 2016; 6:30367. [PMID: 27464796 PMCID: PMC4964642 DOI: 10.1038/srep30367] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2016] [Accepted: 06/24/2016] [Indexed: 11/09/2022] Open
Abstract
Proteins fold using a two-state or multi-state kinetic mechanisms, but up to now there is not a first-principle model to explain this different behavior. We exploit the network properties of protein structures by introducing novel observables to address the problem of classifying the different types of folding kinetics. These observables display a plain physical meaning, in terms of vibrational modes, possible configurations compatible with the native protein structure, and folding cooperativity. The relevance of these observables is supported by a classification performance up to 90%, even with simple classifiers such as discriminant analysis.
Collapse
|
13
|
Ruiz-Blanco YB, Paz W, Green J, Marrero-Ponce Y. ProtDCal: A program to compute general-purpose-numerical descriptors for sequences and 3D-structures of proteins. BMC Bioinformatics 2015; 16:162. [PMID: 25982853 PMCID: PMC4432771 DOI: 10.1186/s12859-015-0586-0] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2015] [Accepted: 04/22/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The exponential growth of protein structural and sequence databases is enabling multifaceted approaches to understanding the long sought sequence-structure-function relationship. Advances in computation now make it possible to apply well-established data mining and pattern recognition techniques to these data to learn models that effectively relate structure and function. However, extracting meaningful numerical descriptors of protein sequence and structure is a key issue that requires an efficient and widely available solution. RESULTS We here introduce ProtDCal, a new computational software suite capable of generating tens of thousands of features considering both sequence-based and 3D-structural descriptors. We demonstrate, by means of principle component analysis and Shannon entropy tests, how ProtDCal's sequence-based descriptors provide new and more relevant information not encoded by currently available servers for sequence-based protein feature generation. The wide diversity of the 3D-structure-based features generated by ProtDCal is shown to provide additional complementary information and effectively completes its general protein encoding capability. As demonstration of the utility of ProtDCal's features, prediction models of N-linked glycosylation sites are trained and evaluated. Classification performance compares favourably with that of contemporary predictors of N-linked glycosylation sites, in spite of not using domain-specific features as input information. CONCLUSIONS ProtDCal provides a friendly and cross-platform graphical user interface, developed in the Java programming language and is freely available at: http://bioinf.sce.carleton.ca/ProtDCal/ . ProtDCal introduces local and group-based encoding which enhances the diversity of the information captured by the computed features. Furthermore, we have shown that adding structure-based descriptors contributes non-redundant additional information to the features-based characterization of polypeptide systems. This software is intended to provide a useful tool for general-purpose encoding of protein sequences and structures for applications is protein classification, similarity analyses and function prediction.
Collapse
Affiliation(s)
- Yasser B Ruiz-Blanco
- Unit of Computer-Aided Molecular "Biosilico" Discovery and Bioinformatic Research (CAMD-BIR Unit), Facultad de Química y Farmacia, Universidad Central "Marta Abreu" de Las Villas, Road to Camajuani km 5 ½, Santa Clara, CP: 54830, Villa Clara, Cuba. .,Department of Systems and Computer Engineering, Carleton University, Ottawa, ON, Canada.
| | - Waldo Paz
- Unit of Computer-Aided Molecular "Biosilico" Discovery and Bioinformatic Research (CAMD-BIR Unit), Facultad de Química y Farmacia, Universidad Central "Marta Abreu" de Las Villas, Road to Camajuani km 5 ½, Santa Clara, CP: 54830, Villa Clara, Cuba. .,Centre of Informatics Studies (CEI), Universidad Central "Marta Abreu" de Las Villas, Road to Camajuani km 5 ½, Santa Clara, CP:54830, Villa Clara, Cuba.
| | - James Green
- Department of Systems and Computer Engineering, Carleton University, Ottawa, ON, Canada.
| | - Yovani Marrero-Ponce
- Unit of Computer-Aided Molecular "Biosilico" Discovery and Bioinformatic Research (CAMD-BIR Unit), Facultad de Química y Farmacia, Universidad Central "Marta Abreu" de Las Villas, Road to Camajuani km 5 ½, Santa Clara, CP: 54830, Villa Clara, Cuba. .,Grupo de Investigación Microbiología y Ambiente (GIMA). Programa de Bacteriología, Facultad Ciencias de la Salud, Universidad de San Buenaventura, Calle Real de Ternera, Cartagena (Bolivar), Colombia.
| |
Collapse
|
14
|
Chaudhary P, Naganathan AN, Gromiha MM. Folding RaCe: a robust method for predicting changes in protein folding rates upon point mutations. ACTA ACUST UNITED AC 2015; 31:2091-7. [PMID: 25686635 DOI: 10.1093/bioinformatics/btv091] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2014] [Accepted: 02/10/2015] [Indexed: 11/13/2022]
Abstract
MOTIVATION Protein engineering methods are commonly employed to decipher the folding mechanism of proteins and enzymes. However, such experiments are exceedingly time and resource intensive. It would therefore be advantageous to develop a simple computational tool to predict changes in folding rates upon mutations. Such a method should be able to rapidly provide the sequence position and chemical nature to modulate through mutation, to effect a particular change in rate. This can be of importance in protein folding, function or mechanistic studies. RESULTS We have developed a robust knowledge-based methodology to predict the changes in folding rates upon mutations formulated from amino and acid properties using multiple linear regression approach. We benchmarked this method against an experimental database of 790 point mutations from 26 two-state proteins. Mutants were first classified according to secondary structure, accessible surface area and position along the primary sequence. Three prime amino acid features eliciting the best relationship with folding rates change were then shortlisted for each class along with an optimized window length. We obtained a self-consistent mean absolute error of 0.36 s(-1) and a mean Pearson correlation coefficient (PCC) of 0.81. Jack-knife test resulted in a MAE of 0.42 s(-1) and a PCC of 0.73. Moreover, our method highlights the importance of outlier(s) detection and studying their implications in the folding mechanism. AVAILABILITY AND IMPLEMENTATION A web server 'Folding RaCe' has been developed and is available at http://www.iitm.ac.in/bioinfo/proteinfolding/foldingrace.html. CONTACT gromiha@iitm.ac.in SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Priyashree Chaudhary
- Department of Biotechnology, Bhupat & Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600 036, India
| | - Athi N Naganathan
- Department of Biotechnology, Bhupat & Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600 036, India
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat & Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600 036, India
| |
Collapse
|
15
|
Ruiz-Blanco YB, Marrero-Ponce Y, Prieto PJ, Salgado J, García Y, Sotomayor-Torres CM. A Hooke׳s law-based approach to protein folding rate. J Theor Biol 2015; 364:407-17. [DOI: 10.1016/j.jtbi.2014.09.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2014] [Revised: 08/28/2014] [Accepted: 09/02/2014] [Indexed: 10/24/2022]
|
16
|
Krobath H, Rey A, Faísca PFN. How determinant is N-terminal to C-terminal coupling for protein folding? Phys Chem Chem Phys 2015; 17:3512-24. [DOI: 10.1039/c4cp05178e] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
The existence of native interactions between the protein termini is a major determinant of the free energy barrier in a two-state folding transition being therefore a critical modulator of protein folding cooperativity.
Collapse
Affiliation(s)
- Heinrich Krobath
- Centro de Física da Matéria Condensada and Departamento de Física
- Faculdade de Ciências da Universidade de Lisboa
- Portugal
| | - Antonio Rey
- Departamento de Química Física I
- Facultad de Ciencias Químicas
- Universidad Complutense
- Madrid
- Spain
| | - Patrícia F. N. Faísca
- Centro de Física da Matéria Condensada and Departamento de Física
- Faculdade de Ciências da Universidade de Lisboa
- Portugal
| |
Collapse
|
17
|
Broom A, Gosavi S, Meiering EM. Protein unfolding rates correlate as strongly as folding rates with native structure. Protein Sci 2014; 24:580-7. [PMID: 25422093 DOI: 10.1002/pro.2606] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2014] [Revised: 11/03/2014] [Accepted: 11/04/2014] [Indexed: 01/19/2023]
Abstract
Although the folding rates of proteins have been studied extensively, both experimentally and theoretically, and many native state topological parameters have been proposed to correlate with or predict these rates, unfolding rates have received much less attention. Moreover, unfolding rates have generally been thought either to not relate to native topology in the same manner as folding rates, perhaps depending on different topological parameters, or to be more difficult to predict. Using a dataset of 108 proteins including two-state and multistate folders, we find that both unfolding and folding rates correlate strongly, and comparably well, with well-established measures of native topology, the absolute contact order and the long range order, with correlation coefficient values of 0.75 or higher. In addition, compared to folding rates, the absolute values of unfolding rates vary more strongly with native topology, have a larger range of values, and correlate better with thermodynamic stability. Similar trends are observed for subsets of different protein structural classes. Taken together, these results suggest that choosing a scaffold for protein engineering may require a compromise between a simple topology that will fold sufficiently quickly but also unfold quickly, and a complex topology that will unfold slowly and hence have kinetic stability, but fold slowly. These observations, together with the established role of kinetic stability in determining resistance to thermal and chemical denaturation as well as proteases, have important implications for understanding fundamental aspects of protein unfolding and folding and for protein engineering and design.
Collapse
Affiliation(s)
- Aron Broom
- Department of Chemistry, Guelph-Waterloo Centre for Graduate Studies in Chemistry and Biochemistry, University of Waterloo, Waterloo, Ontario, Canada, N2L 1W2
| | | | | |
Collapse
|
18
|
Compiani M, Capriotti E. Computational and theoretical methods for protein folding. Biochemistry 2013; 52:8601-24. [PMID: 24187909 DOI: 10.1021/bi4001529] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
A computational approach is essential whenever the complexity of the process under study is such that direct theoretical or experimental approaches are not viable. This is the case for protein folding, for which a significant amount of data are being collected. This paper reports on the essential role of in silico methods and the unprecedented interplay of computational and theoretical approaches, which is a defining point of the interdisciplinary investigations of the protein folding process. Besides giving an overview of the available computational methods and tools, we argue that computation plays not merely an ancillary role but has a more constructive function in that computational work may precede theory and experiments. More precisely, computation can provide the primary conceptual clues to inspire subsequent theoretical and experimental work even in a case where no preexisting evidence or theoretical frameworks are available. This is cogently manifested in the application of machine learning methods to come to grips with the folding dynamics. These close relationships suggested complementing the review of computational methods within the appropriate theoretical context to provide a self-contained outlook of the basic concepts that have converged into a unified description of folding and have grown in a synergic relationship with their computational counterpart. Finally, the advantages and limitations of current computational methodologies are discussed to show how the smart analysis of large amounts of data and the development of more effective algorithms can improve our understanding of protein folding.
Collapse
Affiliation(s)
- Mario Compiani
- School of Sciences and Technology, University of Camerino , Camerino, Macerata 62032, Italy
| | | |
Collapse
|
19
|
Kaya H, Uzunoğlu Z, Chan HS. Spatial ranges of driving forces are a key determinant of protein folding cooperativity and rate diversity. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2013; 88:044701. [PMID: 24229309 DOI: 10.1103/physreve.88.044701] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/24/2013] [Revised: 08/21/2013] [Indexed: 06/02/2023]
Abstract
The physical basis of two-state-like folding transitions and the tremendous diversity in folding rates is elucidated by directly simulating the folding kinetics of 52 representative proteins. Relative to the results from a common modeling approach, the diversity of the simulated folding rates can be increased from ~10(2.1) to the experimental ~10(6.0) by a modest decrease in the spatial range of the attractive potential. The required theoretical range is consistent with desolvation physics and is notably much more permissive than that needed for two-state-like homopolymer collapse.
Collapse
Affiliation(s)
- Hüseyin Kaya
- Department of Biophysics, Faculty of Medicine, University of Gaziantep, 27310 Gaziantep, Turkey
| | | | | |
Collapse
|
20
|
Das A, Sin BK, Mohazab AR, Plotkin SS. Unfolded protein ensembles, folding trajectories, and refolding rate prediction. J Chem Phys 2013; 139:121925. [DOI: 10.1063/1.4817215] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
|
21
|
Glyakina AV, Pereyaslavets LB, Galzitskaya OV. Right- and left-handed three-helix proteins. I. Experimental and simulation analysis of differences in folding and structure. Proteins 2013; 81:1527-41. [DOI: 10.1002/prot.24301] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2012] [Revised: 03/27/2013] [Accepted: 03/28/2013] [Indexed: 11/08/2022]
Affiliation(s)
- Anna V. Glyakina
- Institute of Protein Research; Russian Academy of Sciences; Pushchino, Moscow Region 142290 Russia
- Institute of Mathematical Problems of Biology; Russian Academy of Sciences; Pushchino, Moscow Region 142290 Russia
| | - Leonid B. Pereyaslavets
- Institute of Protein Research; Russian Academy of Sciences; Pushchino, Moscow Region 142290 Russia
| | - Oxana V. Galzitskaya
- Institute of Protein Research; Russian Academy of Sciences; Pushchino, Moscow Region 142290 Russia
| |
Collapse
|
22
|
Galzitskaya OV, Glyakina AV. Nucleation-based prediction of the protein folding rate and its correlation with the folding nucleus size. Proteins 2012; 80:2711-27. [DOI: 10.1002/prot.24156] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2012] [Revised: 07/19/2012] [Accepted: 07/21/2012] [Indexed: 11/08/2022]
|
23
|
Real value prediction of protein folding rate change upon point mutation. J Comput Aided Mol Des 2012; 26:339-47. [DOI: 10.1007/s10822-012-9560-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2011] [Accepted: 03/02/2012] [Indexed: 10/28/2022]
|
24
|
Zou T, Ozkan SB. Local and non-local native topologies reveal the underlying folding landscape of proteins. Phys Biol 2011; 8:066011. [DOI: 10.1088/1478-3975/8/6/066011] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
|
25
|
Harihar B, Selvaraj S. Application of long-range order to predict unfolding rates of two-state proteins. Proteins 2010; 79:880-7. [DOI: 10.1002/prot.22925] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2010] [Revised: 10/07/2010] [Accepted: 10/24/2010] [Indexed: 01/09/2023]
|
26
|
Hamacher K. Efficient quantification of the importance of contacts for the dynamical stability of proteins. J Comput Chem 2010; 32:810-5. [PMID: 20957707 DOI: 10.1002/jcc.21659] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2010] [Revised: 07/12/2010] [Accepted: 08/05/2010] [Indexed: 11/07/2022]
Abstract
Understanding the stability of the native state and the dynamics of a protein is of great importance for all areas of biomolecular design. The efficient estimation of the influence of individual contacts between amino acids in a protein structure is a first step in the reengineering of a particular protein for technological or pharmacological purposes. At the same time, the functional annotation of molecular evolution can be facilitated by such insight. Here, we use a recently suggested, information theoretical measure in biomolecular design - the Kullback-Leibler-divergence - to quantify and therefore rank residue-residue contacts within proteins according to their overall contribution to the molecular mechanics. We implement this protocol on the basis of a reduced molecular model, which allows us to use a well-known lemma of linear algebra to speed up the computation. The increase in computational performance is around 10(1)- to 10(4)-fold. We applied the method to two proteins to illustrate the protocol and its results. We found that our method can reliably identify key residues in the molecular mechanics and the protein fold in comparison to well-known properties in the serine protease inhibitor. We found significant correlations to experimental results, e.g., dissociation constants and Φ values.
Collapse
|
27
|
Topological Quantities Determining the Folding/Unfolding Rate of Two-state Folding Proteins. J SOLUTION CHEM 2010. [DOI: 10.1007/s10953-010-9556-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
28
|
Huang LT, Gromiha MM. First insight into the prediction of protein folding rate change upon point mutation. Bioinformatics 2010; 26:2121-7. [DOI: 10.1093/bioinformatics/btq350] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
29
|
Harihar B, Selvaraj S. Refinement of the long-range order parameter in predicting folding rates of two-state proteins. Biopolymers 2009; 91:928-35. [DOI: 10.1002/bip.21281] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
30
|
Ferguson A, Liu Z, Chan HS. Desolvation Barrier Effects Are a Likely Contributor to the Remarkable Diversity in the Folding Rates of Small Proteins. J Mol Biol 2009; 389:619-36. [DOI: 10.1016/j.jmb.2009.04.011] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2009] [Revised: 04/01/2009] [Accepted: 04/06/2009] [Indexed: 11/25/2022]
|
31
|
Gromiha MM. Multiple Contact Network Is a Key Determinant to Protein Folding Rates. J Chem Inf Model 2009; 49:1130-5. [DOI: 10.1021/ci800440x] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- M. Michael Gromiha
- Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), AIST Tokyo Waterfront Bio-IT Research Building, 2-42 Aomi, Koto-ku, Tokyo 135-0064, Japan
| |
Collapse
|
32
|
Huang LT, Gromiha MM. Analysis and prediction of protein folding rates using quadratic response surface models. J Comput Chem 2008; 29:1675-83. [PMID: 18351617 DOI: 10.1002/jcc.20925] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Understanding the relationship between amino acid sequences and folding rates of proteins is an important task in computational and molecular biology. In this work, we have systematically analyzed the composition of amino acid residues for proteins with different ranges of folding rates. We observed that the polar residues, Asn, Gln, Ser, and Lys, are dominant in fast folding proteins whereas the hydrophobic residues, Ala, Cys, Gly, and Leu, prefer to be in slow folding proteins. Further, we have developed a method based on quadratic response surface models for predicting the folding rates of 77 two- and three-state proteins. Our method showed a correlation of 0.90 between experimental and predicted protein folding rates using leave-one-out cross-validation method. The classification of proteins based on structural class improved the correlation to 0.98 and it is 0.99, 0.98, and 0.96, respectively, for all-alpha, all-beta, and mixed class proteins. In addition, we have utilized Baysean classification theory for discriminating two- and three-state proteins, which showed an accuracy of 90%. We have developed a web server for predicting protein folding rates and it is available at http://bioinformatics.myweb.hinet.net/foldrate.htm.
Collapse
Affiliation(s)
- Liang-Tsung Huang
- Department of Computer Science and Information Engineering, Ming-Dao University, Changhua 523, Taiwan
| | | |
Collapse
|
33
|
Weikl TR. Loop-closure principles in protein folding. Arch Biochem Biophys 2008; 469:67-75. [PMID: 17662688 DOI: 10.1016/j.abb.2007.06.018] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2007] [Revised: 06/20/2007] [Accepted: 06/22/2007] [Indexed: 10/23/2022]
Abstract
Simple theoretical concepts and models have been helpful to understand the folding rates and routes of single-domain proteins. As reviewed in this article, a physical principle that appears to underly these models is loop closure.
Collapse
Affiliation(s)
- Thomas R Weikl
- Max Planck Institute of Colloids and Interfaces, Department of Theory and Bio-Systems, 14424 Potsdam, Germany.
| |
Collapse
|
34
|
Bruscolini P, Pelizzola A, Zamparo M. Rate determining factors in protein model structures. PHYSICAL REVIEW LETTERS 2007; 99:038103. [PMID: 17678333 DOI: 10.1103/physrevlett.99.038103] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/20/2007] [Indexed: 05/16/2023]
Abstract
Previous research has shown a strong correlation of protein folding rates to the native state geometry, yet a complete explanation for this dependence is still lacking. Here we study the rate-geometry relationship with a simple statistical physics model, and focus on two classes of model geometries, representing ideal parallel and antiparallel structures. We find that the logarithm of the rate shows an almost perfect linear correlation with the "absolute contact order", but the slope depends on the particular class considered. We discuss these findings in the light of experimental results.
Collapse
Affiliation(s)
- Pierpaolo Bruscolini
- Instituto de Biocomputación y Física de Sistemas Complejos (BIFI), Universidad de Zaragoza, c. Corona de Aragón 42, 50009 Zaragoza, Spain.
| | | | | |
Collapse
|
35
|
Ma BG, Chen LL, Zhang HY. What determines protein folding type? An investigation of intrinsic structural properties and its implications for understanding folding mechanisms. J Mol Biol 2007; 370:439-48. [PMID: 17524416 DOI: 10.1016/j.jmb.2007.04.051] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2007] [Revised: 04/08/2007] [Accepted: 04/18/2007] [Indexed: 12/01/2022]
Abstract
Protein folding experiments demonstrate that the folding behaviors of many proteins can be roughly classified into two types: two-state kinetics and multi-state kinetics. Although the two types of protein folding kinetics have been observed for a long time, what determines the folding type of a protein is still largely unclear. The present work performed a comparative study based on a dataset of 43 two-state and 42 multi-state folders at different levels of proteins' intrinsic properties from the simplest sequence length to native structure topology. The results show that protein's amino acids composition and the long-range interaction-based topological complexity rather than secondary structure contents are the major determinants of protein folding type. Furthermore, a sequence-based folding type prediction achieved an accuracy of more than 80%. These findings implicate that there is no clear boundary between secondary and tertiary structure formation during the protein folding process and support the existence of a continuum of folding mechanism between the two ends of hierarchic and nucleation folding scenarios.
Collapse
Affiliation(s)
- Bin-Guang Ma
- Shandong Provincial Research Center for Bioinformatic Engineering and Technique, Center for Advanced Study, Shandong University of Technology, Zibo 255049, PR China.
| | | | | |
Collapse
|
36
|
Gromiha MM, Thangakani AM, Selvaraj S. FOLD-RATE: prediction of protein folding rates from amino acid sequence. Nucleic Acids Res 2006; 34:W70-4. [PMID: 16845101 PMCID: PMC1538837 DOI: 10.1093/nar/gkl043] [Citation(s) in RCA: 111] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We have developed a web server, FOLD-RATE, for predicting the folding rates of proteins from their amino acid sequences. The relationship between amino acid properties and protein folding rates has been systematically analyzed and a statistical method based on linear regression technique has been proposed for predicting the folding rate of proteins. We found that the classification of proteins into different structural classes shows an excellent correlation between amino acid properties and folding rates of two and three-state proteins. Consequently, different regression equations have been developed for proteins belonging to all-alpha, all-beta and mixed class. We observed an excellent agreement between predicted and experimentally observed folding rates of proteins; the correlation coefficients are, 0.99, 0.97 and 0.90, respectively, for all-alpha, all-beta and mixed class proteins. The prediction server is freely available at http://psfs.cbrc.jp/fold-rate/.
Collapse
Affiliation(s)
- M Michael Gromiha
- Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, AIST Tokyo Waterfront Bio-IT Research Building, 2-42 Aomi, Koto-ku, Tokyo 135-0064, Japan.
| | | | | |
Collapse
|
37
|
Ma BG, Guo JX, Zhang HY. Direct correlation between proteins' folding rates and their amino acid compositions: An ab initio folding rate prediction. Proteins 2006; 65:362-72. [PMID: 16937389 DOI: 10.1002/prot.21140] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Discovering the mechanism of protein folding, in molecular biology, is a great challenge. A key step to this end is to find factors that correlate with protein folding rates. Over the past few years, many empirical parameters, such as contact order, long-range order, total contact distance, secondary structure contents, have been developed to reflect the correlation between folding rates and protein tertiary or secondary structures. However, the correlation between proteins' folding rates and their amino acid compositions has not been explored. In the present work, we examined systematically the correlation between proteins' folding rates and their amino acid compositions for two-state and multistate folders and found that different amino acids contributed differently to the folding progress. The relation between the amino acids' molecular weight and degeneracy and the folding rates was examined, and the role of hydrophobicity in the protein folding process was also inspected. As a consequence, a new indicator called composition index was derived, which takes no structure factors into account and is merely determined by the amino acid composition of a protein. Such an indicator is found to be highly correlated with the protein's folding rate (r > 0.7). From the results of this work, three points of concluding remarks are evident. (1) Two-state folders and multistate folders have different rate-determining amino acids. (2) The main determining information of a protein's folding rate is largely reflected in its amino acid composition. (3) Composition index may be the best predictor for an ab initio protein folding rate prediction directly from protein sequence from the standpoint of practical application.
Collapse
Affiliation(s)
- Bin-Guang Ma
- Shandong Provincial Research Center for Bioinformatic Engineering and Technique, Center for Advanced Study, Shandong University of Technology, Zibo 255049, People's Republic of China.
| | | | | |
Collapse
|
38
|
Dixit PD, Weikl TR. A simple measure of native‐state topology and chain connectivity predicts the folding rates of two‐state proteins with and without crosslinks. Proteins 2006; 64:193-7. [PMID: 16596570 DOI: 10.1002/prot.20976] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
The folding rates of two-state proteins have been found to correlate with simple measures of native-state topology. The most prominent among these measures is the relative contact order (CO), which is the average CO, or localness, of all contacts in the native protein structure, divided by the chain length. Here, we test whether such measures can be generalized to capture the effect of chain crosslinks on the folding rate. Crosslinks change the chain connectivity and therefore also the localness of some of the native contacts. These changes in localness can be taken into account by the graph-theoretical concept of effective contact order (ECO). The relative ECO, however, the natural extension of the relative CO for proteins with crosslinks, overestimates the changes in the folding rates caused by crosslinks. We suggest here a novel measure of native-state topology, the relative logCO, and its natural extension, the relative logECO. The relative logCO is the average value for the logarithm of the CO of all contacts, divided by the logarithm of the chain length. The relative log(E)CO reproduces the folding rates of a set of 26 two-state proteins without crosslinks with essentially the same high correlation coefficient as the relative CO. In addition, it also captures the folding rates of eight two-state proteins with crosslinks.
Collapse
Affiliation(s)
- Purushottam D Dixit
- Max Planck Institute of Colloids and Interfaces, Theory Department, Potsdam, Germany
| | | |
Collapse
|
39
|
Gromiha MM, Selvaraj S, Thangakani AM. A Statistical Method for Predicting Protein Unfolding Rates from Amino Acid Sequence. J Chem Inf Model 2006; 46:1503-8. [PMID: 16711769 DOI: 10.1021/ci050417u] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The prediction of protein unfolding rates from amino acid sequences is one of the most important challenges in computational biology and chemistry. The analysis on the relationship between protein unfolding rates and physical-chemical, energetic, and conformational properties of amino acid residues provides valuable information to understand and predict the unfolding rates of two- and three-state proteins. We found that the classification of proteins into different structural classes shows an excellent correlation between amino acid properties and unfolding rates of two- and three-state proteins, indicating the importance of native-state topology in determining the protein unfolding rates. We have formulated three independent linear regression equations to different structural classes of proteins for predicting their unfolding rates from amino acid sequences and obtained an excellent agreement between predicted and experimentally observed unfolding rates of proteins; the correlation coefficients are 0.999, 0.990, and 0.992, respectively, for all-alpha, all-beta, and mixed-class proteins. Further, we have derived a general equation applicable to all structural classes of proteins, which can be used for predicting the unfolding rates for proteins of an unknown structural class. We observed a correlation of 0.987 and 0.930, respectively, for back-check and jack-knife tests. These accuracy levels are better than those of other methods in the literature.
Collapse
Affiliation(s)
- M Michael Gromiha
- Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), AIST Tokyo Waterfront Bio-IT Research Building, 2-42 Aomi, Tokyo 135-0064, Japan.
| | | | | |
Collapse
|
40
|
Weikl TR. Loop-closure events during protein folding: rationalizing the shape of Phi-value distributions. Proteins 2006; 60:701-11. [PMID: 16021610 DOI: 10.1002/prot.20504] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In the past years, the folding kinetics of many small single-domain proteins has been characterized by mutational Phi-value analysis. In this article, a simple, essentially parameter-free model is introduced which derives folding routes from native structures by minimizing the entropic loop-closure cost during folding. The model predicts characteristic folding sequences of structural elements such as helices and beta-strand pairings. Based on few simple rules, the kinetic impact of these structural elements is estimated from the routes and compared to average experimental Phi-values for the helices and strands of 15 small, well-characterized proteins. The comparison leads on average to a correlation coefficient of 0.62 for all proteins with polarized Phi-value distributions, and 0.74 if distributions with negative average Phi-values are excluded. The diffuse Phi-value distributions of the remaining proteins are reproduced correctly. The model shows that Phi-value distributions, averaged over secondary structural elements, can often be traced back to entropic loop-closure events, but also indicates energetic preferences in the case of a few proteins governed by parallel folding processes.
Collapse
Affiliation(s)
- Thomas R Weikl
- Max-Planck-Institut für Kolloid- und Grenzflächenforschung, Potsdam, Germany.
| |
Collapse
|
41
|
Das P, Wilson CJ, Fossati G, Wittung-Stafshede P, Matthews KS, Clementi C. Characterization of the folding landscape of monomeric lactose repressor: quantitative comparison of theory and experiment. Proc Natl Acad Sci U S A 2005; 102:14569-74. [PMID: 16203982 PMCID: PMC1253569 DOI: 10.1073/pnas.0505844102] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2005] [Indexed: 12/22/2022] Open
Abstract
Recent theoretical/computational studies based on simplified protein models and experimental investigation have suggested that the native structure of a protein plays a primary role in determining the folding rate and mechanism of relatively small single-domain proteins. Here, we extend the study of the relationship between protein topology and folding mechanism to a larger protein with complex topology, by analyzing the folding process of monomeric lactose repressor (MLAc) computationally by using a Gō-like C(alpha) model. Next, we combine simulation and experimental results (see companion article in this issue) to achieve a comprehensive assessment of the folding landscape of this protein. Remarkably, simulated kinetic and equilibrium analyses show an excellent quantitative agreement with the experimental folding data of this study. The results of this comparison show that a simplified, completely unfrustrated C(alpha) model correctly reproduces the complex folding features of a large multidomain protein with complex topology. The success of this effort underlines the importance of synergistic experimental/theoretical approaches to achieve a broader understanding of the folding landscape.
Collapse
Affiliation(s)
- Payel Das
- Department of Chemistry, Rice University, Houston, TX 77005, USA
| | | | | | | | | | | |
Collapse
|
42
|
Wallin S, Chan HS. A critical assessment of the topomer search model of protein folding using a continuum explicit-chain model with extensive conformational sampling. Protein Sci 2005; 14:1643-60. [PMID: 15930009 PMCID: PMC2253387 DOI: 10.1110/ps.041317705] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Recently, a series of closely related theoretical constructs termed the "topomer search model" (TSM) has been proposed for the folding mechanism of small, single-domain proteins. A basic assumption of the proposed scenarios is that the rate-limiting step in folding is an essentially unbiased, diffusive search for a conformational state called the native topomer defined by an overall native-like topological pattern. Successes in correlating TSM-predicted folding rates with that of real proteins have been interpreted as experimental support for the model. To better delineate the physics entailed, key TSM concepts are examined here using extensive Langevin dynamics simulations of continuum C(alpha) chain models. The theoretical native topomers of four experimentally well-studied two-state proteins are characterized. Consistent with the TSM perspective, we found that the sizes of the native topomers increase with experimental folding rate. However, a careful determination of the corresponding probabilities that the native topomers are populated during a random search fails to reproduce the previously predicted folding rates. Instead, our results indicate that an unbiased TSM search for the native topomer amounts to a Levinthal-like process that would take an impossibly long average time to complete. Furthermore, intraprotein contacts in all four native topomers considered exhibit no apparent correlation with the experimental phi-values determined from the folding kinetics of these proteins. Thus, the present findings suggest that certain basic, generic yet essential energetic features in protein folding are not accounted for by TSM scenarios to date.
Collapse
Affiliation(s)
- Stefan Wallin
- Department of Biochemistry, University of Toronto, 1 King's College Circle, Toronto, Ontario M5S 1A8, Canada
| | | |
Collapse
|
43
|
Gromiha MM. A Statistical Model for Predicting Protein Folding Rates from Amino Acid Sequence with Structural Class Information. J Chem Inf Model 2005; 45:494-501. [PMID: 15807515 DOI: 10.1021/ci049757q] [Citation(s) in RCA: 88] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Prediction of protein folding rates from amino acid sequences is one of the most important challenges in molecular biology. In this work, I have related the protein folding rates with physical-chemical, energetic and conformational properties of amino acid residues. I found that the classification of proteins into different structural classes shows an excellent correlation between amino acid properties and folding rates of two- and three-state proteins, indicating the importance of native state topology in determining the protein folding rates. I have formulated a simple linear regression model for predicting the protein folding rates from amino acid sequences along with structural class information and obtained an excellent agreement between predicted and experimentally observed folding rates of proteins; the correlation coefficients are 0.99, 0.96 and 0.95, respectively, for all-alpha, all-beta and mixed class proteins. This is the first available method, which is capable of predicting the protein folding rates just from the amino acid sequence with the aid of generic amino acid properties and structural class information.
Collapse
Affiliation(s)
- M Michael Gromiha
- Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), Aomi Frontier Building 17F, 2-43 Aomi, Koto-ku, Tokyo 135-0064, Japan.
| |
Collapse
|
44
|
Oztop B, Ejtehadi MR, Plotkin SS. Protein folding rates correlate with heterogeneity of folding mechanism. PHYSICAL REVIEW LETTERS 2004; 93:208105. [PMID: 15600977 DOI: 10.1103/physrevlett.93.208105] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2004] [Indexed: 05/24/2023]
Abstract
By observing trends in the folding kinetics of experimental 2-state proteins at their transition midpoints, and by observing trends in the barrier heights of numerous simulations of coarse-grained, C(alpha) model Go proteins, we show that folding rates correlate with the degree of heterogeneity in the formation of native contacts. Statistically significant correlations are observed between folding rates and measures of heterogeneity inherent in the native topology, as well as between rates and the variance in the distribution of either experimentally measured or simulated phi values.
Collapse
Affiliation(s)
- B Oztop
- Department of Physics and Astronomy, University of British Columbia, Vancouver, BC V6T-1Z1, Canada
| | | | | |
Collapse
|
45
|
Matysiak S, Clementi C. Optimal combination of theory and experiment for the characterization of the protein folding landscape of S6: how far can a minimalist model go? J Mol Biol 2004; 343:235-48. [PMID: 15381433 DOI: 10.1016/j.jmb.2004.08.006] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2004] [Revised: 07/30/2004] [Accepted: 08/03/2004] [Indexed: 11/28/2022]
Abstract
The detailed characterization of the overall free energy landscape associated with the folding process of a protein is the ultimate goal in protein folding studies. Modern experimental techniques provide accurate thermodynamic and kinetic measurements on restricted regions of a protein landscape. Although simplified protein models can access larger regions of the landscape, they are oftentimes built on assumptions and approximations that affect the accuracy of the results. We present a new methodology that allows to combine the complementary strengths of theory and experiment for a more complete characterization of a protein folding landscape. We prove that this new procedure allows a simplified protein model to reproduce remarkably well (correlation coefficient > 0.9) all experimental data available on free energies differences upon single mutations for S6 ribosomal protein and two circular permutants. Our results confirm and quantify the hypothesis, recently formulated on the basis of experimental data, that the folding landscape of protein S6 is strongly affected by an atypical distribution of contact energies.
Collapse
Affiliation(s)
- Silvina Matysiak
- Department of Chemistry, Rice University, 6100 Main Street, Houston, TX 77005, USA
| | | |
Collapse
|
46
|
Chavez LL, Onuchic JN, Clementi C. Quantifying the roughness on the free energy landscape: entropic bottlenecks and protein folding rates. J Am Chem Soc 2004; 126:8426-32. [PMID: 15237999 DOI: 10.1021/ja049510+] [Citation(s) in RCA: 190] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The prediction of protein folding rates and mechanisms is currently of great interest in the protein folding community. A close comparison between theory and experiment in this area is promising to advance our understanding of the physical-chemical principles governing the folding process. The delicate interplay of entropic and energetic/enthalpic factors in the protein free energy regulates the details of this complex reaction. In this article, we propose the use of topological descriptors to quantify the amount of heterogeneity in the configurational entropy contribution to the free energy. We apply the procedure to a set of 16 two-state folding proteins. The results offer a clean and simple theoretical explanation for the experimentally measured folding rates and mechanisms, in terms of the intrinsic entropic roughness along the populated folding routes on the protein free energy landscape.
Collapse
Affiliation(s)
- Leslie L Chavez
- Center for Theoretical Biological Physics and Department of Physics, University of California at San Diego, La Jolla, California 92093, USA
| | | | | |
Collapse
|
47
|
Tiana G, Simona F, De Mori GMS, Broglia RA, Colombo G. Understanding the determinants of stability and folding of small globular proteins from their energetics. Protein Sci 2004; 13:113-24. [PMID: 14691227 PMCID: PMC2286534 DOI: 10.1110/ps.03223804] [Citation(s) in RCA: 69] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
The results of minimal model calculations indicate that the stability and the kinetic accessibility of the native state of small globular proteins are controlled by few "hot" sites. By means of molecular dynamics simulations around the native conformation, which describe the protein and the surrounding solvent at the all-atom level, an accurate and compact energetic map of the native state of the protein is generated. This map is further simplified by means of an eigenvalue decomposition. The components of the eigenvector associated with the lowest eigenvalue indicate which hot sites are likely to be responsible for the stability and for the rapid folding of the protein. The comparison of the results of the model with the findings of mutagenesis experiments performed for four small proteins show that the eigenvalue decomposition method is able to identify between 60% and 80% of these (hot) sites.
Collapse
Affiliation(s)
- Guido Tiana
- Department of Physics, University of Milano, 20133 Milano, Italy
| | | | | | | | | |
Collapse
|
48
|
Kamagata K, Arai M, Kuwajima K. Unification of the Folding Mechanisms of Non-two-state and Two-state Proteins. J Mol Biol 2004; 339:951-65. [PMID: 15165862 DOI: 10.1016/j.jmb.2004.04.015] [Citation(s) in RCA: 75] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2003] [Revised: 04/01/2004] [Accepted: 04/01/2004] [Indexed: 11/15/2022]
Abstract
We have collected the kinetic folding data for non-two-state and two-state globular proteins reported in the literature, and investigated the relationships between the folding kinetics and the native three-dimensional structure of these proteins. The rate constants of formation of both the intermediate and the native state of non-two-state folders were found to be significantly correlated with protein chain length and native backbone topology, which is represented by the absolute contact order and sequence-distant native pairs. The folding rate of two-state folders, which is known to be correlated with the native backbone topology, apparently does not correlate significantly with protein chain length. On the basis of a comparison of the folding rates of the non-two-state and two-state folders, it was found that they are similarly dependent on the parameters that reflect the native backbone topology. This suggests that the mechanisms behind non-two-state and two-state folding are essentially identical. The present results lead us to propose a unified mechanism of protein folding, in which folding occurs in a hierarchical manner, reflecting the hierarchy of the native three-dimensional structure, as embodied in the case of non-two-state folding with an accumulation of the intermediate. Apparently, two-state folding is merely a simplified version of hierarchical folding caused either by an alteration in the rate-limiting step of folding or by destabilization of the intermediate.
Collapse
Affiliation(s)
- Kiyoto Kamagata
- Department of Physics, School of Science, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
| | | | | |
Collapse
|
49
|
Kaya H, Chan HS. Contact order dependent protein folding rates: kinetic consequences of a cooperative interplay between favorable nonlocal interactions and local conformational preferences. Proteins 2003; 52:524-33. [PMID: 12910452 DOI: 10.1002/prot.10478] [Citation(s) in RCA: 68] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Physical mechanisms underlying the empirical correlation between relative contact order (CO) and folding rate among naturally occurring small single-domain proteins are investigated by evaluating postulated interaction schemes for a set of three-dimensional 27mer lattice protein models with 97 different CO values. Many-body interactions are constructed such that contact energies become more favorable when short chain segments sequentially adjacent to the contacting residues adopt native-like conformations. At a given interaction strength, this scheme leads to folding rates that are logarithmically well correlated with CO (correlation coefficient r = 0.914) and span more than 2.5 orders of magnitude, whereas folding rates of the corresponding Gō models with additive contact energies have much less logarithmic correlation with CO and span only approximately one order of magnitude. The present protein chain models also exhibit calorimetric cooperativity and linear chevron plots similar to that observed experimentally for proteins with apparent simple two-state folding/unfolding kinetics. Thus, our findings suggest that CO-dependent folding rates of real proteins may arise partly from a significant positive coupling between nonlocal contact favorabilities and local conformational preferences.
Collapse
Affiliation(s)
- Hüseyin Kaya
- Protein Engineering Network of Centres of Excellence, Department of Biochemistry, Faculty of Medicine, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | | |
Collapse
|