1
|
Banerjee A, Saha S, Tvedt NC, Yang LW, Bahar I. Mutually beneficial confluence of structure-based modeling of protein dynamics and machine learning methods. Curr Opin Struct Biol 2023; 78:102517. [PMID: 36587424 PMCID: PMC10038760 DOI: 10.1016/j.sbi.2022.102517] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Revised: 11/19/2022] [Accepted: 11/22/2022] [Indexed: 12/31/2022]
Abstract
Proteins sample an ensemble of conformers under physiological conditions, having access to a spectrum of modes of motions, also called intrinsic dynamics. These motions ensure the adaptation to various interactions in the cell, and largely assist in, if not determine, viable mechanisms of biological function. In recent years, machine learning frameworks have proven uniquely useful in structural biology, and recent studies further provide evidence to the utility and/or necessity of considering intrinsic dynamics for increasing their predictive ability. Efficient quantification of dynamics-based attributes by recently developed physics-based theories and models such as elastic network models provides a unique opportunity to generate data on dynamics for training ML models towards inferring mechanisms of protein function, assessing pathogenicity, or estimating binding affinities.
Collapse
Affiliation(s)
- Anupam Banerjee
- Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh PA 15261, USA
| | - Satyaki Saha
- Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh PA 15261, USA
| | - Nathan C Tvedt
- Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh PA 15261, USA; Computational and Applied Mathematics and Statistics, The College of William and Mary, Williamsburg, VA 23185, USA
| | - Lee-Wei Yang
- Institute of Bioinformatics and Structural Biology, and PhD Program in Biomedical Artificial Intelligence, National Tsing Hua University, Hsinchu 300044, Taiwan; Physics Division, National Center for Theoretical Sciences, Taipei 106319, Taiwan
| | - Ivet Bahar
- Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh PA 15261, USA.
| |
Collapse
|
2
|
Adhikari U, Mostofian B, Copperman J, Subramanian SR, Petersen AA, Zuckerman DM. Computational Estimation of Microsecond to Second Atomistic Folding Times. J Am Chem Soc 2019; 141:6519-6526. [PMID: 30892023 PMCID: PMC6660137 DOI: 10.1021/jacs.8b10735] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Despite the development of massively parallel computing hardware including inexpensive graphics processing units (GPUs), it has remained infeasible to simulate the folding of atomistic proteins at room temperature using conventional molecular dynamics (MD) beyond the microsecond scale. Here, we report the folding of atomistic, implicitly solvated protein systems with folding times τ ranging from ∼10 μs to ∼100 ms using the weighted ensemble (WE) strategy in combination with GPU computing. Starting from an initial structure or set of structures, WE organizes an ensemble of GPU-accelerated MD trajectory segments via intermittent pruning and replication events to generate statistically unbiased estimates of rate constants for rare events such as folding; no biasing forces are used. Although the variance among atomistic WE folding runs is significant, multiple independent runs are used to reduce and quantify statistical uncertainty. Folding times are estimated directly from WE probability flux and from history-augmented Markov analysis of the WE data. Three systems were examined: NTL9 at low solvent viscosity (yielding τf = 0.8-9 μs), NTL9 at water-like viscosity (τf = 0.2-2 ms), and Protein G at low viscosity (τf = 3-200 ms). In all cases, the folding time, uncertainty, and ensemble properties could be estimated from WE simulation; for Protein G, this characterization required significantly less overall computing than would be required to observe a single folding event with conventional MD simulations. Our results suggest that the use and calibration of force fields and solvent models for precise estimation of kinetic quantities is becoming feasible.
Collapse
Affiliation(s)
- Upendra Adhikari
- Department of Biomedical Engineering, School of Medicine, Oregon Health & Science University, Portland, OR 97239
| | - Barmak Mostofian
- Department of Biomedical Engineering, School of Medicine, Oregon Health & Science University, Portland, OR 97239
| | - Jeremy Copperman
- Department of Biomedical Engineering, School of Medicine, Oregon Health & Science University, Portland, OR 97239
| | | | - Andrew A. Petersen
- NCSU Data Science Resources, North Carolina State University, Raleigh, NC 27695
| | - Daniel M. Zuckerman
- Department of Biomedical Engineering, School of Medicine, Oregon Health & Science University, Portland, OR 97239
| |
Collapse
|
3
|
Affiliation(s)
- Brooke E. Husic
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| | - Vijay S. Pande
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| |
Collapse
|
4
|
Cieplak M, Banavar JR. Energy landscape and dynamics of proteins: an exact analysis of a simplified lattice model. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2013; 88:040702. [PMID: 24229101 DOI: 10.1103/physreve.88.040702] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/09/2013] [Indexed: 06/02/2023]
Abstract
We present the results of exact numerical studies of the energy landscape and the dynamics of a 12-monomer chain with contact interactions encoding the ground state on a square lattice. In spite of its simplicity, the model is shown to exhibit behavior at odds with the standard picture of proteins.
Collapse
Affiliation(s)
- Marek Cieplak
- Institute of Physics, Polish Academy of Sciences, 02-668 Warsaw, Poland
| | | |
Collapse
|
5
|
Bhatt D, Bahar I. An adaptive weighted ensemble procedure for efficient computation of free energies and first passage rates. J Chem Phys 2012; 137:104101. [PMID: 22979844 PMCID: PMC3460967 DOI: 10.1063/1.4748278] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2012] [Accepted: 08/14/2012] [Indexed: 01/20/2023] Open
Abstract
We introduce an adaptive weighted-ensemble procedure (aWEP) for efficient and accurate evaluation of first-passage rates between states for two-state systems. The basic idea that distinguishes aWEP from conventional weighted-ensemble (WE) methodology is the division of the configuration space into smaller regions and equilibration of the trajectories within each region upon adaptive partitioning of the regions themselves into small grids. The equilibrated conditional∕transition probabilities between each pair of regions lead to the determination of populations of the regions and the first-passage times between regions, which in turn are combined to evaluate the first passage times for the forward and backward transitions between the two states. The application of the procedure to a non-trivial coarse-grained model of a 70-residue calcium binding domain of calmodulin is shown to efficiently yield information on the equilibrium probabilities of the two states as well as their first passage times. Notably, the new procedure is significantly more efficient than the canonical implementation of the WE procedure, and this improvement becomes even more significant at low temperatures.
Collapse
Affiliation(s)
- Divesh Bhatt
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania 15213, USA
| | | |
Collapse
|
6
|
Structured pathway across the transition state for peptide folding revealed by molecular dynamics simulations. PLoS Comput Biol 2011; 7:e1002137. [PMID: 21931542 PMCID: PMC3169518 DOI: 10.1371/journal.pcbi.1002137] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2010] [Accepted: 06/09/2011] [Indexed: 11/25/2022] Open
Abstract
Small globular proteins and peptides commonly exhibit two-state folding kinetics in which the rate limiting step of folding is the surmounting of a single free energy barrier at the transition state (TS) separating the folded and the unfolded states. An intriguing question is whether the polypeptide chain reaches, and leaves, the TS by completely random fluctuations, or whether there is a directed, stepwise process. Here, the folding TS of a 15-residue β-hairpin peptide, Peptide 1, is characterized using independent 2.5 μs-long unbiased atomistic molecular dynamics (MD) simulations (a total of 15 μs). The trajectories were started from fully unfolded structures. Multiple (spontaneous) folding events to the NMR-derived conformation are observed, allowing both structural and dynamical characterization of the folding TS. A common loop-like topology is observed in all the TS structures with native end-to-end and turn contacts, while the central segments of the strands are not in contact. Non-native sidechain contacts are present in the TS between the only tryptophan (W11) and the turn region (P7-G9). Prior to the TS the turn is found to be already locked by the W11 sidechain, while the ends are apart. Once the ends have also come into contact, the TS is reached. Finally, along the reactive folding paths the cooperative loss of the W11 non-native contacts and the formation of the central inter-strand native contacts lead to the peptide rapidly proceeding from the TS to the native state. The present results indicate a directed stepwise process to folding the peptide. The folding dynamics of many small protein/peptides investigated recently are in terms of simple two-state model in which only two populations exist (folded and unfolded), separated by a single free energy barrier with only one kinetically important transition state (TS). However, dynamical characterization of the folding TS is challenging. We have used independent unbiased atomistic molecular dynamics simulations with clear folding-unfolding transitions to characterize structural and dynamical features of transition state ensemble of Peptide 1. A common loop-like topology is observed in all TS structures extracted from multiple simulations. The trajectories were used to examine the mechanism by which the TS is reached and subsequent events in folding pathways. The folding TS is reached and crossed in a directed stagewise process rather than through random fluctuations. Specific structures are formed before, during, and after the transition state, indicating a clear structured folding pathway.
Collapse
|
7
|
Radford IH, Fersht AR, Settanni G. Combination of Markov state models and kinetic networks for the analysis of molecular dynamics simulations of peptide folding. J Phys Chem B 2011; 115:7459-71. [PMID: 21553833 PMCID: PMC3106446 DOI: 10.1021/jp112158w] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Atomistic molecular dynamics simulations of the TZ1 beta-hairpin peptide have been carried out using an implicit model for the solvent. The trajectories have been analyzed using a Markov state model defined on the projections along two significant observables and a kinetic network approach. The Markov state model allowed for an unbiased identification of the metastable states of the system, and provided the basis for commitment probability calculations performed on the kinetic network. The kinetic network analysis served to extract the main transition state for folding of the peptide and to validate the results from the Markov state analysis. The combination of the two techniques allowed for a consistent and concise characterization of the dynamics of the peptide. The slowest relaxation process identified is the exchange between variably folded and denatured species, and the second slowest process is the exchange between two different subsets of the denatured state which could not be otherwise identified by simple inspection of the projected trajectory. The third slowest process is the exchange between a fully native and a partially folded intermediate state characterized by a native turn with a proximal backbone H-bond, and frayed side-chain packing and termini. The transition state for the main folding reaction is similar to the intermediate state, although a more native like side-chain packing is observed.
Collapse
|
8
|
Zheng W, Gallicchio E, Deng N, Andrec M, Levy RM. Kinetic network study of the diversity and temperature dependence of Trp-Cage folding pathways: combining transition path theory with stochastic simulations. J Phys Chem B 2011; 115:1512-23. [PMID: 21254767 PMCID: PMC3059588 DOI: 10.1021/jp1089596] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
We present a new approach to study a multitude of folding pathways and different folding mechanisms for the 20-residue mini-protein Trp-Cage using the combined power of replica exchange molecular dynamics (REMD) simulations for conformational sampling, transition path theory (TPT) for constructing folding pathways, and stochastic simulations for sampling the pathways in a high dimensional structure space. REMD simulations of Trp-Cage with 16 replicas at temperatures between 270 and 566 K are carried out with an all-atom force field (OPLSAA) and an implicit solvent model (AGBNP). The conformations sampled from all temperatures are collected. They form a discretized state space that can be used to model the folding process. The equilibrium population for each state at a target temperature can be calculated using the weighted-histogram-analysis method (WHAM). By connecting states with similar structures and creating edges satisfying detailed balance conditions, we construct a kinetic network that preserves the equilibrium population distribution of the state space. After defining the folded and unfolded macrostates, committor probabilities (P(fold)) are calculated by solving a set of linear equations for each node in the network and pathways are extracted together with their fluxes using the TPT algorithm. By clustering the pathways into folding "tubes", a more physically meaningful picture of the diversity of folding routes emerges. Stochastic simulations are carried out on the network, and a procedure is developed to project sampled trajectories onto the folding tubes. The fluxes through the folding tubes calculated from the stochastic trajectories are in good agreement with the corresponding values obtained from the TPT analysis. The temperature dependence of the ensemble of Trp-Cage folding pathways is investigated. Above the folding temperature, a large number of diverse folding pathways with comparable fluxes flood the energy landscape. At low temperature, however, the folding transition is dominated by only a few localized pathways.
Collapse
Affiliation(s)
- Weihua Zheng
- Department of Chemistry and Chemical Biology and BioMaPS Institute for Quantitative Biology, Rutgers, the State University of New Jersey Piscataway, NJ 08854
| | - Emilio Gallicchio
- Department of Chemistry and Chemical Biology and BioMaPS Institute for Quantitative Biology, Rutgers, the State University of New Jersey Piscataway, NJ 08854
| | - Nanjie Deng
- Department of Chemistry and Chemical Biology and BioMaPS Institute for Quantitative Biology, Rutgers, the State University of New Jersey Piscataway, NJ 08854
| | - Michael Andrec
- Department of Chemistry and Chemical Biology and BioMaPS Institute for Quantitative Biology, Rutgers, the State University of New Jersey Piscataway, NJ 08854
| | - Ronald M. Levy
- Department of Chemistry and Chemical Biology and BioMaPS Institute for Quantitative Biology, Rutgers, the State University of New Jersey Piscataway, NJ 08854
| |
Collapse
|
9
|
Abstract
The minimal folding pathway or trajectory for a biopolymer can be defined as the transformation that minimizes the total distance traveled between a folded and an unfolded structure. This involves generalizing the usual Euclidean distance from points to one-dimensional objects such as a polymer. We apply this distance here to find minimal folding pathways for several candidate protein fragments, including the helix, the beta-hairpin, and a nonplanar structure where chain noncrossing is important. Comparing the distances traveled with root mean-squared distance and mean root-squared distance, we show that chain noncrossing can have large effects on the kinetic proximity of apparently similar conformations. Structures that are aligned to the beta-hairpin by minimizing mean root-squared distance, a quantity that closely approximates the true distance for long chains, show globally different orientation than structures aligned by minimizing root mean-squared distance.
Collapse
|
10
|
Zhou HX. A minimum-reaction-flux solution to master-equation models of protein folding. J Chem Phys 2008; 128:195104. [PMID: 18500902 DOI: 10.1063/1.2929824] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Master equations are widely used for modeling protein folding. Here an approximate solution to such master equations is presented. The approach used may be viewed as a discrete variational transition-state theory. The folding rate constant kf is approximated by the outgoing reaction flux J, when the unfolded set of macrostates assumes an equilibrium distribution. Correspondingly the unfolding rate constant ku is calculated as Jpu(1-pu), where pu is the equilibrium fraction of the unfolded state. The dividing surface between the unfolded and folded states is chosen to minimize the reaction flux J. This minimum-reaction-flux surface plays the role of the transition-state ensemble and identifies rate-limiting steps. Test against exact results of master-equation models of Zwanzig [Proc. Natl. Acad. Sci. USA 92, 9801 (1995)] and Munoz et al. [Proc. Natl. Acad. Sci. USA 95, 5872 (1998)] shows that the minimum-reaction-flux solution works well. Macrostates separated by the minimum-reaction-flux surface show a gap in p(fold) values. The approach presented here significantly simplifies the solution of master-equation models and, at the same time, directly yields insight into folding mechanisms.
Collapse
Affiliation(s)
- Huan-Xiang Zhou
- Department of Physics and Institute of Molecular Biophysics and School of Computational Science, Florida State University, Tallahassee, Florida 32306, USA.
| |
Collapse
|
11
|
Tang X, Thomas S, Tapia L, Giedroc DP, Amato NM. Simulating RNA folding kinetics on approximated energy landscapes. J Mol Biol 2008; 381:1055-67. [PMID: 18639245 DOI: 10.1016/j.jmb.2008.02.007] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2007] [Revised: 01/26/2008] [Accepted: 02/03/2008] [Indexed: 12/28/2022]
Abstract
We present a general computational approach to simulate RNA folding kinetics that can be used to extract population kinetics, folding rates and the formation of particular substructures that might be intermediates in the folding process. Simulating RNA folding kinetics can provide unique insight into RNA whose functions are dictated by folding kinetics and not always by nucleotide sequence or the structure of the lowest free-energy state. The method first builds an approximate map (or model) of the folding energy landscape from which the population kinetics are analyzed by solving the master equation on the map. We present results obtained using an analysis technique, map-based Monte Carlo simulation, which stochastically extracts folding pathways from the map. Our method compares favorably with other computational methods that begin with a comprehensive free-energy landscape, illustrating that the smaller, approximate map captures the major features of the complete energy landscape. As a result, our method scales to larger RNAs. For example, here we validate kinetics of RNA of more than 200 nucleotides. Our method accurately computes the kinetics-based functional rates of wild-type and mutant ColE1 RNAII and MS2 phage RNAs showing excellent agreement with experiment.
Collapse
|
12
|
Li J, Wang J, Wang W. Identifying folding nucleus based on residue contact networks of proteins. Proteins 2008; 71:1899-907. [DOI: 10.1002/prot.21891] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
13
|
Harland B, Sun SX. Path ensembles and path sampling in nonequilibrium stochastic systems. J Chem Phys 2007; 127:104103. [PMID: 17867733 DOI: 10.1063/1.2775439] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Markovian models based on the stochastic master equation are often encountered in single molecule dynamics, reaction networks, and nonequilibrium problems in chemistry, physics, and biology. An efficient and convenient method to simulate these systems is the kinetic Monte Carlo algorithm which generates continuous-time stochastic trajectories. We discuss an alternative simulation method based on sampling of stochastic paths. Utilizing known probabilities of stochastic paths, it is possible to apply Metropolis Monte Carlo in path space to generate a desired ensemble of stochastic paths. The method is a generalization of the path sampling idea to stochastic dynamics, and is especially suited for the analysis of rare paths which are not often produced in the standard kinetic Monte Carlo procedure. Two generic examples are presented to illustrate the methodology.
Collapse
Affiliation(s)
- Ben Harland
- Department of Mechanical Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | | |
Collapse
|
14
|
Efficient and verified simulation of a path ensemble for conformational change in a united-residue model of calmodulin. Proc Natl Acad Sci U S A 2007; 104:18043-8. [PMID: 17984047 DOI: 10.1073/pnas.0706349104] [Citation(s) in RCA: 99] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The computational sampling of rare, large-scale, conformational transitions in proteins is a well appreciated challenge-for which a number of potentially efficient path-sampling methodologies have been proposed. Here, we study a large-scale transition in a united-residue model of calmodulin using the "weighted ensemble" (WE) approach of Huber and Kim. Because of the model's relative simplicity, we are able to compare our results with brute-force simulations. The comparison indicates that the WE approach quantitatively reproduces the brute-force results, as assessed by considering (i) the reaction rate, (ii) the distribution of event durations, and (iii) structural distributions describing the heterogeneity of the paths. Importantly, the WE method is readily applied to more chemically accurate models, and by studying a series of lower temperatures, our results suggest that the WE method can increase efficiency by orders of magnitude in more challenging systems.
Collapse
|
15
|
Lam AR, Borreguero JM, Ding F, Dokholyan NV, Buldyrev SV, Stanley HE, Shakhnovich E. Parallel folding pathways in the SH3 domain protein. J Mol Biol 2007; 373:1348-60. [PMID: 17900612 DOI: 10.1016/j.jmb.2007.08.032] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2006] [Revised: 08/06/2007] [Accepted: 08/14/2007] [Indexed: 11/16/2022]
Abstract
The transition-state ensemble (TSE) is the set of protein conformations with an equal probability to fold or unfold. Its characterization is crucial for an understanding of the folding process. We determined the TSE of the src-SH3 domain protein by using extensive molecular dynamics simulations of the Go model and computing the folding probability of a generated set of TSE candidate conformations. We found that the TSE possesses a well-defined hydrophobic core with variable enveloping structures resulting from the superposition of three parallel folding pathways. The most preferred pathway agrees with the experimentally determined TSE, while the two least preferred pathways differ significantly. The knowledge of the different pathways allows us to design the interactions between amino acids that guide the protein to fold through the least preferred pathway. This particular design is akin to a circular permutation of the protein. The finding motivates the hypothesis that the different experimentally observed TSEs in homologous proteins and circular permutants may represent potentially available pathways to the wild-type protein.
Collapse
Affiliation(s)
- A R Lam
- Center for Polymer Studies, Department of Physics, Boston University, Boston, MA 02215, USA.
| | | | | | | | | | | | | |
Collapse
|
16
|
Tapia L, Tang X, Thomas S, Amato NM. Kinetics analysis methods for approximate folding landscapes. Bioinformatics 2007; 23:i539-48. [PMID: 17646341 DOI: 10.1093/bioinformatics/btm199] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Protein motions play an essential role in many biochemical processes. Lab studies often quantify these motions in terms of their kinetics such as the speed at which a protein folds or the population of certain interesting states like the native state. Kinetic metrics give quantifiable measurements of the folding process that can be compared across a group of proteins such as a wild-type protein and its mutants. RESULTS We present two new techniques, map-based master equation solution and map-based Monte Carlo simulation, to study protein kinetics through folding rates and population kinetics from approximate folding landscapes, models called maps. From these two new techniques, interesting metrics that describe the folding process, such as reaction coordinates, can also be studied. In this article we focus on two metrics, formation of helices and structure formation around tryptophan residues. These two metrics are often studied in the lab through circular dichroism (CD) spectra analysis and tryptophan fluorescence experiments, respectively. The approximated landscape models we use here are the maps of protein conformations and their associated transitions that we have presented and validated previously. In contrast to other methods such as the traditional master equation and Monte Carlo simulation, our techniques are both fast and can easily be computed for full-length detailed protein models. We validate our map-based kinetics techniques by comparing folding rates to known experimental results. We also look in depth at the population kinetics, helix formation and structure near tryptophan residues for a variety of proteins. AVAILABILITY We invite the community to help us enrich our publicly available database of motions and kinetics analysis by submitting to our server: http://parasol.tamu.edu/foldingserver/.
Collapse
Affiliation(s)
- Lydia Tapia
- Parasol Lab, Department of Computer Science, Texas A&M University, College Station, TX 77843, USA
| | | | | | | |
Collapse
|
17
|
Olivares-Quiroz L, Garcia-Colin LS. Protein's native state stability in a chemically induced denaturation mechanism. J Theor Biol 2007; 246:214-24. [PMID: 17306831 DOI: 10.1016/j.jtbi.2006.12.020] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2006] [Revised: 12/06/2006] [Accepted: 12/12/2006] [Indexed: 11/23/2022]
Abstract
In this work, we present a generalization of Zwanzig's protein unfolding analysis [Zwanzig, R., 1997. Two-state models of protein folding kinetics. Proc. Natl Acad. Sci. USA 94, 148-150; Zwanzig, R., 1995. Simple model of protein folding kinetics. Proc. Natl Acad. Sci. USA 92, 9801], in order to calculate the free energy change Delta(N)(D)F between the protein's native state N and its unfolded state D in a chemically induced denaturation. This Extended Zwanzig Model (EZM) is both based on an equilibrium statistical mechanics approach and the inclusion of experimental denaturation curves. It enables us to construct a suitable partition function Z and to derive an analytical formula for Delta(N)(D)F in terms of the number K of residues of the macromolecule, the average number nu of accessible states for each single amino acid and the concentration C(1/2) where the midpoint of the N<==>D transition occurs. The results of the EZM for proteins where chemical denaturation follows a sigmoidal-type profile, as it occurs for the case of the T70N human variant of lysozyme (PDB code: T70N) [Esposito, G., et al., 2003. J. Biol. Chem. 278, 25910-25918], can be splitted into two lines. First, EZM shows that for sigmoidal denaturation profiles, the internal degrees of freedom of the chain play an outstanding role in the stability of the native state. On the other hand, that under certain conditions DeltaF can be written as a quadratic polynomial on concentration C(1/2), i.e., DeltaF approximately aC(1/2)(2)+bC(1/2)+c, where a,b,c are constant coefficients directly linked to protein's size K and the averaged number of non-native conformations nu. Such functional form for DeltaF has been widely known to fit experimental measures in chemically induced protein denaturation [Yagi, M., et al., 2003. J. Biol. Chem. 278, 47009-47015; Asgeirsson, B., Guojonsdottir, K., 2006. Biochim. Biophys. Acta 1764, 190-198; Sharma, S., et al., 2006. Protein Pept. Lett. 13(4), 323-329; Salem, M., et al., 2006. Biochim. Biophys. Acta 1764(5), 903-912] so EZM can shed some light into the physical meaning of the experimental values for the a,b,c coefficients.
Collapse
Affiliation(s)
- L Olivares-Quiroz
- Departamento de Fisica, Universidad Autonoma Metropolitana-Iztapalapa, Mexico DF 09340, Mexico.
| | | |
Collapse
|
18
|
Pandit AD, Jha A, Freed KF, Sosnick TR. Small Proteins Fold Through Transition States With Native-like Topologies. J Mol Biol 2006; 361:755-70. [PMID: 16876194 DOI: 10.1016/j.jmb.2006.06.041] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2006] [Revised: 06/12/2006] [Accepted: 06/16/2006] [Indexed: 10/24/2022]
Abstract
The folding pathway of common-type acyl phosphatase (ctAcP) is characterized using psi-analysis, which identifies specific chain-chain contacts using bi-histidine (biHis) metal-ion binding sites. In the transition state ensemble (TSE), the majority of the protein is structured with a near-native topology, only lacking one beta-strand and an alpha-helix. psi-Values are zero or unity for all sites except one at the amino terminus of helix H2. This fractional psi-value remains unchanged when three metal ions of differing coordination geometries are used, indicating this end of the helix experiences microscopic heterogeneity through fraying in the TSE. Ubiquitin, the other globular protein characterized using psi-analysis, also exhibits a single consensus TSE structure. Hence, the TSE of both proteins have converged to a single configuration, albeit one that contains some fraying at the periphery. Models of the TSE of both proteins are created using all-atom Langevin dynamics simulations using distance constraints derived from the experimental psi-values. For both proteins, the relative contact order of the TS models is approximately 80% of the native value. This shared value viewed in the context of the known correlation between contact order and folding rates, suggests that other proteins will have a similarly high fraction of the native contact order. This constraint greatly limits the range of possible configurations at the rate-limiting step.
Collapse
Affiliation(s)
- Adarsh D Pandit
- Department of Biochemistry and Molecular Biology, and the Institute for Biophysical Dynamics, University of Chicago, 929 E. 57th St., Chicago, IL 60637, USA
| | | | | | | |
Collapse
|
19
|
Sullivan DC, Lim C. Quantifying Polypeptide Conformational Space: Sensitivity to Conformation and Ensemble Definition. J Phys Chem B 2006; 110:16707-17. [PMID: 16913810 DOI: 10.1021/jp0569133] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Quantifying the density of conformations over phase space (the conformational distribution) is needed to model important macromolecular processes such as protein folding. In this work, we quantify the conformational distribution for a simple polypeptide (N-mer polyalanine) using the cumulative distribution function (CDF), which gives the probability that two randomly selected conformations are separated by less than a "conformational" distance and whose inverse gives conformation counts as a function of conformational radius. An important finding is that the conformation counts obtained by the CDF inverse depend critically on the assignment of a conformation's distance span and the ensemble (e.g., unfolded state model): varying ensemble and conformation definition (1 --> 2 A) varies the CDF-based conformation counts for Ala(50) from 10(11) to 10(69). In particular, relatively short molecular dynamics (MD) relaxation of Ala(50)'s random-walk ensemble reduces the number of conformers from 10(55) to 10(14) (using a 1 A root-mean-square-deviation radius conformation definition) pointing to potential disconnections in comparing the results from simplified models of unfolded proteins with those from all-atom MD simulations. Explicit waters are found to roughen the landscape considerably. Under some common conformation definitions, the results herein provide (i) an upper limit to the number of accessible conformations that compose unfolded states of proteins, (ii) the optimal clustering radius/conformation radius for counting conformations for a given energy and solvent model, (iii) a means of comparing various studies, and (iv) an assessment of the applicability of random search in protein folding.
Collapse
Affiliation(s)
- David C Sullivan
- Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan.
| | | |
Collapse
|
20
|
Cecconi F, Guardiani C, Livi R. Testing simplified proteins models of the hPin1 WW domain. Biophys J 2006; 91:694-704. [PMID: 16648162 PMCID: PMC1483113 DOI: 10.1529/biophysj.105.069138] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2005] [Accepted: 04/06/2006] [Indexed: 11/18/2022] Open
Abstract
The WW domain of the human Pin1 protein for its simple topology and large amount of experimental data is an ideal candidate to assess theoretical approaches to protein folding. The purpose of this work is to compare the reliability of the chemically based Sorenson/Head-Gordon (SHG) model and a standard native centric model in reproducing, through molecular dynamics simulations, some of the well known features of the folding transition of this small domain. Our results show that the Gō model correctly reproduces the cooperative, two-state, folding mechanism of the WW-domain, while the SHG model predicts a transition occurring in two stages: a collapse, followed by a structural rearrangement. The lack of a cooperative folding in the SHG simulations appears to be related to the nonfunnel shape of the energy landscape featuring a partitioning of the native valley in subbasins corresponding to different chain chiralities. However, the SHG approach remains more reliable in estimating the phi-values with respect to Gō-like description. This may suggest that the WW-domain folding process is stirred by energetic and topological factors as well, and it highlights the better suitability of chemically based models in simulating mutations.
Collapse
Affiliation(s)
- Fabio Cecconi
- INFM-CNR Istituto dei Sistemi Complessi, Rome, Italy.
| | | | | |
Collapse
|
21
|
Sun SX. Path summation formulation of the master equation. PHYSICAL REVIEW LETTERS 2006; 96:210602. [PMID: 16803224 DOI: 10.1103/physrevlett.96.210602] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/23/2006] [Indexed: 05/10/2023]
Abstract
Markovian dynamics, modeled by the kinetic master equation, has wide ranging applications in chemistry, physics, and biology. We derive an exact expression for the probability of a Markovian path in discrete state space for an arbitrary number of states and path length. The total probability of paths repeatedly visiting a set of states can be explicitly summed. The transition probability between states can be expressed as a sum over all possible paths connecting the states. The derived path probabilities satisfy the fluctuation theorem. The paths can be the starting point for a path space Monte Carlo procedure which can serve as an alternative algorithm to analyze pathways in a complex reaction network.
Collapse
Affiliation(s)
- Sean X Sun
- Department of Mechanical Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA
| |
Collapse
|
22
|
McLeish TCB. Diffusive searches in high-dimensional spaces and apparent 'two-state' behaviour in protein folding. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2006; 18:1861-1868. [PMID: 21697560 DOI: 10.1088/0953-8984/18/6/003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
We extend a simple model for protein folding as a high-dimensional diffusive search. By solving a steady-state diffusion equation on a hypersphere centred on an absorbing 'native state' we find the general property that the kinetics of such a search will always be nearly single exponential. This explains the common observation of such simple 'two-state' folding kinetics in models that contain considerable intermediate structure. It also suggests that the experimental signature of single-exponential folding kinetics does not imply a simple two-state structure to the folding space.
Collapse
Affiliation(s)
- T C B McLeish
- Department of Physics and Astronomy and Astbury Centre for Molecular Biology, University of Leeds, Leeds LS2 9JT, UK
| |
Collapse
|
23
|
Tang X, Kirkpatrick B, Thomas S, Song G, Amato NM. Using motion planning to study RNA folding kinetics. J Comput Biol 2005; 12:862-81. [PMID: 16108722 DOI: 10.1089/cmb.2005.12.862] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We propose a novel, motion planning based approach to approximately map the energy landscape of an RNA molecule. A key feature of our method is that it provides a sparse map that captures the main features of the energy landscape which can be analyzed to compute folding kinetics. Our method is based on probabilistic roadmap motion planners that we have previously successfully applied to protein folding. In this paper, we provide evidence that this approach is also well suited to RNA. We compute population kinetics and transition rates on our roadmaps using the master equation for a few moderately sized RNA and show that our results compare favorably with results of other existing methods.
Collapse
Affiliation(s)
- Xinyu Tang
- Parasol Lab, Dept. of Computer Science, Texas A&M University, 301 Harvey R. Bright Building, College Station, TX 77843-3112, USA
| | | | | | | | | |
Collapse
|
24
|
Abstract
The viability of a biological system depends upon careful regulation of the rates of various processes. These rates have limits imposed by intrinsic chemical or physical steps (e.g., diffusion). These limits can be expanded by interactions and dynamics of the biomolecules. For example, (a) a chemical reaction is catalyzed when its transition state is preferentially bound to an enzyme; (b) the folding of a protein molecule is speeded up by specific interactions within the transition-state ensemble and may be assisted by molecular chaperones; (c) the rate of specific binding of a protein molecule to a cellular target can be enhanced by mechanisms such as long-range electrostatic interactions, nonspecific binding and folding upon binding; (d) directional movement of motor proteins is generated by capturing favorable Brownian motion through intermolecular binding energy; and (e) conduction and selectivity of ions through membrane channels are controlled by interactions and the dynamics of channel proteins. Simple physical models are presented here to illustrate these processes and provide a unifying framework for understanding speed attainment and regulation in biomolecular systems.
Collapse
Affiliation(s)
- Huan-Xiang Zhou
- Department of Physics and Institute of Molecular Biophysics and School of Computational Science, Florida State University, Tallahassee, FL 32306, USA.
| |
Collapse
|
25
|
Andrec M, Felts AK, Gallicchio E, Levy RM. Protein folding pathways from replica exchange simulations and a kinetic network model. Proc Natl Acad Sci U S A 2005; 102:6801-6. [PMID: 15800044 PMCID: PMC1100763 DOI: 10.1073/pnas.0408970102] [Citation(s) in RCA: 125] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2004] [Indexed: 11/18/2022] Open
Abstract
We present an approach to the study of protein folding that uses the combined power of replica exchange simulations and a network model for the kinetics. We carry out replica exchange simulations to generate a large ( approximately 10(6)) set of states with an all-atom effective potential function and construct a kinetic model for folding, using an ansatz that allows kinetic transitions between states based on structural similarity. We use this network to perform random walks in the state space and examine the overall network structure. Results are presented for the C-terminal peptide from the B1 domain of protein G. The kinetics is two-state after small temperature perturbations. However, the coil-to-hairpin folding is dominated by pathways that visit metastable helical conformations. We propose possible mechanisms for the alpha-helix/beta-hairpin interconversion.
Collapse
Affiliation(s)
- Michael Andrec
- Department of Chemistry and Chemical Biology and BIOMAPS Institute for Quantitative Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | | | | | | |
Collapse
|
26
|
Chekmarev DS, Ishida T, Levy RM. Long-Time Conformational Transitions of Alanine Dipeptide in Aqueous Solution: Continuous and Discrete-State Kinetic Models. J Phys Chem B 2004. [DOI: 10.1021/jp048540w] [Citation(s) in RCA: 74] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Dmitriy S. Chekmarev
- Department of Chemistry and Chemical Biology and BioMaPS Institute for Quantitative Biology, Rutgers, the State University of New Jersey, 610 Taylor Road, Piscataway, New Jersey 08854
| | - Tateki Ishida
- Department of Chemistry and Chemical Biology and BioMaPS Institute for Quantitative Biology, Rutgers, the State University of New Jersey, 610 Taylor Road, Piscataway, New Jersey 08854
| | - Ronald M. Levy
- Department of Chemistry and Chemical Biology and BioMaPS Institute for Quantitative Biology, Rutgers, the State University of New Jersey, 610 Taylor Road, Piscataway, New Jersey 08854
| |
Collapse
|
27
|
Berezhkovskii A, Szabo A. Ensemble of transition states for two-state protein folding from the eigenvectors of rate matrices. J Chem Phys 2004; 121:9186-7. [PMID: 15527389 DOI: 10.1063/1.1802674] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
28
|
Abstract
We present a solvable model that predicts the folding kinetics of two-state proteins from their native structures. The model is based on conditional chain entropies. It assumes that folding processes are dominated by small-loop closure events that can be inferred from native structures. For CI2, the src SH3 domain, TNfn3, and protein L, the model reproduces two-state kinetics, and it predicts well the average Phi-values for secondary structures. The barrier to folding is the formation of predominantly local structures such as helices and hairpins, which are needed to bring nonlocal pairs of amino acids into contact.
Collapse
Affiliation(s)
- Thomas R Weikl
- Department of Pharmaceutical Chemistry, University of California, San Francisco, 94143, USA. Thomas.
| | | | | |
Collapse
|
29
|
Swope WC, Pitera JW, Suits F. Describing Protein Folding Kinetics by Molecular Dynamics Simulations. 1. Theory. J Phys Chem B 2004. [DOI: 10.1021/jp037421y] [Citation(s) in RCA: 332] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- William C. Swope
- IBM Almaden Research Center, 650 Harry Road, San Jose, California 95120
| | - Jed W. Pitera
- IBM Almaden Research Center, 650 Harry Road, San Jose, California 95120
| | - Frank Suits
- IBM Watson Research Center, Route 134, Yorktown Heights, New York 10598
| |
Collapse
|
30
|
Wolfinger MT, Svrcek-Seiler WA, Flamm C, Hofacker IL, Stadler PF. Efficient computation of RNA folding dynamics. ACTA ACUST UNITED AC 2004. [DOI: 10.1088/0305-4470/37/17/005] [Citation(s) in RCA: 91] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
31
|
|
32
|
Zhang W, Chen SJ. Analyzing the biopolymer folding rates and pathways using kinetic cluster method. J Chem Phys 2003; 119:8716-8729. [PMID: 19079645 DOI: 10.1063/1.1613255] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
A kinetic cluster method enables us to analyze biopolymer folding kinetics with discrete rate-limiting steps by classifying biopolymer conformations into pre-equilibrated clusters. The overall folding kinetics is determined by the intercluster transitions. Due to the complex energy landscapes of biopolymers, the intercluster transitions have multiple pathways and can have kinetic intermediates (local free-energy minima) distributed on the intercluster pathways. We focus on the RNA secondary structure folding kinetics. The dominant folding pathways and the kinetic partitioning mechanism can be identified and quantified from the rate constants for different intercluster pathways. Moreover, the temperature dependence of the folding rate can be analyzed from the interplay between the stabilities of the on-pathway (nativelike) and off-pathway (misfolded) conformations and from the kinetic partitioning between different intercluster pathways. The predicted folding kinetics can be directly tested against experiments.
Collapse
Affiliation(s)
- Wenbing Zhang
- Department of Physics and Astronomy and Department of Biochemistry, University of Missouri, Columbia, Missouri 65211
| | | |
Collapse
|