1
|
Donovan-Maiye RM, Langmead CJ, Zuckerman DM. Systematic Testing of Belief-Propagation Estimates for Absolute Free Energies in Atomistic Peptides and Proteins. J Chem Theory Comput 2018; 14:426-443. [PMID: 29185777 PMCID: PMC5933972 DOI: 10.1021/acs.jctc.7b00775] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Motivated by the extremely high computing costs associated with estimates of free energies for biological systems using molecular simulations, we further the exploration of existing "belief propagation" (BP) algorithms for fixed-backbone peptide and protein systems. The precalculation of pairwise interactions among discretized libraries of side-chain conformations, along with representation of protein side chains as nodes in a graphical model, enables direct application of the BP approach, which requires only ∼1 s of single-processor run time after the precalculation stage. We use a "loopy BP" algorithm, which can be seen as an approximate generalization of the transfer-matrix approach to highly connected (i.e., loopy) graphs, and it has previously been applied to protein calculations. We examine the application of loopy BP to several peptides as well as the binding site of the T4 lysozyme L99A mutant. The present study reports on (i) the comparison of the approximate BP results with estimates from unbiased estimators based on the Amber99SB force field; (ii) investigation of the effects of varying library size on BP predictions; and (iii) a theoretical discussion of the discretization effects that can arise in BP calculations. The data suggest that, despite their approximate nature, BP free-energy estimates are highly accurate-indeed, they never fall outside confidence intervals from unbiased estimators for the systems where independent results could be obtained. Furthermore, we find that libraries of sufficiently fine discretization (which diminish library-size sensitivity) can be obtained with standard computing resources in most cases. Altogether, the extremely low computing times and accurate results suggest the BP approach warrants further study.
Collapse
Affiliation(s)
| | - Christopher J Langmead
- Computational Biology Department, Carnegie Mellon University , Pittsburgh, Pennsylvania 15213, United States
| | - Daniel M Zuckerman
- Department of Biomedical Engineering, Oregon Health and Science University , Portland, Oregon 97239, United States
| |
Collapse
|
2
|
Giovan SM, Scharein RG, Hanke A, Levene SD. Free-energy calculations for semi-flexible macromolecules: applications to DNA knotting and looping. J Chem Phys 2014; 141:174902. [PMID: 25381542 PMCID: PMC4241824 DOI: 10.1063/1.4900657] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2014] [Accepted: 10/18/2014] [Indexed: 12/16/2022] Open
Abstract
We present a method to obtain numerically accurate values of configurational free energies of semiflexible macromolecular systems, based on the technique of thermodynamic integration combined with normal-mode analysis of a reference system subject to harmonic constraints. Compared with previous free-energy calculations that depend on a reference state, our approach introduces two innovations, namely, the use of internal coordinates to constrain the reference states and the ability to freely select these reference states. As a consequence, it is possible to explore systems that undergo substantially larger fluctuations than those considered in previous calculations, including semiflexible biopolymers having arbitrary ratios of contour length L to persistence length P. To validate the method, high accuracy is demonstrated for free energies of prime DNA knots with L/P = 20 and L/P = 40, corresponding to DNA lengths of 3000 and 6000 base pairs, respectively. We then apply the method to study the free-energy landscape for a model of a synaptic nucleoprotein complex containing a pair of looped domains, revealing a bifurcation in the location of optimal synapse (crossover) sites. This transition is relevant to target-site selection by DNA-binding proteins that occupy multiple DNA sites separated by large linear distances along the genome, a problem that arises naturally in gene regulation, DNA recombination, and the action of type-II topoisomerases.
Collapse
Affiliation(s)
- Stefan M Giovan
- Department of Molecular and Cell Biology, University of Texas at Dallas, Richardson, Texas 75083, USA
| | | | - Andreas Hanke
- Department of Physics and Astronomy, University of Texas at Brownsville, Brownsville, Texas 78520, USA
| | - Stephen D Levene
- Department of Molecular and Cell Biology, University of Texas at Dallas, Richardson, Texas 75083, USA
| |
Collapse
|
3
|
Mamonov AB, Lettieri S, Ding Y, Sarver JL, Palli R, Cunningham TF, Saxena S, Zuckerman DM. Tunable, mixed-resolution modeling using library-based Monte Carlo and graphics processing units. J Chem Theory Comput 2012; 8:2921-2929. [PMID: 23162384 PMCID: PMC3496292 DOI: 10.1021/ct300263z] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Building on our recently introduced library-based Monte Carlo (LBMC) approach, we describe a flexible protocol for mixed coarse-grained (CG)/all-atom (AA) simulation of proteins and ligands. In the present implementation of LBMC, protein side chain configurations are pre-calculated and stored in libraries, while bonded interactions along the backbone are treated explicitly. Because the AA side chain coordinates are maintained at minimal run-time cost, arbitrary sites and interaction terms can be turned on to create mixed-resolution models. For example, an AA region of interest such as a binding site can be coupled to a CG model for the rest of the protein. We have additionally developed a hybrid implementation of the generalized Born/surface area (GBSA) implicit solvent model suitable for mixed-resolution models, which in turn was ported to a graphics processing unit (GPU) for faster calculation. The new software was applied to study two systems: (i) the behavior of spin labels on the B1 domain of protein G (GB1) and (ii) docking of randomly initialized estradiol configurations to the ligand binding domain of the estrogen receptor (ERα). The performance of the GPU version of the code was also benchmarked in a number of additional systems.
Collapse
|
4
|
Do H, Hirst JD, Wheatley RJ. Calculation of Partition Functions and Free Energies of a Binary Mixture Using the Energy Partitioning Method: Application to Carbon Dioxide and Methane. J Phys Chem B 2012; 116:4535-42. [DOI: 10.1021/jp212168f] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Affiliation(s)
- Hainam Do
- School of Chemistry, University of Nottingham, University Park, Nottingham, NG7 2RD, United Kingdom
| | - Jonathan D. Hirst
- School of Chemistry, University of Nottingham, University Park, Nottingham, NG7 2RD, United Kingdom
| | - Richard J. Wheatley
- School of Chemistry, University of Nottingham, University Park, Nottingham, NG7 2RD, United Kingdom
| |
Collapse
|
5
|
Somani S, Gilson MK. Accelerated convergence of molecular free energy via superposition approximation-based reference states. J Chem Phys 2011; 134:134107. [PMID: 21476743 PMCID: PMC3094129 DOI: 10.1063/1.3571441] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2010] [Accepted: 03/09/2011] [Indexed: 11/14/2022] Open
Abstract
The free energy of a molecular system can, at least in principle, be computed by thermodynamic perturbation from a reference system whose free energy is known. The convergence of such a calculation depends critically on the conformational overlap between the reference and the physical systems. One approach to defining a suitable reference system is to construct it from the one-dimensional marginal probability distribution functions (PDFs) of internal coordinates observed in a molecular simulation. However, the conformational overlap of this reference system tends to decline steeply with increasing dimensionality, due to the neglect of correlations among the coordinates. Here, we test a reference system that can account for pairwise correlations among the internal coordinates, as captured by their two-dimensional marginal PDFs derived from a molecular simulation. Incorporating pairwise correlations in the reference system is found to dramatically improve the convergence of the free energy estimates relative to the first-order reference system, due to increased conformational overlap with the physical distribution.
Collapse
Affiliation(s)
- Sandeep Somani
- Institute for Bioscience and Biotechnology Research, Rockville, Maryland 20850, USA
| | | |
Collapse
|
6
|
Mamonov AB, Zhang X, Zuckerman DM. Rapid sampling of all-atom peptides using a library-based polymer-growth approach. J Comput Chem 2011; 32:396-405. [PMID: 20734315 PMCID: PMC3005036 DOI: 10.1002/jcc.21626] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2010] [Revised: 05/17/2010] [Accepted: 06/12/2010] [Indexed: 12/30/2022]
Abstract
We adapted existing polymer growth strategies for equilibrium sampling of peptides described by modern atomistic forcefields with a simple uniform dielectric solvent. The main novel feature of our approach is the use of precalculated statistical libraries of molecular fragments. A molecule is sampled by combining fragment configurations-of single residues in this study-which are stored in the libraries. Ensembles generated from the independent libraries are reweighted to conform with the Boltzmann-factor distribution of the forcefield describing the full molecule. In this way, high-quality equilibrium sampling of small peptides (4-8 residues) typically requires less than one hour of single-processor wallclock time and can be significantly faster than Langevin simulations. Furthermore, approximate, clash-free ensembles can be generated for larger peptides (up to 32 residues in this study) in less than a minute of single-processor computing. We discuss possible applications of our growth procedure to free energy calculation, fragment assembly protein-structure prediction protocols, and to "multi-resolution" sampling.
Collapse
Affiliation(s)
- Artem B Mamonov
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA
| | | | | |
Collapse
|
7
|
Lettieri S, Mamonov AB, Zuckerman DM. Extending fragment-based free energy calculations with library Monte Carlo simulation: annealing in interaction space. J Comput Chem 2010; 32:1135-43. [PMID: 21387340 DOI: 10.1002/jcc.21695] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2010] [Accepted: 09/10/2010] [Indexed: 11/09/2022]
Abstract
Pre-calculated libraries of molecular fragment configurations have previously been used as a basis for both equilibrium sampling (via library-based Monte Carlo) and for obtaining absolute free energies using a polymer-growth formalism. Here, we combine the two approaches to extend the size of systems for which free energies can be calculated. We study a series of all-atom poly-alanine systems in a simple dielectric solvent and find that precise free energies can be obtained rapidly. For instance, for 12 residues, less than an hour of single-processor time is required. The combined approach is formally equivalent to the annealed importance sampling algorithm; instead of annealing by decreasing temperature, however, interactions among fragments are gradually added as the molecule is grown. We discuss implications for future binding affinity calculations in which a ligand is grown into a binding site.
Collapse
Affiliation(s)
- Steven Lettieri
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA
| | | | | |
Collapse
|
8
|
Zhang X, Bhatt D, Zuckerman DM. Automated sampling assessment for molecular simulations using the effective sample size. J Chem Theory Comput 2010; 6:3048-3057. [PMID: 21221418 PMCID: PMC3017371 DOI: 10.1021/ct1002384] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
To quantify the progress in the development of algorithms and forcefields used in molecular simulations, a general method for the assessment of the sampling quality is needed. Statistical mechanics principles suggest the populations of physical states characterize equilibrium sampling in a fundamental way. We therefore develop an approach for analyzing the variances in state populations, which quantifies the degree of sampling in terms of the effective sample size (ESS). The ESS estimates the number of statistically independent configurations contained in a simulated ensemble. The method is applicable to both traditional dynamics simulations as well as more modern (e.g., multi-canonical) approaches. Our procedure is tested in a variety of systems from toy models to atomistic protein simulations. We also introduce a simple automated procedure to obtain approximate physical states from dynamic trajectories: this allows sample-size estimation in systems for which physical states are not known in advance.
Collapse
Affiliation(s)
- Xin Zhang
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania 15213
| | | | | |
Collapse
|
9
|
Ding Y, Mamonov AB, Zuckerman DM. Efficient equilibrium sampling of all-atom peptides using library-based Monte Carlo. J Phys Chem B 2010; 114:5870-7. [PMID: 20380366 PMCID: PMC2882875 DOI: 10.1021/jp910112d] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
We applied our previously developed library-based Monte Carlo (LBMC) to equilibrium sampling of several implicitly solvated all-atom peptides. LBMC can perform equilibrium sampling of molecules using precalculated statistical libraries of molecular-fragment configurations and energies. For this study, we employed residue-based fragments distributed according to the Boltzmann factor of the optimized potential for liquid simulations all-atom (OPLS-AA) forcefield describing the individual fragments. Two solvent models were employed: a simple uniform dielectric and the generalized Born/surface area (GBSA) model. The efficiency of LBMC was compared to standard Langevin dynamics (LD) using three different statistical tools. The statistical analyses indicate that LBMC is more than 100 times faster than LD not only for the simple solvent model but also for GBSA.
Collapse
Affiliation(s)
- Ying Ding
- Department of Computational Biology, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania 15213
| | - Artem B. Mamonov
- Department of Computational Biology, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania 15213
| | - Daniel M. Zuckerman
- Department of Computational Biology, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania 15213
| |
Collapse
|
10
|
Bhatt D, Zuckerman DM. Absolute free energies and equilibrium ensembles of dense fluids computed from a nondynamic growth method. J Chem Phys 2009; 131:214110. [PMID: 19968340 DOI: 10.1063/1.3269674] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We demonstrate a nondynamical Monte Carlo method to compute free energies and generate equilibrium ensembles of dense fluids. In this method, based on step-by-step polymer growth algorithms, an ensemble of n+1 particles is obtained from an ensemble of n particles by generating configurations of the n+1st particle. A statistically rigorous resampling scheme is utilized to remove configurations with low weights and to avoid a combinatorial explosion; the free energy is obtained from the sum of the weights. In addition to the free energy, the method generates an equilibrium ensemble of the full system. We consider two different system sizes for a Lennard-Jones fluid and compare the results with conventional Monte Carlo methods.
Collapse
Affiliation(s)
- Divesh Bhatt
- Department of Computational Biology, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA
| | | |
Collapse
|
11
|
Mamonov AB, Bhatt D, Cashman DJ, Ding Y, Zuckerman DM. General library-based Monte Carlo technique enables equilibrium sampling of semi-atomistic protein models. J Phys Chem B 2009; 113:10891-904. [PMID: 19594147 DOI: 10.1021/jp901322v] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We introduce "library-based Monte Carlo" (LBMC) simulation, which performs Boltzmann sampling of molecular systems based on precalculated statistical libraries of molecular-fragment configurations, energies, and interactions. The library for each fragment can be Boltzmann distributed and thus account for all correlations internal to the fragment. LBMC can be applied to both atomistic and coarse-grained models, as we demonstrate in this "proof-of-principle" report. We first verify the approach in a toy model and in implicitly solvated all-atom polyalanine systems. We next study five proteins, up to 309 residues in size. On the basis of atomistic equilibrium libraries of peptide-plane configurations, the proteins are modeled with fully atomistic backbones and simplified Go-like interactions among residues. We show that full equilibrium sampling can be obtained in days to weeks on a single processor, suggesting that more accurate models are well within reach. For the future, LBMC provides a convenient platform for constructing adjustable or mixed-resolution models: the configurations of all atoms can be stored at no run-time cost, while an arbitrary subset of interactions is "turned on".
Collapse
Affiliation(s)
- Artem B Mamonov
- Department of Computational Biology, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| | | | | | | | | |
Collapse
|