1
|
Wang J, Charron N, Husic B, Olsson S, Noé F, Clementi C. Multi-body effects in a coarse-grained protein force field. J Chem Phys 2021; 154:164113. [PMID: 33940848 DOI: 10.1063/5.0041022] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
The use of coarse-grained (CG) models is a popular approach to study complex biomolecular systems. By reducing the number of degrees of freedom, a CG model can explore long time- and length-scales inaccessible to computational models at higher resolution. If a CG model is designed by formally integrating out some of the system's degrees of freedom, one expects multi-body interactions to emerge in the effective CG model's energy function. In practice, it has been shown that the inclusion of multi-body terms indeed improves the accuracy of a CG model. However, no general approach has been proposed to systematically construct a CG effective energy that includes arbitrary orders of multi-body terms. In this work, we propose a neural network based approach to address this point and construct a CG model as a multi-body expansion. By applying this approach to a small protein, we evaluate the relative importance of the different multi-body terms in the definition of an accurate model. We observe a slow convergence in the multi-body expansion, where up to five-body interactions are needed to reproduce the free energy of an atomistic model.
Collapse
Affiliation(s)
- Jiang Wang
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
| | - Nicholas Charron
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
| | - Brooke Husic
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
| | - Simon Olsson
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
| | - Frank Noé
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
| | - Cecilia Clementi
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
| |
Collapse
|
2
|
Jernigan RL, Sankar K, Jia K, Faraggi E, Kloczkowski A. Computational Ways to Enhance Protein Inhibitor Design. Front Mol Biosci 2021; 7:607323. [PMID: 33614705 PMCID: PMC7886686 DOI: 10.3389/fmolb.2020.607323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Accepted: 12/08/2020] [Indexed: 11/22/2022] Open
Abstract
Two new computational approaches are described to aid in the design of new peptide-based drugs by evaluating ensembles of protein structures from their dynamics and through the assessing of structures using empirical contact potential. These approaches build on the concept that conformational variability can aid in the binding process and, for disordered proteins, can even facilitate the binding of more diverse ligands. This latter consideration indicates that such a design process should be less restrictive so that multiple inhibitors might be effective. The example chosen here focuses on proteins/peptides that bind to hemagglutinin (HA) to block the large-scale conformational change for activation. Variability in the conformations is considered from sets of experimental structures, or as an alternative, from their simple computed dynamics; the set of designe peptides/small proteins from the David Baker lab designed to bind to hemagglutinin, is the large set considered and is assessed with the new empirical contact potentials.
Collapse
Affiliation(s)
- Robert L. Jernigan
- Roy J. Carver Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, IA, United States
| | - Kannan Sankar
- Roy J. Carver Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, IA, United States
| | - Kejue Jia
- Roy J. Carver Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, IA, United States
| | - Eshel Faraggi
- Research and Information Systems, LLC, Indianapolis, IN, United States
- Department of Physics, Indiana University Purdue University Indianapolis, Indianapolis, IN, United States
| | - Andrzej Kloczkowski
- Battelle Center for Mathematical Medicine, Nationwide Children's Hospital, Columbus, OH, United States
- Department of Pediatrics, The Ohio State University, Columbus, OH, United States
| |
Collapse
|
3
|
Knowledge-based entropies improve the identification of native protein structures. Proc Natl Acad Sci U S A 2017; 114:2928-2933. [PMID: 28265078 DOI: 10.1073/pnas.1613331114] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Evaluating protein structures requires reliable free energies with good estimates of both potential energies and entropies. Although there are many demonstrated successes from using knowledge-based potential energies, computing entropies of proteins has lagged far behind. Here we take an entirely different approach and evaluate knowledge-based conformational entropies of proteins based on the observed frequencies of contact changes between amino acids in a set of 167 diverse proteins, each of which has two alternative structures. The results show that charged and polar interactions break more often than hydrophobic pairs. This pattern correlates strongly with the average solvent exposure of amino acids in globular proteins, as well as with polarity indices and the sizes of the amino acids. Knowledge-based entropies are derived by using the inverse Boltzmann relationship, in a manner analogous to the way that knowledge-based potentials have been extracted. Including these new knowledge-based entropies almost doubles the performance of knowledge-based potentials in selecting the native protein structures from decoy sets. Beyond the overall energy-entropy compensation, a similar compensation is seen for individual pairs of interacting amino acids. The entropies in this report have immediate applications for 3D structure prediction, protein model assessment, and protein engineering and design.
Collapse
|
4
|
Sankar K, Liu J, Wang Y, Jernigan RL. Distributions of experimental protein structures on coarse-grained free energy landscapes. J Chem Phys 2016; 143:243153. [PMID: 26723638 DOI: 10.1063/1.4937940] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Predicting conformational changes of proteins is needed in order to fully comprehend functional mechanisms. With the large number of available structures in sets of related proteins, it is now possible to directly visualize the clusters of conformations and their conformational transitions through the use of principal component analysis. The most striking observation about the distributions of the structures along the principal components is their highly non-uniform distributions. In this work, we use principal component analysis of experimental structures of 50 diverse proteins to extract the most important directions of their motions, sample structures along these directions, and estimate their free energy landscapes by combining knowledge-based potentials and entropy computed from elastic network models. When these resulting motions are visualized upon their coarse-grained free energy landscapes, the basis for conformational pathways becomes readily apparent. Using three well-studied proteins, T4 lysozyme, serum albumin, and sarco-endoplasmic reticular Ca(2+) adenosine triphosphatase (SERCA), as examples, we show that such free energy landscapes of conformational changes provide meaningful insights into the functional dynamics and suggest transition pathways between different conformational states. As a further example, we also show that Monte Carlo simulations on the coarse-grained landscape of HIV-1 protease can directly yield pathways for force-driven conformational changes.
Collapse
Affiliation(s)
- Kannan Sankar
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, Iowa 50011, USA
| | - Jie Liu
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, Iowa 50011, USA
| | - Yuan Wang
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, Iowa 50011, USA
| | - Robert L Jernigan
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, Iowa 50011, USA
| |
Collapse
|
5
|
Zhu Y, Chen SJ. Many-body effect in ion binding to RNA. J Chem Phys 2014; 141:055101. [PMID: 25106614 PMCID: PMC4119196 DOI: 10.1063/1.4890656] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2014] [Accepted: 06/30/2014] [Indexed: 01/07/2023] Open
Abstract
Ion-mediated electrostatic interactions play an important role in RNA folding stability. For a RNA in a solution with higher Mg(2+) ion concentration, more counterions in the solution can bind to the RNA, causing a strong many-body coupling between the bound ions. The many-body effect can change the effective potential of mean force between the tightly bound ions. This effect tends to dampen ion binding and lower RNA folding stability. Neglecting the many-body effect leads to a systematic error (over-estimation) of RNA folding stability at high Mg(2+) ion concentrations. Using the tightly bound ion model combined with a conformational ensemble model, we investigate the influence of the many-body effect on the ion-dependent RNA folding stability. Comparisons with the experimental data indicate that including the many-body effect led to much improved predictions for RNA folding stability at high Mg(2+) ion concentrations. The results suggest that the many-body effect can be important for RNA folding in high concentrations of multivalent ions. Further investigation showed that the many-body effect can influence the spatial distribution of the tightly bound ions and the effect is more pronounced for compact RNA structures and structures prone to the formation of local clustering of ions.
Collapse
Affiliation(s)
- Yuhong Zhu
- Department of Physics, Zhejiang University, Hangzhou, Zhejiang 310027, China
| | - Shi-Jie Chen
- Department of Physics and Department of Biochemistry, University of Missouri, Columbia, Missouri 65211, USA
| |
Collapse
|
6
|
Tang K, Zhang J, Liang J. Fast protein loop sampling and structure prediction using distance-guided sequential chain-growth Monte Carlo method. PLoS Comput Biol 2014; 10:e1003539. [PMID: 24763317 PMCID: PMC3998890 DOI: 10.1371/journal.pcbi.1003539] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2013] [Accepted: 02/01/2014] [Indexed: 11/18/2022] Open
Abstract
Loops in proteins are flexible regions connecting regular secondary structures. They are often involved in protein functions through interacting with other molecules. The irregularity and flexibility of loops make their structures difficult to determine experimentally and challenging to model computationally. Conformation sampling and energy evaluation are the two key components in loop modeling. We have developed a new method for loop conformation sampling and prediction based on a chain growth sequential Monte Carlo sampling strategy, called Distance-guided Sequential chain-Growth Monte Carlo (DISGRO). With an energy function designed specifically for loops, our method can efficiently generate high quality loop conformations with low energy that are enriched with near-native loop structures. The average minimum global backbone RMSD for 1,000 conformations of 12-residue loops is 1:53 A° , with a lowest energy RMSD of 2:99 A° , and an average ensembleRMSD of 5:23 A° . A novel geometric criterion is applied to speed up calculations. The computational cost of generating 1,000 conformations for each of the x loops in a benchmark dataset is only about 10 cpu minutes for 12-residue loops, compared to ca 180 cpu minutes using the FALCm method. Test results on benchmark datasets show that DISGRO performs comparably or better than previous successful methods, while requiring far less computing time. DISGRO is especially effective in modeling longer loops (10-17 residues).
Collapse
Affiliation(s)
- Ke Tang
- Department of Bioengineering, University of Illinois at Chicago, Chicago, Illinois, United States of America
| | - Jinfeng Zhang
- Department of Statistics, Florida State University, Tallahassee, Florida, United States of America
- * E-mail: (JZ); (JL)
| | - Jie Liang
- Department of Bioengineering, University of Illinois at Chicago, Chicago, Illinois, United States of America
- * E-mail: (JZ); (JL)
| |
Collapse
|
7
|
Zheng W. All-atom and coarse-grained simulations of the forced unfolding pathways of the SNARE complex. Proteins 2014; 82:1376-86. [DOI: 10.1002/prot.24505] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2013] [Revised: 12/11/2013] [Accepted: 01/06/2014] [Indexed: 01/03/2023]
Affiliation(s)
- Wenjun Zheng
- Department of Physics; University at Buffalo, State University of New York; New York
| |
Collapse
|
8
|
Su JG, Du HJ, Hao R, Xu XJ, Li CH, Chen WZ, Wang CX. Identification of functionally key residues in AMPA receptor with a thermodynamic method. J Phys Chem B 2013; 117:8689-96. [PMID: 23822189 DOI: 10.1021/jp402290t] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
AMPA receptor mediates the fast excitatory synaptic transmission in the central nervous system, and it is activated by the binding of glutamate that results in the opening of the transmembrane ion channel. In the present work, the thermodynamic method developed by our group was improved and then applied to identify the functionally key residues that regulate the glutamate-binding affinity of AMPA receptor. In our method, the key residues are identified as those whose perturbation largely changes the ligand binding free energy of the protein. It is found that besides the ligand binding sites, other residues distant from the binding cleft can also influence the glutamate binding affinity through a long-range allosteric regulation. These allosteric sites include the hinge region of the ligand binding cleft, the dimer interface of the ligand binding domain, the linkers between the ligand binding domain and the transmembrane domain, and the interface between the N-terminal domain and the ligand binding domain. Our calculation results are consistent with the available experimental data. The results are helpful for our understanding of the mechanism of long-range allosteric communication in the AMPA receptor and the mechanism of channel opening triggered by glutamate binding.
Collapse
Affiliation(s)
- Ji Guo Su
- College of Science, Yanshan University, Qinhuangdao, China
| | | | | | | | | | | | | |
Collapse
|
9
|
Zacharias M. Combining coarse-grained nonbonded and atomistic bonded interactions for protein modeling. Proteins 2012; 81:81-92. [PMID: 22911567 DOI: 10.1002/prot.24164] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2012] [Revised: 08/09/2012] [Accepted: 08/15/2012] [Indexed: 12/12/2022]
Abstract
A hybrid coarse-grained (CG) and atomistic (AT) model for protein simulations and rapid searching and refinement of peptide-protein complexes has been developed. In contrast to other hybrid models that typically represent spatially separate parts of a protein by either a CG or an AT force field model, the present approach simultaneously represents the protein by an AT (united atom) and a CG model. The interactions of the protein main chain are described based on the united atom force field allowing a realistic representation of protein secondary structures. In addition, the AT description of all other bonded interactions keeps the protein compatible with a realistic bonded geometry. Nonbonded interactions between side chains and side chains and main chain are calculated at the level of a CG model using a knowledge-based potential. Unrestrained molecular dynamics simulations on several test proteins resulted in trajectories in reasonable agreement with the corresponding experimental structures. Applications to the refinement of docked peptide-protein complexes resulted in improved complex structures. Application to the rapid refinement of docked protein-protein complex is also possible but requires further optimization of force field parameters.
Collapse
Affiliation(s)
- Martin Zacharias
- Physik-Department T38, Technische Universität München, James Franck Str. 1, 85748 Garching, Germany.
| |
Collapse
|
10
|
Gniewek P, Kolinski A, Jernigan RL, Kloczkowski A. Elastic network normal modes provide a basis for protein structure refinement. J Chem Phys 2012; 136:195101. [PMID: 22612113 DOI: 10.1063/1.4710986] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
It is well recognized that thermal motions of atoms in the protein native state, the fluctuations about the minimum of the global free energy, are well reproduced by the simple elastic network models (ENMs) such as the anisotropic network model (ANM). Elastic network models represent protein dynamics as vibrations of a network of nodes (usually represented by positions of the heavy atoms or by the C(α) atoms only for coarse-grained representations) in which the spatially close nodes are connected by harmonic springs. These models provide a reliable representation of the fluctuational dynamics of proteins and RNA, and explain various conformational changes in protein structures including those important for ligand binding. In the present paper, we study the problem of protein structure refinement by analyzing thermal motions of proteins in non-native states. We represent the conformational space close to the native state by a set of decoys generated by the I-TASSER protein structure prediction server utilizing template-free modeling. The protein substates are selected by hierarchical structure clustering. The main finding is that thermal motions for some substates, overlap significantly with the deformations necessary to reach the native state. Additionally, more mobile residues yield higher overlaps with the required deformations than do the less mobile ones. These findings suggest that structural refinement of poorly resolved protein models can be significantly enhanced by reduction of the conformational space to the motions imposed by the dominant normal modes.
Collapse
Affiliation(s)
- Pawel Gniewek
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| | | | | | | |
Collapse
|
11
|
Khashan R, Zheng W, Tropsha A. Scoring protein interaction decoys using exposed residues (SPIDER): a novel multibody interaction scoring function based on frequent geometric patterns of interfacial residues. Proteins 2012; 80:2207-17. [PMID: 22581643 DOI: 10.1002/prot.24110] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2012] [Revised: 04/05/2012] [Accepted: 04/20/2012] [Indexed: 01/14/2023]
Abstract
Accurate prediction of the structure of protein-protein complexes in computational docking experiments remains a formidable challenge. It has been recognized that identifying native or native-like poses among multiple decoys is the major bottleneck of the current scoring functions used in docking. We have developed a novel multibody pose-scoring function that has no theoretical limit on the number of residues contributing to the individual interaction terms. We use a coarse-grain representation of a protein-protein complex where each residue is represented by its side chain centroid. We apply a computational geometry approach called Almost-Delaunay tessellation that transforms protein-protein complexes into a residue contact network, or an undirectional graph where vertex-residues are nodes connected by edges. This treatment forms a family of interfacial graphs representing a dataset of protein-protein complexes. We then employ frequent subgraph mining approach to identify common interfacial residue patterns that appear in at least a subset of native protein-protein interfaces. The geometrical parameters and frequency of occurrence of each "native" pattern in the training set are used to develop the new SPIDER scoring function. SPIDER was validated using standard "ZDOCK" benchmark dataset that was not used in the development of SPIDER. We demonstrate that SPIDER scoring function ranks native and native-like poses above geometrical decoys and that it exceeds in performance a popular ZRANK scoring function. SPIDER was ranked among the top scoring functions in a recent round of CAPRI (Critical Assessment of PRedicted Interactions) blind test of protein-protein docking methods.
Collapse
Affiliation(s)
- Raed Khashan
- Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, North Carolina 27599, USA
| | | | | |
Collapse
|
12
|
Zimmermann MT, Leelananda SP, Kloczkowski A, Jernigan RL. Combining statistical potentials with dynamics-based entropies improves selection from protein decoys and docking poses. J Phys Chem B 2012; 116:6725-31. [PMID: 22490366 DOI: 10.1021/jp2120143] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Protein structure prediction and protein-protein docking are important and widely used tools, but methods to confidently evaluate the quality of a predicted structure or binding pose have had limited success. Typically, either knowledge-based or physics-based energy functions are employed to evaluate a set of predicted structures (termed "decoys" in structure prediction and "poses" in docking), with the lowest energy structure being assumed to be the one closest to the native state. While successful for many cases, failures are still common. Thus, improvements to structure evaluation methods are essential for future improvements. In this work, we combine multibody statistical potentials with dynamics models, evaluating fluctuation-based entropies that include contributions from the entire structure. This leads to enhanced selection of native-like structures for CASP9 decoys, refined ClusPro docking poses, as well as large sets of docking poses from the Benchmark 3.0 and Dockground data sets. The data used include both bound and unbound docking, and positive results are found for each type. Not only does this method yield improved average results, but for high quality docking poses, we often pick the best pose.
Collapse
Affiliation(s)
- Michael T Zimmermann
- Bioinformatics and Computational Biology Interdepartmental Graduate Program, Iowa State University, Ames, Iowa 50011, USA
| | | | | | | |
Collapse
|
13
|
Protein Loop Dynamics Are Complex and Depend on the Motions of the Whole Protein. ENTROPY 2012. [DOI: 10.3390/e14040687] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|