1
|
Padhi AK, Kalita P, Maurya S, Poluri KM, Tripathi T. From De Novo Design to Redesign: Harnessing Computational Protein Design for Understanding SARS-CoV-2 Molecular Mechanisms and Developing Therapeutics. J Phys Chem B 2023; 127:8717-8735. [PMID: 37815479 DOI: 10.1021/acs.jpcb.3c04542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/11/2023]
Abstract
The continuous emergence of novel SARS-CoV-2 variants and subvariants serves as compelling evidence that COVID-19 is an ongoing concern. The swift, well-coordinated response to the pandemic highlights how technological advancements can accelerate the detection, monitoring, and treatment of the disease. Robust surveillance systems have been established to understand the clinical characteristics of new variants, although the unpredictable nature of these variants presents significant challenges. Some variants have shown resistance to current treatments, but innovative technologies like computational protein design (CPD) offer promising solutions and versatile therapeutics against SARS-CoV-2. Advances in computing power, coupled with open-source platforms like AlphaFold and RFdiffusion (employing deep neural network and diffusion generative models), among many others, have accelerated the design of protein therapeutics with precise structures and intended functions. CPD has played a pivotal role in developing peptide inhibitors, mini proteins, protein mimics, decoy receptors, nanobodies, monoclonal antibodies, identifying drug-resistance mutations, and even redesigning native SARS-CoV-2 proteins. Pending regulatory approval, these designed therapies hold the potential for a lasting impact on human health and sustainability. As SARS-CoV-2 continues to evolve, use of such technologies enables the ongoing development of alternative strategies, thus equipping us for the "New Normal".
Collapse
Affiliation(s)
- Aditya K Padhi
- Laboratory for Computational Biology & Biomolecular Design, School of Biochemical Engineering, Indian Institute of Technology (BHU), Varanasi 221005, Uttar Pradesh, India
| | - Parismita Kalita
- Molecular and Structural Biophysics Laboratory, Department of Biochemistry, North-Eastern Hill University, Shillong 793022, India
| | - Shweata Maurya
- Laboratory for Computational Biology & Biomolecular Design, School of Biochemical Engineering, Indian Institute of Technology (BHU), Varanasi 221005, Uttar Pradesh, India
| | - Krishna Mohan Poluri
- Department of Biosciences and Bioengineering, Indian Institute of Technology Roorkee, Roorkee 247667, Uttarakhand, India
- Centre for Nanotechnology, Indian Institute of Technology Roorkee, Roorkee 247667, Uttarakhand, India
| | - Timir Tripathi
- Molecular and Structural Biophysics Laboratory, Department of Biochemistry, North-Eastern Hill University, Shillong 793022, India
- Department of Zoology, School of Life Sciences, North-Eastern Hill University, Shillong 793022, India
| |
Collapse
|
2
|
|
3
|
On the ability of molecular dynamics simulation and continuum electrostatics to treat interfacial water molecules in protein-protein complexes. Sci Rep 2016; 6:38259. [PMID: 27905545 PMCID: PMC5131287 DOI: 10.1038/srep38259] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2016] [Accepted: 10/28/2016] [Indexed: 01/23/2023] Open
Abstract
Interfacial waters are increasingly appreciated as playing a key role in protein-protein interactions. We report on a study of the prediction of interfacial water positions by both Molecular Dynamics and explicit solvent-continuum electrostatics based on the Dipolar Poisson-Boltzmann Langevin (DPBL) model, for three test cases: (i) the barnase/barstar complex (ii) the complex between the DNase domain of colicin E2 and its cognate Im2 immunity protein and (iii) the highly unusual anti-freeze protein Maxi which contains a large number of waters in its interior. We characterize the waters at the interface and in the core of the Maxi protein by the statistics of correctly predicted positions with respect to crystallographic water positions in the PDB files as well as the dynamic measures of diffusion constants and position lifetimes. Our approach provides a methodology for the evaluation of predicted interfacial water positions through an investigation of water-mediated inter-chain contacts. While our results show satisfactory behaviour for molecular dynamics simulation, they also highlight the need for improvement of continuum methods.
Collapse
|
4
|
Gil VA, Lecina D, Grebner C, Guallar V. Enhancing backbone sampling in Monte Carlo simulations using internal coordinates normal mode analysis. Bioorg Med Chem 2016; 24:4855-4866. [PMID: 27436808 DOI: 10.1016/j.bmc.2016.07.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2016] [Revised: 07/01/2016] [Accepted: 07/02/2016] [Indexed: 10/21/2022]
Abstract
Normal mode methods are becoming a popular alternative to sample the conformational landscape of proteins. In this study, we describe the implementation of an internal coordinate normal mode analysis method and its application in exploring protein flexibility by using the Monte Carlo method PELE. This new method alternates two different stages, a perturbation of the backbone through the application of torsional normal modes, and a resampling of the side chains. We have evaluated the new approach using two test systems, ubiquitin and c-Src kinase, and the differences to the original ANM method are assessed by comparing both results to reference molecular dynamics simulations. The results suggest that the sampled phase space in the internal coordinate approach is closer to the molecular dynamics phase space than the one coming from a Cartesian coordinate anisotropic network model. In addition, the new method shows a great speedup (∼5-7×), making it a good candidate for future normal mode implementations in Monte Carlo methods.
Collapse
Affiliation(s)
- Victor A Gil
- Joint BSC-CRG-IRB Research Program in Computational Biology, Barcelona Supercomputing Center, 08034 Barcelona, Spain
| | - Daniel Lecina
- Joint BSC-CRG-IRB Research Program in Computational Biology, Barcelona Supercomputing Center, 08034 Barcelona, Spain
| | - Christoph Grebner
- Department of Medicinal Chemistry, CVMD iMed, AstraZeneca, S-43183 Mölndal, Sweden
| | - Victor Guallar
- Joint BSC-CRG-IRB Research Program in Computational Biology, Barcelona Supercomputing Center, 08034 Barcelona, Spain; Institució Catalana de Recerca i Estudis Avançats (ICREA), Passeig Lluís Companys 23, E-08010 Barcelona, Spain.
| |
Collapse
|
5
|
Parmar AS, James JK, Grisham DR, Pike DH, Nanda V. Dissecting Electrostatic Contributions to Folding and Self-Assembly Using Designed Multicomponent Peptide Systems. J Am Chem Soc 2016; 138:4362-7. [PMID: 26966815 DOI: 10.1021/jacs.5b10304] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We investigate formation of nano- to microscale peptide fibers and sheets where assembly requires association of two distinct collagen mimetic peptides (CMPs). The multicomponent nature of these designs allows the decoupling of amino acid contributions to peptide folding versus higher-order assembly. While both arginine and lysine containing CMP sequences can favor triple-helix folding, only arginine promotes rapid supramolecular assembly in each of the three two-component systems examined. Unlike lysine, the polyvalent guanidyl group of arginine is capable of both intra- and intermolecular contacts, promoting assembly. This is consistent with the supramolecular diversity of CMP morphologies observed throughout the literature. It also connects CMP self-assembly with a broad range of biomolecular interaction phenomena, providing general principles for modeling and design.
Collapse
Affiliation(s)
- Avanish S Parmar
- Department of Physics, Indian Institute of Technology (Banaras Hindu University) , Varanasi 221005, Uttar Pradesh, India
| | - Jose K James
- Center for Advanced Biotechnology and Medicine, Department of Biochemistry and Molecular Biology, Robert Wood Johnson Medical School, Rutgers University , 679 Hoes Lane West, Piscataway, New Jersey 08854, United States
| | - Daniel R Grisham
- Center for Advanced Biotechnology and Medicine, Department of Biochemistry and Molecular Biology, Robert Wood Johnson Medical School, Rutgers University , 679 Hoes Lane West, Piscataway, New Jersey 08854, United States
| | - Douglas H Pike
- Center for Advanced Biotechnology and Medicine, Department of Biochemistry and Molecular Biology, Robert Wood Johnson Medical School, Rutgers University , 679 Hoes Lane West, Piscataway, New Jersey 08854, United States
| | - Vikas Nanda
- Center for Advanced Biotechnology and Medicine, Department of Biochemistry and Molecular Biology, Robert Wood Johnson Medical School, Rutgers University , 679 Hoes Lane West, Piscataway, New Jersey 08854, United States
| |
Collapse
|
6
|
LuCore SD, Litman JM, Powers KT, Gao S, Lynn AM, Tollefson WTA, Fenn TD, Washington MT, Schnieders MJ. Dead-End Elimination with a Polarizable Force Field Repacks PCNA Structures. Biophys J 2015; 109:816-26. [PMID: 26287633 PMCID: PMC4547145 DOI: 10.1016/j.bpj.2015.06.062] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2015] [Revised: 06/07/2015] [Accepted: 06/29/2015] [Indexed: 11/15/2022] Open
Abstract
A balance of van der Waals, electrostatic, and hydrophobic forces drive the folding and packing of protein side chains. Although such interactions between residues are often approximated as being pairwise additive, in reality, higher-order many-body contributions that depend on environment drive hydrophobic collapse and cooperative electrostatics. Beginning from dead-end elimination, we derive the first algorithm, to our knowledge, capable of deterministic global repacking of side chains compatible with many-body energy functions. The approach is applied to seven PCNA x-ray crystallographic data sets with resolutions 2.5-3.8 Å (mean 3.0 Å) using an open-source software. While PDB_REDO models average an Rfree value of 29.5% and MOLPROBITY score of 2.71 Å (77th percentile), dead-end elimination with the polarizable AMOEBA force field lowered Rfree by 2.8-26.7% and improved mean MOLPROBITY score to atomic resolution at 1.25 Å (100th percentile). For structural biology applications that depend on side-chain repacking, including x-ray refinement, homology modeling, and protein design, the accuracy limitations of pairwise additivity can now be eliminated via polarizable or quantum mechanical potentials.
Collapse
Affiliation(s)
- Stephen D LuCore
- Department of Biomedical Engineering, University of Iowa, Iowa City, Iowa
| | - Jacob M Litman
- Department of Biochemistry, University of Iowa, Iowa City, Iowa
| | - Kyle T Powers
- Department of Biochemistry, University of Iowa, Iowa City, Iowa
| | - Shibo Gao
- Department of Biochemistry, University of Iowa, Iowa City, Iowa
| | - Ava M Lynn
- Department of Biomedical Engineering, University of Iowa, Iowa City, Iowa
| | | | | | | | - Michael J Schnieders
- Department of Biomedical Engineering, University of Iowa, Iowa City, Iowa; Department of Biochemistry, University of Iowa, Iowa City, Iowa.
| |
Collapse
|
7
|
OptMAVEn--a new framework for the de novo design of antibody variable region models targeting specific antigen epitopes. PLoS One 2014; 9:e105954. [PMID: 25153121 PMCID: PMC4143332 DOI: 10.1371/journal.pone.0105954] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2014] [Accepted: 07/29/2014] [Indexed: 01/12/2023] Open
Abstract
Antibody-based therapeutics provides novel and efficacious treatments for a number of diseases. Traditional experimental approaches for designing therapeutic antibodies rely on raising antibodies against a target antigen in an immunized animal or directed evolution of antibodies with low affinity for the desired antigen. However, these methods remain time consuming, cannot target a specific epitope and do not lead to broad design principles informing other studies. Computational design methods can overcome some of these limitations by using biophysics models to rationally select antibody parts that maximize affinity for a target antigen epitope. This has been addressed to some extend by OptCDR for the design of complementary determining regions. Here, we extend this earlier contribution by addressing the de novo design of a model of the entire antibody variable region against a given antigen epitope while safeguarding for immunogenicity (Optimal Method for Antibody Variable region Engineering, OptMAVEn). OptMAVEn simulates in silico the in vivo steps of antibody generation and evolution, and is capable of capturing the critical structural features responsible for affinity maturation of antibodies. In addition, a humanization procedure was developed and incorporated into OptMAVEn to minimize the potential immunogenicity of the designed antibody models. As case studies, OptMAVEn was applied to design models of neutralizing antibodies targeting influenza hemagglutinin and HIV gp120. For both HA and gp120, novel computational antibody models with numerous interactions with their target epitopes were generated. The observed rates of mutations and types of amino acid changes during in silico affinity maturation are consistent with what has been observed during in vivo affinity maturation. The results demonstrate that OptMAVEn can efficiently generate diverse computational antibody models with both optimized binding affinity to antigens and reduced immunogenicity.
Collapse
|
8
|
Gaillard T, Simonson T. Pairwise decomposition of an MMGBSA energy function for computational protein design. J Comput Chem 2014; 35:1371-87. [PMID: 24854675 DOI: 10.1002/jcc.23637] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2014] [Revised: 04/14/2014] [Accepted: 05/01/2014] [Indexed: 02/02/2023]
Abstract
Computational protein design (CPD) aims at predicting new proteins or modifying existing ones. The computational challenge is huge as it requires exploring an enormous sequence and conformation space. The difficulty can be reduced by considering a fixed backbone and a discrete set of sidechain conformations. Another common strategy consists in precalculating a pairwise energy matrix, from which the energy of any sequence/conformation can be quickly obtained. In this work, we examine the pairwise decomposition of protein MMGBSA energy functions from a general theoretical perspective, and an implementation proposed earlier for CPD. It includes a Generalized Born term, whose many-body character is overcome using an effective dielectric environment, and a Surface Area term, for which we present an improved pairwise decomposition. A detailed evaluation of the error introduced by the decomposition on the different energy components is performed. We show that the error remains reasonable, compared to other uncertainties.
Collapse
Affiliation(s)
- Thomas Gaillard
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, 91128, Palaiseau, France
| | | |
Collapse
|
9
|
Huang X, Yang J, Zhu Y. A solvated ligand rotamer approach and its application in computational protein design. J Mol Model 2012. [PMID: 23192355 DOI: 10.1007/s00894-012-1695-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
The structure-based design of protein-ligand interfaces with respect to different small molecules is of great significance in the discovery of functional proteins. By statistical analysis of a set of protein-ligand complex structures, it was determined that water-mediated hydrogen bonding at the protein-ligand interface plays a crucial role in governing the binding between the protein and the ligand. Based on the novel statistic results, a solvated ligand rotamer approach was developed to explicitly describe the key water molecules at the protein-ligand interface and a water-mediated hydrogen bonding model was applied in the computational protein design context to complement the continuum solvent model. The solvated ligand rotamer approach produces only one additional solvated rotamer for each rotamer in the ligand rotamer library and does not change the number of side-chain rotamers at each protein design site. This has greatly reduced the total combinatorial number in sequence selection for protein design, and the accuracy of the model was confirmed by two tests. For the water placement test, 61% of the crystal water molecules were predicted correctly in five protein-ligand complex structures. For the sequence recapitulation test, 44.7% of the amino acid identities were recovered using the solvated ligand rotamer approach and the water-mediated hydrogen bonding model, while only 30.4% were recovered when the explicitly bound waters were removed. These results indicated that the developed solvated ligand rotamer approach is promising for functional protein design targeting novel protein-ligand interactions.
Collapse
Affiliation(s)
- Xiaoqiang Huang
- Department of Chemical Engineering, Tsinghua University, Beijing 100084, China
| | | | | |
Collapse
|
10
|
Designing electrostatic interactions in biological systems via charge optimization or combinatorial approaches: insights and challenges with a continuum electrostatic framework. Theor Chem Acc 2012. [DOI: 10.1007/s00214-012-1252-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
11
|
In Silico Strategies Toward Enzyme Function and Dynamics. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2012. [DOI: 10.1016/b978-0-12-398312-1.00009-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register]
|
12
|
Menikarachchi LC, Gascón JA. An extrapolation method for computing protein solvation energies based on density fragmentation of a graphical surface tessellation. J Mol Graph Model 2011; 30:38-45. [PMID: 21715202 DOI: 10.1016/j.jmgm.2011.06.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2011] [Revised: 05/26/2011] [Accepted: 06/02/2011] [Indexed: 11/28/2022]
Abstract
Modeling chemical events inside proteins often require the incorporation of solvent effects via continuum polarizable models. One of these approaches is based on the assumption that the interface between solute and solvent acts as a conductor. Image charges are added on the molecular surface to satisfy the appropriate conductor boundary conditions in the presence of solute charges. As in the case of other polarizable continuum models that are based on surface tessellation, the simplest implementation of this approach is often limited to several hundred atoms due to a matrix inversion, which scales as the cube of the number or tesserae. For larger systems, approaches that use iterative matrix solvers coupled to fast summation methods must be used. In the present work, we develop a self-consistent approach to obtain conductor-like screening charges suitable for applications in proteins. The approach is based on a density fragmentation of a graphical surface tessellation. This method, although approximate, provides a straightforward scheme of parallelization, which can in principle be added to existing linear scaling implementations of conductor-like models. We implement this method in conjunction with a fixed charge model for the protein, as well as with a moving domain QM/MM description of the protein. In the latter case, the overall result leads to a charge distribution within the protein determined by self-polarization and polarization due to solvent.
Collapse
Affiliation(s)
- Lochana C Menikarachchi
- Department of Chemistry, University of Connecticut, 55 North Eagleville Rd., Unit 3060, Storrs, CT 06269, USA
| | | |
Collapse
|
13
|
Abstract
The ability to engineer novel proteins using the principles of molecular structure and energetics is a stringent test of our basic understanding of how proteins fold and maintain structure. The design of protein self-assembly has the potential to impact many fields of biology from molecular recognition to cell signaling to biomaterials. Most progress in computational design of protein self-assembly has focused on α-helical systems, exploring ways to concurrently optimize the stability and specificity of a target state. Applying these methods to collagen self-assembly is very challenging, due to fundamental differences in folding and structure of α- versus triple-helices. Here, we explore various computational methods for designing stable and specific oligomeric systems, with a focus on α-helix and collagen self-assembly.
Collapse
|
14
|
Bueno M, Temiz NA, Camacho CJ. Novel modulation factor quantifies the role of water molecules in protein interactions. Proteins 2011; 78:3226-34. [PMID: 20665475 DOI: 10.1002/prot.22805] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Water molecules decrease the potential of mean force of a hydrogen bond (H-bond), as well as modulate (de)solvation forces, but exactly how much has not been easy to determine. Crystallographic water molecules provide snapshots of optimal solutions for the role of solvent in protein interactions, information that is often ignored by implicit solvent models. Motivated by high-resolution crystal structures, we describe a simple quantitative approach to explicitly incorporate the role of molecular water in protein interactions. Applications to protein-DNA interactions show that the accuracy of binding free-energy estimates improves significantly if a distinction is made between H-bonds that are desolvated (or only contact crystal waters), solvated by mobile waters trapped at the binding interface, or partially solvated through connections to bulk water. These different environments are modeled by a unique "water" scaling factor that decreases or increases the strength of hydrogen bonds depending on whether water contacts the acceptor or donor atoms or the bond is fully desolvated, respectively. Our empirical energies are fully consistent with mobile water molecules having a strong polarization effect in direct intermolecular interactions.
Collapse
Affiliation(s)
- Marta Bueno
- Department of Pathology, Division of Transplant Pathology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania 15213, USA.
| | | | | |
Collapse
|
15
|
The empirical valence bond model: theory and applications. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2011. [DOI: 10.1002/wcms.10] [Citation(s) in RCA: 113] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
16
|
Chen Z, Wilmanns M, Zeng AP. Structural synthetic biotechnology: from molecular structure to predictable design for industrial strain development. Trends Biotechnol 2010; 28:534-42. [PMID: 20727604 DOI: 10.1016/j.tibtech.2010.07.004] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2010] [Revised: 07/14/2010] [Accepted: 07/15/2010] [Indexed: 10/19/2022]
Abstract
The future of industrial biotechnology requires efficient development of highly productive and robust strains of microorganisms. Present praxis of strain development cannot adequately fulfill this requirement, primarily owing to the inability to control reactions precisely at a molecular level, or to predict reliably the behavior of cells upon perturbation. Recent developments in two areas of biology are changing the situation rapidly: structural biology has revealed details about enzymes and associated bioreactions at an atomic level; and synthetic biology has provided tools to design and assemble precisely controllable modules for re-programming cellular metabolic circuitry. However, because of different emphases, to date, these two areas have developed separately. A linkage between them is desirable to harness their concerted potential. We therefore propose structural synthetic biotechnology as a new field in biotechnology, specifically for application to the development of industrial microbial strains.
Collapse
Affiliation(s)
- Zhen Chen
- Institute of Bioprocess and Biosystems Engineering, Hamburg University of Technology, Denickestrasse 15, D-21073 Hamburg, Germany
| | | | | |
Collapse
|
17
|
Green DF. A Statistical Framework for Hierarchical Methods in Molecular Simulation and Design. J Chem Theory Comput 2010; 6:1682-97. [PMID: 26615700 DOI: 10.1021/ct9004504] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
A statistical framework for performance analysis in hierarchical methods is described, with a focus on applications in molecular design. A theory is derived from statistical principles, describing the relationships between the results of each hierarchical level by a functional correlation and an error model for how values are distributed around the correlation curve. Two key measures are then defined for evaluating a hierarchical approach-completeness and excess cost-conceptually similar to the sensitivity and specificity of dichotomous prediction methods. We demonstrate the use of this method using a simple model problem in conformational search, refining the results of an in vacuo search of glucose conformations with a continuum solvent model. Second, we show the usefulness of this approach when structural hierarchies are used to efficiently make use of large rotamer libraries with the Dead-end Elimination and A* algorithms for protein design. The framework described is applicable not only to the specific examples given but to any problem in molecular simulation or design that involves a hierarchical approach.
Collapse
Affiliation(s)
- David F Green
- Department of Applied Mathematics and Statistics and Graduate Program in Biochemistry and Structural Biology, Stony Brook University, Stony Brook, New York 11794-3600
| |
Collapse
|
18
|
De novo self-assembling collagen heterotrimers using explicit positive and negative design. Biochemistry 2010; 49:2307-16. [PMID: 20170197 DOI: 10.1021/bi902077d] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We sought to computationally design model collagen peptides that specifically associate as heterotrimers. Computational design has been successfully applied to the creation of new protein folds and functions. Despite the high abundance of collagen and its key role in numerous biological processes, fibrous proteins have received little attention as computational design targets. Collagens are composed of three polypeptide chains that wind into triple helices. We developed a discrete computational model to design heterotrimer-forming collagen-like peptides. Stability and specificity of oligomerization were concurrently targeted using a combined positive and negative design approach. The sequences of three 30-residue peptides, A, B, and C, were optimized to favor charge-pair interactions in an ABC heterotrimer, while disfavoring the 26 competing oligomers (i.e., AAA, ABB, BCA). Peptides were synthesized and characterized for thermal stability and triple-helical structure by circular dichroism and NMR. A unique A:B:C-type species was not achieved. Negative design was partially successful, with only A + B and B + C competing mixtures formed. Analysis of computed versus experimental stabilities helps to clarify the role of electrostatics and secondary-structure propensities determining collagen stability and to provide important insight into how subsequent designs can be improved.
Collapse
|
19
|
Kamerlin SCL, Warshel A. The EVB as a quantitative tool for formulating simulations and analyzing biological and chemical reactions. Faraday Discuss 2010; 145:71-106. [PMID: 25285029 PMCID: PMC4184467 DOI: 10.1039/b907354j] [Citation(s) in RCA: 74] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Recent years have seen dramatic improvements in computer power, allowing ever more challenging problems to be approached. In light of this, it is imperative to have a quantitative model for examining chemical reactivity, both in the condensed phase and in solution, as well as to accurately quantify physical organic chemistry (particularly as experimental approaches can often be inconclusive). Similarly, computational approaches allow for great progress in studying enzyme catalysis, as they allow for the separation of the relevant energy contributions to catalysis. Due to the complexity of the problems that need addressing, there is a need for an approach that can combine reliability with an ability to capture complex systems in order to resolve long-standing controversies in a unique way. Herein, we will demonstrate that the empirical valence bond (EVB) approach provides a powerful way to connect the classical concepts of physical organic chemistry to the actual energies of enzymatic reactions by means of computation. Additionally, we will discuss the proliferation of this approach, as well as attempts to capture its basic chemistry and repackage it under different names. We believe that the EVB approach is the most powerful tool that is currently available for studies of chemical processes in the condensed phase in general and enzymes in particular, particularly when trying to explore the different proposals about the origin of the catalytic power of enzymes.
Collapse
Affiliation(s)
- Shina C. L. Kamerlin
- Department of Chemistry SGM418, University of Southern California, 3620 McClintock Ave., Los Angeles, CA-90089, USA
| | - Arieh Warshel
- Department of Chemistry SGM418, University of Southern California, 3620 McClintock Ave., Los Angeles, CA-90089, USA
| |
Collapse
|
20
|
Suárez M, Jaramillo A. Challenges in the computational design of proteins. J R Soc Interface 2009; 6 Suppl 4:S477-91. [PMID: 19324680 PMCID: PMC2843960 DOI: 10.1098/rsif.2008.0508.focus] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2008] [Accepted: 02/04/2009] [Indexed: 11/12/2022] Open
Abstract
Protein design has many applications not only in biotechnology but also in basic science. It uses our current knowledge in structural biology to predict, by computer simulations, an amino acid sequence that would produce a protein with targeted properties. As in other examples of synthetic biology, this approach allows the testing of many hypotheses in biology. The recent development of automated computational methods to design proteins has enabled proteins to be designed that are very different from any known ones. Moreover, some of those methods mostly rely on a physical description of atomic interactions, which allows the designed sequences not to be biased towards known proteins. In this paper, we will describe the use of energy functions in computational protein design, the use of atomic models to evaluate the free energy in the unfolded and folded states, the exploration and optimization of amino acid sequences, the problem of negative design and the design of biomolecular function. We will also consider its use together with the experimental techniques such as directed evolution. We will end by discussing the challenges ahead in computational protein design and some of their future applications.
Collapse
Affiliation(s)
- María Suárez
- Laboratoire de Biochimie, Ecole Polytechnique, CNRS, 91128 Palaiseau Cedex, France
- Epigenomics Project, Genopole, Université d'Evry Val d'Essonne-Genopole-CNRS, Tour Evry2, Etage 10, Terrasses de l'Agora, 91034 Evry Cedex, France
| | - Alfonso Jaramillo
- Laboratoire de Biochimie, Ecole Polytechnique, CNRS, 91128 Palaiseau Cedex, France
- Epigenomics Project, Genopole, Université d'Evry Val d'Essonne-Genopole-CNRS, Tour Evry2, Etage 10, Terrasses de l'Agora, 91034 Evry Cedex, France
| |
Collapse
|
21
|
Suárez M, Tortosa P, Jaramillo A. PROTDES: CHARMM toolbox for computational protein design. SYSTEMS AND SYNTHETIC BIOLOGY 2009; 2:105-13. [PMID: 19572216 PMCID: PMC2735645 DOI: 10.1007/s11693-009-9026-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/22/2008] [Revised: 05/17/2009] [Accepted: 05/30/2009] [Indexed: 12/13/2022]
Abstract
We present an open-source software able to automatically mutate any residue positions and find the best aminoacids in an arbitrary protein structure without requiring pairwise approximations. Our software, PROTDES, is based on CHARMM and it searches automatically for mutations optimizing a protein folding free energy. PROTDES allows the integration of molecular dynamics within the protein design. We have implemented an heuristic optimization algorithm that iteratively searches the best aminoacids and their conformations for an arbitrary set of positions within a structure. Our software allows CHARMM users to perform protein design calculations and to create their own procedures for protein design using their own energy functions. We show this by implementing three different energy functions based on different solvent treatments: surface area accessibility, generalized Born using molecular volume and an effective energy function. PROTDES, a tutorial, parameter sets, configuration tools and examples are freely available at http://soft.synth-bio.org/protdes.html.
Collapse
Affiliation(s)
- María Suárez
- Biochemistry Laboratory, CNRS—UMR 7654, Ecole Polytechnique, 91128 Palaiseau, France
- SYNTH-BIO group Epigenomics Project, Genopole Tour Evry2, etage 10, 523, Terrasses de l’Agora, 91034 Evry Cedex, France
| | - Pablo Tortosa
- Biochemistry Laboratory, CNRS—UMR 7654, Ecole Polytechnique, 91128 Palaiseau, France
| | - Alfonso Jaramillo
- Biochemistry Laboratory, CNRS—UMR 7654, Ecole Polytechnique, 91128 Palaiseau, France
- SYNTH-BIO group Epigenomics Project, Genopole Tour Evry2, etage 10, 523, Terrasses de l’Agora, 91034 Evry Cedex, France
| |
Collapse
|
22
|
Moltó G, Suárez M, Tortosa P, Alonso JM, Hernández V, Jaramillo A. Protein Design Based on Parallel Dimensional Reduction. J Chem Inf Model 2009; 49:1261-71. [DOI: 10.1021/ci8004594] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Germán Moltó
- Departamento de Sistemas Informáticos y Computación, Universidad Politécnica de Valencia, 46022 Valencia, Spain, Epigenomics Project, Genopole-Université d'Évry Val d'Essonne-CNRS UPS 3201, 91034 Évry, France, and Laboratoire de Biochimie, École Polytechnique-CNRS UMR 7654, 91128, Palaiseau, France
| | - María Suárez
- Departamento de Sistemas Informáticos y Computación, Universidad Politécnica de Valencia, 46022 Valencia, Spain, Epigenomics Project, Genopole-Université d'Évry Val d'Essonne-CNRS UPS 3201, 91034 Évry, France, and Laboratoire de Biochimie, École Polytechnique-CNRS UMR 7654, 91128, Palaiseau, France
| | - Pablo Tortosa
- Departamento de Sistemas Informáticos y Computación, Universidad Politécnica de Valencia, 46022 Valencia, Spain, Epigenomics Project, Genopole-Université d'Évry Val d'Essonne-CNRS UPS 3201, 91034 Évry, France, and Laboratoire de Biochimie, École Polytechnique-CNRS UMR 7654, 91128, Palaiseau, France
| | - José M. Alonso
- Departamento de Sistemas Informáticos y Computación, Universidad Politécnica de Valencia, 46022 Valencia, Spain, Epigenomics Project, Genopole-Université d'Évry Val d'Essonne-CNRS UPS 3201, 91034 Évry, France, and Laboratoire de Biochimie, École Polytechnique-CNRS UMR 7654, 91128, Palaiseau, France
| | - Vicente Hernández
- Departamento de Sistemas Informáticos y Computación, Universidad Politécnica de Valencia, 46022 Valencia, Spain, Epigenomics Project, Genopole-Université d'Évry Val d'Essonne-CNRS UPS 3201, 91034 Évry, France, and Laboratoire de Biochimie, École Polytechnique-CNRS UMR 7654, 91128, Palaiseau, France
| | - Alfonso Jaramillo
- Departamento de Sistemas Informáticos y Computación, Universidad Politécnica de Valencia, 46022 Valencia, Spain, Epigenomics Project, Genopole-Université d'Évry Val d'Essonne-CNRS UPS 3201, 91034 Évry, France, and Laboratoire de Biochimie, École Polytechnique-CNRS UMR 7654, 91128, Palaiseau, France
| |
Collapse
|
23
|
Sciretti D, Bruscolini P, Pelizzola A, Pretti M, Jaramillo A. Computational protein design with side-chain conformational entropy. Proteins 2009; 74:176-91. [PMID: 18618711 DOI: 10.1002/prot.22145] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Recent advances in modeling protein structures at the atomic level have made it possible to tackle "de novo" computational protein design. Most procedures are based on combinatorial optimization using a scoring function that estimates the folding free energy of a protein sequence on a given main-chain structure. However, the computation of the conformational entropy in the folded state is generally an intractable problem, and its contribution to the free energy is not properly evaluated. In this article, we propose a new automated protein design methodology that incorporates such conformational entropy based on statistical mechanics principles. We define the free energy of a protein sequence by the corresponding partition function over rotamer states. The free energy is written in variational form in a pairwise approximation and minimized using the Belief Propagation algorithm. In this way, a free energy is associated to each amino acid sequence: we use this insight to rescore the results obtained with a standard minimization method, with the energy as the cost function. Then, we set up a design method that directly uses the free energy as a cost function in combination with a stochastic search in the sequence space. We validate the methods on the design of three superficial sites of a small SH3 domain, and then apply them to the complete redesign of 27 proteins. Our results indicate that accounting for entropic contribution in the score function affects the outcome in a highly nontrivial way, and might improve current computational design techniques based on protein stability.
Collapse
Affiliation(s)
- Daniele Sciretti
- Departamento de Física Teórica, Universidad de Zaragoza, c. Pedro Cerbuna 12, Zaragoza 50009, Spain
| | | | | | | | | |
Collapse
|
24
|
Tatsis VA, Stavrakoudis A, Demetropoulos IN. Molecular Dynamics as a pattern recognition tool: An automated process detects peptides that preserve the 3D arrangement of Trypsin's Active Site. Biophys Chem 2008; 133:36-44. [DOI: 10.1016/j.bpc.2007.11.008] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2007] [Revised: 11/24/2007] [Accepted: 11/26/2007] [Indexed: 11/25/2022]
|
25
|
Bueno M, Camacho CJ. Acidic groups docked to well defined wetted pockets at the core of the binding interface: a tale of scoring and missing protein interactions in CAPRI. Proteins 2008; 69:786-92. [PMID: 17803211 DOI: 10.1002/prot.21722] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Some challenging targets in CAPRI (T24/25 and T26) involve binding solvent accessible acidic residues at the core of the binding interface, where they are always found immersed in crystal waters. In fact, Asp and Glu residues are more likely to form part of the hydrogen bond network of their surrounding crystal water molecules than to form a buried salt bridge. Interestingly, many of the crystal waters mediating the intermolecular interactions of the acidic groups are already present in the unbound structure, reinforcing the notion that some water molecules behave as an extension of the protein structure. This is in contrast to acidic groups found in the periphery of the binding interface that form ubiquitous salt bridges that cement the high affinity complex, while at the same time they are exposed to rapidly exchanging water molecules. Because of this, dichotomy implicit solvent scoring functions fail to properly rank these complexes by prioritizing salt bridges rather than water mediated contacts. A detailed analysis of Target 24, for which our group predicted two out of the four successful homology model complex structures, and Target 26 reveal how crystal waters shape the binding cavities of acidic groups prior to binding, in agreement with the theory of anchor residues as mediators of protein recognition.
Collapse
Affiliation(s)
- Marta Bueno
- Department of Computational Biology, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania 15213, USA
| | | |
Collapse
|
26
|
Schnieders MJ, Ponder JW. Polarizable Atomic Multipole Solutes in a Generalized Kirkwood Continuum. J Chem Theory Comput 2007; 3:2083-97. [PMID: 26636202 PMCID: PMC4767294 DOI: 10.1021/ct7001336] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The generalized Born (GB) model of continuum electrostatics is an analytic approximation to the Poisson equation useful for predicting the electrostatic component of the solvation free energy for solutes ranging in size from small organic molecules to large macromolecular complexes. This work presents a new continuum electrostatics model based on Kirkwood's analytic result for the electrostatic component of the solvation free energy for a solute with arbitrary charge distribution. Unlike GB, which is limited to monopoles, our generalized Kirkwood (GK) model can treat solute electrostatics represented by any combination of permanent and induced atomic multipole moments of arbitrary degree. Here we apply the GK model to the newly developed Atomic Multipole Optimized Energetics for Biomolecular Applications (AMOEBA) force field, which includes permanent atomic multipoles through the quadrupole and treats polarization via induced dipoles. A derivation of the GK gradient is presented, which enables energy minimization or molecular dynamics of an AMOEBA solute within a GK continuum. For a series of 55 proteins, GK electrostatic solvation free energies are compared to the Polarizable Multipole Poisson-Boltzmann (PMPB) model and yield a mean unsigned relative difference of 0.9%. Additionally, the reaction field of GK compares well to that of the PMPB model, as shown by a mean unsigned relative difference of 2.7% in predicting the total solvated dipole moment for each protein in this test set. The CPU time needed for GK relative to vacuum AMOEBA calculations is approximately a factor of 3, making it suitable for applications that require significant sampling of configuration space.
Collapse
Affiliation(s)
- Michael J. Schnieders
- Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, MO 63130
| | - Jay W. Ponder
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO 63110
| |
Collapse
|
27
|
Lippow SM, Tidor B. Progress in computational protein design. Curr Opin Biotechnol 2007; 18:305-11. [PMID: 17644370 PMCID: PMC3495006 DOI: 10.1016/j.copbio.2007.04.009] [Citation(s) in RCA: 161] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2007] [Accepted: 04/17/2007] [Indexed: 11/25/2022]
Abstract
Current progress in computational structure-based protein design is reviewed in the areas of methodology and applications. Foundational advances include new potential functions, more efficient ways of computing energetics, flexible treatments of solvent, and useful energy function approximations, as well as ensemble-based approaches to scoring designs for inclusion of entropic effects, improvements to guaranteed and to stochastic search techniques, and methods to design combinatorial libraries for screening and selection. Applications include new approaches and successes in the design of specificity for protein folding, binding, and catalysis, in the redesign of proteins for enhanced binding affinity, and in the application of design technology to study and alter enzyme catalysis. Computational protein design continues to mature and advance.
Collapse
Affiliation(s)
- Shaun M Lippow
- Department of Chemical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA.
| | | |
Collapse
|
28
|
Ozkan SB, Wu GA, Chodera JD, Dill KA. Protein folding by zipping and assembly. Proc Natl Acad Sci U S A 2007; 104:11987-92. [PMID: 17620603 PMCID: PMC1924571 DOI: 10.1073/pnas.0703700104] [Citation(s) in RCA: 119] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2006] [Indexed: 11/18/2022] Open
Abstract
How do proteins fold so quickly? Some denatured proteins fold to their native structures in only microseconds, on average, implying that there is a folding "mechanism," i.e., a particular set of events by which the protein short-circuits a broader conformational search. Predicting protein structures using atomically detailed physical models is currently challenging. The most definitive proof of a putative folding mechanism would be whether it speeds up protein structure prediction in physical models. In the zipping and assembly (ZA) mechanism, local structuring happens first at independent sites along the chain, then those structures either grow (zip) or coalescence (assemble) with other structures. Here, we apply the ZA search mechanism to protein native structure prediction by using the AMBER96 force field with a generalized Born/surface area implicit solvent model and sampling by replica exchange molecular dynamics. Starting from open denatured conformations, our algorithm, called the ZA method, converges to an average of 2.2 A from the Protein Data Bank native structures of eight of nine proteins that we tested, which ranged from 25 to 73 aa in length. In addition, experimental Phi values, where available on these proteins, are consistent with the predicted routes. We conclude that ZA is a viable model for how proteins physically fold. The present work also shows that physics-based force fields are quite good and that physics-based protein structure prediction may be practical, at least for some small proteins.
Collapse
Affiliation(s)
| | | | - John D. Chodera
- Graduate Group in Biophysics, University of California, San Francisco, CA 94143
| | | |
Collapse
|
29
|
Lopes A, Alexandrov A, Bathelt C, Archontis G, Simonson T. Computational sidechain placement and protein mutagenesis with implicit solvent models. Proteins 2007; 67:853-67. [PMID: 17348031 DOI: 10.1002/prot.21379] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Structure prediction and computational protein design should benefit from accurate solvent models. We have applied implicit solvent models to two problems that are central to this area. First, we performed sidechain placement for 29 proteins, using a solvent model that combines a screened Coulomb term with an Accessible Surface Area term (CASA model). With optimized parameters, the prediction quality is comparable with earlier work that omitted electrostatics and solvation altogether. Second, we computed the stability changes associated with point mutations involving ionized sidechains. For over 1000 mutations, including many fully or partly buried positions, we compared CASA and two generalized Born models (GB) with a more accurate model, which solves the Poisson equation of continuum electrostatics numerically. CASA predicts the correct sign and order of magnitude of the stability change for 81% of the mutations, compared to 97% with the best GB. We also considered 140 mutations for which experimental data are available. Comparing to experiment requires additional assumptions about the unfolded protein structure, protein relaxation in response to the mutations, and contributions from the hydrophobic effect. With a simple, commonly-used unfolded state model, the mean unsigned error is 2.1 kcal/mol with both CASA and the best GB. Overall, the electrostatic model is not important for sidechain placement; CASA and GB are equivalent for surface mutations, while GB is far superior for fully or partly buried positions. Thus, for problems like protein design that involve all these aspects, the most recent GB models represent an important step forward. Along with the recent discovery of efficient, pairwise implementations of GB, this will open new possibilities for the computational engineering of proteins.
Collapse
Affiliation(s)
- Anne Lopes
- Laboratoire de Biochimie (UMR CNRS 7654), Department of Biology, Ecole Polytechnique, 91128, Palaiseau, France
| | | | | | | | | |
Collapse
|
30
|
Leaver-Fay A, Butterfoss GL, Snoeyink J, Kuhlman B. Maintaining solvent accessible surface area under rotamer substitution for protein design. J Comput Chem 2007; 28:1336-41. [PMID: 17285560 DOI: 10.1002/jcc.20626] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Although quantities derived from solvent accessible surface areas (SASA) are useful in many applications in protein design and structural biology, the computational cost of accurate SASA calculation makes SASA-based scores difficult to integrate into commonly used protein design methodologies. We demonstrate a method for maintaining accurate SASA during a Monte Carlo search of sequence and rotamer space for a fixed protein backbone. We extend the fast Le Grand and Merz algorithm (Le Grand and Merz, J Comput Chem, 14, 349), which discretizes the solvent accessible surface for each atom by placing dots on a sphere and combines Boolean masks to determine which dots are exposed. By replacing semigroup operations with group operations (from Boolean logic to counting dot coverage) we support SASA updates. Our algorithm takes time proportional to the number of atoms affected by rotamer substitution, rather than the number of atoms in the protein. For design simulations with a one hundred residue protein our approach is approximately 145 times faster than performing a Le Grand and Merz SASA calculation from scratch following each rotamer substitution. To demonstrate practical effectiveness, we optimize a SASA-based measure of protein packing in the complete redesign of a large set of proteins and protein-protein interfaces.
Collapse
Affiliation(s)
- Andrew Leaver-Fay
- Department of Computer Science, University of North Carolina, Chapel Hill, North Carolina 27599, USA
| | | | | | | |
Collapse
|
31
|
Bueno M, Camacho CJ, Sancho J. SIMPLE estimate of the free energy change due to aliphatic mutations: Superior predictions based on first principles. Proteins 2007; 68:850-62. [PMID: 17523191 DOI: 10.1002/prot.21453] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
The bioinformatics revolution of the last decade has been instrumental in the development of empirical potentials to quantitatively estimate protein interactions for modeling and design. Although computationally efficient, these potentials hide most of the relevant thermodynamics in 5-to-40 parameters that are fitted against a large experimental database. Here, we revisit this longstanding problem and show that a careful consideration of the change in hydrophobicity, electrostatics, and configurational entropy between the folded and unfolded state of aliphatic point mutations predicts 20-30% less false positives and yields more accurate predictions than any published empirical energy function. This significant improvement is achieved with essentially no free parameters, validating past theoretical and experimental efforts to understand the thermodynamics of protein folding. Our first principle analysis strongly suggests that both the solute-solute van der Waals interactions in the folded state and the electrostatics free energy change of exposed aliphatic mutations are almost completely compensated by similar interactions operating in the unfolded ensemble. Not surprisingly, the problem of properly accounting for the solvent contribution to the free energy of polar and charged group mutations, as well as of mutations that disrupt the protein backbone remains open.
Collapse
Affiliation(s)
- Marta Bueno
- Department of Computational Biology, University of Pittsburgh, Pennsylvania, USA
| | | | | |
Collapse
|
32
|
Relating destabilizing regions to known functional sites in proteins. BMC Bioinformatics 2007; 8:141. [PMID: 17470296 PMCID: PMC1890302 DOI: 10.1186/1471-2105-8-141] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2006] [Accepted: 04/30/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Most methods for predicting functional sites in protein 3D structures, rely on information on related proteins and cannot be applied to proteins with no known relatives. Another limitation of these methods is the lack of a well annotated set of functional sites to use as benchmark for validating their predictions. Experimental findings and theoretical considerations suggest that residues involved in function often contribute unfavorably to the native state stability. We examine the possibility of systematically exploiting this intrinsic property to identify functional sites using an original procedure that detects destabilizing regions in protein structures. In addition, to relate destabilizing regions to known functional sites, a novel benchmark consisting of a diverse set of hand-curated protein functional sites is derived. RESULTS A procedure for detecting clusters of destabilizing residues in protein structures is presented. Individual residue contributions to protein stability are evaluated using detailed atomic models and a force-field successfully applied in computational protein design. The most destabilizing residues, and some of their closest neighbours, are clustered into destabilizing regions following a rigorous protocol. Our procedure is applied to high quality apo-structures of 63 unrelated proteins. The biologically relevant binding sites of these proteins were annotated using all available information, including structural data and literature curation, resulting in the largest hand-curated dataset of binding sites in proteins available to date. Comparing the destabilizing regions with the annotated binding sites in these proteins, we find that the overlap is on average limited, but significantly better than random. Results depend on the type of bound ligand. Significant overlap is obtained for most polysaccharide- and small ligand-binding sites, whereas no overlap is observed for most nucleic acid binding sites. These differences are rationalised in terms of the geometry and energetics of the binding site. CONCLUSION We find that although destabilizing regions as detected here can in general not be used to predict binding sites in protein structures, they can provide useful information, particularly on the location of functional sites that bind polysaccharides and small ligands. This information can be exploited in methods for predicting function in protein structures with no known relatives. Our publicly available benchmark of hand-curated functional sites in proteins should help other workers derive and validate new prediction methods.
Collapse
|
33
|
Rakhmanov SV, Makeev VJ. Atomic hydration potentials using a Monte Carlo Reference State (MCRS) for protein solvation modeling. BMC STRUCTURAL BIOLOGY 2007; 7:19. [PMID: 17397537 PMCID: PMC1852318 DOI: 10.1186/1472-6807-7-19] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/15/2006] [Accepted: 03/30/2007] [Indexed: 11/10/2022]
Abstract
Background Accurate description of protein interaction with aqueous solvent is crucial for modeling of protein folding, protein-protein interaction, and drug design. Efforts to build a working description of solvation, both by continuous models and by molecular dynamics, yield controversial results. Specifically constructed knowledge-based potentials appear to be promising for accounting for the solvation at the molecular level, yet have not been used for this purpose. Results We developed original knowledge-based potentials to study protein hydration at the level of atom contacts. The potentials were obtained using a new Monte Carlo reference state (MCRS), which simulates the expected probability density of atom-atom contacts via exhaustive sampling of structure space with random probes. Using the MCRS allowed us to calculate the expected atom contact densities with high resolution over a broad distance range including very short distances. Knowledge-based potentials for hydration of protein atoms of different types were obtained based on frequencies of their contacts at different distances with protein-bound water molecules, in a non-redundant training data base of 1776 proteins with known 3D structures. Protein hydration sites were predicted in a test set of 12 proteins with experimentally determined water locations. The MCRS greatly improves prediction of water locations over existing methods. In addition, the contribution of the energy of macromolecular solvation into total folding free energy was estimated, and tested in fold recognition experiments. The correct folds were preferred over all the misfolded decoys for the majority of proteins from the improved Rosetta decoy set based on the structure hydration energy alone. Conclusion MCRS atomic hydration potentials provide a detailed distance-dependent description of hydropathies of individual protein atoms. This allows placement of water molecules on the surface of proteins and in protein interfaces with much higher precision. The potentials provide a means to estimate the total solvation energy for a protein structure, in many cases achieving a successful fold recognition. Possible applications of atomic hydration potentials to structure verification, protein folding and stability, and protein-protein interactions are discussed.
Collapse
Affiliation(s)
- Sergei V Rakhmanov
- Institute of Genetics and Selection of Industrial Microorganisms, State Research Centre GosNIIgenetika, 1Dorozhny proezd, 1, Moscow, Russia
| | - Vsevolod J Makeev
- Institute of Genetics and Selection of Industrial Microorganisms, State Research Centre GosNIIgenetika, 1Dorozhny proezd, 1, Moscow, Russia
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Vavilova str. 32, Moscow, Russia
| |
Collapse
|
34
|
Schnieders MJ, Baker NA, Ren P, Ponder JW. Polarizable atomic multipole solutes in a Poisson-Boltzmann continuum. J Chem Phys 2007; 126:124114. [PMID: 17411115 PMCID: PMC2430168 DOI: 10.1063/1.2714528] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Modeling the change in the electrostatics of organic molecules upon moving from vacuum into solvent, due to polarization, has long been an interesting problem. In vacuum, experimental values for the dipole moments and polarizabilities of small, rigid molecules are known to high accuracy; however, it has generally been difficult to determine these quantities for a polar molecule in water. A theoretical approach introduced by Onsager [J. Am. Chem. Soc. 58, 1486 (1936)] used vacuum properties of small molecules, including polarizability, dipole moment, and size, to predict experimentally known permittivities of neat liquids via the Poisson equation. Since this important advance in understanding the condensed phase, a large number of computational methods have been developed to study solutes embedded in a continuum via numerical solutions to the Poisson-Boltzmann equation. Only recently have the classical force fields used for studying biomolecules begun to include explicit polarization in their functional forms. Here the authors describe the theory underlying a newly developed polarizable multipole Poisson-Boltzmann (PMPB) continuum electrostatics model, which builds on the atomic multipole optimized energetics for biomolecular applications (AMOEBA) force field. As an application of the PMPB methodology, results are presented for several small folded proteins studied by molecular dynamics in explicit water as well as embedded in the PMPB continuum. The dipole moment of each protein increased on average by a factor of 1.27 in explicit AMOEBA water and 1.26 in continuum solvent. The essentially identical electrostatic response in both models suggests that PMPB electrostatics offers an efficient alternative to sampling explicit solvent molecules for a variety of interesting applications, including binding energies, conformational analysis, and pK(a) prediction. Introduction of 150 mM salt lowered the electrostatic solvation energy between 2 and 13 kcalmole, depending on the formal charge of the protein, but had only a small influence on dipole moments.
Collapse
Affiliation(s)
- Michael J. Schnieders
- Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, MO 63130
| | - Nathan A. Baker
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO 63110
| | - Pengyu Ren
- Department of Biomedical Engineering The University of Texas at Austin, Austin, TX 78712
| | - Jay W. Ponder
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO 63110
| |
Collapse
|
35
|
Vieceli J, Müllegger J, Tehrani A. Computer-assisted design of industrial enzymes The resurgence of rational design and in silico mutagenesis. Ind Biotechnol (New Rochelle N Y) 2006. [DOI: 10.1089/ind.2006.2.303] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Affiliation(s)
| | | | - Ali Tehrani
- Zymeworks Incorporated, 201–1401 West Broadway Avenue, Vancouver, British Columbia V6H 1H6, Canada
| |
Collapse
|
36
|
Huang A, Stultz CM. Conformational sampling with implicit solvent models: application to the PHF6 peptide in tau protein. Biophys J 2006; 92:34-45. [PMID: 17040986 PMCID: PMC1697846 DOI: 10.1529/biophysj.106.091207] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Implicit solvent models approximate the effects of solvent through a potential of mean force and therefore make solvated simulations computationally efficient. Yet despite their computational efficiency, the inherent approximations made by implicit solvent models can sometimes lead to inaccurate results. To test the accuracy of a number of popular implicit solvent models, we determined whether implicit solvent simulations can reproduce the set of potential energy minima obtained from explicit solvent simulations. For these studies, we focus on a six-residue amino-acid sequence, referred to as the paired helical filament 6 (PHF6), which may play an important role in the formation of intracellular aggregates in patients with Alzheimer's disease. Several implicit solvent models form the basis of this work--two based on the generalized Born formalism, and one based on a Gaussian solvent-exclusion model. All three implicit solvent models generate minima that are in good agreement with minima obtained from simulations with explicit solvent. Moreover, free-energy profiles generated with each implicit solvent model agree with free-energy profiles obtained with explicit solvent. For the Gaussian solvent-exclusion model, we demonstrate that a straightforward ranking of the relative stability of each minimum suggests that the most stable structure is extended, a result in excellent agreement with the free-energy profiles. Overall, our data demonstrate that for some peptides like PHF6, implicit solvent can accurately reproduce the set of local energy minimum arising from quenched dynamics simulations with explicit solvent. More importantly, all solvent models predict that PHF6 forms extended beta-structures in solution, a finding consistent with the notion that PHF6 initiates neurofibrillary tangle formation in patients with Alzheimer's disease.
Collapse
Affiliation(s)
- Austin Huang
- Harvard-MIT Division of Health Science and Technology, MIT Department of Electrical Engineering and Computer Science, Cambridge, Massachusetts, USA
| | | |
Collapse
|
37
|
Lahiri A, Sarzynska J, Nilsson L, Kulinski T. Molecular dynamics simulation of the preferred conformations of 2-thiouridine in aqueous solution. Theor Chem Acc 2006. [DOI: 10.1007/s00214-006-0141-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
38
|
Méndez R, Leplae R, Lensink MF, Wodak SJ. Assessment of CAPRI predictions in rounds 3-5 shows progress in docking procedures. Proteins 2006; 60:150-69. [PMID: 15981261 DOI: 10.1002/prot.20551] [Citation(s) in RCA: 269] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
The current status of docking procedures for predicting protein-protein interactions starting from their three-dimensional (3D) structure is reassessed by evaluating blind predictions, performed during 2003-2004 as part of Rounds 3-5 of the community-wide experiment on Critical Assessment of PRedicted Interactions (CAPRI). Ten newly determined structures of protein-protein complexes were used as targets for these rounds. They comprised 2 enzyme-inhibitor complexes, 2 antigen-antibody complexes, 2 complexes involved in cellular signaling, 2 homo-oligomers, and a complex between 2 components of the bacterial cellulosome. For most targets, the predictors were given the experimental structures of 1 unbound and 1 bound component, with the latter in a random orientation. For some, the structure of the free component was derived from that of a related protein, requiring the use of homology modeling. In some of the targets, significant differences in conformation were displayed between the bound and unbound components, representing a major challenge for the docking procedures. For 1 target, predictions could not go to completion. In total, 1866 predictions submitted by 30 groups were evaluated. Over one-third of these groups applied completely novel docking algorithms and scoring functions, with several of them specifically addressing the challenge of dealing with side-chain and backbone flexibility. The quality of the predicted interactions was evaluated by comparison to the experimental structures of the targets, made available for the evaluation, using the well-agreed-upon criteria used previously. Twenty-four groups, which for the first time included an automatic Web server, produced predictions ranking from acceptable to highly accurate for all targets, including those where the structures of the bound and unbound forms differed substantially. These results and a brief survey of the methods used by participants of CAPRI Rounds 3-5 suggest that genuine progress in the performance of docking methods is being achieved, with CAPRI acting as the catalyst.
Collapse
Affiliation(s)
- Raúl Méndez
- Service de Conformation de Macromolécules Biologiques et Bioinformatique, Centre de Biologie Structurale et Bioinformatique, Université Libre de Bruxelles, Bruxelles, Belgium
| | | | | | | |
Collapse
|
39
|
Affiliation(s)
- Ninad Prabhu
- Johnson Research Foundation, Dept. of Biochemistry and Biophysics, University of Pennsylvania
| | - Kim Sharp
- Johnson Research Foundation, Dept. of Biochemistry and Biophysics, University of Pennsylvania
| |
Collapse
|
40
|
Chen J, Im W, Brooks CL. Balancing solvation and intramolecular interactions: toward a consistent generalized Born force field. J Am Chem Soc 2006; 128:3728-36. [PMID: 16536547 PMCID: PMC2596729 DOI: 10.1021/ja057216r] [Citation(s) in RCA: 284] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The efficient and accurate characterization of solvent effects is a key element in the theoretical and computational study of biological problems. Implicit solvent models, particularly generalized Born (GB) continuum electrostatics, have emerged as an attractive tool to study the structure and dynamics of biomolecules in various environments. Despite recent advances in this methodology, there remain limitations in the parametrization of many of these models. In the present work, we demonstrate that it is possible to achieve a balanced implicit solvent force field by further optimizing the input atomic radii in combination with adjusting the protein backbone torsional energetics. This parameter optimization is guided by the potentials of mean force (PMFs) between amino acid polar groups, calculated from explicit solvent free energy simulations, and by conformational equilibria of short peptides, obtained from extensive folding and unfolding replica exchange molecular dynamics (REX-MD) simulations. Through the application of this protocol, the delicate balance between the competing solvation forces and intramolecular forces appears to be better captured, and correct conformational equilibria for a range of both helical and beta-hairpin peptides are obtained. The same optimized force field also successfully folds both beta-hairpin trpzip2 and mini-protein Trp-Cage, indicating that it is quite robust. Such a balanced, physics-based force field will be highly applicable to a range of biological problems including protein folding and protein structural dynamics.
Collapse
Affiliation(s)
- Jianhan Chen
- Department of Molecular Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California 92037, USA
| | | | | |
Collapse
|
41
|
Archontis G, Simonson T. A Residue-Pairwise Generalized Born Scheme Suitable for Protein Design Calculations. J Phys Chem B 2005; 109:22667-73. [PMID: 16853951 DOI: 10.1021/jp055282+] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We describe an efficient generalized Born (GB) approximation for proteins, in which the interaction energy between two amino acids depends on the whole protein structure, but can be accurately computed from residue-pairwise information. Two results make the scheme pairwise. First, an accurate expression exists for the interaction energy between two residues R and R' that depends on the product B = BRBR' of their residue Born solvation radii. Second, this expression is accurately fitted by a parabolic function of B; the (three) fitting coefficients depend only on the pair RR', not on its environment. In effect, the quantity B captures all the information that is relevant about the pair's dielectric environment. The method is tested with calculations on several hundred structures of the proteins trpcage, BPTI, ubiqutin, and thoredoxin. It yields solvation energies in better agreement with Poisson calculations than a traditional GB formulation. We also compute the effect of the protein/solvent environment on the interactions between pairs of charged residues in the active site of the enzyme aspartyl-tRNA synthetase. Our method captures this effect as accurately as traditional GB. Because it is residue-pairwise, the method can be incorporated into efficient protocols for rotamer placement and computational protein design.
Collapse
Affiliation(s)
- Georgios Archontis
- Department of Physics, University of Cyprus, PO20537, CY1678, Nicosia, Cyprus.
| | | |
Collapse
|
42
|
Liu Z, Chan HS. Solvation and desolvation effects in protein folding: native flexibility, kinetic cooperativity and enthalpic barriers under isostability conditions. Phys Biol 2005; 2:S75-85. [PMID: 16280624 DOI: 10.1088/1478-3975/2/4/s01] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
As different parts of a protein chain approach one another during folding, they are expected to encounter desolvation barriers before optimal packing is achieved. This impediment originates from the water molecule's finite size, which entails a net energetic cost for water exclusion when the formation of compensating close intraprotein contacts is not yet complete. Based on recent advances, we extend our exploration of these microscopic elementary desolvation barriers' roles in the emergence of generic properties of protein folding. Using continuum Gō-like C(alpha) chain models of chymotrypsin inhibitor 2 (CI2) and barnase as examples, we underscore that elementary desolvation barriers between a protein's constituent groups can significantly reduce native conformational fluctuations relative to model predictions that neglected these barriers. An increasing height of elementary desolvation barriers leads to thermodynamically more cooperative folding/unfolding transitions (i.e., higher overall empirical folding barriers) and higher degrees of kinetic cooperativity as manifested by more linear rate-stability relationships under constant temperature. Applying a spatially non-uniform thermodynamic parametrization we recently introduced for the pairwise C(alpha) potentials of mean force, the present barnase model further illustrates that desolvation is a probable physical underpinning for the experimentally observed high intrinsic enthalpic folding barrier under isostability conditions.
Collapse
Affiliation(s)
- Zhirong Liu
- Department of Biochemistry, and Department of Medical Genetics & Microbiology, Faculty of Medicine, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | | |
Collapse
|
43
|
Vizcarra CL, Mayo SL. Electrostatics in computational protein design. Curr Opin Chem Biol 2005; 9:622-6. [PMID: 16257567 DOI: 10.1016/j.cbpa.2005.10.014] [Citation(s) in RCA: 101] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2005] [Accepted: 10/11/2005] [Indexed: 11/18/2022]
Abstract
Catalytic activity and protein-protein recognition have proven to be significant challenges for computational protein design. Electrostatic interactions are crucial for these and other protein functions, and therefore accurate modeling of electrostatics is necessary for successfully advancing protein design into the realm of protein function. This review focuses on recent progress in modeling electrostatic interactions in computational protein design, with particular emphasis on continuum models.
Collapse
Affiliation(s)
- Christina L Vizcarra
- Division of Chemistry and Chemical Engineering, Division of Biology and Howard Hughes Medical Institute, California Institute of Technology, Pasadena, California 91125, USA
| | | |
Collapse
|
44
|
Im W, Chen J, Brooks CL. Peptide and protein folding and conformational equilibria: theoretical treatment of electrostatics and hydrogen bonding with implicit solvent models. ADVANCES IN PROTEIN CHEMISTRY 2005; 72:173-98. [PMID: 16581377 DOI: 10.1016/s0065-3233(05)72007-6] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Since biomolecules exist in aqueous and membrane environments, the accurate modeling of solvation, and hydrogen bonding interactions in particular, is essential for the exploration of structure and function in theoretical and computational studies. In this chapter, we focus on alternatives to explicit solvent models and discuss recent advances in generalized Born (GB) implicit solvent theories. We present a brief review of the successes and shortcomings of the application of these theories to biomolecular problems that are strongly linked to backbone H-bonding and electrostatics. This discussion naturally leads us to explore existing areas for improvement in current GB theories and our approach towards addressing a number of the key issues that remain in the refinement of these models. Specifically, the critical importance of balancing solvation forces and intramolecular forces in GB models is illustrated by examining the influence of backbone hydrogen bond strength and backbone dihedral energetics on conformational equilibria of small peptids.
Collapse
Affiliation(s)
- Wonpil Im
- Department of Molecular Biology and Center for Theoretical Biological Physics, The Scripps Research Institute, La Jolla, California 92037
| | | | | |
Collapse
|