1
|
Kim DN, Jacobs TM, Kuhlman B. Boosting protein stability with the computational design of β-sheet surfaces. Protein Sci 2016; 25:702-10. [PMID: 26701383 DOI: 10.1002/pro.2869] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2015] [Revised: 12/18/2015] [Accepted: 12/21/2015] [Indexed: 11/09/2022]
Abstract
β-sheets often have one face packed against the core of the protein and the other facing solvent. Mutational studies have indicated that the solvent-facing residues can contribute significantly to protein stability, and that the preferred amino acid at each sequence position is dependent on the precise structure of the protein backbone and the identity of the neighboring amino acids. This suggests that the most advantageous methods for designing β-sheet surfaces will be approaches that take into account the multiple energetic factors at play including side chain rotamer preferences, van der Waals forces, electrostatics, and desolvation effects. Here, we show that the protein design software Rosetta, which models these energetic factors, can be used to dramatically increase protein stability by optimizing interactions on the surfaces of small β-sheet proteins. Two design variants of the β-sandwich protein from tenascin were made with 7 and 14 mutations respectively on its β-sheet surfaces. These changes raised the thermal midpoint for unfolding from 45°C to 64°C and 74°C. Additionally, we tested an empirical approach based on increasing the number of potential salt bridges on the surfaces of the β-sheets. This was not a robust strategy for increasing stability, as three of the four variants tested were unfolded.
Collapse
Affiliation(s)
- Doo Nam Kim
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Timothy M Jacobs
- Program in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Brian Kuhlman
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina.,Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| |
Collapse
|
2
|
Wannier TM, Moore MM, Mou Y, Mayo SL. Computational Design of the β-Sheet Surface of a Red Fluorescent Protein Allows Control of Protein Oligomerization. PLoS One 2015; 10:e0130582. [PMID: 26075618 PMCID: PMC4468108 DOI: 10.1371/journal.pone.0130582] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2015] [Accepted: 05/21/2015] [Indexed: 01/28/2023] Open
Abstract
Computational design has been used with mixed success for the design of protein surfaces, with directed evolution heretofore providing better practical solutions than explicit design. Directed evolution, however, requires a tractable high-throughput screen because the random nature of mutation does not enrich for desired traits. Here we demonstrate the successful design of the β-sheet surface of a red fluorescent protein (RFP), enabling control over its oligomerization. To isolate the problem of surface design, we created a hybrid RFP from DsRed and mCherry with a stabilized protein core that allows for monomerization without loss of fluorescence. We designed an explicit library for which 93 of 96 (97%) of the protein variants are soluble, stably fluorescent, and monomeric. RFPs are heavily used in biology, but are natively tetrameric, and creating RFP monomers has proven extremely difficult. We show that surface design and core engineering are separate problems in RFP development and that the next generation of RFP markers will depend on improved methods for core design.
Collapse
Affiliation(s)
- Timothy M. Wannier
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, United States of America
| | - Matthew M. Moore
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California, United States of America
| | - Yun Mou
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California, United States of America
| | - Stephen L. Mayo
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, United States of America
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California, United States of America
| |
Collapse
|
3
|
Evaluating and optimizing computational protein design force fields using fixed composition-based negative design. Proc Natl Acad Sci U S A 2008; 105:12242-7. [PMID: 18708527 DOI: 10.1073/pnas.0805858105] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
An accurate force field is essential to computational protein design and protein fold prediction studies. Proper force field tuning is problematic, however, due in part to the incomplete modeling of the unfolded state. Here, we evaluate and optimize a protein design force field by constraining the amino acid composition of the designed sequences to that of a well behaved model protein. According to the random energy model, unfolded state energies are dependent only on amino acid composition and not the specific arrangement of amino acids. Therefore, energy discrepancies between computational predictions and experimental results, for sequences of identical composition, can be directly attributed to flaws in the force field's ability to properly account for folded state sequence energies. This aspect of fixed composition design allows for force field optimization by focusing solely on the interactions in the folded state. Several rounds of fixed composition optimization of the 56-residue beta1 domain of protein G yielded force field parameters with significantly greater predictive power: Optimized sequences exhibited higher wild-type sequence identity in critical regions of the structure, and the wild-type sequence showed an improved Z-score. Experimental studies revealed a designed 24-fold mutant to be stably folded with a melting temperature similar to that of the wild-type protein. Sequence designs using engrailed homeodomain as a scaffold produced similar results, suggesting the tuned force field parameters were not specific to protein G.
Collapse
|
4
|
Ogata K, Soejima K, Higo J. A Monte Carlo sampling method of amino acid sequences adaptable to given main-chain atoms in the proteins. J Biochem 2006; 140:543-52. [PMID: 16945938 DOI: 10.1093/jb/mvj184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We have developed a computational method of protein design to detect amino acid sequences that are adaptable to given main-chain coordinates of a protein. In this method, the selection of amino acid types employs a Metropolis Monte Carlo method with a scoring function in conjunction with the approximation of free energies computed from 3D structures. To compute the scoring function, a side-chain prediction using another Metropolis Monte Carlo method was performed to select structurally suitable side-chain conformations from a side-chain library. In total, two layers of Monte Carlo procedures were performed, first to select amino acid types (1st layer Monte Carlo) and then to predict side-chain conformations (2nd layers Monte Carlo). We applied this method to sequence design for the entire sequence on the SH3 domain, Protein G, and BPTI. The predicted sequences were similar to those of the wild-type proteins. We compared the results of the predictions with and without the 2nd layer Monte Carlo method. The results revealed that the two-layer Monte Carlo method produced better sequence similarity to the wild-type proteins than the one-layer method. Finally, we applied this method to neuraminidase of influenza virus. The results were consistent with the sequences identified from the isolated viruses.
Collapse
Affiliation(s)
- Koji Ogata
- Centre for Computational Biology, The Hospital for Sick Children, 555 University Avenue, Toronot, Ontario M5G 1X8, Canada
| | | | | |
Collapse
|
5
|
Pokala N, Handel TM. Energy Functions for Protein Design: Adjustment with Protein–Protein Complex Affinities, Models for the Unfolded State, and Negative Design of Solubility and Specificity. J Mol Biol 2005; 347:203-27. [PMID: 15733929 DOI: 10.1016/j.jmb.2004.12.019] [Citation(s) in RCA: 157] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2004] [Revised: 12/05/2004] [Accepted: 12/09/2004] [Indexed: 11/16/2022]
Abstract
The development of the EGAD program and energy function for protein design is described. In contrast to most protein design methods, which require several empirical parameters or heuristics such as patterning of residues or rotamers, EGAD has a minimalist philosophy; it uses very few empirical factors to account for inaccuracies resulting from the use of fixed backbones and discrete rotamers in protein design calculations, and describes the unfolded state, aggregates, and alternative conformers explicitly with physical models instead of fitted parameters. This approach unveils important issues in protein design that are often camouflaged by heuristic-emphasizing methods. Inter-atom energies are modeled with the OPLS-AA all-atom forcefield, electrostatics with the generalized Born continuum model, and the hydrophobic effect with a solvent-accessible surface area-dependent term. Experimental characterization of proteins designed with an unmodified version of the energy function revealed problems with under-packing, stability, aggregation, and structural specificity. Under-packing was addressed by modifying the van der Waals function. By optimizing only three parameters, the effects of >400 mutations on protein-protein complex formation were predicted to within 1.0 kcal mol(-1). As an independent test, this modified energy function was used to predict the stabilities of >1500 mutants to within 1.0 kcal mol(-1); this required a physical model of the unfolded state that includes more interactions than traditional tripeptide-based models. Solubility and structural specificity were addressed with simple physical approximations of aggregation and conformational equilibria. The complete energy function can design protein sequences that have high levels of identity with their natural counterparts, and have predicted structural properties more consistent with soluble and uniquely folded proteins than the initial designs.
Collapse
Affiliation(s)
- Navin Pokala
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA.
| | | |
Collapse
|
6
|
Shifman JM, Mayo SL. Exploring the origins of binding specificity through the computational redesign of calmodulin. Proc Natl Acad Sci U S A 2003; 100:13274-9. [PMID: 14597710 PMCID: PMC263780 DOI: 10.1073/pnas.2234277100] [Citation(s) in RCA: 96] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Calmodulin (CaM) is a second messenger protein that has evolved to bind tightly to a variety of targets and, as such, exhibits low binding specificity. We redesigned CaM by using a computational protein design algorithm to improve its binding specificity for one of its targets, smooth muscle myosin light chain kinase (smMLCK). Residues in or near the CaM/smMLCK binding interface were optimized; CaM interactions with alternative targets were not directly considered in the optimization. The predicted CaM sequences were constructed and tested for binding to a set of eight targets including smMLCK. The best CaM variant, obtained from a calculation that emphasized intermolecular interactions, showed up to a 155-fold increase in binding specificity. The increase in binding specificity was not due to improved binding to smMLCK, but due to decreased binding to the alternative targets. This finding is consistent with the fact that the sequence of wild-type CaM is nearly optimal for interactions with numerous targets.
Collapse
Affiliation(s)
- Julia M Shifman
- Howard Hughes Medical Institute and Division of Biology, California Institute of Technology, Mail Code 114-96, Pasadena, CA 91125, USA
| | | |
Collapse
|
7
|
Qin M, Wang J, Tang Y, Wang W. Folding behaviors of lattice model proteins with three kinds of contact potentials. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2003; 67:061905. [PMID: 16241259 DOI: 10.1103/physreve.67.061905] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/30/2003] [Indexed: 05/04/2023]
Abstract
The interaction potentials between the amino acids are very important in the study of protein folding and design. In this work, the folding behaviors of lattice model protein chains are studied using three kinds of contact potentials between the beads. For these three cases, a number of sequences are designed using the Z-score method, and then their folding behaviors are obtained via Monte Carlo simulations for different sizes of the chains. It is found that the proper weakening of hydrophobicity may speed up the folding and the elimination of the mixing interaction terms may deteriorate the foldability. The different features of the foldability are discussed by comparing the characteristics of the energy landscapes of these model chains. The formations of various contacts are also analyzed, which provide us with some microscopic information on the model systems and interaction potentials.
Collapse
Affiliation(s)
- Meng Qin
- National Laboratory of Solid State Microstructure and Department of Physics, Nanjing University, China
| | | | | | | |
Collapse
|
8
|
Jin W, Kambara O, Sasakawa H, Tamura A, Takada S. De novo design of foldable proteins with smooth folding funnel: automated negative design and experimental verification. Structure 2003; 11:581-90. [PMID: 12737823 DOI: 10.1016/s0969-2126(03)00075-3] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
De novo sequence design of foldable proteins provides a way of investigating principles of protein architecture. We performed fully automated sequence design for a target structure having a three-helix bundle topology and synthesized the designed sequences. Our design principle is different from the conventional approach, in that instead of optimizing interactions within the target structure, we design the global shape of the protein folding funnel. This includes automated implementation of negative design by explicitly requiring higher free energy of the denatured state. The designed sequences do not have significant similarity to those of any natural proteins. The NMR and CD spectroscopic data indicated that one designed sequence has a well-defined three-dimensional structure as well as alpha-helical content consistent with the target.
Collapse
Affiliation(s)
- Wenzhen Jin
- Graduate School of Science and Technology, Japan Science and Technology Corporation, Kobe University, Rokkodai, Nada, 657-8501, Kobe, Japan
| | | | | | | | | |
Collapse
|
9
|
Zou J, Saven JG. Using self-consistent fields to bias Monte Carlo methods with applications to designing and sampling protein sequences. J Chem Phys 2003. [DOI: 10.1063/1.1539845] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
10
|
Abstract
Computational methods play a central role in the rational design of novel proteins. The present work describes a new hybrid exact rotamer optimization (HERO) method that builds on previous dead-end elimination algorithms to yield dramatic performance enhancements. Measured on experimentally validated physical models, these improvements make it possible to perform previously intractable designs of entire protein core, surface, or boundary regions. Computational demonstrations include a full core design of the variable domains of the light and heavy chains of catalytic antibody 48G7 FAB with 74 residues and 10(128) conformations, a full core/boundary design of the beta1 domain of protein G with 25 residues and 10(53) conformations, and a full surface design of the beta1 domain of protein G with 27 residues and 10(60) conformations. In addition, a full sequence design of the beta1 domain of protein G is used to demonstrate the strong dependence of algorithm performance on the exact form of the potential function and the fidelity of the rotamer library. These results emphasize that search algorithm performance for protein design can only be meaningfully evaluated on physical models that have been subjected to experimental scrutiny. The new algorithm greatly facilitates ongoing efforts to engineer increasingly complex protein features.
Collapse
Affiliation(s)
- D Benjamin Gordon
- Whitehead Institute for Biomedical Research, Cambridge, Massachusetts 02142, USA
| | | | | | | |
Collapse
|
11
|
Abstract
We report the computational redesign of the protein-binding interface of calmodulin (CaM), a small, ubiquitous Ca(2+)-binding protein that is known to bind to and regulate a variety of functionally and structurally diverse proteins. The CaM binding interface was optimized to improve binding specificity towards one of its natural targets, smooth muscle myosin light chain kinase (smMLCK). The optimization was performed using optimization of rotamers by iterative techniques (ORBIT), a protein design program that utilizes a physically based force-field and the Dead-End Elimination theorem to compute sequences that are optimal for a given protein scaffold. Starting from the structure of the CaM-smMLCK complex, the program considered 10(22) amino acid residue sequences to obtain the lowest-energy CaM sequence. The resulting eightfold mutant, CaM_8, was constructed and tested for binding to a set of seven CaM target peptides. CaM_8 displayed high binding affinity to the smMLCK peptide (1.3nM), similar to that of the wild-type protein (1.8nM). The affinity of CaM_8 to six other target peptides was reduced, as intended, by 1.5-fold to 86-fold. Hence, CaM_8 exhibited increased binding specificity, preferring the smMLCK peptide to the other targets. Studies of this type may increase our understanding of the origins of binding specificity in protein-ligand complexes and may provide valuable information that can be used in the design of novel protein receptors and/or ligands.
Collapse
Affiliation(s)
- Julia M Shifman
- Howard Hughes Medical Institute and Division of Biology, California Institute of Technology, Pasadena, CA 91125, USA
| | | |
Collapse
|
12
|
Abstract
Plastocyanin, like many other metalloproteins, does not undergo reversible folding, which is thought to be due to an irreversible conformational change in the copper-binding site. Moreover, apoplastocyanin's ability to adopt a native tertiary structure is highly salt-dependent, and even in high salt, it has an irreversible thermal denaturation. Here, we report a designed apoplastocyanin variant, PCV, that is well folded and has reversible folding in both high and low salt conditions. This variant provides a tractable model for understanding and designing protein beta-sheets.
Collapse
Affiliation(s)
- Deepshikha Datta
- Division of Biology (Biochemistry and Molecular Biophysics option), California Institute of Technology, 1200 East California Blvd, Pasadena, CA 91125, USA
| | | |
Collapse
|
13
|
Abstract
Predicting protein sequences that fold into specific native three-dimensional structures is a problem of great potential complexity. Although the complete solution is ultimately rooted in understanding the physical chemistry underlying the complex interactions between amino acid residues that determine protein stability, recent work shows that empirical information about these first principles is embedded in the statistics of protein sequence and structure databases. This review focuses on the use of 'knowledge-based' potentials derived from these databases in designing proteins. In addition, the data suggest how the study of these empirical potentials might impact our fundamental understanding of the energetic principles of protein structure.
Collapse
Affiliation(s)
- William P Russ
- Howard Hughes Medical Institute and Department of Pharmacology, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas 75390-9050, USA
| | | |
Collapse
|
14
|
Abstract
The progress achieved by several groups in the field of computational protein design shows that successful design methods include two major features: efficient algorithms to deal with the combinatorial exploration of sequence space and optimal energy functions to rank sequences according to their fitness for the given fold.
Collapse
Affiliation(s)
- Joaquim Mendes
- European Molecular Biology Laboratory, Meyerhofstrasse 1, D-69117 Heidelberg, Germany
| | | | | |
Collapse
|
15
|
Abstract
The field of computational protein design is reaching its adolescence. Protein design algorithms have been applied to design or engineer proteins that fold, fold faster, catalyze, catalyze faster, signal, and adopt preferred conformational states. Further developments of scoring functions, sampling strategies, and optimization methods will expand the range of applicability of computational protein design to larger and more varied systems, with greater incidence of success. Developments in this field are beginning to have significant impact on biotechnology and chemical biology.
Collapse
Affiliation(s)
- C M Kraemer-Pecore
- The Pennsylvania State University, Department of Chemistry, Chandlee Laboratory, University Park, PA 16802, USA
| | | | | |
Collapse
|
16
|
Affiliation(s)
- J G Saven
- Department of Chemistry, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| |
Collapse
|
17
|
Abstract
Protein design has become a powerful approach for understanding the relationship between amino acid sequence and 3-dimensional structure. In the past 5 years, there have been many breakthroughs in the development of computational methods that allow the selection of novel sequences given the structure of a protein backbone. Successful design of protein scaffolds has now paved the way for new endeavors to design function. The ability to design sequences compatible with a fold may also be useful in structural and functional genomics by expanding the range of proteins used for fold recognition and for the identification of functionally important domains from multiple sequence alignments.
Collapse
Affiliation(s)
- N Pokala
- Department of Molecular and Cell Biology, University of California, 229 Stanley Hall, Berkeley, California 94720, USA
| | | |
Collapse
|
18
|
Marshall SA, Mayo SL. Achieving stability and conformational specificity in designed proteins via binary patterning. J Mol Biol 2001; 305:619-31. [PMID: 11152617 DOI: 10.1006/jmbi.2000.4319] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
We have developed a method to determine the optimal binary pattern (arrangement of hydrophobic and polar amino acids) of a target protein fold prior to amino acid sequence selection in protein design studies. A solvent accessible surface is generated for a target fold using its backbone coordinates and "generic" side-chains, which are constructs whose size and shape are similar to an average amino acid. Each position is classified as hydrophobic or polar according to the solvent exposure of its generic side-chain. The method was tested by analyzing a set of proteins in the Protein Data Bank and by experimentally constructing and analyzing a set of engrailed homeodomain variants whose binary patterns were systematically varied. Selection of the optimal binary pattern results in a designed protein that is monomeric, well-folded, and hyperthermophilic. Homeodomain variants with fewer hydrophobic residues are destabilized, while additional hydrophobic residues induce aggregation. Binary patterning, in conjunction with a force field that models folded state energies, appears sufficient to satisfy two basic goals of protein design: stability and conformational specificity.
Collapse
Affiliation(s)
- S A Marshall
- Division of Chemistry and Chemical Engineering, California Institute of Technology, 1200 East California Blvd., Pasadena, CA 91125, USA
| | | |
Collapse
|