601
|
Yi Q, Rajagopal P, Klevit RE, Baker D. Structural and kinetic characterization of the simplified SH3 domain FP1. Protein Sci 2003; 12:776-83. [PMID: 12649436 PMCID: PMC2323857 DOI: 10.1110/ps.0238603] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
The simplified SH3 domain sequence, FP1, obtained in phage display selection experiments has an amino acid composition that is 95% Ile, Lys, Glu, Ala, Gly. Here we use NMR to investigate the tertiary structure of FP1. We find that the overall topology of FP1 resembles that of the src SH3 domain, the hydrogen-deuterium exchange and chemical shift perturbation profiles are similar to those of naturally occurring SH3 domains, and the (15)N relaxation rates are in the range of naturally occurring small proteins. Guided by the structure, we further simplify the FP1 sequence and compare the effects on folding kinetics of point mutations in FP1 and the wild-type src SH3 domain. The results suggest that the folding transition state of FP1 is similar to but somewhat less polarized than that of the wild-type src SH3 domain.
Collapse
Affiliation(s)
- Qian Yi
- Department of Biochemistry, University of Washington, Seattle, Washington 98195, USA
| | | | | | | |
Collapse
|
602
|
McFarland BJ, Kortemme T, Yu SF, Baker D, Strong RK. Symmetry recognizing asymmetry: analysis of the interactions between the C-type lectin-like immunoreceptor NKG2D and MHC class I-like ligands. Structure 2003; 11:411-22. [PMID: 12679019 DOI: 10.1016/s0969-2126(03)00047-9] [Citation(s) in RCA: 83] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Engagement of diverse protein ligands (MIC-A/B, ULBP, Rae-1, or H60) by NKG2D immunoreceptors mediates elimination of tumorigenic or virally infected cells by natural killer and T cells. Three previous NKG2D-ligand complex structures show the homodimeric receptor interacting with the monomeric ligands in similar 2:1 complexes, with an equivalent surface on each NKG2D monomer binding intimately to a total of six distinct ligand surfaces. Here, the crystal structure of free human NKG2D and in silico and in vitro alanine-scanning mutagenesis analyses of the complex interfaces indicate that NKG2D recognition degeneracy is not explained by a classical induced-fit mechanism. Rather, the divergent ligands appear to utilize different strategies to interact with structurally conserved elements of the consensus NKG2D binding site.
Collapse
Affiliation(s)
- Benjamin J McFarland
- The Division of Basic Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, Seattle, WA 98109, USA
| | | | | | | | | |
Collapse
|
603
|
Kortemme T, Morozov AV, Baker D. An orientation-dependent hydrogen bonding potential improves prediction of specificity and structure for proteins and protein-protein complexes. J Mol Biol 2003; 326:1239-59. [PMID: 12589766 DOI: 10.1016/s0022-2836(03)00021-4] [Citation(s) in RCA: 379] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Hydrogen bonding is a key contributor to the specificity of intramolecular and intermolecular interactions in biological systems. Here, we develop an orientation-dependent hydrogen bonding potential based on the geometric characteristics of hydrogen bonds in high-resolution protein crystal structures, and evaluate it using four tests related to the prediction and design of protein structures and protein-protein complexes. The new potential is superior to the widely used Coulomb model of hydrogen bonding in prediction of the sequences of proteins and protein-protein interfaces from their structures, and improves discrimination of correctly docked protein-protein complexes from large sets of alternative structures.
Collapse
Affiliation(s)
- Tanja Kortemme
- Howard Hughes Medical Institute and Department of Biochemistry, J-567 Health Sciences, Box 357350, University of Washington, Seattle, WA 98195-7350, USA
| | | | | |
Collapse
|
604
|
Morozov AV, Kortemme T, Baker D. Evaluation of Models of Electrostatic Interactions in Proteins. J Phys Chem B 2003. [DOI: 10.1021/jp0267555] [Citation(s) in RCA: 38] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Alexandre V. Morozov
- Department of Physics, University of Washington, Box 351560, Seattle, Washington 98195-1560, and Department of Biochemistry, University of Washington, Box 357350, Seattle, Washington 98195-7350
| | - Tanja Kortemme
- Department of Physics, University of Washington, Box 351560, Seattle, Washington 98195-1560, and Department of Biochemistry, University of Washington, Box 357350, Seattle, Washington 98195-7350
| | - David Baker
- Department of Physics, University of Washington, Box 351560, Seattle, Washington 98195-1560, and Department of Biochemistry, University of Washington, Box 357350, Seattle, Washington 98195-7350
| |
Collapse
|
605
|
Ogata K, Jaramillo A, Cohen W, Briand JP, Connan F, Choppin J, Muller S, Wodak SJ. Automatic sequence design of major histocompatibility complex class I binding peptides impairing CD8+ T cell recognition. J Biol Chem 2003; 278:1281-90. [PMID: 12411444 DOI: 10.1074/jbc.m206853200] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
An automatic protein design procedure was used to compute amino acid sequences of peptides likely to bind the HLA-A2 major histocompatibility complex (MHC) class I allele. The only information used by the procedure are a structural template, a rotamer library, and a well established classical empirical force field. The calculations are performed on six different templates from x-ray structures of HLA-A0201-peptide complexes. Each template consists of the bound peptide backbone and the full atomic coordinates of the MHC protein. Sequences within 2 kcal/mol of the minimum energy sequence are computed for each template, and the sequences from all the templates are combined and ranked by their energies. The five lowest energy peptide sequences and five other low energy sequences re-ranked on the basis of their similarity to peptides known to bind the same MHC allele are chemically synthesized and tested for their ability to bind and form stable complexes with the HLA-A2 molecule. The most efficient binders are also tested for inhibition of the T cell receptor recognition of two known CD8(+) T effectors. Results show that all 10 peptides bind the expected MHC protein. The six strongest binders also form stable HLA-A2-peptide complexes, albeit to varying degrees, and three peptides display significant inhibition of CD8(+) T cell recognition. These results are rationalized in light of our knowledge of the three-dimensional structures of the HLA-A2-peptide and HLA-A2-peptide-T cell receptor complexes.
Collapse
Affiliation(s)
- Koji Ogata
- Service de Conformation de Macromolécules Biologiques et Bioinformatique, CP263, Centre de Biologie Structurale et Bioinformatique, Université Libre de Bruxelles, Blvd. du Triomphe, Belgium
| | | | | | | | | | | | | | | |
Collapse
|
606
|
Nauli S, Kuhlman B, Le Trong I, Stenkamp RE, Teller D, Baker D. Crystal structures and increased stabilization of the protein G variants with switched folding pathways NuG1 and NuG2. Protein Sci 2002; 11:2924-31. [PMID: 12441390 PMCID: PMC2373753 DOI: 10.1110/ps.0216902] [Citation(s) in RCA: 42] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
We recently described two protein G variants (NuG1 and NuG2) with redesigned first hairpins that were almost twice as stable, folded 100-fold faster, and had a switched folding mechanism relative to the wild-type protein. To test the structural accuracy of our design algorithm and to provide insights to the dramatic changes in the kinetics and thermodynamics of folding, we have now determined the crystal structures of NuG1 and NuG2 to 1.8 A and 1.85 A, respectively. We find that they adopt hairpin structures that are closer to the computational models than to wild-type protein G; the RMSD of the NuG1 hairpin to the design model and the wild-type structure are 1.7 A and 5.1 A, respectively. The crystallographic B factor in the redesigned first hairpin of NuG1 is systematically higher than the second hairpin, suggesting that the redesigned region is somewhat less rigid. A second round of structure-based design yielded new variants of NuG1 and NuG2, which are further stabilized by 0.5 kcal/mole and 0.9 kcal/mole.
Collapse
Affiliation(s)
- Sehat Nauli
- Department of Biochemistry, University of Washington, Seattle 98195, USA
| | | | | | | | | | | |
Collapse
|
607
|
Larson SM, England JL, Desjarlais JR, Pande VS. Thoroughly sampling sequence space: large-scale protein design of structural ensembles. Protein Sci 2002; 11:2804-13. [PMID: 12441379 PMCID: PMC2373757 DOI: 10.1110/ps.0203902] [Citation(s) in RCA: 82] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2002] [Revised: 08/16/2002] [Accepted: 09/04/2002] [Indexed: 10/27/2022]
Abstract
Modeling the inherent flexibility of the protein backbone as part of computational protein design is necessary to capture the behavior of real proteins and is a prerequisite for the accurate exploration of protein sequence space. We present the results of a broad exploration of sequence space, with backbone flexibility, through a novel approach: large-scale protein design to structural ensembles. A distributed computing architecture has allowed us to generate hundreds of thousands of diverse sequences for a set of 253 naturally occurring proteins, allowing exciting insights into the nature of protein sequence space. Designing to a structural ensemble produces a much greater diversity of sequences than previous studies have reported, and homology searches using profiles derived from the designed sequences against the Protein Data Bank show that the relevance and quality of the sequences is not diminished. The designed sequences have greater overall diversity than corresponding natural sequence alignments, and no direct correlations are seen between the diversity of natural sequence alignments and the diversity of the corresponding designed sequences. For structures in the same fold, the sequence entropies of the designed sequences cluster together tightly. This tight clustering of sequence entropies within a fold and the separation of sequence entropy distributions for different folds suggest that the diversity of designed sequences is primarily determined by a structure's overall fold, and that the designability principle postulated from studies of simple models holds in real proteins. This has important implications for experimental protein design and engineering, as well as providing insight into protein evolution.
Collapse
Affiliation(s)
- Stefan M Larson
- Chemistry Department and Biophysics Program, Stanford University, California 94305, USA
| | | | | | | |
Collapse
|
608
|
Gan HH, Perlow RA, Roy S, Ko J, Wu M, Huang J, Yan S, Nicoletta A, Vafai J, Sun D, Wang L, Noah JE, Pasquali S, Schlick T. Analysis of protein sequence/structure similarity relationships. Biophys J 2002; 83:2781-91. [PMID: 12414710 PMCID: PMC1302362 DOI: 10.1016/s0006-3495(02)75287-9] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Current analyses of protein sequence/structure relationships have focused on expected similarity relationships for structurally similar proteins. To survey and explore the basis of these relationships, we present a general sequence/structure map that covers all combinations of similarity/dissimilarity relationships and provide novel energetic analyses of these relationships. To aid our analysis, we divide protein relationships into four categories: expected/unexpected similarity (S and S(?)) and expected/unexpected dissimilarity (D and D(?)) relationships. In the expected similarity region S, we show that trends in the sequence/structure relation can be derived based on the requirement of protein stability and the energetics of sequence and structural changes. Specifically, we derive a formula relating sequence and structural deviations to a parameter characterizing protein stiffness; the formula fits the data reasonably well. We suggest that the absence of data in region S(?) (high structural but low sequence similarity) is due to unfavorable energetics. In contrast to region S, region D(?) (high sequence but low structural similarity) is well-represented by proteins that can accommodate large structural changes. Our analyses indicate that there are several categories of similarity relationships and that protein energetics provide a basis for understanding these relationships.
Collapse
Affiliation(s)
- Hin Hark Gan
- Department of Chemistry, New York University Medical School, New York, NY 10012, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
609
|
Kortemme T, Baker D. A simple physical model for binding energy hot spots in protein-protein complexes. Proc Natl Acad Sci U S A 2002; 99:14116-21. [PMID: 12381794 PMCID: PMC137846 DOI: 10.1073/pnas.202485799] [Citation(s) in RCA: 610] [Impact Index Per Article: 27.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2002] [Accepted: 08/12/2002] [Indexed: 11/18/2022] Open
Abstract
Protein-protein recognition plays a central role in most biological processes. Although the structures of many protein-protein complexes have been solved in molecular detail, general rules describing affinity and selectivity of protein-protein interactions do not accurately account for the extremely diverse nature of the interfaces. We investigate the extent to which a simple physical model can account for the wide range of experimentally measured free energy changes brought about by alanine mutation at protein-protein interfaces. The model successfully predicts the results of alanine scanning experiments on globular proteins (743 mutations) and 19 protein-protein interfaces (233 mutations) with average unsigned errors of 0.81 kcal/mol and 1.06 kcal/mol, respectively. The results test our understanding of the dominant contributions to the free energy of protein-protein interactions, can guide experiments aimed at the design of protein interaction inhibitors, and provide a stepping-stone to important applications such as interface redesign.
Collapse
Affiliation(s)
- Tanja Kortemme
- Howard Hughes Medical Institute and Department of Biochemistry, University of Washington, Seattle, WA 98195, USA
| | | |
Collapse
|
610
|
Jaramillo A, Wernisch L, Héry S, Wodak SJ. Folding free energy function selects native-like protein sequences in the core but not on the surface. Proc Natl Acad Sci U S A 2002; 99:13554-9. [PMID: 12368470 PMCID: PMC129712 DOI: 10.1073/pnas.212068599] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
An automatic protein design procedure is used to select amino acid sequences that optimize the folding free energy function for a given protein. The only information used in designing the sequences is a set of known backbone structures for each protein, a rotamer library, and a well established classical empirical force field, which relies on basic physical chemical principles that underlie molecular interactions and protein stability, and has not been adjusted to yield native-like sequences. Applying the procedure to 7 different known protein folds, representing a total of 45 different native protein structures, yields ensembles of designed sequences displaying remarkable similarity to their natural counterparts in the protein core, but which are distinctly non-native on the protein surface. We show that natural and designed sequences for a given fold score significantly higher than random sequences against profiles derived from both, designed and natural sequence ensembles. Furthermore, we find that designed sequence profiles can be used to retrieve the native sequences for many of the analyzed proteins using standard PSI-BLAST searches in sequence databases. These findings may have important implications for our understanding the selection pressures operating on natural protein sequences and hold promise for improving fold recognition.
Collapse
Affiliation(s)
- Alfonso Jaramillo
- Unité de Conformation de Macromolécules Biologiques, CP160/16, Université Libre de Bruxelles, 50 Avenue F. D. Roosevelt, 1050 Brussels, Belgium
| | | | | | | |
Collapse
|
611
|
Chevalier BS, Kortemme T, Chadsey MS, Baker D, Monnat RJ, Stoddard BL. Design, activity, and structure of a highly specific artificial endonuclease. Mol Cell 2002; 10:895-905. [PMID: 12419232 DOI: 10.1016/s1097-2765(02)00690-1] [Citation(s) in RCA: 164] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
We have generated an artificial highly specific endonuclease by fusing domains of homing endonucleases I-DmoI and I-CreI and creating a new 1400 A(2) protein interface between these domains. Protein engineering was accomplished by combining computational redesign and an in vivo protein-folding screen. The resulting enzyme, E-DreI (Engineered I-DmoI/I-CreI), binds a long chimeric DNA target site with nanomolar affinity, cleaving it precisely at a rate equivalent to its natural parents. The structure of an E-DreI/DNA complex demonstrates the accuracy of the protein interface redesign algorithm and reveals how catalytic function is maintained during the creation of the new endonuclease. These results indicate that it may be possible to generate novel highly specific DNA binding proteins from homing endonucleases.
Collapse
Affiliation(s)
- Brett S Chevalier
- Fred Hutchinson Cancer Research Center and Graduate Program in Molecular and Cell Biology, University of Washington, 1100 Fairview Avenue N. A3-023, Seattle, WA 98109, USA
| | | | | | | | | | | |
Collapse
|
612
|
Bonneau R, Tsai J, Ruczinski I, Chivian D, Rohl C, Strauss CE, Baker D. Rosetta in CASP4: progress in ab initio protein structure prediction. Proteins 2002; Suppl 5:119-26. [PMID: 11835488 DOI: 10.1002/prot.1170] [Citation(s) in RCA: 178] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Rosetta ab initio protein structure predictions in CASP4 were considerably more consistent and more accurate than previous ab initio structure predictions. Large segments were correctly predicted (>50 residues superimposed within an RMSD of 6.5 A) for 16 of the 21 domains under 300 residues for which models were submitted. Models with the global fold largely correct were produced for several targets with new folds, and for several difficult fold recognition targets, the Rosetta models were more accurate than those produced with traditional fold recognition models. These promising results suggest that Rosetta may soon be able to contribute to the interpretation of genome sequence information.
Collapse
Affiliation(s)
- R Bonneau
- Department of Biochemistry, University of Washington, Seattle, Washington 98195, USA
| | | | | | | | | | | | | |
Collapse
|
613
|
Xia Y, Levitt M. Roles of mutation and recombination in the evolution of protein thermodynamics. Proc Natl Acad Sci U S A 2002; 99:10382-7. [PMID: 12149452 PMCID: PMC124923 DOI: 10.1073/pnas.162097799] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2002] [Indexed: 11/18/2022] Open
Abstract
We present a comprehensive study of the evolutionary origin of the thermodynamic behavior of proteins. With the use of a simplified model, we exhaustively enumerate the space of all sequences and the space of all structures, simulate the evolutionary relationship between sequences and structures, and characterize the steady-state sequence distribution for all structures in terms of several thermodynamic variables. We assess the effects of two major forces of evolution: mutation and recombination. Three simplifications are made. First, a two-dimensional lattice model is used to represent protein sequences and structures. Second, proteins undergo neutral evolution so that the fitness landscape has a flat allowed region inside of which all sequences are equally fit. Third, we ignore otherwise important factors such as finite population size and evolutionary time. Two scenarios emerge from our study. The first occurs when evolution is dominated by mutation events. Even though the prototype sequence that is most mutationally robust is preferred by evolution, the preference is not strong enough to offset the huge size of sequence space. Most native sequences are located near the boundary of the fitness region and are marginally compatible with the native structure. The second scenario occurs when evolution is dominated by recombination events. Now evolutionary preference for prototype sequence is strong enough to overcome the size of sequence space so that most native sequences are located near the center of sequence-structure compatibility. We conclude that the relative frequency of mutation and recombination events is a major determinant of how optimal protein sequences are for their structures.
Collapse
Affiliation(s)
- Yu Xia
- Department of Structural Biology, Stanford University School of Medicine, Stanford, CA 94305, USA.
| | | |
Collapse
|
614
|
Abstract
Rotamer libraries are widely used in protein structure prediction, protein design, and structure refinement. As the size of the structure data base has increased rapidly in recent years, it has become possible to derive well-refined rotamer libraries using strict criteria for data inclusion and for studying dependence of rotamer populations and dihedral angles on local structural features.
Collapse
Affiliation(s)
- Roland L Dunbrack
- Institute for Cancer Research, Fox Chase Cancer Center, 7701 Burholme Avenue, Philadelphia PA 19111, USA.
| |
Collapse
|
615
|
Abstract
The progress achieved by several groups in the field of computational protein design shows that successful design methods include two major features: efficient algorithms to deal with the combinatorial exploration of sequence space and optimal energy functions to rank sequences according to their fitness for the given fold.
Collapse
Affiliation(s)
- Joaquim Mendes
- European Molecular Biology Laboratory, Meyerhofstrasse 1, D-69117 Heidelberg, Germany
| | | | | |
Collapse
|
616
|
Shimizu S, Chan HS. Anti-cooperativity and cooperativity in hydrophobic interactions: Three-body free energy landscapes and comparison with implicit-solvent potential functions for proteins. Proteins 2002; 48:15-30. [PMID: 12012334 DOI: 10.1002/prot.10108] [Citation(s) in RCA: 75] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Potentials of mean force (PMFs) of three-body hydrophobic association are investigated to gain insight into similar processes in protein folding. Free energy landscapes obtained from explicit simulations of three methanes in water are compared with that predicted by popular implicit-solvent effective potentials for the study of proteins. Explicit-water simulations show that for an extended range of three-methane configurations, hydrophobic association at 25 degrees C under atmospheric pressure is mostly anti-cooperative, that is, less favorable than if the interaction free energies were pairwise additive. Effects of free energy nonadditivity on the kinetic path of association and the temperature dependence of additivity are explored by using a three-methane system and simplified chain models. The prevalence of anti-cooperativity under ambient conditions suggests that driving forces other than hydrophobicity also play critical roles in protein thermodynamic cooperativity. We evaluate the effectiveness of several implicit-solvent potentials in mimicking explicit water simulated three-body PMFs. The favorability of the contact free energy minimum is found to be drastically overestimated by solvent accessible surface area (SASA). Both the SASA and a volume-based Gaussian solvent exclusion model fail to predict the desolvation barrier. However, this barrier is qualitatively captured by the molecular surface area model and a recent "hydrophobic force field." None of the implicit-solvent models tested are accurate for the entire range of three-methane configurations and several other thermodynamic signatures considered.
Collapse
Affiliation(s)
- Seishi Shimizu
- Department of Biochemistry and Department of Medical Genetics and Microbiology, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada
| | | |
Collapse
|
617
|
Filikov AV, Hayes RJ, Luo P, Stark DM, Chan C, Kundu A, Dahiyat BI. Computational stabilization of human growth hormone. Protein Sci 2002; 11:1452-61. [PMID: 12021444 PMCID: PMC2373623 DOI: 10.1110/ps.3500102] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Abstract
Recombinant human growth hormone (hGH) is used worldwide for the treatment of pediatric hypopituitary dwarfism and in children suffering from low levels of hGH. It has limited stability in solution, and because of poor oral absorption, is administered by injection, typically several times a week. Development has therefore focused on more stable or sustained-release formulations and alternatives to injectable delivery that would increase bioavailability and make it easier for patients to use. We redesigned hGH computationally to improve its thermostability. A more stable variant of hGH could have improved pharmacokinetics or enhanced shelf-life, or be more amenable to use in alternate delivery systems and formulations. The computational design was performed using a previously developed combinatorial optimization algorithm based on the dead-end elimination theorem. The algorithm uses an empirical free energy function for scoring designed sequences. This function was augmented with a term that accounts for the loss of backbone and side-chain conformational entropy. The weighting factors for this term, the electrostatic interaction term, and the polar hydrogen burial term were optimized by minimizing the number of mutations designed by the algorithm relative to wild-type. Forty-five residues in the core of the protein were selected for optimization with the modified potential function. The proteins designed using the developed scoring function contained six to 10 mutations, showed enhancement in the melting temperature of up to 16 degrees C, and were biologically active in cell proliferation studies. These results show the utility of our free energy function in automated protein design.
Collapse
|
618
|
Robson B, Mordasini T, Curioni A. Studies in the assessment of folding quality for protein modeling and structure prediction. J Proteome Res 2002; 1:115-33. [PMID: 12643532 DOI: 10.1021/pr0155228] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
A diagnostic for assessing the quality of a fold has been developed to which further criteria can be progressively added. The goal is to create a measure that can follow the status of a protein structure in a simulation or modeling process, when the answer (the experimental structure) is not known in advance, rather than simply reject deliberate misfolds. This places greater emphasis on the need to study, and calibrate against, marginal cases, i.e., unusual native structures, incomplete structures, partially erroneous X-ray structures, good models, poor models, and the effect of cofactors. The first three terms introduced in the diagnostic are appropriate core-forming properties or noncore properties of residues in relation to tertiary structure, appropriate neighboring structure density for each residue in relation to tertiary structure, and secondary structure consistency. While the method emerges as a useful simulation analysis tool, we find a need for further fine-tuning to diminish sensitivity to minor conformational changes that retain essential features of the fold, balanced against the need to obtain a more sensitive response when a conformational change involves less physically meaningful interatomic interactions. This dual utility is difficult to obtain: the investigation highlights some of the issues. Initial attempts to obtain it have led to terms in the diagnostic that are admittedly complex: simplifications must also be explored.
Collapse
Affiliation(s)
- Barry Robson
- IBM Research, T. J. Watson Research Laboratory, Yorktown Heights, New York 10598, USA
| | | | | |
Collapse
|
619
|
Koehl P, Levitt M. Protein topology and stability define the space of allowed sequences. Proc Natl Acad Sci U S A 2002; 99:1280-5. [PMID: 11805293 PMCID: PMC122181 DOI: 10.1073/pnas.032405199] [Citation(s) in RCA: 72] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We describe a new approach to explore and quantify the sequence space associated with a given protein structure. A set of sequences are optimized for a given target structure, using all-atom models and a physical energy function. Specificity of the sequence for its target is ensured by using the random energy model, which keeps the amino acid composition of the sequence constant. The designed sequences provide a multiple sequence alignment that describes the sequence space compatible with the structure of interest; here the size of this space is estimated by using an information entropy measure. In parallel, multiple alignments of naturally occurring sequences can be derived by using either sequence or structure alignments. We compared these 3 independent multiple sequence alignments for 10 different proteins, ranging in size from 56 to 310 residues. We observed that the subset of the sequence space derived by using our design procedure is similar in size to the sequence spaces observed in nature. These results suggest that the volume of sequence space compatible with a given protein fold is defined by the length of the protein as well as by the topology (i.e., geometry of the polypeptide chain) and the stability (i.e., free energy of denaturation) of the fold.
Collapse
Affiliation(s)
- Patrice Koehl
- Department of Structural Biology, Fairchild Building, D109, Stanford University, Stanford, CA 94305, USA.
| | | |
Collapse
|
620
|
Saarela JTA, Tuppurainen K, Peräkylä M, Santa H, Laatikainen R. Correlative motions and memory effects in molecular dynamics simulations of molecules: principal components and rescaled range analysis suggest that the motions of native BPTI are more correlated than those of its mutants. Biophys Chem 2002; 95:49-57. [PMID: 11880172 DOI: 10.1016/s0301-4622(01)00250-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
In this work MD simulations of the native bovine pancreatic trypsin inhibitor (BPTI) and 16 mutants were done in vacuum in order to study memory effects in the mutants using principal component analysis (PCA) and the rescaled range analysis (Hurst exponents). Both PCA and the rescaled range analysis support our previous proposition, based on PCA of lysozyme, that the motions of a native protein are more correlated than those of mutants. The methods are compared, the nature and applications of the rule and the role of the long-range correlations in MD time series (i.e. memory) are discussed in the context of collective motions.
Collapse
Affiliation(s)
- Janne T A Saarela
- Department of Chemistry, University of Kuopio, P.O. Box 1627, FIN-70211, Kuopio, Finland
| | | | | | | | | |
Collapse
|
621
|
Kuhlman B, O'Neill JW, Kim DE, Zhang KYJ, Baker D. Accurate computer-based design of a new backbone conformation in the second turn of protein L. J Mol Biol 2002; 315:471-7. [PMID: 11786026 DOI: 10.1006/jmbi.2001.5229] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The rational design of loops and turns is a key step towards creating proteins with new functions. We used a computational design procedure to create new backbone conformations in the second turn of protein L. The Protein Data Bank was searched for alternative turn conformations, and sequences optimal for these turns in the context of protein L were identified using a Monte Carlo search procedure and an energy function that favors close packing. Two variants containing 12 and 14 mutations were found to be as stable as wild-type protein L. The crystal structure of one of the variants has been solved at a resolution of 1.9 A, and the backbone conformation in the second turn is remarkably close to that of the in silico model (1.1 A RMSD) while it differs significantly from that of wild-type protein L (the turn residues are displaced by an average of 7.2 A). The folding rates of the redesigned proteins are greater than that of the wild-type protein and in contrast to wild-type protein L the second beta-turn appears to be formed at the rate limiting step in folding.
Collapse
Affiliation(s)
- Brian Kuhlman
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA
| | | | | | | | | |
Collapse
|
622
|
Abstract
The field of computational protein design is reaching its adolescence. Protein design algorithms have been applied to design or engineer proteins that fold, fold faster, catalyze, catalyze faster, signal, and adopt preferred conformational states. Further developments of scoring functions, sampling strategies, and optimization methods will expand the range of applicability of computational protein design to larger and more varied systems, with greater incidence of success. Developments in this field are beginning to have significant impact on biotechnology and chemical biology.
Collapse
Affiliation(s)
- C M Kraemer-Pecore
- The Pennsylvania State University, Department of Chemistry, Chandlee Laboratory, University Park, PA 16802, USA
| | | | | |
Collapse
|
623
|
Abstract
We have developed a new method for the prediction of peptide sequences that bind to a protein, given a three-dimensional structure of the protein in complex with a peptide. By applying a recently developed sequence prediction algorithm and a novel ensemble averaging calculation, we generate a diverse collection of peptide sequences that are predicted to have significant affinity for the protein. Using output from the simulations, we create position-specific scoring matrices, or virtual interaction profiles (VIPs). Comparison of VIPs for a collection of binding motifs to sequences determined experimentally indicates that the prediction algorithm is accurate and applicable to a diverse range of structures. With these VIPs, one can scan protein sequence databases rapidly to seek binding partners of potential biological significance. Overall, this method can significantly enhance the information contained within a protein- peptide crystal structure, and enrich the data obtained by experimental selection methods such as phage display.
Collapse
Affiliation(s)
- A M Wollacott
- Department of Chemistry, Pennsylvania State University, 406 Chandlee Laboratory, PA 16802, USA
| | | |
Collapse
|
624
|
Affiliation(s)
- J G Saven
- Department of Chemistry, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| |
Collapse
|
625
|
Kuhlman B, O'Neill JW, Kim DE, Zhang KY, Baker D. Conversion of monomeric protein L to an obligate dimer by computational protein design. Proc Natl Acad Sci U S A 2001; 98:10687-91. [PMID: 11526208 PMCID: PMC58527 DOI: 10.1073/pnas.181354398] [Citation(s) in RCA: 61] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2001] [Accepted: 07/11/2001] [Indexed: 11/18/2022] Open
Abstract
Protein L consists of a single alpha-helix packed on a four-stranded beta-sheet formed by two symmetrically opposed beta-hairpins. We use a computer-based protein design procedure to stabilize a domain-swapped dimer of protein L in which the second beta-turn straightens and the C-terminal strand inserts into the beta-sheet of the partner. The designed obligate dimer contains three mutations (A52V, N53P, and G55A) and has a dissociation constant of approximately 700 pM, which is comparable to the dissociation constant of many naturally occurring protein dimers. The structure of the dimer has been determined by x-ray crystallography and is close to the in silico model.
Collapse
Affiliation(s)
- B Kuhlman
- Department of Biochemistry and Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| | | | | | | | | |
Collapse
|
626
|
Verrelli BC, Eanes WF. The functional impact of Pgm amino acid polymorphism on glycogen content in Drosophila melanogaster. Genetics 2001; 159:201-10. [PMID: 11560897 PMCID: PMC1461781 DOI: 10.1093/genetics/159.1.201] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Earlier studies of the common PGM allozymes in Drosophila melanogaster reported no in vitro activity differences. However, our study of nucleotide variation observed that PGM allozymes are a heterogeneous mixture of amino acid polymorphisms. In this study, we analyze 10 PGM protein haplotypes with respect to PGM activity, thermostability, and adult glycogen content. We find a twofold difference in activity among PGM protein haplotypes that is associated with a threefold difference in glycogen content. The latitudinal clines for several Pgm amino acid polymorphisms show that high PGM activity, and apparently higher flux to glycogen synthesis, parallel the low activity clines at G6PD for reduced pentose shunt flux in northern latitudes. This suggests that amino acid polymorphism is under selection at this branch point and may be favored for increased metabolic storage associated with stress resistance and adaptation to temperate regions.
Collapse
Affiliation(s)
- B C Verrelli
- Department of Ecology and Evolution, State University of New York, Stony Brook, New York 11794-5245, USA.
| | | |
Collapse
|
627
|
Mirny L, Shakhnovich E. Protein folding theory: from lattice to all-atom models. ANNUAL REVIEW OF BIOPHYSICS AND BIOMOLECULAR STRUCTURE 2001; 30:361-96. [PMID: 11340064 DOI: 10.1146/annurev.biophys.30.1.361] [Citation(s) in RCA: 232] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
This review focuses on recent advances in understanding protein folding kinetics in the context of nucleation theory. We present basic concepts such as nucleation, folding nucleus, and transition state ensemble and then discuss recent advances and challenges in theoretical understanding of several key aspects of protein folding kinetics. We cover recent topology-based approaches as well as evolutionary studies and molecular dynamics approaches to determine protein folding nucleus and analyze other aspects of folding kinetics. Finally, we briefly discuss successful all-atom Monte-Carlo simulations of protein folding and conclude with a brief outlook for the future.
Collapse
Affiliation(s)
- L Mirny
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts 02138, USA.
| | | |
Collapse
|
628
|
Abstract
Protein design has become a powerful approach for understanding the relationship between amino acid sequence and 3-dimensional structure. In the past 5 years, there have been many breakthroughs in the development of computational methods that allow the selection of novel sequences given the structure of a protein backbone. Successful design of protein scaffolds has now paved the way for new endeavors to design function. The ability to design sequences compatible with a fold may also be useful in structural and functional genomics by expanding the range of proteins used for fold recognition and for the identification of functionally important domains from multiple sequence alignments.
Collapse
Affiliation(s)
- N Pokala
- Department of Molecular and Cell Biology, University of California, 229 Stanley Hall, Berkeley, California 94720, USA
| | | |
Collapse
|
629
|
Abstract
The strong correlation between protein folding rates and the contact order suggests that folding rates are largely determined by the topology of the native structure. However, for a given topology, there may be several possible low free energy paths to the native state and the path that is chosen (the lowest free energy path) may depend on differences in interaction energies and local free energies of ordering in different parts of the structure. For larger proteins whose folding is assisted by chaperones, such as the Escherichia coli chaperonin GroEL, advances have been made in understanding both the aspects of an unfolded protein that GroEL recognizes and the mode of binding to the chaperonin. The possibility that GroEL can remove non-native proteins from kinetic traps by unfolding them either during polypeptide binding to the chaperonin or during the subsequent ATP-dependent formation of folding-active complexes with the co-chaperonin GroES has also been explored.
Collapse
Affiliation(s)
- V Grantcharova
- Center for Genomics Research, Harvard University, Cambridge, MA 02138, USA
| | | | | | | |
Collapse
|
630
|
Bowers PM, Strauss CE, Baker D. De novo protein structure determination using sparse NMR data. JOURNAL OF BIOMOLECULAR NMR 2000; 18:311-318. [PMID: 11200525 DOI: 10.1023/a:1026744431105] [Citation(s) in RCA: 105] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
We describe a method for generating moderate to high-resolution protein structures using limited NMR data combined with the ab initio protein structure prediction method Rosetta. Peptide fragments are selected from proteins of known structure based on sequence similarity and consistency with chemical shift and NOE data. Models are built from these fragments by minimizing an energy function that favors hydrophobic burial, strand pairing, and satisfaction of NOE constraints. Models generated using this procedure with approximately 1 NOE constraint per residue are in some cases closer to the corresponding X-ray structures than the published NMR solution structures. The method requires only the sparse constraints available during initial stages of NMR structure determination, and thus holds promise for increasing the speed with which protein solution structures can be determined.
Collapse
Affiliation(s)
- P M Bowers
- Department of Biochemistry, University of Washington School of Medicine, Seattle 98195, USA
| | | | | |
Collapse
|