51
|
Abstract
Functional characterization of a protein is often facilitated by its 3D structure. However, the fraction of experimentally known 3D models is currently less than 1% due to the inherently time-consuming and complicated nature of structure determination techniques. Computational approaches are employed to bridge the gap between the number of known sequences and that of 3D models. Template-based protein structure modeling techniques rely on the study of principles that dictate the 3D structure of natural proteins from the theory of evolution viewpoint. Strategies for template-based structure modeling will be discussed with a focus on comparative modeling, by reviewing techniques available for all the major steps involved in the comparative modeling pipeline.
Collapse
Affiliation(s)
- Andras Fiser
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY, USA
| |
Collapse
|
52
|
Barwell J, Miller PS, Donnelly D, Poyner DR. Mapping interaction sites within the N-terminus of the calcitonin gene-related peptide receptor; the role of residues 23-60 of the calcitonin receptor-like receptor. Peptides 2010; 31:170-6. [PMID: 19913063 PMCID: PMC2809212 DOI: 10.1016/j.peptides.2009.10.021] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/18/2009] [Revised: 10/29/2009] [Accepted: 10/29/2009] [Indexed: 11/21/2022]
Abstract
The calcitonin receptor-like receptor (CLR) acts as a receptor for the calcitonin gene-related peptide (CGRP) but in order to recognize CGRP, it must form a complex with an accessory protein, receptor activity modifying protein 1 (RAMP1). Identifying the protein/protein and protein/ligand interfaces in this unusual complex would aid drug design. The role of the extreme N-terminus of CLR (Glu23-Ala60) was examined by an alanine scan and the results were interpreted with the help of a molecular model. The potency of CGRP at stimulating cAMP production was reduced at Leu41Ala, Gln45Ala, Cys48Ala and Tyr49Ala; furthermore, CGRP-induced receptor internalization at all of these receptors was also impaired. Ile32Ala, Gly35Ala and Thr37Ala all increased CGRP potency. CGRP specific binding was abolished at Leu41Ala, Ala44Leu, Cys48Ala and Tyr49Ala. There was significant impairment of cell surface expression of Gln45Ala, Cys48Ala and Tyr49Ala. Cys48 takes part in a highly conserved disulfide bond and is probably needed for correct folding of CLR. The model suggests that Gln45 and Tyr49 mediate their effects by interacting with RAMP1 whereas Leu41 and Ala44 are likely to be involved in binding CGRP. Ile32, Gly35 and Thr37 form a separate cluster of residues which modulate CGRP binding. The results from this study may be applicable to other family B GPCRs which can associate with RAMPs.
Collapse
Affiliation(s)
- James Barwell
- School of Life and Health Sciences, Aston University, Birmingham B4 7ET, UK
| | - Philip S. Miller
- Institute of Membrane & Systems Biology, LIGHT Laboratories, Faculty of Biological Sciences, University of Leeds, Leeds LS2 9JT, UK
| | - Dan Donnelly
- Institute of Membrane & Systems Biology, LIGHT Laboratories, Faculty of Biological Sciences, University of Leeds, Leeds LS2 9JT, UK
| | - David R. Poyner
- School of Life and Health Sciences, Aston University, Birmingham B4 7ET, UK
- Corresponding author. Tel.: +44 121 204 3997; fax: +44 121 359 5142.
| |
Collapse
|
53
|
Bornot A, Etchebest C, de Brevern AG. A new prediction strategy for long local protein structures using an original description. Proteins 2009; 76:570-87. [PMID: 19241475 DOI: 10.1002/prot.22370] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
A relevant and accurate description of three-dimensional (3D) protein structures can be achieved by characterizing recurrent local structures. In a previous study, we developed a library of 120 3D structural prototypes encompassing all known 11-residues long local protein structures and ensuring a good quality of structural approximation. A local structure prediction method was also proposed. Here, overlapping properties of local protein structures in global ones are taken into account to characterize frequent local networks. At the same time, we propose a new long local structure prediction strategy which involves the use of evolutionary information coupled with Support Vector Machines (SVMs). Our prediction is evaluated by a stringent geometrical assessment. Every local structure prediction with a Calpha RMSD less than 2.5 A from the true local structure is considered as correct. A global prediction rate of 63.1% is then reached, corresponding to an improvement of 7.7 points compared with the previous strategy. In the same way, the prediction of 88.33% of the 120 structural classes is improved with 8.65% mean gain. 85.33% of proteins have better prediction results with a 9.43% average gain. An analysis of prediction rate per local network also supports the global improvement and gives insights into the potential of our method for predicting super local structures. Moreover, a confidence index for the direct estimation of prediction quality is proposed. Finally, our method is proved to be very competitive with cutting-edge strategies encompassing three categories of local structure predictions.
Collapse
Affiliation(s)
- Aurélie Bornot
- INSERM UMR-S, Université Paris Diderot, Institut National de la Transfusion Sanguine, France.
| | | | | |
Collapse
|
54
|
Zhu J, Fan H, Periole X, Honig B, Mark AE. Refining homology models by combining replica-exchange molecular dynamics and statistical potentials. Proteins 2009; 72:1171-88. [PMID: 18338384 DOI: 10.1002/prot.22005] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
A protocol is presented for the global refinement of homology models of proteins. It combines the advantages of temperature-based replica-exchange molecular dynamics (REMD) for conformational sampling and the use of statistical potentials for model selection. The protocol was tested using 21 models. Of these 14 were models of 10 small proteins for which high-resolution crystal structures were available, the remainder were targets of the recent CASPR exercise. It was found that REMD in combination with currently available force fields could sample near-native conformational states starting from high-quality homology models. Conformations in which the backbone RMSD of secondary structure elements (SSE-RMSD) was lower than the starting value by 0.5-1.0 A were found for 15 out of the 21 cases (average 0.82 A). Furthermore, when a simple scoring function consisting of two statistical potentials was used to rank the structures, one or more structures with SSE-RMSD of at least 0.2 A lower than the starting value was found among the five best ranked structures in 11 out of the 21 cases. The average improvement in SSE-RMSD for the best models was 0.42 A. However, none of the scoring functions tested identified the structures with the lowest SSE-RMSD as the best models although all identified the native conformation as the one with lowest energy. This suggests that while the proposed protocol proved effective for the refinement of high-quality models of small proteins scoring functions remain one of the major limiting factors in structure refinement. This and other aspects by which the methodology could be further improved are discussed.
Collapse
Affiliation(s)
- Jiang Zhu
- Howard Hughes Medical Institute and Columbia University, Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biophysics, Columbia University, New York, USA
| | | | | | | | | |
Collapse
|
55
|
Cui M, Mezei M, Osman R. Prediction of protein loop structures using a local move Monte Carlo approach and a grid-based force field. Protein Eng Des Sel 2008; 21:729-35. [PMID: 18957407 PMCID: PMC2597363 DOI: 10.1093/protein/gzn056] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2008] [Revised: 09/18/2008] [Accepted: 09/23/2008] [Indexed: 11/14/2022] Open
Abstract
We have developed an improved local move Monte Carlo (LMMC) loop sampling approach for loop predictions. The method generates loop conformations based on simple moves of the torsion angles of side chains and local moves of backbone of loops. To reduce the computational costs for energy evaluations, we developed a grid-based force field to represent the protein environment and solvation effect. Simulated annealing has been used to enhance the efficiency of the LMMC loop sampling and identify low-energy loop conformations. The prediction quality is evaluated on a set of protein loops with known crystal structure that has been previously used by others to test different loop prediction methods. The results show that this approach can reproduce the experimental results with the root mean square deviation within 1.8 A for all the test cases. The LMMC loop prediction approach developed here could be useful for improvement in the quality the loop regions in homology models, flexible protein-ligand and protein-protein docking studies.
Collapse
Affiliation(s)
- Meng Cui
- Department of Structural and Chemical Biology, Mount Sinai School of Medicine, NYU, Box 1218, New York, NY 10029
- Department of Physiology and Biophysics, Virginia Commonwealth University, 1101 East Marshall Street, PO Box 980551, Richmond, VA 23298, USA
| | - Mihaly Mezei
- Department of Structural and Chemical Biology, Mount Sinai School of Medicine, NYU, Box 1218, New York, NY 10029
| | - Roman Osman
- Department of Structural and Chemical Biology, Mount Sinai School of Medicine, NYU, Box 1218, New York, NY 10029
| |
Collapse
|
56
|
Yao P, Dhanik A, Marz N, Propper R, Kou C, Liu G, van den Bedem H, Latombe JC, Halperin-Landsberg I, Altman RB. Efficient algorithms to explore conformation spaces of flexible protein loops. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2008; 5:534-45. [PMID: 18989041 PMCID: PMC2794838 DOI: 10.1109/tcbb.2008.96] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Several applications in biology - e.g., incorporation of protein flexibility in ligand docking algorithms, interpretation of fuzzy X-ray crystallographic data, and homology modeling - require computing the internal parameters of a flexible fragment (usually, a loop) of a protein in order to connect its termini to the rest of the protein without causing any steric clash. One must often sample many such conformations in order to explore and adequately represent the conformational range of the studied loop. While sampling must be fast, it is made difficult by the fact that two conflicting constraints - kinematic closure and clash avoidance - must be satisfied concurrently. This paper describes two efficient and complementary sampling algorithms to explore the space of closed clash-free conformations of a flexible protein loop. The "seed sampling" algorithm samples broadly from this space, while the "deformation sampling" algorithm uses seed conformations as starting points to explore the conformation space around them at a finer grain. Computational results are presented for various loops ranging from 5 to 25 residues. More specific results also show that the combination of the sampling algorithms with a functional site prediction software (FEATURE) makes it possible to compute and recognize calcium-binding loop conformations. The sampling algorithms are implemented in a toolkit (LoopTK), which is available at https://simtk.org/home/looptk.
Collapse
Affiliation(s)
- Peggy Yao
- The Computer Science and Biomedical Informatics Departments, Stanford University, S240 Clark Center, 318 Campus Drive, Stanford, CA 94305.
| | - Ankur Dhanik
- The Computer Science and Mechanical Engineering Departments, Stanford University, S245 Clark Center, 318 Campus Drive, Stanford, CA 94305.
| | - Nathan Marz
- The Computer Science Department, Stanford University, S245 Clark Center, 318 Campus Drive, Stanford, CA 94305.
| | - Ryan Propper
- The Computer Science Department, Stanford University, S245 Clark Center, 318 Campus Drive, Stanford, CA 94305.
| | - Charles Kou
- The Computer Science Department, Stanford University, S245 Clark Center, 318 Campus Drive, Stanford, CA 94305.
| | - Guanfeng Liu
- The Computer Science Department, Stanford University, S245 Clark Center, 318 Campus Drive, Stanford, CA 94305.
| | - Henry van den Bedem
- The Stanford Linear Accelerator Center, SSRL/Joint Center for Structural Genomics, MS 69, 2575 Sand Hill Road, Menlo Park, CA 94025.
| | - Jean-Claude Latombe
- The Computer Science Department, Stanford University, S245 Clark Center, 318 Campus Drive, Stanford, CA 94305.
| | - Inbal Halperin-Landsberg
- The Department of Genetics, Stanford University, S240 Clark Center, 318 Campus Drive, Stanford, CA 94305.
| | - Russ Biagio Altman
- The Department of Bioengineering, Stanford University, 318 Campus Drive S172, Stanford, CA 94305-5444.
| |
Collapse
|
57
|
Eyrisch S, Helms V. What induces pocket openings on protein surface patches involved in protein-protein interactions? J Comput Aided Mol Des 2008; 23:73-86. [PMID: 18777159 DOI: 10.1007/s10822-008-9239-y] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2008] [Accepted: 08/12/2008] [Indexed: 12/25/2022]
Abstract
We previously showed for the proteins BCL-X(L), IL-2, and MDM2 that transient pockets at their protein-protein binding interfaces can be identified by applying the PASS algorithm to molecular dynamics (MD) snapshots. We now investigated which aspects of the natural conformational dynamics of proteins induce the formation of such pockets. The pocket detection protocol was applied to three different conformational ensembles for the same proteins that were extracted from MD simulations of the inhibitor bound crystal conformation in water and the free crystal/NMR structure in water and in methanol. Additional MD simulations studied the impact of backbone mobility. The more efficient CONCOORD or normal mode analysis (NMA) techniques gave significantly smaller pockets than MD simulations, whereas tCONCOORD generated pockets comparable to those observed in MD simulations for two of the three systems. Our findings emphasize the influence of solvent polarity and backbone rearrangements on the formation of pockets on protein surfaces and should be helpful in future generation of transient pockets as putative ligand binding sites at protein-protein interfaces.
Collapse
Affiliation(s)
- Susanne Eyrisch
- Center for Bioinformatics, Building C7 1, P.O. Box 151150, D-66041 Saarbruecken, Germany
| | | |
Collapse
|
58
|
Eswar N, Webb B, Marti-Renom MA, Madhusudhan MS, Eramian D, Shen MY, Pieper U, Sali A. Comparative protein structure modeling using MODELLER. ACTA ACUST UNITED AC 2008; Chapter 2:Unit 2.9. [PMID: 18429317 DOI: 10.1002/0471140864.ps0209s50] [Citation(s) in RCA: 757] [Impact Index Per Article: 47.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Functional characterization of a protein sequence is a common goal in biology, and is usually facilitated by having an accurate three-dimensional (3-D) structure of the studied protein. In the absence of an experimentally determined structure, comparative or homology modeling can sometimes provide a useful 3-D model for a protein that is related to at least one known protein structure. Comparative modeling predicts the 3-D structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described.
Collapse
Affiliation(s)
- Narayanan Eswar
- University of California at San Francisco, San Francisco, California, USA
| | | | | | | | | | | | | | | |
Collapse
|
59
|
Barzon L, Masi G, Boschin IM, Lavezzo E, Pacenti M, Casal Ide E, Toniato A, Toppo S, Palù G, Pelizzo MR. Characterization of a novel complex BRAF mutation in a follicular variant papillary thyroid carcinoma. Eur J Endocrinol 2008; 159:77-80. [PMID: 18426810 DOI: 10.1530/eje-08-0239] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
INTRODUCTION Activating mutations of the BRAF oncogene are frequently detected in papillary thyroid carcinoma (PTC) and have been associated with a worse prognosis. The amino acid substitution V600E accounts for 90% of all oncogenic BRAF mutations and is typically detected in classic PTCs, whereas other less frequent BRAF mutations seem to be associated with other PTC histotypes. CASE Screening for activating BRAF mutations in a series of 83 PTCs identified the most common V600E mutation in 39 cases (histologically, 38 classic PTCs and 1 sclerosing variant PTC) and a complex in-frame mutation involving amino acids V600-S605 in a stage III multicentric follicular variant PTC, occurring in a 50-year-old female patient, who was affected by hypothyroidism in autoimmune thyroiditis and had a family history of PTC and autoimmune thyroiditis. Since the identified BRAF mutation was novel in the literature, bioinformatic modeling was performed to predict its impact on BRAF activity. Although the mutation resulted in loss of a phosphorylation site in the activation loop of BRAF, it was predicted to increase BRAF kinase activity by mimicking an activating phosphorylation. CONCLUSIONS This study, which reports a new BRAF mutation, highlights the usefulness of bioinformatic modeling in the prediction of functional effects of new mutations and indicates that mutation-specific screening tests might miss some rare BRAF mutations. These facts should be taken into consideration in the molecular diagnosis of thyroid cancer and in the design of therapeutic protocols based on inhibitors of the BRAF pathway.
Collapse
Affiliation(s)
- Luisa Barzon
- Department of Histology, Microbiology, and Medical Biotechnologies, University of Padova, I-35121 Padova, Italy
| | | | | | | | | | | | | | | | | | | |
Collapse
|
60
|
Olson MA, Feig M, Brooks CL. Prediction of protein loop conformations using multiscale modeling methods with physical energy scoring functions. J Comput Chem 2008; 29:820-31. [PMID: 17876760 DOI: 10.1002/jcc.20827] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
This article examines ab initio methods for the prediction of protein loops by a computational strategy of multiscale conformational sampling and physical energy scoring functions. Our approach consists of initial sampling of loop conformations from lattice-based low-resolution models followed by refinement using all-atom simulations. To allow enhanced conformational sampling, the replica exchange method was implemented. Physical energy functions based on CHARMM19 and CHARMM22 parameterizations with generalized Born (GB) solvent models were applied in scoring loop conformations extracted from the lattice simulations and, in the case of all-atom simulations, the ensemble of conformations were generated and scored with these models. Predictions are reported for 25 loop segments, each eight residues long and taken from a diverse set of 22 protein structures. We find that the simulations generally sampled conformations with low global root-mean-square-deviation (RMSD) for loop backbone coordinates from the known structures, whereas clustering conformations in RMSD space and scoring detected less favorable loop structures. Specifically, the lattice simulations sampled basins that exhibited an average global RMSD of 2.21 +/- 1.42 A, whereas clustering and scoring the loop conformations determined an RMSD of 3.72 +/- 1.91 A. Using CHARMM19/GB to refine the lattice conformations improved the sampling RMSD to 1.57 +/- 0.98 A and detection to 2.58 +/- 1.48 A. We found that further improvement could be gained from extending the upper temperature in the all-atom refinement from 400 to 800 K, where the results typically yield a reduction of approximately 1 A or greater in the RMSD of the detected loop. Overall, CHARMM19 with a simple pairwise GB solvent model is more efficient at sampling low-RMSD loop basins than CHARMM22 with a higher-resolution modified analytical GB model; however, the latter simulation method provides a more accurate description of the all-atom energy surface, yet demands a much greater computational cost.
Collapse
Affiliation(s)
- Mark A Olson
- Department of Cell Biology and Biochemistry, U.S. Army Medical Research Institute of Infectious Diseases, Frederick, Maryland 21702, USA.
| | | | | |
Collapse
|
61
|
Ytreberg FM, Zuckerman DM. A black-box re-weighting analysis can correct flawed simulation data. Proc Natl Acad Sci U S A 2008; 105:7982-7. [PMID: 18544653 PMCID: PMC2786942 DOI: 10.1073/pnas.0706063105] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2007] [Indexed: 11/18/2022] Open
Abstract
There is a great need for improved statistical sampling in a range of physical, chemical, and biological systems. Even simulations based on correct algorithms suffer from statistical error, which can be substantial or even dominant when slow processes are involved. Further, in key biomolecular applications, such as the determination of protein structures from NMR data, non-Boltzmann-distributed ensembles are generated. We therefore have developed the "black-box" strategy for re-weighting a set of configurations generated by arbitrary means to produce an ensemble distributed according to any target distribution. In contrast to previous algorithmic efforts, the black-box approach exploits the configuration-space density observed in a simulation, rather than assuming a desired distribution has been generated. Successful implementations of the strategy, which reduce both statistical error and bias, are developed for a one-dimensional system, and a 50-atom peptide, for which the correct 250-to-1 population ratio is recovered from a heavily biased ensemble.
Collapse
Affiliation(s)
- F. Marty Ytreberg
- *Department of Physics, University of Idaho, Moscow, ID 83844-0903; and
| | - Daniel M. Zuckerman
- Department of Computational Biology, University of Pittsburgh, 3501 Fifth Avenue, Pittsburgh, PA 152601
| |
Collapse
|
62
|
Ngan SC, Hung LH, Liu T, Samudrala R. Scoring functions for de novo protein structure prediction revisited. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2008; 413:243-81. [PMID: 18075169 DOI: 10.1007/978-1-59745-574-9_10] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/08/2023]
Abstract
De novo protein structure prediction methods attempt to predict tertiary structures from sequences based on general principles that govern protein folding energetics and/or statistical tendencies of conformational features that native structures acquire, without the use of explicit templates. A general paradigm for de novo prediction involves sampling the conformational space, guided by scoring functions and other sequence-dependent biases, such that a large set of candidate ("decoy") structures are generated, and then selecting native-like conformations from those decoys using scoring functions as well as conformer clustering. High-resolution refinement is sometimes used as a final step to fine-tune native-like structures. There are two major classes of scoring functions. Physics-based functions are based on mathematical models describing aspects of the known physics of molecular interaction. Knowledge-based functions are formed with statistical models capturing aspects of the properties of native protein conformations. We discuss the implementation and use of some of the scoring functions from these two classes for de novo structure prediction in this chapter.
Collapse
Affiliation(s)
- Shing-Chung Ngan
- Department of Microbiology, University of Washington School of Medicine, Seattle, WA, USA
| | | | | | | |
Collapse
|
63
|
Eswar N, Webb B, Marti-Renom MA, Madhusudhan MS, Eramian D, Shen MY, Pieper U, Sali A. Comparative protein structure modeling using Modeller. ACTA ACUST UNITED AC 2008; Chapter 5:Unit-5.6. [PMID: 18428767 DOI: 10.1002/0471250953.bi0506s15] [Citation(s) in RCA: 1775] [Impact Index Per Article: 110.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Functional characterization of a protein sequence is one of the most frequent problems in biology. This task is usually facilitated by accurate three-dimensional (3-D) structure of the studied protein. In the absence of an experimentally determined structure, comparative or homology modeling can sometimes provide a useful 3-D model for a protein that is related to at least one known protein structure. Comparative modeling predicts the 3-D structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described.
Collapse
Affiliation(s)
- Narayanan Eswar
- University of California at San Francisco San Francisco, California
| | - Ben Webb
- University of California at San Francisco San Francisco, California
| | | | - M S Madhusudhan
- University of California at San Francisco San Francisco, California
| | - David Eramian
- University of California at San Francisco San Francisco, California
| | - Min-Yi Shen
- University of California at San Francisco San Francisco, California
| | - Ursula Pieper
- University of California at San Francisco San Francisco, California
| | - Andrej Sali
- University of California at San Francisco San Francisco, California
| |
Collapse
|
64
|
Felts AK, Gallicchio E, Chekmarev D, Paris KA, Friesner RA, Levy RM. Prediction of Protein Loop Conformations using the AGBNP Implicit Solvent Model and Torsion Angle Sampling. J Chem Theory Comput 2008; 4:855-868. [PMID: 18787648 DOI: 10.1021/ct800051k] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The OPLS-AA all-atom force field and the Analytical Generalized Born plus Non-Polar (AGBNP) implicit solvent model, in conjunction with torsion angle conformational search protocols based on the Protein Local Optimization Program (PLOP), are shown to be effective in predicting the native conformations of 57 9-residue and 35 13-residue loops of a diverse series of proteins with low sequence identity. The novel nonpolar solvation free energy estimator implemented in AGBNP augmented by correction terms aimed at reducing the occurrence of ion pairing are important to achieve the best prediction accuracy. Extended versions of the previously developed PLOP-based conformational search schemes based on calculations in the crystal environment are reported that are suitable for application to loop homology modeling without the crystal environment. Our results suggest that in general the loop backbone conformation is not strongly influenced by crystal packing. The application of the temperature Replica Exchange Molecular Dynamics (T-REMD) sampling method for a few examples where PLOP sampling is insufficient are also reported. The results reported indicate that the OPLS-AA/AGBNP effective potential is suitable for high-resolution modeling of proteins in the final stages of homology modeling and/or protein crystallographic refinement.
Collapse
Affiliation(s)
- Anthony K Felts
- Department of Chemistry and Chemical Biology and BioMaPS Institute for Quantitative Biology, Rutgers University, Piscataway, New Jersey 08854
| | | | | | | | | | | |
Collapse
|
65
|
Abstract
We describe a fast and accurate protocol, LoopBuilder, for the prediction of loop conformations in proteins. The procedure includes extensive sampling of backbone conformations, side chain addition, the use of a statistical potential to select a subset of these conformations, and, finally, an energy minimization and ranking with an all-atom force field. We find that the Direct Tweak algorithm used in the previously developed LOOPY program is successful in generating an ensemble of conformations that on average are closer to the native conformation than those generated by other methods. An important feature of Direct Tweak is that it checks for interactions between the loop and the rest of the protein during the loop closure process. DFIRE is found to be a particularly effective statistical potential that can bias conformation space toward conformations that are close to the native structure. Its application as a filter prior to a full molecular mechanics energy minimization both improves prediction accuracy and offers a significant savings in computer time. Final scoring is based on the OPLS/SBG-NP force field implemented in the PLOP program. The approach is also shown to be quite successful in predicting loop conformations for cases where the native side chain conformations are assumed to be unknown, suggesting that it will prove effective in real homology modeling applications. Proteins 2008. © 2007 Wiley-Liss, Inc.
Collapse
Affiliation(s)
- Cinque S Soto
- Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York 10032, USA
| | | | | | | | | |
Collapse
|
66
|
Knight JL, Zhou Z, Gallicchio E, Himmel DM, Friesner RA, Arnold E, Levy RM. Exploring structural variability in X-ray crystallographic models using protein local optimization by torsion-angle sampling. ACTA CRYSTALLOGRAPHICA. SECTION D, BIOLOGICAL CRYSTALLOGRAPHY 2008; 64:383-96. [PMID: 18391405 PMCID: PMC2631124 DOI: 10.1107/s090744490800070x] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/05/2007] [Accepted: 01/08/2008] [Indexed: 11/10/2022]
Abstract
Modeling structural variability is critical for understanding protein function and for modeling reliable targets for in silico docking experiments. Because of the time-intensive nature of manual X-ray crystallographic refinement, automated refinement methods that thoroughly explore conformational space are essential for the systematic construction of structurally variable models. Using five proteins spanning resolutions of 1.0-2.8 A, it is demonstrated how torsion-angle sampling of backbone and side-chain libraries with filtering against both the chemical energy, using a modern effective potential, and the electron density, coupled with minimization of a reciprocal-space X-ray target function, can generate multiple structurally variable models which fit the X-ray data well. Torsion-angle sampling as implemented in the Protein Local Optimization Program (PLOP) has been used in this work. Models with the lowest R(free) values are obtained when electrostatic and implicit solvation terms are included in the effective potential. HIV-1 protease, calmodulin and SUMO-conjugating enzyme illustrate how variability in the ensemble of structures captures structural variability that is observed across multiple crystal structures and is linked to functional flexibility at hinge regions and binding interfaces. An ensemble-refinement procedure is proposed to differentiate between variability that is a consequence of physical conformational heterogeneity and that which reflects uncertainty in the atomic coordinates.
Collapse
Affiliation(s)
| | | | - Emilio Gallicchio
- Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Daniel M. Himmel
- Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | | | - Eddy Arnold
- Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Ronald M. Levy
- Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| |
Collapse
|
67
|
Lin MS, Head-Gordon T. Improved Energy Selection of Nativelike Protein Loops from Loop Decoys. J Chem Theory Comput 2008; 4:515-21. [DOI: 10.1021/ct700292u] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Matthew S. Lin
- UCSF/UCB Joint Graduate Group in Bioengineering, Berkeley, California 94720, and Department of Bioengineering, University of California, Berkeley, California 94720
| | - Teresa Head-Gordon
- UCSF/UCB Joint Graduate Group in Bioengineering, Berkeley, California 94720, and Department of Bioengineering, University of California, Berkeley, California 94720
| |
Collapse
|
68
|
Furnham N, de Bakker PI, Gore S, Burke DF, Blundell TL. Comparative modelling by restraint-based conformational sampling. BMC STRUCTURAL BIOLOGY 2008; 8:7. [PMID: 18237407 PMCID: PMC2275734 DOI: 10.1186/1472-6807-8-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/10/2007] [Accepted: 01/31/2008] [Indexed: 11/10/2022]
Abstract
BACKGROUND Although comparative modelling is routinely used to produce three-dimensional models of proteins, very few automated approaches are formulated in a way that allows inclusion of restraints derived from experimental data as well as those from the structures of homologues. Furthermore, proteins are usually described as a single conformer, rather than an ensemble that represents the heterogeneity and inaccuracy of experimentally determined protein structures. Here we address these issues by exploring the application of the restraint-based conformational space search engine, RAPPER, which has previously been developed for rebuilding experimentally defined protein structures and for fitting models to electron density derived from X-ray diffraction analyses. RESULTS A new application of RAPPER for comparative modelling uses positional restraints and knowledge-based sampling to generate models with accuracies comparable to other leading modelling tools. Knowledge-based predictions are based on geometrical features of the homologous templates and rules concerning main-chain and side-chain conformations. By directly changing the restraints derived from available templates we estimate the accuracy limits of the method in comparative modelling. CONCLUSION The application of RAPPER to comparative modelling provides an effective means of exploring the conformational space available to a target sequence. Enhanced methods for generating positional restraints can greatly improve structure prediction. Generation of an ensemble of solutions that are consistent with both target sequence and knowledge derived from the template structures provides a more appropriate representation of a structural prediction than a single model. By formulating homologous structural information as sets of restraints we can begin to consider how comparative models might be used to inform conformer generation from sparse experimental data.
Collapse
Affiliation(s)
- Nicholas Furnham
- Department of Biochemistry, Sanger Building, University of Cambridge, 80 Tennis Court Road, Cambridge, CB2 1GA, UK.
| | | | | | | | | |
Collapse
|
69
|
Zhu K, Shirts MR, Friesner RA. Improved Methods for Side Chain and Loop Predictions via the Protein Local Optimization Program: Variable Dielectric Model for Implicitly Improving the Treatment of Polarization Effects. J Chem Theory Comput 2007; 3:2108-19. [DOI: 10.1021/ct700166f] [Citation(s) in RCA: 90] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Kai Zhu
- Department of Chemistry, Columbia University, New York, New York 10027
| | - Michael R. Shirts
- Department of Chemistry, Columbia University, New York, New York 10027
| | | |
Collapse
|
70
|
Rykunov D, Fiser A. Effects of amino acid composition, finite size of proteins, and sparse statistics on distance-dependent statistical pair potentials. Proteins 2007; 67:559-68. [PMID: 17335003 DOI: 10.1002/prot.21279] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Statistical distance dependent pair potentials are frequently used in a variety of folding, threading, and modeling studies of proteins. The applicability of these types of potentials is tightly connected to the reliability of statistical observations. We explored the possible origin and extent of false positive signals in statistical potentials by analyzing their distance dependence in a variety of randomized protein-like models. While on average potentials derived from such models are expected to equal zero at any distance, we demonstrate that systematic and significant distortions exist. These distortions originate from the limited statistical counts in local environments of proteins and from the limited size of protein structures at large distances. We suggest that these systematic errors in statistical potentials are connected to the dependence of amino acid composition on protein size and to variation in protein sizes. Additionally, atom-based potentials are dominated by a false positive signal that is due to correlation among distances measured from atoms of one residue to atoms of another residue. The significance of residue-based pairwise potentials at various spatial pair separations was assessed in this study and it was found that as few as approximately 50% of potential values were statistically significant at distances below 4 A, and only at most approximately 80% of them were significant at larger pair separations. A new definition for reference state, free of the observed systematic errors, is suggested. It has been demonstrated to generate statistical potentials that compare favorably to other publicly available ones.
Collapse
Affiliation(s)
- Dmitry Rykunov
- Department of Biochemistry, Seaver Center for Bioinformatics, Albert Einstein College of Medicine, Bronx, New York 10461, USA
| | | |
Collapse
|
71
|
Eyrisch S, Helms V. Transient pockets on protein surfaces involved in protein-protein interaction. J Med Chem 2007; 50:3457-64. [PMID: 17602601 DOI: 10.1021/jm070095g] [Citation(s) in RCA: 173] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
A new pocket detection protocol successfully identified transient pockets on the protein surfaces of BCL-XL, IL-2, and MDM2. Because the native inhibitor binding pocket was absent or only partly detectable in the unbound proteins, these crystal structures were used as starting points for 10 ns long molecular dynamics simulations. Trajectory snapshots were scanned for cavities on the protein surface using the program PASS. The detected cavities were clustered to determine several distinct transient pockets. They all opened within 2.5 ps, and most of them appeared multiple times. All three systems gave similar results overall. At the native binding site, pockets of similar size compared with a known inhibitor bound could be observed for all three systems. AutoDock could successfully place inhibitor molecules into these transient pockets with less than 2 A rms deviation from their crystal structures, suggesting this protocol as a viable tool to identify transient ligand binding pockets on protein surfaces.
Collapse
Affiliation(s)
- Susanne Eyrisch
- Center for Bioinformatics, Building C7 1, P.O. Box 151150, D-66041 Saarbruecken, Germany
| | | |
Collapse
|
72
|
Punta M, Forrest LR, Bigelow H, Kernytsky A, Liu J, Rost B. Membrane protein prediction methods. Methods 2007; 41:460-74. [PMID: 17367718 PMCID: PMC1934899 DOI: 10.1016/j.ymeth.2006.07.026] [Citation(s) in RCA: 70] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2006] [Accepted: 07/05/2006] [Indexed: 10/23/2022] Open
Abstract
We survey computational approaches that tackle membrane protein structure and function prediction. While describing the main ideas that have led to the development of the most relevant and novel methods, we also discuss pitfalls, provide practical hints and highlight the challenges that remain. The methods covered include: sequence alignment, motif search, functional residue identification, transmembrane segment and protein topology predictions, homology and ab initio modeling. In general, predictions of functional and structural features of membrane proteins are improving, although progress is hampered by the limited amount of high-resolution experimental information available. While predictions of transmembrane segments and protein topology rank among the most accurate methods in computational biology, more attention and effort will be required in the future to ameliorate database search, homology and ab initio modeling.
Collapse
Affiliation(s)
- Marco Punta
- Department of Biochemistry and Molecular Biophysics, Columbia University, 1130 St. Nicholas Ave., New York, NY 10032, USA
| | | | | | | | | | | |
Collapse
|
73
|
Gore SP, Karmali AM, Blundell TL. Rappertk: a versatile engine for discrete restraint-based conformational sampling of macromolecules. BMC STRUCTURAL BIOLOGY 2007; 7:13. [PMID: 17376228 PMCID: PMC1847436 DOI: 10.1186/1472-6807-7-13] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/18/2006] [Accepted: 03/21/2007] [Indexed: 11/18/2022]
Abstract
Background Macromolecular structures are modeled by conformational optimization within experimental and knowledge-based restraints. Discrete restraint-based sampling generates high-quality structures within these restraints and facilitates further refinement in a continuous all-atom energy landscape. This approach has been used successfully for protein loop modeling, comparative modeling and electron density fitting in X-ray crystallography. Results Here we present a software toolkit (Rappertk) which generalizes discrete restraint-based sampling for use in structural biology. Modular design and multi-layered architecture enables Rappertk to sample conformations of any macromolecule at many levels of detail and within a variety of experimental restraints. Performance against a Cα-tracing benchmark shows that the efficiency has not suffered despite the overhead required by this flexibility. We demonstrate the toolkit's capabilities by building high-quality β-sheets and by introducing restraint-driven sampling. RNA sampling is demonstrated by rebuilding a protein-RNA interface. Ability to construct arbitrary ligands is used in sampling protein-ligand interfaces within electron density. Finally, secondary structure and shape information derived from EM are combined to generate multiple conformations of a protein consistent with the observed density. Conclusion Through its modular design and ease of use, Rappertk enables exploration of a wide variety of interesting avenues in structural biology. This toolkit, with illustrative examples, is freely available to academic users from .
Collapse
Affiliation(s)
- Swanand P Gore
- Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1GA UK
| | - Anjum M Karmali
- Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1GA UK
| | - Tom L Blundell
- Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1GA UK
| |
Collapse
|
74
|
Abstract
In this perspective, we begin by describing the comparative protein structure modeling technique and the accuracy of the corresponding models. We then discuss the significant role that comparative prediction plays in drug discovery. We focus on virtual ligand screening against comparative models and illustrate the state of the art by a number of specific examples.
Collapse
|
75
|
Zhu J, Xie L, Honig B. Structural refinement of protein segments containing secondary structure elements: Local sampling, knowledge-based potentials, and clustering. Proteins 2006; 65:463-79. [PMID: 16927337 DOI: 10.1002/prot.21085] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
In this article, we present an iterative, modular optimization (IMO) protocol for the local structure refinement of protein segments containing secondary structure elements (SSEs). The protocol is based on three modules: a torsion-space local sampling algorithm, a knowledge-based potential, and a conformational clustering algorithm. Alternative methods are tested for each module in the protocol. For each segment, random initial conformations were constructed by perturbing the native dihedral angles of loops (and SSEs) of the segment to be refined while keeping the protein body fixed. Two refinement procedures based on molecular mechanics force fields - using either energy minimization or molecular dynamics - were also tested but were found to be less successful than the IMO protocol. We found that DFIRE is a particularly effective knowledge-based potential and that clustering algorithms that are biased by the DFIRE energies improve the overall results. Results were further improved by adding an energy minimization step to the conformations generated with the IMO procedure, suggesting that hybrid strategies that combine both knowledge-based and physical effective energy functions may prove to be particularly effective in future applications.
Collapse
Affiliation(s)
- Jiang Zhu
- Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biophysics, Columbia University, 1130 St. Nicholas Avenue, Room 815, New York, New York 10032, USA
| | | | | |
Collapse
|
76
|
Mehler EL, Hassan SA, Kortagere S, Weinstein H. Ab initio computational modeling of loops in G-protein-coupled receptors: lessons from the crystal structure of rhodopsin. Proteins 2006; 64:673-90. [PMID: 16729264 DOI: 10.1002/prot.21022] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
With the help of the crystal structure of rhodopsin an ab initio method has been developed to calculate the three-dimensional structure of the loops that connect the transmembrane helices (TMHs). The goal of this procedure is to calculate the loop structures in other G-protein coupled receptors (GPCRs) for which only model coordinates of the TMHs are available. To mimic this situation a construct of rhodopsin was used that only includes the experimental coordinates of the TMHs while the rest of the structure, including the terminal domains, has been removed. To calculate the structure of the loops a method was designed based on Monte Carlo (MC) simulations which use a temperature annealing protocol, and a scaled collective variables (SCV) technique with proper structural constraints. Because only part of the protein is used in the calculations the usual approach of modeling loops, which consists of finding a single, lowest energy conformation of the system, is abandoned because such a single structure may not be a representative member of the native ensemble. Instead, the method was designed to generate structural ensembles from which the single lowest free energy ensemble is identified as representative of the native folding of the loop. To find the native ensemble a successive series of SCV-MC simulations are carried out to allow the loops to undergo structural changes in a controlled manner. To increase the chances of finding the native funnel for the loop, some of the SCV-MC simulations are carried out at elevated temperatures. The native ensemble can be identified by an MC search starting from any conformation already in the native funnel. The hypothesis is that native structures are trapped in the conformational space because of the high-energy barriers that surround the native funnel. The existence of such ensembles is demonstrated by generating multiple copies of the loops from their crystal structures in rhodopsin and carrying out an extended SCV-MC search. For the extracellular loops e1 and e3, and the intracellular loop i1 that were used in this work, the procedure resulted in dense clusters of structures with Calpha-RMSD approximately 0.5 angstroms. To test the predictive power of the method the crystal structure of each loop was replaced by its extended conformations. For e1 and i1 the procedure identifies native clusters with Calpha-RMSD approximately 0.5 angstroms and good structural overlap of the side chains; for e3, two clusters were found with Calpha-RMSD approximately 1.1 angstroms each, but with poor overlap of the side chains. Further searching led to a single cluster with lower Calpha-RMSD but higher energy than the two previous clusters. This discrepancy was found to be due to the missing elements in the constructs available from experiment for use in the calculations. Because this problem will likely appear whenever parts of the structural information are missing, possible solutions are discussed.
Collapse
Affiliation(s)
- Ernest L Mehler
- Department of Physiology and Biophysics, Weill Medical College of Cornell University, New York, New York 10021, USA.
| | | | | | | |
Collapse
|
77
|
Hoskins J, Lovell S, Blundell TL. An algorithm for predicting protein-protein interaction sites: Abnormally exposed amino acid residues and secondary structure elements. Protein Sci 2006; 15:1017-29. [PMID: 16641487 PMCID: PMC2242518 DOI: 10.1110/ps.051589106] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Multiprotein systems mediate most regulatory processes in living organisms. Although the structures of the individual proteins are often defined, less is known of the structures of multiprotein systems. Computational methods for predicting interfaces, using evolutionary conservation and/or physicochemical data, have been developed. Here we consider the use of solvent accessibility, residue propensity, and hydrophobicity, in conjunction with secondary structure data, as prediction parameters. We analyze the influence of residue type and secondary structure on solvent accessibility and define a measure of "relative exposedness." Clustering abnormally high scoring residues provides a basis for predicting interaction sites. The analysis is extended to investigate abnormally exposed secondary structure elements, particularly beta-sheet strands. We show that surface-exposed beta-strands lacking protective features are more likely to be found at protein-protein interfaces, allowing us to create an algorithm with approximately 68% and approximately 75% accuracy in differentiating between interacting and edge strands in isolated beta-strands and beta-sheet strands, respectively. These methods of identifying abnormally exposed surface regions are combined in an algorithm, which, on a data set of 77 unbound and disjoint (single chain extracted from complex) structures, predicts 79% of the protein-protein interfaces correctly. If enzyme-inhibitor complexes, where the inhibitor mimics a nonprotein substrate, are excluded, the accuracy increases to 85%.
Collapse
Affiliation(s)
- Jemima Hoskins
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, United Kingdom.
| | | | | |
Collapse
|
78
|
Kortagere S, Roy A, Mehler EL. Ab initio computational modeling of long loops in G-protein coupled receptors. J Comput Aided Mol Des 2006; 20:427-36. [PMID: 16972169 DOI: 10.1007/s10822-006-9056-0] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2006] [Accepted: 07/11/2006] [Indexed: 12/27/2022]
Abstract
A newly developed approach for predicting the structure of segments that connect known elements of secondary structure in proteins has been applied to some of the longer loops in the G-protein coupled receptors (GPCRs) rhodopsin and the dopamine receptor D2R. The algorithm uses Monte Carlo (MC) simulation in a temperature annealing protocol combined with a scaled collective variables (SCV) technique to search conformation space for loop structures that could belong to the native ensemble. Except for rhodopsin, structural information is only available for the transmembrane helices (TMHs), and therefore the usual approach of finding a single conformation of lowest energy has to be abandoned. Instead the MC search aims to find the ensemble located at the absolute minimum free energy, i.e., the native ensemble. It is assumed that structures in the native ensemble can be found by an MC search starting from any conformation in the native funnel. The hypothesis is that native structures are trapped in this part of conformational space because of the high-energy barriers that surround the native funnel. In this work it is shown that the crystal structure of the second extracellular loop (e2) of rhodopsin is a member of this loop's native ensemble. In contrast, the crystal structure of the third intracellular loop is quite different in the different crystal structures that have been reported. Our calculations indicate, that of three crystal structures examined, two show features characteristic of native ensembles while the other one does not. Finally the protocol is used to calculate the structure of the e2 loop in D2R. Here, the crystal structure is not known, but it is shown that several side chains that are involved in interaction with a class of substituted benzamides assume conformations that point into the active site. Thus, they are poised to interact with the incoming ligand.
Collapse
Affiliation(s)
- Sandhya Kortagere
- Department of Physiology and Biophysics, Weill-Cornell Medical College, 1300 York Avenue, New York, NY 10021, USA
| | | | | |
Collapse
|
79
|
Chivian D, Baker D. Homology modeling using parametric alignment ensemble generation with consensus and energy-based model selection. Nucleic Acids Res 2006; 34:e112. [PMID: 16971460 PMCID: PMC1635247 DOI: 10.1093/nar/gkl480] [Citation(s) in RCA: 89] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The accuracy of a homology model based on the structure of a distant relative or other topologically equivalent protein is primarily limited by the quality of the alignment. Here we describe a systematic approach for sequence-to-structure alignment, called ‘K*Sync’, in which alignments are generated by dynamic programming using a scoring function that combines information on many protein features, including a novel measure of how obligate a sequence region is to the protein fold. By systematically varying the weights on the different features that contribute to the alignment score, we generate very large ensembles of diverse alignments, each optimal under a particular constellation of weights. We investigate a variety of approaches to select the best models from the ensemble, including consensus of the alignments, a hydrophobic burial measure, low- and high-resolution energy functions, and combinations of these evaluation methods. The effect on model quality and selection resulting from loop modeling and backbone optimization is also studied. The performance of the method on a benchmark set is reported and shows the approach to be effective at both generating and selecting accurate alignments. The method serves as the foundation of the homology modeling module in the Robetta server.
Collapse
Affiliation(s)
- Dylan Chivian
- Department of Biochemistry, University of WashingtonSeattle, WA, USA
| | - David Baker
- Department of Biochemistry, University of WashingtonSeattle, WA, USA
- Howard Hughes Medical Institute, SeattleWA, USA
- To whom correspondence should be addressed at Department of Biochemistry and HHMI, University of Washington, Box 357350, Seattle, WA 98195, USA. Tel: +1 206 543 1295; Fax: +1 206 685 1792;
| |
Collapse
|
80
|
Weissman KJ, Hong H, Popovic B, Meersman F. Evidence for a protein-protein interaction motif on an acyl carrier protein domain from a modular polyketide synthase. ACTA ACUST UNITED AC 2006; 13:625-36. [PMID: 16793520 DOI: 10.1016/j.chembiol.2006.04.010] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2005] [Revised: 04/24/2006] [Accepted: 04/25/2006] [Indexed: 11/19/2022]
Abstract
During biosynthesis on modular polyketide synthases (PKSs), chain extension intermediates are tethered to acyl carrier protein (ACP) domains through phosphopantetheinyl prosthetic groups. Each ACP must therefore interact with every other domain within the module, and also with a downstream acceptor domain. The nature of these interactions is key to our understanding of the topology and operation of these multienzymes. Sequence analysis and homology modeling implicates a potential helical region (helix II) on the ACPs as a protein-protein interaction motif. Using site-directed mutagenesis, we show that residues along this putative helix lie at the interface between the ACP and the phosphopantetheinyl transferase that catalyzes its activation. Our results accord with previous studies of discrete ACP proteins from fatty acid and aromatic polyketide biosynthesis, suggesting that helix II may also serve as a universal interaction motif in modular PKSs.
Collapse
Affiliation(s)
- Kira J Weissman
- Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge CB2 1GA, United Kingdom.
| | | | | | | |
Collapse
|
81
|
Zhu K, Pincus DL, Zhao S, Friesner RA. Long loop prediction using the protein local optimization program. Proteins 2006; 65:438-52. [PMID: 16927380 DOI: 10.1002/prot.21040] [Citation(s) in RCA: 104] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
We have developed an improved sampling algorithm and energy model for protein loop prediction, the combination of which has yielded the first methodology capable of achieving good results for the prediction of loop backbone conformations of 11 residue length or greater. Applied to our newly constructed test suite of 104 loops ranging from 11 to 13 residues, our method obtains average/median global backbone root-mean-square deviations (RMSDs) to the native structure (superimposing the body of the protein, not the loop itself) of 1.00/0.62 A for 11 residue loops, 1.15/0.60 A for 12 residue loops, and 1.25/0.76 A for 13 residue loops. Sampling errors are virtually eliminated, while energy errors leading to large backbone RMSDs are very infrequent compared to any previously reported efforts, including our own previous study. We attribute this success to both an improved sampling algorithm and, more critically, the inclusion of a hydrophobic term, which appears to approximately fix a major flaw in SGB solvation model that we have been employing. A discussion of these results in the context of the general question of the accuracy of continuum solvation models is presented.
Collapse
Affiliation(s)
- Kai Zhu
- Department of Chemistry, Columbia University, New York, New York 10027, USA
| | | | | | | |
Collapse
|
82
|
Fernandez-Fuentes N, Fiser A. Saturating representation of loop conformational fragments in structure databanks. BMC STRUCTURAL BIOLOGY 2006; 6:15. [PMID: 16820050 PMCID: PMC1574324 DOI: 10.1186/1472-6807-6-15] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/27/2006] [Accepted: 07/04/2006] [Indexed: 11/30/2022]
Abstract
Background Short fragments of proteins are fundamental starting points in various structure prediction applications, such as in fragment based loop modeling methods but also in various full structure build-up procedures. The applicability and performance of these approaches depend on the availability of short fragments in structure databanks. Results We studied the representation of protein loop fragments up to 14 residues in length. All possible query fragments found in sequence databases (Sequence Space) were clustered and cross referenced with available structural fragments in Protein Data Bank (Structure Space). We found that the expansion of PDB in the last few years resulted in a dense coverage of loop conformational fragments. For each loops of length 8 in the current Sequence Space there is at least one loop in Structure Space with 50% or higher sequence identity. By correlating sequence and structure clusters of loops we found that a 50% sequence identity generally guarantees structural similarity. These percentages of coverage at 50% sequence cutoff drop to 96, 94, 68, 53, 33 and 13% for loops of length 9, 10, 11, 12, 13, and 14, respectively. There is not a single loop in the current Sequence Space at any length up to 14 residues that is not matched with a conformational segment that shares at least 20% sequence identity. This minimum observed identity is 40% for loops of 12 residues or shorter and is as high as 50% for 10 residue or shorter loops. We also assessed the impact of rapidly growing sequence databanks on the estimated number of new loop conformations and found that while the number of sequentially unique sequence segments increased about six folds during the last five years there are almost no unique conformational segments among these up to 12 residues long fragments. Conclusion The results suggest that fragment based prediction approaches are not limited any more by the completeness of fragments in databanks but rather by the effective scoring and search algorithms to locate them. The current favorable coverage and trends observed will be further accentuated with the progress of Protein Structure Initiative that targets new protein folds and ultimately aims at providing an exhaustive coverage of the structure space.
Collapse
Affiliation(s)
- Narcis Fernandez-Fuentes
- Department of Biochemistry and Seaver Foundation Center for Bioinformatics, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA
| | - András Fiser
- Department of Biochemistry and Seaver Foundation Center for Bioinformatics, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA
| |
Collapse
|
83
|
Ngan SC, Inouye MT, Samudrala R. A knowledge-based scoring function based on residue triplets for protein structure prediction. Protein Eng Des Sel 2006; 19:187-93. [PMID: 16533801 PMCID: PMC5441915 DOI: 10.1093/protein/gzj018] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2005] [Revised: 12/30/2005] [Accepted: 01/09/2006] [Indexed: 11/29/2022] Open
Abstract
One of the general paradigms for ab initio protein structure prediction involves sampling the conformational space such that a large set of decoy (candidate) structures are generated and then selecting native-like conformations from those decoys using various scoring functions. In this study, based on a physical/geometric approach first suggested by Banavar and colleagues, we formulate a knowledge-based scoring function, which uses the radii of curvature formed among triplets of residues in a protein conformation. By analyzing its performance on various decoy sets, we determine a good set of parameters--the distance cutoff and the number of distance bins--to use for configuring such a function. Furthermore, we investigate the effect of using various approaches for compiling the prior distribution on the performance of the knowledge-based function. Possible extensions to the current form of the residue triplet scoring function are discussed.
Collapse
Affiliation(s)
- Shing-Chung Ngan
- Computational Genomics Group, Department of Microbiology, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Michael T. Inouye
- Computational Genomics Group, Department of Microbiology, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Ram Samudrala
- Computational Genomics Group, Department of Microbiology, University of Washington School of Medicine, Seattle, WA 98195, USA
| |
Collapse
|
84
|
Fernandez-Fuentes N, Oliva B, Fiser A. A supersecondary structure library and search algorithm for modeling loops in protein structures. Nucleic Acids Res 2006; 34:2085-97. [PMID: 16617149 PMCID: PMC1440879 DOI: 10.1093/nar/gkl156] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We present a fragment-search based method for predicting loop conformations in protein models. A hierarchical and multidimensional database has been set up that currently classifies 105 950 loop fragments and loop flanking secondary structures. Besides the length of the loops and types of bracing secondary structures the database is organized along four internal coordinates, a distance and three types of angles characterizing the geometry of stem regions. Candidate fragments are selected from this library by matching the length, the types of bracing secondary structures of the query and satisfying the geometrical restraints of the stems and subsequently inserted in the query protein framework where their fit is assessed by the root mean square deviation (r.m.s.d.) of stem regions and by the number of rigid body clashes with the environment. In the final step remaining candidate loops are ranked by a Z-score that combines information on sequence similarity and fit of predicted and observed ϕ/ψ main chain dihedral angle propensities. Confidence Z-score cut-offs were determined for each loop length that identify those predicted fragments that outperform a competitive ab initio method. A web server implements the method, regularly updates the fragment library and performs prediction. Predicted segments are returned, or optionally, these can be completed with side chain reconstruction and subsequently annealed in the environment of the query protein by conjugate gradient minimization. The prediction method was tested on artificially prepared search datasets where all trivial sequence similarities on the SCOP superfamily level were removed. Under these conditions it is possible to predict loops of length 4, 8 and 12 with coverage of 98, 78 and 28% with at least of 0.22, 1.38 and 2.47 Å of r.m.s.d. accuracy, respectively. In a head-to-head comparison on loops extracted from freshly deposited new protein folds the current method outperformed in a ∼5:1 ratio an earlier developed database search method.
Collapse
Affiliation(s)
| | - Baldomero Oliva
- Structural Bioinformatics Group (GRIB), Universitat Pompeu FabraC/Doctor Aiguader,80. 08003, Barcelona, Catalonia, Spain
| | - András Fiser
- To whom correspondence should be addressed. Tel: +1 718 430 3233; Fax: +1 718 430 856;
| |
Collapse
|
85
|
de Bakker PIW, Furnham N, Blundell TL, DePristo MA. Conformer generation under restraints. Curr Opin Struct Biol 2006; 16:160-5. [PMID: 16483766 DOI: 10.1016/j.sbi.2006.02.001] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2005] [Revised: 01/17/2006] [Accepted: 02/06/2006] [Indexed: 10/25/2022]
Abstract
Conformational sampling by direct optimization of an all-atom energy function is ineffective and inefficient because of the ruggedness of the energy landscape. Discrete sampling schemes represent an attractive alternative for generating ensembles of conformers consistent with spatial restraints derived from empirical data. Conformational sampling is becoming increasingly important for structure prediction as the bottleneck in accurate prediction shifts from energy functions to the methods used to find low-energy conformers. Experimental structure determination remains a perennial challenge as investigators tackle larger macromolecular systems, and begin to incorporate more complete descriptions of uncertainty, heterogeneity and dynamics into their models. Computational approaches that combine dense, discrete sampling with all-atom energy evaluation and refinement may help to overcome the remaining barriers to solving these problems.
Collapse
Affiliation(s)
- Paul I W de Bakker
- Department of Molecular Biology and Center for Human Genetic Research, Massachusetts General Hospital, and Department of Genetics, Harvard Medical School, Boston, MA 02114-2790, USA
| | | | | | | |
Collapse
|
86
|
Abstract
The structure prediction of loops with flexible stem residues is addressed in this article. While the secondary structure of the stem residues is assumed to be known, the geometry of the protein into which the loop must fit is considered to be unknown in our methodology. As a consequence, the compatibility of the loop with the remainder of the protein is not used as a criterion to reject loop decoys. The loop structure prediction with flexible stems is more difficult than fitting loops into a known protein structure in that a larger conformational space has to be covered. The main focus of the study is to assess the precision of loop structure prediction if no information on the protein geometry is available. The proposed approach is based on (1) dihedral angle sampling, (2) structure optimization by energy minimization with a physically based energy function, (3) clustering, and (4) a comparison of strategies for the selection of loops identified in (3). Steps (1) and (2) have similarities to previous approaches to loop structure prediction with fixed stems. Step (3) is based on a new iterative approach to clustering that is tailored for the loop structure prediction problem with flexible stems. In this new approach, clustering is not only used to identify conformers that are likely to be close to the native structure, but clustering is also employed to identify far-from-native decoys. By discarding these decoys iteratively, the overall quality of the ensemble and the loop structure prediction is improved. Step (4) provides a comparative study of criteria for loop selection based on energy, colony energy, cluster density, and a hybrid criterion introduced here. The proposed method is tested on a large set of 3215 loops from proteins in the Pdb-Select25 set and to 179 loops from proteins from the Casp6 experiment.
Collapse
Affiliation(s)
- M Mönnigmann
- Department of Chemical Engineering, Princeton University, Princeton, New Jersey 08544-5263, USA
| | | |
Collapse
|
87
|
Baerga-Ortiz A, Popovic B, Siskos AP, O'Hare HM, Spiteller D, Williams MG, Campillo N, Spencer JB, Leadlay PF. Directed Mutagenesis Alters the Stereochemistry of Catalysis by Isolated Ketoreductase Domains from the Erythromycin Polyketide Synthase. ACTA ACUST UNITED AC 2006; 13:277-85. [PMID: 16638533 DOI: 10.1016/j.chembiol.2006.01.004] [Citation(s) in RCA: 90] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2005] [Revised: 01/10/2006] [Accepted: 01/10/2006] [Indexed: 11/18/2022]
Abstract
The ketoreductase (KR) domains eryKR(1) and eryKR(2) from the erythromycin-producing polyketide synthase (PKS) reduce 3-ketoacyl-thioester intermediates with opposite stereospecificity. Modeling of eryKR(1) and eryKR(2) showed that conserved amino acids previously correlated with production of alternative alcohol configurations lie in the active site. eryKR(1) domains mutated at these positions showed an altered stereochemical outcome in reduction of (2R, S)-2-methyl-3-oxopentanoic acid N-acetylcysteamine thioester. The wild-type eryKR(1) domain exclusively gave the (2S, 3R)-3-hydroxy-2-methylpentanoic acid N-acetylcysteamine thioester, while the double mutant (F141W, P144G) gave only the (2S, 3S) isomer, a switch of the alcohol stereochemistry. Mutation of the eryKR(2) domain, in contrast, greatly increased the proportion of the wild-type (2R, 3S)-alcohol product. These data confirm the role of key residues in stereocontrol and suggest an additional way to make rational alterations in polyketide antibiotic structure.
Collapse
Affiliation(s)
- Abel Baerga-Ortiz
- Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge CB2 1GA, United Kingdom
| | | | | | | | | | | | | | | | | |
Collapse
|
88
|
Szarecka A, Meirovitch H. Optimization of the GB/SA solvation model for predicting the structure of surface loops in proteins. J Phys Chem B 2006; 110:2869-80. [PMID: 16471897 PMCID: PMC1945207 DOI: 10.1021/jp055771+] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Implicit solvation models are commonly optimized with respect to experimental data or Poisson-Boltzmann (PB) results obtained for small molecules, where the force field is sometimes not considered. In previous studies, we have developed an optimization procedure for cyclic peptides and surface loops in proteins based on the entire system studied and the specific force field used. Thus, the loop has been modeled by the simplified solvation function E(tot) = E(FF) (epsilon = 2r) + Sigma(i) sigma(i)A(i), where E(FF) (epsilon = nr) is the AMBER force field energy with a distance-dependent dielectric function, epsilon = nr, A(i) is the solvent accessible surface area of atom i, and sigma(i) is its atomic solvation parameter. During the optimization process, the loop is free to move while the protein template is held fixed in its X-ray structure. To improve on the results of this model, in the present work we apply our optimization procedure to the physically more rigorous solvation model, the generalized Born with surface area (GB/SA) (together with the all-atom AMBER force field) as suggested by Still and co-workers (J. Phys. Chem. A 1997, 101, 3005). The six parameters of the GB/SA model, namely, P(1)-P(5) and the surface area parameter, sigma (programmed in the TINKER package) are reoptimized for a "training" group of nine loops, and a best-fit set is defined from the individual sets of optimized parameters. The best-fit set and Still's original set of parameters (where Lys, Arg, His, Glu, and Asp are charged or neutralized) were applied to the training group as well as to a "test" group of seven loops, and the energy gaps and the corresponding RMSD values were calculated. These GB/SA results based on the three sets of parameters have been found to be comparable; surprisingly, however, they are somewhat inferior (e.g, of larger energy gaps) to those obtained previously from the simplified model described above. We discuss recent results for loops obtained by other solvation models and potential directions for future studies.
Collapse
Affiliation(s)
- Agnieszka Szarecka
- Department of Computational Biology, University of Pittsburgh School of Medicine, Suite 3064, BST 3, 3501 Fifth Avenue, Pittsburgh, PA 15213
| | - Hagai Meirovitch
- Department of Computational Biology, University of Pittsburgh School of Medicine, Suite 3064, BST 3, 3501 Fifth Avenue, Pittsburgh, PA 15213
| |
Collapse
|
89
|
Floudas C, Fung H, McAllister S, Mönnigmann M, Rajgaria R. Advances in protein structure prediction and de novo protein design: A review. Chem Eng Sci 2006. [DOI: 10.1016/j.ces.2005.04.009] [Citation(s) in RCA: 175] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
90
|
Reinert DJ, Carpusca I, Aktories K, Schulz GE. Structure of the mosquitocidal toxin from Bacillus sphaericus. J Mol Biol 2006; 357:1226-36. [PMID: 16483607 DOI: 10.1016/j.jmb.2006.01.025] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2005] [Revised: 12/29/2005] [Accepted: 01/05/2006] [Indexed: 11/19/2022]
Abstract
The catalytic domain of a mosquitocidal toxin prolonged by a C-terminal 44 residue linker connecting to four ricin B-like domains was crystallized. Three crystal structures were established at resolutions between 2.5A and 3.0A using multi-wavelength and single-wavelength anomalous X-ray diffraction as well as molecular replacement phasing techniques. The chainfold of the toxin fragment corresponds to those of ADP-ribosylating enzymes. At pH 4.3 the fragment is associated in a C(7)-symmetric heptamer in agreement with an aggregate of similar size observed by size-exclusion chromatography. In two distinct crystal forms, the heptamers formed nearly spherical, D(7)-symmetric tetradecamers. Another crystal form obtained at pH 6.3 contained a recurring C(2)-symmetric tetramer, which, however, was not stable in solution. On the basis of the common chainfold and NAD(+)-binding site of all ADP-ribosyl transferases, the NAD(+)-binding site of the toxin was assigned at a high confidence level. In all three crystal forms the NAD(+) site was occupied by part of the 44 residue linker, explaining the known inhibitory effect of this polypeptide region. The structure showed that the cleavage site for toxin activation is in a highly mobile loop that is exposed in the monomer. Since it contains the inhibitory linker as a crucial part of the association contact, the observed heptamer is inactive. Moreover, the heptamer cannot be activated by proteolysis because the activation loop is at the ring center and not accessible for proteases. Therefore the heptamer, or possibly the tetradecamer, seems to represent an inactive storage form of the toxin.
Collapse
Affiliation(s)
- Dirk J Reinert
- Institut für Organische Chemie und Biochemie, Albert-Ludwigs-Universität, Albertstrasse 21, 79104 Freiburg im Breisgau, Germany
| | | | | | | |
Collapse
|
91
|
White RP, Meirovitch H. Minimalist explicit solvation models for surface loops in proteins. J Chem Theory Comput 2006; 2:1135-1151. [PMID: 17429495 PMCID: PMC1851699 DOI: 10.1021/ct0503217] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We have performed molecular dynamics simulations of protein surface loops solvated by explicit water, where a prime focus of the study is the small numbers (e.g., ~100) of explicit water molecules employed. The models include only part of the protein (typically 500 - 1000 atoms), and the water molecules are restricted to a region surrounding the loop. In this study, the number of water molecules (N(w)) is systematically varied, and convergence with large N(w) is monitored to reveal N(w)(min), the minimum number required for the loop to exhibit realistic (fully hydrated) behavior. We have also studied protein surface coverage, as well as diffusion and residence times for water molecules as a function of N(w). A number of other modeling parameters are also tested. These include the number of environmental protein atoms explicitly considered in the model, as well as two ways to constrain the water molecules to the vicinity of the loop (where we find one of these methods to perform better when N(w) is small). The results (for RMSD and its fluctuations for four loops) are further compared to much larger, fully solvated systems (using ~10,000 water molecules under periodic boundary conditions and Ewald electrostatics), and to results for the GBSA implicit solvation model. We find that the loop backbone can stabilize with a surprisingly small number of water molecules (as low as 5 molecules per amino acid residue). The side chains of the loop require somewhat larger N(w), where the atomic fluctuations become too small if N(w) is further reduced. Thus, in general, we find adequate hydration to occur at roughly 12 water molecules per residue. This is an important result, because at this hydration level, computational times are comparable to those required for GBSA. Therefore these "minimalist explicit models" can provide a viable and potentially more accurate alternative. The importance of protein loop modeling is discussed in the context of these, and other, loop models, along with other challenges including the relevance of appropriate free energy simulation methodology for assessment of conformational stability.
Collapse
Affiliation(s)
- Ronald P. White
- Department of Computational Biology, University of Pittsburgh School of Medicine, Biomedical Science Tower3, 3064 Pittsburgh, PA 15260
| | - Hagai Meirovitch
- Department of Computational Biology, University of Pittsburgh School of Medicine, Biomedical Science Tower3, 3064 Pittsburgh, PA 15260
| |
Collapse
|
92
|
Conner AC, Simms J, Howitt SG, Wheatley M, Poyner DR. The second intracellular loop of the calcitonin gene-related peptide receptor provides molecular determinants for signal transduction and cell surface expression. J Biol Chem 2005; 281:1644-51. [PMID: 16293613 DOI: 10.1074/jbc.m510064200] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
The calcitonin gene-related peptide (CGRP) receptor is a heterodimer of a family B G-protein-coupled receptor, calcitonin receptor-like receptor (CLR), and the accessory protein receptor activity modifying protein 1. It couples to G(s), but it is not known which intracellular loops mediate this. We have identified the boundaries of this loop based on the relative position and length of the juxtamembrane transmembrane regions 3 and 4. The loop has been analyzed by systematic mutagenesis of all residues to alanine, measuring cAMP accumulation, CGRP affinity, and receptor expression. Unlike rhodopsin, ICL2 of the CGRP receptor plays a part in the conformational switch after agonist interaction. His-216 and Lys-227 were essential for a functional CGRP-induced cAMP response. The effect of (H216A)CLR is due to a disruption to the cell surface transport or surface stability of the mutant receptor. In contrast, (K227A)CLR had wild-type expression and agonist affinity, suggesting a direct disruption to the downstream signal transduction mechanism of the CGRP receptor. Modeling suggests that the loop undergoes a significant shift in position during receptor activation, exposing a potential G-protein binding pocket. Lys-227 changes position to point into the pocket, potentially allowing it to interact with bound G-proteins. His-216 occupies a position similar to that of Tyr-136 in bovine rhodopsin, part of the DRY motif of the latter receptor. This is the first comprehensive analysis of an entire intracellular loop within the calcitonin family of G-protein-coupled receptor. These data help to define the structural and functional characteristics of the CGRP-receptor and of family B G-protein-coupled receptors in general.
Collapse
Affiliation(s)
- Alex C Conner
- School of Life and Health Sciences, Aston University, Birmingham B4 7ET, United Kingdom
| | | | | | | | | |
Collapse
|
93
|
Jiang H, Blouin C. Ab initio construction of all-atom loop conformations. J Mol Model 2005; 12:221-8. [PMID: 16247602 DOI: 10.1007/s00894-005-0030-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2004] [Accepted: 06/23/2005] [Indexed: 11/28/2022]
Abstract
In this study, a new ab initio method named CLOOP has been developed to build all-atom loop conformations. In this method, a loop main-chain conformation is generated by sampling main-chain dihedral angles from a restrained varphi/psi set, and the side-chain conformations are built randomly. The CHARMM all-atom force field was used to evaluate the loop conformations. Soft core potentials were used to treat the non-bond interactions, and a designed energy-minimization technique was used to close and optimize the loop conformations. It is shown that the two strategies improve the computational efficiency and the loop-closure rate substantially compared to normal minimization methods. CLOOP was used to construct the conformations of 4-, 8-, and 12-residue loops in Fiser's test set. The average main-chain root-mean-square deviations obtained in 1,000 trials for the 10 different loops of each size are 0.33, 1.27, and 2.77 A, respectively. CLOOP can build all-atom loop conformations with a sampling accuracy comparable with previous loop main-chain construction algorithms. [Figure: see text].
Collapse
Affiliation(s)
- Haiyan Jiang
- Faculty of Computer Science, Dalhousie University, Halifax, NS, B3H 1W5, Canada.
| | | |
Collapse
|
94
|
Depristo MA, de Bakker PIW, Johnson RJK, Blundell TL. Crystallographic Refinement by Knowledge-Based Exploration of Complex Energy Landscapes. Structure 2005; 13:1311-9. [PMID: 16154088 DOI: 10.1016/j.str.2005.06.008] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2005] [Revised: 06/03/2005] [Accepted: 06/08/2005] [Indexed: 11/24/2022]
Abstract
Although X-ray crystallography remains the most versatile method to determine the three-dimensional atomic structure of proteins and much progress has been made in model building and refinement techniques, it remains a challenge to elucidate accurately the structure of proteins in medium-resolution crystals. This is largely due to the difficulty of exploring an immense conformational space to identify the set of conformers that collectively best fits the experimental diffraction pattern. We show here that combining knowledge-based conformational sampling in RAPPER with molecular dynamics/simulated annealing (MD/SA) vastly improves the quality and power of refinement compared to MD/SA alone. The utility of this approach is highlighted by the automated determination of a lysozyme mutant from a molecular replacement solution that is in congruence with a model prepared independently by crystallographers. Finally, we discuss the implications of this work on structure determination in particular and conformational sampling and energy minimization in general.
Collapse
Affiliation(s)
- Mark A Depristo
- Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge, CB2 1GA, United Kingdom.
| | | | | | | |
Collapse
|
95
|
Forrest LR, Honig B. An assessment of the accuracy of methods for predicting hydrogen positions in protein structures. Proteins 2005; 61:296-309. [PMID: 16114036 DOI: 10.1002/prot.20601] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
The addition of hydrogen atoms to models or experimental structures of proteins that contain only non-hydrogen atoms is a common step in crystallographic structure refinement, in theoretical studies of proteins, and in protein structure prediction. Accurate prediction of the hydrogen positions is essential, since they constitute around half of the atoms in proteins and hence contribute significantly to their energetics. Many computational tools exist for predicting hydrogen positions, although to date no quantitative comparison has been made of their accuracy or efficiency. Here we take advantage of the recent increase in ultra-high-resolution X-ray crystal structures (< 0.9 A resolution), as well as of a number of relatively high-resolution neutron diffraction structures (< 1.8 A resolution), to compare the quality of the predictions generated by a large set of commonly used methods. These include CHARMM, CNS, GROMACS, MCCE, MolProbity, WHAT IF, and X-PLOR. The hydrogen atoms that lack a rotational degree of freedom are mostly, but not always, accurately predicted. For hydrogens with a rotational degree of freedom, all the methods give much less accurate predictions. The predictions for the hydroxyl hydrogens are analyzed in detail, particularly those buried within the protein, and some explanation is provided for the errors observed. The results provide a means to make informed decisions regarding the choice and implementation of methodologies for placing hydrogens on structures of proteins. They also point to shortcomings in current force fields and suggest the need for improved descriptions of hydrogen bonding energetics.
Collapse
Affiliation(s)
- Lucy R Forrest
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York 10032, USA
| | | |
Collapse
|
96
|
Fogolari F, Tosatto SCE. Application of MM/PBSA colony free energy to loop decoy discrimination: toward correlation between energy and root mean square deviation. Protein Sci 2005; 14:889-901. [PMID: 15772305 PMCID: PMC2253447 DOI: 10.1110/ps.041004105] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Accurate free energy estimation is needed in many predictive tasks. The molecular mechanics/Poisson-Boltzmann solvent accessible surface area (MM/PBSA) approach has proven to be accurate. However, the correlation between the estimated free energy and the distance (e.g., root mean square deviation [RMSD]) from the most stable conformation is hindered by the strong free energy dependence on minor conformational variations. In this paper, a protocol for MM/PBSA free energy estimation is designed and tested on several loop decoy sets. We show that further integration of MM/PBSA free energy estimator with the colony energy approach makes the correlation between the free energy and RMSD from the native structure apparent, for the test sets on which it could be applied. Our results suggest that (1) the MM/PBSA free energy estimator is able to detect native-like structures for most decoy sets, and (2) application of the colony energy approach greatly hampers the MM/energy strong dependence on minor conformational changes.
Collapse
Affiliation(s)
- Federico Fogolari
- Dipartimento di Scienze e Tecnologie Biomediche, Università di Udine, Piazzale Kolbe 4, 33100 Udine, Italy.
| | | |
Collapse
|
97
|
Abstract
Energy functions are crucial ingredients of protein tertiary structure prediction methods. Assessing the quality of energy functions is therefore of prime importance. It requires the elaboration of a standard evaluation scheme, whose key elements are: i). sets that contain the native and several non-native structures of proteins (decoys) in order to test whether the energy functions display the expected quality features and ii). measures to evaluate the reliability of energy functions. We present here a survey of the recent advances in these two related fields. In a first part, we analyze and review the large number of decoy sets that are available on the web, and we summarize the characteristics of a challenging decoy set. We then discuss how to define the quality of energy functions and review the measures related to it.
Collapse
Affiliation(s)
- D Gilis
- Center of Applied Molecular Engineering, Institute of Chemistry and Biochemistry, University of Salzburg, Jakob Haringerstrabe 3, A-5020 Salzburg, Austria.
| |
Collapse
|
98
|
Rayan A, Senderowitz H, Goldblum A. Exploring the conformational space of cyclic peptides by a stochastic search method. J Mol Graph Model 2004; 22:319-33. [PMID: 15099829 DOI: 10.1016/j.jmgm.2003.12.012] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
A stochastic search algorithm is applied in order to probe the conformations of cyclic peptides. The search is conducted in two stages. In the first stage, random conformations are generated and evaluated by a penalty function for ring closure ability, following a stepwise construction of each amino acid into the peptide by a random choice of one of its allowed conformations. The allowed conformational ranges of backbone dihedral angles for each amino acid have been extracted from a Data Bank of diverse proteins. Values of dihedral angles that do not contribute favorably to the scoring of ring closure are retained or discarded by a statistical test. Values are discarded up to a point from which all remaining combinations of angles are constructed, scored, sorted, and clustered. In the second stage, side chains have been added and fast optimization was applied to the set of diverse conformations in a "united atoms" approach, with the "Kollman forcefield" of Sybyl 6.8. This iterative stochastic elimination algorithm finds the global minimum and most of the best results, when compared to a full exhaustive search in appropriately sized problems. In larger problems, we compare the results to experimental structures. The root mean square deviation (RMSD) of our best results compared to crystal structures of cyclic peptides with sizes from 4 to 15 amino acids are mostly below 1.0 A up to 8 mers and under 2.0 A for larger cyclic peptides.
Collapse
Affiliation(s)
- Anwar Rayan
- Department of Medicinal Chemistry and Natural Products, David R. Bloom Center for Pharmacy, School of Pharmacy, The Hebrew University of Jerusalem, Jerusalem 91120, Israel.
| | | | | |
Collapse
|
99
|
Núñez Miguel R, Sanders J, Jeffreys J, Depraetere H, Evans M, Richards T, Blundell TL, Rees Smith B, Furmaniak J. Analysis of the thyrotropin receptor-thyrotropin interaction by comparative modeling. Thyroid 2004; 14:991-1011. [PMID: 15650352 DOI: 10.1089/thy.2004.14.991] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
We have used the most advanced programs currently available to construct the first three-domain structure of the human thyrotropin receptor (TSHR) using comparative modeling. The model consists of a leucine-rich domain (LRD; amino acids 36-281; porcine ribonuclease inhibitor used as a template for modeling), a cleavage domain (CD; amino acids 282-409; tissue inhibitor of matrix metalloproteinases 2 as template) and transmembrane domain (TMD amino acids 410-699; bovine rhodopsin as template). Models of human, porcine, and bovine TSH were also constructed (human chorionic gonadotropin [hCG] and human follicle stimulating hormone [hFSH] as templates). The LRD has a characteristic horseshoe shape with 10 tandem homologous repeats. The CD consists of beta-barrel and alpha helix structures (OB-like fold) with two disulfide bridges and the structure around these disulfide bridges remains stable after cleavage. The TMD presents the typical seven membrane-spanning helices. The TSH, LRD, CD, and TMD models were brought together in an extensive series of docking experiments. Known features of the TSH-TSHR interaction were used for selection of appropriate complexes that were then validated using a different set of experimental data. A similar approach was used to build a model of a complex between the TSHR and a monoclonal TSHR antibody with weak thyroid stimulating activity. Human thyrotropin (hTSH) alpha chains were found to make contact with many amino acids on the LRD surface and CD surface whereas no interaction between the beta chains and the CD were found. The higher affinity of bovine thyrotropin (bTSH) and porcine thyrotropin (pTSH) (relative to hTSH) for the TSHR is explained well by the models in terms of charge-charge interactions between their alpha chains and the receptor. Experimental observations showing increased sensitivity of the TSHR to hCG after mutation of TSHR Lys209 to Glu are explained well by our model. Furthermore, several mutations in the TMD that are associated with increased TSHR basal activity are predicted from our model to be caused by the formation of new interactions that stabilize the activated form of the TMD.
Collapse
|
100
|
Jacobson MP, Pincus DL, Rapp CS, Day TJF, Honig B, Shaw DE, Friesner RA. A hierarchical approach to all-atom protein loop prediction. Proteins 2004; 55:351-67. [PMID: 15048827 DOI: 10.1002/prot.10613] [Citation(s) in RCA: 1753] [Impact Index Per Article: 87.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
The application of all-atom force fields (and explicit or implicit solvent models) to protein homology-modeling tasks such as side-chain and loop prediction remains challenging both because of the expense of the individual energy calculations and because of the difficulty of sampling the rugged all-atom energy surface. Here we address this challenge for the problem of loop prediction through the development of numerous new algorithms, with an emphasis on multiscale and hierarchical techniques. As a first step in evaluating the performance of our loop prediction algorithm, we have applied it to the problem of reconstructing loops in native structures; we also explicitly include crystal packing to provide a fair comparison with crystal structures. In brief, large numbers of loops are generated by using a dihedral angle-based buildup procedure followed by iterative cycles of clustering, side-chain optimization, and complete energy minimization of selected loop structures. We evaluate this method by using the largest test set yet used for validation of a loop prediction method, with a total of 833 loops ranging from 4 to 12 residues in length. Average/median backbone root-mean-square deviations (RMSDs) to the native structures (superimposing the body of the protein, not the loop itself) are 0.42/0.24 A for 5 residue loops, 1.00/0.44 A for 8 residue loops, and 2.47/1.83 A for 11 residue loops. Median RMSDs are substantially lower than the averages because of a small number of outliers; the causes of these failures are examined in some detail, and many can be attributed to errors in assignment of protonation states of titratable residues, omission of ligands from the simulation, and, in a few cases, probable errors in the experimentally determined structures. When these obvious problems in the data sets are filtered out, average RMSDs to the native structures improve to 0.43 A for 5 residue loops, 0.84 A for 8 residue loops, and 1.63 A for 11 residue loops. In the vast majority of cases, the method locates energy minima that are lower than or equal to that of the minimized native loop, thus indicating that sampling rarely limits prediction accuracy. The overall results are, to our knowledge, the best reported to date, and we attribute this success to the combination of an accurate all-atom energy function, efficient methods for loop buildup and side-chain optimization, and, especially for the longer loops, the hierarchical refinement protocol.
Collapse
Affiliation(s)
- Matthew P Jacobson
- Department of Pharmaceutical Chemistry, University of California, San Francisco 94143-2240, USA.
| | | | | | | | | | | | | |
Collapse
|