51
|
Goldfeld DA, Zhu K, Beuming T, Friesner RA. Loop prediction for a GPCR homology model: Algorithms and results. Proteins 2012; 81:214-28. [DOI: 10.1002/prot.24178] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2012] [Revised: 08/13/2012] [Accepted: 08/25/2012] [Indexed: 11/07/2022]
|
52
|
Abstract
The prediction of loop structures is considered one of the main challenges in the protein folding problem. Regardless of the dependence of the overall algorithm on the protein data bank, the flexibility of loop regions dictates the need for special attention to their structures. In this article, we present algorithms for loop structure prediction with fixed stem and flexible stem geometry. In the flexible stem geometry problem, only the secondary structure of three stem residues on either side of the loop is known. In the fixed stem geometry problem, the structure of the three stem residues on either side of the loop is also known. Initial loop structures are generated using a probability database for the flexible stem geometry problem, and using torsion angle dynamics for the fixed stem geometry problem. Three rotamer optimization algorithms are introduced to alleviate steric clashes between the generated backbone structures and the side chain rotamers. The structures are optimized by energy minimization using an all-atom force field. The optimized structures are clustered using a traveling salesman problem-based clustering algorithm. The structures in the densest clusters are then utilized to refine dihedral angle bounds on all amino acids in the loop. The entire procedure is carried out for a number of iterations, leading to improved structure prediction and refined dihedral angle bounds. The algorithms presented in this article have been tested on 3190 loops from the PDBSelect25 data set and on targets from the recently concluded CASP9 community-wide experiment.
Collapse
Affiliation(s)
- A. Subramani
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544-5263, U.S.A
| | - C. A. Floudas
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544-5263, U.S.A
| |
Collapse
|
53
|
St-Pierre JF, Mousseau N. Large loop conformation sampling using the activation relaxation technique, ART-nouveau method. Proteins 2012; 80:1883-94. [PMID: 22488731 DOI: 10.1002/prot.24085] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2011] [Revised: 03/19/2011] [Accepted: 03/30/2012] [Indexed: 12/25/2022]
Abstract
We present an adaptation of the ART-nouveau energy surface sampling method to the problem of loop structure prediction. This method, previously used to study protein folding pathways and peptide aggregation, is well suited to the problem of sampling the conformation space of large loops by targeting probable folding pathways instead of sampling exhaustively that space. The number of sampled conformations needed by ART nouveau to find the global energy minimum for a loop was found to scale linearly with the sequence length of the loop for loops between 8 and about 20 amino acids. Considering the linear scaling dependence of the computation cost on the loop sequence length for sampling new conformations, we estimate the total computational cost of sampling larger loops to scale quadratically compared to the exponential scaling of exhaustive search methods.
Collapse
Affiliation(s)
- Jean-François St-Pierre
- Département de Physique and Regroupement Québécois sur les Matériaux de Pointe, Université de Montréal, CP 6128, Succursale Centre-Ville, Montréal, Québec, Canada H3C 3J7
| | | |
Collapse
|
54
|
Lee GR, Shin WH, Park HB, Shin SM, Seok CO. Conformational Sampling of Flexible Ligand-binding Protein Loops. B KOREAN CHEM SOC 2012. [DOI: 10.5012/bkcs.2012.33.3.770] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
55
|
|
56
|
Skliros A, Zimmermann MT, Chakraborty D, Saraswathi S, Katebi AR, Leelananda SP, Kloczkowski A, Jernigan RL. The importance of slow motions for protein functional loops. Phys Biol 2012; 9:014001. [PMID: 22314977 DOI: 10.1088/1478-3975/9/1/014001] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Loops in proteins that connect secondary structures such as alpha-helix and beta-sheet, are often on the surface and may play a critical role in some functions of a protein. The mobility of loops is central for the motional freedom and flexibility requirements of active-site loops and may play a critical role for some functions. The structures and behaviors of loops have not been studied much in the context of the whole structure and its overall motions, especially how these might be coupled. Here we investigate loop motions by using coarse-grained structures (C(α) atoms only) to solve the motions of the system by applying Lagrange equations with elastic network models to learn about which loops move in an independent fashion and which move in coordination with domain motions, faster and slower, respectively. The normal modes of the system are calculated using eigen-decomposition of the stiffness matrix. The contribution of individual modes and groups of modes is investigated for their effects on all residues in each loop by using Fourier analyses. Our results indicate overall that the motions of functional sets of loops behave in similar ways as the whole structure. But overall only a relatively few loops move in coordination with the dominant slow modes of motion, and these are often closely related to function.
Collapse
Affiliation(s)
- Aris Skliros
- L. H. Baker Center for Bioinformatics and Biological Statistics, Iowa State University, Ames, IA 50011, USA. Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, IA 50011, USA
| | | | | | | | | | | | | | | |
Collapse
|
57
|
Wickstrom L, Gallicchio E, Levy RM. The linear interaction energy method for the prediction of protein stability changes upon mutation. Proteins 2011; 80:111-25. [PMID: 22038697 DOI: 10.1002/prot.23168] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2011] [Revised: 07/28/2011] [Accepted: 08/06/2011] [Indexed: 12/25/2022]
Abstract
The coupling of protein energetics and sequence changes is a critical aspect of computational protein design, as well as for the understanding of protein evolution, human disease, and drug resistance. To study the molecular basis for this coupling, computational tools must be sufficiently accurate and computationally inexpensive enough to handle large amounts of sequence data. We have developed a computational approach based on the linear interaction energy (LIE) approximation to predict the changes in the free-energy of the native state induced by a single mutation. This approach was applied to a set of 822 mutations in 10 proteins which resulted in an average unsigned error of 0.82 kcal/mol and a correlation coefficient of 0.72 between the calculated and experimental ΔΔG values. The method is able to accurately identify destabilizing hot spot mutations; however, it has difficulty in distinguishing between stabilizing and destabilizing mutations because of the distribution of stability changes for the set of mutations used to parameterize the model. In addition, the model also performs quite well in initial tests on a small set of double mutations. On the basis of these promising results, we can begin to examine the relationship between protein stability and fitness, correlated mutations, and drug resistance.
Collapse
Affiliation(s)
- Lauren Wickstrom
- Department of Chemistry and Chemical Biology, BioMaPS Institute for Quantitative Biology, Rutgers, The State University of New Jersey, Piscataway, New Jersey 08854, USA
| | | | | |
Collapse
|
58
|
Joo H, Chavan AG, Day R, Lennox KP, Sukhanov P, Dahl DB, Vannucci M, Tsai J. Near-native protein loop sampling using nonparametric density estimation accommodating sparcity. PLoS Comput Biol 2011; 7:e1002234. [PMID: 22028638 PMCID: PMC3197639 DOI: 10.1371/journal.pcbi.1002234] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2011] [Accepted: 09/01/2011] [Indexed: 11/29/2022] Open
Abstract
Unlike the core structural elements of a protein like regular secondary structure, template based modeling (TBM) has difficulty with loop regions due to their variability in sequence and structure as well as the sparse sampling from a limited number of homologous templates. We present a novel, knowledge-based method for loop sampling that leverages homologous torsion angle information to estimate a continuous joint backbone dihedral angle density at each loop position. The φ,ψ distributions are estimated via a Dirichlet process mixture of hidden Markov models (DPM-HMM). Models are quickly generated based on samples from these distributions and were enriched using an end-to-end distance filter. The performance of the DPM-HMM method was evaluated against a diverse test set in a leave-one-out approach. Candidates as low as 0.45 Å RMSD and with a worst case of 3.66 Å were produced. For the canonical loops like the immunoglobulin complementarity-determining regions (mean RMSD <2.0 Å), the DPM-HMM method performs as well or better than the best templates, demonstrating that our automated method recaptures these canonical loops without inclusion of any IgG specific terms or manual intervention. In cases with poor or few good templates (mean RMSD >7.0 Å), this sampling method produces a population of loop structures to around 3.66 Å for loops up to 17 residues. In a direct test of sampling to the Loopy algorithm, our method demonstrates the ability to sample nearer native structures for both the canonical CDRH1 and non-canonical CDRH3 loops. Lastly, in the realistic test conditions of the CASP9 experiment, successful application of DPM-HMM for 90 loops from 45 TBM targets shows the general applicability of our sampling method in loop modeling problem. These results demonstrate that our DPM-HMM produces an advantage by consistently sampling near native loop structure. The software used in this analysis is available for download at http://www.stat.tamu.edu/~dahl/software/cortorgles/. A protein's structure consists of elements of regular secondary structure connected by less regular stretches of loop segments. The irregularity of the loop structure makes loop modeling quite challenging. More accurate sampling of these loop conformations has a direct impact on protein modeling, design, function classification, as well as protein interactions. A method has been developed that extends a more comprehensive knowledge-based approach to producing models of the loop regions of protein structure. Most physical models cannot adequately sample the large conformational space, while the more discrete knowledge based libraries are conformationally limited. To address both of these problems, we introduce a novel statistical method that produces a continuous yet weighted estimation of loop conformational space from a discrete library of structures by using a Dirichlet process mixture of hidden Markov models (DPM-HMM). Applied to loop structure sampling, the results of a number of tests demonstrate that our approach quickly generates large numbers of candidates with near native loop conformations. Most significantly, in the cases where the template sampling is sparse and/or far from native conformations, the DPM-HMM method samples close to the native space and produces a population of accurate loop structures.
Collapse
Affiliation(s)
- Hyun Joo
- Department of Chemistry, University of the Pacific, Stockton, California, United States of America
| | - Archana G. Chavan
- Department of Chemistry, University of the Pacific, Stockton, California, United States of America
| | - Ryan Day
- Department of Chemistry, University of the Pacific, Stockton, California, United States of America
| | - Kristin P. Lennox
- Department of Statistics, Texas A&M University, College Station, Texas, United States of America
| | - Paul Sukhanov
- Department of Chemistry, University of the Pacific, Stockton, California, United States of America
| | - David B. Dahl
- Department of Statistics, Texas A&M University, College Station, Texas, United States of America
| | - Marina Vannucci
- Department of Statistics, Rice University, Houston, Texas, United States of America
| | - Jerry Tsai
- Department of Chemistry, University of the Pacific, Stockton, California, United States of America
- * E-mail:
| |
Collapse
|
59
|
Integrating structure-based and ligand-based approaches for computational drug design. Future Med Chem 2011; 3:735-50. [PMID: 21554079 DOI: 10.4155/fmc.11.18] [Citation(s) in RCA: 103] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Methods utilized in computer-aided drug design can be classified into two major categories: structure based and ligand based, using information on the structure of the protein or on the biological and physicochemical properties of bound ligands, respectively. In recent years there has been a trend towards integrating these two methods in order to enhance the reliability and efficiency of computer-aided drug-design approaches by combining information from both the ligand and the protein. This trend resulted in a variety of methods that include: pseudoreceptor methods, pharmacophore methods, fingerprint methods and approaches integrating docking with similarity-based methods. In this article, we will describe the concepts behind each method and selected applications.
Collapse
|
60
|
Zhao S, Zhu K, Li J, Friesner RA. Progress in super long loop prediction. Proteins 2011; 79:2920-35. [PMID: 21905115 DOI: 10.1002/prot.23129] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2010] [Revised: 05/06/2011] [Accepted: 06/15/2011] [Indexed: 11/07/2022]
Abstract
Sampling errors are very common in super long loop (referring here to loops that have more than thirteen residues) prediction, simply because the sampling space is vast. We have developed a dipeptide segment sampling algorithm to solve this problem. As a first step in evaluating the performance of this algorithm, it was applied to the problem of reconstructing loops in native protein structures. With a newly constructed test set of 89 loops ranging from 14 to 17 residues, this method obtains average/median global backbone root-mean-square deviations (RMSDs) to the native structure (superimposing the body of the protein, not the loop itself) of 1.46/0.68 Å. Specifically, results for loops of various lengths are 1.19/0.67 Å for 36 fourteen-residue loops, 1.55/0.75 Å for 30 fifteen-residue loops, 1.43/0.80 Å for 14 sixteen-residue loops, and 2.30/1.92 Å for nine seventeen-residue loops. In the vast majority of cases, the method locates energy minima that are lower than or equal to that of the minimized native loop, thus indicating that the new sampling method is successful and rarely limits prediction accuracy. Median RMSDs are substantially lower than the averages because of a small number of outliers. The causes of these failures are examined in some detail, and some can be attributed to flaws in the energy function, such as π-π interactions are not accurately accounted for by the OPLS-AA force field we employed in this study. By introducing a new energy model which has a superior description of π-π interactions, significantly better results were achieved for quite a few former outliers. Crystal packing is explicitly included in order to provide a fair comparison with crystal structures.
Collapse
Affiliation(s)
- Suwen Zhao
- Department of Chemistry, Columbia University, New York, New York 1027, USA
| | | | | | | |
Collapse
|
61
|
Li J, Abel R, Zhu K, Cao Y, Zhao S, Friesner RA. The VSGB 2.0 model: a next generation energy model for high resolution protein structure modeling. Proteins 2011; 79:2794-812. [PMID: 21905107 DOI: 10.1002/prot.23106] [Citation(s) in RCA: 725] [Impact Index Per Article: 55.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2010] [Revised: 05/03/2011] [Accepted: 05/13/2011] [Indexed: 02/06/2023]
Abstract
A novel energy model (VSGB 2.0) for high resolution protein structure modeling is described, which features an optimized implicit solvent model as well as physics-based corrections for hydrogen bonding, π-π interactions, self-contact interactions, and hydrophobic interactions. Parameters of the VSGB 2.0 model were fit to a crystallographic database of 2239 single side chain and 100 11-13 residue loop predictions. Combined with an advanced method of sampling and a robust algorithm for protonation state assignment, the VSGB 2.0 model was validated by predicting 115 super long loops up to 20 residues. Despite the dramatically increasing difficulty in reconstructing longer loops, a high accuracy was achieved: all of the lowest energy conformations have global backbone RMSDs better than 2.0 Å from the native conformations. Average global backbone RMSDs of the predictions are 0.51, 0.63, 0.70, 0.62, 0.80, 1.41, and 1.59 Å for 14, 15, 16, 17, 18, 19, and 20 residue loop predictions, respectively. When these results are corrected for possible statistical bias as explained in the text, the average global backbone RMSDs are 0.61, 0.71, 0.86, 0.62, 1.06, 1.67, and 1.59 Å. Given the precision and robustness of the calculations, we believe that the VSGB 2.0 model is suitable to tackle "real" problems, such as biological function modeling and structure-based drug discovery.
Collapse
Affiliation(s)
- Jianing Li
- Department of Chemistry, Columbia University, New York, New York 10027, USA
| | | | | | | | | | | |
Collapse
|
62
|
Olson MA, Chaudhury S, Lee MS. Comparison between self-guided Langevin dynamics and molecular dynamics simulations for structure refinement of protein loop conformations. J Comput Chem 2011; 32:3014-22. [PMID: 21793008 DOI: 10.1002/jcc.21883] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2011] [Accepted: 06/19/2011] [Indexed: 11/09/2022]
Abstract
This article presents a comparative analysis of two replica-exchange simulation methods for the structure refinement of protein loop conformations, starting from low-resolution predictions. The methods are self-guided Langevin dynamics (SGLD) and molecular dynamics (MD) with a Nosé-Hoover thermostat. We investigated a small dataset of 8- and 12-residue loops, with the shorter loops placed initially from a coarse-grained lattice model and the longer loops from an enumeration assembly method (the Loopy program). The CHARMM22 + CMAP force field with a generalized Born implicit solvent model (molecular-surface parameterized GBSW2) was used to explore conformational space. We also assessed two empirical scoring methods to detect nativelike conformations from decoys: the all-atom distance-scaled ideal-gas reference state (DFIRE-AA) statistical potential and the Rosetta energy function. Among the eight-residue loop targets, SGLD out performed MD in all cases, with a median of 0.48 Å reduction in global root-mean-square deviation (RMSD) of the loop backbone coordinates from the native structure. Among the more challenging 12-residue loop targets, SGLD improved the prediction accuracy over MD by a median of 1.31 Å, representing a substantial improvement. The overall median RMSD for SGLD simulations of 12-residue loops was 0.91 Å, yielding refinement of a median 2.70 Å from initial loop placement. Results from DFIRE-AA and the Rosetta model applied to rescoring conformations failed to improve the overall detection calculated from the CHARMM force field. We illustrate the advantage of SGLD over the MD simulation model by presenting potential-energy landscapes for several loop predictions. Our results demonstrate that SGLD significantly outperforms traditional MD in the generation and populating of nativelike loop conformations and that the CHARMM force field performs comparably to other empirical force fields in identifying these conformations from the resulting ensembles.
Collapse
Affiliation(s)
- Mark A Olson
- Department of Cell Biology and Biochemistry, US Army Medical Research Institute of Infectious Diseases, Fredrick, Maryland 21702, USA.
| | | | | |
Collapse
|
63
|
Li Y, Rata I, Jakobsson E. Sampling multiple scoring functions can improve protein loop structure prediction accuracy. J Chem Inf Model 2011; 51:1656-66. [PMID: 21702492 PMCID: PMC3211142 DOI: 10.1021/ci200143u] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Accurately predicting loop structures is important for understanding functions of many proteins. In order to obtain loop models with high accuracy, efficiently sampling the loop conformation space to discover reasonable structures is a critical step. In loop conformation sampling, coarse-grain energy (scoring) functions coupling with reduced protein representations are often used to reduce the number of degrees of freedom as well as sampling computational time. However, due to implicitly considering many factors by reduced representations, the coarse-grain scoring functions may have potential insensitivity and inaccuracy, which can mislead the sampling process and consequently ignore important loop conformations. In this paper, we present a new computational sampling approach to obtain reasonable loop backbone models, so-called the Pareto optimal sampling (POS) method. The rationale of the POS method is to sample the function space of multiple, carefully selected scoring functions to discover an ensemble of diversified structures yielding Pareto optimality to all sampled conformations. The POS method can efficiently tolerate insensitivity and inaccuracy in individual scoring functions and thereby lead to significant accuracy improvement in loop structure prediction. We apply the POS method to a set of 4-12-residue loop targets using a function space composed of backbone-only Rosetta and distance-scale finite ideal-gas reference (DFIRE) and a triplet backbone dihedral potential developed in our lab. Our computational results show that in 501 out of 502 targets, the model sets generated by POS contain structure models are within subangstrom resolution. Moreover, the top-ranked models have a root mean square deviation (rmsd) less than 1 A in 96.8, 84.1, and 72.2% of the short (4-6 residues), medium (7-9 residues), and long (10-12 residues) targets, respectively, when the all-atom models are generated by local optimization from the backbone models and are ranked by our recently developed Pareto optimal consensus (POC) method. Similar sampling effectiveness can also be found in a set of 13-residue loop targets.
Collapse
Affiliation(s)
- Yaohang Li
- Department of Computer Science, Old Dominion University
| | - Ionel Rata
- Center for Biophysics and Computational Biology, University of Illinois at Urbana-Champaign
| | - Eric Jakobsson
- Department of Molecular and Integrative Physiology, Beckman Institute, and National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign
| |
Collapse
|
64
|
Successful prediction of the intra- and extracellular loops of four G-protein-coupled receptors. Proc Natl Acad Sci U S A 2011; 108:8275-80. [PMID: 21536915 DOI: 10.1073/pnas.1016951108] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We present results of the restoration of all crystallographically available intra- and extracellular loops of four G-protein-coupled receptors (GPCRs): bovine rhodopsin (bRh), the turkey β-1 adrenergic receptor (β1Ar), and the human β-2 adrenergic (β2Ar) and A2A adenosine (A2Ar) receptors. We use our Protein Local Optimization Program (PLOP), which samples conformational space from first principles to build sets of loop candidates and then discriminates between them using our physics-based, all-atom energy function with implicit solvent. We also discuss a new kind of explicit membrane calculation developed for GPCR loops that interact, either in the native structure or in low-energy false-positive structures, with the membrane, and thus exist in a multiphase environment not previously incorporated in PLOP. Our results demonstrate a significant advance over previous work reported in the literature, and of particular note we are able to accurately restore the extremely long second extracellular loop (ECL2), which is also key for GPCR ligand binding. In the case of β2Ar, accurate ECL2 restoration required seeding a small helix into the loop in the appropriate region, based on alignment with the β1Ar ECL2 loop, and then running loop reconstruction simulations with and without the seeded helix present; simulations containing the helix attain significantly lower total energies than those without the helix, and have rmsds close to the native structure. For β1Ar, the same protocol was used, except the alignment was done to β2Ar. These results represent an encouraging start for the more difficult problem of accurate loop refinement for GPCR homology modeling.
Collapse
|
65
|
Kokh DB, Wade RC, Wenzel W. Receptor flexibility in small‐molecule docking calculations. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2011. [DOI: 10.1002/wcms.29] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- Daria B. Kokh
- Molecular and Cellular Modeling Group, Heidelberg Institute for Theoretical Studies (HITS gGmbH), Heidelberg, Germany
| | - Rebecca C. Wade
- Molecular and Cellular Modeling Group, Heidelberg Institute for Theoretical Studies (HITS gGmbH), Heidelberg, Germany
| | - Wolfgang Wenzel
- Karlsruhe Institute of Technology, Institute of Nanotechnology, Karlsruhe, Germany
| |
Collapse
|
66
|
Xu M, Lill MA. Significant enhancement of docking sensitivity using implicit ligand sampling. J Chem Inf Model 2011; 51:693-706. [PMID: 21375306 DOI: 10.1021/ci100457t] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The efficient and accurate quantification of protein-ligand interactions using computational methods is still a challenging task. Two factors strongly contribute to the failure of docking methods to predict free energies of binding accurately: the insufficient incorporation of protein flexibility coupled to ligand binding and the neglected dynamics of the protein-ligand complex in current scoring schemes. We have developed a new methodology, named the 'ligand-model' concept, to sample protein conformations that are relevant for binding structurally diverse sets of ligands. In the ligand-model concept, molecular-dynamics (MD) simulations are performed with a virtual ligand, represented by a collection of functional groups that binds to the protein and dynamically changes its shape and properties during the simulation. The ligand model essentially represents a large ensemble of different chemical species binding to the same target protein. Representative protein structures were obtained from the MD simulation, and docking was performed into this ensemble of protein conformation. Similar binding poses were clustered, and the averaged score was utilized to rerank the poses. We demonstrate that the ligand-model approach yields significant improvements in predicting native-like binding poses and quantifying binding affinities compared to static docking and ensemble docking simulations into protein structures generated from an apo MD simulation.
Collapse
Affiliation(s)
- Mengang Xu
- Department of Medicinal Chemistry and Molecular Pharmacology, College of Pharmacy, Purdue University, West Lafayette, Indiana 47907, United States
| | | |
Collapse
|
67
|
Arnautova YA, Abagyan RA, Totrov M. Development of a new physics-based internal coordinate mechanics force field and its application to protein loop modeling. Proteins 2011; 79:477-98. [PMID: 21069716 PMCID: PMC3057902 DOI: 10.1002/prot.22896] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
We report the development of internal coordinate mechanics force field (ICMFF), new force field parameterized using a combination of experimental data for crystals of small molecules and quantum mechanics calculations. The main features of ICMFF include: (a) parameterization for the dielectric constant relevant to the condensed state (ε = 2) instead of vacuum, (b) an improved description of hydrogen-bond interactions using duplicate sets of van der Waals parameters for heavy atom-hydrogen interactions, and (c) improved backbone covalent geometry and energetics achieved using novel backbone torsional potentials and inclusion of the bond angles at the C(α) atoms into the internal variable set. The performance of ICMFF was evaluated through loop modeling simulations for 4-13 residue loops. ICMFF was combined with a solvent-accessible surface area solvation model optimized using a large set of loop decoys. Conformational sampling was carried out using the biased probability Monte Carlo method. Average/median backbone root-mean-square deviations of the lowest energy conformations from the native structures were 0.25/0.21 Å for four residues loops, 0.84/0.46 Å for eight residue loops, and 1.16/0.73 Å for 12 residue loops. To our knowledge, these results are significantly better than or comparable with those reported to date for any loop modeling method that does not take crystal packing into account. Moreover, the accuracy of our method is on par with the best previously reported results obtained considering the crystal environment. We attribute this success to the high accuracy of the new ICM force field achieved by meticulous parameterization, to the optimized solvent model, and the efficiency of the search method.
Collapse
Affiliation(s)
- Yelena A Arnautova
- Molsoft LLC, 3366 North Torrey Pines Court, Suite 300, La Jolla, California 92037, USA
| | | | | |
Collapse
|
68
|
Pinson JA, Schmidt-Kittler O, Zhu J, Jennings IG, Kinzler KW, Vogelstein B, Chalmers DK, Thompson PE. Thiazolidinedione-based PI3Kα inhibitors: an analysis of biochemical and virtual screening methods. ChemMedChem 2011; 6:514-22. [PMID: 21360822 PMCID: PMC3187668 DOI: 10.1002/cmdc.201000467] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2010] [Revised: 11/29/2010] [Indexed: 12/27/2022]
Abstract
A series of synthesized and commercially available compounds were assessed against PI3Kα for in vitro inhibitory activity and the results compared to binding calculated in silico. Using published crystal structures of PI3Kγ and PI3Kδ co-crystallized with inhibitors as a template, docking was able to identify the majority of potent inhibitors from a decoy set of 1000 compounds. On the other hand, PI3Kα in the apo-form, modeled by induced fit docking, or built as a homology model gave only poor results. A PI3Kα homology model derived from a ligand-bound PI3Kδ crystal structure was developed that has a good ability to identify active compounds. The docking results identified binding poses for active compounds that differ from those identified to date and can contribute to our understanding of structure-activity relationships for PI3K inhibitors.
Collapse
Affiliation(s)
- Jo-Anne Pinson
- Medicinal Chemistry & Drug Action, Monash Institute of Pharmaceutical Sciences, Parkville, Victoria 3052, Australia
| | | | | | | | | | | | | | | |
Collapse
|
69
|
Abstract
Antibodies are one of the critical molecules of our immune system and are unique in their enormous diversity required for recognizing various antigens. Antibodies are protein molecules and their antigen interacting region, the fragment variable (F (V)), is typically composed of a light (V (L)) and heavy (V (H)) chain. In particular, three loops each at the tip of the V (L) and the V (H), known as the complementarity determining region (CDR) loops, are responsible for binding to the antigen. While the framework regions of the V (L) and V (H) are relatively constant across the entire repertoire of antibodies, the conformation of the CDR loops varies extensively to enable the antibody to recognize different antigens. Three-dimensional structures of antibodies illustrating the V (L)-V (H) relative orientation and the CDR conformations are needed to gain insight into antibody stability, immunogenicity, and antibody-antigen interactions. Computational modeling provides a fast and inexpensive route for generating antibody structural models. This chapter highlights the various features crucial for creating a successful antibody homology model.
Collapse
Affiliation(s)
- Aroop Sircar
- EMD Serono Research Center, Inc., Billerica, MA, USA.
| |
Collapse
|
70
|
Abstract
Loop modeling is crucial for high-quality homology model construction outside conserved secondary structure elements. Dozens of loop modeling protocols involving a range of database and ab initio search algorithms and a variety of scoring functions have been proposed. Knowledge-based loop modeling methods are very fast and some can successfully and reliably predict loops up to about eight residues long. Several recent ab initio loop simulation methods can be used to construct accurate models of loops up to 12-13 residues long, albeit at a substantial computational cost. Major current challenges are the simulations of loops longer than 12-13 residues, the modeling of multiple interacting flexible loops, and the sensitivity of the loop predictions to the accuracy of the loop environment.
Collapse
|
71
|
Lee J, Lee D, Park H, Coutsias EA, Seok C. Protein loop modeling by using fragment assembly and analytical loop closure. Proteins 2010; 78:3428-36. [PMID: 20872556 PMCID: PMC2976774 DOI: 10.1002/prot.22849] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2010] [Revised: 07/16/2010] [Accepted: 07/31/2010] [Indexed: 12/27/2022]
Abstract
Protein loops are often involved in important biological functions such as molecular recognition, signal transduction, or enzymatic action. The three dimensional structures of loops can provide essential information for understanding molecular mechanisms behind protein functions. In this article, we develop a novel method for protein loop modeling, where the loop conformations are generated by fragment assembly and analytical loop closure. The fragment assembly method reduces the conformational space drastically, and the analytical loop closure method finds the geometrically consistent loop conformations efficiently. We also derive an analytic formula for the gradient of any analytical function of dihedral angles in the space of closed loops. The gradient can be used to optimize various restraints derived from experiments or databases, for example restraints for preferential interactions between specific residues or for preferred backbone angles. We demonstrate that the current loop modeling method outperforms previous methods that employ residue-based torsion angle maps or different loop closure strategies when tested on two sets of loop targets of lengths ranging from 4 to 12.
Collapse
Affiliation(s)
- Julian Lee
- Department of Bioinformatics and Life Science, Soongsil University, Seoul 156-743, Korea
| | - Dongseon Lee
- Department of Chemistry, Seoul National University, Seoul 151-747, Korea
| | - Hahnbeom Park
- Department of Chemistry, Seoul National University, Seoul 151-747, Korea
| | - Evangelos A. Coutsias
- Department of Mathematics and Statistics, University of New Mexico, Albuquerque, NM 87131, USA
| | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul 151-747, Korea
| |
Collapse
|
72
|
Sellers BD, Nilmeier JP, Jacobson MP. Antibodies as a model system for comparative model refinement. Proteins 2010; 78:2490-505. [PMID: 20602354 DOI: 10.1002/prot.22757] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Predicting the conformations of loops is a critical aspect of protein comparative (homology) modeling. Despite considerable advances in developing loop prediction algorithms, refining loops in homology models remains challenging. In this work, we use antibodies as a model system to investigate strategies for more robustly predicting loop conformations when the protein model contains errors in the conformations of side chains and protein backbone surrounding the loop in question. Specifically, our test system consists of partial models of antibodies in which the "scaffold" (i.e., the portion other than the complementarity determining region, CDR, loops) retains native backbone conformation, whereas the CDR loops are predicted using a combination of knowledge-based modeling (H1, H2, L1, L2, and L3) and ab initio loop prediction (H3). H3 is the most variable of the CDRs. Using a previously published method, a test set of 10 shorter H3 loops (5-7 residues) are predicted to an average backbone (N-C alpha-C-O) RMSD of 2.7 A while 11 longer loops (8-9 residues) are predicted to 5.1 A, thus recapitulating the difficulties in refining loops in models. By contrast, in control calculations predicting the same loops in crystal structures, the same method reconstructs the loops to an average of 0.5 and 1.4 A for the shorter and longer loops, respectively. We modify the loop prediction method to improve the ability to sample near-native loop conformations in the models, primarily by reducing the sensitivity of the sampling to the loop surroundings, and allowing the other CDR loops to optimize with the H3 loop. The new method improves the average accuracy significantly to 1.3 A RMSD and 3.1 A RMSD for the shorter and longer loops, respectively. Finally, we present results predicting 8-10 residue loops within complete comparative models of five nonantibody proteins. While anecdotal, these mixed, full-model results suggest our approach is a promising step toward more accurately predicting loops in homology models. Furthermore, while significant challenges remain, our method is a potentially useful tool for predicting antibody structures based on a known Fv scaffold.
Collapse
Affiliation(s)
- Benjamin D Sellers
- Department of Pharmaceutical Chemistry, University of California, San Francisco, California 94158-2517, USA
| | | | | |
Collapse
|
73
|
Geitmann M, Retra K, de Kloe GE, Homan E, Smit AB, de Esch IJP, Danielson UH. Interaction Kinetic and Structural Dynamic Analysis of Ligand Binding to Acetylcholine-Binding Protein. Biochemistry 2010; 49:8143-54. [DOI: 10.1021/bi1006354] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
| | - Kim Retra
- Leiden/Amsterdam Center for Drug Research (LACDR), Division of BioMolecular Analysis, Department of Chemistry and Pharmaceutical Sciences, Faculty of Sciences, VU University, Amsterdam, The Netherlands
| | - Gerdien E. de Kloe
- Leiden/Amsterdam Center for Drug Research (LACDR), Division of Medicinal Chemistry, Department of Chemistry and Pharmaceutical Sciences, Faculty of Sciences, VU University, Amsterdam, The Netherlands
| | - Evert Homan
- Beactica AB, Box 567, SE-751 22 Uppsala, Sweden
| | - August B. Smit
- Department of Molecular and Cellular Neurobiology, Center for Neurogenomics and Cognitive Research, Neuroscience Campus Amsterdam, VU University, Amsterdam, The Netherlands
| | - Iwan J. P. de Esch
- Leiden/Amsterdam Center for Drug Research (LACDR), Division of Medicinal Chemistry, Department of Chemistry and Pharmaceutical Sciences, Faculty of Sciences, VU University, Amsterdam, The Netherlands
| | - U. Helena Danielson
- Beactica AB, Box 567, SE-751 22 Uppsala, Sweden
- Department of Biochemistry and Organic Chemistry, Uppsala University, BMC, Box 576, SE-751 23 Uppsala, Sweden
| |
Collapse
|
74
|
Li Y, Rata I, Chiu SW, Jakobsson E. Improving predicted protein loop structure ranking using a Pareto-optimality consensus method. BMC STRUCTURAL BIOLOGY 2010; 10:22. [PMID: 20642859 PMCID: PMC2914074 DOI: 10.1186/1472-6807-10-22] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/27/2009] [Accepted: 07/20/2010] [Indexed: 11/10/2022]
Abstract
Background Accurate protein loop structure models are important to understand functions of many proteins. Identifying the native or near-native models by distinguishing them from the misfolded ones is a critical step in protein loop structure prediction. Results We have developed a Pareto Optimal Consensus (POC) method, which is a consensus model ranking approach to integrate multiple knowledge- or physics-based scoring functions. The procedure of identifying the models of best quality in a model set includes: 1) identifying the models at the Pareto optimal front with respect to a set of scoring functions, and 2) ranking them based on the fuzzy dominance relationship to the rest of the models. We apply the POC method to a large number of decoy sets for loops of 4- to 12-residue in length using a functional space composed of several carefully-selected scoring functions: Rosetta, DOPE, DDFIRE, OPLS-AA, and a triplet backbone dihedral potential developed in our lab. Our computational results show that the sets of Pareto-optimal decoys, which are typically composed of ~20% or less of the overall decoys in a set, have a good coverage of the best or near-best decoys in more than 99% of the loop targets. Compared to the individual scoring function yielding best selection accuracy in the decoy sets, the POC method yields 23%, 37%, and 64% less false positives in distinguishing the native conformation, indentifying a near-native model (RMSD < 0.5A from the native) as top-ranked, and selecting at least one near-native model in the top-5-ranked models, respectively. Similar effectiveness of the POC method is also found in the decoy sets from membrane protein loops. Furthermore, the POC method outperforms the other popularly-used consensus strategies in model ranking, such as rank-by-number, rank-by-rank, rank-by-vote, and regression-based methods. Conclusions By integrating multiple knowledge- and physics-based scoring functions based on Pareto optimality and fuzzy dominance, the POC method is effective in distinguishing the best loop models from the other ones within a loop model set.
Collapse
Affiliation(s)
- Yaohang Li
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, USA.
| | | | | | | |
Collapse
|
75
|
Danielson ML, Lill MA. New computational method for prediction of interacting protein loop regions. Proteins 2010; 78:1748-59. [PMID: 20186974 DOI: 10.1002/prot.22690] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Flexible loop regions of proteins play a crucial role in many biological functions such as protein-ligand recognition, enzymatic catalysis, and protein-protein association. To date, most computational methods that predict the conformational states of loops only focus on individual loop regions. However, loop regions are often spatially in close proximity to one another and their mutual interactions stabilize their conformations. We have developed a new method, titled CorLps, capable of simultaneously predicting such interacting loop regions. First, an ensemble of individual loop conformations is generated for each loop region. The members of the individual ensembles are combined and are accepted or rejected based on a steric clash filter. After a subsequent side-chain optimization step, the resulting conformations of the interacting loops are ranked by the statistical scoring function DFIRE that originated from protein structure prediction. Our results show that predicting interacting loops with CorLps is superior to sequential prediction of the two interacting loop regions, and our method is comparable in accuracy to single loop predictions. Furthermore, improved predictive accuracy of the top-ranked solution is achieved for 12-residue length loop regions by diversifying the initial pool of individual loop conformations using a quality threshold clustering algorithm.
Collapse
Affiliation(s)
- Matthew L Danielson
- Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, Indiana 47907, USA
| | | |
Collapse
|
76
|
Application of biasing-potential replica-exchange simulations for loop modeling and refinement of proteins in explicit solvent. Proteins 2010; 78:2809-19. [DOI: 10.1002/prot.22796] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
77
|
Berrondo M, Gray JJ, Schleif R. Computational predictions of the mutant behavior of AraC. J Mol Biol 2010; 398:462-70. [PMID: 20338183 DOI: 10.1016/j.jmb.2010.03.021] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2009] [Revised: 02/16/2010] [Accepted: 03/11/2010] [Indexed: 11/29/2022]
Abstract
An algorithm implemented in Rosetta correctly predicts the folding capabilities of the 17-residue N-terminal arm of the AraC gene regulatory protein when arabinose is bound to the protein and the dramatically different structure of this arm when arabinose is absent. The transcriptional activity of 43 mutant AraC proteins with alterations in the arm sequences was measured in vivo and compared with their predicted folding properties. Seventeen of the mutants possessed regulatory properties that could be directly compared with folding predictions. Sixteen of the 17 mutants were correctly predicted. The algorithm predicts that the N-terminal arm sequences of AraC homologs fold to the Escherichia coli AraC arm structure. In contrast, it predicts that random sequences of the same length and many partially randomized E. coli arm sequences do not fold to the E. coli arm structure. The high level of success shows that relatively "simple" computational methods can in some cases predict the behavior of mutant proteins with good reliability.
Collapse
Affiliation(s)
- Monica Berrondo
- Chemical and Biomolecular Engineering, Johns Hopkins University, 3400 North Charles Street, Baltimore, MD 21218, USA
| | | | | |
Collapse
|
78
|
Nikiforovich GV, Taylor CM, Marshall GR, Baranski TJ. Modeling the possible conformations of the extracellular loops in G-protein-coupled receptors. Proteins 2010; 78:271-85. [PMID: 19731375 DOI: 10.1002/prot.22537] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
This study presents the results of a de novo approach modeling possible conformational dynamics of the extracellular (EC) loops in G-protein-coupled receptors (GPCRs), specifically in bovine rhodopsin (bRh), squid rhodopsin (sRh), human beta-2 adrenergic receptor (beta2AR), turkey beta-1 adrenergic receptor (beta1AR), and human A2 adenosine receptor (A2AR). The approach deliberately sacrificed a detailed description of any particular 3D structure of the loops in GPCRs in favor of a less precise description of many possible structures. Despite this, the approach found ensembles of the low-energy conformers of the EC loops that contained structures close to the available X-ray snapshots. For the smaller EC1 and EC3 loops (6-11 residues), our results were comparable with the best recent results obtained by other authors using much more sophisticated techniques. For the larger EC2 loops (25-34 residues), our results consistently yielded structures significantly closer to the X-ray snapshots than the results of the other authors for loops of similar size. The results suggested possible large-scale movements of the EC loops in GPCRs that might determine their conformational dynamics. The approach was also validated by accurately reproducing the docking poses for low-molecular-weight ligands in beta2AR, beta1AR, and A2AR, demonstrating the possible influence of the conformations of the EC loops on the binding sites of ligands. The approach correctly predicted the system of disulfide bridges between the EC loops in A2AR and elucidated the probable pathways for forming this system.
Collapse
|
79
|
Tyagi M, Bornot A, Offmann B, de Brevern AG. Analysis of loop boundaries using different local structure assignment methods. Protein Sci 2009; 18:1869-81. [PMID: 19606500 DOI: 10.1002/pro.198] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Loops connect regular secondary structures. In many instances, they are known to play important biological roles. Analysis and prediction of loop conformations depend directly on the definition of repetitive structures. Nonetheless, the secondary structure assignment methods (SSAMs) often lead to divergent assignments. In this study, we analyzed, both structure and sequence point of views, how the divergence between different SSAMs affect boundary definitions of loops connecting regular secondary structures. The analysis of SSAMs underlines that no clear consensus between the different SSAMs can be easily found. Because these latter greatly influence the loop boundary definitions, important variations are indeed observed, that is, capping positions are shifted between different SSAMs. On the other hand, our results show that the sequence information in these capping regions are more stable than expected, and, classical and equivalent sequence patterns were found for most of the SSAMs. This is, to our knowledge, the most exhaustive survey in this field as (i) various databank have been used leading to similar results without implication of protein redundancy and (ii) the first time various SSAMs have been used. This work hence gives new insights into the difficult question of assignment of repetitive structures and addresses the issue of loop boundaries definition. Although SSAMs give very different local structure assignments capping sequence patterns remain efficiently stable.
Collapse
Affiliation(s)
- Manoj Tyagi
- Laboratoire de Biochimie et Génétique Moléculaire, Université de La Réunion, BP 7151, 15 avenue René Cassin, 97715 Saint Denis Messag Cedex 09, La Réunion, France
| | | | | | | |
Collapse
|
80
|
Bornot A, Etchebest C, de Brevern AG. A new prediction strategy for long local protein structures using an original description. Proteins 2009; 76:570-87. [PMID: 19241475 DOI: 10.1002/prot.22370] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
A relevant and accurate description of three-dimensional (3D) protein structures can be achieved by characterizing recurrent local structures. In a previous study, we developed a library of 120 3D structural prototypes encompassing all known 11-residues long local protein structures and ensuring a good quality of structural approximation. A local structure prediction method was also proposed. Here, overlapping properties of local protein structures in global ones are taken into account to characterize frequent local networks. At the same time, we propose a new long local structure prediction strategy which involves the use of evolutionary information coupled with Support Vector Machines (SVMs). Our prediction is evaluated by a stringent geometrical assessment. Every local structure prediction with a Calpha RMSD less than 2.5 A from the true local structure is considered as correct. A global prediction rate of 63.1% is then reached, corresponding to an improvement of 7.7 points compared with the previous strategy. In the same way, the prediction of 88.33% of the 120 structural classes is improved with 8.65% mean gain. 85.33% of proteins have better prediction results with a 9.43% average gain. An analysis of prediction rate per local network also supports the global improvement and gives insights into the potential of our method for predicting super local structures. Moreover, a confidence index for the direct estimation of prediction quality is proposed. Finally, our method is proved to be very competitive with cutting-edge strategies encompassing three categories of local structure predictions.
Collapse
Affiliation(s)
- Aurélie Bornot
- INSERM UMR-S, Université Paris Diderot, Institut National de la Transfusion Sanguine, France.
| | | | | |
Collapse
|
81
|
Liu P, Zhu F, Rassokhin DN, Agrafiotis DK. A self-organizing algorithm for modeling protein loops. PLoS Comput Biol 2009; 5:e1000478. [PMID: 19696883 PMCID: PMC2719875 DOI: 10.1371/journal.pcbi.1000478] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2009] [Accepted: 07/20/2009] [Indexed: 11/19/2022] Open
Abstract
Protein loops, the flexible short segments connecting two stable secondary
structural units in proteins, play a critical role in protein structure and
function. Constructing chemically sensible conformations of protein loops that
seamlessly bridge the gap between the anchor points without introducing any
steric collisions remains an open challenge. A variety of algorithms have been
developed to tackle the loop closure problem, ranging from inverse kinematics to
knowledge-based approaches that utilize pre-existing fragments extracted from
known protein structures. However, many of these approaches focus on the
generation of conformations that mainly satisfy the fixed end point condition,
leaving the steric constraints to be resolved in subsequent post-processing
steps. In the present work, we describe a simple solution that simultaneously
satisfies not only the end point and steric conditions, but also chirality and
planarity constraints. Starting from random initial atomic coordinates, each
individual conformation is generated independently by using a simple alternating
scheme of pairwise distance adjustments of randomly chosen atoms, followed by
fast geometric matching of the conformationally rigid components of the
constituent amino acids. The method is conceptually simple, numerically stable
and computationally efficient. Very importantly, additional constraints, such as
those derived from NMR experiments, hydrogen bonds or salt bridges, can be
incorporated into the algorithm in a straightforward and inexpensive way, making
the method ideal for solving more complex multi-loop problems. The remarkable
performance and robustness of the algorithm are demonstrated on a set of protein
loops of length 4, 8, and 12 that have been used in previous studies. Protein loops play an important role in protein function, such as ligand binding,
recognition, and allosteric regulation. However, due to their flexibility, it is
notoriously difficult to determine their 3D structures using traditional
experimental techniques. As a result, one can often find protein structures with
missing loops in the Protein Data Bank. Their sequence variability also presents
a particular challenge for homology modeling methods, which can only yield good
overall structures given sufficient sequence identity and good experimental
reference structures. Despite extensive research, the construction of protein
loop 3D structures remains an open problem, since a sensible conformation should
seamlessly bridge the anchor points without introducing steric clashes within
the loop itself or between the loop and its surroundings environment. Here, we
present a conceptually simple, mathematically straightforward, numerically
robust and computationally efficient approach for building protein loop
conformations that simultaneously satisfy end-point, steric, planar and chiral
constraints. More importantly, additional constraints derived from experimental
sources can be incorporated in a straightforward manner, allowing the processing
of more complex structures involving multiple interlocking loops.
Collapse
Affiliation(s)
- Pu Liu
- Johnson & Johnson Pharmaceutical Research and Development, Exton,
Pennsylvania, United States of America
- * E-mail: (PL); (DKA)
| | - Fangqiang Zhu
- Johnson & Johnson Pharmaceutical Research and Development, Exton,
Pennsylvania, United States of America
| | - Dmitrii N. Rassokhin
- Johnson & Johnson Pharmaceutical Research and Development, Exton,
Pennsylvania, United States of America
| | - Dimitris K. Agrafiotis
- Johnson & Johnson Pharmaceutical Research and Development, Exton,
Pennsylvania, United States of America
- * E-mail: (PL); (DKA)
| |
Collapse
|
82
|
Tyagi M, Bornot A, Offmann B, de Brevern AG. Protein short loop prediction in terms of a structural alphabet. Comput Biol Chem 2009; 33:329-33. [PMID: 19625218 DOI: 10.1016/j.compbiolchem.2009.06.002] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2008] [Revised: 06/17/2009] [Accepted: 06/17/2009] [Indexed: 11/20/2022]
Abstract
Loops connect regular secondary structures. In many instances, they are known to play crucial biological roles. To bypass the limitation of secondary structure description, we previously defined a structural alphabet composed of 16 structural prototypes, called Protein Blocks (PBs). It leads to an accurate description of every region of 3D protein backbones and has been used in local structure prediction. In the present study, we used our structural alphabet to predict the loops connecting two repetitive structures. Thus, we showed interest to take into account the flanking regions, leading to prediction rate improvement up to 19.8%, but we also underline the sensitivity of such an approach. This research can be used to propose different structures for the loops and to probe and sample their flexibility. It is a useful tool for ab initio loop prediction and leads to insights into flexible docking approach.
Collapse
Affiliation(s)
- Manoj Tyagi
- Laboratoire de Biochimie et Génétique Moléculaire, Université de La Réunion, BP 7151, 15 avenue René Cassin, 97715 Saint Denis Messag Cedex 09, La Réunion, France
| | | | | | | |
Collapse
|
83
|
Sircar A, Kim ET, Gray JJ. RosettaAntibody: antibody variable region homology modeling server. Nucleic Acids Res 2009; 37:W474-9. [PMID: 19458157 PMCID: PMC2703951 DOI: 10.1093/nar/gkp387] [Citation(s) in RCA: 119] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
The RosettaAntibody server (http://antibody.graylab.jhu.edu) predicts the structure of an antibody variable region given the amino-acid sequences of the respective light and heavy chains. In an initial stage, the server identifies and displays the most sequence homologous template structures for the light and heavy framework regions and each of the complementarity determining region (CDR) loops. Subsequently, the most homologous templates are assembled into a side-chain optimized crude model, and the server returns a picture and coordinate file. For users requesting a high-resolution model, the server executes the full RosettaAntibody protocol which additionally models the hyper-variable CDR H3 loop. The high-resolution protocol also relieves steric clashes by optimizing the CDR backbone torsion angles and by simultaneously perturbing the relative orientation of the light and heavy chains. RosettaAntibody generates 2000 independent structures, and the server returns pictures, coordinate files, and detailed scoring information for the 10 top-scoring models. The 10 models enable users to use rational judgment in choosing the best model or to use the set as an ensemble for further studies such as docking. The high-resolution models generated by RosettaAntibody have been used for the successful prediction of antibody–antigen complex structures.
Collapse
Affiliation(s)
- Aroop Sircar
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, 3400 N. Charles Street, Baltimore, MD 21218, USA
| | | | | |
Collapse
|
84
|
Hildebrand PW, Goede A, Bauer RA, Gruening B, Ismer J, Michalsky E, Preissner R. SuperLooper--a prediction server for the modeling of loops in globular and membrane proteins. Nucleic Acids Res 2009; 37:W571-4. [PMID: 19429894 PMCID: PMC2703960 DOI: 10.1093/nar/gkp338] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
SuperLooper provides the first online interface for the automatic, quick and interactive search and placement of loops in proteins (LIP). A database containing half a billion segments of water-soluble proteins with lengths up to 35 residues can be screened for candidate loops. A specified database containing 180 000 membrane loops in proteins (LIMP) can be searched, alternatively. Loop candidates are scored based on sequence criteria and the root mean square deviation (RMSD) of the stem atoms. Searching LIP, the average global RMSD of the respective top-ranked loops to the original loops is benchmarked to be <2 Å, for loops up to six residues or <3 Å for loops shorter than 10 residues. Other suitable conformations may be selected and directly visualized on the web server from a top-50 list. For user guidance, the sequence homology between the template and the original sequence, proline or glycine exchanges or close contacts between a loop candidate and the remainder of the protein are denoted. For membrane proteins, the expansions of the lipid bilayer are automatically modeled using the TMDET algorithm. This allows the user to select the optimal membrane protein loop concerning its relative orientation to the lipid bilayer. The server is online since October 2007 and can be freely accessed at URL: http://bioinformatics.charite.de/superlooper/
Collapse
Affiliation(s)
- Peter W Hildebrand
- Institute of Medical Physics and Biophysics, Charité, University of Medicine, Berlin, Germany.
| | | | | | | | | | | | | |
Collapse
|
85
|
Stroganov OV, Novikov FN, Stroylov VS, Kulkov V, Chilov GG. Lead finder: an approach to improve accuracy of protein-ligand docking, binding energy estimation, and virtual screening. J Chem Inf Model 2009; 48:2371-85. [PMID: 19007114 DOI: 10.1021/ci800166p] [Citation(s) in RCA: 146] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
An innovative molecular docking algorithm and three specialized high accuracy scoring functions are introduced in the Lead Finder docking software. Lead Finder's algorithm for ligand docking combines the classical genetic algorithm with various local optimization procedures and resourceful exploitation of the knowledge generated during docking process. Lead Finder's scoring functions are based on a molecular mechanics functional which explicitly accounts for different types of energy contributions scaled with empiric coefficients to produce three scoring functions tailored for (a) accurate binding energy predictions; (b) correct energy-ranking of docked ligand poses; and (c) correct rank-ordering of active and inactive compounds in virtual screening experiments. The predicted values of the free energy of protein-ligand binding were benchmarked against a set of experimentally measured binding energies for 330 diverse protein-ligand complexes yielding rmsd of 1.50 kcal/mol. The accuracy of ligand docking was assessed on a set of 407 structures, which included almost all published test sets of the following programs: FlexX, Glide SP, Glide XP, Gold, LigandFit, MolDock, and Surflex. rmsd of 2 A or less was observed for 80-96% of the structures in the test sets (80.0% on the Glide XP and FlexX test sets, 96.0% on the Surflex and MolDock test sets). The ability of Lead Finder to distinguish between active and inactive compounds during virtual screening experiments was benchmarked against 34 therapeutically relevant protein targets. Impressive enrichment factors were obtained for almost all of the targets with the average area under receiver operator curve being equal to 0.92.
Collapse
Affiliation(s)
- Oleg V Stroganov
- MolTech Ltd., Leninskie gory, 1/75A, Moscow 119992, Russian Federation, andBioMolTech Corp., 226 York Mills Road, Toronto, Ontario M2L 1L1, Canada
| | | | | | | | | |
Collapse
|
86
|
Sivasubramanian A, Sircar A, Chaudhury S, Gray JJ. Toward high-resolution homology modeling of antibody Fv regions and application to antibody-antigen docking. Proteins 2009; 74:497-514. [PMID: 19062174 DOI: 10.1002/prot.22309] [Citation(s) in RCA: 145] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
High-resolution homology models are useful in structure-based protein engineering applications, especially when a crystallographic structure is unavailable. Here, we report the development and implementation of RosettaAntibody, a protocol for homology modeling of antibody variable regions. The protocol combines comparative modeling of canonical complementarity determining region (CDR) loop conformations and de novo loop modeling of CDR H3 conformation with simultaneous optimization of V(L)-V(H) rigid-body orientation and CDR backbone and side-chain conformations. The protocol was tested on a benchmark of 54 antibody crystal structures. The median root mean square deviation (rmsd) of the antigen binding pocket comprised of all the CDR residues was 1.5 A with 80% of the targets having an rmsd lower than 2.0 A. The median backbone heavy atom global rmsd of the CDR H3 loop prediction was 1.6, 1.9, 2.4, 3.1, and 6.0 A for very short (4-6 residues), short (7-9), medium (10-11), long (12-14) and very long (17-22) loops, respectively. When the set of ten top-scoring antibody homology models are used in local ensemble docking to antigen, a moderate-to-high accuracy docking prediction was achieved in seven of fifteen targets. This success in computational docking with high-resolution homology models is encouraging, but challenges still remain in modeling antibody structures for sequences with long H3 loops. This first large-scale antibody-antigen docking study using homology models reveals the level of "functional accuracy" of these structural models toward protein engineering applications.
Collapse
Affiliation(s)
- Arvind Sivasubramanian
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, 3400 North Charles Street, Baltimore, Maryland 21218, USA
| | | | | | | |
Collapse
|
87
|
Cui M, Mezei M, Osman R. Prediction of protein loop structures using a local move Monte Carlo approach and a grid-based force field. Protein Eng Des Sel 2008; 21:729-35. [PMID: 18957407 PMCID: PMC2597363 DOI: 10.1093/protein/gzn056] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2008] [Revised: 09/18/2008] [Accepted: 09/23/2008] [Indexed: 11/14/2022] Open
Abstract
We have developed an improved local move Monte Carlo (LMMC) loop sampling approach for loop predictions. The method generates loop conformations based on simple moves of the torsion angles of side chains and local moves of backbone of loops. To reduce the computational costs for energy evaluations, we developed a grid-based force field to represent the protein environment and solvation effect. Simulated annealing has been used to enhance the efficiency of the LMMC loop sampling and identify low-energy loop conformations. The prediction quality is evaluated on a set of protein loops with known crystal structure that has been previously used by others to test different loop prediction methods. The results show that this approach can reproduce the experimental results with the root mean square deviation within 1.8 A for all the test cases. The LMMC loop prediction approach developed here could be useful for improvement in the quality the loop regions in homology models, flexible protein-ligand and protein-protein docking studies.
Collapse
Affiliation(s)
- Meng Cui
- Department of Structural and Chemical Biology, Mount Sinai School of Medicine, NYU, Box 1218, New York, NY 10029
- Department of Physiology and Biophysics, Virginia Commonwealth University, 1101 East Marshall Street, PO Box 980551, Richmond, VA 23298, USA
| | - Mihaly Mezei
- Department of Structural and Chemical Biology, Mount Sinai School of Medicine, NYU, Box 1218, New York, NY 10029
| | - Roman Osman
- Department of Structural and Chemical Biology, Mount Sinai School of Medicine, NYU, Box 1218, New York, NY 10029
| |
Collapse
|
88
|
Sellers BD, Zhu K, Zhao S, Friesner RA, Jacobson MP. Toward better refinement of comparative models: predicting loops in inexact environments. Proteins 2008; 72:959-71. [PMID: 18300241 DOI: 10.1002/prot.21990] [Citation(s) in RCA: 75] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Achieving atomic-level accuracy in comparative protein models is limited by our ability to refine the initial, homolog-derived model closer to the native state. Despite considerable effort, progress in developing a generalized refinement method has been limited. In contrast, methods have been described that can accurately reconstruct loop conformations in native protein structures. We hypothesize that loop refinement in homology models is much more difficult than loop reconstruction in crystal structures, in part, because side-chain, backbone, and other structural inaccuracies surrounding the loop create a challenging sampling problem; the loop cannot be refined without simultaneously refining adjacent portions. In this work, we single out one sampling issue in an artificial but useful test set and examine how loop refinement accuracy is affected by errors in surrounding side-chains. In 80 high-resolution crystal structures, we first perturbed 6-12 residue loops away from the crystal conformation, and placed all protein side chains in non-native but low energy conformations. Even these relatively small perturbations in the surroundings made the loop prediction problem much more challenging. Using a previously published loop prediction method, median backbone (N-Calpha-C-O) RMSD's for groups of 6, 8, 10, and 12 residue loops are 0.3/0.6/0.4/0.6 A, respectively, on native structures and increase to 1.1/2.2/1.5/2.3 A on the perturbed cases. We then augmented our previous loop prediction method to simultaneously optimize the rotamer states of side chains surrounding the loop. Our results show that this augmented loop prediction method can recover the native state in many perturbed structures where the previous method failed; the median RMSD's for the 6, 8, 10, and 12 residue perturbed loops improve to 0.4/0.8/1.1/1.2 A. Finally, we highlight three comparative models from blind tests, in which our new method predicted loops closer to the native conformation than first modeled using the homolog template, a task generally understood to be difficult. Although many challenges remain in refining full comparative models to high accuracy, this work offers a methodical step toward that goal.
Collapse
Affiliation(s)
- Benjamin D Sellers
- Graduate Group in Biophysics, University of California, San Francisco, California 94158-2517, USA
| | | | | | | | | |
Collapse
|
89
|
Olson MA, Feig M, Brooks CL. Prediction of protein loop conformations using multiscale modeling methods with physical energy scoring functions. J Comput Chem 2008; 29:820-31. [PMID: 17876760 DOI: 10.1002/jcc.20827] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
This article examines ab initio methods for the prediction of protein loops by a computational strategy of multiscale conformational sampling and physical energy scoring functions. Our approach consists of initial sampling of loop conformations from lattice-based low-resolution models followed by refinement using all-atom simulations. To allow enhanced conformational sampling, the replica exchange method was implemented. Physical energy functions based on CHARMM19 and CHARMM22 parameterizations with generalized Born (GB) solvent models were applied in scoring loop conformations extracted from the lattice simulations and, in the case of all-atom simulations, the ensemble of conformations were generated and scored with these models. Predictions are reported for 25 loop segments, each eight residues long and taken from a diverse set of 22 protein structures. We find that the simulations generally sampled conformations with low global root-mean-square-deviation (RMSD) for loop backbone coordinates from the known structures, whereas clustering conformations in RMSD space and scoring detected less favorable loop structures. Specifically, the lattice simulations sampled basins that exhibited an average global RMSD of 2.21 +/- 1.42 A, whereas clustering and scoring the loop conformations determined an RMSD of 3.72 +/- 1.91 A. Using CHARMM19/GB to refine the lattice conformations improved the sampling RMSD to 1.57 +/- 0.98 A and detection to 2.58 +/- 1.48 A. We found that further improvement could be gained from extending the upper temperature in the all-atom refinement from 400 to 800 K, where the results typically yield a reduction of approximately 1 A or greater in the RMSD of the detected loop. Overall, CHARMM19 with a simple pairwise GB solvent model is more efficient at sampling low-RMSD loop basins than CHARMM22 with a higher-resolution modified analytical GB model; however, the latter simulation method provides a more accurate description of the all-atom energy surface, yet demands a much greater computational cost.
Collapse
Affiliation(s)
- Mark A Olson
- Department of Cell Biology and Biochemistry, U.S. Army Medical Research Institute of Infectious Diseases, Frederick, Maryland 21702, USA.
| | | | | |
Collapse
|
90
|
Eswar N, Webb B, Marti-Renom MA, Madhusudhan MS, Eramian D, Shen MY, Pieper U, Sali A. Comparative protein structure modeling using Modeller. ACTA ACUST UNITED AC 2008; Chapter 5:Unit-5.6. [PMID: 18428767 DOI: 10.1002/0471250953.bi0506s15] [Citation(s) in RCA: 1775] [Impact Index Per Article: 110.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Functional characterization of a protein sequence is one of the most frequent problems in biology. This task is usually facilitated by accurate three-dimensional (3-D) structure of the studied protein. In the absence of an experimentally determined structure, comparative or homology modeling can sometimes provide a useful 3-D model for a protein that is related to at least one known protein structure. Comparative modeling predicts the 3-D structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described.
Collapse
Affiliation(s)
- Narayanan Eswar
- University of California at San Francisco San Francisco, California
| | - Ben Webb
- University of California at San Francisco San Francisco, California
| | | | - M S Madhusudhan
- University of California at San Francisco San Francisco, California
| | - David Eramian
- University of California at San Francisco San Francisco, California
| | - Min-Yi Shen
- University of California at San Francisco San Francisco, California
| | - Ursula Pieper
- University of California at San Francisco San Francisco, California
| | - Andrej Sali
- University of California at San Francisco San Francisco, California
| |
Collapse
|
91
|
Felts AK, Gallicchio E, Chekmarev D, Paris KA, Friesner RA, Levy RM. Prediction of Protein Loop Conformations using the AGBNP Implicit Solvent Model and Torsion Angle Sampling. J Chem Theory Comput 2008; 4:855-868. [PMID: 18787648 DOI: 10.1021/ct800051k] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The OPLS-AA all-atom force field and the Analytical Generalized Born plus Non-Polar (AGBNP) implicit solvent model, in conjunction with torsion angle conformational search protocols based on the Protein Local Optimization Program (PLOP), are shown to be effective in predicting the native conformations of 57 9-residue and 35 13-residue loops of a diverse series of proteins with low sequence identity. The novel nonpolar solvation free energy estimator implemented in AGBNP augmented by correction terms aimed at reducing the occurrence of ion pairing are important to achieve the best prediction accuracy. Extended versions of the previously developed PLOP-based conformational search schemes based on calculations in the crystal environment are reported that are suitable for application to loop homology modeling without the crystal environment. Our results suggest that in general the loop backbone conformation is not strongly influenced by crystal packing. The application of the temperature Replica Exchange Molecular Dynamics (T-REMD) sampling method for a few examples where PLOP sampling is insufficient are also reported. The results reported indicate that the OPLS-AA/AGBNP effective potential is suitable for high-resolution modeling of proteins in the final stages of homology modeling and/or protein crystallographic refinement.
Collapse
Affiliation(s)
- Anthony K Felts
- Department of Chemistry and Chemical Biology and BioMaPS Institute for Quantitative Biology, Rutgers University, Piscataway, New Jersey 08854
| | | | | | | | | | | |
Collapse
|
92
|
Abstract
We describe a fast and accurate protocol, LoopBuilder, for the prediction of loop conformations in proteins. The procedure includes extensive sampling of backbone conformations, side chain addition, the use of a statistical potential to select a subset of these conformations, and, finally, an energy minimization and ranking with an all-atom force field. We find that the Direct Tweak algorithm used in the previously developed LOOPY program is successful in generating an ensemble of conformations that on average are closer to the native conformation than those generated by other methods. An important feature of Direct Tweak is that it checks for interactions between the loop and the rest of the protein during the loop closure process. DFIRE is found to be a particularly effective statistical potential that can bias conformation space toward conformations that are close to the native structure. Its application as a filter prior to a full molecular mechanics energy minimization both improves prediction accuracy and offers a significant savings in computer time. Final scoring is based on the OPLS/SBG-NP force field implemented in the PLOP program. The approach is also shown to be quite successful in predicting loop conformations for cases where the native side chain conformations are assumed to be unknown, suggesting that it will prove effective in real homology modeling applications. Proteins 2008. © 2007 Wiley-Liss, Inc.
Collapse
Affiliation(s)
- Cinque S Soto
- Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York 10032, USA
| | | | | | | | | |
Collapse
|
93
|
Knight JL, Zhou Z, Gallicchio E, Himmel DM, Friesner RA, Arnold E, Levy RM. Exploring structural variability in X-ray crystallographic models using protein local optimization by torsion-angle sampling. ACTA CRYSTALLOGRAPHICA. SECTION D, BIOLOGICAL CRYSTALLOGRAPHY 2008; 64:383-96. [PMID: 18391405 PMCID: PMC2631124 DOI: 10.1107/s090744490800070x] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/05/2007] [Accepted: 01/08/2008] [Indexed: 11/10/2022]
Abstract
Modeling structural variability is critical for understanding protein function and for modeling reliable targets for in silico docking experiments. Because of the time-intensive nature of manual X-ray crystallographic refinement, automated refinement methods that thoroughly explore conformational space are essential for the systematic construction of structurally variable models. Using five proteins spanning resolutions of 1.0-2.8 A, it is demonstrated how torsion-angle sampling of backbone and side-chain libraries with filtering against both the chemical energy, using a modern effective potential, and the electron density, coupled with minimization of a reciprocal-space X-ray target function, can generate multiple structurally variable models which fit the X-ray data well. Torsion-angle sampling as implemented in the Protein Local Optimization Program (PLOP) has been used in this work. Models with the lowest R(free) values are obtained when electrostatic and implicit solvation terms are included in the effective potential. HIV-1 protease, calmodulin and SUMO-conjugating enzyme illustrate how variability in the ensemble of structures captures structural variability that is observed across multiple crystal structures and is linked to functional flexibility at hinge regions and binding interfaces. An ensemble-refinement procedure is proposed to differentiate between variability that is a consequence of physical conformational heterogeneity and that which reflects uncertainty in the atomic coordinates.
Collapse
Affiliation(s)
| | | | - Emilio Gallicchio
- Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Daniel M. Himmel
- Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | | | - Eddy Arnold
- Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Ronald M. Levy
- Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| |
Collapse
|
94
|
Dong Q, Wang X, Lin L, Wang Y. Analysis and prediction of protein local structure based on structure alphabets. Proteins 2008; 72:163-72. [DOI: 10.1002/prot.21904] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
95
|
Abstract
Genome sequencing projects have resulted in a rapid increase in the number of known protein sequences. In contrast, only about one-hundredth of these sequences have been characterized using experimental structure determination methods. Computational protein structure modeling techniques have the potential to bridge this sequence-structure gap. This chapter presents an example that illustrates the use of MODELLER to construct a comparative model for a protein with unknown structure. Automation of similar protocols (correction of protcols) has resulted in models of useful accuracy for domains in more than half of all known protein sequences.
Collapse
Affiliation(s)
- Narayanan Eswar
- Department of Biopharmaceutical Sciences and California Institute for Quantitative Biomedical Research, University of California at San Francisco, San Francisco, CA, USA
| | | | | | | | | |
Collapse
|
96
|
Zhu K, Shirts MR, Friesner RA. Improved Methods for Side Chain and Loop Predictions via the Protein Local Optimization Program: Variable Dielectric Model for Implicitly Improving the Treatment of Polarization Effects. J Chem Theory Comput 2007; 3:2108-19. [DOI: 10.1021/ct700166f] [Citation(s) in RCA: 90] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Kai Zhu
- Department of Chemistry, Columbia University, New York, New York 10027
| | - Michael R. Shirts
- Department of Chemistry, Columbia University, New York, New York 10027
| | | |
Collapse
|
97
|
Li X, Jacobson MP, Zhu K, Zhao S, Friesner RA. Assignment of polar states for protein amino acid residues using an interaction cluster decomposition algorithm and its application to high resolution protein structure modeling. Proteins 2007; 66:824-37. [PMID: 17154422 DOI: 10.1002/prot.21125] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
We have developed a new method (Independent Cluster Decomposition Algorithm, ICDA) for creating all-atom models of proteins given the heavy-atom coordinates, provided by X-ray crystallography, and the pH. In our method the ionization states of titratable residues, the crystallographic mis-assignment of amide orientations in Asn/Gln, and the orientations of OH/SH groups are addressed under the unified framework of polar states assignment. To address the large number of combinatorial possibilities for the polar hydrogen states of the protein, we have devised a novel algorithm to decompose the system into independent interacting clusters, based on the observation of the crucial interdependence between the short range hydrogen bonding network and polar residue states, thus significantly reducing the computational complexity of the problem and making our algorithm tractable using relatively modest computational resources. We utilize an all atom protein force field (OPLS) and a Generalized Born continuum solvation model, in contrast to the various empirical force fields adopted in most previous studies. We have compared our prediction results with a few well-documented methods in the literature (WHATIF, REDUCE). In addition, as a preliminary attempt to couple our polar state assignment method with real structure predictions, we further validate our method using single side chain prediction, which has been demonstrated to be an effective way of validating structure prediction methods without incurring sampling problems. Comparisons of single side chain prediction results after the application of our polar state prediction method with previous results with default polar state assignments indicate a significant improvement in the single side chain predictions for polar residues.
Collapse
Affiliation(s)
- Xin Li
- Department of Chemistry, Columbia University, New York, NY 10027, USA
| | | | | | | | | |
Collapse
|