Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Oliva B, Bates PA, Querol E, Avilés FX, Sternberg MJ. An automated classification of the structure of protein loops. J Mol Biol 1997;266:814-30. [PMID: 9102471 DOI: 10.1006/jmbi.1996.0819] [Citation(s) in RCA: 161] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]

For:	Oliva B, Bates PA, Querol E, Avilés FX, Sternberg MJ. An automated classification of the structure of protein loops. J Mol Biol 1997;266:814-30. [PMID: 9102471 DOI: 10.1006/jmbi.1996.0819] [Citation(s) in RCA: 161] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]

Number

Cited by Other Article(s)

Zou D, He Z, He J. Beta-hairpin prediction with quadratic discriminant analysis using diversity measure. J Comput Chem 2009;30:2277-84. [PMID: 19263434 DOI: 10.1002/jcc.21229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Understanding hydrogen-bond patterns in proteins using network motifs. Bioinformatics 2009;25:2921-8. [DOI: 10.1093/bioinformatics/btp541] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Hvidsten TR, Kryshtafovych A, Fidelis K. Local descriptors of protein structure: a systematic analysis of the sequence-structure relationship in proteins using short- and long-range interactions. Proteins 2009;75:870-84. [PMID: 19025980 DOI: 10.1002/prot.22296] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Abstract

Local protein structure representations that incorporate long-range contacts between residues are often considered in protein structure comparison but have found relatively little use in structure prediction where assembly from single backbone fragments dominates. Here, we introduce the concept of local descriptors of protein structure to characterize local neighborhoods of amino acids including short- and long-range interactions. We build a library of recurring local descriptors and show that this library is general enough to allow assembly of unseen protein structures. The library could on average re-assemble 83% of 119 unseen structures, and showed little or no performance decrease between homologous targets and targets with folds not represented among domains used to build it. We then systematically evaluate the descriptor library to establish the level of the sequence signal in sets of protein fragments of similar geometrical conformation. In particular, we test whether that signal is strong enough to facilitate correct assignment and alignment of these local geometries to new sequences. We use the signal to assign descriptors to a test set of 479 sequences with less than 40% sequence identity to any domain used to build the library, and show that on average more than 50% of the backbone fragments constituting descriptors can be correctly aligned. We also use the assigned descriptors to infer SCOP folds, and show that correct predictions can be made in many of the 151 cases where PSI-BLAST was unable to detect significant sequence similarity to proteins in the library. Although the combinatorial problem of simultaneously aligning several fragments to sequence is a major bottleneck compared with single fragment methods, the advantage of the current approach is that correct alignments imply correct long range distance constraints. The lack of these constraints is most likely the major reason why structure prediction methods fail to consistently produce adequate models when good templates are unavailable or undetectable. Thus, we believe that the current study offers new and valuable insight into the prediction of sequence-structure relationships in proteins.

Collapse

Liu P, Zhu F, Rassokhin DN, Agrafiotis DK. A self-organizing algorithm for modeling protein loops. PLoS Comput Biol 2009;5:e1000478. [PMID: 19696883 PMCID: PMC2719875 DOI: 10.1371/journal.pcbi.1000478] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2009] [Accepted: 07/20/2009] [Indexed: 11/19/2022] Open

Abstract

Protein loops, the flexible short segments connecting two stable secondary structural units in proteins, play a critical role in protein structure and function. Constructing chemically sensible conformations of protein loops that seamlessly bridge the gap between the anchor points without introducing any steric collisions remains an open challenge. A variety of algorithms have been developed to tackle the loop closure problem, ranging from inverse kinematics to knowledge-based approaches that utilize pre-existing fragments extracted from known protein structures. However, many of these approaches focus on the generation of conformations that mainly satisfy the fixed end point condition, leaving the steric constraints to be resolved in subsequent post-processing steps. In the present work, we describe a simple solution that simultaneously satisfies not only the end point and steric conditions, but also chirality and planarity constraints. Starting from random initial atomic coordinates, each individual conformation is generated independently by using a simple alternating scheme of pairwise distance adjustments of randomly chosen atoms, followed by fast geometric matching of the conformationally rigid components of the constituent amino acids. The method is conceptually simple, numerically stable and computationally efficient. Very importantly, additional constraints, such as those derived from NMR experiments, hydrogen bonds or salt bridges, can be incorporated into the algorithm in a straightforward and inexpensive way, making the method ideal for solving more complex multi-loop problems. The remarkable performance and robustness of the algorithm are demonstrated on a set of protein loops of length 4, 8, and 12 that have been used in previous studies.

Protein loops play an important role in protein function, such as ligand binding, recognition, and allosteric regulation. However, due to their flexibility, it is notoriously difficult to determine their 3D structures using traditional experimental techniques. As a result, one can often find protein structures with missing loops in the Protein Data Bank. Their sequence variability also presents a particular challenge for homology modeling methods, which can only yield good overall structures given sufficient sequence identity and good experimental reference structures. Despite extensive research, the construction of protein loop 3D structures remains an open problem, since a sensible conformation should seamlessly bridge the anchor points without introducing steric clashes within the loop itself or between the loop and its surroundings environment. Here, we present a conceptually simple, mathematically straightforward, numerically robust and computationally efficient approach for building protein loop conformations that simultaneously satisfy end-point, steric, planar and chiral constraints. More importantly, additional constraints derived from experimental sources can be incorporated in a straightforward manner, allowing the processing of more complex structures involving multiple interlocking loops.

Collapse

Recognition of β-hairpin motifs in proteins by using the composite vector. Amino Acids 2009;38:915-21. [DOI: 10.1007/s00726-009-0299-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2008] [Accepted: 04/20/2009] [Indexed: 10/20/2022]

Eswar N, Webb B, Marti-Renom MA, Madhusudhan MS, Eramian D, Shen MY, Pieper U, Sali A. Comparative protein structure modeling using MODELLER. ACTA ACUST UNITED AC 2008;Chapter 2:Unit 2.9. [PMID: 18429317 DOI: 10.1002/0471140864.ps0209s50] [Citation(s) in RCA: 754] [Impact Index Per Article: 47.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]

Olson MA, Feig M, Brooks CL. Prediction of protein loop conformations using multiscale modeling methods with physical energy scoring functions. J Comput Chem 2008;29:820-31. [PMID: 17876760 DOI: 10.1002/jcc.20827] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]

Abstract

This article examines ab initio methods for the prediction of protein loops by a computational strategy of multiscale conformational sampling and physical energy scoring functions. Our approach consists of initial sampling of loop conformations from lattice-based low-resolution models followed by refinement using all-atom simulations. To allow enhanced conformational sampling, the replica exchange method was implemented. Physical energy functions based on CHARMM19 and CHARMM22 parameterizations with generalized Born (GB) solvent models were applied in scoring loop conformations extracted from the lattice simulations and, in the case of all-atom simulations, the ensemble of conformations were generated and scored with these models. Predictions are reported for 25 loop segments, each eight residues long and taken from a diverse set of 22 protein structures. We find that the simulations generally sampled conformations with low global root-mean-square-deviation (RMSD) for loop backbone coordinates from the known structures, whereas clustering conformations in RMSD space and scoring detected less favorable loop structures. Specifically, the lattice simulations sampled basins that exhibited an average global RMSD of 2.21 +/- 1.42 A, whereas clustering and scoring the loop conformations determined an RMSD of 3.72 +/- 1.91 A. Using CHARMM19/GB to refine the lattice conformations improved the sampling RMSD to 1.57 +/- 0.98 A and detection to 2.58 +/- 1.48 A. We found that further improvement could be gained from extending the upper temperature in the all-atom refinement from 400 to 800 K, where the results typically yield a reduction of approximately 1 A or greater in the RMSD of the detected loop. Overall, CHARMM19 with a simple pairwise GB solvent model is more efficient at sampling low-RMSD loop basins than CHARMM22 with a higher-resolution modified analytical GB model; however, the latter simulation method provides a more accurate description of the all-atom energy surface, yet demands a much greater computational cost.

Collapse

Eswar N, Webb B, Marti-Renom MA, Madhusudhan MS, Eramian D, Shen MY, Pieper U, Sali A. Comparative protein structure modeling using Modeller. ACTA ACUST UNITED AC 2008;Chapter 5:Unit-5.6. [PMID: 18428767 DOI: 10.1002/0471250953.bi0506s15] [Citation(s) in RCA: 1766] [Impact Index Per Article: 110.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]

Hermoso A, Espadaler J, Enrique Querol E, Aviles FX, Sternberg MJ, Oliva B, Fernandez-Fuentes N. Including Functional Annotations and Extending the Collection of Structural Classifications of Protein Loops (ArchDB). Bioinform Biol Insights 2008. [DOI: 10.1177/117793220700100004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

Abstract Loops represent an important part of protein structures. The study of loop is critical for two main reasons: First, loops are often involved in protein function, stability and folding. Second, despite improvements in experimental and computational structure prediction methods, modeling the conformation of loops remains problematic. Here, we present a structural classification of loops, ArchDB, a mine of information with application in both mentioned fields: loop structure prediction and function prediction. ArchDB ( http://sbi.imim.es/archdb ) is a database of classified protein loop motifs. The current database provides four different classification sets tailored for different purposes. ArchDB-40, a loop classification derived from SCOP40, well suited for modeling common loop motifs. Since features relevant to loop structure or function can be more easily determined on well-populated clusters, we have developed ArchDB-95, a loop classification derived from SCOP95. This new classification set shows a ~40% increase in the number of subclasses, and a large 7-fold increase in the number of putative structure/function-related subclasses. We also present ArchDB-EC, a classification of loop motifs from enzymes, and ArchDB-KI, a manually annotated classification of loop motifs from kinases. Information about ligand contacts and PDB sites has been included in all classification sets. Improvements in our classification scheme are described, as well as several new database features, such as the ability to query by conserved annotations, sequence similarity, or uploading 3D coordinates of a protein. The lengths of classified loops range between 0 and 36 residues long. ArchDB offers an exhaustive sampling of loop structures. Functional information about loops and links with related biological databases are also provided. All this information and the possibility to browse/query the database through a web-server outline an useful tool with application in the comparative study of loops, the analysis of loops involved in protein function and to obtain templates for loop modeling. Collapse

Hu XZ, Li QZ. Prediction of the β-Hairpins in Proteins Using Support Vector Machine. Protein J 2007;27:115-22. [DOI: 10.1007/s10930-007-9114-z] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Prakash T, Sandhu KS, Singh NK, Bhasin Y, Ramakrishnan C, Brahmachari SK. Structural assessment of glycyl mutations in invariantly conserved motifs. Proteins 2007;69:617-32. [PMID: 17623846 DOI: 10.1002/prot.21488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Abstract

Motifs that are evolutionarily conserved in proteins are crucial to their structure and function. In one of our earlier studies, we demonstrated that the conserved motifs occurring invariantly across several organisms could act as structural determinants of the proteins. We observed the abundance of glycyl residues in these invariantly conserved motifs. The role of glycyl residues in highly conserved motifs has not been studied extensively. Thus, it would be interesting to examine the structural perturbations induced by mutation in these conserved glycyl sites. In this work, we selected a representative set of invariant signature (IS) peptides for which both the PDB structure and mutation information was available. We thoroughly analyzed the conformational features of the glycyl sites and their local interactions with the surrounding residues. Using Ramachandran angles, we showed that the glycyl residues occurring in these IS peptides, which have undergone mutation, occurred more often in the L-disallowed as compared with the L-allowed region of the Ramachandran plot. Short range contacts around the mutation site were analyzed to study the steric effects. With the results obtained from our analysis, we hypothesize that any change of activity arising because of such mutations must be attributed to the long-range interaction(s) of the new residue if the glycyl residue in the IS peptide occurred in the L-allowed region of the Ramachandran plot. However, the mutation of those conserved glycyl residues that occurred in the L-disallowed region of the Ramachandran plot might lead to an altered activity of the protein as a result of an altered conformation of the backbone in the immediate vicinity of the glycyl residue, in addition to long range effects arising from the long side chains of the new residue. Thus, the loss of activity because of mutation in the conserved glycyl site might either relate to long range interactions or to local perturbations around the site depending upon the conformational preference of the glycyl residue.

Collapse

De Brevern AG, Etchebest C, Benros C, Hazout S. "Pinning strategy": a novel approach for predicting the backbone structure in terms of protein blocks from sequence. J Biosci 2007;32:51-70. [PMID: 17426380 DOI: 10.1007/s12038-007-0006-3] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Peng HP, Yang AS. Modeling protein loops with knowledge-based prediction of sequence-structure alignment. Bioinformatics 2007;23:2836-42. [PMID: 17827204 DOI: 10.1093/bioinformatics/btm456] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Kanagasabai V, Arunachalam J, Prasad PA, Gautham N. Exploring the conformational space of protein loops using a mean field technique with MOLS sampling. Proteins 2007;67:908-21. [PMID: 17357159 DOI: 10.1002/prot.21333] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

Eswar N, Sali A. Comparative Modeling of Drug Target Proteins. COMPREHENSIVE MEDICINAL CHEMISTRY II 2007. [PMCID: PMC7151936 DOI: 10.1016/b0-08-045044-x/00251-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]

Fernandez-Fuentes N, Zhai J, Fiser A. ArchPRED: a template based loop structure prediction server. Nucleic Acids Res 2006;34:W173-6. [PMID: 16844985 PMCID: PMC1538831 DOI: 10.1093/nar/gkl113] [Citation(s) in RCA: 111] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Espadaler J, Querol E, Aviles FX, Oliva B. Identification of function-associated loop motifs and application to protein function prediction. Bioinformatics 2006;22:2237-43. [PMID: 16870939 DOI: 10.1093/bioinformatics/btl382] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Hennetin J, Jullian B, Steven AC, Kajava AV. Standard Conformations of β-Arches in β-Solenoid Proteins. J Mol Biol 2006;358:1094-105. [PMID: 16580019 DOI: 10.1016/j.jmb.2006.02.039] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2005] [Revised: 02/13/2006] [Accepted: 02/15/2006] [Indexed: 11/15/2022]

Fernandez-Fuentes N, Oliva B, Fiser A. A supersecondary structure library and search algorithm for modeling loops in protein structures. Nucleic Acids Res 2006;34:2085-97. [PMID: 16617149 PMCID: PMC1440879 DOI: 10.1093/nar/gkl156] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Abstract

We present a fragment-search based method for predicting loop conformations in protein models. A hierarchical and multidimensional database has been set up that currently classifies 105 950 loop fragments and loop flanking secondary structures. Besides the length of the loops and types of bracing secondary structures the database is organized along four internal coordinates, a distance and three types of angles characterizing the geometry of stem regions. Candidate fragments are selected from this library by matching the length, the types of bracing secondary structures of the query and satisfying the geometrical restraints of the stems and subsequently inserted in the query protein framework where their fit is assessed by the root mean square deviation (r.m.s.d.) of stem regions and by the number of rigid body clashes with the environment. In the final step remaining candidate loops are ranked by a Z-score that combines information on sequence similarity and fit of predicted and observed ϕ/ψ main chain dihedral angle propensities. Confidence Z-score cut-offs were determined for each loop length that identify those predicted fragments that outperform a competitive ab initio method. A web server implements the method, regularly updates the fragment library and performs prediction. Predicted segments are returned, or optionally, these can be completed with side chain reconstruction and subsequently annealed in the environment of the query protein by conjugate gradient minimization. The prediction method was tested on artificially prepared search datasets where all trivial sequence similarities on the SCOP superfamily level were removed. Under these conditions it is possible to predict loops of length 4, 8 and 12 with coverage of 98, 78 and 28% with at least of 0.22, 1.38 and 2.47 Å of r.m.s.d. accuracy, respectively. In a head-to-head comparison on loops extracted from freshly deposited new protein folds the current method outperformed in a ∼5:1 ratio an earlier developed database search method.

Collapse

Benros C, de Brevern AG, Etchebest C, Hazout S. Assessing a novel approach for predicting local 3D protein structures from sequence. Proteins 2006;62:865-80. [PMID: 16385557 DOI: 10.1002/prot.20815] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Carrega L, Mosbah A, Ferrat G, Beeton C, Andreotti N, Mansuelle P, Darbon H, De Waard M, Sabatier JM. The impact of the fourth disulfide bridge in scorpion toxins of the alpha-KTx6 subfamily. Proteins 2006;61:1010-23. [PMID: 16247791 DOI: 10.1002/prot.20681] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Fernandez-Fuentes N, Querol E, Aviles FX, Sternberg MJE, Oliva B. Prediction of the conformation and geometry of loops in globular proteins: testing ArchDB, a structural classification of loops. Proteins 2006;60:746-57. [PMID: 16021623 DOI: 10.1002/prot.20516] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Szarecka A, Meirovitch H. Optimization of the GB/SA solvation model for predicting the structure of surface loops in proteins. J Phys Chem B 2006;110:2869-80. [PMID: 16471897 PMCID: PMC1945207 DOI: 10.1021/jp055771+] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Abstract

Implicit solvation models are commonly optimized with respect to experimental data or Poisson-Boltzmann (PB) results obtained for small molecules, where the force field is sometimes not considered. In previous studies, we have developed an optimization procedure for cyclic peptides and surface loops in proteins based on the entire system studied and the specific force field used. Thus, the loop has been modeled by the simplified solvation function E(tot) = E(FF) (epsilon = 2r) + Sigma(i) sigma(i)A(i), where E(FF) (epsilon = nr) is the AMBER force field energy with a distance-dependent dielectric function, epsilon = nr, A(i) is the solvent accessible surface area of atom i, and sigma(i) is its atomic solvation parameter. During the optimization process, the loop is free to move while the protein template is held fixed in its X-ray structure. To improve on the results of this model, in the present work we apply our optimization procedure to the physically more rigorous solvation model, the generalized Born with surface area (GB/SA) (together with the all-atom AMBER force field) as suggested by Still and co-workers (J. Phys. Chem. A 1997, 101, 3005). The six parameters of the GB/SA model, namely, P(1)-P(5) and the surface area parameter, sigma (programmed in the TINKER package) are reoptimized for a "training" group of nine loops, and a best-fit set is defined from the individual sets of optimized parameters. The best-fit set and Still's original set of parameters (where Lys, Arg, His, Glu, and Asp are charged or neutralized) were applied to the training group as well as to a "test" group of seven loops, and the energy gaps and the corresponding RMSD values were calculated. These GB/SA results based on the three sets of parameters have been found to be comparable; surprisingly, however, they are somewhat inferior (e.g, of larger energy gaps) to those obtained previously from the simplified model described above. We discuss recent results for loops obtained by other solvation models and potential directions for future studies.

Collapse

White RP, Meirovitch H. Minimalist explicit solvation models for surface loops in proteins. J Chem Theory Comput 2006;2:1135-1151. [PMID: 17429495 PMCID: PMC1851699 DOI: 10.1021/ct0503217] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Abstract

We have performed molecular dynamics simulations of protein surface loops solvated by explicit water, where a prime focus of the study is the small numbers (e.g., ~100) of explicit water molecules employed. The models include only part of the protein (typically 500 - 1000 atoms), and the water molecules are restricted to a region surrounding the loop. In this study, the number of water molecules (N(w)) is systematically varied, and convergence with large N(w) is monitored to reveal N(w)(min), the minimum number required for the loop to exhibit realistic (fully hydrated) behavior. We have also studied protein surface coverage, as well as diffusion and residence times for water molecules as a function of N(w). A number of other modeling parameters are also tested. These include the number of environmental protein atoms explicitly considered in the model, as well as two ways to constrain the water molecules to the vicinity of the loop (where we find one of these methods to perform better when N(w) is small). The results (for RMSD and its fluctuations for four loops) are further compared to much larger, fully solvated systems (using ~10,000 water molecules under periodic boundary conditions and Ewald electrostatics), and to results for the GBSA implicit solvation model. We find that the loop backbone can stabilize with a surprisingly small number of water molecules (as low as 5 molecules per amino acid residue). The side chains of the loop require somewhat larger N(w), where the atomic fluctuations become too small if N(w) is further reduced. Thus, in general, we find adequate hydration to occur at roughly 12 water molecules per residue. This is an important result, because at this hydration level, computational times are comparable to those required for GBSA. Therefore these "minimalist explicit models" can provide a viable and potentially more accurate alternative. The importance of protein loop modeling is discussed in the context of these, and other, loop models, along with other challenges including the relevance of appropriate free energy simulation methodology for assessment of conformational stability.

Collapse

Bujnicki JM. Protein-Structure Prediction by Recombination of Fragments. Chembiochem 2005;7:19-27. [PMID: 16317788 DOI: 10.1002/cbic.200500235] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

De S, Sur K, Dasgupta S. Characterization of the nonregular regions of proteins by a contortion index. Biopolymers 2005;79:63-73. [PMID: 15962279 DOI: 10.1002/bip.20333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Tendulkar AV, Sohoni MA, Ogunnaike B, Wangikar PP. A geometric invariant-based framework for the analysis of protein conformational space. Bioinformatics 2005;21:3622-8. [PMID: 16096349 DOI: 10.1093/bioinformatics/bti621] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Lee MC, Deng J, Briggs JM, Duan Y. Large-scale conformational dynamics of the HIV-1 integrase core domain and its catalytic loop mutants. Biophys J 2005;88:3133-46. [PMID: 15731379 PMCID: PMC1305464 DOI: 10.1529/biophysj.104.058446] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Panchenko AR, Madej T. Structural similarity of loops in protein families: toward the understanding of protein evolution. BMC Evol Biol 2005;5:10. [PMID: 15691378 PMCID: PMC549550 DOI: 10.1186/1471-2148-5-10] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2004] [Accepted: 02/03/2005] [Indexed: 11/16/2022] Open

Rayan A, Senderowitz H, Goldblum A. Exploring the conformational space of cyclic peptides by a stochastic search method. J Mol Graph Model 2004;22:319-33. [PMID: 15099829 DOI: 10.1016/j.jmgm.2003.12.012] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

Fernandez-Fuentes N, Hermoso A, Espadaler J, Querol E, Aviles FX, Oliva B. Classification of common functional loops of kinase super-families. Proteins 2004;56:539-55. [PMID: 15229886 DOI: 10.1002/prot.20136] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Jacobson MP, Pincus DL, Rapp CS, Day TJF, Honig B, Shaw DE, Friesner RA. A hierarchical approach to all-atom protein loop prediction. Proteins 2004;55:351-67. [PMID: 15048827 DOI: 10.1002/prot.10613] [Citation(s) in RCA: 1687] [Impact Index Per Article: 84.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]

Abstract

The application of all-atom force fields (and explicit or implicit solvent models) to protein homology-modeling tasks such as side-chain and loop prediction remains challenging both because of the expense of the individual energy calculations and because of the difficulty of sampling the rugged all-atom energy surface. Here we address this challenge for the problem of loop prediction through the development of numerous new algorithms, with an emphasis on multiscale and hierarchical techniques. As a first step in evaluating the performance of our loop prediction algorithm, we have applied it to the problem of reconstructing loops in native structures; we also explicitly include crystal packing to provide a fair comparison with crystal structures. In brief, large numbers of loops are generated by using a dihedral angle-based buildup procedure followed by iterative cycles of clustering, side-chain optimization, and complete energy minimization of selected loop structures. We evaluate this method by using the largest test set yet used for validation of a loop prediction method, with a total of 833 loops ranging from 4 to 12 residues in length. Average/median backbone root-mean-square deviations (RMSDs) to the native structures (superimposing the body of the protein, not the loop itself) are 0.42/0.24 A for 5 residue loops, 1.00/0.44 A for 8 residue loops, and 2.47/1.83 A for 11 residue loops. Median RMSDs are substantially lower than the averages because of a small number of outliers; the causes of these failures are examined in some detail, and many can be attributed to errors in assignment of protonation states of titratable residues, omission of ligands from the simulation, and, in a few cases, probable errors in the experimentally determined structures. When these obvious problems in the data sets are filtered out, average RMSDs to the native structures improve to 0.43 A for 5 residue loops, 0.84 A for 8 residue loops, and 1.63 A for 11 residue loops. In the vast majority of cases, the method locates energy minima that are lower than or equal to that of the minimized native loop, thus indicating that sampling rarely limits prediction accuracy. The overall results are, to our knowledge, the best reported to date, and we attribute this success to the combination of an accurate all-atom energy function, efficient methods for loop buildup and side-chain optimization, and, especially for the longer loops, the hierarchical refinement protocol.

Collapse

Dasgupta B, Pal L, Basu G, Chakrabarti P. Expanded turn conformations: characterization and sequence-structure correspondence in alpha-turns with implications in helix folding. Proteins 2004;55:305-15. [PMID: 15048823 DOI: 10.1002/prot.20064] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Camproux AC, Gautier R, Tufféry P. A hidden markov model derived structural alphabet for proteins. J Mol Biol 2004;339:591-605. [PMID: 15147844 DOI: 10.1016/j.jmb.2004.04.005] [Citation(s) in RCA: 103] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2003] [Revised: 03/30/2004] [Accepted: 04/05/2004] [Indexed: 10/26/2022]

Rohl CA, Strauss CEM, Chivian D, Baker D. Modeling structurally variable regions in homologous proteins with rosetta. Proteins 2004;55:656-77. [PMID: 15103629 DOI: 10.1002/prot.10629] [Citation(s) in RCA: 242] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]

Abstract

A major limitation of current comparative modeling methods is the accuracy with which regions that are structurally divergent from homologues of known structure can be modeled. Because structural differences between homologous proteins are responsible for variations in protein function and specificity, the ability to model these differences has important functional consequences. Although existing methods can provide reasonably accurate models of short loop regions, modeling longer structurally divergent regions is an unsolved problem. Here we describe a method based on the de novo structure prediction algorithm, Rosetta, for predicting conformations of structurally divergent regions in comparative models. Initial conformations for short segments are selected from the protein structure database, whereas longer segments are built up by using three- and nine-residue fragments drawn from the database and combined by using the Rosetta algorithm. A gap closure term in the potential in combination with modified Newton's method for gradient descent minimization is used to ensure continuity of the peptide backbone. Conformations of variable regions are refined in the context of a fixed template structure using Monte Carlo minimization together with rapid repacking of side-chains to iteratively optimize backbone torsion angles and side-chain rotamers. For short loops, mean accuracies of 0.69, 1.45, and 3.62 A are obtained for 4, 8, and 12 residue loops, respectively. In addition, the method can provide reasonable models of conformations of longer protein segments: predicted conformations of 3A root-mean-square deviation or better were obtained for 5 of 10 examples of segments ranging from 13 to 34 residues. In combination with a sequence alignment algorithm, this method generates complete, ungapped models of protein structures, including regions both similar to and divergent from a homologous structure. This combined method was used to make predictions for 28 protein domains in the Critical Assessment of Protein Structure 4 (CASP 4) and 59 domains in CASP 5, where the method ranked highly among comparative modeling and fold recognition methods. Model accuracy in these blind predictions is dominated by alignment quality, but in the context of accurate alignments, long protein segments can be accurately modeled. Notably, the method correctly predicted the local structure of a 39-residue insertion into a TIM barrel in CASP 5 target T0186.

Collapse

Fourrier L, Benros C, de Brevern AG. Use of a structural alphabet for analysis of short loops connecting repetitive structures. BMC Bioinformatics 2004;5:58. [PMID: 15140270 PMCID: PMC450294 DOI: 10.1186/1471-2105-5-58] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2004] [Accepted: 05/12/2004] [Indexed: 12/02/2022] Open

Abstract

Background

Because loops connect regular secondary structures, analysis of the former depends directly on the definition of the latter. The numerous assignment methods, however, can offer different definitions. In a previous study, we defined a structural alphabet composed of 16 average protein fragments, which we called Protein Blocks (PBs). They allow an accurate description of every region of 3D protein backbones and have been used in local structure prediction. In the present study, we use this structural alphabet to analyze and predict the loops connecting two repetitive structures.

Results

We first analyzed the secondary structure assignments. Use of five different assignment methods (DSSP, DEFINE, PCURVE, STRIDE and PSEA) showed the absence of consensus: 20% of the residues were assigned to different states. The discrepancies were particularly important at the extremities of the repetitive structures. We used PBs to describe and predict the short loops because they can help analyze and in part explain these discrepancies. An analysis of the PB distribution in these regions showed some specificities in the sequence-structure relationship. Of the amino acid over- or under-representations observed in the short loop databank, 20% did not appear in the entire databank. Finally, predicting 3D structure in terms of PBs with a Bayesian approach yielded an accuracy rate of 36.0% for all loops and 41.2% for the short loops. Specific learning in the short loops increased the latter by 1%.

Conclusion

This work highlights the difficulties of assigning repetitive structures and the advantages of using more precise descriptions, that is, PBs. We observed some new amino acid distributions in the short loops and used this information to enhance local prediction. Instead of describing entire loops, our approach predicts each position in the loops locally. It can thus be used to propose many different structures for the loops and to probe and sample their flexibility. It can be a useful tool in ab initio loop prediction.

Collapse

Law RJ, Sansom MSP. Homology modelling and molecular dynamics simulations: comparative studies of human aquaporin-1. EUROPEAN BIOPHYSICS JOURNAL: EBJ 2004;33:477-89. [PMID: 15071758 DOI: 10.1007/s00249-004-0398-z] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/18/2003] [Revised: 02/10/2004] [Accepted: 02/12/2004] [Indexed: 10/26/2022]

Tendulkar AV, Joshi AA, Sohoni MA, Wangikar PP. Clustering of Protein Structural Fragments Reveals Modular Building Block Approach of Nature. J Mol Biol 2004;338:611-29. [PMID: 15081817 DOI: 10.1016/j.jmb.2004.02.047] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2003] [Revised: 02/11/2004] [Accepted: 02/17/2004] [Indexed: 11/29/2022]

Cortés J, Siméon T, Remaud-Siméon M, Tran V. Geometric algorithms for the conformational analysis of long protein loops. J Comput Chem 2004;25:956-67. [PMID: 15027107 DOI: 10.1002/jcc.20021] [Citation(s) in RCA: 75] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Zhang C, Liu S, Zhou Y. Accurate and efficient loop selections by the DFIRE-based all-atom statistical potential. Protein Sci 2004;13:391-9. [PMID: 14739324 PMCID: PMC2286705 DOI: 10.1110/ps.03411904] [Citation(s) in RCA: 83] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2003] [Revised: 10/17/2003] [Accepted: 10/17/2003] [Indexed: 10/26/2022]

Espadaler J, Fernandez-Fuentes N, Hermoso A, Querol E, Aviles FX, Sternberg MJE, Oliva B. ArchDB: automated protein loop classification as a tool for structural genomics. Nucleic Acids Res 2004;32:D185-8. [PMID: 14681390 PMCID: PMC308737 DOI: 10.1093/nar/gkh002] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Comparative Protein Structure Modeling and its Applications to Drug Discovery. ANNUAL REPORTS IN MEDICINAL CHEMISTRY 2004. [DOI: 10.1016/s0065-7743(04)39020-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]

Marti‐Renom MA, Madhusudhan M, Eswar N, Pieper U, Shen M, Sali A, Fiser A, Mirkovic N, John B, Stuart A. Modeling Protein Structure from its Sequence. ACTA ACUST UNITED AC 2003. [DOI: 10.1002/0471250953.bi0501s03] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Duarte CM, Wadley LM, Pyle AM. RNA structure comparison, motif search and discovery using a reduced representation of RNA conformational space. Nucleic Acids Res 2003;31:4755-61. [PMID: 12907716 PMCID: PMC169959 DOI: 10.1093/nar/gkg682] [Citation(s) in RCA: 97] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Haspel N, Tsai CJ, Wolfson H, Nussinov R. Reducing the computational complexity of protein folding via fragment folding and assembly. Protein Sci 2003;12:1177-87. [PMID: 12761388 PMCID: PMC2323902 DOI: 10.1110/ps.0232903] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2002] [Revised: 12/23/2002] [Accepted: 02/23/2003] [Indexed: 10/27/2022]

Pal M, Dasgupta S. The nature of the turn in omega loops of proteins. Proteins 2003;51:591-606. [PMID: 12784218 DOI: 10.1002/prot.10376] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Haspel N, Tsai CJ, Wolfson H, Nussinov R. Hierarchical protein folding pathways: a computational study of protein fragments. Proteins 2003;51:203-15. [PMID: 12660989 DOI: 10.1002/prot.10294] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Berezovsky IN. Discrete structure of van der Waals domains in globular proteins. Protein Eng Des Sel 2003;16:161-7. [PMID: 12702795 DOI: 10.1093/proeng/gzg026] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Mosbah A, Campanacci V, Lartigue A, Tegoni M, Cambillau C, Darbon H. Solution structure of a chemosensory protein from the moth Mamestra brassicae. Biochem J 2003;369:39-44. [PMID: 12217077 PMCID: PMC1223053 DOI: 10.1042/bj20021217] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2002] [Accepted: 09/09/2002] [Indexed: 11/17/2022]

100

Kolodny R, Koehl P, Guibas L, Levitt M. Small libraries of protein fragments model native protein structures accurately. J Mol Biol 2002;323:297-307. [PMID: 12381322 DOI: 10.1016/s0022-2836(02)00942-7] [Citation(s) in RCA: 144] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Abstract

Prediction of protein structure depends on the accuracy and complexity of the models used. Here, we represent the polypeptide chain by a sequence of rigid fragments that are concatenated without any degrees of freedom. Fragments chosen from a library of representative fragments are fit to the native structure using a greedy build-up method. This gives a one-dimensional representation of native protein three-dimensional structure whose quality depends on the nature of the library. We use a novel clustering method to construct libraries that differ in the fragment length (four to seven residues) and number of representative fragments they contain (25-300). Each library is characterized by the quality of fit (accuracy) and the number of allowed states per residue (complexity). We find that the accuracy depends on the complexity and varies from 2.9A for a 2.7-state model on the basis of fragments of length 7-0.76A for a 15-state model on the basis of fragments of length 5. Our goal is to find representations that are both accurate and economical (low complexity). The models defined here are substantially better in this regard: with ten states per residue we approximate native protein structure to 1A compared to over 20 states per residue needed previously. For the same complexity, we find that longer fragments provide better fits. Unfortunately, libraries of longer fragments must be much larger (for ten states per residue, a seven-residue library is 100 times larger than a five-residue library). As the number of known protein native structures increases, it will be possible to construct larger libraries to better exploit this correlation between neighboring residues. Our fragment libraries, which offer a wide range of optimal fragments suited to different accuracies of fit, may prove to be useful for generating better decoy sets for ab initio protein folding and for generating accurate loop conformations in homology modeling.

Collapse