Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Kolodny R, Koehl P, Guibas L, Levitt M. Small libraries of protein fragments model native protein structures accurately. J Mol Biol 2002;323:297-307. [PMID: 12381322 DOI: 10.1016/s0022-2836(02)00942-7] [Citation(s) in RCA: 144] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

For:	Kolodny R, Koehl P, Guibas L, Levitt M. Small libraries of protein fragments model native protein structures accurately. J Mol Biol 2002;323:297-307. [PMID: 12381322 DOI: 10.1016/s0022-2836(02)00942-7] [Citation(s) in RCA: 144] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Number

Cited by Other Article(s)

Feng Q, Hou M, Liu J, Zhao K, Zhang G. Construct a variable-length fragment library for de novo protein structure prediction. Brief Bioinform 2022;23:6547572. [PMID: 35284936 DOI: 10.1093/bib/bbac086] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2022] [Revised: 02/10/2022] [Accepted: 02/20/2022] [Indexed: 11/12/2022] Open

Konagurthu AS, Subramanian R, Allison L, Abramson D, Stuckey PJ, Garcia de la Banda M, Lesk AM. Universal Architectural Concepts Underlying Protein Folding Patterns. Front Mol Biosci 2021;7:612920. [PMID: 33996891 PMCID: PMC8120156 DOI: 10.3389/fmolb.2020.612920] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Accepted: 12/16/2020] [Indexed: 11/17/2022] Open

Gao W, Mahajan SP, Sulam J, Gray JJ. Deep Learning in Protein Structural Modeling and Design. PATTERNS (NEW YORK, N.Y.) 2020;1:100142. [PMID: 33336200 PMCID: PMC7733882 DOI: 10.1016/j.patter.2020.100142] [Citation(s) in RCA: 82] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Wen Z, He J, Huang SY. Topology-independent and global protein structure alignment through an FFT-based algorithm. Bioinformatics 2020;36:478-486. [PMID: 31384919 DOI: 10.1093/bioinformatics/btz609] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2019] [Revised: 07/22/2019] [Accepted: 08/02/2019] [Indexed: 12/12/2022] Open

Biophysical prediction of protein-peptide interactions and signaling networks using machine learning. Nat Methods 2020;17:175-183. [PMID: 31907444 PMCID: PMC7004877 DOI: 10.1038/s41592-019-0687-1] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2019] [Accepted: 11/15/2019] [Indexed: 12/17/2022]

Badaczewska-Dawid AE, Kolinski A, Kmiecik S. Computational reconstruction of atomistic protein structures from coarse-grained models. Comput Struct Biotechnol J 2019;18:162-176. [PMID: 31969975 PMCID: PMC6961067 DOI: 10.1016/j.csbj.2019.12.007] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Accepted: 12/10/2019] [Indexed: 01/02/2023] Open

Investigating the Formation of Structural Elements in Proteins Using Local Sequence-Dependent Information and a Heuristic Search Algorithm. Molecules 2019;24:molecules24061150. [PMID: 30909488 PMCID: PMC6471799 DOI: 10.3390/molecules24061150] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2019] [Revised: 03/14/2019] [Accepted: 03/19/2019] [Indexed: 11/22/2022] Open

Estaña A, Sibille N, Delaforge E, Vaisset M, Cortés J, Bernadó P. Realistic Ensemble Models of Intrinsically Disordered Proteins Using a Structure-Encoding Coil Database. Structure 2019;27:381-391.e2. [DOI: 10.1016/j.str.2018.10.016] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2018] [Revised: 07/13/2018] [Accepted: 10/19/2018] [Indexed: 11/27/2022]

Trevizani R, Custódio FL. Supersecondary Structures and Fragment Libraries. Methods Mol Biol 2019;1958:283-295. [PMID: 30945224 DOI: 10.1007/978-1-4939-9161-7_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Navigating Among Known Structures in Protein Space. Methods Mol Biol 2018. [PMID: 30298400 DOI: 10.1007/978-1-4939-8736-8_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]

Abstract

Present-day protein space is the result of 3.7 billion years of evolution, constrained by the underlying physicochemical qualities of the proteins. It is difficult to differentiate between evolutionary traces and effects of physicochemical constraints. Nonetheless, as a rule of thumb, instances of structural reuse, or focusing on structural similarity, are likely attributable to physicochemical constraints, whereas sequence reuse, or focusing on sequence similarity, may be more indicative of evolutionary relationships. Both types of relationships have been studied and can provide meaningful insights to protein biophysics and evolution, which in turn can lead to better algorithms for protein search, annotation, and maybe even design.In broad strokes, studies of protein space vary in the entities they represent, the similarity measure comparing these entities, and the representation used. The entities can be, for example, protein chains, domains, supra-domains, or smaller protein sub-parts denoted themes. The measures of similarity between the entities can be based on sequence, structure, function, or any combination of these. The representation can be global, encompassing the whole space, or local, focusing on a particular region surrounding protein(s) of interest. Global representations include lists of grouped proteins, protein networks, and maps. Networks are the abstraction that is derived most directly from the similarity data: each node is the protein entity (e.g., a domain), and edges connect similar domains. Selecting the entities, the similarity measure, and the abstraction are three intertwined decisions: the similarity measures allow us to identify the entities, and the selection of entities influences what is a meaningful similarity measure. Similarly, we seek entities that are related to each other in a way, for which a simple representation describes their relationships succinctly and accurately. This chapter will cover studies that rely on different entities, similarity measures, and a range of representations to better understand protein structure space. Scholars may use publicly available navigators offering a global representation, and in particular the hierarchical classifications SCOP, CATH, and ECOD, or a local representation, which encompass structural alignment algorithms. Alternatively, scholars can configure their own navigator using existing tools. To demonstrate this DIY (do it yourself) approach for navigating in protein space, we investigate substrate-binding proteins. By presenting sequence similarities among this large and diverse protein family as a network, we can infer that one member (pdb ID 4ntl; of yet unknown function) may bind methionine and suggest a putative binding mechanism.

Collapse

Kunzmann P, Hamacher K. Biotite: a unifying open source computational biology framework in Python. BMC Bioinformatics 2018;19:346. [PMID: 30285630 PMCID: PMC6167853 DOI: 10.1186/s12859-018-2367-z] [Citation(s) in RCA: 46] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2018] [Accepted: 09/10/2018] [Indexed: 12/18/2022] Open

SAFlex: A structural alphabet extension to integrate protein structural flexibility and missing data information. PLoS One 2018;13:e0198854. [PMID: 29975698 PMCID: PMC6033379 DOI: 10.1371/journal.pone.0198854] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2017] [Accepted: 05/25/2018] [Indexed: 11/19/2022] Open

Wang T, Yang Y, Zhou Y, Gong H. LRFragLib: an effective algorithm to identify fragments for de novo protein structure prediction. Bioinformatics 2017;33:677-684. [PMID: 27797773 DOI: 10.1093/bioinformatics/btw668] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2016] [Accepted: 10/18/2016] [Indexed: 11/13/2022] Open

Vetrivel I, Mahajan S, Tyagi M, Hoffmann L, Sanejouand YH, Srinivasan N, de Brevern AG, Cadet F, Offmann B. Knowledge-based prediction of protein backbone conformation using a structural alphabet. PLoS One 2017;12:e0186215. [PMID: 29161266 PMCID: PMC5697859 DOI: 10.1371/journal.pone.0186215] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2017] [Accepted: 09/27/2017] [Indexed: 01/19/2023] Open

Abstract

Libraries of structural prototypes that abstract protein local structures are known as structural alphabets and have proven to be very useful in various aspects of protein structure analyses and predictions. One such library, Protein Blocks, is composed of 16 standard 5-residues long structural prototypes. This form of analyzing proteins involves drafting its structure as a string of Protein Blocks. Predicting the local structure of a protein in terms of protein blocks is the general objective of this work. A new approach, PB-kPRED is proposed towards this aim. It involves (i) organizing the structural knowledge in the form of a database of pentapeptide fragments extracted from all protein structures in the PDB and (ii) applying a knowledge-based algorithm that does not rely on any secondary structure predictions and/or sequence alignment profiles, to scan this database and predict most probable backbone conformations for the protein local structures. Though PB-kPRED uses the structural information from homologues in preference, if available. The predictions were evaluated rigorously on 15,544 query proteins representing a non-redundant subset of the PDB filtered at 30% sequence identity cut-off. We have shown that the kPRED method was able to achieve mean accuracies ranging from 40.8% to 66.3% depending on the availability of homologues. The impact of the different strategies for scanning the database on the prediction was evaluated and is discussed. Our results highlight the usefulness of the method in the context of proteins without any known structural homologues. A scoring function that gives a good estimate of the accuracy of prediction was further developed. This score estimates very well the accuracy of the algorithm (R2 of 0.82). An online version of the tool is provided freely for non-commercial usage at http://www.bo-protscience.fr/kpred/.

Collapse

Complex evolutionary footprints revealed in an analysis of reused protein segments of diverse lengths. Proc Natl Acad Sci U S A 2017;114:11703-11708. [PMID: 29078314 PMCID: PMC5676897 DOI: 10.1073/pnas.1707642114] [Citation(s) in RCA: 55] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open

Abstract

We question a central paradigm: namely, that the protein domain is the “atomic unit” of evolution. In conflict with the current textbook view, our results unequivocally show that duplication of protein segments happens both above and below the domain level among amino acid segments of diverse lengths. Indeed, we show that significant evolutionary information is lost when the protein is approached as a string of domains. Our finer-grained approach reveals a far more complicated picture, where reused segments often intertwine and overlap with each other. Our results are consistent with a recursive model of evolution, in which segments of various lengths, typically smaller than domains, “hop” between environments. The fit segments remain, leaving traces that can still be detected.

Proteins share similar segments with one another. Such “reused parts”—which have been successfully incorporated into other proteins—are likely to offer an evolutionary advantage over de novo evolved segments, as most of the latter will not even have the capacity to fold. To systematically explore the evolutionary traces of segment “reuse” across proteins, we developed an automated methodology that identifies reused segments from protein alignments. We search for “themes”—segments of at least 35 residues of similar sequence and structure—reused within representative sets of 15,016 domains [Evolutionary Classification of Protein Domains (ECOD) database] or 20,398 chains [Protein Data Bank (PDB)]. We observe that theme reuse is highly prevalent and that reuse is more extensive when the length threshold for identifying a theme is lower. Structural domains, the best characterized form of reuse in proteins, are just one of many complex and intertwined evolutionary traces. Others include long themes shared among a few proteins, which encompass and overlap with shorter themes that recur in numerous proteins. The observed complexity is consistent with evolution by duplication and divergence, and some of the themes might include descendants of ancestral segments. The observed recursive footprints, where the same amino acid can simultaneously participate in several intertwined themes, could be a useful concept for protein design. Data are available at http://trachel-srv.cs.haifa.ac.il/rachel/ppi/themes/.

Collapse

Mackenzie CO, Grigoryan G. Protein structural motifs in prediction and design. Curr Opin Struct Biol 2017;44:161-167. [PMID: 28460216 PMCID: PMC5513761 DOI: 10.1016/j.sbi.2017.03.012] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Revised: 03/18/2017] [Accepted: 03/28/2017] [Indexed: 01/11/2023]

Sequence statistics of tertiary structural motifs reflect protein stability. PLoS One 2017;12:e0178272. [PMID: 28552940 PMCID: PMC5446159 DOI: 10.1371/journal.pone.0178272] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2017] [Accepted: 05/10/2017] [Indexed: 11/19/2022] Open

Abstract

The Protein Data Bank (PDB) has been a key resource for learning general rules of sequence-structure relationships in proteins. Quantitative insights have been gained by defining geometric descriptors of structure (e.g., distances, dihedral angles, solvent exposure, etc.) and observing their distributions and sequence preferences. Here we argue that as the PDB continues to grow, it may become unnecessary to reduce structure into a set of elementary descriptors. Instead, it could be possible to deduce quantitative sequence-structure relationships in the context of precisely-defined complex structural motifs by mining the PDB for closely matching backbone geometries. To validate this idea, we turned to the the task of predicting changes in protein stability upon amino-acid substitution—a difficult problem of broad significance. We defined non-contiguous tertiary motifs (TERMs) around a protein site of interest and extracted sequence preferences from ensembles of closely-matching substructures in the PDB to predict mutational stability changes at the site, ΔΔG_m. We demonstrate that these ensemble statistics predict ΔΔG_m on par with state-of-the-art statistical and machine-learning methods on large thermodynamic datasets, and outperform these, along with a leading structure-based modeling approach, when tested in the context of unbiased diverse mutations. Further, we show that the performance of the TERM-based method is directly related to the amount of available relevant structural data, automatically improving with the growing PDB. This enables a means of estimating prediction accuracy. Our results clearly demonstrate that: 1) statistics of non-contiguous structural motifs in the PDB encode fundamental sequence-structure relationships related to protein thermodynamic stability, and 2) the PDB is now large enough that such statistics are already useful in practice, with their accuracy expected to continue increasing as the database grows. These observations suggest new ways of using structural data towards addressing problems of computational structural biology.

Collapse

Thangappan J, Wu S, Lee SG. Joint-based description of protein structure: its application to the geometric characterization of membrane proteins. Sci Rep 2017;7:1056. [PMID: 28432363 PMCID: PMC5430719 DOI: 10.1038/s41598-017-01011-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Accepted: 03/28/2017] [Indexed: 11/17/2022] Open

Koehl P. Minimum action transition paths connecting minima on an energy surface. J Chem Phys 2017;145:184111. [PMID: 27846680 DOI: 10.1063/1.4966974] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open

Abstract

Dynamics is essential to the biological functions of many bio-molecules, yet our knowledge of dynamics remains fragmented. Experimental techniques for studying bio-molecules either provide high resolution information on static conformations of the molecule or provide low-resolution, ensemble information that does not shed light on single molecule dynamics. In parallel, bio-molecular dynamics occur at time scale that are not yet attainable through detailed simulation methods. These limitations are especially noticeable when studying transition paths. To address this issue, we report in this paper two methods that derive meaningful trajectories for proteins between two of their conformations. The first method, MinActionPath, uses approximations of the potential energy surface for the molecule to derive an analytical solution of the equations of motion related to the concept of minimum action path. The second method, RelaxPath, follows the same principle of minimum action path but implements a more sophisticated potential, including a mixed elastic potential and a collision term to alleviate steric clashes. Using this new potential, the equations of motion cannot be solved analytically. We have introduced a relaxation method for solving those equations. We describe both the theories behind the two methods and their implementations, focusing on the specific techniques we have used that make those implementations amenable to study large molecular systems. We have illustrated the performance of RelaxPath on simple 2D systems. We have also compared MinActionPath and RelaxPath to other methods for generating transition paths on a well suited test set of large proteins, for which the end points of the trajectories as well as an intermediate conformation between those end points are known. We have shown that RelaxPath outperforms those other methods, including MinActionPath, in its ability to generate trajectories that get close to the known intermediates. We have also shown that the structures along the RelaxPath trajectories remain protein-like. Open source versions of the two programs MinActionPath and RelaxPath are available by request.

Collapse

Critical Features of Fragment Libraries for Protein Structure Prediction. PLoS One 2017;12:e0170131. [PMID: 28085928 PMCID: PMC5235372 DOI: 10.1371/journal.pone.0170131] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2016] [Accepted: 12/29/2016] [Indexed: 11/19/2022] Open

Fourati Z, Ruza RR, Laverty D, Drège E, Delarue-Cochin S, Joseph D, Koehl P, Smart T, Delarue M. Barbiturates Bind in the GLIC Ion Channel Pore and Cause Inhibition by Stabilizing a Closed State. J Biol Chem 2016;292:1550-1558. [PMID: 27986812 DOI: 10.1074/jbc.m116.766964] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Revised: 12/06/2016] [Indexed: 12/12/2022] Open

Kozyrev SV. Model of protein fragments and statistical potentials. ACTA ACUST UNITED AC 2016. [DOI: 10.1134/s2070046616040051] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

Tertiary alphabet for the observable protein structural universe. Proc Natl Acad Sci U S A 2016;113:E7438-E7447. [PMID: 27810958 DOI: 10.1073/pnas.1607178113] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Characterization and Prediction of Protein Flexibility Based on Structural Alphabets. BIOMED RESEARCH INTERNATIONAL 2016;2016:4628025. [PMID: 27660756 PMCID: PMC5021887 DOI: 10.1155/2016/4628025] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/13/2016] [Accepted: 08/02/2016] [Indexed: 11/25/2022]

Kolodny R, Guibas L, Levitt M, Koehl P. Inverse Kinematics in Biology: The Protein Loop Closure Problem. Int J Rob Res 2016. [DOI: 10.1177/0278364905050352] [Citation(s) in RCA: 52] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Craveur P, Joseph AP, Esque J, Narwani TJ, Noël F, Shinada N, Goguet M, Leonard S, Poulain P, Bertrand O, Faure G, Rebehmed J, Ghozlane A, Swapna LS, Bhaskara RM, Barnoud J, Téletchéa S, Jallu V, Cerny J, Schneider B, Etchebest C, Srinivasan N, Gelly JC, de Brevern AG. Protein flexibility in the light of structural alphabets. Front Mol Biosci 2015;2:20. [PMID: 26075209 PMCID: PMC4445325 DOI: 10.3389/fmolb.2015.00020] [Citation(s) in RCA: 59] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2015] [Accepted: 04/30/2015] [Indexed: 01/01/2023] Open

Affiliation(s)

Pierrick Craveur Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
Agnel P Joseph Rutherford Appleton Laboratory, Science and Technology Facilities Council Didcot, UK
Jeremy Esque Institut National de la Santé et de la Recherche Médicale U964,7 UMR Centre National de la Recherche Scientifique 7104, IGBMC, Université de Strasbourg Illkirch, France
Tarun J Narwani Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
Floriane Noël Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
Nicolas Shinada Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
Matthieu Goguet Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
Sylvain Leonard Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
Pierre Poulain Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France ; Ets Poulain Pointe-Noire, Congo
Olivier Bertrand Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
Guilhem Faure National Library of Medicine, National Center for Biotechnology Information, National Institutes of Health Bethesda, MD, USA
Joseph Rebehmed Centre National de la Recherche Scientifique UMR7590, Sorbonne Universités, Université Pierre et Marie Curie - MNHN - IRD - IUC Paris, France
Amine Ghozlane Metagenopolis, INRA Jouy-en-Josas, France
Lakshmipuram S Swapna Molecular Biophysics Unit, Indian Institute of Science, Bangalore Bangalore, India ; Hospital for Sick Children, and Departments of Biochemistry and Molecular Genetics, University of Toronto Toronto, ON, Canada
Ramachandra M Bhaskara Molecular Biophysics Unit, Indian Institute of Science, Bangalore Bangalore, India ; Department of Theoretical Biophysics, Max Planck Institute of Biophysics Frankfurt, Germany
Jonathan Barnoud Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France ; Laboratoire de Physique, École Normale Supérieure de Lyon, Université de Lyon, Centre National de la Recherche Scientifique UMR 5672 Lyon, France
Stéphane Téletchéa Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France ; Faculté des Sciences et Techniques, Université de Nantes, Unité Fonctionnalité et Ingénierie des Protéines, Centre National de la Recherche Scientifique UMR 6286, Université Nantes Nantes, France
Vincent Jallu Platelet Unit, Institut National de la Transfusion Sanguine Paris, France
Jiri Cerny Institute of Biotechnology, The Czech Academy of Sciences Prague, Czech Republic
Bohdan Schneider Institute of Biotechnology, The Czech Academy of Sciences Prague, Czech Republic
Catherine Etchebest Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
Narayanaswamy Srinivasan Molecular Biophysics Unit, Indian Institute of Science, Bangalore Bangalore, India
Jean-Christophe Gelly Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France
Alexandre G de Brevern Institut National de la Santé et de la Recherche Médicale U 1134 Paris, France ; UMR_S 1134, DSIMB, Université Paris Diderot, Sorbonne Paris Cite Paris, France ; Institut National de la Transfusion Sanguine, DSIMB Paris, France ; UMR_S 1134, DSIMB, Laboratory of Excellence GR-Ex Paris, France

Collapse

Abbass J, Nebel JC. Customised fragments libraries for protein structure prediction based on structural class annotations. BMC Bioinformatics 2015;16:136. [PMID: 25925397 PMCID: PMC4419399 DOI: 10.1186/s12859-015-0576-2] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2014] [Accepted: 04/17/2015] [Indexed: 12/05/2022] Open

Abstract

Background

Since experimental techniques are time and cost consuming, in silico protein structure prediction is essential to produce conformations of protein targets. When homologous structures are not available, fragment-based protein structure prediction has become the approach of choice. However, it still has many issues including poor performance when targets’ lengths are above 100 residues, excessive running times and sub-optimal energy functions. Taking advantage of the reliable performance of structural class prediction software, we propose to address some of the limitations of fragment-based methods by integrating structural constraints in their fragment selection process.

Results

Using Rosetta, a state-of-the-art fragment-based protein structure prediction package, we evaluated our proposed pipeline on 70 former CASP targets containing up to 150 amino acids. Using either CATH or SCOP-based structural class annotations, enhancement of structure prediction performance is highly significant in terms of both GDT_TS (at least +2.6, p-values < 0.0005) and RMSD (−0.4, p-values < 0.005). Although CATH and SCOP classifications are different, they perform similarly. Moreover, proteins from all structural classes benefit from the proposed methodology. Further analysis also shows that methods relying on class-based fragments produce conformations which are more relevant to user and converge quicker towards the best model as estimated by GDT_TS (up to 10% in average). This substantiates our hypothesis that usage of structurally relevant templates conducts to not only reducing the size of the conformation space to be explored, but also focusing on a more relevant area.

Conclusions

Since our methodology produces models the quality of which is up to 7% higher in average than those generated by a standard fragment-based predictor, we believe it should be considered before conducting any fragment-based protein structure prediction. Despite such progress, ab initio prediction remains a challenging task, especially for proteins of average and large sizes. Apart from improving search strategies and energy functions, integration of additional constraints seems a promising route, especially if they can be accurately predicted from sequence alone.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0576-2) contains supplementary material, which is available to authorized users.

Collapse

de Oliveira SHP, Shi J, Deane CM. Building a better fragment library for de novo protein structure prediction. PLoS One 2015;10:e0123998. [PMID: 25901595 PMCID: PMC4406757 DOI: 10.1371/journal.pone.0123998] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2014] [Accepted: 02/25/2015] [Indexed: 01/11/2023] Open

Lushington GH. Comparative modeling of proteins. Methods Mol Biol 2015;1215:309-30. [PMID: 25330969 DOI: 10.1007/978-1-4939-1465-4_14] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]

Rysavy SJ, Beck DAC, Daggett V. Dynameomics: data-driven methods and models for utilizing large-scale protein structure repositories for improving fragment-based loop prediction. Protein Sci 2014;23:1584-95. [PMID: 25142412 DOI: 10.1002/pro.2537] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2014] [Revised: 07/30/2014] [Accepted: 08/17/2014] [Indexed: 12/26/2022]

Molloy K, Van MJ, Barbara D, Shehu A. Exploring representations of protein structure for automated remote homology detection and mapping of protein structure space. BMC Bioinformatics 2014;15 Suppl 8:S4. [PMID: 25080993 PMCID: PMC4120149 DOI: 10.1186/1471-2105-15-s8-s4] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Due to rapid sequencing of genomes, there are now millions of deposited protein sequences with no known function. Fast sequence-based comparisons allow detecting close homologs for a protein of interest to transfer functional information from the homologs to the given protein. Sequence-based comparison cannot detect remote homologs, in which evolution has adjusted the sequence while largely preserving structure. Structure-based comparisons can detect remote homologs but most methods for doing so are too expensive to apply at a large scale over structural databases of proteins. Recently, fragment-based structural representations have been proposed that allow fast detection of remote homologs with reasonable accuracy. These representations have also been used to obtain linearly-reducible maps of protein structure space. It has been shown, as additionally supported from analysis in this paper that such maps preserve functional co-localization of the protein structure space.

METHODS

Inspired by a recent application of the Latent Dirichlet Allocation (LDA) model for conducting structural comparisons of proteins, we propose higher-order LDA-obtained topic-based representations of protein structures to provide an alternative route for remote homology detection and organization of the protein structure space in few dimensions. Various techniques based on natural language processing are proposed and employed to aid the analysis of topics in the protein structure domain.

RESULTS

We show that a topic-based representation is just as effective as a fragment-based one at automated detection of remote homologs and organization of protein structure space. We conduct a detailed analysis of the information content in the topic-based representation, showing that topics have semantic meaning. The fragment-based and topic-based representations are also shown to allow prediction of superfamily membership.

CONCLUSIONS

This work opens exciting venues in designing novel representations to extract information about protein structures, as well as organizing and mining protein structure space with mature text mining tools.

Collapse

Sterpone F, Melchionna S, Tuffery P, Pasquali S, Mousseau N, Cragnolini T, Chebaro Y, St-Pierre JF, Kalimeri M, Barducci A, Laurin Y, Tek A, Baaden M, Nguyen PH, Derreumaux P. The OPEP protein model: from single molecules, amyloid formation, crowding and hydrodynamics to DNA/RNA systems. Chem Soc Rev 2014;43:4871-93. [PMID: 24759934 PMCID: PMC4426487 DOI: 10.1039/c4cs00048j] [Citation(s) in RCA: 120] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]

Schneider B, Černý J, Svozil D, Čech P, Gelly JC, de Brevern AG. Bioinformatic analysis of the protein/DNA interface. Nucleic Acids Res 2014;42:3381-94. [PMID: 24335080 PMCID: PMC3950675 DOI: 10.1093/nar/gkt1273] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2013] [Revised: 11/14/2013] [Accepted: 11/14/2013] [Indexed: 01/04/2023] Open

Affiliation(s)

Bohdan Schneider Institute of Biotechnology AS CR, Videnska 1083, CZ-142 20 Prague, Czech Republic, Laboratory of Informatics and Chemistry, Faculty of Chemical Technology, Institute of Chemical Technology Prague, Technická 5, CZ-166 28 Prague, Czech Republic, INSERM, U665, DSIMB, F-75739 Paris, France, University of Paris Diderot, Sorbonne Paris Cité, UMR_S 665, F-75739 Paris, France, Institut National de la Transfusion Sanguine (INTS), F-75739 Paris, France and Laboratoire d’Excellence GR-Ex, F-75739 Paris, France
Jiří Černý Institute of Biotechnology AS CR, Videnska 1083, CZ-142 20 Prague, Czech Republic, Laboratory of Informatics and Chemistry, Faculty of Chemical Technology, Institute of Chemical Technology Prague, Technická 5, CZ-166 28 Prague, Czech Republic, INSERM, U665, DSIMB, F-75739 Paris, France, University of Paris Diderot, Sorbonne Paris Cité, UMR_S 665, F-75739 Paris, France, Institut National de la Transfusion Sanguine (INTS), F-75739 Paris, France and Laboratoire d’Excellence GR-Ex, F-75739 Paris, France
Daniel Svozil Institute of Biotechnology AS CR, Videnska 1083, CZ-142 20 Prague, Czech Republic, Laboratory of Informatics and Chemistry, Faculty of Chemical Technology, Institute of Chemical Technology Prague, Technická 5, CZ-166 28 Prague, Czech Republic, INSERM, U665, DSIMB, F-75739 Paris, France, University of Paris Diderot, Sorbonne Paris Cité, UMR_S 665, F-75739 Paris, France, Institut National de la Transfusion Sanguine (INTS), F-75739 Paris, France and Laboratoire d’Excellence GR-Ex, F-75739 Paris, France
Petr Čech Institute of Biotechnology AS CR, Videnska 1083, CZ-142 20 Prague, Czech Republic, Laboratory of Informatics and Chemistry, Faculty of Chemical Technology, Institute of Chemical Technology Prague, Technická 5, CZ-166 28 Prague, Czech Republic, INSERM, U665, DSIMB, F-75739 Paris, France, University of Paris Diderot, Sorbonne Paris Cité, UMR_S 665, F-75739 Paris, France, Institut National de la Transfusion Sanguine (INTS), F-75739 Paris, France and Laboratoire d’Excellence GR-Ex, F-75739 Paris, France
Jean-Christophe Gelly Institute of Biotechnology AS CR, Videnska 1083, CZ-142 20 Prague, Czech Republic, Laboratory of Informatics and Chemistry, Faculty of Chemical Technology, Institute of Chemical Technology Prague, Technická 5, CZ-166 28 Prague, Czech Republic, INSERM, U665, DSIMB, F-75739 Paris, France, University of Paris Diderot, Sorbonne Paris Cité, UMR_S 665, F-75739 Paris, France, Institut National de la Transfusion Sanguine (INTS), F-75739 Paris, France and Laboratoire d’Excellence GR-Ex, F-75739 Paris, France
Alexandre G. de Brevern Institute of Biotechnology AS CR, Videnska 1083, CZ-142 20 Prague, Czech Republic, Laboratory of Informatics and Chemistry, Faculty of Chemical Technology, Institute of Chemical Technology Prague, Technická 5, CZ-166 28 Prague, Czech Republic, INSERM, U665, DSIMB, F-75739 Paris, France, University of Paris Diderot, Sorbonne Paris Cité, UMR_S 665, F-75739 Paris, France, Institut National de la Transfusion Sanguine (INTS), F-75739 Paris, France and Laboratoire d’Excellence GR-Ex, F-75739 Paris, France

Collapse

Ma J, Wang S. Algorithms, Applications, and Challenges of Protein Structure Alignment. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2014;94:121-75. [DOI: 10.1016/b978-0-12-800168-4.00005-6] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

Shen Y, Picord G, Guyon F, Tuffery P. Detecting protein candidate fragments using a structural alphabet profile comparison approach. PLoS One 2013;8:e80493. [PMID: 24303019 PMCID: PMC3841190 DOI: 10.1371/journal.pone.0080493] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2013] [Accepted: 10/03/2013] [Indexed: 01/28/2023] Open

Edwards H, Abeln S, Deane CM. Exploring fold space preferences of new-born and ancient protein superfamilies. PLoS Comput Biol 2013;9:e1003325. [PMID: 24244135 PMCID: PMC3828129 DOI: 10.1371/journal.pcbi.1003325] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2013] [Accepted: 09/23/2013] [Indexed: 11/18/2022] Open

Soong TT, Hwang MJ, Chen CM. Discovery of Recurrent Structural Motifs for Approximating Three-Dimensional Protein Structures. J CHIN CHEM SOC-TAIP 2013. [DOI: 10.1002/jccs.200400164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Molloy K, Saleh S, Shehu A. Probabilistic search and energy guidance for biased decoy sampling in ab initio protein structure prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013;10:1162-1175. [PMID: 24384705 DOI: 10.1109/tcbb.2013.29] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]

Dhingra P, Jayaram B. A homology/ab initio hybrid algorithm for sampling near-native protein conformations. J Comput Chem 2013;34:1925-36. [PMID: 23728619 DOI: 10.1002/jcc.23339] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2012] [Revised: 03/09/2013] [Accepted: 04/21/2013] [Indexed: 12/19/2022]

Johansson MU, Zoete V, Guex N. Recurrent structural motifs in non-homologous protein structures. Int J Mol Sci 2013;14:7795-814. [PMID: 23574940 PMCID: PMC3645717 DOI: 10.3390/ijms14047795] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2013] [Revised: 03/27/2013] [Accepted: 04/01/2013] [Indexed: 11/18/2022] Open

Gullotto D, Nolassi MS, Bernini A, Spiga O, Niccolai N. Probing the protein space for extending the detection of weak homology folds. J Theor Biol 2013;320:152-8. [DOI: 10.1016/j.jtbi.2012.12.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2012] [Revised: 11/03/2012] [Accepted: 12/05/2012] [Indexed: 12/19/2022]

Kolodny R, Kosloff M. From Protein Structure to Function via Computational Tools and Approaches. Isr J Chem 2013. [DOI: 10.1002/ijch.201200078] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Sunami T, Kono H. Local conformational changes in the DNA interfaces of proteins. PLoS One 2013;8:e56080. [PMID: 23418514 PMCID: PMC3571985 DOI: 10.1371/journal.pone.0056080] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2012] [Accepted: 01/03/2013] [Indexed: 11/18/2022] Open

Finding short structural motifs for re-construction of proteins 3D structure. Appl Soft Comput 2013. [DOI: 10.1016/j.asoc.2012.10.027] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

Xu D, Zhang Y. Toward optimal fragment generations for ab initio protein structure assembly. Proteins 2012;81:229-39. [PMID: 22972754 DOI: 10.1002/prot.24179] [Citation(s) in RCA: 170] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2012] [Revised: 08/06/2012] [Accepted: 09/03/2012] [Indexed: 01/03/2023]

Yadav A, Jayaraman VK. Structure based function prediction of proteins using fragment library frequency vectors. Bioinformation 2012;8:953-6. [PMID: 23144557 PMCID: PMC3488839 DOI: 10.6026/97320630008953] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2012] [Accepted: 09/19/2012] [Indexed: 11/23/2022] Open

Maadooliat M, Gao X, Huang JZ. Assessing protein conformational sampling methods based on bivariate lag-distributions of backbone angles. Brief Bioinform 2012;14:724-36. [PMID: 22926831 DOI: 10.1093/bib/bbs052] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open

Chellapa GD, Rose GD. Reducing the dimensionality of the protein-folding search problem. Protein Sci 2012;21:1231-40. [PMID: 22692765 DOI: 10.1002/pro.2106] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2012] [Revised: 06/04/2012] [Accepted: 06/05/2012] [Indexed: 11/10/2022]

Joseph AP, Valadié H, Srinivasan N, de Brevern AG. Local structural differences in homologous proteins: specificities in different SCOP classes. PLoS One 2012;7:e38805. [PMID: 22745680 PMCID: PMC3382195 DOI: 10.1371/journal.pone.0038805] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2011] [Accepted: 05/10/2012] [Indexed: 11/19/2022] Open

Abstract

The constant increase in the number of solved protein structures is of great help in understanding the basic principles behind protein folding and evolution. 3-D structural knowledge is valuable in designing and developing methods for comparison, modelling and prediction of protein structures. These approaches for structure analysis can be directly implicated in studying protein function and for drug design. The backbone of a protein structure favours certain local conformations which include α-helices, β-strands and turns. Libraries of limited number of local conformations (Structural Alphabets) were developed in the past to obtain a useful categorization of backbone conformation. Protein Block (PB) is one such Structural Alphabet that gave a reasonable structure approximation of 0.42 Å. In this study, we use PB description of local structures to analyse conformations that are preferred sites for structural variations and insertions, among group of related folds. This knowledge can be utilized in improving tools for structure comparison that work by analysing local structure similarities. Conformational differences between homologous proteins are known to occur often in the regions comprising turns and loops. Interestingly, these differences are found to have specific preferences depending upon the structural classes of proteins. Such class-specific preferences are mainly seen in the all-β class with changes involving short helical conformations and hairpin turns. A test carried out on a benchmark dataset also indicates that the use of knowledge on the class specific variations can improve the performance of a PB based structure comparison approach. The preference for the indel sites also seem to be confined to a few backbone conformations involving β-turns and helix C-caps. These are mainly associated with short loops joining the regular secondary structures that mediate a reversal in the chain direction. Rare β-turns of type I’ and II’ are also identified as preferred sites for insertions.

Collapse

Moreno-Hernández S, Levitt M. Comparative modeling and protein-like features of hydrophobic-polar models on a two-dimensional lattice. Proteins 2012;80:1683-93. [PMID: 22411636 DOI: 10.1002/prot.24067] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2011] [Revised: 02/26/2012] [Accepted: 03/03/2012] [Indexed: 11/07/2022]