1
|
Affiliation(s)
- Rachel Kolodny
- Department of Computer Science, University of Haifa, Haifa 31905, Israel;
| | - Leonid Pereyaslavets
- Department of Structural Biology, Stanford University, Stanford, California 94305; ,
| | | | - Michael Levitt
- Department of Structural Biology, Stanford University, Stanford, California 94305; ,
| |
Collapse
|
2
|
Sippl MJ, Wiederstein M. Detection of spatial correlations in protein structures and molecular complexes. Structure 2012; 20:718-28. [PMID: 22483118 PMCID: PMC3320710 DOI: 10.1016/j.str.2012.01.024] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2011] [Revised: 01/09/2012] [Accepted: 01/31/2012] [Indexed: 10/28/2022]
Abstract
Protein structures are frequently related by spectacular and often surprising similarities. Structural correlations among protein chains are routinely detected by various structure-matching techniques, but the comparison of oligomers and molecular complexes is largely uncharted territory. Here we solve the structure-matching problem for oligomers and large molecular aggregates, including the largest molecular complexes known today. We provide several challenging examples that cannot be handled by conventional structure-matching techniques and we report on a number of remarkable correlations. The examples cover the cell-puncturing device of bacteriophage T4, the secretion system of P. aeruginosa, members of the dehydrogenase family, DNA clamps, ferredoxin iron-storage cages, and virus capsids.
Collapse
Affiliation(s)
- Manfred J Sippl
- Division of Bioinformatics, Department of Molecular Biology, University of Salzburg, Hellbrunnerstraße 34, 5020 Salzburg, Austria.
| | | |
Collapse
|
3
|
Suhrer SJ, Gruber M, Wiederstein M, Sippl MJ. Effective techniques for protein structure mining. Methods Mol Biol 2012; 857:33-54. [PMID: 22323216 DOI: 10.1007/978-1-61779-588-6_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Retrieval and characterization of protein structure relationships are instrumental in a wide range of tasks in structural biology. The classification of protein structures (COPS) is a web service that provides efficient access to structure and sequence similarities for all currently available protein structures. Here, we focus on the application of COPS to the problem of template selection in homology modeling.
Collapse
Affiliation(s)
- Stefan J Suhrer
- Center of Applied Molecular Engineering, Division of Bioinformatics, University of Salzburg, Salzburg, Austria.
| | | | | | | |
Collapse
|
4
|
Analysis of RNA binding by the dengue virus NS5 RNA capping enzyme. PLoS One 2011; 6:e25795. [PMID: 22022449 PMCID: PMC3192115 DOI: 10.1371/journal.pone.0025795] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2011] [Accepted: 09/11/2011] [Indexed: 01/16/2023] Open
Abstract
Flaviviruses are small, capped positive sense RNA viruses that replicate in the cytoplasm of infected cells. Dengue virus and other related flaviviruses have evolved RNA capping enzymes to form the viral RNA cap structure that protects the viral genome and directs efficient viral polyprotein translation. The N-terminal domain of NS5 possesses the methyltransferase and guanylyltransferase activities necessary for forming mature RNA cap structures. The mechanism for flavivirus guanylyltransferase activity is currently unknown, and how the capping enzyme binds its diphosphorylated RNA substrate is important for deciphering how the flavivirus guanylyltransferase functions. In this report we examine how flavivirus NS5 N-terminal capping enzymes bind to the 5′ end of the viral RNA using a fluorescence polarization-based RNA binding assay. We observed that the KD for RNA binding is approximately 200 nM Dengue, Yellow Fever, and West Nile virus capping enzymes. Removal of one or both of the 5′ phosphates reduces binding affinity, indicating that the terminal phosphates contribute significantly to binding. RNA binding affinity is negatively affected by the presence of GTP or ATP and positively affected by S-adensyl methoninine (SAM). Structural superpositioning of the dengue virus capping enzyme with the Vaccinia virus VP39 protein bound to RNA suggests how the flavivirus capping enzyme may bind RNA, and mutagenesis analysis of residues in the putative RNA binding site demonstrate that several basic residues are critical for RNA binding. Several mutants show differential binding to 5′ di-, mono-, and un-phosphorylated RNAs. The mode of RNA binding appears similar to that found with other methyltransferase enzymes, and a discussion of diphosphorylated RNA binding is presented.
Collapse
|
5
|
Teyra J, Hawkins J, Zhu H, Pisabarro MT. Studies on the inference of protein binding regions across fold space based on structural similarities. Proteins 2011; 79:499-508. [PMID: 21069715 DOI: 10.1002/prot.22897] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
The emerging picture of a continuous protein fold space highlights the existence of non obvious structural similarities between proteins with apparent different topologies. The identification of structure resemblances across fold space and the analysis of similar recognition regions may be a valuable source of information towards protein structure-based functional characterization. In this work, we use non-sequential structural alignment methods (ns-SAs) to identify structural similarities between protein pairs independently of their SCOP hierarchy, and we calculate the significance of binding region conservation using the interacting residues overlap in the ns-SA. We cluster the binding inferences for each family to distinguish already known family binding regions from putative new ones. Our methodology exploits the enormous amount of data available in the PDB to identify binding region similarities within protein families and to propose putative binding regions. Our results indicate that there is a plethora of structurally common binding regions among proteins, independently of current fold classifications. We obtain a 6- to 8-fold enrichment of novel binding regions, and identify binding inferences for 728 protein families that so far lack binding information in the PDB. We explore binding mode analogies between ligands from commonly clustered binding regions to investigate the utility of our methodology. A comprehensive analysis of the obtained binding inferences may help in the functional characterization of protein recognition and assist rational engineering. The data obtained in this work is available in the download link at www.scowlp.org.
Collapse
Affiliation(s)
- Joan Teyra
- Structural Bioinformatics, BIOTEC, Technical University of Dresden, Tatzberg 47-51, 01307 Dresden, Germany.
| | | | | | | |
Collapse
|
6
|
Fernandez-Fuentes N, Dybas JM, Fiser A. Structural characteristics of novel protein folds. PLoS Comput Biol 2010; 6:e1000750. [PMID: 20421995 PMCID: PMC2858679 DOI: 10.1371/journal.pcbi.1000750] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2009] [Accepted: 03/19/2010] [Indexed: 11/29/2022] Open
Abstract
Folds are the basic building blocks of protein structures. Understanding the emergence of novel protein folds is an important step towards understanding the rules governing the evolution of protein structure and function and for developing tools for protein structure modeling and design. We explored the frequency of occurrences of an exhaustively classified library of supersecondary structural elements (Smotifs), in protein structures, in order to identify features that would define a fold as novel compared to previously known structures. We found that a surprisingly small set of Smotifs is sufficient to describe all known folds. Furthermore, novel folds do not require novel Smotifs, but rather are a new combination of existing ones. Novel folds can be typified by the inclusion of a relatively higher number of rarely occurring Smotifs in their structures and, to a lesser extent, by a novel topological combination of commonly occurring Smotifs. When investigating the structural features of Smotifs, we found that the top 10% of most frequent ones have a higher fraction of internal contacts, while some of the most rare motifs are larger, and contain a longer loop region. Structural genomics efforts aim at exploring the repertoire of three-dimensional structures of protein molecules. While genome scale sequencing projects have already provided us with all the genes of many organisms, it is the three dimensional shape of gene encoded proteins that defines all the interactions among these components. Understanding the versatility and, ultimately, the role of all possible molecular shapes in the cell is a necessary step toward understanding how organisms function. In this work we explored the rules that identify certain shapes as novel compared to all already known structures. The findings of this work provide possible insights into the rules that can be used in future works to identify or design new molecular shapes or to relate folds with each other in a quantitative manner.
Collapse
Affiliation(s)
- Narcis Fernandez-Fuentes
- University of Leeds, Leeds Institute of Molecular Medicine Section of Experimental Therapeutics, St. James's University Hospital, Leeds, United Kingdom
| | - Joseph M. Dybas
- Department of Systems and Computational Biology, Department of Biochemistry, Albert Einstein College of Medicine, Bronx, New York, United States of America
| | - Andras Fiser
- Department of Systems and Computational Biology, Department of Biochemistry, Albert Einstein College of Medicine, Bronx, New York, United States of America
- * E-mail:
| |
Collapse
|
7
|
Cuff A, Redfern OC, Greene L, Sillitoe I, Lewis T, Dibley M, Reid A, Pearl F, Dallman T, Todd A, Garratt R, Thornton J, Orengo C. The CATH hierarchy revisited-structural divergence in domain superfamilies and the continuity of fold space. Structure 2010; 17:1051-62. [PMID: 19679085 PMCID: PMC2741583 DOI: 10.1016/j.str.2009.06.015] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2008] [Revised: 06/24/2009] [Accepted: 06/25/2009] [Indexed: 11/29/2022]
Abstract
This paper explores the structural continuum in CATH and the extent to which superfamilies adopt distinct folds. Although most superfamilies are structurally conserved, in some of the most highly populated superfamilies (4% of all superfamilies) there is considerable structural divergence. While relatives share a similar fold in the evolutionary conserved core, diverse elaborations to this core can result in significant differences in the global structures. Applying similar protocols to examine the extent to which structural overlaps occur between different fold groups, it appears this effect is confined to just a few architectures and is largely due to small, recurring super-secondary motifs (e.g., αβ-motifs, α-hairpins). Although 24% of superfamilies overlap with superfamilies having different folds, only 14% of nonredundant structures in CATH are involved in overlaps. Nevertheless, the existence of these overlaps suggests that, in some regions of structure space, the fold universe should be seen as more continuous.
Collapse
Affiliation(s)
- Alison Cuff
- Institute of Structural and Molecular Biology, University College London, London, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
8
|
Sippl MJ. Fold space unlimited. Curr Opin Struct Biol 2009; 19:312-20. [DOI: 10.1016/j.sbi.2009.03.010] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2009] [Revised: 02/16/2009] [Accepted: 03/16/2009] [Indexed: 11/25/2022]
|
9
|
Suhrer SJ, Wiederstein M, Gruber M, Sippl MJ. COPS--a novel workbench for explorations in fold space. Nucleic Acids Res 2009; 37:W539-44. [PMID: 19465386 PMCID: PMC2703906 DOI: 10.1093/nar/gkp411] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The COPS (Classification Of Protein Structures) web server provides access to the complete repertoire of known protein structures and protein structural domains. The COPS classification encodes pairwise structural similarities as quantified metric relationships. The resulting metrical structure is mapped to a hierarchical tree, which is largely equivalent to the structure of a file browser. Exploiting this relationship we implemented the Fold Space Navigator, a tool that makes navigation in fold space as convenient as browsing through a file system. Moreover, pairwise structural similarities among the domains can be visualized and inspected instantaneously. COPS is updated weekly and stays concurrent with the PDB repository. The server also exposes the COPS classification pipeline. Newly determined structures uploaded to the server are chopped into domains, the locations of the new domains in the classification tree are determined, and their neighborhood can be immediately explored through the Fold Space Navigator. The COPS web server is accessible at http://cops.services.came.sbg.ac.at/.
Collapse
Affiliation(s)
- Stefan J Suhrer
- Center of Applied Molecular Engineering, Division of Bioinformatics, University of Salzburg, Hellbrunnerstrasse 34, 5020 Salzburg, Austria
| | | | | | | |
Collapse
|
10
|
Dessailly BH, Redfern OC, Cuff A, Orengo CA. Exploiting structural classifications for function prediction: towards a domain grammar for protein function. Curr Opin Struct Biol 2009; 19:349-56. [PMID: 19398323 DOI: 10.1016/j.sbi.2009.03.009] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2009] [Revised: 02/17/2009] [Accepted: 03/16/2009] [Indexed: 12/28/2022]
Abstract
The ability to assign function to proteins has become a major bottleneck for comprehensively understanding cellular mechanisms at the molecular level. Here we discuss the extent to which structural domain classifications can help in deciphering the complex relationship between the functions of proteins and their sequences and structures. Structural classifications are particularly helpful in understanding the mosaic manner in which new proteins and functions emerge through evolution. This is partly because they provide reliable and concrete domain definitions and enable the detection of very remote structural similarities and homologies. It is also because structural data can illuminate more clearly the mechanisms by which a broader functional repertoire can emerge during evolution.
Collapse
Affiliation(s)
- Benoît H Dessailly
- Department of Structural and Molecular Biology, University College London, London WC1E 6BT, United Kingdom
| | | | | | | |
Collapse
|
11
|
Cuff AL, Sillitoe I, Lewis T, Redfern OC, Garratt R, Thornton J, Orengo CA. The CATH classification revisited--architectures reviewed and new ways to characterize structural divergence in superfamilies. Nucleic Acids Res 2008; 37:D310-4. [PMID: 18996897 PMCID: PMC2686597 DOI: 10.1093/nar/gkn877] [Citation(s) in RCA: 157] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The latest version of CATH (class, architecture, topology, homology) (version 3.2), released in July 2008 (http://www.cathdb.info), contains 114,215 domains, 2178 Homologous superfamilies and 1110 fold groups. We have assigned 20,330 new domains, 87 new homologous superfamilies and 26 new folds since CATH release version 3.1. A total of 28,064 new domains have been assigned since our NAR 2007 database publication (CATH version 3.0). The CATH website has been completely redesigned and includes more comprehensive documentation. We have revisited the CATH architecture level as part of the development of a 'Protein Chart' and present information on the population of each architecture. The CATHEDRAL structure comparison algorithm has been improved and used to characterize structural diversity in CATH superfamilies and structural overlaps between superfamilies. Although the majority of superfamilies in CATH are not structurally diverse and do not overlap significantly with other superfamilies, approximately 4% of superfamilies are very diverse and these are the superfamilies that are most highly populated in both the PDB and in the genomes. Information on the degree of structural diversity in each superfamily and structural overlaps between superfamilies can now be downloaded from the CATH website.
Collapse
Affiliation(s)
- Alison L Cuff
- Institute of Structural and Molecular Biology, University College London, London, WC1E 6BT, UK.
| | | | | | | | | | | | | |
Collapse
|
12
|
Carrillo-Tripp M, Shepherd CM, Borelli IA, Venkataraman S, Lander G, Natarajan P, Johnson JE, Brooks CL, Reddy VS. VIPERdb2: an enhanced and web API enabled relational database for structural virology. Nucleic Acids Res 2008; 37:D436-42. [PMID: 18981051 PMCID: PMC2686430 DOI: 10.1093/nar/gkn840] [Citation(s) in RCA: 275] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
VIPERdb (http://viperdb.scripps.edu) is a relational database and a web portal for icosahedral virus capsid structures. Our aim is to provide a comprehensive resource specific to the needs of the virology community, with an emphasis on the description and comparison of derived data from structural and computational analyses of the virus capsids. In the current release, VIPERdb(2), we implemented a useful and novel method to represent capsid protein residues in the icosahedral asymmetric unit (IAU) using azimuthal polar orthographic projections, otherwise known as Phi-Psi (Phi-Psi) diagrams. In conjunction with a new Application Programming Interface (API), these diagrams can be used as a dynamic interface to the database to map residues (categorized as surface, interface and core residues) and identify family wide conserved residues including hotspots at the interfaces. Additionally, we enhanced the interactivity with the database by interfacing with web-based tools. In particular, the applications Jmol and STRAP were implemented to visualize and interact with the virus molecular structures and provide sequence-structure alignment capabilities. Together with extended curation practices that maintain data uniformity, a relational database implementation based on a schema for macromolecular structures and the APIs provided will greatly enhance the ability to do structural bioinformatics analysis of virus capsids.
Collapse
Affiliation(s)
- Mauricio Carrillo-Tripp
- Department of Molecular Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Redfern OC, Dessailly B, Orengo CA. Exploring the structure and function paradigm. Curr Opin Struct Biol 2008; 18:394-402. [PMID: 18554899 DOI: 10.1016/j.sbi.2008.05.007] [Citation(s) in RCA: 84] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2008] [Revised: 04/16/2008] [Accepted: 05/07/2008] [Indexed: 11/29/2022]
Abstract
Advances in protein structure determination, led by the structural genomics initiatives have increased the proportion of novel folds deposited in the Protein Data Bank. However, these structures are often not accompanied by functional annotations with experimental confirmation. In this review, we reassess the meaning of structural novelty and examine its relevance to the complexity of the structure-function paradigm. Recent advances in the prediction of protein function from structure are discussed, as well as new sequence-based methods for partitioning large, diverse superfamilies into biologically meaningful clusters. Obtaining structural data for these functionally coherent groups of proteins will allow us to better understand the relationship between structure and function.
Collapse
Affiliation(s)
- Oliver C Redfern
- Department of Structural and Molecular Biology, University College London, London WC1E 6BT, United Kingdom
| | | | | |
Collapse
|
14
|
|