Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Sadreyev RI, Kim BH, Grishin NV. Discrete-continuous duality of protein structure space. Curr Opin Struct Biol 2009;19:321-8. [PMID: 19482467 PMCID: PMC3688466 DOI: 10.1016/j.sbi.2009.04.009] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2009] [Revised: 04/29/2009] [Accepted: 04/29/2009] [Indexed: 11/30/2022]

For:	Sadreyev RI, Kim BH, Grishin NV. Discrete-continuous duality of protein structure space. Curr Opin Struct Biol 2009;19:321-8. [PMID: 19482467 PMCID: PMC3688466 DOI: 10.1016/j.sbi.2009.04.009] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2009] [Revised: 04/29/2009] [Accepted: 04/29/2009] [Indexed: 11/30/2022]

Number

Cited by Other Article(s)

Porter LL. Fluid protein fold space and its implications. Bioessays 2023;45:e2300057. [PMID: 37431685 PMCID: PMC10529699 DOI: 10.1002/bies.202300057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Revised: 06/21/2023] [Accepted: 06/23/2023] [Indexed: 07/12/2023]

Schaeffer RD, Zhang J, Kinch LN, Pei J, Cong Q, Grishin NV. Classification of domains in predicted structures of the human proteome. Proc Natl Acad Sci U S A 2023;120:e2214069120. [PMID: 36917664 PMCID: PMC10041065 DOI: 10.1073/pnas.2214069120] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Accepted: 02/06/2023] [Indexed: 03/16/2023] Open

Sykes J, Holland BR, Charleston MA. A review of visualisations of protein fold networks and their relationship with sequence and function. Biol Rev Camb Philos Soc 2023;98:243-262. [PMID: 36210328 PMCID: PMC10092621 DOI: 10.1111/brv.12905] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 09/08/2022] [Accepted: 09/09/2022] [Indexed: 01/12/2023]

Bourne PE, Draizen EJ, Mura C. The curse of the protein ribbon diagram. PLoS Biol 2022;20:e3001901. [PMID: 36508416 PMCID: PMC9744277 DOI: 10.1371/journal.pbio.3001901] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open

Gullotto D. Fine tuned exploration of evolutionary relationships within the protein universe. Stat Appl Genet Mol Biol 2021;20:17-36. [PMID: 33594839 DOI: 10.1515/sagmb-2019-0039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2019] [Accepted: 01/12/2021] [Indexed: 11/15/2022]

Searching protein space for ancient sub-domain segments. Curr Opin Struct Biol 2021;68:105-112. [PMID: 33476896 DOI: 10.1016/j.sbi.2020.11.006] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2020] [Accepted: 11/29/2020] [Indexed: 01/08/2023]

Mura C, Veretnik S, Bourne PE. The Urfold: Structural similarity just above the superfold level? Protein Sci 2019;28:2119-2126. [PMID: 31599042 PMCID: PMC6863707 DOI: 10.1002/pro.3742] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2019] [Revised: 09/30/2019] [Accepted: 10/01/2019] [Indexed: 01/16/2023]

Verma R, Pandit SB. Unraveling the structural landscape of intra-chain domain interfaces: Implication in the evolution of domain-domain interactions. PLoS One 2019;14:e0220336. [PMID: 31374091 PMCID: PMC6677297 DOI: 10.1371/journal.pone.0220336] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Accepted: 07/12/2019] [Indexed: 12/22/2022] Open

Abstract

Intra-chain domain interactions are known to play a significant role in the function and stability of multidomain proteins. These interactions are mediated through a physical interaction at domain-domain interfaces (DDIs). With a motivation to understand evolution of interfaces, we have investigated similarities among DDIs. Even though interfaces of protein-protein interactions (PPIs) have been previously studied by structurally aligning interfaces, similar analyses have not yet been performed on DDIs of either multidomain proteins or PPIs. For studying the structural landscape of DDIs, we have used iAlign to structurally align intra-chain domain interfaces of domains. The interface alignment of spatially constrained domains (due to inter-domain linkers) showed that ~88% of these could identify a structural matching interface having similar C-alpha geometry and contact pattern despite that aligned domain pairs are not structurally related. Moreover, the mean interface similarity score (IS-score) is 0.307, which is higher compared to the average random IS-score (0.207) suggesting domain interfaces are not random. The structural space of DDIs is highly connected as ~84% of all possible directed edges among interfaces are found to have at most path length of 8 when 0.26 is IS-score threshold. At this threshold, ~83% of interfaces form the largest strongly connected component. Thus, suggesting that structural space of intra-chain domain interfaces is degenerate and highly connected, as has been found in PPI interfaces. Interestingly, searching for structural neighbors of inter-chain interfaces among intra-chain interfaces showed that ~86% could find a statistically significant match to intra-chain interface with a mean IS-score of 0.311. This implies that domain interfaces are degenerate whether formed within a protein or between proteins. The interface degeneracy is most likely due to limited possible ways of packing secondary structures. In principle, interface similarities can be exploited to accurately model domain interfaces in structure prediction of multidomain proteins.

Collapse

Herman JL. Enhancing Statistical Multiple Sequence Alignment and Tree Inference Using Structural Information. Methods Mol Biol 2019;1851:183-214. [PMID: 30298398 DOI: 10.1007/978-1-4939-8736-8_10] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Kumirov VK, Dykstra EM, Hall BM, Anderson WJ, Szyszka TN, Cordes MHJ. Multistep mutational transformation of a protein fold through structural intermediates. Protein Sci 2018;27:1767-1779. [PMID: 30051937 DOI: 10.1002/pro.3488] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2018] [Revised: 07/24/2018] [Accepted: 07/25/2018] [Indexed: 12/24/2022]

Jaña GA, Mendoza F, Osorio MI, Alderete JB, Fernandes PA, Ramos MJ, Jiménez VA. A QM/MM approach on the structural and stereoelectronic factors governing glycosylation by GTF-SI fromStreptococcus mutans. Org Biomol Chem 2018;16:2438-2447. [DOI: 10.1039/c8ob00284c] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Dybas JM, Fiser A. Development of a motif-based topology-independent structure comparison method to identify evolutionarily related folds. Proteins 2016;84:1859-1874. [PMID: 27671894 PMCID: PMC5118133 DOI: 10.1002/prot.25169] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2016] [Revised: 08/17/2016] [Accepted: 08/25/2016] [Indexed: 11/09/2022]

Margelevičius M. Bayesian nonparametrics in protein remote homology search. Bioinformatics 2016;32:2744-52. [DOI: 10.1093/bioinformatics/btw213] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2015] [Accepted: 04/14/2016] [Indexed: 11/14/2022] Open

Das S, Dawson NL, Orengo CA. Diversity in protein domain superfamilies. Curr Opin Genet Dev 2015;35:40-9. [PMID: 26451979 PMCID: PMC4686048 DOI: 10.1016/j.gde.2015.09.005] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2015] [Revised: 09/07/2015] [Accepted: 09/08/2015] [Indexed: 01/25/2023]

Machine Learnable Fold Space Representation based on Residue Cluster Classes. Comput Biol Chem 2015;59 Pt A:1-7. [PMID: 26366526 DOI: 10.1016/j.compbiolchem.2015.07.010] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2014] [Revised: 07/17/2015] [Accepted: 07/25/2015] [Indexed: 11/21/2022]

Edwards H, Deane CM. Structural Bridges through Fold Space. PLoS Comput Biol 2015;11:e1004466. [PMID: 26372166 PMCID: PMC4570669 DOI: 10.1371/journal.pcbi.1004466] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2015] [Accepted: 07/12/2015] [Indexed: 12/05/2022] Open

Abstract

Several protein structure classification schemes exist that partition the protein universe into structural units called folds. Yet these schemes do not discuss how these units sit relative to each other in a global structure space. In this paper we construct networks that describe such global relationships between folds in the form of structural bridges. We generate these networks using four different structural alignment methods across multiple score thresholds. The networks constructed using the different methods remain a similar distance apart regardless of the probability threshold defining a structural bridge. This suggests that at least some structural bridges are method specific and that any attempt to build a picture of structural space should not be reliant on a single structural superposition method. Despite these differences all representations agree on an organisation of fold space into five principal community structures: all-α, all-β sandwiches, all-β barrels, α/β and α + β. We project estimated fold ages onto the networks and find that not only are the pairings of unconnected folds associated with higher age differences than bridged folds, but this difference increases with the number of networks displaying an edge. We also examine different centrality measures for folds within the networks and how these relate to fold age. While these measures interpret the central core of fold space in varied ways they all identify the disposition of ancestral folds to fall within this core and that of the more recently evolved structures to provide the peripheral landscape. These findings suggest that evolutionary information is encoded along these structural bridges. Finally, we identify four highly central pivotal folds representing dominant topological features which act as key attractors within our landscapes.

Folds are considered to be the structural units which make up the protein universe. Structural classification schemes focus on the assignment and organisation of protein domains into folds. However, they do not suggest how different folds might relate to one another in a global way. We introduce the concept of bridges through fold space: significant similarities between these units. We consider four alignment methods and a dynamic approach to placing these bridges. A greater consensus between these methods cannot be achieved by simply increasing the stringency with which edges are assigned. Instead, we emphasise the importance of considering consensus maps and only report results where there is agreement across all networks. It is possible that a study of the bridges may reveal evolutionary relationships. Based on a phylogenetic analysis of structures, we find that bridges consistently fall between folds which evolved at similar times. Moreover, the landscapes all consist of a core of older folds, with younger structures more often seen at the periphery. Finally we identify four pivotal folds in the landscapes. They contain topological motifs which unite disparate regions of fold space.

Collapse

Rackovsky S. Nonlinearities in protein space limit the utility of informatics in protein biophysics. Proteins 2015;83:1923-8. [PMID: 26315852 DOI: 10.1002/prot.24916] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2015] [Revised: 08/12/2015] [Accepted: 08/20/2015] [Indexed: 11/08/2022]

Holzgräfe C, Wallin S. Smooth functional transition along a mutational pathway with an abrupt protein fold switch. Biophys J 2015;107:1217-1225. [PMID: 25185557 DOI: 10.1016/j.bpj.2014.07.020] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2014] [Revised: 06/25/2014] [Accepted: 07/01/2014] [Indexed: 10/24/2022] Open

De novo protein design: how do we expand into the universe of possible protein structures? Curr Opin Struct Biol 2015;33:16-26. [DOI: 10.1016/j.sbi.2015.05.009] [Citation(s) in RCA: 128] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2015] [Revised: 05/15/2015] [Accepted: 05/25/2015] [Indexed: 01/08/2023]

Lhota J, Hauptman R, Hart T, Ng C, Xie L. A new method to improve network topological similarity search: applied to fold recognition. Bioinformatics 2015;31:2106-14. [PMID: 25717198 DOI: 10.1093/bioinformatics/btv125] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2014] [Accepted: 02/21/2015] [Indexed: 11/14/2022] Open

Affiliation(s)

John Lhota Hunter College High School, New York, NY 10128, U.S.A., Department of Computer Science, Hunter College, The City University of New York, New York, NY 10065, U.S.A., Department of Biological Sciences, Hunter College, The City University of New York New York, NY 10065, U.S.A. and The Graduate Center, The City University of New York, New York, NY 10016, U.S.A
Ruth Hauptman Hunter College High School, New York, NY 10128, U.S.A., Department of Computer Science, Hunter College, The City University of New York, New York, NY 10065, U.S.A., Department of Biological Sciences, Hunter College, The City University of New York New York, NY 10065, U.S.A. and The Graduate Center, The City University of New York, New York, NY 10016, U.S.A
Thomas Hart Hunter College High School, New York, NY 10128, U.S.A., Department of Computer Science, Hunter College, The City University of New York, New York, NY 10065, U.S.A., Department of Biological Sciences, Hunter College, The City University of New York New York, NY 10065, U.S.A. and The Graduate Center, The City University of New York, New York, NY 10016, U.S.A
Clara Ng Hunter College High School, New York, NY 10128, U.S.A., Department of Computer Science, Hunter College, The City University of New York, New York, NY 10065, U.S.A., Department of Biological Sciences, Hunter College, The City University of New York New York, NY 10065, U.S.A. and The Graduate Center, The City University of New York, New York, NY 10016, U.S.A
Lei Xie Hunter College High School, New York, NY 10128, U.S.A., Department of Computer Science, Hunter College, The City University of New York, New York, NY 10065, U.S.A., Department of Biological Sciences, Hunter College, The City University of New York New York, NY 10065, U.S.A. and The Graduate Center, The City University of New York, New York, NY 10016, U.S.A. Hunter College High School, New York, NY 10128, U.S.A., Department of Computer Science, Hunter College, The City University of New York, New York, NY 10065, U.S.A., Department of Biological Sciences, Hunter College, The City University of New York New York, NY 10065, U.S.A. and The Graduate Center, The City University of New York, New York, NY 10016, U.S.A

Collapse

Minami S, Sawada K, Chikenji G. How a spatial arrangement of secondary structure elements is dispersed in the universe of protein folds. PLoS One 2014;9:e107959. [PMID: 25243952 PMCID: PMC4171485 DOI: 10.1371/journal.pone.0107959] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2014] [Accepted: 08/18/2014] [Indexed: 11/18/2022] Open

Global view of the protein universe. Proc Natl Acad Sci U S A 2014;111:11691-6. [PMID: 25071170 DOI: 10.1073/pnas.1403395111] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Tóth-Petróczy A, Tawfik DS. The robustness and innovability of protein folds. Curr Opin Struct Biol 2014;26:131-8. [PMID: 25038399 DOI: 10.1016/j.sbi.2014.06.007] [Citation(s) in RCA: 93] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2013] [Revised: 06/26/2014] [Accepted: 06/26/2014] [Indexed: 11/30/2022]

Myers-Turnbull D, Bliven SE, Rose PW, Aziz ZK, Youkharibache P, Bourne PE, Prlić A. Systematic detection of internal symmetry in proteins using CE-Symm. J Mol Biol 2014;426:2255-68. [PMID: 24681267 DOI: 10.1016/j.jmb.2014.03.010] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2013] [Revised: 03/17/2014] [Accepted: 03/18/2014] [Indexed: 11/26/2022]

Gao M, Skolnick J. A comprehensive survey of small-molecule binding pockets in proteins. PLoS Comput Biol 2013;9:e1003302. [PMID: 24204237 PMCID: PMC3812058 DOI: 10.1371/journal.pcbi.1003302] [Citation(s) in RCA: 91] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2013] [Accepted: 09/11/2013] [Indexed: 11/19/2022] Open

Abstract

Many biological activities originate from interactions between small-molecule ligands and their protein targets. A detailed structural and physico-chemical characterization of these interactions could significantly deepen our understanding of protein function and facilitate drug design. Here, we present a large-scale study on a non-redundant set of about 20,000 known ligand-binding sites, or pockets, of proteins. We find that the structural space of protein pockets is crowded, likely complete, and may be represented by about 1,000 pocket shapes. Correspondingly, the growth rate of novel pockets deposited in the Protein Data Bank has been decreasing steadily over the recent years. Moreover, many protein pockets are promiscuous and interact with ligands of diverse scaffolds. Conversely, many ligands are promiscuous and interact with structurally different pockets. Through a physico-chemical and structural analysis, we provide insights into understanding both pocket promiscuity and ligand promiscuity. Finally, we discuss the implications of our study for the prediction of protein-ligand interactions based on pocket comparison.

The life of a living cell relies on many distinct proteins to carry out their functions. Most of these functions are rooted in interactions between the proteins and metabolites, small-molecules essential for life. By targeting specific proteins relevant to a disease, drug molecules may provide a cure. A deep understanding of the nature of interactions between proteins and small-molecules (or ligands) through analyzing their structures may help predict protein function or improve drug design. In this contribution, we present a large-scale analysis of a non-redundant set of over 20,000 experimental protein-ligand complex structures available in the current Protein Data Bank. We seek answers to several fundamental questions: How many representative pockets are there that serve as ligand-binding sites in proteins? To what extent can we infer a similar protein-ligand interaction by matching the structures of protein pockets? How different are the ligands found in the same pocket? For a promiscuous protein pocket, how does a pocket maintain favorable interactions with very different ligands? Conversely, how different are those pockets that interact with the same ligand? We find the structural space of protein pocket is small and that both protein promiscuity and ligand promiscuity are very common in Nature.

Collapse

Hadzipasic O, Wrabl JO, Hilser VJ. A horizontal alignment tool for numerical trend discovery in sequence data: application to protein hydropathy. PLoS Comput Biol 2013;9:e1003247. [PMID: 24130469 PMCID: PMC3794901 DOI: 10.1371/journal.pcbi.1003247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2013] [Accepted: 07/10/2013] [Indexed: 11/19/2022] Open

Ingles-Prieto A, Ibarra-Molero B, Delgado-Delgado A, Perez-Jimenez R, Fernandez JM, Gaucher EA, Sanchez-Ruiz JM, Gavira JA. Conservation of protein structure over four billion years. Structure 2013;21:1690-7. [PMID: 23932589 PMCID: PMC3774310 DOI: 10.1016/j.str.2013.06.020] [Citation(s) in RCA: 86] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2013] [Revised: 06/07/2013] [Accepted: 06/26/2013] [Indexed: 01/07/2023]

Topham CM, Rouquier M, Tarrat N, André I. Adaptive Smith-Waterman residue match seeding for protein structural alignment. Proteins 2013;81:1823-39. [DOI: 10.1002/prot.24327] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2013] [Revised: 04/22/2013] [Accepted: 05/15/2013] [Indexed: 12/30/2022]

Lenart A, Dudkiewicz M, Grynberg M, Pawłowski K. CLCAs - a family of metalloproteases of intriguing phylogenetic distribution and with cases of substituted catalytic sites. PLoS One 2013;8:e62272. [PMID: 23671590 PMCID: PMC3650047 DOI: 10.1371/journal.pone.0062272] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2012] [Accepted: 03/19/2013] [Indexed: 01/08/2023] Open

Abstract

The zinc-dependent metalloproteases with His-Glu-x-x-His (HExxH) active site motif, zincins, are a broad group of proteins involved in many metabolic and regulatory functions, and found in all forms of life. Human genome contains more than 100 genes encoding proteins with known zincin-like domains. A survey of all proteins containing the HExxH motif shows that approximately 52% of HExxH occurrences fall within known protein structural domains (as defined in the Pfam database). Domain families with majority of members possessing a conserved HExxH motif include, not surprisingly, many known and putative metalloproteases. Furthermore, several HExxH-containing protein domains thus identified can be confidently predicted to be putative peptidases of zincin fold. Thus, we predict zincin-like fold for eight uncharacterised Pfam families. Besides the domains with the HExxH motif strictly conserved, and those with sporadic occurrences, intermediate families are identified that contain some members with a conserved HExxH motif, but also many homologues with substitutions at the conserved positions. Such substitutions can be evolutionarily conserved and non-random, yet functional roles of these inactive zincins are not known. The CLCAs are a novel zincin-like protease family with many cases of substituted active sites. We show that this allegedly metazoan family has a number of bacterial and archaeal members. An extremely patchy phylogenetic distribution of CLCAs in prokaryotes and their conserved protein domain composition strongly suggests an evolutionary scenario of horizontal gene transfer (HGT) from multicellular eukaryotes to bacteria, providing an example of eukaryote-derived xenologues in bacterial genomes. Additionally, in a protein family identified here as closely homologous to CLCA, the CLCA_X (CLCA-like) family, a number of proteins is found in phages and plasmids, supporting the HGT scenario.

Collapse

Kolodny R, Pereyaslavets L, Samson AO, Levitt M. On the Universe of Protein Folds. Annu Rev Biophys 2013;42:559-82. [DOI: 10.1146/annurev-biophys-083012-130432] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Wu Y, Punta M, Xiao R, Acton TB, Sathyamoorthy B, Dey F, Fischer M, Skerra A, Rost B, Montelione GT, Szyperski T. NMR structure of lipoprotein YxeF from Bacillus subtilis reveals a calycin fold and distant homology with the lipocalin Blc from Escherichia coli. PLoS One 2012;7:e37404. [PMID: 22693626 PMCID: PMC3367933 DOI: 10.1371/journal.pone.0037404] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2012] [Accepted: 04/19/2012] [Indexed: 11/18/2022] Open

Affiliation(s)

Yibing Wu Department of Chemistry, State University of New York at Buffalo, Buffalo, New York, United States of America Northeast Structural Genomics Consortium
Marco Punta Department of Computer Science and Institute for Advanced Study, Technical University of Munich, Munich, Germany Northeast Structural Genomics Consortium
Rong Xiao Center of Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Robert Wood Johnson Medical School, The State University of New Jersey, Piscataway, New Jersey, United States of America Northeast Structural Genomics Consortium
Thomas B. Acton Center of Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Robert Wood Johnson Medical School, The State University of New Jersey, Piscataway, New Jersey, United States of America Northeast Structural Genomics Consortium
Bharathwaj Sathyamoorthy Department of Chemistry, State University of New York at Buffalo, Buffalo, New York, United States of America
Fabian Dey Howard Hughes Medical Institute, Department of Biochemistry and Molecular Biophysics, Center for Computational Biology and Bioinformatics, Columbia University, New York, New York, United States of America Northeast Structural Genomics Consortium
Markus Fischer Howard Hughes Medical Institute, Department of Biochemistry and Molecular Biophysics, Center for Computational Biology and Bioinformatics, Columbia University, New York, New York, United States of America Northeast Structural Genomics Consortium
Arne Skerra Munich Center for Integrated Protein Science, CIPS-M, and Lehrstuhl für Biologische Chemie, Technische Universität München, Freising-Weihenstephan, Germany
Burkhard Rost Department of Computer Science and Institute for Advanced Study, Technical University of Munich, Munich, Germany Northeast Structural Genomics Consortium
Gaetano T. Montelione Center of Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Robert Wood Johnson Medical School, The State University of New Jersey, Piscataway, New Jersey, United States of America Northeast Structural Genomics Consortium
Thomas Szyperski Department of Chemistry, State University of New York at Buffalo, Buffalo, New York, United States of America Northeast Structural Genomics Consortium * E-mail:

Collapse

Hensen U, Meyer T, Haas J, Rex R, Vriend G, Grubmüller H. Exploring protein dynamics space: the dynasome as the missing link between protein structure and function. PLoS One 2012;7:e33931. [PMID: 22606222 PMCID: PMC3350514 DOI: 10.1371/journal.pone.0033931] [Citation(s) in RCA: 72] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2011] [Accepted: 02/20/2012] [Indexed: 12/25/2022] Open

Skolnick J, Zhou H, Brylinski M. Further evidence for the likely completeness of the library of solved single domain protein structures. J Phys Chem B 2012;116:6654-64. [PMID: 22272723 DOI: 10.1021/jp211052j] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Daniels NM, Kumar A, Cowen LJ, Menke M. Touring protein space with Matt. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012;9:286-93. [PMID: 21464511 PMCID: PMC3355523 DOI: 10.1109/tcbb.2011.70] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]

Brylinski M, Gao M, Skolnick J. Why not consider a spherical protein? Implications of backbone hydrogen bonding for protein structure and function. Phys Chem Chem Phys 2011;13:17044-55. [PMID: 21655593 DOI: 10.1039/c1cp21140d] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

Abstract

The intrinsic ability of protein structures to exhibit the geometric features required for molecular function in the absence of evolution is examined in the context of three systems: the reference set of real, single domain protein structures, a library of computationally generated, compact homopolypeptides, artificial structures with protein-like secondary structural elements, and quasi-spherical random proteins packed at the same density as proteins but lacking backbone secondary structure and hydrogen bonding. Without any evolutionary selection, the library of artificial structures has similar backbone hydrogen bonding, global shape, surface to volume ratio and statistically significant structural matches to real protein global structures. Moreover, these artificial structures have native like ligand binding cavities, and a tiny subset has interfacial geometries consistent with native-like protein-protein interactions and DNA binding. In contrast, the quasi-spherical random proteins, being devoid of secondary structure, have a lower surface to volume ratio and lack ligand binding pockets and intermolecular interaction interfaces. Surprisingly, these quasi-spherical random proteins exhibit protein like distributions of virtual bond angles and almost all have a statistically significant structural match to real protein structures. This implies that it is local chain stiffness, even without backbone hydrogen bonding, and compactness that give rise to the likely completeness of the library solved single domain protein structures. These studies also suggest that the packing of secondary structural elements generates the requisite geometry for intermolecular binding. Thus, backbone hydrogen bonding plays an important role not only in protein structure but also in protein function. Such ability to bind biological molecules is an inherent feature of protein structure; if combined with appropriate protein sequences, it could provide the non-zero background probability for low-level function that evolution requires for selection to occur.

Collapse

Abroi A, Gough J. Are viruses a source of new protein folds for organisms? - Virosphere structure space and evolution. Bioessays 2011;33:626-35. [DOI: 10.1002/bies.201000126] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

Hollup SM, Sadowski MI, Jonassen I, Taylor WR. Exploring the limits of fold discrimination by structural alignment: a large scale benchmark using decoys of known fold. Comput Biol Chem 2011;35:174-88. [PMID: 21704264 PMCID: PMC3145973 DOI: 10.1016/j.compbiolchem.2011.04.008] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2011] [Accepted: 04/23/2011] [Indexed: 11/10/2022]

Dai L, Zhou Y. Characterizing the existing and potential structural space of proteins by large-scale multiple loop permutations. J Mol Biol 2011;408:585-95. [PMID: 21376059 DOI: 10.1016/j.jmb.2011.02.056] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2010] [Revised: 02/22/2011] [Accepted: 02/24/2011] [Indexed: 10/18/2022]

Abstract

Worldwide structural genomics projects are increasing structure coverage of sequence space but have not significantly expanded the protein structure space itself (i.e., number of unique structural folds) since 2007. Discovering new structural folds experimentally by directed evolution and random recombination of secondary-structure blocks is also proved rarely successful. Meanwhile, previous computational efforts for large-scale mapping of protein structure space are limited to simple model proteins and led to an inconclusive answer on the completeness of the existing observed protein structure space. Here, we build novel protein structures by extending naturally occurring circular (single-loop) permutation to multiple loop permutations (MLPs). These structures are clustered by structural similarity measure called TM-score. The computational technique allows us to produce different structural clusters on the same naturally occurring, packed, stable core but with alternatively connected secondary-structure segments. A large-scale MLP of 2936 domains from structural classification of protein domains reproduces those existing structural clusters (63%) mostly as hubs for many nonredundant sequences and illustrates newly discovered novel clusters as islands adopted by a few sequences only. Results further show that there exist a significant number of novel potentially stable clusters for medium-size or large-size single-domain proteins, in particular, >100 amino acid residues, that are either not yet adopted by nature or adopted only by a few sequences. This study suggests that MLP provides a simple yet highly effective tool for engineering and design of novel protein structures (including naturally knotted proteins). The implication of recovering new-fold targets from critical assessment of structure prediction techniques (CASP) by MLP on template-based structure prediction is also discussed. Our MLP structures are available for download at the publication page of the Web site http://sparks.informatics.iupui.edu.

Collapse

Schaeffer RD, Daggett V. Protein folds and protein folding. Protein Eng Des Sel 2011;24:11-9. [PMID: 21051320 PMCID: PMC3003448 DOI: 10.1093/protein/gzq096] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2010] [Revised: 10/09/2010] [Accepted: 10/11/2010] [Indexed: 01/07/2023] Open

Structural space of protein-protein interfaces is degenerate, close to complete, and highly connected. Proc Natl Acad Sci U S A 2010;107:22517-22. [PMID: 21149688 DOI: 10.1073/pnas.1012820107] [Citation(s) in RCA: 103] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Vuong TV, Wilson DB. Glycoside hydrolases: catalytic base/nucleophile diversity. Biotechnol Bioeng 2010;107:195-205. [PMID: 20552664 DOI: 10.1002/bit.22838] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Wu S, Zhang Y. Recognizing protein substructure similarity using segmental threading. Structure 2010;18:858-67. [PMID: 20637422 DOI: 10.1016/j.str.2010.04.007] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2010] [Revised: 04/02/2010] [Accepted: 04/03/2010] [Indexed: 11/15/2022]

Chubb D, Jefferys BR, Sternberg MJE, Kelley LA. Sequencing delivers diminishing returns for homology detection: implications for mapping the protein universe. ACTA ACUST UNITED AC 2010;26:2664-71. [PMID: 20843957 DOI: 10.1093/bioinformatics/btq527] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

On the evolutionary origins of "Fold Space Continuity": a study of topological convergence and divergence in mixed alpha-beta domains. J Struct Biol 2010;172:244-52. [PMID: 20691788 DOI: 10.1016/j.jsb.2010.07.016] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2010] [Revised: 06/25/2010] [Accepted: 07/31/2010] [Indexed: 11/21/2022]

Di Lena P, Fariselli P, Margara L, Vassura M, Casadio R. Fast overlapping of protein contact maps by alignment of eigenvectors. ACTA ACUST UNITED AC 2010;26:2250-8. [PMID: 20610612 DOI: 10.1093/bioinformatics/btq402] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Omelchenko MV, Galperin MY, Wolf YI, Koonin EV. Non-homologous isofunctional enzymes: a systematic analysis of alternative solutions in enzyme evolution. Biol Direct 2010;5:31. [PMID: 20433725 PMCID: PMC2876114 DOI: 10.1186/1745-6150-5-31] [Citation(s) in RCA: 108] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2010] [Accepted: 04/30/2010] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Evolutionarily unrelated proteins that catalyze the same biochemical reactions are often referred to as analogous - as opposed to homologous - enzymes. The existence of numerous alternative, non-homologous enzyme isoforms presents an interesting evolutionary problem; it also complicates genome-based reconstruction of the metabolic pathways in a variety of organisms. In 1998, a systematic search for analogous enzymes resulted in the identification of 105 Enzyme Commission (EC) numbers that included two or more proteins without detectable sequence similarity to each other, including 34 EC nodes where proteins were known (or predicted) to have distinct structural folds, indicating independent evolutionary origins. In the past 12 years, many putative non-homologous isofunctional enzymes were identified in newly sequenced genomes. In addition, efforts in structural genomics resulted in a vastly improved structural coverage of proteomes, providing for definitive assessment of (non)homologous relationships between proteins.

RESULTS

We report the results of a comprehensive search for non-homologous isofunctional enzymes (NISE) that yielded 185 EC nodes with two or more experimentally characterized - or predicted - structurally unrelated proteins. Of these NISE sets, only 74 were from the original 1998 list. Structural assignments of the NISE show over-representation of proteins with the TIM barrel fold and the nucleotide-binding Rossmann fold. From the functional perspective, the set of NISE is enriched in hydrolases, particularly carbohydrate hydrolases, and in enzymes involved in defense against oxidative stress.

CONCLUSIONS

These results indicate that at least some of the non-homologous isofunctional enzymes were recruited relatively recently from enzyme families that are active against related substrates and are sufficiently flexible to accommodate changes in substrate specificity.

Collapse

Alva V, Remmert M, Biegert A, Lupas AN, Söding J. A galaxy of folds. Protein Sci 2010;19:124-30. [PMID: 19937658 DOI: 10.1002/pro.297] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]

Wrabl JO, Hilser VJ. Investigating homology between proteins using energetic profiles. PLoS Comput Biol 2010;6:e1000722. [PMID: 20361049 PMCID: PMC2845653 DOI: 10.1371/journal.pcbi.1000722] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2009] [Accepted: 02/25/2010] [Indexed: 11/19/2022] Open

Abstract

Accumulated experimental observations demonstrate that protein stability is often preserved upon conservative point mutation. In contrast, less is known about the effects of large sequence or structure changes on the stability of a particular fold. Almost completely unknown is the degree to which stability of different regions of a protein is generally preserved throughout evolution. In this work, these questions are addressed through thermodynamic analysis of a large representative sample of protein fold space based on remote, yet accepted, homology. More than 3,000 proteins were computationally analyzed using the structural-thermodynamic algorithm COREX/BEST. Estimated position-specific stability (i.e., local Gibbs free energy of folding) and its component enthalpy and entropy were quantitatively compared between all proteins in the sample according to all-vs.-all pairwise structural alignment. It was discovered that the local stabilities of homologous pairs were significantly more correlated than those of non-homologous pairs, indicating that local stability was indeed generally conserved throughout evolution. However, the position-specific enthalpy and entropy underlying stability were less correlated, suggesting that the overall regional stability of a protein was more important than the thermodynamic mechanism utilized to achieve that stability. Finally, two different types of statistically exceptional evolutionary structure-thermodynamic relationships were noted. First, many homologous proteins contained regions of similar thermodynamics despite localized structure change, suggesting a thermodynamic mechanism enabling evolutionary fold change. Second, some homologous proteins with extremely similar structures nonetheless exhibited different local stabilities, a phenomenon previously observed experimentally in this laboratory. These two observations, in conjunction with the principal conclusion that homologous proteins generally conserved local stability, may provide guidance for a future thermodynamically informed classification of protein homology.

Protein structure and function are fundamentally determined by thermodynamics. However, for technical as well as historical reasons, current evolutionary classification schemes and bioinformatics tools do not fully utilize thermodynamic information to describe or analyze proteins. In this work, we address this deficiency by computationally estimating the position-specific thermodynamic quantities of stability (ΔG), enthalpy (ΔH), and entropy (TΔS) for a large and diverse representative sample of protein structures. The sample was drawn from an expertly curated database, such that accepted evolutionary relationships existed for all protein pairs. Importantly, trivial relationships between pairs highly similar in amino acid sequence were explicitly excluded. We found that all position-specific thermodynamic quantities ΔG, ΔH, and TΔS were more similar between proteins that were evolutionarily related (i.e., homologous), and were less similar between proteins that were not evolutionarily related (i.e., non-homologous), with stability being particularly similar between homologous proteins. However, interesting statistically significant exceptions to these trends were observed, exceptions that could indicate novel processes of functional adaptation or evolutionary fold change, mediated by thermodynamics, for the proteins involved. Taken together, these results expand our understanding of the role of thermodynamics in protein evolution and suggest an organizational framework for a future thermodynamically-informed classification of protein homology.

Collapse

Xu J, Zhang Y. How significant is a protein structure similarity with TM-score = 0.5? ACTA ACUST UNITED AC 2010;26:889-95. [PMID: 20164152 DOI: 10.1093/bioinformatics/btq066] [Citation(s) in RCA: 542] [Impact Index Per Article: 38.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Sadowski MI, Taylor WR. Protein structures, folds and fold spaces. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2010;22:033103. [PMID: 21386276 DOI: 10.1088/0953-8984/22/3/033103] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]