Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Sawada Y, Honda S. Structural diversity of protein segments follows a power-law distribution. Biophys J 2006;91:1213-23. [PMID: 16731566 PMCID: PMC1518648 DOI: 10.1529/biophysj.105.076661] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

For:	Sawada Y, Honda S. Structural diversity of protein segments follows a power-law distribution. Biophys J 2006;91:1213-23. [PMID: 16731566 PMCID: PMC1518648 DOI: 10.1529/biophysj.105.076661] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Number

Cited by Other Article(s)

Bagrova O, Lapshina K, Sidorova A, Shpigun D, Lutsenko A, Belova E. Secondary structure analysis of proteins within the same topology group. Biochem Biophys Res Commun 2024;734:150613. [PMID: 39222577 DOI: 10.1016/j.bbrc.2024.150613] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2024] [Revised: 08/13/2024] [Accepted: 08/27/2024] [Indexed: 09/04/2024]

Jones AA, Snow CD. Porous protein crystals: synthesis and applications. Chem Commun (Camb) 2024;60:5790-5803. [PMID: 38756076 DOI: 10.1039/d4cc00183d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/18/2024]

Complex evolutionary footprints revealed in an analysis of reused protein segments of diverse lengths. Proc Natl Acad Sci U S A 2017;114:11703-11708. [PMID: 29078314 PMCID: PMC5676897 DOI: 10.1073/pnas.1707642114] [Citation(s) in RCA: 55] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open

Abstract

We question a central paradigm: namely, that the protein domain is the “atomic unit” of evolution. In conflict with the current textbook view, our results unequivocally show that duplication of protein segments happens both above and below the domain level among amino acid segments of diverse lengths. Indeed, we show that significant evolutionary information is lost when the protein is approached as a string of domains. Our finer-grained approach reveals a far more complicated picture, where reused segments often intertwine and overlap with each other. Our results are consistent with a recursive model of evolution, in which segments of various lengths, typically smaller than domains, “hop” between environments. The fit segments remain, leaving traces that can still be detected.

Proteins share similar segments with one another. Such “reused parts”—which have been successfully incorporated into other proteins—are likely to offer an evolutionary advantage over de novo evolved segments, as most of the latter will not even have the capacity to fold. To systematically explore the evolutionary traces of segment “reuse” across proteins, we developed an automated methodology that identifies reused segments from protein alignments. We search for “themes”—segments of at least 35 residues of similar sequence and structure—reused within representative sets of 15,016 domains [Evolutionary Classification of Protein Domains (ECOD) database] or 20,398 chains [Protein Data Bank (PDB)]. We observe that theme reuse is highly prevalent and that reuse is more extensive when the length threshold for identifying a theme is lower. Structural domains, the best characterized form of reuse in proteins, are just one of many complex and intertwined evolutionary traces. Others include long themes shared among a few proteins, which encompass and overlap with shorter themes that recur in numerous proteins. The observed complexity is consistent with evolution by duplication and divergence, and some of the themes might include descendants of ancestral segments. The observed recursive footprints, where the same amino acid can simultaneously participate in several intertwined themes, could be a useful concept for protein design. Data are available at http://trachel-srv.cs.haifa.ac.il/rachel/ppi/themes/.

Collapse

Paeth M, Stapleton J, Dougherty ML, Fischesser H, Shepherd J, McCauley M, Falatach R, Page RC, Berberich JA, Konkolewicz D. Approaches for Conjugating Tailor-Made Polymers to Proteins. Methods Enzymol 2017;590:193-224. [DOI: 10.1016/bs.mie.2016.12.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

Williams C, Dougherty ML, Makaroff K, Stapleton J, Konkolewicz D, Berberich JA, Page RC. Strategies for Biophysical Characterization of Protein–Polymer Conjugates. Methods Enzymol 2017;590:93-114. [DOI: 10.1016/bs.mie.2016.11.008] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]

Motomura K, Fujita T, Tsutsumi M, Kikuzato S, Nakamura M, Otaki JM. Word decoding of protein amino Acid sequences with availability analysis: a linguistic approach. PLoS One 2012;7:e50039. [PMID: 23185527 PMCID: PMC3503725 DOI: 10.1371/journal.pone.0050039] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2012] [Accepted: 10/15/2012] [Indexed: 11/19/2022] Open

Abstract

The amino acid sequences of proteins determine their three-dimensional structures and functions. However, how sequence information is related to structures and functions is still enigmatic. In this study, we show that at least a part of the sequence information can be extracted by treating amino acid sequences of proteins as a collection of English words, based on a working hypothesis that amino acid sequences of proteins are composed of short constituent amino acid sequences (SCSs) or "words". We first confirmed that the English language highly likely follows Zipf's law, a special case of power law. We found that the rank-frequency plot of SCSs in proteins exhibits a similar distribution when low-rank tails are excluded. In comparison with natural English and "compressed" English without spaces between words, amino acid sequences of proteins show larger linear ranges and smaller exponents with heavier low-rank tails, demonstrating that the SCS distribution in proteins is largely scale-free. A distribution pattern of SCSs in proteins is similar among species, but species-specific features are also present. Based on the availability scores of SCSs, we found that sequence motifs are enriched in high-availability sites (i.e., "key words") and vice versa. In fact, the highest availability peak within a given protein sequence often directly corresponds to a sequence motif. The amino acid composition of high-availability sites within motifs is different from that of entire motifs and all protein sequences, suggesting the possible functional importance of specific SCSs and their compositional amino acids within motifs. We anticipate that our availability-based word decoding approach is complementary to sequence alignment approaches in predicting functionally important sites of unknown proteins from their amino acid sequences.

Collapse

Petersen SB, Neves-Petersen MT, Henriksen SB, Mortensen RJ, Geertz-Hansen HM. Scale-free behaviour of amino acid pair interactions in folded proteins. PLoS One 2012;7:e41322. [PMID: 22848462 PMCID: PMC3406053 DOI: 10.1371/journal.pone.0041322] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2012] [Accepted: 06/20/2012] [Indexed: 11/19/2022] Open

Tomii K, Sawada Y, Honda S. Convergent evolution in structural elements of proteins investigated using cross profile analysis. BMC Bioinformatics 2012;13:11. [PMID: 22244085 PMCID: PMC3398312 DOI: 10.1186/1471-2105-13-11] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2011] [Accepted: 01/16/2012] [Indexed: 11/10/2022] Open

Abstract

Background

Evolutionary relations of similar segments shared by different protein folds remain controversial, even though many examples of such segments have been found. To date, several methods such as those based on the results of structure comparisons, sequence-based classifications, and sequence-based profile-profile comparisons have been applied to identify such protein segments that possess local similarities in both sequence and structure across protein folds. However, to capture more precise sequence-structure relations, no method reported to date combines structure-based profiles, and sequence-based profiles based on evolutionary information. The former are generally regarded as representing the amino acid preferences at each position of a specific conformation of protein segment. They might reflect the nature of ancient short peptide ancestors, using the results of structural classifications of protein segments.

Results

This report describes the development and use of "Cross Profile Analysis" to compare sequence-based profiles and structure-based profiles based on amino acid occurrences at each position within a protein segment cluster. Using systematic cross profile analysis, we found structural clusters of 9-residue and 15-residue segments showing remarkably strong correlation with particular sequence profiles. These correlations reflect structural similarities among constituent segments of both sequence-based and structure-based profiles. We also report previously undetectable sequence-structure patterns that transcend protein family and fold boundaries, and present results of the conformational analysis of the deduced peptide of a segment cluster. These results suggest the existence of ancient short-peptide ancestors.

Conclusions

Cross profile analysis reveals the polyphyletic and convergent evolution of β-hairpin-like structures, which were verified both experimentally and computationally. The results presented here give us new insights into the evolution of short protein segments.

Collapse

Ito JI, Sonobe Y, Ikeda K, Tomii K, Higo J. Universal partitioning of the hierarchical fold network of 50-residue segments in proteins. BMC STRUCTURAL BIOLOGY 2009;9:34. [PMID: 19454039 PMCID: PMC2693521 DOI: 10.1186/1472-6807-9-34] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/07/2008] [Accepted: 05/20/2009] [Indexed: 11/23/2022]

Abstract

Background

Several studies have demonstrated that protein fold space is structured hierarchically and that power-law statistics are satisfied in relation between the numbers of protein families and protein folds (or superfamilies). We examined the internal structure and statistics in the fold space of 50 amino-acid residue segments taken from various protein folds. We used inter-residue contact patterns to measure the tertiary structural similarity among segments. Using this similarity measure, the segments were classified into a number (K_c) of clusters. We examined various K_cvalues for the clustering. The special resolution to differentiate the segment tertiary structures increases with increasing K_c. Furthermore, we constructed networks by linking structurally similar clusters.

Results

The network was partitioned persistently into four regions for K_c≥ 1000. This main partitioning is consistent with results of earlier studies, where similar partitioning was reported in classifying protein domain structures. Furthermore, the network was partitioned naturally into several dozens of sub-networks (i.e., communities). Therefore, intra-sub-network clusters were mutually connected with numerous links, although inter-sub-network ones were rarely done with few links. For K_c≥ 1000, the major sub-networks were about 40; the contents of the major sub-networks were conserved. This sub-partitioning is a novel finding, suggesting that the network is structured hierarchically: Segments construct a cluster, clusters form a sub-network, and sub-networks constitute a region. Additionally, the network was characterized by non-power-law statistics, which is also a novel finding.

Conclusion

Main findings are: (1) The universe of 50 residue segments found here was characterized by non-power-law statistics. Therefore, the universe differs from those ever reported for the protein domains. (2) The 50-residue segments were partitioned persistently and universally into some dozens (ca. 40) of major sub-networks, irrespective of the number of clusters. (3) These major sub-networks encompassed 90% of all segments. Consequently, the protein tertiary structure is constructed using the dozens of elements (sub-networks).

Collapse

Nacher JC, Hayashida M, Akutsu T. Emergence of scale-free distribution in protein-protein interaction networks based on random selection of interacting domain pairs. Biosystems 2008;95:155-9. [PMID: 19010382 DOI: 10.1016/j.biosystems.2008.10.002] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2008] [Revised: 10/03/2008] [Accepted: 10/05/2008] [Indexed: 11/18/2022]

Honda S, Akiba T, Kato YS, Sawada Y, Sekijima M, Ishimura M, Ooishi A, Watanabe H, Odahara T, Harata K. Crystal structure of a ten-amino acid protein. J Am Chem Soc 2008;130:15327-31. [PMID: 18950166 DOI: 10.1021/ja8030533] [Citation(s) in RCA: 130] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Sawada Y, Honda S. ProSeg: a database of local structures of protein segments. J Comput Aided Mol Des 2008;23:163-9. [DOI: 10.1007/s10822-008-9248-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2008] [Accepted: 09/26/2008] [Indexed: 11/29/2022]

Ikeda K, Hirokawa T, Higo J, Tomii K. Protein-segment universe exhibiting transitions at intermediate segment length in conformational subspaces. BMC STRUCTURAL BIOLOGY 2008;8:37. [PMID: 18700043 PMCID: PMC2529298 DOI: 10.1186/1472-6807-8-37] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/25/2008] [Accepted: 08/13/2008] [Indexed: 12/05/2022]

Abstract

Background

Many studies have examined rules governing two aspects of protein structures: short segments and proteins' structural domains. Nevertheless, the organization and nature of the conformational space of segments with intermediate length between short segments and domains remain unclear. Conformational spaces of intermediate length segments probably differ from those of short segments. We investigated the identification and characterization of the boundary(s) between peptide-like (short segment) and protein-like (long segment) distributions. We generated ensembles embedded in globular proteins comprising segments 10–50 residues long. We explored the relationships between the conformational distribution of segments and their lengths, and also protein structural classes using principal component analysis based on the intra-segment C_α-C_αatomic distances.

Results

Our statistical analyses of segment conformations and length revealed critical dual transitions in their conformational distribution with segments derived from all four structural classes. Dual transitions were identified with the intermediate phase between the short segments and domains. Consequently, protein segment universes were categorized. i) Short segments (10–22 residues) showed a distribution with a high frequency of secondary structure clusters. ii) Medium segments (23–26 residues) showed a distribution corresponding to an intermediate state of transitions. iii) Long segments (27–50 residues) showed a distribution converging on one huge cluster containing compact conformations with a smaller radius of gyration. This distribution reflects the protein structures' organization and protein domains' origin. Three major conformational components (radius of gyration, structural symmetry with respect to the N-terminal and C-terminal halves, and single-turn/two-turn structure) well define most of the segment universes. Furthermore, we identified several conformational components that were unique to each structural class. Those characteristics suggest that protein segment conformation is described by compositions of the three common structural variables with large contributions and specific structural variables with small contributions.

Conclusion

The present results of the analyses of four protein structural classes show the universal role of three major components as segment conformational descriptors. The obtained perspectives of distribution changes related to the segment lengths using the three key components suggest both the adequacy and the possibility of further progress on the prediction strategies used in the recent de novo structure-prediction methods.

Collapse

Manikandan K, Pal D, Ramakumar S, Brener NE, Iyengar SS, Seetharaman G. Functionally important segments in proteins dissected using Gene Ontology and geometric clustering of peptide fragments. Genome Biol 2008;9:R52. [PMID: 18331637 PMCID: PMC2397504 DOI: 10.1186/gb-2008-9-3-r52] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2007] [Revised: 02/24/2008] [Accepted: 03/10/2008] [Indexed: 11/25/2022] Open