Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Karlin S, Brocchieri L, Bergman A, Mrazek J, Gentles AJ. Amino acid runs in eukaryotic proteomes and disease associations. Proc Natl Acad Sci U S A 2002;99:333-8. [PMID: 11782551 PMCID: PMC117561 DOI: 10.1073/pnas.012608599] [Citation(s) in RCA: 168] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/14/2001] [Indexed: 11/18/2022] Open

For:	Karlin S, Brocchieri L, Bergman A, Mrazek J, Gentles AJ. Amino acid runs in eukaryotic proteomes and disease associations. Proc Natl Acad Sci U S A 2002;99:333-8. [PMID: 11782551 PMCID: PMC117561 DOI: 10.1073/pnas.012608599] [Citation(s) in RCA: 168] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/14/2001] [Indexed: 11/18/2022] Open

Number

Cited by Other Article(s)

Non-random distribution of homo-repeats: links with biological functions and human diseases. Sci Rep 2016;6:26941. [PMID: 27256590 PMCID: PMC4891720 DOI: 10.1038/srep26941] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2016] [Accepted: 05/06/2016] [Indexed: 12/22/2022] Open

Zheng X, Li Y, Zhao J, Wang D, Xia H, Mao Q. Production and Characterization of Monoclonal Antibodies against Human Nuclear Protein FAM76B. PLoS One 2016;11:e0152237. [PMID: 27018871 PMCID: PMC4809503 DOI: 10.1371/journal.pone.0152237] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2015] [Accepted: 03/10/2016] [Indexed: 11/18/2022] Open

Wu R, Liu Q, Zhang P, Liang D. Tandem amino acid repeats in the green anole (Anolis carolinensis) and other squamates may have a role in increasing genetic variability. BMC Genomics 2016;17:109. [PMID: 26868501 PMCID: PMC4751654 DOI: 10.1186/s12864-016-2430-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2015] [Accepted: 02/02/2016] [Indexed: 01/04/2023] Open

Abstract

Background

Tandem amino acid repeats are characterised by the consecutive recurrence of a single amino acid. They exhibit high rates of length mutations in addition to point mutations and have been proposed to be involved in genetic plasticity. Squamate reptiles (lizards and snakes) diversify in both morphology and physiology. The underlying mechanism is yet to be understood. In a previous phylogenomic analysis of reptiles, the density of tandem repeats in an anole lizard diverged heavily from that of the other reptiles. To gain further insight into the tandem amino acid repeats in squamates, we analysed the repeat content in the green anole (Anolis carolinensis) proteome and compared the amino acid repeats in a large orthologous protein data set from six vertebrates (the Western clawed frog, the green anole, the Chinese softshell turtle, the zebra finch, mouse and human).

Results

Our results revealed that the number of amino acid repeats in the green anole exceeded those found in the other five species studied. Species-only repeats were found in high proportion in the green anole but not in the other five species, suggesting that the green anole had gained many amino acid repeats in either the Anolis or the squamate lineage. Since the amino acid repeat containing genes in the green anole were highly enriched in genes related to transcription and development, an important family of developmental genes, i.e., the Hox family, was further studied in a wide collection of squamates. Abundant amino acid repeats were also observed, implying the general high tolerance of amino acid repeats in squamates. A particular enrichment of amino acid repeats was observed in the central class Hox genes that are known to be responsible for defining cervical to lumbar regions.

Conclusions

Our study suggests that the abundant amino acid repeats in the green anole, and possibly in other squamates, may play a role in increasing the genetic variability, and contribute to the evolutionary diversity of this clade.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-016-2430-y) contains supplementary material, which is available to authorized users.

Collapse

Wu LZ, Xu XY, Liu YF, Ge X, Wang XJ. Expansion of polyalanine tracts in the QA domain may play a critical role in the clavicular development of cleidocranial dysplasia. J Genet 2015;94:551-3. [PMID: 26440098 DOI: 10.1007/s12041-015-0551-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Martins F, Gonçalves R, Oliveira J, Cruz-Monteagudo M, Nieto-Villar JM, Paz-y-Miño C, Rebelo I, Tejera E. Unravelling the relationship between protein sequence and low-complexity regions entropies: Interactome implications. J Theor Biol 2015;382:320-7. [PMID: 26164061 DOI: 10.1016/j.jtbi.2015.06.049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Revised: 06/12/2015] [Accepted: 06/28/2015] [Indexed: 10/23/2022]

Radó-Trilla N, Arató K, Pegueroles C, Raya A, de la Luna S, Albà MM. Key Role of Amino Acid Repeat Expansions in the Functional Diversification of Duplicated Transcription Factors. Mol Biol Evol 2015;32:2263-72. [PMID: 25931513 PMCID: PMC4540963 DOI: 10.1093/molbev/msv103] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open

Wear MP, Kryndushkin D, O’Meally R, Sonnenberg JL, Cole RN, Shewmaker FP. Proteins with Intrinsically Disordered Domains Are Preferentially Recruited to Polyglutamine Aggregates. PLoS One 2015;10:e0136362. [PMID: 26317359 PMCID: PMC4552826 DOI: 10.1371/journal.pone.0136362] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2015] [Accepted: 07/31/2015] [Indexed: 12/12/2022] Open

Wei W, Davis RE, Suo X, Zhao Y. Occurrence, distribution and possible functional roles of simple sequence repeats in phytoplasma genomes. Int J Syst Evol Microbiol 2015;65:2748-2760. [DOI: 10.1099/ijs.0.000273] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open

Abstract Phytoplasmas are unculturable, cell-wall-less bacteria that parasitize plants and insects. This transkingdom life cycle requires rapid responses to vastly different environments, including transitions from plant phloem sieve elements to various insect tissues and alternations among diverse plant hosts. Features that enable such flexibility in other microbes include simple sequence repeats (SSRs) — mutation-prone, phase-variable short DNA tracts that function as ‘evolutionary rheostats’ and enhance rapid adaptations. To gain insights into the occurrence, distribution and potentially functional roles of SSRs in phytoplasmas, we performed computational analysis on the genomes of five completely sequenced phytoplasma strains, ‘Candidatus Phytoplasma asteris’-related strains OYM and AYWB, ‘Candidatus Phytoplasma australiense’-related strains CBWB and SLY and ‘Candidatus Phytoplasma mali’-related strain AP-AT. The overall density of SSRs in phytoplasma genomes was higher than in representative strains of other prokaryotes. While mono- and trinucleotide SSRs were significantly overrepresented in the phytoplasma genomes, dinucleotide SSRs and other higher-order SSRs were underrepresented. The occurrence and distribution of long SSRs in the prophage islands and phytoplasma-unique genetic loci indicated that SSRs played a role in compounding the complexity of sequence mosaics in individual genomes and in increasing allelic diversity among genomes. Findings from computational analyses were further complemented by an examination of SSRs in varied additional phytoplasma strains, with a focus on potential contingency genes. Some SSRs were located in regions that could profoundly alter the regulation of transcription and translation of affected genes and/or the composition of protein products. Collapse

Lu X, Murphy RM. Asparagine Repeat Peptides: Aggregation Kinetics and Comparison with Glutamine Repeats. Biochemistry 2015. [PMID: 26204228 DOI: 10.1021/acs.biochem.5b00644] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Banerji J. Asparaginase treatment side-effects may be due to genes with homopolymeric Asn codons (Review-Hypothesis). Int J Mol Med 2015;36:607-26. [PMID: 26178806 PMCID: PMC4533780 DOI: 10.3892/ijmm.2015.2285] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2015] [Accepted: 07/15/2015] [Indexed: 12/14/2022] Open

Arthur LL, Pavlovic-Djuranovic S, Koutmou KS, Green R, Szczesny P, Djuranovic S. Translational control by lysine-encoding A-rich sequences. SCIENCE ADVANCES 2015;1:e1500154. [PMID: 26322332 PMCID: PMC4552401 DOI: 10.1126/sciadv.1500154] [Citation(s) in RCA: 78] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]

Bina M, Wyss P. Impact of the MLL1 morphemes on codon utilization and preservation in CpG Islands. Biopolymers 2015;103:480-90. [PMID: 25991579 DOI: 10.1002/bip.22681] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2014] [Revised: 05/04/2015] [Accepted: 05/13/2015] [Indexed: 11/07/2022]

Pandya S, Struck TJ, Mannakee BK, Paniscus M, Gutenkunst RN. Testing whether metazoan tyrosine loss was driven by selection against promiscuous phosphorylation. Mol Biol Evol 2015;32:144-52. [PMID: 25312910 PMCID: PMC4271526 DOI: 10.1093/molbev/msu284] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open

Kumari B, Kumar R, Kumar M. Low complexity and disordered regions of proteins have different structural and amino acid preferences. MOLECULAR BIOSYSTEMS 2014;11:585-94. [PMID: 25468592 DOI: 10.1039/c4mb00425f] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]

Mandal A, Mandal S, Park MH. Genome-wide analyses and functional classification of proline repeat-rich proteins: potential role of eIF5A in eukaryotic evolution. PLoS One 2014;9:e111800. [PMID: 25364902 PMCID: PMC4218817 DOI: 10.1371/journal.pone.0111800] [Citation(s) in RCA: 57] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2014] [Accepted: 10/06/2014] [Indexed: 12/16/2022] Open

Imprasittichail W, Roytrakul S, Krungkrai SR, Krungkrail J. A unique insertion of low complexity amino acid sequence underlies protein-protein interaction in human malaria parasite orotate phosphoribosyltransferase and orotidine 5'-monophosphate decarboxylase. ASIAN PAC J TROP MED 2014;7:184-92. [PMID: 24507637 DOI: 10.1016/s1995-7645(14)60018-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2013] [Revised: 09/15/2013] [Accepted: 01/15/2014] [Indexed: 11/17/2022] Open

Lu X, Murphy RM. Synthesis and disaggregation of asparagine repeat-containing peptides. J Pept Sci 2014;20:860-7. [PMID: 25044797 DOI: 10.1002/psc.2677] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2014] [Revised: 06/12/2014] [Accepted: 06/26/2014] [Indexed: 01/21/2023]

Abstract

Of all amino acid repeats in eukaryotes, polyglutamine (polyQ) is the most frequent, followed by polyasparagine (polyN). Glutamine repeats are expanded in proteins associated with several neurodegenerative disorders. The expanded polyQ domain is known to induce aggregation, and it is hypothesized that aggregation is directly causative of pathology. Despite the widespread presence of asparagine repeats in invertebrate eukaryotes, polyN is curiously quite rare in vertebrates. Several investigators have characterized the conformational and aggregation properties of polyQ-containing peptides and proteins, and to a lesser extent, peptides containing mixed glutamine and asparagine, but to our knowledge, there is no detailed characterization of polyN-containing peptides. Such a comparison could elucidate reasons for the paucity of asparagine repeats in humans. In this study, we synthesized a peptide containing a 24-asparagine repeat (N24). For aggregation studies, it is critical to start with monomeric unaggregated peptide. A protocol involving dissolution in mixed trifluoroacetic acid and hexafluoroisopropanol (TFA + HFIP) solvents is widely used for disaggregation of polyQ peptides. We used the same protocol for N24 but discovered that there was both oxidative damage and insufficient disaggregation. Oxidation of tryptophan, used as a flanking residue, was common. Moreover, we found evidence of Förster resonance energy transfer between Trp and its oxidation product N-formylkynurenine, even in chemical denaturants. This suggested that N24 was insufficiently disaggregated, a conclusion that was further supported by gel electrophoresis analysis. Oxidation was reduced, but not eliminated, by addition of methionine to the buffer. Formic acid proved to be a better disaggregator and caused no oxidative damage. The glutamine repeat peptide Q24 also underwent some oxidation after extended incubation in TFA + HFIP, but there was no evidence of Förster resonance energy transfer, and samples appeared monomeric by gel electrophoresis. This result indicates that polyN-containing peptides self-associate more strongly than polyQ-containing peptides. Circular dichroism spectra reveal a greater propensity for β-turn formation in polyN than polyQ, providing an explanation for the increased stability of polyN aggregates relative to polyQ.

Collapse

Perticaroli S, Nickels JD, Ehlers G, Mamontov E, Sokolov AP. Dynamics and rigidity in an intrinsically disordered protein, β-casein. J Phys Chem B 2014;118:7317-26. [PMID: 24918971 DOI: 10.1021/jp503788r] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Zorgani MA, Patron K, Desvaux M. New insight in the structural features of haloadaptation in α-amylases from halophilic Archaea following homology modeling strategy: folded and stable conformation maintained through low hydrophobicity and highly negative charged surface. J Comput Aided Mol Des 2014;28:721-34. [DOI: 10.1007/s10822-014-9754-y] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2014] [Accepted: 05/16/2014] [Indexed: 11/24/2022]

Wolfe KJ, Ren HY, Trepte P, Cyr DM. Polyglutamine-rich suppressors of huntingtin toxicity act upstream of Hsp70 and Sti1 in spatial quality control of amyloid-like proteins. PLoS One 2014;9:e95914. [PMID: 24828240 PMCID: PMC4020751 DOI: 10.1371/journal.pone.0095914] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2013] [Accepted: 04/01/2014] [Indexed: 11/30/2022] Open

Abstract

Protein conformational maladies such as Huntington Disease are characterized by accumulation of intracellular and extracellular protein inclusions containing amyloid-like proteins. There is an inverse correlation between proteotoxicity and aggregation, so facilitated protein aggregation appears cytoprotective. To define mechanisms for protective protein aggregation, a screen for suppressors of nuclear huntingtin (Htt103Q) toxicity was conducted. Nuclear Htt103Q is highly toxic and less aggregation prone than its cytosolic form, so we identified suppressors of cytotoxicity caused by Htt103Q tagged with a nuclear localization signal (NLS). High copy suppressors of Htt103Q-NLS toxicity include the polyQ-domain containing proteins Nab3, Pop2, and Cbk1, and each suppresses Htt toxicity via a different mechanism. Htt103Q-NLS appears to inactivate the essential functions of Nab3 in RNA processing in the nucleus. Function of Pop2 and Cbk1 is not impaired by nuclear Htt103Q, as their respective polyQ-rich domains are sufficient to suppress Htt103Q toxicity. Pop2 is a subunit of an RNA processing complex and is localized throughout the cytoplasm. Expression of just the Pop2 polyQ domain and an adjacent proline-rich stretch is sufficient to suppress Htt103Q toxicity. The proline-rich domain in Pop2 resembles an aggresome targeting signal, so Pop2 may act in trans to positively impact spatial quality control of Htt103Q. Cbk1 accumulates in discrete perinuclear foci and overexpression of the Cbk1 polyQ domain concentrates diffuse Htt103Q into these foci, which correlates with suppression of Htt toxicity. Protective action of Pop2 and Cbk1 in spatial quality control is dependent upon the Hsp70 co-chaperone Sti1, which packages amyloid-like proteins into benign foci. Protein:protein interactions between Htt103Q and its intracellular neighbors lead to toxic and protective outcomes. A subset of polyQ-rich proteins buffer amyloid toxicity by funneling toxic aggregation intermediates to the Hsp70/Sti1 system for spatial organization into benign species.

Collapse

Ahmed Z, Gurusaran M, Narayana P, Kumar KSD, Mohanapriya J, Vaishnavi MK, Sekar K. PPS: A computing engine to find Palindromes in all Protein sequences. Bioinformation 2014;10:48-51. [PMID: 24516327 PMCID: PMC3916820 DOI: 10.6026/97320630010048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2014] [Revised: 01/23/2014] [Accepted: 01/24/2014] [Indexed: 11/23/2022] Open

Persi E, Horn D. Systematic analysis of compositional order of proteins reveals new characteristics of biological functions and a universal correlate of macroevolution. PLoS Comput Biol 2013;9:e1003346. [PMID: 24278003 PMCID: PMC3836704 DOI: 10.1371/journal.pcbi.1003346] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2013] [Accepted: 10/03/2013] [Indexed: 01/01/2023] Open

Abstract

We present a novel analysis of compositional order (CO) based on the occurrence of Frequent amino-acid Triplets (FTs) that appear much more than random in protein sequences. The method captures all types of proteomic compositional order including single amino-acid runs, tandem repeats, periodic structure of motifs and otherwise low complexity amino-acid regions. We introduce new order measures, distinguishing between ‘regularity’, ‘periodicity’ and ‘vocabulary’, to quantify these phenomena and to facilitate the identification of evolutionary effects. Detailed analysis of representative species across the tree-of-life demonstrates that CO proteins exhibit numerous functional enrichments, including a wide repertoire of particular patterns of dependencies on regularity and periodicity. Comparison between human and mouse proteomes further reveals the interplay of CO with evolutionary trends, such as faster substitution rate in mouse leading to decrease of periodicity, while innovation along the human lineage leads to larger regularity. Large-scale analysis of 94 proteomes leads to systematic ordering of all major taxonomic groups according to FT-vocabulary size. This is measured by the count of Different Frequent Triplets (DFT) in proteomes. The latter provides a clear hierarchical delineation of vertebrates, invertebrates, plants, fungi and prokaryotes, with thermophiles showing the lowest level of FT-vocabulary. Among eukaryotes, this ordering correlates with phylogenetic proximity. Interestingly, in all kingdoms CO accumulation in the proteome has universal characteristics. We suggest that CO is a genomic-information correlate of both macroevolution and various protein functions. The results indicate a mechanism of genomic ‘innovation’ at the peptide level, involved in protein elongation, shaped in a universal manner by mutational and selective forces.

Variations in compositionally ordered (CO) sections of proteins, such as amino acid runs, tandem repeats and low complexity regions, are often considered as a third type of genomic variation along with SNP and CNV. At the microevolutionary scale, they are involved in the rapid evolution of numerous biological functions and the development of novel phenotypic complex traits, including disease in human, in particular neurodegeneration and cancer. At the macroevolutionary scale, the best discriminating proteomic factor between super-kingdoms is the prevalence of CO proteins in eukaryotes. The analysis of CO structures has so far been quite eclectic. Here we introduce a novel unifying methodology, accounting for all types of low-complexity regions and repetitive phenomena, including the existence of large periodic structures in protein sequences. We define new CO measures providing insights into the correlation of CO with protein function and with evolution. In particular, a large-scale analysis of 94 proteomes shows that the CO vocabulary of frequently appearing amino acid triplets serves as a measure of taxonomic ordering separating major clades from each other. It unravels a missing genomic correlate of macroevolution and serves as a novel phylogenetic tool. This suggests that major CO generation occurs during the creation of a completely new species, i.e. during macroevolutionary events.

Collapse

Filisetti D, Théobald-Dietrich A, Mahmoudi N, Rudinger-Thirion J, Candolfi E, Frugier M. Aminoacylation of Plasmodium falciparum tRNA(Asn) and insights in the synthesis of asparagine repeats. J Biol Chem 2013;288:36361-71. [PMID: 24196969 DOI: 10.1074/jbc.m113.522896] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open

Asparagine repeats in Plasmodium falciparum proteins: good for nothing? PLoS Pathog 2013;9:e1003488. [PMID: 23990777 PMCID: PMC3749963 DOI: 10.1371/journal.ppat.1003488] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open

Willadsen K, Cao MD, Wiles J, Balasubramanian S, Bodén M. Repeat-encoded poly-Q tracts show statistical commonalities across species. BMC Genomics 2013;14:76. [PMID: 23374135 PMCID: PMC3617014 DOI: 10.1186/1471-2164-14-76] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2012] [Accepted: 01/18/2013] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Among repetitive genomic sequence, the class of tri-nucleotide repeats has received much attention due to their association with human diseases. Tri-nucleotide repeat diseases are caused by excessive sequence length variability; diseases such as Huntington's disease and Fragile X syndrome are tied to an increase in the number of repeat units in a tract. Motivated by the recent discovery of a tri-nucleotide repeat associated genetic defect in Arabidopsis thaliana, this study takes a cross-species approach to investigating these repeat tracts, with the goal of using commonalities between species to identify potential disease-related properties.

RESULTS

We find that statistical enrichment in regulatory function associations for coding region repeats - previously observed in human - is consistent across multiple organisms. By distinguishing between homo-amino acid tracts that are encoded by tri-nucleotide repeats, and those encoded by varying codons, we show that amino acid repeats - not tri-nucleotide repeats - fully explain these regulatory associations. Using this same separation between repeat- and non-repeat-encoded homo-amino acid tracts, we show that poly-glutamine tracts are disproportionately encoded by tri-nucleotide repeats, and those tracts that are encoded by tri-nucleotide repeats are also significantly longer; these results are consistent across multiple species.

CONCLUSION

These findings establish similarities in tri-nucleotide repeats across species at the level of protein functionality and protein sequence. The tendency of tri-nucleotide repeats to encode longer poly-glutamine tracts indicates a link with the poly-glutamine repeat diseases. The cross-species nature of this tendency suggests that unknown repeat diseases are yet to be uncovered in other species. Future discoveries of new non-human repeat associated defects may provide the breadth of information needed to unravel the mechanisms that underpin this class of human disease.

Collapse

Tompa P. Hydrogel formation by multivalent IDPs: A reincarnation of the microtrabecular lattice? INTRINSICALLY DISORDERED PROTEINS 2013;1:e24068. [PMID: 28516006 PMCID: PMC5424804 DOI: 10.4161/idp.24068] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/08/2013] [Revised: 01/31/2013] [Accepted: 02/21/2013] [Indexed: 02/03/2023]

Exploring charged biased regions in the human proteome. Gene 2012;515:277-80. [PMID: 23266628 DOI: 10.1016/j.gene.2012.11.077] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2012] [Revised: 11/12/2012] [Accepted: 11/28/2012] [Indexed: 11/23/2022]

Khan MKA, Bowler BE. Conformational properties of polyglutamine sequences in guanidine hydrochloride solutions. Biophys J 2012. [PMID: 23199927 DOI: 10.1016/j.bpj.2012.09.041] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open

Background-dependent effects of polyglutamine variation in the Arabidopsis thaliana gene ELF3. Proc Natl Acad Sci U S A 2012;109:19363-7. [PMID: 23129635 DOI: 10.1073/pnas.1211021109] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Radó-Trilla N, Albà M. Dissecting the role of low-complexity regions in the evolution of vertebrate proteins. BMC Evol Biol 2012;12:155. [PMID: 22920595 PMCID: PMC3523016 DOI: 10.1186/1471-2148-12-155] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2012] [Accepted: 07/30/2012] [Indexed: 11/10/2022] Open

Jorda J, Baudrand T, Kajava AV. PRDB: Protein Repeat DataBase. Proteomics 2012;12:1333-6. [DOI: 10.1002/pmic.201100534] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Lobanov MY, Bogatyreva NS, Galzitskaya OV. Occurrence of six-amino-acid motifs in three eukaryotic proteomes. Mol Biol 2012. [DOI: 10.1134/s0026893312010128] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

Ramazzotti M, Monsellier E, Kamoun C, Degl'Innocenti D, Melki R. Polyglutamine repeats are associated to specific sequence biases that are conserved among eukaryotes. PLoS One 2012;7:e30824. [PMID: 22312432 PMCID: PMC3270027 DOI: 10.1371/journal.pone.0030824] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2011] [Accepted: 12/23/2011] [Indexed: 12/20/2022] Open

Schaefer MH, Wanker EE, Andrade-Navarro MA. Evolution and function of CAG/polyglutamine repeats in protein-protein interaction networks. Nucleic Acids Res 2012;40:4273-87. [PMID: 22287626 PMCID: PMC3378862 DOI: 10.1093/nar/gks011] [Citation(s) in RCA: 151] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open

Lobanov MY, Galzitskaya OV. Occurrence of disordered patterns and homorepeats in eukaryotic and bacterial proteomes. ACTA ACUST UNITED AC 2012;8:327-37. [DOI: 10.1039/c1mb05318c] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]

Faux N. Single amino acid and trinucleotide repeats: function and evolution. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2012;769:26-40. [PMID: 23560303 DOI: 10.1007/978-1-4614-5434-2_3] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Zhou Y, Liu J, Han L, Li ZG, Zhang Z. Comprehensive analysis of tandem amino acid repeats from ten angiosperm genomes. BMC Genomics 2011;12:632. [PMID: 22195734 PMCID: PMC3283746 DOI: 10.1186/1471-2164-12-632] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2011] [Accepted: 12/23/2011] [Indexed: 11/30/2022] Open

Laurie S, Toll-Riera M, Radó-Trilla N, Albà MM. Sequence shortening in the rodent ancestor. Genome Res 2011;22:478-85. [PMID: 22128134 DOI: 10.1101/gr.121897.111] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Luo H, Lin K, David A, Nijveen H, Leunissen JAM. ProRepeat: an integrated repository for studying amino acid tandem repeats in proteins. Nucleic Acids Res 2011;40:D394-9. [PMID: 22102581 PMCID: PMC3245022 DOI: 10.1093/nar/gkr1019] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open

Toll-Riera M, Radó-Trilla N, Martys F, Albà MM. Role of low-complexity sequences in the formation of novel protein coding sequences. Mol Biol Evol 2011;29:883-6. [PMID: 22045997 DOI: 10.1093/molbev/msr263] [Citation(s) in RCA: 67] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open

Behura SK, Haugen M, Flannery E, Sarro J, Tessier CR, Severson DW, Duman-Scheel M. Comparative genomic analysis of Drosophila melanogaster and vector mosquito developmental genes. PLoS One 2011;6:e21504. [PMID: 21754989 PMCID: PMC3130749 DOI: 10.1371/journal.pone.0021504] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2011] [Accepted: 05/30/2011] [Indexed: 11/18/2022] Open

Łabaj PP, Sykacek P, Kreil DP. An analysis of single amino acid repeats as use case for application specific background models. BMC Bioinformatics 2011;12:173. [PMID: 21595908 PMCID: PMC3124433 DOI: 10.1186/1471-2105-12-173] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2010] [Accepted: 05/19/2011] [Indexed: 11/30/2022] Open

Abstract

Background

Sequence analysis aims to identify biologically relevant signals against a backdrop of functionally meaningless variation. Increasingly, it is recognized that the quality of the background model directly affects the performance of analyses. State-of-the-art approaches rely on classical sequence models that are adapted to the studied dataset. Although performing well in the analysis of globular protein domains, these models break down in regions of stronger compositional bias or low complexity. While these regions are typically filtered, there is increasing anecdotal evidence of functional roles. This motivates an exploration of more complex sequence models and application-specific approaches for the investigation of biased regions.

Results

Traditional Markov-chains and application-specific regression models are compared using the example of predicting runs of single amino acids, a particularly simple class of biased regions. Cross-fold validation experiments reveal that the alternative regression models capture the multi-variate trends well, despite their low dimensionality and in contrast even to higher-order Markov-predictors. We show how the significance of unusual observations can be computed for such empirical models. The power of a dedicated model in the detection of biologically interesting signals is then demonstrated in an analysis identifying the unexpected enrichment of contiguous leucine-repeats in signal-peptides. Considering different reference sets, we show how the question examined actually defines what constitutes the 'background'. Results can thus be highly sensitive to the choice of appropriate model training sets. Conversely, the choice of reference data determines the questions that can be investigated in an analysis.

Conclusions

Using a specific case of studying biased regions as an example, we have demonstrated that the construction of application-specific background models is both necessary and feasible in a challenging sequence analysis situation.

Collapse

Jorda J, Kajava AV. Protein homorepeats sequences, structures, evolution, and functions. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2011;79:59-88. [PMID: 20621281 DOI: 10.1016/s1876-1623(10)79002-7] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Haerty W, Golding GB. Low-complexity sequences and single amino acid repeats: not just "junk" peptide sequences. Genome 2011;53:753-62. [PMID: 20962881 DOI: 10.1139/g10-063] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]

Role of Everlasting Triplet Expansions in Protein Evolution. J Mol Evol 2010;72:232-9. [DOI: 10.1007/s00239-010-9425-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2010] [Accepted: 12/01/2010] [Indexed: 02/05/2023]

Tian X, Strassmann JE, Queller DC. Genome nucleotide composition shapes variation in simple sequence repeats. Mol Biol Evol 2010;28:899-909. [PMID: 20943830 DOI: 10.1093/molbev/msq266] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open

Kaundal R, Saini R, Zhao PX. Combining machine learning and homology-based approaches to accurately predict subcellular localization in Arabidopsis. PLANT PHYSIOLOGY 2010;154:36-54. [PMID: 20647376 PMCID: PMC2938157 DOI: 10.1104/pp.110.156851] [Citation(s) in RCA: 65] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/25/2010] [Accepted: 07/13/2010] [Indexed: 05/20/2023]

Abstract

A complete map of the Arabidopsis (Arabidopsis thaliana) proteome is clearly a major goal for the plant research community in terms of determining the function and regulation of each encoded protein. Developing genome-wide prediction tools such as for localizing gene products at the subcellular level will substantially advance Arabidopsis gene annotation. To this end, we performed a comprehensive study in Arabidopsis and created an integrative support vector machine-based localization predictor called AtSubP (for Arabidopsis subcellular localization predictor) that is based on the combinatorial presence of diverse protein features, such as its amino acid composition, sequence-order effects, terminal information, Position-Specific Scoring Matrix, and similarity search-based Position-Specific Iterated-Basic Local Alignment Search Tool information. When used to predict seven subcellular compartments through a 5-fold cross-validation test, our hybrid-based best classifier achieved an overall sensitivity of 91% with high-confidence precision and Matthews correlation coefficient values of 90.9% and 0.89, respectively. Benchmarking AtSubP on two independent data sets, one from Swiss-Prot and another containing green fluorescent protein- and mass spectrometry-determined proteins, showed a significant improvement in the prediction accuracy of species-specific AtSubP over some widely used "general" tools such as TargetP, LOCtree, PA-SUB, MultiLoc, WoLF PSORT, Plant-PLoc, and our newly created All-Plant method. Cross-comparison of AtSubP on six nontrained eukaryotic organisms (rice [Oryza sativa], soybean [Glycine max], human [Homo sapiens], yeast [Saccharomyces cerevisiae], fruit fly [Drosophila melanogaster], and worm [Caenorhabditis elegans]) revealed inferior predictions. AtSubP significantly outperformed all the prediction tools being currently used for Arabidopsis proteome annotation and, therefore, may serve as a better complement for the plant research community. A supplemental Web site that hosts all the training/testing data sets and whole proteome predictions is available at http://bioinfo3.noble.org/AtSubP/.

Collapse

Francis DM, Page R. Strategies to optimize protein expression in E. coli. CURRENT PROTOCOLS IN PROTEIN SCIENCE 2010;Chapter 5:5.24.1-5.24.29. [PMID: 20814932 PMCID: PMC7162232 DOI: 10.1002/0471140864.ps0524s61] [Citation(s) in RCA: 109] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]

Łabaj PP, Leparc GG, Bardet AF, Kreil G, Kreil DP. Single amino acid repeats in signal peptides. FEBS J 2010;277:3147-57. [DOI: 10.1111/j.1742-4658.2010.07720.x] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

100

Mojsin M, Kovacevic-Grujicic N, Krstic A, Popovic J, Milivojevic M, Stevanovic M. Comparative analysis of SOX3 protein orthologs: Expansion of homopolymeric amino acid tracts during vertebrate evolution. Biochem Genet 2010;48:612-23. [PMID: 20495863 DOI: 10.1007/s10528-010-9343-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2009] [Accepted: 01/25/2010] [Indexed: 10/19/2022]