1
|
Moldovean-Cioroianu NS. Reviewing the Structure-Function Paradigm in Polyglutamine Disorders: A Synergistic Perspective on Theoretical and Experimental Approaches. Int J Mol Sci 2024; 25:6789. [PMID: 38928495 PMCID: PMC11204371 DOI: 10.3390/ijms25126789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2024] [Revised: 06/13/2024] [Accepted: 06/17/2024] [Indexed: 06/28/2024] Open
Abstract
Polyglutamine (polyQ) disorders are a group of neurodegenerative diseases characterized by the excessive expansion of CAG (cytosine, adenine, guanine) repeats within host proteins. The quest to unravel the complex diseases mechanism has led researchers to adopt both theoretical and experimental methods, each offering unique insights into the underlying pathogenesis. This review emphasizes the significance of combining multiple approaches in the study of polyQ disorders, focusing on the structure-function correlations and the relevance of polyQ-related protein dynamics in neurodegeneration. By integrating computational/theoretical predictions with experimental observations, one can establish robust structure-function correlations, aiding in the identification of key molecular targets for therapeutic interventions. PolyQ proteins' dynamics, influenced by their length and interactions with other molecular partners, play a pivotal role in the polyQ-related pathogenic cascade. Moreover, conformational dynamics of polyQ proteins can trigger aggregation, leading to toxic assembles that hinder proper cellular homeostasis. Understanding these intricacies offers new avenues for therapeutic strategies by fine-tuning polyQ kinetics, in order to prevent and control disease progression. Last but not least, this review highlights the importance of integrating multidisciplinary efforts to advancing research in this field, bringing us closer to the ultimate goal of finding effective treatments against polyQ disorders.
Collapse
Affiliation(s)
- Nastasia Sanda Moldovean-Cioroianu
- Institute of Materials Science, Bioinspired Materials and Biosensor Technologies, Kiel University, Kaiserstraße 2, 24143 Kiel, Germany;
- Faculty of Physics, Babeș-Bolyai University, Kogălniceanu 1, RO-400084 Cluj-Napoca, Romania
| |
Collapse
|
2
|
Teekas L, Sharma S, Vijay N. Terminal regions of a protein are a hotspot for low complexity regions and selection. Open Biol 2024; 14:230439. [PMID: 38862022 DOI: 10.1098/rsob.230439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 05/13/2024] [Indexed: 06/13/2024] Open
Abstract
Volatile low complexity regions (LCRs) are a novel source of adaptive variation, functional diversification and evolutionary novelty. An interplay of selection and mutation governs the composition and length of low complexity regions. High %GC and mutations provide length variability because of mechanisms like replication slippage. Owing to the complex dynamics between selection and mutation, we need a better understanding of their coexistence. Our findings underscore that positively selected sites (PSS) and low complexity regions prefer the terminal regions of genes, co-occurring in most Tetrapoda clades. We observed that positively selected sites within a gene have position-specific roles. Central-positively selected site genes primarily participate in defence responses, whereas terminal-positively selected site genes exhibit non-specific functions. Low complexity region-containing genes in the Tetrapoda clade exhibit a significantly higher %GC and lower ω (dN/dS: non-synonymous substitution rate/synonymous substitution rate) compared with genes without low complexity regions. This lower ω implies that despite providing rapid functional diversity, low complexity region-containing genes are subjected to intense purifying selection. Furthermore, we observe that low complexity regions consistently display ubiquitous prevalence at lower purity levels, but exhibit a preference for specific positions within a gene as the purity of the low complexity region stretch increases, implying a composition-dependent evolutionary role. Our findings collectively contribute to the understanding of how genetic diversity and adaptation are shaped by the interplay of selection and low complexity regions in the Tetrapoda clade.
Collapse
Affiliation(s)
- Lokdeep Teekas
- Computational Evolutionary Genomics Lab, Department of Biological Sciences, IISER Bhopal , Bhauri, Madhya Pradesh, India
| | - Sandhya Sharma
- Computational Evolutionary Genomics Lab, Department of Biological Sciences, IISER Bhopal , Bhauri, Madhya Pradesh, India
| | - Nagarjun Vijay
- Computational Evolutionary Genomics Lab, Department of Biological Sciences, IISER Bhopal , Bhauri, Madhya Pradesh, India
| |
Collapse
|
3
|
Mier P, Andrade-Navarro MA, Morett E. Homorepeat variability within the human population. NAR Genom Bioinform 2024; 6:lqae053. [PMID: 38774515 PMCID: PMC11106027 DOI: 10.1093/nargab/lqae053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 04/12/2024] [Accepted: 05/08/2024] [Indexed: 05/24/2024] Open
Abstract
Genetic variation within populations plays a crucial role in driving evolution. Unlike the average protein sequence, the evolution of homorepeats can be influenced by DNA replication slippage, when DNA polymerases either add or skip repeats of nucleotides. While there are some diseases known to be caused by abnormal changes in the length of amino acid homorepeats, naturally occurring variations in homorepeat length remain relatively unexplored. In our study, we examined the variation in amino acid homorepeat length of human individuals by analyzing 125 748 exomes, as well as 15 708 whole genomes. Our analyses revealed significant variability in homorepeat length across the human population, indicating that these motifs are prone to mutations at higher rates than non repeat sequences. We focused our study on glutamine homorepeats, also known as polyQ sequences, and found that shorter polyQ sequences tend to exhibit greater length variation, while longer ones primarily undergo deletions. Notably, polyQ sequencesthat are more conserved across primates tend to show less variation within the human population, indicating stronger selective pressure to maintain their length. Overall, our results demonstrate that there is large natural variation in the length of homorepeats within the human population, with no apparent impact on observable traits.
Collapse
Affiliation(s)
- Pablo Mier
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University Mainz, Hanns-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - Miguel A Andrade-Navarro
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University Mainz, Hanns-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - Enrique Morett
- Departamento de Ingeniería Celular y Biocatálisis, Instituto de Biotecnología, Universidad Nacional Autónoma de México (UNAM), Av. Universidad 2001, Cuernavaca, Morelos 62210, Mexico
| |
Collapse
|
4
|
Dickson ZW, Golding GB. Evolution of Transcript Abundance is Influenced by Indels in Protein Low Complexity Regions. J Mol Evol 2024; 92:153-168. [PMID: 38485789 DOI: 10.1007/s00239-024-10158-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Accepted: 01/24/2024] [Indexed: 04/02/2024]
Abstract
Protein Protein low complexity regions (LCRs) are compositionally biased amino acid sequences, many of which have significant evolutionary impacts on the proteins which contain them. They are mutationally unstable experiencing higher rates of indels and substitutions than higher complexity regions. LCRs also impact the expression of their proteins, likely through multiple effects along the path from gene transcription, through translation, and eventual protein degradation. It has been observed that proteins which contain LCRs are associated with elevated transcript abundance (TAb), despite having lower protein abundance. We have gathered and integrated human data to investigate the co-evolution of TAb and LCRs through ancestral reconstructions and model inference using an approximate Bayesian calculation based method. We observe that on short evolutionary timescales TAb evolution is significantly impacted by changes in LCR length, with insertions driving TAb down. But in contrast, the observed data is best explained by indel rates in LCRs which are unaffected by shifts in TAb. Our work demonstrates a coupling between LCR and TAb evolution, and the utility of incorporating multiple responses into evolutionary analyses.
Collapse
Affiliation(s)
| | - G Brian Golding
- Department of Biology, McMaster University, Hamilton, ON, Canada
| |
Collapse
|
5
|
Jaramillo Ponce JR, Frugier M. Plasmodium, the Apicomplexa Outlier When It Comes to Protein Synthesis. Biomolecules 2023; 14:46. [PMID: 38254646 PMCID: PMC10813123 DOI: 10.3390/biom14010046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 12/19/2023] [Accepted: 12/22/2023] [Indexed: 01/24/2024] Open
Abstract
Plasmodium is an obligate intracellular parasite that has numerous interactions with different hosts during its elaborate life cycle. This is also the case for the other parasites belonging to the same phylum Apicomplexa. In this study, we bioinformatically identified the components of the multi-synthetase complexes (MSCs) of several Apicomplexa parasites and modelled their assembly using AlphaFold2. It appears that none of these MSCs resemble the two MSCs that we have identified and characterized in Plasmodium. Indeed, tRip, the central protein involved in the association of the two Plasmodium MSCs is different from its homologues, suggesting also that the tRip-dependent import of exogenous tRNAs is not conserved in other apicomplexan parasites. Based on this observation, we searched for obvious differences that could explain the singularity of Plasmodium protein synthesis by comparing tRNA genes and amino acid usage in the different genomes. We noted a contradiction between the large number of asparagine residues used in Plasmodium proteomes and the single gene encoding the tRNA that inserts them into proteins. This observation remains true for all the Plasmodia strains studied, even those that do not contain long asparagine homorepeats.
Collapse
Affiliation(s)
| | - Magali Frugier
- Université de Strasbourg, CNRS, Architecture et Réactivité de l’ARN, UPR 9002, F-67084 Strasbourg, France;
| |
Collapse
|
6
|
Elena-Real CA, Mier P, Sibille N, Andrade-Navarro MA, Bernadó P. Structure-function relationships in protein homorepeats. Curr Opin Struct Biol 2023; 83:102726. [PMID: 37924569 DOI: 10.1016/j.sbi.2023.102726] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 10/06/2023] [Accepted: 10/09/2023] [Indexed: 11/06/2023]
Abstract
Homorepeats (or polyX), protein segments containing repetitions of the same amino acid, are abundant in proteomes from all kingdoms of life and are involved in crucial biological functions as well as several neurodegenerative and developmental diseases. Mainly inserted in disordered segments of proteins, the structure/function relationships of homorepeats remain largely unexplored. In this review, we summarize present knowledge for the most abundant homorepeats, highlighting the role of the inherent structure and the conformational influence exerted by their flanking regions. Recent experimental and computational methods enable residue-specific investigations of these regions and promise novel structural and dynamic information for this elusive group of proteins. This information should increase our knowledge about the structural bases of phenomena such as liquid-liquid phase separation and trinucleotide repeat disorders.
Collapse
Affiliation(s)
- Carlos A Elena-Real
- Centre de Biologie Structurale (CBS), Université de Montpellier, INSERM, CNRS. 29 rue de Navacelles, 34090 Montpellier, France. https://twitter.com/carloselenareal
| | - Pablo Mier
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University Mainz. Hans-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - Nathalie Sibille
- Centre de Biologie Structurale (CBS), Université de Montpellier, INSERM, CNRS. 29 rue de Navacelles, 34090 Montpellier, France
| | - Miguel A Andrade-Navarro
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University Mainz. Hans-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - Pau Bernadó
- Centre de Biologie Structurale (CBS), Université de Montpellier, INSERM, CNRS. 29 rue de Navacelles, 34090 Montpellier, France.
| |
Collapse
|
7
|
Lynch VJ, Wagner GP. Cooption of polyalanine tract into a repressor domain in the mammalian transcription factor HoxA11. JOURNAL OF EXPERIMENTAL ZOOLOGY. PART B, MOLECULAR AND DEVELOPMENTAL EVOLUTION 2023; 340:486-495. [PMID: 34125492 DOI: 10.1002/jez.b.23063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Revised: 04/21/2021] [Accepted: 04/26/2021] [Indexed: 06/12/2023]
Abstract
An enduring problem in biology is explaining how novel functions of genes originated and how those functions diverge between species. Despite detailed studies on the functional evolution of a few proteins, the molecular mechanisms by which protein functions have evolved are almost entirely unknown. Here, we show that a polyalanine tract in the homeodomain transcription factor HoxA11 arose in the stem-lineage of mammals and functions as an autonomous repressor module by physically interacting with the PAH domains of SIN3 proteins. These results suggest that long polyalanine tracts, which are common in transcription factors and often associated with disease, may tend to function as repressor domains and can contribute to the diversification of transcription factor functions despite the deleterious consequences of polyalanine tract expansion.
Collapse
Affiliation(s)
- Vincent J Lynch
- Department of Biological Sciences, University at Buffalo, Buffalo, New York, USA
| | - Gunter P Wagner
- Department of Ecology and Evolutionary Biology and Yale Systems Biology Institute, Yale University, New Haven, Connecticut, USA
| |
Collapse
|
8
|
Neville N, Lehotsky K, Yang Z, Klupt KA, Denoncourt A, Downey M, Jia Z. Modification of histidine repeat proteins by inorganic polyphosphate. Cell Rep 2023; 42:113082. [PMID: 37660293 DOI: 10.1016/j.celrep.2023.113082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Revised: 06/29/2023] [Accepted: 08/16/2023] [Indexed: 09/05/2023] Open
Abstract
Inorganic polyphosphate (polyP) is a linear polymer of orthophosphate that is present in nearly all organisms studied to date. A remarkable function of polyP involves its attachment to lysine residues via non-enzymatic post-translational modification (PTM), which is presumed to be covalent. Here, we show that proteins containing tracts of consecutive histidine residues exhibit a similar modification by polyP, which confers an electrophoretic mobility shift on NuPAGE gels. Our screen uncovers 30 human and yeast histidine repeat proteins that undergo histidine polyphosphate modification (HPM). This polyP modification is histidine dependent and non-covalent in nature, although remarkably it withstands harsh denaturing conditions-a hallmark of covalent PTMs. Importantly, we show that HPM disrupts phase separation and the phosphorylation activity of the human protein kinase DYRK1A, and inhibits the activity of the transcription factor MafB, highlighting HPM as a potential protein regulatory mechanism.
Collapse
Affiliation(s)
- Nolan Neville
- Department of Biomedical and Molecular Sciences, Queen's University, Kingston, ON K7L 3N6, Canada
| | - Kirsten Lehotsky
- Department of Biomedical and Molecular Sciences, Queen's University, Kingston, ON K7L 3N6, Canada
| | - Zhiyun Yang
- Department of Biomedical and Molecular Sciences, Queen's University, Kingston, ON K7L 3N6, Canada
| | - Kody A Klupt
- Department of Biomedical and Molecular Sciences, Queen's University, Kingston, ON K7L 3N6, Canada
| | - Alix Denoncourt
- Department of Cellular and Molecular Medicine, University of Ottawa, Ottawa, ON K1H 8M5, Canada; Ottawa Institute of Systems Biology, Ottawa, ON K1H 8M5, Canada
| | - Michael Downey
- Department of Cellular and Molecular Medicine, University of Ottawa, Ottawa, ON K1H 8M5, Canada; Ottawa Institute of Systems Biology, Ottawa, ON K1H 8M5, Canada
| | - Zongchao Jia
- Department of Biomedical and Molecular Sciences, Queen's University, Kingston, ON K7L 3N6, Canada.
| |
Collapse
|
9
|
Odelin G, Faucherre A, Marchese D, Pinard A, Jaouadi H, Le Scouarnec S, Chiarelli R, Achouri Y, Faure E, Herbane M, Théron A, Avierinos JF, Jopling C, Collod-Béroud G, Rezsohazy R, Zaffran S. Variations in the poly-histidine repeat motif of HOXA1 contribute to bicuspid aortic valve in mouse and zebrafish. Nat Commun 2023; 14:1543. [PMID: 36941270 PMCID: PMC10027860 DOI: 10.1038/s41467-023-37110-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Accepted: 03/02/2023] [Indexed: 03/23/2023] Open
Abstract
Bicuspid aortic valve (BAV), the most common cardiovascular malformation occurs in 0.5-1.2% of the population. Although highly heritable, few causal mutations have been identified in BAV patients. Here, we report the targeted sequencing of HOXA1 in a cohort of BAV patients and the identification of rare indel variants in the homopolymeric histidine tract of HOXA1. In vitro analysis shows that disruption of this motif leads to a significant reduction in protein half-life and defective transcriptional activity of HOXA1. In zebrafish, targeting hoxa1a ortholog results in aortic valve defects. In vivo assays indicates that these variants behave as dominant negatives leading abnormal valve development. In mice, deletion of Hoxa1 leads to BAV with a very small, rudimentary non-coronary leaflet. We also show that 17% of homozygous Hoxa1-1His knock-in mice present similar phenotype. Genetic lineage tracing in Hoxa1-/- mutant mice reveals an abnormal reduction of neural crest-derived cells in the valve leaflet, which is caused by a failure of early migration of these cells.
Collapse
Affiliation(s)
- Gaëlle Odelin
- Aix Marseille Univ, INSERM, MMG, U1251, 13005, Marseille, France
| | - Adèle Faucherre
- Institute of Functional Genomics, University of Montpellier, CNRS, INSERM, Montpellier, France
| | - Damien Marchese
- Animal Molecular and Cellular Biology group, Louvain Institute of Biomolecular Science and Technology, Université catholique de Louvain, 5 (L7.07.10) place Croix du Sud, 1348, Louvain-la-Neuve, Belgium
| | - Amélie Pinard
- Aix Marseille Univ, INSERM, MMG, U1251, 13005, Marseille, France
| | - Hager Jaouadi
- Aix Marseille Univ, INSERM, MMG, U1251, 13005, Marseille, France
| | | | | | - Raphaël Chiarelli
- Animal Molecular and Cellular Biology group, Louvain Institute of Biomolecular Science and Technology, Université catholique de Louvain, 5 (L7.07.10) place Croix du Sud, 1348, Louvain-la-Neuve, Belgium
| | - Younes Achouri
- Transgenesis Platform, de Duve Institute, Université Catholique de Louvain, 1200, Brussels, Belgium
| | - Emilie Faure
- Aix Marseille Univ, INSERM, MMG, U1251, 13005, Marseille, France
| | - Marine Herbane
- Aix Marseille Univ, INSERM, MMG, U1251, 13005, Marseille, France
| | - Alexis Théron
- Aix Marseille Univ, INSERM, MMG, U1251, 13005, Marseille, France
- Service de Chirurgie Cardiaque, AP-HM, Hôpital de la Timone, 13005, Marseille, France
| | - Jean-François Avierinos
- Aix Marseille Univ, INSERM, MMG, U1251, 13005, Marseille, France
- Service de Cardiologie, AP-HM, Hôpital de la Timone, 13005, Marseille, France
| | - Chris Jopling
- Institute of Functional Genomics, University of Montpellier, CNRS, INSERM, Montpellier, France
| | | | - René Rezsohazy
- Animal Molecular and Cellular Biology group, Louvain Institute of Biomolecular Science and Technology, Université catholique de Louvain, 5 (L7.07.10) place Croix du Sud, 1348, Louvain-la-Neuve, Belgium
| | - Stéphane Zaffran
- Aix Marseille Univ, INSERM, MMG, U1251, 13005, Marseille, France.
| |
Collapse
|
10
|
Park SH, Xu Y, Park YS, Seo JT, Gye MC. Glycogen Synthase Kinase-3 Isoform Variants and Their Inhibitory Phosphorylation in Human Testes and Spermatozoa. World J Mens Health 2023; 41:215-226. [PMID: 36047078 PMCID: PMC9826905 DOI: 10.5534/wjmh.220108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 06/30/2022] [Accepted: 07/13/2022] [Indexed: 01/21/2023] Open
Abstract
PURPOSE To clarify (phospho-) glycogen synthase kinase-3 (GSK3) isoform variants in the germline and soma of human testes and spermatozoa. MATERIALS AND METHODS GSK3 isoform variants in normospermatogenic and Sertoli cell-only (SCO) testicular biopsies and spermatozoa were examined. RESULTS In normospermatogenic testes, GSK3α and GSK3β variants 1 and 2 different in low complexity region (LCR) were expressed and their levels were decreased in SCO testes. GSK3β variant 3 was only expressed in SCO testes. GSK3β as well as GSK3α, the dominant isoforms in testes were decreased in SCO testes. In normospermatogenic testes, GSK3β were found in spermatogonia and markedly decreased in meiotic germ cells in which GSK3α was dominant. p-GSK3α/β were marginal in spermatogonia and early spermatocytes. In SCO testes, GSK3α/β immunoreactivity in seminiferous epithelia was weaker than those of normospermatogenic testes whereas p-GSK3α/β(Ser) immunoreactivity was visibly increased in Sertoli cells. GSK3α was dominant in ejaculated spermatozoa in which GSK3α and p-GSK3α(Ser) were found in the head, midpiece, and tail. In acrosome-reacted spermatozoa, GSK3α was found in the equatorial region of head, midpiece, and tail, and p-GSK3α(Ser) was only found in midpiece. During sperm capacitation, p-GSK3α(Ser) was significantly increased together with phosphotyrosine proteins and motility. CONCLUSIONS In human male germ cells, GSK3 isoforms different in LCRs switch from GSK3β to GSK3α during meiotic entry, suggesting the isoform-specific roles of GSK3α and GSK3β in meiosis and stemness or proliferation of spermatogonia, respectively. In dormant Sertoli cells of SCO testes kinase activity of GSK3 might be downregulated via inhibitory phosphorylation. In spermatozoa, inhibitory phosphorylation of GSK3α might be coupled with activation of motility during capacitation.
Collapse
Affiliation(s)
- Seung Hyun Park
- Department of Life Science and Institute for Natural Sciences, Hanyang University, Seoul, Korea
| | - Yang Xu
- Department of Life Science and Institute for Natural Sciences, Hanyang University, Seoul, Korea
| | - Yong-Seog Park
- Laboratory of Reproductive Medicine, Cheil General Hospital & Women's Healthcare Center, Dankook University College of Medicine, Seoul, Korea
| | - Ju Tae Seo
- Department of Urology, Cheil General Hospital and Women's Healthcare Center, Dankook University College of Medicine, Seoul, Korea.
| | - Myung Chan Gye
- Department of Life Science and Institute for Natural Sciences, Hanyang University, Seoul, Korea.
| |
Collapse
|
11
|
Ito Y, Chadani Y, Niwa T, Yamakawa A, Machida K, Imataka H, Taguchi H. Nascent peptide-induced translation discontinuation in eukaryotes impacts biased amino acid usage in proteomes. Nat Commun 2022; 13:7451. [PMID: 36460666 PMCID: PMC9718836 DOI: 10.1038/s41467-022-35156-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Accepted: 11/18/2022] [Indexed: 12/04/2022] Open
Abstract
Robust translation elongation of any given amino acid sequence is required to shape proteomes. Nevertheless, nascent peptides occasionally destabilize ribosomes, since consecutive negatively charged residues in bacterial nascent chains can stochastically induce discontinuation of translation, in a phenomenon termed intrinsic ribosome destabilization (IRD). Here, using budding yeast and a human factor-based reconstituted translation system, we show that IRD also occurs in eukaryotic translation. Nascent chains enriched in aspartic acid (D) or glutamic acid (E) in their N-terminal regions alter canonical ribosome dynamics, stochastically aborting translation. Although eukaryotic ribosomes are more robust to ensure uninterrupted translation, we find many endogenous D/E-rich peptidyl-tRNAs in the N-terminal regions in cells lacking a peptidyl-tRNA hydrolase, indicating that the translation of the N-terminal D/E-rich sequences poses an inherent risk of failure. Indeed, a bioinformatics analysis reveals that the N-terminal regions of ORFs lack D/E enrichment, implying that the translation defect partly restricts the overall amino acid usage in proteomes.
Collapse
Affiliation(s)
- Yosuke Ito
- grid.32197.3e0000 0001 2179 2105School of Life Science and Technology, Tokyo Institute of Technology, Yokohama, 226-8503 Japan
| | - Yuhei Chadani
- grid.32197.3e0000 0001 2179 2105Cell Biology Center, Institute of Innovative Research, Tokyo Institute of Technology, Yokohama, 226-8503 Japan
| | - Tatsuya Niwa
- grid.32197.3e0000 0001 2179 2105School of Life Science and Technology, Tokyo Institute of Technology, Yokohama, 226-8503 Japan ,grid.32197.3e0000 0001 2179 2105Cell Biology Center, Institute of Innovative Research, Tokyo Institute of Technology, Yokohama, 226-8503 Japan
| | - Ayako Yamakawa
- grid.32197.3e0000 0001 2179 2105School of Life Science and Technology, Tokyo Institute of Technology, Yokohama, 226-8503 Japan
| | - Kodai Machida
- grid.266453.00000 0001 0724 9317Graduate School of Engineering, University of Hyogo, Himeji, Hyogo 671-2280 Japan
| | - Hiroaki Imataka
- grid.266453.00000 0001 0724 9317Graduate School of Engineering, University of Hyogo, Himeji, Hyogo 671-2280 Japan
| | - Hideki Taguchi
- grid.32197.3e0000 0001 2179 2105School of Life Science and Technology, Tokyo Institute of Technology, Yokohama, 226-8503 Japan ,grid.32197.3e0000 0001 2179 2105Cell Biology Center, Institute of Innovative Research, Tokyo Institute of Technology, Yokohama, 226-8503 Japan
| |
Collapse
|
12
|
Shukla S, Lazarchuk P, Pavlova MN, Sidorova JM. Genome-wide survey of D/E repeats in human proteins uncovers their instability and aids in identifying their role in the chromatin regulator ATAD2. iScience 2022; 25:105464. [PMCID: PMC9672403 DOI: 10.1016/j.isci.2022.105464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Revised: 08/03/2022] [Accepted: 10/26/2022] [Indexed: 11/15/2022] Open
Abstract
D/E repeats are stretches of aspartic and/or glutamic acid residues found in over 150 human proteins. We examined genomic stability of D/E repeats and functional characteristics of D/E repeat-containing proteins vis-à-vis the proteins with poly-Q or poly-A repeats, which are known to undergo pathologic expansions. Mining of tumor sequencing data revealed that D/E repeat-coding regions are similar to those coding poly-Qs and poly-As in increased incidence of trinucleotide insertions/deletions but differ in types and incidence of substitutions. D/E repeat-containing proteins preferentially function in chromatin metabolism and are the more likely to be nuclear and interact with core histones, the longer their repeats are. One of the longest D/E repeats of unknown function is in ATAD2, a bromodomain family ATPase frequently overexpressed in tumors. We demonstrate that D/E repeat deletion in ATAD2 suppresses its binding to nascent and mature chromatin and to the constitutive pericentromeric heterochromatin, where ATAD2 represses satellite transcription. Many human proteins contain runs of aspartic/glutamic acid residues (D/E repeats) D/E repeats show increased incidence of in-frame insertions/deletions in tumors Nuclear and histone-interacting proteins often have long D/E repeats D/E repeat of the oncogene ATAD2 controls its binding to pericentric chromatin
Collapse
Affiliation(s)
- Shalabh Shukla
- Department of Laboratory Medicine and Pathology, University of Washington, 1959 NE Pacific St., Box 357705, Seattle, WA 98195, USA
| | - Pavlo Lazarchuk
- Department of Laboratory Medicine and Pathology, University of Washington, 1959 NE Pacific St., Box 357705, Seattle, WA 98195, USA
| | - Maria N. Pavlova
- Department of Laboratory Medicine and Pathology, University of Washington, 1959 NE Pacific St., Box 357705, Seattle, WA 98195, USA
| | - Julia M. Sidorova
- Department of Laboratory Medicine and Pathology, University of Washington, 1959 NE Pacific St., Box 357705, Seattle, WA 98195, USA
- Corresponding author
| |
Collapse
|
13
|
Teekas L, Sharma S, Vijay N. Lineage-specific protein repeat expansions and contractions reveal malleable regions of immune genes. Genes Immun 2022; 23:218-234. [PMID: 36203090 DOI: 10.1038/s41435-022-00186-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 09/21/2022] [Accepted: 09/22/2022] [Indexed: 01/07/2023]
Abstract
Functional diversification, a higher evolutionary rate, and intense positive selection help a limited number of immune genes interact with many pathogens. Repeats in protein-coding regions are a well-known source of functional diversification, adaptive variation, and evolutionary novelty in a short time. Repeats play a crucial role in biochemical functions like functional diversification of transcription regulation, protein kinases, cell adhesion, signaling pathways, morphogenesis, DNA repair, recombination, and RNA processing. Repeat length variation can change the associated protein's interaction, efficacy, and overall protein network. Repeats have an intrinsic unstable nature and can potentially evolve rapidly and expedite the acquisition of complex phenotypic traits and functions. Because of their ability to generate rapid, adaptive variations over short evolutionary distances, repeats are considered "tuning knobs." Repeat length variation in specific genes, like RUNX2 and ALX4, is associated with morphological and physiological changes across vertebrates. Here we study repeat length variation as a potent source of species-specific immune diversification across several clades of tetrapods. Moreover, we provide a clade-wise comprehensive list of immune genes with repeat types for future studies of morphological/evolutionary changes within species groups. We observe significant repeat length variation of FASLG and C1QC in Rodentia and Primates' contrasting species groups, respectively.
Collapse
Affiliation(s)
- Lokdeep Teekas
- Department of Biological Sciences, Computational Evolutionary Genomics Lab, IISER Bhopal, Bhauri, Madhya Pradesh, India
| | - Sandhya Sharma
- Department of Biological Sciences, Computational Evolutionary Genomics Lab, IISER Bhopal, Bhauri, Madhya Pradesh, India
| | - Nagarjun Vijay
- Department of Biological Sciences, Computational Evolutionary Genomics Lab, IISER Bhopal, Bhauri, Madhya Pradesh, India.
| |
Collapse
|
14
|
Lee J, Cho H, Kwon I. Phase separation of low-complexity domains in cellular function and disease. EXPERIMENTAL & MOLECULAR MEDICINE 2022; 54:1412-1422. [PMID: 36175485 PMCID: PMC9534829 DOI: 10.1038/s12276-022-00857-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 07/15/2022] [Accepted: 07/19/2022] [Indexed: 11/09/2022]
Abstract
In this review, we discuss the ways in which recent studies of low-complexity (LC) domains have challenged our understanding of the mechanisms underlying cellular organization. LC sequences, long believed to function in the absence of a molecular structure, are abundant in the proteomes of all eukaryotic organisms. Over the past decade, the phase separation of LC domains has emerged as a fundamental mechanism driving dynamic multivalent interactions of many cellular processes. We review the key evidence showing the role of phase separation of individual proteins in organizing cellular assemblies and facilitating biological function while implicating the dynamics of phase separation as a key to biological validity and functional utility. We also highlight the evidence showing that pathogenic LC proteins alter various phase separation-dependent interactions to elicit debilitating human diseases, including cancer and neurodegenerative diseases. Progress in understanding the biology of phase separation may offer useful hints toward possible therapeutic interventions to combat the toxicity of pathogenic proteins.
Collapse
Affiliation(s)
- Jiwon Lee
- Department of Anatomy and Cell Biology, Sungkyunkwan University School of Medicine, Suwon, 16419, Korea
| | - Hana Cho
- Department of Physiology, Sungkyunkwan University School of Medicine, Suwon, 16419, Korea.
| | - Ilmin Kwon
- Department of Anatomy and Cell Biology, Sungkyunkwan University School of Medicine, Suwon, 16419, Korea.
| |
Collapse
|
15
|
Bigman LS, Iwahara J, Levy Y. Negatively Charged Disordered Regions are Prevalent and Functionally Important Across Proteomes. J Mol Biol 2022; 434:167660. [PMID: 35659505 DOI: 10.1016/j.jmb.2022.167660] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 05/20/2022] [Accepted: 05/24/2022] [Indexed: 01/12/2023]
Abstract
Intrinsically disordered regions (IDRs) of proteins are often characterized by a high fraction of charged residues, but differ in their overall net charge and in the organization of the charged residues. The function-encoding information stored via IDR charge composition and organization remains elusive. Here, we aim to decipher the sequence-function relationship in IDRs by presenting a comprehensive bioinformatic analysis of the charge properties of IDRs in the human, mouse, and yeast proteomes. About 50% of the proteins comprise at least a single IDR, which is either positively or negatively charged. Highly negatively charged IDRs are longer and possess greater net charge per residue compared with highly positively charged IDRs. A striking difference between positively and negatively charged IDRs is the characteristics of the repeated units, specifically, of consecutive Lys or Arg residues (K/R repeats) and Asp or Glu (D/E repeats) residues. D/E repeats are found to be about five times longer than K/R repeats, with the longest found containing 49 residues. Long stretches of consecutive D and E are found to be more prevalent in nucleic acid-related proteins. They are less common in prokaryotes, and in eukaryotes their abundance increases with genome size. The functional role of D/E repeats and the profound differences between them and K/R repeats are discussed.
Collapse
Affiliation(s)
- Lavi S Bigman
- Department of Chemical and Structural Biology, Weizmann Institute of Science, Rehovot, Israel. https://twitter.com/LaviBigman
| | - Junji Iwahara
- Department of Biochemistry and Molecular Biology, Sealy Center for Structural Biology and Molecular Biophysics, University of Texas Medical Branch, Galveston, TX 77555, United States
| | - Yaakov Levy
- Department of Chemical and Structural Biology, Weizmann Institute of Science, Rehovot, Israel.
| |
Collapse
|
16
|
Kim JJ, Lee SY, Hwang Y, Kim S, Chung JM, Park S, Yoon J, Yun H, Ji JH, Chae S, Cho H, Kim CG, Dawson TM, Kim H, Dawson VL, Kang HC. USP39 promotes non-homologous end-joining repair by poly(ADP-ribose)-induced liquid demixing. Nucleic Acids Res 2021; 49:11083-11102. [PMID: 34614178 PMCID: PMC8565343 DOI: 10.1093/nar/gkab892] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 09/16/2021] [Accepted: 09/20/2021] [Indexed: 12/18/2022] Open
Abstract
Mutual crosstalk among poly(ADP-ribose) (PAR), activated PAR polymerase 1 (PARP1) metabolites, and DNA repair machinery has emerged as a key regulatory mechanism of the DNA damage response (DDR). However, there is no conclusive evidence of how PAR precisely controls DDR. Herein, six deubiquitinating enzymes (DUBs) associated with PAR-coupled DDR were identified, and the role of USP39, an inactive DUB involved in spliceosome assembly, was characterized. USP39 rapidly localizes to DNA lesions in a PAR-dependent manner, where it regulates non-homologous end-joining (NHEJ) via a tripartite RG motif located in the N-terminus comprising 46 amino acids (N46). Furthermore, USP39 acts as a molecular trigger for liquid demixing in a PAR-coupled N46-dependent manner, thereby directly interacting with the XRCC4/LIG4 complex during NHEJ. In parallel, the USP39-associated spliceosome complex controls homologous recombination repair in a PAR-independent manner. These findings provide mechanistic insights into how PAR chains precisely control DNA repair processes in the DDR.
Collapse
Affiliation(s)
- Jae Jin Kim
- Genomic Instability Research Center, Ajou University School of Medicine, Suwon, Gyeonggi, 16499, Republic of Korea.,Department of Physiology, Ajou University School of Medicine, Suwon, Gyeonggi 16499, Republic of Korea.,Department of Biomedical Sciences, Ajou University School of Medicine, Suwon, Gyeonggi 16499, Republic of Korea.,Department of Life Science, Hallym University, Chuncheon 24252, Republic of Korea
| | - Seo Yun Lee
- Genomic Instability Research Center, Ajou University School of Medicine, Suwon, Gyeonggi, 16499, Republic of Korea.,Department of Physiology, Ajou University School of Medicine, Suwon, Gyeonggi 16499, Republic of Korea.,Department of Biomedical Sciences, Ajou University School of Medicine, Suwon, Gyeonggi 16499, Republic of Korea
| | - Yiseul Hwang
- Genomic Instability Research Center, Ajou University School of Medicine, Suwon, Gyeonggi, 16499, Republic of Korea.,Department of Physiology, Ajou University School of Medicine, Suwon, Gyeonggi 16499, Republic of Korea.,Department of Biomedical Sciences, Ajou University School of Medicine, Suwon, Gyeonggi 16499, Republic of Korea
| | - Soyeon Kim
- Genomic Instability Research Center, Ajou University School of Medicine, Suwon, Gyeonggi, 16499, Republic of Korea.,Department of Physiology, Ajou University School of Medicine, Suwon, Gyeonggi 16499, Republic of Korea.,Department of Biomedical Sciences, Ajou University School of Medicine, Suwon, Gyeonggi 16499, Republic of Korea
| | - Jee Min Chung
- Genomic Instability Research Center, Ajou University School of Medicine, Suwon, Gyeonggi, 16499, Republic of Korea.,Department of Physiology, Ajou University School of Medicine, Suwon, Gyeonggi 16499, Republic of Korea.,Department of Biomedical Sciences, Ajou University School of Medicine, Suwon, Gyeonggi 16499, Republic of Korea
| | - Sangwook Park
- Genomic Instability Research Center, Ajou University School of Medicine, Suwon, Gyeonggi, 16499, Republic of Korea.,Department of Physiology, Ajou University School of Medicine, Suwon, Gyeonggi 16499, Republic of Korea.,Department of Biomedical Sciences, Ajou University School of Medicine, Suwon, Gyeonggi 16499, Republic of Korea
| | - Junghyun Yoon
- Genomic Instability Research Center, Ajou University School of Medicine, Suwon, Gyeonggi, 16499, Republic of Korea.,Department of Physiology, Ajou University School of Medicine, Suwon, Gyeonggi 16499, Republic of Korea.,Department of Biomedical Sciences, Ajou University School of Medicine, Suwon, Gyeonggi 16499, Republic of Korea
| | - Hansol Yun
- Genomic Instability Research Center, Ajou University School of Medicine, Suwon, Gyeonggi, 16499, Republic of Korea.,Department of Physiology, Ajou University School of Medicine, Suwon, Gyeonggi 16499, Republic of Korea.,Department of Biomedical Sciences, Ajou University School of Medicine, Suwon, Gyeonggi 16499, Republic of Korea
| | - Jae-Hoon Ji
- Genomic Instability Research Center, Ajou University School of Medicine, Suwon, Gyeonggi, 16499, Republic of Korea.,Department of Biochemistry & Structural Biology, University of Texas Health Science Center, San Antonio, TX, USA
| | - Sunyoung Chae
- Institute of Medical Science, Ajou University School of Medicine, Suwon, Gyeonggi 16499, Republic of Korea
| | - Hyeseong Cho
- Genomic Instability Research Center, Ajou University School of Medicine, Suwon, Gyeonggi, 16499, Republic of Korea.,Department of Biochemistry and Molecular Biology, Ajou University School of Medicine, Suwon, Gyeonggi 16499, Republic of Korea.,Department of Biomedical Sciences, Ajou University School of Medicine, Suwon, Gyeonggi 16499, Republic of Korea
| | - Chan Gil Kim
- Department of Biotechnology, Konkuk University, Chungju 380-701, Republic of Korea
| | - Ted M Dawson
- Neuroregeneration and Stem Cell Programs, Institute for Cell Engineering, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA.,Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA.,Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA.,Department of Pharmacology and Molecular Sciences, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Hongtae Kim
- Center for Genomic Integrity Institute for Basic Science (IBS), Ulsan National Institute of Science and Technology, Ulsan, Republic of Korea.,School of Life Sciences, Ulsan National Institute of Science and Technology, Ulsan, Republic of Korea
| | - Valina L Dawson
- Neuroregeneration and Stem Cell Programs, Institute for Cell Engineering, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA.,Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA.,Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA.,Department of Physiology, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Ho Chul Kang
- Genomic Instability Research Center, Ajou University School of Medicine, Suwon, Gyeonggi, 16499, Republic of Korea.,Department of Physiology, Ajou University School of Medicine, Suwon, Gyeonggi 16499, Republic of Korea.,Department of Biomedical Sciences, Ajou University School of Medicine, Suwon, Gyeonggi 16499, Republic of Korea
| |
Collapse
|
17
|
Huang L, Agrawal T, Zhu G, Yu S, Tao L, Lin J, Marmorstein R, Shorter J, Yang X. DAXX represents a new type of protein-folding enabler. Nature 2021; 597:132-137. [PMID: 34408321 PMCID: PMC8485697 DOI: 10.1038/s41586-021-03824-5] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2019] [Accepted: 07/15/2021] [Indexed: 12/31/2022]
Abstract
Protein quality control systems are crucial for cellular function and organismal health. At present, most known protein quality control systems are multicomponent machineries that operate via ATP-regulated interactions with non-native proteins to prevent aggregation and promote folding1, and few systems that can broadly enable protein folding by a different mechanism have been identified. Moreover, proteins that contain the extensively charged poly-Asp/Glu (polyD/E) region are common in eukaryotic proteomes2, but their biochemical activities remain undefined. Here we show that DAXX, a polyD/E protein that has been implicated in diverse cellular processes3-10, possesses several protein-folding activities. DAXX prevents aggregation, solubilizes pre-existing aggregates and unfolds misfolded species of model substrates and neurodegeneration-associated proteins. Notably, DAXX effectively prevents and reverses aggregation of its in vivo-validated client proteins, the tumour suppressor p53 and its principal antagonist MDM2. DAXX can also restore native conformation and function to tumour-associated, aggregation-prone p53 mutants, reducing their oncogenic properties. These DAXX activities are ATP-independent and instead rely on the polyD/E region. Other polyD/E proteins, including ANP32A and SET, can also function as stand-alone, ATP-independent molecular chaperones, disaggregases and unfoldases. Thus, polyD/E proteins probably constitute a multifunctional protein quality control system that operates via a distinctive mechanism.
Collapse
Affiliation(s)
- Liangqian Huang
- Department of Cancer Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Abramson Family Cancer Research Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Trisha Agrawal
- Department of Cancer Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Abramson Family Cancer Research Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Wilson Sonsini Goodrich & Rosati LP, New York, NY, USA
| | - Guixin Zhu
- Department of Cancer Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Abramson Family Cancer Research Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Sixiang Yu
- Department of Cancer Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Abramson Family Cancer Research Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Liming Tao
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - JiaBei Lin
- Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Ronen Marmorstein
- Abramson Family Cancer Research Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - James Shorter
- Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Xiaolu Yang
- Department of Cancer Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
- Abramson Family Cancer Research Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
18
|
Zaharias S, Zhang Z, Davis K, Fargason T, Cashman D, Yu T, Zhang J. Intrinsically disordered electronegative clusters improve stability and binding specificity of RNA-binding proteins. J Biol Chem 2021; 297:100945. [PMID: 34246632 PMCID: PMC8348266 DOI: 10.1016/j.jbc.2021.100945] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Revised: 06/28/2021] [Accepted: 07/07/2021] [Indexed: 11/25/2022] Open
Abstract
RNA-binding proteins play crucial roles in various cellular functions and contain abundant disordered protein regions. The disordered regions in RNA-binding proteins are rich in repetitive sequences, such as poly-K/R, poly-N/Q, poly-A, and poly-G residues. Our bioinformatic analysis identified a largely neglected repetitive sequence family we define as electronegative clusters (ENCs) that contain acidic residues and/or phosphorylation sites. The abundance and length of ENCs exceed other known repetitive sequences. Despite their abundance, the functions of ENCs in RNA-binding proteins are still elusive. To investigate the impacts of ENCs on protein stability, RNA-binding affinity, and specificity, we selected one RNA-binding protein, the ribosomal biogenesis factor 15 (Nop15), as a model. We found that the Nop15 ENC increases protein stability and inhibits nonspecific RNA binding, but minimally interferes with specific RNA binding. To investigate the effect of ENCs on sequence specificity of RNA binding, we grafted an ENC to another RNA-binding protein, Ser/Arg-rich splicing factor 3. Using RNA Bind-n-Seq, we found that the engineered ENC inhibits disparate RNA motifs differently, instead of weakening all RNA motifs to the same extent. The motif site directly involved in electrostatic interaction is more susceptible to the ENC inhibition. These results suggest that one of functions of ENCs is to regulate RNA binding via electrostatic interaction. This is consistent with our finding that ENCs are also overrepresented in DNA-binding proteins, whereas underrepresented in halophiles, in which nonspecific nucleic acid binding is inhibited by high concentrations of salts.
Collapse
Affiliation(s)
- Steve Zaharias
- Department of Chemistry, College of Arts and Sciences, University of Alabama at Birmingham, Birmingham, Alabama, USA
| | - Zihan Zhang
- Department of Chemistry, College of Arts and Sciences, University of Alabama at Birmingham, Birmingham, Alabama, USA
| | - Kenneth Davis
- Department of Chemistry, College of Arts and Sciences, University of Alabama at Birmingham, Birmingham, Alabama, USA
| | - Talia Fargason
- Department of Chemistry, College of Arts and Sciences, University of Alabama at Birmingham, Birmingham, Alabama, USA
| | - Derek Cashman
- Department of Chemistry, Tennessee Technological University, Cookeville, Tennessee, USA
| | - Tao Yu
- Department of Chemistry, University of North Dakota, Grand Forks, North Dakota, USA
| | - Jun Zhang
- Department of Chemistry, College of Arts and Sciences, University of Alabama at Birmingham, Birmingham, Alabama, USA.
| |
Collapse
|
19
|
Cappannini A, Forcelloni S, Giansanti A. Evolutionary pressures and codon bias in low complexity regions of plasmodia. Genetica 2021; 149:217-237. [PMID: 34254217 DOI: 10.1007/s10709-021-00126-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Accepted: 06/30/2021] [Indexed: 11/25/2022]
Abstract
The biological meaning of low complexity regions in the proteins of Plasmodium species is a topic of discussion in evolutionary biology. There is a debate between selectionists and neutralists, who either attribute or do not attribute an effect of low-complexity regions on the fitness of these parasites, respectively. In this work, we comparatively study 22 Plasmodium species to understand whether their low complexity regions undergo a neutral or, rather, a selective and species-dependent evolution. The focus is on the connection between the codon repertoire of the genetic coding sequences and the occurrence of low complexity regions in the corresponding proteins. The first part of the work concerns the correlation between the length of plasmodial proteins and their propensity at embedding low complexity regions. Relative synonymous codon usage, entropy, and other indicators reveal that the incidence of low complexity regions and their codon bias is species-specific and subject to selective evolutionary pressure. We also observed that protein length, a relaxed selective pressure, and a broad repertoire of codons in proteins, are strongly correlated with the occurrence of low complexity regions. Overall, it seems plausible that the codon bias of low-complexity regions contributes to functional innovation and codon bias enhancement of proteins on which Plasmodium species rest as successful evolutionary parasites.
Collapse
Affiliation(s)
- Andrea Cappannini
- Department of Physics, Sapienza, University of Rome, P.le A. Moro 5, 00185, Roma, Italy.
| | - Sergio Forcelloni
- Max Planck Institute of Biochemistry, 82152, Martinsried, Germany.,Department of Chemistry, Technical University of Munich, 85748, Garching, Germany
| | - Andrea Giansanti
- Department of Physics, Sapienza, University of Rome, P.le A. Moro 5, 00185, Roma, Italy.,Istituto Nazionale di Fisica Nucleare, INFN, Roma1 section. 00185, Roma, Italy
| |
Collapse
|
20
|
Lei Y, Zhou Y, Price M, Song Z. Genome-wide characterization of microsatellite DNA in fishes: survey and analysis of their abundance and frequency in genome-specific regions. BMC Genomics 2021; 22:421. [PMID: 34098869 PMCID: PMC8186053 DOI: 10.1186/s12864-021-07752-6] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 05/24/2021] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Microsatellite repeats are ubiquitous in organism genomes and play an important role in the chromatin organization, regulation of gene activity, recombination and DNA replication. Although microsatellite distribution patterns have been studied in most phylogenetic lineages, they are unclear in fish species. RESULTS Here, we present the first systematic examination of microsatellite distribution in coding and non-coding regions of 14 fish genomes. Our study showed that the number and type of microsatellites displayed nonrandom distribution for both intragenic and intergenic regions, suggesting that they have potential roles in transcriptional or translational regulation and DNA replication slippage theories alone were insufficient to explain the distribution patterns. Our results showed that microsatellites are dominant in non-coding regions. The total number of microsatellites ranged from 78,378 to 1,012,084, and the relative density varied from 4925.76 bp/Mb to 25,401.97 bp/Mb. Overall, (A + T)-rich repeats were dominant. The dependence of repeat abundance on the length of the repeated unit (1-6 nt) showed a great similarity decrease, whereas more tri-nucleotide repeats were found in exonic regions than tetra-nucleotide repeats of most species. Moreover, the incidence of different repeated types appeared species- and genomic-specific. These results highlight potential mechanisms for maintaining microsatellite distribution, such as selective forces and mismatch repair systems. CONCLUSIONS Our data could be beneficial for the studies of genome evolution and microsatellite DNA evolutionary dynamics, and facilitate the exploration of microsatellites structural, function, composition mode and molecular markers development in these species.
Collapse
Affiliation(s)
- Yi Lei
- Sichuan Key Laboratory of Conservation Biology on Endangered Wildlife, College of Life Sciences, Sichuan University, Chengdu, 610065, People's Republic of China
| | - Yu Zhou
- Sichuan Key Laboratory of Conservation Biology on Endangered Wildlife, College of Life Sciences, Sichuan University, Chengdu, 610065, People's Republic of China
| | - Megan Price
- Sichuan Key Laboratory of Conservation Biology on Endangered Wildlife, College of Life Sciences, Sichuan University, Chengdu, 610065, People's Republic of China
| | - Zhaobin Song
- Sichuan Key Laboratory of Conservation Biology on Endangered Wildlife, College of Life Sciences, Sichuan University, Chengdu, 610065, People's Republic of China.
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, 610065, People's Republic of China.
| |
Collapse
|
21
|
Cascarina SM, King DC, Osborne Nishimura E, Ross ED. LCD-Composer: an intuitive, composition-centric method enabling the identification and detailed functional mapping of low-complexity domains. NAR Genom Bioinform 2021; 3:lqab048. [PMID: 34056598 PMCID: PMC8153834 DOI: 10.1093/nargab/lqab048] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Revised: 04/13/2021] [Accepted: 05/06/2021] [Indexed: 02/07/2023] Open
Abstract
Low complexity domains (LCDs) in proteins are regions predominantly composed of a small subset of the possible amino acids. LCDs are involved in a variety of normal and pathological processes across all domains of life. Existing methods define LCDs using information-theoretical complexity thresholds, sequence alignment with repetitive regions, or statistical overrepresentation of amino acids relative to whole-proteome frequencies. While these methods have proven valuable, they are all indirectly quantifying amino acid composition, which is the fundamental and biologically-relevant feature related to protein sequence complexity. Here, we present a new computational tool, LCD-Composer, that directly identifies LCDs based on amino acid composition and linear amino acid dispersion. Using LCD-Composer's default parameters, we identified simple LCDs across all organisms available through UniProt and provide the resulting data in an accessible form as a resource. Furthermore, we describe large-scale differences between organisms from different domains of life and explore organisms with extreme LCD content for different LCD classes. Finally, we illustrate the versatility and specificity achievable with LCD-Composer by identifying diverse classes of LCDs using both simple and multifaceted composition criteria. We demonstrate that the ability to dissect LCDs based on these multifaceted criteria enhances the functional mapping and classification of LCDs.
Collapse
Affiliation(s)
- Sean M Cascarina
- Department of Biochemistry and Molecular Biology, Colorado State University, Fort Collins, CO 80523, USA
| | - David C King
- Department of Biochemistry and Molecular Biology, Colorado State University, Fort Collins, CO 80523, USA
| | - Erin Osborne Nishimura
- Department of Biochemistry and Molecular Biology, Colorado State University, Fort Collins, CO 80523, USA
| | - Eric D Ross
- Department of Biochemistry and Molecular Biology, Colorado State University, Fort Collins, CO 80523, USA
| |
Collapse
|
22
|
Influence of nascent polypeptide positive charges on translation dynamics. Biochem J 2021; 477:2921-2934. [PMID: 32797214 DOI: 10.1042/bcj20200303] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2020] [Revised: 07/17/2020] [Accepted: 07/23/2020] [Indexed: 01/05/2023]
Abstract
Protein segments with a high concentration of positively charged amino acid residues are often used in reporter constructs designed to activate ribosomal mRNA/protein decay pathways, such as those involving nonstop mRNA decay (NSD), no-go mRNA decay (NGD) and the ribosome quality control (RQC) complex. It has been proposed that the electrostatic interaction of the positively charged nascent peptide with the negatively charged ribosomal exit tunnel leads to translation arrest. When stalled long enough, the translation process is terminated with the degradation of the transcript and an incomplete protein. Although early experiments made a strong argument for this mechanism, other features associated with positively charged reporters, such as codon bias and mRNA and protein structure, have emerged as potent inducers of ribosome stalling. We carefully reviewed the published data on the protein and mRNA expression of artificial constructs with diverse compositions as assessed in different organisms. We concluded that, although polybasic sequences generally lead to lower translation efficiency, it appears that an aggravating factor, such as a nonoptimal codon composition, is necessary to cause translation termination events.
Collapse
|
23
|
Determination of seventeen free amino acids in human urine and plasma samples using quadruple isotope dilution mass spectrometry combined with hydrophilic interaction liquid chromatography - Tandem mass spectrometry. J Chromatogr A 2021; 1641:461970. [PMID: 33611120 DOI: 10.1016/j.chroma.2021.461970] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2021] [Revised: 02/01/2021] [Accepted: 02/03/2021] [Indexed: 11/23/2022]
Abstract
Taking into account the growing demand for new analytical procedures that are appropriate for analysis of complex biological samples with increased sensitivity, accuracy and precision, a novel analytical method was described for the determination of underivatized amino acids in human plasma and urine samples. The presented analytical procedure involved the direct analysis of urine samples and the analysis of plasma samples followed by a simple protein precipitation protocol. Samples were analyzed using a simple and fast chromatographic method developed for the determination of 17 different amino acids by liquid chromatography - tandem mass spectrometry. The limit of detection and quantification values for amino acids were ranged between 0.03-2.26 µmol kg-1 and 0.09-7.54 µmol kg-1. Matrix effects of plasma and urine on the quantification of analytes were determined by spiking experiments. The accuracy of method was evaluated by matrix matching and quadruple isotope dilution strategies. Excellent accuracy and precision were obtained with the use isotope labeled amino acids demonstrating the high reliability and reproducibility of the proposed method. The percent recovery values were found to be between 98.70 - 101.68% with%RSD below than 1.62% for human plasma and 99.14 - 101.78% with%RSD below than 2.44% for urine samples.
Collapse
|
24
|
Barros GC, Requião RD, Carneiro RL, Masuda CA, Moreira MH, Rossetto S, Domitrovic T, Palhano FL. Rqc1 and other yeast proteins containing highly positively charged sequences are not targets of the RQC complex. J Biol Chem 2021; 296:100586. [PMID: 33774050 PMCID: PMC8102910 DOI: 10.1016/j.jbc.2021.100586] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Revised: 03/12/2021] [Accepted: 03/23/2021] [Indexed: 02/06/2023] Open
Abstract
Previous work has suggested that highly positively charged protein segments coded by rare codons or poly (A) stretches induce ribosome stalling and translational arrest through electrostatic interactions with the negatively charged ribosome exit tunnel, leading to inefficient elongation. This arrest leads to the activation of the Ribosome Quality Control (RQC) pathway and results in low expression of these reporter proteins. However, the only endogenous yeast proteins known to activate the RQC are Rqc1, a protein essential for RQC function, and Sdd1, a protein with unknown function, both of which contain polybasic sequences. To explore the generality of this phenomenon, we investigated whether the RQC complex controls the expression of other proteins with polybasic sequences. We showed by ribosome profiling data analysis and western blot that proteins containing polybasic sequences similar to, or even more positively charged than those of Rqc1 and Sdd1, were not targeted by the RQC complex. We also observed that the previously reported Ltn1-dependent regulation of Rqc1 is posttranslational, independent of the RQC activity. Taken together, our results suggest that RQC should not be regarded as a general regulatory pathway for the expression of highly positively charged proteins in yeast.
Collapse
Affiliation(s)
- Géssica C Barros
- Programa de Biologia Estrutural, Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Rio de Janeiro, RJ, Brazil
| | - Rodrigo D Requião
- Programa de Biologia Estrutural, Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Rio de Janeiro, RJ, Brazil
| | - Rodolfo L Carneiro
- Programa de Biologia Estrutural, Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Rio de Janeiro, RJ, Brazil
| | - Claudio A Masuda
- Programa de Biologia Molecular e Biotecnologia, Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Rio de Janeiro, RJ, Brazil
| | - Mariana H Moreira
- Programa de Biologia Estrutural, Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Rio de Janeiro, RJ, Brazil
| | - Silvana Rossetto
- Departamento de Ciência da Computação, Universidade Federal do Rio de Janeiro, Rio de Janeiro, RJ, Brazil
| | - Tatiana Domitrovic
- Departamento de Virologia, Instituto de Microbiologia Paulo de Góes, Universidade Federal do Rio de Janeiro, Rio de Janeiro, RJ, Brazil.
| | - Fernando L Palhano
- Programa de Biologia Estrutural, Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Rio de Janeiro, RJ, Brazil.
| |
Collapse
|
25
|
Mier P, Andrade-Navarro MA. Assessing the low complexity of protein sequences via the low complexity triangle. PLoS One 2020; 15:e0239154. [PMID: 33378336 PMCID: PMC7773278 DOI: 10.1371/journal.pone.0239154] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Accepted: 08/31/2020] [Indexed: 11/24/2022] Open
Abstract
Background Proteins with low complexity regions (LCRs) have atypical sequence and structural features. Their amino acid composition varies from the expected, determined proteome-wise, and they do not follow the rules of structural folding that prevail in globular regions. One way to characterize these regions is by assessing the repeatability of a sequence, that is, calculating the local propensity of a region to be part of a repeat. Results We combine two local measures of low complexity, repeatability (using the RES algorithm) and fraction of the most frequent amino acid, to evaluate different proteomes, datasets of protein regions with specific features, and individual cases of proteins with extreme compositions. We apply a representation called ‘low complexity triangle’ as a proof-of-concept to represent the low complexity measured values. Results show that proteomes have distinct signatures in the low complexity triangle, and that these signatures are associated to complexity features of the sequences. We developed a web tool called LCT (http://cbdm-01.zdv.uni-mainz.de/~munoz/lct/) to allow users to calculate the low complexity triangle of a given protein or region of interest. Conclusions The low complexity triangle proves to be a suitable procedure to represent the general low complexity of a sequence or protein dataset. Homorepeats, direpeats, compositionally biased regions and globular regions occupy characteristic positions in the triangle. The described pipeline can be used to characterize LCRs and may help in quantifying the content of degenerated tandem repeats in proteins and proteomes.
Collapse
Affiliation(s)
- Pablo Mier
- Faculty of Biology, Institute of Organismic and Molecular Evolution, Johannes Gutenberg University Mainz, Mainz, Germany
- * E-mail:
| | - Miguel A. Andrade-Navarro
- Faculty of Biology, Institute of Organismic and Molecular Evolution, Johannes Gutenberg University Mainz, Mainz, Germany
| |
Collapse
|
26
|
Persi E, Wolf YI, Horn D, Ruppin E, Demichelis F, Gatenby RA, Gillies RJ, Koonin EV. Mutation-selection balance and compensatory mechanisms in tumour evolution. Nat Rev Genet 2020; 22:251-262. [PMID: 33257848 DOI: 10.1038/s41576-020-00299-4] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/16/2020] [Indexed: 12/11/2022]
Abstract
Intratumour heterogeneity and phenotypic plasticity, sustained by a range of somatic aberrations, as well as epigenetic and metabolic adaptations, are the principal mechanisms that enable cancers to resist treatment and survive under environmental stress. A comprehensive picture of the interplay between different somatic aberrations, from point mutations to whole-genome duplications, in tumour initiation and progression is lacking. We posit that different genomic aberrations generally exhibit a temporal order, shaped by a balance between the levels of mutations and selective pressures. Repeat instability emerges first, followed by larger aberrations, with compensatory effects leading to robust tumour fitness maintained throughout the tumour progression. A better understanding of the interplay between genetic aberrations, the microenvironment, and epigenetic and metabolic cellular states is essential for early detection and prevention of cancer as well as development of efficient therapeutic strategies.
Collapse
Affiliation(s)
- Erez Persi
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
| | - Yuri I Wolf
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - David Horn
- School of Physics and Astronomy, Raymond & Beverly Sackler Faculty of Exact Sciences, Tel-Aviv University, Tel-Aviv, Israel
| | - Eytan Ruppin
- Cancer Data Science Lab, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Francesca Demichelis
- Department for Cellular, Computational and Integrative Biology, University of Trento, Trento, Italy.,Caryl and Israel Englander Institute for Precision Medicine, New York Presbyterian Hospital, Weill Cornell Medicine, New York, NY, USA
| | - Robert A Gatenby
- Integrated Mathematical Oncology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
| | - Robert J Gillies
- Department of Cancer Physiology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA.
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
27
|
Mary Rajathei D, Parthasarathy S, Selvaraj S. HPREP: a comprehensive database for human proteome repeats. J Integr Bioinform 2020; 0:/j/jib.ahead-of-print/jib-2020-0024/jib-2020-0024.xml. [PMID: 33136065 DOI: 10.1515/jib-2020-0024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Accepted: 09/17/2020] [Indexed: 11/15/2022] Open
Abstract
Amino acid repeats are found to play important roles in both structures and functions of the proteins. These are commonly found in all kingdoms of life, especially in eukaryotes and a larger fraction of human proteins composed of repeats. Further, the abnormal expansions of shorter repeats cause various diseases to humans. Therefore, the analysis of repeats of the entire human proteome along with functional, mutational and disease information would help to better understand their roles in proteins. To fulfill this need, we developed a web database HPREP (http://bioinfo.bdu.ac.in/hprep) for human proteome repeats using Perl and HTML programming. We identified different categories of well-characterized repeats and domain repeats that are present in the human proteome of UniProtKB/Swiss-Prot by using in-house Perl programming and novel repeats by using the repeat detection T-REKS tool as well as XSTREAM web server. Further, these proteins are annotated with functional, mutational and disease information and grouped according to specific repeat types. The developed database enables the users to search by specific repeat type in order to understand their involvement in proteins. Thus, the HPREP database is expected to be a useful resource to gain better insight regarding the different repeats in human proteome and their biological roles.
Collapse
Affiliation(s)
- David Mary Rajathei
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli 620 024, India
| | - Subbiah Parthasarathy
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli 620 024, India
| | - Samuel Selvaraj
- Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli 620 024, India
| |
Collapse
|
28
|
Wilken SE, Seppälä S, Lankiewicz TS, Saxena M, Henske JK, Salamov AA, Grigoriev IV, O’Malley MA. Genomic and proteomic biases inform metabolic engineering strategies for anaerobic fungi. Metab Eng Commun 2020; 10:e00107. [PMID: 31799118 PMCID: PMC6883316 DOI: 10.1016/j.mec.2019.e00107] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2019] [Revised: 10/24/2019] [Accepted: 11/04/2019] [Indexed: 12/22/2022] Open
Abstract
Anaerobic fungi (Neocallimastigomycota) are emerging non-model hosts for biotechnology due to their wealth of biomass-degrading enzymes, yet tools to engineer these fungi have not yet been established. Here, we show that the anaerobic gut fungi have the most GC depleted genomes among 443 sequenced organisms in the fungal kingdom, which has ramifications for heterologous expression of genes as well as for emerging CRISPR-based genome engineering approaches. Comparative genomic analyses suggest that anaerobic fungi may contain cellular machinery to aid in sexual reproduction, yet a complete mating pathway was not identified. Predicted proteomes of the anaerobic fungi also contain an unusually large fraction of proteins with homopolymeric amino acid runs consisting of five or more identical consecutive amino acids. In particular, threonine runs are especially enriched in anaerobic fungal carbohydrate active enzymes (CAZymes) and this, together with a high abundance of predicted N-glycosylation motifs, suggests that gut fungal CAZymes are heavily glycosylated, which may impact heterologous production of these biotechnologically useful enzymes. Finally, we present a codon optimization strategy to aid in the development of genetic engineering tools tailored to these early-branching anaerobic fungi.
Collapse
Affiliation(s)
- St. Elmo Wilken
- Department of Chemical Engineering, University of California, Santa Barbara, CA, 93106, USA
| | - Susanna Seppälä
- Department of Chemical Engineering, University of California, Santa Barbara, CA, 93106, USA
| | - Thomas S. Lankiewicz
- Department of Chemical Engineering, University of California, Santa Barbara, CA, 93106, USA
- Department of Evolution Ecology and Marine Biology, University of California, Santa Barbara, CA, 93106, USA
| | - Mohan Saxena
- Department of Chemical Engineering, University of California, Santa Barbara, CA, 93106, USA
| | - John K. Henske
- Department of Chemical Engineering, University of California, Santa Barbara, CA, 93106, USA
| | - Asaf A. Salamov
- US Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA, 94598, USA
| | - Igor V. Grigoriev
- US Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA, 94598, USA
| | - Michelle A. O’Malley
- Department of Chemical Engineering, University of California, Santa Barbara, CA, 93106, USA
| |
Collapse
|
29
|
Pavlovic Djuranovic S, Erath J, Andrews RJ, Bayguinov PO, Chung JJ, Chalker DL, Fitzpatrick JAJ, Moss WN, Szczesny P, Djuranovic S. Plasmodium falciparum translational machinery condones polyadenosine repeats. eLife 2020; 9:e57799. [PMID: 32469313 PMCID: PMC7295572 DOI: 10.7554/elife.57799] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Accepted: 05/28/2020] [Indexed: 01/04/2023] Open
Abstract
Plasmodium falciparum is a causative agent of human malaria. Sixty percent of mRNAs from its extremely AT-rich (81%) genome harbor long polyadenosine (polyA) runs within their ORFs, distinguishing the parasite from its hosts and other sequenced organisms. Recent studies indicate polyA runs cause ribosome stalling and frameshifting, triggering mRNA surveillance pathways and attenuating protein synthesis. Here, we show that P. falciparum is an exception to this rule. We demonstrate that both endogenous genes and reporter sequences containing long polyA runs are efficiently and accurately translated in P. falciparum cells. We show that polyA runs do not elicit any response from No Go Decay (NGD) or result in the production of frameshifted proteins. This is in stark contrast to what we observe in human cells or T. thermophila, an organism with similar AT-content. Finally, using stalling reporters we show that Plasmodium cells evolved not to have a fully functional NGD pathway.
Collapse
Affiliation(s)
| | - Jessey Erath
- Department of Cell Biology and Physiology, Washington University School of MedicineSt. LouisUnited States
| | - Ryan J Andrews
- Roy J. Carver Department of Biochemistry, Biophysics, and Molecular Biology, Iowa State UniversityAmesUnited States
| | - Peter O Bayguinov
- Washington University Center for Cellular Imaging, Washington University School of MedicineSt. LouisUnited States
| | - Joyce J Chung
- Department of Biology, Washington UniversitySt LouisUnited States
| | | | - James AJ Fitzpatrick
- Department of Cell Biology and Physiology, Washington University School of MedicineSt. LouisUnited States
- Washington University Center for Cellular Imaging, Washington University School of MedicineSt. LouisUnited States
- Department of Neuroscience, Washington University School of MedicineSt. LouisUnited States
- Department of Biomedical Engineering, Washington UniversitySt LouisUnited States
| | - Walter N Moss
- Roy J. Carver Department of Biochemistry, Biophysics, and Molecular Biology, Iowa State UniversityAmesUnited States
| | - Pawel Szczesny
- Institute of Biochemistry and Biophysics Polish Academy of Sciences, Department of BioinformaticsWarsawPoland
| | - Sergej Djuranovic
- Department of Cell Biology and Physiology, Washington University School of MedicineSt. LouisUnited States
| |
Collapse
|
30
|
Mier P, Elena-Real C, Urbanek A, Bernadó P, Andrade-Navarro MA. The importance of definitions in the study of polyQ regions: A tale of thresholds, impurities and sequence context. Comput Struct Biotechnol J 2020; 18:306-313. [PMID: 32071707 PMCID: PMC7016039 DOI: 10.1016/j.csbj.2020.01.012] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2019] [Revised: 12/13/2019] [Accepted: 01/30/2020] [Indexed: 12/18/2022] Open
Abstract
Polyglutamine (polyQ) regions are one of the most prevalent homorepeats in eukaryotes. It is however difficult to evaluate their prevalence because various studies claim different results. The reason is the lack of a consensus to define what is indeed a polyQ region. We have tackled this issue by studying how the use of different thresholds (i.e., minimum number of glutamines required in a protein region of a given size), to detect polyQ regions in the human proteome influences not only their prevalence but also their general features and sequence context. Threshold definition shapes the length distribution of the polyQ dataset, and changes the observed number and position of impurities (amino acids other than glutamine) within polyQ regions. Irrespective of the chosen threshold, leucine and proline residues are enriched both within and around polyQ. While leucine is enriched at the N-terminus of polyQ and specially at position -1 (amino acid preceding the polyQ), proline is prevalent in the C-terminus (positions +1 to +5, that is, the first five amino acids after the polyQ). We also checked the suitability of these thresholds for other species, and compared their polyQ features with those found in humans. As the sequence context and features of polyQ regions are threshold-dependent, we propose a method to quickly scan the polyQ landscape of a proteome. We complement our results with a summarized overview about which biases are to be expected per threshold when studying polyQ regions.
Collapse
Affiliation(s)
- Pablo Mier
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University Mainz, Hans-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - Carlos Elena-Real
- Centre de Biochimie Structurale (CBS), INSERM, CNRS, Université de Montpellier, 29, rue de Navacelles, 34090 Montpellier, France
| | - Annika Urbanek
- Centre de Biochimie Structurale (CBS), INSERM, CNRS, Université de Montpellier, 29, rue de Navacelles, 34090 Montpellier, France
| | - Pau Bernadó
- Centre de Biochimie Structurale (CBS), INSERM, CNRS, Université de Montpellier, 29, rue de Navacelles, 34090 Montpellier, France
| | - Miguel A. Andrade-Navarro
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University Mainz, Hans-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| |
Collapse
|
31
|
Atypical structural tendencies among low-complexity domains in the Protein Data Bank proteome. PLoS Comput Biol 2020; 16:e1007487. [PMID: 31986130 PMCID: PMC7004392 DOI: 10.1371/journal.pcbi.1007487] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2019] [Revised: 02/06/2020] [Accepted: 12/23/2019] [Indexed: 11/29/2022] Open
Abstract
A variety of studies have suggested that low-complexity domains (LCDs) tend to be intrinsically disordered and are relatively rare within structured proteins in the Protein Data Bank (PDB). Although LCDs are often treated as a single class, we previously found that LCDs enriched in different amino acids can exhibit substantial differences in protein metabolism and function. Therefore, we wondered whether the structural conformations of LCDs are likewise dependent on which specific amino acids are enriched within each LCD. Here, we directly examined relationships between enrichment of individual amino acids and secondary structure tendencies across the entire PDB proteome. Secondary structure tendencies varied as a function of the identity of the amino acid enriched and its degree of enrichment. Furthermore, divergence in secondary structure profiles often occurred for LCDs enriched in physicochemically similar amino acids (e.g. valine vs. leucine), indicating that LCDs composed of related amino acids can have distinct secondary structure tendencies. Comparison of LCD secondary structure tendencies with numerous pre-existing secondary structure propensity scales resulted in relatively poor correlations for certain types of LCDs, indicating that these scales may not capture secondary structure tendencies as sequence complexity decreases. Collectively, these observations provide a highly resolved view of structural tendencies among LCDs parsed by the nature and magnitude of single amino acid enrichment. The structures that proteins adopt are directly related to their amino acid sequences. Low-complexity domains (LCDs) in protein sequences are unusual regions made up of only a few different types of amino acids. Although this is the key feature that classifies sequences as LCDs, the physical properties of LCDs will differ based on the types of amino acids that are found in each domain. For example, the sequences “AAAAAAAAAA”, “EEEEEEEEEE”, and “EEKRKEEEKE” will have very different properties, even though they would all be classified as LCDs by traditional methods. In a previous study, we developed a new method to further divide LCDs into categories that more closely reflect the differences in their physical properties. In this study, we apply that approach to examine the structures of LCDs when sorted into different categories based on their amino acids. This allowed us to define relationships between the types of amino acids in the LCDs and their corresponding structures. Since protein structure is closely related to protein function, this has important implications for understanding the basic functions and properties of LCDs in a variety of proteins.
Collapse
|
32
|
Lucchese G, Flöel A, Stahl B. A Peptide Link Between Human Cytomegalovirus Infection, Neuronal Migration, and Psychosis. Front Psychiatry 2020; 11:349. [PMID: 32457660 PMCID: PMC7225321 DOI: 10.3389/fpsyt.2020.00349] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/21/2019] [Accepted: 04/06/2020] [Indexed: 01/28/2023] Open
Abstract
Alongside biological, psychological, and social risk factors, psychotic syndromes may be related to disturbances of neuronal migration. This highly complex process characterizes the developing brain of the fetus, the early postnatal brain, and the adult brain, as reflected by changes within the subventricular zone and the dentate gyrus of the hippocampus, where neurogenesis persists throughout life. Psychosis also appears to be linked to human cytomegalovirus (HCMV) infection. However, little is known about the connection between psychosis, HCMV infection, and disruption of neuronal migration. The present study addresses the hypothesis that HCMV infection may lead to mental disorders through mechanisms of autoimmune cross-reactivity. Searching for common peptides that underlie immune cross-reactions, the analyses focus on HCMV and human proteins involved in neuronal migration. Results demonstrate a large overlap of viral peptides with human proteins associated with neuronal migration, such as ventral anterior homeobox 1 and cell adhesion molecule 1 implicated in GABAergic and glutamatergic neurotransmission. The present findings support the possibility of immune cross-reactivity between HCMV and human proteins that-when altered, mutated, or improperly functioning-may disrupt normal neuronal migration. In addition, these findings are consistent with a molecular and mechanistic framework for pathological sequences of events, beginning with HCMV infection, followed by immune activation, cross-reactivity, and neuronal protein variations that may ultimately contribute to the emergence of mental disorders, including psychosis.
Collapse
Affiliation(s)
- Guglielmo Lucchese
- Department of Neurology, University of Greifswald, Greifswald, Germany.,Department of Computing, Goldsmiths, University of London, London, United Kingdom
| | - Agnes Flöel
- Department of Neurology, University of Greifswald, Greifswald, Germany.,Partner Site Rostock/Greifswald, German Center for Neurodegenerative Diseases, Greifswald, Germany
| | - Benjamin Stahl
- Department of Neurology, University of Greifswald, Greifswald, Germany.,Department of Neurology, Charité Universitätsmedizin Berlin, Berlin, Germany.,Department of Neurophysics, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.,Psychologische Hochschule Berlin, Berlin, Germany
| |
Collapse
|
33
|
Ntountoumi C, Vlastaridis P, Mossialos D, Stathopoulos C, Iliopoulos I, Promponas V, Oliver SG, Amoutzias GD. Low complexity regions in the proteins of prokaryotes perform important functional roles and are highly conserved. Nucleic Acids Res 2019; 47:9998-10009. [PMID: 31504783 PMCID: PMC6821194 DOI: 10.1093/nar/gkz730] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2019] [Revised: 07/16/2019] [Accepted: 08/15/2019] [Indexed: 01/27/2023] Open
Abstract
We provide the first high-throughput analysis of the properties and functional role of Low Complexity Regions (LCRs) in more than 1500 prokaryotic and phage proteomes. We observe that, contrary to a widespread belief based on older and sparse data, LCRs actually have a significant, persistent and highly conserved presence and role in many and diverse prokaryotes. Their specific amino acid content is linked to proteins with certain molecular functions, such as the binding of RNA, DNA, metal-ions and polysaccharides. In addition, LCRs have been repeatedly identified in very ancient, and usually highly expressed proteins of the translation machinery. At last, based on the amino acid content enriched in certain categories, we have developed a neural network web server to identify LCRs and accurately predict whether they can bind nucleic acids, metal-ions or are involved in chaperone functions. An evaluation of the tool showed that it is highly accurate for eukaryotic proteins as well.
Collapse
Affiliation(s)
- Chrysa Ntountoumi
- Bioinformatics Laboratory, Department of Biochemistry and Biotechnology, University of Thessaly, 41500, Greece
| | - Panayotis Vlastaridis
- Bioinformatics Laboratory, Department of Biochemistry and Biotechnology, University of Thessaly, 41500, Greece
| | - Dimitris Mossialos
- Microbial Biotechnology-Molecular Bacteriology-Virology Laboratory, Department of Biochemistry and Biotechnology, University of Thessaly, 41500, Greece
| | | | | | - Vasilios Promponas
- Bioinformatics Research Laboratory, Department of Biological Sciences, New Campus, University of Cyprus, PO Box 20537, CY-1678 Nicosia, Cyprus
| | - Stephen G Oliver
- Cambridge Systems Biology Centre & Department of Biochemistry, University of Cambridge, CB2 1GA, UK
| | - Grigoris D Amoutzias
- Bioinformatics Laboratory, Department of Biochemistry and Biotechnology, University of Thessaly, 41500, Greece
| |
Collapse
|
34
|
Repeatability in protein sequences. J Struct Biol 2019; 208:86-91. [PMID: 31408700 DOI: 10.1016/j.jsb.2019.08.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Revised: 08/06/2019] [Accepted: 08/08/2019] [Indexed: 02/07/2023]
Abstract
Low complexity regions (LCRs) in protein sequences have special properties that are very different from those of globular proteins. The rules that define secondary structure elements do not apply when the distribution of amino acids becomes biased. While there is a tendency towards structural disorder in LCRs, various examples, and particularly homorepeats of single amino acids, suggest that very short repeats could adopt structures very difficult to predict. These structures are possibly variable and dependant on the context of intra- or inter-molecular interactions. In general, short repeats in LCRs can induce structure. This could explain the observation that very short (non-perfect) repeats are widespread and many define regions with a function in protein interactions. For these reasons, we have developed an algorithm to quickly analyze local repeatability along protein sequences, that is, how close a protein fragment is from a perfect repeat. Using this algorithm we identified that the proteins of the yeast Saccharomyces cerevisiae are depleted in short repeats (approximate or not) of odd-length, while the human proteins are not, that the fish Danio rerio has many proteins with repeats of length two and that the plant Arabidopsis thaliana has an unusually large amount of repeats of length seven. Our method (REpeatability Scanner, RES, accessible at http://cbdm-01.zdv.uni-mainz.de/~munoz/res/) allows to find regions with approximate short repeats in protein sequences, and helps to characterize the variable use of LCRs and compositional bias in different organisms.
Collapse
|
35
|
Proteomic and genomic signatures of repeat instability in cancer and adjacent normal tissues. Proc Natl Acad Sci U S A 2019; 116:16987-16996. [PMID: 31387980 DOI: 10.1073/pnas.1908790116] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Repetitive sequences are hotspots of evolution at multiple levels. However, due to difficulties involved in their assembly and analysis, the role of repeats in tumor evolution is poorly understood. We developed a rigorous motif-based methodology to quantify variations in the repeat content, beyond microsatellites, in proteomes and genomes directly from proteomic and genomic raw data. This method was applied to a wide range of tumors and normal tissues. We identify high similarity between repeat instability patterns in tumors and their patient-matched adjacent normal tissues. Nonetheless, tumor-specific signatures both in protein expression and in the genome strongly correlate with cancer progression and robustly predict the tumorigenic state. In a patient, the hierarchy of genomic repeat instability signatures accurately reconstructs tumor evolution, with primary tumors differentiated from metastases. We observe an inverse relationship between repeat instability and point mutation load within and across patients independent of other somatic aberrations. Thus, repeat instability is a distinct, transient, and compensatory adaptive mechanism in tumor evolution and a potential signal for early detection.
Collapse
|
36
|
Dynamics of repeat-associated plasticity in the aaap gene family in Anaplasma marginale. Gene X 2019; 721S:100010. [PMID: 32099970 PMCID: PMC7041399 DOI: 10.1016/j.gene.2019.100010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2019] [Revised: 02/08/2019] [Accepted: 02/14/2019] [Indexed: 11/23/2022] Open
Abstract
Anaplasmosis, the most prevalent tick-transmitted disease of cattle, is caused by the rickettsial intracellular parasite Anaplasma marginale. The pathogen replicates within a parasitophorous vacuole formed from the invagination of the erythrocyte membrane. Several strains of A. marginale form "tails" or "appendages" which are attached to, and extend out from, the cytoplasmic side of the parasitophorous vacuole. Genomic analysis of the parasite antigen distributed along the appendage led to the discovery of the aaap (Anaplasma appendage associated protein) gene family located within a highly plastic region in the genome. The aaap gene family consists of aaap and several alps (for aaap-like proteins), depending on the strain. These genes/proteins are characterized by repeat sequences. To investigate locus plasticity, different versions of the locus were cloned from the same strain as well as from different strains, sequenced and aligned to identify changes. Our findings show that repeat sequences both within and between genes facilitated rearrangement events within the locus. Structural variation of the locus in the St. Maries strain was further investigated during infection of different cellular environments, i.e., bovine erythrocytes and tick cells, with a reduction in subpopulations of the aaap locus within the tick as compared to erythrocytes. Interestingly, subpopulations bearing alternative locus structures began to arise again when the pathogen was transferred from the tick environment into a naïve calf. Additionally, the Aaap protein expression profile between blood and tick samples showed a regulatory shift, indicating a host-specific response. Alignment of the protein sequences from different species of Anaplasma reveals six similar repeating motifs that appear to be unique to a few species of Anaplasma. The role the aaap locus may play in the pathogenesis of the bovine host or in tick infection/transmission remains unknown; however, the changes in aaap locus subpopulations, locus structure, and protein expression indicate that these genes have a role in strain diversification.
Collapse
|
37
|
Cascarina SM, Ross ED. Proteome-scale relationships between local amino acid composition and protein fates and functions. PLoS Comput Biol 2018; 14:e1006256. [PMID: 30248088 PMCID: PMC6171957 DOI: 10.1371/journal.pcbi.1006256] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2018] [Revised: 10/04/2018] [Accepted: 08/16/2018] [Indexed: 11/26/2022] Open
Abstract
Proteins with low-complexity domains continue to emerge as key players in both normal and pathological cellular processes. Although low-complexity domains are often grouped into a single class, individual low-complexity domains can differ substantially with respect to amino acid composition. These differences may strongly influence the physical properties, cellular regulation, and molecular functions of low-complexity domains. Therefore, we developed a bioinformatic approach to explore relationships between amino acid composition, protein metabolism, and protein function. We find that local compositional enrichment within protein sequences is associated with differences in translation efficiency, abundance, half-life, protein-protein interaction promiscuity, subcellular localization, and molecular functions of proteins on a proteome-wide scale. However, local enrichment of related amino acids is sometimes associated with opposite effects on protein regulation and function, highlighting the importance of distinguishing between different types of low-complexity domains. Furthermore, many of these effects are discernible at amino acid compositions below those required for classification as low-complexity or statistically-biased by traditional methods and in the absence of homopolymeric amino acid repeats, indicating that thresholds employed by classical methods may not reflect biologically relevant criteria. Application of our analyses to composition-driven processes, such as the formation of membraneless organelles, reveals distinct composition profiles even for closely related organelles. Collectively, these results provide a unique perspective and detailed insights into relationships between amino acid composition, protein metabolism, and protein functions. Low-complexity domains in protein sequences are regions that are composed of only a few amino acids in the protein “alphabet”. These domains often have unique chemical properties and play important biological roles in both normal and disease-related processes. While a number of approaches have been developed to define low-complexity domains, these methods each possess conceptual limitations. Therefore, we developed a complementary approach that focuses on local amino acid composition (i.e. the amino acid composition within small regions of proteins). We find that high local composition of individual amino acids is associated with pervasive effects on protein metabolism, subcellular localization, and molecular function on a proteome-wide scale. Importantly, the nature of the effects depend on the type of amino acid enriched within the examined domains, and are observable in the absence of classically-defined low-complexity (and related) domains. Furthermore, we define the compositions of proteins involved in the formation of membraneless, protein-rich organelles such as stress granules and P-bodies. Our results provide a coherent view and unprecedented resolution of the effects of local amino acid enrichment on protein biology.
Collapse
Affiliation(s)
- Sean M. Cascarina
- Department of Biochemistry and Molecular Biology, Colorado State University, Fort Collins, CO, United States of America
- * E-mail: (SMC); (EDR)
| | - Eric D. Ross
- Department of Biochemistry and Molecular Biology, Colorado State University, Fort Collins, CO, United States of America
- * E-mail: (SMC); (EDR)
| |
Collapse
|
38
|
Arthur LL, Djuranovic S. PolyA tracks, polybasic peptides, poly-translational hurdles. WILEY INTERDISCIPLINARY REVIEWS. RNA 2018; 9:e1486. [PMID: 29869837 PMCID: PMC6281860 DOI: 10.1002/wrna.1486] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2017] [Revised: 04/25/2018] [Accepted: 04/26/2018] [Indexed: 12/26/2022]
Abstract
The abundance of messenger RNA (mRNA) is one of the major determinants of protein synthesis. As such, factors that influence mRNA stability often contribute to gene regulation. Polyadenylation of the 3' end of mRNA transcripts, the poly(A) tail, has long been recognized as one of these regulatory elements given its influence on translation efficiency and mRNA stability. Unwanted translation of the poly(A) tail signals to the cell an aberrant polyadenylation event or the lack of stop codons, which makes this sequence an important element in translation fidelity and mRNA surveillance response. Consequently, investigations into the effects of the poly(A) tail lead to the discoveries that poly-lysine as well as other polybasic peptide sequences and, to a much greater extent, polyA mRNA sequences within the open reading frame influence mRNA stability and translational efficiency. Conservation and evolutionary selection of codon usage in polyA track sequences across multiple organisms suggests a biological significance for coding polyA tracks in the regulation of gene expression. Here, we discuss the cellular responses and consequences of coding polyA track translation and synthesis of polybasic peptides. This article is categorized under: Translation > Translation Mechanisms Translation > Translation Regulation RNA Turnover and Surveillance > Turnover/Surveillance Mechanisms.
Collapse
Affiliation(s)
- Laura L Arthur
- Department of Cell Biology and Physiology, Washington University School of Medicine, St. Louis, Missouri
| | - Sergej Djuranovic
- Department of Cell Biology and Physiology, Washington University School of Medicine, St. Louis, Missouri
| |
Collapse
|
39
|
Press MO, McCoy RC, Hall AN, Akey JM, Queitsch C. Massive variation of short tandem repeats with functional consequences across strains of Arabidopsis thaliana. Genome Res 2018; 28:1169-1178. [PMID: 29970452 PMCID: PMC6071631 DOI: 10.1101/gr.231753.117] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2017] [Accepted: 06/26/2018] [Indexed: 11/24/2022]
Abstract
Short tandem repeat (STR) mutations may comprise more than half of the mutations in eukaryotic coding DNA, yet STR variation is rarely examined as a contributor to complex traits. We assessed this contribution across a collection of 96 strains of Arabidopsis thaliana, genotyping 2046 STR loci each, using highly parallel STR sequencing with molecular inversion probes. We found that 95% of examined STRs are polymorphic, with a median of six alleles per STR across these strains. STR expansions (large copy number increases) are found in most strains, several of which have evident functional effects. These include three of six intronic STR expansions we found to be associated with intron retention. Coding STRs were depleted of variation relative to noncoding STRs, and we detected a total of 56 coding STRs (11%) showing low variation consistent with the action of purifying selection. In contrast, some STRs show hypervariable patterns consistent with diversifying selection. Finally, we detected 133 novel STR-phenotype associations under stringent criteria, most of which could not be detected with SNPs alone, and validated some with follow-up experiments. Our results support the conclusion that STRs constitute a large, unascertained reservoir of functionally relevant genomic variation.
Collapse
Affiliation(s)
- Maximilian O Press
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Rajiv C McCoy
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Ashley N Hall
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA.,Molecular and Cellular Biology Program, University of Washington, Seattle, Washington 98195, USA
| | - Joshua M Akey
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Christine Queitsch
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| |
Collapse
|
40
|
Jamsheer K M, Shukla BN, Jindal S, Gopan N, Mannully CT, Laxmi A. The FCS-like zinc finger scaffold of the kinase SnRK1 is formed by the coordinated actions of the FLZ domain and intrinsically disordered regions. J Biol Chem 2018; 293:13134-13150. [PMID: 29945970 DOI: 10.1074/jbc.ra118.002073] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Revised: 06/05/2018] [Indexed: 11/06/2022] Open
Abstract
The SNF1-related protein kinase 1 (SnRK1) is a heterotrimeric eukaryotic kinase that interacts with diverse proteins and regulates their activity in response to starvation and stress signals. Recently, the FCS-like zinc finger (FLZ) proteins were identified as a potential scaffold for SnRK1 in plants. However, the evolutionary and mechanistic aspect of this complex formation is currently unknown. Here, in silico analyses predicted that FLZ proteins possess conserved intrinsically disordered regions (IDRs) with a propensity for protein binding in the N and C termini across the plant lineage. We observed that the Arabidopsis FLZ proteins promiscuously interact with SnRK1 subunits, which formed different isoenzyme complexes. The FLZ domain was essential for mediating the interaction with SnRK1α subunits, whereas the IDRs in the N termini facilitated interactions with the β and βγ subunits of SnRK1. Furthermore, the IDRs in the N termini were important for mediating dimerization of different FLZ proteins. Of note, the interaction of FLZ with SnRK1 was confined to cytoplasmic foci, which colocalized with the endoplasmic reticulum. An evolutionary analysis revealed that in general, the IDR-rich regions are under more relaxed selection than the FLZ domain. In summary, the findings in our study reveal the structural details, origin, and evolution of a land plant-specific scaffold of SnRK1 formed by the coordinated actions of IDRs and structured regions in the FLZ proteins. We propose that the FLZ protein complex might be involved in providing flexibility, thus enhancing the binding repertoire of the SnRK1 hub in land plants.
Collapse
Affiliation(s)
- Muhammed Jamsheer K
- From the National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi-110067 and
| | - Brihaspati N Shukla
- From the National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi-110067 and
| | - Sunita Jindal
- From the National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi-110067 and
| | - Nandu Gopan
- the Jawaharlal Nehru Centre for Advanced Scientific Research, Jakkur, Bengaluru-560064, India
| | | | - Ashverya Laxmi
- From the National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi-110067 and
| |
Collapse
|
41
|
Urbanek A, Morató A, Allemand F, Delaforge E, Fournet A, Popovic M, Delbecq S, Sibille N, Bernadó P. A General Strategy to Access Structural Information at Atomic Resolution in Polyglutamine Homorepeats. Angew Chem Int Ed Engl 2018; 57:3598-3601. [PMID: 29359503 PMCID: PMC5901001 DOI: 10.1002/anie.201711530] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2017] [Revised: 12/28/2017] [Indexed: 12/31/2022]
Abstract
Homorepeat (HR) proteins are involved in key biological processes and multiple pathologies, however their high-resolution characterization has been impaired due to their homotypic nature. To overcome this problem, we have developed a strategy to isotopically label individual glutamines within HRs by combining nonsense suppression and cell-free expression. Our method has enabled the NMR investigation of huntingtin exon1 with a 16-residue polyglutamine (poly-Q) tract, and the results indicate the presence of an N-terminal α-helix at near neutral pH that vanishes towards the end of the HR. The generality of the strategy was demonstrated by introducing a labeled glutamine into a pathological version of huntingtin with 46 glutamines. This methodology paves the way to decipher the structural and dynamic perturbations induced by HR extensions in poly-Q-related diseases. Our approach can be extended to other amino acids to investigate biological processes involving proteins containing low-complexity regions (LCRs).
Collapse
Affiliation(s)
- Annika Urbanek
- Centre de Biochimie Structurale (CBS), INSERM, CNRSUniversité de Montpellier29 rue de Navacelles34090MontpellierFrance
| | - Anna Morató
- Centre de Biochimie Structurale (CBS), INSERM, CNRSUniversité de Montpellier29 rue de Navacelles34090MontpellierFrance
| | - Frédéric Allemand
- Centre de Biochimie Structurale (CBS), INSERM, CNRSUniversité de Montpellier29 rue de Navacelles34090MontpellierFrance
| | - Elise Delaforge
- Centre de Biochimie Structurale (CBS), INSERM, CNRSUniversité de Montpellier29 rue de Navacelles34090MontpellierFrance
| | - Aurélie Fournet
- Centre de Biochimie Structurale (CBS), INSERM, CNRSUniversité de Montpellier29 rue de Navacelles34090MontpellierFrance
| | - Matija Popovic
- Centre de Biochimie Structurale (CBS), INSERM, CNRSUniversité de Montpellier29 rue de Navacelles34090MontpellierFrance
| | - Stephane Delbecq
- Laboratoire de Biologie Cellulaire et Moléculaire, (LBCM-EA4558 Vaccination Antiparasitaire)UFR PharmacieUniversité de MontpellierMontpellierFrance
| | - Nathalie Sibille
- Centre de Biochimie Structurale (CBS), INSERM, CNRSUniversité de Montpellier29 rue de Navacelles34090MontpellierFrance
| | - Pau Bernadó
- Centre de Biochimie Structurale (CBS), INSERM, CNRSUniversité de Montpellier29 rue de Navacelles34090MontpellierFrance
| |
Collapse
|
42
|
Urbanek A, Morató A, Allemand F, Delaforge E, Fournet A, Popovic M, Delbecq S, Sibille N, Bernadó P. A General Strategy to Access Structural Information at Atomic Resolution in Polyglutamine Homorepeats. Angew Chem Int Ed Engl 2018. [DOI: 10.1002/ange.201711530] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Annika Urbanek
- Centre de Biochimie Structurale (CBS), INSERM, CNRS; Université de Montpellier; 29 rue de Navacelles 34090 Montpellier France
| | - Anna Morató
- Centre de Biochimie Structurale (CBS), INSERM, CNRS; Université de Montpellier; 29 rue de Navacelles 34090 Montpellier France
| | - Frédéric Allemand
- Centre de Biochimie Structurale (CBS), INSERM, CNRS; Université de Montpellier; 29 rue de Navacelles 34090 Montpellier France
| | - Elise Delaforge
- Centre de Biochimie Structurale (CBS), INSERM, CNRS; Université de Montpellier; 29 rue de Navacelles 34090 Montpellier France
| | - Aurélie Fournet
- Centre de Biochimie Structurale (CBS), INSERM, CNRS; Université de Montpellier; 29 rue de Navacelles 34090 Montpellier France
| | - Matija Popovic
- Centre de Biochimie Structurale (CBS), INSERM, CNRS; Université de Montpellier; 29 rue de Navacelles 34090 Montpellier France
| | - Stephane Delbecq
- Laboratoire de Biologie Cellulaire et Moléculaire, (LBCM-EA4558 Vaccination Antiparasitaire); UFR Pharmacie; Université de Montpellier; Montpellier France
| | - Nathalie Sibille
- Centre de Biochimie Structurale (CBS), INSERM, CNRS; Université de Montpellier; 29 rue de Navacelles 34090 Montpellier France
| | - Pau Bernadó
- Centre de Biochimie Structurale (CBS), INSERM, CNRS; Université de Montpellier; 29 rue de Navacelles 34090 Montpellier France
| |
Collapse
|
43
|
Abstract
Genome sequencing has greatly contributed to our understanding of parasitic protozoa. This is particularly the case for Cryptosporidium species (phylum Apicomplexa) which are difficult to propagate. Because of their polymorphic nature, simple sequence repeats have been used extensively as genotypic markers to differentiate between isolates, but no global analysis of amino acid repeats in Cryptosporidium genomes has been reported. Taking advantage of several newly sequenced Cryptosporidium genomes, a comparative analysis of single-amino-acid repeats (SAARs) in seven species was undertaken. This analysis revealed a striking difference between the SAAR profile of the gastric and intestinal species which infect mammals and one species which infects birds. In average, total SAAR length in gastric species is only 25% of the cumulative SAAR length in the genome of Cryptosporidium parvum, Cryptosporidium hominis and Cryptosporidium meleagridis, species infectious to humans. The SAAR profile in the avian parasite Cryptosporidium baileyi stands out due to the presence of long asparagine repeats. Cryptosporidium baileyi proteins with repeats ⩾20 residues are significantly enriched in regulatory functions. As postulated for the related apicomplexan species Plasmodium falciparum, these observations suggest that Cryptosporidium SAARs evolve in response to selective pressure. The putative selective mechanisms driving SAAR evolution in Cryptosporidium species are unknown.
Collapse
|
44
|
Takahashi M, Takahashi E, Joudeh LI, Marini M, Das G, Elshenawy MM, Akal A, Sakashita K, Alam I, Tehseen M, Sobhy MA, Stingl U, Merzaban JS, Di Fabrizio E, Hamdan SM. Dynamic structure mediates halophilic adaptation of a DNA polymerase from the deep-sea brines of the Red Sea. FASEB J 2018; 32:3346-3360. [PMID: 29401622 PMCID: PMC6051491 DOI: 10.1096/fj.201700862rr] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
The deep-sea brines of the Red Sea are remote and unexplored environments characterized by high temperatures, anoxic water, and elevated concentrations of salt and heavy metals. This environment provides a rare system to study the interplay between halophilic and thermophilic adaptation in biologic macromolecules. The present article reports the first DNA polymerase with halophilic and thermophilic features. Biochemical and structural analysis by Raman and circular dichroism spectroscopy showed that the charge distribution on the protein’s surface mediates the structural balance between stability for thermal adaptation and flexibility for counteracting the salt-induced rigid and nonfunctional hydrophobic packing. Salt bridge interactions via increased negative and positive charges contribute to structural stability. Salt tolerance, conversely, is mediated by a dynamic structure that becomes more fixed and functional with increasing salt concentration. We propose that repulsive forces among excess negative charges, in addition to a high percentage of negatively charged random coils, mediate this structural dynamism. This knowledge enabled us to engineer a halophilic version of Thermococcus kodakarensis DNA polymerase.—Takahashi, M., Takahashi, E., Joudeh, L. I., Marini, M., Das, G., Elshenawy, M. M., Akal, A., Sakashita, K., Alam, I., Tehseen, M., Sobhy, M. A., Stingl, U., Merzaban, J. S., Di Fabrizio, E., Hamdan, S. M. Dynamic structure mediates halophilic adaptation of a DNA polymerase from the deep-sea brines of the Red Sea.
Collapse
Affiliation(s)
- Masateru Takahashi
- Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology, Jeddah, Saudi Arabia
| | - Etsuko Takahashi
- Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology, Jeddah, Saudi Arabia
| | - Luay I Joudeh
- Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology, Jeddah, Saudi Arabia
| | - Monica Marini
- Physical Science and Engineering Division, King Abdullah University of Science and Technology, Jeddah, Saudi Arabia
| | - Gobind Das
- Physical Science and Engineering Division, King Abdullah University of Science and Technology, Jeddah, Saudi Arabia
| | - Mohamed M Elshenawy
- Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology, Jeddah, Saudi Arabia
| | - Anastassja Akal
- Physical Science and Engineering Division, King Abdullah University of Science and Technology, Jeddah, Saudi Arabia.,KAUST Catalysis Center, King Abdullah University of Science and Technology, Jeddah, Saudi Arabia
| | - Kosuke Sakashita
- Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology, Jeddah, Saudi Arabia
| | - Intikhab Alam
- Computational Bioscience Research Center, King Abdullah University of Science and Technology, Jeddah, Saudi Arabia; and
| | - Muhammad Tehseen
- Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology, Jeddah, Saudi Arabia
| | - Mohamed A Sobhy
- Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology, Jeddah, Saudi Arabia
| | - Ulrich Stingl
- Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology, Jeddah, Saudi Arabia.,Fort Lauderdale Research and Education Center, University of Florida, Davie, Florida, USA
| | - Jasmeen S Merzaban
- Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology, Jeddah, Saudi Arabia
| | - Enzo Di Fabrizio
- Physical Science and Engineering Division, King Abdullah University of Science and Technology, Jeddah, Saudi Arabia
| | - Samir M Hamdan
- Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology, Jeddah, Saudi Arabia
| |
Collapse
|
45
|
Hecel A, Wątły J, Rowińska-Żyrek M, Świątek-Kozłowska J, Kozłowski H. Histidine tracts in human transcription factors: insight into metal ion coordination ability. J Biol Inorg Chem 2018; 23:81-90. [PMID: 29218639 PMCID: PMC5756558 DOI: 10.1007/s00775-017-1512-x] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2017] [Accepted: 11/03/2017] [Indexed: 12/19/2022]
Abstract
Consecutive histidine repeats are chosen both by nature and by molecular biologists due to their high affinity towards metal ions. Screening of the human genome showed that transcription factors are extremely rich in His tracts. In this work, we examine two of such His-rich regions from forkhead box and MAFA proteins-MB3 (contains 18 His) and MB6 (with 21 His residues), focusing on the affinity and binding modes of Cu2+ and Zn2+ towards the two His-rich regions. In the case of Zn2+ species, the availability of imidazole nitrogen donors enhances metal complex stability. Interestingly, an opposite tendency is observed for Cu2+ complexes at above physiological pH, in which amide nitrogens participate in binding.
Collapse
Affiliation(s)
- Aleksandra Hecel
- Faculty of Chemistry, University of Wroclaw, F. Joliot-Curie 14, 50-383, Wrocław, Poland.
| | - Joanna Wątły
- Faculty of Chemistry, University of Wroclaw, F. Joliot-Curie 14, 50-383, Wrocław, Poland
| | | | | | - Henryk Kozłowski
- Public Higher Medical Professional School in Opole, Katowicka 68, 45-060, Opole, Poland.
- Wroclaw Research Centre EIT+, Stabłowicka 147, 54-066, Wrocław, Poland.
| |
Collapse
|
46
|
Constraints and consequences of the emergence of amino acid repeats in eukaryotic proteins. Nat Struct Mol Biol 2017; 24:765-777. [PMID: 28805808 DOI: 10.1038/nsmb.3441] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2017] [Accepted: 06/23/2017] [Indexed: 12/21/2022]
Abstract
Proteins with amino acid homorepeats have the potential to be detrimental to cells and are often associated with human diseases. Why, then, are homorepeats prevalent in eukaryotic proteomes? In yeast, homorepeats are enriched in proteins that are essential and pleiotropic and that buffer environmental insults. The presence of homorepeats increases the functional versatility of proteins by mediating protein interactions and facilitating spatial organization in a repeat-dependent manner. During evolution, homorepeats are preferentially retained in proteins with stringent proteostasis, which might minimize repeat-associated detrimental effects such as unregulated phase separation and protein aggregation. Their presence facilitates rapid protein divergence through accumulation of amino acid substitutions, which often affect linear motifs and post-translational-modification sites. These substitutions may result in rewiring protein interaction and signaling networks. Thus, homorepeats are distinct modules that are often retained in stringently regulated proteins. Their presence facilitates rapid exploration of the genotype-phenotype landscape of a population, thereby contributing to adaptation and fitness.
Collapse
|
47
|
Screening of nucleotide variations in genomic sequences encoding charged protein regions in the human genome. BMC Genomics 2017; 18:588. [PMID: 28789634 PMCID: PMC5549384 DOI: 10.1186/s12864-017-4000-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2017] [Accepted: 08/01/2017] [Indexed: 11/24/2022] Open
Abstract
Background Studying genetic variation distribution in proteins containing charged regions, called charge clusters (CCs), is of great interest to unravel their functional role. Charge clusters are 20 to 75 residue segments with high net positive charge, high net negative charge, or high total charge relative to the overall charge composition of the protein. We previously developed a bioinformatics tool (FCCP) to detect charge clusters in proteomes and scanned the human proteome for the occurrence of CCs. In this paper we investigate the genetic variations in the human proteins harbouring CCs. Results We studied the coding regions of 317 positively charged clusters and 1020 negatively charged ones previously detected in human proteins. Results revealed that coding parts of CCs are richer in sequence variants than their corresponding genes, full mRNAs, and exonic + intronic sequences and that these variants are predominately rare (Minor allele frequency < 0.005). Furthermore, variants occurring in the coding parts of positively charged regions of proteins are more often pathogenic than those occurring in negatively charged ones. Classification of variants according to their types showed that substitution is the major type followed by Indels (Insertions-deletions). Concerning substitutions, it was found that within clusters of both charges, the charged amino acids were the greatest loser groups whereas polar residues were the greatest gainers. Conclusions Our findings highlight the prominent features of the human charged regions from the DNA up to the protein sequence which might provide potential clues to improve the current understanding of those charged regions and their implication in the emergence of diseases. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-4000-3) contains supplementary material, which is available to authorized users.
Collapse
|
48
|
Single Amino Acid Repeats in the Proteome World: Structural, Functional, and Evolutionary Insights. PLoS One 2016; 11:e0166854. [PMID: 27893794 PMCID: PMC5125637 DOI: 10.1371/journal.pone.0166854] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2016] [Accepted: 11/05/2016] [Indexed: 12/15/2022] Open
Abstract
Microsatellites or simple sequence repeats (SSR) are abundant, highly diverse stretches of short DNA repeats present in all genomes. Tandem mono/tri/hexanucleotide repeats in the coding regions contribute to single amino acids repeats (SAARs) in the proteome. While SSRs in the coding region always result in amino acid repeats, a majority of SAARs arise due to a combination of various codons representing the same amino acid and not as a consequence of SSR events. Certain amino acids are abundant in repeat regions indicating a positive selection pressure behind the accumulation of SAARs. By analysing 22 proteomes including the human proteome, we explored the functional and structural relationship of amino acid repeats in an evolutionary context. Only ~15% of repeats are present in any known functional domain, while ~74% of repeats are present in the disordered regions, suggesting that SAARs add to the functionality of proteins by providing flexibility, stability and act as linker elements between domains. Comparison of SAAR containing proteins across species reveals that while shorter repeats are conserved among orthologs, proteins with longer repeats, >15 amino acids, are unique to the respective organism. Lysine repeats are well conserved among orthologs with respect to their length and number of occurrences in a protein. Other amino acids such as glutamic acid, proline, serine and alanine repeats are generally conserved among the orthologs with varying repeat lengths. These findings suggest that SAARs have accumulated in the proteome under positive selection pressure and that they provide flexibility for optimal folding of functional/structural domains of proteins. The insights gained from our observations can help in effective designing and engineering of proteins with novel features.
Collapse
|
49
|
Abstract
Genes carrying mutations associated with genetic diseases are present in all human cells; yet, clinical manifestations of genetic diseases are usually highly tissue-specific. Although some disease genes are expressed only in selected tissues, the expression patterns of disease genes alone cannot explain the observed tissue specificity of human diseases. Here we hypothesize that for a disease to manifest itself in a particular tissue, a whole functional subnetwork of genes (disease module) needs to be expressed in that tissue. Driven by this hypothesis, we conducted a systematic study of the expression patterns of disease genes within the human interactome. We find that genes expressed in a specific tissue tend to be localized in the same neighborhood of the interactome. By contrast, genes expressed in different tissues are segregated in distinct network neighborhoods. Most important, we show that it is the integrity and the completeness of the expression of the disease module that determines disease manifestation in selected tissues. This approach allows us to construct a disease-tissue network that confirms known and predicts unexpected disease-tissue associations.
Collapse
|
50
|
Saffert P, Adamla F, Schieweck R, Atkins JF, Ignatova Z. An Expanded CAG Repeat in Huntingtin Causes +1 Frameshifting. J Biol Chem 2016; 291:18505-13. [PMID: 27382061 DOI: 10.1074/jbc.m116.744326] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2016] [Indexed: 01/08/2023] Open
Abstract
Maintenance of triplet decoding is crucial for the expression of functional protein because deviations either into the -1 or +1 reading frames are often non-functional. We report here that expression of huntingtin (Htt) exon 1 with expanded CAG repeats, implicated in Huntington pathology, undergoes a sporadic +1 frameshift to generate from the CAG repeat a trans-frame AGC repeat-encoded product. This +1 recoding is exclusively detected in pathological Htt variants, i.e. those with expanded repeats with more than 35 consecutive CAG codons. An atypical +1 shift site, UUC C at the 5' end of CAG repeats, which has some resemblance to the influenza A virus shift site, triggers the +1 frameshifting and is enhanced by the increased propensity of the expanded CAG repeats to form a stem-loop structure. The +1 trans-frame-encoded product can directly influence the aggregation of the parental Htt exon 1.
Collapse
Affiliation(s)
- Paul Saffert
- From the Institute of Biochemistry, University of Potsdam, 14467 Potsdam, Germany
| | - Frauke Adamla
- Biochemistry and Molecular Biology, Department of Chemistry, University of Hamburg, 20146 Hamburg, Germany
| | - Rico Schieweck
- From the Institute of Biochemistry, University of Potsdam, 14467 Potsdam, Germany
| | - John F Atkins
- the School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland, and the Department of Human Genetics, University of Utah, Salt Lake City, Utah 84112
| | - Zoya Ignatova
- From the Institute of Biochemistry, University of Potsdam, 14467 Potsdam, Germany, Biochemistry and Molecular Biology, Department of Chemistry, University of Hamburg, 20146 Hamburg, Germany,
| |
Collapse
|