1
|
Hausmann S, Geiser J, Allen GE, Geslain SAM, Valentini M. Intrinsically disordered regions regulate RhlE RNA helicase functions in bacteria. Nucleic Acids Res 2024:gkae511. [PMID: 38874491 DOI: 10.1093/nar/gkae511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 05/29/2024] [Accepted: 06/03/2024] [Indexed: 06/15/2024] Open
Abstract
RNA helicases-central enzymes in RNA metabolism-often feature intrinsically disordered regions (IDRs) that enable phase separation and complex molecular interactions. In the bacterial pathogen Pseudomonas aeruginosa, the non-redundant RhlE1 and RhlE2 RNA helicases share a conserved REC catalytic core but differ in C-terminal IDRs. Here, we show how the IDR diversity defines RhlE RNA helicase specificity of function. Both IDRs facilitate RNA binding and phase separation, localizing proteins in cytoplasmic clusters. However, RhlE2 IDR is more efficient in enhancing REC core RNA unwinding, exhibits a greater tendency for phase separation, and interacts with the RNase E endonuclease, a crucial player in mRNA degradation. Swapping IDRs results in chimeric proteins that are biochemically active but functionally distinct as compared to their native counterparts. The RECRhlE1-IDRRhlE2 chimera improves cold growth of a rhlE1 mutant, gains interaction with RNase E and affects a subset of both RhlE1 and RhlE2 RNA targets. The RECRhlE2-IDRRhlE1 chimera instead hampers bacterial growth at low temperatures in the absence of RhlE1, with its detrimental effect linked to aberrant RNA droplets. By showing that IDRs modulate both protein core activities and subcellular localization, our study defines the impact of IDR diversity on the functional differentiation of RNA helicases.
Collapse
Affiliation(s)
- Stéphane Hausmann
- Department of Microbiology and Molecular Medicine, CMU, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Johan Geiser
- Department of Microbiology and Molecular Medicine, CMU, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - George Edward Allen
- Department of Microbiology and Molecular Medicine, CMU, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Sandra Amandine Marie Geslain
- Department of Microbiology and Molecular Medicine, CMU, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Martina Valentini
- Department of Microbiology and Molecular Medicine, CMU, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| |
Collapse
|
2
|
Yang Y, Braga MV, Dean MD. Insertion-Deletion Events Are Depleted in Protein Regions with Predicted Secondary Structure. Genome Biol Evol 2024; 16:evae093. [PMID: 38735759 PMCID: PMC11102076 DOI: 10.1093/gbe/evae093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Revised: 04/16/2024] [Accepted: 04/21/2024] [Indexed: 05/14/2024] Open
Abstract
A fundamental goal in evolutionary biology and population genetics is to understand how selection shapes the fate of new mutations. Here, we test the null hypothesis that insertion-deletion (indel) events in protein-coding regions occur randomly with respect to secondary structures. We identified indels across 11,444 sequence alignments in mouse, rat, human, chimp, and dog genomes and then quantified their overlap with four different types of secondary structure-alpha helices, beta strands, protein bends, and protein turns-predicted by deep-learning methods of AlphaFold2. Indels overlapped secondary structures 54% as much as expected and were especially underrepresented over beta strands, which tend to form internal, stable regions of proteins. In contrast, indels were enriched by 155% over regions without any predicted secondary structures. These skews were stronger in the rodent lineages compared to the primate lineages, consistent with population genetic theory predicting that natural selection will be more efficient in species with larger effective population sizes. Nonsynonymous substitutions were also less common in regions of protein secondary structure, although not as strongly reduced as in indels. In a complementary analysis of thousands of human genomes, we showed that indels overlapping secondary structure segregated at significantly lower frequency than indels outside of secondary structure. Taken together, our study shows that indels are selected against if they overlap secondary structure, presumably because they disrupt the tertiary structure and function of a protein.
Collapse
Affiliation(s)
- Yi Yang
- Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Matthew V Braga
- Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Matthew D Dean
- Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
3
|
Lin L, Huang Y, McIntyre J, Chang CH, Colmenares S, Lee YCG. Prevalent fast evolution of genes involved in heterochromatin functions. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.03.583199. [PMID: 38496614 PMCID: PMC10942301 DOI: 10.1101/2024.03.03.583199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
Heterochromatin is a gene-poor and repeat-rich genomic compartment ubiquitously found in eukaryotes. Despite its low transcriptional activity, heterochromatin plays important roles in maintaining genome stability, organizing chromosomes, and suppressing transposable elements (TEs). Given the importance of these functions, it is expected that the genes involved in heterochromatin regulation would be highly conserved. Yet, a handful of these genes have been found to evolve rapidly. To investigate whether these previous findings are anecdotal or general to genes modulating heterochromatin, we compiled an exhaustive list of 106 candidate genes involved in heterochromatin functions and investigated their evolution over both short and long evolutionary time scales in Drosophila. Our analyses found that these genes exhibit significantly more frequent evolutionary changes, both in the forms of amino acid substitutions and gene copy number variation, when compared to genes involved in Polycomb-based repressive chromatin. While positive selection drives amino acid changes within both structured domains with diverse functions and irregular disordered regions (IDRs), purifying selection may have maintained the proportions of IDRs. Together with the observed negative associations between rates of protein evolution of these genes and genomic TE abundance, we propose an evolutionary model where the fast evolution of genes involved in heterochromatin functions is an inevitable outcome of the unique molecular features of the heterochromatin environment, while the rapid evolution of TEs may be an effect rather than cause. Our study provides an important global view of the evolution of genes involved in this critical cellular domain and provides insights into the factors driving the distinctive evolution of heterochromatin.
Collapse
Affiliation(s)
- Leila Lin
- Department of Ecology and Evolutionary Biology, University of California, Irvine
| | - Yuheng Huang
- Department of Ecology and Evolutionary Biology, University of California, Irvine
| | - Jennifer McIntyre
- Department of Ecology and Evolutionary Biology, University of California, Irvine
| | | | - Serafin Colmenares
- Department of Cell and Molecular Biology, University of California, Berkeley
| | - Yuh Chwen G. Lee
- Department of Ecology and Evolutionary Biology, University of California, Irvine
| |
Collapse
|
4
|
Kouros CE, Makri V, Ouzounis CA, Chasapi A. Disease association and comparative genomics of compositional bias in human proteins. F1000Res 2023; 12:198. [PMID: 37082000 PMCID: PMC10111144 DOI: 10.12688/f1000research.129929.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/02/2023] [Indexed: 02/22/2023] Open
Abstract
Background: The evolutionary rate of disordered proteins varies greatly due to the lack of structural constraints. So far, few studies have investigated the presence/absence patterns of intrinsically disordered regions (IDRs) across phylogenies in conjunction with human disease. In this study, we report a genome-wide analysis of compositional bias association with disease in human proteins and their taxonomic distribution. Methods: The human genome protein set provided by the Ensembl database was annotated and analysed with respect to both disease associations and the detection of compositional bias. The Uniprot Reference Proteome dataset, containing 11297 proteomes was used as target dataset for the comparative genomics of a well-defined subset of the Human Genome, including 100 characteristic, compositionally biased proteins, some linked to disease. Results: Cross-evaluation of compositional bias and disease-association in the human genome reveals a significant bias towards low complexity regions in disease-associated genes, with charged, hydrophilic amino acids appearing as over-represented. The phylogenetic profiling of 17 disease-associated, low complexity proteins across 11297 proteomes captures characteristic taxonomic distribution patterns. Conclusions: This is the first time that a combined genome-wide analysis of low complexity, disease-association and taxonomic distribution of human proteins is reported, covering structural, functional, and evolutionary properties. The reported framework can form the basis for large-scale, follow-up projects, encompassing the entire human genome and all known gene-disease associations.
Collapse
Affiliation(s)
- Christos E. Kouros
- BCCB-AIIA, School of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Vasiliki Makri
- BCCB-AIIA, School of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Christos A. Ouzounis
- BCCB-AIIA, School of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece
- BCPL, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas (CERTH), Thessaloniki, Greece
| | - Anastasia Chasapi
- BCPL, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas (CERTH), Thessaloniki, Greece
| |
Collapse
|
5
|
Kouros CE, Makri V, Ouzounis CA, Chasapi A. Disease association and comparative genomics of compositional bias in human proteins. F1000Res 2023; 12:198. [PMID: 37082000 PMCID: PMC10111144.2 DOI: 10.12688/f1000research.129929.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 04/12/2023] [Indexed: 04/25/2023] Open
Abstract
Background: The evolutionary rate of disordered protein regions varies greatly due to the lack of structural constraints. So far, few studies have investigated the presence/absence patterns of compositional bias, indicative of disorder, across phylogenies in conjunction with human disease. In this study, we report a genome-wide analysis of compositional bias association with disease in human proteins and their taxonomic distribution. Methods: The human genome protein set provided by the Ensembl database was annotated and analysed with respect to both disease associations and the detection of compositional bias. The Uniprot Reference Proteome dataset, containing 11297 proteomes was used as target dataset for the comparative genomics of a well-defined subset of the Human Genome, including 100 characteristic, compositionally biased proteins, some linked to disease. Results: Cross-evaluation of compositional bias and disease-association in the human genome reveals a significant bias towards biased regions in disease-associated genes, with charged, hydrophilic amino acids appearing as over-represented. The phylogenetic profiling of 17 disease-associated, proteins with compositional bias across 11297 proteomes captures characteristic taxonomic distribution patterns. Conclusions: This is the first time that a combined genome-wide analysis of compositional bias, disease-association and taxonomic distribution of human proteins is reported, covering structural, functional, and evolutionary properties. The reported framework can form the basis for large-scale, follow-up projects, encompassing the entire human genome and all known gene-disease associations.
Collapse
Affiliation(s)
- Christos E. Kouros
- BCCB-AIIA, School of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Vasiliki Makri
- BCCB-AIIA, School of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Christos A. Ouzounis
- BCCB-AIIA, School of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece
- BCPL, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas (CERTH), Thessaloniki, Greece
| | - Anastasia Chasapi
- BCPL, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas (CERTH), Thessaloniki, Greece
| |
Collapse
|
6
|
Sallah SR, Sergouniotis PI, Hardcastle C, Ramsden S, Lotery AJ, Lench N, Lovell SC, Black GCM. Assessing the Pathogenicity of In-Frame CACNA1F Indel Variants Using Structural Modeling. J Mol Diagn 2022; 24:1232-1239. [PMID: 36191840 DOI: 10.1016/j.jmoldx.2022.09.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 08/20/2022] [Accepted: 09/09/2022] [Indexed: 01/13/2023] Open
Abstract
Small in-frame insertion-deletion (indel) variants are a common form of genomic variation whose impact on rare disease phenotypes has been understudied. The prediction of the pathogenicity of such variants remains challenging. X-linked incomplete congenital stationary night blindness type 2 (CSNB2) is a nonprogressive, inherited retinal disorder caused by variants in CACNA1F, encoding the Cav1.4α1 channel protein. Here, structural analysis was used through homology modeling to interpret 10 disease-correlated and 10 putatively benign CACNA1F in-frame indel variants. CSNB2-correlated changes were found to be more highly conserved compared with putative benign variants. Notably, all 10 disease-correlated variants but none of the benign changes were within modeled regions of the protein. Structural analysis revealed that disease-correlated variants are predicted to destabilize the structure and function of the Cav1.4α1 channel protein. Overall, the use of structural information to interpret the consequences of in-frame indel variants provides an important adjunct that can improve the diagnosis for individuals with CSNB2.
Collapse
Affiliation(s)
- Shalaw R Sallah
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicines and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, United Kingdom; Manchester Centre for Genomic Medicine, Manchester University NHS Foundation Trust, Manchester Academic Health Science Centre, St. Mary's Hospital, Manchester, United Kingdom.
| | - Panagiotis I Sergouniotis
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicines and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, United Kingdom; Manchester Centre for Genomic Medicine, Manchester University NHS Foundation Trust, Manchester Academic Health Science Centre, St. Mary's Hospital, Manchester, United Kingdom
| | - Claire Hardcastle
- Manchester Centre for Genomic Medicine, Manchester University NHS Foundation Trust, Manchester Academic Health Science Centre, St. Mary's Hospital, Manchester, United Kingdom
| | - Simon Ramsden
- Manchester Centre for Genomic Medicine, Manchester University NHS Foundation Trust, Manchester Academic Health Science Centre, St. Mary's Hospital, Manchester, United Kingdom
| | - Andrew J Lotery
- Faculty of Medicine, University of Southampton, Southampton, United Kingdom
| | - Nick Lench
- Congenica Ltd., BioData Innovation Centre, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Simon C Lovell
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicines and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, United Kingdom
| | - Graeme C M Black
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicines and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, United Kingdom; Manchester Centre for Genomic Medicine, Manchester University NHS Foundation Trust, Manchester Academic Health Science Centre, St. Mary's Hospital, Manchester, United Kingdom.
| |
Collapse
|
7
|
Sangster AG, Zarin T, Moses AM. Evolution of short linear motifs and disordered proteins Topic: yeast as model system to study evolution. Curr Opin Genet Dev 2022; 76:101964. [PMID: 35939968 DOI: 10.1016/j.gde.2022.101964] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Revised: 06/29/2022] [Accepted: 07/08/2022] [Indexed: 11/26/2022]
Abstract
Evolutionary preservation of protein structure had a major influence on the field of molecular evolution: changes in individual amino acids that did not disrupt protein folding would either have no effect or subtly change the 'lock' so that it could fit a new 'key'. Homology of individual amino acids could be confidently assigned through sequence alignments, and models of evolution could be tested. This view of molecular evolution excluded large regions of proteins that could not be confidently aligned, such as intrinsically disordered regions (IDRs) that do not fold into stable structures. In the last decade, major progress has been made in understanding the evolution of IDRs, much of it facilitated by new experimental and computational approaches in yeast. Here, we review this progress as well as several still outstanding questions.
Collapse
Affiliation(s)
- Ami G Sangster
- Cell & Systems Biology, University of Toronto, 25 Harbord St., Toronto, ON M5S 3G5, Canada
| | - Taraneh Zarin
- Cell & Systems Biology, University of Toronto, 25 Harbord St., Toronto, ON M5S 3G5, Canada. https://twitter.com/@taraneh_z
| | - Alan M Moses
- Cell & Systems Biology, University of Toronto, 25 Harbord St., Toronto, ON M5S 3G5, Canada.
| |
Collapse
|
8
|
Ahmed SS, Rifat ZT, Lohia R, Campbell AJ, Dunker AK, Rahman MS, Iqbal S. Characterization of intrinsically disordered regions in proteins informed by human genetic diversity. PLoS Comput Biol 2022; 18:e1009911. [PMID: 35275927 PMCID: PMC8942211 DOI: 10.1371/journal.pcbi.1009911] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Revised: 03/23/2022] [Accepted: 02/10/2022] [Indexed: 01/21/2023] Open
Abstract
All proteomes contain both proteins and polypeptide segments that don’t form a defined three-dimensional structure yet are biologically active—called intrinsically disordered proteins and regions (IDPs and IDRs). Most of these IDPs/IDRs lack useful functional annotation limiting our understanding of their importance for organism fitness. Here we characterized IDRs using protein sequence annotations of functional sites and regions available in the UniProt knowledgebase (“UniProt features”: active site, ligand-binding pocket, regions mediating protein-protein interactions, etc.). By measuring the statistical enrichment of twenty-five UniProt features in 981 IDRs of 561 human proteins, we identified eight features that are commonly located in IDRs. We then collected the genetic variant data from the general population and patient-based databases and evaluated the prevalence of population and pathogenic variations in IDPs/IDRs. We observed that some IDRs tolerate 2 to 12-times more single amino acid-substituting missense mutations than synonymous changes in the general population. However, we also found that 37% of all germline pathogenic mutations are located in disordered regions of 96 proteins. Based on the observed-to-expected frequency of mutations, we categorized 34 IDRs in 20 proteins (DDX3X, KIT, RB1, etc.) as intolerant to mutation. Finally, using statistical analysis and a machine learning approach, we demonstrate that mutation-intolerant IDRs carry a distinct signature of functional features. Our study presents a novel approach to assign functional importance to IDRs by leveraging the wealth of available genetic data, which will aid in a deeper understating of the role of IDRs in biological processes and disease mechanisms.
Collapse
Affiliation(s)
- Shehab S. Ahmed
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, ECE Building, West Palashi, Dhaka-1205, Bangladesh
| | - Zaara T. Rifat
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, ECE Building, West Palashi, Dhaka-1205, Bangladesh
| | - Ruchi Lohia
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Arthur J. Campbell
- Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - A. Keith Dunker
- Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Indiana University School of Medicine, Indianapolis, Indiana, United States of America
| | - M. Sohel Rahman
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, ECE Building, West Palashi, Dhaka-1205, Bangladesh
- * E-mail: (MSR); (SI)
| | - Sumaiya Iqbal
- Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, United States of America
- * E-mail: (MSR); (SI)
| |
Collapse
|
9
|
Pajkos M, Dosztányi Z. Functions of intrinsically disordered proteins through evolutionary lenses. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2021; 183:45-74. [PMID: 34656334 DOI: 10.1016/bs.pmbts.2021.06.017] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Protein sequences are the result of an evolutionary process that involves the balancing act of experimenting with novel mutations and selecting out those that have an undesirable functional outcome. In the case of globular proteins, the function relies on a well-defined conformation, therefore, there is a strong evolutionary pressure to preserve the structure. However, different evolutionary rules might apply for the group of intrinsically disordered regions and proteins (IDR/IDPs) that exist as an ensemble of fluctuating conformations. The function of IDRs can directly originate from their disordered state or arise through different types of molecular recognition processes. There is an amazing variety of ways IDRs can carry out their functions, and this is also reflected in their evolutionary properties. In this chapter we give an overview of the different types of evolutionary behavior of disordered proteins and associated functions in normal and disease settings.
Collapse
Affiliation(s)
- Mátyás Pajkos
- Department of Biochemistry, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Zsuzsanna Dosztányi
- Department of Biochemistry, ELTE Eötvös Loránd University, Budapest, Hungary.
| |
Collapse
|
10
|
Exploring Potential Signals of Selection for Disordered Residues in Prokaryotic and Eukaryotic Proteins. GENOMICS PROTEOMICS & BIOINFORMATICS 2020; 18:549-564. [PMID: 33346088 PMCID: PMC8377245 DOI: 10.1016/j.gpb.2020.06.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Revised: 03/29/2020] [Accepted: 06/10/2020] [Indexed: 11/22/2022]
Abstract
Intrinsically disordered proteins (IDPs) are an important class of proteins in all domains of life for their functional importance. However, how nature has shaped the disorder potential of prokaryotic and eukaryotic proteins is still not clearly known. Randomly generated sequences are free of any selective constraints, thus these sequences are commonly used as null models. Considering different types of random protein models, here we seek to understand how the disorder potential of natural eukaryotic and prokaryotic proteins differs from random sequences. Comparing proteome-wide disorder content between real and random sequences of 12 model organisms, we noticed that eukaryotic proteins are enriched in disordered regions compared to random sequences, but in prokaryotes such regions are depleted. By analyzing the position-wise disorder profile, we show that there is a generally higher disorder near the N- and C-terminal regions of eukaryotic proteins as compared to the random models; however, either no or a weak such trend was found in prokaryotic proteins. Moreover, here we show that this preference is not caused by the amino acid or nucleotide composition at the respective sites. Instead, these regions were found to be endowed with a higher fraction of protein–protein binding sites, suggesting their functional importance. We discuss several possible explanations for this pattern, such as improving the efficiency of protein–protein interaction, ribosome movement during translation, and post-translational modification. However, further studies are needed to clearly understand the biophysical mechanisms causing the trend.
Collapse
|
11
|
Protein-Protein Interactions Mediated by Intrinsically Disordered Protein Regions Are Enriched in Missense Mutations. Biomolecules 2020; 10:biom10081097. [PMID: 32722039 PMCID: PMC7463635 DOI: 10.3390/biom10081097] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2020] [Revised: 07/15/2020] [Accepted: 07/20/2020] [Indexed: 12/27/2022] Open
Abstract
Because proteins are fundamental to most biological processes, many genetic diseases can be traced back to single nucleotide variants (SNVs) that cause changes in protein sequences. However, not all SNVs that result in amino acid substitutions cause disease as each residue is under different structural and functional constraints. Influential studies have shown that protein–protein interaction interfaces are enriched in disease-associated SNVs and depleted in SNVs that are common in the general population. These studies focus primarily on folded (globular) protein domains and overlook the prevalent class of protein interactions mediated by intrinsically disordered regions (IDRs). Therefore, we investigated the enrichment patterns of missense mutation-causing SNVs that are associated with disease and cancer, as well as those present in the healthy population, in structures of IDR-mediated interactions with comparisons to classical globular interactions. When comparing the different categories of interaction interfaces, division of the interface regions into solvent-exposed rim residues and buried core residues reveal distinctive enrichment patterns for the various types of missense mutations. Most notably, we demonstrate a strong enrichment at the interface core of interacting IDRs in disease mutations and its depletion in neutral ones, which supports the view that the disruption of IDR interactions is a mechanism underlying many diseases. Intriguingly, we also found an asymmetry across the IDR interaction interface in the enrichment of certain missense mutation types, which may hint at an increased variant tolerance and urges further investigations of IDR interactions.
Collapse
|
12
|
Zarin T, Strome B, Nguyen Ba AN, Alberti S, Forman-Kay JD, Moses AM. Proteome-wide signatures of function in highly diverged intrinsically disordered regions. eLife 2019; 8:46883. [PMID: 31264965 PMCID: PMC6634968 DOI: 10.7554/elife.46883] [Citation(s) in RCA: 102] [Impact Index Per Article: 20.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2019] [Accepted: 07/01/2019] [Indexed: 12/24/2022] Open
Abstract
Intrinsically disordered regions make up a large part of the proteome, but the sequence-to-function relationship in these regions is poorly understood, in part because the primary amino acid sequences of these regions are poorly conserved in alignments. Here we use an evolutionary approach to detect molecular features that are preserved in the amino acid sequences of orthologous intrinsically disordered regions. We find that most disordered regions contain multiple molecular features that are preserved, and we define these as ‘evolutionary signatures’ of disordered regions. We demonstrate that intrinsically disordered regions with similar evolutionary signatures can rescue function in vivo, and that groups of intrinsically disordered regions with similar evolutionary signatures are strongly enriched for functional annotations and phenotypes. We propose that evolutionary signatures can be used to predict function for many disordered regions from their amino acid sequences.
Collapse
Affiliation(s)
- Taraneh Zarin
- Department of Cell and Systems Biology, University of Toronto, Toronto, Canada
| | - Bob Strome
- Department of Cell and Systems Biology, University of Toronto, Toronto, Canada
| | - Alex N Nguyen Ba
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, United States
| | - Simon Alberti
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany.,Center for Molecular and Cellular Bioengineering, Biotechnology Center, Technische Universität Dresden, Dresden, Germany
| | - Julie D Forman-Kay
- Program in Molecular Medicine, Hospital for Sick Children, Toronto, Canada.,Department of Biochemistry, University of Toronto, Toronto, Canada
| | - Alan M Moses
- Department of Cell and Systems Biology, University of Toronto, Toronto, Canada.,Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Canada.,Department of Computer Science, University of Toronto, Toronto, Canada
| |
Collapse
|
13
|
Pagel KA, Antaki D, Lian A, Mort M, Cooper DN, Sebat J, Iakoucheva LM, Mooney SD, Radivojac P. Pathogenicity and functional impact of non-frameshifting insertion/deletion variation in the human genome. PLoS Comput Biol 2019; 15:e1007112. [PMID: 31199787 PMCID: PMC6594643 DOI: 10.1371/journal.pcbi.1007112] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2018] [Revised: 06/26/2019] [Accepted: 05/17/2019] [Indexed: 11/19/2022] Open
Abstract
Differentiation between phenotypically neutral and disease-causing genetic variation remains an open and relevant problem. Among different types of variation, non-frameshifting insertions and deletions (indels) represent an understudied group with widespread phenotypic consequences. To address this challenge, we present a machine learning method, MutPred-Indel, that predicts pathogenicity and identifies types of functional residues impacted by non-frameshifting insertion/deletion variation. The model shows good predictive performance as well as the ability to identify impacted structural and functional residues including secondary structure, intrinsic disorder, metal and macromolecular binding, post-translational modifications, allosteric sites, and catalytic residues. We identify structural and functional mechanisms impacted preferentially by germline variation from the Human Gene Mutation Database, recurrent somatic variation from COSMIC in the context of different cancers, as well as de novo variants from families with autism spectrum disorder. Further, the distributions of pathogenicity prediction scores generated by MutPred-Indel are shown to differentiate highly recurrent from non-recurrent somatic variation. Collectively, we present a framework to facilitate the interrogation of both pathogenicity and the functional effects of non-frameshifting insertion/deletion variants. The MutPred-Indel webserver is available at http://mutpred.mutdb.org/.
Collapse
Affiliation(s)
- Kymberleigh A. Pagel
- School of Informatics, Computing, and Engineering, Indiana University, Bloomington, Indiana, United States of America
| | - Danny Antaki
- Department of Psychiatry, University of California San Diego, La Jolla, California, United States of America
| | - AoJie Lian
- Department of Psychiatry, University of California San Diego, La Jolla, California, United States of America
- Center for Medical Genetics, School of Life Sciences, Central South University, Changsha, China
| | - Matthew Mort
- Institute of Medical Genetics, Cardiff University, Cardiff, United Kingdom
| | - David N. Cooper
- Institute of Medical Genetics, Cardiff University, Cardiff, United Kingdom
| | - Jonathan Sebat
- Department of Psychiatry, University of California San Diego, La Jolla, California, United States of America
| | - Lilia M. Iakoucheva
- Department of Psychiatry, University of California San Diego, La Jolla, California, United States of America
| | - Sean D. Mooney
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, Washington, United States of America
| | - Predrag Radivojac
- School of Informatics, Computing, and Engineering, Indiana University, Bloomington, Indiana, United States of America
- Khoury College of Computer Sciences, Northeastern University, Boston, Massachusetts, United States of America
| |
Collapse
|
14
|
Afanasyeva A, Bockwoldt M, Cooney CR, Heiland I, Gossmann TI. Human long intrinsically disordered protein regions are frequent targets of positive selection. Genome Res 2018; 28:975-982. [PMID: 29858274 PMCID: PMC6028134 DOI: 10.1101/gr.232645.117] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2017] [Accepted: 06/01/2018] [Indexed: 12/20/2022]
Abstract
Intrinsically disordered regions occur frequently in proteins and are characterized by a lack of a well-defined three-dimensional structure. Although these regions do not show a higher order of structural organization, they are known to be functionally important. Disordered regions are rapidly evolving, largely attributed to relaxed purifying selection and an increased role of genetic drift. It has also been suggested that positive selection might contribute to their rapid diversification. However, for our own species, it is currently unknown whether positive selection has played a role during the evolution of these protein regions. Here, we address this question by investigating the evolutionary pattern of more than 6600 human proteins with intrinsically disordered regions and their ordered counterparts. Our comparative approach with data from more than 90 mammalian genomes uses a priori knowledge of disordered protein regions, and we show that this increases the power to detect positive selection by an order of magnitude. We can confirm that human intrinsically disordered regions evolve more rapidly, not only within humans but also across the entire mammalian phylogeny. They have, however, experienced substantial evolutionary constraint, hinting at their fundamental functional importance. We find compelling evidence that disordered protein regions are frequent targets of positive selection and estimate that the relative rate of adaptive substitutions differs fourfold between disordered and ordered protein regions in humans. Our results suggest that disordered protein regions are important targets of genetic innovation and that the contribution of positive selection in these regions is more pronounced than in other protein parts.
Collapse
Affiliation(s)
- Arina Afanasyeva
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield S102TN, United Kingdom.,Institute of Nanobiotechnologies, Peter the Great St. Petersburg Polytechnic University, Saint-Petersburg 195251, Russia.,Petersburg Nuclear Physics Institute, B.P. Konstantinov NRC Kurchatov Institute, Gatchina, Leningrad District 188300, Russia.,National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki City, Osaka 567-0085, Japan
| | - Mathias Bockwoldt
- Department of Arctic and Marine Biology, UiT The Arctic University of Norway, 9037 Tromsø, Norway
| | - Christopher R Cooney
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield S102TN, United Kingdom
| | - Ines Heiland
- Department of Arctic and Marine Biology, UiT The Arctic University of Norway, 9037 Tromsø, Norway
| | - Toni I Gossmann
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield S102TN, United Kingdom
| |
Collapse
|
15
|
Salamanova E, Costeira-Paulo J, Han KH, Kim DH, Nilsson L, Wright APH. A subset of functional adaptation mutations alter propensity for α-helical conformation in the intrinsically disordered glucocorticoid receptor tau1core activation domain. Biochim Biophys Acta Gen Subj 2018; 1862:1452-1461. [PMID: 29550429 DOI: 10.1016/j.bbagen.2018.03.015] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Revised: 03/07/2018] [Accepted: 03/10/2018] [Indexed: 01/22/2023]
Abstract
BACKGROUND Adaptive mutations that alter protein functionality are enriched within intrinsically disordered protein regions (IDRs), thus conformational flexibility correlates with evolvability. Pre-structured motifs (PreSMos) with transient propensity for secondary structure conformation are believed to be important for IDR function. The glucocorticoid receptor tau1core transcriptional activation domain (GR tau1core) domain contains three α-helical PreSMos in physiological buffer conditions. METHODS Sixty change-of-function mutants affecting the intrinsically disordered 58-residue GR tau1core were studied using disorder prediction and molecular dynamics simulations. RESULTS Change-of-function mutations were partitioned into seven clusters based on their effect on IDR predictions and gene activation activity. Some mutations selected from clusters characterized by mutations altering the IDR prediction score, altered the apparent stability of the α-helical form of one of the PreSMos in molecular dynamics simulations, suggesting PreSMo stabilization or destabilization as strategies for functional adaptation. Indeed all tested gain-of-function mutations affecting this PreSMo were associated with increased stability of the α-helical PreSMo conformation, suggesting that PreSMo stabilization may be the main mechanism by which adaptive mutations can increase the activity of this IDR type. Some mutations did not appear to affect PreSMo stability. CONCLUSIONS Changes in PreSMo stability account for the effects of a subset of change-of-function mutants affecting the GR tau1core IDR. GENERAL SIGNIFICANCE Long IDRs occur in about 50% of human proteins. They are poorly characterized despite much recent attention. Our results suggest the importance of a subtle balance between PreSMo stability and IDR activity, which may provide a novel target for future pharmaceutical intervention.
Collapse
Affiliation(s)
- Evdokiya Salamanova
- Department of Biosciences and Nutrition, Karolinska Institutet, Neo, TTI, SE-141 83 Huddinge, Sweden
| | - Joana Costeira-Paulo
- Department of Biosciences and Nutrition, Karolinska Institutet, Neo, TTI, SE-141 83 Huddinge, Sweden.
| | - Kyou-Hoon Han
- Genome Editing Research Center, Future Biotechnology Research Division, Korea Research Institute of Bioscience and Biotechnology, 125 Gwahak-ro, Yuseong-gu, Daejeon 305-806, Republic of Korea; Department of Nano and Bioinformatics, University of Science and Technology, 113 Gwahak-ro, Yuseong-gu, Daejeon 305-333, Republic of Korea.
| | - Do-Hyoung Kim
- Genome Editing Research Center, Future Biotechnology Research Division, Korea Research Institute of Bioscience and Biotechnology, 125 Gwahak-ro, Yuseong-gu, Daejeon 305-806, Republic of Korea.
| | - Lennart Nilsson
- Department of Biosciences and Nutrition, Karolinska Institutet, Neo, TTI, SE-141 83 Huddinge, Sweden.
| | - Anthony P H Wright
- Clinical Research Center, Department of Laboratory Medicine, Karolinska Institutet, NOVUM Level 5, Hälsovägen 7, SE-141 57 Huddinge, Sweden.
| |
Collapse
|
16
|
Yruela I, Oldfield CJ, Niklas KJ, Dunker AK. Evidence for a Strong Correlation Between Transcription Factor Protein Disorder and Organismic Complexity. Genome Biol Evol 2017; 9:1248-1265. [PMID: 28430951 PMCID: PMC5434936 DOI: 10.1093/gbe/evx073] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/17/2017] [Indexed: 12/11/2022] Open
Abstract
Studies of diverse phylogenetic lineages reveal that protein disorder increases in concert with organismic complexity but that differences nevertheless exist among lineages. To gain insight into this phenomenology, we analyzed all of the transcription factor (TF) families for which sequences are known for 17 species spanning bacteria, yeast, algae, land plants, and animals and for which the number of different cell types has been reported in the primary literature. Although the fraction of disordered residues in TF sequences is often moderately or poorly correlated with organismic complexity as gauged by cell-type number (r2 < 0.5), an unbiased and phylogenetically broad analysis shows that organismic complexity is positively and strongly correlated with the total number of TFs, the number of their spliced variants and their total disordered residues content (r2 > 0.8). Furthermore, the correlation between the fraction of disordered residues and cell-type number becomes stronger when confined to the TF families participating in cell cycle, cell size, cell division, cell differentiation, or cell proliferation, and other important developmental processes. The data also indicate that evolutionarily simpler organisms allow for the detection of subtle differences in the conserved IDRs of TFs as well as changes in variable IDRs, which can influence the DNA recognition and multifunctionality of TFs through direct or indirect mechanisms. Although strong correlations cannot be taken as evidence for cause-and-effect relationships, we interpret our data to indicate that increasing TF disorder likely was an important factor contributing to the evolution of organismic complexity and not merely a concurrent unrelated effect of increasing organismic complexity.
Collapse
Affiliation(s)
- Inmaculada Yruela
- Estación Experimental de Aula Dei, Consejo Superior de Investigaciones Científicas (EEAD-CSIC), Zaragoza, Spain.,Grupo de Bioquímica, Biofísica y Biología Computacional (BIFI, UNIZAR), Unidad Asociada al CSIC, Zaragoza, Spain
| | - Christopher J Oldfield
- Department of Biochemistry and Molecular Biology, Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN
| | - Karl J Niklas
- School of Integrative Plant Science, Cornell University, Ithaca, NY
| | - A Keith Dunker
- Department of Biochemistry and Molecular Biology, Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN
| |
Collapse
|
17
|
Dinan AM, Atkins JF, Firth AE. ASXL gain-of-function truncation mutants: defective and dysregulated forms of a natural ribosomal frameshifting product? Biol Direct 2017; 12:24. [PMID: 29037253 PMCID: PMC5644247 DOI: 10.1186/s13062-017-0195-0] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2017] [Accepted: 10/04/2017] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Programmed ribosomal frameshifting (PRF) is a gene expression mechanism which enables the translation of two N-terminally coincident, C-terminally distinct protein products from a single mRNA. Many viruses utilize PRF to control or regulate gene expression, but very few phylogenetically conserved examples are known in vertebrate genes. Additional sex combs-like (ASXL) genes 1 and 2 encode important epigenetic and transcriptional regulatory proteins that control the expression of homeotic genes during key developmental stages. Here we describe an ~150-codon overlapping ORF (termed TF) in ASXL1 and ASXL2 that, with few exceptions, is conserved throughout vertebrates. RESULTS Conservation of the TF ORF, strong suppression of synonymous site variation in the overlap region, and the completely conserved presence of an EH[N/S]Y motif (a known binding site for Host Cell Factor-1, HCF-1, an epigenetic regulatory factor), all indicate that TF is a protein-coding sequence. A highly conserved UCC_UUU_CGU sequence (identical to the known site of +1 ribosomal frameshifting for influenza virus PA-X expression) occurs at the 5' end of the region of enhanced synonymous site conservation in ASXL1. Similarly, a highly conserved RG_GUC_UCU sequence (identical to a known site of -2 ribosomal frameshifting for arterivirus nsp2TF expression) occurs at the 5' end of the region of enhanced synonymous site conservation in ASXL2. CONCLUSIONS Due to a lack of appropriate splice forms, or initiation sites, the most plausible mechanism for translation of the ASXL1 and 2 TF regions is ribosomal frameshifting, resulting in a transframe fusion of the N-terminal half of ASXL1 or 2 to the TF product, termed ASXL-TF. Truncation or frameshift mutants of ASXL are linked to myeloid malignancies and genetic diseases, such as Bohring-Opitz syndrome, likely at least in part as a result of gain-of-function or dominant-negative effects. Our hypothesis now indicates that these disease-associated mutant forms represent overexpressed defective versions of ASXL-TF. REVIEWERS This article was reviewed by Laurence Hurst and Eugene Koonin.
Collapse
Affiliation(s)
- Adam M Dinan
- Department of Pathology, Division of Virology, University of Cambridge, Cambridge, CB2 1QP, UK
| | - John F Atkins
- School of Biochemistry and Cell Biology, University College Cork, T12 YT57, Cork, Ireland.,Department of Human Genetics, University of Utah, Salt Lake City, UT, 84112, USA
| | - Andrew E Firth
- Department of Pathology, Division of Virology, University of Cambridge, Cambridge, CB2 1QP, UK.
| |
Collapse
|
18
|
Uversky VN. Intrinsic Disorder, Protein-Protein Interactions, and Disease. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2017; 110:85-121. [PMID: 29413001 DOI: 10.1016/bs.apcsb.2017.06.005] [Citation(s) in RCA: 78] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
It is recognized now that biologically active proteins without stable tertiary structure (known as intrinsically disordered proteins, IDPs) and hybrid proteins containing ordered domains and intrinsically disordered protein regions (IDPRs) are important players found in any given proteome. These IDPs/IDPRs possess functions that complement functional repertoire of their ordered counterparts, being commonly related to recognition, as well as control and regulation of various signaling pathways. They are interaction masters, being able to utilize a wide spectrum of interaction mechanisms, ranging from induced folding to formation of fuzzy complexes where significant levels of disorder are preserved, to polyvalent stochastic interactions playing crucial roles in the liquid-liquid phase transitions leading to the formation of proteinaceous membrane-less organelles. IDPs/IDPRs are tightly controlled themselves via various means, including alternative splicing, precisely controlled expression and degradation, binding to specific partners, and posttranslational modifications. Distortions in the regulation and control of IDPs/IDPRs, as well as their aberrant interactivity are commonly associated with various human diseases. This review presents some aspects of the intrinsic disorder-based functionality and dysfunctionality, paying special attention to the normal and pathological protein-protein interactions.
Collapse
Affiliation(s)
- Vladimir N Uversky
- USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, United States; Institute for Biological Instrumentation, Russian Academy of Sciences, Pushchino, Moscow Region, Russia.
| |
Collapse
|
19
|
Johnson KL, Cassin AM, Lonsdale A, Bacic A, Doblin MS, Schultz CJ. Pipeline to Identify Hydroxyproline-Rich Glycoproteins. PLANT PHYSIOLOGY 2017; 174:886-903. [PMID: 28446635 PMCID: PMC5462032 DOI: 10.1104/pp.17.00294] [Citation(s) in RCA: 43] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2017] [Accepted: 04/21/2017] [Indexed: 05/14/2023]
Abstract
Intrinsically disordered proteins (IDPs) are functional proteins that lack a well-defined three-dimensional structure. The study of IDPs is a rapidly growing area as the crucial biological functions of more of these proteins are uncovered. In plants, IDPs are implicated in plant stress responses, signaling, and regulatory processes. A superfamily of cell wall proteins, the hydroxyproline-rich glycoproteins (HRGPs), have characteristic features of IDPs. Their protein backbones are rich in the disordering amino acid proline, they contain repeated sequence motifs and extensive posttranslational modifications (glycosylation), and they have been implicated in many biological functions. HRGPs are evolutionarily ancient, having been isolated from the protein-rich walls of chlorophyte algae to the cellulose-rich walls of embryophytes. Examination of HRGPs in a range of plant species should provide valuable insights into how they have evolved. Commonly divided into the arabinogalactan proteins, extensins, and proline-rich proteins, in reality, a continuum of structures exists within this diverse and heterogenous superfamily. An inability to accurately classify HRGPs leads to inconsistent gene ontologies limiting the identification of HRGP classes in existing and emerging omics data sets. We present a novel and robust motif and amino acid bias (MAAB) bioinformatics pipeline to classify HRGPs into 23 descriptive subclasses. Validation of MAAB was achieved using available genomic resources and then applied to the 1000 Plants transcriptome project (www.onekp.com) data set. Significant improvement in the detection of HRGPs using multiple-k-mer transcriptome assembly methodology was observed. The MAAB pipeline is readily adaptable and can be modified to optimize the recovery of IDPs from other organisms.
Collapse
Affiliation(s)
- Kim L Johnson
- Australian Research Council Centre of Excellence in Plant Cell Walls, School of BioSciences, University of Melbourne, Parkville, Victoria 3010, Australia (K.L.J., A.M.C., A.L., A.B., M.S.D.); and
- School of Agriculture, Food, and Wine, University of Adelaide, Waite Research Institute, Glen Osmond, South Australia 5064, Australia (C.J.S.)
| | - Andrew M Cassin
- Australian Research Council Centre of Excellence in Plant Cell Walls, School of BioSciences, University of Melbourne, Parkville, Victoria 3010, Australia (K.L.J., A.M.C., A.L., A.B., M.S.D.); and
- School of Agriculture, Food, and Wine, University of Adelaide, Waite Research Institute, Glen Osmond, South Australia 5064, Australia (C.J.S.)
| | - Andrew Lonsdale
- Australian Research Council Centre of Excellence in Plant Cell Walls, School of BioSciences, University of Melbourne, Parkville, Victoria 3010, Australia (K.L.J., A.M.C., A.L., A.B., M.S.D.); and
- School of Agriculture, Food, and Wine, University of Adelaide, Waite Research Institute, Glen Osmond, South Australia 5064, Australia (C.J.S.)
| | - Antony Bacic
- Australian Research Council Centre of Excellence in Plant Cell Walls, School of BioSciences, University of Melbourne, Parkville, Victoria 3010, Australia (K.L.J., A.M.C., A.L., A.B., M.S.D.); and
- School of Agriculture, Food, and Wine, University of Adelaide, Waite Research Institute, Glen Osmond, South Australia 5064, Australia (C.J.S.)
| | - Monika S Doblin
- Australian Research Council Centre of Excellence in Plant Cell Walls, School of BioSciences, University of Melbourne, Parkville, Victoria 3010, Australia (K.L.J., A.M.C., A.L., A.B., M.S.D.); and
- School of Agriculture, Food, and Wine, University of Adelaide, Waite Research Institute, Glen Osmond, South Australia 5064, Australia (C.J.S.)
| | - Carolyn J Schultz
- Australian Research Council Centre of Excellence in Plant Cell Walls, School of BioSciences, University of Melbourne, Parkville, Victoria 3010, Australia (K.L.J., A.M.C., A.L., A.B., M.S.D.); and
- School of Agriculture, Food, and Wine, University of Adelaide, Waite Research Institute, Glen Osmond, South Australia 5064, Australia (C.J.S.)
| |
Collapse
|
20
|
Selection maintains signaling function of a highly diverged intrinsically disordered region. Proc Natl Acad Sci U S A 2017; 114:E1450-E1459. [PMID: 28167781 DOI: 10.1073/pnas.1614787114] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Intrinsically disordered regions (IDRs) are characterized by their lack of stable secondary or tertiary structure and comprise a large part of the eukaryotic proteome. Although these regions play a variety of signaling and regulatory roles, they appear to be rapidly evolving at the primary sequence level. To understand the functional implications of this rapid evolution, we focused on a highly diverged IDR in Saccharomyces cerevisiae that is involved in regulating multiple conserved MAPK pathways. We hypothesized that under stabilizing selection, the functional output of orthologous IDRs could be maintained, such that diverse genotypes could lead to similar function and fitness. Consistent with the stabilizing selection hypothesis, we find that diverged, orthologous IDRs can mostly recapitulate wild-type function and fitness in S. cerevisiae We also find that the electrostatic charge of the IDR is correlated with signaling output and, using phylogenetic comparative methods, find evidence for selection maintaining this quantitative molecular trait despite underlying genotypic divergence.
Collapse
|
21
|
Trujillo JT, Beilstein MA, Mosher RA. The Argonaute-binding platform of NRPE1 evolves through modulation of intrinsically disordered repeats. THE NEW PHYTOLOGIST 2016; 212:1094-1105. [PMID: 27431917 PMCID: PMC5125548 DOI: 10.1111/nph.14089] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/19/2016] [Accepted: 06/04/2016] [Indexed: 05/26/2023]
Abstract
Argonaute (Ago) proteins are important effectors in RNA silencing pathways, but they must interact with other machinery to trigger silencing. Ago hooks have emerged as a conserved motif responsible for interaction with Ago proteins, but little is known about the sequence surrounding Ago hooks that must restrict or enable interaction with specific Argonautes. Here we investigated the evolutionary dynamics of an Ago-binding platform in NRPE1, the largest subunit of RNA polymerase V. We compared NRPE1 sequences from > 50 species, including dense sampling of two plant lineages. This study demonstrates that the Ago-binding platform of NRPE1 retains Ago hooks, intrinsic disorder, and repetitive character while being highly labile at the sequence level. We reveal that loss of sequence conservation is the result of relaxed selection and frequent expansions and contractions of tandem repeat arrays. These factors allow a complete restructuring of the Ago-binding platform over 50-60 million yr. This evolutionary pattern is also detected in a second Ago-binding platform, suggesting it is a general mechanism. The presence of labile repeat arrays in all analyzed NRPE1 Ago-binding platforms indicates that selection maintains repetitive character, potentially to retain the ability to rapidly restructure the Ago-binding platform.
Collapse
Affiliation(s)
- Joshua T Trujillo
- The School of Plant Sciences, The University of Arizona, Tucson, AZ, 85721-0036, USA
| | - Mark A Beilstein
- The School of Plant Sciences, The University of Arizona, Tucson, AZ, 85721-0036, USA
| | - Rebecca A Mosher
- The School of Plant Sciences, The University of Arizona, Tucson, AZ, 85721-0036, USA
| |
Collapse
|
22
|
DeForte S, Uversky VN. Order, Disorder, and Everything in Between. Molecules 2016; 21:molecules21081090. [PMID: 27548131 PMCID: PMC6274243 DOI: 10.3390/molecules21081090] [Citation(s) in RCA: 64] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2016] [Revised: 08/10/2016] [Accepted: 08/11/2016] [Indexed: 02/04/2023] Open
Abstract
In addition to the “traditional” proteins characterized by the unique crystal-like structures needed for unique functions, it is increasingly recognized that many proteins or protein regions (collectively known as intrinsically disordered proteins (IDPs) and intrinsically disordered protein regions (IDPRs)), being biologically active, do not have a specific 3D-structure in their unbound states under physiological conditions. There are also subtler categories of disorder, such as conditional (or dormant) disorder and partial disorder. Both the ability of a protein/region to fold into a well-ordered functional unit or to stay intrinsically disordered but functional are encoded in the amino acid sequence. Structurally, IDPs/IDPRs are characterized by high spatiotemporal heterogeneity and exist as dynamic structural ensembles. It is important to remember, however, that although structure and disorder are often treated as binary states, they actually sit on a structural continuum.
Collapse
Affiliation(s)
- Shelly DeForte
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA.
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA.
- USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA.
- Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg 194064, Russia.
| |
Collapse
|
23
|
Brown CJ, Quates CJ, Mirabzadeh CA, Miller CR, Wichman HA, Miura TA, Ytreberg FM. New Perspectives on Ebola Virus Evolution. PLoS One 2016; 11:e0160410. [PMID: 27479005 PMCID: PMC4968807 DOI: 10.1371/journal.pone.0160410] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2016] [Accepted: 07/19/2016] [Indexed: 12/01/2022] Open
Abstract
Since the recent devastating outbreak of Ebola virus disease in western Africa, there has been significant effort to understand the evolution of the deadly virus that caused the outbreak. There has been a considerable investment in sequencing Ebola virus (EBOV) isolates, and the results paint an important picture of how the virus has spread in western Africa. EBOV evolution cannot be understood outside the context of previous outbreaks, however. We have focused this study on the evolution of the EBOV glycoprotein gene (GP) because one of its products, the spike glycoprotein (GP1,2), is central to the host immune response and because it contains a large amount of the phylogenetic signal for this virus. We inferred the maximum likelihood phylogeny of 96 nonredundant GP gene sequences representing each of the outbreaks since 1976 up to the end of 2014. We tested for positive selection and considered the placement of adaptive amino acid substitutions along the phylogeny and within the protein structure of GP1,2. We conclude that: 1) the common practice of rooting the phylogeny of EBOV between the first known outbreak in 1976 and the next outbreak in 1995 provides a misleading view of EBOV evolution that ignores the fact that there is a non-human EBOV host between outbreaks; 2) the N-terminus of GP1 may be constrained from evolving in response to the host immune system by the highly expressed, secreted glycoprotein, which is encoded by the same region of the GP gene; 3) although the mucin-like domain of GP1 is essential for EBOV in vivo, it evolves rapidly without losing its twin functions: providing O-linked glycosylation sites and a flexible surface.
Collapse
Affiliation(s)
- Celeste J Brown
- Department of Biological Sciences, University of Idaho, Moscow, Idaho, United States of America.,Institute for Bioinformatics and Evolutionary Studies, University of Idaho, Moscow, Idaho, United States of America.,Center for Modeling Complex Interactions, University of Idaho, Moscow, Idaho, United States of America
| | - Caleb J Quates
- Center for Modeling Complex Interactions, University of Idaho, Moscow, Idaho, United States of America.,Department of Physics, University of Idaho, Moscow, Idaho, United States of America
| | - Christopher A Mirabzadeh
- Center for Modeling Complex Interactions, University of Idaho, Moscow, Idaho, United States of America.,Department of Physics, University of Idaho, Moscow, Idaho, United States of America
| | - Craig R Miller
- Department of Biological Sciences, University of Idaho, Moscow, Idaho, United States of America.,Institute for Bioinformatics and Evolutionary Studies, University of Idaho, Moscow, Idaho, United States of America.,Center for Modeling Complex Interactions, University of Idaho, Moscow, Idaho, United States of America
| | - Holly A Wichman
- Department of Biological Sciences, University of Idaho, Moscow, Idaho, United States of America.,Institute for Bioinformatics and Evolutionary Studies, University of Idaho, Moscow, Idaho, United States of America.,Center for Modeling Complex Interactions, University of Idaho, Moscow, Idaho, United States of America
| | - Tanya A Miura
- Department of Biological Sciences, University of Idaho, Moscow, Idaho, United States of America.,Center for Modeling Complex Interactions, University of Idaho, Moscow, Idaho, United States of America
| | - F Marty Ytreberg
- Institute for Bioinformatics and Evolutionary Studies, University of Idaho, Moscow, Idaho, United States of America.,Center for Modeling Complex Interactions, University of Idaho, Moscow, Idaho, United States of America.,Department of Physics, University of Idaho, Moscow, Idaho, United States of America
| |
Collapse
|
24
|
Fadda E. Role of the XPA protein in the NER pathway: A perspective on the function of structural disorder in macromolecular assembly. Comput Struct Biotechnol J 2015; 14:78-85. [PMID: 26865925 PMCID: PMC4710682 DOI: 10.1016/j.csbj.2015.11.007] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2015] [Revised: 11/25/2015] [Accepted: 11/26/2015] [Indexed: 12/23/2022] Open
Abstract
Lack of structure is often an essential functional feature of protein domains. The coordination of macromolecular assemblies in DNA repair pathways is yet another task disordered protein regions are highly implicated in. Here I review the available experimental and computational data and within this context discuss the functional role of structure and disorder in one of the essential scaffolding proteins in the nucleotide excision repair (NER) pathway, namely Xeroderma pigmentosum complementation group A (XPA). From the analysis of the current knowledge, in addition to protein–protein docking and secondary structure prediction results presented for the first time herein, a mechanistic framework emerges, where XPA builds the NER pre-incision complex in a modular fashion, as “beads on a string”, where the protein–protein interaction “beads”, or modules, are interconnected by disordered link regions. This architecture is ideal to avoid the expected steric hindrance constraints of the DNA expanded bubble. Finally, the role of the XPA structural disorder in binding affinity modulation and in the sequential binding of NER core factors in the pre-incision complex is also discussed.
Collapse
Affiliation(s)
- Elisa Fadda
- Department of Chemistry, Maynooth University, Maynooth, Kildare, Ireland
| |
Collapse
|
25
|
Guy AJ, Irani V, MacRaild CA, Anders RF, Norton RS, Beeson JG, Richards JS, Ramsland PA. Insights into the Immunological Properties of Intrinsically Disordered Malaria Proteins Using Proteome Scale Predictions. PLoS One 2015; 10:e0141729. [PMID: 26513658 PMCID: PMC4626106 DOI: 10.1371/journal.pone.0141729] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2015] [Accepted: 10/12/2015] [Indexed: 12/31/2022] Open
Abstract
Malaria remains a significant global health burden. The development of an effective malaria vaccine remains as a major challenge with the potential to significantly reduce morbidity and mortality. While Plasmodium spp. have been shown to contain a large number of intrinsically disordered proteins (IDPs) or disordered protein regions, the relationship of protein structure to subcellular localisation and adaptive immune responses remains unclear. In this study, we employed several computational prediction algorithms to identify IDPs at the proteome level of six Plasmodium spp. and to investigate the potential impact of protein disorder on adaptive immunity against P. falciparum parasites. IDPs were shown to be particularly enriched within nuclear proteins, apical proteins, exported proteins and proteins localised to the parasitophorous vacuole. Furthermore, several leading vaccine candidates, and proteins with known roles in host-cell invasion, have extensive regions of disorder. Presentation of peptides by MHC molecules plays an important role in adaptive immune responses, and we show that IDP regions are predicted to contain relatively few MHC class I and II binding peptides owing to inherent differences in amino acid composition compared to structured domains. In contrast, linear B-cell epitopes were predicted to be enriched in IDPs. Tandem repeat regions and non-synonymous single nucleotide polymorphisms were found to be strongly associated with regions of disorder. In summary, immune responses against IDPs appear to have characteristics distinct from those against structured protein domains, with increased antibody recognition of linear epitopes but some constraints for MHC presentation and issues of polymorphisms. These findings have major implications for vaccine design, and understanding immunity to malaria.
Collapse
Affiliation(s)
- Andrew J. Guy
- Centre for Biomedical Research, Burnet Institute, Melbourne, Australia
- Department of Immunology, Monash University, Melbourne, Australia
| | - Vashti Irani
- Centre for Biomedical Research, Burnet Institute, Melbourne, Australia
- Department of Medicine, University of Melbourne, Melbourne, Australia
| | - Christopher A. MacRaild
- Medicinal Chemistry, Monash Institute of Pharmaceutical Sciences, Monash University, Parkville, Australia
| | - Robin F. Anders
- Department of Biochemistry and Genetics, La Trobe Institute for Molecular Science, La Trobe University, Melbourne, Australia
| | - Raymond S. Norton
- Medicinal Chemistry, Monash Institute of Pharmaceutical Sciences, Monash University, Parkville, Australia
| | - James G. Beeson
- Centre for Biomedical Research, Burnet Institute, Melbourne, Australia
- Department of Medicine, University of Melbourne, Melbourne, Australia
- Department of Microbiology, Monash University, Melbourne, Australia
| | - Jack S. Richards
- Centre for Biomedical Research, Burnet Institute, Melbourne, Australia
- Department of Medicine, University of Melbourne, Melbourne, Australia
- Department of Microbiology, Monash University, Melbourne, Australia
- Victorian Infectious Diseases Service, Royal Melbourne Hospital, Melbourne, Australia
- * E-mail: (JSR); (PAR)
| | - Paul A. Ramsland
- Centre for Biomedical Research, Burnet Institute, Melbourne, Australia
- Department of Immunology, Monash University, Melbourne, Australia
- Department of Surgery Austin Health, University of Melbourne, Heidelberg, Australia
- School of Biomedical Sciences, CHIRI Biosciences, Faculty of Health Sciences, Curtin University, Perth, Australia
- * E-mail: (JSR); (PAR)
| |
Collapse
|