1
|
Buggiani J, Meinnel T, Giglione C, Frottin F. Advances in nuclear proteostasis of metazoans. Biochimie 2024:S0300-9084(24)00081-6. [PMID: 38642824 DOI: 10.1016/j.biochi.2024.04.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 04/12/2024] [Accepted: 04/16/2024] [Indexed: 04/22/2024]
Abstract
The proteostasis network and associated protein quality control (PQC) mechanisms ensure proteome functionality and are essential for cell survival. A distinctive feature of eukaryotic cells is their high degree of compartmentalization, requiring specific and adapted proteostasis networks for each compartment. The nucleus, essential for maintaining the integrity of genetic information and gene transcription, is one such compartment. While PQC mechanisms have been investigated for decades in the cytoplasm and the endoplasmic reticulum, our knowledge of nuclear PQC pathways is only emerging. Recent developments in the field have underscored the importance of spatially managing aberrant proteins within the nucleus. Upon proteotoxic stress, misfolded proteins and PQC effectors accumulate in various nuclear membrane-less organelles. Beyond bringing together effectors and substrates, the biophysical properties of these organelles allow novel PQC functions. In this review, we explore the specificity of the nuclear compartment, the effectors of the nuclear proteostasis network, and the PQC roles of nuclear membrane-less organelles in metazoans.
Collapse
Affiliation(s)
- Julia Buggiani
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CEA, CNRS, Gif-sur-Yvette, France
| | - Thierry Meinnel
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CEA, CNRS, Gif-sur-Yvette, France
| | - Carmela Giglione
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CEA, CNRS, Gif-sur-Yvette, France
| | - Frédéric Frottin
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CEA, CNRS, Gif-sur-Yvette, France.
| |
Collapse
|
2
|
Uversky VN. Functional unfoldomics: Roles of intrinsic disorder in protein (multi)functionality. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2023; 138:179-210. [PMID: 38220424 DOI: 10.1016/bs.apcsb.2023.11.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/16/2024]
Abstract
Intrinsically disordered proteins (IDPs), which are functional proteins without stable tertiary structure, and hybrid proteins containing ordered domains and intrinsically disordered regions (IDRs) constitute prominent parts of all proteomes collectively known as unfoldomes. IDPs/IDRs exist as highly dynamic structural ensembles of rapidly interconverting conformations and are characterized by the exceptional structural heterogeneity, where their different parts are (dis)ordered to different degree, and their overall structure represents a complex mosaic of foldons, inducible foldons, inducible morphing foldons, non-foldons, semifoldons, and even unfoldons. Despite their lack of unique 3D structures, IDPs/IDRs play crucial roles in the control of various biological processes and the regulation of different cellular pathways and are commonly involved in recognition and signaling, indicating that the disorder-based functional repertoire is complementary to the functions of ordered proteins. Furthermore, IDPs/IDRs are frequently multifunctional, and this multifunctionality is defined by their structural flexibility and heterogeneity. Intrinsic disorder phenomenon is at the roots of the structure-function continuum model, where the structure continuum is defined by the presence of differently (dis)ordered regions, and the function continuum arises from the ability of all these differently (dis)ordered parts to have different functions. In their everyday life, IDPs/IDRs utilize a broad spectrum of interaction mechanisms thereby acting as interaction specialists. They are crucial for the biogenesis of numerous proteinaceous membrane-less organelles driven by the liquid-liquid phase separation. This review introduces functional unfoldomics by representing some aspects of the intrinsic disorder-based functionality.
Collapse
Affiliation(s)
- Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, United States.
| |
Collapse
|
3
|
Dupouy G, Cashell R, Brychkova G, Tuteja R, McKeown PC, Spillane C. PICKLE RELATED 2 is a Neofunctionalized Gene Duplicate Under Positive Selection With Antagonistic Effects to the Ancestral PICKLE Gene on the Seed Transcriptome. Genome Biol Evol 2023; 15:evad191. [PMID: 37931037 PMCID: PMC10630071 DOI: 10.1093/gbe/evad191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/16/2023] [Indexed: 11/08/2023] Open
Abstract
The evolution and diversification of proteins capable of remodeling domains has been critical for transcriptional reprogramming during cell fate determination in multicellular eukaryotes. Chromatin remodeling proteins of the CHD3 family have been shown to have important and antagonistic impacts on seed development in the model plant, Arabidopsis thaliana, yet the basis of this functional divergence remains unknown. In this study, we demonstrate that genes encoding the CHD3 proteins PICKLE (PKL) and PICKLE-RELATED 2 (PKR2) originated from a duplication event during the diversification of crown Brassicaceae, and that these homologs have undergone distinct evolutionary trajectories since this duplication, with PKR2 fast evolving under positive selection, while PKL is subject to purifying selection. We find that the rapid evolution of PKR2 under positive selection reduces the encoded protein's intrinsic disorder, possibly suggesting a tertiary structure configuration which differs from that of PKL. Our whole genome transcriptome analysis in seeds of pkr2 and pkl mutants reveals that they act antagonistically on the expression of specific sets of genes, providing a basis for their differing roles in seed development. Our results provide insights into how gene duplication and neofunctionalization can lead to differing and antagonistic selective pressures on transcriptomes during plant reproduction, as well as on the evolutionary diversification of the CHD3 family within seed plants.
Collapse
Affiliation(s)
- Gilles Dupouy
- Genetics and Biotechnology Lab, Agriculture & Bioeconomy Research Centre, Ryan Institute, University of Galway, Galway H91 REW4, Ireland
| | - Ronan Cashell
- Genetics and Biotechnology Lab, Agriculture & Bioeconomy Research Centre, Ryan Institute, University of Galway, Galway H91 REW4, Ireland
| | - Galina Brychkova
- Genetics and Biotechnology Lab, Agriculture & Bioeconomy Research Centre, Ryan Institute, University of Galway, Galway H91 REW4, Ireland
| | - Reetu Tuteja
- Genetics and Biotechnology Lab, Agriculture & Bioeconomy Research Centre, Ryan Institute, University of Galway, Galway H91 REW4, Ireland
| | - Peter C McKeown
- Genetics and Biotechnology Lab, Agriculture & Bioeconomy Research Centre, Ryan Institute, University of Galway, Galway H91 REW4, Ireland
| | - Charles Spillane
- Genetics and Biotechnology Lab, Agriculture & Bioeconomy Research Centre, Ryan Institute, University of Galway, Galway H91 REW4, Ireland
| |
Collapse
|
4
|
Manyilov VD, Ilyinsky NS, Nesterov SV, Saqr BMGA, Dayhoff GW, Zinovev EV, Matrenok SS, Fonin AV, Kuznetsova IM, Turoverov KK, Ivanovich V, Uversky VN. Chaotic aging: intrinsically disordered proteins in aging-related processes. Cell Mol Life Sci 2023; 80:269. [PMID: 37634152 PMCID: PMC11073068 DOI: 10.1007/s00018-023-04897-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2023] [Revised: 07/03/2023] [Accepted: 07/24/2023] [Indexed: 08/29/2023]
Abstract
The development of aging is associated with the disruption of key cellular processes manifested as well-established hallmarks of aging. Intrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs) have no stable tertiary structure that provide them a power to be configurable hubs in signaling cascades and regulate many processes, potentially including those related to aging. There is a need to clarify the roles of IDPs/IDRs in aging. The dataset of 1702 aging-related proteins was collected from established aging databases and experimental studies. There is a noticeable presence of IDPs/IDRs, accounting for about 36% of the aging-related dataset, which is however less than the disorder content of the whole human proteome (about 40%). A Gene Ontology analysis of the used here aging proteome reveals an abundance of IDPs/IDRs in one-third of aging-associated processes, especially in genome regulation. Signaling pathways associated with aging also contain IDPs/IDRs on different hierarchical levels, revealing the importance of "structure-function continuum" in aging. Protein-protein interaction network analysis showed that IDPs present in different clusters associated with different aging hallmarks. Protein cluster with IDPs enrichment has simultaneously high liquid-liquid phase separation (LLPS) probability, "nuclear" localization and DNA-associated functions, related to aging hallmarks: genomic instability, telomere attrition, epigenetic alterations, and stem cells exhaustion. Intrinsic disorder, LLPS, and aggregation propensity should be considered as features that could be markers of pathogenic proteins. Overall, our analyses indicate that IDPs/IDRs play significant roles in aging-associated processes, particularly in the regulation of DNA functioning. IDP aggregation, which can lead to loss of function and toxicity, could be critically harmful to the cell. A structure-based analysis of aging and the identification of proteins that are particularly susceptible to disturbances can enhance our understanding of the molecular mechanisms of aging and open up new avenues for slowing it down.
Collapse
Affiliation(s)
- Vladimir D Manyilov
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Institutskiy Pereulok, 9, Dolgoprudny, 141700, Russia
| | - Nikolay S Ilyinsky
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Institutskiy Pereulok, 9, Dolgoprudny, 141700, Russia.
| | - Semen V Nesterov
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Institutskiy Pereulok, 9, Dolgoprudny, 141700, Russia
- Institute of Cytology, Russian Academy of Sciences, Saint Petersburg, 194064, Russia
| | - Baraa M G A Saqr
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Institutskiy Pereulok, 9, Dolgoprudny, 141700, Russia
| | - Guy W Dayhoff
- Department of Chemistry, University of South Florida, Tampa, FL, USA
| | - Egor V Zinovev
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Institutskiy Pereulok, 9, Dolgoprudny, 141700, Russia
| | - Simon S Matrenok
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Institutskiy Pereulok, 9, Dolgoprudny, 141700, Russia
| | - Alexander V Fonin
- Institute of Cytology, Russian Academy of Sciences, Saint Petersburg, 194064, Russia
| | - Irina M Kuznetsova
- Institute of Cytology, Russian Academy of Sciences, Saint Petersburg, 194064, Russia
| | | | - Valentin Ivanovich
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Institutskiy Pereulok, 9, Dolgoprudny, 141700, Russia
| | - Vladimir N Uversky
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Institutskiy Pereulok, 9, Dolgoprudny, 141700, Russia.
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Blvd., MDC07, Tampa, FL, 33612, USA.
| |
Collapse
|
5
|
Kusova AM, Rakipov IT, Zuev YF. Effects of Homogeneous and Heterogeneous Crowding on Translational Diffusion of Rigid Bovine Serum Albumin and Disordered Alfa-Casein. Int J Mol Sci 2023; 24:11148. [PMID: 37446325 DOI: 10.3390/ijms241311148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 06/30/2023] [Accepted: 07/04/2023] [Indexed: 07/15/2023] Open
Abstract
Intracellular environment includes proteins, sugars, and nucleic acids interacting in restricted media. In the cytoplasm, the excluded volume effect takes up to 40% of the volume available for occupation by macromolecules. In this work, we tested several approaches modeling crowded solutions for protein diffusion. We experimentally showed how the protein diffusion deviates from conventional Brownian motion in artificial conditions modeling the alteration of medium viscosity and rigid spatial obstacles. The studied tracer proteins were globular bovine serum albumin and intrinsically disordered α-casein. Using the pulsed field gradient NMR, we investigated the translational diffusion of protein probes of different structures in homogeneous (glycerol) and heterogeneous (PEG 300/PEG 6000/PEG 40,000) solutions as a function of crowder concentration. Our results showed fundamentally different effects of homogeneous and heterogeneous crowded environments on protein self-diffusion. In addition, the applied "tracer on lattice" model showed that smaller crowding obstacles (PEG 300 and PEG 6000) create a dense net of restrictions noticeably hindering diffusing protein probes, whereas the large-sized PEG 40,000 creates a "less restricted" environment for the diffusive motion of protein molecules.
Collapse
Affiliation(s)
- Aleksandra M Kusova
- Kazan Institute of Biochemistry and Biophysics, FRC Kazan Scientific Center, Russian Academy of Sciences, Lobachevsky Str. 2/31, Kazan 420111, Russia
| | - Ilnaz T Rakipov
- Institute of Chemistry, Kazan Federal University, Kremlevskaya Str. 18, Kazan 420008, Russia
| | - Yuriy F Zuev
- Kazan Institute of Biochemistry and Biophysics, FRC Kazan Scientific Center, Russian Academy of Sciences, Lobachevsky Str. 2/31, Kazan 420111, Russia
- Institute of Chemistry, Kazan Federal University, Kremlevskaya Str. 18, Kazan 420008, Russia
| |
Collapse
|
6
|
Zhao B, Ghadermarzi S, Kurgan L. Comparative evaluation of AlphaFold2 and disorder predictors for prediction of intrinsic disorder, disorder content and fully disordered proteins. Comput Struct Biotechnol J 2023; 21:3248-3258. [PMID: 38213902 PMCID: PMC10782001 DOI: 10.1016/j.csbj.2023.06.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 05/31/2023] [Accepted: 06/01/2023] [Indexed: 01/13/2024] Open
Abstract
We expand studies of AlphaFold2 (AF2) in the context of intrinsic disorder prediction by comparing it against a broad selection of 20 accurate, popular and recently released disorder predictors. We use 25% larger benchmark dataset with 646 proteins and cover protein-level predictions of disorder content and fully disordered proteins. AF2-based disorder predictions secure a relatively high Area Under receiver operating characteristic Curve (AUC) of 0.77 and are statistically outperformed by several modern disorder predictors that secure AUCs around 0.8 with median runtime of about 20 s compared to 1200 s for AF2. Moreover, AF2 provides modestly accurate predictions of fully disordered proteins (F1 = 0.59 vs. 0.91 for the best disorder predictor) and disorder content (mean absolute error of 0.21 vs. 0.15). AF2 also generates statistically more accurate disorder predictions for about 20% of proteins that have relatively short sequences and a few disordered regions that tend to be located at the sequence termini, and which are absent of disordered protein-binding regions. Interestingly, AF2 and the most accurate disorder predictors rely on deep neural networks, suggesting that these models are useful for protein structure and disorder predictions.
Collapse
Affiliation(s)
- Bi Zhao
- Genomics program, College of Public Health, University of South Florida, Tampa, FL, United States
| | - Sina Ghadermarzi
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| |
Collapse
|
7
|
Peng Z, Li Z, Meng Q, Zhao B, Kurgan L. CLIP: accurate prediction of disordered linear interacting peptides from protein sequences using co-evolutionary information. Brief Bioinform 2023; 24:6858950. [PMID: 36458437 DOI: 10.1093/bib/bbac502] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 09/30/2022] [Accepted: 10/24/2022] [Indexed: 12/04/2022] Open
Abstract
One of key features of intrinsically disordered regions (IDRs) is facilitation of protein-protein and protein-nucleic acids interactions. These disordered binding regions include molecular recognition features (MoRFs), short linear motifs (SLiMs) and longer binding domains. Vast majority of current predictors of disordered binding regions target MoRFs, with a handful of methods that predict SLiMs and disordered protein-binding domains. A new and broader class of disordered binding regions, linear interacting peptides (LIPs), was introduced recently and applied in the MobiDB resource. LIPs are segments in protein sequences that undergo disorder-to-order transition upon binding to a protein or a nucleic acid, and they cover MoRFs, SLiMs and disordered protein-binding domains. Although current predictors of MoRFs and disordered protein-binding regions could be used to identify some LIPs, there are no dedicated sequence-based predictors of LIPs. To this end, we introduce CLIP, a new predictor of LIPs that utilizes robust logistic regression model to combine three complementary types of inputs: co-evolutionary information derived from multiple sequence alignments, physicochemical profiles and disorder predictions. Ablation analysis suggests that the co-evolutionary information is particularly useful for this prediction and that combining the three inputs provides substantial improvements when compared to using these inputs individually. Comparative empirical assessments using low-similarity test datasets reveal that CLIP secures area under receiver operating characteristic curve (AUC) of 0.8 and substantially improves over the results produced by the closest current tools that predict MoRFs and disordered protein-binding regions. The webserver of CLIP is freely available at http://biomine.cs.vcu.edu/servers/CLIP/ and the standalone code can be downloaded from http://yanglab.qd.sdu.edu.cn/download/CLIP/.
Collapse
Affiliation(s)
- Zhenling Peng
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China.,Frontier Science Center for Nonlinear Expectations, Ministry of Education, Qingdao, 266237, China
| | - Zixia Li
- Center for Applied Mathematics, Tianjin University, Tianjin, 300072, China
| | - Qiaozhen Meng
- College of Intelligence and Computing, Tianjin University, Tianjin, 300072, China
| | - Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
8
|
Kastano K, Mier P, Dosztányi Z, Promponas VJ, Andrade-Navarro MA. Functional Tuning of Intrinsically Disordered Regions in Human Proteins by Composition Bias. Biomolecules 2022; 12:biom12101486. [PMID: 36291695 PMCID: PMC9599065 DOI: 10.3390/biom12101486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Revised: 09/30/2022] [Accepted: 10/11/2022] [Indexed: 11/16/2022] Open
Abstract
Intrinsically disordered regions (IDRs) in protein sequences are flexible, have low structural constraints and as a result have faster rates of evolution. This lack of evolutionary conservation greatly limits the use of sequence homology for the classification and functional assessment of IDRs, as opposed to globular domains. The study of IDRs requires other properties for their classification and functional prediction. While composition bias is not a necessary property of IDRs, compositionally biased regions (CBRs) have been noted as frequent part of IDRs. We hypothesized that to characterize IDRs, it could be helpful to study their overlap with particular types of CBRs. Here, we evaluate this overlap in the human proteome. A total of 2/3 of residues in IDRs overlap CBRs. Considering CBRs enriched in one type of amino acid, we can distinguish CBRs that tend to be fully included within long IDRs (R, H, N, D, P, G), from those that partially overlap shorter IDRs (S, E, K, T), and others that tend to overlap IDR terminals (Q, A). CBRs overlap more often IDRs in nuclear proteins and in proteins involved in liquid-liquid phase separation (LLPS). Study of protein interaction networks reveals the enrichment of CBRs in IDRs by tandem repetition of short linear motifs (rich in S or P), and the existence of E-rich polar regions that could support specific protein interactions with non-specific interactions. Our results open ways to pin down the function of IDRs from their partial compositional biases.
Collapse
Affiliation(s)
- Kristina Kastano
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University, Biozentrum I, Hans-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - Pablo Mier
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University, Biozentrum I, Hans-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - Zsuzsanna Dosztányi
- Department of Biochemistry, ELTE Eötvös Loránd University, Pázmány Péter stny 1/c, H-1117 Budapest, Hungary
| | - Vasilis J. Promponas
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, 1678 Nicosia, Cyprus
| | - Miguel A. Andrade-Navarro
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University, Biozentrum I, Hans-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
- Correspondence:
| |
Collapse
|
9
|
Uversky VN. State without borders: Membrane-less organelles and liquid-liquid phase transitions. BIOCHIMICA ET BIOPHYSICA ACTA. MOLECULAR CELL RESEARCH 2022; 1869:119251. [PMID: 35245612 DOI: 10.1016/j.bbamcr.2022.119251] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Affiliation(s)
- Vladimir N Uversky
- Department of Molecular Medicine and Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33620, United States.
| |
Collapse
|
10
|
Compositional Bias of Intrinsically Disordered Proteins and Regions and Their Predictions. Biomolecules 2022; 12:biom12070888. [PMID: 35883444 PMCID: PMC9313023 DOI: 10.3390/biom12070888] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 06/10/2022] [Accepted: 06/10/2022] [Indexed: 11/17/2022] Open
Abstract
Intrinsically disordered regions (IDRs) carry out many cellular functions and vary in length and placement in protein sequences. This diversity leads to variations in the underlying compositional biases, which were demonstrated for the short vs. long IDRs. We analyze compositional biases across four classes of disorder: fully disordered proteins; short IDRs; long IDRs; and binding IDRs. We identify three distinct biases: for the fully disordered proteins, the short IDRs and the long and binding IDRs combined. We also investigate compositional bias for putative disorder produced by leading disorder predictors and find that it is similar to the bias of the native disorder. Interestingly, the accuracy of disorder predictions across different methods is correlated with the correctness of the compositional bias of their predictions highlighting the importance of the compositional bias. The predictive quality is relatively low for the disorder classes with compositional bias that is the most different from the “generic” disorder bias, while being much higher for the classes with the most similar bias. We discover that different predictors perform best across different classes of disorder. This suggests that no single predictor is universally best and motivates the development of new architectures that combine models that target specific disorder classes.
Collapse
|
11
|
Sun H, Dong Z, Zhang Q, Liu B, Yan S, Wang Y, Yin D, Wang Y, Ren P, Wu N, Chang L. Companion-Probe & Race platform for interrogating nuclear protein and migration of living cells. Biosens Bioelectron 2022; 210:114281. [DOI: 10.1016/j.bios.2022.114281] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 04/01/2022] [Accepted: 04/09/2022] [Indexed: 01/15/2023]
|
12
|
Zhao B, Kurgan L. Deep Learning in Prediction of Intrinsic Disorder in Proteins. Comput Struct Biotechnol J 2022; 20:1286-1294. [PMID: 35356546 PMCID: PMC8927795 DOI: 10.1016/j.csbj.2022.03.003] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 03/04/2022] [Accepted: 03/04/2022] [Indexed: 12/12/2022] Open
|
13
|
Kurgan L. Resources for computational prediction of intrinsic disorder in proteins. Methods 2022; 204:132-141. [DOI: 10.1016/j.ymeth.2022.03.018] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Revised: 03/25/2022] [Accepted: 03/29/2022] [Indexed: 12/26/2022] Open
|
14
|
Abstract
INTRODUCTION Intrinsic disorder prediction field develops, assesses, and deploys computational predictors of disorder in protein sequences and constructs and disseminates databases of these predictions. Over 40 years of research resulted in the release of numerous resources. AREAS COVERED We identify and briefly summarize the most comprehensive to date collection of over 100 disorder predictors. We focus on their predictive models, availability and predictive performance. We categorize and study them from a historical point of view to highlight informative trends. EXPERT OPINION We find a consistent trend of improvements in predictive quality as newer and more advanced predictors are developed. The original focus on machine learning methods has shifted to meta-predictors in early 2010s, followed by a recent transition to deep learning. The use of deep learners will continue in foreseeable future given recent and convincing success of these methods. Moreover, a broad range of resources that facilitate convenient collection of accurate disorder predictions is available to users. They include web servers and standalone programs for disorder prediction, servers that combine prediction of disorder and disorder functions, and large databases of pre-computed predictions. We also point to the need to address the shortage of accurate methods that predict disordered binding regions.
Collapse
Affiliation(s)
- Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, Virginia, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, Virginia, USA
| |
Collapse
|
15
|
Katuwawala A, Zhao B, Kurgan L. DisoLipPred: accurate prediction of disordered lipid-binding residues in protein sequences with deep recurrent networks and transfer learning. Bioinformatics 2021; 38:115-124. [PMID: 34487138 DOI: 10.1093/bioinformatics/btab640] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 08/05/2021] [Accepted: 09/02/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Intrinsically disordered protein regions interact with proteins, nucleic acids and lipids. Regions that bind lipids are implicated in a wide spectrum of cellular functions and several human diseases. Motivated by the growing amount of experimental data for these interactions and lack of tools that can predict them from the protein sequence, we develop DisoLipPred, the first predictor of the disordered lipid-binding residues (DLBRs). RESULTS DisoLipPred relies on a deep bidirectional recurrent network that implements three innovative features: transfer learning, bypass module that sidesteps predictions for putative structured residues, and expanded inputs that cover physiochemical properties associated with the protein-lipid interactions. Ablation analysis shows that these features drive predictive quality of DisoLipPred. Tests on an independent test dataset and the yeast proteome reveal that DisoLipPred generates accurate results and that none of the related existing tools can be used to indirectly identify DLBR. We also show that DisoLipPred's predictions complement the results generated by predictors of the transmembrane regions. Altogether, we conclude that DisoLipPred provides high-quality predictions of DLBRs that complement the currently available methods. AVAILABILITY AND IMPLEMENTATION DisoLipPred's webserver is available at http://biomine.cs.vcu.edu/servers/DisoLipPred/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Akila Katuwawala
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
16
|
Zhang F, Zhao B, Shi W, Li M, Kurgan L. DeepDISOBind: accurate prediction of RNA-, DNA- and protein-binding intrinsically disordered residues with deep multi-task learning. Brief Bioinform 2021; 23:6461158. [PMID: 34905768 DOI: 10.1093/bib/bbab521] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 10/30/2021] [Accepted: 11/14/2021] [Indexed: 12/14/2022] Open
Abstract
Proteins with intrinsically disordered regions (IDRs) are common among eukaryotes. Many IDRs interact with nucleic acids and proteins. Annotation of these interactions is supported by computational predictors, but to date, only one tool that predicts interactions with nucleic acids was released, and recent assessments demonstrate that current predictors offer modest levels of accuracy. We have developed DeepDISOBind, an innovative deep multi-task architecture that accurately predicts deoxyribonucleic acid (DNA)-, ribonucleic acid (RNA)- and protein-binding IDRs from protein sequences. DeepDISOBind relies on an information-rich sequence profile that is processed by an innovative multi-task deep neural network, where subsequent layers are gradually specialized to predict interactions with specific partner types. The common input layer links to a layer that differentiates protein- and nucleic acid-binding, which further links to layers that discriminate between DNA and RNA interactions. Empirical tests show that this multi-task design provides statistically significant gains in predictive quality across the three partner types when compared to a single-task design and a representative selection of the existing methods that cover both disorder- and structure-trained tools. Analysis of the predictions on the human proteome reveals that DeepDISOBind predictions can be encoded into protein-level propensities that accurately predict DNA- and RNA-binding proteins and protein hubs. DeepDISOBind is available at https://www.csuligroup.com/DeepDISOBind/.
Collapse
Affiliation(s)
- Fuhao Zhang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, 410083, China
| | - Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, 23284, USA
| | - Wenbo Shi
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, 410083, China
| | - Min Li
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, 410083, China
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, 23284, USA
| |
Collapse
|
17
|
Ulianov SV, Velichko A, Magnitov MD, Luzhin A, Golov AK, Ovsyannikova N, Kireev II, Gavrikov A, Mishin A, Garaev AK, Tyakht AV, Gavrilov A, Kantidze OL, Razin SV. Suppression of liquid-liquid phase separation by 1,6-hexanediol partially compromises the 3D genome organization in living cells. Nucleic Acids Res 2021; 49:10524-10541. [PMID: 33836078 PMCID: PMC8501969 DOI: 10.1093/nar/gkab249] [Citation(s) in RCA: 57] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Revised: 03/22/2021] [Accepted: 03/25/2021] [Indexed: 12/12/2022] Open
Abstract
Liquid-liquid phase separation (LLPS) contributes to the spatial and functional segregation of molecular processes within the cell nucleus. However, the role played by LLPS in chromatin folding in living cells remains unclear. Here, using stochastic optical reconstruction microscopy (STORM) and Hi-C techniques, we studied the effects of 1,6-hexanediol (1,6-HD)-mediated LLPS disruption/modulation on higher-order chromatin organization in living cells. We found that 1,6-HD treatment caused the enlargement of nucleosome clutches and their more uniform distribution in the nuclear space. At a megabase-scale, chromatin underwent moderate but irreversible perturbations that resulted in the partial mixing of A and B compartments. The removal of 1,6-HD from the culture medium did not allow chromatin to acquire initial configurations, and resulted in more compact repressed chromatin than in untreated cells. 1,6-HD treatment also weakened enhancer-promoter interactions and TAD insulation but did not considerably affect CTCF-dependent loops. Our results suggest that 1,6-HD-sensitive LLPS plays a limited role in chromatin spatial organization by constraining its folding patterns and facilitating compartmentalization at different levels.
Collapse
Affiliation(s)
- Sergey V Ulianov
- Institute of Gene Biology Russian Academy of Science, 119334 Moscow, Russia
- Faculty of Biology, Lomonosov Moscow State University, 119992 Moscow, Russia
| | - Artem K Velichko
- Institute of Gene Biology Russian Academy of Science, 119334 Moscow, Russia
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Institute of Gene Biology Russian Academy of Sciences, 119334 Moscow, Russia
- Institute for Translational Medicine and Biotechnology, Sechenov First Moscow State Medical University, 119991 Moscow, Russia
| | - Mikhail D Magnitov
- Institute of Gene Biology Russian Academy of Science, 119334 Moscow, Russia
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Institute of Gene Biology Russian Academy of Sciences, 119334 Moscow, Russia
- Department of Biological and Medical Physics, Moscow Institute of Physics and Technology (National Research University), 141701 Dolgoprudny, Russia
| | - Artem V Luzhin
- Institute of Gene Biology Russian Academy of Science, 119334 Moscow, Russia
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Institute of Gene Biology Russian Academy of Sciences, 119334 Moscow, Russia
| | - Arkadiy K Golov
- Institute of Gene Biology Russian Academy of Science, 119334 Moscow, Russia
| | - Natalia Ovsyannikova
- A.N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, 119992 Moscow, Russia
| | - Igor I Kireev
- A.N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, 119992 Moscow, Russia
- V.I. Kulakov National Medical Research Center for Obstetrics, Gynecology, and Perinatology, 117997 Moscow, Russia
| | - Alexey S Gavrikov
- Shemyakin−Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 117997 Moscow, Russia
| | - Alexander S Mishin
- Shemyakin−Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 117997 Moscow, Russia
| | - Azat K Garaev
- Institute of Gene Biology Russian Academy of Science, 119334 Moscow, Russia
| | - Alexander V Tyakht
- Institute of Gene Biology Russian Academy of Science, 119334 Moscow, Russia
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Institute of Gene Biology Russian Academy of Sciences, 119334 Moscow, Russia
| | - Alexey A Gavrilov
- Institute of Gene Biology Russian Academy of Science, 119334 Moscow, Russia
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Institute of Gene Biology Russian Academy of Sciences, 119334 Moscow, Russia
| | - Omar L Kantidze
- Institute of Gene Biology Russian Academy of Science, 119334 Moscow, Russia
| | - Sergey V Razin
- Institute of Gene Biology Russian Academy of Science, 119334 Moscow, Russia
| |
Collapse
|
18
|
Gutierrez‐Beltran E, Elander PH, Dalman K, Dayhoff GW, Moschou PN, Uversky VN, Crespo JL, Bozhkov PV. Tudor staphylococcal nuclease is a docking platform for stress granule components and is essential for SnRK1 activation in Arabidopsis. EMBO J 2021; 40:e105043. [PMID: 34287990 PMCID: PMC8447601 DOI: 10.15252/embj.2020105043] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 06/23/2021] [Accepted: 07/01/2021] [Indexed: 12/19/2022] Open
Abstract
Tudor staphylococcal nuclease (TSN; also known as Tudor-SN, p100, or SND1) is a multifunctional, evolutionarily conserved regulator of gene expression, exhibiting cytoprotective activity in animals and plants and oncogenic activity in mammals. During stress, TSN stably associates with stress granules (SGs), in a poorly understood process. Here, we show that in the model plant Arabidopsis thaliana, TSN is an intrinsically disordered protein (IDP) acting as a scaffold for a large pool of other IDPs, enriched for conserved stress granule components as well as novel or plant-specific SG-localized proteins. While approximately 30% of TSN interactors are recruited to stress granules de novo upon stress perception, 70% form a protein-protein interaction network present before the onset of stress. Finally, we demonstrate that TSN and stress granule formation promote heat-induced activation of the evolutionarily conserved energy-sensing SNF1-related protein kinase 1 (SnRK1), the plant orthologue of mammalian AMP-activated protein kinase (AMPK). Our results establish TSN as a docking platform for stress granule proteins, with an important role in stress signalling.
Collapse
Affiliation(s)
- Emilio Gutierrez‐Beltran
- Instituto de Bioquímica Vegetal y FotosíntesisConsejo Superior de Investigaciones Científicas (CSIC)‐Universidad de SevillaSevillaSpain
- Departamento de Bioquímica Vegetal y Biología MolecularFacultad de BiologíaUniversidad de SevillaSevillaSpain
| | - Pernilla H Elander
- Department of Molecular SciencesUppsala BioCenterSwedish University of Agricultural Sciences and Linnean Center for Plant BiologyUppsalaSweden
| | - Kerstin Dalman
- Department of Molecular SciencesUppsala BioCenterSwedish University of Agricultural Sciences and Linnean Center for Plant BiologyUppsalaSweden
| | - Guy W Dayhoff
- Department of ChemistryCollege of Art and SciencesUniversity of South FloridaTampaFLUSA
| | - Panagiotis N Moschou
- Institute of Molecular Biology and BiotechnologyFoundation for Research and Technology ‐ HellasHeraklionGreece
- Department of Plant BiologyUppsala BioCenterSwedish University of Agricultural Sciences and Linnean Center for Plant BiologyUppsalaSweden
- Department of BiologyUniversity of CreteHeraklionGreece
| | - Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of MedicineUniversity of South FloridaTampaFLUSA
- Institute for Biological Instrumentation of the Russian Academy of SciencesFederal Research Center “Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences”PushchinoRussia
| | - Jose L Crespo
- Instituto de Bioquímica Vegetal y FotosíntesisConsejo Superior de Investigaciones Científicas (CSIC)‐Universidad de SevillaSevillaSpain
| | - Peter V Bozhkov
- Department of Molecular SciencesUppsala BioCenterSwedish University of Agricultural Sciences and Linnean Center for Plant BiologyUppsalaSweden
| |
Collapse
|
19
|
Marzullo L, Turco MC, Uversky VN. What's in the BAGs? Intrinsic disorder angle of the multifunctionality of the members of a family of chaperone regulators. J Cell Biochem 2021; 123:22-42. [PMID: 34339540 DOI: 10.1002/jcb.30123] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 06/28/2021] [Accepted: 07/22/2021] [Indexed: 01/22/2023]
Abstract
In humans, the family of Bcl-2 associated athanogene (BAG) proteins includes six members characterized by exceptional multifunctionality and engagement in the pathogenesis of various diseases. All of them are capable of interacting with a multitude of often unrelated binding partners. Such binding promiscuity and related functional and pathological multifacetedness cannot be explained or understood within the frames of the classical "one protein-one structure-one function" model, which also fails to explain the presence of multiple isoforms generated for BAG proteins by alternative splicing or alternative translation initiation and their extensive posttranslational modifications. However, all these mysteries can be solved by taking into account the intrinsic disorder phenomenon. In fact, high binding promiscuity and potential to participate in a broad spectrum of interactions with multiple binding partners, as well as a capability to be multifunctional and multipathogenic, are some of the characteristic features of intrinsically disordered proteins and intrinsically disordered protein regions. Such functional proteins or protein regions lacking unique tertiary structures constitute a cornerstone of the protein structure-function continuum concept. The aim of this paper is to provide an overview of the functional roles of human BAG proteins from the perspective of protein intrinsic disorder which will provide a means for understanding their binding promiscuity, multifunctionality, and relation to the pathogenesis of various diseases.
Collapse
Affiliation(s)
- Liberato Marzullo
- Department of Medicine, Surgery and Dentistry Schola Medica Salernitana, University of Salerno, Baronissi, Italy.,Research and Development Division, BIOUNIVERSA s.r.l., Baronissi, Italy
| | - Maria C Turco
- Department of Medicine, Surgery and Dentistry Schola Medica Salernitana, University of Salerno, Baronissi, Italy.,Research and Development Division, BIOUNIVERSA s.r.l., Baronissi, Italy
| | - Vladimir N Uversky
- Department of Molecular Medicine and Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, Florida, USA
| |
Collapse
|
20
|
Peng Z, Xing Q, Kurgan L. APOD: accurate sequence-based predictor of disordered flexible linkers. Bioinformatics 2021; 36:i754-i761. [PMID: 33381830 PMCID: PMC7773485 DOI: 10.1093/bioinformatics/btaa808] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/07/2020] [Indexed: 12/21/2022] Open
Abstract
Motivation Disordered flexible linkers (DFLs) are abundant and functionally important intrinsically disordered regions that connect protein domains and structural elements within domains and which facilitate disorder-based allosteric regulation. Although computational estimates suggest that thousands of proteins have DFLs, they were annotated experimentally in <200 proteins. This substantial annotation gap can be reduced with the help of accurate computational predictors. The sole predictor of DFLs, DFLpred, trade-off accuracy for shorter runtime by excluding relevant but computationally costly predictive inputs. Moreover, it relies on the local/window-based information while lacking to consider useful protein-level characteristics. Results We conceptualize, design and test APOD (Accurate Predictor Of DFLs), the first highly accurate predictor that utilizes both local- and protein-level inputs that quantify propensity for disorder, sequence composition, sequence conservation and selected putative structural properties. Consequently, APOD offers significantly more accurate predictions when compared with its faster predecessor, DFLpred, and several other alternative ways to predict DFLs. These improvements stem from the use of a more comprehensive set of inputs that cover the protein-level information and the application of a more sophisticated predictive model, a well-parametrized support vector machine. APOD achieves area under the curve = 0.82 (28% improvement over DFLpred) and Matthews correlation coefficient = 0.42 (180% increase over DFLpred) when tested on an independent/low-similarity test dataset. Consequently, APOD is a suitable choice for accurate and small-scale prediction of DFLs. Availability and implementation https://yanglab.nankai.edu.cn/APOD/.
Collapse
Affiliation(s)
- Zhenling Peng
- Center for Applied Mathematics, Tianjin University, Tianjin 300072, China.,School of Statistics and Data Science, Nankai University, Tianjin 300074, China
| | - Qian Xing
- Center for Applied Mathematics, Tianjin University, Tianjin 300072, China
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
21
|
Zhang J, Ghadermarzi S, Kurgan L. Prediction of protein-binding residues: dichotomy of sequence-based methods developed using structured complexes versus disordered proteins. Bioinformatics 2021; 36:4729-4738. [PMID: 32860044 DOI: 10.1093/bioinformatics/btaa573] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2020] [Revised: 05/22/2020] [Accepted: 06/10/2020] [Indexed: 01/08/2023] Open
Abstract
MOTIVATION There are over 30 sequence-based predictors of the protein-binding residues (PBRs). They use either structure-annotated or disorder-annotated training datasets, potentially creating a dichotomy where the structure-/disorder-specific models may not be able to cross-over to accurately predict the other type. Moreover, the structure-trained predictors were shown to substantially cross-predict PBRs among residues that interact with non-protein partners (nucleic acids and small ligands). We address these issues by performing first-of-its-kind comparative study of a representative collection of disorder- and structure-trained predictors using a comprehensive benchmark set with the structure- and disorder-derived annotations of PBRs (to analyze the cross-over) and the protein-, nucleic acid- and small ligand-binding proteins (to study the cross-predictions). RESULTS Three predictors provide accurate results: SCRIBER, ANCHOR and disoRDPbind. Some of the structure-trained methods make accurate predictions on the structure-annotated proteins. Similarly, the disorder-trained predictors predict well on the disorder-annotated proteins. However, the considered predictors generally fail to cross-over, with the exception of SCRIBER. Our study also reveals that virtually all methods substantially cross-predict PBRs, except for SCRIBER for the structure-annotated proteins and disoRDPbind for the disorder-annotated proteins. We formulate a novel hybrid predictor, hybridPBRpred, that combines results produced by disoRDPbind and SCRIBER to accurately predict disorder- and structure-annotated PBRs. HybridPBRpred generates accurate results that cross-over structure- and disorder-annotated proteins and produces relatively low amount of cross-predictions, offering an accurate alternative to predict PBRs. AVAILABILITY AND IMPLEMENTATION HybridPBRpred webserver, benchmark dataset and supplementary information are available at http://biomine.cs.vcu.edu/servers/hybridPBRpred/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jian Zhang
- School of Computer and Information Technology, Xinyang Normal University, Xinyang 464000, China
| | - Sina Ghadermarzi
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
22
|
Zhao B, Katuwawala A, Uversky VN, Kurgan L. IDPology of the living cell: intrinsic disorder in the subcellular compartments of the human cell. Cell Mol Life Sci 2021; 78:2371-2385. [PMID: 32997198 PMCID: PMC11071772 DOI: 10.1007/s00018-020-03654-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Revised: 09/09/2020] [Accepted: 09/22/2020] [Indexed: 12/11/2022]
Abstract
Intrinsic disorder can be found in all proteomes of all kingdoms of life and in viruses, being particularly prevalent in the eukaryotes. We conduct a comprehensive analysis of the intrinsic disorder in the human proteins while mapping them into 24 compartments of the human cell. In agreement with previous studies, we show that human proteins are significantly enriched in disorder relative to a generic protein set that represents the protein universe. In fact, the fraction of proteins with long disordered regions and the average protein-level disorder content in the human proteome are about 3 times higher than in the protein universe. Furthermore, levels of intrinsic disorder in the majority of human subcellular compartments significantly exceed the average disorder content in the protein universe. Relative to the overall amount of disorder in the human proteome, proteins localized in the nucleus and cytoskeleton have significantly increased amounts of disorder, measured by both high disorder content and presence of multiple long intrinsically disordered regions. We empirically demonstrate that, on average, human proteins are assigned to 2.3 subcellular compartments, with proteins localized to few subcellular compartments being more disordered than the proteins that are localized to many compartments. Functionally, the disordered proteins localized in the most disorder-enriched subcellular compartments are primarily responsible for interactions with nucleic acids and protein partners. This is the first-time disorder is comprehensively mapped into the human cell. Our observations add a missing piece to the puzzle of functional disorder and its organization inside the cell.
Collapse
Affiliation(s)
- Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, 401 West Main Street, Room E4225, Richmond, VA, 23284, USA
| | - Akila Katuwawala
- Department of Computer Science, Virginia Commonwealth University, 401 West Main Street, Room E4225, Richmond, VA, 23284, USA
| | - Vladimir N Uversky
- Department of Molecular Medicine, USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Blvd. MDC07, Tampa, FL, 33612, USA.
- Laboratory of New Methods in Biology, Institute for Biological Instrumentation of the Russian Academy of Sciences, Federal Research Center "Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences", Pushchino, Russia.
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, 401 West Main Street, Room E4225, Richmond, VA, 23284, USA.
| |
Collapse
|
23
|
Layalle S, They L, Ourghani S, Raoul C, Soustelle L. Amyotrophic Lateral Sclerosis Genes in Drosophila melanogaster. Int J Mol Sci 2021; 22:ijms22020904. [PMID: 33477509 PMCID: PMC7831090 DOI: 10.3390/ijms22020904] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Revised: 01/13/2021] [Accepted: 01/14/2021] [Indexed: 12/11/2022] Open
Abstract
Amyotrophic lateral sclerosis (ALS) is a devastating adult-onset neurodegenerative disease characterized by the progressive degeneration of upper and lower motoneurons. Most ALS cases are sporadic but approximately 10% of ALS cases are due to inherited mutations in identified genes. ALS-causing mutations were identified in over 30 genes with superoxide dismutase-1 (SOD1), chromosome 9 open reading frame 72 (C9orf72), fused in sarcoma (FUS), and TAR DNA-binding protein (TARDBP, encoding TDP-43) being the most frequent. In the last few decades, Drosophila melanogaster emerged as a versatile model for studying neurodegenerative diseases, including ALS. In this review, we describe the different Drosophila ALS models that have been successfully used to decipher the cellular and molecular pathways associated with SOD1, C9orf72, FUS, and TDP-43. The study of the known fruit fly orthologs of these ALS-related genes yielded significant insights into cellular mechanisms and physiological functions. Moreover, genetic screening in tissue-specific gain-of-function mutants that mimic ALS-associated phenotypes identified disease-modifying genes. Here, we propose a comprehensive review on the Drosophila research focused on four ALS-linked genes that has revealed novel pathogenic mechanisms and identified potential therapeutic targets for future therapy.
Collapse
Affiliation(s)
- Sophie Layalle
- The Neuroscience Institute of Montpellier, INSERM, University of Montpellier, 34091 Montpellier, France; (S.L.); (L.T.); (S.O.)
| | - Laetitia They
- The Neuroscience Institute of Montpellier, INSERM, University of Montpellier, 34091 Montpellier, France; (S.L.); (L.T.); (S.O.)
| | - Sarah Ourghani
- The Neuroscience Institute of Montpellier, INSERM, University of Montpellier, 34091 Montpellier, France; (S.L.); (L.T.); (S.O.)
| | - Cédric Raoul
- The Neuroscience Institute of Montpellier, INSERM, University of Montpellier, 34091 Montpellier, France; (S.L.); (L.T.); (S.O.)
- Laboratory of Neurobiology, Kazan Federal University, 420008 Kazan, Russia
- Correspondence: (C.R.); (L.S.)
| | - Laurent Soustelle
- The Neuroscience Institute of Montpellier, INSERM, University of Montpellier, 34091 Montpellier, France; (S.L.); (L.T.); (S.O.)
- Correspondence: (C.R.); (L.S.)
| |
Collapse
|
24
|
Sharma A, Colonna G. System-Wide Pollution of Biomedical Data: Consequence of the Search for Hub Genes of Hepatocellular Carcinoma Without Spatiotemporal Consideration. Mol Diagn Ther 2021; 25:9-27. [PMID: 33475988 PMCID: PMC7847983 DOI: 10.1007/s40291-020-00505-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/01/2020] [Indexed: 12/17/2022]
Abstract
Biomedical institutions rely on data evaluation and are turning into data factories. Big-data storage centers, supercomputing systems, and increased algorithmic efficiency allow us to analyze the ever-increasing amount of data generated every day in biomedical research centers. In network science, the principal intrinsic problem is how to integrate the data and information from different experiments on genes or proteins. Data curation is an essential process in annotating new functional data to known genes or proteins, undertaken by a biobank curator, which is then reflected in the calculated networks. We provide an example of how protein-protein networks today have space-time limits. The next step is the integration of data and information from different biobanks. Omics data and networks are essential parts of this step but also have flawed protocols and errors. Consider data from patients with cancer: from biopsy procedures to experimental tests, to archiving methods and computational algorithms, these are continuously handled so require critical and continuous "updates" to obtain reproducible, reliable, and correct results. We show, as a second example, how all this distorts studies in cellular hepatocellular carcinoma. It is not unlikely that these flawed data have been polluting biobanks for some time before stringent conditions for the veracity of data were implemented in Big data. Therefore, all this could contribute to errors in future medical decisions.
Collapse
Affiliation(s)
- Ankush Sharma
- Department of Biosciences, University of Oslo, Oslo, Norway.
- Department of Informatics, University of Oslo, Oslo, Norway.
- Institute of Cancer Research, Institute of Clinical medicine, University of Oslo, Oslo, Norway.
| | - Giovanni Colonna
- Medical Informatics, AOU-Vanvitelli, Università della Campania, Naples, Italy
| |
Collapse
|
25
|
Kurgan L, Li M, Li Y. The Methods and Tools for Intrinsic Disorder Prediction and their Application to Systems Medicine. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11320-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022] Open
|
26
|
Peng Z, Xing Q, Kurgan L. APOD: accurate sequence-based predictor of disordered flexible linkers. BIOINFORMATICS (OXFORD, ENGLAND) 2020; 36:i754-i761. [PMID: 33381830 DOI: 10.1101/2020.12.03.409755] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 09/07/2020] [Indexed: 05/28/2023]
Abstract
MOTIVATION Disordered flexible linkers (DFLs) are abundant and functionally important intrinsically disordered regions that connect protein domains and structural elements within domains and which facilitate disorder-based allosteric regulation. Although computational estimates suggest that thousands of proteins have DFLs, they were annotated experimentally in <200 proteins. This substantial annotation gap can be reduced with the help of accurate computational predictors. The sole predictor of DFLs, DFLpred, trade-off accuracy for shorter runtime by excluding relevant but computationally costly predictive inputs. Moreover, it relies on the local/window-based information while lacking to consider useful protein-level characteristics. RESULTS We conceptualize, design and test APOD (Accurate Predictor Of DFLs), the first highly accurate predictor that utilizes both local- and protein-level inputs that quantify propensity for disorder, sequence composition, sequence conservation and selected putative structural properties. Consequently, APOD offers significantly more accurate predictions when compared with its faster predecessor, DFLpred, and several other alternative ways to predict DFLs. These improvements stem from the use of a more comprehensive set of inputs that cover the protein-level information and the application of a more sophisticated predictive model, a well-parametrized support vector machine. APOD achieves area under the curve = 0.82 (28% improvement over DFLpred) and Matthews correlation coefficient = 0.42 (180% increase over DFLpred) when tested on an independent/low-similarity test dataset. Consequently, APOD is a suitable choice for accurate and small-scale prediction of DFLs. AVAILABILITY AND IMPLEMENTATION https://yanglab.nankai.edu.cn/APOD/.
Collapse
Affiliation(s)
- Zhenling Peng
- Center for Applied Mathematics, Tianjin University, Tianjin 300072, China
- School of Statistics and Data Science, Nankai University, Tianjin 300074, China
| | - Qian Xing
- Center for Applied Mathematics, Tianjin University, Tianjin 300072, China
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
27
|
Comparative Assessment of Intrinsic Disorder Predictions with a Focus on Protein and Nucleic Acid-Binding Proteins. Biomolecules 2020; 10:biom10121636. [PMID: 33291838 PMCID: PMC7762010 DOI: 10.3390/biom10121636] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2020] [Revised: 11/26/2020] [Accepted: 12/03/2020] [Indexed: 01/18/2023] Open
Abstract
With over 60 disorder predictors, users need help navigating the predictor selection task. We review 28 surveys of disorder predictors, showing that only 11 include assessment of predictive performance. We identify and address a few drawbacks of these past surveys. To this end, we release a novel benchmark dataset with reduced similarity to the training sets of the considered predictors. We use this dataset to perform a first-of-its-kind comparative analysis that targets two large functional families of disordered proteins that interact with proteins and with nucleic acids. We show that limiting sequence similarity between the benchmark and the training datasets has a substantial impact on predictive performance. We also demonstrate that predictive quality is sensitive to the use of the well-annotated order and inclusion of the fully structured proteins in the benchmark datasets, both of which should be considered in future assessments. We identify three predictors that provide favorable results using the new benchmark set. While we find that VSL2B offers the most accurate and robust results overall, ESpritz-DisProt and SPOT-Disorder perform particularly well for disordered proteins. Moreover, we find that predictions for the disordered protein-binding proteins suffer low predictive quality compared to generic disordered proteins and the disordered nucleic acids-binding proteins. This can be explained by the high disorder content of the disordered protein-binding proteins, which makes it difficult for the current methods to accurately identify ordered regions in these proteins. This finding motivates the development of a new generation of methods that would target these difficult-to-predict disordered proteins. We also discuss resources that support users in collecting and identifying high-quality disorder predictions.
Collapse
|
28
|
Uversky VN. Functions of short lifetime biological structures at large: the case of intrinsically disordered proteins. Brief Funct Genomics 2020; 19:60-68. [PMID: 29982297 DOI: 10.1093/bfgp/ely023] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Although for more than a century a protein function was intimately associated with the presence of unique structure in a protein molecule, recent years witnessed a skyrocket rise of the appreciation of protein intrinsic disorder concept that emphasizes the importance of the biologically active proteins without ordered structures. In different proteins, the depth and breadth of disorder penetrance are different, generating an amusing spatiotemporal heterogeneity of intrinsically disordered proteins (IDPs) and intrinsically disordered protein region regions (IDPRs), which are typically described as highly dynamic ensembles of rapidly interconverting conformations (or a multitude of short lifetime structures). IDPs/IDPRs constitute a substantial part of protein kingdom and have unique functions complementary to functional repertoires of ordered proteins. They are recognized as interaction specialists and global controllers that play crucial roles in regulation of functions of their binding partners and in controlling large biological networks. IDPs/IDPRs are characterized by immense binding promiscuity and are able to use a broad spectrum of binding modes, often resulting in the formation of short lifetime complexes. In their turn, functions of IDPs and IDPRs are controlled by various means, such as numerous posttranslational modifications and alternative splicing. Some of the functions of IDPs/IDPRs are briefly considered in this review to shed some light on the biological roles of short-lived structures at large.
Collapse
Affiliation(s)
- Vladimir N Uversky
- Department of Molecular Medicine, USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA and Laboratory of New Methods in Biology, Institute for Biological Instrumentation, Russian Academy of Sciences, 142290 Pushchino, Moscow Region, Russia
| |
Collapse
|
29
|
Razin SV, Ulianov SV. Divide and Rule: Phase Separation in Eukaryotic Genome Functioning. Cells 2020; 9:cells9112480. [PMID: 33203115 PMCID: PMC7696541 DOI: 10.3390/cells9112480] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Revised: 11/12/2020] [Accepted: 11/13/2020] [Indexed: 12/13/2022] Open
Abstract
The functioning of a cell at various organizational levels is determined by the interactions between macromolecules that promote cellular organelle formation and orchestrate metabolic pathways via the control of enzymatic activities. Although highly specific and relatively stable protein-protein, protein-DNA, and protein-RNA interactions are traditionally suggested as the drivers for cellular function realization, recent advances in the discovery of weak multivalent interactions have uncovered the role of so-called macromolecule condensates. These structures, which are highly divergent in size, composition, function, and cellular localization are predominantly formed by liquid-liquid phase separation (LLPS): a physical-chemical process where an initially homogenous solution turns into two distinct phases, one of which contains the major portion of the dissolved macromolecules and the other one containing the solvent. In a living cell, LLPS drives the formation of membrane-less organelles such as the nucleolus, nuclear bodies, and viral replication factories and facilitates the assembly of complex macromolecule aggregates possessing regulatory, structural, and enzymatic functions. Here, we discuss the role of LLPS in the spatial organization of eukaryotic chromatin and regulation of gene expression in normal and pathological conditions.
Collapse
Affiliation(s)
- Sergey V. Razin
- Institute of Gene Biology, Russian Academy of Sciences, 119017 Moscow, Russia;
- Faculty of Biology, M.V. Lomonosov Moscow State University, 119017 Moscow, Russia
| | - Sergey V. Ulianov
- Institute of Gene Biology, Russian Academy of Sciences, 119017 Moscow, Russia;
- Faculty of Biology, M.V. Lomonosov Moscow State University, 119017 Moscow, Russia
- Correspondence: ; Tel.: +7-499-135-9787
| |
Collapse
|
30
|
Kantidze OL, Razin SV. Weak interactions in higher-order chromatin organization. Nucleic Acids Res 2020; 48:4614-4626. [PMID: 32313950 PMCID: PMC7229822 DOI: 10.1093/nar/gkaa261] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Revised: 03/30/2020] [Accepted: 04/03/2020] [Indexed: 12/20/2022] Open
Abstract
The detailed principles of the hierarchical folding of eukaryotic chromosomes have been revealed during the last two decades. Along with structures composing three-dimensional (3D) genome organization (chromatin compartments, topologically associating domains, chromatin loops, etc.), the molecular mechanisms that are involved in their establishment and maintenance have been characterized. Generally, protein-protein and protein-DNA interactions underlie the spatial genome organization in eukaryotes. However, it is becoming increasingly evident that weak interactions, which exist in biological systems, also contribute to the 3D genome. Here, we provide a snapshot of our current understanding of the role of the weak interactions in the establishment and maintenance of the 3D genome organization. We discuss how weak biological forces, such as entropic forces operating in crowded solutions, electrostatic interactions of the biomolecules, liquid-liquid phase separation, DNA supercoiling, and RNA environment participate in chromosome segregation into structural and functional units and drive intranuclear functional compartmentalization.
Collapse
Affiliation(s)
- Omar L Kantidze
- Institute of Gene Biology Russian Academy of Sciences, 119334 Moscow, Russia
| | - Sergey V Razin
- Institute of Gene Biology Russian Academy of Sciences, 119334 Moscow, Russia
| |
Collapse
|
31
|
Stenström L, Mahdessian D, Gnann C, Cesnik AJ, Ouyang W, Leonetti MD, Uhlén M, Cuylen‐Haering S, Thul PJ, Lundberg E. Mapping the nucleolar proteome reveals a spatiotemporal organization related to intrinsic protein disorder. Mol Syst Biol 2020; 16:e9469. [PMID: 32744794 PMCID: PMC7397901 DOI: 10.15252/msb.20209469] [Citation(s) in RCA: 76] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2020] [Revised: 07/02/2020] [Accepted: 07/03/2020] [Indexed: 01/01/2023] Open
Abstract
The nucleolus is essential for ribosome biogenesis and is involved in many other cellular functions. We performed a systematic spatiotemporal dissection of the human nucleolar proteome using confocal microscopy. In total, 1,318 nucleolar proteins were identified; 287 were localized to fibrillar components, and 157 were enriched along the nucleoplasmic border, indicating a potential fourth nucleolar subcompartment: the nucleoli rim. We found 65 nucleolar proteins (36 uncharacterized) to relocate to the chromosomal periphery during mitosis. Interestingly, we observed temporal partitioning into two recruitment phenotypes: early (prometaphase) and late (after metaphase), suggesting phase-specific functions. We further show that the expression of MKI67 is critical for this temporal partitioning. We provide the first proteome-wide analysis of intrinsic protein disorder for the human nucleolus and show that nucleolar proteins in general, and mitotic chromosome proteins in particular, have significantly higher intrinsic disorder level compared to cytosolic proteins. In summary, this study provides a comprehensive and essential resource of spatiotemporal expression data for the nucleolar proteome as part of the Human Protein Atlas.
Collapse
Affiliation(s)
- Lovisa Stenström
- Science for Life LaboratorySchool of Engineering Sciences in Chemistry, Biotechnology and HealthKTH Royal Institute of TechnologyStockholmSweden
| | - Diana Mahdessian
- Science for Life LaboratorySchool of Engineering Sciences in Chemistry, Biotechnology and HealthKTH Royal Institute of TechnologyStockholmSweden
| | - Christian Gnann
- Science for Life LaboratorySchool of Engineering Sciences in Chemistry, Biotechnology and HealthKTH Royal Institute of TechnologyStockholmSweden
- Chan Zuckerberg BiohubSan FranciscoCAUSA
| | - Anthony J Cesnik
- Chan Zuckerberg BiohubSan FranciscoCAUSA
- Department of GeneticsStanford UniversityStanfordCAUSA
| | - Wei Ouyang
- Science for Life LaboratorySchool of Engineering Sciences in Chemistry, Biotechnology and HealthKTH Royal Institute of TechnologyStockholmSweden
| | | | - Mathias Uhlén
- Science for Life LaboratorySchool of Engineering Sciences in Chemistry, Biotechnology and HealthKTH Royal Institute of TechnologyStockholmSweden
| | - Sara Cuylen‐Haering
- Cell Biology and Biophysics UnitEuropean Molecular Biology LaboratoryHeidelbergGermany
| | - Peter J Thul
- Science for Life LaboratorySchool of Engineering Sciences in Chemistry, Biotechnology and HealthKTH Royal Institute of TechnologyStockholmSweden
| | - Emma Lundberg
- Science for Life LaboratorySchool of Engineering Sciences in Chemistry, Biotechnology and HealthKTH Royal Institute of TechnologyStockholmSweden
- Chan Zuckerberg BiohubSan FranciscoCAUSA
- Department of GeneticsStanford UniversityStanfordCAUSA
| |
Collapse
|
32
|
Zhang J, Kurgan L. SCRIBER: accurate and partner type-specific prediction of protein-binding residues from proteins sequences. Bioinformatics 2020; 35:i343-i353. [PMID: 31510679 PMCID: PMC6612887 DOI: 10.1093/bioinformatics/btz324] [Citation(s) in RCA: 70] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Motivation Accurate predictions of protein-binding residues (PBRs) enhances understanding of molecular-level rules governing protein–protein interactions, helps protein–protein docking and facilitates annotation of protein functions. Recent studies show that current sequence-based predictors of PBRs severely cross-predict residues that interact with other types of protein partners (e.g. RNA and DNA) as PBRs. Moreover, these methods are relatively slow, prohibiting genome-scale use. Results We propose a novel, accurate and fast sequence-based predictor of PBRs that minimizes the cross-predictions. Our SCRIBER (SeleCtive pRoteIn-Binding rEsidue pRedictor) method takes advantage of three innovations: comprehensive dataset that covers multiple types of binding residues, novel types of inputs that are relevant to the prediction of PBRs, and an architecture that is tailored to reduce the cross-predictions. The dataset includes complete protein chains and offers improved coverage of binding annotations that are transferred from multiple protein–protein complexes. We utilize innovative two-layer architecture where the first layer generates a prediction of protein-binding, RNA-binding, DNA-binding and small ligand-binding residues. The second layer re-predicts PBRs by reducing overlap between PBRs and the other types of binding residues produced in the first layer. Empirical tests on an independent test dataset reveal that SCRIBER significantly outperforms current predictors and that all three innovations contribute to its high predictive performance. SCRIBER reduces cross-predictions by between 41% and 69% and our conservative estimates show that it is at least 3 times faster. We provide putative PBRs produced by SCRIBER for the entire human proteome and use these results to hypothesize that about 14% of currently known human protein domains bind proteins. Availability and implementation SCRIBER webserver is available at http://biomine.cs.vcu.edu/servers/SCRIBER/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jian Zhang
- School of Computer and Information Technology, Xinyang Normal University, Xinyang, China.,Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| |
Collapse
|
33
|
Drake JA, Pettitt BM. Physical Chemistry of the Protein Backbone: Enabling the Mechanisms of Intrinsic Protein Disorder. J Phys Chem B 2020; 124:4379-4390. [PMID: 32349480 PMCID: PMC7384255 DOI: 10.1021/acs.jpcb.0c02489] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Over the last two decades it has become clear that well-defined structure is not a requisite for proteins to properly function. Rather, spectra of functionally competent, structurally disordered states have been uncovered requiring canonical paradigms in molecular biology to be revisited or reimagined. It is enticing and oftentimes practical to divide the proteome into structured and unstructured, or disordered, proteins. While function, composition, and structural properties largely differ, these two classes of protein are built upon the same scaffold, namely, the protein backbone. The versatile physicochemical properties of the protein backbone must accommodate structural disorder, order, and transitions between these states. In this review, we survey these properties through the conceptual lenses of solubility and conformational populations and in the context of protein-disorder mediated phenomena (e.g., phase separation, order-disorder transitions, allostery). Particular attention is paid to the results of computational studies, which, through thermodynamic decomposition and dissection of molecular interactions, can provide valuable mechanistic insight and testable hypotheses to guide further solution experiments. Lastly, we discuss changes in the dynamics of side chains and order-disorder transitions of the protein backbone as two modes or realizations of "entropic reservoirs" capable of tuning coupled thermodynamic processes.
Collapse
Affiliation(s)
- Justin A Drake
- Sealy Center for Structural Biology and Molecular Biophysics, University of Texas Medical Branch, Galveston 77555, Texas, United States
- Texas Advanced Computing Center, University of Texas at Austin, Austin 78712, Texas, United States
| | - B Montgomery Pettitt
- Sealy Center for Structural Biology and Molecular Biophysics, University of Texas Medical Branch, Galveston 77555, Texas, United States
| |
Collapse
|
34
|
Razin SV, Gavrilov AA. The Role of Liquid–Liquid Phase Separation in the Compartmentalization of Cell Nucleus and Spatial Genome Organization. BIOCHEMISTRY (MOSCOW) 2020; 85:643-650. [DOI: 10.1134/s0006297920060012] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
|
35
|
Yan J, Cheng J, Kurgan L, Uversky VN. Structural and functional analysis of "non-smelly" proteins. Cell Mol Life Sci 2020; 77:2423-2440. [PMID: 31486849 PMCID: PMC11105052 DOI: 10.1007/s00018-019-03292-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2019] [Revised: 08/21/2019] [Accepted: 08/28/2019] [Indexed: 01/09/2023]
Abstract
Cysteine and aromatic residues are major structure-promoting residues. We assessed the abundance, structural coverage, and functional characteristics of the "non-smelly" proteins, i.e., proteins that do not contain cysteine residues (C-depleted) or cysteine and aromatic residues (CFYWH-depleted), across 817 proteomes from all domains of life. The analysis revealed that although these proteomes contained significant levels of the C-depleted proteins, with prokaryotes being significantly more enriched in such proteins than eukaryotes, the CFYWH-depleted proteins were relatively rare, accounting for about 0.05% of proteomes. Furthermore, CFYWH-depleted proteins were virtually never found in PDB. Depletion in cysteine and in aromatic residues was associated with the substantially increased intrinsic disorder levels across all domains of life. Archaeal and eukaryotic organisms with higher levels of the C-depleted proteins were shown to have higher levels of the intrinsic disorder and lower levels of structural coverage. We also showed that the "non-smelly" proteins typically did not independently fold into monomeric structures, and instead, they fold by interacting with nucleic acids as constituents of the ribosome and nucleosome complexes. They were shown to be involved in translation, transcription, nucleosome assembly, transmembrane transport, and protein folding functions, all of which are known to be associated with the intrinsic disorder. Our data suggested that, in general, structure of monomeric proteins is crucially dependent on the presence of cysteine and aromatic residues.
Collapse
Affiliation(s)
- Jing Yan
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Canada
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, 401 West Main Street, Room E4225, Richmond, VA, 23284, USA.
| | - Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Blvd., MDC07, Tampa, FL, 33612, USA.
- Protein Research Group, Institute for Biological Instrumentation of the Russian Academy of Sciences, 142290, Pushchino, Moscow Region, Russia.
| |
Collapse
|
36
|
Hu G, Wu Z, Oldfield CJ, Wang C, Kurgan L. Quality assessment for the putative intrinsic disorder in proteins. Bioinformatics 2020; 35:1692-1700. [PMID: 30329008 DOI: 10.1093/bioinformatics/bty881] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2018] [Revised: 09/19/2018] [Accepted: 10/15/2018] [Indexed: 11/15/2022] Open
Abstract
MOTIVATION While putative intrinsic disorder is widely used, none of the predictors provides quality assessment (QA) scores. QA scores estimate the likelihood that predictions are correct at a residue level and have been applied in other bioinformatics areas. We recently reported that QA scores derived from putative disorder propensities perform relatively poorly for native disordered residues. Here we design and validate a general approach to construct QA predictors for disorder predictions. RESULTS The QUARTER (QUality Assessment for pRotein inTrinsic disordEr pRedictions) toolbox of methods accommodates a diverse set of ten disorder predictors. It builds upon several innovative design elements including use and scaling of selected physicochemical properties of the input sequence, post-processing of disorder propensity scores, and a feature selection that optimizes the predictive models to a specific disorder predictor. We empirically establish that each one of these elements contributes to the overall predictive performance of our tool and that QUARTER's outputs significantly outperform QA scores derived from the outputs generated the disorder predictors. The best performing QA scores for a single disorder predictor identify 13% of residues that are predicted with 98% precision. QA scores computed by combining results of the ten disorder predictors cover 40% of residues with 95% precision. Case studies are used to show how to interpret the QA scores. QA scores based on the high precision combined predictions are applied to analyze disorder in the human proteome. AVAILABILITY AND IMPLEMENTATION http://biomine.cs.vcu.edu/servers/QUARTER/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Gang Hu
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, People's Republic of China
| | - Zhonghua Wu
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, People's Republic of China
| | | | - Chen Wang
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| |
Collapse
|
37
|
Oldfield CJ, Fan X, Wang C, Dunker AK, Kurgan L. Computational Prediction of Intrinsic Disorder in Protein Sequences with the disCoP Meta-predictor. Methods Mol Biol 2020; 2141:21-35. [PMID: 32696351 DOI: 10.1007/978-1-0716-0524-0_2] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Intrinsically disordered proteins are either entirely disordered or contain disordered regions in their native state. These proteins and regions function without the prerequisite of a stable structure and were found to be abundant across all kingdoms of life. Experimental annotation of disorder lags behind the rapidly growing number of sequenced proteins, motivating the development of computational methods that predict disorder in protein sequences. DisCoP is a user-friendly webserver that provides accurate sequence-based prediction of protein disorder. It relies on meta-architecture in which the outputs generated by multiple disorder predictors are combined together to improve predictive performance. The architecture of disCoP is presented, and its accuracy relative to several other disorder predictors is briefly discussed. We describe usage of the web interface and explain how to access and read results generated by this computational tool. We also provide an example of prediction results and interpretation. The disCoP's webserver is publicly available at http://biomine.cs.vcu.edu/servers/disCoP/ .
Collapse
Affiliation(s)
| | - Xiao Fan
- Department of Pediatrics, Columbia University, New York, NY, USA
| | - Chen Wang
- Department of Medicine, Columbia University, New York, NY, USA
| | - A Keith Dunker
- Department of Biochemistry and Molecular Biology, Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA.
| |
Collapse
|
38
|
Oldfield CJ, Peng Z, Uversky VN, Kurgan L. Codon selection reduces GC content bias in nucleic acids encoding for intrinsically disordered proteins. Cell Mol Life Sci 2020; 77:149-160. [PMID: 31175370 PMCID: PMC11104855 DOI: 10.1007/s00018-019-03166-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2019] [Revised: 05/14/2019] [Accepted: 05/28/2019] [Indexed: 02/06/2023]
Abstract
Protein-coding nucleic acids exhibit composition and codon biases between sequences coding for intrinsically disordered regions (IDRs) and those coding for structured regions. IDRs are regions of proteins that are folding self-insufficient and which function without the prerequisite of folded structure. Several authors have investigated composition bias or codon selection in regions encoding for IDRs, primarily in Eukaryota, and concluded that elevated GC content is the result of the biased amino acid composition of IDRs. We substantively extend previous work by examining GC content in regions encoding IDRs, from 44 species in Eukaryota, Archaea, and Bacteria, spanning a wide range of GC content. We confirm that regions coding for IDRs show a significantly elevated GC content, even across all domains of life. Although this is largely attributable to the amino acid composition bias of IDRs, we show that this bias is independent of the overall GC content and, most importantly, we are the first to observe that GC content bias in IDRs is significantly different than expected from IDR amino acid composition alone. We empirically find compensatory codon selection that reduces the observed GC content bias in IDRs. This selection is dependent on the overall GC content of the organism. The codon selection bias manifests as use of infrequent, AT-rich codons in encoding IDRs. Further, we find these relationships to be independent of the intrinsic disorder prediction method used, and independent of estimated translation efficiency. These observations are consistent with the previous work, and we speculate on whether the observed biases are causal or symptomatic of other driving forces.
Collapse
Affiliation(s)
- Christopher J Oldfield
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, 23284, USA.
| | - Zhenling Peng
- Center for Applied Mathematics, Tianjin University, Tianjin, 300072, China
| | - Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, 33612, USA
- Institute for Biological Instrumentation, Russian Academy of Sciences, 142290, Pushchino, Moscow Region, Russia
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, 23284, USA.
| |
Collapse
|
39
|
Abstract
Intrinsically disordered regions (IDRs) are estimated to be highly abundant in nature. While only several thousand proteins are annotated with experimentally derived IDRs, computational methods can be used to predict IDRs for the millions of currently uncharacterized protein chains. Several dozen disorder predictors were developed over the last few decades. While some of these methods provide accurate predictions, unavoidably they also make some mistakes. Consequently, one of the challenges facing users of these methods is how to decide which predictions can be trusted and which are likely incorrect. This practical problem can be solved using quality assessment (QA) scores that predict correctness of the underlying (disorder) predictions at a residue level. We motivate and describe a first-of-its-kind toolbox of QA methods, QUARTER (QUality Assessment for pRotein inTrinsic disordEr pRedictions), which provides the scores for a diverse set of ten disorder predictors. QUARTER is available to the end users as a free and convenient webserver at http://biomine.cs.vcu.edu/servers/QUARTER/ . We briefly describe the predictive architecture of QUARTER and provide detailed instructions on how to use the webserver. We also explain how to interpret results produced by QUARTER with the help of a case study.
Collapse
|
40
|
Ghadermarzi S, Li X, Li M, Kurgan L. Sequence-Derived Markers of Drug Targets and Potentially Druggable Human Proteins. Front Genet 2019; 10:1075. [PMID: 31803227 PMCID: PMC6872670 DOI: 10.3389/fgene.2019.01075] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2019] [Accepted: 10/09/2019] [Indexed: 12/16/2022] Open
Abstract
Recent research shows that majority of the druggable human proteome is yet to be annotated and explored. Accurate identification of these unexplored druggable proteins would facilitate development, screening, repurposing, and repositioning of drugs, as well as prediction of new drug–protein interactions. We contrast the current drug targets against the datasets of non-druggable and possibly druggable proteins to formulate markers that could be used to identify druggable proteins. We focus on the markers that can be extracted from protein sequences or names/identifiers to ensure that they can be applied across the entire human proteome. These markers quantify key features covered in the past works (topological features of PPIs, cellular functions, and subcellular locations) and several novel factors (intrinsic disorder, residue-level conservation, alternative splicing isoforms, domains, and sequence-derived solvent accessibility). We find that the possibly druggable proteins have significantly higher abundance of alternative splicing isoforms, relatively large number of domains, higher degree of centrality in the protein-protein interaction networks, and lower numbers of conserved and surface residues, when compared with the non-druggable proteins. We show that the current drug targets and possibly druggable proteins share involvement in the catalytic and signaling functions. However, unlike the drug targets, the possibly druggable proteins participate in the metabolic and biosynthesis processes, are enriched in the intrinsic disorder, interact with proteins and nucleic acids, and are localized across the cell. To sum up, we formulate several markers that can help with finding novel druggable human proteins and provide interesting insights into the cellular functions and subcellular locations of the current drug targets and potentially druggable proteins.
Collapse
Affiliation(s)
- Sina Ghadermarzi
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Xingyi Li
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| |
Collapse
|
41
|
A P, Weber SC. Evidence for and against Liquid-Liquid Phase Separation in the Nucleus. Noncoding RNA 2019; 5:E50. [PMID: 31683819 PMCID: PMC6958436 DOI: 10.3390/ncrna5040050] [Citation(s) in RCA: 80] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2019] [Revised: 10/23/2019] [Accepted: 10/29/2019] [Indexed: 12/11/2022] Open
Abstract
Enclosed by two membranes, the nucleus itself is comprised of various membraneless compartments, including nuclear bodies and chromatin domains. These compartments play an important though still poorly understood role in gene regulation. Significant progress has been made in characterizing the dynamic behavior of nuclear compartments and liquid-liquid phase separation (LLPS) has emerged as a prominent mechanism governing their assembly. However, recent work reveals that certain nuclear structures violate key predictions of LLPS, suggesting that alternative mechanisms likely contribute to nuclear organization. Here, we review the evidence for and against LLPS for several nuclear compartments and discuss experimental strategies to identify the mechanism(s) underlying their assembly. We propose that LLPS, together with multiple modes of protein-nucleic acid binding, drive spatiotemporal organization of the nucleus and facilitate functional diversity among nuclear compartments.
Collapse
Affiliation(s)
- Peng A
- Department of Biology, McGill University, Montreal, QC H3A 1B1, Canada.
| | - Stephanie C Weber
- Department of Biology, McGill University, Montreal, QC H3A 1B1, Canada.
- Department of Physics, McGill University, Montreal, QC H3A 2T8, Canada.
| |
Collapse
|
42
|
Katuwawala A, Oldfield CJ, Kurgan L. Accuracy of protein-level disorder predictions. Brief Bioinform 2019; 21:1509-1522. [DOI: 10.1093/bib/bbz100] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Revised: 06/22/2019] [Accepted: 07/15/2019] [Indexed: 01/15/2023] Open
Abstract
Abstract
Experimental annotations of intrinsic disorder are available for 0.1% of 147 000 000 of currently sequenced proteins. Over 60 sequence-based disorder predictors were developed to help bridge this gap. Current benchmarks of these methods assess predictive performance on datasets of proteins; however, predictions are often interpreted for individual proteins. We demonstrate that the protein-level predictive performance varies substantially from the dataset-level benchmarks. Thus, we perform first-of-its-kind protein-level assessment for 13 popular disorder predictors using 6200 disorder-annotated proteins. We show that the protein-level distributions are substantially skewed toward high predictive quality while having long tails of poor predictions. Consequently, between 57% and 75% proteins secure higher predictive performance than the currently used dataset-level assessment suggests, but as many as 30% of proteins that are located in the long tails suffer low predictive performance. These proteins typically have relatively high amounts of disorder, in contrast to the mostly structured proteins that are predicted accurately by all 13 methods. Interestingly, each predictor provides the most accurate results for some number of proteins, while the best-performing at the dataset-level method is in fact the best for only about 30% of proteins. Moreover, the majority of proteins are predicted more accurately than the dataset-level performance of the most accurate tool by at least four disorder predictors. While these results suggests that disorder predictors outperform their current benchmark performance for the majority of proteins and that they complement each other, novel tools that accurately identify the hard-to-predict proteins and that make accurate predictions for these proteins are needed.
Collapse
Affiliation(s)
- Akila Katuwawala
- Department of Computer Science, Virginia Commonwealth University, USA
- Department of Computer Science, Virginia Commonwealth University, USA
| | - Christopher J Oldfield
- Department of Computer Science, Virginia Commonwealth University, USA
- Department of Computer Science, Virginia Commonwealth University, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, USA
- Department of Computer Science, Virginia Commonwealth University, USA
| |
Collapse
|
43
|
Supramolecular Fuzziness of Intracellular Liquid Droplets: Liquid-Liquid Phase Transitions, Membrane-Less Organelles, and Intrinsic Disorder. Molecules 2019; 24:molecules24183265. [PMID: 31500307 PMCID: PMC6767272 DOI: 10.3390/molecules24183265] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2019] [Revised: 08/29/2019] [Accepted: 09/06/2019] [Indexed: 12/14/2022] Open
Abstract
Cells are inhomogeneously crowded, possessing a wide range of intracellular liquid droplets abundantly present in the cytoplasm of eukaryotic and bacterial cells, in the mitochondrial matrix and nucleoplasm of eukaryotes, and in the chloroplast’s stroma of plant cells. These proteinaceous membrane-less organelles (PMLOs) not only represent a natural method of intracellular compartmentalization, which is crucial for successful execution of various biological functions, but also serve as important means for the processing of local information and rapid response to the fluctuations in environmental conditions. Since PMLOs, being complex macromolecular assemblages, possess many characteristic features of liquids, they represent highly dynamic (or fuzzy) protein–protein and/or protein–nucleic acid complexes. The biogenesis of PMLOs is controlled by specific intrinsically disordered proteins (IDPs) and hybrid proteins with ordered domains and intrinsically disordered protein regions (IDPRs), which, due to their highly dynamic structures and ability to facilitate multivalent interactions, serve as indispensable drivers of the biological liquid–liquid phase transitions (LLPTs) giving rise to PMLOs. In this article, the importance of the disorder-based supramolecular fuzziness for LLPTs and PMLO biogenesis is discussed.
Collapse
|
44
|
Liu Y, Wang X, Liu B. A comprehensive review and comparison of existing computational methods for intrinsically disordered protein and region prediction. Brief Bioinform 2019; 20:330-346. [PMID: 30657889 DOI: 10.1093/bib/bbx126] [Citation(s) in RCA: 94] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2017] [Indexed: 01/06/2023] Open
Abstract
Intrinsically disordered proteins and regions are widely distributed in proteins, which are associated with many biological processes and diseases. Accurate prediction of intrinsically disordered proteins and regions is critical for both basic research (such as protein structure and function prediction) and practical applications (such as drug development). During the past decades, many computational approaches have been proposed, which have greatly facilitated the development of this important field. Therefore, a comprehensive and updated review is highly required. In this regard, we give a review on the computational methods for intrinsically disordered protein and region prediction, especially focusing on the recent development in this field. These computational approaches are divided into four categories based on their methodologies, including physicochemical-based method, machine-learning-based method, template-based method and meta method. Furthermore, their advantages and disadvantages are also discussed. The performance of 40 state-of-the-art predictors is directly compared on the target proteins in the task of disordered region prediction in the 10th Critical Assessment of protein Structure Prediction. A more comprehensive performance comparison of 45 different predictors is conducted based on seven widely used benchmark data sets. Finally, some open problems and perspectives are discussed.
Collapse
Affiliation(s)
- Yumeng Liu
- School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, China
| | - Xiaolong Wang
- School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, China
| | - Bin Liu
- School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, China
| |
Collapse
|
45
|
N-terminal sequences in matrin 3 mediate phase separation into droplet-like structures that recruit TDP43 variants lacking RNA binding elements. J Transl Med 2019; 99:1030-1040. [PMID: 31019288 PMCID: PMC6857798 DOI: 10.1038/s41374-019-0260-7] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Revised: 03/29/2019] [Accepted: 04/09/2019] [Indexed: 12/12/2022] Open
Abstract
RNA binding proteins associated with amyotrophic lateral sclerosis (ALS) and muscle myopathy possess sequence elements that are low in complexity, or bear resemblance to yeast prion domains. These sequence elements appear to mediate phase separation into liquid-like membraneless organelles. Using fusion proteins of matrin 3 (MATR3) to yellow fluorescent protein (YFP), we recently observed that deletion of the second RNA recognition motif (RRM2) caused the protein to phase separate and form intranuclear liquid-like droplets. Here, we use fusion constructs of MATR3, TARDBP43 (TDP43) and FUS with YFP or mCherry to examine phase separation and protein colocalization in mouse C2C12 myoblast cells. We observed that the N-terminal 397 amino acids of MATR3 (tagged with a nuclear localization signal and expressed as a fusion protein with YFP) formed droplet-like structures within nuclei. Introduction of the myopathic S85C mutation into NLS-N397 MATR3:YFP, but not ALS mutations F115C or P154S, inhibited droplet formation. Further, we analyzed interactions between variants of MATR3 lacking RRM2 (ΔRRM2) and variants of TDP43 with disabling mutations in its RRM1 domain (deletion or mutation). We observed that MATR3:YFP ΔRRM2 formed droplets that appeared to recruit the TDP43 RRM1 mutants. Further, coexpression of the NLS-397 MATR3:YFP construct with a construct that encodes the prion-like domain of TDBP43 produced intranuclear droplet-like structures containing both proteins. Collectively, our studies show that N-terminal sequences in MATR3 can mediate phase separation into intranuclear droplet-like structures that can recruit TDP43 under conditions of low RNA binding.
Collapse
|
46
|
Palumbo E, Zhao B, Xue B, Uversky VN, Davé V. Analyzing aggregation propensities of clinically relevant PTEN mutants: a new culprit in pathogenesis of cancer and other PTENopathies. J Biomol Struct Dyn 2019; 38:2253-2266. [PMID: 31232187 DOI: 10.1080/07391102.2019.1630005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
While studies on pathological protein aggregation are largely limited to neurodegenerative disease, emerging evidence suggests that other diseases are also associated with pathogenic protein aggregation. For example, tumor suppressor protein p53, and its mutant conformers, undergo protein aggregation, exacerbating the cancer phenotype. These findings raise the possibility that inactivation of tumor suppressors via protein aggregation may participate in cancer and other disease pathologies. Since tumor suppressor protein PTEN has similar functions to p53, and is mutated in multiple diseases, we examined the aggregation propensity of PTEN wild-type and 1523 clinically relevant PTEN mutants. Applying computational tools to PTEN mutation databases revealed that PTEN wild-type protein can aggregate under physiological conditions, and 274 distinct PTEN mutants had increased aggregation propensity. To understand the mechanism underlying PTEN conformer aggregation, we analyzed the physicochemical properties of these 274 PTEN mutants and defined their aggregation potential. We conclude that increased aggregation propensity of select PTEN mutants may contribute to disease phenotypes. Our studies have built the foundation for interrogating the aggregation potential of these select mutants in cancers and in PTENopathies. Elucidating the pathogenic mechanisms associated with aggregation-prone PTEN conformers will aid in developing therapies that target PTEN-aggregates in multiple diseases.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Emily Palumbo
- Department of Pathology and Cell Biology, Morsani College of Medicine, University of South Florida, Tampa, FL, USA
| | - Bi Zhao
- Department of Cell Biology, Microbiology and Molecular Biology, University of South Florida, Tampa, FL, USA
| | - Bin Xue
- Department of Cell Biology, Microbiology and Molecular Biology, University of South Florida, Tampa, FL, USA
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, Byrd Alzheimer's Institute, University of South Florida, Tampa, FL, USA.,Institute for Biological Instrumentation of the Russian Academy of Sciences, Pushchino, Moscow Region, Russia
| | - Vrushank Davé
- Department of Pathology and Cell Biology, Morsani College of Medicine, University of South Florida, Tampa, FL, USA
| |
Collapse
|
47
|
Iarovaia OV, Minina EP, Sheval EV, Onichtchouk D, Dokudovskaya S, Razin SV, Vassetzky YS. Nucleolus: A Central Hub for Nuclear Functions. Trends Cell Biol 2019; 29:647-659. [PMID: 31176528 DOI: 10.1016/j.tcb.2019.04.003] [Citation(s) in RCA: 91] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2018] [Revised: 04/24/2019] [Accepted: 04/26/2019] [Indexed: 12/19/2022]
Abstract
The nucleolus is the largest and most studied nuclear body, but its role in nuclear function is far from being comprehensively understood. Much work on the nucleolus has focused on its role in regulating RNA polymerase I (RNA Pol I) transcription and ribosome biogenesis; however, emerging evidence points to the nucleolus as an organizing hub for many nuclear functions, accomplished via the shuttling of proteins and nucleic acids between the nucleolus and nucleoplasm. Here, we discuss the cellular mechanisms affected by shuttling of nucleolar components, including the 3D organization of the genome, stress response, DNA repair and recombination, transcription regulation, telomere maintenance, and other essential cellular functions.
Collapse
Affiliation(s)
- Olga V Iarovaia
- Institute of Gene Biology of the Russian Academy of Sciences, 119334 Moscow, Russia; LIA 1066 LFR2O French-Russian Joint Cancer Research Laboratory, 94805 Villejuif, France
| | - Elizaveta P Minina
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 119991 Moscow, Russia
| | - Eugene V Sheval
- LIA 1066 LFR2O French-Russian Joint Cancer Research Laboratory, 94805 Villejuif, France; Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, 119991 Moscow, Russia; Faculty of Biology, Lomonosov Moscow State University, 119991 Moscow, Russia
| | - Daria Onichtchouk
- Developmental Biology Unit, Department of Biology I, University of Freiburg, Hauptstrasse 1, D-79104 Freiburg, Germany
| | - Svetlana Dokudovskaya
- LIA 1066 LFR2O French-Russian Joint Cancer Research Laboratory, 94805 Villejuif, France; UMR8126, Université Paris-Sud, CNRS, Institut Gustave Roussy, 94805 Villejuif, France
| | - Sergey V Razin
- Institute of Gene Biology of the Russian Academy of Sciences, 119334 Moscow, Russia; LIA 1066 LFR2O French-Russian Joint Cancer Research Laboratory, 94805 Villejuif, France; Faculty of Biology, Lomonosov Moscow State University, 119991 Moscow, Russia
| | - Yegor S Vassetzky
- LIA 1066 LFR2O French-Russian Joint Cancer Research Laboratory, 94805 Villejuif, France; Koltzov Institute of Developmental Biology of the Russian Academy of Sciences, 119334 Moscow, Russia; UMR8126, Université Paris-Sud, CNRS, Institut Gustave Roussy, 94805 Villejuif, France.
| |
Collapse
|
48
|
Intrinsic Disorder-Based Emergence in Cellular Biology: Physiological and Pathological Liquid-Liquid Phase Transitions in Cells. Polymers (Basel) 2019; 11:polym11060990. [PMID: 31167414 PMCID: PMC6631845 DOI: 10.3390/polym11060990] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2019] [Revised: 05/29/2019] [Accepted: 05/31/2019] [Indexed: 12/14/2022] Open
Abstract
The visible outcome of liquid-liquid phase transitions (LLPTs) in cells is the formation and disintegration of various proteinaceous membrane-less organelles (PMLOs). Although LLPTs and related PMLOs have been observed in living cells for over 200 years, the physiological functions of these transitions (also known as liquid-liquid phase separation, LLPS) are just starting to be understood. While unveiling the functionality of these transitions is important, they have come into light more recently due to the association of abnormal LLPTs with various pathological conditions. In fact, several maladies, such as various cancers, different neurodegenerative diseases, and cardiovascular diseases, are known to be associated with either aberrant LLPTs or some pathological transformations within the resultant PMLOs. Here, we will highlight both the physiological functions of cellular liquid-liquid phase transitions as well as the pathological consequences produced through both dysregulated biogenesis of PMLOs and the loss of their dynamics. We will also discuss the potential downstream toxic effects of proteins that are involved in pathological formations.
Collapse
|
49
|
Katuwawala A, Ghadermarzi S, Kurgan L. Computational prediction of functions of intrinsically disordered regions. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2019; 166:341-369. [PMID: 31521235 DOI: 10.1016/bs.pmbts.2019.04.006] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Intrinsically disorder regions (IDRs) are abundant in nature, particularly among Eukaryotes. While they facilitate a wide spectrum of cellular functions including signaling, molecular assembly and recognition, translation, transcription and regulation, only several hundred IDRs are annotated functionally. This annotation gap motivates the development of fast and accurate computational methods that predict IDR functions directly from protein sequences. We introduce and describe a comprehensive collection of 25 methods that provide accurate predictions of IDRs that interact with proteins and nucleic acids, that function as flexible linkers and that moonlight multiple functions. Virtually all of these predictors can be accessed online and many were developed in the last few years. They utilize a wide range of predictive architectures and take advantage of modern machine learning algorithms. Our empirical analysis shows that predictors that are available as webservers enjoy high rates of citations, attesting to their practical value and popularity. The most cited methods include DISOPRED3, ANCHOR, alpha-MoRFpred, MoRFpred, fMoRFpred and MoRFCHiBi. We present two case studies to demonstrate that predictions produced by these computational tools are relatively easy to interpret and that they deliver valuable functional clues. However, the current computational tools cover a relatively narrow range of disorder functions. Further development efforts that would cover a broader range of functions should be pursued. We demonstrate that a sufficient amount of functionally annotated IDRs that are associated with several other disorder functions is already available and can be used to design and validate novel predictors.
Collapse
Affiliation(s)
- Akila Katuwawala
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Sina Ghadermarzi
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States.
| |
Collapse
|
50
|
Bentley EP, Frey BB, Deniz AA. Physical Chemistry of Cellular Liquid-Phase Separation. Chemistry 2019; 25:5600-5610. [PMID: 30589142 PMCID: PMC6551525 DOI: 10.1002/chem.201805093] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2018] [Revised: 12/11/2018] [Indexed: 01/05/2023]
Abstract
Compartmentalization of biochemical processes is essential for cell function. Although membrane-bound organelles are well studied in this context, recent work has shown that phase separation is a key contributor to cellular compartmentalization through the formation of liquid-like membraneless organelles (MLOs). In this Minireview, the key mechanistic concepts that underlie MLO dynamics and function are first briefly discussed, including the relevant noncovalent interaction chemistry and polymer physical chemistry. Next, a few examples of MLOs and relevant proteins are given, along with their functions, which highlight the relevance of the above concepts. The developing area of active matter and non-equilibrium systems, which can give rise to unexpected effects in fluctuating cellular conditions, are also discussed. Finally, our thoughts for emerging and future directions in the field are discussed, including in vitro and in vivo studies of MLO physical chemistry and function.
Collapse
Affiliation(s)
- Emily P Bentley
- The Scripps Research Institute, 10550 N. Torrey Pines Rd., La Jolla, CA, 92037, USA
| | - Benjamin B Frey
- The Scripps Research Institute, 10550 N. Torrey Pines Rd., La Jolla, CA, 92037, USA
| | - Ashok A Deniz
- The Scripps Research Institute, 10550 N. Torrey Pines Rd., La Jolla, CA, 92037, USA
| |
Collapse
|