1
|
Mier P, Paladin L, Tamana S, Petrosian S, Hajdu-Soltész B, Urbanek A, Gruca A, Plewczynski D, Grynberg M, Bernadó P, Gáspári Z, Ouzounis CA, Promponas VJ, Kajava AV, Hancock JM, Tosatto SCE, Dosztanyi Z, Andrade-Navarro MA. Disentangling the complexity of low complexity proteins. Brief Bioinform 2021; 21:458-472. [PMID: 30698641 PMCID: PMC7299295 DOI: 10.1093/bib/bbz007] [Citation(s) in RCA: 51] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2018] [Revised: 12/19/2018] [Accepted: 01/07/2019] [Indexed: 12/31/2022] Open
Abstract
There are multiple definitions for low complexity regions (LCRs) in protein sequences, with all of them broadly considering LCRs as regions with fewer amino acid types compared to an average composition. Following this view, LCRs can also be defined as regions showing composition bias. In this critical review, we focus on the definition of sequence complexity of LCRs and their connection with structure. We present statistics and methodological approaches that measure low complexity (LC) and related sequence properties. Composition bias is often associated with LC and disorder, but repeats, while compositionally biased, might also induce ordered structures. We illustrate this dichotomy, and more generally the overlaps between different properties related to LCRs, using examples. We argue that statistical measures alone cannot capture all structural aspects of LCRs and recommend the combined usage of a variety of predictive tools and measurements. While the methodologies available to study LCRs are already very advanced, we foresee that a more comprehensive annotation of sequences in the databases will enable the improvement of predictions and a better understanding of the evolution and the connection between structure and function of LCRs. This will require the use of standards for the generation and exchange of data describing all aspects of LCRs. Short abstract There are multiple definitions for low complexity regions (LCRs) in protein sequences. In this critical review, we focus on the definition of sequence complexity of LCRs and their connection with structure. We present statistics and methodological approaches that measure low complexity (LC) and related sequence properties. Composition bias is often associated with LC and disorder, but repeats, while compositionally biased, might also induce ordered structures. We illustrate this dichotomy, plus overlaps between different properties related to LCRs, using examples.
Collapse
Affiliation(s)
- Pablo Mier
- Institute of Organismic and Molecular Evolution, Johannes Gutenberg University of Mainz, Mainz, Germany
| | - Lisanna Paladin
- Department of Biomedical Science, University of Padova, Padova, Italy
| | - Stella Tamana
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus
| | - Sophia Petrosian
- Biological Computation and Process Laboratory, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas, Thessalonica, Greece
| | - Borbála Hajdu-Soltész
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest, Hungary
| | - Annika Urbanek
- Centre de Biochimie Structurale, INSERM, CNRS, Université de Montpellier, Montpellier, France
| | - Aleksandra Gruca
- Institute of Informatics, Silesian University of Technology, Gliwice, Poland
| | - Dariusz Plewczynski
- Center of New Technologies, University of Warsaw, Warsaw, Poland.,Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland
| | | | - Pau Bernadó
- Centre de Biochimie Structurale, INSERM, CNRS, Université de Montpellier, Montpellier, France
| | - Zoltán Gáspári
- Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Budapest, Hungary
| | - Christos A Ouzounis
- Biological Computation and Process Laboratory, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas, Thessalonica, Greece
| | - Vasilis J Promponas
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus
| | - Andrey V Kajava
- Centre de Recherche en Biologie Cellulaire de Montpellier, CNRS-UMR, Institut de Biologie Computationnelle, Universite de Montpellier, Montpellier, France.,Institute of Bioengineering, University ITMO, St. Petersburg, Russia
| | - John M Hancock
- Earlham Institute, Norwich, UK.,ELIXIR Hub, Welcome Genome Campus, Hinxton, UK
| | - Silvio C E Tosatto
- Department of Biomedical Science, University of Padova, Padova, Italy.,CNR Institute of Neuroscience, Padova, Italy
| | - Zsuzsanna Dosztanyi
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest, Hungary
| | - Miguel A Andrade-Navarro
- Institute of Organismic and Molecular Evolution, Johannes Gutenberg University of Mainz, Mainz, Germany
| |
Collapse
|
2
|
Kiss-Tóth A, Dobson L, Péterfia B, Ángyán AF, Ligeti B, Lukács G, Gáspári Z. Occurrence of Ordered and Disordered Structural Elements in Postsynaptic Proteins Supports Optimization for Interaction Diversity. ENTROPY (BASEL, SWITZERLAND) 2019; 21:E761. [PMID: 33267475 PMCID: PMC7515291 DOI: 10.3390/e21080761] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/10/2019] [Revised: 07/30/2019] [Accepted: 08/02/2019] [Indexed: 12/15/2022]
Abstract
The human postsynaptic density is an elaborate network comprising thousands of proteins, playing a vital role in the molecular events of learning and the formation of memory. Despite our growing knowledge of specific proteins and their interactions, atomic-level details of their full three-dimensional structure and their rearrangements are mostly elusive. Advancements in structural bioinformatics enabled us to depict the characteristic features of proteins involved in different processes aiding neurotransmission. We show that postsynaptic protein-protein interactions are mediated through the delicate balance of intrinsically disordered regions and folded domains, and this duality is also imprinted in the amino acid sequence. We introduce Diversity of Potential Interactions (DPI), a structure and regulation based descriptor to assess the diversity of interactions. Our approach reveals that the postsynaptic proteome has its own characteristic features and these properties reliably discriminate them from other proteins of the human proteome. Our results suggest that postsynaptic proteins are especially susceptible to forming diverse interactions with each other, which might be key in the reorganization of the postsynaptic density (PSD) in molecular processes related to learning and memory.
Collapse
Affiliation(s)
- Annamária Kiss-Tóth
- Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Práter u. 50A, 1083 Budapest, Hungary
- 3in-PPCU Research Group, 2500 Esztergom, Hungary
| | - Laszlo Dobson
- Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Práter u. 50A, 1083 Budapest, Hungary
| | - Bálint Péterfia
- Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Práter u. 50A, 1083 Budapest, Hungary
| | - Annamária F. Ángyán
- Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Práter u. 50A, 1083 Budapest, Hungary
| | - Balázs Ligeti
- Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Práter u. 50A, 1083 Budapest, Hungary
| | - Gergely Lukács
- Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Práter u. 50A, 1083 Budapest, Hungary
| | - Zoltán Gáspári
- Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Práter u. 50A, 1083 Budapest, Hungary
| |
Collapse
|
3
|
A Comprehensive Survey of the Roles of Highly Disordered Proteins in Type 2 Diabetes. Int J Mol Sci 2017; 18:ijms18102010. [PMID: 28934129 PMCID: PMC5666700 DOI: 10.3390/ijms18102010] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2017] [Revised: 09/04/2017] [Accepted: 09/12/2017] [Indexed: 01/03/2023] Open
Abstract
Type 2 diabetes mellitus (T2DM) is a chronic and progressive disease that is strongly associated with hyperglycemia (high blood sugar) related to either insulin resistance or insufficient insulin production. Among the various molecular events and players implicated in the manifestation and development of diabetes mellitus, proteins play several important roles. The Kyoto Encyclopedia of Genes and Genomes (KEGG) database has information on 34 human proteins experimentally shown to be related to the T2DM pathogenesis. It is known that many proteins associated with different human maladies are intrinsically disordered as a whole, or contain intrinsically disordered regions. The presented study shows that T2DM is not an exception to this rule, and many proteins known to be associated with pathogenesis of this malady are intrinsically disordered. The multiparametric bioinformatics analysis utilizing several computational tools for the intrinsic disorder characterization revealed that IRS1, IRS2, IRS4, MAFA, PDX1, ADIPO, PIK3R2, PIK3R5, SoCS1, and SoCS3 are expected to be highly disordered, whereas VDCC, SoCS2, SoCS4, JNK9, PRKCZ, PRKCE, insulin, GCK, JNK8, JNK10, PYK, INSR, TNF-α, MAPK3, and Kir6.2 are classified as moderately disordered proteins, and GLUT2, GLUT4, mTOR, SUR1, MAPK1, IKKA, PRKCD, PIK3CB, and PIK3CA are predicted as mostly ordered. More focused computational analyses and intensive literature mining were conducted for a set of highly disordered proteins related to T2DM. The resulting work represents a comprehensive survey describing the major biological functions of these proteins and functional roles of their intrinsically disordered regions, which are frequently engaged in protein–protein interactions, and contain sites of various posttranslational modifications (PTMs). It is also shown that intrinsic disorder-associated PTMs may play important roles in controlling the functions of these proteins. Consideration of the T2DM proteins from the perspective of intrinsic disorder provides useful information that can potentially lead to future experimental studies that may uncover latent and novel pathways associated with the disease.
Collapse
|
4
|
Vajda T, Perczel A. The clear and dark sides of water: influence on the coiled coil folding domain. Biomol Concepts 2016; 7:189-95. [PMID: 27180359 DOI: 10.1515/bmc-2016-0005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2016] [Accepted: 03/29/2016] [Indexed: 11/15/2022] Open
Abstract
The essential role of water in extra- and intracellular coiled coil structures of proteins is critically evaluated, and the different protein types incorporating coiled coil units are overviewed. The following subjects are discussed: i) influence of water on the formation and degradation of the coiled coil domain together with the stability of this conformer type; ii) the water's paradox iii) design of coiled coil motifs and iv) expert opinion and outlook is presented. The clear and dark sides refer to the positive and negative aspects of the water molecule, as it may enhance or inhibit a given folding event. This duplicity can be symbolized by the Roman 'Janus-face' which means that water may facilitate and stimulate coiled coil structure formation, however, it may contribute to the fatal processes of oligomerization and amyloidosis of the very same polypeptide chain.
Collapse
|