1
|
Majila K, Viswanath S. StrIDR: a database of intrinsically disordered regions of proteins with experimentally resolved structures. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.22.609111. [PMID: 39253485 PMCID: PMC11382991 DOI: 10.1101/2024.08.22.609111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/11/2024]
Abstract
Motivation Intrinsically disordered regions (IDRs) of proteins exist as an ensemble of conformations, and not as a single structure. Existing databases contain extensive, experimentally derived annotations of intrinsic disorder for millions of proteins at the sequence level. However, only a tiny fraction of these IDRs are associated with an experimentally determined protein structure. Moreover, even if a structure exists, parts of the disordered regions may still be unresolved. Results Here we organize Structures of Intrinsically Disordered Regions (StrIDR), a database of IDRs confirmed via experimental or homology-based evidence, resolved in experimentally determined structures. The database can provide useful insights into the dynamics, folding, and interactions of IDRs. It can also facilitate computational studies on IDRs, such as those using molecular dynamics simulations and/or machine learning. Availability StrIDR is available at https://isblab.ncbs.res.in/stridr. The web UI allows for downloading PDB structures and SIFTS mappings of individual entries. Additionally, the entire database can be downloaded in a JSON format. The source code for creating and updating the database is available at https://github.com/isblab/stridr.
Collapse
Affiliation(s)
- Kartik Majila
- National Center for Biological Sciences, Tata Institute of Fundamental Research, Bangalore, India 560065
| | - Shruthi Viswanath
- National Center for Biological Sciences, Tata Institute of Fundamental Research, Bangalore, India 560065
| |
Collapse
|
2
|
Agarwal A, Bahadur RP. Modular architecture and functional annotation of human RNA-binding proteins containing RNA recognition motif. Biochimie 2023; 209:116-130. [PMID: 36716848 DOI: 10.1016/j.biochi.2023.01.017] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Revised: 01/09/2023] [Accepted: 01/23/2023] [Indexed: 01/28/2023]
Abstract
RNA-binding proteins (RBPs) are structurally and functionally diverse macromolecules with significant involvement in several post-transcriptional gene regulatory processes and human diseases. RNA recognition motif (RRM) is one of the most abundant RNA-binding domains in human RBPs. The unique modular architecture of each RBP containing RRM is crucial for its diverse target recognition and function. Genome-wide study of these structurally conserved and functionally diverse domains can enhance our understanding of their functional implications. In this study, modular architecture of RRM containing RBPs in human proteome is identified and systematically analysed. We observe that 30% of human RBPs with RNA-binding function contain RRM in single or multiple repeats or with other domains with maximum of six repeats. Zinc-fingers are the most frequently co-occurring domain partner of RRMs. Human RRM containing RBPs mostly belong to RNA metabolism class of proteins and are significantly enriched in two functional pathways including spliceosome and mRNA surveillance. Various human diseases are associated with 18% of the RRM containing RBPs. Single RRM containing RBPs are highly enriched in disorder regions. Gene ontology (GO) molecular functions including poly(A), poly(U) and miRNA binding are highly depleted in RBPs with single RRM, indicating the significance of modular nature of RRMs in specific function. The current study reports all the possible domain architectures of RRM containing human RBPs and their functional enrichment. The idea of domain architecture, and how they confer specificity and new functionalities to RBPs, can help in re-designing of modular RRM containing RBPs with re-engineered function.
Collapse
Affiliation(s)
- Ankita Agarwal
- School of Bio Science, Indian Institute of Technology Kharagpur, Kharagpur 721302, India; Computational Structural Biology Laboratory, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
| | - Ranjit Prasad Bahadur
- Computational Structural Biology Laboratory, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, India.
| |
Collapse
|
3
|
Anbo H, Sakuma K, Fukuchi S, Ota M. How AlphaFold2 Predicts Conditionally Folding Regions Annotated in an Intrinsically Disordered Protein Database, IDEAL. BIOLOGY 2023; 12:182. [PMID: 36829461 PMCID: PMC9952413 DOI: 10.3390/biology12020182] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 01/19/2023] [Accepted: 01/21/2023] [Indexed: 01/27/2023]
Abstract
AlphaFold2 (AF2) is a protein structure prediction program which provides accurate models. In addition to predicting structural domains, AF2 assigns intrinsically disordered regions (IDRs) by identifying regions with low prediction reliability (pLDDT). Some regions in IDRs undergo disorder-to-order transition upon binding the interaction partner. Here we assessed model structures of AF2 based on the annotations in IDEAL, in which segments with disorder-to-order transition have been collected as Protean Segments (ProSs). We non-redundantly selected ProSs from IDEAL and classified them based on the root mean square deviation to the corresponding region of AF2 models. Statistical analysis identified 11 structural and sequential features, possibly contributing toward the prediction of ProS structures. These features were categorized into two groups: one that contained pLDDT and the other that contained normalized radius of gyration. The typical ProS structures in the former group comprise a long α helix or a whole or part of the structural domain and those in the latter group comprise a short α helix with terminal loops.
Collapse
Affiliation(s)
- Hiroto Anbo
- Faculty of Engineering, Maebashi Institute of Technology, Maebashi 371-0816, Japan
| | - Koya Sakuma
- Graduate School of Informatics, Nagoya University, Nagoya 464-8601, Japan
| | - Satoshi Fukuchi
- Faculty of Engineering, Maebashi Institute of Technology, Maebashi 371-0816, Japan
| | - Motonori Ota
- Graduate School of Informatics, Nagoya University, Nagoya 464-8601, Japan
- Institute for Glyco-core Research, Nagoya University, Nagoya 464-8601, Japan
| |
Collapse
|
4
|
Magyar C, Németh BZ, Cserző M, Simon I. Molecular Dynamics Simulation as a Tool to Identify Mutual Synergistic Folding Proteins. Int J Mol Sci 2023; 24:ijms24021790. [PMID: 36675304 PMCID: PMC9861041 DOI: 10.3390/ijms24021790] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Revised: 01/12/2023] [Accepted: 01/13/2023] [Indexed: 01/18/2023] Open
Abstract
Mutual synergistic folding (MSF) proteins belong to a recently emerged subclass of disordered proteins, which are disordered in their monomeric forms but become ordered in their oligomeric forms. They can be identified by experimental methods following their unfolding, which happens in a single-step cooperative process, without the presence of stable monomeric intermediates. Only a limited number of experimentally validated MSF proteins are accessible. The amino acid composition of MSF proteins shows high similarity to globular ordered proteins, rather than to disordered ones. However, they have some special structural features, which makes it possible to distinguish them from globular proteins. Even in the possession of their oligomeric three-dimensional structure, classification can only be performed based on unfolding experiments, which are frequently absent. In this work, we demonstrate a simple protocol using molecular dynamics simulations, which is able to indicate that a protein structure belongs to the MSF subclass. The presumption of the known atomic resolution quaternary structure is an obvious limitation of the method, and because of its high computational time requirements, it is not suitable for screening large databases; still, it is a valuable in silico tool for identification of MSF proteins.
Collapse
Affiliation(s)
- Csaba Magyar
- Institute of Enzymology, Research Centre for Natural Sciences, Eötvös Loránd Research Network, 1117 Budapest, Hungary
- Correspondence:
| | - Bálint Zoltán Németh
- Institute of Enzymology, Research Centre for Natural Sciences, Eötvös Loránd Research Network, 1117 Budapest, Hungary
| | - Miklós Cserző
- Institute of Enzymology, Research Centre for Natural Sciences, Eötvös Loránd Research Network, 1117 Budapest, Hungary
- Department of Physiology, Faculty of Medicine, Semmelweis University, 1094 Budapest, Hungary
| | - István Simon
- Institute of Enzymology, Research Centre for Natural Sciences, Eötvös Loránd Research Network, 1117 Budapest, Hungary
| |
Collapse
|
5
|
Sun B, Kekenes-Huskey PM. Myofilament-associated proteins with intrinsic disorder (MAPIDs) and their resolution by computational modeling. Q Rev Biophys 2023; 56:e2. [PMID: 36628457 PMCID: PMC11070111 DOI: 10.1017/s003358352300001x] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
The cardiac sarcomere is a cellular structure in the heart that enables muscle cells to contract. Dozens of proteins belong to the cardiac sarcomere, which work in tandem to generate force and adapt to demands on cardiac output. Intriguingly, the majority of these proteins have significant intrinsic disorder that contributes to their functions, yet the biophysics of these intrinsically disordered regions (IDRs) have been characterized in limited detail. In this review, we first enumerate these myofilament-associated proteins with intrinsic disorder (MAPIDs) and recent biophysical studies to characterize their IDRs. We secondly summarize the biophysics governing IDR properties and the state-of-the-art in computational tools toward MAPID identification and characterization of their conformation ensembles. We conclude with an overview of future computational approaches toward broadening the understanding of intrinsic disorder in the cardiac sarcomere.
Collapse
Affiliation(s)
- Bin Sun
- Research Center for Pharmacoinformatics (The State-Province Key Laboratories of Biomedicine-Pharmaceutics of China), Department of Medicinal Chemistry and Natural Medicine Chemistry, College of Pharmacy, Harbin Medical University, Harbin 150081, China
| | | |
Collapse
|
6
|
Anbo H, Ota M, Fukuchi S. Computational Methods to Predict Intrinsically Disordered Regions and Functional Regions in Them. Methods Mol Biol 2023; 2627:231-245. [PMID: 36959451 DOI: 10.1007/978-1-0716-2974-1_13] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/25/2023]
Abstract
Intrinsically disordered regions (IDRs) are protein regions that do not adopt fixed tertiary structures. Since these regions lack ordered three-dimensional structures, they should be excluded from the target portions of homology modeling. IDRs can be predicted from the amino acid sequences, because their amino acid compositions are different from that of the structured domains. This chapter provides a review of the prediction methods of IDRs and a case study of IDR prediction.
Collapse
Affiliation(s)
- Hiroto Anbo
- Faculty of Engineering, Maebashi Institute of Technology, Maebashi, Japan
| | - Motonori Ota
- Graduate School of Information Sciences, Nagoya University, Nagoya, Japan
| | - Satoshi Fukuchi
- Faculty of Engineering, Maebashi Institute of Technology, Maebashi, Japan.
| |
Collapse
|
7
|
Sun C, Feng Y, Fan G. IDPsBind: a repository of binding sites for intrinsically disordered proteins complexes with known 3D structures. BMC Mol Cell Biol 2022; 23:33. [PMID: 35883018 PMCID: PMC9327236 DOI: 10.1186/s12860-022-00434-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Accepted: 07/14/2022] [Indexed: 11/10/2022] Open
Abstract
Abstract
Background
Intrinsically disordered proteins (IDPs) lack a stable three-dimensional structure under physiological conditions but play crucial roles in many biological processes. Intrinsically disordered proteins perform various biological functions by interacting with other ligands.
Results
Here, we present a database, IDPsBind, which displays interacting sites between IDPs and interacting ligands by using the distance threshold method in known 3D structure IDPs complexes from the PDB database. IDPsBind contains 9626 IDPs complexes and 880 intrinsically disordered proteins verified by experiments. The current release of the IDPsBind database is defined as version 1.0. IDPsBind is freely accessible at http://www.s-bioinformatics.cn/idpsbind/home/.
Conclusions
IDPsBind provides more comprehensive interaction sites for IDPs complexes of known 3D structures. It can not only help the subsequent studies of the interaction mechanism of intrinsically disordered proteins but also provides a suitable background for developing the algorithms for predicting the interaction sites of intrinsically disordered proteins.
Collapse
|
8
|
Okuda M, Tsunaka Y, Nishimura Y. Dynamic structures of intrinsically disordered proteins related to the general transcription factor TFIIH, nucleosomes, and histone chaperones. Biophys Rev 2022; 14:1449-1472. [PMID: 36659983 PMCID: PMC9842849 DOI: 10.1007/s12551-022-01014-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Accepted: 11/06/2022] [Indexed: 11/19/2022] Open
Abstract
Advances in structural analysis by cryogenic electron microscopy (cryo-EM) and X-ray crystallography have revealed the tertiary structures of various chromatin-related proteins, including transcription factors, RNA polymerases, nucleosomes, and histone chaperones; however, the dynamic structures of intrinsically disordered regions (IDRs) in these proteins remain elusive. Recent studies using nuclear magnetic resonance (NMR), together with molecular dynamics (MD) simulations, are beginning to reveal dynamic structures of the general transcription factor TFIIH complexed with target proteins including the general transcription factor TFIIE, the tumor suppressor p53, the cell cycle protein DP1, the DNA repair factors XPC and UVSSA, and three RNA polymerases, in addition to the dynamics of histone tails in nucleosomes and histone chaperones. In complexes of TFIIH, the PH domain of the p62 subunit binds to an acidic string formed by the IDR in TFIIE, p53, XPC, UVSSA, DP1, and the RPB6 subunit of three RNA polymerases by a common interaction mode, namely extended string-like binding of the IDR on the positively charged surface of the PH domain. In the nucleosome, the dynamic conformations of the N-tails of histones H2A and H2B are correlated, while the dynamic conformations of the N-tails of H3 and H4 form a histone tail network dependent on their modifications and linker DNA. The acidic IDRs of the histone chaperones of FACT and NAP1 play important roles in regulating the accessibility to histone proteins in the nucleosome.
Collapse
Affiliation(s)
- Masahiko Okuda
- Graduate School of Medical Life Science, Yokohama City University, 1-7-29 Suehiro-Cho, Tsurumi-Ku, Yokohama, 230-0045 Japan
| | - Yasuo Tsunaka
- Graduate School of Medical Life Science, Yokohama City University, 1-7-29 Suehiro-Cho, Tsurumi-Ku, Yokohama, 230-0045 Japan
| | - Yoshifumi Nishimura
- Graduate School of Medical Life Science, Yokohama City University, 1-7-29 Suehiro-Cho, Tsurumi-Ku, Yokohama, 230-0045 Japan
- Graduate School of Integrated Sciences for Life, Hiroshima University, 1-4-4 Kagamiyama, Higashi-Hiroshima, 739-8528 Japan
| |
Collapse
|
9
|
Intrinsically Disordered Proteins: An Overview. Int J Mol Sci 2022; 23:ijms232214050. [PMID: 36430530 PMCID: PMC9693201 DOI: 10.3390/ijms232214050] [Citation(s) in RCA: 35] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2022] [Revised: 11/07/2022] [Accepted: 11/08/2022] [Indexed: 11/16/2022] Open
Abstract
Many proteins and protein segments cannot attain a single stable three-dimensional structure under physiological conditions; instead, they adopt multiple interconverting conformational states. Such intrinsically disordered proteins or protein segments are highly abundant across proteomes, and are involved in various effector functions. This review focuses on different aspects of disordered proteins and disordered protein regions, which form the basis of the so-called "Disorder-function paradigm" of proteins. Additionally, various experimental approaches and computational tools used for characterizing disordered regions in proteins are discussed. Finally, the role of disordered proteins in diseases and their utility as potential drug targets are explored.
Collapse
|
10
|
Roterman I, Stapor K, Fabian P, Konieczny L. New insights into disordered proteins and regions according to the FOD-M model. PLoS One 2022; 17:e0275300. [PMID: 36215254 PMCID: PMC9550084 DOI: 10.1371/journal.pone.0275300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Accepted: 09/13/2022] [Indexed: 11/18/2022] Open
Abstract
A collection of intrinsically disordered proteins (IDPs) having regions with the status of intrinsically disordered (IDR) according to the Disprot database was analyzed from the point of view of the structure of hydrophobic core in the structural unit (chain / domain). The analysis includes all the Homo Sapiens as well as Mus Musculus proteins present in the DisProt database for which the structure is available. In the analysis, the fuzzy oil drop modified model (FOD-M) was used, taking into account the external force field, modified by the presence of other factors apart from polar water, influencing protein structuring. The paper presents an alternative to secondary-structure-based classification of intrinsically disordered regions (IDR). The basis of our classification is the ordering of hydrophobic core as calculated by the FOD-M model resulting in FOD-ordered or FOD-unordered IDRs.
Collapse
Affiliation(s)
- Irena Roterman
- Department of Bioinformatics and Telemedicine, Jagiellonian University, Medical College, Kraków, Poland
| | - Katarzyna Stapor
- Faculty of Automatic, Department of Applied Informatics, Electronics and Computer Science, Silesian University of Technology, Gliwice, Poland
| | - Piotr Fabian
- Faculty of Automatic, Electronics and Computer Science, Department of Algorithmics and Software, Silesian University of Technology, Gliwice, Poland
| | - Leszek Konieczny
- Chair of Medical Biochemistry, Jagiellonian University, Medical College, Kraków, Poland
| |
Collapse
|
11
|
Chen R, Li X, Yang Y, Song X, Wang C, Qiao D. Prediction of protein-protein interaction sites in intrinsically disordered proteins. Front Mol Biosci 2022; 9:985022. [PMID: 36250006 PMCID: PMC9567019 DOI: 10.3389/fmolb.2022.985022] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Accepted: 07/27/2022] [Indexed: 11/25/2022] Open
Abstract
Intrinsically disordered proteins (IDPs) participate in many biological processes by interacting with other proteins, including the regulation of transcription, translation, and the cell cycle. With the increasing amount of disorder sequence data available, it is thus crucial to identify the IDP binding sites for functional annotation of these proteins. Over the decades, many computational approaches have been developed to predict protein-protein binding sites of IDP (IDP-PPIS) based on protein sequence information. Moreover, there are new IDP-PPIS predictors developed every year with the rapid development of artificial intelligence. It is thus necessary to provide an up-to-date overview of these methods in this field. In this paper, we collected 30 representative predictors published recently and summarized the databases, features and algorithms. We described the procedure how the features were generated based on public data and used for the prediction of IDP-PPIS, along with the methods to generate the feature representations. All the predictors were divided into three categories: scoring functions, machine learning-based prediction, and consensus approaches. For each category, we described the details of algorithms and their performances. Hopefully, our manuscript will not only provide a full picture of the status quo of IDP binding prediction, but also a guide for selecting different methods. More importantly, it will shed light on the inspirations for future development trends and principles.
Collapse
Affiliation(s)
- Ranran Chen
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Xinlu Li
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Yaqing Yang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Xixi Song
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Cheng Wang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Dongdong Qiao
- Shandong Mental Health Center, Shandong University, Jinan, China
| |
Collapse
|
12
|
Yang J, Cheng WX, Zhao XF, Wu G, Sheng ST, Hu Q, Ge H, Qin Q, Jin X, Zhang L, Zhang P. Comprehensive folding variations for protein folding. Proteins 2022; 90:1851-1872. [DOI: 10.1002/prot.26381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 04/12/2022] [Accepted: 04/22/2022] [Indexed: 11/12/2022]
Affiliation(s)
- Jiaan Yang
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences Shenzhen Guangdong China
- Micro Biotech, Ltd. Shanghai China
| | - Wen Xiang Cheng
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences Shenzhen Guangdong China
| | | | - Gang Wu
- School of Basic Medicine, Tongji Medical College, Huazhong University of Science and Technology Wuhan China
| | - Shi Tong Sheng
- Shenzhen Hua Ying Kang Gene Technology Co., Ltd Shenzhen Guangdong China
| | - Qiyue Hu
- Shanghai Hengrui Pharmaceutical Co. Ltd. Shanghai China
| | - Hu Ge
- Shanghai Hengrui Pharmaceutical Co. Ltd. Shanghai China
| | - Qianshan Qin
- Shanghai Hengrui Pharmaceutical Co. Ltd. Shanghai China
| | - Xinshen Jin
- Shanghai Hengrui Pharmaceutical Co. Ltd. Shanghai China
| | | | - Peng Zhang
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences Shenzhen Guangdong China
| |
Collapse
|
13
|
Carugo O. Uses and Abuses of the Atomic Displacement Parameters in Structural Biology. Methods Mol Biol 2022; 2449:281-298. [PMID: 35507268 DOI: 10.1007/978-1-0716-2095-3_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
B-factors determined with X-ray crystallographic analyses are commonly used to estimate the flexibility degree of atoms, residues, and molecular moieties in biological macromolecules. In this chapter, the most recent studies and applications of B-factors in protein engineering and structural biology are briefly summarized. Particular emphasis is given to the limitations in using B-factors, in order to prevent inappropriate applications. It is eventually predicted that future applications will involve anisotropically refined B-factors, deep learning, and data produced by cryo-EM.
Collapse
|
14
|
Origin of Increased Solvent Accessibility of Peptide Bonds in Mutual Synergetic Folding Proteins. Int J Mol Sci 2021; 22:ijms222413404. [PMID: 34948202 PMCID: PMC8704591 DOI: 10.3390/ijms222413404] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2021] [Revised: 12/10/2021] [Accepted: 12/11/2021] [Indexed: 11/16/2022] Open
Abstract
Mutual Synergetic Folding (MSF) proteins belong to a recently discovered class of proteins. These proteins are disordered in their monomeric but ordered in their oligomeric forms. Their amino acid composition is more similar to globular proteins than to disordered ones. Our preceding work shed light on important structural aspects of the structural organization of these proteins, but the background of this behavior is still unknown. We suggest that solvent accessibility is an important factor, especially solvent accessibility of the peptide bonds can be accounted for this phenomenon. The side chains of the amino acids which form a peptide bond have a high local contribution to the shielding of the peptide bond from the solvent. During the oligomerization step, other non-local residues contribute to the shielding. We investigated these local and non-local effects of shielding based on Shannon information entropy calculations. We found that MSF and globular homodimeric proteins have different local contributions resulting from different amino acid pair frequencies. Their non-local distribution is also different because of distinctive inter-subunit contacts.
Collapse
|
15
|
Vihinen M. Measuring and interpreting pervasive heterogeneity, poikilosis. FASEB Bioadv 2021; 3:611-625. [PMID: 34377957 PMCID: PMC8332472 DOI: 10.1096/fba.2021-00015] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Revised: 03/09/2021] [Accepted: 03/12/2021] [Indexed: 11/11/2022] Open
Abstract
Measurements are widely used in science, engineering, industry, and trade. They form the basis for experimental scientific research, approach, and progress; however, their foundations are seldom thought or questioned. Recently poikilosis, pervasive heterogeneity ranging from subatomic level to biosphere, was introduced. Poikilosis makes single point measurements and estimates obsolete and irrelevant as measurands display intervals of magnitudes. Consideration of poikilosis requires new lines of thinking in experimental design, conduction of studies, data analysis and interpretation. Measurements of poikilosis must consider lagom, normal, variation extent. Measurements, measures, and measurands as well as the measuring systems and uncertainties are discussed from the perspective of poikilosis. New systematics is introduced for description of uncertainty in measurements and for types of experimental designs. Poikilosis-aware experimenting, data analysis and interpretation are discussed. Instructions are provided for how to measure lagom and non-lagom effects of poikilosis. Consideration of poikilosis can solve scientific controversies and enigmas and can allow novel insight into systems, processes, mechanisms, and reactions and their interpretation, understanding, and manipulation. Furthermore, it will increase reproducibility of measurements and studies.
Collapse
Affiliation(s)
- Mauno Vihinen
- Department of Experimental Medical ScienceLund UniversityLundSweden
| |
Collapse
|
16
|
Anbo H, Amagai H, Fukuchi S. NeProc predicts binding segments in intrinsically disordered regions without learning binding region sequences. Biophys Physicobiol 2020; 17:147-154. [PMID: 33304713 PMCID: PMC7692026 DOI: 10.2142/biophysico.bsj-2020026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 10/29/2020] [Indexed: 12/01/2022] Open
Abstract
Intrinsically disordered proteins are those proteins with intrinsically disordered regions. One of the unique characteristics of intrinsically disordered proteins is the existence of functional segments in intrinsically dis-ordered regions. These segments are involved in binding to partner molecules, such as protein and DNA, and play important roles in signaling pathways and/or transcriptional regulation. Although there are databases that gather information on such disordered binding regions, data remain limited. Therefore, it is desirable to develop programs to predict the disordered binding regions without using data for the binding regions. We developed a program, NeProc, to predict the disordered binding regions, which can be regarded as intrinsically disordered regions with a structural propensity. We only used data for the structural domains and intrinsically disordered regions to detect such regions. NeProc accepts a query amino acid sequence converted into a position specific score matrix, and uses two neural networks that employ different window sizes, a neural network of short windows, and a neural network of long windows. The performance of NeProc was comparable to that of existing programs of the disordered binding region prediction. This result presents the possibility to overcome the shortage of the disordered binding region data in the development of the prediction programs for these binding regions. NeProc is available at http://flab.neproc.org/neproc/index.html.
Collapse
Affiliation(s)
- Hiroto Anbo
- Department of Life Science and Informatics, Faculty of Engineering, Maebashi Institute of Technology, Maebashi, Gunma 371-0816, Japan
| | - Hiroki Amagai
- Department of Life Science and Informatics, Faculty of Engineering, Maebashi Institute of Technology, Maebashi, Gunma 371-0816, Japan
| | - Satoshi Fukuchi
- Department of Life Science and Informatics, Faculty of Engineering, Maebashi Institute of Technology, Maebashi, Gunma 371-0816, Japan
| |
Collapse
|
17
|
Presence of intrinsically disordered proteins can inhibit the nucleation phase of amyloid fibril formation of Aβ(1-42) in amino acid sequence independent manner. Sci Rep 2020; 10:12334. [PMID: 32703978 PMCID: PMC7378830 DOI: 10.1038/s41598-020-69129-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Accepted: 06/19/2020] [Indexed: 11/27/2022] Open
Abstract
The molecular shield effect was studied for intrinsically disordered proteins (IDPs) that do not adopt compact and stable protein folds. IDPs are found among many stress-responsive gene products and cryoprotective- and drought-protective proteins. We recently reported that some fragments of human genome-derived IDPs are cryoprotective for cellular enzymes, despite a lack of relevant amino acid sequence motifs. This sequence-independent IDP function may reflect their molecular shield effect. This study examined the inhibitory activity of IDPs against fibril formation in an amyloid beta peptide (Aβ(1–42)) model system. Four of five human genome-derived IDPs (size range 20 to 44 amino acids) showed concentration-dependent inhibition of amyloid formation (IC50 range between 60 and 130 μM against 20 μM Aβ(1–42)). The IC50 value was two orders of magnitude lower than that of polyethylene-glycol and dextran, used as neutral hydrophilic polymer controls. Nuclear magnetic resonance with 15 N-labeled Aβ(1–42) revealed no relevant molecular interactions between Aβ(1–42) and IDPs. The inhibitory activities were abolished by adding external amyloid-formation seeds. Therefore, IDPs seemed to act only at the amyloid nucleation phase but not at the elongation phase. These results suggest that IDPs (0.1 mM or less) have a molecular shield effect that prevents aggregation of susceptible molecules.
Collapse
|
18
|
Basu S, Bahadur RP. Do sequence neighbours of intrinsically disordered regions promote structural flexibility in intrinsically disordered proteins? J Struct Biol 2020; 209:107428. [DOI: 10.1016/j.jsb.2019.107428] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2019] [Revised: 11/14/2019] [Accepted: 11/17/2019] [Indexed: 10/25/2022]
|
19
|
Dynamic conformational flexibility and molecular interactions of intrinsically disordered proteins. J Biosci 2020. [DOI: 10.1007/s12038-020-0010-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
|
20
|
Bhattarai A, Emerson IA. Dynamic conformational flexibility and molecular interactions of intrinsically disordered proteins. J Biosci 2020; 45:29. [PMID: 32020911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Intrinsically disordered proteins (IDPs) are highly flexible and undergo disorder to order transition upon binding. They are highly abundant in human proteomes and play critical roles in cell signaling and regulatory processes. This review mainly focuses on the dynamics of disordered proteins including their conformational heterogeneity, protein-protein interactions, and the phase transition of biomolecular condensates that are central to various biological functions. Besides, the role of RNA-mediated chaperones in protein folding and stability of IDPs were also discussed. Finally, we explored the dynamic binding interface of IDPs as novel therapeutic targets and the effect of small molecules on their interactions.
Collapse
Affiliation(s)
- Anil Bhattarai
- Bioinformatics Programming Laboratory, Department of Biotechnology, School of Biosciences and Technology, Vellore Institute of Technology, Vellore 632 014, India
| | | |
Collapse
|
21
|
Koike R, Amano M, Kaibuchi K, Ota M. Protein kinases phosphorylate long disordered regions in intrinsically disordered proteins. Protein Sci 2019; 29:564-571. [PMID: 31724233 DOI: 10.1002/pro.3789] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2019] [Revised: 11/09/2019] [Accepted: 11/11/2019] [Indexed: 12/12/2022]
Abstract
Phosphorylation is a major post-translational modification that plays a central role in signaling pathways. Protein kinases phosphorylate substrates (phosphoproteins) by adding phosphate at Ser/Thr or Tyr residues (phosphosites). A large amount of data identifying and describing phosphosites in phosphoproteins has been reported but the specificity of phosphorylation is not fully resolved. In this report, data of kinase-substrate pairs identified by the Kinase-Interacting Substrate Screening (KISS) method were used to analyze phosphosites in intrinsically disordered regions (IDRs) of intrinsically disordered proteins. We compared phosphorylated and nonphosphorylated IDRs and found that the phosphorylated IDRs were significantly longer than nonphosphorylated IDRs. The phosphorylated IDR is often the longest IDR (71%) in a phosphoprotein when only a single phosphosite exists in the IDR, and when the phosphoprotein has multiple phosphosites in an IDR(s), the phosphosites are primarily localized in a single IDR (78%) and this IDR is usually the longest one (81%). We constructed a stochastic model of phosphorylation to estimate the effect of IDR length. The model that accounted for IDR length produced more realistic results when compared with a model that excluded the IDR length. We propose that the IDR length is a significant determinant for locating kinase phosphorylation sites in phosphoproteins.
Collapse
Affiliation(s)
- Ryotaro Koike
- Graduate School of Informatics, Nagoya University, Nagoya, Japan
| | - Mutsuki Amano
- Graduate School of Medicine, Nagoya University, Nagoya, Japan
| | - Kozo Kaibuchi
- Graduate School of Medicine, Nagoya University, Nagoya, Japan
| | - Motonori Ota
- Graduate School of Informatics, Nagoya University, Nagoya, Japan
| |
Collapse
|
22
|
Sequence and Structure Properties Uncover the Natural Classification of Protein Complexes Formed by Intrinsically Disordered Proteins via Mutual Synergistic Folding. Int J Mol Sci 2019; 20:ijms20215460. [PMID: 31683980 PMCID: PMC6862064 DOI: 10.3390/ijms20215460] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Revised: 10/28/2019] [Accepted: 10/30/2019] [Indexed: 12/17/2022] Open
Abstract
Intrinsically disordered proteins mediate crucial biological functions through their interactions with other proteins. Mutual synergistic folding (MSF) occurs when all interacting proteins are disordered, folding into a stable structure in the course of the complex formation. In these cases, the folding and binding processes occur in parallel, lending the resulting structures uniquely heterogeneous features. Currently there are no dedicated classification approaches that take into account the particular biological and biophysical properties of MSF complexes. Here, we present a scalable clustering-based classification scheme, built on redundancy-filtered features that describe the sequence and structure properties of the complexes and the role of the interaction, which is directly responsible for structure formation. Using this approach, we define six major types of MSF complexes, corresponding to biologically meaningful groups. Hence, the presented method also shows that differences in binding strength, subcellular localization, and regulation are encoded in the sequence and structural properties of proteins. While current protein structure classification methods can also handle complex structures, we show that the developed scheme is fundamentally different, and since it takes into account defining features of MSF complexes, it serves as a better representation of structures arising through this specific interaction mode.
Collapse
|
23
|
Liu Y, Wang X, Liu B. A comprehensive review and comparison of existing computational methods for intrinsically disordered protein and region prediction. Brief Bioinform 2019; 20:330-346. [PMID: 30657889 DOI: 10.1093/bib/bbx126] [Citation(s) in RCA: 95] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2017] [Indexed: 01/06/2023] Open
Abstract
Intrinsically disordered proteins and regions are widely distributed in proteins, which are associated with many biological processes and diseases. Accurate prediction of intrinsically disordered proteins and regions is critical for both basic research (such as protein structure and function prediction) and practical applications (such as drug development). During the past decades, many computational approaches have been proposed, which have greatly facilitated the development of this important field. Therefore, a comprehensive and updated review is highly required. In this regard, we give a review on the computational methods for intrinsically disordered protein and region prediction, especially focusing on the recent development in this field. These computational approaches are divided into four categories based on their methodologies, including physicochemical-based method, machine-learning-based method, template-based method and meta method. Furthermore, their advantages and disadvantages are also discussed. The performance of 40 state-of-the-art predictors is directly compared on the target proteins in the task of disordered region prediction in the 10th Critical Assessment of protein Structure Prediction. A more comprehensive performance comparison of 45 different predictors is conducted based on seven widely used benchmark data sets. Finally, some open problems and perspectives are discussed.
Collapse
Affiliation(s)
- Yumeng Liu
- School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, China
| | - Xiaolong Wang
- School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, China
| | - Bin Liu
- School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, China
| |
Collapse
|
24
|
Shamilov R, Aneskievich BJ. Intrinsic Disorder in Nuclear Receptor Amino Termini: From Investigational Challenge to Therapeutic Opportunity. NUCLEAR RECEPTOR RESEARCH 2019. [DOI: 10.32527/2019/101417] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Affiliation(s)
- Rambon Shamilov
- Graduate Program in Pharmacology & Toxicology, University of Connecticut, Storrs, CT 06269-3092, USA
| | - Brian J. Aneskievich
- Department of Pharmaceutical Sciences, University of Connecticut, Storrs, CT 06269-3092, USA
| |
Collapse
|
25
|
Wong ETC, Gsponer J. Predicting Protein-Protein Interfaces that Bind Intrinsically Disordered Protein Regions. J Mol Biol 2019; 431:3157-3178. [PMID: 31207240 DOI: 10.1016/j.jmb.2019.06.010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2019] [Revised: 06/01/2019] [Accepted: 06/04/2019] [Indexed: 12/18/2022]
Abstract
A long-standing goal in biology is the complete annotation of function and structure on all protein-protein interactions, a large fraction of which is mediated by intrinsically disordered protein regions (IDRs). However, knowledge derived from experimental structures of such protein complexes is disproportionately small due, in part, to challenges in studying interactions of IDRs. Here, we introduce IDRBind, a computational method that by combining gradient boosted trees and conditional random field models predicts binding sites of IDRs with performance approaching state-of-the-art globular interface predictions, making it suitable for proteome-wide applications. Although designed and trained with a focus on molecular recognition features, which are long interaction-mediating-elements in IDRs, IDRBind also predicts the binding sites of short peptides more accurately than existing specialized predictors. Consistent with IDRBind's specificity, a comparison of protein interface categories uncovered uniform trends in multiple physicochemical properties, positioning molecular recognition feature interfaces between peptide and globular interfaces.
Collapse
Affiliation(s)
- Eric T C Wong
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada; Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC, Canada
| | - Jörg Gsponer
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada; Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC, Canada.
| |
Collapse
|
26
|
Anbo H, Sato M, Okoshi A, Fukuchi S. Functional Segments on Intrinsically Disordered Regions in Disease-Related Proteins. Biomolecules 2019; 9:biom9030088. [PMID: 30841624 PMCID: PMC6468909 DOI: 10.3390/biom9030088] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2019] [Revised: 02/19/2019] [Accepted: 02/25/2019] [Indexed: 01/05/2023] Open
Abstract
One of the unique characteristics of intrinsically disordered proteins (IPDs) is the existence of functional segments in intrinsically disordered regions (IDRs). A typical function of these segments is binding to partner molecules, such as proteins and DNAs. These segments play important roles in signaling pathways and transcriptional regulation. We conducted bioinformatics analysis to search these functional segments based on IDR predictions and database annotations. We found more than a thousand potential functional IDR segments in disease-related proteins. Large fractions of proteins related to cancers, congenital disorders, digestive system diseases, and reproductive system diseases have these functional IDRs. Some proteins in nervous system diseases have long functional segments in IDRs. The detailed analysis of some of these regions showed that the functional segments are located on experimentally verified IDRs. The proteins with functional IDR segments generally tend to come and go between the cytoplasm and the nucleus. Proteins involved in multiple diseases tend to have more protein-protein interactors, suggesting that hub proteins in the protein-protein interaction networks can have multiple impacts on human diseases.
Collapse
Affiliation(s)
- Hiroto Anbo
- Department of Life Science and Informatics, Faculty of Engineering, Maebashi Institute of Technology, 460-1, Kamisadori, Maebashi, Gunma 371-0816, Japan.
| | - Masaya Sato
- Department of Life Science and Informatics, Faculty of Engineering, Maebashi Institute of Technology, 460-1, Kamisadori, Maebashi, Gunma 371-0816, Japan.
| | - Atsushi Okoshi
- Department of Life Science and Informatics, Faculty of Engineering, Maebashi Institute of Technology, 460-1, Kamisadori, Maebashi, Gunma 371-0816, Japan.
| | - Satoshi Fukuchi
- Department of Life Science and Informatics, Faculty of Engineering, Maebashi Institute of Technology, 460-1, Kamisadori, Maebashi, Gunma 371-0816, Japan.
| |
Collapse
|
27
|
Vincent M, Uversky VN, Schnell S. On the Need to Develop Guidelines for Characterizing and Reporting Intrinsic Disorder in Proteins. Proteomics 2019; 19:e1800415. [PMID: 30793871 PMCID: PMC6571172 DOI: 10.1002/pmic.201800415] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2018] [Revised: 02/05/2019] [Indexed: 01/02/2023]
Abstract
Since the early 2000s, numerous computational tools have been created and used to predict intrinsic disorder in proteins. At present, the output from these algorithms is difficult to interpret in the absence of standards or references for comparison. There are many reasons to establish a set of standard-based guidelines to evaluate computational protein disorder predictions. This viewpoint explores a handful of these reasons, including standardizing nomenclature to improve communication, rigor and reproducibility, and making it easier for newcomers to enter the field. An approach for reporting predicted disorder in single proteins with respect to whole proteomes is discussed. The suggestions are not intended to be formulaic; they should be viewed as a starting point to establish guidelines for interpreting and reporting computational protein disorder predictions.
Collapse
Affiliation(s)
- Michael Vincent
- Interdisciplinary Biological Sciences, Northwestern University, Evanston, Illinois 60208, USA
| | - Vladimir N. Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer’s Research Institute, Morsani College of Medicine, University of South Florida, Tampa, Florida 33612, USA
- Institute for Biological Instrumentation of the Russian Academy of Sciences, Pushchino 142290, Moscow region, Russia
| | - Santiago Schnell
- Department of Molecular & Integrative Physiology, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA
- Department of Computational Medicine & Bioinformatics, University of Michigan Medical School, Michigan 48109, USA
| |
Collapse
|
28
|
Shimomura T, Nishijima K, Kikuchi T. A new technique for predicting intrinsically disordered regions based on average distance map constructed with inter-residue average distance statistics. BMC STRUCTURAL BIOLOGY 2019; 19:3. [PMID: 30727987 PMCID: PMC6366092 DOI: 10.1186/s12900-019-0101-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/08/2018] [Accepted: 01/23/2019] [Indexed: 01/03/2023]
Abstract
Background It had long been thought that a protein exhibits its specific function through its own specific 3D-structure under physiological conditions. However, subsequent research has shown that there are many proteins without specific 3D-structures under physiological conditions, so-called intrinsically disordered proteins (IDPs). This study presents a new technique for predicting intrinsically disordered regions in a protein, based on our average distance map (ADM) technique. The ADM technique was developed to predict compact regions or structural domains in a protein. In a protein containing partially disordered regions, a domain region is likely to be ordered, thus it is unlikely that a disordered region would be part of any domain. Therefore, the ADM technique is expected to also predict a disordered region between domains. Results The results of our new technique are comparable to the top three performing techniques in the community-wide CASP10 experiment. We further discuss the case of p53, a tumor-suppressor protein, which is the most significant protein among cell cycle regulatory proteins. This protein exhibits a disordered character as a monomer but an ordered character when two p53s form a dimer. Conclusion Our technique can predict the location of an intrinsically disordered region in a protein with an accuracy comparable to the best techniques proposed so far. Furthermore, it can also predict a core region of IDPs forming definite 3D structures through interactions, such as dimerization. The technique in our study may also serve as a means of predicting a disordered region which would become an ordered structure when binding to another protein. Electronic supplementary material The online version of this article (10.1186/s12900-019-0101-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Takumi Shimomura
- Department of Bioinformatics, College of Life Sciences, Ritsumeikan University, 1-1-1 Nojihigashi, Kusatsu, Shiga, 525-8577, Japan
| | - Kohki Nishijima
- Department of Bioinformatics, College of Life Sciences, Ritsumeikan University, 1-1-1 Nojihigashi, Kusatsu, Shiga, 525-8577, Japan
| | - Takeshi Kikuchi
- Department of Bioinformatics, College of Life Sciences, Ritsumeikan University, 1-1-1 Nojihigashi, Kusatsu, Shiga, 525-8577, Japan.
| |
Collapse
|
29
|
In Silico and In Vitro Considerations of Keratinocyte Nuclear Receptor Protein Structural Order for Improving Experimental Analysis. Methods Mol Biol 2019; 2109:93-111. [PMID: 31124000 DOI: 10.1007/7651_2019_240] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Nuclear receptors (NR) regulate gene expression critical in keratinocyte replication and differentiation. In addition to a ligand-binding domain, NR like other transcription factor families have a DNA-binding domain that must attain a particular conformation for effective interaction with the three-dimensional structure in promoters of target genes for control of their expression. Such protein-DNA assemblies extend the classic "lock and key" idea typified by protein-protein interactions. However, it is becoming increasingly clear that multi-subdomain transcription factors like NR frequently range along the length of the protein from structured, ordered regions expected for interaction with a preset partner to more flexible, intrinsically disordered regions which are more available for diverse posttranslational modifications and/or interaction with differing partners. The extended amino terminus of NR (the A/B subdomain) is one such intrinsically disordered region. Here we provide a primer on in silico-based recognition of amino acid composition and order associated with such conformational flexibility along with adaptations of readily accessible laboratory techniques (e.g., considerations for recombinant expression, sensitivity to protease and proteasome digestion) to facilitate initial prediction and testing for intrinsic disorder in various proteins of interest to keratinocyte biologists, like NR and other transcription factors.
Collapse
|
30
|
|
31
|
Large-Scale Analyses of Site-Specific Evolutionary Rates across Eukaryote Proteomes Reveal Confounding Interactions between Intrinsic Disorder, Secondary Structure, and Functional Domains. Genes (Basel) 2018; 9:genes9110553. [PMID: 30441862 PMCID: PMC6265720 DOI: 10.3390/genes9110553] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2018] [Revised: 11/09/2018] [Accepted: 11/09/2018] [Indexed: 12/31/2022] Open
Abstract
Various structural and functional constraints govern the evolution of protein sequences. As a result, the relative rates of amino acid replacement among sites within a protein can vary significantly. Previous large-scale work on Metazoan (Animal) protein sequence alignments indicated that amino acid replacement rates are partially driven by a complex interaction among three factors: intrinsic disorder propensity; secondary structure; and functional domain involvement. Here, we use sequence-based predictors to evaluate the effects of these factors on site-specific sequence evolutionary rates within four eukaryotic lineages: Metazoans; Plants; Saccharomycete Fungi; and Alveolate Protists. Our results show broad, consistent trends across all four Eukaryote groups. In all four lineages, there is a significant increase in amino acid replacement rates when comparing: (i) disordered vs. ordered sites; (ii) random coil sites vs. sites in secondary structures; and (iii) inter-domain linker sites vs. sites in functional domains. Additionally, within Metazoans, Plants, and Saccharomycetes, there is a strong confounding interaction between intrinsic disorder and secondary structure-alignment sites exhibiting both high disorder propensity and involvement in secondary structures have very low average rates of sequence evolution. Analysis of gene ontology (GO) terms revealed that in all four lineages, a high fraction of sequences containing these conserved, disordered-structured sites are involved in nucleic acid binding. We also observe notable differences in the statistical trends of Alveolates, where intrinsically disordered sites are more variable than in other Eukaryotes and the statistical interactions between disorder and other factors are less pronounced.
Collapse
|
32
|
Schad E, Fichó E, Pancsa R, Simon I, Dosztányi Z, Mészáros B. DIBS: a repository of disordered binding sites mediating interactions with ordered proteins. Bioinformatics 2018; 34:535-537. [PMID: 29385418 PMCID: PMC5860366 DOI: 10.1093/bioinformatics/btx640] [Citation(s) in RCA: 66] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2017] [Accepted: 10/06/2017] [Indexed: 12/14/2022] Open
Abstract
Motivation Intrinsically Disordered Proteins (IDPs) mediate crucial protein–protein interactions, most notably in signaling and regulation. As their importance is increasingly recognized, the detailed analyses of specific IDP interactions opened up new opportunities for therapeutic targeting. Yet, large scale information about IDP-mediated interactions in structural and functional details are lacking, hindering the understanding of the mechanisms underlying this distinct binding mode. Results Here, we present DIBS, the first comprehensive, curated collection of complexes between IDPs and ordered proteins. DIBS not only describes by far the highest number of cases, it also provides the dissociation constants of their interactions, as well as the description of potential post-translational modifications modulating the binding strength and linear motifs involved in the binding. Together with the wide range of structural and functional annotations, DIBS will provide the cornerstone for structural and functional studies of IDP complexes. Availability and implementation DIBS is freely accessible at http://dibs.enzim.ttk.mta.hu/. The DIBS application is hosted by Apache web server and was implemented in PHP. To enrich querying features and to enhance backend performance a MySQL database was also created. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Eva Schad
- Research Centre for Natural Sciences, Institute of Enzymology, Hungarian Academy of Sciences, Budapest H-1117, Hungary
| | - Erzsébet Fichó
- Research Centre for Natural Sciences, Institute of Enzymology, Hungarian Academy of Sciences, Budapest H-1117, Hungary
| | - Rita Pancsa
- MRC Laboratory of Molecular Biology, Cambridge Biomedical Campus, Cambridge CB2 0QH, UK
| | - István Simon
- Research Centre for Natural Sciences, Institute of Enzymology, Hungarian Academy of Sciences, Budapest H-1117, Hungary
| | - Zsuzsanna Dosztányi
- MTA-ELTE Momentum Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest H-1117, Hungary
| | - Bálint Mészáros
- Research Centre for Natural Sciences, Institute of Enzymology, Hungarian Academy of Sciences, Budapest H-1117, Hungary.,MTA-ELTE Momentum Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest H-1117, Hungary
| |
Collapse
|
33
|
Zhao B, Xue B. Decision-Tree Based Meta-Strategy Improved Accuracy of Disorder Prediction and Identified Novel Disordered Residues Inside Binding Motifs. Int J Mol Sci 2018; 19:E3052. [PMID: 30301243 PMCID: PMC6213717 DOI: 10.3390/ijms19103052] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2018] [Revised: 09/24/2018] [Accepted: 10/04/2018] [Indexed: 02/06/2023] Open
Abstract
Using computational techniques to identify intrinsically disordered residues is practical and effective in biological studies. Therefore, designing novel high-accuracy strategies is always preferable when existing strategies have a lot of room for improvement. Among many possibilities, a meta-strategy that integrates the results of multiple individual predictors has been broadly used to improve the overall performance of predictors. Nonetheless, a simple and direct integration of individual predictors may not effectively improve the performance. In this project, dual-threshold two-step significance voting and neural networks were used to integrate the predictive results of four individual predictors, including: DisEMBL, IUPred, VSL2, and ESpritz. The new meta-strategy has improved the prediction performance of intrinsically disordered residues significantly, compared to all four individual predictors and another four recently-designed predictors. The improvement was validated using five-fold cross-validation and in independent test datasets.
Collapse
Affiliation(s)
- Bi Zhao
- Department of Cell Biology, Microbiology and Molecular Biology, School of Natural Sciences and Mathematics, College of Arts and Sciences, University of South Florida, Tampa, FL 33620, USA.
| | - Bin Xue
- Department of Cell Biology, Microbiology and Molecular Biology, School of Natural Sciences and Mathematics, College of Arts and Sciences, University of South Florida, Tampa, FL 33620, USA.
| |
Collapse
|
34
|
Okazaki H, Matsuo N, Tenno T, Goda N, Shigemitsu Y, Ota M, Hiroaki H. Using 1 H N amide temperature coefficients to define intrinsically disordered regions: An alternative NMR method. Protein Sci 2018; 27:1821-1830. [PMID: 30098073 DOI: 10.1002/pro.3485] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2018] [Revised: 07/20/2018] [Accepted: 07/21/2018] [Indexed: 02/02/2023]
Abstract
This report describes a cost-effective experimental method for determining an intrinsically disordered protein (IDP) region in a given protein sample. In this area, the most popular (and conventional) means is using the amide (1 HN ) NMR signal chemical shift distributed in the range of 7.5-8.5 ppm. For this study, we applied an additional step: analysis of 1 HN chemical shift temperature coefficients (1 HN -CSTCs) of the signals. We measured 1 H-15 N two-dimensional NMR spectra of model IDP samples and ordered samples at four temperatures (288, 293, 298, and 303 K). We derived the 1 HN -CSTC threshold deviation, which gives the best correlation of ordered and disordered regions among the proteins examined (below -3.6 ppb/K). By combining these criteria with the newly optimized chemical shift range (7.8-8.5 ppm), the ratios of both true positive and true negative were improved by approximately 19% (62-81%) compared with the conventional "chemical shift-only" method.
Collapse
Affiliation(s)
- Hiroki Okazaki
- Department of Complex Systems Science, Graduate School of Information Sciences, Nagoya University, Nagoya, 464-8601, Japan
| | - Naoki Matsuo
- Laboratory of Structural Molecular Pharmacology, Graduate School of Pharmaceutical Sciences, Nagoya University, Nagoya, 464-8601, Japan
| | - Takeshi Tenno
- Laboratory of Structural Molecular Pharmacology, Graduate School of Pharmaceutical Sciences, Nagoya University, Nagoya, 464-8601, Japan.,BeCellBar LLC, Business Incubation Center, Nagoya University, Nagoya, 464-8601, Aichi, Japan
| | - Natsuko Goda
- Laboratory of Structural Molecular Pharmacology, Graduate School of Pharmaceutical Sciences, Nagoya University, Nagoya, 464-8601, Japan
| | - Yoshiki Shigemitsu
- Laboratory of Structural Molecular Pharmacology, Graduate School of Pharmaceutical Sciences, Nagoya University, Nagoya, 464-8601, Japan
| | - Motonori Ota
- Department of Complex Systems Science, Graduate School of Information Sciences, Nagoya University, Nagoya, 464-8601, Japan
| | - Hidekazu Hiroaki
- Laboratory of Structural Molecular Pharmacology, Graduate School of Pharmaceutical Sciences, Nagoya University, Nagoya, 464-8601, Japan.,BeCellBar LLC, Business Incubation Center, Nagoya University, Nagoya, 464-8601, Aichi, Japan.,The Structural Biology Research Center and Division of Biological Science, Graduate School of Science, Nagoya University, Nagoya, Japan
| |
Collapse
|
35
|
Narasumani M, Harrison PM. Discerning evolutionary trends in post-translational modification and the effect of intrinsic disorder: Analysis of methylation, acetylation and ubiquitination sites in human proteins. PLoS Comput Biol 2018; 14:e1006349. [PMID: 30096183 PMCID: PMC6105011 DOI: 10.1371/journal.pcbi.1006349] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2018] [Revised: 08/22/2018] [Accepted: 07/07/2018] [Indexed: 11/18/2022] Open
Abstract
Intrinsically disordered regions (IDRs) of proteins play significant biological functional roles despite lacking a well-defined 3D structure. For example, IDRs provide efficient housing for large numbers of post-translational modification (PTM) sites in eukaryotic proteins. Here, we study the distribution of more than 15,000 experimentally determined human methylation, acetylation and ubiquitination sites (collectively termed 'MAU' sites) in ordered and disordered regions, and analyse their conservation across 380 eukaryotic species. Conservation signals for the maintenance and novel emergence of MAU sites are examined at 11 evolutionary levels from the whole eukaryotic domain down to the ape superfamily, in both ordered and disordered regions. We discover that MAU PTM is a major driver of conservation for arginines and lysines in both ordered and disordered regions, across the 11 levels, most significantly across the mammalian clade. Conservation of human methylatable arginines is very strongly favoured for ordered regions rather than for disordered, whereas methylatable lysines are conserved in either set of regions, and conservation of acetylatable and ubiquitinatable lysines is favoured in disordered over ordered. Notably, we find evidence for the emergence of new lysine MAU sites in disordered regions of proteins in deuterostomes and mammals, and in ordered regions after the dawn of eutherians. For histones specifically, MAU sites demonstrate an idiosyncratic significant conservation pattern that is evident since the last common ancestor of mammals. Similarly, folding-on-binding (FB) regions are highly enriched for MAU sites relative to either ordered or disordered regions, with ubiquitination sites in FBs being highly conserved at all evolutionary levels back as far as mammals. This investigation clearly demonstrates the complex patterns of PTM evolution across the human proteome and that it is necessary to consider conservation of sequence features at multiple evolutionary levels in order not to get an incomplete or misleading picture.
Collapse
|
36
|
Okuda M, Nakazawa Y, Guo C, Ogi T, Nishimura Y. Common TFIIH recruitment mechanism in global genome and transcription-coupled repair subpathways. Nucleic Acids Res 2017; 45:13043-13055. [PMID: 29069470 PMCID: PMC5727438 DOI: 10.1093/nar/gkx970] [Citation(s) in RCA: 75] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2017] [Revised: 10/03/2017] [Accepted: 10/10/2017] [Indexed: 12/17/2022] Open
Abstract
Nucleotide excision repair is initiated by two different damage recognition subpathways, global genome repair (GGR) and transcription-coupled repair (TCR). In GGR, XPC detects DNA lesions and recruits TFIIH via interaction with the pleckstrin homology (PH) domain of TFIIH subunit p62. In TCR, an elongating form of RNA Polymerase II detects a lesion on the transcribed strand and recruits TFIIH by an unknown mechanism. Here, we found that the TCR initiation factor UVSSA forms a stable complex with the PH domain of p62 via a short acidic string in the central region of UVSSA, and determined the complex structure by NMR. The acidic string of UVSSA binds strongly to the basic groove of the PH domain by inserting Phe408 and Val411 into two pockets, highly resembling the interaction mechanism of XPC with p62. Mutational binding analysis validated the structure and identified residues crucial for binding. TCR activity was markedly diminished in UVSSA-deficient cells expressing UVSSA mutated at Phe408 or Val411. Thus, a common TFIIH recruitment mechanism is shared by UVSSA in TCR and XPC in GGR.
Collapse
Affiliation(s)
- Masahiko Okuda
- Graduate School of Medical Life Science, Yokohama City University, 1-7-29 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| | - Yuka Nakazawa
- Department of Genome Repair, Atomic Bomb Disease Institute, Nagasaki University, 1-12-4, Sakamoto, Nagasaki 852-8523, Japan
- Department of Genetics, Research Institute of Environmental Medicine (RIeM), Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8601, Japan
| | - Chaowan Guo
- Department of Genetics, Research Institute of Environmental Medicine (RIeM), Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8601, Japan
| | - Tomoo Ogi
- Department of Genetics, Research Institute of Environmental Medicine (RIeM), Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8601, Japan
| | - Yoshifumi Nishimura
- Graduate School of Medical Life Science, Yokohama City University, 1-7-29 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| |
Collapse
|
37
|
Sequence conservation of protein binding segments in intrinsically disordered regions. Biochem Biophys Res Commun 2017; 494:602-607. [DOI: 10.1016/j.bbrc.2017.10.099] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2017] [Accepted: 10/18/2017] [Indexed: 12/11/2022]
|
38
|
Meng F, Uversky VN, Kurgan L. Comprehensive review of methods for prediction of intrinsic disorder and its molecular functions. Cell Mol Life Sci 2017; 74:3069-3090. [PMID: 28589442 PMCID: PMC11107660 DOI: 10.1007/s00018-017-2555-4] [Citation(s) in RCA: 130] [Impact Index Per Article: 18.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2017] [Accepted: 06/01/2017] [Indexed: 12/19/2022]
Abstract
Computational prediction of intrinsic disorder in protein sequences dates back to late 1970 and has flourished in the last two decades. We provide a brief historical overview, and we review over 30 recent predictors of disorder. We are the first to also cover predictors of molecular functions of disorder, including 13 methods that focus on disordered linkers and disordered protein-protein, protein-RNA, and protein-DNA binding regions. We overview their predictive models, usability, and predictive performance. We highlight newest methods and predictors that offer strong predictive performance measured based on recent comparative assessments. We conclude that the modern predictors are relatively accurate, enjoy widespread use, and many of them are fast. Their predictions are conveniently accessible to the end users, via web servers and databases that store pre-computed predictions for millions of proteins. However, research into methods that predict many not yet addressed functions of intrinsic disorder remains an outstanding challenge.
Collapse
Affiliation(s)
- Fanchi Meng
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Canada
| | - Vladimir N Uversky
- Department of Molecular Medicine, USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA
- Institute for Biological Instrumentation, Russian Academy of Sciences, Pushchino, Moscow Region, Russian Federation
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, USA.
| |
Collapse
|
39
|
Salt-bridge networks within globular and disordered proteins: characterizing trends for designable interactions. J Mol Model 2017. [PMID: 28626846 DOI: 10.1007/s00894-017-3376-y] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
There has been considerable debate about the contribution of salt bridges to the stabilization of protein folds, in spite of their participation in crucial protein functions. Salt bridges appear to contribute to the activity-stability trade-off within proteins by bringing high-entropy charged amino acids into close contacts during the course of their functions. The current study analyzes the modes of association of salt bridges (in terms of networks) within globular proteins and at protein-protein interfaces. While the most common and trivial type of salt bridge is the isolated salt bridge, bifurcated salt bridge appears to be a distinct salt-bridge motif having a special topology and geometry. Bifurcated salt bridges are found ubiquitously in proteins and interprotein complexes. Interesting and attractive examples presenting different modes of interaction are highlighted. Bifurcated salt bridges appear to function as molecular clips that are used to stitch together large surface contours at interacting protein interfaces. The present work also emphasizes the key role of salt-bridge-mediated interactions in the partial folding of proteins containing long stretches of disordered regions. Salt-bridge-mediated interactions seem to be pivotal to the promotion of "disorder-to-order" transitions in small disordered protein fragments and their stabilization upon binding. The results obtained in this work should help to guide efforts to elucidate the modus operandi of these partially disordered proteins, and to conceptualize how these proteins manage to maintain the required amount of disorder even in their bound forms. This work could also potentially facilitate explorations of geometrically specific designable salt bridges through the characterization of composite salt-bridge networks. Graphical abstract ᅟ.
Collapse
|
40
|
DisBind: A database of classified functional binding sites in disordered and structured regions of intrinsically disordered proteins. BMC Bioinformatics 2017; 18:206. [PMID: 28381244 PMCID: PMC5382478 DOI: 10.1186/s12859-017-1620-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Accepted: 03/31/2017] [Indexed: 01/01/2023] Open
Abstract
Background Intrinsically unstructured or disordered proteins function via interacting with other molecules. Annotation of these binding sites is the first step for mapping functional impact of genetic variants in coding regions of human and other genomes, considering that a significant portion of eukaryotic genomes code for intrinsically disordered regions in proteins. Results DisBind (available at http://biophy.dzu.edu.cn/DisBind) is a collection of experimentally supported binding sites in intrinsically disordered proteins and proteins with both structured and disordered regions. There are a total of 226 IDPs with functional site annotations. These IDPs contain 465 structured regions (ORs) and 428 IDRs according to annotation by DisProt. The database contains a total of 4232 binding residues (from UniProt and PDB structures) in which 2836 residues are in ORs and 1396 in IDRs. These binding sites are classified according to their interacting partners including proteins, RNA, DNA, metal ions and others with 2984, 258, 383, 350, and 262 annotated binding sites, respectively. Each entry contains site-specific annotations (structured regions, intrinsically disordered regions, and functional binding regions) that are experimentally supported according to PDB structures or annotations from UniProt. Conclusion The searchable DisBind provides a reliable data resource for functional classification of intrinsically disordered proteins at the residue level.
Collapse
|
41
|
Meng F, Uversky V, Kurgan L. Computational Prediction of Intrinsic Disorder in Proteins. ACTA ACUST UNITED AC 2017; 88:2.16.1-2.16.14. [DOI: 10.1002/cpps.28] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Affiliation(s)
- Fanchi Meng
- Department of Electrical and Computer Engineering, University of Alberta Edmonton Canada
| | - Vladimir Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida Tampa FL USA
- Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences St. Petersburg Russia
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University Richmond USA
| |
Collapse
|
42
|
Basu S, Söderquist F, Wallner B. Proteus: a random forest classifier to predict disorder-to-order transitioning binding regions in intrinsically disordered proteins. J Comput Aided Mol Des 2017; 31:453-466. [PMID: 28365882 PMCID: PMC5429364 DOI: 10.1007/s10822-017-0020-y] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2016] [Accepted: 03/24/2017] [Indexed: 12/03/2022]
Abstract
The focus of the computational structural biology community has taken a dramatic shift over the past one-and-a-half decades from the classical protein structure prediction problem to the possible understanding of intrinsically disordered proteins (IDP) or proteins containing regions of disorder (IDPR). The current interest lies in the unraveling of a disorder-to-order transitioning code embedded in the amino acid sequences of IDPs/IDPRs. Disordered proteins are characterized by an enormous amount of structural plasticity which makes them promiscuous in binding to different partners, multi-functional in cellular activity and atypical in folding energy landscapes resembling partially folded molten globules. Also, their involvement in several deadly human diseases (e.g. cancer, cardiovascular and neurodegenerative diseases) makes them attractive drug targets, and important for a biochemical understanding of the disease(s). The study of the structural ensemble of IDPs is rather difficult, in particular for transient interactions. When bound to a structured partner, an IDPR adapts an ordered conformation in the complex. The residues that undergo this disorder-to-order transition are called protean residues, generally found in short contiguous stretches and the first step in understanding the modus operandi of an IDP/IDPR would be to predict these residues. There are a few available methods which predict these protean segments from their amino acid sequences; however, their performance reported in the literature leaves clear room for improvement. With this background, the current study presents ‘Proteus’, a random forest classifier that predicts the likelihood of a residue undergoing a disorder-to-order transition upon binding to a potential partner protein. The prediction is based on features that can be calculated using the amino acid sequence alone. Proteus compares favorably with existing methods predicting twice as many true positives as the second best method (55 vs. 27%) with a much higher precision on an independent data set. The current study also sheds some light on a possible ‘disorder-to-order’ transitioning consensus, untangled, yet embedded in the amino acid sequence of IDPs. Some guidelines have also been suggested for proceeding with a real-life structural modeling involving an IDPR using Proteus.
Collapse
Affiliation(s)
- Sankar Basu
- Bioinformatics Division, Department of Physics, Chemistry and Biology, Linköping University, Linköping, Sweden.,Department of Biochemistry, University of Calcutta, Kolkata, 700019, India
| | - Fredrik Söderquist
- Bioinformatics Division, Department of Physics, Chemistry and Biology, Linköping University, Linköping, Sweden
| | - Björn Wallner
- Bioinformatics Division, Department of Physics, Chemistry and Biology, Linköping University, Linköping, Sweden. .,Swedish e-Science Research Center, Linköping University, Linköping, Sweden.
| |
Collapse
|
43
|
Assessment of virulence potential of uncharacterized Enterococcus faecalis strains using pan genomic approach - Identification of pathogen-specific and habitat-specific genes. Sci Rep 2016; 6:38648. [PMID: 27924951 PMCID: PMC5141418 DOI: 10.1038/srep38648] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2016] [Accepted: 11/10/2016] [Indexed: 12/13/2022] Open
Abstract
Enterococcus faecalis, a leading nosocomial pathogen and yet a prominent member of gut microbiome, lacks clear demarcation between pathogenic and non-pathogenic strains at genome level. Here we present the comparative genome analysis of 36 E. faecalis strains with different pathogenic features and from different body-habitats. This study begins by addressing the genome dynamics, which shows that the pan-genome of E. faecalis is still open, though the core genome is nearly saturated. We identified eight uncharacterized strains as potential pathogens on the basis of their co-segregation with reported pathogens in gene presence-absence matrix and Pathogenicity Island (PAI) distribution. A ~7.4 kb genomic-cassette, which is itself a part of PAI, is found to exist in all reported and potential pathogens, but not in commensals and other uncharacterized strains. This region encodes four genes and among them, products of two hypothetical genes are predicted to be intrinsically disordered that may serve as novel targets for therapeutic measures. Exclusive existence of 215, 129, 4 and 1 genes in the blood, gastrointestinal tract, urogenital tract, oral cavity and lymph node derived E. faecalis genomes respectively suggests possible employment of distinct habitat-specific genetic strategies in the adaptation of E. faecalis in human host.
Collapse
|
44
|
Piovesan D, Tabaro F, Mičetić I, Necci M, Quaglia F, Oldfield CJ, Aspromonte MC, Davey NE, Davidović R, Dosztányi Z, Elofsson A, Gasparini A, Hatos A, Kajava AV, Kalmar L, Leonardi E, Lazar T, Macedo-Ribeiro S, Macossay-Castillo M, Meszaros A, Minervini G, Murvai N, Pujols J, Roche DB, Salladini E, Schad E, Schramm A, Szabo B, Tantos A, Tonello F, Tsirigos KD, Veljković N, Ventura S, Vranken W, Warholm P, Uversky VN, Dunker AK, Longhi S, Tompa P, Tosatto SCE. DisProt 7.0: a major update of the database of disordered proteins. Nucleic Acids Res 2016; 45:D219-D227. [PMID: 27899601 PMCID: PMC5210544 DOI: 10.1093/nar/gkw1056] [Citation(s) in RCA: 201] [Impact Index Per Article: 25.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2016] [Revised: 10/19/2016] [Accepted: 10/21/2016] [Indexed: 01/16/2023] Open
Abstract
The Database of Protein Disorder (DisProt, URL: www.disprot.org) has been significantly updated and upgraded since its last major renewal in 2007. The current release holds information on more than 800 entries of IDPs/IDRs, i.e. intrinsically disordered proteins or regions that exist and function without a well-defined three-dimensional structure. We have re-curated previous entries to purge DisProt from conflicting cases, and also upgraded the functional classification scheme to reflect continuous advance in the field in the past 10 years or so. We define IDPs as proteins that are disordered along their entire sequence, i.e. entirely lack structural elements, and IDRs as regions that are at least five consecutive residues without well-defined structure. We base our assessment of disorder strictly on experimental evidence, such as X-ray crystallography and nuclear magnetic resonance (primary techniques) and a broad range of other experimental approaches (secondary techniques). Confident and ambiguous annotations are highlighted separately. DisProt 7.0 presents classified knowledge regarding the experimental characterization and functional annotations of IDPs/IDRs, and is intended to provide an invaluable resource for the research community for a better understanding structural disorder and for developing better computational tools for studying disordered proteins.
Collapse
Affiliation(s)
- Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, I-35121 Padova, Italy
| | - Francesco Tabaro
- Department of Biomedical Sciences, University of Padova, I-35121 Padova, Italy.,Institute of Biosciences and Medical Technology, University of Tampere, Finland
| | - Ivan Mičetić
- Department of Biomedical Sciences, University of Padova, I-35121 Padova, Italy
| | - Marco Necci
- Department of Biomedical Sciences, University of Padova, I-35121 Padova, Italy
| | - Federica Quaglia
- Department of Biomedical Sciences, University of Padova, I-35121 Padova, Italy
| | - Christopher J Oldfield
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 46202 Indianapolis, IN, USA
| | | | - Norman E Davey
- Conway Institute of Biomolecular & Biomedical Research, University College Dublin, Belfield, Dublin 4, Ireland.,Ireland UCD School of Medicine & Medical Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Radoslav Davidović
- Centre for Multidisciplinary Research, Institute of Nuclear Sciences Vinca, University of Belgrade, 11001 Belgrade, Serbia
| | - Zsuzsanna Dosztányi
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, 1/c Pázmány Péter sétány, 1117 Budapest, Hungary.,Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, PO Box 7,H-1518 Budapest, Hungary
| | - Arne Elofsson
- Department of Biochemistry and Biophysics and Science for Life Laboratory, Stockholm University, Box 1031, 17121 Solna, Sweden
| | - Alessandra Gasparini
- Department of Woman and Child Health, University of Padova, I-35128 Padova, Italy
| | - András Hatos
- Department of Biomedical Sciences, University of Padova, I-35121 Padova, Italy.,Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, PO Box 7,H-1518 Budapest, Hungary
| | - Andrey V Kajava
- Centre de Recherche en Biologie cellulaire de Montpellier (CRBM), UMR 5237 CNRS, Université Montpellier 1919 Route de Mende, Cedex 5, Montpellier 34293, France.,Institut de Biologie Computationnelle (IBC), Montpellier 34095, France.,University ITMO, Institute of Bioengineering, St. Petersburg 197101, Russia
| | - Lajos Kalmar
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, PO Box 7,H-1518 Budapest, Hungary.,Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge CB3 0ES, UK
| | - Emanuela Leonardi
- Department of Woman and Child Health, University of Padova, I-35128 Padova, Italy
| | - Tamas Lazar
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels 1050, Belgium.,Structural Biology Research Center (SBRC), Flanders Institute for Biotechnology (VIB), Brussels 1050, Belgium
| | - Sandra Macedo-Ribeiro
- Biomolecular Structure and Function Group, Instituto de Biologia Molecular e Celular (IBMC) and Instituto de Investigação e Inovação em Saúde (i3S), Universidade do Porto, 4200-135 Porto, Portugal
| | - Mauricio Macossay-Castillo
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels 1050, Belgium.,Structural Biology Research Center (SBRC), Flanders Institute for Biotechnology (VIB), Brussels 1050, Belgium
| | - Attila Meszaros
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, PO Box 7,H-1518 Budapest, Hungary
| | - Giovanni Minervini
- Department of Biomedical Sciences, University of Padova, I-35121 Padova, Italy
| | - Nikoletta Murvai
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, PO Box 7,H-1518 Budapest, Hungary
| | - Jordi Pujols
- Departament de Bioquimica i Biologia Molecular and Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, Bellaterra 08193, Spain
| | - Daniel B Roche
- Centre de Recherche en Biologie cellulaire de Montpellier (CRBM), UMR 5237 CNRS, Université Montpellier 1919 Route de Mende, Cedex 5, Montpellier 34293, France.,Institut de Biologie Computationnelle (IBC), Montpellier 34095, France
| | | | - Eva Schad
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, PO Box 7,H-1518 Budapest, Hungary
| | | | - Beata Szabo
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, PO Box 7,H-1518 Budapest, Hungary
| | - Agnes Tantos
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, PO Box 7,H-1518 Budapest, Hungary
| | - Fiorella Tonello
- Department of Biomedical Sciences, University of Padova, I-35121 Padova, Italy.,CNR Institute of Neurosceince, I-35121 Padova, Italy
| | - Konstantinos D Tsirigos
- Department of Biochemistry and Biophysics and Science for Life Laboratory, Stockholm University, Box 1031, 17121 Solna, Sweden
| | - Nevena Veljković
- Centre for Multidisciplinary Research, Institute of Nuclear Sciences Vinca, University of Belgrade, 11001 Belgrade, Serbia
| | - Salvador Ventura
- Departament de Bioquimica i Biologia Molecular and Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, Bellaterra 08193, Spain
| | - Wim Vranken
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels 1050, Belgium.,Structural Biology Research Center (SBRC), Flanders Institute for Biotechnology (VIB), Brussels 1050, Belgium.,Interuniversity Institute of Bioinformatics in Brussels (IB2), ULB-VUB, Brussels 1050, Belgium
| | - Per Warholm
- Department of Biochemistry and Biophysics and Science for Life Laboratory, Stockholm University, Box 1031, 17121 Solna, Sweden
| | - Vladimir N Uversky
- Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, 194064 St. Petersburg, Russia.,Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA
| | - A Keith Dunker
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 46202 Indianapolis, IN, USA
| | - Sonia Longhi
- Aix-Marseille Univ, CNRS, AFMB, UMR 7257, Marseille, France
| | - Peter Tompa
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, PO Box 7,H-1518 Budapest, Hungary .,Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels 1050, Belgium.,Structural Biology Research Center (SBRC), Flanders Institute for Biotechnology (VIB), Brussels 1050, Belgium
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padova, I-35121 Padova, Italy .,CNR Institute of Neurosceince, I-35121 Padova, Italy
| |
Collapse
|
45
|
Shaji D. The relationship between relative solvent accessible surface area (rASA) and irregular structures in protean segments (ProSs). Bioinformation 2016; 12:381-387. [PMID: 28250616 PMCID: PMC5314839 DOI: 10.6026/97320630012381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2016] [Accepted: 11/18/2016] [Indexed: 11/23/2022] Open
Abstract
Intrinsically Disordered Proteins (IDPs) lack a stable, three-dimensional structure under physiological conditions, yet they exhibit numerous biological activities. Protean segments (ProSs) are the functional regions of intrinsically disordered proteins that undergo disorder-to-order transitions upon binding to their partners. Example ProSs collected from the intrinsically disordered proteins with extensive annotations and literature (IDEAL) database. The interface of protean segments (ProSs) is classified into core, rim, and support, and analyzed their secondary structure elements (SSEs) based on the relative accessible surface area (rASA). The amino acid compositions and the relative solvent accessible surface areas (rASAs) of ProS secondary structural elements (SSEs) at the interface, core and rim were compared to those of heterodimers. The average number of contacts of alpha helices and irregular residues was calculated for each ProS and heterodimer. Furthermore, the ProSs were classified into high and low efficient based on their average number of contacts at the interface. The results indicate that the irregular structures of ProSs and heterodimers are significantly different. The rASA of irregular structures in the monomeric state (rASAm) is large, leads to the formation of larger ΔrASA and many contacts in ProSs.
Collapse
Affiliation(s)
- Divya Shaji
- Graduate School of Information Science, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8601, Japan
| |
Collapse
|
46
|
Necci M, Piovesan D, Tosatto SCE. Large-scale analysis of intrinsic disorder flavors and associated functions in the protein sequence universe. Protein Sci 2016; 25:2164-2174. [PMID: 27636733 DOI: 10.1002/pro.3041] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2016] [Revised: 09/12/2016] [Accepted: 09/12/2016] [Indexed: 12/22/2022]
Abstract
Intrinsic disorder (ID) in proteins has been extensively described for the last decade; a large-scale classification of ID in proteins is mostly missing. Here, we provide an extensive analysis of ID in the protein universe on the UniProt database derived from sequence-based predictions in MobiDB. Almost half the sequences contain an ID region of at least five residues. About 9% of proteins have a long ID region of over 20 residues which are more abundant in Eukaryotic organisms and most frequently cover less than 20% of the sequence. A small subset of about 67,000 (out of over 80 million) proteins is fully disordered and mostly found in Viruses. Most proteins have only one ID, with short ID evenly distributed along the sequence and long ID overrepresented in the center. The charged residue composition of Das and Pappu was used to classify ID proteins by structural propensities and corresponding functional enrichment. Swollen Coils seem to be used mainly as structural components and in biosynthesis in both Prokaryotes and Eukaryotes. In Bacteria, they are confined in the nucleoid and in Viruses provide DNA binding function. Coils & Hairpins seem to be specialized in ribosome binding and methylation activities. Globules & Tadpoles bind antigens in Eukaryotes but are involved in killing other organisms and cytolysis in Bacteria. The Undefined class is used by Bacteria to bind toxic substances and mediate transport and movement between and within organisms in Viruses. Fully disordered proteins behave similarly, but are enriched for glycine residues and extracellular structures.
Collapse
Affiliation(s)
- Marco Necci
- Department of Biomedical Sciences and CRIBI Biotech Center, University of Padua, Padua, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences and CRIBI Biotech Center, University of Padua, Padua, Italy
| | - Silvio C E Tosatto
- Department of Biomedical Sciences and CRIBI Biotech Center, University of Padua, Padua, Italy.,CNR Institute of Neuroscience, Padua, Italy
| |
Collapse
|
47
|
Dos Santos HG, Siltberg-Liberles J. Paralog-Specific Patterns of Structural Disorder and Phosphorylation in the Vertebrate SH3-SH2-Tyrosine Kinase Protein Family. Genome Biol Evol 2016; 8:2806-25. [PMID: 27519537 PMCID: PMC5630953 DOI: 10.1093/gbe/evw194] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/06/2016] [Indexed: 12/21/2022] Open
Abstract
One of the largest multigene families in Metazoa are the tyrosine kinases (TKs). These are important multifunctional proteins that have evolved as dynamic switches that perform tyrosine phosphorylation and other noncatalytic activities regulated by various allosteric mechanisms. TKs interact with each other and with other molecules, ultimately activating and inhibiting different signaling pathways. TKs are implicated in cancer and almost 30 FDA-approved TK inhibitors are available. However, specific binding is a challenge when targeting an active site that has been conserved in multiple protein paralogs for millions of years. A cassette domain (CD) containing SH3-SH2-Tyrosine Kinase domains reoccurs in vertebrate nonreceptor TKs. Although part of the CD function is shared between TKs, it also presents TK specific features. Here, the evolutionary dynamics of sequence, structure, and phosphorylation across the CD in 17 TK paralogs have been investigated in a large-scale study. We establish that TKs often have ortholog-specific structural disorder and phosphorylation patterns, while secondary structure elements, as expected, are highly conserved. Further, domain-specific differences are at play. Notably, we found the catalytic domain to fluctuate more in certain secondary structure elements than the regulatory domains. By elucidating how different properties evolve after gene duplications and which properties are specifically conserved within orthologs, the mechanistic understanding of protein evolution is enriched and regions supposedly critical for functional divergence across paralogs are highlighted.
Collapse
Affiliation(s)
- Helena G Dos Santos
- Department of Biological Sciences, Biomolecular Sciences Institute, Florida International University
| | - Jessica Siltberg-Liberles
- Department of Biological Sciences, Biomolecular Sciences Institute, Florida International University
| |
Collapse
|
48
|
Shaji D, Amemiya T, Koike R, Ota M. Interface property responsible for effective interactions of protean segments: Intrinsically disordered regions that undergo disorder-to-order transitions upon binding. Biochem Biophys Res Commun 2016; 478:123-127. [DOI: 10.1016/j.bbrc.2016.07.082] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2016] [Accepted: 07/19/2016] [Indexed: 12/16/2022]
|
49
|
Alawad A, Alharbi S, Alhazzaa O, Alagrafi F, Alkhrayef M, Alhamdan Z, Alenazi A, Al-Johi H, Alanazi IO, Hammad M. Phylogenetic and Structural Analysis of the Pluripotency Factor Sex-Determining Region Y box2 Gene of Camelus dromedarius (cSox2). Bioinform Biol Insights 2016; 10:111-20. [PMID: 27486314 PMCID: PMC4962958 DOI: 10.4137/bbi.s39047] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2016] [Revised: 05/15/2016] [Accepted: 05/21/2016] [Indexed: 12/18/2022] Open
Abstract
Although the sequencing information of Sox2 cDNA for many mammalian is available, the Sox2 cDNA of Camelus dromedaries has not yet been characterized. The objective of this study was to sequence and characterize Sox2 cDNA from the brain of C. dromedarius (also known as Arabian camel). A full coding sequence of the Sox2 gene from the brain of C. dromedarius was amplified by reverse transcription PCRjmc and then sequenced using the 3730XL series platform Sequencer (Applied Biosystem) for the first time. The cDNA sequence displayed an open reading frame of 822 nucleotides, encoding a protein of 273 amino acids. The molecular weight and the isoelectric point of the translated protein were calculated as 29.825 kDa and 10.11, respectively, using bioinformatics analysis. The predicted cSox2 protein sequence exhibited high identity: 99% for Homo sapiens, Mus musculus, Bos taurus, and Vicugna pacos; 98% for Sus scrofa and 93% for Camelus ferus. A 3D structure was built based on the available crystal structure of the HMG-box domain of human stem cell transcription factor Sox2 (PDB: 2 LE4) with 81 residues and predicting bioinformatics software for 273 amino acid residues. The comparison confirms the presence of the HMG-box domain in the cSox2 protein. The orthologous phylogenetic analysis showed that the Sox2 isoform from C. dromedarius was grouped with humans, alpacas, cattle, and pigs. We believe that this genetic and structural information will be a helpful source for the annotation. Furthermore, Sox2 is one of the transcription factors that contributes to the generation-induced pluripotent stem cells (iPSCs), which in turn will probably help generate camel induced pluripotent stem cells (CiPSCs).
Collapse
Affiliation(s)
- Abdullah Alawad
- National Center for Stem Cell Technology, King Abdulaziz City for Science and Technology, Riyadh, KSA
| | - Sultan Alharbi
- National Center for Stem Cell Technology, King Abdulaziz City for Science and Technology, Riyadh, KSA
| | - Othman Alhazzaa
- National Center for Stem Cell Technology, King Abdulaziz City for Science and Technology, Riyadh, KSA
| | - Faisal Alagrafi
- National Center for Stem Cell Technology, King Abdulaziz City for Science and Technology, Riyadh, KSA
| | - Mohammed Alkhrayef
- National Center for Stem Cell Technology, King Abdulaziz City for Science and Technology, Riyadh, KSA
| | - Ziyad Alhamdan
- National Center for Stem Cell Technology, King Abdulaziz City for Science and Technology, Riyadh, KSA
| | - Abdullah Alenazi
- National Center for Stem Cell Technology, King Abdulaziz City for Science and Technology, Riyadh, KSA
| | - Hasan Al-Johi
- National Center for Genomic Technology, King Abdulaziz City for Science and Technology, Riyadh, KSA
| | - Ibrahim O Alanazi
- National Center for Genomic Technology, King Abdulaziz City for Science and Technology, Riyadh, KSA
| | - Mohamed Hammad
- National Center for Stem Cell Technology, King Abdulaziz City for Science and Technology, Riyadh, KSA.; SAAD Research and Development Center, Clinical Research Laboratory and Radiation Oncology, SAAD Specialist Hospital, Al Khobar, KSA
| |
Collapse
|
50
|
Vincent M, Schnell S. A collection of intrinsic disorder characterizations from eukaryotic proteomes. Sci Data 2016; 3:160045. [PMID: 27326998 PMCID: PMC4915274 DOI: 10.1038/sdata.2016.45] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2016] [Accepted: 05/04/2016] [Indexed: 12/17/2022] Open
Abstract
Intrinsically disordered proteins and protein regions lack a stable three-dimensional structure under physiological conditions. Several proteomic investigations of intrinsic disorder have been performed to date and have found disorder to be prevalent in eukaryotic proteomes. Here we present descriptive statistics of intrinsic disorder features for ten model eukaryotic proteomes that have been calculated from computational disorder prediction algorithms. The data descriptor also provides consensus disorder annotations as well as additional physical parameters relevant to protein disorder, and further provides protein existence information for all proteins included in our analysis. The complete datasets can be downloaded freely, and it is envisaged that they will be updated periodically with new proteomes and protein disorder prediction algorithms. These datasets will be especially useful for assessing protein disorder, and conducting novel analyses that advance our understanding of intrinsic disorder and protein structure.
Collapse
Affiliation(s)
- Michael Vincent
- Department of Molecular &Integrative Physiology, University of Michigan Medical School, Ann Arbor, Michigan 48109-0622, USA
| | - Santiago Schnell
- Department of Molecular &Integrative Physiology, University of Michigan Medical School, Ann Arbor, Michigan 48109-0622, USA.,Department of Computational Medicine &Bioinformatics, University of Michigan Medical School, Michigan 48109-2218, USA.,Brehm Center for Diabetes Research, University of Michigan Medical School, Ann Arbor, Michigan 48105-1912, USA
| |
Collapse
|