1
|
Crawshaw S, Murphy AM, Rowling PJE, Nietlispach D, Itzhaki LS, Carr JP. Investigating the Interactions of the Cucumber Mosaic Virus 2b Protein with the Viral 1a Replicase Component and the Cellular RNA Silencing Factor Argonaute 1. Viruses 2024; 16:676. [PMID: 38793558 PMCID: PMC11125589 DOI: 10.3390/v16050676] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2024] [Revised: 04/13/2024] [Accepted: 04/16/2024] [Indexed: 05/26/2024] Open
Abstract
The cucumber mosaic virus (CMV) 2b protein is a suppressor of plant defenses and a pathogenicity determinant. Amongst the 2b protein's host targets is the RNA silencing factor Argonaute 1 (AGO1), which it binds to and inhibits. In Arabidopsis thaliana, if 2b-induced inhibition of AGO1 is too efficient, it induces reinforcement of antiviral silencing by AGO2 and triggers increased resistance against aphids, CMV's insect vectors. These effects would be deleterious to CMV replication and transmission, respectively, but are moderated by the CMV 1a protein, which sequesters sufficient 2b protein molecules into P-bodies to prevent excessive inhibition of AGO1. Mutant 2b protein variants were generated, and red and green fluorescent protein fusions were used to investigate subcellular colocalization with AGO1 and the 1a protein. The effects of mutations on complex formation with the 1a protein and AGO1 were investigated using bimolecular fluorescence complementation and co-immunoprecipitation assays. Although we found that residues 56-60 influenced the 2b protein's interactions with the 1a protein and AGO1, it appears unlikely that any single residue or sequence domain is solely responsible. In silico predictions of intrinsic disorder within the 2b protein secondary structure were supported by circular dichroism (CD) but not by nuclear magnetic resonance (NMR) spectroscopy. Intrinsic disorder provides a plausible model to explain the 2b protein's ability to interact with AGO1, the 1a protein, and other factors. However, the reasons for the conflicting conclusions provided by CD and NMR must first be resolved.
Collapse
Affiliation(s)
- Sam Crawshaw
- Department of Plant Sciences, University of Cambridge, Downing Street, Cambridge CB2 3EA, UK; (S.C.); (A.M.M.)
| | - Alex M. Murphy
- Department of Plant Sciences, University of Cambridge, Downing Street, Cambridge CB2 3EA, UK; (S.C.); (A.M.M.)
| | - Pamela J. E. Rowling
- Department of Pharmacology, University of Cambridge, Tennis Court Rd., Cambridge CB2 1PD, UK; (P.J.E.R.); (L.S.I.)
| | - Daniel Nietlispach
- Department of Biochemistry, University of Cambridge, Sanger Building, 80 Tennis Court Rd., Cambridge CB2 1GA, UK;
| | - Laura S. Itzhaki
- Department of Pharmacology, University of Cambridge, Tennis Court Rd., Cambridge CB2 1PD, UK; (P.J.E.R.); (L.S.I.)
| | - John P. Carr
- Department of Plant Sciences, University of Cambridge, Downing Street, Cambridge CB2 3EA, UK; (S.C.); (A.M.M.)
| |
Collapse
|
2
|
Shahrajabian MH, Sun W. Characterization of Intrinsically Disordered Proteins in Healthy and Diseased States by Nuclear Magnetic Resonance. Rev Recent Clin Trials 2024; 19:176-188. [PMID: 38409704 DOI: 10.2174/0115748871271420240213064251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 11/10/2023] [Accepted: 12/13/2023] [Indexed: 02/28/2024]
Abstract
INTRODUCTION Intrinsically Disordered Proteins (IDPs) are active in different cellular procedures like ordered assembly of chromatin and ribosomes, interaction with membrane, protein, and ligand binding, molecular recognition, binding, and transportation via nuclear pores, microfilaments and microtubules process and disassembly, protein functions, RNA chaperone, and nucleic acid binding, modulation of the central dogma, cell cycle, and other cellular activities, post-translational qualification and substitute splicing, and flexible entropic linker and management of signaling pathways. METHODS The intrinsic disorder is a precise structural characteristic that permits IDPs/IDPRs to be involved in both one-to-many and many-to-one signaling. IDPs/IDPRs also exert some dynamical and structural ordering, being much less constrained in their activities than folded proteins. Nuclear magnetic resonance (NMR) spectroscopy is a major technique for the characterization of IDPs, and it can be used for dynamic and structural studies of IDPs. RESULTS AND CONCLUSION This review was carried out to discuss intrinsically disordered proteins and their different goals, as well as the importance and effectiveness of NMR in characterizing intrinsically disordered proteins in healthy and diseased states.
Collapse
Affiliation(s)
- Mohamad Hesam Shahrajabian
- National Key Laboratory of Agricultural Microbiology, Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Wenli Sun
- National Key Laboratory of Agricultural Microbiology, Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| |
Collapse
|
3
|
Zhao B, Ghadermarzi S, Kurgan L. Comparative evaluation of AlphaFold2 and disorder predictors for prediction of intrinsic disorder, disorder content and fully disordered proteins. Comput Struct Biotechnol J 2023; 21:3248-3258. [PMID: 38213902 PMCID: PMC10782001 DOI: 10.1016/j.csbj.2023.06.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 05/31/2023] [Accepted: 06/01/2023] [Indexed: 01/13/2024] Open
Abstract
We expand studies of AlphaFold2 (AF2) in the context of intrinsic disorder prediction by comparing it against a broad selection of 20 accurate, popular and recently released disorder predictors. We use 25% larger benchmark dataset with 646 proteins and cover protein-level predictions of disorder content and fully disordered proteins. AF2-based disorder predictions secure a relatively high Area Under receiver operating characteristic Curve (AUC) of 0.77 and are statistically outperformed by several modern disorder predictors that secure AUCs around 0.8 with median runtime of about 20 s compared to 1200 s for AF2. Moreover, AF2 provides modestly accurate predictions of fully disordered proteins (F1 = 0.59 vs. 0.91 for the best disorder predictor) and disorder content (mean absolute error of 0.21 vs. 0.15). AF2 also generates statistically more accurate disorder predictions for about 20% of proteins that have relatively short sequences and a few disordered regions that tend to be located at the sequence termini, and which are absent of disordered protein-binding regions. Interestingly, AF2 and the most accurate disorder predictors rely on deep neural networks, suggesting that these models are useful for protein structure and disorder predictions.
Collapse
Affiliation(s)
- Bi Zhao
- Genomics program, College of Public Health, University of South Florida, Tampa, FL, United States
| | - Sina Ghadermarzi
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| |
Collapse
|
4
|
Choudhary P, Anyango S, Berrisford J, Tolchard J, Varadi M, Velankar S. Unified access to up-to-date residue-level annotations from UniProtKB and other biological databases for PDB data. Sci Data 2023; 10:204. [PMID: 37045837 PMCID: PMC10097656 DOI: 10.1038/s41597-023-02101-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 03/23/2023] [Indexed: 04/14/2023] Open
Abstract
More than 61,000 proteins have up-to-date correspondence between their amino acid sequence (UniProtKB) and their 3D structures (PDB), enabled by the Structure Integration with Function, Taxonomy and Sequences (SIFTS) resource. SIFTS incorporates residue-level annotations from many other biological resources. SIFTS data is available in various formats like XML, CSV and TSV format or also accessible via the PDBe REST API but always maintained separately from the structure data (PDBx/mmCIF file) in the PDB archive. Here, we extended the wwPDB PDBx/mmCIF data dictionary with additional categories to accommodate SIFTS data and added the UniProtKB, Pfam, SCOP2, and CATH residue-level annotations directly into the PDBx/mmCIF files from the PDB archive. With the integrated UniProtKB annotations, these files now provide consistent numbering of residues in different PDB entries allowing easy comparison of structure models. The extended dictionary yields a more consistent, standardised metadata description without altering the core PDB information. This development enables up-to-date cross-reference information at the residue level resulting in better data interoperability, supporting improved data analysis and visualisation.
Collapse
Grants
- BB/V004247/1, PI:Sameer Velankar RCUK | Biotechnology and Biological Sciences Research Council (BBSRC)
- BB/V004247/1, PI:Sameer Velankar RCUK | Biotechnology and Biological Sciences Research Council (BBSRC)
- BB/V004247/1, PI:Sameer Velankar RCUK | Biotechnology and Biological Sciences Research Council (BBSRC)
- BB/V004247/1, PI:Sameer Velankar RCUK | Biotechnology and Biological Sciences Research Council (BBSRC)
- BB/V004247/1, PI:Sameer Velankar RCUK | Biotechnology and Biological Sciences Research Council (BBSRC)
- BB/V004247/1, PI:Sameer Velankar RCUK | Biotechnology and Biological Sciences Research Council (BBSRC)
- DBI-2019297, PI: S.K. Burley National Science Foundation (NSF)
- DBI-2019297, PI: S.K. Burley National Science Foundation (NSF)
- DBI-2019297, PI: S.K. Burley) National Science Foundation (NSF)
- DBI-2019297, PI: S.K. Burley National Science Foundation (NSF)
- DBI-2019297, PI: S.K. Burley National Science Foundation (NSF)
- DBI-2019297, PI: S.K. Burley NSF | National Science Board (NSB)
Collapse
Affiliation(s)
- Preeti Choudhary
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| | - Stephen Anyango
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - John Berrisford
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
- AstraZeneca, Biomedical Campus, 1 Francis Crick Ave, Trumpington, Cambridge, CB2 0AA, UK
| | - James Tolchard
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
- Claude Bernard University, Villeurbanne, Lyon, 69100, France
| | - Mihaly Varadi
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| |
Collapse
|
5
|
Dayhoff GW, Uversky VN. Rapid prediction and analysis of protein intrinsic disorder. Protein Sci 2022; 31:e4496. [PMID: 36334049 PMCID: PMC9679974 DOI: 10.1002/pro.4496] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 10/28/2022] [Accepted: 11/02/2022] [Indexed: 11/07/2022]
Abstract
Protein intrinsic disorder is found in all kingdoms of life and is known to underpin numerous physiological and pathological processes. Computational methods play an important role in characterizing and identifying intrinsically disordered proteins and protein regions. Herein, we present a new high-efficiency web-based disorder predictor named Rapid Intrinsic Disorder Analysis Online (RIDAO) that is designed to facilitate the application of protein intrinsic disorder analysis in genome-scale structural bioinformatics and comparative genomics/proteomics. RIDAO integrates six established disorder predictors into a single, unified platform that reproduces the results of individual predictors with near-perfect fidelity. To demonstrate the potential applications, we construct a test set containing more than one million sequences from one hundred organisms comprising over 420 million residues. Using this test set, we compare the efficiency and accessibility (i.e., ease of use) of RIDAO to five well-known and popular disorder predictors, namely: AUCpreD, IUPred3, metapredict V2, flDPnn, and SPOT-Disorder2. We show that RIDAO yields per-residue predictions at a rate two to six orders of magnitude greater than the other predictors and completely processes the test set in under an hour. RIDAO can be accessed free of charge at https://ridao.app.
Collapse
Affiliation(s)
- Guy W. Dayhoff
- Department of ChemistryUniversity of South FloridaTampaFloridaUSA
| | - Vladimir N. Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research InstituteUniversity of South FloridaTampaFloridaUSA
| |
Collapse
|
6
|
Polanco C, Uversky VN, Huberman A, Vargas-Alarcón G, Castañón González JA, Buhse T, Hernández Lemus E, Rios Castro M, López Oliva EJ, Solís Nájera SE. Bioinformatics-based Characterization of the Sequence Variability of
Zika Virus Polyprotein and Envelope Protein (E). Evol Bioinform Online 2022; 18:11769343221130730. [PMCID: PMC9623037 DOI: 10.1177/11769343221130730] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2022] [Accepted: 09/12/2022] [Indexed: 11/17/2022] Open
Abstract
Background: Zika virus, which is widely spread and infects humans through the bites of
Aedes albopictus and Aedes aegypti
female mosquitoes, represents a serious global health issue. Objective: The objective of the present study is to computationally characterize Zika
virus polyproteins (UniProt Name: PRO_0000443018 [residues 1-3423],
PRO_0000445659 [residues 1-3423] and PRO_0000435828 [residues 1-3419]) and
their envelope proteins using their physico-chemical properties. Methods: To achieve this, the Polarity Index Method (PIM) profile and the Protein
Intrinsic Disorder Predisposition (PIDP) profile of 3 main groups of
proteins were evaluated: structural proteins extracted from specific
Databases, Zika virus polyproteins, and their envelope proteins (E)
extracted from UniProt Database. Once the PIM profile of the Zika virus
envelope proteins (E) was obtained and since the Zika virus polyproteins
were also identified with this profile, the proteins defined as “reviewed
proteins” extracted from the UniProt Database were searched
for the similar PIM profile. Finally, the difference between the PIM
profiles of the Zika virus polyproteins and their envelope proteins (E) was
tested using 2 non-parametric statistical tests. Results: It was found and tested that the PIM profile is an efficient discriminant
that allows obtaining a “computational fingerprint” of each Zika virus
polyprotein from its envelope protein (E). Conclusion: PIM profile represents a computational tool, which can be used to effectively
discover Zika virus polyproteins from Databases, from their envelope
proteins (E) sequences.
Collapse
Affiliation(s)
- Carlos Polanco
- Department of Electromechanical
Instrumentation, Instituto Nacional de Cardiología “Ignacio Chávez,” México City,
México,Department of Mathematics, Faculty of
Sciences, Universidad Nacional Autónoma de México, México City, México,Carlos Polanco, Department of
Electromechanical Instrumentation, Instituto Nacional de Cardiología “Ignacio
Chávez,” Juan Badiano 1 Tlalpan, México City 14800, México.
| | - Vladimir N Uversky
- Department of Molecular Medicine and
USF Health Byrd Alzheimer’s Research Institute, Morsani College of Medicine,
University of South Florida, Tampa, FL, USA,Protein Research Group, Institute for
Biological Instrumentation of the Russian Academy of Sciences, Federal Research
Center “Pushchino Scientific Center for Biological Research of the Russian Academy
of Sciences,” Pushchino, Moscow Region, Russia
| | - Alberto Huberman
- Department of Biochemistry, Instituto
Nacional de Ciencias Médicas y Nutrición “Salvador Zubirán”, México City,
México
| | | | | | - Thomas Buhse
- Chemical Research Center, Universidad
Autónoma del Estado de Morelos, Cuernavaca, Morelos, México
| | - Enrique Hernández Lemus
- Department of Computational Genomics,
Instituto Nacional de Medicina Genómica, México City, México
| | - Martha Rios Castro
- Department of Electromechanical
Instrumentation, Instituto Nacional de Cardiología “Ignacio Chávez,” México City,
México
| | - Erika Jeannette López Oliva
- Department of Electromechanical
Instrumentation, Instituto Nacional de Cardiología “Ignacio Chávez,” México City,
México
| | | |
Collapse
|
7
|
Ilzhöfer D, Heinzinger M, Rost B. SETH predicts nuances of residue disorder from protein embeddings. FRONTIERS IN BIOINFORMATICS 2022; 2:1019597. [PMID: 36304335 PMCID: PMC9580958 DOI: 10.3389/fbinf.2022.1019597] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Accepted: 09/20/2022] [Indexed: 11/07/2022] Open
Abstract
Predictions for millions of protein three-dimensional structures are only a few clicks away since the release of AlphaFold2 results for UniProt. However, many proteins have so-called intrinsically disordered regions (IDRs) that do not adopt unique structures in isolation. These IDRs are associated with several diseases, including Alzheimer's Disease. We showed that three recent disorder measures of AlphaFold2 predictions (pLDDT, "experimentally resolved" prediction and "relative solvent accessibility") correlated to some extent with IDRs. However, expert methods predict IDRs more reliably by combining complex machine learning models with expert-crafted input features and evolutionary information from multiple sequence alignments (MSAs). MSAs are not always available, especially for IDRs, and are computationally expensive to generate, limiting the scalability of the associated tools. Here, we present the novel method SETH that predicts residue disorder from embeddings generated by the protein Language Model ProtT5, which explicitly only uses single sequences as input. Thereby, our method, relying on a relatively shallow convolutional neural network, outperformed much more complex solutions while being much faster, allowing to create predictions for the human proteome in about 1 hour on a consumer-grade PC with one NVIDIA GeForce RTX 3060. Trained on a continuous disorder scale (CheZOD scores), our method captured subtle variations in disorder, thereby providing important information beyond the binary classification of most methods. High performance paired with speed revealed that SETH's nuanced disorder predictions for entire proteomes capture aspects of the evolution of organisms. Additionally, SETH could also be used to filter out regions or proteins with probable low-quality AlphaFold2 3D structures to prioritize running the compute-intensive predictions for large data sets. SETH is freely publicly available at: https://github.com/Rostlab/SETH.
Collapse
Affiliation(s)
- Dagmar Ilzhöfer
- Faculty of Informatics, TUM (Technical University of Munich), Munich, Germany
| | - Michael Heinzinger
- Faculty of Informatics, TUM (Technical University of Munich), Munich, Germany
- Center of Doctoral Studies in Informatics and Its Applications (CeDoSIA), TUM Graduate School, Garching, Germany
| | - Burkhard Rost
- Faculty of Informatics, TUM (Technical University of Munich), Munich, Germany
- Institute for Advanced Study (TUM-IAS), TUM (Technical University of Munich), Garching, Germany
- TUM School of Life Sciences Weihenstephan (WZW), TUM (Technical University of Munich), Freising, Germany
| |
Collapse
|
8
|
Elkhaligy H, Balbin CA, Siltberg-Liberles J. Comparative Analysis of Structural Features in SLiMs from Eukaryotes, Bacteria, and Viruses with Importance for Host-Pathogen Interactions. Pathogens 2022; 11:pathogens11050583. [PMID: 35631103 PMCID: PMC9147284 DOI: 10.3390/pathogens11050583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 05/06/2022] [Accepted: 05/11/2022] [Indexed: 11/19/2022] Open
Abstract
Protein-protein interactions drive functions in eukaryotes that can be described by short linear motifs (SLiMs). Conservation of SLiMs help illuminate functional SLiMs in eukaryotic protein families. However, the simplicity of eukaryotic SLiMs makes them appear by chance due to mutational processes not only in eukaryotes but also in pathogenic bacteria and viruses. Further, functional eukaryotic SLiMs are often found in disordered regions. Although proteomes from pathogenic bacteria and viruses have less disorder than eukaryotic proteomes, their proteins can successfully mimic eukaryotic SLiMs and disrupt host cellular function. Identifying important SLiMs in pathogens is difficult but essential for understanding potential host-pathogen interactions. We performed a comparative analysis of structural features for experimentally verified SLiMs from the Eukaryotic Linear Motif (ELM) database across viruses, bacteria, and eukaryotes. Our results revealed that many viral SLiMs and specific motifs found across viruses and eukaryotes, such as some glycosylation motifs, have less disorder. Analyzing the disorder and coil properties of equivalent SLiMs from pathogens and eukaryotes revealed that some motifs are more structured in pathogens than their eukaryotic counterparts and vice versa. These results support a varying mechanism of interaction between pathogens and their eukaryotic hosts for some of the same motifs.
Collapse
|
9
|
Mandel C, Yang H, Buchko GW, Abendroth J, Grieshaber N, Chiarelli T, Grieshaber S, Omsland A. Expression and structure of the Chlamydia trachomatis DksA ortholog. Pathog Dis 2022; 80:6564600. [PMID: 35388904 PMCID: PMC9126822 DOI: 10.1093/femspd/ftac007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Revised: 02/15/2022] [Accepted: 04/04/2022] [Indexed: 11/14/2022] Open
Abstract
Chlamydia trachomatis is a bacterial obligate intracellular parasite and a significant cause of human disease, including sexually transmitted infections and trachoma. The bacterial RNA polymerase-binding protein DksA is a transcription factor integral to the multicomponent bacterial stress response pathway known as the stringent response. The genome of C. trachomatis encodes a DksA ortholog (DksACt) that is maximally expressed at 15–20 h post infection, a time frame correlating with the onset of transition between the replicative reticulate body (RB) and infectious elementary body (EB) forms of the pathogen. Ectopic overexpression of DksACt in C. trachomatis prior to RB–EB transitions during infection of HeLa cells resulted in a 39.3% reduction in overall replication (yield) and a 49.6% reduction in recovered EBs. While the overall domain organization of DksACt is similar to the DksA ortholog of Escherichia coli (DksAEc), DksACt did not functionally complement DksAEc. Transcription of dksACt is regulated by tandem promoters, one of which also controls expression of nrdR, encoding a negative regulator of deoxyribonucleotide biosynthesis. The phenotype resulting from ectopic expression of DksACt and the correlation between dksACt and nrdR expression is consistent with a role for DksACt in the C. trachomatis developmental cycle.
Collapse
Affiliation(s)
- Cameron Mandel
- Paul G. Allen School for Global Health, Washington State University, Pullman, WA 99164, USA
| | - Hong Yang
- Paul G. Allen School for Global Health, Washington State University, Pullman, WA 99164, USA
| | - Garry W Buchko
- School of Molecular Biosciences, Washington State University, Pullman WA 99164, USA.,Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA 99354, USA.,Seattle Structural Genomics Center for Infectious Disease, WA, USA
| | - Jan Abendroth
- Seattle Structural Genomics Center for Infectious Disease, WA, USA.,UCB, Bainbridge Island, WA 98110, USA
| | - Nicole Grieshaber
- Department of Biological Sciences, University of Idaho, Moscow, ID 83844, USA
| | - Travis Chiarelli
- Department of Biological Sciences, University of Idaho, Moscow, ID 83844, USA
| | - Scott Grieshaber
- Department of Biological Sciences, University of Idaho, Moscow, ID 83844, USA
| | - Anders Omsland
- Paul G. Allen School for Global Health, Washington State University, Pullman, WA 99164, USA
| |
Collapse
|
10
|
Mitra D, Pal AK, Das Mohapatra PK. Intra-protein interactions of SARS-CoV-2 and SARS: a bioinformatic analysis for plausible explanation regarding stability, divergency, and severity. SYSTEMS MICROBIOLOGY AND BIOMANUFACTURING 2022; 2:653-664. [PMID: 38624777 PMCID: PMC8935616 DOI: 10.1007/s43393-022-00091-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 02/27/2022] [Accepted: 02/28/2022] [Indexed: 11/16/2022]
Abstract
The current nightmare for the whole world is COVID-19. The occurrence of concentrated pneumonia cases in Wuhan city, Hubei province of China, was first reported on December 30, 2019. SARS-CoV first disclosed in 2002 but had not outspread worldwide. After 18 years, in 2020, it reemerged and outspread worldwide as SARS-CoV-2 (COVID-19), as the most dangerous virus-creating disease in the world. Is it possible to create a favorable evolution within the short time (18 years)? If possible, then what are those properties or factors that are changed in SARS-CoV-2 to make it undefeated? What are the fundamental differences between SARS-CoV-2 and SARS? The study is one of the initiatives to find out all those queries. Here, four types of protein sequences from SARS-CoV-2 and SARS were retrieved from the database to study their physicochemical and structural properties. Results showed that charged residues are playing a pivotal role in SARS-CoV-2 evolution and contribute to the helix stabilization. The formation of the cyclic salt bridge and other intra-protein interactions specially network aromatic-aromatic interaction also play the crucial role in SAS-CoV-2. This comparative study will help to understand the evolution from SARS to SARS-CoV-2 and helpful in protein engineering.
Collapse
Affiliation(s)
- Debanjan Mitra
- Department of Microbiology, Raiganj University, Raiganj, WB India
| | - Aditya K. Pal
- Department of Microbiology, Raiganj University, Raiganj, WB India
| | | |
Collapse
|
11
|
Zhao B, Kurgan L. Deep learning in prediction of intrinsic disorder in proteins. Comput Struct Biotechnol J 2022; 20:1286-1294. [PMID: 35356546 PMCID: PMC8927795 DOI: 10.1016/j.csbj.2022.03.003] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 03/04/2022] [Accepted: 03/04/2022] [Indexed: 12/12/2022] Open
Abstract
Intrinsic disorder prediction is an active area that has developed over 100 predictors. We identify and investigate a recent trend towards the development of deep neural network (DNN)-based methods. The first DNN-based method was released in 2013 and since 2019 deep learners account for majority of the new disorder predictors. We find that the 13 currently available DNN-based predictors are diverse in their topologies, sizes of their networks and the inputs that they utilize. We empirically show that the deep learners are statistically more accurate than other types of disorder predictors using the blind test dataset from the recent community assessment of intrinsic disorder predictions (CAID). We also identify several well-rounded DNN-based predictors that are accurate, fast and/or conveniently available. The popularity, favorable predictive performance and architectural flexibility suggest that deep networks are likely to fuel the development of future disordered predictors. Novel hybrid designs of deep networks could be used to adequately accommodate for diversity of types and flavors of intrinsic disorder. We also discuss scarcity of the DNN-based methods for the prediction of disordered binding regions and the need to develop more accurate methods for this prediction.
Collapse
Affiliation(s)
- Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
12
|
Kurgan L. Resources for computational prediction of intrinsic disorder in proteins. Methods 2022; 204:132-141. [DOI: 10.1016/j.ymeth.2022.03.018] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Revised: 03/25/2022] [Accepted: 03/29/2022] [Indexed: 12/26/2022] Open
|
13
|
Afanasyeva TAV, Schnellbach YT, Gibson TJ, Roepman R, Collin RWJ. OUP accepted manuscript. Hum Mol Genet 2022; 31:2560-2570. [PMID: 35253837 PMCID: PMC9396937 DOI: 10.1093/hmg/ddac057] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2022] [Revised: 02/17/2022] [Accepted: 03/03/2022] [Indexed: 11/14/2022] Open
Abstract
Retinitis pigmentosa (RP) is a genetically heterogeneous form of inherited retinal disease that leads to progressive visual impairment. One genetic subtype of RP, RP54, has been linked to mutations in PCARE (photoreceptor cilium actin regulator). We have recently shown that PCARE recruits WASF3 to the tip of a primary cilium, and thereby activates an Arp2/3 complex which results in the remodeling of actin filaments that drives the expansion of the ciliary tip membrane. On the basis of these findings, and the lack of proper photoreceptor development in mice lacking Pcare, we postulated that PCARE plays an important role in photoreceptor outer segment disk formation. In this study, we aimed to decipher the relationship between predicted structural and function amino acid motifs within PCARE and its function. Our results show that PCARE contains a predicted helical coiled coil domain together with evolutionary conserved binding sites for photoreceptor kinase MAK (type RP62), as well as EVH1 domain-binding linear motifs. Upon deletion of the helical domain, PCARE failed to localize to the cilia. Furthermore, upon deletion of the EVH1 domain-binding motifs separately or together, co-expression of mutant protein with WASF3 resulted in smaller ciliary tip membrane expansions. Finally, inactivation of the lipid modification on the cysteine residue at amino acid position 3 also caused a moderate decrease in the sizes of ciliary tip expansions. Taken together, our data illustrate the importance of amino acid motifs and domains within PCARE in fulfilling its physiological function.
Collapse
Affiliation(s)
- Tess A V Afanasyeva
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, GA6525, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, GA6525, The Netherlands
| | - Yan-Ting Schnellbach
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, GA6525, The Netherlands
| | - Toby J Gibson
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg 69117, Germany
| | - Ronald Roepman
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, GA6525, The Netherlands
- Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, GA 6525, The Netherlands
| | - Rob W J Collin
- To whom correspondence should be addressed at: Donders Institute for Brain, Cognition and Behaviour, Radboud university medical center, Geert Grooteplein 10, 6525 GA Nijmegen, The Netherlands. Tel: +31 243613750; Fax: +31 243668752;
| |
Collapse
|
14
|
Abstract
INTRODUCTION Intrinsic disorder prediction field develops, assesses, and deploys computational predictors of disorder in protein sequences and constructs and disseminates databases of these predictions. Over 40 years of research resulted in the release of numerous resources. AREAS COVERED We identify and briefly summarize the most comprehensive to date collection of over 100 disorder predictors. We focus on their predictive models, availability and predictive performance. We categorize and study them from a historical point of view to highlight informative trends. EXPERT OPINION We find a consistent trend of improvements in predictive quality as newer and more advanced predictors are developed. The original focus on machine learning methods has shifted to meta-predictors in early 2010s, followed by a recent transition to deep learning. The use of deep learners will continue in foreseeable future given recent and convincing success of these methods. Moreover, a broad range of resources that facilitate convenient collection of accurate disorder predictions is available to users. They include web servers and standalone programs for disorder prediction, servers that combine prediction of disorder and disorder functions, and large databases of pre-computed predictions. We also point to the need to address the shortage of accurate methods that predict disordered binding regions.
Collapse
Affiliation(s)
- Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, Virginia, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, Virginia, USA
| |
Collapse
|
15
|
Perera BLA, Colina CM. Cluster formation of initiators as a tool to impose conformational stability to unstructured regions of a protein. Mol Phys 2021. [DOI: 10.1080/00268976.2021.1963000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- B. Lakshitha A. Perera
- Department of Chemistry, University of Florida, Gainesville, FL, USA
- George and Josephine Butler Polymer Research Laboratory, University of Florida, Gainesville, FL, USA
- Center for Macromolecular Science and Engineering, University of Florida, Gainesville, FL, USA
| | - Coray M. Colina
- Department of Chemistry, University of Florida, Gainesville, FL, USA
- George and Josephine Butler Polymer Research Laboratory, University of Florida, Gainesville, FL, USA
- Center for Macromolecular Science and Engineering, University of Florida, Gainesville, FL, USA
- Department of Material Science and Engineering, University of Florida, Gainesville, FL, USA
| |
Collapse
|
16
|
Bondos SE, Dunker AK, Uversky VN. On the roles of intrinsically disordered proteins and regions in cell communication and signaling. Cell Commun Signal 2021; 19:88. [PMID: 34461937 PMCID: PMC8404256 DOI: 10.1186/s12964-021-00774-3] [Citation(s) in RCA: 44] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
For proteins, the sequence → structure → function paradigm applies primarily to enzymes, transmembrane proteins, and signaling domains. This paradigm is not universal, but rather, in addition to structured proteins, intrinsically disordered proteins and regions (IDPs and IDRs) also carry out crucial biological functions. For these proteins, the sequence → IDP/IDR ensemble → function paradigm applies primarily to signaling and regulatory proteins and regions. Often, in order to carry out function, IDPs or IDRs cooperatively interact, either intra- or inter-molecularly, with structured proteins or other IDPs or intermolecularly with nucleic acids. In this IDP/IDR thematic collection published in Cell Communication and Signaling, thirteen articles are presented that describe IDP/IDR signaling molecules from a variety of organisms from humans to fruit flies and tardigrades ("water bears") and that describe how these proteins and regions contribute to the function and regulation of cell signaling. Collectively, these papers exhibit the diverse roles of disorder in responding to a wide range of signals as to orchestrate an array of organismal processes. They also show that disorder contributes to signaling in a broad spectrum of species, ranging from micro-organisms to plants and animals.
Collapse
Affiliation(s)
- Sarah E Bondos
- Department of Molecular and Cellular Medicine, Texas A&M Health Science Center, College Station, TX, 77843, USA.
| | - A Keith Dunker
- Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Indiana University School of Medicine, Indianapolis, IN, 46202, USA.
| | - Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA.
- Institute for Biological Instrumentation of the Russian Academy of Sciences, Federal Research Center "Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences", Pushchino, Russia.
| |
Collapse
|
17
|
Hu G, Katuwawala A, Wang K, Wu Z, Ghadermarzi S, Gao J, Kurgan L. flDPnn: Accurate intrinsic disorder prediction with putative propensities of disorder functions. Nat Commun 2021; 12:4438. [PMID: 34290238 PMCID: PMC8295265 DOI: 10.1038/s41467-021-24773-7] [Citation(s) in RCA: 140] [Impact Index Per Article: 46.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 07/06/2021] [Indexed: 01/05/2023] Open
Abstract
Identification of intrinsic disorder in proteins relies in large part on computational predictors, which demands that their accuracy should be high. Since intrinsic disorder carries out a broad range of cellular functions, it is desirable to couple the disorder and disorder function predictions. We report a computational tool, flDPnn, that provides accurate, fast and comprehensive disorder and disorder function predictions from protein sequences. The recent Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment and results on other test datasets demonstrate that flDPnn offers accurate predictions of disorder, fully disordered proteins and four common disorder functions. These predictions are substantially better than the results of the existing disorder predictors and methods that predict functions of disorder. Ablation tests reveal that the high predictive performance stems from innovative ways used in flDPnn to derive sequence profiles and encode inputs. flDPnn's webserver is available at http://biomine.cs.vcu.edu/servers/flDPnn/.
Collapse
Affiliation(s)
- Gang Hu
- School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin, China
| | - Akila Katuwawala
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Kui Wang
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, China
| | - Zhonghua Wu
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, China
| | - Sina Ghadermarzi
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Jianzhao Gao
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, China
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA.
| |
Collapse
|
18
|
Ghadermarzi S, Krawczyk B, Song J, Kurgan L. XRRpred: Accurate Predictor of Crystal Structure Quality from Protein Sequence. Bioinformatics 2021; 37:4366-4374. [PMID: 34247234 DOI: 10.1093/bioinformatics/btab509] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Revised: 06/10/2021] [Accepted: 07/06/2021] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION X-ray crystallography was used to produce nearly 90% of protein structures. These efforts were supported by numerous sequence-based tools that accurately predict crystallizable proteins. However, protein structures vary widely in their quality, typically measured with resolution and R-free. This impacts the ability to use these structures for some applications including rational drug design and molecular docking and motivates development of methods that accurately predict structure quality. RESULTS We introduce XRRpred, the first predictor of the resolution and R-free values from protein sequences. XRRpred relies on original sequence profiles, hand-crafted features, empirically selected and parametrized regressors, and modern resampling techniques. Using an independent test dataset, we show that XRRpred provides accurate predictions of resolution and R-free. We demonstrate that XRRpred's predictions correctly model relationship between the resolution and R-free and reproduce structure quality relations between structural classes of proteins. We also show that XRRpred significantly outperforms indirect alternative ways to predict the structure quality that include predictors of crystallization propensity and an alignment-based approach. XRRpred is available as a convenient webserver that allows batch predictions and offers informative visualization of the results. AVAILABILITY http://biomine.cs.vcu.edu/servers/XRRPred/.
Collapse
Affiliation(s)
- Sina Ghadermarzi
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Bartosz Krawczyk
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Jiangning Song
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia.,Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
19
|
The Anti-Inflammatory Protein TNIP1 Is Intrinsically Disordered with Structural Flexibility Contributed by Its AHD1-UBAN Domain. Biomolecules 2020; 10:biom10111531. [PMID: 33182596 PMCID: PMC7697625 DOI: 10.3390/biom10111531] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Revised: 11/04/2020] [Accepted: 11/05/2020] [Indexed: 01/02/2023] Open
Abstract
TNFAIP3 interacting protein 1 (TNIP1) interacts with numerous non-related cellular, viral, and bacterial proteins. TNIP1 is also linked with multiple chronic inflammatory disorders on the gene and protein levels, through numerous single-nucleotide polymorphisms and reduced protein amounts. Despite the importance of TNIP1 function, there is limited investigation as to how its conformation may impact its apparent multiple roles. Hub proteins like TNIP1 are often intrinsically disordered proteins. Our initial in silico assessments suggested TNIP1 is natively unstructured, featuring numerous potentials intrinsically disordered regions, including the ABIN homology domain 1-ubiquitin binding domain in ABIN proteins and NEMO (AHD1-UBAN) domain associated with its anti-inflammatory function. Using multiple biophysical approaches, we demonstrate the structural flexibility of full-length TNIP1 and the AHD1-UBAN domain. We present evidence the AHD1-UBAN domain exists primarily as a pre-molten globule with limited secondary structure in solution. Data presented here suggest the previously described coiled-coil conformation of the crystallized UBAN-only region may represent just one of possibly multiple states for the AHD1-UBAN domain in solution. These data also characterize the AHD1-UBAN domain in solution as mostly monomeric with potential to undergo oligomerization under specific environmental conditions (e.g., binding partner availability, pH-dependence). This proposed intrinsic disorder across TNIP1 and within the AHD1-UBAN region is likely to impact TNIP1 function and interaction with its multiple partners.
Collapse
|
20
|
Dark Proteome Database: Studies on Disorder. High Throughput 2020; 9:ht9030015. [PMID: 32629790 PMCID: PMC7563470 DOI: 10.3390/ht9030015] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Revised: 06/17/2020] [Accepted: 06/18/2020] [Indexed: 12/17/2022] Open
Abstract
There is a misconception that intrinsic disorder in proteins is equivalent to darkness. The present study aims to establish, in the scope of the Swiss-Prot and Dark Proteome databases, the relationship between disorder and darkness. Three distinct predictors were used to calculate the disorder of Swiss-Prot proteins. The analysis of the results obtained with the used predictors and visualization paradigms resulted in the same conclusion that was reached before: disorder is mostly unrelated to darkness.
Collapse
|
21
|
Monzon AM, Necci M, Quaglia F, Walsh I, Zanotti G, Piovesan D, Tosatto SCE. Experimentally Determined Long Intrinsically Disordered Protein Regions Are Now Abundant in the Protein Data Bank. Int J Mol Sci 2020; 21:ijms21124496. [PMID: 32599863 PMCID: PMC7349999 DOI: 10.3390/ijms21124496] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Revised: 06/18/2020] [Accepted: 06/19/2020] [Indexed: 01/12/2023] Open
Abstract
Intrinsically disordered protein regions are commonly defined from missing electron density in X-ray structures. Experimental evidence for long disorder regions (LDRs) of at least 30 residues was so far limited to manually curated proteins. Here, we describe a comprehensive and large-scale analysis of experimental LDRs for 3133 unique proteins, demonstrating an increasing coverage of intrinsic disorder in the Protein Data Bank (PDB) in the last decade. The results suggest that long missing residue regions are a good quality source to annotate intrinsically disordered regions and perform functional analysis in large data sets. The consensus approach used to define LDRs allows to evaluate context dependent disorder and provide a common definition at the protein level.
Collapse
Affiliation(s)
- Alexander Miguel Monzon
- Department of Biomedical Sciences, University of Padua, 35131 Padua, Italy; (A.M.M.); (M.N.); (F.Q.); (G.Z.)
| | - Marco Necci
- Department of Biomedical Sciences, University of Padua, 35131 Padua, Italy; (A.M.M.); (M.N.); (F.Q.); (G.Z.)
| | - Federica Quaglia
- Department of Biomedical Sciences, University of Padua, 35131 Padua, Italy; (A.M.M.); (M.N.); (F.Q.); (G.Z.)
| | - Ian Walsh
- Bioprocessing Technology Institute, A*STAR, Singapore 138668, Singapore;
| | - Giuseppe Zanotti
- Department of Biomedical Sciences, University of Padua, 35131 Padua, Italy; (A.M.M.); (M.N.); (F.Q.); (G.Z.)
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padua, 35131 Padua, Italy; (A.M.M.); (M.N.); (F.Q.); (G.Z.)
- Correspondence: (D.P.); (S.C.E.T.)
| | - Silvio C. E. Tosatto
- Department of Biomedical Sciences, University of Padua, 35131 Padua, Italy; (A.M.M.); (M.N.); (F.Q.); (G.Z.)
- Correspondence: (D.P.); (S.C.E.T.)
| |
Collapse
|
22
|
Yan J, Cheng J, Kurgan L, Uversky VN. Structural and functional analysis of "non-smelly" proteins. Cell Mol Life Sci 2020; 77:2423-2440. [PMID: 31486849 PMCID: PMC11105052 DOI: 10.1007/s00018-019-03292-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2019] [Revised: 08/21/2019] [Accepted: 08/28/2019] [Indexed: 01/09/2023]
Abstract
Cysteine and aromatic residues are major structure-promoting residues. We assessed the abundance, structural coverage, and functional characteristics of the "non-smelly" proteins, i.e., proteins that do not contain cysteine residues (C-depleted) or cysteine and aromatic residues (CFYWH-depleted), across 817 proteomes from all domains of life. The analysis revealed that although these proteomes contained significant levels of the C-depleted proteins, with prokaryotes being significantly more enriched in such proteins than eukaryotes, the CFYWH-depleted proteins were relatively rare, accounting for about 0.05% of proteomes. Furthermore, CFYWH-depleted proteins were virtually never found in PDB. Depletion in cysteine and in aromatic residues was associated with the substantially increased intrinsic disorder levels across all domains of life. Archaeal and eukaryotic organisms with higher levels of the C-depleted proteins were shown to have higher levels of the intrinsic disorder and lower levels of structural coverage. We also showed that the "non-smelly" proteins typically did not independently fold into monomeric structures, and instead, they fold by interacting with nucleic acids as constituents of the ribosome and nucleosome complexes. They were shown to be involved in translation, transcription, nucleosome assembly, transmembrane transport, and protein folding functions, all of which are known to be associated with the intrinsic disorder. Our data suggested that, in general, structure of monomeric proteins is crucially dependent on the presence of cysteine and aromatic residues.
Collapse
Affiliation(s)
- Jing Yan
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Canada
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, 401 West Main Street, Room E4225, Richmond, VA, 23284, USA.
| | - Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Blvd., MDC07, Tampa, FL, 33612, USA.
- Protein Research Group, Institute for Biological Instrumentation of the Russian Academy of Sciences, 142290, Pushchino, Moscow Region, Russia.
| |
Collapse
|
23
|
Hu G, Wu Z, Oldfield CJ, Wang C, Kurgan L. Quality assessment for the putative intrinsic disorder in proteins. Bioinformatics 2020; 35:1692-1700. [PMID: 30329008 DOI: 10.1093/bioinformatics/bty881] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2018] [Revised: 09/19/2018] [Accepted: 10/15/2018] [Indexed: 11/15/2022] Open
Abstract
MOTIVATION While putative intrinsic disorder is widely used, none of the predictors provides quality assessment (QA) scores. QA scores estimate the likelihood that predictions are correct at a residue level and have been applied in other bioinformatics areas. We recently reported that QA scores derived from putative disorder propensities perform relatively poorly for native disordered residues. Here we design and validate a general approach to construct QA predictors for disorder predictions. RESULTS The QUARTER (QUality Assessment for pRotein inTrinsic disordEr pRedictions) toolbox of methods accommodates a diverse set of ten disorder predictors. It builds upon several innovative design elements including use and scaling of selected physicochemical properties of the input sequence, post-processing of disorder propensity scores, and a feature selection that optimizes the predictive models to a specific disorder predictor. We empirically establish that each one of these elements contributes to the overall predictive performance of our tool and that QUARTER's outputs significantly outperform QA scores derived from the outputs generated the disorder predictors. The best performing QA scores for a single disorder predictor identify 13% of residues that are predicted with 98% precision. QA scores computed by combining results of the ten disorder predictors cover 40% of residues with 95% precision. Case studies are used to show how to interpret the QA scores. QA scores based on the high precision combined predictions are applied to analyze disorder in the human proteome. AVAILABILITY AND IMPLEMENTATION http://biomine.cs.vcu.edu/servers/QUARTER/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Gang Hu
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, People's Republic of China
| | - Zhonghua Wu
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, People's Republic of China
| | | | - Chen Wang
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| |
Collapse
|
24
|
Novel molecular aspects of the CRISPR backbone protein ‘Cas7’ from cyanobacteria. Biochem J 2020; 477:971-983. [DOI: 10.1042/bcj20200026] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2020] [Revised: 02/11/2020] [Accepted: 02/13/2020] [Indexed: 01/16/2023]
Abstract
The cyanobacterium Anabaena PCC 7120 shows the presence of Type I-D CRISPR system that can potentially confer adaptive immunity. The Cas7 protein (Alr1562), which forms the backbone of the type I-D surveillance complex, was characterized from Anabaena. Alr1562, showed the presence of the non-canonical RNA recognition motif and two intrinsically disordered regions (IDRs). When overexpressed in E. coli, the Alr1562 protein was soluble and could be purified by affinity chromatography, however, deletion of IDRs rendered Alr1562 completely insoluble. The purified Alr1562 was present in the dimeric or a RNA-associated higher oligomeric form, which appeared as spiral structures under electron microscope. With RNaseA and NaCl treatment, the higher oligomeric form converted to the lower oligomeric form, indicating that oligomerization occurred due to the association of Alr1562 with RNA. The secondary structure of both these forms was largely similar, resembling that of a partially folded protein. The dimeric Alr1562 was more prone to temperature-dependent aggregation than the higher oligomeric form. In vitro, the Alr1562 bound more specifically to a minimal CRISPR unit than to the non-specific RNA. Residues required for binding of Alr1562 to RNA, identified by protein modeling-based approaches, were mutated for functional validation. Interestingly, these mutant proteins, showing reduced ability to bind RNA were predominantly present in dimeric form. Alr1562 was detected with specific antiserum in Anabaena, suggesting that the type I-D system is expressed and may be functional in vivo. This is the first report that describes the characterization of a Cas protein from any photosynthetic organism.
Collapse
|
25
|
Seckfort D, Lynch GC, Pettitt BM. The lac repressor hinge helix in context: The effect of the DNA binding domain and symmetry. Biochim Biophys Acta Gen Subj 2020; 1864:129538. [PMID: 31958546 DOI: 10.1016/j.bbagen.2020.129538] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Revised: 01/13/2020] [Accepted: 01/14/2020] [Indexed: 01/04/2023]
Abstract
The Lac system of genes has been an important model system in understanding gene regulation. When the dimer lac repressor protein binds to the correct DNA sequence, the hinge region of the protein goes through a disorder to order transition. The hinge region is disordered when binding to nonoperator sequences. This region of the protein must pay a conformational entropic penalty to order when it is bound to operator DNA. Structural studies show that this region is flexible. Previous simulations showed that this region is disordered when free in solution without the DNA binding domain present. Our simulations corroborate that this region is extremely flexible in solution, but we find that the presence of the DNA binding domain proximal to the hinge helix and salt make the ordered conformation more favorable even without DNA present.
Collapse
Affiliation(s)
- Danielle Seckfort
- Structural and Computational Biology and Molecular Biophysics, Baylor College of Medicine, Houston 77030, TX, USA; Department of Biochemistry and Molecular Biology, Sealy Center for Structural Biology and Molecular Biophysics, University of Texas Medical Branch, Galveston 77555, TX, USA
| | - Gillian C Lynch
- Department of Biochemistry and Molecular Biology, Sealy Center for Structural Biology and Molecular Biophysics, University of Texas Medical Branch, Galveston 77555, TX, USA
| | - B Montgomery Pettitt
- Structural and Computational Biology and Molecular Biophysics, Baylor College of Medicine, Houston 77030, TX, USA; Department of Biochemistry and Molecular Biology, Sealy Center for Structural Biology and Molecular Biophysics, University of Texas Medical Branch, Galveston 77555, TX, USA.
| |
Collapse
|
26
|
Vincent M, Uversky VN, Schnell S. On the Need to Develop Guidelines for Characterizing and Reporting Intrinsic Disorder in Proteins. Proteomics 2019; 19:e1800415. [PMID: 30793871 PMCID: PMC6571172 DOI: 10.1002/pmic.201800415] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2018] [Revised: 02/05/2019] [Indexed: 01/02/2023]
Abstract
Since the early 2000s, numerous computational tools have been created and used to predict intrinsic disorder in proteins. At present, the output from these algorithms is difficult to interpret in the absence of standards or references for comparison. There are many reasons to establish a set of standard-based guidelines to evaluate computational protein disorder predictions. This viewpoint explores a handful of these reasons, including standardizing nomenclature to improve communication, rigor and reproducibility, and making it easier for newcomers to enter the field. An approach for reporting predicted disorder in single proteins with respect to whole proteomes is discussed. The suggestions are not intended to be formulaic; they should be viewed as a starting point to establish guidelines for interpreting and reporting computational protein disorder predictions.
Collapse
Affiliation(s)
- Michael Vincent
- Interdisciplinary Biological Sciences, Northwestern University, Evanston, Illinois 60208, USA
| | - Vladimir N. Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer’s Research Institute, Morsani College of Medicine, University of South Florida, Tampa, Florida 33612, USA
- Institute for Biological Instrumentation of the Russian Academy of Sciences, Pushchino 142290, Moscow region, Russia
| | - Santiago Schnell
- Department of Molecular & Integrative Physiology, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA
- Department of Computational Medicine & Bioinformatics, University of Michigan Medical School, Michigan 48109, USA
| |
Collapse
|
27
|
Redwan EM, AlJaddawi AA, Uversky VN. Structural disorder in the proteome and interactome of Alkhurma virus (ALKV). Cell Mol Life Sci 2019; 76:577-608. [PMID: 30443749 PMCID: PMC7079808 DOI: 10.1007/s00018-018-2968-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2018] [Revised: 10/30/2018] [Accepted: 11/05/2018] [Indexed: 12/13/2022]
Abstract
Infection by the Alkhurma virus (ALKV) leading to the Alkhurma hemorrhagic fever is a common thread in Saudi Arabia, with no efficient treatment or prevention available as of yet. Although the rational drug design traditionally uses information on known 3D structures of viral proteins, intrinsically disordered proteins (i.e., functional proteins that do not possess unique 3D structures), with their multitude of disorder-dependent functions, are crucial for the biology of viruses. Here, viruses utilize disordered regions in their invasion of the host organisms and in hijacking and repurposing of different host systems. Furthermore, the ability of viruses to efficiently adjust and accommodate to their hostile habitats is also intrinsic disorder-dependent. However, little is currently known on the level of penetrance and functional utilization of intrinsic disorder in the ALKV proteome. To fill this gap, we used here multiple computational tools to evaluate the abundance of intrinsic disorder in the ALKV genome polyprotein. We also analyzed the peculiarities of intrinsic disorder predisposition of the individual viral proteins, as well as human proteins known to be engaged in interaction with the ALKV proteins. Special attention was paid to finding a correlation between protein functionality and structural disorder. To the best of our knowledge, this work represents the first systematic study of the intrinsic disorder status of ALKV proteome and interactome.
Collapse
Affiliation(s)
- Elrashdy M Redwan
- Department of Biological Sciences, Faculty of Sciences, King Abdulaziz University, P.O. Box 80203, Jeddah, Saudi Arabia.
| | - Abdullah A AlJaddawi
- Department of Biological Sciences, Faculty of Sciences, King Abdulaziz University, P.O. Box 80203, Jeddah, Saudi Arabia
| | - Vladimir N Uversky
- Department of Biological Sciences, Faculty of Sciences, King Abdulaziz University, P.O. Box 80203, Jeddah, Saudi Arabia.
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA.
- Laboratory of New Methods in Biology, Institute for Biological Instrumentation, Russian Academy of Sciences, Pushchino, 142290, Moscow Region, Russia.
| |
Collapse
|
28
|
Seckfort D, Montgomery Pettitt B. Price of disorder in the lac repressor hinge helix. Biopolymers 2019; 110:e23239. [PMID: 30485404 PMCID: PMC6335174 DOI: 10.1002/bip.23239] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2018] [Revised: 09/12/2018] [Accepted: 10/04/2018] [Indexed: 12/26/2022]
Abstract
The Lac system of genes has been pivotal in understanding gene regulation. When the lac repressor protein binds to the correct DNA sequence, the hinge region of the protein goes through a disorder to order transition. The structure of this region of the protein is well understood when it is in this bound conformation, but less so when it is not. Structural studies show that this region is flexible. Our simulations show this region is extremely flexible in solution; however, a high concentration of salt can help kinetically trap the hinge helix. Thermodynamically, disorder is more favorable without the DNA present.
Collapse
Affiliation(s)
- Danielle Seckfort
- Structural and Computational Biology and Molecular Biophysics, Baylor College of Medicine, Houston, Texas
| | - B Montgomery Pettitt
- Structural and Computational Biology and Molecular Biophysics, Baylor College of Medicine, Houston, Texas
- Department of Biochemistry and Molecular Biology, Sealy Center for Structural Biology, University of Texas Medical Branch, Galveston, Texas
| |
Collapse
|
29
|
Hu G, Wang K, Song J, Uversky VN, Kurgan L. Taxonomic Landscape of the Dark Proteomes: Whole-Proteome Scale Interplay Between Structural Darkness, Intrinsic Disorder, and Crystallization Propensity. Proteomics 2018; 18:e1800243. [PMID: 30198635 DOI: 10.1002/pmic.201800243] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2018] [Revised: 08/30/2018] [Indexed: 12/14/2022]
Abstract
Growth rate of the protein sequence universe dramatically exceeds the speed of expansion for the protein structure universe, generating an immense dark proteome that includes proteins with unknown structure. A whole-proteome scale analysis of 5.4 million proteins from 987 proteomes in the three domains of life and viruses to systematically dissect an interplay between structural coverage, degree of putative intrinsic disorder, and predicted propensity for structure determination is performed. It has been found that Archaean and Bacterial proteomes have relatively high structural coverage and low amounts of disorder, whereas Eukaryotic and Viral proteomes are characterized by a broad spread of structural coverage and higher disorder levels. The analysis reveals that dark proteomes (i.e., proteomes containing high fractions of proteins with unknown structure) have significantly elevated amounts of intrinsic disorder and are predicted to be difficult to solve structurally. Although the majority of dark proteomes are of viral origin, many dark viral proteomes have at least modest crystallization propensity and only a handful of them are enriched in the intrinsic disorder. The disorder, structural coverage, and propensity are mapped for structural determination onto a novel proteome-level sequence similarity network to analyze the interplay of these characteristics in the taxonomic landscape.
Collapse
Affiliation(s)
- Gang Hu
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, 300071, P. R. China
| | - Kui Wang
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, 300071, P. R. China
| | - Jiangning Song
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia.,Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia
| | - Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, 33612, USA.,Institute for Biological Instrumentation, Russian Academy of Sciences, Pushchino, 142290, Russia
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, 23284, USA
| |
Collapse
|
30
|
D K, C S. Comparative analysis of human and mouse transcriptional cofactors (TcoFs) with special emphasis on intrinsically disordered regions and their associated regulating post‐translational modifications. J Cell Biochem 2018; 119:8531-8546. [DOI: 10.1002/jcb.27083] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2017] [Accepted: 04/26/2018] [Indexed: 12/14/2022]
Affiliation(s)
- Kamalesh D
- Department of Integrative Biology School of Bioscience and Technology, VIT University Vellore India
| | - Sudandiradoss C
- Department of Biotechnology School of Bioscience and Technology, VIT University Vellore India
| |
Collapse
|
31
|
Meng F, Murray GF, Kurgan L, Donahue HJ. Functional and structural characterization of osteocytic MLO-Y4 cell proteins encoded by genes differentially expressed in response to mechanical signals in vitro. Sci Rep 2018; 8:6716. [PMID: 29712973 PMCID: PMC5928037 DOI: 10.1038/s41598-018-25113-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2017] [Accepted: 04/09/2018] [Indexed: 12/29/2022] Open
Abstract
The anabolic response of bone to mechanical load is partially the result of osteocyte response to fluid flow-induced shear stress. Understanding signaling pathways activated in osteocytes exposed to fluid flow could identify novel signaling pathways involved in the response of bone to mechanical load. Bioinformatics allows for a unique perspective and provides key first steps in understanding these signaling pathways. We examined proteins encoded by genes differentially expressed in response to fluid flow in murine osteocytic MLO-Y4 cells. We considered structural and functional characteristics including putative intrinsic disorder, evolutionary conservation, interconnectedness in protein-protein interaction networks, and cellular localization. Our analysis suggests that proteins encoded by fluid flow activated genes have lower than expected conservation, are depleted in intrinsic disorder, maintain typical levels of connectivity for the murine proteome, and are found in the cytoplasm and extracellular space. Pathway analyses reveal that these proteins are associated with cellular response to stress, chemokine and cytokine activity, enzyme binding, and osteoclast differentiation. The lower than expected disorder of proteins encoded by flow activated genes suggests they are relatively specialized.
Collapse
Affiliation(s)
- Fanchi Meng
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Canada
| | - Graeme F Murray
- Bone Engineering, Science and Technology (BEST) Laboratory, Department of Biomedical Engineering, Virginia Commonwealth University, Richmond, Virginia, United States of America
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, Virginia, United States of America.
| | - Henry J Donahue
- Bone Engineering, Science and Technology (BEST) Laboratory, Department of Biomedical Engineering, Virginia Commonwealth University, Richmond, Virginia, United States of America.
| |
Collapse
|
32
|
Meng F, Wang C, Kurgan L. fDETECT webserver: fast predictor of propensity for protein production, purification, and crystallization. BMC Bioinformatics 2018; 18:580. [PMID: 29295714 PMCID: PMC6389161 DOI: 10.1186/s12859-017-1995-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2017] [Accepted: 12/06/2017] [Indexed: 02/26/2023] Open
Abstract
Background Development of predictors of propensity of protein sequences for successful crystallization has been actively pursued for over a decade. A few novel methods that expanded the scope of these predictions to address additional steps of protein production and structure determination pipelines were released in recent years. The predictive performance of the current methods is modest. This is because the only input that they use is the protein sequence and since the experimental annotations of these data might be inconsistent given that they were collected across many laboratories and centers. However, even these modest levels of predictive quality are still practical compared to the reported low success rates of crystallization, which are below 10%. We focus on another important aspect related to a high computational cost of running the predictors that offer the expanded scope. Results We introduce a novel fDETECT webserver that provides very fast and modestly accurate predictions of the success of protein production, purification, crystallization, and structure determination. Empirical tests on two datasets demonstrate that fDETECT is more accurate than the only other similarly fast method, and similarly accurate and three orders of magnitude faster than the currently most accurate predictors. Our method predicts a single protein in about 120 milliseconds and needs less than an hour to generate the four predictions for an entire human proteome. Moreover, we empirically show that fDETECT secures similar levels of predictive performance when compared with four representative methods that only predict success of crystallization, while it also provides the other three predictions. A webserver that implements fDETECT is available at http://biomine.cs.vcu.edu/servers/fDETECT/. Conclusions fDETECT is a computational tool that supports target selection for protein production and X-ray crystallography-based structure determination. It offers predictive quality that matches or exceeds other state-of-the-art tools and is especially suitable for the analysis of large protein sets.
Collapse
Affiliation(s)
- Fanchi Meng
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada
| | - Chen Wang
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA.
| |
Collapse
|
33
|
Gao J, Wu Z, Hu G, Wang K, Song J, Joachimiak A, Kurgan L. Survey of Predictors of Propensity for Protein Production and Crystallization with Application to Predict Resolution of Crystal Structures. Curr Protein Pept Sci 2018; 19:200-210. [PMID: 28933304 PMCID: PMC7001581 DOI: 10.2174/1389203718666170921114437] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2017] [Revised: 09/14/2017] [Accepted: 09/14/2017] [Indexed: 11/22/2022]
Abstract
Selection of proper targets for the X-ray crystallography will benefit biological research community immensely. Several computational models were proposed to predict propensity of successful protein production and diffraction quality crystallization from protein sequences. We reviewed a comprehensive collection of 22 such predictors that were developed in the last decade. We found that almost all of these models are easily accessible as webservers and/or standalone software and we demonstrated that some of them are widely used by the research community. We empirically evaluated and compared the predictive performance of seven representative methods. The analysis suggests that these methods produce quite accurate propensities for the diffraction-quality crystallization. We also summarized results of the first study of the relation between these predictive propensities and the resolution of the crystallizable proteins. We found that the propensities predicted by several methods are significantly higher for proteins that have high resolution structures compared to those with the low resolution structures. Moreover, we tested a new meta-predictor, MetaXXC, which averages the propensities generated by the three most accurate predictors of the diffraction-quality crystallization. MetaXXC generates putative values of resolution that have modest levels of correlation with the experimental resolutions and it offers the lowest mean absolute error when compared to the seven considered methods. We conclude that protein sequences can be used to fairly accurately predict whether their corresponding protein structures can be solved using X-ray crystallography. Moreover, we also ascertain that sequences can be used to reasonably well predict the resolution of the resulting protein crystals.
Collapse
Affiliation(s)
- Jianzhao Gao
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, People’s Republic of China
| | - Zhonghua Wu
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, People’s Republic of China
| | - Gang Hu
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, People’s Republic of China
| | - Kui Wang
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, People’s Republic of China
| | - Jiangning Song
- Infection and Immunity Program, Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Australia
- ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, Australia
| | - Andrzej Joachimiak
- Midwest Center for Structural Genomics, Argonne, USA
- Structural Biology Center, Biosciences, Argonne National Laboratory, Argonne, USA
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, USA
| |
Collapse
|
34
|
Dosztányi Z. Prediction of protein disorder based on IUPred. Protein Sci 2017; 27:331-340. [PMID: 29076577 DOI: 10.1002/pro.3334] [Citation(s) in RCA: 119] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2017] [Revised: 10/25/2017] [Accepted: 10/25/2017] [Indexed: 12/19/2022]
Abstract
Many proteins contain intrinsically disordered regions (IDRs), functional polypeptide segments that in isolation adopt a highly flexible conformational ensemble instead of a single, well-defined structure. Disorder prediction methods, which can discriminate ordered and disordered regions from the amino acid sequence, have contributed significantly to our current understanding of the distinct properties of intrinsically disordered proteins by enabling the characterization of individual examples as well as large-scale analyses of these protein regions. One popular method, IUPred provides a robust prediction of protein disorder based on an energy estimation approach that captures the fundamental difference between the biophysical properties of ordered and disordered regions. This paper reviews the energy estimation method underlying IUPred and the basic properties of the web server. Through an example, it also illustrates how the prediction output can be interpreted in a more complex case by taking into account the heterogeneous nature of IDRs. Various applications that benefited from IUPred to provide improved disorder predictions, complementing domain annotations and aiding the identification of functional short linear motifs are also described here. IUPred is freely available for noncommercial users through the web server (http://iupred.enzim.hu and http://iupred.elte.hu) . The program can also be downloaded and installed locally for large-scale analyses.
Collapse
Affiliation(s)
- Zsuzsanna Dosztányi
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest, H-1117, Hungary
| |
Collapse
|
35
|
Meng F, Uversky VN, Kurgan L. Comprehensive review of methods for prediction of intrinsic disorder and its molecular functions. Cell Mol Life Sci 2017; 74:3069-3090. [PMID: 28589442 PMCID: PMC11107660 DOI: 10.1007/s00018-017-2555-4] [Citation(s) in RCA: 130] [Impact Index Per Article: 18.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2017] [Accepted: 06/01/2017] [Indexed: 12/19/2022]
Abstract
Computational prediction of intrinsic disorder in protein sequences dates back to late 1970 and has flourished in the last two decades. We provide a brief historical overview, and we review over 30 recent predictors of disorder. We are the first to also cover predictors of molecular functions of disorder, including 13 methods that focus on disordered linkers and disordered protein-protein, protein-RNA, and protein-DNA binding regions. We overview their predictive models, usability, and predictive performance. We highlight newest methods and predictors that offer strong predictive performance measured based on recent comparative assessments. We conclude that the modern predictors are relatively accurate, enjoy widespread use, and many of them are fast. Their predictions are conveniently accessible to the end users, via web servers and databases that store pre-computed predictions for millions of proteins. However, research into methods that predict many not yet addressed functions of intrinsic disorder remains an outstanding challenge.
Collapse
Affiliation(s)
- Fanchi Meng
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Canada
| | - Vladimir N Uversky
- Department of Molecular Medicine, USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA
- Institute for Biological Instrumentation, Russian Academy of Sciences, Pushchino, Moscow Region, Russian Federation
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, USA.
| |
Collapse
|
36
|
Rayasam A, Hsu M, Hernández G, Kijak J, Lindstedt A, Gerhart C, Sandor M, Fabry Z. Contrasting roles of immune cells in tissue injury and repair in stroke: The dark and bright side of immunity in the brain. Neurochem Int 2017; 107:104-116. [PMID: 28245997 DOI: 10.1016/j.neuint.2017.02.009] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2016] [Revised: 02/14/2017] [Accepted: 02/16/2017] [Indexed: 01/09/2023]
Abstract
Despite considerable efforts in research and clinical studies, stroke is still one of the leading causes of death and disability worldwide. Originally, stroke was considered a vascular thrombotic disease without significant immune involvement. However, over the last few decades it has become increasingly obvious that the immune responses can significantly contribute to both tissue injury and protection following stroke. Recently, much research has been focused on the immune system's role in stroke pathology and trying to elucidate the mechanism used by immune cells in tissue injury and protection. Since the discovery of tissue plasminogen activator therapy in 1996, there have been no new treatments for stroke. For this reason, research into understanding how the immune system contributes to stroke pathology may lead to better therapies or enhance the efficacy of current treatments. Here, we discuss the contrasting roles of immune cells to stroke pathology while emphasizing myeloid cells and T cells. We propose that focusing future research on balancing the beneficial-versus-detrimental roles of immunity may lead to the discovery of better and novel stroke therapies.
Collapse
Affiliation(s)
- Aditya Rayasam
- Department of Pathology and Laboratory Medicine, University of Wisconsin-Madison, Madison, WI, USA; Neuroscience Training Program, University of Wisconsin-Madison, Madison, WI, USA
| | - Martin Hsu
- Department of Pathology and Laboratory Medicine, University of Wisconsin-Madison, Madison, WI, USA; Neuroscience Training Program, University of Wisconsin-Madison, Madison, WI, USA
| | - Gianna Hernández
- Department of Pathology and Laboratory Medicine, University of Wisconsin-Madison, Madison, WI, USA; Cellular and Molecular Pathology Graduate Program, University of Wisconsin-Madison, Madison, WI, USA
| | - Julie Kijak
- Department of Pathology and Laboratory Medicine, University of Wisconsin-Madison, Madison, WI, USA
| | - Anders Lindstedt
- Department of Pathology and Laboratory Medicine, University of Wisconsin-Madison, Madison, WI, USA
| | - Christian Gerhart
- Department of Pathology and Laboratory Medicine, University of Wisconsin-Madison, Madison, WI, USA
| | - Matyas Sandor
- Department of Pathology and Laboratory Medicine, University of Wisconsin-Madison, Madison, WI, USA; Cellular and Molecular Pathology Graduate Program, University of Wisconsin-Madison, Madison, WI, USA
| | - Zsuzsanna Fabry
- Department of Pathology and Laboratory Medicine, University of Wisconsin-Madison, Madison, WI, USA; Cellular and Molecular Pathology Graduate Program, University of Wisconsin-Madison, Madison, WI, USA.
| |
Collapse
|
37
|
D. K, Ramireddy S, P. R, C. S. Expediting dynamics approach to understand the influence of 14-3-3ζ causing metastatic cancer through the interaction of YAP1 and β-TRCP. MOLECULAR BIOSYSTEMS 2017; 13:1981-1992. [DOI: 10.1039/c7mb00271h] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
The 14-3-3ζ protein acts as a molecular switch in regulating the TGF-β pathway, which alters from a tumor suppressor in the early stage of breast cancer to a promoter of metastasis in the late stage.
Collapse
Affiliation(s)
- Kamalesh D.
- Department of Integrative Biology
- School of Biosciences and Technology
- VIT University
- Vellore
- India
| | - Sriroopreddy Ramireddy
- Department of Biotechnology
- School of Biosciences and Technology
- VIT University
- Vellore
- India
| | - Raguraman P.
- Department of Biotechnology
- School of Biosciences and Technology
- VIT University
- Vellore
- India
| | - Sudandiradoss C.
- Department of Biotechnology
- School of Biosciences and Technology
- VIT University
- Vellore
- India
| |
Collapse
|
38
|
Wu Z, Hu G, Wang K, Kurgan L. Exploratory Analysis of Quality Assessment of Putative Intrinsic Disorder in Proteins. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING 2017. [DOI: 10.1007/978-3-319-59063-9_65] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
|
39
|
Alowolodu O, Johnson G, Alashwal L, Addou I, Zhdanova IV, Uversky VN. Intrinsic disorder in spondins and some of their interacting partners. INTRINSICALLY DISORDERED PROTEINS 2016; 4:e1255295. [PMID: 28232900 DOI: 10.1080/21690707.2016.1255295] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2016] [Revised: 10/22/2016] [Accepted: 10/27/2016] [Indexed: 12/28/2022]
Abstract
Spondins, which are proteins that inhibit and promote adherence of embryonic cells so as to aid axonal growth are part of the thrombospondin-1 family. Spondins function in several important biological processes, such as apoptosis, angiogenesis, etc. Spondins constitute a thrombospondin subfamily that includes F-spondin, a protein that interacts with Aβ precursor protein and inhibits its proteolytic processing; R-spondin, a 4-membered group of proteins that regulates Wnt pathway and have other functions, such as regulation of kidney proliferation, induction of epithelial proliferation, the tumor suppressant action; M-spondin that mediates mechanical linkage between the muscles and apodemes; and the SCO-spondin, a protein important for neuronal development. In this study, we investigated intrinsic disorder status of human spondins and their interacting partners, such as members of the LRP family, LGR family, Frizzled family, and several other binding partners in order to establish the existence and importance of disordered regions in spondins and their interacting partners by conducting a detailed analysis of their sequences, finding disordered regions, and establishing a correlation between their structure and biological functions.
Collapse
Affiliation(s)
- Oluwole Alowolodu
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida , Tampa, FL, USA
| | - Gbemisola Johnson
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida , Tampa, FL, USA
| | - Lamis Alashwal
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida , Tampa, FL, USA
| | - Iqbal Addou
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida , Tampa, FL, USA
| | - Irina V Zhdanova
- Department of Anatomy & Neurobiology, Boston University School of Medicine , Boston, MA, USA
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL, USA; USF Health Byrd Alzheimer Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA; Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg, Russia
| |
Collapse
|
40
|
DeForte S, Uversky VN. Resolving the ambiguity: Making sense of intrinsic disorder when PDB structures disagree. Protein Sci 2016; 25:676-88. [PMID: 26683124 DOI: 10.1002/pro.2864] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2015] [Revised: 12/14/2015] [Accepted: 12/15/2015] [Indexed: 12/25/2022]
Abstract
Missing regions in X-ray crystal structures in the Protein Data Bank (PDB) have played a foundational role in the study of intrinsically disordered protein regions (IDPRs), especially in the development of in silico predictors of intrinsic disorder. However, a missing region is only a weak indication of intrinsic disorder, and this uncertainty is compounded by the presence of ambiguous regions, where more than one structure of the same protein sequence "disagrees" in terms of the presence or absence of missing residues. The question is this: are these ambiguous regions intrinsically disordered, or are they the result of static disorder that arises from experimental conditions, ensembles of structures, or domain wobbling? A novel way of looking at ambiguous regions in terms of the pattern between multiple PDB structures has been demonstrated. It was found that the propensity for intrinsic disorder increases as the level of ambiguity decreases. However, it is also shown that ambiguity is more likely to occur as the protein region is placed within different environmental conditions, and even the most ambiguous regions as a set display compositional bias that suggests flexibility. The results suggested that ambiguity is a natural result for many IDPRs crystallized under different conditions and that static disorder and wobbling domains are relatively rare. Instead, it is more likely that ambiguity arises because many of these regions were conditionally or partially disordered.
Collapse
Affiliation(s)
- Shelly DeForte
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, Florida, 33612
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, Florida, 33612.,USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, Florida, 33612.,Department of Biological Science, Faculty of Science, King Abdulaziz University, PO Box 80203, Jeddah, Jeddah 21589, Saudi Arabia.,Institute for Biological Instrumentation, Russian Academy of Sciences, Pushchino, Moscow Region, 142290, Russian Federation.,Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg, Russian Federation
| |
Collapse
|
41
|
Abstract
We surveyed the "dark" proteome-that is, regions of proteins never observed by experimental structure determination and inaccessible to homology modeling. For 546,000 Swiss-Prot proteins, we found that 44-54% of the proteome in eukaryotes and viruses was dark, compared with only ∼14% in archaea and bacteria. Surprisingly, most of the dark proteome could not be accounted for by conventional explanations, such as intrinsic disorder or transmembrane regions. Nearly half of the dark proteome comprised dark proteins, in which the entire sequence lacked similarity to any known structure. Dark proteins fulfill a wide variety of functions, but a subset showed distinct and largely unexpected features, such as association with secretion, specific tissues, the endoplasmic reticulum, disulfide bonding, and proteolytic cleavage. Dark proteins also had short sequence length, low evolutionary reuse, and few known interactions with other proteins. These results suggest new research directions in structural and computational biology.
Collapse
|
42
|
Abstract
Intrinsically disordered proteins and protein regions (IDPs/IDRs) do not adopt a well-defined folded structure under physiological conditions. Instead, these proteins exist as heterogeneous and dynamical conformational ensembles. IDPs are widespread in eukaryotic proteomes and are involved in fundamental biological processes, mostly related to regulation and signaling. At the same time, disordered regions often pose significant challenges to the structure determination process, which generally requires highly homogeneous proteins samples. In this book chapter, we provide a brief overview of protein disorder, describe various bioinformatics resources that have been developed in recent years for their characterization, and give a general outline of their applications in various types of structural genomics projects. Traditionally, disordered segments were filtered out to optimize the yield of structure determination pipelines. However, it is becoming increasingly clear that the structural characterization of proteins cannot be complete without the incorporation of intrinsically disordered regions.
Collapse
Affiliation(s)
- Marco Punta
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | | | | |
Collapse
|
43
|
Dunker AK, Oldfield CJ. Back to the Future: Nuclear Magnetic Resonance and Bioinformatics Studies on Intrinsically Disordered Proteins. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2015; 870:1-34. [PMID: 26387098 DOI: 10.1007/978-3-319-20164-1_1] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
From the 1970s to the present, regions of missing electron density in protein structures determined by X-ray diffraction and the characterization of the functions of these regions have suggested that not all protein regions depend on prior 3D structure to carry out function. Motivated by these observations, in early 1996 we began to use bioinformatics approaches to study these intrinsically disordered proteins (IDPs) and IDP regions. At just about the same time, several laboratory groups began to study a collection of IDPs and IDP regions using nuclear magnetic resonance. The temporal overlap of the bioinformatics and NMR studies played a significant role in the development of our understanding of IDPs. Here the goal is to recount some of this history and to project from this experience possible directions for future work.
Collapse
Affiliation(s)
- A Keith Dunker
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 46202, Indianapolis, IN, USA.
| | - Christopher J Oldfield
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 46202, Indianapolis, IN, USA.
| |
Collapse
|
44
|
Kurotani A, Yamada Y, Shinozaki K, Kuroda Y, Sakurai T. Plant-PrAS: a database of physicochemical and structural properties and novel functional regions in plant proteomes. PLANT & CELL PHYSIOLOGY 2015; 56:e11. [PMID: 25435546 PMCID: PMC4301743 DOI: 10.1093/pcp/pcu176] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/28/2014] [Accepted: 10/31/2014] [Indexed: 05/21/2023]
Abstract
Arabidopsis thaliana is an important model species for studies of plant gene functions. Research on Arabidopsis has resulted in the generation of high-quality genome sequences, annotations and related post-genomic studies. The amount of annotation, such as gene-coding regions and structures, is steadily growing in the field of plant research. In contrast to the genomics resource of animals and microorganisms, there are still some difficulties with characterization of some gene functions in plant genomics studies. The acquisition of information on protein structure can help elucidate the corresponding gene function because proteins encoded in the genome possess highly specific structures and functions. In this study, we calculated multiple physicochemical and secondary structural parameters of protein sequences, including length, hydrophobicity, the amount of secondary structure, the number of intrinsically disordered regions (IDRs) and the predicted presence of transmembrane helices and signal peptides, using a total of 208,333 protein sequences from the genomes of six representative plant species, Arabidopsis thaliana, Glycine max (soybean), Populus trichocarpa (poplar), Oryza sativa (rice), Physcomitrella patens (moss) and Cyanidioschyzon merolae (alga). Using the PASS tool and the Rosetta Stone method, we annotated the presence of novel functional regions in 1,732 protein sequences that included unannotated sequences from the Arabidopsis and rice proteomes. These results were organized into the Plant Protein Annotation Suite database (Plant-PrAS), which can be freely accessed online at http://plant-pras.riken.jp/.
Collapse
Affiliation(s)
- Atsushi Kurotani
- RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, 230-0045 Japan Department of Biotechnology and Life Sciences, Faculty of Technology, Tokyo University of Agriculture and Technology, Koganei, Tokyo, 184-8588 Japan
| | - Yutaka Yamada
- RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, 230-0045 Japan
| | - Kazuo Shinozaki
- RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, 230-0045 Japan
| | - Yutaka Kuroda
- Department of Biotechnology and Life Sciences, Faculty of Technology, Tokyo University of Agriculture and Technology, Koganei, Tokyo, 184-8588 Japan
| | - Tetsuya Sakurai
- RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, 230-0045 Japan
| |
Collapse
|
45
|
A critical evaluation of in silico methods for detection of membrane protein intrinsic disorder. Biophys J 2014; 106:1638-49. [PMID: 24739163 DOI: 10.1016/j.bpj.2014.02.025] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2013] [Revised: 02/03/2014] [Accepted: 02/25/2014] [Indexed: 11/23/2022] Open
Abstract
Intrinsically disordered regions in proteins possess important biological roles including transcriptional regulation, molecular recognition, and provision of sites for posttranslational modification. In three-dimensional crystallization of both soluble and membrane proteins, identification and removal of disordered regions is often necessary for obtaining crystals possessing sufficient long-range order for structure determination. Disordered regions can be identified experimentally, with techniques such as limited proteolysis coupled with mass spectrometry, or computationally, by using disorder prediction programs, of which many are available. Although these programs use various methods to predict disorder from a protein's primary sequence, they all were developed using information derived from soluble protein structures. Therefore, their performance and accuracy when applied to integral membrane proteins remained an open question. We evaluated the performance of 13 disorder prediction programs on a dataset containing 343 membrane proteins, and upon subdatasets containing only α-helical or β-barrel proteins. These programs were ranked using multiple metrics, including metrics specifically created for membrane proteins. Analysis of these data shows a clear distinction between programs that accurately predict disordered regions in membrane proteins and programs which perform poorly, and allows for the robust integration of in silico disorder prediction into our PSI:Biology membrane protein structural genomics pipeline.
Collapse
|
46
|
Mizianty MJ, Fan X, Yan J, Chalmers E, Woloschuk C, Joachimiak A, Kurgan L. Covering complete proteomes with X-ray structures: a current snapshot. ACTA CRYSTALLOGRAPHICA. SECTION D, BIOLOGICAL CRYSTALLOGRAPHY 2014; 70:2781-93. [PMID: 25372670 PMCID: PMC4220968 DOI: 10.1107/s1399004714019427] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/23/2014] [Accepted: 08/27/2014] [Indexed: 12/23/2022]
Abstract
Structural genomics programs have developed and applied structure-determination pipelines to a wide range of protein targets, facilitating the visualization of macromolecular interactions and the understanding of their molecular and biochemical functions. The fundamental question of whether three-dimensional structures of all proteins and all functional annotations can be determined using X-ray crystallography is investigated. A first-of-its-kind large-scale analysis of crystallization propensity for all proteins encoded in 1953 fully sequenced genomes was performed. It is shown that current X-ray crystallographic knowhow combined with homology modeling can provide structures for 25% of modeling families (protein clusters for which structural models can be obtained through homology modeling), with at least one structural model produced for each Gene Ontology functional annotation. The coverage varies between superkingdoms, with 19% for eukaryotes, 35% for bacteria and 49% for archaea, and with those of viruses following the coverage values of their hosts. It is shown that the crystallization propensities of proteomes from the taxonomic superkingdoms are distinct. The use of knowledge-based target selection is shown to substantially increase the ability to produce X-ray structures. It is demonstrated that the human proteome has one of the highest attainable coverage values among eukaryotes, and GPCR membrane proteins suitable for X-ray structure determination were determined.
Collapse
Affiliation(s)
- Marcin J. Mizianty
- Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta T6G 2V4, Canada
| | - Xiao Fan
- Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta T6G 2V4, Canada
| | - Jing Yan
- Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta T6G 2V4, Canada
| | - Eric Chalmers
- Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta T6G 2V4, Canada
| | - Christopher Woloschuk
- Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta T6G 2V4, Canada
| | - Andrzej Joachimiak
- Midwest Center for Structural Genomics, Argonne National Laboratory, Argonne, IL 60439, USA
| | - Lukasz Kurgan
- Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta T6G 2V4, Canada
| |
Collapse
|
47
|
Dunker AK, Bondos SE, Huang F, Oldfield CJ. Intrinsically disordered proteins and multicellular organisms. Semin Cell Dev Biol 2014; 37:44-55. [PMID: 25307499 DOI: 10.1016/j.semcdb.2014.09.025] [Citation(s) in RCA: 92] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2014] [Revised: 09/15/2014] [Accepted: 09/30/2014] [Indexed: 12/12/2022]
Abstract
Intrinsically disordered proteins (IDPs) and IDP regions lack stable tertiary structure yet carry out numerous biological functions, especially those associated with signaling, transcription regulation, DNA condensation, cell division, and cellular differentiation. Both post-translational modifications (PTMs) and alternative splicing (AS) expand the functional repertoire of IDPs. Here we propose that an "IDP-based developmental toolkit," which is comprised of IDP regions, PTMs, especially multiple PTMs, within these IDP regions, and AS events within segments of pre-mRNA that code for these same IDP regions, allows functional diversification and environmental responsiveness for molecules that direct the development of complex metazoans.
Collapse
Affiliation(s)
- A Keith Dunker
- Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Indiana University Schools of Medicine and Informatics, Indianapolis, IN 46202, United States.
| | - Sarah E Bondos
- Department of Molecular and Cellular Medicine, Texas A&M Health Science Center, College Station, TX 77843, United States.
| | - Fei Huang
- Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Indiana University Schools of Medicine and Informatics, Indianapolis, IN 46202, United States.
| | - Christopher J Oldfield
- Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Indiana University Schools of Medicine and Informatics, Indianapolis, IN 46202, United States.
| |
Collapse
|
48
|
Li M, Cho SB, Ryu KH. A novel approach for predicting disordered regions in a protein sequence. Osong Public Health Res Perspect 2014; 5:211-8. [PMID: 25379372 PMCID: PMC4215001 DOI: 10.1016/j.phrp.2014.06.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2014] [Revised: 06/24/2014] [Accepted: 06/24/2014] [Indexed: 12/01/2022] Open
Abstract
OBJECTIVES A number of published predictors are based on various algorithms and disordered protein sequence properties. Although many predictors have been published, the study of protein disordered region prediction is ongoing because different prediction methods can find different disordered regions in a protein sequence. METHODS Therefore we have used a new approach to find the more varying disordered regions for more efficient and accurate prediction of protein structures. In this study, we propose a novel approach called "emerging subsequence (ES) mining" without using the characteristics of the disordered protein. We first adapted the approach to generate emerging protein subsequences on public protein sequence data. Second, the disordered and ordered regions in a protein sequence were predicted by searching the generated emerging protein subsequence with a sliding window, which tends to overlap. Third, the scores of the overlapping regions were calculated based on support and growthrate values in both classes. Finally, the score of predicted regions in the target class were compared with the score of the source class, and the class having a higher score was selected. RESULTS In this experiment, disordered sequence data and ordered sequence data was extracted from DisProt 6.02 and PDB respectively and used as training data. The test data come from CASP 9 and CASP 10 where disordered and ordered regions are known. CONCLUSION Comparing with several published predictors, the results of the experiment show higher accuracy rates than with other existing methods.
Collapse
Affiliation(s)
- Meijing Li
- Database/Bioinformatics Laboratory, Chungbuk National University, Cheongju, Korea
| | - Seong Beom Cho
- Division of Bio-Medical Informatics, Center for Genome Science, Korea National Institute of Health, Cheongju, Korea
| | - Keun Ho Ryu
- Database/Bioinformatics Laboratory, Chungbuk National University, Cheongju, Korea
| |
Collapse
|
49
|
Exceptionally abundant exceptions: comprehensive characterization of intrinsic disorder in all domains of life. CELLULAR AND MOLECULAR LIFE SCIENCES : CMLS 2014. [PMID: 24939692 DOI: 10.1007/s00018‐014‐1661‐9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Recent years witnessed increased interest in intrinsically disordered proteins and regions. These proteins and regions are abundant and possess unique structural features and a broad functional repertoire that complements ordered proteins. However, modern studies on the abundance and functions of intrinsically disordered proteins and regions are relatively limited in size and scope of their analysis. To fill this gap, we performed a broad and detailed computational analysis of over 6 million proteins from 59 archaea, 471 bacterial, 110 eukaryotic and 325 viral proteomes. We used arguably more accurate consensus-based disorder predictions, and for the first time comprehensively characterized intrinsic disorder at proteomic and protein levels from all significant perspectives, including abundance, cellular localization, functional roles, evolution, and impact on structural coverage. We show that intrinsic disorder is more abundant and has a unique profile in eukaryotes. We map disorder into archaea, bacterial and eukaryotic cells, and demonstrate that it is preferentially located in some cellular compartments. Functional analysis that considers over 1,200 annotations shows that certain functions are exclusively implemented by intrinsically disordered proteins and regions, and that some of them are specific to certain domains of life. We reveal that disordered regions are often targets for various post-translational modifications, but primarily in the eukaryotes and viruses. Using a phylogenetic tree for 14 eukaryotic and 112 bacterial species, we analyzed relations between disorder, sequence conservation and evolutionary speed. We provide a complete analysis that clearly shows that intrinsic disorder is exceptionally and uniquely abundant in each domain of life.
Collapse
|
50
|
Abstract
Intrinsically disordered proteins (IDPs) and IDP regions fail to form a stable structure, yet they exhibit biological activities. Their mobile flexibility and structural instability are encoded by their amino acid sequences. They recognize proteins, nucleic acids, and other types of partners; they accelerate interactions and chemical reactions between bound partners; and they help accommodate posttranslational modifications, alternative splicing, protein fusions, and insertions or deletions. Overall, IDP-associated biological activities complement those of structured proteins. Recently, there has been an explosion of studies on IDP regions and their functions, yet the discovery and investigation of these proteins have a long, mostly ignored history. Along with recent discoveries, we present several early examples and the mechanisms by which IDPs contribute to function, which we hope will encourage comprehensive discussion of IDPs and IDP regions in biochemistry textbooks. Finally, we propose future directions for IDP research.
Collapse
Affiliation(s)
- Christopher J Oldfield
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, Indiana 46202; ,
| | | |
Collapse
|