51
|
Improved prediction of residue flexibility by embedding optimized amino acid grouping into RSA-based linear models. Amino Acids 2014; 46:2665-80. [DOI: 10.1007/s00726-014-1817-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2014] [Accepted: 07/21/2014] [Indexed: 11/26/2022]
|
52
|
van der Lee R, Buljan M, Lang B, Weatheritt RJ, Daughdrill GW, Dunker AK, Fuxreiter M, Gough J, Gsponer J, Jones D, Kim PM, Kriwacki R, Oldfield CJ, Pappu RV, Tompa P, Uversky VN, Wright P, Babu MM. Classification of intrinsically disordered regions and proteins. Chem Rev 2014; 114:6589-631. [PMID: 24773235 PMCID: PMC4095912 DOI: 10.1021/cr400525m] [Citation(s) in RCA: 1440] [Impact Index Per Article: 144.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2013] [Indexed: 12/11/2022]
Affiliation(s)
- Robin van der Lee
- MRC
Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, United Kingdom
- Centre
for Molecular and Biomolecular Informatics, Radboud University Medical Centre, 6500 HB Nijmegen, The
Netherlands
| | - Marija Buljan
- MRC
Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, United Kingdom
| | - Benjamin Lang
- MRC
Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, United Kingdom
| | - Robert J. Weatheritt
- MRC
Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, United Kingdom
| | - Gary W. Daughdrill
- Department
of Cell Biology, Microbiology, and Molecular Biology, University of South Florida, 3720 Spectrum Boulevard, Suite 321, Tampa, Florida 33612, United States
| | - A. Keith Dunker
- Department
of Biochemistry and Molecular Biology, Indiana
University School of Medicine, Indianapolis, Indiana 46202, United States
| | - Monika Fuxreiter
- MTA-DE
Momentum Laboratory of Protein Dynamics, Department of Biochemistry
and Molecular Biology, University of Debrecen, H-4032 Debrecen, Nagyerdei krt 98, Hungary
| | - Julian Gough
- Department
of Computer Science, University of Bristol, The Merchant Venturers Building, Bristol BS8 1UB, United Kingdom
| | - Joerg Gsponer
- Department
of Biochemistry and Molecular Biology, Centre for High-Throughput
Biology, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| | - David
T. Jones
- Bioinformatics
Group, Department of Computer Science, University
College London, London, WC1E 6BT, United Kingdom
| | - Philip M. Kim
- Terrence Donnelly Centre for Cellular and Biomolecular Research, Department of Molecular
Genetics, and Department of Computer Science, University
of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Richard
W. Kriwacki
- Department
of Structural Biology, St. Jude Children’s
Research Hospital, Memphis, Tennessee 38105, United States
| | - Christopher J. Oldfield
- Department
of Biochemistry and Molecular Biology, Indiana
University School of Medicine, Indianapolis, Indiana 46202, United States
| | - Rohit V. Pappu
- Department
of Biomedical Engineering and Center for Biological Systems Engineering, Washington University in St. Louis, St. Louis, Missouri 63130, United States
| | - Peter Tompa
- VIB Department
of Structural Biology, Vrije Universiteit
Brussel, Brussels, Belgium
- Institute
of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary
| | - Vladimir N. Uversky
- Department
of Molecular Medicine and USF Health Byrd Alzheimer’s Research
Institute, Morsani College of Medicine, University of South Florida, Tampa, Florida 33612, United States
- Institute for Biological Instrumentation,
Russian Academy of Sciences, Pushchino,
Moscow Region, Russia
| | - Peter
E. Wright
- Department
of Integrative Structural and Computational Biology and Skaggs Institute
of Chemical Biology, The Scripps Research
Institute, 10550 North
Torrey Pines Road, La Jolla, California 92037, United States
| | - M. Madan Babu
- MRC
Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, United Kingdom
| |
Collapse
|
53
|
Macossay-Castillo M, Kosol S, Tompa P, Pancsa R. Synonymous constraint elements show a tendency to encode intrinsically disordered protein segments. PLoS Comput Biol 2014; 10:e1003607. [PMID: 24809503 PMCID: PMC4014394 DOI: 10.1371/journal.pcbi.1003607] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2014] [Accepted: 03/17/2014] [Indexed: 01/22/2023] Open
Abstract
Synonymous constraint elements (SCEs) are protein-coding genomic regions with very low synonymous mutation rates believed to carry additional, overlapping functions. Thousands of such potentially multi-functional elements were recently discovered by analyzing the levels and patterns of evolutionary conservation in human coding exons. These elements provide a good opportunity to improve our understanding of how the redundant nature of the genetic code is exploited in the cell. Our premise is that the protein segments encoded by such elements might better comply with the increased functional demands if they are structurally less constrained (i.e. intrinsically disordered). To test this idea, we investigated the protein segments encoded by SCEs with computational tools to describe the underlying structural properties. In addition to SCEs, we examined the level of disorder, secondary structure, and sequence complexity of protein regions overlapping with experimentally validated splice regulatory sites. We show that multi-functional gene regions translate into protein segments that are significantly enriched in structural disorder and compositional bias, while they are depleted in secondary structure and domain annotations compared to reference segments of similar lengths. This tendency suggests that relaxed protein structural constraints provide an advantage when accommodating multiple overlapping functions in coding regions.
Collapse
Affiliation(s)
- Mauricio Macossay-Castillo
- Vlaams Instituut voor Biotechnologie (VIB) Department of Structural Biology, Vrije Universiteit Brussel, Brussels, Belgium
| | - Simone Kosol
- Vlaams Instituut voor Biotechnologie (VIB) Department of Structural Biology, Vrije Universiteit Brussel, Brussels, Belgium
| | - Peter Tompa
- Vlaams Instituut voor Biotechnologie (VIB) Department of Structural Biology, Vrije Universiteit Brussel, Brussels, Belgium
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary
| | - Rita Pancsa
- Vlaams Instituut voor Biotechnologie (VIB) Department of Structural Biology, Vrije Universiteit Brussel, Brussels, Belgium
- * E-mail:
| |
Collapse
|
54
|
Pavlović MD, Jandrlić DR, Mitić NS. Epitope distribution in ordered and disordered protein regions. Part B — Ordered regions and disordered binding sites are targets of T- and B-cell immunity. J Immunol Methods 2014; 407:90-107. [DOI: 10.1016/j.jim.2014.03.027] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2013] [Revised: 03/31/2014] [Accepted: 03/31/2014] [Indexed: 01/04/2023]
|
55
|
Mitić NS, Pavlović MD, Jandrlić DR. Epitope distribution in ordered and disordered protein regions - part A. T-cell epitope frequency, affinity and hydropathy. J Immunol Methods 2014; 406:83-103. [PMID: 24614036 DOI: 10.1016/j.jim.2014.02.012] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2013] [Revised: 02/27/2014] [Accepted: 02/27/2014] [Indexed: 02/08/2023]
Abstract
Highly disordered protein regions are prevalently hydrophilic, extremely sensitive to proteolysis in vitro, and are expected to be under-represented as T-cell epitopes. The aim of this research was to find out whether disorder and hydropathy prediction methods could help in understanding epitope processing and presentation. According to the pan-specific T-cell epitope predictors NetMHCpan and NetMHCIIpan and 9 publicly available disorder predictors, frequency of epitopes presented by human leukocyte antigens (HLA) class-I or -II was found to be more than 2.5 times higher in ordered than in disordered protein regions (depending on the disorder predictor). Both HLA class-I and HLA class-II binding epitopes are prevalently hydrophilic in disordered and prevalently hydrophobic in ordered protein regions, whereas epitopes recognized by HLA class-II alleles are more hydrophobic than those recognized by HLA class-I. As regards both classes of HLA molecules, high-affinity binding epitopes display more hydrophobicity than low affinity-binding epitopes (in both ordered and disordered regions). Epitopes belonging to disordered protein regions were not predicted to have poor affinity to HLA class-II molecules, as expected from disorder intrinsic proteolytic instability. The relation of epitope hydrophobicity and order/disorder location was also valid if alleles were grouped according to the HLA class-I and HLA class-II supertypes, except for the class-I supertype A3 in which the main part of recognized epitopes was prevalently hydrophilic. Regarding specific supertypes, the affinity of epitopes belonging to ordered regions varies only slightly (depending on the disorder predictor) compared to the affinity of epitopes in corresponding disordered regions. The distribution of epitopes in ordered and disordered protein regions has revealed that the curves of order-epitope distribution were convex-like while the curves of disorder-epitope distribution were concave-like. The percentage of prevalently hydrophobic epitopes increases with the enhancement of epitope promiscuity level and moving from disordered to ordered regions. These data suggests that reverse vaccinology, oriented towards promiscuous and high-affinity epitopes, is also oriented towards prevalently hydrophobic, ordered regions. The analysis of predicted and experimentally evaluated epitopes of cancer-testis antigen MAGE-A3 has confirmed that the majority of T-cell epitopes, particularly those that are promiscuous or naturally processed, was located in ordered and disorder/order boundary protein regions overlapping hydrophobic regions.
Collapse
Affiliation(s)
- Nenad S Mitić
- University of Belgrade, Faculty of Mathematics, P.O.B. 550, Studentski trg 16, Belgrade, Serbia.
| | - Mirjana D Pavlović
- University of Belgrade, Institute of General and Physical Chemistry, Studentski trg 12/V, Belgrade, Serbia.
| | - Davorka R Jandrlić
- University of Belgrade, Faculty of Mechanical Engineering, Kraljice Marije 16, Belgrade, Serbia.
| |
Collapse
|
56
|
Mannige RV. Dynamic New World: Refining Our View of Protein Structure, Function and Evolution. Proteomes 2014; 2:128-153. [PMID: 28250374 PMCID: PMC5302727 DOI: 10.3390/proteomes2010128] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2013] [Revised: 02/12/2014] [Accepted: 02/20/2014] [Indexed: 01/06/2023] Open
Abstract
Proteins are crucial to the functioning of all lifeforms. Traditional understanding posits that a single protein occupies a single structure ("fold"), which performs a single function. This view is radically challenged with the recognition that high structural dynamism-the capacity to be extra "floppy"-is more prevalent in functional proteins than previously assumed. As reviewed here, this dynamic take on proteins affects our understanding of protein "structure", function, and evolution, and even gives us a glimpse into protein origination. Specifically, this review will discuss historical developments concerning protein structure, and important new relationships between dynamism and aspects of protein sequence, structure, binding modes, binding promiscuity, evolvability, and origination. Along the way, suggestions will be provided for how key parts of textbook definitions-that so far have excluded membership to intrinsically disordered proteins (IDPs)-could be modified to accommodate our more dynamic understanding of proteins.
Collapse
Affiliation(s)
- Ranjan V Mannige
- Molecular Foundry, Lawrence Berkeley National Laboratory, 1 Cyclotron Road,Berkeley, CA 94720, USA.
| |
Collapse
|
57
|
Das S, Pal U, Das S, Bagga K, Roy A, Mrigwani A, Maiti NC. Sequence complexity of amyloidogenic regions in intrinsically disordered human proteins. PLoS One 2014; 9:e89781. [PMID: 24594841 PMCID: PMC3940659 DOI: 10.1371/journal.pone.0089781] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2013] [Accepted: 01/26/2014] [Indexed: 01/03/2023] Open
Abstract
An amyloidogenic region (AR) in a protein sequence plays a significant role in protein aggregation and amyloid formation. We have investigated the sequence complexity of AR that is present in intrinsically disordered human proteins. More than 80% human proteins in the disordered protein databases (DisProt+IDEAL) contained one or more ARs. With decrease of protein disorder, AR content in the protein sequence was decreased. A probability density distribution analysis and discrete analysis of AR sequences showed that ∼8% residue in a protein sequence was in AR and the region was in average 8 residues long. The residues in the AR were high in sequence complexity and it seldom overlapped with low complexity regions (LCR), which was largely abundant in disorder proteins. The sequences in the AR showed mixed conformational adaptability towards α-helix, β-sheet/strand and coil conformations.
Collapse
Affiliation(s)
- Swagata Das
- Structural Biology and Bioinformatics Division, Council of Scientific and Industrial Research (CSIR)-Indian Institute of Chemical Biology (IICB), Kolkata, India
| | - Uttam Pal
- Structural Biology and Bioinformatics Division, Council of Scientific and Industrial Research (CSIR)-Indian Institute of Chemical Biology (IICB), Kolkata, India
| | - Supriya Das
- Structural Biology and Bioinformatics Division, Council of Scientific and Industrial Research (CSIR)-Indian Institute of Chemical Biology (IICB), Kolkata, India
| | - Khyati Bagga
- Structural Biology and Bioinformatics Division, Council of Scientific and Industrial Research (CSIR)-Indian Institute of Chemical Biology (IICB), Kolkata, India
| | - Anupam Roy
- Structural Biology and Bioinformatics Division, Council of Scientific and Industrial Research (CSIR)-Indian Institute of Chemical Biology (IICB), Kolkata, India
| | - Arpita Mrigwani
- Structural Biology and Bioinformatics Division, Council of Scientific and Industrial Research (CSIR)-Indian Institute of Chemical Biology (IICB), Kolkata, India
| | - Nakul C. Maiti
- Structural Biology and Bioinformatics Division, Council of Scientific and Industrial Research (CSIR)-Indian Institute of Chemical Biology (IICB), Kolkata, India
- * E-mail:
| |
Collapse
|
58
|
Monastyrskyy B, Kryshtafovych A, Moult J, Tramontano A, Fidelis K. Assessment of protein disorder region predictions in CASP10. Proteins 2013; 82 Suppl 2:127-37. [PMID: 23946100 DOI: 10.1002/prot.24391] [Citation(s) in RCA: 124] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2013] [Revised: 06/14/2013] [Accepted: 06/18/2013] [Indexed: 12/12/2022]
Abstract
The article presents the assessment of disorder region predictions submitted to CASP10. The evaluation is based on the three measures tested in previous CASPs: (i) balanced accuracy, (ii) the Matthews correlation coefficient for the binary predictions, and (iii) the area under the curve in the receiver operating characteristic (ROC) analysis of predictions using probability annotation. We also performed new analyses such as comparison of the submitted predictions with those obtained with a Naïve disorder prediction method and with predictions from the disorder prediction databases D2P2 and MobiDB. On average, the methods participating in CASP10 demonstrated slightly better performance than those in CASP9.
Collapse
|
59
|
Kale A, Hire RS, Hadapad AB, D'Souza SF, Kumar V. Interaction between mosquito-larvicidal Lysinibacillus sphaericus binary toxin components: analysis of complex formation. INSECT BIOCHEMISTRY AND MOLECULAR BIOLOGY 2013; 43:1045-1054. [PMID: 23974012 DOI: 10.1016/j.ibmb.2013.07.011] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2013] [Revised: 07/19/2013] [Accepted: 07/29/2013] [Indexed: 06/02/2023]
Abstract
The two components (BinA and BinB) of Lysinibacillus sphaericus binary toxin together are highly toxic to Culex and Anopheles mosquito larvae, and have been employed world-wide to control mosquito borne diseases. Upon binding to the membrane receptor an oligomeric form (BinA2.BinB2) of the binary toxin is expected to play role in pore formation. It is not clear if these two proteins interact in solution as well, in the absence of receptor. The interactions between active forms of BinA and BinB polypeptides were probed in solution using size-exclusion chromatography, pull-down assay, surface plasmon resonance, circular dichroism, and by chemically crosslinking BinA and BinB components. We demonstrate that the two proteins interact weakly with first association and dissociation rate constants of 4.5×10(3) M(-1) s(-1) and 0.8 s(-1), resulting in conformational change, most likely, in toxic BinA protein that could kinetically favor membrane translocation of the active oligomer. The weak interactions between the two toxin components could be stabilized by glutaraldehyde crosslinking. The cross-linked complex, interestingly, showed maximal Culex larvicidal activity (LC50 value of 1.59 ng mL(-1)) reported so far for combination of BinA/BinB components, and thus is an attractive option for development of new bio-pesticides for control of mosquito borne vector diseases.
Collapse
Affiliation(s)
- Avinash Kale
- High Pressure & Synchrotron Radiation Physics Division, Bhabha Atomic Research Centre, Mumbai 400085, India
| | | | | | | | | |
Collapse
|
60
|
Peng Z, Mizianty MJ, Kurgan L. Genome-scale prediction of proteins with long intrinsically disordered regions. Proteins 2013; 82:145-58. [PMID: 23798504 DOI: 10.1002/prot.24348] [Citation(s) in RCA: 86] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2013] [Accepted: 06/06/2013] [Indexed: 12/24/2022]
Abstract
Proteins with long disordered regions (LDRs), defined as having 30 or more consecutive disordered residues, are abundant in eukaryotes, and these regions are recognized as a distinct class of biologically functional domains. LDRs facilitate various cellular functions and are important for target selection in structural genomics. Motivated by the lack of methods that directly predict proteins with LDRs, we designed Super-fast predictor of proteins with Long Intrinsically DisordERed regions (SLIDER). SLIDER utilizes logistic regression that takes an empirically chosen set of numerical features, which consider selected physicochemical properties of amino acids, sequence complexity, and amino acid composition, as its inputs. Empirical tests show that SLIDER offers competitive predictive performance combined with low computational cost. It outperforms, by at least a modest margin, a comprehensive set of modern disorder predictors (that can indirectly predict LDRs) and is 16 times faster compared to the best currently available disorder predictor. Utilizing our time-efficient predictor, we characterized abundance and functional roles of proteins with LDRs over 110 eukaryotic proteomes. Similar to related studies, we found that eukaryotes have many (on average 30.3%) proteins with LDRs with majority of proteomes having between 25 and 40%, where higher abundance is characteristic to proteomes that have larger proteins. Our first-of-its-kind large-scale functional analysis shows that these proteins are enriched in a number of cellular functions and processes including certain binding events, regulation of catalytic activities, cellular component organization, biogenesis, biological regulation, and some metabolic and developmental processes. A webserver that implements SLIDER is available at http://biomine.ece.ualberta.ca/SLIDER/.
Collapse
Affiliation(s)
- Zhenling Peng
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Canada
| | | | | |
Collapse
|
61
|
Light S, Sagit R, Sachenkova O, Ekman D, Elofsson A. Protein Expansion Is Primarily due to Indels in Intrinsically Disordered Regions. Mol Biol Evol 2013; 30:2645-53. [DOI: 10.1093/molbev/mst157] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
|
62
|
Kahali B, Ghosh TC. Disorderness inEscherichia coliproteome: perception of folding fidelity and protein–protein interactions. J Biomol Struct Dyn 2013; 31:472-6. [DOI: 10.1080/07391102.2012.706071] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
|
63
|
Fan X, Kurgan L. Accurate prediction of disorder in protein chains with a comprehensive and empirically designed consensus. J Biomol Struct Dyn 2013; 32:448-64. [DOI: 10.1080/07391102.2013.775969] [Citation(s) in RCA: 113] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
64
|
The N-terminal intrinsically disordered domain of Mgm101p is localized to the mitochondrial nucleoid. PLoS One 2013; 8:e56465. [PMID: 23418572 PMCID: PMC3572067 DOI: 10.1371/journal.pone.0056465] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2012] [Accepted: 01/14/2013] [Indexed: 01/22/2023] Open
Abstract
The mitochondrial genome maintenance gene, MGM101, is essential for yeasts that depend on mitochondrial DNA replication. Previously, in Saccharomyces cerevisiae, it has been found that the carboxy-terminal two-thirds of Mgm101p has a functional core. Furthermore, there is a high level of amino acid sequence conservation in this region from widely diverse species. By contrast, the amino-terminal region, that is also essential for function, does not have recognizable conservation. Using a bioinformatic approach we find that the functional core from yeast and a corresponding region of Mgm101p from the coral Acropora millepora have an ordered structure, while the N-terminal domains of sequences from yeast and coral are predicted to be disordered. To examine whether ordered and disordered domains of Mgm101p have specific or general functions we made chimeric proteins from yeast and coral by swapping the two regions. We find, by an in vivo assay in S.cerevisiae, that the ordered domain of A.millepora can functionally replace the yeast core region but the disordered domain of the coral protein cannot substitute for its yeast counterpart. Mgm101p is found in the mitochondrial nucleoid along with enzymes and proteins involved in mtDNA replication. By attaching green fluorescent protein to the N-terminal disordered domain of yeast Mgm101p we find that GFP is still directed to the mitochondrial nucleoid where full-length Mgm101p-GFP is targeted.
Collapse
|
65
|
Long indels are disordered: a study of disorder and indels in homologous eukaryotic proteins. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2013; 1834:890-7. [PMID: 23333420 DOI: 10.1016/j.bbapap.2013.01.002] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/01/2012] [Revised: 12/30/2012] [Accepted: 01/03/2013] [Indexed: 11/21/2022]
Abstract
Proteins evolve through point mutations as well as by insertions and deletions (indels). During the last decade it has become apparent that protein regions that do not fold into three-dimensional structures, i.e. intrinsically disordered regions, are quite common. Here, we have studied the relationship between protein disorder and indels using HMM-HMM pairwise alignments in two sets of orthologous eukaryotic protein pairs. First, we show that disordered residues are much more frequent among indel residues than among aligned residues and, also are more prevalent among indels than in coils. Second, we observed that disordered residues are particularly common in longer indels. Disordered indels of short-to-medium size are prevalent in the non-terminal regions of proteins while the longest indels, ordered and disordered alike, occur toward the termini of the proteins where new structural units are comparatively well tolerated. Finally, while disordered regions often evolve faster than ordered regions and disorder is common in indels, there are some previously recognized protein families where the disordered region is more conserved than the ordered region. We find that these rare proteins are often involved in information processes, such as RNA processing and translation. This article is part of a Special Issue entitled: The emerging dynamic view of proteins: Protein plasticity in allostery, evolution and self-assembly.
Collapse
|
66
|
Abstract
The introduction of the term ‘Tubulin Polymerization Promoting Protein (TPPP)-like proteins’ is suggested. They constitute a eukaryotic protein superfamily, characterized by the presence of the p25alpha domain (Pfam05517, IPR008907), and named after the first identified member, TPPP/p25, exhibiting microtubule stabilizing function. TPPP-like proteins can be grouped on the basis of two characteristics: the length of their p25alpha domain, which can be long, short, truncated or partial, and the presence or absence of additional domain(s). TPPPs, in the strict sense, contain no other domains but one long or short p25alpha one (long- and short-type TPPPs, respectively). Proteins possessing truncated p25alpha domain are first described in this paper. They evolved from the long-type TPPPs and can be considered as arthropod-specific paralogs of long-type TPPPs. Phylogenetic analysis shows that the two groups (long-type and truncated TPPPs) split in the common ancestor of arthropods. Incomplete p25alpha domains can be found in multidomain TPPP-like proteins as well. The various subfamilies occur with a characteristic phyletic distribution: e. g., animal genomes/proteomes contain almost without exception long-type TPPPs; the multidomain apicortins occur almost exclusively in apicomplexan parasites. There are no data about the physiological function of these proteins except two human long-type TPPP paralogs which are involved in developmental processes of the brain and the musculoskeletal system, respectively. I predict that the superfamily members containing long or partial p25alpha domain are often intrinsically disordered proteins, while those with short or truncated domain(s) are structurally ordered. Interestingly, members of this superfamily connected or maybe connected to diseases are intrinsically disordered proteins.
Collapse
Affiliation(s)
- Ferenc Orosz
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary.
| |
Collapse
|
67
|
Bardwell JCA, Jakob U. Conditional disorder in chaperone action. Trends Biochem Sci 2012; 37:517-25. [PMID: 23018052 DOI: 10.1016/j.tibs.2012.08.006] [Citation(s) in RCA: 110] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2012] [Revised: 08/17/2012] [Accepted: 08/29/2012] [Indexed: 11/18/2022]
Abstract
Protein disorder remains an intrinsically fuzzy concept. Its role in protein function is difficult to conceptualize and its experimental study is challenging. Although a wide variety of roles for protein disorder have been proposed, establishing that disorder is functionally important, particularly in vivo, is not a trivial task. Several molecular chaperones have now been identified as conditionally disordered proteins; fully folded and chaperone-inactive under non-stress conditions, they adopt a partially disordered conformation upon exposure to distinct stress conditions. This disorder appears to be vital for their ability to bind multiple aggregation-sensitive client proteins and to protect cells against the stressors. The study of these conditionally disordered chaperones should prove useful in understanding the functional role for protein disorder in molecular recognition.
Collapse
Affiliation(s)
- James C A Bardwell
- Howard Hughes Medical Institute, University of Michigan, Ann Arbor, MI 48109, USA
| | | |
Collapse
|
68
|
Yruela I, Contreras-Moreira B. Protein disorder in plants: a view from the chloroplast. BMC PLANT BIOLOGY 2012; 12:165. [PMID: 22970728 PMCID: PMC3460767 DOI: 10.1186/1471-2229-12-165] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/02/2012] [Accepted: 09/10/2012] [Indexed: 05/08/2023]
Abstract
BACKGROUND The intrinsically unstructured state of some proteins, observed in all living organisms, is essential for basic cellular functions. In this field the available information from plants is limited but it has been reached a point where these proteins can be comprehensively classified on the basis of disorder, function and evolution. RESULTS Our analysis of plant genomes confirms that nuclear-encoded proteins follow the same trend than other multi-cellular eukaryotes; however, chloroplast- and mitochondria- encoded proteins conserve the patterns of Archaea and Bacteria, in agreement with their phylogenetic origin. Based on current knowledge about gene transference from the chloroplast to the nucleus, we report a strong correlation between the rate of disorder of transferred and nuclear-encoded proteins, even for polypeptides that play functional roles back in the chloroplast. We further investigate this trend by reviewing the set of chloroplast ribosomal proteins, one of the most representative transferred gene clusters, finding that the ribosomal large subunit, assembled from a majority of nuclear-encoded proteins, is clearly more unstructured than the small one, which integrates mostly plastid-encoded proteins. CONCLUSIONS Our observations suggest that the evolutionary dynamics of the plant nucleus adds disordered segments to genes alike, regardless of their origin, with the notable exception of proteins currently encoded in both genomes, probably due to functional constraints.
Collapse
Affiliation(s)
- Inmaculada Yruela
- Estación Experimental de Aula Dei, Consejo Superior de Investigaciones Científicas (EEAD-CSIC), Avda. Montañana, 1005, Zaragoza, 50059, Spain
| | - Bruno Contreras-Moreira
- Estación Experimental de Aula Dei, Consejo Superior de Investigaciones Científicas (EEAD-CSIC), Avda. Montañana, 1005, Zaragoza, 50059, Spain
- Institute of Biocomputation and Physics of Complex Systems (BIFI), Universidad de Zaragoza, Mariano Esquillor, Edificio I + D, Zaragoza, 50018, Spain
- Fundación ARAID, Zaragoza, Spain
| |
Collapse
|
69
|
Lobanov MY, Sokolovskiy IV, Galzitskaya OV. IsUnstruct: prediction of the residue status to be ordered or disordered in the protein chain by a method based on the Ising model. J Biomol Struct Dyn 2012; 31:1034-43. [PMID: 22963167 DOI: 10.1080/07391102.2012.718529] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Affiliation(s)
- Michail Yu Lobanov
- a Institute of Protein Research of the Russian Academy of Sciences , 4 Institutskaya str., Pushchino , Moscow Region , 142290 , Russia
| | | | | |
Collapse
|
70
|
Seeger MA, Zhang Y, Rice SE. Kinesin tail domains are intrinsically disordered. Proteins 2012; 80:2437-46. [PMID: 22674872 DOI: 10.1002/prot.24128] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2012] [Revised: 05/22/2012] [Accepted: 05/25/2012] [Indexed: 12/11/2022]
Abstract
Kinesin motor proteins transport a wide variety of molecular cargoes in a spatially and temporally regulated manner. Kinesin motor domains, which hydrolyze ATP to produce a directed mechanical force along a microtubule, are well conserved throughout the entire superfamily. Outside of the motor domains, kinesin sequences diverge along with their transport functions. The nonmotor regions, particularly the tails, respond to a wide variety of structural and molecular cues that enable kinesins to carry specific cargoes in response to particular cellular signals. Here, we demonstrate that intrinsic disorder is a common structural feature of kinesins. A bioinformatics survey of the full-length sequences of all 43 human kinesins predicts that significant regions of intrinsically disordered residues are present in all kinesins. These regions are concentrated in the nonmotor domains, particularly in the tails and near sites for ligand binding or post-translational modifications. In order to experimentally verify these predictions, we expressed and purified the tail domains of kinesins representing three different families (Kif5B, Kif10, and KifC3). Circular dichroism and NMR spectroscopy experiments demonstrate that the isolated tails are disordered in vitro, yet they retain their functional microtubule-binding activity. On the basis of these results, we propose that intrinsic disorder is a common structural feature that confers functional specificity to kinesins.
Collapse
Affiliation(s)
- Mark A Seeger
- Department of Cell and Molecular Biology, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| | | | | |
Collapse
|
71
|
Kovačević JJ. Computational analysis of position-dependent disorder content in DisProt database. GENOMICS PROTEOMICS & BIOINFORMATICS 2012; 10:158-65. [PMID: 22917189 PMCID: PMC5056116 DOI: 10.1016/j.gpb.2012.01.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2011] [Revised: 01/27/2012] [Accepted: 01/31/2012] [Indexed: 11/27/2022]
Abstract
A bioinformatics analysis of disorder content of proteins from the DisProt database has been performed with respect to position of disordered residues. Each protein chain was divided into three parts: N- and C- terminal parts with each containing 30 amino acid (AA) residues and the middle region containing the remaining AA residues. The results show that in terminal parts, the percentage of disordered AA residues is higher than that of all AA residues (17% of disordered AA residues and 11% of all). We analyzed the percentage of disorder for each of 20 AA residues in the three parts of proteins with respect to their hydropathy and molecular weight. For each AA, the percentage of disorder in the middle part is lower than that in terminal parts which is comparable at the two termini. A new scale of AAs has been introduced according to their disorder content in the middle part of proteins: CIFWMLYHRNVTAGQDSKEP. All big hydrophobic AAs are less frequently disordered, while almost all small hydrophilic AAs are more frequently disordered. The results obtained may be useful for construction and improving predictors for protein disorder.
Collapse
|
72
|
Rawat N, Biswas P. Hydrophobic moments, shape, and packing in disordered proteins. J Phys Chem B 2012; 116:6326-35. [PMID: 22582807 DOI: 10.1021/jp3016529] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Disordered proteins play a significant role in many biological processes and provide an attractive target for biophysical studies under physiological conditions. Disordered proteins may be classified as (a) proteins with overall well-defined secondary structures, interspersed with regions of missing residues, or (b) natively unstructured proteins which lack definite secondary structure. The spatial profile of second order hydrophobic moment for disordered proteins depicts the distribution of hydrophobic residues from the interior to the surface of the protein and indicates the lack of a well-formed hydrophobic core unlike that of the globular proteins. This trend is independent of the size or position of the disordered region in the sequence. The hydrophobicity profile of the ordered regions of the disordered proteins differ considerably from that of globular proteins implying the role of disordered parts and the significance of hydrophobic interactions in the folding of proteins. The shape asymmetry of the two classes of disordered proteins is determined by calculating the asphercity and shape parameters, derived from the cartesian components of radius of gyration tensor. Disordered proteins of group a are more spherical as compared to the natively unstructured proteins (group b), which are more prolate. Both groups of proteins exhibit similar types of side-chain backbone contacts, as that of the globular proteins. While disordered proteins contains few hydrophobic residues natively unstructured proteins are characterized by a residues of low mean hydrophobicity and high mean net charge.
Collapse
Affiliation(s)
- Nidhi Rawat
- Department of Chemistry, University of Delhi, Delhi-110007, India
| | | |
Collapse
|
73
|
Karlin D, Belshaw R. Detecting remote sequence homology in disordered proteins: discovery of conserved motifs in the N-termini of Mononegavirales phosphoproteins. PLoS One 2012; 7:e31719. [PMID: 22403617 PMCID: PMC3293882 DOI: 10.1371/journal.pone.0031719] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2011] [Accepted: 01/18/2012] [Indexed: 11/19/2022] Open
Abstract
Paramyxovirinae are a large group of viruses that includes measles virus and parainfluenza viruses. The viral Phosphoprotein (P) plays a central role in viral replication. It is composed of a highly variable, disordered N-terminus and a conserved C-terminus. A second viral protein alternatively expressed, the V protein, also contains the N-terminus of P, fused to a zinc finger. We suspected that, despite their high variability, the N-termini of P/V might all be homologous; however, using standard approaches, we could previously identify sequence conservation only in some Paramyxovirinae. We now compared the N-termini using sensitive sequence similarity search programs, able to detect residual similarities unnoticeable by conventional approaches. We discovered that all Paramyxovirinae share a short sequence motif in their first 40 amino acids, which we called soyuz1. Despite its short length (11-16aa), several arguments allow us to conclude that soyuz1 probably evolved by homologous descent, unlike linear motifs. Conservation across such evolutionary distances suggests that soyuz1 plays a crucial role and experimental data suggest that it binds the viral nucleoprotein to prevent its illegitimate self-assembly. In some Paramyxovirinae, the N-terminus of P/V contains a second motif, soyuz2, which might play a role in blocking interferon signaling. Finally, we discovered that the P of related Mononegavirales contain similarly overlooked motifs in their N-termini, and that their C-termini share a previously unnoticed structural similarity suggesting a common origin. Our results suggest several testable hypotheses regarding the replication of Mononegavirales and suggest that disordered regions with little overall sequence similarity, common in viral and eukaryotic proteins, might contain currently overlooked motifs (intermediate in length between linear motifs and disordered domains) that could be detected simply by comparing orthologous proteins.
Collapse
Affiliation(s)
- David Karlin
- Department of Zoology, University of Oxford, Oxford, United Kingdom.
| | | |
Collapse
|
74
|
Lobanov MY, Galzitskaya OV. Occurrence of disordered patterns and homorepeats in eukaryotic and bacterial proteomes. ACTA ACUST UNITED AC 2012; 8:327-37. [DOI: 10.1039/c1mb05318c] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
|
75
|
Peng Z, Mizianty MJ, Xue B, Kurgan L, Uversky VN. More than just tails: intrinsic disorder in histone proteins. MOLECULAR BIOSYSTEMS 2012; 8:1886-901. [DOI: 10.1039/c2mb25102g] [Citation(s) in RCA: 85] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
76
|
Ghalwash MF, Dunker AK, Obradović Z. Uncertainty analysis in protein disorder prediction. MOLECULAR BIOSYSTEMS 2011; 8:381-91. [PMID: 22101336 DOI: 10.1039/c1mb05373f] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
UNLABELLED A grand challenge in the proteomics and structural genomics era is the prediction of protein structure, including identification of those proteins that are partially or wholly unstructured. A number of predictors for identification of intrinsically disordered proteins (IDPs) have been developed over the last decade, but none can be taken as a fully reliable on its own. Using a single model for prediction is typically inadequate because prediction based on only the most accurate model ignores model uncertainty. In this paper, we present an empirical method to specify and measure uncertainty associated with disorder predictions. In particular, we analyze the uncertainty in the reference model itself and the uncertainty in data. This is achieved by training a set of models and developing several meta predictors on top of them. The best meta predictor achieved comparable or better results than any other single model, suggesting that incorporating different aspects of protein disorder prediction is important for the disorder prediction task. In addition, the best meta-predictor had more balanced sensitivity and specificity than any individual model. We also assessed the effects of changes in disorder prediction as a function of changes in the protein sequence. For collections of homologous sequences, we found that mutations caused many of the predicted disordered residues to be flipped to be predicted as ordered residues, while the reverse was observed much less frequently. These results suggest that disorder tendencies are more sensitive to allowed mutations than structure tendencies and the conservation of disorder is indeed less stable than conservation of structure. AVAILABILITY five meta-predictors and four single models developed for this study will be publicly freely accessible for non-commercial use.
Collapse
Affiliation(s)
- Mohamed F Ghalwash
- Center for Data Analytics and Biomedical Informatics, Computer and Information Sciences Department, College of Science and Technology, Temple University, Philadelphia, PA 19122, USA.
| | | | | |
Collapse
|
77
|
Lobanov MY, Galzitskaya OV. Disordered patterns in clustered Protein Data Bank and in eukaryotic and bacterial proteomes. PLoS One 2011; 6:e27142. [PMID: 22073276 PMCID: PMC3208572 DOI: 10.1371/journal.pone.0027142] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2011] [Accepted: 10/11/2011] [Indexed: 11/18/2022] Open
Abstract
We have constructed the clustered Protein Data Bank and obtained clusters of chains of different identity inside each cluster, http://bioinfo.protres.ru/st_pdb/. We have compiled the largest database of disordered patterns (141) from the clustered PDB where identity between chains inside of a cluster is larger or equal to 75% (version of 28 June 2010) by using simple rules of selection. The results of these analyses would help to further our understanding of the physicochemical and structural determinants of intrinsically disordered regions that serve as molecular recognition elements. We have analyzed the occurrence of the selected patterns in 97 eukaryotic and in 26 bacterial proteomes. The disordered patterns appear more often in eukaryotic than in bacterial proteomes. The matrix of correlation coefficients between numbers of proteins where a disordered pattern from the library of 141 disordered patterns appears at least once in 9 kingdoms of eukaryota and 5 phyla of bacteria have been calculated. As a rule, the correlation coefficients are higher inside of the considered kingdom than between them. The patterns with the frequent occurrence in proteomes have low complexity (PPPPP, GGGGG, EEEED, HHHH, KKKKK, SSTSS, QQQQQP), and the type of patterns vary across different proteomes, http://bioinfo.protres.ru/fp/search_new_pattern.html.
Collapse
Affiliation(s)
- Michail Yu. Lobanov
- Group of Bioinformatics, Institute of Protein Research Russian Academy of Sciences, Pushchino, Moscow Region, Russia
| | - Oxana V. Galzitskaya
- Group of Bioinformatics, Institute of Protein Research Russian Academy of Sciences, Pushchino, Moscow Region, Russia
- * E-mail:
| |
Collapse
|
78
|
Monastyrskyy B, Fidelis K, Moult J, Tramontano A, Kryshtafovych A. Evaluation of disorder predictions in CASP9. Proteins 2011; 79 Suppl 10:107-18. [PMID: 21928402 PMCID: PMC3212657 DOI: 10.1002/prot.23161] [Citation(s) in RCA: 105] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2011] [Revised: 07/11/2011] [Accepted: 07/15/2011] [Indexed: 11/10/2022]
Abstract
Lack of stable three-dimensional structure, or intrinsic disorder, is a common phenomenon in proteins. Naturally, unstructured regions are proven to be essential for carrying function by many proteins, and therefore identification of such regions is an important issue. CASP has been assessing the state of the art in predicting disorder regions from amino acid sequence since 2002. Here, we present the results of the evaluation of the disorder predictions submitted to CASP9. The assessment is based on the evaluation measures and procedures used in previous CASPs. The balanced accuracy and the Matthews correlation coefficient were chosen as basic measures for evaluating the correctness of binary classifications. The area under the receiver operating characteristic curve was the measure of choice for evaluating probability-based predictions of disorder. The CASP9 methods are shown to perform slightly better than the CASP7 methods but not better than the methods in CASP8. It was also shown that capability of most CASP9 methods to predict disorder decreases with increasing minimum disorder segment length.
Collapse
Affiliation(s)
- Bohdan Monastyrskyy
- Genome Center, University of California, Davis, 451 Health Sciences Dr., Davis, CA 95616, USA
| | - Krzysztof Fidelis
- Genome Center, University of California, Davis, 451 Health Sciences Dr., Davis, CA 95616, USA
| | - John Moult
- Center for Advanced Research in Biotechnology, University of Maryland Biotechnology Institute, 9600 Gudelsky Drive, Rockville, MD 20850, USA
| | - Anna Tramontano
- Department of Physics, Sapienza University of Rome, P.le Aldo Moro 5, 00185 Rome, Italy
| | - Andriy Kryshtafovych
- Genome Center, University of California, Davis, 451 Health Sciences Dr., Davis, CA 95616, USA
| |
Collapse
|
79
|
Mészáros B, Tóth J, Vértessy BG, Dosztányi Z, Simon I. Proteins with complex architecture as potential targets for drug design: a case study of Mycobacterium tuberculosis. PLoS Comput Biol 2011; 7:e1002118. [PMID: 21814507 PMCID: PMC3140968 DOI: 10.1371/journal.pcbi.1002118] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2011] [Accepted: 05/24/2011] [Indexed: 02/04/2023] Open
Abstract
Lengthy co-evolution of Homo sapiens and Mycobacterium tuberculosis, the main causative agent of tuberculosis, resulted in a dramatically successful pathogen species that presents considerable challenge for modern medicine. The continuous and ever increasing appearance of multi-drug resistant mycobacteria necessitates the identification of novel drug targets and drugs with new mechanisms of action. However, further insights are needed to establish automated protocols for target selection based on the available complete genome sequences. In the present study, we perform complete proteome level comparisons between M. tuberculosis, mycobacteria, other prokaryotes and available eukaryotes based on protein domains, local sequence similarities and protein disorder. We show that the enrichment of certain domains in the genome can indicate an important function specific to M. tuberculosis. We identified two families, termed pkn and PE/PPE that stand out in this respect. The common property of these two protein families is a complex domain organization that combines species-specific regions, commonly occurring domains and disordered segments. Besides highlighting promising novel drug target candidates in M. tuberculosis, the presented analysis can also be viewed as a general protocol to identify proteins involved in species-specific functions in a given organism. We conclude that target selection protocols should be extended to include proteins with complex domain architectures instead of focusing on sequentially unique and essential proteins only.
Collapse
Affiliation(s)
- Bálint Mészáros
- Institute of Enzymology, Hungarian Academy of Sciences, Budapest, Hungary
| | - Judit Tóth
- Institute of Enzymology, Hungarian Academy of Sciences, Budapest, Hungary
| | - Beáta G. Vértessy
- Institute of Enzymology, Hungarian Academy of Sciences, Budapest, Hungary
- Department of Applied Biotechnology, Budapest University of Technology and Economics, Budapest, Hungary
| | - Zsuzsanna Dosztányi
- Institute of Enzymology, Hungarian Academy of Sciences, Budapest, Hungary
- * E-mail: (ZD); (IS)
| | - István Simon
- Institute of Enzymology, Hungarian Academy of Sciences, Budapest, Hungary
- * E-mail: (ZD); (IS)
| |
Collapse
|
80
|
Orosz F. Apicomplexan apicortins possess a long disordered N-terminal extension. INFECTION GENETICS AND EVOLUTION 2011; 11:1037-44. [DOI: 10.1016/j.meegid.2011.03.023] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/03/2010] [Revised: 03/24/2011] [Accepted: 03/25/2011] [Indexed: 01/01/2023]
|
81
|
Song J, Tan H, Boyd SE, Shen H, Mahmood K, Webb GI, Akutsu T, Whisstock JC, Pike RN. Bioinformatic approaches for predicting substrates of proteases. J Bioinform Comput Biol 2011; 9:149-78. [PMID: 21328711 DOI: 10.1142/s0219720011005288] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2010] [Revised: 10/08/2010] [Accepted: 10/09/2010] [Indexed: 11/18/2022]
Abstract
Proteases have central roles in "life and death" processes due to their important ability to catalytically hydrolyze protein substrates, usually altering the function and/or activity of the target in the process. Knowledge of the substrate specificity of a protease should, in theory, dramatically improve the ability to predict target protein substrates. However, experimental identification and characterization of protease substrates is often difficult and time-consuming. Thus solving the "substrate identification" problem is fundamental to both understanding protease biology and the development of therapeutics that target specific protease-regulated pathways. In this context, bioinformatic prediction of protease substrates may provide useful and experimentally testable information about novel potential cleavage sites in candidate substrates. In this article, we provide an overview of recent advances in developing bioinformatic approaches for predicting protease substrate cleavage sites and identifying novel putative substrates. We discuss the advantages and drawbacks of the current methods and detail how more accurate models can be built by deriving multiple sequence and structural features of substrates. We also provide some suggestions about how future studies might further improve the accuracy of protease substrate specificity prediction.
Collapse
Affiliation(s)
- Jiangning Song
- Department of Biochemistry and Molecular Biology, Monash University, Victoria 3800, Australia.
| | | | | | | | | | | | | | | | | |
Collapse
|
82
|
Mészáros B, Simon I, Dosztányi Z. The expanding view of protein-protein interactions: complexes involving intrinsically disordered proteins. Phys Biol 2011; 8:035003. [PMID: 21572179 DOI: 10.1088/1478-3975/8/3/035003] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
A frequently neglected aspect of protein-protein interactions is flexibility. Small-scale fluctuations are present even in globular proteins, and alternative conformations can have a significant influence on the binding process. However, flexibility becomes highly prominent in complexes involving intrinsically disordered proteins. The importance of disordered regions in protein interactions has been recognized only relatively recently. In this survey we examine the basic properties of the complexes of disordered and ordered proteins from three different directions. The comparison of the interface properties shows that although disordered proteins can also adopt well-defined conformations in their bound form, their inherently dynamic nature is cast into their complexes. Furthermore, an overview of prediction methods indicates that disordered proteins as well as their binding regions can be recognized from the amino acid sequence by capturing the basic biophysical properties of these segments. Finally, we propose the generalization of the 'energy landscape model' for the description of complex formation that can help to put the various types of protein associations on a common ground.
Collapse
Affiliation(s)
- Bálint Mészáros
- Institute of Enzymology, Hungarian Academy of Sciences, PO Box 7, H-1518 Budapest, Hungary
| | | | | |
Collapse
|
83
|
Lobanov MY, Galzitskaya OV. The Ising model for prediction of disordered residues from protein sequence alone. Phys Biol 2011; 8:035004. [PMID: 21572175 DOI: 10.1088/1478-3975/8/3/035004] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Intrinsically disordered regions serve as molecular recognition elements, which play an important role in the control of many cellular processes and signaling pathways. It is useful to be able to predict positions of disordered residues and disordered regions in protein chains using protein sequence alone. A new method (IsUnstruct) based on the Ising model for prediction of disordered residues from protein sequence alone has been developed. According to this model, each residue can be in one of two states: ordered or disordered. The model is an approximation of the Ising model in which the interaction term between neighbors has been replaced by a penalty for changing between states (the energy of border). The IsUnstruct has been compared with other available methods and found to perform well. The method correctly finds 77% of disordered residues as well as 87% of ordered residues in the CASP8 database, and 72% of disordered residues as well as 85% of ordered residues in the DisProt database.
Collapse
Affiliation(s)
- Michail Yu Lobanov
- Institute of Protein Research of the Russian Academy of Sciences, 4 Institutskaya str., Pushchino, Moscow Region 142290, Russia
| | | |
Collapse
|
84
|
Abstract
MOTIVATION Predictions, and experiments to a lesser extent, following the decoding of the human genome showed that a significant fraction of gene products do not have well-defined 3D structures. While the presence of structured domains traditionally suggested function, it was not clear what the absence of structure implied. These and many other findings initiated the extensive theoretical and experimental research into these types of proteins, commonly known as intrinsically disordered proteins (IDPs). Crucial to understanding IDPs is the evaluation of structural predictors based on different principles and trained on various datasets, which is currently the subject of active research. The view is emerging that structural disorder can be considered as a separate structural category and not simply as absence of secondary and/or tertiary structure. IDPs perform essential functions and their improper functioning is responsible for human diseases such as neurodegenerative disorders.
Collapse
Affiliation(s)
- Ferenc Orosz
- Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, Karolina út 29, Budapest, H-1113 Hungary.
| | | |
Collapse
|
85
|
Perkins JR, Diboun I, Dessailly BH, Lees JG, Orengo C. Transient protein-protein interactions: structural, functional, and network properties. Structure 2011; 18:1233-43. [PMID: 20947012 DOI: 10.1016/j.str.2010.08.007] [Citation(s) in RCA: 370] [Impact Index Per Article: 28.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2010] [Revised: 07/13/2010] [Accepted: 08/02/2010] [Indexed: 11/28/2022]
Abstract
Transient interactions, which involve protein interactions that are formed and broken easily, are important in many aspects of cellular function. Here we describe structural and functional properties of transient interactions between globular domains and between globular domains, short peptides, and disordered regions. The importance of posttranslational modifications in transient interactions is also considered. We review techniques used in the detection of the different types of transient protein-protein interactions. We also look at the role of transient interactions within protein-protein interaction networks and consider their contribution to different aspects of these networks.
Collapse
Affiliation(s)
- James R Perkins
- Department of Structural and Molecular Biology, University College of London, Gower Street, WC1E 6BT London, UK.
| | | | | | | | | |
Collapse
|
86
|
Pavlović-Lažetić GM, Mitić NS, Kovačević JJ, Obradović Z, Malkov SN, Beljanski MV. Bioinformatics analysis of disordered proteins in prokaryotes. BMC Bioinformatics 2011; 12:66. [PMID: 21366926 PMCID: PMC3062596 DOI: 10.1186/1471-2105-12-66] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2010] [Accepted: 03/02/2011] [Indexed: 01/06/2023] Open
Abstract
Background A significant number of proteins have been shown to be intrinsically disordered, meaning that they lack a fixed 3 D structure or contain regions that do not posses a well defined 3 D structure. It has also been proven that a protein's disorder content is related to its function. We have performed an exhaustive analysis and comparison of the disorder content of proteins from prokaryotic organisms (i.e., superkingdoms Archaea and Bacteria) with respect to functional categories they belong to, i.e., Clusters of Orthologous Groups of proteins (COGs) and groups of COGs-Cellular processes (Cp), Information storage and processing (Isp), Metabolism (Me) and Poorly characterized (Pc). We also analyzed the disorder content of proteins with respect to various genomic, metabolic and ecological characteristics of the organism they belong to. We used correlations and association rule mining in order to identify the most confident associations between specific modalities of the characteristics considered and disorder content. Results Bacteria are shown to have a somewhat higher level of protein disorder than archaea, except for proteins in the Me functional group. It is demonstrated that the Isp and Cp functional groups in particular (L-repair function and N-cell motility and secretion COGs of proteins in specific) possess the highest disorder content, while Me proteins, in general, posses the lowest. Disorder fractions have been confirmed to have the lowest level for the so-called order-promoting amino acids and the highest level for the so-called disorder promoters. For each pair of organism characteristics, specific modalities are identified with the maximum disorder proteins in the corresponding organisms, e.g., high genome size-high GC content organisms, facultative anaerobic-low GC content organisms, aerobic-high genome size organisms, etc. Maximum disorder in archaea is observed for high GC content-low genome size organisms, high GC content-facultative anaerobic or aquatic or mesophilic organisms, etc. Maximum disorder in bacteria is observed for high GC content-high genome size organisms, high genome size-aerobic organisms, etc. Some of the most reliable association rules mined establish relationships between high GC content and high protein disorder, medium GC content and both medium and low protein disorder, anaerobic organisms and medium protein disorder, Gammaproteobacteria and low protein disorder, etc. A web site Prokaryote Disorder Database has been designed and implemented at the address http://bioinfo.matf.bg.ac.rs/disorder, which contains complete results of the analysis of protein disorder performed for 296 prokaryotic completely sequenced genomes. Conclusions Exhaustive disorder analysis has been performed by functional classes of proteins, for a larger dataset of prokaryotic organisms than previously done. Results obtained are well correlated to those previously published, with some extension in the range of disorder level and clear distinction between functional classes of proteins. Wide correlation and association analysis between protein disorder and genomic and ecological characteristics has been performed for the first time. The results obtained give insight into multi-relationships among the characteristics and protein disorder. Such analysis provides for better understanding of the evolutionary process and may be useful for taxon determination. The main drawback of the approach is the fact that the disorder considered has been predicted and not experimentally established.
Collapse
|
87
|
Tamburstuen MV, Reseland JE, Spahr A, Brookes SJ, Kvalheim G, Slaby I, Snead ML, Lyngstadaas SP. Ameloblastin expression and putative autoregulation in mesenchymal cells suggest a role in early bone formation and repair. Bone 2011; 48:406-13. [PMID: 20854943 PMCID: PMC4469498 DOI: 10.1016/j.bone.2010.09.007] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/14/2009] [Revised: 08/24/2010] [Accepted: 09/07/2010] [Indexed: 10/19/2022]
Abstract
Ameloblastin is mainly known as a dental enamel protein, synthesized and secreted into developing enamel matrix by the enamel-forming ameloblasts. The function of ameloblastin in tooth development remains unclear, but it has been suggested to be involved in processes varying from regulating crystal growth to activity as a growth factor or partaking in cell signaling. Recent studies suggest that some enamel matrix proteins also might have important functions outside enamel formation. In this context ameloblastin has recently been reported to induce dentin and bone repair, as well as being present in the early bone and cartilage extracellular matrices during embryogenesis. However, what cells express ameloblastin in these tissues still remains unclear. Thus, the expression of ameloblastin was examined in cultured primary mesenchymal cells and in vivo during healing of bone defects in a "proof of concept" animal study. Real time RT-PCR analysis revealed human ameloblastin (AMBN) mRNA expression in human mesenchymal stem cells and primary osteoblasts and chondrocytes. Expression of AMBN mRNA was also confirmed in human CD34 positive cells and osteoclasts. Western and dot blot analysis of cell lysates and medium confirmed the expression and secretion of ameloblastin from mesenchymal stem cells, primary human osteoblasts and chondrocytes. Expression of ameloblastin was also detected in newly formed bone in experimental bone defects in adult rats. Together these findings suggest a role for this protein in early bone formation and repair.
Collapse
Affiliation(s)
| | - Janne E. Reseland
- Department of Biomaterials, Faculty of Dentistry, University of Oslo (UiO), Oslo, Norway
| | | | - Steven J. Brookes
- Department of Oral Biology, Leeds Dental Institute, University of Leeds, Leeds, UK
| | - Gunnar Kvalheim
- Department of Cellular Therapy, Radiumhospitalet, Oslo University Hospital, Oslo, Norway
| | - Ivan Slaby
- Department of Biomaterials, Faculty of Dentistry, University of Oslo (UiO), Oslo, Norway
| | | | - S. Petter Lyngstadaas
- Department of Biomaterials, Faculty of Dentistry, University of Oslo (UiO), Oslo, Norway
| |
Collapse
|
88
|
Lobanov MY, Furletova EI, Bogatyreva NS, Roytberg MA, Galzitskaya OV. Library of disordered patterns in 3D protein structures. PLoS Comput Biol 2010; 6:e1000958. [PMID: 20976197 PMCID: PMC2954861 DOI: 10.1371/journal.pcbi.1000958] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2010] [Accepted: 09/16/2010] [Indexed: 01/11/2023] Open
Abstract
Intrinsically disordered regions serve as molecular recognition elements, which play an important role in the control of many cellular processes and signaling pathways. It is useful to be able to predict positions of disordered regions in protein chains. The statistical analysis of disordered residues was done considering 34,464 unique protein chains taken from the PDB database. In this database, 4.95% of residues are disordered (i.e. invisible in X-ray structures). The statistics were obtained separately for the N- and C-termini as well as for the central part of the protein chain. It has been shown that frequencies of occurrence of disordered residues of 20 types at the termini of protein chains differ from the ones in the middle part of the protein chain. Our systematic analysis of disordered regions in PDB revealed 109 disordered patterns of different lengths. Each of them has disordered occurrences in at least five protein chains with identity less than 20%. The vast majority of all occurrences of each disordered pattern are disordered. This allows one to use the library of disordered patterns for predicting the status of a residue of a given protein to be ordered or disordered. We analyzed the occurrence of the selected patterns in three eukaryotic and three bacterial proteomes.
Collapse
Affiliation(s)
- Michail Yu. Lobanov
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Russia
| | - Eugeniya I. Furletova
- Institute of Mathematical Problems of Biology, Russian Academy of Sciences, Pushchino, Russia
- Pushchino University, Pushchino, Russia
| | | | - Michail A. Roytberg
- Institute of Mathematical Problems of Biology, Russian Academy of Sciences, Pushchino, Russia
- Pushchino University, Pushchino, Russia
| | | |
Collapse
|