Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	Liu J, Tan H, Rost B. Loopy proteins appear conserved in evolution. J Mol Biol 2002;322:53-64. [PMID: 12215414 DOI: 10.1016/s0022-2836(02)00736-2] [Citation(s) in RCA: 141] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

Number

Cited by Other Article(s)

Protein disorder--a breakthrough invention of evolution? Curr Opin Struct Biol 2011;21:412-8. [PMID: 21514145 DOI: 10.1016/j.sbi.2011.03.014] [Citation(s) in RCA: 112] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2011] [Revised: 03/29/2011] [Accepted: 03/29/2011] [Indexed: 11/21/2022]

Aravind L, Abhiman S, Iyer LM. Natural history of the eukaryotic chromatin protein methylation system. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2011;101:105-76. [PMID: 21507350 DOI: 10.1016/b978-0-12-387685-0.00004-4] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]

Abstract

In eukaryotes, methylation of nucleosomal histones and other nuclear proteins is a central aspect of chromatin structure and dynamics. The past 15 years have seen an enormous advance in our understanding of the biochemistry of these modifications, and of their role in establishing the epigenetic code. We provide a synthetic overview, from an evolutionary perspective, of the main players in the eukaryotic chromatin protein methylation system, with an emphasis on catalytic domains. Several components of the eukaryotic protein methylation system had their origins in bacteria. In particular, the Rossmann fold protein methylases (PRMTs and DOT1), and the LSD1 and jumonji-related demethylases and oxidases, appear to have emerged in the context of bacterial peptide methylation and hydroxylation systems. These systems were originally involved in synthesis of peptide secondary metabolites, such as antibiotics, toxins, and siderophores. The peptidylarginine deiminases appear to have been acquired by animals from bacterial enzymes that modify cell-surface proteins. SET domain methylases, which display the β-clip fold, apparently first emerged in prokaryotes from the SAF superfamily of carbohydrate-binding domains. However, even in bacteria, a subset of the SET domains might have evolved a chromatin-related role in conjunction with a BAF60a/b-like SWIB domain protein and topoisomerases. By the time of the last eukaryotic common ancestor, multiple SET and PRMT methylases were already in place and are likely to have mediated methylation at the H3K4, H3K9, H3K36, and H4K20 positions, and carried out both asymmetric and symmetric arginine dimethylation. Inference of H3K27 methylation in the ancestral eukaryote appears uncertain, though it was certainly in place a little later in eukaryotic evolution. Current data suggest that unlike SET methylases, which are universally present in eukaryotes, demethylases are not. They appear to be absent in the earliest-branching eukaryotic lineages, and emerged later along with several other chromatin proteins, such as the Dot1-methylase, prior to divergence of the kinetoplastid-heterolobosean lineage from the remaining eukaryotes. This period also corresponds to the point of origin of DNA cytosine methylation by DNMT1. Origin of major lineages of SET domains such as the Trithorax, Su(var)3-9, Ash1, SMYD, and TTLL12 and E(Z) might have played the initial role in the establishment of multiple distinct heterochromatic and euchromatic states that are likely to have been present, in some form, through much of eukaryotic evolution. Elaboration of these chromatin states might have gone hand-in-hand with acquisition of multiple jumonji-related and LSD1-like demethylases, and functional linkages with the DNA methylation and RNAi systems. Throughout eukaryotic evolution, there were several lineage-specific expansions of SET domain proteins, which might be related to a special transcription regulation process in trypanosomes, acquisition of new meiotic recombination hotspots in animals, and methylation and associated modifications of the diatom silaffin proteins involved in silica biomineralization. The use of specific domains to "read" the methylation marks appears to have been present in the ancestral eukaryote itself. Of these the chromo-like domains appear to have been acquired from bacterial secreted proteins that might have a role in binding cell-surface peptides or peptidoglycan. Domain architectures of the primary enzymes involved in the eukaryotic protein methylation system indicate key features relating to interactions with each other and other modifications in chromatin, such as acetylation. They also emphasize the profound functional distinction between the role of demethylation and deacetylation in regulation of chromatin dynamics.

Collapse

Niu S, Huang T, Feng K, Cai Y, Li Y. Prediction of Tyrosine Sulfation with mRMR Feature Selection and Analysis. J Proteome Res 2010;9:6490-7. [DOI: 10.1021/pr1007152] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Affiliation(s)

Shen Niu Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, P. R. China, Institute of Systems Biology, Shanghai University, Shanghai 200444, P. R. China, Shanghai Center for Bioinformation Technology, Shanghai 200235, P. R. China, and Centre for Computational Systems Biology, Fudan University, Shanghai 200433, P. R. China
Tao Huang Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, P. R. China, Institute of Systems Biology, Shanghai University, Shanghai 200444, P. R. China, Shanghai Center for Bioinformation Technology, Shanghai 200235, P. R. China, and Centre for Computational Systems Biology, Fudan University, Shanghai 200433, P. R. China
Kaiyan Feng Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, P. R. China, Institute of Systems Biology, Shanghai University, Shanghai 200444, P. R. China, Shanghai Center for Bioinformation Technology, Shanghai 200235, P. R. China, and Centre for Computational Systems Biology, Fudan University, Shanghai 200433, P. R. China
Yudong Cai Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, P. R. China, Institute of Systems Biology, Shanghai University, Shanghai 200444, P. R. China, Shanghai Center for Bioinformation Technology, Shanghai 200235, P. R. China, and Centre for Computational Systems Biology, Fudan University, Shanghai 200433, P. R. China
Yixue Li Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, P. R. China, Institute of Systems Biology, Shanghai University, Shanghai 200444, P. R. China, Shanghai Center for Bioinformation Technology, Shanghai 200235, P. R. China, and Centre for Computational Systems Biology, Fudan University, Shanghai 200433, P. R. China

Collapse

Habchi J, Mamelli L, Darbon H, Longhi S. Structural disorder within Henipavirus nucleoprotein and phosphoprotein: from predictions to experimental assessment. PLoS One 2010;5:e11684. [PMID: 20657787 PMCID: PMC2908138 DOI: 10.1371/journal.pone.0011684] [Citation(s) in RCA: 71] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2010] [Accepted: 06/21/2010] [Indexed: 12/30/2022] Open

Fong JH, Panchenko AR. Intrinsic disorder and protein multibinding in domain, terminal, and linker regions. MOLECULAR BIOSYSTEMS 2010;6:1821-8. [PMID: 20544079 DOI: 10.1039/c005144f] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]

Haritos VS, Niranjane A, Weisman S, Trueman HE, Sriskantha A, Sutherland TD. Harnessing disorder: onychophorans use highly unstructured proteins, not silks, for prey capture. Proc Biol Sci 2010;277:3255-63. [PMID: 20519222 DOI: 10.1098/rspb.2010.0604] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open

Uversky VN, Dunker AK. Understanding protein non-folding. BIOCHIMICA ET BIOPHYSICA ACTA 2010;1804:1231-64. [PMID: 20117254 PMCID: PMC2882790 DOI: 10.1016/j.bbapap.2010.01.017] [Citation(s) in RCA: 901] [Impact Index Per Article: 64.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/21/2009] [Revised: 01/09/2010] [Accepted: 01/21/2010] [Indexed: 02/07/2023]

Xue B, Li L, Meroueh SO, Uversky VN, Dunker AK. Analysis of structured and intrinsically disordered regions of transmembrane proteins. MOLECULAR BIOSYSTEMS 2010;5:1688-1702. [PMID: 19585006 DOI: 10.1039/b905913j] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Abstract

Integral membrane proteins display two major types of transmembrane structure, helical bundles and beta barrels. The main functional roles of transmembrane proteins are the transport of small molecules and cell signaling, and sometimes these two roles are coupled. For cytosolic, water-soluble proteins, signaling and regulatory functions are often carried out by intrinsically disordered regions. Our long range goal is to determine whether integral membrane proteins likewise use disordered regions for signaling and regulation. Here we carried out a systematic bioinformatics investigation of intrinsically disordered regions obtained from integral membrane proteins for which crystal structures have been determined, and for which the intrinsic disorder was identified as missing electron density. We found 120 disorder-containing integral membrane proteins having a total of 33675 residues, with 3209 of the residues distributed among 240 different disordered regions. These disordered regions were compared with those obtained from water-soluble proteins with regards to their amino acid compositional biases, and to the accuracies of various disorder predictors. The results of these analyses show that the disordered regions from helical bundle integral membrane proteins, those from beta barrel integral membrane proteins, and those from water soluble proteins all exhibit statistically distinct amino acid compositional biases. Despite these differences in composition, current algorithms make reasonably accurate predictions of disorder for these membrane proteins. Although the small size of the current data sets are limiting, these results suggest that developing new predictors that make use of data from disordered regions in helical bundles and beta barrels, especially as these datasets increase in size, will likely lead to significantly more accurate disorder predictions for these two classes of integral membrane proteins.

Collapse

Arnot CJ, Gay NJ, Gangloff M. Molecular mechanism that induces activation of Spätzle, the ligand for the Drosophila Toll receptor. J Biol Chem 2010;285:19502-9. [PMID: 20378549 DOI: 10.1074/jbc.m109.098186] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open

Schaefer C, Schlessinger A, Rost B. Protein secondary structure appears to be robust under in silico evolution while protein disorder appears not to be. ACTA ACUST UNITED AC 2010;26:625-31. [PMID: 20081223 PMCID: PMC2828120 DOI: 10.1093/bioinformatics/btq012] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Abstract

Motivation: The mutation of amino acids often impacts protein function and structure. Mutations without negative effect sustain evolutionary pressure. We study a particular aspect of structural robustness with respect to mutations: regular protein secondary structure and natively unstructured (intrinsically disordered) regions. Is the formation of regular secondary structure an intrinsic feature of amino acid sequences, or is it a feature that is lost upon mutation and is maintained by evolution against the odds? Similarly, is disorder an intrinsic sequence feature or is it difficult to maintain? To tackle these questions, we in silico mutated native protein sequences into random sequence-like ensembles and monitored the change in predicted secondary structure and disorder.

Results: We established that by our coarse-grained measures for change, predictions and observations were similar, suggesting that our results were not biased by prediction mistakes. Changes in secondary structure and disorder predictions were linearly proportional to the change in sequence. Surprisingly, neither the content nor the length distribution for the predicted secondary structure changed substantially. Regions with long disorder behaved differently in that significantly fewer such regions were predicted after a few mutation steps. Our findings suggest that the formation of regular secondary structure is an intrinsic feature of random amino acid sequences, while the formation of long-disordered regions is not an intrinsic feature of proteins with disordered regions. Put differently, helices and strands appear to be maintained easily by evolution, whereas maintaining disordered regions appears difficult. Neutral mutations with respect to disorder are therefore very unlikely.

Contact:schaefer@rostlab.org

Supplementary Information:Supplementary data are available at Bioinformatics online.

Collapse

Pentony MM, Ward J, Jones DT. Computational resources for the prediction and analysis of native disorder in proteins. Methods Mol Biol 2010;604:369-93. [PMID: 20013384 DOI: 10.1007/978-1-60761-444-9_25] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]

Swaminathan K, Adamczak R, Porollo A, Meller J. Enhanced Prediction of Conformational Flexibility and Phosphorylation in Proteins. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2010;680:307-19. [DOI: 10.1007/978-1-4419-5913-3_35] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]

Rossi P, Swapna GVT, Huang YJ, Aramini JM, Anklin C, Conover K, Hamilton K, Xiao R, Acton TB, Ertekin A, Everett JK, Montelione GT. A microscale protein NMR sample screening pipeline. JOURNAL OF BIOMOLECULAR NMR 2010;46:11-22. [PMID: 19915800 PMCID: PMC2797623 DOI: 10.1007/s10858-009-9386-z] [Citation(s) in RCA: 75] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/09/2009] [Accepted: 10/14/2009] [Indexed: 05/14/2023]

Affiliation(s)

Paolo Rossi Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, 679 Hoes Lane, Piscataway, NJ 08854 USA Northeast Structural Genomics Consortium, Piscataway, NJ USA
G. V. T. Swapna Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, 679 Hoes Lane, Piscataway, NJ 08854 USA Northeast Structural Genomics Consortium, Piscataway, NJ USA
Yuanpeng J. Huang Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, 679 Hoes Lane, Piscataway, NJ 08854 USA Northeast Structural Genomics Consortium, Piscataway, NJ USA
James M. Aramini Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, 679 Hoes Lane, Piscataway, NJ 08854 USA Northeast Structural Genomics Consortium, Piscataway, NJ USA
Clemens Anklin Bruker Biospin Corporation, 15 Fortune Drive, Billerica, MA 01821 USA
Kenith Conover Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, 679 Hoes Lane, Piscataway, NJ 08854 USA Northeast Structural Genomics Consortium, Piscataway, NJ USA
Keith Hamilton Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, 679 Hoes Lane, Piscataway, NJ 08854 USA Northeast Structural Genomics Consortium, Piscataway, NJ USA
Rong Xiao Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, 679 Hoes Lane, Piscataway, NJ 08854 USA Northeast Structural Genomics Consortium, Piscataway, NJ USA
Thomas B. Acton Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, 679 Hoes Lane, Piscataway, NJ 08854 USA Northeast Structural Genomics Consortium, Piscataway, NJ USA
Asli Ertekin Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, 679 Hoes Lane, Piscataway, NJ 08854 USA Northeast Structural Genomics Consortium, Piscataway, NJ USA
John K. Everett Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, 679 Hoes Lane, Piscataway, NJ 08854 USA Northeast Structural Genomics Consortium, Piscataway, NJ USA
Gaetano T. Montelione Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, 679 Hoes Lane, Piscataway, NJ 08854 USA Northeast Structural Genomics Consortium, Piscataway, NJ USA Department of Biochemistry, Robert Wood Johnson Medical School, UMDNJ, Piscataway, NJ 08854 USA

Collapse

Longhi S, Lieutaud P, Canard B. Conformational disorder. Methods Mol Biol 2010;609:307-325. [PMID: 20221927 DOI: 10.1007/978-1-60327-241-4_18] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]

Lee JH, Kim HJ, Kim HD, Lee BC, Chun JS, Park CS. Modulation of the conductance-voltage relationship of the BK(Ca) channel by shortening the cytosolic loop connecting two RCK domains. Biophys J 2009;97:730-7. [PMID: 19651031 DOI: 10.1016/j.bpj.2009.04.058] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2008] [Revised: 04/20/2009] [Accepted: 04/24/2009] [Indexed: 12/25/2022] Open

Gerard FCA, Ribeiro EDA, Leyrat C, Ivanov I, Blondel D, Longhi S, Ruigrok RWH, Jamin M. Modular organization of rabies virus phosphoprotein. J Mol Biol 2009;388:978-96. [PMID: 19341745 DOI: 10.1016/j.jmb.2009.03.061] [Citation(s) in RCA: 89] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2008] [Revised: 03/23/2009] [Accepted: 03/25/2009] [Indexed: 10/20/2022]

A LIM-9 (FHL)/SCPL-1 (SCP) complex interacts with the C-terminal protein kinase regions of UNC-89 (obscurin) in Caenorhabditis elegans muscle. J Mol Biol 2009;386:976-88. [PMID: 19244614 DOI: 10.1016/j.jmb.2009.01.016] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Schlessinger A, Punta M, Yachdav G, Kajan L, Rost B. Improved disorder prediction by combination of orthogonal approaches. PLoS One 2009;4:e4433. [PMID: 19209228 PMCID: PMC2635965 DOI: 10.1371/journal.pone.0004433] [Citation(s) in RCA: 161] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2008] [Accepted: 12/15/2008] [Indexed: 12/15/2022] Open

Han P, Zhang X, Feng ZP. Predicting disordered regions in proteins using the profiles of amino acid indices. BMC Bioinformatics 2009;10 Suppl 1:S42. [PMID: 19208144 PMCID: PMC2648739 DOI: 10.1186/1471-2105-10-s1-s42] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Han P, Zhang X, Norton RS, Feng ZP. Large-scale prediction of long disordered regions in proteins using random forests. BMC Bioinformatics 2009;10:8. [PMID: 19128505 PMCID: PMC2637845 DOI: 10.1186/1471-2105-10-8] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2008] [Accepted: 01/07/2009] [Indexed: 12/02/2022] Open

Abstract

Background

Many proteins contain disordered regions that lack fixed three-dimensional (3D) structure under physiological conditions but have important biological functions. Prediction of disordered regions in protein sequences is important for understanding protein function and in high-throughput determination of protein structures. Machine learning techniques, including neural networks and support vector machines have been widely used in such predictions. Predictors designed for long disordered regions are usually less successful in predicting short disordered regions. Combining prediction of short and long disordered regions will dramatically increase the complexity of the prediction algorithm and make the predictor unsuitable for large-scale applications. Efficient batch prediction of long disordered regions alone is of greater interest in large-scale proteome studies.

Results

A new algorithm, IUPforest-L, for predicting long disordered regions using the random forest learning model is proposed in this paper. IUPforest-L is based on the Moreau-Broto auto-correlation function of amino acid indices (AAIs) and other physicochemical features of the primary sequences. In 10-fold cross validation tests, IUPforest-L can achieve an area of 89.5% under the receiver operating characteristic (ROC) curve. Compared with existing disorder predictors, IUPforest-L has high prediction accuracy and is efficient for predicting long disordered regions in large-scale proteomes.

Conclusion

The random forest model based on the auto-correlation functions of the AAIs within a protein fragment and other physicochemical features could effectively detect long disordered regions in proteins. A new predictor, IUPforest-L, was developed to batch predict long disordered regions in proteins, and the server can be accessed from

Collapse

Schlessinger A, Liu J, Rost B. Natively unstructured loops differ from other loops. PLoS Comput Biol 2008;3:e140. [PMID: 17658943 PMCID: PMC1924875 DOI: 10.1371/journal.pcbi.0030140] [Citation(s) in RCA: 77] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2006] [Accepted: 06/05/2007] [Indexed: 11/24/2022] Open

Abstract

Natively unstructured or disordered protein regions may increase the functional complexity of an organism; they are particularly abundant in eukaryotes and often evade structure determination. Many computational methods predict unstructured regions by training on outliers in otherwise well-ordered structures. Here, we introduce an approach that uses a neural network in a very different and novel way. We hypothesize that very long contiguous segments with nonregular secondary structure (NORS regions) differ significantly from regular, well-structured loops, and that a method detecting such features could predict natively unstructured regions. Training our new method, NORSnet, on predicted information rather than on experimental data yielded three major advantages: it removed the overlap between testing and training, it systematically covered entire proteomes, and it explicitly focused on one particular aspect of unstructured regions with a simple structural interpretation, namely that they are loops. Our hypothesis was correct: well-structured and unstructured loops differ so substantially that NORSnet succeeded in their distinction. Benchmarks on previously used and new experimental data of unstructured regions revealed that NORSnet performed very well. Although it was not the best single prediction method, NORSnet was sufficiently accurate to flag unstructured regions in proteins that were previously not annotated. In one application, NORSnet revealed previously undetected unstructured regions in putative targets for structural genomics and may thereby contribute to increasing structural coverage of large eukaryotic families. NORSnet found unstructured regions more often in domain boundaries than expected at random. In another application, we estimated that 50%–70% of all worm proteins observed to have more than seven protein–protein interaction partners have unstructured regions. The comparative analysis between NORSnet and DISOPRED2 suggested that long unstructured loops are a major part of unstructured regions in molecular networks.

The details of protein structures are important for function. Regions that do not adopt any regular structure in isolation (natively unstructured or disordered regions) initially appeared as a curious exception to this structure–function paradigm. It has become increasingly clear that unstructured regions are fundamental to many roles and that they are particularly important for multicellular organisms. Structural biology is just beginning to apprehend the stunning diversity of these roles. Here, we focused on unstructured regions dominated by a particular type of loop, namely the natively unstructured one. We developed a method that succeeded in the distinction between well-structured and natively unstructured loops. For the development, we did not use any experimental data for unstructured regions; when tested on experimental data, the method performed surprisingly well. Due to its different premises, the method captured very different aspects of unstructured regions than other methods that we tested. We applied the new method to two different problems. The first was the identification of proteins that may be difficult targets for structure determination. The second was the identification of worm proteins that have many interaction partners (more than seven) and unstructured regions. Surprisingly, we found unstructured regions of the loopy type in more than 50% of all the promiscuous worm proteins.

Collapse

Ribeiro EA, Favier A, Gerard FCA, Leyrat C, Brutscher B, Blondel D, Ruigrok RWH, Blackledge M, Jamin M. Solution structure of the C-terminal nucleoprotein-RNA binding domain of the vesicular stomatitis virus phosphoprotein. J Mol Biol 2008;382:525-38. [PMID: 18657547 DOI: 10.1016/j.jmb.2008.07.028] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2008] [Accepted: 07/07/2008] [Indexed: 10/21/2022]

Abstract

Beyond common features in their genome organization and replication mechanisms, the evolutionary relationships among viruses of the Rhabdoviridae family are difficult to decipher because of the great variability in the amino acid sequence of their proteins. The phosphoprotein (P) of vesicular stomatitis virus (VSV) is an essential component of the RNA transcription and replication machinery; in particular, it contains binding sites for the RNA-dependent RNA polymerase and for the nucleoprotein. Here, we devised a new method for defining boundaries of structured domains from multiple disorder prediction algorithms, and we identified an autonomous folding C-terminal domain in VSV P (P(CTD)). We show that, like the C-terminal domain of rabies virus (RV) P, VSV P(CTD) binds to the viral nucleocapsid (nucleoprotein-RNA complex). We solved the three-dimensional structure of VSV P(CTD) by NMR spectroscopy and found that the topology of its polypeptide chain resembles that of RV P(CTD). The common part of both proteins could be superimposed with a backbone RMSD from mean atomic coordinates of 2.6 A. VSV P(CTD) has a shorter N-terminal helix (alpha(1)) than RV P(CTD); it lacks two alpha-helices (helices alpha(3) and alpha(6) of RV P), and the loop between strands beta(1) and beta(2) is longer than that in RV. Dynamical properties measured by NMR relaxation revealed the presence of fast motions (below the nanosecond timescale) in loop regions (amino acids 209-214) and slower conformational exchange in the N- and C-terminal helices. Characterization of a longer construct indicated that P(CTD) is preceded by a flexible linker. The results presented here support a modular organization of VSV P, with independent folded domains separated by flexible linkers, which is conserved among different genera of Rhabdoviridae and is similar to that proposed for the P proteins of the Paramyxoviridae.

Collapse

Cortese MS, Uversky VN, Dunker AK. Intrinsic disorder in scaffold proteins: getting more from less. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2008;98:85-106. [PMID: 18619997 DOI: 10.1016/j.pbiomolbio.2008.05.007] [Citation(s) in RCA: 224] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]

Liu J, Zhang Y, Lei X, Zhang Z. Natural selection of protein structural and functional properties: a single nucleotide polymorphism perspective. Genome Biol 2008. [PMID: 18397526 DOI: 10.1186/gb‐2008‐9‐4‐r69] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Liu J, Zhang Y, Lei X, Zhang Z. Natural selection of protein structural and functional properties: a single nucleotide polymorphism perspective. Genome Biol 2008;9:R69. [PMID: 18397526 PMCID: PMC2643940 DOI: 10.1186/gb-2008-9-4-r69] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2008] [Revised: 03/25/2008] [Accepted: 04/08/2008] [Indexed: 09/03/2023] Open

Abstract

A large-scale survey using single nucleotide polymorphism data from dbSNP provides insights into the evolutionary selection constraints on human proteins of different structural and functional categories.

Background

The rates of molecular evolution for protein-coding genes depend on the stringency of functional or structural constraints. The Ka/Ks ratio has been commonly used as an indicator of selective constraints and is typically calculated from interspecies alignments. Recent accumulation of single nucleotide polymorphism (SNP) data has enabled the derivation of Ka/Ks ratios for polymorphism (SNP A/S ratios).

Results

Using data from the dbSNP database, we conducted the first large-scale survey of SNP A/S ratios for different structural and functional properties. We confirmed that the SNP A/S ratio is largely correlated with Ka/Ks for divergence. We observed stronger selective constraints for proteins that have high mRNA expression levels or broad expression patterns, have no paralogs, arose earlier in evolution, have natively disordered regions, are located in cytoplasm and nucleus, or are related to human diseases. On the residue level, we found higher degrees of variation for residues that are exposed to solvent, are in a loop conformation, natively disordered regions or low complexity regions, or are in the signal peptides of secreted proteins. Our analysis also revealed that histones and protein kinases are among the protein families that are under the strongest selective constraints, whereas olfactory and taste receptors are among the most variable groups.

Conclusion

Our study suggests that the SNP A/S ratio is a robust measure for selective constraints. The correlations between SNP A/S ratios and other variables provide valuable insights into the natural selection of various structural or functional properties, particularly for human-specific genes and constraints within the human lineage.

Collapse

Higurashi M, Ishida T, Kinoshita K. Identification of transient hub proteins and the possible structural basis for their multiple interactions. Protein Sci 2008;17:72-8. [PMID: 18156468 DOI: 10.1110/ps.073196308] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

Bannen RM, Bingman CA, Phillips GN. Effect of low-complexity regions on protein structure determination. ACTA ACUST UNITED AC 2008;8:217-26. [PMID: 18302007 DOI: 10.1007/s10969-008-9039-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2007] [Accepted: 02/05/2008] [Indexed: 11/24/2022]

Dosztányi Z, Tompa P. Prediction of protein disorder. Methods Mol Biol 2008;426:103-115. [PMID: 18542859 DOI: 10.1007/978-1-60327-058-8_6] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]

Iyer LM, Anantharaman V, Wolf MY, Aravind L. Comparative genomics of transcription factors and chromatin proteins in parasitic protists and other eukaryotes. Int J Parasitol 2007;38:1-31. [PMID: 17949725 DOI: 10.1016/j.ijpara.2007.07.018] [Citation(s) in RCA: 192] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2007] [Revised: 07/26/2007] [Accepted: 07/30/2007] [Indexed: 11/18/2022]

Abstract

Comparative genomics of parasitic protists and their free-living relatives are profoundly impacting our understanding of the regulatory systems involved in transcription and chromatin dynamics. While some parts of these systems are highly conserved, other parts are rapidly evolving, thereby providing the molecular basis for the variety in the regulatory adaptations of eukaryotes. The gross number of specific transcription factors and chromatin proteins are positively correlated with proteome size in eukaryotes. However, the individual types of specific transcription factors show an enormous variety across different eukaryotic lineages. The dominant families of specific transcription factors even differ between sister lineages, and have been shaped by gene loss and lineage-specific expansions. Recognition of this principle has helped in identifying the hitherto unknown, major specific transcription factors of several parasites, such as apicomplexans, Entamoeba histolytica, Trichomonas vaginalis, Phytophthora and ciliates. Comparative analysis of predicted chromatin proteins from protists allows reconstruction of the early evolutionary history of histone and DNA modification, nucleosome assembly and chromatin-remodeling systems. Many key catalytic, peptide-binding and DNA-binding domains in these systems ultimately had bacterial precursors, but were put together into distinctive regulatory complexes that are unique to the eukaryotes. In the case of histone methylases, histone demethylases and SWI2/SNF2 ATPases, proliferation of paralogous families followed by acquisition of novel domain architectures, seem to have played a major role in producing a diverse set of enzymes that create and respond to an epigenetic code of modified histones. The diversification of histone acetylases and DNA methylases appears to have proceeded via repeated emergence of new versions, most probably via transfers from bacteria to different eukaryotic lineages, again resulting in lineage-specific diversity in epigenetic signals. Even though the key histone modifications are universal to eukaryotes, domain architectures of proteins binding post-translationally modified-histones vary considerably across eukaryotes. This indicates that the histone code might be "interpreted" differently from model organisms in parasitic protists and their relatives. The complexity of domain architectures of chromatin proteins appears to have increased during eukaryotic evolution. Thus, Trichomonas, Giardia, Naegleria and kinetoplastids have relatively simple domain architectures, whereas apicomplexans and oomycetes have more complex architectures. RNA-dependent post-transcriptional silencing systems, which interact with chromatin-level regulatory systems, show considerable variability across parasitic protists, with complete loss in many apicomplexans and partial loss in Trichomonas vaginalis. This evolutionary synthesis offers a robust scaffold for future investigation of transcription and chromatin structure in parasitic protists.

Collapse

Schlessinger A, Punta M, Rost B. Natively unstructured regions in proteins identified from contact predictions. ACTA ACUST UNITED AC 2007;23:2376-84. [PMID: 17709338 DOI: 10.1093/bioinformatics/btm349] [Citation(s) in RCA: 95] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

John SP, Wang T, Steffen S, Longhi S, Schmaljohn CS, Jonsson CB. Ebola virus VP30 is an RNA binding protein. J Virol 2007;81:8967-76. [PMID: 17567691 PMCID: PMC1951390 DOI: 10.1128/jvi.02523-06] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open

Radivojac P, Iakoucheva LM, Oldfield CJ, Obradovic Z, Uversky VN, Dunker AK. Intrinsic disorder and functional proteomics. Biophys J 2007;92:1439-56. [PMID: 17158572 PMCID: PMC1796814 DOI: 10.1529/biophysj.106.094045] [Citation(s) in RCA: 549] [Impact Index Per Article: 32.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2006] [Accepted: 11/15/2006] [Indexed: 11/18/2022] Open

Ferrara TM, Flaherty DB, Benian GM. Titin/connectin-related proteins in C. elegans: a review and new findings. J Muscle Res Cell Motil 2007;26:435-47. [PMID: 16453163 DOI: 10.1007/s10974-005-9027-4] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]

Sitbon E, Pietrokovski S. Occurrence of protein structure elements in conserved sequence regions. BMC STRUCTURAL BIOLOGY 2007;7:3. [PMID: 17210087 PMCID: PMC1781454 DOI: 10.1186/1472-6807-7-3] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/04/2006] [Accepted: 01/09/2007] [Indexed: 11/19/2022]

Uversky VN, Radivojac P, Iakoucheva LM, Obradovic Z, Dunker AK. Prediction of intrinsic disorder and its use in functional proteomics. Methods Mol Biol 2007;408:69-92. [PMID: 18314578 DOI: 10.1007/978-1-59745-547-3_5] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]

Singh GP, Ganapathi M, Dash D. Role of intrinsic disorder in transient interactions of hub proteins. Proteins 2006;66:761-5. [PMID: 17154416 DOI: 10.1002/prot.21281] [Citation(s) in RCA: 127] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Han P, Zhang X, Norton RS, Feng ZP. Predicting Disordered Regions in Proteins Based on Decision Trees of Reduced Amino Acid Composition. J Comput Biol 2006;13:1723-34. [PMID: 17238841 DOI: 10.1089/cmb.2006.13.1723] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Feng ZP, Zhang X, Han P, Arora N, Anders RF, Norton RS. Abundance of intrinsically unstructured proteins in P. falciparum and other apicomplexan parasite proteomes. Mol Biochem Parasitol 2006;150:256-67. [PMID: 17010454 DOI: 10.1016/j.molbiopara.2006.08.011] [Citation(s) in RCA: 99] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2006] [Revised: 08/28/2006] [Accepted: 08/28/2006] [Indexed: 11/21/2022]

Han P, Zhang X, Norton RS, Feng ZP. Predicting Disordered Regions in Proteins Based on Decision Trees of Reduced Amino Acid Composition. J Comput Biol 2006. [DOI: 10.1089/cmb.2006.13.1579] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Sobolevsky Y, Trifonov EN. Protein Modules Conserved Since LUCA. J Mol Evol 2006;63:622-34. [PMID: 17075700 DOI: 10.1007/s00239-005-0190-4] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2005] [Accepted: 12/02/2005] [Indexed: 11/28/2022]

Coronado JE, Attie O, Epstein SL, Qiu WG, Lipke PN. Composition-modified matrices improve identification of homologs of saccharomyces cerevisiae low-complexity glycoproteins. EUKARYOTIC CELL 2006;5:628-37. [PMID: 16607010 PMCID: PMC1459670 DOI: 10.1128/ec.5.4.628-637.2006] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

Ferron F, Longhi S, Canard B, Karlin D. A practical overview of protein disorder prediction methods. Proteins 2006;65:1-14. [PMID: 16856179 DOI: 10.1002/prot.21075] [Citation(s) in RCA: 205] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Vullo A, Bortolami O, Pollastri G, Tosatto SCE. Spritz: a server for the prediction of intrinsically disordered regions in protein sequences using kernel machines. Nucleic Acids Res 2006;34:W164-8. [PMID: 16844983 PMCID: PMC1538873 DOI: 10.1093/nar/gkl166] [Citation(s) in RCA: 105] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open

Romov PA, Li F, Lipke PN, Epstein SL, Qiu WG. Comparative genomics reveals long, evolutionarily conserved, low-complexity islands in yeast proteins. J Mol Evol 2006;63:415-25. [PMID: 16927006 DOI: 10.1007/s00239-005-0291-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2005] [Accepted: 04/27/2006] [Indexed: 01/12/2023]

Schlessinger A, Rost B. Protein flexibility and rigidity predicted from sequence. Proteins 2006;61:115-26. [PMID: 16080156 DOI: 10.1002/prot.20587] [Citation(s) in RCA: 123] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Sharma S, Ang SL, Shaw M, Mackey DA, Gécz J, McAvoy JW, Craig JE. Nance-Horan syndrome protein, NHS, associates with epithelial cell junctions. Hum Mol Genet 2006;15:1972-83. [PMID: 16675532 DOI: 10.1093/hmg/ddl120] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Nardini M, Svergun D, Konarev PV, Spanò S, Fasano M, Bracco C, Pesce A, Donadini A, Cericola C, Secundo F, Luini A, Corda D, Bolognesi M. The C-terminal domain of the transcriptional corepressor CtBP is intrinsically unstructured. Protein Sci 2006;15:1042-50. [PMID: 16597837 PMCID: PMC2242513 DOI: 10.1110/ps.062115406] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Singh GP, Ganapathi M, Sandhu KS, Dash D. Intrinsic unstructuredness and abundance of PEST motifs in eukaryotic proteomes. Proteins 2006;62:309-15. [PMID: 16299712 DOI: 10.1002/prot.20746] [Citation(s) in RCA: 78] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Bhalla J, Storchan GB, MacCarthy CM, Uversky VN, Tcherkasskaya O. Local flexibility in molecular function paradigm. Mol Cell Proteomics 2006;5:1212-23. [PMID: 16571897 DOI: 10.1074/mcp.m500315-mcp200] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open

100

Turlure F, Maertens G, Rahman S, Cherepanov P, Engelman A. A tripartite DNA-binding element, comprised of the nuclear localization signal and two AT-hook motifs, mediates the association of LEDGF/p75 with chromatin in vivo. Nucleic Acids Res 2006;34:1653-65. [PMID: 16549878 PMCID: PMC1405818 DOI: 10.1093/nar/gkl052] [Citation(s) in RCA: 143] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open