1
|
Majila K, Viswanath S. StrIDR: a database of intrinsically disordered regions of proteins with experimentally resolved structures. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.22.609111. [PMID: 39253485 PMCID: PMC11382991 DOI: 10.1101/2024.08.22.609111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/11/2024]
Abstract
Motivation Intrinsically disordered regions (IDRs) of proteins exist as an ensemble of conformations, and not as a single structure. Existing databases contain extensive, experimentally derived annotations of intrinsic disorder for millions of proteins at the sequence level. However, only a tiny fraction of these IDRs are associated with an experimentally determined protein structure. Moreover, even if a structure exists, parts of the disordered regions may still be unresolved. Results Here we organize Structures of Intrinsically Disordered Regions (StrIDR), a database of IDRs confirmed via experimental or homology-based evidence, resolved in experimentally determined structures. The database can provide useful insights into the dynamics, folding, and interactions of IDRs. It can also facilitate computational studies on IDRs, such as those using molecular dynamics simulations and/or machine learning. Availability StrIDR is available at https://isblab.ncbs.res.in/stridr. The web UI allows for downloading PDB structures and SIFTS mappings of individual entries. Additionally, the entire database can be downloaded in a JSON format. The source code for creating and updating the database is available at https://github.com/isblab/stridr.
Collapse
Affiliation(s)
- Kartik Majila
- National Center for Biological Sciences, Tata Institute of Fundamental Research, Bangalore, India 560065
| | - Shruthi Viswanath
- National Center for Biological Sciences, Tata Institute of Fundamental Research, Bangalore, India 560065
| |
Collapse
|
2
|
Lebedenko OO, Sekhar A, Skrynnikov NR. Order/Disorder Transitions Upon Protein Binding: A Unifying Perspective. Proteins 2024. [PMID: 39158131 DOI: 10.1002/prot.26737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Revised: 07/11/2024] [Accepted: 07/30/2024] [Indexed: 08/20/2024]
Abstract
When two proteins bind to each other, this process is often accompanied by a change in their structural states (from disordered to ordered or vice versa). As it turns out, there are 10 distinct possibilities for such binding-related order/disorder transitions. Out of this number, seven scenarios have been experimentally observed, while another three remain hitherto unreported. As an example, we discuss the so-called mutual synergistic folding, whereby two disordered proteins come together to form a fully structured complex. Our bioinformatics analysis of the Protein Databank found potential new examples of this remarkable binding mechanism.
Collapse
Affiliation(s)
- Olga O Lebedenko
- Laboratory of Biomolecular NMR, St. Petersburg State University, St. Petersburg, Russia
| | - Ashok Sekhar
- Molecular Biophysics Unit, Indian Institute of Science Bangalore, Bengaluru, India
| | - Nikolai R Skrynnikov
- Laboratory of Biomolecular NMR, St. Petersburg State University, St. Petersburg, Russia
- Department of Chemistry, Purdue University, West Lafayette, Indiana, USA
| |
Collapse
|
3
|
Erdős G, Dosztányi Z. AIUPred: combining energy estimation with deep learning for the enhanced prediction of protein disorder. Nucleic Acids Res 2024; 52:W176-W181. [PMID: 38747347 PMCID: PMC11223784 DOI: 10.1093/nar/gkae385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 04/19/2024] [Accepted: 05/07/2024] [Indexed: 07/06/2024] Open
Abstract
Intrinsically disordered proteins and protein regions (IDPs/IDRs) carry out important biological functions without relying on a single well-defined conformation. As these proteins are a challenge to study experimentally, computational methods play important roles in their characterization. One of the commonly used tools is the IUPred web server which provides prediction of disordered regions and their binding sites. IUPred is rooted in a simple biophysical model and uses a limited number of parameters largely derived on globular protein structures only. This enabled an incredibly fast and robust prediction method, however, its limitations have also become apparent in light of recent breakthrough methods using deep learning techniques. Here, we present AIUPred, a novel version of IUPred which incorporates deep learning techniques into the energy estimation framework. It achieves improved performance while keeping the robustness of the original method. Based on the evaluation of recent benchmark datasets, AIUPred scored amongst the top three single sequence based methods. With a new web server we offer fast and reliable visual analysis for users as well as options to analyze whole genomes in mere seconds with the downloadable package. AIUPred is available at https://aiupred.elte.hu.
Collapse
Affiliation(s)
- Gábor Erdős
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| | - Zsuzsanna Dosztányi
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| |
Collapse
|
4
|
Jahn LR, Marquet C, Heinzinger M, Rost B. Protein embeddings predict binding residues in disordered regions. Sci Rep 2024; 14:13566. [PMID: 38866950 PMCID: PMC11169622 DOI: 10.1038/s41598-024-64211-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Accepted: 06/06/2024] [Indexed: 06/14/2024] Open
Abstract
The identification of protein binding residues helps to understand their biological processes as protein function is often defined through ligand binding, such as to other proteins, small molecules, ions, or nucleotides. Methods predicting binding residues often err for intrinsically disordered proteins or regions (IDPs/IDPRs), often also referred to as molecular recognition features (MoRFs). Here, we presented a novel machine learning (ML) model trained to specifically predict binding regions in IDPRs. The proposed model, IDBindT5, leveraged embeddings from the protein language model (pLM) ProtT5 to reach a balanced accuracy of 57.2 ± 3.6% (95% confidence interval). Assessed on the same data set, this did not differ at the 95% CI from the state-of-the-art (SOTA) methods ANCHOR2 and DeepDISOBind that rely on expert-crafted features and evolutionary information from multiple sequence alignments (MSAs). Assessed on other data, methods such as SPOT-MoRF reached higher MCCs. IDBindT5's SOTA predictions are much faster than other methods, easily enabling full-proteome analyses. Our findings emphasize the potential of pLMs as a promising approach for exploring and predicting features of disordered proteins. The model and a comprehensive manual are publicly available at https://github.com/jahnl/binding_in_disorder .
Collapse
Affiliation(s)
- Laura R Jahn
- School of Computation, Information, and Technology (CIT), Department of Informatics, Bioinformatics and Computational Biology, TUM (Technical University of Munich), 85748, Garching/Munich, Germany
| | - Céline Marquet
- School of Computation, Information, and Technology (CIT), Department of Informatics, Bioinformatics and Computational Biology, TUM (Technical University of Munich), 85748, Garching/Munich, Germany.
| | - Michael Heinzinger
- School of Computation, Information, and Technology (CIT), Department of Informatics, Bioinformatics and Computational Biology, TUM (Technical University of Munich), 85748, Garching/Munich, Germany
| | - Burkhard Rost
- School of Computation, Information, and Technology (CIT), Department of Informatics, Bioinformatics and Computational Biology, TUM (Technical University of Munich), 85748, Garching/Munich, Germany
- Institute for Advanced Study (TUM-IAS), Lichtenbergstr. 2a, 85748, Garching/Munich, Germany
- TUM School of Life Sciences Weihenstephan (TUM-WZW), Alte Akademie 8, Freising, Germany
| |
Collapse
|
5
|
Ghosh D, Biswas A, Radhakrishna M. Advanced computational approaches to understand protein aggregation. BIOPHYSICS REVIEWS 2024; 5:021302. [PMID: 38681860 PMCID: PMC11045254 DOI: 10.1063/5.0180691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 03/18/2024] [Indexed: 05/01/2024]
Abstract
Protein aggregation is a widespread phenomenon implicated in debilitating diseases like Alzheimer's, Parkinson's, and cataracts, presenting complex hurdles for the field of molecular biology. In this review, we explore the evolving realm of computational methods and bioinformatics tools that have revolutionized our comprehension of protein aggregation. Beginning with a discussion of the multifaceted challenges associated with understanding this process and emphasizing the critical need for precise predictive tools, we highlight how computational techniques have become indispensable for understanding protein aggregation. We focus on molecular simulations, notably molecular dynamics (MD) simulations, spanning from atomistic to coarse-grained levels, which have emerged as pivotal tools in unraveling the complex dynamics governing protein aggregation in diseases such as cataracts, Alzheimer's, and Parkinson's. MD simulations provide microscopic insights into protein interactions and the subtleties of aggregation pathways, with advanced techniques like replica exchange molecular dynamics, Metadynamics (MetaD), and umbrella sampling enhancing our understanding by probing intricate energy landscapes and transition states. We delve into specific applications of MD simulations, elucidating the chaperone mechanism underlying cataract formation using Markov state modeling and the intricate pathways and interactions driving the toxic aggregate formation in Alzheimer's and Parkinson's disease. Transitioning we highlight how computational techniques, including bioinformatics, sequence analysis, structural data, machine learning algorithms, and artificial intelligence have become indispensable for predicting protein aggregation propensity and locating aggregation-prone regions within protein sequences. Throughout our exploration, we underscore the symbiotic relationship between computational approaches and empirical data, which has paved the way for potential therapeutic strategies against protein aggregation-related diseases. In conclusion, this review offers a comprehensive overview of advanced computational methodologies and bioinformatics tools that have catalyzed breakthroughs in unraveling the molecular basis of protein aggregation, with significant implications for clinical interventions, standing at the intersection of computational biology and experimental research.
Collapse
Affiliation(s)
- Deepshikha Ghosh
- Department of Biological Sciences and Engineering, Indian Institute of Technology (IIT) Gandhinagar, Palaj, Gujarat 382355, India
| | - Anushka Biswas
- Department of Chemical Engineering, Indian Institute of Technology (IIT) Gandhinagar, Palaj, Gujarat 382355, India
| | | |
Collapse
|
6
|
Chakraborty A, Hussain A, Sabnam N. Uncovering the structural stability of Magnaporthe oryzae effectors: a secretome-wide in silico analysis. J Biomol Struct Dyn 2023:1-22. [PMID: 38109060 DOI: 10.1080/07391102.2023.2292795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Accepted: 11/23/2023] [Indexed: 12/19/2023]
Abstract
Rice blast, caused by the ascomycete fungus Magnaporthe oryzae, is a deadly disease and a major threat to global food security. The pathogen secretes small proteinaceous effectors, virulence factors, inside the host to manipulate and perturb the host immune system, allowing the pathogen to colonize and establish a successful infection. While the molecular functions of several effectors are characterized, very little is known about the structural stability of these effectors. We analyzed a total of 554 small secretory proteins (SSPs) from the M. oryzae secretome to decipher key features of intrinsic disorder (ID) and the structural dynamics of the selected putative effectors through thorough and systematic in silico studies. Our results suggest that out of the total SSPs, 66% were predicted as effector proteins, released either into the apoplast or cytoplasm of the host cell. Of these, 68% were found to be intrinsically disordered effector proteins (IDEPs). Among the six distinct classes of disordered effectors, we observed peculiar relationships between the localization of several effectors in the apoplast or cytoplasm and the degree of disorder. We determined the degree of structural disorder and its impact on protein foldability across all the putative small secretory effector proteins from the blast pathogen, further validated by molecular dynamics simulation studies. This study provides definite clues toward unraveling the mystery behind the importance of structural distortions in effectors and their impact on plant-pathogen interactions. The study of these dynamical segments may help identify new effectors as well.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
| | - Afzal Hussain
- Department of Bioinformatics, Maulana Azad National Institute of Technology, Bhopal, India
| | - Nazmiara Sabnam
- Department of Life Sciences, Presidency University, Kolkata, India
| |
Collapse
|
7
|
Kurgan L, Hu G, Wang K, Ghadermarzi S, Zhao B, Malhis N, Erdős G, Gsponer J, Uversky VN, Dosztányi Z. Tutorial: a guide for the selection of fast and accurate computational tools for the prediction of intrinsic disorder in proteins. Nat Protoc 2023; 18:3157-3172. [PMID: 37740110 DOI: 10.1038/s41596-023-00876-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Accepted: 06/21/2023] [Indexed: 09/24/2023]
Abstract
Intrinsic disorder is instrumental for a wide range of protein functions, and its analysis, using computational predictions from primary structures, complements secondary and tertiary structure-based approaches. In this Tutorial, we provide an overview and comparison of 23 publicly available computational tools with complementary parameters useful for intrinsic disorder prediction, partly relying on results from the Critical Assessment of protein Intrinsic Disorder prediction experiment. We consider factors such as accuracy, runtime, availability and the need for functional insights. The selected tools are available as web servers and downloadable programs, offer state-of-the-art predictions and can be used in a high-throughput manner. We provide examples and instructions for the selected tools to illustrate practical aspects related to the submission, collection and interpretation of predictions, as well as the timing and their limitations. We highlight two predictors for intrinsically disordered proteins, flDPnn as accurate and fast and IUPred as very fast and moderately accurate, while suggesting ANCHOR2 and MoRFchibi as two of the best-performing predictors for intrinsically disordered region binding. We link these tools to additional resources, including databases of predictions and web servers that integrate multiple predictive methods. Altogether, this Tutorial provides a hands-on guide to comparatively evaluating multiple predictors, submitting and collecting their own predictions, and reading and interpreting results. It is suitable for experimentalists and computational biologists interested in accurately and conveniently identifying intrinsic disorder, facilitating the functional characterization of the rapidly growing collections of protein sequences.
Collapse
Affiliation(s)
- Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA.
| | - Gang Hu
- School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin, China
| | - Kui Wang
- School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin, China
| | - Sina Ghadermarzi
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Nawar Malhis
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
| | - Gábor Erdős
- MTA-ELTE Momentum Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest, Hungary
| | - Jörg Gsponer
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada.
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL, USA.
- Byrd Alzheimer's Center and Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA.
| | - Zsuzsanna Dosztányi
- MTA-ELTE Momentum Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest, Hungary.
| |
Collapse
|
8
|
Alderson TR, Pritišanac I, Kolarić Đ, Moses AM, Forman-Kay JD. Systematic identification of conditionally folded intrinsically disordered regions by AlphaFold2. Proc Natl Acad Sci U S A 2023; 120:e2304302120. [PMID: 37878721 PMCID: PMC10622901 DOI: 10.1073/pnas.2304302120] [Citation(s) in RCA: 31] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 08/30/2023] [Indexed: 10/27/2023] Open
Abstract
The AlphaFold Protein Structure Database contains predicted structures for millions of proteins. For the majority of human proteins that contain intrinsically disordered regions (IDRs), which do not adopt a stable structure, it is generally assumed that these regions have low AlphaFold2 confidence scores that reflect low-confidence structural predictions. Here, we show that AlphaFold2 assigns confident structures to nearly 15% of human IDRs. By comparison to experimental NMR data for a subset of IDRs that are known to conditionally fold (i.e., upon binding or under other specific conditions), we find that AlphaFold2 often predicts the structure of the conditionally folded state. Based on databases of IDRs that are known to conditionally fold, we estimate that AlphaFold2 can identify conditionally folding IDRs at a precision as high as 88% at a 10% false positive rate, which is remarkable considering that conditionally folded IDR structures were minimally represented in its training data. We find that human disease mutations are nearly fivefold enriched in conditionally folded IDRs over IDRs in general and that up to 80% of IDRs in prokaryotes are predicted to conditionally fold, compared to less than 20% of eukaryotic IDRs. These results indicate that a large majority of IDRs in the proteomes of human and other eukaryotes function in the absence of conditional folding, but the regions that do acquire folds are more sensitive to mutations. We emphasize that the AlphaFold2 predictions do not reveal functionally relevant structural plasticity within IDRs and cannot offer realistic ensemble representations of conditionally folded IDRs.
Collapse
Affiliation(s)
- T. Reid Alderson
- Department of Biochemistry, University of Toronto, Toronto, ONM5S 1A8, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ONM5S 1A8, Canada
| | - Iva Pritišanac
- Department of Cell and Systems Biology, University of Toronto, Toronto, ONM5S 35G, Canada
- Molecular Medicine Program, The Hospital for Sick Children, Toronto, ONM5G 0A4, Canada
- Department of Molecular Biology and Biochemistry, Gottfried Schatz Research Center for Cell Signaling, Metabolism and Aging, Medical University of Graz, Graz8010, Austria
| | - Đesika Kolarić
- Department of Molecular Biology and Biochemistry, Gottfried Schatz Research Center for Cell Signaling, Metabolism and Aging, Medical University of Graz, Graz8010, Austria
| | - Alan M. Moses
- Department of Cell and Systems Biology, University of Toronto, Toronto, ONM5S 35G, Canada
| | - Julie D. Forman-Kay
- Department of Biochemistry, University of Toronto, Toronto, ONM5S 1A8, Canada
- Molecular Medicine Program, The Hospital for Sick Children, Toronto, ONM5G 0A4, Canada
| |
Collapse
|
9
|
Németh-Szatmári O, Nagy-Mikó B, Györkei Á, Varga D, Kovács BBH, Igaz N, Bognár B, Rázga Z, Nagy G, Zsindely N, Bodai L, Papp B, Erdélyi M, Kiricsi M, Blastyák A, Collart MA, Boros IM, Villányi Z. Phase-separated ribosome-nascent chain complexes in genotoxic stress response. RNA (NEW YORK, N.Y.) 2023; 29:1557-1574. [PMID: 37460154 PMCID: PMC10578487 DOI: 10.1261/rna.079755.123] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Accepted: 06/26/2023] [Indexed: 09/20/2023]
Abstract
Assemblysomes are EDTA- and RNase-resistant ribonucleoprotein (RNP) complexes of paused ribosomes with protruding nascent polypeptide chains. They have been described in yeast and human cells for the proteasome subunit Rpt1, and the disordered amino-terminal part of the nascent chain was found to be indispensable for the accumulation of the Rpt1-RNP into assemblysomes. Motivated by this, to find other assemblysome-associated RNPs we used bioinformatics to rank subunits of Saccharomyces cerevisiae protein complexes according to their amino-terminal disorder propensity. The results revealed that gene products involved in DNA repair are enriched among the top candidates. The Sgs1 DNA helicase was chosen for experimental validation. We found that indeed nascent chains of Sgs1 form EDTA-resistant RNP condensates, assemblysomes by definition. Moreover, upon exposure to UV, SGS1 mRNA shifted from assemblysomes to polysomes, suggesting that external stimuli are regulators of assemblysome dynamics. We extended our studies to human cell lines. The BLM helicase, ortholog of yeast Sgs1, was identified upon sequencing assemblysome-associated RNAs from the MCF7 human breast cancer cell line, and mRNAs encoding DNA repair proteins were overall enriched. Using the radiation-resistant A549 cell line, we observed by transmission electron microscopy that 1,6-hexanediol, an agent known to disrupt phase-separated condensates, depletes ring ribosome structures compatible with assemblysomes from the cytoplasm of cells and makes the cells more sensitive to X-ray treatment. Taken together, these findings suggest that assemblysomes may be a component of the DNA damage response from yeast to human.
Collapse
Affiliation(s)
- Orsolya Németh-Szatmári
- Department of Biochemistry and Molecular Biology, University of Szeged, 6726 Szeged, Hungary
| | - Bence Nagy-Mikó
- Department of Biochemistry and Molecular Biology, University of Szeged, 6726 Szeged, Hungary
| | - Ádám Györkei
- Institute of Biochemistry, Biological Research Centre, 6726 Szeged, Hungary
- Section for Physiology and Cell Biology, Department of Biosciences, University of Oslo, 0316 Oslo, Norway
| | - Dániel Varga
- Department of Optics and Quantum Electronics, University of Szeged, 6720 Szeged, Hungary
| | - Bálint Barna H Kovács
- Department of Optics and Quantum Electronics, University of Szeged, 6720 Szeged, Hungary
| | - Nóra Igaz
- Department of Biochemistry and Molecular Biology, University of Szeged, 6726 Szeged, Hungary
| | - Bence Bognár
- Department of Biochemistry and Molecular Biology, University of Szeged, 6726 Szeged, Hungary
| | - Zsolt Rázga
- Department of Pathology, Faculty of Medicine, University of Szeged, 6720 Szeged, Hungary
| | - Gábor Nagy
- Department of Biochemistry and Molecular Biology, University of Szeged, 6726 Szeged, Hungary
| | - Nóra Zsindely
- Department of Biochemistry and Molecular Biology, University of Szeged, 6726 Szeged, Hungary
| | - László Bodai
- Department of Biochemistry and Molecular Biology, University of Szeged, 6726 Szeged, Hungary
| | - Balázs Papp
- Institute of Biochemistry, Biological Research Centre, 6726 Szeged, Hungary
| | - Miklós Erdélyi
- Department of Optics and Quantum Electronics, University of Szeged, 6720 Szeged, Hungary
| | - Mónika Kiricsi
- Department of Biochemistry and Molecular Biology, University of Szeged, 6726 Szeged, Hungary
| | - András Blastyák
- Institute of Genetics, Biological Research Centre, 6726 Szeged, Hungary
| | - Martine A Collart
- Department of Microbiology and Molecular Medicine, Institute of Genetics and Genomics Geneva, Faculty of Medicine, University of Geneva, 1211 Geneva 4, Switzerland
| | - Imre M Boros
- Department of Biochemistry and Molecular Biology, University of Szeged, 6726 Szeged, Hungary
| | - Zoltán Villányi
- Department of Biochemistry and Molecular Biology, University of Szeged, 6726 Szeged, Hungary
| |
Collapse
|
10
|
Abstract
There are over 100 computational predictors of intrinsic disorder. These methods predict amino acid-level propensities for disorder directly from protein sequences. The propensities can be used to annotate putative disordered residues and regions. This unit provides a practical and holistic introduction to the sequence-based intrinsic disorder prediction. We define intrinsic disorder, explain the format of computational prediction of disorder, and identify and describe several accurate predictors. We also introduce recently released databases of intrinsic disorder predictions and use an illustrative example to provide insights into how predictions should be interpreted and combined. Lastly, we summarize key experimental methods that can be used to validate computational predictions. © 2023 Wiley Periodicals LLC.
Collapse
Affiliation(s)
- Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, Florida
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, Virginia
| |
Collapse
|
11
|
Han B, Ren C, Wang W, Li J, Gong X. Computational Prediction of Protein Intrinsically Disordered Region Related Interactions and Functions. Genes (Basel) 2023; 14:432. [PMID: 36833360 PMCID: PMC9956190 DOI: 10.3390/genes14020432] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Revised: 02/02/2023] [Accepted: 02/05/2023] [Indexed: 02/11/2023] Open
Abstract
Intrinsically Disordered Proteins (IDPs) and Regions (IDRs) exist widely. Although without well-defined structures, they participate in many important biological processes. In addition, they are also widely related to human diseases and have become potential targets in drug discovery. However, there is a big gap between the experimental annotations related to IDPs/IDRs and their actual number. In recent decades, the computational methods related to IDPs/IDRs have been developed vigorously, including predicting IDPs/IDRs, the binding modes of IDPs/IDRs, the binding sites of IDPs/IDRs, and the molecular functions of IDPs/IDRs according to different tasks. In view of the correlation between these predictors, we have reviewed these prediction methods uniformly for the first time, summarized their computational methods and predictive performance, and discussed some problems and perspectives.
Collapse
Affiliation(s)
- Bingqing Han
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
| | - Chongjiao Ren
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
| | - Wenda Wang
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
| | - Jiashan Li
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
| | - Xinqi Gong
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
- Beijing Academy of Intelligence, Beijing 100083, China
| |
Collapse
|
12
|
Peng Z, Li Z, Meng Q, Zhao B, Kurgan L. CLIP: accurate prediction of disordered linear interacting peptides from protein sequences using co-evolutionary information. Brief Bioinform 2023; 24:6858950. [PMID: 36458437 DOI: 10.1093/bib/bbac502] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 09/30/2022] [Accepted: 10/24/2022] [Indexed: 12/04/2022] Open
Abstract
One of key features of intrinsically disordered regions (IDRs) is facilitation of protein-protein and protein-nucleic acids interactions. These disordered binding regions include molecular recognition features (MoRFs), short linear motifs (SLiMs) and longer binding domains. Vast majority of current predictors of disordered binding regions target MoRFs, with a handful of methods that predict SLiMs and disordered protein-binding domains. A new and broader class of disordered binding regions, linear interacting peptides (LIPs), was introduced recently and applied in the MobiDB resource. LIPs are segments in protein sequences that undergo disorder-to-order transition upon binding to a protein or a nucleic acid, and they cover MoRFs, SLiMs and disordered protein-binding domains. Although current predictors of MoRFs and disordered protein-binding regions could be used to identify some LIPs, there are no dedicated sequence-based predictors of LIPs. To this end, we introduce CLIP, a new predictor of LIPs that utilizes robust logistic regression model to combine three complementary types of inputs: co-evolutionary information derived from multiple sequence alignments, physicochemical profiles and disorder predictions. Ablation analysis suggests that the co-evolutionary information is particularly useful for this prediction and that combining the three inputs provides substantial improvements when compared to using these inputs individually. Comparative empirical assessments using low-similarity test datasets reveal that CLIP secures area under receiver operating characteristic curve (AUC) of 0.8 and substantially improves over the results produced by the closest current tools that predict MoRFs and disordered protein-binding regions. The webserver of CLIP is freely available at http://biomine.cs.vcu.edu/servers/CLIP/ and the standalone code can be downloaded from http://yanglab.qd.sdu.edu.cn/download/CLIP/.
Collapse
Affiliation(s)
- Zhenling Peng
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China.,Frontier Science Center for Nonlinear Expectations, Ministry of Education, Qingdao, 266237, China
| | - Zixia Li
- Center for Applied Mathematics, Tianjin University, Tianjin, 300072, China
| | - Qiaozhen Meng
- College of Intelligence and Computing, Tianjin University, Tianjin, 300072, China
| | - Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
13
|
Magyar C, Németh BZ, Cserző M, Simon I. Molecular Dynamics Simulation as a Tool to Identify Mutual Synergistic Folding Proteins. Int J Mol Sci 2023; 24:ijms24021790. [PMID: 36675304 PMCID: PMC9861041 DOI: 10.3390/ijms24021790] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Revised: 01/12/2023] [Accepted: 01/13/2023] [Indexed: 01/18/2023] Open
Abstract
Mutual synergistic folding (MSF) proteins belong to a recently emerged subclass of disordered proteins, which are disordered in their monomeric forms but become ordered in their oligomeric forms. They can be identified by experimental methods following their unfolding, which happens in a single-step cooperative process, without the presence of stable monomeric intermediates. Only a limited number of experimentally validated MSF proteins are accessible. The amino acid composition of MSF proteins shows high similarity to globular ordered proteins, rather than to disordered ones. However, they have some special structural features, which makes it possible to distinguish them from globular proteins. Even in the possession of their oligomeric three-dimensional structure, classification can only be performed based on unfolding experiments, which are frequently absent. In this work, we demonstrate a simple protocol using molecular dynamics simulations, which is able to indicate that a protein structure belongs to the MSF subclass. The presumption of the known atomic resolution quaternary structure is an obvious limitation of the method, and because of its high computational time requirements, it is not suitable for screening large databases; still, it is a valuable in silico tool for identification of MSF proteins.
Collapse
Affiliation(s)
- Csaba Magyar
- Institute of Enzymology, Research Centre for Natural Sciences, Eötvös Loránd Research Network, 1117 Budapest, Hungary
- Correspondence:
| | - Bálint Zoltán Németh
- Institute of Enzymology, Research Centre for Natural Sciences, Eötvös Loránd Research Network, 1117 Budapest, Hungary
| | - Miklós Cserző
- Institute of Enzymology, Research Centre for Natural Sciences, Eötvös Loránd Research Network, 1117 Budapest, Hungary
- Department of Physiology, Faculty of Medicine, Semmelweis University, 1094 Budapest, Hungary
| | - István Simon
- Institute of Enzymology, Research Centre for Natural Sciences, Eötvös Loránd Research Network, 1117 Budapest, Hungary
| |
Collapse
|
14
|
Deutsch N, Pajkos M, Erdős G, Dosztányi Z. DisCanVis: Visualizing integrated structural and functional annotations to better understand the effect of cancer mutations located within disordered proteins. Protein Sci 2023; 32:e4522. [PMID: 36452990 PMCID: PMC9793970 DOI: 10.1002/pro.4522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Revised: 11/15/2022] [Accepted: 11/16/2022] [Indexed: 12/03/2022]
Abstract
Intrinsically disordered proteins (IDPs) play important roles in a wide range of biological processes and have been associated with various diseases, including cancer. In the last few years, cancer genome projects have systematically collected genetic variations underlying multiple cancer types. In parallel, the number and different types of disordered proteins characterized by experimental methods have also significantly increased. Nevertheless, the role of IDPs in various types of cancer is still not well understood. In this work, we present DisCanVis, a novel visualization tool for cancer mutations with a special focus on IDPs. In order to aid the interpretation of observed mutations, genome level information is combined with information about the structural and functional properties of proteins. The web server enables users to inspect individual proteins, collect examples with existing annotations of protein disorder and associated function or to discover currently uncharacterized examples with likely disease relevance. Through a REST API interface and precompiled tables the analysis can be extended to a group of proteins.
Collapse
Affiliation(s)
- Norbert Deutsch
- Department of BiochemistryInstitute of Biology, ELTE Eötvös Loránd UniversityBudapestHungary
| | - Mátyás Pajkos
- Department of BiochemistryInstitute of Biology, ELTE Eötvös Loránd UniversityBudapestHungary
| | - Gábor Erdős
- Department of BiochemistryInstitute of Biology, ELTE Eötvös Loránd UniversityBudapestHungary
| | - Zsuzsanna Dosztányi
- Department of BiochemistryInstitute of Biology, ELTE Eötvös Loránd UniversityBudapestHungary
| |
Collapse
|
15
|
Sun C, Feng Y, Fan G. IDPsBind: a repository of binding sites for intrinsically disordered proteins complexes with known 3D structures. BMC Mol Cell Biol 2022; 23:33. [PMID: 35883018 PMCID: PMC9327236 DOI: 10.1186/s12860-022-00434-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Accepted: 07/14/2022] [Indexed: 11/10/2022] Open
Abstract
Abstract
Background
Intrinsically disordered proteins (IDPs) lack a stable three-dimensional structure under physiological conditions but play crucial roles in many biological processes. Intrinsically disordered proteins perform various biological functions by interacting with other ligands.
Results
Here, we present a database, IDPsBind, which displays interacting sites between IDPs and interacting ligands by using the distance threshold method in known 3D structure IDPs complexes from the PDB database. IDPsBind contains 9626 IDPs complexes and 880 intrinsically disordered proteins verified by experiments. The current release of the IDPsBind database is defined as version 1.0. IDPsBind is freely accessible at http://www.s-bioinformatics.cn/idpsbind/home/.
Conclusions
IDPsBind provides more comprehensive interaction sites for IDPs complexes of known 3D structures. It can not only help the subsequent studies of the interaction mechanism of intrinsically disordered proteins but also provides a suitable background for developing the algorithms for predicting the interaction sites of intrinsically disordered proteins.
Collapse
|
16
|
Piovesan D, Del Conte A, Clementel D, Monzon A, Bevilacqua M, Aspromonte M, Iserte J, Orti FE, Marino-Buslje C, Tosatto SE. MobiDB: 10 years of intrinsically disordered proteins. Nucleic Acids Res 2022; 51:D438-D444. [PMID: 36416266 PMCID: PMC9825420 DOI: 10.1093/nar/gkac1065] [Citation(s) in RCA: 56] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 10/11/2022] [Accepted: 10/25/2022] [Indexed: 11/24/2022] Open
Abstract
The MobiDB database (URL: https://mobidb.org/) is a knowledge base of intrinsically disordered proteins. MobiDB aggregates disorder annotations derived from the literature and from experimental evidence along with predictions for all known protein sequences. MobiDB generates new knowledge and captures the functional significance of disordered regions by processing and combining complementary sources of information. Since its first release 10 years ago, the MobiDB database has evolved in order to improve the quality and coverage of protein disorder annotations and its accessibility. MobiDB has now reached its maturity in terms of data standardization and visualization. Here, we present a new release which focuses on the optimization of user experience and database content. The major advances compared to the previous version are the integration of AlphaFoldDB predictions and the re-implementation of the homology transfer pipeline, which expands manually curated annotations by two orders of magnitude. Finally, the entry page has been restyled in order to provide an overview of the available annotations along with two separate views that highlight structural disorder evidence and functions associated with different binding modes.
Collapse
Affiliation(s)
- Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Alessio Del Conte
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Damiano Clementel
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | | | | | | | - Javier A Iserte
- Bioinformatics Unit, Fundación Instituto Leloir, Buenos Aires, Argentina
| | - Fernando E Orti
- Bioinformatics Unit, Fundación Instituto Leloir, Buenos Aires, Argentina
| | | | | |
Collapse
|
17
|
Chen R, Li X, Yang Y, Song X, Wang C, Qiao D. Prediction of protein-protein interaction sites in intrinsically disordered proteins. Front Mol Biosci 2022; 9:985022. [PMID: 36250006 PMCID: PMC9567019 DOI: 10.3389/fmolb.2022.985022] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Accepted: 07/27/2022] [Indexed: 11/25/2022] Open
Abstract
Intrinsically disordered proteins (IDPs) participate in many biological processes by interacting with other proteins, including the regulation of transcription, translation, and the cell cycle. With the increasing amount of disorder sequence data available, it is thus crucial to identify the IDP binding sites for functional annotation of these proteins. Over the decades, many computational approaches have been developed to predict protein-protein binding sites of IDP (IDP-PPIS) based on protein sequence information. Moreover, there are new IDP-PPIS predictors developed every year with the rapid development of artificial intelligence. It is thus necessary to provide an up-to-date overview of these methods in this field. In this paper, we collected 30 representative predictors published recently and summarized the databases, features and algorithms. We described the procedure how the features were generated based on public data and used for the prediction of IDP-PPIS, along with the methods to generate the feature representations. All the predictors were divided into three categories: scoring functions, machine learning-based prediction, and consensus approaches. For each category, we described the details of algorithms and their performances. Hopefully, our manuscript will not only provide a full picture of the status quo of IDP binding prediction, but also a guide for selecting different methods. More importantly, it will shed light on the inspirations for future development trends and principles.
Collapse
Affiliation(s)
- Ranran Chen
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Xinlu Li
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Yaqing Yang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Xixi Song
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Cheng Wang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Dongdong Qiao
- Shandong Mental Health Center, Shandong University, Jinan, China
| |
Collapse
|
18
|
Roca-Martinez J, Lazar T, Gavalda-Garcia J, Bickel D, Pancsa R, Dixit B, Tzavella K, Ramasamy P, Sanchez-Fornaris M, Grau I, Vranken WF. Challenges in describing the conformation and dynamics of proteins with ambiguous behavior. Front Mol Biosci 2022; 9:959956. [PMID: 35992270 PMCID: PMC9382080 DOI: 10.3389/fmolb.2022.959956] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Accepted: 06/27/2022] [Indexed: 11/13/2022] Open
Abstract
Traditionally, our understanding of how proteins operate and how evolution shapes them is based on two main data sources: the overall protein fold and the protein amino acid sequence. However, a significant part of the proteome shows highly dynamic and/or structurally ambiguous behavior, which cannot be correctly represented by the traditional fixed set of static coordinates. Representing such protein behaviors remains challenging and necessarily involves a complex interpretation of conformational states, including probabilistic descriptions. Relating protein dynamics and multiple conformations to their function as well as their physiological context (e.g., post-translational modifications and subcellular localization), therefore, remains elusive for much of the proteome, with studies to investigate the effect of protein dynamics relying heavily on computational models. We here investigate the possibility of delineating three classes of protein conformational behavior: order, disorder, and ambiguity. These definitions are explored based on three different datasets, using interpretable machine learning from a set of features, from AlphaFold2 to sequence-based predictions, to understand the overlap and differences between these datasets. This forms the basis for a discussion on the current limitations in describing the behavior of dynamic and ambiguous proteins.
Collapse
Affiliation(s)
- Joel Roca-Martinez
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
| | - Tamas Lazar
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- VIB-VUB Center for Structural Biology, Brussels, Belgium
| | - Jose Gavalda-Garcia
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
| | - David Bickel
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
| | - Rita Pancsa
- Research Centre for Natural Sciences, Institute of Enzymology, Budapest, Hungary
| | - Bhawna Dixit
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
- IBiTech-Biommeda, Universiteit Gent, Gent, Belgium
| | - Konstantina Tzavella
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
| | - Pathmanaban Ramasamy
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
- VIB-UGent Center for Medical Biotechnology, Universiteit Gent, Gent, Belgium
| | - Maite Sanchez-Fornaris
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
- Department of Computer Sciences, University of Camagüey, Camagüey, Cuba
| | - Isel Grau
- Information Systems, Eindhoven University of Technology, Eindhoven, Netherlands
| | - Wim F. Vranken
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
| |
Collapse
|
19
|
Banani SF, Afeyan LK, Hawken SW, Henninger JE, Dall'Agnese A, Clark VE, Platt JM, Oksuz O, Hannett NM, Sagi I, Lee TI, Young RA. Genetic variation associated with condensate dysregulation in disease. Dev Cell 2022; 57:1776-1788.e8. [PMID: 35809564 PMCID: PMC9339523 DOI: 10.1016/j.devcel.2022.06.010] [Citation(s) in RCA: 35] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 03/11/2022] [Accepted: 06/14/2022] [Indexed: 12/18/2022]
Abstract
A multitude of cellular processes involve biomolecular condensates, which has led to the suggestion that diverse pathogenic mutations may dysregulate condensates. Although proof-of-concept studies have identified specific mutations that cause condensate dysregulation, the full scope of the pathological genetic variation that affects condensates is not yet known. Here, we comprehensively map pathogenic mutations to condensate-promoting protein features in putative condensate-forming proteins and find over 36,000 pathogenic mutations that plausibly contribute to condensate dysregulation in over 1,200 Mendelian diseases and 550 cancers. This resource captures mutations presently known to dysregulate condensates, and experimental tests confirm that additional pathological mutations do indeed affect condensate properties in cells. These findings suggest that condensate dysregulation may be a pervasive pathogenic mechanism underlying a broad spectrum of human diseases, provide a strategy to identify proteins and mutations involved in pathologically altered condensates, and serve as a foundation for mechanistic insights into disease and therapeutic hypotheses.
Collapse
Affiliation(s)
- Salman F Banani
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA; Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Lena K Afeyan
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA; Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Susana W Hawken
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA; Program of Computational & Systems Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | | | | | - Victoria E Clark
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA; Department of Neurosurgery, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | - Jesse M Platt
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA; Division of Gastroenterology, Department of Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Ozgur Oksuz
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
| | - Nancy M Hannett
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
| | - Ido Sagi
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
| | - Tong Ihn Lee
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
| | - Richard A Young
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA; Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02142, USA.
| |
Collapse
|
20
|
Compositional Bias of Intrinsically Disordered Proteins and Regions and Their Predictions. Biomolecules 2022; 12:biom12070888. [PMID: 35883444 PMCID: PMC9313023 DOI: 10.3390/biom12070888] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 06/10/2022] [Accepted: 06/10/2022] [Indexed: 11/17/2022] Open
Abstract
Intrinsically disordered regions (IDRs) carry out many cellular functions and vary in length and placement in protein sequences. This diversity leads to variations in the underlying compositional biases, which were demonstrated for the short vs. long IDRs. We analyze compositional biases across four classes of disorder: fully disordered proteins; short IDRs; long IDRs; and binding IDRs. We identify three distinct biases: for the fully disordered proteins, the short IDRs and the long and binding IDRs combined. We also investigate compositional bias for putative disorder produced by leading disorder predictors and find that it is similar to the bias of the native disorder. Interestingly, the accuracy of disorder predictions across different methods is correlated with the correctness of the compositional bias of their predictions highlighting the importance of the compositional bias. The predictive quality is relatively low for the disorder classes with compositional bias that is the most different from the “generic” disorder bias, while being much higher for the classes with the most similar bias. We discover that different predictors perform best across different classes of disorder. This suggests that no single predictor is universally best and motivates the development of new architectures that combine models that target specific disorder classes.
Collapse
|
21
|
Ahmed SS, Rifat ZT, Lohia R, Campbell AJ, Dunker AK, Rahman MS, Iqbal S. Characterization of intrinsically disordered regions in proteins informed by human genetic diversity. PLoS Comput Biol 2022; 18:e1009911. [PMID: 35275927 PMCID: PMC8942211 DOI: 10.1371/journal.pcbi.1009911] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Revised: 03/23/2022] [Accepted: 02/10/2022] [Indexed: 01/21/2023] Open
Abstract
All proteomes contain both proteins and polypeptide segments that don’t form a defined three-dimensional structure yet are biologically active—called intrinsically disordered proteins and regions (IDPs and IDRs). Most of these IDPs/IDRs lack useful functional annotation limiting our understanding of their importance for organism fitness. Here we characterized IDRs using protein sequence annotations of functional sites and regions available in the UniProt knowledgebase (“UniProt features”: active site, ligand-binding pocket, regions mediating protein-protein interactions, etc.). By measuring the statistical enrichment of twenty-five UniProt features in 981 IDRs of 561 human proteins, we identified eight features that are commonly located in IDRs. We then collected the genetic variant data from the general population and patient-based databases and evaluated the prevalence of population and pathogenic variations in IDPs/IDRs. We observed that some IDRs tolerate 2 to 12-times more single amino acid-substituting missense mutations than synonymous changes in the general population. However, we also found that 37% of all germline pathogenic mutations are located in disordered regions of 96 proteins. Based on the observed-to-expected frequency of mutations, we categorized 34 IDRs in 20 proteins (DDX3X, KIT, RB1, etc.) as intolerant to mutation. Finally, using statistical analysis and a machine learning approach, we demonstrate that mutation-intolerant IDRs carry a distinct signature of functional features. Our study presents a novel approach to assign functional importance to IDRs by leveraging the wealth of available genetic data, which will aid in a deeper understating of the role of IDRs in biological processes and disease mechanisms.
Collapse
Affiliation(s)
- Shehab S. Ahmed
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, ECE Building, West Palashi, Dhaka-1205, Bangladesh
| | - Zaara T. Rifat
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, ECE Building, West Palashi, Dhaka-1205, Bangladesh
| | - Ruchi Lohia
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Arthur J. Campbell
- Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - A. Keith Dunker
- Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Indiana University School of Medicine, Indianapolis, Indiana, United States of America
| | - M. Sohel Rahman
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, ECE Building, West Palashi, Dhaka-1205, Bangladesh
- * E-mail: (MSR); (SI)
| | - Sumaiya Iqbal
- Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, United States of America
- * E-mail: (MSR); (SI)
| |
Collapse
|
22
|
Zhao B, Kurgan L. Deep learning in prediction of intrinsic disorder in proteins. Comput Struct Biotechnol J 2022; 20:1286-1294. [PMID: 35356546 PMCID: PMC8927795 DOI: 10.1016/j.csbj.2022.03.003] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 03/04/2022] [Accepted: 03/04/2022] [Indexed: 12/12/2022] Open
Abstract
Intrinsic disorder prediction is an active area that has developed over 100 predictors. We identify and investigate a recent trend towards the development of deep neural network (DNN)-based methods. The first DNN-based method was released in 2013 and since 2019 deep learners account for majority of the new disorder predictors. We find that the 13 currently available DNN-based predictors are diverse in their topologies, sizes of their networks and the inputs that they utilize. We empirically show that the deep learners are statistically more accurate than other types of disorder predictors using the blind test dataset from the recent community assessment of intrinsic disorder predictions (CAID). We also identify several well-rounded DNN-based predictors that are accurate, fast and/or conveniently available. The popularity, favorable predictive performance and architectural flexibility suggest that deep networks are likely to fuel the development of future disordered predictors. Novel hybrid designs of deep networks could be used to adequately accommodate for diversity of types and flavors of intrinsic disorder. We also discuss scarcity of the DNN-based methods for the prediction of disordered binding regions and the need to develop more accurate methods for this prediction.
Collapse
Affiliation(s)
- Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
23
|
Kurgan L. Resources for computational prediction of intrinsic disorder in proteins. Methods 2022; 204:132-141. [DOI: 10.1016/j.ymeth.2022.03.018] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Revised: 03/25/2022] [Accepted: 03/29/2022] [Indexed: 12/26/2022] Open
|
24
|
Piovesan D, Monzon AM, Quaglia F, Tosatto SCE. Databases for intrinsically disordered proteins. Acta Crystallogr D Struct Biol 2022; 78:144-151. [PMID: 35102880 PMCID: PMC8805306 DOI: 10.1107/s2059798321012109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2021] [Accepted: 11/12/2021] [Indexed: 11/28/2022] Open
Abstract
Intrinsically disordered regions (IDRs) lacking a fixed three-dimensional protein structure are widespread and play a central role in cell regulation. Only a small fraction of IDRs have been functionally characterized, with heterogeneous experimental evidence that is largely buried in the literature. Predictions of IDRs are still difficult to estimate and are poorly characterized. Here, an overview of the publicly available knowledge about IDRs is reported, including manually curated resources, deposition databases and prediction repositories. The types, scopes and availability of the various resources are analyzed, and their complementarity and overlap are highlighted. The volume of information included and the relevance to the field of structural biology are compared.
Collapse
Affiliation(s)
- Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | | | - Federica Quaglia
- Department of Biomedical Sciences, University of Padova, Padova, Italy
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR–IBIOM), Bari, Italy
| | | |
Collapse
|
25
|
Tamburrini KC, Pesce G, Nilsson J, Gondelaud F, Kajava AV, Berrin JG, Longhi S. Predicting Protein Conformational Disorder and Disordered Binding Sites. Methods Mol Biol 2022; 2449:95-147. [PMID: 35507260 DOI: 10.1007/978-1-0716-2095-3_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
In the last two decades it has become increasingly evident that a large number of proteins adopt either a fully or a partially disordered conformation. Intrinsically disordered proteins are ubiquitous proteins that fulfill essential biological functions while lacking a stable 3D structure. Their conformational heterogeneity is encoded by the amino acid sequence, thereby allowing intrinsically disordered proteins or regions to be recognized based on their sequence properties. The identification of disordered regions facilitates the functional annotation of proteins and is instrumental for delineating boundaries of protein domains amenable to crystallization. This chapter focuses on the methods currently employed for predicting protein disorder and identifying intrinsically disordered binding sites.
Collapse
Affiliation(s)
- Ketty C Tamburrini
- Aix Marseille Univ, CNRS, Architecture et Fonction des Macromolécules Biologiques, AFMB, UMR 7257, Marseille, France
- INRAE, Aix Marseille Univ, Biodiversité et Biotechnologie Fongiques (BBF), UMR 1163, Marseille, France
| | - Giulia Pesce
- Aix Marseille Univ, CNRS, Architecture et Fonction des Macromolécules Biologiques, AFMB, UMR 7257, Marseille, France
| | - Juliet Nilsson
- Aix Marseille Univ, CNRS, Architecture et Fonction des Macromolécules Biologiques, AFMB, UMR 7257, Marseille, France
| | - Frank Gondelaud
- Aix Marseille Univ, CNRS, Architecture et Fonction des Macromolécules Biologiques, AFMB, UMR 7257, Marseille, France
| | - Andrey V Kajava
- Centre de Recherche en Biologie cellulaire de Montpellier, UMR 5237, CNRS, Université Montpellier, Montpellier, France
| | - Jean-Guy Berrin
- INRAE, Aix Marseille Univ, Biodiversité et Biotechnologie Fongiques (BBF), UMR 1163, Marseille, France
| | - Sonia Longhi
- Aix Marseille Univ, CNRS, Architecture et Fonction des Macromolécules Biologiques, AFMB, UMR 7257, Marseille, France.
| |
Collapse
|
26
|
Abstract
INTRODUCTION Intrinsic disorder prediction field develops, assesses, and deploys computational predictors of disorder in protein sequences and constructs and disseminates databases of these predictions. Over 40 years of research resulted in the release of numerous resources. AREAS COVERED We identify and briefly summarize the most comprehensive to date collection of over 100 disorder predictors. We focus on their predictive models, availability and predictive performance. We categorize and study them from a historical point of view to highlight informative trends. EXPERT OPINION We find a consistent trend of improvements in predictive quality as newer and more advanced predictors are developed. The original focus on machine learning methods has shifted to meta-predictors in early 2010s, followed by a recent transition to deep learning. The use of deep learners will continue in foreseeable future given recent and convincing success of these methods. Moreover, a broad range of resources that facilitate convenient collection of accurate disorder predictions is available to users. They include web servers and standalone programs for disorder prediction, servers that combine prediction of disorder and disorder functions, and large databases of pre-computed predictions. We also point to the need to address the shortage of accurate methods that predict disordered binding regions.
Collapse
Affiliation(s)
- Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, Virginia, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, Virginia, USA
| |
Collapse
|
27
|
Origin of Increased Solvent Accessibility of Peptide Bonds in Mutual Synergetic Folding Proteins. Int J Mol Sci 2021; 22:ijms222413404. [PMID: 34948202 PMCID: PMC8704591 DOI: 10.3390/ijms222413404] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2021] [Revised: 12/10/2021] [Accepted: 12/11/2021] [Indexed: 11/16/2022] Open
Abstract
Mutual Synergetic Folding (MSF) proteins belong to a recently discovered class of proteins. These proteins are disordered in their monomeric but ordered in their oligomeric forms. Their amino acid composition is more similar to globular proteins than to disordered ones. Our preceding work shed light on important structural aspects of the structural organization of these proteins, but the background of this behavior is still unknown. We suggest that solvent accessibility is an important factor, especially solvent accessibility of the peptide bonds can be accounted for this phenomenon. The side chains of the amino acids which form a peptide bond have a high local contribution to the shielding of the peptide bond from the solvent. During the oligomerization step, other non-local residues contribute to the shielding. We investigated these local and non-local effects of shielding based on Shannon information entropy calculations. We found that MSF and globular homodimeric proteins have different local contributions resulting from different amino acid pair frequencies. Their non-local distribution is also different because of distinctive inter-subunit contacts.
Collapse
|
28
|
Pajkos M, Dosztányi Z. Functions of intrinsically disordered proteins through evolutionary lenses. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2021; 183:45-74. [PMID: 34656334 DOI: 10.1016/bs.pmbts.2021.06.017] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Protein sequences are the result of an evolutionary process that involves the balancing act of experimenting with novel mutations and selecting out those that have an undesirable functional outcome. In the case of globular proteins, the function relies on a well-defined conformation, therefore, there is a strong evolutionary pressure to preserve the structure. However, different evolutionary rules might apply for the group of intrinsically disordered regions and proteins (IDR/IDPs) that exist as an ensemble of fluctuating conformations. The function of IDRs can directly originate from their disordered state or arise through different types of molecular recognition processes. There is an amazing variety of ways IDRs can carry out their functions, and this is also reflected in their evolutionary properties. In this chapter we give an overview of the different types of evolutionary behavior of disordered proteins and associated functions in normal and disease settings.
Collapse
Affiliation(s)
- Mátyás Pajkos
- Department of Biochemistry, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Zsuzsanna Dosztányi
- Department of Biochemistry, ELTE Eötvös Loránd University, Budapest, Hungary.
| |
Collapse
|
29
|
Erdős G, Pajkos M, Dosztányi Z. IUPred3: prediction of protein disorder enhanced with unambiguous experimental annotation and visualization of evolutionary conservation. Nucleic Acids Res 2021; 49:W297-W303. [PMID: 34048569 PMCID: PMC8262696 DOI: 10.1093/nar/gkab408] [Citation(s) in RCA: 248] [Impact Index Per Article: 82.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 04/21/2021] [Accepted: 05/14/2021] [Indexed: 12/22/2022] Open
Abstract
Intrinsically disordered proteins and protein regions (IDPs/IDRs) exist without a single well-defined conformation. They carry out important biological functions with multifaceted roles which is also reflected in their evolutionary behavior. Computational methods play important roles in the characterization of IDRs. One of the commonly used disorder prediction methods is IUPred, which relies on an energy estimation approach. The IUPred web server takes an amino acid sequence or a Uniprot ID/accession as an input and predicts the tendency for each amino acid to be in a disordered region with an option to also predict context-dependent disordered regions. In this new iteration of IUPred, we added multiple novel features to enhance the prediction capabilities of the server. First, learning from the latest evaluation of disorder prediction methods we introduced multiple new smoothing functions to the prediction that decreases noise and increases the performance of the predictions. We constructed a dataset consisting of experimentally verified ordered/disordered regions with unambiguous annotations which were added to the prediction. We also introduced a novel tool that enables the exploration of the evolutionary conservation of protein disorder coupled to sequence conservation in model organisms. The web server is freely available to users and accessible at https://iupred3.elte.hu.
Collapse
Affiliation(s)
- Gábor Erdős
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| | - Mátyás Pajkos
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| | - Zsuzsanna Dosztányi
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| |
Collapse
|
30
|
Mészáros B, Hajdu-Soltész B, Zeke A, Dosztányi Z. Mutations of Intrinsically Disordered Protein Regions Can Drive Cancer but Lack Therapeutic Strategies. Biomolecules 2021; 11:biom11030381. [PMID: 33806614 PMCID: PMC8000335 DOI: 10.3390/biom11030381] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Revised: 02/22/2021] [Accepted: 02/24/2021] [Indexed: 12/22/2022] Open
Abstract
Many proteins contain intrinsically disordered regions (IDRs) which carry out important functions without relying on a single well-defined conformation. IDRs are increasingly recognized as critical elements of regulatory networks and have been also associated with cancer. However, it is unknown whether mutations targeting IDRs represent a distinct class of driver events associated with specific molecular and system-level properties, cancer types and treatment options. Here, we used an integrative computational approach to explore the direct role of intrinsically disordered protein regions driving cancer. We showed that around 20% of cancer drivers are primarily targeted through a disordered region. These IDRs can function in multiple ways which are distinct from the functional mechanisms of ordered drivers. Disordered drivers play a central role in context-dependent interaction networks and are enriched in specific biological processes such as transcription, gene expression regulation and protein degradation. Furthermore, their modulation represents an alternative mechanism for the emergence of all known cancer hallmarks. Importantly, in certain cancer patients, mutations of disordered drivers represent key driving events. However, treatment options for such patients are currently severely limited. The presented study highlights a largely overlooked class of cancer drivers associated with specific cancer types that need novel therapeutic options.
Collapse
Affiliation(s)
- Bálint Mészáros
- Department of Biochemistry, ELTE Eötvös Loránd University, H-1117 Budapest, Hungary; (B.M.); (B.H.-S.)
- EMBL Heidelberg, Meyerhofstraße 1, 69117 Heidelberg, Germany
| | - Borbála Hajdu-Soltész
- Department of Biochemistry, ELTE Eötvös Loránd University, H-1117 Budapest, Hungary; (B.M.); (B.H.-S.)
| | - András Zeke
- Institute of Enzymology, RCNS, P.O. Box 7, H-1518 Budapest, Hungary;
| | - Zsuzsanna Dosztányi
- Department of Biochemistry, ELTE Eötvös Loránd University, H-1117 Budapest, Hungary; (B.M.); (B.H.-S.)
- Correspondence: ; Tel.: +36-1-372 2500/8537
| |
Collapse
|
31
|
Monzon AM, Bonato P, Necci M, Tosatto SCE, Piovesan D. FLIPPER: Predicting and Characterizing Linear Interacting Peptides in the Protein Data Bank. J Mol Biol 2021; 433:166900. [PMID: 33647288 DOI: 10.1016/j.jmb.2021.166900] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2020] [Revised: 02/22/2021] [Accepted: 02/22/2021] [Indexed: 12/31/2022]
Abstract
A large fraction of peptides or protein regions are disordered in isolation and fold upon binding. These regions, also called MoRFs, SLiMs or LIPs, are often associated with signaling and regulation processes. However, despite their importance, only a limited number of examples are available in public databases and their automatic detection at the proteome level is problematic. Here we present FLIPPER, an automatic method for the detection of structurally linear sub-regions or peptides that interact with another chain in a protein complex. FLIPPER is a random forest classification that takes the protein structure as input and provides the propensity of each amino acid to be part of a LIP region. Models are built taking into consideration structural features such as intra- and inter-chain contacts, secondary structure, solvent accessibility in both bound and unbound state, structural linearity and chain length. FLIPPER is accurate when evaluated on non-redundant independent datasets, 99% precision and 99% sensitivity on PixelDB-25 and 87% precision and 88% sensitivity on DIBS-25. Finally, we used FLIPPER to process the entire Protein Data Bank and identified different classes of LIPs based on different binding modes and partner molecules. We provide a detailed description of these LIP categories and show that a large fraction of these regions are not detected by disorder predictors. All FLIPPER predictions are integrated in the MobiDB 4.0 database.
Collapse
Affiliation(s)
| | - Paolo Bonato
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Marco Necci
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Silvio C E Tosatto
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy.
| | - Damiano Piovesan
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| |
Collapse
|
32
|
Lazar T, Martínez-Pérez E, Quaglia F, Hatos A, Chemes L, Iserte JA, Méndez NA, Garrone NA, Saldaño T, Marchetti J, Rueda A, Bernadó P, Blackledge M, Cordeiro TN, Fagerberg E, Forman-Kay JD, Fornasari M, Gibson TJ, Gomes GNW, Gradinaru C, Head-Gordon T, Jensen MR, Lemke E, Longhi S, Marino-Buslje C, Minervini G, Mittag T, Monzon A, Pappu RV, Parisi G, Ricard-Blum S, Ruff KM, Salladini E, Skepö M, Svergun D, Vallet S, Varadi M, Tompa P, Tosatto SCE, Piovesan D. PED in 2021: a major update of the protein ensemble database for intrinsically disordered proteins. Nucleic Acids Res 2021; 49:D404-D411. [PMID: 33305318 PMCID: PMC7778965 DOI: 10.1093/nar/gkaa1021] [Citation(s) in RCA: 80] [Impact Index Per Article: 26.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Revised: 10/13/2020] [Accepted: 12/08/2020] [Indexed: 12/21/2022] Open
Abstract
The Protein Ensemble Database (PED) (https://proteinensemble.org), which holds structural ensembles of intrinsically disordered proteins (IDPs), has been significantly updated and upgraded since its last release in 2016. The new version, PED 4.0, has been completely redesigned and reimplemented with cutting-edge technology and now holds about six times more data (162 versus 24 entries and 242 versus 60 structural ensembles) and a broader representation of state of the art ensemble generation methods than the previous version. The database has a completely renewed graphical interface with an interactive feature viewer for region-based annotations, and provides a series of descriptors of the qualitative and quantitative properties of the ensembles. High quality of the data is guaranteed by a new submission process, which combines both automatic and manual evaluation steps. A team of biocurators integrate structured metadata describing the ensemble generation methodology, experimental constraints and conditions. A new search engine allows the user to build advanced queries and search all entry fields including cross-references to IDP-related resources such as DisProt, MobiDB, BMRB and SASBDB. We expect that the renewed PED will be useful for researchers interested in the atomic-level understanding of IDP function, and promote the rational, structure-based design of IDP-targeting drugs.
Collapse
Affiliation(s)
- Tamas Lazar
- VIB-VUB Center for Structural Biology, Flanders Institute for Biotechnology, Brussels 1050, Belgium
- Structural Biology Brussels, Bioengineering Sciences Department, Vrije Universiteit Brussel, Brussels 1050, Belgium
| | - Elizabeth Martínez-Pérez
- Bioinformatics Unit, Fundación Instituto Leloir, Buenos Aires, C1405BWE, Argentina
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg 69117, Germany
| | - Federica Quaglia
- Dept. of Biomedical Sciences, University of Padua, Padova 35131, Italy
| | - András Hatos
- Dept. of Biomedical Sciences, University of Padua, Padova 35131, Italy
| | - Lucía B Chemes
- Instituto de Investigaciones Biotecnológicas “Dr. Rodolfo A. Ugalde’’, IIB-UNSAM, IIBIO-CONICET, Universidad Nacional de SanMartín, CP1650 San Martín, Buenos Aires, Argentina
| | - Javier A Iserte
- Bioinformatics Unit, Fundación Instituto Leloir, Buenos Aires, C1405BWE, Argentina
| | - Nicolás A Méndez
- Instituto de Investigaciones Biotecnológicas “Dr. Rodolfo A. Ugalde’’, IIB-UNSAM, IIBIO-CONICET, Universidad Nacional de SanMartín, CP1650 San Martín, Buenos Aires, Argentina
| | - Nicolás A Garrone
- Instituto de Investigaciones Biotecnológicas “Dr. Rodolfo A. Ugalde’’, IIB-UNSAM, IIBIO-CONICET, Universidad Nacional de SanMartín, CP1650 San Martín, Buenos Aires, Argentina
| | - Tadeo E Saldaño
- Laboratorio de Química y Biología Computacional, Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, Bernal B1876BXD, Buenos Aires, Argentina
| | - Julia Marchetti
- Laboratorio de Química y Biología Computacional, Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, Bernal B1876BXD, Buenos Aires, Argentina
| | - Ana Julia Velez Rueda
- Laboratorio de Química y Biología Computacional, Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, Bernal B1876BXD, Buenos Aires, Argentina
| | - Pau Bernadó
- Centre de Biochimie Structurale (CBS), CNRS, INSERM, University of Montpellier, Montpellier 34090, France
| | | | - Tiago N Cordeiro
- Centre de Biochimie Structurale (CBS), CNRS, INSERM, University of Montpellier, Montpellier 34090, France
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Av. da República, Oeiras 2780-157, Portugal
| | - Eric Fagerberg
- Theoretical Chemistry, Lund University, Lund, POB 124, SE-221 00, Sweden
| | - Julie D Forman-Kay
- Molecular Medicine Program, Hospital for Sick Children, Toronto, M5G 1X8, Ontario, Canada
- Department of Biochemistry, University of Toronto, Toronto, M5S 1A8, Ontario, Canada
| | - Maria S Fornasari
- Laboratorio de Química y Biología Computacional, Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, Bernal B1876BXD, Buenos Aires, Argentina
| | - Toby J Gibson
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg 69117, Germany
| | - Gregory-Neal W Gomes
- Department of Physics, University of Toronto, Toronto, M5S 1A7, Ontario, Canada
- Department of Chemical and Physical Sciences, University of Toronto Mississauga, Mississauga, L5L 1C6, Ontario, Canada
| | - Claudiu C Gradinaru
- Department of Physics, University of Toronto, Toronto, M5S 1A7, Ontario, Canada
- Department of Chemical and Physical Sciences, University of Toronto Mississauga, Mississauga, L5L 1C6, Ontario, Canada
| | - Teresa Head-Gordon
- Departments of Chemistry, Bioengineering, Chemical and Biomolecular Engineering University of California, Berkeley, CA 94720, USA
| | | | - Edward A Lemke
- Biocentre, Johannes Gutenberg-University Mainz, Mainz 55128, Germany
- Institute of Molecular Biology, Mainz 55128, Germany
| | - Sonia Longhi
- Aix-Marseille University, CNRS, Architecture et Fonction des Macromolécules Biologiques (AFMB), Marseille 13288, France
| | | | | | - Tanja Mittag
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | | | - Rohit V Pappu
- Department of Biomedical Engineering, Center for Science & Engineering of Living Systems (CSELS), Washington University in St. Louis, MO 63130, USA
| | - Gustavo Parisi
- Laboratorio de Química y Biología Computacional, Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, Bernal B1876BXD, Buenos Aires, Argentina
| | - Sylvie Ricard-Blum
- Univ Lyon, University Claude Bernard Lyon 1, CNRS, INSA Lyon, CPE, Institute of Molecular and Supramolecular Chemistry and Biochemistry (ICBMS), UMR 5246, Villeurbanne, 69629 Lyon Cedex 07, France
| | - Kiersten M Ruff
- Department of Biomedical Engineering, Center for Science & Engineering of Living Systems (CSELS), Washington University in St. Louis, MO 63130, USA
| | - Edoardo Salladini
- Aix-Marseille University, CNRS, Architecture et Fonction des Macromolécules Biologiques (AFMB), Marseille 13288, France
| | - Marie Skepö
- Theoretical Chemistry, Lund University, Lund, POB 124, SE-221 00, Sweden
- LINXS - Lund Institute of Advanced Neutron and X-ray Science, Lund 223 70, Sweden
| | - Dmitri Svergun
- European Molecular Biology Laboratory, Hamburg Unit, Hamburg 22607, Germany
| | - Sylvain D Vallet
- Univ Lyon, University Claude Bernard Lyon 1, CNRS, INSA Lyon, CPE, Institute of Molecular and Supramolecular Chemistry and Biochemistry (ICBMS), UMR 5246, Villeurbanne, 69629 Lyon Cedex 07, France
| | - Mihaly Varadi
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | - Peter Tompa
- To whom correspondence should be addressed. Tel +32 473 785386;
| | - Silvio C E Tosatto
- Correspondence may also be addressed to Silvio C. E. Tosatto. Tel: +39 049 827 6269;
| | - Damiano Piovesan
- Dept. of Biomedical Sciences, University of Padua, Padova 35131, Italy
| |
Collapse
|
33
|
Csizmadia G, Erdős G, Tordai H, Padányi R, Tosatto S, Dosztányi Z, Hegedűs T. The MemMoRF database for recognizing disordered protein regions interacting with cellular membranes. Nucleic Acids Res 2021; 49:D355-D360. [PMID: 33119751 PMCID: PMC7778998 DOI: 10.1093/nar/gkaa954] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2020] [Revised: 09/25/2020] [Accepted: 10/28/2020] [Indexed: 12/19/2022] Open
Abstract
Protein and lipid membrane interactions play fundamental roles in a large number of cellular processes (e.g. signalling, vesicle trafficking, or viral invasion). A growing number of examples indicate that such interactions can also rely on intrinsically disordered protein regions (IDRs), which can form specific reversible interactions not only with proteins but also with lipids. We named IDRs involved in such membrane lipid-induced disorder-to-order transition as MemMoRFs, in an analogy to IDRs exhibiting disorder-to-order transition upon interaction with protein partners termed Molecular Recognition Features (MoRFs). Currently, both the experimental detection and computational characterization of MemMoRFs are challenging, and information about these regions are scattered in the literature. To facilitate the related investigations we generated a comprehensive database of experimentally validated MemMoRFs based on manual curation of literature and structural data. To characterize the dynamics of MemMoRFs, secondary structure propensity and flexibility calculated from nuclear magnetic resonance chemical shifts were incorporated into the database. These data were supplemented by inclusion of sentences from papers, functional data and disease-related information. The MemMoRF database can be accessed via a user-friendly interface at https://memmorf.hegelab.org, potentially providing a central resource for the characterization of disordered regions in transmembrane and membrane-associated proteins.
Collapse
Affiliation(s)
- Georgina Csizmadia
- Department of Biophysics and Radiation Biology, Semmelweis University, Budapest 1094, Hungary
| | - Gábor Erdős
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest 1117, Hungary
| | - Hedvig Tordai
- Department of Biophysics and Radiation Biology, Semmelweis University, Budapest 1094, Hungary
| | - Rita Padányi
- Department of Biophysics and Radiation Biology, Semmelweis University, Budapest 1094, Hungary
| | - Silvio Tosatto
- Department of Biomedical Sciences, University of Padua, Padua 35131, Italy
| | - Zsuzsanna Dosztányi
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest 1117, Hungary
| | - Tamás Hegedűs
- Department of Biophysics and Radiation Biology, Semmelweis University, Budapest 1094, Hungary
| |
Collapse
|
34
|
Piovesan D, Necci M, Escobedo N, Monzon AM, Hatos A, Mičetić I, Quaglia F, Paladin L, Ramasamy P, Dosztányi Z, Vranken WF, Davey N, Parisi G, Fuxreiter M, Tosatto SE. MobiDB: intrinsically disordered proteins in 2021. Nucleic Acids Res 2021; 49:D361-D367. [PMID: 33237329 PMCID: PMC7779018 DOI: 10.1093/nar/gkaa1058] [Citation(s) in RCA: 130] [Impact Index Per Article: 43.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 10/16/2020] [Accepted: 11/19/2020] [Indexed: 12/13/2022] Open
Abstract
The MobiDB database (URL: https://mobidb.org/) provides predictions and annotations for intrinsically disordered proteins. Here, we report recent developments implemented in MobiDB version 4, regarding the database format, with novel types of annotations and an improved update process. The new website includes a re-designed user interface, a more effective search engine and advanced API for programmatic access. The new database schema gives more flexibility for the users, as well as simplifying the maintenance and updates. In addition, the new entry page provides more visualisation tools including customizable feature viewer and graphs of the residue contact maps. MobiDB v4 annotates the binding modes of disordered proteins, whether they undergo disorder-to-order transitions or remain disordered in the bound state. In addition, disordered regions undergoing liquid-liquid phase separation or post-translational modifications are defined. The integrated information is presented in a simplified interface, which enables faster searches and allows large customized datasets to be downloaded in TSV, Fasta or JSON formats. An alternative advanced interface allows users to drill deeper into features of interest. A new statistics page provides information at database and proteome levels. The new MobiDB version presents state-of-the-art knowledge on disordered proteins and improves data accessibility for both computational and experimental users.
Collapse
Affiliation(s)
- Damiano Piovesan
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Marco Necci
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Nahuel Escobedo
- Dept. of Science and Technology, Universidad Nacional de Quilmes, Buenos Aires, Argentina
| | | | - András Hatos
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Ivan Mičetić
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Federica Quaglia
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Lisanna Paladin
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Pathmanaban Ramasamy
- Interuniversity Institute of Bioinformatics in Brussels, ULB/VUB, Triomflaan, BC building, 6th floor, CP 263, 1050 Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium
- Centre for Structural Biology, VIB, Pleinlaan 2, 1050 Brussels, Belgium
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent 9000, Belgium
- Department of Biomolecular Medicine, Faculty of Health Sciences and Medicine, Ghent University, Ghent 9000, Belgium
| | | | - Wim F Vranken
- Interuniversity Institute of Bioinformatics in Brussels, ULB/VUB, Triomflaan, BC building, 6th floor, CP 263, 1050 Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium
- Centre for Structural Biology, VIB, Pleinlaan 2, 1050 Brussels, Belgium
| | - Norman E Davey
- Division of Cancer Biology, The Institute of Cancer Research, 237 Fulham Road, London, SW3 6JB, UK
| | - Gustavo Parisi
- Dept. of Science and Technology, Universidad Nacional de Quilmes, Buenos Aires, Argentina
| | - Monika Fuxreiter
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Silvio C E Tosatto
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| |
Collapse
|
35
|
Abstract
Intrinsically disordered proteins, defying the traditional protein structure-function paradigm, are a challenge to study experimentally. Because a large part of our knowledge rests on computational predictions, it is crucial that their accuracy is high. The Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment was established as a community-based blind test to determine the state of the art in prediction of intrinsically disordered regions and the subset of residues involved in binding. A total of 43 methods were evaluated on a dataset of 646 proteins from DisProt. The best methods use deep learning techniques and notably outperform physicochemical methods. The top disorder predictor has Fmax = 0.483 on the full dataset and Fmax = 0.792 following filtering out of bona fide structured regions. Disordered binding regions remain hard to predict, with Fmax = 0.231. Interestingly, computing times among methods can vary by up to four orders of magnitude.
Collapse
|
36
|
Chasing coevolutionary signals in intrinsically disordered proteins complexes. Sci Rep 2020; 10:17962. [PMID: 33087759 PMCID: PMC7578644 DOI: 10.1038/s41598-020-74791-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2020] [Accepted: 08/27/2020] [Indexed: 11/30/2022] Open
Abstract
Intrinsically disordered proteins/regions (IDPs/IDRs) are crucial components of the cell, they are highly abundant and participate ubiquitously in a wide range of biological functions, such as regulatory processes and cell signaling. Many of their important functions rely on protein interactions, by which they trigger or modulate different pathways. Sequence covariation, a powerful tool for protein contact prediction, has been applied successfully to predict protein structure and to identify protein–protein interactions mostly of globular proteins. IDPs/IDRs also mediate a plethora of protein–protein interactions, highlighting the importance of addressing sequence covariation-based inter-protein contact prediction of this class of proteins. Despite their importance, a systematic approach to analyze the covariation phenomena of intrinsically disordered proteins and their complexes is still missing. Here we carry out a comprehensive critical assessment of coevolution-based contact prediction in IDP/IDR complexes and detail the challenges and possible limitations that emerge from their analysis. We found that the coevolutionary signal is faint in most of the complexes of disordered proteins but positively correlates with the interface size and binding affinity between partners. In addition, we discuss the state-of-art methodology by biological interpretation of the results, formulate evaluation guidelines and suggest future directions of development to the field.
Collapse
|
37
|
Khramushin A, Marcu O, Alam N, Shimony O, Padhorny D, Brini E, Dill KA, Vajda S, Kozakov D, Schueler-Furman O. Modeling beta-sheet peptide-protein interactions: Rosetta FlexPepDock in CAPRI rounds 38-45. Proteins 2020; 88:1037-1049. [PMID: 31891416 PMCID: PMC7539656 DOI: 10.1002/prot.25871] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Revised: 12/17/2019] [Accepted: 12/26/2019] [Indexed: 01/09/2023]
Abstract
Peptide-protein docking is challenging due to the considerable conformational freedom of the peptide. CAPRI rounds 38-45 included two peptide-protein interactions, both characterized by a peptide forming an additional beta strand of a beta sheet in the receptor. Using the Rosetta FlexPepDock peptide docking protocol we generated top-performing, high-accuracy models for targets 134 and 135, involving an interaction between a peptide derived from L-MAG with DLC8. In addition, we were able to generate the only medium-accuracy models for a particularly challenging target, T121. In contrast to the classical peptide-mediated interaction, in which receptor side chains contact both peptide backbone and side chains, beta-sheet complementation involves a major contribution to binding by hydrogen bonds between main chain atoms. To establish how binding affinity and specificity are established in this special class of peptide-protein interactions, we extracted PeptiDBeta, a benchmark of solved structures of different protein domains that are bound by peptides via beta-sheet complementation, and tested our protocol for global peptide-docking PIPER-FlexPepDock on this dataset. We find that the beta-strand part of the peptide is sufficient to generate approximate and even high resolution models of many interactions, but inclusion of adjacent motif residues often provides additional information necessary to achieve high resolution model quality.
Collapse
Affiliation(s)
- Alisa Khramushin
- Department of Microbiologyand Molecular Genetics, Institute
for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University,
Jerusalem, Israel
| | - Orly Marcu
- Department of Microbiologyand Molecular Genetics, Institute
for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University,
Jerusalem, Israel
| | - Nawsad Alam
- Department of Microbiologyand Molecular Genetics, Institute
for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University,
Jerusalem, Israel
| | - Orly Shimony
- Department of Microbiologyand Molecular Genetics, Institute
for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University,
Jerusalem, Israel
| | - Dzmitry Padhorny
- Department of Applied Mathematics and Statistics, Stony
Brook University, New York, New York
- Laufer Center for Physical and Quantitative Biology, Stony
Brook University, New York, New York
| | - Emiliano Brini
- Laufer Center for Physical and Quantitative Biology, Stony
Brook University, New York, New York
| | - Ken A. Dill
- Laufer Center for Physical and Quantitative Biology, Stony
Brook University, New York, New York
- Department of Physics and Astronomy, Stony Brook
University, New York, New York
- Department of Chemistry, Stony Brook University, New York,
New York
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University,
Boston, Massachusetts
- Department of Chemistry, Boston University, Boston,
Massachusetts
| | - Dima Kozakov
- Department of Applied Mathematics and Statistics, Stony
Brook University, New York, New York
- Laufer Center for Physical and Quantitative Biology, Stony
Brook University, New York, New York
| | - Ora Schueler-Furman
- Department of Microbiologyand Molecular Genetics, Institute
for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University,
Jerusalem, Israel
| |
Collapse
|
38
|
Zhou J, Oldfield CJ, Yan W, Shen B, Dunker A. Identification of Intrinsic Disorder in Complexes from the Protein Data Bank. ACS OMEGA 2020; 5:17883-17891. [PMID: 32743159 PMCID: PMC7391252 DOI: 10.1021/acsomega.9b03927] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/17/2019] [Accepted: 03/18/2020] [Indexed: 02/08/2023]
Abstract
![]()
Background:
Intrinsically disordered proteins or regions (IDPs
or IDRs) lack stable structures in solution, yet often fold upon binding
with partners. IDPs or IDRs are highly abundant in all proteomes and
represent a significant modification of sequence → structure
→ function paradigm. The Protein Data Bank (PDB) includes complexes
containing disordered segments bound to globular proteins, but the
molecular mechanisms of such binding interactions remain largely unknown.
Results: In this study, we present the results of various disorder
predictions on a nonredundant set of PDB complexes. In contrast to
their structural appearances, many PDB proteins were predicted to
be disordered when separated from their binding partners. These predicted-to-be-disordered
proteins were observed to form structures depending upon various factors,
including heterogroup binding, protein/DNA/RNA binding, disulfide
bonds, and ion binding. Conclusions: This study collects many examples
of disorder-to-order transition in IDP complex formation, thus revealing
the unusual structure–function relationships of IDPs and providing
an additional support for the newly proposed paradigm of the sequence
→ IDP/IDR ensemble → function.
Collapse
Affiliation(s)
- Jianhong Zhou
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, Indiana 46202, United States
| | - Christopher J. Oldfield
- Computer Science Department, Virginia Commonwealth University, Richmond, Virginia 23284, United States
| | - Wenying Yan
- School of Biology & Basic Medical Sciences, Soochow University, Suzhou 215123, China
| | - Bairong Shen
- Institutes for Systems Genetics, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - A.Keith Dunker
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, Indiana 46202, United States
| |
Collapse
|
39
|
Exploring Protein Intrinsic Disorder with MobiDB. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2020; 2141:127-143. [PMID: 32696355 DOI: 10.1007/978-1-0716-0524-0_6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Nowadays, it is well established that many proteins or regions under physiological conditions lack a fixed three-dimensional structure and are intrinsically disordered. MobiDB is the main repository of protein disorder and mobility annotations, combining different data sources to provide an exhaustive overview of intrinsic disorder. MobiDB includes curated annotations from other databases, indirect disorder evidence from structural data, and disorder predictions from protein sequences. It provides an easy-to-use web server to visualize and explore disorder information. This chapter describes the data available in MobiDB, emphasizing how to use and access the intrinsic disorder data. MobiDB is available at URL http://mobidb.bio.unipd.it .
Collapse
|
40
|
Monzon AM, Necci M, Quaglia F, Walsh I, Zanotti G, Piovesan D, Tosatto SCE. Experimentally Determined Long Intrinsically Disordered Protein Regions Are Now Abundant in the Protein Data Bank. Int J Mol Sci 2020; 21:ijms21124496. [PMID: 32599863 PMCID: PMC7349999 DOI: 10.3390/ijms21124496] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Revised: 06/18/2020] [Accepted: 06/19/2020] [Indexed: 01/12/2023] Open
Abstract
Intrinsically disordered protein regions are commonly defined from missing electron density in X-ray structures. Experimental evidence for long disorder regions (LDRs) of at least 30 residues was so far limited to manually curated proteins. Here, we describe a comprehensive and large-scale analysis of experimental LDRs for 3133 unique proteins, demonstrating an increasing coverage of intrinsic disorder in the Protein Data Bank (PDB) in the last decade. The results suggest that long missing residue regions are a good quality source to annotate intrinsically disordered regions and perform functional analysis in large data sets. The consensus approach used to define LDRs allows to evaluate context dependent disorder and provide a common definition at the protein level.
Collapse
Affiliation(s)
- Alexander Miguel Monzon
- Department of Biomedical Sciences, University of Padua, 35131 Padua, Italy; (A.M.M.); (M.N.); (F.Q.); (G.Z.)
| | - Marco Necci
- Department of Biomedical Sciences, University of Padua, 35131 Padua, Italy; (A.M.M.); (M.N.); (F.Q.); (G.Z.)
| | - Federica Quaglia
- Department of Biomedical Sciences, University of Padua, 35131 Padua, Italy; (A.M.M.); (M.N.); (F.Q.); (G.Z.)
| | - Ian Walsh
- Bioprocessing Technology Institute, A*STAR, Singapore 138668, Singapore;
| | - Giuseppe Zanotti
- Department of Biomedical Sciences, University of Padua, 35131 Padua, Italy; (A.M.M.); (M.N.); (F.Q.); (G.Z.)
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padua, 35131 Padua, Italy; (A.M.M.); (M.N.); (F.Q.); (G.Z.)
- Correspondence: (D.P.); (S.C.E.T.)
| | - Silvio C. E. Tosatto
- Department of Biomedical Sciences, University of Padua, 35131 Padua, Italy; (A.M.M.); (M.N.); (F.Q.); (G.Z.)
- Correspondence: (D.P.); (S.C.E.T.)
| |
Collapse
|
41
|
Hatos A, Hajdu-Soltész B, Monzon AM, Palopoli N, Álvarez L, Aykac-Fas B, Bassot C, Benítez GI, Bevilacqua M, Chasapi A, Chemes L, Davey NE, Davidović R, Dunker AK, Elofsson A, Gobeill J, Foutel NSG, Sudha G, Guharoy M, Horvath T, Iglesias V, Kajava AV, Kovacs OP, Lamb J, Lambrughi M, Lazar T, Leclercq JY, Leonardi E, Macedo-Ribeiro S, Macossay-Castillo M, Maiani E, Manso JA, Marino-Buslje C, Martínez-Pérez E, Mészáros B, Mičetić I, Minervini G, Murvai N, Necci M, Ouzounis CA, Pajkos M, Paladin L, Pancsa R, Papaleo E, Parisi G, Pasche E, Barbosa Pereira PJ, Promponas VJ, Pujols J, Quaglia F, Ruch P, Salvatore M, Schad E, Szabo B, Szaniszló T, Tamana S, Tantos A, Veljkovic N, Ventura S, Vranken W, Dosztányi Z, Tompa P, Tosatto SCE, Piovesan D. DisProt: intrinsic protein disorder annotation in 2020. Nucleic Acids Res 2020; 48:D269-D276. [PMID: 31713636 PMCID: PMC7145575 DOI: 10.1093/nar/gkz975] [Citation(s) in RCA: 98] [Impact Index Per Article: 24.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2019] [Revised: 10/11/2019] [Accepted: 10/12/2019] [Indexed: 11/29/2022] Open
Abstract
The Database of Protein Disorder (DisProt, URL: https://disprot.org) provides manually curated annotations of intrinsically disordered proteins from the literature. Here we report recent developments with DisProt (version 8), including the doubling of protein entries, a new disorder ontology, improvements of the annotation format and a completely new website. The website includes a redesigned graphical interface, a better search engine, a clearer API for programmatic access and a new annotation interface that integrates text mining technologies. The new entry format provides a greater flexibility, simplifies maintenance and allows the capture of more information from the literature. The new disorder ontology has been formalized and made interoperable by adopting the OWL format, as well as its structure and term definitions have been improved. The new annotation interface has made the curation process faster and more effective. We recently showed that new DisProt annotations can be effectively used to train and validate disorder predictors. We believe the growth of DisProt will accelerate, contributing to the improvement of function and disorder predictors and therefore to illuminate the ‘dark’ proteome.
Collapse
Affiliation(s)
- András Hatos
- Department of Biomedical Sciences, University of Padova, Padova 35121, Italy
| | - Borbála Hajdu-Soltész
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest 1117, Hungary
| | - Alexander M Monzon
- Department of Biomedical Sciences, University of Padova, Padova 35121, Italy
| | - Nicolas Palopoli
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes - CONICET, Bernal, Buenos Aires B1876BXD, Argentina
| | - Lucía Álvarez
- Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Investigaciones Biotecnológicas IIBIO, Universidad Nacional de San Martín, San Martín, Buenos Aires, Argentina
| | - Burcu Aykac-Fas
- Computational Biology Laboratory, Danish Cancer Society Research Center, Copenhagen DK-2100, Denmark
| | - Claudio Bassot
- Department of Biochemistry and Biophysics and Science for Life Laboratory, Stockholm University, Box 1031, Solna 17121, Sweden
| | - Guillermo I Benítez
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes - CONICET, Bernal, Buenos Aires B1876BXD, Argentina
| | - Martina Bevilacqua
- Department of Biomedical Sciences, University of Padova, Padova 35121, Italy
| | - Anastasia Chasapi
- Biological Computation & Process Laboratory, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas, Thessalonica GR-57500, Greece
| | - Lucia Chemes
- Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Investigaciones Biotecnológicas IIBIO, Universidad Nacional de San Martín, San Martín, Buenos Aires, Argentina.,Departamento de Fisiología y Biología Molecular y Celular (DFBMC), Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Norman E Davey
- Division of Cancer Biology, The Institute of Cancer Research, Chelsea, London SW3 6BJ, UK
| | - Radoslav Davidović
- Laboratory for Bioinformatics and Computational Chemistry, Institute of Nuclear Sciences Vinca, University of Belgrade, Belgrade 11001, Serbia
| | - A Keith Dunker
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, IN 46202, USA
| | - Arne Elofsson
- Department of Biochemistry and Biophysics and Science for Life Laboratory, Stockholm University, Box 1031, Solna 17121, Sweden
| | - Julien Gobeill
- Swiss Institute of Bioinformatics and HES-SO \ HEG, Geneva 1200, Switzerland
| | - Nicolás S González Foutel
- Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Investigaciones Biotecnológicas IIBIO, Universidad Nacional de San Martín, San Martín, Buenos Aires, Argentina
| | - Govindarajan Sudha
- Department of Biochemistry and Biophysics and Science for Life Laboratory, Stockholm University, Box 1031, Solna 17121, Sweden
| | - Mainak Guharoy
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels 1050, Belgium.,VIB-VUB Center for Structural Biology, Flanders Institute for Biotechnology (VIB), Brussels 1050, Belgium
| | - Tamas Horvath
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest H-1117, Hungary
| | - Valentin Iglesias
- Departament de Bioquímica i Biologia Molecular and Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, Bellaterra 08193, Spain
| | - Andrey V Kajava
- Centre de Recherche en Biologie cellulaire de Montpellier (CRBM), UMR 5237 CNRS, Université Montpellier, Montpellier 34293, France.,Institut de Biologie Computationnelle(IBC), Montpellier 34095, France
| | - Orsolya P Kovacs
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest H-1117, Hungary
| | - John Lamb
- Department of Biochemistry and Biophysics and Science for Life Laboratory, Stockholm University, Box 1031, Solna 17121, Sweden
| | - Matteo Lambrughi
- Computational Biology Laboratory, Danish Cancer Society Research Center, Copenhagen DK-2100, Denmark
| | - Tamas Lazar
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels 1050, Belgium.,VIB-VUB Center for Structural Biology, Flanders Institute for Biotechnology (VIB), Brussels 1050, Belgium
| | - Jeremy Y Leclercq
- Centre de Recherche en Biologie cellulaire de Montpellier (CRBM), UMR 5237 CNRS, Université Montpellier, Montpellier 34293, France
| | - Emanuela Leonardi
- Department of Woman and Child Health, University of Padova, Padova 35127, Italy.,Fondazione Istituto di Ricerca Pediatrica (IRP), Città della Speranza, Padova 35127, Italy
| | - Sandra Macedo-Ribeiro
- Instituto de Biologia Molecular e Celular (IBMC) and Instituto de Investigação e Inovação em Saúde (i3S), Universidade do Porto, Porto 4200-135, Portugal
| | - Mauricio Macossay-Castillo
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels 1050, Belgium.,VIB-VUB Center for Structural Biology, Flanders Institute for Biotechnology (VIB), Brussels 1050, Belgium
| | - Emiliano Maiani
- Computational Biology Laboratory, Danish Cancer Society Research Center, Copenhagen DK-2100, Denmark
| | - José A Manso
- Instituto de Biologia Molecular e Celular (IBMC) and Instituto de Investigação e Inovação em Saúde (i3S), Universidade do Porto, Porto 4200-135, Portugal
| | - Cristina Marino-Buslje
- Bioinformatics Unit. Fundación Instituto Leloir, Ciudad de Buenos Aires C1405BWE, Argentina
| | | | - Bálint Mészáros
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest 1117, Hungary
| | - Ivan Mičetić
- Department of Biomedical Sciences, University of Padova, Padova 35121, Italy
| | - Giovanni Minervini
- Department of Biomedical Sciences, University of Padova, Padova 35121, Italy
| | - Nikoletta Murvai
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest H-1117, Hungary
| | - Marco Necci
- Department of Biomedical Sciences, University of Padova, Padova 35121, Italy
| | - Christos A Ouzounis
- Biological Computation & Process Laboratory, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas, Thessalonica GR-57500, Greece
| | - Mátyás Pajkos
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest 1117, Hungary
| | - Lisanna Paladin
- Department of Biomedical Sciences, University of Padova, Padova 35121, Italy
| | - Rita Pancsa
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest H-1117, Hungary
| | - Elena Papaleo
- Computational Biology Laboratory, Danish Cancer Society Research Center, Copenhagen DK-2100, Denmark.,Translational Disease Systems Biology, Faculty of Health and Medical Sciences, Novo Nordisk Foundation Center for Protein Research University of Copenhagen, Copenhagen DK-2200, Denmark
| | - Gustavo Parisi
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes - CONICET, Bernal, Buenos Aires B1876BXD, Argentina
| | - Emilie Pasche
- Swiss Institute of Bioinformatics and HES-SO \ HEG, Geneva 1200, Switzerland
| | - Pedro J Barbosa Pereira
- Instituto de Biologia Molecular e Celular (IBMC) and Instituto de Investigação e Inovação em Saúde (i3S), Universidade do Porto, Porto 4200-135, Portugal
| | - Vasilis J Promponas
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, Nicosia, CY 1678, Cyprus
| | - Jordi Pujols
- Departament de Bioquímica i Biologia Molecular and Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, Bellaterra 08193, Spain
| | - Federica Quaglia
- Department of Biomedical Sciences, University of Padova, Padova 35121, Italy
| | - Patrick Ruch
- Swiss Institute of Bioinformatics and HES-SO \ HEG, Geneva 1200, Switzerland
| | - Marco Salvatore
- Department of Biochemistry and Biophysics and Science for Life Laboratory, Stockholm University, Box 1031, Solna 17121, Sweden
| | - Eva Schad
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest H-1117, Hungary
| | - Beata Szabo
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest H-1117, Hungary
| | - Tamás Szaniszló
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest 1117, Hungary
| | - Stella Tamana
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, Nicosia, CY 1678, Cyprus
| | - Agnes Tantos
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest H-1117, Hungary
| | - Nevena Veljkovic
- Laboratory for Bioinformatics and Computational Chemistry, Institute of Nuclear Sciences Vinca, University of Belgrade, Belgrade 11001, Serbia
| | - Salvador Ventura
- Departament de Bioquímica i Biologia Molecular and Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, Bellaterra 08193, Spain
| | - Wim Vranken
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels 1050, Belgium.,VIB-VUB Center for Structural Biology, Flanders Institute for Biotechnology (VIB), Brussels 1050, Belgium.,Interuniversity Institute of Bioinformatics in Brussels (IB2), ULB-VUB, Brussels 1050, Belgium
| | - Zsuzsanna Dosztányi
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest 1117, Hungary
| | - Peter Tompa
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels 1050, Belgium.,VIB-VUB Center for Structural Biology, Flanders Institute for Biotechnology (VIB), Brussels 1050, Belgium.,Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest H-1117, Hungary
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padova, Padova 35121, Italy.,CNR Institute of Neurosceince, Padova 35121, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova 35121, Italy
| |
Collapse
|
42
|
Alderson TR, Ying J, Bax A, Benesch JLP, Baldwin AJ. Conditional Disorder in Small Heat-shock Proteins. J Mol Biol 2020; 432:3033-3049. [PMID: 32081587 PMCID: PMC7245567 DOI: 10.1016/j.jmb.2020.02.003] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2019] [Revised: 01/27/2020] [Accepted: 02/09/2020] [Indexed: 12/31/2022]
Abstract
Small heat-shock proteins (sHSPs) are molecular chaperones that respond to cellular stresses to combat protein aggregation. HSP27 is a critical human sHSP that forms large, dynamic oligomers whose quaternary structures and chaperone activities depend on environmental factors. Upon exposure to cellular stresses, such as heat shock or acidosis, HSP27 oligomers can dissociate into dimers and monomers, which leads to significantly enhanced chaperone activity. The structured core of the protein, the α-crystallin domain (ACD), forms dimers and can prevent the aggregation of substrate proteins to a similar degree as the full-length protein. When the ACD dimer dissociates into monomers, it partially unfolds and exhibits enhanced activity. Here, we used solution-state NMR spectroscopy to characterize the structure and dynamics of the HSP27 ACD monomer. Web show that the monomer is stabilized at low pH and that its backbone chemical shifts, 15N relaxation rates, and 1H-15N residual dipolar couplings suggest structural changes and rapid motions in the region responsible for dimerization. By analyzing the solvent accessible and buried surface areas of sHSP structures in the context of a database of dimers that are known to dissociate into disordered monomers, we predict that ACD dimers from sHSPs across all kingdoms of life may partially unfold upon dissociation. We propose a general model in which conditional disorder-the partial unfolding of ACDs upon monomerization-is a common mechanism for sHSP activity.
Collapse
Affiliation(s)
- T Reid Alderson
- Department of Chemistry, Physical and Theoretical Chemistry Laboratory, University of Oxford, South Parks Road, Oxford, OX1 3QZ, UK; Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Jinfa Ying
- Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Ad Bax
- Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, 20892, USA.
| | - Justin L P Benesch
- Department of Chemistry, Physical and Theoretical Chemistry Laboratory, University of Oxford, South Parks Road, Oxford, OX1 3QZ, UK.
| | - Andrew J Baldwin
- Department of Chemistry, Physical and Theoretical Chemistry Laboratory, University of Oxford, South Parks Road, Oxford, OX1 3QZ, UK.
| |
Collapse
|
43
|
Affiliation(s)
- Gábor Erdős
- Department of Biochemistry, MTA‐ELTE Momentum Bioinformatics Research Group ELTE Eötvös Loránd University Budapest Hungary
| | - Zsuzsanna Dosztányi
- Department of Biochemistry, MTA‐ELTE Momentum Bioinformatics Research Group ELTE Eötvös Loránd University Budapest Hungary
| |
Collapse
|
44
|
Dynamic conformational flexibility and molecular interactions of intrinsically disordered proteins. J Biosci 2020. [DOI: 10.1007/s12038-020-0010-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
|
45
|
Simon I. Macromolecular Interactions of Disordered Proteins. Int J Mol Sci 2020; 21:ijms21020504. [PMID: 31941113 PMCID: PMC7014052 DOI: 10.3390/ijms21020504] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2020] [Revised: 01/08/2020] [Accepted: 01/10/2020] [Indexed: 02/03/2023] Open
Affiliation(s)
- István Simon
- Institute of Enzymology, RCNS, Lorand Eotvos Research Network, Center of Excellence of the Hungarian Academy of Sciences, Magyar Tudósok krt. 2., H-1117 Budapest, Hungary
| |
Collapse
|
46
|
Bhattarai A, Emerson IA. Dynamic conformational flexibility and molecular interactions of intrinsically disordered proteins. J Biosci 2020; 45:29. [PMID: 32020911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Intrinsically disordered proteins (IDPs) are highly flexible and undergo disorder to order transition upon binding. They are highly abundant in human proteomes and play critical roles in cell signaling and regulatory processes. This review mainly focuses on the dynamics of disordered proteins including their conformational heterogeneity, protein-protein interactions, and the phase transition of biomolecular condensates that are central to various biological functions. Besides, the role of RNA-mediated chaperones in protein folding and stability of IDPs were also discussed. Finally, we explored the dynamic binding interface of IDPs as novel therapeutic targets and the effect of small molecules on their interactions.
Collapse
Affiliation(s)
- Anil Bhattarai
- Bioinformatics Programming Laboratory, Department of Biotechnology, School of Biosciences and Technology, Vellore Institute of Technology, Vellore 632 014, India
| | | |
Collapse
|
47
|
Gouw M, Alvarado-Valverde J, Čalyševa J, Diella F, Kumar M, Michael S, Van Roey K, Dinkel H, Gibson TJ. How to Annotate and Submit a Short Linear Motif to the Eukaryotic Linear Motif Resource. Methods Mol Biol 2020; 2141:73-102. [PMID: 32696353 DOI: 10.1007/978-1-0716-0524-0_4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Over the past few years, it has become apparent that approximately 35% of the human proteome consists of intrinsically disordered regions. Many of these disordered regions are rich in short linear motifs (SLiMs) which mediate protein-protein interactions. Although these motifs are short and often partially conserved, they are involved in many important aspects of protein function, including cleavage, targeting, degradation, docking, phosphorylation, and other posttranslational modifications. The Eukaryotic Linear Motif resource (ELM) was established over 15 years ago as a repository to store and catalogue the scientific discoveries of motifs. Each motif in the database is annotated and curated manually, based on the experimental evidence gathered from publications. The entries themselves are submitted to ELM by filling in two annotation templates designed for motif class and motif instance annotation. In this protocol, we describe the steps involved in annotating new motifs and how to submit them to ELM.
Collapse
Affiliation(s)
- Marc Gouw
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Jesús Alvarado-Valverde
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany.,Faculty of Biosciences, Collaboration for Joint PhD Degree between EMBL and Heidelberg University, Heidelberg, Germany
| | - Jelena Čalyševa
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany.,Faculty of Biosciences, Collaboration for Joint PhD Degree between EMBL and Heidelberg University, Heidelberg, Germany
| | - Francesca Diella
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Manjeet Kumar
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Sushama Michael
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Kim Van Roey
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Holger Dinkel
- Max Planck Institute for Biological Cybernetics, Tübingen, Germany
| | - Toby J Gibson
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany.
| |
Collapse
|
48
|
Sequence and Structure Properties Uncover the Natural Classification of Protein Complexes Formed by Intrinsically Disordered Proteins via Mutual Synergistic Folding. Int J Mol Sci 2019; 20:ijms20215460. [PMID: 31683980 PMCID: PMC6862064 DOI: 10.3390/ijms20215460] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Revised: 10/28/2019] [Accepted: 10/30/2019] [Indexed: 12/17/2022] Open
Abstract
Intrinsically disordered proteins mediate crucial biological functions through their interactions with other proteins. Mutual synergistic folding (MSF) occurs when all interacting proteins are disordered, folding into a stable structure in the course of the complex formation. In these cases, the folding and binding processes occur in parallel, lending the resulting structures uniquely heterogeneous features. Currently there are no dedicated classification approaches that take into account the particular biological and biophysical properties of MSF complexes. Here, we present a scalable clustering-based classification scheme, built on redundancy-filtered features that describe the sequence and structure properties of the complexes and the role of the interaction, which is directly responsible for structure formation. Using this approach, we define six major types of MSF complexes, corresponding to biologically meaningful groups. Hence, the presented method also shows that differences in binding strength, subcellular localization, and regulation are encoded in the sequence and structural properties of proteins. While current protein structure classification methods can also handle complex structures, we show that the developed scheme is fundamentally different, and since it takes into account defining features of MSF complexes, it serves as a better representation of structures arising through this specific interaction mode.
Collapse
|
49
|
Sequential, Structural and Functional Properties of Protein Complexes Are Defined by How Folding and Binding Intertwine. J Mol Biol 2019; 431:4408-4428. [DOI: 10.1016/j.jmb.2019.07.034] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Revised: 07/10/2019] [Accepted: 07/29/2019] [Indexed: 12/15/2022]
|
50
|
Analysis of Heterodimeric "Mutual Synergistic Folding"-Complexes. Int J Mol Sci 2019; 20:ijms20205136. [PMID: 31623284 PMCID: PMC6829572 DOI: 10.3390/ijms20205136] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2019] [Revised: 10/07/2019] [Accepted: 10/15/2019] [Indexed: 12/17/2022] Open
Abstract
Several intrinsically disordered proteins (IDPs) are capable to adopt stable structures without interacting with a folded partner. When the folding of all interacting partners happens at the same time, coupled with the interaction in a synergistic manner, the process is called Mutual Synergistic Folding (MSF). These complexes represent a discrete subset of IDPs. Recently, we collected information on their complexes and created the MFIB (Mutual Folding Induced by Binding) database. In a previous study, we compared homodimeric MSF complexes with homodimeric and monomeric globular proteins with similar amino acid sequence lengths. We concluded that MSF homodimers, compared to globular homodimeric proteins, have a greater solvent accessible main-chain surface area on the contact surface of the subunits, which becomes buried during dimerization. The main driving force of the folding is the mutual shielding of the water-accessible backbones, but the formation of further intermolecular interactions can also be relevant. In this paper, we will report analyses of heterodimeric MSF complexes. Our results indicate that the amino acid composition of the heterodimeric MSF monomer subunits slightly diverges from globular monomer proteins, while after dimerization, the amino acid composition of the overall MSF complexes becomes more similar to overall amino acid compositions of globular complexes. We found that inter-subunit interactions are strengthened, and additionally to the shielding of the solvent accessible backbone, other factors might play an important role in the stabilization of the heterodimeric structures, likewise energy gain resulting from the interaction of the two subunits with different amino acid compositions. We suggest that the shielding of the β-sheet backbones and the formation of a buried structural core along with the general strengthening of inter-subunit interactions together could be the driving forces of MSF protein structural ordering upon dimerization.
Collapse
|