1
|
Ghosh D, Biswas A, Radhakrishna M. Advanced computational approaches to understand protein aggregation. BIOPHYSICS REVIEWS 2024; 5:021302. [PMID: 38681860 PMCID: PMC11045254 DOI: 10.1063/5.0180691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 03/18/2024] [Indexed: 05/01/2024]
Abstract
Protein aggregation is a widespread phenomenon implicated in debilitating diseases like Alzheimer's, Parkinson's, and cataracts, presenting complex hurdles for the field of molecular biology. In this review, we explore the evolving realm of computational methods and bioinformatics tools that have revolutionized our comprehension of protein aggregation. Beginning with a discussion of the multifaceted challenges associated with understanding this process and emphasizing the critical need for precise predictive tools, we highlight how computational techniques have become indispensable for understanding protein aggregation. We focus on molecular simulations, notably molecular dynamics (MD) simulations, spanning from atomistic to coarse-grained levels, which have emerged as pivotal tools in unraveling the complex dynamics governing protein aggregation in diseases such as cataracts, Alzheimer's, and Parkinson's. MD simulations provide microscopic insights into protein interactions and the subtleties of aggregation pathways, with advanced techniques like replica exchange molecular dynamics, Metadynamics (MetaD), and umbrella sampling enhancing our understanding by probing intricate energy landscapes and transition states. We delve into specific applications of MD simulations, elucidating the chaperone mechanism underlying cataract formation using Markov state modeling and the intricate pathways and interactions driving the toxic aggregate formation in Alzheimer's and Parkinson's disease. Transitioning we highlight how computational techniques, including bioinformatics, sequence analysis, structural data, machine learning algorithms, and artificial intelligence have become indispensable for predicting protein aggregation propensity and locating aggregation-prone regions within protein sequences. Throughout our exploration, we underscore the symbiotic relationship between computational approaches and empirical data, which has paved the way for potential therapeutic strategies against protein aggregation-related diseases. In conclusion, this review offers a comprehensive overview of advanced computational methodologies and bioinformatics tools that have catalyzed breakthroughs in unraveling the molecular basis of protein aggregation, with significant implications for clinical interventions, standing at the intersection of computational biology and experimental research.
Collapse
Affiliation(s)
- Deepshikha Ghosh
- Department of Biological Sciences and Engineering, Indian Institute of Technology (IIT) Gandhinagar, Palaj, Gujarat 382355, India
| | - Anushka Biswas
- Department of Chemical Engineering, Indian Institute of Technology (IIT) Gandhinagar, Palaj, Gujarat 382355, India
| | | |
Collapse
|
2
|
Fang C, He J, Yamana H. MoRF_ESM: Prediction of MoRFs in disordered proteins based on a deep transformer protein language model. J Bioinform Comput Biol 2024; 22:2450006. [PMID: 38812466 DOI: 10.1142/s0219720024500069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2024]
Abstract
Molecular recognition features (MoRFs) are particular functional segments of disordered proteins, which play crucial roles in regulating the phase transition of membrane-less organelles and frequently serve as central sites in cellular interaction networks. As the association between disordered proteins and severe diseases continues to be discovered, identifying MoRFs has gained growing significance. Due to the limited number of experimentally validated MoRFs, the performance of existing MoRF's prediction algorithms is not good enough and still needs to be improved. In this research, we present a model named MoRF_ESM, which utilizes deep-learning protein representations to predict MoRFs in disordered proteins. This approach employs a pretrained ESM-2 protein language model to generate embedding representations of residues in the form of attention map matrices. These representations are combined with a self-learned TextCNN model for feature extraction and prediction. In addition, an averaging step was incorporated at the end of the MoRF_ESM model to refine the output and generate final prediction results. In comparison to other impressive methods on benchmark datasets, the MoRF_ESM approach demonstrates state-of-the-art performance, achieving [Formula: see text] higher AUC than other methods when tested on TEST1 and achieving [Formula: see text] higher AUC than other methods when tested on TEST2. These results imply that the combination of ESM-2 and TextCNN can effectively extract deep evolutionary features related to protein structure and function, along with capturing shallow pattern features located in protein sequences, and is well qualified for the prediction task of MoRFs. Given that ESM-2 is a highly versatile protein language model, the methodology proposed in this study can be readily applied to other tasks involving the classification of protein sequences.
Collapse
Affiliation(s)
- Chun Fang
- Department of Information Engineering, Beijing Institute of Petrochemical Technology, 19 Qingyuan North Road, Daxing District, Beijing 102617, P. R. China
- Department of Computer Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku, Tokyo 169-8555, Japan
| | - Jiasheng He
- Department of Information Engineering, Beijing Institute of Petrochemical Technology, 19 Qingyuan North Road, Daxing District, Beijing 102617, P. R. China
| | - Hayato Yamana
- Department of Computer Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku, Tokyo 169-8555, Japan
| |
Collapse
|
3
|
Luo S, Wohl S, Zheng W, Yang S. Biophysical and Integrative Characterization of Protein Intrinsic Disorder as a Prime Target for Drug Discovery. Biomolecules 2023; 13:biom13030530. [PMID: 36979465 PMCID: PMC10046839 DOI: 10.3390/biom13030530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 03/07/2023] [Accepted: 03/10/2023] [Indexed: 03/17/2023] Open
Abstract
Protein intrinsic disorder is increasingly recognized for its biological and disease-driven functions. However, it represents significant challenges for biophysical studies due to its high conformational flexibility. In addressing these challenges, we highlight the complementary and distinct capabilities of a range of experimental and computational methods and further describe integrative strategies available for combining these techniques. Integrative biophysics methods provide valuable insights into the sequence–structure–function relationship of disordered proteins, setting the stage for protein intrinsic disorder to become a promising target for drug discovery. Finally, we briefly summarize recent advances in the development of new small molecule inhibitors targeting the disordered N-terminal domains of three vital transcription factors.
Collapse
Affiliation(s)
- Shuqi Luo
- Center for Proteomics and Department of Nutrition, School of Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Samuel Wohl
- Department of Physics, Arizona State University, Tempe, AZ 85287, USA
| | - Wenwei Zheng
- College of Integrative Sciences and Arts, Arizona State University, Mesa, AZ 85212, USA
- Correspondence: (W.Z.); (S.Y.)
| | - Sichun Yang
- Center for Proteomics and Department of Nutrition, School of Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
- Case Comprehensive Cancer Center, Case Western Reserve University, Cleveland, OH 44106, USA
- Correspondence: (W.Z.); (S.Y.)
| |
Collapse
|
4
|
Computational prediction of disordered binding regions. Comput Struct Biotechnol J 2023; 21:1487-1497. [PMID: 36851914 PMCID: PMC9957716 DOI: 10.1016/j.csbj.2023.02.018] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 02/08/2023] [Accepted: 02/08/2023] [Indexed: 02/12/2023] Open
Abstract
One of the key features of intrinsically disordered regions (IDRs) is their ability to interact with a broad range of partner molecules. Multiple types of interacting IDRs were identified including molecular recognition fragments (MoRFs), short linear sequence motifs (SLiMs), and protein-, nucleic acids- and lipid-binding regions. Prediction of binding IDRs in protein sequences is gaining momentum in recent years. We survey 38 predictors of binding IDRs that target interactions with a diverse set of partners, such as peptides, proteins, RNA, DNA and lipids. We offer a historical perspective and highlight key events that fueled efforts to develop these methods. These tools rely on a diverse range of predictive architectures that include scoring functions, regular expressions, traditional and deep machine learning and meta-models. Recent efforts focus on the development of deep neural network-based architectures and extending coverage to RNA, DNA and lipid-binding IDRs. We analyze availability of these methods and show that providing implementations and webservers results in much higher rates of citations/use. We also make several recommendations to take advantage of modern deep network architectures, develop tools that bundle predictions of multiple and different types of binding IDRs, and work on algorithms that model structures of the resulting complexes.
Collapse
|
5
|
Chen R, Li X, Yang Y, Song X, Wang C, Qiao D. Prediction of protein-protein interaction sites in intrinsically disordered proteins. Front Mol Biosci 2022; 9:985022. [PMID: 36250006 PMCID: PMC9567019 DOI: 10.3389/fmolb.2022.985022] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Accepted: 07/27/2022] [Indexed: 11/25/2022] Open
Abstract
Intrinsically disordered proteins (IDPs) participate in many biological processes by interacting with other proteins, including the regulation of transcription, translation, and the cell cycle. With the increasing amount of disorder sequence data available, it is thus crucial to identify the IDP binding sites for functional annotation of these proteins. Over the decades, many computational approaches have been developed to predict protein-protein binding sites of IDP (IDP-PPIS) based on protein sequence information. Moreover, there are new IDP-PPIS predictors developed every year with the rapid development of artificial intelligence. It is thus necessary to provide an up-to-date overview of these methods in this field. In this paper, we collected 30 representative predictors published recently and summarized the databases, features and algorithms. We described the procedure how the features were generated based on public data and used for the prediction of IDP-PPIS, along with the methods to generate the feature representations. All the predictors were divided into three categories: scoring functions, machine learning-based prediction, and consensus approaches. For each category, we described the details of algorithms and their performances. Hopefully, our manuscript will not only provide a full picture of the status quo of IDP binding prediction, but also a guide for selecting different methods. More importantly, it will shed light on the inspirations for future development trends and principles.
Collapse
Affiliation(s)
- Ranran Chen
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Xinlu Li
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Yaqing Yang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Xixi Song
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Cheng Wang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
- *Correspondence: Cheng Wang, ; Dongdong Qiao,
| | - Dongdong Qiao
- Shandong Mental Health Center, Shandong University, Jinan, China
- *Correspondence: Cheng Wang, ; Dongdong Qiao,
| |
Collapse
|
6
|
Bondos SE, Dunker AK, Uversky VN. Intrinsically disordered proteins play diverse roles in cell signaling. Cell Commun Signal 2022; 20:20. [PMID: 35177069 PMCID: PMC8851865 DOI: 10.1186/s12964-022-00821-7] [Citation(s) in RCA: 56] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Accepted: 12/11/2021] [Indexed: 11/29/2022] Open
Abstract
Abstract Signaling pathways allow cells to detect and respond to a wide variety of chemical (e.g. Ca2+ or chemokine proteins) and physical stimuli (e.g., sheer stress, light). Together, these pathways form an extensive communication network that regulates basic cell activities and coordinates the function of multiple cells or tissues. The process of cell signaling imposes many demands on the proteins that comprise these pathways, including the abilities to form active and inactive states, and to engage in multiple protein interactions. Furthermore, successful signaling often requires amplifying the signal, regulating or tuning the response to the signal, combining information sourced from multiple pathways, all while ensuring fidelity of the process. This sensitivity, adaptability, and tunability are possible, in part, due to the inclusion of intrinsically disordered regions in many proteins involved in cell signaling. The goal of this collection is to highlight the many roles of intrinsic disorder in cell signaling. Following an overview of resources that can be used to study intrinsically disordered proteins, this review highlights the critical role of intrinsically disordered proteins for signaling in widely diverse organisms (animals, plants, bacteria, fungi), in every category of cell signaling pathway (autocrine, juxtacrine, intracrine, paracrine, and endocrine) and at each stage (ligand, receptor, transducer, effector, terminator) in the cell signaling process. Thus, a cell signaling pathway cannot be fully described without understanding how intrinsically disordered protein regions contribute to its function. The ubiquitous presence of intrinsic disorder in different stages of diverse cell signaling pathways suggest that more mechanisms by which disorder modulates intra- and inter-cell signals remain to be discovered. Graphical abstract ![]()
Collapse
Affiliation(s)
- Sarah E Bondos
- Department of Molecular and Cellular Medicine, Texas A&M Health Science Center, College Station, TX, 77843, USA.
| | - A Keith Dunker
- Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Indiana University School of Medicine, Indianapolis, IN, 46202, USA
| | - Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, 33612, USA.,Institute for Biological Instrumentation of the Russian Academy of Sciences, Federal Research Center "Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences", Pushchino, Moscow Region, Russia, 142290
| |
Collapse
|
7
|
Tamburrini KC, Pesce G, Nilsson J, Gondelaud F, Kajava AV, Berrin JG, Longhi S. Predicting Protein Conformational Disorder and Disordered Binding Sites. Methods Mol Biol 2022; 2449:95-147. [PMID: 35507260 DOI: 10.1007/978-1-0716-2095-3_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
In the last two decades it has become increasingly evident that a large number of proteins adopt either a fully or a partially disordered conformation. Intrinsically disordered proteins are ubiquitous proteins that fulfill essential biological functions while lacking a stable 3D structure. Their conformational heterogeneity is encoded by the amino acid sequence, thereby allowing intrinsically disordered proteins or regions to be recognized based on their sequence properties. The identification of disordered regions facilitates the functional annotation of proteins and is instrumental for delineating boundaries of protein domains amenable to crystallization. This chapter focuses on the methods currently employed for predicting protein disorder and identifying intrinsically disordered binding sites.
Collapse
Affiliation(s)
- Ketty C Tamburrini
- Aix Marseille Univ, CNRS, Architecture et Fonction des Macromolécules Biologiques, AFMB, UMR 7257, Marseille, France
- INRAE, Aix Marseille Univ, Biodiversité et Biotechnologie Fongiques (BBF), UMR 1163, Marseille, France
| | - Giulia Pesce
- Aix Marseille Univ, CNRS, Architecture et Fonction des Macromolécules Biologiques, AFMB, UMR 7257, Marseille, France
| | - Juliet Nilsson
- Aix Marseille Univ, CNRS, Architecture et Fonction des Macromolécules Biologiques, AFMB, UMR 7257, Marseille, France
| | - Frank Gondelaud
- Aix Marseille Univ, CNRS, Architecture et Fonction des Macromolécules Biologiques, AFMB, UMR 7257, Marseille, France
| | - Andrey V Kajava
- Centre de Recherche en Biologie cellulaire de Montpellier, UMR 5237, CNRS, Université Montpellier, Montpellier, France
| | - Jean-Guy Berrin
- INRAE, Aix Marseille Univ, Biodiversité et Biotechnologie Fongiques (BBF), UMR 1163, Marseille, France
| | - Sonia Longhi
- Aix Marseille Univ, CNRS, Architecture et Fonction des Macromolécules Biologiques, AFMB, UMR 7257, Marseille, France.
| |
Collapse
|
8
|
Zhang J, Ghadermarzi S, Kurgan L. Prediction of protein-binding residues: dichotomy of sequence-based methods developed using structured complexes versus disordered proteins. Bioinformatics 2021; 36:4729-4738. [PMID: 32860044 DOI: 10.1093/bioinformatics/btaa573] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2020] [Revised: 05/22/2020] [Accepted: 06/10/2020] [Indexed: 01/08/2023] Open
Abstract
MOTIVATION There are over 30 sequence-based predictors of the protein-binding residues (PBRs). They use either structure-annotated or disorder-annotated training datasets, potentially creating a dichotomy where the structure-/disorder-specific models may not be able to cross-over to accurately predict the other type. Moreover, the structure-trained predictors were shown to substantially cross-predict PBRs among residues that interact with non-protein partners (nucleic acids and small ligands). We address these issues by performing first-of-its-kind comparative study of a representative collection of disorder- and structure-trained predictors using a comprehensive benchmark set with the structure- and disorder-derived annotations of PBRs (to analyze the cross-over) and the protein-, nucleic acid- and small ligand-binding proteins (to study the cross-predictions). RESULTS Three predictors provide accurate results: SCRIBER, ANCHOR and disoRDPbind. Some of the structure-trained methods make accurate predictions on the structure-annotated proteins. Similarly, the disorder-trained predictors predict well on the disorder-annotated proteins. However, the considered predictors generally fail to cross-over, with the exception of SCRIBER. Our study also reveals that virtually all methods substantially cross-predict PBRs, except for SCRIBER for the structure-annotated proteins and disoRDPbind for the disorder-annotated proteins. We formulate a novel hybrid predictor, hybridPBRpred, that combines results produced by disoRDPbind and SCRIBER to accurately predict disorder- and structure-annotated PBRs. HybridPBRpred generates accurate results that cross-over structure- and disorder-annotated proteins and produces relatively low amount of cross-predictions, offering an accurate alternative to predict PBRs. AVAILABILITY AND IMPLEMENTATION HybridPBRpred webserver, benchmark dataset and supplementary information are available at http://biomine.cs.vcu.edu/servers/hybridPBRpred/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jian Zhang
- School of Computer and Information Technology, Xinyang Normal University, Xinyang 464000, China
| | - Sina Ghadermarzi
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
9
|
Yamada T, Figueroa EE, Denton JS, Strange K. LRRC8A homohexameric channels poorly recapitulate VRAC regulation and pharmacology. Am J Physiol Cell Physiol 2020; 320:C293-C303. [PMID: 33356947 DOI: 10.1152/ajpcell.00454.2020] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Swelling-activated volume-regulated anion channels (VRACs) are heteromeric channels comprising LRRC8A and at least one other LRRC8 paralog. Cryoelectron microscopy (cryo-EM) structures of nonnative LRRC8A and LRRC8D homohexamers have been described. We demonstrate here that LRRC8A homohexamers poorly recapitulate VRAC functional properties. Unlike VRACs, LRRC8A channels heterologously expressed in Lrr8c-/- HCT116 cells are poorly activated by low intracellular ionic strength (µ) and insensitive to cell swelling with normal µ. Combining low µ with swelling modestly activates LRRC8A, allowing characterization of pore properties. VRACs are strongly inhibited by 10 µM 4-[(2-butyl-6,7-dichloro-2-cyclopentyl-2,3-dihydro-1-oxo-1H-inden-5-yl)oxy]butanoic acid (DCPIB) in a voltage-independent manner. In contrast, DCPIB block of LRRC8A is weak and voltage sensitive. Cryo-EM structures indicate that DCPIB block is dependent on arginine 103. Consistent with this, LRRC8A R103F mutants are insensitive to DCPIB. However, an LRRC8 chimeric channel in which R103 is replaced by a leucine at the homologous position is inhibited ∼90% by 10 µM DCPIB in a voltage-independent manner. Coexpression of LRRC8A and LRRC8C gives rise to channels with DCPIB sensitivity that is strongly µ dependent. At normal intracellular µ, LRRC8A + LRRC8C heteromers exhibit strong, voltage-independent DCPIB block that is insensitive to R103F. DCPIB inhibition is greatly reduced and exhibits voltage dependence with low intracellular µ. The R103F mutation has no effect on maximal DCPIB inhibition but eliminates voltage dependence under low µ conditions. Our findings demonstrate that the LRRC8A cryo-EM structure and the use of heterologously expressed LRRC8 heteromeric channels pose significant limitations for VRAC mutagenesis-based structure-function analysis. Native VRAC function is most closely mimicked by chimeric LRRC8 homomeric channels.
Collapse
Affiliation(s)
- Toshiki Yamada
- Department of Anesthesiology, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Eric E Figueroa
- Department of Pharmacology, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Jerod S Denton
- Department of Anesthesiology, Vanderbilt University Medical Center, Nashville, Tennessee.,Department of Pharmacology, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Kevin Strange
- Department of Anesthesiology, Vanderbilt University Medical Center, Nashville, Tennessee
| |
Collapse
|
10
|
Fang C, Moriwaki Y, Li C, Shimizu K. MoRFPred_en: Sequence-based prediction of MoRFs using an ensemble learning strategy. J Bioinform Comput Biol 2020; 17:1940015. [PMID: 32019410 DOI: 10.1142/s0219720019400158] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Molecular recognition features (MoRFs) usually act as "hub" sites in the interaction networks of intrinsically disordered proteins (IDPs). Because an increasing number of serious diseases have been found to be associated with disordered proteins, identifying MoRFs has become increasingly important. In this study, we propose an ensemble learning strategy, named MoRFPred_en, to predict MoRFs from protein sequences. This approach combines four submodels that utilize different sequence-derived features for the prediction, including a multichannel one-dimensional convolutional neural network (CNN_1D multichannel) based model, two deep two-dimensional convolutional neural network (DCNN_2D) based models, and a support vector machine (SVM) based model. When compared with other methods on the same datasets, the MoRFPred_en approach produced better results than existing state-of-the-art MoRF prediction methods, achieving an AUC of 0.762 on the VALIDATION419 dataset, 0.795 on the TEST45 dataset, and 0.776 on the TEST49 dataset. Availability: http://vivace.bi.a.u-tokyo.ac.jp:8008/fang/MoRFPred_en.php.
Collapse
Affiliation(s)
- Chun Fang
- Department of Computer Science and Engineering, Shandong University of Technology, Shandong 255049, P. R. China
| | - Yoshitaka Moriwaki
- Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo 113-8657, Japan
| | - Caihong Li
- Department of Computer Science and Engineering, Shandong University of Technology, Shandong 255049, P. R. China
| | - Kentaro Shimizu
- Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo 113-8657, Japan
| |
Collapse
|
11
|
Abstract
Functions of intrinsically disordered proteins do not require structure. Such structure-independent functionality has melted away the classic rigid "lock and key" representation of structure-function relationships in proteins, opening a new page in protein science, where molten keys operate on melted locks and where conformational flexibility and intrinsic disorder, structural plasticity and extreme malleability, multifunctionality and binding promiscuity represent a new-fangled reality. Analysis and understanding of this new reality require novel tools, and some of the techniques elaborated for the examination of intrinsically disordered protein functions are outlined in this review.
Collapse
Affiliation(s)
- Vladimir N. Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer’s Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, 33620, USA
- Laboratory of New Methods in Biology, Institute for Biological Instrumentation, Russian Academy of Sciences, Pushchino, Russian Federation
| |
Collapse
|
12
|
Zerze GH, Stillinger FH, Debenedetti PG. Computational investigation of retro-isomer equilibrium structures: Intrinsically disordered, foldable, and cyclic peptides. FEBS Lett 2019; 594:104-113. [PMID: 31356683 DOI: 10.1002/1873-3468.13558] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2019] [Revised: 06/20/2019] [Accepted: 07/26/2019] [Indexed: 11/08/2022]
Abstract
We use all-atom modeling and advanced-sampling molecular dynamics simulations to investigate quantitatively the effect of peptide bond directionality on the equilibrium structures of four linear (two foldable, two disordered) and two cyclic peptides. We find that the retro forms of cyclic and foldable linear peptides adopt distinctively different conformations compared to their parents. While the retro form of a linear intrinsically disordered peptide with transient secondary structure fails to reproduce a secondary structure content similar to that of its parent, the retro form of a shorter disordered linear peptide shows only minor differences compared to its parent.
Collapse
Affiliation(s)
- Gül H Zerze
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ, USA
| | | | - Pablo G Debenedetti
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ, USA
| |
Collapse
|
13
|
Katuwawala A, Ghadermarzi S, Kurgan L. Computational prediction of functions of intrinsically disordered regions. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2019; 166:341-369. [PMID: 31521235 DOI: 10.1016/bs.pmbts.2019.04.006] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Intrinsically disorder regions (IDRs) are abundant in nature, particularly among Eukaryotes. While they facilitate a wide spectrum of cellular functions including signaling, molecular assembly and recognition, translation, transcription and regulation, only several hundred IDRs are annotated functionally. This annotation gap motivates the development of fast and accurate computational methods that predict IDR functions directly from protein sequences. We introduce and describe a comprehensive collection of 25 methods that provide accurate predictions of IDRs that interact with proteins and nucleic acids, that function as flexible linkers and that moonlight multiple functions. Virtually all of these predictors can be accessed online and many were developed in the last few years. They utilize a wide range of predictive architectures and take advantage of modern machine learning algorithms. Our empirical analysis shows that predictors that are available as webservers enjoy high rates of citations, attesting to their practical value and popularity. The most cited methods include DISOPRED3, ANCHOR, alpha-MoRFpred, MoRFpred, fMoRFpred and MoRFCHiBi. We present two case studies to demonstrate that predictions produced by these computational tools are relatively easy to interpret and that they deliver valuable functional clues. However, the current computational tools cover a relatively narrow range of disorder functions. Further development efforts that would cover a broader range of functions should be pursued. We demonstrate that a sufficient amount of functionally annotated IDRs that are associated with several other disorder functions is already available and can be used to design and validate novel predictors.
Collapse
Affiliation(s)
- Akila Katuwawala
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Sina Ghadermarzi
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States.
| |
Collapse
|
14
|
Harnoš J, Cañizal MCA, Jurásek M, Kumar J, Holler C, Schambony A, Hanáková K, Bernatík O, Zdráhal Z, Gömöryová K, Gybeľ T, Radaszkiewicz TW, Kravec M, Trantírek L, Ryneš J, Dave Z, Fernández-Llamazares AI, Vácha R, Tripsianes K, Hoffmann C, Bryja V. Dishevelled-3 conformation dynamics analyzed by FRET-based biosensors reveals a key role of casein kinase 1. Nat Commun 2019; 10:1804. [PMID: 31000703 PMCID: PMC6472409 DOI: 10.1038/s41467-019-09651-7] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2018] [Accepted: 03/20/2019] [Indexed: 01/17/2023] Open
Abstract
Dishevelled (DVL) is the key component of the Wnt signaling pathway. Currently, DVL conformational dynamics under native conditions is unknown. To overcome this limitation, we develop the Fluorescein Arsenical Hairpin Binder- (FlAsH-) based FRET in vivo approach to study DVL conformation in living cells. Using this single-cell FRET approach, we demonstrate that (i) Wnt ligands induce open DVL conformation, (ii) DVL variants that are predominantly open, show more even subcellular localization and more efficient membrane recruitment by Frizzled (FZD) and (iii) Casein kinase 1 ɛ (CK1ɛ) has a key regulatory function in DVL conformational dynamics. In silico modeling and in vitro biophysical methods explain how CK1ɛ-specific phosphorylation events control DVL conformations via modulation of the PDZ domain and its interaction with DVL C-terminus. In summary, our study describes an experimental tool for DVL conformational sampling in living cells and elucidates the essential regulatory role of CK1ɛ in DVL conformational dynamics.
Collapse
Affiliation(s)
- Jakub Harnoš
- Department of Experimental Biology, Faculty of Science, Masaryk University, Brno, 62500, Czech Republic.,Department of Cell, Developmental & Regenerative Biology, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Maria Consuelo Alonso Cañizal
- Department of Pharmacology and Toxicology, University of Würzburg, Würzburg, 97078, Germany.,Rudolf Virchow Center for Experimental Biomedicine, University of Würzburg, Würzburg, 97078, Germany.,Institute for Molecular Cell Biology, CMB-Center for Molecular Biomedicine, University Hospital Jena, Friedrich Schiller University Jena, Jena, 07745, Germany
| | - Miroslav Jurásek
- CEITEC-Central European Institute of Technology, Masaryk University, Brno, 62500, Czech Republic.,National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Brno, 62500, Czech Republic
| | - Jitender Kumar
- CEITEC-Central European Institute of Technology, Masaryk University, Brno, 62500, Czech Republic
| | - Cornelia Holler
- Max Planck Institute for the Science of Light, Erlangen, 91058, Germany.,Biology Department, Developmental Biology, Friedrich-Alexander University Erlangen-Nüremberg, Erlangen, 91058, Germany
| | - Alexandra Schambony
- Max Planck Institute for the Science of Light, Erlangen, 91058, Germany.,Biology Department, Developmental Biology, Friedrich-Alexander University Erlangen-Nüremberg, Erlangen, 91058, Germany
| | - Kateřina Hanáková
- CEITEC-Central European Institute of Technology, Masaryk University, Brno, 62500, Czech Republic.,National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Brno, 62500, Czech Republic
| | - Ondřej Bernatík
- Department of Experimental Biology, Faculty of Science, Masaryk University, Brno, 62500, Czech Republic
| | - Zbyněk Zdráhal
- CEITEC-Central European Institute of Technology, Masaryk University, Brno, 62500, Czech Republic.,National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Brno, 62500, Czech Republic
| | - Kristína Gömöryová
- Department of Experimental Biology, Faculty of Science, Masaryk University, Brno, 62500, Czech Republic
| | - Tomáš Gybeľ
- Department of Experimental Biology, Faculty of Science, Masaryk University, Brno, 62500, Czech Republic
| | | | - Marek Kravec
- Department of Experimental Biology, Faculty of Science, Masaryk University, Brno, 62500, Czech Republic
| | - Lukáš Trantírek
- CEITEC-Central European Institute of Technology, Masaryk University, Brno, 62500, Czech Republic.,Institute of Biophysics, Academy of Sciences of the Czech Republic, v.v.i., Brno, 612 65, Czech Republic
| | - Jan Ryneš
- CEITEC-Central European Institute of Technology, Masaryk University, Brno, 62500, Czech Republic
| | - Zankruti Dave
- Department of Experimental Biology, Faculty of Science, Masaryk University, Brno, 62500, Czech Republic
| | | | - Robert Vácha
- CEITEC-Central European Institute of Technology, Masaryk University, Brno, 62500, Czech Republic.,National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Brno, 62500, Czech Republic
| | - Konstantinos Tripsianes
- CEITEC-Central European Institute of Technology, Masaryk University, Brno, 62500, Czech Republic
| | - Carsten Hoffmann
- Department of Pharmacology and Toxicology, University of Würzburg, Würzburg, 97078, Germany.,Rudolf Virchow Center for Experimental Biomedicine, University of Würzburg, Würzburg, 97078, Germany.,Institute for Molecular Cell Biology, CMB-Center for Molecular Biomedicine, University Hospital Jena, Friedrich Schiller University Jena, Jena, 07745, Germany
| | - Vítězslav Bryja
- Department of Experimental Biology, Faculty of Science, Masaryk University, Brno, 62500, Czech Republic. .,Institute of Biophysics, Academy of Sciences of the Czech Republic, v.v.i., Brno, 612 65, Czech Republic.
| |
Collapse
|
15
|
Katuwawala A, Peng Z, Yang J, Kurgan L. Computational Prediction of MoRFs, Short Disorder-to-order Transitioning Protein Binding Regions. Comput Struct Biotechnol J 2019; 17:454-462. [PMID: 31007871 PMCID: PMC6453775 DOI: 10.1016/j.csbj.2019.03.013] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2019] [Revised: 03/22/2019] [Accepted: 03/23/2019] [Indexed: 12/28/2022] Open
Abstract
Molecular recognition features (MoRFs) are short protein-binding regions that undergo disorder-to-order transitions (induced folding) upon binding protein partners. These regions are abundant in nature and can be predicted from protein sequences based on their distinctive sequence signatures. This first-of-its-kind survey covers 14 MoRF predictors and six related methods for the prediction of short protein-binding linear motifs, disordered protein-binding regions and semi-disordered regions. We show that the development of MoRF predictors has accelerated in the recent years. These predictors depend on machine learning-derived models that were generated using training datasets where MoRFs are annotated using putative disorder. Our analysis reveals that they generate accurate predictions. We identified eight methods that offer area under the ROC curve (AUC) ≥ 0.7 on experimentally-validated test datasets. We show that modern MoRF predictors accurately find experimentally annotated MoRFs even though they were trained using the putative disorder annotations. They are relatively highly-cited, particularly the methods available as webservers that on average secure three times more citations than methods without this option. MoRF predictions contribute to the experimental discovery of protein-protein interactions, annotation of protein functions and computational analysis of a variety of proteomes, protein families, and pathways. We outline future development and application directions for these tools, stressing the importance to develop novel tools that would target interactions of disordered regions with other types of partners.
Collapse
Affiliation(s)
- Akila Katuwawala
- Department of Computer Science, Virginia Commonwealth University, USA
| | - Zhenling Peng
- Center for Applied Mathematics, Tianjin University, Tianjin, China
| | - Jianyi Yang
- School of Mathematical Sciences, Nankai University, Tianjin, China
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, USA
| |
Collapse
|
16
|
Fang C, Moriwaki Y, Tian A, Li C, Shimizu K. Identifying short disorder-to-order binding regions in disordered proteins with a deep convolutional neural network method. J Bioinform Comput Biol 2019; 17:1950004. [PMID: 30866736 DOI: 10.1142/s0219720019500045] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Molecular recognition features (MoRFs) are key functional regions of intrinsically disordered proteins (IDPs), which play important roles in the molecular interaction network of cells and are implicated in many serious human diseases. Identifying MoRFs is essential for both functional studies of IDPs and drug design. This study adopts the cutting-edge machine learning method of artificial intelligence to develop a powerful model for improving MoRFs prediction. We proposed a method, named as en_DCNNMoRF (ensemble deep convolutional neural network-based MoRF predictor). It combines the outcomes of two independent deep convolutional neural network (DCNN) classifiers that take advantage of different features. The first, DCNNMoRF1, employs position-specific scoring matrix (PSSM) and 22 types of amino acid-related factors to describe protein sequences. The second, DCNNMoRF2, employs PSSM and 13 types of amino acid indexes to describe protein sequences. For both single classifiers, DCNN with a novel two-dimensional attention mechanism was adopted, and an average strategy was added to further process the output probabilities of each DCNN model. Finally, en_DCNNMoRF combined the two models by averaging their final scores. When compared with other well-known tools applied to the same datasets, the accuracy of the novel proposed method was comparable with that of state-of-the-art methods. The related web server can be accessed freely via http://vivace.bi.a.u-tokyo.ac.jp:8008/fang/en_MoRFs.php .
Collapse
Affiliation(s)
- Chun Fang
- Department of Computer Science and Engineering, Shandong University of Technology, Shandong 255049, P. R. China
| | - Yoshitaka Moriwaki
- Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo 113-8657, Japan
| | - Aikui Tian
- Department of Computer Science and Engineering, Shandong University of Technology, Shandong 255049, P. R. China
| | - Caihong Li
- Department of Computer Science and Engineering, Shandong University of Technology, Shandong 255049, P. R. China
| | - Kentaro Shimizu
- Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo 113-8657, Japan
| |
Collapse
|
17
|
Strange K, Yamada T, Denton JS. A 30-year journey from volume-regulated anion currents to molecular structure of the LRRC8 channel. J Gen Physiol 2019; 151:100-117. [PMID: 30651298 PMCID: PMC6363415 DOI: 10.1085/jgp.201812138] [Citation(s) in RCA: 61] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2018] [Accepted: 01/03/2019] [Indexed: 12/18/2022] Open
Abstract
Strange et al. review recent advances in our understanding of the molecular and structural basis of volume-regulated anion channel function within the framework of classical biophysical and physiological studies. The swelling-activated anion channel VRAC has fascinated and frustrated physiologists since it was first described in 1988. Multiple laboratories have defined VRAC’s biophysical properties and have shown that it plays a central role in cell volume regulation and possibly other fundamental physiological processes. However, confusion and intense controversy surrounding the channel’s molecular identity greatly hindered progress in the field for >15 yr. A major breakthrough came in 2014 with the demonstration that VRAC is a heteromeric channel encoded by five members of the Lrrc8 gene family, Lrrc8A–E. A mere 4 yr later, four laboratories described cryo-EM structures of LRRC8A homomeric channels. As the melee of structure/function and physiology studies begins, it is critical that this work be framed by a clear understanding of VRAC biophysics, regulation, and cellular physiology as well as by the field’s past confusion and controversies. That understanding is essential for the design and interpretation of structure/function studies, studies of VRAC physiology, and studies aimed at addressing the vexing problem of how the channel detects cell volume changes. In this review we discuss key aspects of VRAC biophysics, regulation, and function and integrate these into our emerging understanding of LRRC8 protein structure/function.
Collapse
Affiliation(s)
- Kevin Strange
- Department of Anesthesiology, Vanderbilt University Medical Center, Nashville, TN.,Novo Biosciences, Inc., Bar Harbor, ME
| | - Toshiki Yamada
- Department of Anesthesiology, Vanderbilt University Medical Center, Nashville, TN
| | - Jerod S Denton
- Department of Anesthesiology, Vanderbilt University Medical Center, Nashville, TN
| |
Collapse
|
18
|
Abstract
Intrinsically disordered proteins and regions are involved in a wide range of cellular functions, and they often facilitate protein-protein interactions. Molecular recognition features (MoRFs) are segments of intrinsically disordered regions that bind to partner proteins, where binding is concomitant with a transition to a structured conformation. MoRFs facilitate translation, transport, signaling, and regulatory processes and are found across all domains of life. A popular computational tool, MoRFpred, accurately predicts MoRFs in protein sequences. MoRFpred is implemented as a user-friendly web server that is freely available at http://biomine.cs.vcu.edu/servers/MoRFpred/ . We describe this predictor, explain how to run the web server, and show how to interpret the results it generates. We also demonstrate the utility of this web server based on two case studies, focusing on the relevance of evolutionary conservation of MoRF regions.
Collapse
Affiliation(s)
| | - Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA.
- Institute for Biological Instrumentation, Russian Academy of Sciences, Moscow Region, Russia.
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA.
| |
Collapse
|
19
|
Oldfield CJ, Uversky VN, Dunker AK, Kurgan L. Introduction to intrinsically disordered proteins and regions. Proteins 2019. [DOI: 10.1016/b978-0-12-816348-1.00001-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
|
20
|
Sharma R, Sharma A, Raicar G, Tsunoda T, Patil A. OPAL+: Length‐Specific MoRF Prediction in Intrinsically Disordered Protein Sequences. Proteomics 2018; 19:e1800058. [DOI: 10.1002/pmic.201800058] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2018] [Revised: 10/10/2018] [Indexed: 11/09/2022]
Affiliation(s)
- Ronesh Sharma
- School of Engineering and Physics The University of the South Pacific Suva Fiji
- School of Electrical and Electronics Engineering Fiji National University Suva Fiji
| | - Alok Sharma
- School of Engineering and Physics The University of the South Pacific Suva Fiji
- Laboratory for Medical Science Mathematics RIKEN Center for Integrative Medical Sciences Yokohama 230‐0045 Japan
- Department of Medical Science Mathematics Medical Research Institute Tokyo Medical and Dental University (TMDU) Tokyo 113–8510 Japan
- Institute for Integrated and Intelligent Systems Griffith University Nathan Brisbane QLD Australia
| | - Gaurav Raicar
- School of Engineering and Physics The University of the South Pacific Suva Fiji
| | - Tatsuhiko Tsunoda
- Laboratory for Medical Science Mathematics RIKEN Center for Integrative Medical Sciences Yokohama 230‐0045 Japan
- Department of Medical Science Mathematics Medical Research Institute Tokyo Medical and Dental University (TMDU) Tokyo 113–8510 Japan
- CREST JST Tokyo 113–8510 Japan
| | - Ashwini Patil
- Human Genome Center The Institute of Medical Science The University of Tokyo Tokyo 108–8639 Japan
| |
Collapse
|
21
|
Yamada T, Strange K. Intracellular and extracellular loops of LRRC8 are essential for volume-regulated anion channel function. J Gen Physiol 2018; 150:1003-1015. [PMID: 29853476 PMCID: PMC6028502 DOI: 10.1085/jgp.201812016] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2018] [Revised: 04/02/2018] [Accepted: 05/01/2018] [Indexed: 12/13/2022] Open
Abstract
The volume-regulated anion channel (VRAC) is expressed ubiquitously in vertebrate cells and mediates swelling-induced release of Cl- and organic solutes. Recent studies by several laboratories have demonstrated conclusively that VRAC is encoded by members of the leucine-rich repeat containing 8 (Lrrc8) gene family, which comprises five members, termed Lrrc8a-e. Numerous observations indicate that VRAC is a heteromeric channel comprising the essential subunit LRRC8A and one or more of the other LRRC8 paralogs. Here we demonstrate that the intracellular loop (IL) connecting transmembrane domains 2 and 3 of LRRC8A and the first extracellular loop (EL1) connecting transmembrane domains 1 and 2 of LRRC8C, LRRC8D, or LRRC8E are both essential for VRAC activity. We generate homomeric VRACs by replacing EL1 of LRRC8A with that of LRRC8C and demonstrate normal regulation by cell swelling and shrinkage. We also observe normal volume-dependent regulation in VRAC homomers in which the IL of LRRC8C, LRRC8D, or LRRC8E is replaced with the LRRC8A IL. A 25-amino acid sequence unique to the LRRC8A IL is sufficient to generate homomeric VRAC activity when inserted into the corresponding region of LRRC8C and LRRC8E. LRRC8 chimeras containing these partial LRRC8A IL sequences exhibit altered anion permeability, rectification, and voltage sensitivity, suggesting that the LRRC8A IL plays a role in VRAC pore structure and function. Our studies provide important new insights into the structure/function roles of the LRRC8 EL1 and IL. Homomeric LRRC8 channels will simplify future studies aimed at understanding channel structure and the longstanding and vexing problem of how VRAC is regulated by cell volume changes.
Collapse
|
22
|
Meng F, Uversky VN, Kurgan L. Comprehensive review of methods for prediction of intrinsic disorder and its molecular functions. Cell Mol Life Sci 2017; 74:3069-3090. [PMID: 28589442 PMCID: PMC11107660 DOI: 10.1007/s00018-017-2555-4] [Citation(s) in RCA: 130] [Impact Index Per Article: 18.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2017] [Accepted: 06/01/2017] [Indexed: 12/19/2022]
Abstract
Computational prediction of intrinsic disorder in protein sequences dates back to late 1970 and has flourished in the last two decades. We provide a brief historical overview, and we review over 30 recent predictors of disorder. We are the first to also cover predictors of molecular functions of disorder, including 13 methods that focus on disordered linkers and disordered protein-protein, protein-RNA, and protein-DNA binding regions. We overview their predictive models, usability, and predictive performance. We highlight newest methods and predictors that offer strong predictive performance measured based on recent comparative assessments. We conclude that the modern predictors are relatively accurate, enjoy widespread use, and many of them are fast. Their predictions are conveniently accessible to the end users, via web servers and databases that store pre-computed predictions for millions of proteins. However, research into methods that predict many not yet addressed functions of intrinsic disorder remains an outstanding challenge.
Collapse
Affiliation(s)
- Fanchi Meng
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Canada
| | - Vladimir N Uversky
- Department of Molecular Medicine, USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA
- Institute for Biological Instrumentation, Russian Academy of Sciences, Pushchino, Moscow Region, Russian Federation
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, USA.
| |
Collapse
|
23
|
Malhis N, Jacobson M, Gsponer J. MoRFchibi SYSTEM: software tools for the identification of MoRFs in protein sequences. Nucleic Acids Res 2016; 44:W488-93. [PMID: 27174932 PMCID: PMC4987941 DOI: 10.1093/nar/gkw409] [Citation(s) in RCA: 103] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2016] [Accepted: 05/03/2016] [Indexed: 11/13/2022] Open
Abstract
Molecular recognition features, MoRFs, are short segments within longer disordered protein regions that bind to globular protein domains in a process known as disorder-to-order transition. MoRFs have been found to play a significant role in signaling and regulatory processes in cells. High-confidence computational identification of MoRFs remains an important challenge. In this work, we introduce MoRFchibi SYSTEM that contains three MoRF predictors: MoRFCHiBi, a basic predictor best suited as a component in other applications, MoRFCHiBi_ Light, ideal for high-throughput predictions and MoRFCHiBi_ Web, slower than the other two but best for high accuracy predictions. Results show that MoRFchibi SYSTEM provides more than double the precision of other predictors. MoRFchibi SYSTEM is available in three different forms: as HTML web server, RESTful web server and downloadable software at: http://www.chibi.ubc.ca/faculty/joerg-gsponer/gsponer-lab/software/morf_chibi/.
Collapse
Affiliation(s)
- Nawar Malhis
- Michael Smith Laboratories-Centre for High-Throughput Biology, The University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Matthew Jacobson
- Michael Smith Laboratories-Centre for High-Throughput Biology, The University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Jörg Gsponer
- Michael Smith Laboratories-Centre for High-Throughput Biology, The University of British Columbia, Vancouver, BC V6T 1Z4, Canada Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC V6T 1Z3, Canada
| |
Collapse
|
24
|
Yuan J, Xue B. Role of structural flexibility in the evolution of emerin. J Theor Biol 2015; 385:102-11. [PMID: 26319992 DOI: 10.1016/j.jtbi.2015.08.009] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2015] [Revised: 08/07/2015] [Accepted: 08/17/2015] [Indexed: 02/07/2023]
Abstract
Emerin is a short inner nuclear membrane protein with an LEM-domain at the N-terminal end and a transmembrane domain at the C-terminal end. The middle region of human emerin contains multiple binding motifs. Since emerin is often found in evolutionarily newer species, the functional conservation of emerin becomes an interesting topic. In this study, we have demonstrated that most of the functional motifs of emerin are intrinsically disordered or highly flexible. Many post-translational modification sites and mutation sites are associated with these disordered regions. The averaged substitution rates of most functional motifs between species correlate positively with the averaged disorder scores of those functional motifs. Human emerin sequence may have acquired new functions on protein-protein interaction through the formation of hydrophobic motifs in the middle region, which is resulted from accumulated mutations during the evolution process.
Collapse
Affiliation(s)
- Jia Yuan
- Department of Cell Biology, Microbiology and Molecular Biology, School of Natural Sciences and Mathematics, College of Arts and Sciences, University of South Florida, 4202 E. Fowler Ave, ISA 2015, Tampa, FL 33620, USA
| | - Bin Xue
- Department of Cell Biology, Microbiology and Molecular Biology, School of Natural Sciences and Mathematics, College of Arts and Sciences, University of South Florida, 4202 E. Fowler Ave, ISA 2015, Tampa, FL 33620, USA.
| |
Collapse
|
25
|
Identifying Similar Patterns of Structural Flexibility in Proteins by Disorder Prediction and Dynamic Programming. Int J Mol Sci 2015; 16:13829-49. [PMID: 26086829 PMCID: PMC4490526 DOI: 10.3390/ijms160613829] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2015] [Revised: 06/03/2015] [Accepted: 06/05/2015] [Indexed: 12/31/2022] Open
Abstract
Computational methods are prevailing in identifying protein intrinsic disorder. The results from predictors are often given as per-residue disorder scores. The scores describe the disorder propensity of amino acids of a protein and can be further represented as a disorder curve. Many proteins share similar patterns in their disorder curves. The similar patterns are often associated with similar functions and evolutionary origins. Therefore, finding and characterizing specific patterns of disorder curves provides a unique and attractive perspective of studying the function of intrinsically disordered proteins. In this study, we developed a new computational tool named IDalign using dynamic programming. This tool is able to identify similar patterns among disorder curves, as well as to present the distribution of intrinsic disorder in query proteins. The disorder-based information generated by IDalign is significantly different from the information retrieved from classical sequence alignments. This tool can also be used to infer functions of disordered regions and disordered proteins. The web server of IDalign is available at (http://labs.cas.usf.edu/bioinfo/service.html).
Collapse
|
26
|
Malhis N, Gsponer J. Computational identification of MoRFs in protein sequences. ACTA ACUST UNITED AC 2015; 31:1738-44. [PMID: 25637562 DOI: 10.1093/bioinformatics/btv060] [Citation(s) in RCA: 63] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2014] [Accepted: 01/25/2015] [Indexed: 11/14/2022]
Abstract
MOTIVATION Intrinsically disordered regions of proteins play an essential role in the regulation of various biological processes. Key to their regulatory function is the binding of molecular recognition features (MoRFs) to globular protein domains in a process known as a disorder-to-order transition. Predicting the location of MoRFs in protein sequences with high accuracy remains an important computational challenge. METHOD In this study, we introduce MoRFCHiBi, a new computational approach for fast and accurate prediction of MoRFs in protein sequences. MoRFCHiBi combines the outcomes of two support vector machine (SVM) models that take advantage of two different kernels with high noise tolerance. The first, SVMS, is designed to extract maximal information from the general contrast in amino acid compositions between MoRFs, their surrounding regions (Flanks), and the remainders of the sequences. The second, SVMT, is used to identify similarities between regions in a query sequence and MoRFs of the training set. RESULTS We evaluated the performance of our predictor by comparing its results with those of two currently available MoRF predictors, MoRFpred and ANCHOR. Using three test sets that have previously been collected and used to evaluate MoRFpred and ANCHOR, we demonstrate that MoRFCHiBi outperforms the other predictors with respect to different evaluation metrics. In addition, MoRFCHiBi is downloadable and fast, which makes it useful as a component in other computational prediction tools. AVAILABILITY AND IMPLEMENTATION http://www.chibi.ubc.ca/morf/.
Collapse
Affiliation(s)
- Nawar Malhis
- Centre for High-Throughput Biology and Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Jörg Gsponer
- Centre for High-Throughput Biology and Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC V6T 1Z4, Canada Centre for High-Throughput Biology and Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| |
Collapse
|
27
|
DBC1/CCAR2 and CCAR1 Are Largely Disordered Proteins that Have Evolved from One Common Ancestor. BIOMED RESEARCH INTERNATIONAL 2014; 2014:418458. [PMID: 25610865 PMCID: PMC4287135 DOI: 10.1155/2014/418458] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/11/2014] [Revised: 09/18/2014] [Accepted: 09/18/2014] [Indexed: 01/07/2023]
Abstract
Deleted in breast cancer 1 (DBC1, CCAR2, KIAA1967) is a large, predominantly nuclear, multidomain protein that modulates gene expression by inhibiting several epigenetic modifiers, including the deacetylases SIRT1 and HDAC3, and the methyltransferase SUV39H1. DBC1 shares many highly conserved protein domains with its paralog cell cycle and apoptosis regulator 1 (CCAR1, CARP-1). In this study, we examined the full-length sequential and structural properties of DBC1 and CCAR1 from multiple species and correlated these properties with evolution. Our data shows that the conserved domains shared between DBC1 and CCAR1 have similar domain structures, as well as similar patterns of predicted disorder in less-conserved intrinsically disordered regions. Our analysis indicates similarities between DBC1, CCAR1, and the nematode protein lateral signaling target 3 (LST-3), suggesting that DBC1 and CCAR1 may have evolved from LST-3. Our data also suggests that DBC1 emerged later in evolution than CCAR1. DBC1 contains regions that show less conservation across species as compared to the same regions in CCAR1, suggesting a continuously evolving scenario for DBC1. Overall, this study provides insight into the structure and evolution of DBC1 and CCAR1, which may impact future studies on the biological functions of these proteins.
Collapse
|
28
|
The role of the N-terminal tail for the oligomerization, folding and stability of human frataxin. FEBS Open Bio 2013; 3:310-20. [PMID: 23951553 PMCID: PMC3741918 DOI: 10.1016/j.fob.2013.07.004] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2013] [Revised: 07/10/2013] [Accepted: 07/15/2013] [Indexed: 01/30/2023] Open
Abstract
The N-terminal stretch of human frataxin (hFXN) intermediate (residues 42–80) is not conserved throughout evolution and, under defined experimental conditions, behaves as a random-coil. Overexpression of hFXN56–210 in Escherichia coli yields a multimer, whereas the mature form of hFXN (hFXN81–210) is monomeric. Thus, cumulative experimental evidence points to the N-terminal moiety as an essential element for the assembly of a high molecular weight oligomer. The secondary structure propensity of peptide 56–81, the moiety putatively responsible for promoting protein–protein interactions, was also studied. Depending on the environment (TFE or SDS), this peptide adopts α-helical or β-strand structure. In this context, we explored the conformation and stability of hFXN56–210. The biophysical characterization by fluorescence, CD and SEC-FPLC shows that subunits are well folded, sharing similar stability to hFXN90–210. However, controlled proteolysis indicates that the N-terminal stretch is labile in the context of the multimer, whereas the FXN domain (residues 81–210) remains strongly resistant. In addition, guanidine hydrochloride at low concentration disrupts intermolecular interactions, shifting the ensemble toward the monomeric form. The conformational plasticity of the N-terminal tail might impart on hFXN the ability to act as a recognition signal as well as an oligomerization trigger. Understanding the fine-tuning of these activities and their resulting balance will bear direct relevance for ultimately comprehending hFXN function. hFXN56–210 is well-folded and shares similar stability to hFXN90–210. The oligomeric form of hFXN56–210 can be disassembled and reassembled in vitro. Proteolysis leads to the oligomer disassembly: subunits are abridged to hFXN81–210. Isolated peptide hFXN56–81 acquires structure in TFE and SDS solutions. The N-terminal tail is structurally malleable and triggers oligomerization.
Collapse
|
29
|
Kutyshenko VP, Prokhorov DA, Molochkov NV, Sharapov MG, Kolesnikov I, Uversky VN. Dancing retro: solution structure and micelle interactions of the retro-SH3-domain, retro-SHH-'Bergerac'. J Biomol Struct Dyn 2013; 32:257-72. [PMID: 23527530 DOI: 10.1080/07391102.2012.762724] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
A protein with the reversed direction of its polypeptide chain, retro-SHH, was analyzed by several spectroscopic techniques including circular dichroism and high-resolution NMR to understand its solution structure and structural consequences of interaction with the micelles formed by the zwitterionic detergent dodecylphosphocholine (DPC). This analysis revealed that retro-SHH does not contain rigid 3-D structure, but is characterized by the presence of residual secondary structure. Intriguingly, interaction with the DPC micelles affected the structures of SHH and retro-SHH very differently. In fact, micelles induce pronounced folding of retro-SHH, whereas micelle-bound SHH was noticeably disordered. Finally, we performed a disorder prediction with the PONDR-FIT algorithm and discovered that the reversal of the chain direction almost does not affect the propensity of a polypeptide for intrinsic disorder, since the disorder plot for retro-SHH was almost a mirror image of that for the normal SHH.
Collapse
Affiliation(s)
- Victor P Kutyshenko
- a Institute of Theoretical and Experimental Biophysics of Russian Academy of Science , Pushchino , Moscow Region , 142290 , Russia
| | | | | | | | | | | |
Collapse
|
30
|
Xue B, Dunker AK, Uversky VN. The roles of intrinsic disorder in orchestrating the Wnt-pathway. J Biomol Struct Dyn 2012; 29:843-61. [PMID: 22292947 DOI: 10.1080/073911012010525024] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
The canonical Wnt-pathway plays a number of crucial roles in the development of organism. Malfunctions of this pathway lead to various diseases including cancer. In the inactivated state, this pathway involves five proteins, Axin, CKI-α, GSK-3β, APC, and β-catenin. We analyzed these proteins by a number of computational tools, such as PONDR(r)VLXT, PONDR(r)VSL2, MoRF-II predictor and Hydrophobic Cluster Analysis (HCA) to show that each of the Wnt-pathway proteins contains several intrinsically disordered regions. Based on a comprehensive analysis of published data we conclude that these disordered regions facilitate protein-protein interactions, post-translational modifications, and signaling. The scaffold protein Axin and another large protein, APC, act as flexible concentrators in gathering together all other proteins involved in the Wnt-pathway, emphasizing the role of intrinsically disordered regions in orchestrating the complex protein-protein interactions. We further explore the intricate roles of highly disordered APC in regulation of β-catenin function. Intrinsically disordered APC helps the collection of β-catenin from cytoplasm, facilitates the b-catenin delivery to the binding sites on Axin, and controls the final detachment of β-catenin from Axin.
Collapse
Affiliation(s)
- Bin Xue
- Department of Molecular Medicine, University of South Florida, Tampa, FL 33612, USA.
| | | | | |
Collapse
|
31
|
Weatheritt RJ, Jehl P, Dinkel H, Gibson TJ. iELM--a web server to explore short linear motif-mediated interactions. Nucleic Acids Res 2012; 40:W364-9. [PMID: 22638578 PMCID: PMC3394315 DOI: 10.1093/nar/gks444] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
The recent expansion in our knowledge of protein–protein interactions (PPIs) has allowed the annotation and prediction of hundreds of thousands of interactions. However, the function of many of these interactions remains elusive. The interactions of Eukaryotic Linear Motif (iELM) web server provides a resource for predicting the function and positional interface for a subset of interactions mediated by short linear motifs (SLiMs). The iELM prediction algorithm is based on the annotated SLiM classes from the Eukaryotic Linear Motif (ELM) resource and allows users to explore both annotated and user-generated PPI networks for SLiM-mediated interactions. By incorporating the annotated information from the ELM resource, iELM provides functional details of PPIs. This can be used in proteomic analysis, for example, to infer whether an interaction promotes complex formation or degradation. Furthermore, details of the molecular interface of the SLiM-mediated interactions are also predicted. This information is displayed in a fully searchable table, as well as graphically with the modular architecture of the participating proteins extracted from the UniProt and Phospho.ELM resources. A network figure is also presented to aid the interpretation of results. The iELM server supports single protein queries as well as large-scale proteomic submissions and is freely available at http://i.elm.eu.org.
Collapse
Affiliation(s)
- Robert J Weatheritt
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany
| | | | | | | |
Collapse
|
32
|
Edwards RJ, Davey NE, O'Brien K, Shields DC. Interactome-wide prediction of short, disordered protein interaction motifs in humans. MOLECULAR BIOSYSTEMS 2011; 8:282-95. [PMID: 21879107 DOI: 10.1039/c1mb05212h] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Many of the specific functions of intrinsically disordered protein segments are mediated by Short Linear Motifs (SLiMs) interacting with other proteins. Well known examples include SLiMs that interact with 14-3-3, PDZ, SH2, SH3, and WW domains but the true extent and diversity of SLiM-mediated interactions is largely unknown. Here, we attempt to expand our knowledge of human SLiMs by applying in silico SLiM prediction to the human interactome. Combining data from seven different interaction databases, we analysed approximately 6000 protein-centred and 1600 domain-centred human interaction datasets of 3+ unrelated proteins that interact with a common partner. Results were placed in context through comparison to randomised datasets of similar size and composition. The search returned thousands of evolutionarily conserved, intrinsically disordered occurrences of hundreds of significantly enriched recurring motifs, including many that have never been previously identified (). In addition to True Positive results for at least 25 different known SLiMs, a striking number of "off-target" proteins/domains also returned significantly enriched known motifs. Often, this was due to the non-independence of the datasets, with many proteins sharing interaction partners or contributing interactions to multiple domain datasets. The majority of these motif classes, however, were also found to be significantly enriched in one or more randomised datasets. This highlights the need for care when interpreting motif predictions of this nature but also raises the possibility that SLiM occurrences may be successfully identified independently of interaction data. Although not as compositionally biased as previous studies, patterns matching known SLiMs tended to cluster into a few large groups of similar sequence, while novel predictions tended to be more distinctive and less abundant. Whether this is due to ascertainment bias or a true functional composition bias of SLiMs is not clear and warrants further investigation.
Collapse
Affiliation(s)
- Richard J Edwards
- Centre for Biological Sciences, University of Southampton, Southampton, UK.
| | | | | | | |
Collapse
|
33
|
Xue B, Oldfield CJ, Van YY, Dunker AK, Uversky VN. Protein intrinsic disorder and induced pluripotent stem cells. MOLECULAR BIOSYSTEMS 2011; 8:134-50. [PMID: 21761058 DOI: 10.1039/c1mb05163f] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Induced pluripotent stem (iPS) cells can be obtained from terminally differentiated somatic cells by overexpression of defined sets of reprogramming transcription factors. These protein sets have been called the Yamanaka factors, namely Sox2, Oct3/4 (Pou5f1), Klf4, and c-Myc, and the Thomson factors, namely Sox2, Oct3, Lin28, and Nanog. Other sets of proteins, while not essential for the formation of iPS cells, are important for improving the efficiency of the induction and still other sets of proteins are important as markers for embryonic stem cells. Structural information about most of these important proteins is very sparse. Our bioinformatics analysis herein reveals that these reprogramming factors and most of the efficiency-improving and embryonic stem cell markers are highly enriched in intrinsic disorder. As is typical for transcription factors, these proteins are modular. Specific sites for interaction with other proteins and DNA are dispersed in the long regions of intrinsic disorder. These highly dynamic interaction sites are evidently responsible for the delicate interplay among various molecules. The bioinformatics analysis given herein should facilitate the investigation of the roles and organization of these modular interaction sites, thereby helping to shed further light on the pathways that underlie the mechanism(s) by which terminally differentiated cells are converted to iPS cells.
Collapse
Affiliation(s)
- Bin Xue
- Department of Molecular Medicine, College of Medicine, University of South Florida, Tampa, Florida 33612, USA.
| | | | | | | | | |
Collapse
|
34
|
Abstract
MOTIVATION Predictions, and experiments to a lesser extent, following the decoding of the human genome showed that a significant fraction of gene products do not have well-defined 3D structures. While the presence of structured domains traditionally suggested function, it was not clear what the absence of structure implied. These and many other findings initiated the extensive theoretical and experimental research into these types of proteins, commonly known as intrinsically disordered proteins (IDPs). Crucial to understanding IDPs is the evaluation of structural predictors based on different principles and trained on various datasets, which is currently the subject of active research. The view is emerging that structural disorder can be considered as a separate structural category and not simply as absence of secondary and/or tertiary structure. IDPs perform essential functions and their improper functioning is responsible for human diseases such as neurodegenerative disorders.
Collapse
Affiliation(s)
- Ferenc Orosz
- Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, Karolina út 29, Budapest, H-1113 Hungary.
| | | |
Collapse
|
35
|
Uversky VN. Intrinsically disordered proteins from A to Z. Int J Biochem Cell Biol 2011; 43:1090-103. [PMID: 21501695 DOI: 10.1016/j.biocel.2011.04.001] [Citation(s) in RCA: 327] [Impact Index Per Article: 25.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2011] [Revised: 03/31/2011] [Accepted: 04/01/2011] [Indexed: 01/13/2023]
Abstract
The ideas that proteins might possess specific functions without being uniquely folded into rigid 3D-structures and that these floppy polypeptides might constitute a noticeable part of any given proteome would have been considered as a preposterous fiction 15 or even 10 years ago. The situation has changed recently, and the existence of functional yet intrinsically disordered proteins and regions has become accepted by a significant number of protein scientists. These fuzzy objects with fuzzy structures and fuzzy functions are among the most interesting and attractive targets for modern protein research. This review summarizes some of the major discoveries and breakthroughs in the field of intrinsic disorder by representing related concepts and definitions.
Collapse
Affiliation(s)
- Vladimir N Uversky
- Department of Molecular Medicine, University of South Florida, FL 33612, USA.
| |
Collapse
|