1
|
Volzhenin K, Bittner L, Carbone A. SENSE-PPI reconstructs interactomes within, across, and between species at the genome scale. iScience 2024; 27:110371. [PMID: 39055916 PMCID: PMC11269938 DOI: 10.1016/j.isci.2024.110371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Revised: 05/04/2024] [Accepted: 06/21/2024] [Indexed: 07/28/2024] Open
Abstract
Ab initio computational reconstructions of protein-protein interaction (PPI) networks will provide invaluable insights into cellular systems, enabling the discovery of novel molecular interactions and elucidating biological mechanisms within and between organisms. Leveraging the latest generation protein language models and recurrent neural networks, we present SENSE-PPI, a sequence-based deep learning model that efficiently reconstructs ab initio PPIs, distinguishing partners among tens of thousands of proteins and identifying specific interactions within functionally similar proteins. SENSE-PPI demonstrates high accuracy, limited training requirements, and versatility in cross-species predictions, even with non-model organisms and human-virus interactions. Its performance decreases for phylogenetically more distant model and non-model organisms, but signal alteration is very slow. In this regard, it demonstrates the important role of parameters in protein language models. SENSE-PPI is very fast and can test 10,000 proteins against themselves in a matter of hours, enabling the reconstruction of genome-wide proteomes.
Collapse
Affiliation(s)
- Konstantin Volzhenin
- Sorbonne Université, CNRS, IBPS, UMR 7238, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 75005 Paris, France
| | - Lucie Bittner
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Muséum national d’Histoire naturelle, CNRS, Sorbonne Université, EPHE, Université des Antilles, Paris, France
- Institut Universitaire de France, Paris, France
| | - Alessandra Carbone
- Sorbonne Université, CNRS, IBPS, UMR 7238, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 75005 Paris, France
- Institut Universitaire de France, Paris, France
| |
Collapse
|
2
|
Maneix L, Iakova P, Lee CG, Moree SE, Lu X, Datar GK, Hill CT, Spooner E, King JCK, Sykes DB, Saez B, Di Stefano B, Chen X, Krause DS, Sahin E, Tsai FTF, Goodell MA, Berk BC, Scadden DT, Catic A. Cyclophilin A supports translation of intrinsically disordered proteins and affects haematopoietic stem cell ageing. Nat Cell Biol 2024; 26:593-603. [PMID: 38553595 PMCID: PMC11021199 DOI: 10.1038/s41556-024-01387-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Accepted: 02/23/2024] [Indexed: 04/11/2024]
Abstract
Loss of protein function is a driving force of ageing. We have identified peptidyl-prolyl isomerase A (PPIA or cyclophilin A) as a dominant chaperone in haematopoietic stem and progenitor cells. Depletion of PPIA accelerates stem cell ageing. We found that proteins with intrinsically disordered regions (IDRs) are frequent PPIA substrates. IDRs facilitate interactions with other proteins or nucleic acids and can trigger liquid-liquid phase separation. Over 20% of PPIA substrates are involved in the formation of supramolecular membrane-less organelles. PPIA affects regulators of stress granules (PABPC1), P-bodies (DDX6) and nucleoli (NPM1) to promote phase separation and increase cellular stress resistance. Haematopoietic stem cell ageing is associated with a post-transcriptional decrease in PPIA expression and reduced translation of IDR-rich proteins. Here we link the chaperone PPIA to the synthesis of intrinsically disordered proteins, which indicates that impaired protein interaction networks and macromolecular condensation may be potential determinants of haematopoietic stem cell ageing.
Collapse
Affiliation(s)
- Laure Maneix
- Huffington Center on Aging, Baylor College of Medicine, Houston, TX, USA
- Stem Cells and Regenerative Medicine Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX, USA
- Cell and Gene Therapy Program at the Dan L. Duncan Comprehensive Cancer Center, Houston, TX, USA
| | - Polina Iakova
- Huffington Center on Aging, Baylor College of Medicine, Houston, TX, USA
- Stem Cells and Regenerative Medicine Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX, USA
- Cell and Gene Therapy Program at the Dan L. Duncan Comprehensive Cancer Center, Houston, TX, USA
| | - Charles G Lee
- Department of BioSciences, Rice University, Houston, TX, USA
| | - Shannon E Moree
- Huffington Center on Aging, Baylor College of Medicine, Houston, TX, USA
- Stem Cells and Regenerative Medicine Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX, USA
- Cell and Gene Therapy Program at the Dan L. Duncan Comprehensive Cancer Center, Houston, TX, USA
| | - Xuan Lu
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX, USA
| | - Gandhar K Datar
- Stem Cells and Regenerative Medicine Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX, USA
| | - Cedric T Hill
- Center for Regenerative Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Eric Spooner
- Whitehead Institute for Biomedical Research, Cambridge, MA, USA
| | - Jordon C K King
- Huffington Center on Aging, Baylor College of Medicine, Houston, TX, USA
- Stem Cells and Regenerative Medicine Center, Baylor College of Medicine, Houston, TX, USA
- Cell and Gene Therapy Program at the Dan L. Duncan Comprehensive Cancer Center, Houston, TX, USA
| | - David B Sykes
- Center for Regenerative Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Borja Saez
- Center for Applied Medical Research, Hematology-Oncology Unit, Pamplona, Navarra, Spain
| | - Bruno Di Stefano
- Stem Cells and Regenerative Medicine Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX, USA
- Cell and Gene Therapy Program at the Dan L. Duncan Comprehensive Cancer Center, Houston, TX, USA
| | - Xi Chen
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX, USA
| | - Daniela S Krause
- Georg-Speyer-Haus, Institute for Tumor Biology and Experimental Therapy, Frankfurt am Main, Germany
| | - Ergun Sahin
- Huffington Center on Aging, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular Physiology and Biophysics, Baylor College of Medicine, Houston, TX, USA
| | - Francis T F Tsai
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX, USA
- Department of Biochemistry and Molecular Pharmacology, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular Virology and Microbiology, Baylor College of Medicine, Houston, TX, USA
| | - Margaret A Goodell
- Stem Cells and Regenerative Medicine Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX, USA
- Cell and Gene Therapy Program at the Dan L. Duncan Comprehensive Cancer Center, Houston, TX, USA
| | - Bradford C Berk
- Department of Medicine, University of Rochester School of Medicine and Dentistry, Rochester, NY, USA
| | - David T Scadden
- Center for Regenerative Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - André Catic
- Huffington Center on Aging, Baylor College of Medicine, Houston, TX, USA.
- Stem Cells and Regenerative Medicine Center, Baylor College of Medicine, Houston, TX, USA.
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX, USA.
- Cell and Gene Therapy Program at the Dan L. Duncan Comprehensive Cancer Center, Houston, TX, USA.
- Michael E. DeBakey Veterans Affairs Medical Center, Houston, TX, USA.
| |
Collapse
|
3
|
Bakhtiar D, Vondraskova K, Pengelly RJ, Chivers M, Kralovicova J, Vorechovsky I. Exonic splicing code and coordination of divalent metals in proteins. Nucleic Acids Res 2024; 52:1090-1106. [PMID: 38055834 PMCID: PMC10853796 DOI: 10.1093/nar/gkad1161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 11/15/2023] [Accepted: 11/17/2023] [Indexed: 12/08/2023] Open
Abstract
Exonic sequences contain both protein-coding and RNA splicing information but the interplay of the protein and splicing code is complex and poorly understood. Here, we have studied traditional and auxiliary splicing codes of human exons that encode residues coordinating two essential divalent metals at the opposite ends of the Irving-Williams series, a universal order of relative stabilities of metal-organic complexes. We show that exons encoding Zn2+-coordinating amino acids are supported much less by the auxiliary splicing motifs than exons coordinating Ca2+. The handicap of the former is compensated by stronger splice sites and uridine-richer polypyrimidine tracts, except for position -3 relative to 3' splice junctions. However, both Ca2+ and Zn2+ exons exhibit close-to-constitutive splicing in multiple tissues, consistent with their critical importance for metalloprotein function and a relatively small fraction of expendable, alternatively spliced exons. These results indicate that constraints imposed by metal coordination spheres on RNA splicing have been efficiently overcome by the plasticity of exon-intron architecture to ensure adequate metalloprotein expression.
Collapse
Affiliation(s)
- Dara Bakhtiar
- University of Southampton, Faculty of Medicine, Southampton SO16 6YD, UK
| | - Katarina Vondraskova
- Slovak Academy of Sciences, Centre of Biosciences, 840 05 Bratislava, Slovak Republic
| | - Reuben J Pengelly
- University of Southampton, Faculty of Medicine, Southampton SO16 6YD, UK
| | - Martin Chivers
- University of Southampton, Faculty of Medicine, Southampton SO16 6YD, UK
| | - Jana Kralovicova
- University of Southampton, Faculty of Medicine, Southampton SO16 6YD, UK
- Slovak Academy of Sciences, Centre of Biosciences, 840 05 Bratislava, Slovak Republic
| | - Igor Vorechovsky
- University of Southampton, Faculty of Medicine, Southampton SO16 6YD, UK
| |
Collapse
|
4
|
Munro TA. Reanalysis of a μ opioid receptor crystal structure reveals a covalent adduct with BU72. BMC Biol 2023; 21:213. [PMID: 37817141 PMCID: PMC10566028 DOI: 10.1186/s12915-023-01689-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Accepted: 08/25/2023] [Indexed: 10/12/2023] Open
Abstract
BACKGROUND The first crystal structure of the active μ opioid receptor (μOR) exhibited several unexplained features. The ligand BU72 exhibited many extreme deviations from ideal geometry, along with unexplained electron density. I previously showed that inverting the benzylic configuration resolved these problems, establishing revised stereochemistry of BU72 and its analog BU74. However, another problem remains unresolved: additional unexplained electron density contacts both BU72 and a histidine residue in the N-terminus, revealing the presence of an as-yet unidentified atom. RESULTS These short contacts and uninterrupted density are inconsistent with non-covalent interactions. Therefore, BU72 and μOR form a covalent adduct, rather than representing two separate entities as in the original model. A subsequently proposed magnesium complex is inconsistent with multiple lines of evidence. However, oxygen fits the unexplained density well. While the structure I propose is tentative, similar adducts have been reported previously in the presence of reactive oxygen species. Moreover, known sources of reactive oxygen species were present: HEPES buffer, nickel ions, and a sequence motif that forms redox-active nickel complexes. This motif contacts the unexplained density. The adduct exhibits severe strain, and the tethered N-terminus forms contacts with adjacent residues. These forces, along with the nanobody used as a G protein substitute, would be expected to influence the receptor conformation. Consistent with this, the intracellular end of the structure differs markedly from subsequent structures of active μOR bound to Gi protein. CONCLUSIONS Later Gi-bound structures are likely to be more accurate templates for ligand docking and modelling of active G protein-bound μOR. The possibility of reactions like this should be considered in the choice of protein truncation sites and purification conditions, and in the interpretation of excess or unexplained density.
Collapse
Affiliation(s)
- Thomas A Munro
- School of Life and Environmental Sciences, Deakin University, Burwood, VIC, 3125, Australia.
| |
Collapse
|
5
|
Subedi S, Nag N, Shukla H, Padhi AK, Tripathi T. Comprehensive analysis of liquid-liquid phase separation propensities of HSV-1 proteins and their interaction with host factors. J Cell Biochem 2023. [PMID: 37796176 DOI: 10.1002/jcb.30480] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 09/08/2023] [Accepted: 09/17/2023] [Indexed: 10/06/2023]
Abstract
In recent years, it has been shown that the liquid-liquid phase separation (LLPS) of virus proteins plays a crucial role in their life cycle. It promotes the formation of viral replication organelles, concentrating viral components for efficient replication and facilitates the assembly of viral particles. LLPS has emerged as a crucial process in the replication and assembly of herpes simplex virus-1 (HSV-1). Recent studies have identified several HSV-1 proteins involved in LLPS, including the myristylated tegument protein UL11 and infected cell protein 4; however, a complete proteome-level understanding of the LLPS-prone HSV-1 proteins is not available. We provide a comprehensive analysis of the HSV-1 proteome and explore the potential of its proteins to undergo LLPS. By integrating sequence analysis, prediction algorithms and an array of tools and servers, we identified 10 HSV-1 proteins that exhibit high LLPS potential. By analysing the amino acid sequences of the LLPS-prone proteins, we identified specific sequence motifs and enriched amino acid residues commonly found in LLPS-prone regions. Our findings reveal a diverse range of LLPS-prone proteins within the HSV-1, which are involved in critical viral processes such as replication, transcriptional regulation and assembly of viral particles. This suggests that LLPS might play a crucial role in facilitating the formation of specialized viral replication compartments and the assembly of HSV-1 virion. The identification of LLPS-prone proteins in HSV-1 opens up new avenues for understanding the molecular mechanisms underlying viral pathogenesis. Our work provides valuable insights into the LLPS landscape of HSV-1, highlighting potential targets for further experimental validation and enhancing our understanding of viral replication and pathogenesis.
Collapse
Affiliation(s)
- Sushma Subedi
- Molecular and Structural Biophysics Laboratory, Department of Biochemistry, North-Eastern Hill University, Shillong, India
| | - Niharika Nag
- Molecular and Structural Biophysics Laboratory, Department of Biochemistry, North-Eastern Hill University, Shillong, India
| | - Harish Shukla
- Molecular and Structural Biophysics Laboratory, Department of Biochemistry, North-Eastern Hill University, Shillong, India
| | - Aditya K Padhi
- Laboratory for Computational Biology & Biomolecular Design, School of Biochemical Engineering, Indian Institute of Technology (BHU), Varanasi, India
| | - Timir Tripathi
- Molecular and Structural Biophysics Laboratory, Department of Biochemistry, North-Eastern Hill University, Shillong, India
- Department of Zoology, North-Eastern Hill University, Shillong, India
| |
Collapse
|
6
|
Guo G, Wang X, Zhang Y, Li T. Sequence variations of phase-separating proteins and resources for studying biomolecular condensates. Acta Biochim Biophys Sin (Shanghai) 2023; 55:1119-1132. [PMID: 37464880 PMCID: PMC10423696 DOI: 10.3724/abbs.2023131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Accepted: 06/06/2023] [Indexed: 07/20/2023] Open
Abstract
Phase separation (PS) is an important mechanism underlying the formation of biomolecular condensates. Physiological condensates are associated with numerous biological processes, such as transcription, immunity, signaling, and synaptic transmission. Changes in particular amino acids or segments can disturb the protein's phase behavior and interactions with other biomolecules in condensates. It is thus presumed that variations in the phase-separating-prone domains can significantly impact the properties and functions of condensates. The dysfunction of condensates contributes to a number of pathological processes. Pharmacological perturbation of these condensates is proposed as a promising way to restore physiological states. In this review, we characterize the variations observed in PS proteins that lead to aberrant biomolecular compartmentalization. We also showcase recent advancements in bioinformatics of membraneless organelles (MLOs), focusing on available databases useful for screening PS proteins and describing endogenous condensates, guiding researchers to seek the underlying pathogenic mechanisms of biomolecular condensates.
Collapse
Affiliation(s)
- Gaigai Guo
- Department of Biomedical InformaticsSchool of Basic Medical SciencesPeking University Health Science CenterBeijing100191China
| | - Xinxin Wang
- Department of Biomedical InformaticsSchool of Basic Medical SciencesPeking University Health Science CenterBeijing100191China
| | - Yi Zhang
- Department of Biomedical InformaticsSchool of Basic Medical SciencesPeking University Health Science CenterBeijing100191China
| | - Tingting Li
- Department of Biomedical InformaticsSchool of Basic Medical SciencesPeking University Health Science CenterBeijing100191China
- Key Laboratory for NeuroscienceMinistry of Education/National Health Commission of ChinaPeking UniversityBeijing100191China
| |
Collapse
|
7
|
Koehler Leman J, Künze G. Recent Advances in NMR Protein Structure Prediction with ROSETTA. Int J Mol Sci 2023; 24:ijms24097835. [PMID: 37175539 PMCID: PMC10178863 DOI: 10.3390/ijms24097835] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Revised: 04/15/2023] [Accepted: 04/21/2023] [Indexed: 05/15/2023] Open
Abstract
Nuclear magnetic resonance (NMR) spectroscopy is a powerful method for studying the structure and dynamics of proteins in their native state. For high-resolution NMR structure determination, the collection of a rich restraint dataset is necessary. This can be difficult to achieve for proteins with high molecular weight or a complex architecture. Computational modeling techniques can complement sparse NMR datasets (<1 restraint per residue) with additional structural information to elucidate protein structures in these difficult cases. The Rosetta software for protein structure modeling and design is used by structural biologists for structure determination tasks in which limited experimental data is available. This review gives an overview of the computational protocols available in the Rosetta framework for modeling protein structures from NMR data. We explain the computational algorithms used for the integration of different NMR data types in Rosetta. We also highlight new developments, including modeling tools for data from paramagnetic NMR and hydrogen-deuterium exchange, as well as chemical shifts in CS-Rosetta. Furthermore, strategies are discussed to complement and improve structure predictions made by the current state-of-the-art AlphaFold2 program using NMR-guided Rosetta modeling.
Collapse
Affiliation(s)
- Julia Koehler Leman
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY 10010, USA
| | - Georg Künze
- Institute for Drug Discovery, Medical Faculty, University of Leipzig, Brüderstr. 34, D-04103 Leipzig, Germany
- Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstr. 16-18, D-04107 Leipzig, Germany
| |
Collapse
|
8
|
Tibble RW, Gross JD. A call to order: Examining structured domains in biomolecular condensates. JOURNAL OF MAGNETIC RESONANCE (SAN DIEGO, CALIF. : 1997) 2023; 346:107318. [PMID: 36657879 PMCID: PMC10878105 DOI: 10.1016/j.jmr.2022.107318] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 09/20/2022] [Accepted: 10/13/2022] [Indexed: 06/17/2023]
Abstract
Diverse cellular processes have been observed or predicted to occur in biomolecular condensates, which are comprised of proteins and nucleic acids that undergo liquid-liquid phase separation (LLPS). Protein-driven LLPS often involves weak, multivalent interactions between intrinsically disordered regions (IDRs). Due to their inherent lack of defined tertiary structures, NMR has been a powerful resource for studying the behavior and interactions of IDRs in condensates. While IDRs in proteins are necessary for phase separation, core proteins enriched in condensates often contain structured domains that are essential for their function and contribute to phase separation. How phase separation can affect the structure and conformational dynamics of structured domains is critical for understanding how biochemical reactions can be effectively regulated in cellular condensates. In this perspective, we discuss the consequences phase separation can have on structured domains and outline NMR observables we believe are useful for assessing protein structure and dynamics in condensates.
Collapse
Affiliation(s)
- Ryan W Tibble
- Program in Chemistry and Chemical Biology, University of California, San Francisco, United States; Department of Pharmaceutical Chemistry, University of California, San Francisco, United States
| | - John D Gross
- Program in Chemistry and Chemical Biology, University of California, San Francisco, United States; Department of Pharmaceutical Chemistry, University of California, San Francisco, United States.
| |
Collapse
|
9
|
Intrinsically Disordered Proteins: An Overview. Int J Mol Sci 2022; 23:ijms232214050. [PMID: 36430530 PMCID: PMC9693201 DOI: 10.3390/ijms232214050] [Citation(s) in RCA: 35] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2022] [Revised: 11/07/2022] [Accepted: 11/08/2022] [Indexed: 11/16/2022] Open
Abstract
Many proteins and protein segments cannot attain a single stable three-dimensional structure under physiological conditions; instead, they adopt multiple interconverting conformational states. Such intrinsically disordered proteins or protein segments are highly abundant across proteomes, and are involved in various effector functions. This review focuses on different aspects of disordered proteins and disordered protein regions, which form the basis of the so-called "Disorder-function paradigm" of proteins. Additionally, various experimental approaches and computational tools used for characterizing disordered regions in proteins are discussed. Finally, the role of disordered proteins in diseases and their utility as potential drug targets are explored.
Collapse
|
10
|
Ilzhöfer D, Heinzinger M, Rost B. SETH predicts nuances of residue disorder from protein embeddings. FRONTIERS IN BIOINFORMATICS 2022; 2:1019597. [PMID: 36304335 PMCID: PMC9580958 DOI: 10.3389/fbinf.2022.1019597] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Accepted: 09/20/2022] [Indexed: 11/07/2022] Open
Abstract
Predictions for millions of protein three-dimensional structures are only a few clicks away since the release of AlphaFold2 results for UniProt. However, many proteins have so-called intrinsically disordered regions (IDRs) that do not adopt unique structures in isolation. These IDRs are associated with several diseases, including Alzheimer's Disease. We showed that three recent disorder measures of AlphaFold2 predictions (pLDDT, "experimentally resolved" prediction and "relative solvent accessibility") correlated to some extent with IDRs. However, expert methods predict IDRs more reliably by combining complex machine learning models with expert-crafted input features and evolutionary information from multiple sequence alignments (MSAs). MSAs are not always available, especially for IDRs, and are computationally expensive to generate, limiting the scalability of the associated tools. Here, we present the novel method SETH that predicts residue disorder from embeddings generated by the protein Language Model ProtT5, which explicitly only uses single sequences as input. Thereby, our method, relying on a relatively shallow convolutional neural network, outperformed much more complex solutions while being much faster, allowing to create predictions for the human proteome in about 1 hour on a consumer-grade PC with one NVIDIA GeForce RTX 3060. Trained on a continuous disorder scale (CheZOD scores), our method captured subtle variations in disorder, thereby providing important information beyond the binary classification of most methods. High performance paired with speed revealed that SETH's nuanced disorder predictions for entire proteomes capture aspects of the evolution of organisms. Additionally, SETH could also be used to filter out regions or proteins with probable low-quality AlphaFold2 3D structures to prioritize running the compute-intensive predictions for large data sets. SETH is freely publicly available at: https://github.com/Rostlab/SETH.
Collapse
Affiliation(s)
- Dagmar Ilzhöfer
- Faculty of Informatics, TUM (Technical University of Munich), Munich, Germany
| | - Michael Heinzinger
- Faculty of Informatics, TUM (Technical University of Munich), Munich, Germany
- Center of Doctoral Studies in Informatics and Its Applications (CeDoSIA), TUM Graduate School, Garching, Germany
| | - Burkhard Rost
- Faculty of Informatics, TUM (Technical University of Munich), Munich, Germany
- Institute for Advanced Study (TUM-IAS), TUM (Technical University of Munich), Garching, Germany
- TUM School of Life Sciences Weihenstephan (WZW), TUM (Technical University of Munich), Freising, Germany
| |
Collapse
|
11
|
Bezerra RP, Conniff AS, Uversky VN. Comparative study of structures and functional motifs in lectins from the commercially important photosynthetic microorganisms. Biochimie 2022; 201:63-74. [PMID: 35839918 DOI: 10.1016/j.biochi.2022.07.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Revised: 06/17/2022] [Accepted: 07/08/2022] [Indexed: 11/26/2022]
Abstract
Photosynthetic microorganisms, specifically cyanobacteria and microalgae, can synthesize a vast array of biologically active molecules, such as lectins, that have great potential for various biotechnological and biomedical applications. However, since the structures of these proteins are not well established, likely due to the presence of intrinsically disordered regions, our ability to better understand their functionality is hampered. We embarked on a study of the carbohydrate recognition domain (CRD), intrinsically disordered regions (IDRs), amino acidic composition, as well as and functional motifs in lectins from cyanobacteria of the genus Arthrospira and microalgae Chlorella and Dunaliella genus using a combination of bioinformatics techniques. This search revealed the presence of five distinctive CRD types differently distributed between the genera. Most CRDs displayed a group-specific distribution, except to C. sorokiniana possessing distinctive CRD probably due to its specific lifestyle. We also found that all CRDs contain short IDRs. Bacterial lectin of Arthrospira prokarionte showed lower intrinsic disorder and proline content when compared to the lectins from the eukaryotic microalgae (Chlorella and Dunaliella). Among the important functions predicted in all lectins were several specific motifs, which directly interacts with proteins involved in the cell-cycle control and which may be used for pharmaceutical purposes. Since the aforementioned properties of each type of lectin were investigated in silico, they need experimental confirmation. The results of our study provide an overview of the distribution of CRD, IDRs, and functional motifs within lectin from the commercially important microalgae.
Collapse
Affiliation(s)
- Raquel P Bezerra
- Department of Morphology and Animal Physiology, Federal Rural University of Pernambuco-UFRPE, Dom Manoel de Medeiros Ave, Recife, PE, 52171-900, Brazil.
| | - Amanda S Conniff
- Department of Medical Engineering, Morsani College of Medicine and College of Engineering, University of South Florida, Tampa, FL, 33612, USA.
| | - Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, 33612, USA.
| |
Collapse
|
12
|
Biró B, Zhao B, Kurgan L. Complementarity of the residue-level protein function and structure predictions in human proteins. Comput Struct Biotechnol J 2022; 20:2223-2234. [PMID: 35615015 PMCID: PMC9118482 DOI: 10.1016/j.csbj.2022.05.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 05/02/2022] [Accepted: 05/02/2022] [Indexed: 11/24/2022] Open
Abstract
Sequence-based predictors of the residue-level protein function and structure cover a broad spectrum of characteristics including intrinsic disorder, secondary structure, solvent accessibility and binding to nucleic acids. They were catalogued and evaluated in numerous surveys and assessments. However, methods focusing on a given characteristic are studied separately from predictors of other characteristics, while they are typically used on the same proteins. We fill this void by studying complementarity of a representative collection of methods that target different predictions using a large, taxonomically consistent, and low similarity dataset of human proteins. First, we bridge the gap between the communities that develop structure-trained vs. disorder-trained predictors of binding residues. Motivated by a recent study of the protein-binding residue predictions, we empirically find that combining the structure-trained and disorder-trained predictors of the DNA-binding and RNA-binding residues leads to substantial improvements in predictive quality. Second, we investigate whether diverse predictors generate results that accurately reproduce relations between secondary structure, solvent accessibility, interaction sites, and intrinsic disorder that are present in the experimental data. Our empirical analysis concludes that predictions accurately reflect all combinations of these relations. Altogether, this study provides unique insights that support combining results produced by diverse residue-level predictors of protein function and structure.
Collapse
Affiliation(s)
- Bálint Biró
- Institute of Genetics and Biotechnology, Hungarian University of Agriculture and Life Sciences, Gödöllő, Hungary
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| |
Collapse
|
13
|
Insights into Membrane Curvature Sensing and Membrane Remodeling by Intrinsically Disordered Proteins and Protein Regions. J Membr Biol 2022; 255:237-259. [PMID: 35451616 PMCID: PMC9028910 DOI: 10.1007/s00232-022-00237-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Accepted: 03/29/2022] [Indexed: 12/15/2022]
Abstract
Cellular membranes are highly dynamic in shape. They can rapidly and precisely regulate their shape to perform various cellular functions. The protein’s ability to sense membrane curvature is essential in various biological events such as cell signaling and membrane trafficking. As they are bound, these curvature-sensing proteins may also change the local membrane shape by one or more curvature driving mechanisms. Established curvature-sensing/driving mechanisms rely on proteins with specific structural features such as amphipathic helices and intrinsically curved shapes. However, the recent discovery and characterization of many proteins have shattered the protein structure–function paradigm, believing that the protein functions require a unique structural feature. Typically, such structure-independent functions are carried either entirely by intrinsically disordered proteins or hybrid proteins containing disordered regions and structured domains. It is becoming more apparent that disordered proteins and regions can be potent sensors/inducers of membrane curvatures. In this article, we outline the basic features of disordered proteins and regions, the motifs in such proteins that encode the function, membrane remodeling by disordered proteins and regions, and assays that may be employed to investigate curvature sensing and generation by ordered/disordered proteins.
Collapse
|
14
|
Kurgan L. Resources for computational prediction of intrinsic disorder in proteins. Methods 2022; 204:132-141. [DOI: 10.1016/j.ymeth.2022.03.018] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Revised: 03/25/2022] [Accepted: 03/29/2022] [Indexed: 12/26/2022] Open
|
15
|
Alghamdi M, Alamry SA, Bahlas SM, Uversky VN, Redwan EM. Circulating extracellular vesicles and rheumatoid arthritis: a proteomic analysis. Cell Mol Life Sci 2021; 79:25. [PMID: 34971426 PMCID: PMC11072894 DOI: 10.1007/s00018-021-04020-4] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Revised: 10/28/2021] [Accepted: 10/29/2021] [Indexed: 12/14/2022]
Abstract
Circulating extracellular vesicles (EVs) are membrane-bound nanoparticles secreted by most cells for intracellular communication and transportation of biomolecules. EVs carry proteins, lipids, nucleic acids, and receptors that are involved in human physiology and pathology. EV cargo is variable and highly related to the type and state of the cellular origin. Three subtypes of EVs have been identified: exosomes, microvesicles, and apoptotic bodies. Exosomes are the smallest and the most well-studied class of EVs that regulate different biological processes and participate in several diseases, such as cancers and autoimmune diseases. Proteomic analysis of exosomes succeeded in profiling numerous types of proteins involved in disease development and prognosis. In rheumatoid arthritis (RA), exosomes revealed a potential function in joint inflammation. These EVs possess a unique function, as they can transfer specific autoantigens and mediators between distant cells. Current proteomic data demonstrated that exosomes could provide beneficial effects against autoimmunity and exert an immunosuppressive action, particularly in RA. Based on these observations, effective therapeutic strategies have been developed for arthritis and other inflammatory disorders.
Collapse
Affiliation(s)
- Mohammed Alghamdi
- Biological Sciences Department, Faculty of Science, King Abdulaziz University, P.O. Box 80203, Jeddah, 21589, Saudi Arabia
- Laboratory Department, University Medical Services Center, King Abdulaziz University, P.O. Box 80200, Jeddah, 21589, Saudi Arabia
| | - Sultan Abdulmughni Alamry
- Immunology Diagnostic Laboratory Department, King Abdulaziz University Hospital, P.O Box 80215, Jeddah, 21589, Saudi Arabia
| | - Sami M Bahlas
- Department of Internal Medicine, Faculty of Medicine, King Abdulaziz University, P.O. Box 80215, Jeddah, 21589, Saudi Arabia
| | - Vladimir N Uversky
- Biological Sciences Department, Faculty of Science, King Abdulaziz University, P.O. Box 80203, Jeddah, 21589, Saudi Arabia
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA
| | - Elrashdy M Redwan
- Biological Sciences Department, Faculty of Science, King Abdulaziz University, P.O. Box 80203, Jeddah, 21589, Saudi Arabia.
- Therapeutic and Protective Proteins Laboratory, Protein Research Department, Genetic Engineering and Biotechnology Research Institute, City for Scientific Research and Technology Applications, New Borg EL-Arab, 21934, Alexandria, Egypt.
| |
Collapse
|
16
|
Flanking Disorder of the Folded αα-Hub Domain from Radical Induced Cell Death1 Affects Transcription Factor Binding by Ensemble Redistribution. J Mol Biol 2021; 433:167320. [PMID: 34687712 DOI: 10.1016/j.jmb.2021.167320] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Revised: 09/28/2021] [Accepted: 10/13/2021] [Indexed: 11/22/2022]
Abstract
Protein intrinsic disorder is essential for organization of transcription regulatory interactomes. In these interactomes, the majority of transcription factors as well as their interaction partners have co-existing order and disorder. Yet, little attention has been paid to their interplay. Here, we investigate how order is affected by flanking disorder in the folded αα-hub domain RST from Radical-Induced Cell Death1 (RCD1), central in a large interactome of transcription factors. We show that the intrinsically disordered C-terminal tail of RCD1-RST shifts its conformational ensemble towards a pseudo-bound state through weak interactions with the ligand-binding pocket. An unfolded excited state is also accessible on the ms timescale independent of surrounding disordered regions, but its population is lowered by 50% in their presence. Flanking disorder additionally lowers transcription factor binding-affinity without affecting the dissociation rate constant, in accordance with similar bound-states assessed by NMR. The extensive dynamics of the RCD1-RST domain, modulated by flanking disorder, is suggestive of its adaptation to many different transcription factor ligands. The study illustrates how disordered flanking regions can tune fold and function through ensemble redistribution and is of relevance to modular proteins in general, many of which play key roles in regulation of genes.
Collapse
|
17
|
Marzullo L, Turco MC, Uversky VN. What's in the BAGs? Intrinsic disorder angle of the multifunctionality of the members of a family of chaperone regulators. J Cell Biochem 2021; 123:22-42. [PMID: 34339540 DOI: 10.1002/jcb.30123] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 06/28/2021] [Accepted: 07/22/2021] [Indexed: 01/22/2023]
Abstract
In humans, the family of Bcl-2 associated athanogene (BAG) proteins includes six members characterized by exceptional multifunctionality and engagement in the pathogenesis of various diseases. All of them are capable of interacting with a multitude of often unrelated binding partners. Such binding promiscuity and related functional and pathological multifacetedness cannot be explained or understood within the frames of the classical "one protein-one structure-one function" model, which also fails to explain the presence of multiple isoforms generated for BAG proteins by alternative splicing or alternative translation initiation and their extensive posttranslational modifications. However, all these mysteries can be solved by taking into account the intrinsic disorder phenomenon. In fact, high binding promiscuity and potential to participate in a broad spectrum of interactions with multiple binding partners, as well as a capability to be multifunctional and multipathogenic, are some of the characteristic features of intrinsically disordered proteins and intrinsically disordered protein regions. Such functional proteins or protein regions lacking unique tertiary structures constitute a cornerstone of the protein structure-function continuum concept. The aim of this paper is to provide an overview of the functional roles of human BAG proteins from the perspective of protein intrinsic disorder which will provide a means for understanding their binding promiscuity, multifunctionality, and relation to the pathogenesis of various diseases.
Collapse
Affiliation(s)
- Liberato Marzullo
- Department of Medicine, Surgery and Dentistry Schola Medica Salernitana, University of Salerno, Baronissi, Italy.,Research and Development Division, BIOUNIVERSA s.r.l., Baronissi, Italy
| | - Maria C Turco
- Department of Medicine, Surgery and Dentistry Schola Medica Salernitana, University of Salerno, Baronissi, Italy.,Research and Development Division, BIOUNIVERSA s.r.l., Baronissi, Italy
| | - Vladimir N Uversky
- Department of Molecular Medicine and Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, Florida, USA
| |
Collapse
|
18
|
Uversky VN, Giuliani A. Networks of Networks: An Essay on Multi-Level Biological Organization. Front Genet 2021; 12:706260. [PMID: 34234818 PMCID: PMC8255927 DOI: 10.3389/fgene.2021.706260] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Accepted: 05/31/2021] [Indexed: 01/01/2023] Open
Abstract
The multi-level organization of nature is self-evident: proteins do interact among them to give rise to an organized metabolism, while in the same time each protein (a single node of such interaction network) is itself a network of interacting amino-acid residues allowing coordinated motion of the macromolecule and systemic effect as allosteric behavior. Similar pictures can be drawn for structure and function of cells, organs, tissues, and ecological systems. The majority of biologists are used to think that causally relevant events originate from the lower level (the molecular one) in the form of perturbations, that “climb up” the hierarchy reaching the ultimate layer of macroscopic behavior (e.g., causing a specific disease). Such causative model, stemming from the usual genotype-phenotype distinction, is not the only one. As a matter of fact, one can observe top-down, bottom-up, as well as middle-out perturbation/control trajectories. The recent complex network studies allow to go further the pure qualitative observation of the existence of both non-linear and non-bottom-up processes and to uncover the deep nature of multi-level organization. Here, taking as paradigm protein structural and interaction networks, we review some of the most relevant results dealing with between networks communication shedding light on the basic principles of complex system control and dynamics and offering a more realistic frame of causation in biology.
Collapse
Affiliation(s)
- Vladimir N Uversky
- Department of Molecular Medicine, Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, United States
| | - Alessandro Giuliani
- Department of Environment and Health, Istituto Superiore di Sanità, Rome, Italy
| |
Collapse
|
19
|
Interactome Mapping Provides a Network of Neurodegenerative Disease Proteins and Uncovers Widespread Protein Aggregation in Affected Brains. Cell Rep 2021; 32:108050. [PMID: 32814053 DOI: 10.1016/j.celrep.2020.108050] [Citation(s) in RCA: 49] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2019] [Revised: 02/15/2020] [Accepted: 07/28/2020] [Indexed: 12/12/2022] Open
Abstract
Interactome maps are valuable resources to elucidate protein function and disease mechanisms. Here, we report on an interactome map that focuses on neurodegenerative disease (ND), connects ∼5,000 human proteins via ∼30,000 candidate interactions and is generated by systematic yeast two-hybrid interaction screening of ∼500 ND-related proteins and integration of literature interactions. This network reveals interconnectivity across diseases and links many known ND-causing proteins, such as α-synuclein, TDP-43, and ATXN1, to a host of proteins previously unrelated to NDs. It facilitates the identification of interacting proteins that significantly influence mutant TDP-43 and HTT toxicity in transgenic flies, as well as of ARF-GEP100 that controls misfolding and aggregation of multiple ND-causing proteins in experimental model systems. Furthermore, it enables the prediction of ND-specific subnetworks and the identification of proteins, such as ATXN1 and MKL1, that are abnormally aggregated in postmortem brains of Alzheimer's disease patients, suggesting widespread protein aggregation in NDs.
Collapse
|
20
|
Sharma R, Srivastava T, Pandey AR, Mishra T, Gupta B, Reddy SS, Singh SP, Narender T, Tripathi A, Chandramouli B, Sashidhara KV, Priya S, Kumar N. Identification of Natural Products as Potential Pharmacological Chaperones for Protein Misfolding Diseases. ChemMedChem 2021; 16:2146-2156. [PMID: 33760394 DOI: 10.1002/cmdc.202100147] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Indexed: 01/12/2023]
Abstract
Defective protein folding and accumulation of misfolded proteins is associated with neurodegenerative, cardiovascular, secretory, and metabolic disorders. Efforts are being made to identify small-molecule modulators or structural-correctors for conformationally destabilized proteins implicated in various protein aggregation diseases. Using a metastable-reporter-based primary screen, we evaluated pharmacological chaperone activity of a diverse class of natural products. We found that a flavonoid glycoside (C-10, chrysoeriol-7-O-β-D-glucopyranoside) stabilizes metastable proteins, prevents its aggregation, and remodels the oligomers into protease-sensitive species. Data was corroborated with additional secondary screen with disease-specific pathogenic protein. In vitro and cell-based experiments showed that C-10 inhibits α-synuclein aggregation which is implicated in synucleinopathies-related neurodegeneration. C-10 interferes in its structural transition into β-sheeted fibrils and mitigates α-synuclein aggregation-associated cytotoxic effects. Computational modeling suggests that C-10 binds to unique sites in α-synuclein which may interfere in its aggregation amplification. These findings open an avenue for comprehensive SAR development for flavonoid glycosides as pharmacological chaperones for metastable and aggregation-prone proteins implicated in protein conformational diseases.
Collapse
Affiliation(s)
- Richa Sharma
- CSIR-Central Drug Research Institute, Lucknow, 226031, Uttar Pradesh, India
| | - Tulika Srivastava
- CSIR-Indian Institute of Toxicology Research, Lucknow, 226 001, Uttar Pradesh, India.,Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh, 201 002, India
| | - Alka Raj Pandey
- CSIR-Central Drug Research Institute, Lucknow, 226031, Uttar Pradesh, India.,Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh, 201 002, India
| | - Tripti Mishra
- CSIR-Central Drug Research Institute, Lucknow, 226031, Uttar Pradesh, India
| | - Bhagyashri Gupta
- CSIR-Central Drug Research Institute, Lucknow, 226031, Uttar Pradesh, India
| | | | - Suriya P Singh
- CSIR-Central Drug Research Institute, Lucknow, 226031, Uttar Pradesh, India
| | - Tadigoppula Narender
- CSIR-Central Drug Research Institute, Lucknow, 226031, Uttar Pradesh, India.,Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh, 201 002, India
| | - Aradhya Tripathi
- CSIR-Central Drug Research Institute, Lucknow, 226031, Uttar Pradesh, India
| | | | - Koneni V Sashidhara
- CSIR-Central Drug Research Institute, Lucknow, 226031, Uttar Pradesh, India.,Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh, 201 002, India
| | - Smriti Priya
- CSIR-Indian Institute of Toxicology Research, Lucknow, 226 001, Uttar Pradesh, India.,Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh, 201 002, India
| | - Niti Kumar
- CSIR-Central Drug Research Institute, Lucknow, 226031, Uttar Pradesh, India.,Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh, 201 002, India
| |
Collapse
|
21
|
Shin JE, Riesselman AJ, Kollasch AW, McMahon C, Simon E, Sander C, Manglik A, Kruse AC, Marks DS. Protein design and variant prediction using autoregressive generative models. Nat Commun 2021; 12:2403. [PMID: 33893299 PMCID: PMC8065141 DOI: 10.1038/s41467-021-22732-w] [Citation(s) in RCA: 130] [Impact Index Per Article: 43.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Accepted: 03/26/2021] [Indexed: 12/11/2022] Open
Abstract
The ability to design functional sequences and predict effects of variation is central to protein engineering and biotherapeutics. State-of-art computational methods rely on models that leverage evolutionary information but are inadequate for important applications where multiple sequence alignments are not robust. Such applications include the prediction of variant effects of indels, disordered proteins, and the design of proteins such as antibodies due to the highly variable complementarity determining regions. We introduce a deep generative model adapted from natural language processing for prediction and design of diverse functional sequences without the need for alignments. The model performs state-of-art prediction of missense and indel effects and we successfully design and test a diverse 105-nanobody library that shows better expression than a 1000-fold larger synthetic library. Our results demonstrate the power of the alignment-free autoregressive model in generalizing to regions of sequence space traditionally considered beyond the reach of prediction and design.
Collapse
Affiliation(s)
- Jung-Eun Shin
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Adam J Riesselman
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- insitro, South San Francisco, CA, USA
| | - Aaron W Kollasch
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Conor McMahon
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA, USA
- Vertex Pharmaceuticals, Boston, MA, USA
| | - Elana Simon
- Harvard College, Cambridge, MA, USA
- Reverie Labs, Cambridge, MA, USA
| | - Chris Sander
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Aashish Manglik
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, CA, USA
- Department of Anesthesia and Perioperative Care, University of California San Francisco, San Francisco, CA, USA
| | - Andrew C Kruse
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA, USA.
| | - Debora S Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
- Broad Institute of Harvard and MIT, Cambridge, MA, USA.
| |
Collapse
|
22
|
Zhao B, Katuwawala A, Uversky VN, Kurgan L. IDPology of the living cell: intrinsic disorder in the subcellular compartments of the human cell. Cell Mol Life Sci 2021; 78:2371-2385. [PMID: 32997198 PMCID: PMC11071772 DOI: 10.1007/s00018-020-03654-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Revised: 09/09/2020] [Accepted: 09/22/2020] [Indexed: 12/11/2022]
Abstract
Intrinsic disorder can be found in all proteomes of all kingdoms of life and in viruses, being particularly prevalent in the eukaryotes. We conduct a comprehensive analysis of the intrinsic disorder in the human proteins while mapping them into 24 compartments of the human cell. In agreement with previous studies, we show that human proteins are significantly enriched in disorder relative to a generic protein set that represents the protein universe. In fact, the fraction of proteins with long disordered regions and the average protein-level disorder content in the human proteome are about 3 times higher than in the protein universe. Furthermore, levels of intrinsic disorder in the majority of human subcellular compartments significantly exceed the average disorder content in the protein universe. Relative to the overall amount of disorder in the human proteome, proteins localized in the nucleus and cytoskeleton have significantly increased amounts of disorder, measured by both high disorder content and presence of multiple long intrinsically disordered regions. We empirically demonstrate that, on average, human proteins are assigned to 2.3 subcellular compartments, with proteins localized to few subcellular compartments being more disordered than the proteins that are localized to many compartments. Functionally, the disordered proteins localized in the most disorder-enriched subcellular compartments are primarily responsible for interactions with nucleic acids and protein partners. This is the first-time disorder is comprehensively mapped into the human cell. Our observations add a missing piece to the puzzle of functional disorder and its organization inside the cell.
Collapse
Affiliation(s)
- Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, 401 West Main Street, Room E4225, Richmond, VA, 23284, USA
| | - Akila Katuwawala
- Department of Computer Science, Virginia Commonwealth University, 401 West Main Street, Room E4225, Richmond, VA, 23284, USA
| | - Vladimir N Uversky
- Department of Molecular Medicine, USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Blvd. MDC07, Tampa, FL, 33612, USA.
- Laboratory of New Methods in Biology, Institute for Biological Instrumentation of the Russian Academy of Sciences, Federal Research Center "Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences", Pushchino, Russia.
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, 401 West Main Street, Room E4225, Richmond, VA, 23284, USA.
| |
Collapse
|
23
|
Exploring Potential Signals of Selection for Disordered Residues in Prokaryotic and Eukaryotic Proteins. GENOMICS PROTEOMICS & BIOINFORMATICS 2020; 18:549-564. [PMID: 33346088 PMCID: PMC8377245 DOI: 10.1016/j.gpb.2020.06.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Revised: 03/29/2020] [Accepted: 06/10/2020] [Indexed: 11/22/2022]
Abstract
Intrinsically disordered proteins (IDPs) are an important class of proteins in all domains of life for their functional importance. However, how nature has shaped the disorder potential of prokaryotic and eukaryotic proteins is still not clearly known. Randomly generated sequences are free of any selective constraints, thus these sequences are commonly used as null models. Considering different types of random protein models, here we seek to understand how the disorder potential of natural eukaryotic and prokaryotic proteins differs from random sequences. Comparing proteome-wide disorder content between real and random sequences of 12 model organisms, we noticed that eukaryotic proteins are enriched in disordered regions compared to random sequences, but in prokaryotes such regions are depleted. By analyzing the position-wise disorder profile, we show that there is a generally higher disorder near the N- and C-terminal regions of eukaryotic proteins as compared to the random models; however, either no or a weak such trend was found in prokaryotic proteins. Moreover, here we show that this preference is not caused by the amino acid or nucleotide composition at the respective sites. Instead, these regions were found to be endowed with a higher fraction of protein–protein binding sites, suggesting their functional importance. We discuss several possible explanations for this pattern, such as improving the efficiency of protein–protein interaction, ribosome movement during translation, and post-translational modification. However, further studies are needed to clearly understand the biophysical mechanisms causing the trend.
Collapse
|
24
|
Intrinsically disordered protein domain of human ameloblastin in synthetic fusion with calmodulin increases calmodulin stability and modulates its function. Int J Biol Macromol 2020; 168:1-12. [PMID: 33290768 DOI: 10.1016/j.ijbiomac.2020.11.216] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Revised: 11/29/2020] [Accepted: 11/30/2020] [Indexed: 11/21/2022]
Abstract
Constantly increasing attention to bioengineered proteins has led to the rapid development of new functional targets. Here we present the biophysical and functional characteristics of the newly designed CaM/AMBN-Ct fusion protein. The two-domain artificial target consists of calmodulin (CaM) and ameloblastin C-terminus (AMBN-Ct). CaM as a well-characterized calcium ions (Ca2+) binding protein offers plenty of options in terms of Ca2+ detection in biomedicine and biotechnologies. Highly negatively charged AMBN-Ct belongs to intrinsically disordered proteins (IDPs). CaM/AMBN-Ct was designed to open new ways of communication synergies between the domains with potential functional improvement. The character and function of CaM/AMBN-Ct were explored by biophysical and molecular modelling methods. Experimental studies have revealed increased stability and preserved CaM/AMBN-Ct function. The results of molecular dynamic simulations (MDs) outlined different interface patterns between the domains with potential allosteric communication within the fusion.
Collapse
|
25
|
Uversky VN. Functions of short lifetime biological structures at large: the case of intrinsically disordered proteins. Brief Funct Genomics 2020; 19:60-68. [PMID: 29982297 DOI: 10.1093/bfgp/ely023] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Although for more than a century a protein function was intimately associated with the presence of unique structure in a protein molecule, recent years witnessed a skyrocket rise of the appreciation of protein intrinsic disorder concept that emphasizes the importance of the biologically active proteins without ordered structures. In different proteins, the depth and breadth of disorder penetrance are different, generating an amusing spatiotemporal heterogeneity of intrinsically disordered proteins (IDPs) and intrinsically disordered protein region regions (IDPRs), which are typically described as highly dynamic ensembles of rapidly interconverting conformations (or a multitude of short lifetime structures). IDPs/IDPRs constitute a substantial part of protein kingdom and have unique functions complementary to functional repertoires of ordered proteins. They are recognized as interaction specialists and global controllers that play crucial roles in regulation of functions of their binding partners and in controlling large biological networks. IDPs/IDPRs are characterized by immense binding promiscuity and are able to use a broad spectrum of binding modes, often resulting in the formation of short lifetime complexes. In their turn, functions of IDPs and IDPRs are controlled by various means, such as numerous posttranslational modifications and alternative splicing. Some of the functions of IDPs/IDPRs are briefly considered in this review to shed some light on the biological roles of short-lived structures at large.
Collapse
Affiliation(s)
- Vladimir N Uversky
- Department of Molecular Medicine, USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA and Laboratory of New Methods in Biology, Institute for Biological Instrumentation, Russian Academy of Sciences, 142290 Pushchino, Moscow Region, Russia
| |
Collapse
|
26
|
Van Bibber NW, Haerle C, Khalife R, Dayhoff GW, Uversky VN. Intrinsic Disorder in Human Proteins Encoded by Core Duplicon Gene Families. J Phys Chem B 2020; 124:8050-8070. [PMID: 32880174 DOI: 10.1021/acs.jpcb.0c07676] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Segmental duplications (i.e., highly homologous DNA fragments greater than 1 kb in length that are present within a genome at more than one site) are typically found in genome regions that are prone to rearrangements. A noticeable fraction of the human genome (∼5%) includes segmental duplications (or duplicons) that are assumed to play a number of vital roles in human evolution, human-specific adaptation, and genomic instability. Despite their importance for crucial events such as synaptogenesis, neuronal migration, and neocortical expansion, these segmental duplications continue to be rather poorly characterized. Of particular interest are the core duplicon gene (CDG) families, which are replicates sharing common "core" DNA among the randomly attached pieces and which expand along single chromosomes and might harbor newly acquired protein domains. Another important feature of proteins encoded by CDG families is their multifunctionality. Although it seems that these proteins might possess many characteristic features of intrinsically disordered proteins, to the best of our knowledge, a systematic investigation of the intrinsic disorder predisposition of the proteins encoded by core duplicon gene families has not been conducted yet. To fill this gap and to determine the degree to which these proteins might be affected by intrinsic disorder, we analyzed a set of human proteins encoded by the members of 10 core duplicon gene families, such as NBPF, RGPD, GUSBP, PMS2P, SPATA31, TRIM51, GOLGA8, NPIP, TBC1D3, and LRRC37. Our analysis revealed that the vast majority of these proteins are highly disordered, with their disordered regions often being utilized as means for the protein-protein interactions and/or targeted for numerous posttranslational modifications of different nature.
Collapse
Affiliation(s)
- Nathan W Van Bibber
- Department of Molecular Medicine Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Boulevard, Tampa, Florida 33612, United States
| | - Cornelia Haerle
- Department of Molecular Medicine Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Boulevard, Tampa, Florida 33612, United States
| | - Roy Khalife
- Department of Molecular Medicine Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Boulevard, Tampa, Florida 33612, United States
| | - Guy W Dayhoff
- Department of Chemistry, College of Art and Sciences, University of South Florida, Tampa, Florida 33620, United States
| | - Vladimir N Uversky
- Department of Molecular Medicine Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Boulevard, Tampa, Florida 33612, United States.,USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Boulevard, Tampa, Florida 33612, United States.,Institute for Biological Instrumentation, Russian Academy of Sciences, Federal Research Center "Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences", 4 Institutskaya St., Pushchino, 142290, Moscow Region, Russia
| |
Collapse
|
27
|
Protein-Protein Interactions Mediated by Intrinsically Disordered Protein Regions Are Enriched in Missense Mutations. Biomolecules 2020; 10:biom10081097. [PMID: 32722039 PMCID: PMC7463635 DOI: 10.3390/biom10081097] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2020] [Revised: 07/15/2020] [Accepted: 07/20/2020] [Indexed: 12/27/2022] Open
Abstract
Because proteins are fundamental to most biological processes, many genetic diseases can be traced back to single nucleotide variants (SNVs) that cause changes in protein sequences. However, not all SNVs that result in amino acid substitutions cause disease as each residue is under different structural and functional constraints. Influential studies have shown that protein–protein interaction interfaces are enriched in disease-associated SNVs and depleted in SNVs that are common in the general population. These studies focus primarily on folded (globular) protein domains and overlook the prevalent class of protein interactions mediated by intrinsically disordered regions (IDRs). Therefore, we investigated the enrichment patterns of missense mutation-causing SNVs that are associated with disease and cancer, as well as those present in the healthy population, in structures of IDR-mediated interactions with comparisons to classical globular interactions. When comparing the different categories of interaction interfaces, division of the interface regions into solvent-exposed rim residues and buried core residues reveal distinctive enrichment patterns for the various types of missense mutations. Most notably, we demonstrate a strong enrichment at the interface core of interacting IDRs in disease mutations and its depletion in neutral ones, which supports the view that the disruption of IDR interactions is a mechanism underlying many diseases. Intriguingly, we also found an asymmetry across the IDR interaction interface in the enrichment of certain missense mutation types, which may hint at an increased variant tolerance and urges further investigations of IDR interactions.
Collapse
|
28
|
Disorder under stress: Role of polyol osmolytes in modulating fibrillation and aggregation of intrinsically disordered proteins. Biophys Chem 2020; 264:106422. [PMID: 32707418 DOI: 10.1016/j.bpc.2020.106422] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 06/19/2020] [Accepted: 06/20/2020] [Indexed: 12/18/2022]
Abstract
Intrinsically disordered proteins (IDPs) comprise ~30-40% of the proteome, have key roles in cellular processes, and have been reported to be involved in stress regulation working in synergy with osmolytes. Osmolytes are known to accumulate against various stresses in living systems and are known to stabilize the native conformation of globular proteins. However, little is known of their effect on IDPs and their mechanism of action is unclear. We have investigated the effect of a series of polyol osmolytes on the conformation, aggregation and fibrillation properties of the IDPs α and β-synuclein, involved in Parkinson's disease, using fluorescence, CD, light scattering and TEM. We observe inhibition of fibril and aggregate formation with increasing concentration as well as the number of hydroxyl groups in polyols as observed by light scattering measurements which correlates well with the increase in viscosity of solution with increasing number of OH groups in them. However, ThT assay, while indicating suppression of fibril formation at various concentrations of polyols, shows enhanced fibrillation at some other concentrations which could be due to the heterogeneity of the species formed that are ThT insensitive. Fibril formation was, thus, probed by using Nile red fluorescence which showed sensitivity towards the species formed. ANS binding fluorescence also indicates a decrease in the hydrophobicity of the fibrils with increasing number of OH groups in polyols. Polyols do not have any effect on the fibrillation of β-syn but lead to enhanced amorphous aggregate formation in presence of Ethylene Glycol and Glycerol and a reduction in the presence of Sorbitol. The net free energy of transfer of the proteins from water to Sorbitol is large and positive while it is relatively negligible in the case of Glycerol suggestive of greater preferential exclusion effect of Sorbitol in comparison with Glycerol in the case of IDPs as well. The results overall show differential and complex effect of osmolytes towards the fibrillation/aggregation properties of the two IDPs and suggest that an appropriate balance between the concentration and type of polyol or osmolyte would be required for the survival of organisms rich in IDPs under various stress conditions.
Collapse
|
29
|
Intrinsic Disorder in Tetratricopeptide Repeat Proteins. Int J Mol Sci 2020; 21:ijms21103709. [PMID: 32466138 PMCID: PMC7279152 DOI: 10.3390/ijms21103709] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 05/12/2020] [Accepted: 05/22/2020] [Indexed: 12/27/2022] Open
Abstract
Among the realm of repeat containing proteins that commonly serve as “scaffolds” promoting protein-protein interactions, there is a family of proteins containing between 2 and 20 tetratricopeptide repeats (TPRs), which are functional motifs consisting of 34 amino acids. The most distinguishing feature of TPR domains is their ability to stack continuously one upon the other, with these stacked repeats being able to affect interaction with binding partners either sequentially or in combination. It is known that many repeat-containing proteins are characterized by high levels of intrinsic disorder, and that many protein tandem repeats can be intrinsically disordered. Furthermore, it seems that TPR-containing proteins share many characteristics with hybrid proteins containing ordered domains and intrinsically disordered protein regions. However, there has not been a systematic analysis of the intrinsic disorder status of TPR proteins. To fill this gap, we analyzed 166 human TPR proteins to determine the degree to which proteins containing TPR motifs are affected by intrinsic disorder. Our analysis revealed that these proteins are characterized by different levels of intrinsic disorder and contain functional disordered regions that are utilized for protein-protein interactions and often serve as targets of various posttranslational modifications.
Collapse
|
30
|
Hu G, Wu Z, Oldfield CJ, Wang C, Kurgan L. Quality assessment for the putative intrinsic disorder in proteins. Bioinformatics 2020; 35:1692-1700. [PMID: 30329008 DOI: 10.1093/bioinformatics/bty881] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2018] [Revised: 09/19/2018] [Accepted: 10/15/2018] [Indexed: 11/15/2022] Open
Abstract
MOTIVATION While putative intrinsic disorder is widely used, none of the predictors provides quality assessment (QA) scores. QA scores estimate the likelihood that predictions are correct at a residue level and have been applied in other bioinformatics areas. We recently reported that QA scores derived from putative disorder propensities perform relatively poorly for native disordered residues. Here we design and validate a general approach to construct QA predictors for disorder predictions. RESULTS The QUARTER (QUality Assessment for pRotein inTrinsic disordEr pRedictions) toolbox of methods accommodates a diverse set of ten disorder predictors. It builds upon several innovative design elements including use and scaling of selected physicochemical properties of the input sequence, post-processing of disorder propensity scores, and a feature selection that optimizes the predictive models to a specific disorder predictor. We empirically establish that each one of these elements contributes to the overall predictive performance of our tool and that QUARTER's outputs significantly outperform QA scores derived from the outputs generated the disorder predictors. The best performing QA scores for a single disorder predictor identify 13% of residues that are predicted with 98% precision. QA scores computed by combining results of the ten disorder predictors cover 40% of residues with 95% precision. Case studies are used to show how to interpret the QA scores. QA scores based on the high precision combined predictions are applied to analyze disorder in the human proteome. AVAILABILITY AND IMPLEMENTATION http://biomine.cs.vcu.edu/servers/QUARTER/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Gang Hu
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, People's Republic of China
| | - Zhonghua Wu
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, People's Republic of China
| | | | - Chen Wang
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| |
Collapse
|
31
|
Badierah RA, Uversky VN, Redwan EM. Dancing with Trojan horses: an interplay between the extracellular vesicles and viruses. J Biomol Struct Dyn 2020; 39:3034-3060. [DOI: 10.1080/07391102.2020.1756409] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- Raied A. Badierah
- Biological Science Department, Faculty of Science, King Abdulaziz University, Jeddah, Saudi Arabia
- Molecular Diagnostic Laboratory, King Abdulaziz University Hospital, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Vladimir N. Uversky
- Biological Science Department, Faculty of Science, King Abdulaziz University, Jeddah, Saudi Arabia
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL, USA
- Laboratory of New Methods in Biology, Institute for Biological Instrumentation, Russian Academy of Sciences, Federal Research Center ‘Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences’, Pushchino, Moscow Region, Russia
| | - Elrashdy M. Redwan
- Biological Science Department, Faculty of Science, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
32
|
A New Census of Protein Tandem Repeats and Their Relationship with Intrinsic Disorder. Genes (Basel) 2020; 11:genes11040407. [PMID: 32283633 PMCID: PMC7230257 DOI: 10.3390/genes11040407] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Revised: 03/29/2020] [Accepted: 04/01/2020] [Indexed: 12/31/2022] Open
Abstract
Protein tandem repeats (TRs) are often associated with immunity-related functions and diseases. Since that last census of protein TRs in 1999, the number of curated proteins increased more than seven-fold and new TR prediction methods were published. TRs appear to be enriched with intrinsic disorder and vice versa. The significance and the biological reasons for this association are unknown. Here, we characterize protein TRs across all kingdoms of life and their overlap with intrinsic disorder in unprecedented detail. Using state-of-the-art prediction methods, we estimate that 50.9% of proteins contain at least one TR, often located at the sequence flanks. Positive linear correlation between the proportion of TRs and the protein length was observed universally, with Eukaryotes in general having more TRs, but when the difference in length is taken into account the difference is quite small. TRs were enriched with disorder-promoting amino acids and were inside intrinsically disordered regions. Many such TRs were homorepeats. Our results support that TRs mostly originate by duplication and are involved in essential functions such as transcription processes, structural organization, electron transport and iron-binding. In viruses, TRs are found in proteins essential for virulence.
Collapse
|
33
|
Liu Y, Wang X, Liu B. RFPR-IDP: reduce the false positive rates for intrinsically disordered protein and region prediction by incorporating both fully ordered proteins and disordered proteins. Brief Bioinform 2020; 22:2000-2011. [PMID: 32112084 PMCID: PMC7986600 DOI: 10.1093/bib/bbaa018] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
As an important type of proteins, intrinsically disordered proteins/regions (IDPs/IDRs) are related to many crucial biological functions. Accurate prediction of IDPs/IDRs is beneficial to the prediction of protein structures and functions. Most of the existing methods ignore the fully ordered proteins without IDRs during training and test processes. As a result, the corresponding predictors prefer to predict the fully ordered proteins as disordered proteins. Unfortunately, these methods were only evaluated on datasets consisting of disordered proteins without or with only a few fully ordered proteins, and therefore, this problem escapes the attention of the researchers. However, most of the newly sequenced proteins are fully ordered proteins in nature. These predictors fail to accurately predict the ordered and disordered proteins in real-world applications. In this regard, we propose a new method called RFPR-IDP trained with both fully ordered proteins and disordered proteins, which is constructed based on the combination of convolution neural network (CNN) and bidirectional long short-term memory (BiLSTM). The experimental results show that although the existing predictors perform well for predicting the disordered proteins, they tend to predict the fully ordered proteins as disordered proteins. In contrast, the RFPR-IDP predictor can correctly predict the fully ordered proteins and outperform the other 10 state-of-the-art methods when evaluated on a test dataset with both fully ordered proteins and disordered proteins. The web server and datasets of RFPR-IDP are freely available at http://bliulab.net/RFPR-IDP/server.
Collapse
Affiliation(s)
- Yumeng Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong 518055, China
| | - Xiaolong Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong 518055, China
| | - Bin Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong 518055, China.,School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China.,Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing 100081, China
| |
Collapse
|
34
|
Abstract
Functions of intrinsically disordered proteins do not require structure. Such structure-independent functionality has melted away the classic rigid "lock and key" representation of structure-function relationships in proteins, opening a new page in protein science, where molten keys operate on melted locks and where conformational flexibility and intrinsic disorder, structural plasticity and extreme malleability, multifunctionality and binding promiscuity represent a new-fangled reality. Analysis and understanding of this new reality require novel tools, and some of the techniques elaborated for the examination of intrinsically disordered protein functions are outlined in this review.
Collapse
Affiliation(s)
- Vladimir N. Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer’s Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, 33620, USA
- Laboratory of New Methods in Biology, Institute for Biological Instrumentation, Russian Academy of Sciences, Pushchino, Russian Federation
| |
Collapse
|
35
|
Arginine π-stacking drives binding to fibrils of the Alzheimer protein Tau. Nat Commun 2020; 11:571. [PMID: 31996674 PMCID: PMC6989696 DOI: 10.1038/s41467-019-13745-7] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2019] [Accepted: 11/15/2019] [Indexed: 01/26/2023] Open
Abstract
Aggregation of the Tau protein into fibrils defines progression of neurodegenerative diseases, including Alzheimer’s Disease. The molecular basis for potentially toxic reactions of Tau aggregates is poorly understood. Here we show that π-stacking by Arginine side-chains drives protein binding to Tau fibrils. We mapped an aggregation-dependent interaction pattern of Tau. Fibrils recruit specifically aberrant interactors characterised by intrinsically disordered regions of atypical sequence features. Arginine residues are key to initiate these aberrant interactions. Crucial for scavenging is the guanidinium group of its side chain, not its charge, indicating a key role of π-stacking chemistry for driving aberrant fibril interactions. Remarkably, despite the non-hydrophobic interaction mode, the molecular chaperone Hsp90 can modulate aberrant fibril binding. Together, our data present a molecular mode of action for derailment of protein-protein interaction by neurotoxic fibrils. Tau fibril formation is a hallmark of Alzheimer’s disease. Here the authors reveal an aggregation-dependent protein interaction pattern of Tau and further show that π-stacking of the arginine side-chains drives aberrant protein binding to Tau fibrils.
Collapse
|
36
|
Carmicheal J, Atri P, Sharma S, Kumar S, Chirravuri Venkata R, Kulkarni P, Salgia R, Ghersi D, Kaur S, Batra SK. Presence and structure-activity relationship of intrinsically disordered regions across mucins. FASEB J 2020; 34:1939-1957. [PMID: 31908009 DOI: 10.1096/fj.201901898rr] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2019] [Revised: 11/18/2019] [Accepted: 12/05/2019] [Indexed: 12/24/2022]
Abstract
Many members of the mucin family are evolutionarily conserved and are often aberrantly expressed and glycosylated in various benign and malignant pathologies leading to tumor invasion, metastasis, and immune evasion. The large size and extensive glycosylation present challenges to study the mucin structure using traditional methods, including crystallography. We offer the hypothesis that the functional versatility of mucins may be attributed to the presence of intrinsically disordered regions (IDRs) that provide dynamism and flexibility and that the IDRs offer potential therapeutic targets. Herein, we examined the links between the mucin structure and function based on IDRs, posttranslational modifications (PTMs), and potential impact on their interactome. Using sequence-based bioinformatics tools, we observed that mucins are predicted to be moderately (20%-40%) to highly (>40%) disordered and many conserved mucin domains could be disordered. Phosphorylation sites overlap with IDRs throughout the mucin sequences. Additionally, the majority of predicted O- and N- glycosylation sites in the tandem repeat regions occur within IDRs and these IDRs contain a large number of functional motifs, that is, molecular recognition features (MoRFs), which directly influence protein-protein interactions (PPIs). This investigation provides a novel perspective and offers an insight into the complexity and dynamic nature of mucins.
Collapse
Affiliation(s)
- Joseph Carmicheal
- Department of Biochemistry and Molecular Biology, University of Nebraska Medical Center, Omaha, Nebraska
| | - Pranita Atri
- Department of Biochemistry and Molecular Biology, University of Nebraska Medical Center, Omaha, Nebraska
| | - Sunandini Sharma
- Department of Biochemistry and Molecular Biology, University of Nebraska Medical Center, Omaha, Nebraska
| | - Sushil Kumar
- Department of Biochemistry and Molecular Biology, University of Nebraska Medical Center, Omaha, Nebraska.,Buffett Cancer Center, University of Nebraska Medical Center, Omaha, Nebraska
| | | | - Prakash Kulkarni
- Department of Medical Oncology and Therapeutics Research, City of Hope, Duarte, California
| | - Ravi Salgia
- Department of Medical Oncology and Therapeutics Research, City of Hope, Duarte, California
| | - Dario Ghersi
- School of Interdisciplinary Informatics, University of Nebraska Omaha, Omaha, Nebraska
| | - Sukhwinder Kaur
- Department of Biochemistry and Molecular Biology, University of Nebraska Medical Center, Omaha, Nebraska.,Buffett Cancer Center, University of Nebraska Medical Center, Omaha, Nebraska
| | - Surinder K Batra
- Department of Biochemistry and Molecular Biology, University of Nebraska Medical Center, Omaha, Nebraska.,Buffett Cancer Center, University of Nebraska Medical Center, Omaha, Nebraska
| |
Collapse
|
37
|
Uversky VN, Finkelstein AV. Life in Phases: Intra- and Inter- Molecular Phase Transitions in Protein Solutions. Biomolecules 2019; 9:E842. [PMID: 31817975 PMCID: PMC6995567 DOI: 10.3390/biom9120842] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Revised: 12/05/2019] [Accepted: 12/06/2019] [Indexed: 02/06/2023] Open
Abstract
Proteins, these evolutionarily-edited biological polymers, are able to undergo intramolecular and intermolecular phase transitions. Spontaneous intramolecular phase transitions define the folding of globular proteins, whereas binding-induced, intra- and inter- molecular phase transitions play a crucial role in the functionality of many intrinsically-disordered proteins. On the other hand, intermolecular phase transitions are the behind-the-scenes players in a diverse set of macrosystemic phenomena taking place in protein solutions, such as new phase nucleation in bulk, on the interface, and on the impurities, protein crystallization, protein aggregation, the formation of amyloid fibrils, and intermolecular liquid-liquid or liquid-gel phase transitions associated with the biogenesis of membraneless organelles in the cells. This review is dedicated to the systematic analysis of the phase behavior of protein molecules and their ensembles, and provides a description of the major physical principles governing intramolecular and intermolecular phase transitions in protein solutions.
Collapse
Affiliation(s)
- Vladimir N. Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA
- Laboratory of New Methods in Biology, Institute for Biological Instrumentation, Russian Academy of Sciences, Federal Research Center “Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences”, 142290 Pushchino, Moscow, Russia
| | - Alexei V. Finkelstein
- Institute of Protein Research, Russian Academy of Sciences, 142290 Pushchino, Moscow, Russia
- Biology Department, Lomonosov Moscow State University, 119192 Moscow, Russia
- Bioltechnogy Department, Lomonosov Moscow State University, 142290 Pushchino, Moscow, Russia
| |
Collapse
|
38
|
Ghadermarzi S, Li X, Li M, Kurgan L. Sequence-Derived Markers of Drug Targets and Potentially Druggable Human Proteins. Front Genet 2019; 10:1075. [PMID: 31803227 PMCID: PMC6872670 DOI: 10.3389/fgene.2019.01075] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2019] [Accepted: 10/09/2019] [Indexed: 12/16/2022] Open
Abstract
Recent research shows that majority of the druggable human proteome is yet to be annotated and explored. Accurate identification of these unexplored druggable proteins would facilitate development, screening, repurposing, and repositioning of drugs, as well as prediction of new drug–protein interactions. We contrast the current drug targets against the datasets of non-druggable and possibly druggable proteins to formulate markers that could be used to identify druggable proteins. We focus on the markers that can be extracted from protein sequences or names/identifiers to ensure that they can be applied across the entire human proteome. These markers quantify key features covered in the past works (topological features of PPIs, cellular functions, and subcellular locations) and several novel factors (intrinsic disorder, residue-level conservation, alternative splicing isoforms, domains, and sequence-derived solvent accessibility). We find that the possibly druggable proteins have significantly higher abundance of alternative splicing isoforms, relatively large number of domains, higher degree of centrality in the protein-protein interaction networks, and lower numbers of conserved and surface residues, when compared with the non-druggable proteins. We show that the current drug targets and possibly druggable proteins share involvement in the catalytic and signaling functions. However, unlike the drug targets, the possibly druggable proteins participate in the metabolic and biosynthesis processes, are enriched in the intrinsic disorder, interact with proteins and nucleic acids, and are localized across the cell. To sum up, we formulate several markers that can help with finding novel druggable human proteins and provide interesting insights into the cellular functions and subcellular locations of the current drug targets and potentially druggable proteins.
Collapse
Affiliation(s)
- Sina Ghadermarzi
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Xingyi Li
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| |
Collapse
|
39
|
El Hadidy N, Uversky VN. Intrinsic Disorder of the BAF Complex: Roles in Chromatin Remodeling and Disease Development. Int J Mol Sci 2019; 20:ijms20215260. [PMID: 31652801 PMCID: PMC6862534 DOI: 10.3390/ijms20215260] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2019] [Revised: 10/12/2019] [Accepted: 10/21/2019] [Indexed: 12/13/2022] Open
Abstract
The two-meter-long DNA is compressed into chromatin in the nucleus of every cell, which serves as a significant barrier to transcription. Therefore, for processes such as replication and transcription to occur, the highly compacted chromatin must be relaxed, and the processes required for chromatin reorganization for the aim of replication or transcription are controlled by ATP-dependent nucleosome remodelers. One of the most highly studied remodelers of this kind is the BRG1- or BRM-associated factor complex (BAF complex, also known as SWItch/sucrose non-fermentable (SWI/SNF) complex), which is crucial for the regulation of gene expression and differentiation in eukaryotes. Chromatin remodeling complex BAF is characterized by a highly polymorphic structure, containing from four to 17 subunits encoded by 29 genes. The aim of this paper is to provide an overview of the role of BAF complex in chromatin remodeling and also to use literature mining and a set of computational and bioinformatics tools to analyze structural properties, intrinsic disorder predisposition, and functionalities of its subunits, along with the description of the relations of different BAF complex subunits to the pathogenesis of various human diseases.
Collapse
Affiliation(s)
- Nashwa El Hadidy
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Blvd. MDC07, Tampa, FL 33612, USA.
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Blvd. MDC07, Tampa, FL 33612, USA.
- Laboratory of New Methods in Biology, Institute for Biological Instrumentation of the Russian Academy of Sciences, Federal Research Center "Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences", Pushchino, 142290 Moscow Region, Russia.
| |
Collapse
|
40
|
Djulbegovic MB, Uversky VN. Ferroptosis - An iron- and disorder-dependent programmed cell death. Int J Biol Macromol 2019; 135:1052-1069. [PMID: 31175900 DOI: 10.1016/j.ijbiomac.2019.05.221] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2019] [Revised: 05/30/2019] [Accepted: 05/31/2019] [Indexed: 12/20/2022]
Abstract
Programmed cell death (PCD) is an integral component of both developmental and pathological features of an organism. Recently, ferroptosis, a new form of PCD that is dependent on reactive oxygen species and iron, has been described. As with apoptosis, necroptosis, and autophagy, ferroptosis is associated with a large set of proteins assembled in protein-protein interaction (PPI) networks, interactability of which is driven by the presence of intrinsically disordered proteins (IDPs) and IDP regions (IDPRs). Previous investigations have identified the prevalence and functionality of IDPs/IDPRs in non-ferroptosis PCD. The intrinsic disorder in protein structures is used to increase the regulatory powers of these processes. As uncontrolled PCD is associated with the onset of various pathological traits, uncovering the association between intrinsic disorder and ferroptosis-related proteins is crucial. To understand this association, 31 human ferroptosis-related proteins were analyzed via a multi-dimensional array of bioinformatics and computational techniques. In addition, a disorder meta-predictor (PONDR® FIT) was implored to look at the evolutionary conservation of intrinsic disorder in these proteins. This study presents evidence that IDPs and IDPRs are prevalent in ferroptosis. The intrinsic disorder found in ferroptosis has far-reaching functional implications related to ferroptosis-related PPIs and molecular interactions.
Collapse
Affiliation(s)
- Mak B Djulbegovic
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA; USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA; Protein Research Group, Institute for Biological Instrumentation of the Russian Academy of Sciences, 142290 Pushchino, Moscow region, Russia.
| |
Collapse
|
41
|
Robinson TJ, Freedman JA, Al Abo M, Deveaux AE, LaCroix B, Patierno BM, George DJ, Patierno SR. Alternative RNA Splicing as a Potential Major Source of Untapped Molecular Targets in Precision Oncology and Cancer Disparities. Clin Cancer Res 2019; 25:2963-2968. [PMID: 30755441 PMCID: PMC6653604 DOI: 10.1158/1078-0432.ccr-18-2445] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2018] [Revised: 09/18/2018] [Accepted: 01/29/2019] [Indexed: 12/12/2022]
Abstract
Studies of alternative RNA splicing (ARS) have the potential to provide an abundance of novel targets for development of new biomarkers and therapeutics in oncology, which will be necessary to improve outcomes for patients with cancer and mitigate cancer disparities. ARS, a key step in gene expression enabling individual genes to encode multiple proteins, is emerging as a major driver of abnormal phenotypic heterogeneity. Recent studies have begun to identify RNA splicing-related genetic and genomic variation in tumors, oncogenes dysregulated by ARS, RNA splice variants driving race-related cancer aggressiveness and drug response, spliceosome-dependent transformation, and RNA splicing-related immunogenic epitopes in cancer. In addition, recent studies have begun to identify and test, preclinically and clinically, approaches to modulate and exploit ARS for therapeutic application, including splice-switching oligonucleotides, small molecules targeting RNA splicing or RNA splice variants, and combination regimens with immunotherapies. Although ARS data hold such promise for precision oncology, inclusion of studies of ARS in translational and clinical cancer research remains limited. Technologic developments in sequencing and bioinformatics are being routinely incorporated into clinical oncology that permit investigation of clinically relevant ARS events, yet ARS remains largely overlooked either because of a lack of awareness within the clinical oncology community or perceived barriers to the technical complexity of analyzing ARS. This perspective aims to increase such awareness, propose immediate opportunities to improve identification and analysis of ARS, and call for bioinformaticians and cancer researchers to work together to address the urgent need to incorporate ARS into cancer biology and precision oncology.
Collapse
Affiliation(s)
| | - Jennifer A Freedman
- Department of Medicine, Division of Medical Oncology, Duke University Medical Center, Durham, North Carolina
- Duke Cancer Institute, Duke University Medical Center, Durham, North Carolina
| | - Muthana Al Abo
- Duke Cancer Institute, Duke University Medical Center, Durham, North Carolina
| | - April E Deveaux
- Department of Medicine, Division of Medical Oncology, Duke University Medical Center, Durham, North Carolina
| | - Bonnie LaCroix
- Department of Medicine, Division of Medical Oncology, Duke University Medical Center, Durham, North Carolina
| | - Brendon M Patierno
- Department of Medicine, Division of Medical Oncology, Duke University Medical Center, Durham, North Carolina
| | - Daniel J George
- Department of Medicine, Division of Medical Oncology, Duke University Medical Center, Durham, North Carolina
- Duke Cancer Institute, Duke University Medical Center, Durham, North Carolina
| | - Steven R Patierno
- Department of Medicine, Division of Medical Oncology, Duke University Medical Center, Durham, North Carolina.
- Duke Cancer Institute, Duke University Medical Center, Durham, North Carolina
| |
Collapse
|
42
|
Barski M. BASILIScan: a tool for high-throughput analysis of intrinsic disorder patterns in homologous proteins. BMC Genomics 2018; 19:902. [PMID: 30537929 PMCID: PMC6290515 DOI: 10.1186/s12864-018-5322-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2018] [Accepted: 11/28/2018] [Indexed: 12/02/2022] Open
Abstract
Background Intrinsic structural disorder is a common property of many proteins, especially in eukaryotic and virus proteomes. The tendency of some proteins or protein regions to exist in a disordered state usually precludes their structural characterisation and renders them especially difficult for experimental handling after recombinant expression. Results A new intuitive, publicly-available computational resource, called BASILIScan, is presented here. It provides a BLAST-based search for close homologues of the protein of interest, integrated with a simultaneous prediction of intrinsic disorder together with a robust data viewer and interpreter. This allows for a quick, high-throughput screening, scoring and selection of closely-related yet highly structured homologues of the protein of interest. Comparative parallel analysis of the conservation of extended regions of disorder in multiple sequences is also offered. The use of BASILIScan and its capacity for yielding biologically applicable predictions is demonstrated. Using a high-throughput BASILIScan screen it is also shown that a large proportion of the human proteome displays homologous sequences of superior intrinsic structural order in many related species. Conclusion Through the swift identification of intrinsically stable homologues and poorly conserved disordered regions by the BASILIScan software, the chances of successful recombinant protein expression and compatibility with downstream applications such as crystallisation can be greatly increased. Electronic supplementary material The online version of this article (10.1186/s12864-018-5322-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Michal Barski
- Section of Virology, Department of Medicine, St Mary's Hospital, Imperial College London, London, W2 1PG, UK.
| |
Collapse
|
43
|
Meng F, Kurgan L. High‐throughput prediction of disordered moonlighting regions in protein sequences. Proteins 2018; 86:1097-1110. [DOI: 10.1002/prot.25590] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2018] [Revised: 07/25/2018] [Accepted: 08/05/2018] [Indexed: 01/20/2023]
Affiliation(s)
- Fanchi Meng
- Department of Electrical and Computer Engineering University of Alberta Edmonton Canada
| | - Lukasz Kurgan
- Department of Electrical and Computer Engineering University of Alberta Edmonton Canada
- Department of Computer Science Virginia Commonwealth University Richmond VA
| |
Collapse
|
44
|
Wang C, Kurgan L. Review and comparative assessment of similarity-based methods for prediction of drug–protein interactions in the druggable human proteome. Brief Bioinform 2018; 20:2066-2087. [DOI: 10.1093/bib/bby069] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2018] [Revised: 06/26/2018] [Accepted: 07/10/2018] [Indexed: 12/18/2022] Open
Abstract
AbstractDrug–protein interactions (DPIs) underlie the desired therapeutic actions and the adverse side effects of a significant majority of drugs. Computational prediction of DPIs facilitates research in drug discovery, characterization and repurposing. Similarity-based methods that do not require knowledge of protein structures are particularly suitable for druggable genome-wide predictions of DPIs. We review 35 high-impact similarity-based predictors that were published in the past decade. We group them based on three types of similarities and their combinations that they use. We discuss and compare key aspects of these methods including source databases, internal databases and their predictive models. Using our novel benchmark database, we perform comparative empirical analysis of predictive performance of seven types of representative predictors that utilize each type of similarity individually and all possible combinations of similarities. We assess predictive quality at the database-wide DPI level and we are the first to also include evaluation over individual drugs. Our comprehensive analysis shows that predictors that use more similarity types outperform methods that employ fewer similarities, and that the model combining all three types of similarities secures area under the receiver operating characteristic curve of 0.93. We offer a comprehensive analysis of sensitivity of predictive performance to intrinsic and extrinsic characteristics of the considered predictors. We find that predictive performance is sensitive to low levels of similarities between sequences of the drug targets and several extrinsic properties of the input drug structures, drug profiles and drug targets. The benchmark database and a webserver for the seven predictors are freely available at http://biomine.cs.vcu.edu/servers/CONNECTOR/.
Collapse
Affiliation(s)
- Chen Wang
- Computer Science Department, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Lukasz Kurgan
- Computer Science Department, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
45
|
Myers N, Olender T, Savidor A, Levin Y, Reuven N, Shaul Y. The Disordered Landscape of the 20S Proteasome Substrates Reveals Tight Association with Phase Separated Granules. Proteomics 2018; 18:e1800076. [PMID: 30039638 DOI: 10.1002/pmic.201800076] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2018] [Revised: 06/28/2018] [Indexed: 12/11/2022]
Abstract
Proteasomal degradation is the main route of regulated proteostasis. The 20S proteasome is the core particle (CP) responsible for the catalytic activity of all proteasome complexes. Structural constraints mean that only unfolded, extended polypeptide chains may enter the catalytic core of the 20S proteasome. It has been previously shown that the 20S CP is active in degradation of certain intrinsically disordered proteins (IDP) lacking structural constrains. Here, a comprehensive analysis of the 20S CP substrates in vitro is conducted. It is revealed that the 20S CP substrates are highly disordered. However, not all the IDPs are 20S CP substrates. The group of the IDPs that are 20S CP substrates, termed 20S-IDPome are characterized by having significantly more protein binding partners, more posttranslational modification sites, and are highly enriched for RNA binding proteins. The vast majority of them are involved in splicing, mRNA processing, and translation. Remarkably, it is found that low complexity proteins with prion-like domain (PrLD), which interact with GR or PR di-peptide repeats, are the most preferential 20S CP substrates. The finding suggests roles of the 20S CP in gene transcription and formation of phase-separated granules.
Collapse
Affiliation(s)
- Nadav Myers
- Department of Molecular Genetics, Weizmann Institute of Science Department of Molecular Genetics, 76100, Rehovot, Israel
| | - Tsviya Olender
- Department of Molecular Genetics, Weizmann Institute of Science Department of Molecular Genetics, 76100, Rehovot, Israel
| | - Alon Savidor
- The Nancy and Stephen Grand Israel National Center for Personalized Medicine (G-INCPM), Weizmann Institute of Science, 76100, Rehovot, Israel
| | - Yishai Levin
- The Nancy and Stephen Grand Israel National Center for Personalized Medicine (G-INCPM), Weizmann Institute of Science, 76100, Rehovot, Israel
| | - Nina Reuven
- Department of Molecular Genetics, Weizmann Institute of Science Department of Molecular Genetics, 76100, Rehovot, Israel
| | - Yosef Shaul
- Department of Molecular Genetics, Weizmann Institute of Science Department of Molecular Genetics, 76100, Rehovot, Israel
| |
Collapse
|
46
|
Functional Analysis of Human Hub Proteins and Their Interactors Involved in the Intrinsic Disorder-Enriched Interactions. Int J Mol Sci 2017; 18:ijms18122761. [PMID: 29257115 PMCID: PMC5751360 DOI: 10.3390/ijms18122761] [Citation(s) in RCA: 71] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2017] [Revised: 12/13/2017] [Accepted: 12/15/2017] [Indexed: 12/15/2022] Open
Abstract
Some of the intrinsically disordered proteins and protein regions are promiscuous interactors that are involved in one-to-many and many-to-one binding. Several studies have analyzed enrichment of intrinsic disorder among the promiscuous hub proteins. We extended these works by providing a detailed functional characterization of the disorder-enriched hub protein-protein interactions (PPIs), including both hubs and their interactors, and by analyzing their enrichment among disease-associated proteins. We focused on the human interactome, given its high degree of completeness and relevance to the analysis of the disease-linked proteins. We quantified and investigated numerous functional and structural characteristics of the disorder-enriched hub PPIs, including protein binding, structural stability, evolutionary conservation, several categories of functional sites, and presence of over twenty types of posttranslational modifications (PTMs). We showed that the disorder-enriched hub PPIs have a significantly enlarged number of disordered protein binding regions and long intrinsically disordered regions. They also include high numbers of targeting, catalytic, and many types of PTM sites. We empirically demonstrated that these hub PPIs are significantly enriched among 11 out of 18 considered classes of human diseases that are associated with at least 100 human proteins. Finally, we also illustrated how over a dozen specific human hubs utilize intrinsic disorder for their promiscuous PPIs.
Collapse
|
47
|
Meng F, Uversky VN, Kurgan L. Comprehensive review of methods for prediction of intrinsic disorder and its molecular functions. Cell Mol Life Sci 2017; 74:3069-3090. [PMID: 28589442 PMCID: PMC11107660 DOI: 10.1007/s00018-017-2555-4] [Citation(s) in RCA: 130] [Impact Index Per Article: 18.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2017] [Accepted: 06/01/2017] [Indexed: 12/19/2022]
Abstract
Computational prediction of intrinsic disorder in protein sequences dates back to late 1970 and has flourished in the last two decades. We provide a brief historical overview, and we review over 30 recent predictors of disorder. We are the first to also cover predictors of molecular functions of disorder, including 13 methods that focus on disordered linkers and disordered protein-protein, protein-RNA, and protein-DNA binding regions. We overview their predictive models, usability, and predictive performance. We highlight newest methods and predictors that offer strong predictive performance measured based on recent comparative assessments. We conclude that the modern predictors are relatively accurate, enjoy widespread use, and many of them are fast. Their predictions are conveniently accessible to the end users, via web servers and databases that store pre-computed predictions for millions of proteins. However, research into methods that predict many not yet addressed functions of intrinsic disorder remains an outstanding challenge.
Collapse
Affiliation(s)
- Fanchi Meng
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Canada
| | - Vladimir N Uversky
- Department of Molecular Medicine, USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA
- Institute for Biological Instrumentation, Russian Academy of Sciences, Pushchino, Moscow Region, Russian Federation
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, USA.
| |
Collapse
|
48
|
DeForte S, Uversky VN. Not an exception to the rule: the functional significance of intrinsically disordered protein regions in enzymes. MOLECULAR BIOSYSTEMS 2017; 13:463-469. [PMID: 28098335 DOI: 10.1039/c6mb00741d] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Intrinsically disordered protein regions (IDPRs) are remarkably common and have unique and important biological functions. Enzymes have long been considered an exception to the rule of protein intrinsic disorder due to the structural requirements for catalysis. Although functionally significant IDPRs have been described in several enzymes, there has been no study quantifying the extent of this phenomenon. We have conducted a multilevel computational analysis of missing regions in X-ray crystal structures in the PDB and predicted disorder in 66 representative proteomes. We found that the fraction of predicted disorder was higher in non-enzymes than enzymes, because non-enzymes were more likely to be fully disordered. However, we also found that transferases, hydrolases and enzymes with multiple assigned functional classifications were similar to non-enzymes in terms of the length of the longest continuous stretch of predicted disorder. Both eukaryotic enzymes and non-enzymes had a greater disorder content than was seen in bacteria. Disorder at the proteome level appears to emerge in response to organismic and functional complexity, and enzymes are not an exception to this rule.
Collapse
Affiliation(s)
- Shelly DeForte
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, Florida 33612, USA.
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, Florida 33612, USA. and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, Florida 33612, USA and Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg, Russian Federation
| |
Collapse
|
49
|
Na I, Meng F, Kurgan L, Uversky VN. Autophagy-related intrinsically disordered proteins in intra-nuclear compartments. MOLECULAR BIOSYSTEMS 2017; 12:2798-817. [PMID: 27377881 DOI: 10.1039/c6mb00069j] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
Recent analyses indicated that autophagy can be regulated via some nuclear transcriptional networks and many important players in the autophagy and other forms of programmed cell death are known to be intrinsically disordered. To this end, we analyzed similarities and differences in the intrinsic disorder distribution of nuclear and non-nuclear proteins related to autophagy. We also looked at the peculiarities of the distribution of the intrinsically disordered autophagy-related proteins in various intra-nuclear organelles, such as the nucleolus, chromatin, Cajal bodies, nuclear speckles, promyelocytic leukemia (PML) nuclear bodies, nuclear lamina, nuclear pores, and perinucleolar compartment. This analysis revealed that the autophagy-related proteins constitute about 2.5% of the non-nuclear proteins and 3.3% of the nuclear proteins, which corresponds to a substantial enrichment by about 32% in the nucleus. Curiously, although, in general, the autophagy-related proteins share similar characteristics of disorder with a generic set of all non-nuclear proteins, chromatin and nuclear speckles are enriched in the intrinsically disordered autophagy proteins (29 and 37% of these proteins are disordered, respectively) and have high disorder content at 0.24 and 0.27, respectively. Therefore, our data suggest that some of the nuclear disordered proteins may play important roles in autophagy.
Collapse
Affiliation(s)
- Insung Na
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA.
| | - Fanchi Meng
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta T6G 2V4, Canada
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23219, USA.
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA. and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA and Biology Department, Faculty of Science, King Abdulaziz University, P.O. Box 80203, Jeddah 21589, Saudi Arabia and Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg 194064, Russia
| |
Collapse
|
50
|
Davies HA, Rigden DJ, Phelan MM, Madine J. Probing Medin Monomer Structure and its Amyloid Nucleation Using 13C-Direct Detection NMR in Combination with Structural Bioinformatics. Sci Rep 2017; 7:45224. [PMID: 28327552 PMCID: PMC5361114 DOI: 10.1038/srep45224] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2016] [Accepted: 02/20/2017] [Indexed: 12/21/2022] Open
Abstract
Aortic medial amyloid is the most prevalent amyloid found to date, but remarkably little is known about it. It is characterised by aberrant deposition of a 5.4 kDa protein called medin within the medial layer of large arteries. Here we employ a combined approach of ab initio protein modelling and 13C-direct detection NMR to generate a model for soluble monomeric medin comprising a stable core of three β-strands and shorter more labile strands at the termini. Molecular dynamics simulations suggested that detachment of the short, C-terminal β-strand from the soluble fold exposes key amyloidogenic regions as a potential site of nucleation enabling dimerisation and subsequent fibril formation. This mechanism resembles models proposed for several other amyloidogenic proteins suggesting that despite variations in sequence and protomer structure these proteins may share a common pathway for amyloid nucleation and subsequent protofibril and fibril formation.
Collapse
Affiliation(s)
- Hannah A. Davies
- Institute of Integrative Biology, University of Liverpool, Biosciences Building, Crown Street, L69 7ZB, UK
| | - Daniel J. Rigden
- Institute of Integrative Biology, University of Liverpool, Biosciences Building, Crown Street, L69 7ZB, UK
| | - Marie M. Phelan
- Institute of Integrative Biology, University of Liverpool, Biosciences Building, Crown Street, L69 7ZB, UK
| | - Jillian Madine
- Institute of Integrative Biology, University of Liverpool, Biosciences Building, Crown Street, L69 7ZB, UK
| |
Collapse
|