1
|
Gavalda-Garcia J, Bickel D, Roca-Martinez J, Raimondi D, Orlando G, Vranken W. Data-driven probabilistic definition of the low energy conformational states of protein residues. NAR Genom Bioinform 2024; 6:lqae082. [PMID: 38984065 PMCID: PMC11231583 DOI: 10.1093/nargab/lqae082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Revised: 06/14/2024] [Accepted: 06/26/2024] [Indexed: 07/11/2024] Open
Abstract
Protein dynamics and related conformational changes are essential for their function but difficult to characterise and interpret. Amino acids in a protein behave according to their local energy landscape, which is determined by their local structural context and environmental conditions. The lowest energy state for a given residue can correspond to sharply defined conformations, e.g. in a stable helix, or can cover a wide range of conformations, e.g. in intrinsically disordered regions. A good definition of such low energy states is therefore important to describe the behaviour of a residue and how it changes with its environment. We propose a data-driven probabilistic definition of six low energy conformational states typically accessible for amino acid residues in proteins. This definition is based on solution NMR information of 1322 proteins through a combined analysis of structure ensembles with interpreted chemical shifts. We further introduce a conformational state variability parameter that captures, based on an ensemble of protein structures from molecular dynamics or other methods, how often a residue moves between these conformational states. The approach enables a different perspective on the local conformational behaviour of proteins that is complementary to their static interpretation from single structure models.
Collapse
Affiliation(s)
- Jose Gavalda-Garcia
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
| | - David Bickel
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
| | - Joel Roca-Martinez
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
| | | | | | - Wim Vranken
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
| |
Collapse
|
2
|
Pesce F, Bremer A, Tesei G, Hopkins JB, Grace CR, Mittag T, Lindorff-Larsen K. Design of intrinsically disordered protein variants with diverse structural properties. SCIENCE ADVANCES 2024; 10:eadm9926. [PMID: 39196930 PMCID: PMC11352843 DOI: 10.1126/sciadv.adm9926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Accepted: 06/07/2024] [Indexed: 08/30/2024]
Abstract
Intrinsically disordered proteins (IDPs) perform a broad range of functions in biology, suggesting that the ability to design IDPs could help expand the repertoire of proteins with novel functions. Computational design of IDPs with specific conformational properties has, however, been difficult because of their substantial dynamics and structural complexity. We describe a general algorithm for designing IDPs with specific structural properties. We demonstrate the power of the algorithm by generating variants of naturally occurring IDPs that differ in compaction, long-range contacts, and propensity to phase separate. We experimentally tested and validated our designs and analyzed the sequence features that determine conformations. We show how our results are captured by a machine learning model, enabling us to speed up the algorithm. Our work expands the toolbox for computational protein design and will facilitate the design of proteins whose functions exploit the many properties afforded by protein disorder.
Collapse
Affiliation(s)
- Francesco Pesce
- Structural Biology and NMR Laboratory, The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Anne Bremer
- Department of Structural Biology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Giulio Tesei
- Structural Biology and NMR Laboratory, The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Jesse B. Hopkins
- BioCAT, Department of Physics, Illinois Institute of Technology, Chicago, IL 60616, USA
| | - Christy R. Grace
- Department of Structural Biology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Tanja Mittag
- Department of Structural Biology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Kresten Lindorff-Larsen
- Structural Biology and NMR Laboratory, The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
3
|
Bastida A, Zúñiga J, Fogolari F, Soler MA. Statistical accuracy of molecular dynamics-based methods for sampling conformational ensembles of disordered proteins. Phys Chem Chem Phys 2024. [PMID: 39190324 DOI: 10.1039/d4cp02564d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/28/2024]
Abstract
The characterization of the statistical ensemble of conformations of intrinsically disordered regions (IDRs) is a great challenge both from experimental and computational points of view. In this respect, a number of protocols have been developed using molecular dynamics (MD) simulations to sample the huge conformational space of the molecule. In this work, we consider one of the best methods available, replica exchange solute tempering (REST), as a reference to compare the results obtained using this method with the results obtained using other methods, in terms of experimentally measurable quantities. Along with the methods assessed, we propose here a novel protocol called probabilistic MD chain growth (PMD-CG), which combines the flexible-meccano and hierarchical chain growth methods with the statistical data obtained from tripeptide MD trajectories as the starting point. The system chosen for testing is a 20-residue region from the C-terminal domain of the p53 tumor suppressor protein (p53-CTD). Our results show that PMD-CG provides an ensemble of conformations extremely quickly, after suitable computation of the conformational pool for all peptide triplets of the IDR sequence. The measurable quantities computed on the ensemble of conformations agree well with those based on the REST conformational ensemble.
Collapse
Affiliation(s)
- Adolfo Bastida
- Departamento de Química Física, Universidad de Murcia, 30100 Murcia, Spain.
| | - José Zúñiga
- Departamento de Química Física, Universidad de Murcia, 30100 Murcia, Spain.
| | - Federico Fogolari
- Dipartimento di Scienze Matematiche, Informatiche e Fisiche, Università di Udine, 33100 Udine, Italy.
| | - Miguel A Soler
- Dipartimento di Scienze Matematiche, Informatiche e Fisiche, Università di Udine, 33100 Udine, Italy.
| |
Collapse
|
4
|
Pesce G, Gondelaud F, Ptchelkine D, Bignon C, Fourquet P, Longhi S. Dissecting Henipavirus W proteins conformational and fibrillation properties: contribution of their N- and C-terminal constituent domains. FEBS J 2024. [PMID: 39180270 DOI: 10.1111/febs.17239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 05/07/2024] [Accepted: 07/23/2024] [Indexed: 08/26/2024]
Abstract
The Nipah and Hendra viruses are severe human pathogens. In addition to the P protein, their P gene also encodes the V and W proteins that share with P their N-terminal intrinsically disordered domain (NTD) and possess distinct C-terminal domains (CTDs). The W protein is a key player in the evasion of the host innate immune response. We previously showed that the W proteins are intrinsically disordered and can form amyloid-like fibrils. However, structural information on W CTD (CTDW) and its potential contribution to the fibrillation process is lacking. In this study, we demonstrate that CTDWS are disordered and able to form dimers mediated by disulfide bridges. We also show that the NTD and the CTDW interact with each other and that this interaction triggers both a gain of secondary structure and a chain compaction within the NTD. Finally, despite the lack of intrinsic fibrillogenic properties, we show that the CTDW favors the formation of fibrils by the NTD both in cis and in trans. Altogether, the results herein presented shed light on the molecular mechanisms underlying Henipavirus pathogenesis and may thus contribute to the development of targeted therapies.
Collapse
Affiliation(s)
- Giulia Pesce
- Laboratoire Architecture et Fonction des Macromolécules Biologiques (AFMB), UMR 7257, Centre National de la Recherche Scientifique (CNRS) and Aix Marseille University, France
| | - Frank Gondelaud
- Laboratoire Architecture et Fonction des Macromolécules Biologiques (AFMB), UMR 7257, Centre National de la Recherche Scientifique (CNRS) and Aix Marseille University, France
| | - Denis Ptchelkine
- Laboratoire Architecture et Fonction des Macromolécules Biologiques (AFMB), UMR 7257, Centre National de la Recherche Scientifique (CNRS) and Aix Marseille University, France
| | - Christophe Bignon
- Laboratoire Architecture et Fonction des Macromolécules Biologiques (AFMB), UMR 7257, Centre National de la Recherche Scientifique (CNRS) and Aix Marseille University, France
| | - Patrick Fourquet
- INSERM, Centre de Recherche en Cancérologie de Marseille (CRCM), Centre National de la Recherche Scientifique (CNRS), Marseille Protéomique, Institut Paoli-Calmettes, Aix Marseille University, France
| | - Sonia Longhi
- Laboratoire Architecture et Fonction des Macromolécules Biologiques (AFMB), UMR 7257, Centre National de la Recherche Scientifique (CNRS) and Aix Marseille University, France
| |
Collapse
|
5
|
Zhang O, Naik SA, Liu ZH, Forman-Kay J, Head-Gordon T. A curated rotamer library for common post-translational modifications of proteins. Bioinformatics 2024; 40:btae444. [PMID: 38995731 PMCID: PMC11254353 DOI: 10.1093/bioinformatics/btae444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Revised: 06/06/2024] [Accepted: 07/11/2024] [Indexed: 07/14/2024] Open
Abstract
MOTIVATION Sidechain rotamer libraries of the common amino acids of a protein are useful for folded protein structure determination and for generating ensembles of intrinsically disordered proteins (IDPs). However, much of protein function is modulated beyond the translated sequence through the introduction of post-translational modifications (PTMs). RESULTS In this work, we have provided a curated set of side chain rotamers for the most common PTMs derived from the RCSB PDB database, including phosphorylated, methylated, and acetylated sidechains. Our rotamer libraries improve upon existing methods such as SIDEpro, Rosetta, and AlphaFold3 in predicting the experimental structures for PTMs in folded proteins. In addition, we showcase our PTM libraries in full use by generating ensembles with the Monte Carlo Side Chain Entropy (MCSCE) for folded proteins, and combining MCSCE with the Local Disordered Region Sampling algorithms within IDPConformerGenerator for proteins with intrinsically disordered regions. AVAILABILITY AND IMPLEMENTATION The codes for dihedral angle computations and library creation are available at https://github.com/THGLab/ptm_sc.git.
Collapse
Affiliation(s)
- Oufan Zhang
- Kenneth S. Pitzer Center for Theoretical Chemistry, University of California, Berkeley, CA 94720, United States
| | - Shubhankar A Naik
- Department of Chemistry, University of California, Berkeley, CA 94720, United States
| | - Zi Hao Liu
- Molecular Medicine Program, Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
- Department of Biochemistry, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Julie Forman-Kay
- Molecular Medicine Program, Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
- Department of Biochemistry, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Teresa Head-Gordon
- Kenneth S. Pitzer Center for Theoretical Chemistry, University of California, Berkeley, CA 94720, United States
- Department of Chemistry, University of California, Berkeley, CA 94720, United States
- Department of Bioengineering, University of California, Berkeley, CA 94720, United States
- Department of Chemical and Biomolecular Engineering, University of California, Berkeley, CA 94720, United States
| |
Collapse
|
6
|
Ghosh D, Biswas A, Radhakrishna M. Advanced computational approaches to understand protein aggregation. BIOPHYSICS REVIEWS 2024; 5:021302. [PMID: 38681860 PMCID: PMC11045254 DOI: 10.1063/5.0180691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 03/18/2024] [Indexed: 05/01/2024]
Abstract
Protein aggregation is a widespread phenomenon implicated in debilitating diseases like Alzheimer's, Parkinson's, and cataracts, presenting complex hurdles for the field of molecular biology. In this review, we explore the evolving realm of computational methods and bioinformatics tools that have revolutionized our comprehension of protein aggregation. Beginning with a discussion of the multifaceted challenges associated with understanding this process and emphasizing the critical need for precise predictive tools, we highlight how computational techniques have become indispensable for understanding protein aggregation. We focus on molecular simulations, notably molecular dynamics (MD) simulations, spanning from atomistic to coarse-grained levels, which have emerged as pivotal tools in unraveling the complex dynamics governing protein aggregation in diseases such as cataracts, Alzheimer's, and Parkinson's. MD simulations provide microscopic insights into protein interactions and the subtleties of aggregation pathways, with advanced techniques like replica exchange molecular dynamics, Metadynamics (MetaD), and umbrella sampling enhancing our understanding by probing intricate energy landscapes and transition states. We delve into specific applications of MD simulations, elucidating the chaperone mechanism underlying cataract formation using Markov state modeling and the intricate pathways and interactions driving the toxic aggregate formation in Alzheimer's and Parkinson's disease. Transitioning we highlight how computational techniques, including bioinformatics, sequence analysis, structural data, machine learning algorithms, and artificial intelligence have become indispensable for predicting protein aggregation propensity and locating aggregation-prone regions within protein sequences. Throughout our exploration, we underscore the symbiotic relationship between computational approaches and empirical data, which has paved the way for potential therapeutic strategies against protein aggregation-related diseases. In conclusion, this review offers a comprehensive overview of advanced computational methodologies and bioinformatics tools that have catalyzed breakthroughs in unraveling the molecular basis of protein aggregation, with significant implications for clinical interventions, standing at the intersection of computational biology and experimental research.
Collapse
Affiliation(s)
- Deepshikha Ghosh
- Department of Biological Sciences and Engineering, Indian Institute of Technology (IIT) Gandhinagar, Palaj, Gujarat 382355, India
| | - Anushka Biswas
- Department of Chemical Engineering, Indian Institute of Technology (IIT) Gandhinagar, Palaj, Gujarat 382355, India
| | | |
Collapse
|
7
|
Viegas RG, Martins IBS, Leite VBP. Understanding the Energy Landscape of Intrinsically Disordered Protein Ensembles. J Chem Inf Model 2024; 64:4149-4157. [PMID: 38713459 DOI: 10.1021/acs.jcim.4c00080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
A substantial portion of various organisms' proteomes comprises intrinsically disordered proteins (IDPs) that lack a defined three-dimensional structure. These IDPs exhibit a diverse array of conformations, displaying remarkable spatiotemporal heterogeneity and exceptional conformational flexibility. Characterizing the structure or structural ensemble of IDPs presents significant conceptual and methodological challenges owing to the absence of a well-defined native structure. While databases such as the Protein Ensemble Database (PED) provide IDP ensembles obtained through a combination of experimental data and molecular modeling, the absence of reaction coordinates poses challenges in comprehensively understanding pertinent aspects of the system. In this study, we leverage the energy landscape visualization method (JCTC, 6482, 2019) to scrutinize four IDP ensembles sourced from PED. ELViM, a methodology that circumvents the need for a priori reaction coordinates, aids in analyzing the ensembles. The specific IDP ensembles investigated are as follows: two fragments of nucleoporin (NUL: 884-993 and NUS: 1313-1390), yeast sic 1 N-terminal (1-90), and the N-terminal SH3 domain of Drk (1-59). Utilizing ELViM enables the comprehensive validation of ensembles, facilitating the detection of potential inconsistencies in the sampling process. Additionally, it allows for identifying and characterizing the most prevalent conformations within an ensemble. Moreover, ELViM facilitates the comparative analysis of ensembles obtained under diverse conditions, thereby providing a powerful tool for investigating the functional mechanisms of IDPs.
Collapse
Affiliation(s)
- Rafael G Viegas
- Federal Institute of Education, Science and Technology of São Paulo (IFSP), Catanduva, São Paulo 15.808-305, Brazil
- Department of Physics, São Paulo State University (UNESP), Institute of Biosciences, Humanities and Exact Sciences, São José do Rio Preto, São Paulo 15054-000, Brazil
| | - Ingrid B S Martins
- Department of Physics, São Paulo State University (UNESP), Institute of Biosciences, Humanities and Exact Sciences, São José do Rio Preto, São Paulo 15054-000, Brazil
| | - Vitor B P Leite
- Department of Physics, São Paulo State University (UNESP), Institute of Biosciences, Humanities and Exact Sciences, São José do Rio Preto, São Paulo 15054-000, Brazil
| |
Collapse
|
8
|
Fazekas Z, K Menyhárd D, Perczel A. LoCoHD: a metric for comparing local environments of proteins. Nat Commun 2024; 15:4029. [PMID: 38740745 DOI: 10.1038/s41467-024-48225-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Accepted: 04/22/2024] [Indexed: 05/16/2024] Open
Abstract
Protein folds and the local environments they create can be compared using a variety of differently designed measures, such as the root mean squared deviation, the global distance test, the template modeling score or the local distance difference test. Although these measures have proven to be useful for a variety of tasks, each fails to fully incorporate the valuable chemical information inherent to atoms and residues, and considers these only partially and indirectly. Here, we develop the highly flexible local composition Hellinger distance (LoCoHD) metric, which is based on the chemical composition of local residue environments. Using LoCoHD, we analyze the chemical heterogeneity of amino acid environments and identify valines having the most conserved-, and arginines having the most variable chemical environments. We use LoCoHD to investigate structural ensembles, to evaluate critical assessment of structure prediction (CASP) competitors, to compare the results with the local distance difference test (lDDT) scoring system, and to evaluate a molecular dynamics simulation. We show that LoCoHD measurements provide unique information about protein structures that is distinct from, for example, those derived using the alignment-based RMSD metric, or the similarly distance matrix-based but alignment-free lDDT metric.
Collapse
Affiliation(s)
- Zsolt Fazekas
- Laboratory of Structural Chemistry and Biology, Institute of Chemistry, ELTE Eötvös Loránd University, Budapest, Hungary
- ELTE Hevesy György PhD School of Chemistry, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Dóra K Menyhárd
- Laboratory of Structural Chemistry and Biology, Institute of Chemistry, ELTE Eötvös Loránd University, Budapest, Hungary
- HUN-REN-ELTE Protein Modeling Research Group, ELTE Eötvös Loránd University, Budapest, Hungary
| | - András Perczel
- Laboratory of Structural Chemistry and Biology, Institute of Chemistry, ELTE Eötvös Loránd University, Budapest, Hungary.
- HUN-REN-ELTE Protein Modeling Research Group, ELTE Eötvös Loránd University, Budapest, Hungary.
| |
Collapse
|
9
|
Zhang O, Naik SA, Liu ZH, Forman-Kay J, Head-Gordon T. A Curated Rotamer Library for Common Post-Translational Modifications of Proteins. ARXIV 2024:arXiv:2405.03120v1. [PMID: 38764597 PMCID: PMC11100909] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/21/2024]
Abstract
Sidechain rotamer libraries of the common amino acids of a protein are useful for folded protein structure determination and for generating ensembles of intrinsically disordered proteins (IDPs). However much of protein function is modulated beyond the translated sequence through thFiguree introduction of post-translational modifications (PTMs). In this work we have provided a curated set of side chain rotamers for the most common PTMs derived from the RCSB PDB database, including phosphorylated, methylated, and acetylated sidechains. Our rotamer libraries improve upon existing methods such as SIDEpro and Rosetta in predicting the experimental structures for PTMs in folded proteins. In addition, we showcase our PTM libraries in full use by generating ensembles with the Monte Carlo Side Chain Entropy (MCSCE) for folded proteins, and combining MCSCE with the Local Disordered Region Sampling algorithms within IDPConformerGenerator for proteins with intrinsically disordered regions.
Collapse
Affiliation(s)
- Oufan Zhang
- Kenneth S. Pitzer Center for Theoretical Chemistry, University of California, Berkeley, Berkeley, California 94720, USA
| | - Shubhankar A. Naik
- Department of Chemistry, University of California, Berkeley, Berkeley, California 94720, USA
| | - Zi Hao Liu
- Molecular Medicine Program, Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada
- Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Julie Forman-Kay
- Molecular Medicine Program, Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada
- Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Teresa Head-Gordon
- Kenneth S. Pitzer Center for Theoretical Chemistry, University of California, Berkeley, Berkeley, California 94720, USA
- Department of Chemistry, University of California, Berkeley, Berkeley, California 94720, USA
- Department of Bioengineering, University of California, Berkeley, Berkeley, California 94720, USA
- Department of Chemical and Biomolecular Engineering, University of California, Berkeley, Berkeley, California 94720, USA
| |
Collapse
|
10
|
Maiti S, Singh A, Maji T, Saibo NV, De S. Experimental methods to study the structure and dynamics of intrinsically disordered regions in proteins. Curr Res Struct Biol 2024; 7:100138. [PMID: 38707546 PMCID: PMC11068507 DOI: 10.1016/j.crstbi.2024.100138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 03/12/2024] [Accepted: 03/15/2024] [Indexed: 05/07/2024] Open
Abstract
Eukaryotic proteins often feature long stretches of amino acids that lack a well-defined three-dimensional structure and are referred to as intrinsically disordered proteins (IDPs) or regions (IDRs). Although these proteins challenge conventional structure-function paradigms, they play vital roles in cellular processes. Recent progress in experimental techniques, such as NMR spectroscopy, single molecule FRET, high speed AFM and SAXS, have provided valuable insights into the biophysical basis of IDP function. This review discusses the advancements made in these techniques particularly for the study of disordered regions in proteins. In NMR spectroscopy new strategies such as 13C detection, non-uniform sampling, segmental isotope labeling, and rapid data acquisition methods address the challenges posed by spectral overcrowding and low stability of IDPs. The importance of various NMR parameters, including chemical shifts, hydrogen exchange rates, and relaxation measurements, to reveal transient secondary structures within IDRs and IDPs are presented. Given the high flexibility of IDPs, the review outlines NMR methods for assessing their dynamics at both fast (ps-ns) and slow (μs-ms) timescales. IDPs exert their functions through interactions with other molecules such as proteins, DNA, or RNA. NMR-based titration experiments yield insights into the thermodynamics and kinetics of these interactions. Detailed study of IDPs requires multiple experimental techniques, and thus, several methods are described for studying disordered proteins, highlighting their respective advantages and limitations. The potential for integrating these complementary techniques, each offering unique perspectives, is explored to achieve a comprehensive understanding of IDPs.
Collapse
Affiliation(s)
| | - Aakanksha Singh
- School of Bioscience, Indian Institute of Technology Kharagpur, Kharagpur, WB, 721302, India
| | - Tanisha Maji
- School of Bioscience, Indian Institute of Technology Kharagpur, Kharagpur, WB, 721302, India
| | - Nikita V. Saibo
- School of Bioscience, Indian Institute of Technology Kharagpur, Kharagpur, WB, 721302, India
| | - Soumya De
- School of Bioscience, Indian Institute of Technology Kharagpur, Kharagpur, WB, 721302, India
| |
Collapse
|
11
|
Jeschke G. Protein ensemble modeling and analysis with MMMx. Protein Sci 2024; 33:e4906. [PMID: 38358120 PMCID: PMC10868441 DOI: 10.1002/pro.4906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 01/04/2024] [Accepted: 01/06/2024] [Indexed: 02/16/2024]
Abstract
Proteins, especially of eukaryotes, often have disordered domains and may contain multiple folded domains whose relative spatial arrangement is distributed. The MMMx ensemble modeling and analysis toolbox (https://github.com/gjeschke/MMMx) can support the design of experiments to characterize the distributed structure of such proteins, starting from AlphaFold2 predictions or folded domain structures. Weak order can be analyzed with reference to a random coil model or to peptide chains that match the residue-specific Ramachandran angle distribution of the loop regions and are otherwise unrestrained. The deviation of the mean square end-to-end distance of chain sections from their average over sections of the same sequence length reveals localized compaction or expansion of the chain. The shape sampled by disordered chains is visualized by superposition in the principal axes frame of their inertia tensor. Ensembles of different sizes and with weighted conformers can be compared based on a similarity parameter that abstracts from the ensemble width.
Collapse
Affiliation(s)
- Gunnar Jeschke
- Department of Chemistry and Applied BiosciencesETH ZürichZürichSwitzerland
| |
Collapse
|
12
|
Holehouse AS, Kragelund BB. The molecular basis for cellular function of intrinsically disordered protein regions. Nat Rev Mol Cell Biol 2024; 25:187-211. [PMID: 37957331 DOI: 10.1038/s41580-023-00673-0] [Citation(s) in RCA: 55] [Impact Index Per Article: 55.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/26/2023] [Indexed: 11/15/2023]
Abstract
Intrinsically disordered protein regions exist in a collection of dynamic interconverting conformations that lack a stable 3D structure. These regions are structurally heterogeneous, ubiquitous and found across all kingdoms of life. Despite the absence of a defined 3D structure, disordered regions are essential for cellular processes ranging from transcriptional control and cell signalling to subcellular organization. Through their conformational malleability and adaptability, disordered regions extend the repertoire of macromolecular interactions and are readily tunable by their structural and chemical context, making them ideal responders to regulatory cues. Recent work has led to major advances in understanding the link between protein sequence and conformational behaviour in disordered regions, yet the link between sequence and molecular function is less well defined. Here we consider the biochemical and biophysical foundations that underlie how and why disordered regions can engage in productive cellular functions, provide examples of emerging concepts and discuss how protein disorder contributes to intracellular information processing and regulation of cellular function.
Collapse
Affiliation(s)
- Alex S Holehouse
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St Louis, MO, USA.
- Center for Biomolecular Condensates, Washington University in St Louis, St Louis, MO, USA.
| | - Birthe B Kragelund
- REPIN, Structural Biology and NMR Laboratory, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
13
|
Vieira MFM, Hernandez G, Zhong Q, Arbesú M, Veloso T, Gomes T, Martins ML, Monteiro H, Frazão C, Frankel G, Zanzoni A, Cordeiro TN. The pathogen-encoded signalling receptor Tir exploits host-like intrinsic disorder for infection. Commun Biol 2024; 7:179. [PMID: 38351154 PMCID: PMC10864410 DOI: 10.1038/s42003-024-05856-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Accepted: 01/26/2024] [Indexed: 02/16/2024] Open
Abstract
The translocated intimin receptor (Tir) is an essential type III secretion system (T3SS) effector of attaching and effacing pathogens contributing to the global foodborne disease burden. Tir acts as a cell-surface receptor in host cells, rewiring intracellular processes by targeting multiple host proteins. We investigated the molecular basis for Tir's binding diversity in signalling, finding that Tir is a disordered protein with host-like binding motifs. Unexpectedly, also are several other T3SS effectors. By an integrative approach, we reveal that Tir dimerises via an antiparallel OB-fold within a highly disordered N-terminal cytosolic domain. Also, it has a long disordered C-terminal cytosolic domain partially structured at host-like motifs that bind lipids. Membrane affinity depends on lipid composition and phosphorylation, highlighting a previously unrecognised host interaction impacting Tir-induced actin polymerisation and cell death. Furthermore, multi-site tyrosine phosphorylation enables Tir to engage host SH2 domains in a multivalent fuzzy complex, consistent with Tir's scaffolding role and binding promiscuity. Our findings provide insights into the intracellular Tir domains, highlighting the ability of T3SS effectors to exploit host-like protein disorder as a strategy for host evasion.
Collapse
Affiliation(s)
- Marta F M Vieira
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Av. da República, Oeiras, Portugal
| | - Guillem Hernandez
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Av. da República, Oeiras, Portugal
| | - Qiyun Zhong
- Department of Life Sciences, Imperial College London, South Kensington Campus, London, UK
| | - Miguel Arbesú
- Department of NMR-supported Structural Biology, Leibniz-Forschungsinstitut für Molekulare Pharmakologie, Berlin, Germany
- InstaDeep Ltd, 5 Merchant Square, London, UK
| | - Tiago Veloso
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Av. da República, Oeiras, Portugal
| | - Tiago Gomes
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Av. da República, Oeiras, Portugal
| | - Maria L Martins
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Av. da República, Oeiras, Portugal
| | - Hugo Monteiro
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Av. da República, Oeiras, Portugal
| | - Carlos Frazão
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Av. da República, Oeiras, Portugal
| | - Gad Frankel
- Department of Life Sciences, Imperial College London, South Kensington Campus, London, UK
| | - Andreas Zanzoni
- Aix-Marseille Université, Inserm, TAGC, UMR_S1090, Marseille, France
| | - Tiago N Cordeiro
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Av. da República, Oeiras, Portugal.
| |
Collapse
|
14
|
Condeminas M, Macias MJ. Overcoming challenges in structural biology with integrative approaches and nanobody-derived technologies. Curr Opin Struct Biol 2024; 84:102764. [PMID: 38215529 DOI: 10.1016/j.sbi.2023.102764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 11/24/2023] [Accepted: 12/11/2023] [Indexed: 01/14/2024]
Abstract
A full understanding of protein structure is key to unraveling how these systems work, how mutations affect their function, and discovering new hotspots for drug discovery. Research tackling this field began with the analysis of globular proteins. In recent years, as technology has improved, research efforts have broadened their focus to include the study of multidomain proteins and the analysis of conformational variability, flexibility, and dynamic systems. Here, we have selected five recent examples that integrate complementary structural methods to provide insight into the behavior of modular, flexible, and transient contacts. We also describe the structural application of domains derived from single-chain antibodies, which are instrumental in overcoming the size limitation of cryogenic electron microscopy (cryoEM) studies. As these methods are continuously developed, they will lead to the interrogation of more complex systems, revealing how large signaling and transcriptional machines are assembled in the context of health and disease.
Collapse
Affiliation(s)
- Miriam Condeminas
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Carrer de Baldiri Reixac 10, Barcelona 08028, Spain; Department of Medicine and Life Sciences, Universitat Pompeu Fabra (MELIS-UPF), Carrer del Doctor Aiguader 88, Barcelona 08003, Spain
| | - Maria J Macias
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Carrer de Baldiri Reixac 10, Barcelona 08028, Spain; Institució Catalana de Recerca i Estudis Avançats (ICREA), Passeig Lluís Companys 23, Barcelona 08010, Spain.
| |
Collapse
|
15
|
Aspromonte MC, Nugnes MV, Quaglia F, Bouharoua A, Tosatto SCE, Piovesan D. DisProt in 2024: improving function annotation of intrinsically disordered proteins. Nucleic Acids Res 2024; 52:D434-D441. [PMID: 37904585 PMCID: PMC10767923 DOI: 10.1093/nar/gkad928] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 10/05/2023] [Accepted: 10/10/2023] [Indexed: 11/01/2023] Open
Abstract
DisProt (URL: https://disprot.org) is the gold standard database for intrinsically disordered proteins and regions, providing valuable information about their functions. The latest version of DisProt brings significant advancements, including a broader representation of functions and an enhanced curation process. These improvements aim to increase both the quality of annotations and their coverage at the sequence level. Higher coverage has been achieved by adopting additional evidence codes. Quality of annotations has been improved by systematically applying Minimum Information About Disorder Experiments (MIADE) principles and reporting all the details of the experimental setup that could potentially influence the structural state of a protein. The DisProt database now includes new thematic datasets and has expanded the adoption of Gene Ontology terms, resulting in an extensive functional repertoire which is automatically propagated to UniProtKB. Finally, we show that DisProt's curated annotations strongly correlate with disorder predictions inferred from AlphaFold2 pLDDT (predicted Local Distance Difference Test) confidence scores. This comparison highlights the utility of DisProt in explaining apparent uncertainty of certain well-defined predicted structures, which often correspond to folding-upon-binding fragments. Overall, DisProt serves as a comprehensive resource, combining experimental evidence of disorder information to enhance our understanding of intrinsically disordered proteins and their functional implications.
Collapse
Affiliation(s)
| | | | - Federica Quaglia
- Department of Biomedical Sciences, University of Padova, Padova, Italy
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR-IBIOM), Bari, Italy
| | - Adel Bouharoua
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | | | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| |
Collapse
|
16
|
Thakur M, Buniello A, Brooksbank C, Gurwitz KT, Hall M, Hartley M, Hulcoop DG, Leach AR, Marques D, Martin M, Mithani A, McDonagh EM, Mutasa-Gottgens E, Ochoa D, Perez-Riverol Y, Stephenson J, Varadi M, Velankar S, Vizcaino JA, Witham R, McEntyre J. EMBL's European Bioinformatics Institute (EMBL-EBI) in 2023. Nucleic Acids Res 2024; 52:D10-D17. [PMID: 38015445 PMCID: PMC10767983 DOI: 10.1093/nar/gkad1088] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 10/23/2023] [Accepted: 10/30/2023] [Indexed: 11/29/2023] Open
Abstract
The European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI) is one of the world's leading sources of public biomolecular data. Based at the Wellcome Genome Campus in Hinxton, UK, EMBL-EBI is one of six sites of the European Molecular Biology Laboratory (EMBL), Europe's only intergovernmental life sciences organisation. This overview summarises the latest developments in the services provided by EMBL-EBI data resources to scientific communities globally. These developments aim to ensure EMBL-EBI resources meet the current and future needs of these scientific communities, accelerating the impact of open biological data for all.
Collapse
Affiliation(s)
- Matthew Thakur
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Annalisa Buniello
- Open Targets, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Catherine Brooksbank
- Training Team, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Kim T Gurwitz
- Training Team, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Matthew Hall
- Industry Partnerships, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Matthew Hartley
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - David G Hulcoop
- Open Targets, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Andrew R Leach
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
- Industry Partnerships, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Diana Marques
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Maria Martin
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Aziz Mithani
- Training Team, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Ellen M McDonagh
- Open Targets, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Euphemia Mutasa-Gottgens
- Industry Partnerships, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - David Ochoa
- Open Targets, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Yasset Perez-Riverol
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - James Stephenson
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Mihaly Varadi
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Sameer Velankar
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Juan Antonio Vizcaino
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Rick Witham
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Johanna McEntyre
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| |
Collapse
|
17
|
Ghafouri H, Lazar T, Del Conte A, Tenorio Ku LG, Tompa P, Tosatto SCE, Monzon AM. PED in 2024: improving the community deposition of structural ensembles for intrinsically disordered proteins. Nucleic Acids Res 2024; 52:D536-D544. [PMID: 37904608 PMCID: PMC10767937 DOI: 10.1093/nar/gkad947] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 10/10/2023] [Accepted: 10/13/2023] [Indexed: 11/01/2023] Open
Abstract
The Protein Ensemble Database (PED) (URL: https://proteinensemble.org) is the primary resource for depositing structural ensembles of intrinsically disordered proteins. This updated version of PED reflects advancements in the field, denoting a continual expansion with a total of 461 entries and 538 ensembles, including those generated without explicit experimental data through novel machine learning (ML) techniques. With this significant increment in the number of ensembles, a few yet-unprecedented new entries entered the database, including those also determined or refined by electron paramagnetic resonance or circular dichroism data. In addition, PED was enriched with several new features, including a novel deposition service, improved user interface, new database cross-referencing options and integration with the 3D-Beacons network-all representing efforts to improve the FAIRness of the database. Foreseeably, PED will keep growing in size and expanding with new types of ensembles generated by accurate and fast ML-based generative models and coarse-grained simulations. Therefore, among future efforts, priority will be given to further develop the database to be compatible with ensembles modeled at a coarse-grained level.
Collapse
Affiliation(s)
| | - Tamas Lazar
- VIB-VUB Center for Structural Biology, Vlaams Instituut voor Biotechnologie (VIB), Brussels, Belgium
- Structural Biology Brussels, Department of Bioengineering, Vrije Universiteit Brussel (VUB), Brussels, Belgium
| | - Alessio Del Conte
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | | | - Peter Tompa
- VIB-VUB Center for Structural Biology, Vlaams Instituut voor Biotechnologie (VIB), Brussels, Belgium
- Structural Biology Brussels, Department of Bioengineering, Vrije Universiteit Brussel (VUB), Brussels, Belgium
- Institute of Enzymology, Research Centre for Natural Sciences (RCNS), Budapest, Hungary
| | | | | |
Collapse
|
18
|
Bolding JE, Nielsen AL, Jensen I, Hansen TN, Ryberg LA, Jameson ST, Harris P, Peters GHJ, Denu JM, Rogers JM, Olsen CA. Substrates and Cyclic Peptide Inhibitors of the Oligonucleotide-Activated Sirtuin 7. Angew Chem Int Ed Engl 2023; 62:e202314597. [PMID: 37873919 DOI: 10.1002/anie.202314597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 10/19/2023] [Accepted: 10/23/2023] [Indexed: 10/25/2023]
Abstract
The sirtuins are NAD+ -dependent lysine deacylases, comprising seven isoforms (SIRT1-7) in humans, which are involved in the regulation of a plethora of biological processes, including gene expression and metabolism. The sirtuins share a common hydrolytic mechanism but display preferences for different ϵ-N-acyllysine substrates. SIRT7 deacetylates targets in nuclei and nucleoli but remains one of the lesser studied of the seven isoforms, in part due to a lack of chemical tools to specifically probe SIRT7 activity. Here we expressed SIRT7 and, using small-angle X-ray scattering, reveal SIRT7 to be a monomeric enzyme with a low degree of globular flexibility in solution. We developed a fluorogenic assay for investigation of the substrate preferences of SIRT7 and to evaluate compounds that modulate its activity. We report several mechanism-based SIRT7 inhibitors as well as de novo cyclic peptide inhibitors selected from mRNA-display library screening that exhibit selectivity for SIRT7 over other sirtuin isoforms, stabilize SIRT7 in cells, and cause an increase in the acetylation of H3 K18.
Collapse
Affiliation(s)
- Julie E Bolding
- Center for Biopharmaceuticals & Department of Drug Design and Pharmacology, Faculty of Health and Medical Sciences, University of Copenhagen, Jagtvej 160, 2100, Copenhagen, Denmark
| | - Alexander L Nielsen
- Center for Biopharmaceuticals & Department of Drug Design and Pharmacology, Faculty of Health and Medical Sciences, University of Copenhagen, Jagtvej 160, 2100, Copenhagen, Denmark
- Current address: Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne (EPFL), 1015, Lausanne, Switzerland
| | - Iben Jensen
- Center for Biopharmaceuticals & Department of Drug Design and Pharmacology, Faculty of Health and Medical Sciences, University of Copenhagen, Jagtvej 160, 2100, Copenhagen, Denmark
| | - Tobias N Hansen
- Center for Biopharmaceuticals & Department of Drug Design and Pharmacology, Faculty of Health and Medical Sciences, University of Copenhagen, Jagtvej 160, 2100, Copenhagen, Denmark
| | - Line A Ryberg
- Department of Chemistry, Technical University of Denmark, 2800, Kgs. Lyngby, Denmark
- Current address: Department of Immunology and Microbiology, University of Copenhagen, 2200, Copenhagen, Denmark
| | - Samuel T Jameson
- Center for Biopharmaceuticals & Department of Drug Design and Pharmacology, Faculty of Health and Medical Sciences, University of Copenhagen, Jagtvej 160, 2100, Copenhagen, Denmark
| | - Pernille Harris
- Department of Chemistry, Technical University of Denmark, 2800, Kgs. Lyngby, Denmark
- Current address: Department of Chemistry, University of Copenhagen, 2200, Copenhagen, Denmark
| | - Günther H J Peters
- Department of Chemistry, Technical University of Denmark, 2800, Kgs. Lyngby, Denmark
| | - John M Denu
- Department of Biomolecular Chemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
| | - Joseph M Rogers
- Center for Biopharmaceuticals & Department of Drug Design and Pharmacology, Faculty of Health and Medical Sciences, University of Copenhagen, Jagtvej 160, 2100, Copenhagen, Denmark
| | - Christian A Olsen
- Center for Biopharmaceuticals & Department of Drug Design and Pharmacology, Faculty of Health and Medical Sciences, University of Copenhagen, Jagtvej 160, 2100, Copenhagen, Denmark
| |
Collapse
|
19
|
Zhang Y, Zhang Z, Kagaya Y, Terashi G, Zhao B, Xiong Y, Kihara D. Distance-AF: Modifying Predicted Protein Structure Models by Alphafold2 with User-Specified Distance Constraints. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.01.569498. [PMID: 38106200 PMCID: PMC10723377 DOI: 10.1101/2023.12.01.569498] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
The three-dimensional structure of a protein plays a fundamental role in determining its function and has an essential impact on understanding biological processes. Despite significant progress in protein structure prediction, such as AlphaFold2, challenges remain on those hard targets that Alphafold2 does not often perform well due to the complex folding of protein and a large number of possible conformations. Here we present a modified version of the AlphaFold2, called Distance-AF, which aims to improve the performance of AlphaFold2 by including distance constraints as input information. Distance-AF uses AlphaFold2's predicted structure as a starting point and incorporates distance constraints between amino acids to adjust folding of the protein structure until it meets the constraints. Distance-AF can correct the domain orientation on challenging targets, leading to more accurate structures with a lower root mean square deviation (RMSD). The ability of Distance-AF is also useful in fitting protein structures into cryo-electron microscopy maps.
Collapse
Affiliation(s)
- Yuanyuan Zhang
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Zicong Zhang
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Yuki Kagaya
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
| | - Bowen Zhao
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA
| |
Collapse
|
20
|
Liu ZH, Teixeira JMC, Zhang O, Tsangaris TE, Li J, Gradinaru CC, Head-Gordon T, Forman-Kay JD. Local Disordered Region Sampling (LDRS) for ensemble modeling of proteins with experimentally undetermined or low confidence prediction segments. Bioinformatics 2023; 39:btad739. [PMID: 38060268 PMCID: PMC10733734 DOI: 10.1093/bioinformatics/btad739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 10/30/2023] [Accepted: 12/06/2023] [Indexed: 12/08/2023] Open
Abstract
SUMMARY The Local Disordered Region Sampling (LDRS, pronounced loaders) tool is a new module developed for IDPConformerGenerator, a previously validated approach to model intrinsically disordered proteins (IDPs). The IDPConformerGenerator LDRS module provides a method for generating all-atom conformations of intrinsically disordered protein regions at N- and C-termini of and in loops or linkers between folded regions of an existing protein structure. These disordered elements often lead to missing coordinates in experimental structures or low confidence in predicted structures. Requiring only a pre-existing PDB or mmCIF formatted structural template of the protein with missing coordinates or with predicted confidence scores and its full-length primary sequence, LDRS will automatically generate physically meaningful conformational ensembles of the missing flexible regions to complete the full-length protein. The capabilities of the LDRS tool of IDPConformerGenerator include modeling phosphorylation sites using enhanced Monte Carlo-Side Chain Entropy, transmembrane proteins within an all-atom bilayer, and multi-chain complexes. The modeling capacity of LDRS capitalizes on the modularity, the ability to be used as a library and via command-line, and the computational speed of the IDPConformerGenerator platform. AVAILABILITY AND IMPLEMENTATION The LDRS module is part of the IDPConformerGenerator modeling suite, which can be downloaded from GitHub at https://github.com/julie-forman-kay-lab/IDPConformerGenerator. IDPConformerGenerator is written in Python3 and works on Linux, Microsoft Windows, and Mac OS versions that support DSSP. Users can utilize LDRS's Python API for scripting the same way they can use any part of IDPConformerGenerator's API, by importing functions from the "idpconfgen.ldrs_helper" library. Otherwise, LDRS can be used as a command line interface application within IDPConformerGenerator. Full documentation is available within the command-line interface as well as on IDPConformerGenerator's official documentation pages (https://idpconformergenerator.readthedocs.io/en/latest/).
Collapse
Affiliation(s)
- Zi Hao Liu
- Molecular Medicine Program, Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
- Department of Biochemistry, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - João M C Teixeira
- Molecular Medicine Program, Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Oufan Zhang
- Pitzer Center for Theoretical Chemistry, University of California, Berkeley, Berkeley, CA 94720, United States
- Department of Chemistry, University of California, Berkeley, Berkeley, CA 94720-1460, United States
| | - Thomas E Tsangaris
- Department of Physics, University of Toronto, Toronto, ON M5S 1A7, Canada
- Department of Chemical and Physical Sciences, University of Toronto Mississauga, Mississauga, ON L5L 1C6, Canada
| | - Jie Li
- Pitzer Center for Theoretical Chemistry, University of California, Berkeley, Berkeley, CA 94720, United States
- Department of Chemistry, University of California, Berkeley, Berkeley, CA 94720-1460, United States
| | - Claudiu C Gradinaru
- Department of Physics, University of Toronto, Toronto, ON M5S 1A7, Canada
- Department of Chemical and Physical Sciences, University of Toronto Mississauga, Mississauga, ON L5L 1C6, Canada
| | - Teresa Head-Gordon
- Pitzer Center for Theoretical Chemistry, University of California, Berkeley, Berkeley, CA 94720, United States
- Department of Chemistry, University of California, Berkeley, Berkeley, CA 94720-1460, United States
- Department of Chemical and Biomolecular Engineering, University of California, Berkeley, Berkeley, CA 94720-1462, United States
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA 94720-1762, United States
| | - Julie D Forman-Kay
- Molecular Medicine Program, Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
- Department of Biochemistry, University of Toronto, Toronto, ON M5S 1A8, Canada
| |
Collapse
|
21
|
Jones MS, Shmilovich K, Ferguson AL. DiAMoNDBack: Diffusion-Denoising Autoregressive Model for Non-Deterministic Backmapping of Cα Protein Traces. J Chem Theory Comput 2023; 19:7908-7923. [PMID: 37906711 DOI: 10.1021/acs.jctc.3c00840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Coarse-grained molecular models of proteins permit access to length and time scales unattainable by all-atom models and the simulation of processes that occur on long time scales, such as aggregation and folding. The reduced resolution realizes computational accelerations, but an atomistic representation can be vital for a complete understanding of mechanistic details. Backmapping is the process of restoring all-atom resolution to coarse-grained molecular models. In this work, we report DiAMoNDBack (Diffusion-denoising Autoregressive Model for Non-Deterministic Backmapping) as an autoregressive denoising diffusion probability model to restore all-atom details to coarse-grained protein representations retaining only Cα coordinates. The autoregressive generation process proceeds from the protein N-terminus to C-terminus in a residue-by-residue fashion conditioned on the Cα trace and previously backmapped backbone and side-chain atoms within the local neighborhood. The local and autoregressive nature of our model makes it transferable between proteins. The stochastic nature of the denoising diffusion process means that the model generates a realistic ensemble of backbone and side-chain all-atom configurations consistent with the coarse-grained Cα trace. We train DiAMoNDBack over 65k+ structures from the Protein Data Bank (PDB) and validate it in applications to a hold-out PDB test set, intrinsically disordered protein structures from the Protein Ensemble Database (PED), molecular dynamics simulations of fast-folding mini-proteins from DE Shaw Research, and coarse-grained simulation data. We achieve state-of-the-art reconstruction performance in terms of correct bond formation, avoidance of side-chain clashes, and the diversity of the generated side-chain configurational states. We make the DiAMoNDBack model publicly available as a free and open-source Python package.
Collapse
Affiliation(s)
- Michael S Jones
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| | - Kirill Shmilovich
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| | - Andrew L Ferguson
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| |
Collapse
|
22
|
Alderson TR, Pritišanac I, Kolarić Đ, Moses AM, Forman-Kay JD. Systematic identification of conditionally folded intrinsically disordered regions by AlphaFold2. Proc Natl Acad Sci U S A 2023; 120:e2304302120. [PMID: 37878721 PMCID: PMC10622901 DOI: 10.1073/pnas.2304302120] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 08/30/2023] [Indexed: 10/27/2023] Open
Abstract
The AlphaFold Protein Structure Database contains predicted structures for millions of proteins. For the majority of human proteins that contain intrinsically disordered regions (IDRs), which do not adopt a stable structure, it is generally assumed that these regions have low AlphaFold2 confidence scores that reflect low-confidence structural predictions. Here, we show that AlphaFold2 assigns confident structures to nearly 15% of human IDRs. By comparison to experimental NMR data for a subset of IDRs that are known to conditionally fold (i.e., upon binding or under other specific conditions), we find that AlphaFold2 often predicts the structure of the conditionally folded state. Based on databases of IDRs that are known to conditionally fold, we estimate that AlphaFold2 can identify conditionally folding IDRs at a precision as high as 88% at a 10% false positive rate, which is remarkable considering that conditionally folded IDR structures were minimally represented in its training data. We find that human disease mutations are nearly fivefold enriched in conditionally folded IDRs over IDRs in general and that up to 80% of IDRs in prokaryotes are predicted to conditionally fold, compared to less than 20% of eukaryotic IDRs. These results indicate that a large majority of IDRs in the proteomes of human and other eukaryotes function in the absence of conditional folding, but the regions that do acquire folds are more sensitive to mutations. We emphasize that the AlphaFold2 predictions do not reveal functionally relevant structural plasticity within IDRs and cannot offer realistic ensemble representations of conditionally folded IDRs.
Collapse
Affiliation(s)
- T. Reid Alderson
- Department of Biochemistry, University of Toronto, Toronto, ONM5S 1A8, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ONM5S 1A8, Canada
| | - Iva Pritišanac
- Department of Cell and Systems Biology, University of Toronto, Toronto, ONM5S 35G, Canada
- Molecular Medicine Program, The Hospital for Sick Children, Toronto, ONM5G 0A4, Canada
- Department of Molecular Biology and Biochemistry, Gottfried Schatz Research Center for Cell Signaling, Metabolism and Aging, Medical University of Graz, Graz8010, Austria
| | - Đesika Kolarić
- Department of Molecular Biology and Biochemistry, Gottfried Schatz Research Center for Cell Signaling, Metabolism and Aging, Medical University of Graz, Graz8010, Austria
| | - Alan M. Moses
- Department of Cell and Systems Biology, University of Toronto, Toronto, ONM5S 35G, Canada
| | - Julie D. Forman-Kay
- Department of Biochemistry, University of Toronto, Toronto, ONM5S 1A8, Canada
- Molecular Medicine Program, The Hospital for Sick Children, Toronto, ONM5G 0A4, Canada
| |
Collapse
|
23
|
Pesce F, Bremer A, Tesei G, Hopkins JB, Grace CR, Mittag T, Lindorff-Larsen K. Design of intrinsically disordered protein variants with diverse structural properties. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.22.563461. [PMID: 37961110 PMCID: PMC10634714 DOI: 10.1101/2023.10.22.563461] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Intrinsically disordered proteins (IDPs) perform a wide range of functions in biology, suggesting that the ability to design IDPs could help expand the repertoire of proteins with novel functions. Designing IDPs with specific structural or functional properties has, however, been difficult, in part because determining accurate conformational ensembles of IDPs generally requires a combination of computational modelling and experiments. Motivated by recent advancements in efficient physics-based models for simulations of IDPs, we have developed a general algorithm for designing IDPs with specific structural properties. We demonstrate the power of the algorithm by generating variants of naturally occurring IDPs with different levels of compaction and that vary more than 100 fold in their propensity to undergo phase separation, even while keeping a fixed amino acid composition. We experimentally tested designs of variants of the low-complexity domain of hnRNPA1 and find high accuracy in our computational predictions, both in terms of single-chain compaction and propensity to undergo phase separation. We analyze the sequence features that determine changes in compaction and propensity to phase separate and find an overall good agreement with previous findings for naturally occurring sequences. Our general, physics-based method enables the design of disordered sequences with specified conformational properties. Our algorithm thus expands the toolbox for protein design to include also the most flexible proteins and will enable the design of proteins whose functions exploit the many properties afforded by protein disorder.
Collapse
Affiliation(s)
- Francesco Pesce
- Structural Biology and NMR Laboratory, The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Anne Bremer
- Department of Structural Biology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Giulio Tesei
- Structural Biology and NMR Laboratory, The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Jesse B. Hopkins
- BioCAT, Department of Physics, Illinois Institute of Technology, Chicago, IL, USA
| | - Christy R. Grace
- Department of Structural Biology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Tanja Mittag
- Department of Structural Biology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Kresten Lindorff-Larsen
- Structural Biology and NMR Laboratory, The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
24
|
Varadi M, Tsenkov M, Velankar S. Challenges in bridging the gap between protein structure prediction and functional interpretation. Proteins 2023. [PMID: 37850517 DOI: 10.1002/prot.26614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 09/26/2023] [Accepted: 10/04/2023] [Indexed: 10/19/2023]
Abstract
The rapid evolution of protein structure prediction tools has significantly broadened access to protein structural data. Although predicted structure models have the potential to accelerate and impact fundamental and translational research significantly, it is essential to note that they are not validated and cannot be considered the ground truth. Thus, challenges persist, particularly in capturing protein dynamics, predicting multi-chain structures, interpreting protein function, and assessing model quality. Interdisciplinary collaborations are crucial to overcoming these obstacles. Databases like the AlphaFold Protein Structure Database, the ESM Metagenomic Atlas, and initiatives like the 3D-Beacons Network provide FAIR access to these data, enabling their interpretation and application across a broader scientific community. Whilst substantial advancements have been made in protein structure prediction, further progress is required to address the remaining challenges. Developing training materials, nurturing collaborations, and ensuring open data sharing will be paramount in this pursuit. The continued evolution of these tools and methodologies will deepen our understanding of protein function and accelerate disease pathogenesis and drug development discoveries.
Collapse
Affiliation(s)
- Mihaly Varadi
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Maxim Tsenkov
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| |
Collapse
|
25
|
Mészáros B, Hatos A, Palopoli N, Quaglia F, Salladini E, Van Roey K, Arthanari H, Dosztányi Z, Felli IC, Fischer PD, Hoch JC, Jeffries CM, Longhi S, Maiani E, Orchard S, Pancsa R, Papaleo E, Pierattelli R, Piovesan D, Pritisanac I, Tenorio L, Viennet T, Tompa P, Vranken W, Tosatto SCE, Davey NE. Minimum information guidelines for experiments structurally characterizing intrinsically disordered protein regions. Nat Methods 2023; 20:1291-1303. [PMID: 37400558 DOI: 10.1038/s41592-023-01915-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Accepted: 05/18/2023] [Indexed: 07/05/2023]
Abstract
An unambiguous description of an experiment, and the subsequent biological observation, is vital for accurate data interpretation. Minimum information guidelines define the fundamental complement of data that can support an unambiguous conclusion based on experimental observations. We present the Minimum Information About Disorder Experiments (MIADE) guidelines to define the parameters required for the wider scientific community to understand the findings of an experiment studying the structural properties of intrinsically disordered regions (IDRs). MIADE guidelines provide recommendations for data producers to describe the results of their experiments at source, for curators to annotate experimental data to community resources and for database developers maintaining community resources to disseminate the data. The MIADE guidelines will improve the interpretability of experimental results for data consumers, facilitate direct data submission, simplify data curation, improve data exchange among repositories and standardize the dissemination of the key metadata on an IDR experiment by IDR data sources.
Collapse
Affiliation(s)
- Bálint Mészáros
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
- Department of Structural Biology and Center for Data Driven Discovery, St Jude Children's Research Hospital, Memphis, TN, USA
| | - András Hatos
- Department of Biomedical Sciences, University of Padova, Padova, Italy
- Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Swiss Cancer Center Leman, Lausanne, Switzerland
| | - Nicolas Palopoli
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes - CONICET, Bernal, Buenos Aires, Argentina
| | - Federica Quaglia
- Department of Biomedical Sciences, University of Padova, Padova, Italy
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR-IBIOM), Bari, Italy
| | - Edoardo Salladini
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Kim Van Roey
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
| | - Haribabu Arthanari
- Harvard Medical School (HMS), Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute (DFCI), Boston, MA, USA
| | | | - Isabella C Felli
- Department of Chemistry 'Ugo Schiff' and Magnetic Resonance Center, University of Florence, Sesto Fiorentino (Florence), Italy
| | - Patrick D Fischer
- Harvard Medical School (HMS), Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute (DFCI), Boston, MA, USA
| | - Jeffrey C Hoch
- Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT, USA
| | - Cy M Jeffries
- European Molecular Biology Laboratory (EMBL), Hamburg Unit, c/o Deutsches Elektronen-Synchrotron, Hamburg, Germany
| | - Sonia Longhi
- Laboratory Architecture et Fonction des Macromolécules Biologiques (AFMB), UMR 7257, Aix Marseille University and Centre National de la Recherche Scientifique (CNRS), Marseille, France
| | - Emiliano Maiani
- Cancer Structural Biology, Danish Cancer Society Research Center, Copenhagen, Denmark
- UniCamillus - Saint Camillus International University of Health and Medical Sciences, Rome, Italy
| | - Sandra Orchard
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Hinxton, UK
| | - Rita Pancsa
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
| | - Elena Papaleo
- Cancer Structural Biology, Danish Cancer Society Research Center, Copenhagen, Denmark
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and Technology, Technical University of Denmark, Lyngby, Denmark
| | - Roberta Pierattelli
- Department of Chemistry 'Ugo Schiff' and Magnetic Resonance Center, University of Florence, Sesto Fiorentino (Florence), Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Iva Pritisanac
- Hospital for Sick Children, Toronto, Ontario, Canada
- Medical University of Graz, Graz, Austria
| | - Luiggi Tenorio
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Thibault Viennet
- Harvard Medical School (HMS), Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute (DFCI), Boston, MA, USA
| | - Peter Tompa
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
- VIB-VUB Center for Structural Biology, Brussels, Belgium
- Structural Biology Brussels, Department of Bioengineering Sciences, Vrije Universiteit Brussel, Brussels, Belgium
| | - Wim Vranken
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Structural Biology Brussels, Department of Bioengineering Sciences, Vrije Universiteit Brussel, Brussels, Belgium
| | | | - Norman E Davey
- Division Of Cancer Biology, Institute of Cancer Research, Chester Beatty Laboratories, Chelsea, London, UK.
| |
Collapse
|
26
|
Tsangaris TE, Smyth S, Gomes GNW, Liu ZH, Milchberg M, Bah A, Wasney GA, Forman-Kay JD, Gradinaru CC. Delineating Structural Propensities of the 4E-BP2 Protein via Integrative Modeling and Clustering. J Phys Chem B 2023; 127:7472-7486. [PMID: 37595014 PMCID: PMC10858721 DOI: 10.1021/acs.jpcb.3c04052] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/20/2023]
Abstract
The intrinsically disordered 4E-BP2 protein regulates mRNA cap-dependent translation through interaction with the predominantly folded eukaryotic initiation factor 4E (eIF4E). Phosphorylation of 4E-BP2 dramatically reduces the level of eIF4E binding, in part by stabilizing a binding-incompatible folded domain. Here, we used a Rosetta-based sampling algorithm optimized for IDRs to generate initial ensembles for two phospho forms of 4E-BP2, non- and 5-fold phosphorylated (NP and 5P, respectively), with the 5P folded domain flanked by N- and C-terminal IDRs (N-IDR and C-IDR, respectively). We then applied an integrative Bayesian approach to obtain NP and 5P conformational ensembles that agree with experimental data from nuclear magnetic resonance, small-angle X-ray scattering, and single-molecule Förster resonance energy transfer (smFRET). For the NP state, inter-residue distance scaling and 2D maps revealed the role of charge segregation and pi interactions in driving contacts between distal regions of the chain (∼70 residues apart). The 5P ensemble shows prominent contacts of the N-IDR region with the two phosphosites in the folded domain, pT37 and pT46, and, to a lesser extent, delocalized interactions with the C-IDR region. Agglomerative hierarchical clustering led to partitioning of each of the two ensembles into four clusters with different global dimensions and contact maps. This helped delineate an NP cluster that, based on our smFRET data, is compatible with the eIF4E-bound state. 5P clusters were differentiated by interactions of C-IDR with the folded domain and of the N-IDR with the two phosphosites in the folded domain. Our study provides both a better visualization of fundamental structural poses of 4E-BP2 and a set of falsifiable insights on intrachain interactions that bias folding and binding of this protein.
Collapse
Affiliation(s)
- Thomas E Tsangaris
- Department of Physics, University of Toronto, Toronto, Ontario M5S 1A7, Canada
- Department of Chemical & Physical Sciences, University of Toronto Mississauga, Mississauga, Ontario L5L 1C6, Canada
| | - Spencer Smyth
- Department of Physics, University of Toronto, Toronto, Ontario M5S 1A7, Canada
- Department of Chemical & Physical Sciences, University of Toronto Mississauga, Mississauga, Ontario L5L 1C6, Canada
| | - Gregory-Neal W Gomes
- Department of Physics, University of Toronto, Toronto, Ontario M5S 1A7, Canada
- Department of Chemical & Physical Sciences, University of Toronto Mississauga, Mississauga, Ontario L5L 1C6, Canada
| | - Zi Hao Liu
- Program in Molecular Medicine, Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada
- Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Moses Milchberg
- Program in Molecular Medicine, Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada
- Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Alaji Bah
- Program in Molecular Medicine, Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada
- Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Gregory A Wasney
- Peter Gilgan Centre for Research and Learning, Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada
| | - Julie D Forman-Kay
- Program in Molecular Medicine, Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada
- Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Claudiu C Gradinaru
- Department of Physics, University of Toronto, Toronto, Ontario M5S 1A7, Canada
- Department of Chemical & Physical Sciences, University of Toronto Mississauga, Mississauga, Ontario L5L 1C6, Canada
| |
Collapse
|
27
|
Maiti S, Heyden M. Model-Dependent Solvation of the K-18 Domain of the Intrinsically Disordered Protein Tau. J Phys Chem B 2023; 127:7220-7230. [PMID: 37556237 DOI: 10.1021/acs.jpcb.3c01726] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/11/2023]
Abstract
A known imbalance between intra-protein and protein-water interactions in many empirical force fields results in collapsed conformational ensembles of intrinsically disordered proteins in explicit solvent simulations that disagree with experiments. Multiple strategies have been introduced in the literature to modify protein-water interactions, which improve agreement between experiments and simulations. In this work, we combine simulations with standard and modified force fields with a spatially resolved analysis of solvation free energy contributions and compare the consequences of each strategy. We find that enhanced Lennard-Jones (LJ) interactions between protein atoms and water oxygens primarily improve the solvation of nonpolar functional groups of the protein. In contrast, modified electrostatics in the water model or strengthened LJ interactions between the protein and water hydrogens mainly affect the hydration of polar functional groups. Modified electrostatics further impact the average orientation of water molecules in the hydration shell. As a result, protein-water interactions with the first hydration layers are strengthened, while interactions with water molecules in higher hydration shells are weakened. Hence, distinct strategies to balance intra-protein and protein-water interactions in simulations have qualitatively different effects on protein solvation. These differences are not necessarily captured by comparisons to experiments that report on global parameters describing protein conformational ensembles, e.g., the radius of gyration, but will influence the tendency of a protein to form aggregates or phase-separated droplets.
Collapse
Affiliation(s)
- Sthitadhi Maiti
- School of Molecular Sciences, Arizona State University, Tempe, Arizona 85287, United States
| | - Matthias Heyden
- School of Molecular Sciences, Arizona State University, Tempe, Arizona 85287, United States
| |
Collapse
|
28
|
Mondal A, Lenz S, MacCallum JL, Perez A. Hybrid computational methods combining experimental information with molecular dynamics. Curr Opin Struct Biol 2023; 81:102609. [PMID: 37224642 DOI: 10.1016/j.sbi.2023.102609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Revised: 04/12/2023] [Accepted: 04/23/2023] [Indexed: 05/26/2023]
Abstract
A goal of structural biology is to understand how macromolecules carry out their biological roles by identifying their metastable states, mechanisms of action, pathways leading to conformational changes, and the thermodynamic and kinetic relationships between those states. Integrative modeling brings structural insights into systems where traditional structure determination approaches cannot help. We focus on the synergies and challenges of integrative modeling combining experimental data with molecular dynamics simulations.
Collapse
Affiliation(s)
- Arup Mondal
- Quantum Theory Project, Department of Chemistry, University of Florida, Leigh, UK. https://twitter.com/@amondal_chem
| | - Stefan Lenz
- Department of Chemistry, University of Calgary, 2500 University Drive, Canada
| | - Justin L MacCallum
- Department of Chemistry, University of Calgary, 2500 University Drive, Canada. https://twitter.com/@jlmaccal
| | - Alberto Perez
- Quantum Theory Project, Department of Chemistry, University of Florida, Leigh, UK.
| |
Collapse
|
29
|
Ricard-Blum S, Couchman JR. Conformations, interactions and functions of intrinsically disordered syndecans. Biochem Soc Trans 2023:BST20221085. [PMID: 37334846 DOI: 10.1042/bst20221085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2023] [Revised: 06/03/2023] [Accepted: 06/07/2023] [Indexed: 06/21/2023]
Abstract
Syndecans are transmembrane heparan sulfate proteoglycans present on most mammalian cell surfaces. They have a long evolutionary history, a single syndecan gene being expressed in bilaterian invertebrates. Syndecans have attracted interest because of their potential roles in development and disease, including vascular diseases, inflammation and various cancers. Recent structural data is providing important insights into their functions, which are complex, involving both intrinsic signaling through cytoplasmic binding partners and co-operative mechanisms where syndecans form a signaling nexus with other receptors such as integrins and tyrosine kinase growth factor receptors. While the cytoplasmic domain of syndecan-4 has a well-defined dimeric structure, the syndecan ectodomains are intrinsically disordered, which is linked to a capacity to interact with multiple partners. However, it remains to fully establish the impact of glycanation and partner proteins on syndecan core protein conformations. Genetic models indicate that a conserved property of syndecans links the cytoskeleton to calcium channels of the transient receptor potential class, compatible with roles as mechanosensors. In turn, syndecans influence actin cytoskeleton organization to impact motility, adhesion and the extracellular matrix environment. Syndecan clustering with other cell surface receptors into signaling microdomains has relevance to tissue differentiation in development, for example in stem cells, but also in disease where syndecan expression can be markedly up-regulated. Since syndecans have potential as diagnostic and prognostic markers as well as possible targets in some forms of cancer, it remains important to unravel structure/function relationships in the four mammalian syndecans.
Collapse
Affiliation(s)
- Sylvie Ricard-Blum
- ICBMS, UMR 5246 CNRS, Universite Claude Bernard Lyon 1, F-69622 Villeurbanne, France
| | - John R Couchman
- Biotech Research & Innovation Center, University of Copenhagen, 2200 Copenhagen, Denmark
| |
Collapse
|
30
|
Paladino A, Vitagliano L, Graziano G. The Action of Chemical Denaturants: From Globular to Intrinsically Disordered Proteins. BIOLOGY 2023; 12:biology12050754. [PMID: 37237566 DOI: 10.3390/biology12050754] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Revised: 05/15/2023] [Accepted: 05/18/2023] [Indexed: 05/28/2023]
Abstract
Proteins perform their many functions by adopting either a minimal number of strictly similar conformations, the native state, or a vast ensemble of highly flexible conformations. In both cases, their structural features are highly influenced by the chemical environment. Even though a plethora of experimental studies have demonstrated the impact of chemical denaturants on protein structure, the molecular mechanism underlying their action is still debated. In the present review, after a brief recapitulation of the main experimental data on protein denaturants, we survey both classical and more recent interpretations of the molecular basis of their action. In particular, we highlight the differences and similarities of the impact that denaturants have on different structural classes of proteins, i.e., globular, intrinsically disordered (IDP), and amyloid-like assemblies. Particular attention has been given to the IDPs, as recent studies are unraveling their fundamental importance in many physiological processes. The role that computation techniques are expected to play in the near future is illustrated.
Collapse
Affiliation(s)
- Antonella Paladino
- Institute of Biostructures and Bioimaging, CNR, Via Pietro Castellino 111, 80131 Naples, Italy
| | - Luigi Vitagliano
- Institute of Biostructures and Bioimaging, CNR, Via Pietro Castellino 111, 80131 Naples, Italy
| | - Giuseppe Graziano
- Department of Science and Technology, University of Sannio, via Francesco de Sanctis snc, 82100 Benevento, Italy
| |
Collapse
|
31
|
Aina A, Hsueh SCC, Plotkin SS. PROTHON: A Local Order Parameter-Based Method for Efficient Comparison of Protein Ensembles. J Chem Inf Model 2023. [PMID: 37178169 DOI: 10.1021/acs.jcim.3c00145] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
The comparison of protein conformational ensembles is of central importance in structural biology. However, there are few computational methods for ensemble comparison, and those that are readily available, such as ENCORE, utilize methods that are sufficiently computationally expensive to be prohibitive for large ensembles. Here, a new method is presented for efficient representation and comparison of protein conformational ensembles. The method is based on the representation of a protein ensemble as a vector of probability distribution functions (pdfs), with each pdf representing the distribution of a local structural property such as the number of contacts between Cβ atoms. Dissimilarity between two conformational ensembles is quantified by the Jensen-Shannon distance between the corresponding set of probability distribution functions. The method is validated for conformational ensembles generated by molecular dynamics simulations of ubiquitin, as well as experimentally derived conformational ensembles of a 130 amino acid truncated form of human tau protein. In the ubiquitin ensemble data set, the method was up to 88 times faster than the existing ENCORE software, while simultaneously utilizing 48 times fewer computing cores. We make the method available as a Python package, called PROTHON, and provide a GitHub page with the Python source code at https://github.com/PlotkinLab/Prothon.
Collapse
Affiliation(s)
- Adekunle Aina
- Department of Physics and Astronomy, The University of British Columbia, Vancouver, BC V6T 1Z1, Canada
| | - Shawn C C Hsueh
- Department of Physics and Astronomy, The University of British Columbia, Vancouver, BC V6T 1Z1, Canada
| | - Steven S Plotkin
- Department of Physics and Astronomy, The University of British Columbia, Vancouver, BC V6T 1Z1, Canada
- Genome Science and Technology Program, The University of British Columbia, Vancouver, BC V6T 1Z1, Canada
| |
Collapse
|
32
|
Mészáros B, Park E, Malinverni D, Sejdiu BI, Immadisetty K, Sandhu M, Lang B, Babu MM. Recent breakthroughs in computational structural biology harnessing the power of sequences and structures. Curr Opin Struct Biol 2023; 80:102608. [PMID: 37182396 DOI: 10.1016/j.sbi.2023.102608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Revised: 04/12/2023] [Accepted: 04/17/2023] [Indexed: 05/16/2023]
Abstract
Recent advances in computational approaches and their integration into structural biology enable tackling increasingly complex questions. Here, we discuss several key areas, highlighting breakthroughs and remaining challenges. Theoretical modeling has provided tools to accurately predict and design protein structures on a scale currently difficult to achieve using experimental approaches. Molecular Dynamics simulations have become faster and more precise, delivering actionable information inaccessible by current experimental methods. Virtual screening workflows allow a high-throughput approach to discover ligands that bind and modulate protein function, while Machine Learning methods enable the design of proteins with new functionalities. Integrative structural biology combines several of these approaches, pushing the frontiers of structural and functional characterization to ever larger systems, advancing towards a complete understanding of the living cell. These breakthroughs will accelerate and significantly impact diverse areas of science.
Collapse
Affiliation(s)
- Bálint Mészáros
- Department of Structural Biology and Center of Excellence for Data Driven Discovery, St Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN, 38105, USA.
| | - Electa Park
- Department of Structural Biology and Center of Excellence for Data Driven Discovery, St Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN, 38105, USA.
| | - Duccio Malinverni
- Department of Structural Biology and Center of Excellence for Data Driven Discovery, St Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN, 38105, USA. https://twitter.com/DucMalinverni
| | - Besian I Sejdiu
- Department of Structural Biology and Center of Excellence for Data Driven Discovery, St Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN, 38105, USA. https://twitter.com/bisejdiu
| | - Kalyan Immadisetty
- Department of Bone Marrow Transplantation & Cellular Therapy, St Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN, 38105, USA. https://twitter.com/k_immadisetty
| | - Manbir Sandhu
- Department of Structural Biology and Center of Excellence for Data Driven Discovery, St Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN, 38105, USA. https://twitter.com/M5andhu
| | - Benjamin Lang
- Department of Structural Biology and Center of Excellence for Data Driven Discovery, St Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN, 38105, USA. https://twitter.com/langbnj
| | - M Madan Babu
- Department of Structural Biology and Center of Excellence for Data Driven Discovery, St Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN, 38105, USA.
| |
Collapse
|
33
|
Liu J, Liu J, Deng L, Liu H, Liu H, Zhao W, Zhao Y, Sun X, Fan S, Wang H, Hua W. An intrinsically disordered region-containing protein mitigates the drought-growth trade-off to boost yields. PLANT PHYSIOLOGY 2023; 192:274-292. [PMID: 36746783 PMCID: PMC10152686 DOI: 10.1093/plphys/kiad074] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/25/2022] [Revised: 12/16/2022] [Accepted: 01/16/2023] [Indexed: 05/03/2023]
Abstract
Drought stress poses a serious threat to global agricultural productivity and food security. Plant resistance to drought is typically accompanied by a growth deficit and yield penalty. Herein, we report a previously uncharacterized, dicotyledon-specific gene, Stress and Growth Interconnector (SGI), that promotes growth during drought in the oil crop rapeseed (Brassica napus) and the model plant Arabidopsis (Arabidopsis thaliana). Overexpression of SGI conferred enhanced biomass and yield under water-deficient conditions, whereas corresponding CRISPR SGI mutants exhibited the opposite effects. These attributes were achieved by mediating reactive oxygen species (ROS) homeostasis while maintaining photosynthetic efficiency to increase plant fitness under water-limiting environments. Further spatial-temporal transcriptome profiling revealed dynamic reprogramming of pathways for photosynthesis and stress responses during drought and the subsequent recovery. Mechanistically, SGI represents an intrinsically disordered region-containing protein that interacts with itself, catalase isoforms, dehydrins, and other drought-responsive positive factors, restraining ROS generation. These multifaceted interactions stabilize catalases in response to drought and facilitate their ROS-scavenging activities. Taken altogether, these findings provide insights into currently underexplored mechanisms to circumvent trade-offs between plant growth and stress tolerance that will inform strategies to breed climate-resilient, higher yielding crops for sustainable agriculture.
Collapse
Affiliation(s)
- Jun Liu
- Oil Crops Research Institute of the Chinese Academy of Agricultural Sciences, Key Laboratory of Biology and Genetic Improvement of Oil Crops, Ministry of Agriculture and Rural Affairs, Wuhan 430062, China
| | - Jing Liu
- Oil Crops Research Institute of the Chinese Academy of Agricultural Sciences, Key Laboratory of Biology and Genetic Improvement of Oil Crops, Ministry of Agriculture and Rural Affairs, Wuhan 430062, China
- Hubei Hongshan Laboratory, Wuhan 430070, China
| | - Linbin Deng
- Oil Crops Research Institute of the Chinese Academy of Agricultural Sciences, Key Laboratory of Biology and Genetic Improvement of Oil Crops, Ministry of Agriculture and Rural Affairs, Wuhan 430062, China
| | - Hongmei Liu
- Oil Crops Research Institute of the Chinese Academy of Agricultural Sciences, Key Laboratory of Biology and Genetic Improvement of Oil Crops, Ministry of Agriculture and Rural Affairs, Wuhan 430062, China
| | - Hongfang Liu
- Oil Crops Research Institute of the Chinese Academy of Agricultural Sciences, Key Laboratory of Biology and Genetic Improvement of Oil Crops, Ministry of Agriculture and Rural Affairs, Wuhan 430062, China
| | - Wei Zhao
- Oil Crops Research Institute of the Chinese Academy of Agricultural Sciences, Key Laboratory of Biology and Genetic Improvement of Oil Crops, Ministry of Agriculture and Rural Affairs, Wuhan 430062, China
| | - Yuwei Zhao
- Oil Crops Research Institute of the Chinese Academy of Agricultural Sciences, Key Laboratory of Biology and Genetic Improvement of Oil Crops, Ministry of Agriculture and Rural Affairs, Wuhan 430062, China
| | - Xingchao Sun
- Oil Crops Research Institute of the Chinese Academy of Agricultural Sciences, Key Laboratory of Biology and Genetic Improvement of Oil Crops, Ministry of Agriculture and Rural Affairs, Wuhan 430062, China
| | - Shihang Fan
- Oil Crops Research Institute of the Chinese Academy of Agricultural Sciences, Key Laboratory of Biology and Genetic Improvement of Oil Crops, Ministry of Agriculture and Rural Affairs, Wuhan 430062, China
| | - Hanzhong Wang
- Oil Crops Research Institute of the Chinese Academy of Agricultural Sciences, Key Laboratory of Biology and Genetic Improvement of Oil Crops, Ministry of Agriculture and Rural Affairs, Wuhan 430062, China
| | - Wei Hua
- Oil Crops Research Institute of the Chinese Academy of Agricultural Sciences, Key Laboratory of Biology and Genetic Improvement of Oil Crops, Ministry of Agriculture and Rural Affairs, Wuhan 430062, China
- Hubei Hongshan Laboratory, Wuhan 430070, China
| |
Collapse
|
34
|
Panigrahi R, Krishnan R, Singh JS, Padinhateeri R, Kumar A. SUMO1 hinders α-Synuclein fibrillation by inducing structural compaction. Protein Sci 2023; 32:e4632. [PMID: 36974517 PMCID: PMC10108436 DOI: 10.1002/pro.4632] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 03/17/2023] [Accepted: 03/23/2023] [Indexed: 03/29/2023]
Abstract
Small Ubiquitin-like Modifier 1 (SUMO1) is an essential protein for many cellular functions, including regulation, signaling, etc., achieved by a process known as SUMOylation, which involves covalent attachment of SUMO1 to target proteins. SUMO1 also regulates the function of several proteins via non-covalent interactions involving the hydrophobic patch in the target protein identified as SUMO Binding or Interacting Motif (SBM/SIM). Here, we demonstrate a crucial functional potential of SUMO1 mediated by its non-covalent interactions with α-Synuclein, a protein responsible for many neurodegenerative diseases called α-Synucleinopathies. SUMO1 hinders the fibrillation of α-Synuclein, an intrinsically disordered protein (IDP) that undergoes a transition to β-structures during the fibrillation process. Using a plethora of biophysical techniques, we show that SUMO1 transiently binds to the N-terminus region of α-Synuclein non-covalently and causes structural compaction, which hinders the self-association process and thereby delays the fibrillation process. On the one hand, this study demonstrates an essential functional role of SUMO1 protein concerning neurodegeneration; it also illustrates the commonly stated mechanism that IDPs carry out multiple functions by structural adaptation to suit specific target proteins, on the other. Residue-level details about the SUMO1-α-Synuclein interaction obtained here also serve as a reliable approach for investigating the detailed mechanisms of IDP functions.
Collapse
Affiliation(s)
- Rajlaxmi Panigrahi
- Department of Biosciences and BioengineeringIndian Institute of Technology (IIT) BombayMumbaiMaharashtraIndia
| | - Rakesh Krishnan
- Department of Biosciences and BioengineeringIndian Institute of Technology (IIT) BombayMumbaiMaharashtraIndia
| | - Jai Shankar Singh
- Department of Biosciences and BioengineeringIndian Institute of Technology (IIT) BombayMumbaiMaharashtraIndia
| | - Ranjith Padinhateeri
- Department of Biosciences and BioengineeringIndian Institute of Technology (IIT) BombayMumbaiMaharashtraIndia
| | - Ashutosh Kumar
- Department of Biosciences and BioengineeringIndian Institute of Technology (IIT) BombayMumbaiMaharashtraIndia
| |
Collapse
|
35
|
Wohl S, Zheng W. Interpreting Transient Interactions of Intrinsically Disordered Proteins. J Phys Chem B 2023; 127:2395-2406. [PMID: 36917561 PMCID: PMC10038935 DOI: 10.1021/acs.jpcb.3c00096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/16/2023]
Abstract
The flexible nature of intrinsically disordered proteins (IDPs) gives rise to a conformational ensemble with a diverse set of conformations. The simplest way to describe this ensemble is through a homopolymer model without any specific interactions. However, there has been growing evidence that the conformational properties of IDPs and their relevant functions can be affected by transient interactions between specific and even nonlocal pairs of amino acids. Interpreting these interactions from experimental methods, each of which is most sensitive to a different distance regime referred to as probing length, remains a challenging and unsolved problem. Here, we first show that transient interactions can be realized between short fragments of charged amino acids by generating conformational ensembles using model disordered peptides and coarse-grained simulations. Using these ensembles, we investigate how sensitive different types of experimental measurements are to the presence of transient interactions. We find methods with shorter probing lengths to be more appropriate for detecting these transient interactions, but one experimental method is not sufficient due to the existence of other weak interactions typically seen in IDPs. Finally, we develop an adjusted polymer model with an additional short-distance peak which can robustly reproduce the distance distribution function from two experimental measurements with complementary short and long probing lengths. This new model can suggest whether a homopolymer model is insufficient for describing a specific IDP and meets the challenge of quantitatively identifying specific, transient interactions from a background of nonspecific, weak interactions.
Collapse
Affiliation(s)
- Samuel Wohl
- Department of Physics, Arizona State University, Tempe, Arizona 85287, United States
| | - Wenwei Zheng
- College of Integrative Sciences and Arts, Arizona State University, Mesa, Arizona 85212, United States
| |
Collapse
|
36
|
Ding Y, Yu K, Huang J. Data science techniques in biomolecular force field development. Curr Opin Struct Biol 2023; 78:102502. [PMID: 36462448 DOI: 10.1016/j.sbi.2022.102502] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Revised: 10/18/2022] [Accepted: 10/25/2022] [Indexed: 12/03/2022]
Abstract
Recent advances in data science are impacting the development of classical force fields. Here we review some ideas and techniques from data science that have been used in force field development, including database construction, atom typing, and machine learning potentials. We highlight how new tools such as active learning and automatic differentiation are facilitating the generation of target data and the direct fitting with macroscopic observables. Philosophical changes on how force field models should be built and used are also discussed. It's inspiring that more accurate biomolecular force fields can be developed with the aid of data science techniques.
Collapse
Affiliation(s)
- Ye Ding
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, 310024, China; Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang, 310024, China
| | - Kuang Yu
- Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, Guangdong, 518055, China
| | - Jing Huang
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, 310024, China; Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang, 310024, China.
| |
Collapse
|
37
|
Duran-Frigola M, Cigler M, Winter GE. Advancing Targeted Protein Degradation via Multiomics Profiling and Artificial Intelligence. J Am Chem Soc 2023; 145:2711-2732. [PMID: 36706315 PMCID: PMC9912273 DOI: 10.1021/jacs.2c11098] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
Only around 20% of the human proteome is considered to be druggable with small-molecule antagonists. This leaves some of the most compelling therapeutic targets outside the reach of ligand discovery. The concept of targeted protein degradation (TPD) promises to overcome some of these limitations. In brief, TPD is dependent on small molecules that induce the proximity between a protein of interest (POI) and an E3 ubiquitin ligase, causing ubiquitination and degradation of the POI. In this perspective, we want to reflect on current challenges in the field, and discuss how advances in multiomics profiling, artificial intelligence, and machine learning (AI/ML) will be vital in overcoming them. The presented roadmap is discussed in the context of small-molecule degraders but is equally applicable for other emerging proximity-inducing modalities.
Collapse
Affiliation(s)
- Miquel Duran-Frigola
- CeMM
Research Center for Molecular Medicine of the Austrian Academy of
Sciences, 1090 Vienna, Austria,Ersilia
Open Source Initiative, 28 Belgrave Road, CB1 3DE, Cambridge, United Kingdom,
| | - Marko Cigler
- CeMM
Research Center for Molecular Medicine of the Austrian Academy of
Sciences, 1090 Vienna, Austria
| | - Georg E. Winter
- CeMM
Research Center for Molecular Medicine of the Austrian Academy of
Sciences, 1090 Vienna, Austria,
| |
Collapse
|
38
|
Sun B, Kekenes-Huskey PM. Myofilament-associated proteins with intrinsic disorder (MAPIDs) and their resolution by computational modeling. Q Rev Biophys 2023; 56:e2. [PMID: 36628457 PMCID: PMC11070111 DOI: 10.1017/s003358352300001x] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
The cardiac sarcomere is a cellular structure in the heart that enables muscle cells to contract. Dozens of proteins belong to the cardiac sarcomere, which work in tandem to generate force and adapt to demands on cardiac output. Intriguingly, the majority of these proteins have significant intrinsic disorder that contributes to their functions, yet the biophysics of these intrinsically disordered regions (IDRs) have been characterized in limited detail. In this review, we first enumerate these myofilament-associated proteins with intrinsic disorder (MAPIDs) and recent biophysical studies to characterize their IDRs. We secondly summarize the biophysics governing IDR properties and the state-of-the-art in computational tools toward MAPID identification and characterization of their conformation ensembles. We conclude with an overview of future computational approaches toward broadening the understanding of intrinsic disorder in the cardiac sarcomere.
Collapse
Affiliation(s)
- Bin Sun
- Research Center for Pharmacoinformatics (The State-Province Key Laboratories of Biomedicine-Pharmaceutics of China), Department of Medicinal Chemistry and Natural Medicine Chemistry, College of Pharmacy, Harbin Medical University, Harbin 150081, China
| | | |
Collapse
|
39
|
Functional benefit of structural disorder for the replication of measles, Nipah and Hendra viruses. Essays Biochem 2022; 66:915-934. [PMID: 36148633 DOI: 10.1042/ebc20220045] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 08/18/2022] [Accepted: 08/25/2022] [Indexed: 12/24/2022]
Abstract
Measles, Nipah and Hendra viruses are severe human pathogens within the Paramyxoviridae family. Their non-segmented, single-stranded, negative-sense RNA genome is encapsidated by the nucleoprotein (N) within a helical nucleocapsid that is the substrate used by the viral RNA-dependent-RNA-polymerase (RpRd) for transcription and replication. The RpRd is a complex made of the large protein (L) and of the phosphoprotein (P), the latter serving as an obligate polymerase cofactor and as a chaperon for N. Both the N and P proteins are enriched in intrinsically disordered regions (IDRs), i.e. regions devoid of stable secondary and tertiary structure. N possesses a C-terminal IDR (NTAIL), while P consists of a large, intrinsically disordered N-terminal domain (NTD) and a C-terminal domain (CTD) encompassing alternating disordered and ordered regions. The V and W proteins, two non-structural proteins that are encoded by the P gene via a mechanism of co-transcriptional edition of the P mRNA, are prevalently disordered too, sharing with P the disordered NTD. They are key players in the evasion of the host antiviral response and were shown to phase separate and to form amyloid-like fibrils in vitro. In this review, we summarize the available information on IDRs within the N, P, V and W proteins from these three model paramyxoviruses and describe their molecular partnership. We discuss the functional benefit of disorder to virus replication in light of the critical role of IDRs in affording promiscuity, multifunctionality, fine regulation of interaction strength, scaffolding functions and in promoting liquid-liquid phase separation and fibrillation.
Collapse
|
40
|
Varadi M, Nair S, Sillitoe I, Tauriello G, Anyango S, Bienert S, Borges C, Deshpande M, Green T, Hassabis D, Hatos A, Hegedus T, Hekkelman ML, Joosten R, Jumper J, Laydon A, Molodenskiy D, Piovesan D, Salladini E, Salzberg SL, Sommer MJ, Steinegger M, Suhajda E, Svergun D, Tenorio-Ku L, Tosatto S, Tunyasuvunakool K, Waterhouse AM, Žídek A, Schwede T, Orengo C, Velankar S. 3D-Beacons: decreasing the gap between protein sequences and structures through a federated network of protein structure data resources. Gigascience 2022; 11:6854872. [PMID: 36448847 PMCID: PMC9709962 DOI: 10.1093/gigascience/giac118] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 09/20/2022] [Accepted: 11/11/2022] [Indexed: 12/02/2022] Open
Abstract
While scientists can often infer the biological function of proteins from their 3-dimensional quaternary structures, the gap between the number of known protein sequences and their experimentally determined structures keeps increasing. A potential solution to this problem is presented by ever more sophisticated computational protein modeling approaches. While often powerful on their own, most methods have strengths and weaknesses. Therefore, it benefits researchers to examine models from various model providers and perform comparative analysis to identify what models can best address their specific use cases. To make data from a large array of model providers more easily accessible to the broader scientific community, we established 3D-Beacons, a collaborative initiative to create a federated network with unified data access mechanisms. The 3D-Beacons Network allows researchers to collate coordinate files and metadata for experimentally determined and theoretical protein models from state-of-the-art and specialist model providers and also from the Protein Data Bank.
Collapse
Affiliation(s)
- Mihaly Varadi
- Correspondence address. Mihaly Varadi, PDBe team, Wellcome Trust Genome Campus, Saffron Walden CB10 1SA, UK. E-mail:
| | | | | | | | - Stephen Anyango
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SA, UK
| | - Stefan Bienert
- Biozentrum, University of Basel, Basel 4056, Switzerland,Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Clemente Borges
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland,European Molecular Biology Laboratory, EMBL Hamburg, Hamburg 69117, Germany
| | - Mandar Deshpande
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SA, UK
| | | | | | - Andras Hatos
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy,Department of Oncology, Lausanne University Hospital, Lausanne 1015, Switzerland,Department of Computational Biology, University of Lausanne, Lausanne 1015, Switzerland,Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland,Swiss Cancer Center Leman, Lausanne 1005, Switzerland
| | - Tamas Hegedus
- Department of Biophysics and Radiation Biology, Semmelweis University, Budapest 1094, Hungary
| | | | - Robbie Joosten
- Netherlands Cancer Institute, Amsterdam 1066 CX, The Netherlands
| | | | | | - Dmitry Molodenskiy
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland,European Molecular Biology Laboratory, EMBL Hamburg, Hamburg 69117, Germany
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
| | - Edoardo Salladini
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
| | - Steven L Salzberg
- Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21205, USA
| | - Markus J Sommer
- Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21205, USA
| | - Martin Steinegger
- School of Biology, Seoul National University, Seoul 82-2-880-6971, 6977, South Korea
| | - Erzsebet Suhajda
- Department of Biophysics and Radiation Biology, Semmelweis University, Budapest 1094, Hungary
| | - Dmitri Svergun
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland,European Molecular Biology Laboratory, EMBL Hamburg, Hamburg 69117, Germany
| | - Luiggi Tenorio-Ku
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
| | - Silvio Tosatto
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
| | | | - Andrew Mark Waterhouse
- Biozentrum, University of Basel, Basel 4056, Switzerland,Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | | | - Torsten Schwede
- Biozentrum, University of Basel, Basel 4056, Switzerland,Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Christine Orengo
- Department of Structural and Molecular Biology, UCL, London WC1E 6BT, UK
| | - Sameer Velankar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SA, UK
| |
Collapse
|
41
|
Piovesan D, Del Conte A, Clementel D, Monzon A, Bevilacqua M, Aspromonte M, Iserte J, Orti FE, Marino-Buslje C, Tosatto SE. MobiDB: 10 years of intrinsically disordered proteins. Nucleic Acids Res 2022; 51:D438-D444. [PMID: 36416266 PMCID: PMC9825420 DOI: 10.1093/nar/gkac1065] [Citation(s) in RCA: 51] [Impact Index Per Article: 25.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 10/11/2022] [Accepted: 10/25/2022] [Indexed: 11/24/2022] Open
Abstract
The MobiDB database (URL: https://mobidb.org/) is a knowledge base of intrinsically disordered proteins. MobiDB aggregates disorder annotations derived from the literature and from experimental evidence along with predictions for all known protein sequences. MobiDB generates new knowledge and captures the functional significance of disordered regions by processing and combining complementary sources of information. Since its first release 10 years ago, the MobiDB database has evolved in order to improve the quality and coverage of protein disorder annotations and its accessibility. MobiDB has now reached its maturity in terms of data standardization and visualization. Here, we present a new release which focuses on the optimization of user experience and database content. The major advances compared to the previous version are the integration of AlphaFoldDB predictions and the re-implementation of the homology transfer pipeline, which expands manually curated annotations by two orders of magnitude. Finally, the entry page has been restyled in order to provide an overview of the available annotations along with two separate views that highlight structural disorder evidence and functions associated with different binding modes.
Collapse
Affiliation(s)
- Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Alessio Del Conte
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Damiano Clementel
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | | | | | | | - Javier A Iserte
- Bioinformatics Unit, Fundación Instituto Leloir, Buenos Aires, Argentina
| | - Fernando E Orti
- Bioinformatics Unit, Fundación Instituto Leloir, Buenos Aires, Argentina
| | | | | |
Collapse
|
42
|
Zheng W, Du Z, Ko SB, Wickramasinghe NP, Yang S. Incorporation of D 2O-Induced Fluorine Chemical Shift Perturbations into Ensemble-Structure Characterization of the ERalpha Disordered Region. J Phys Chem B 2022; 126:9176-9186. [PMID: 36331868 PMCID: PMC10066504 DOI: 10.1021/acs.jpcb.2c05456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Structural characterization of intrinsically disordered proteins (IDPs) requires a concerted effort between experiments and computations by accounting for their conformational heterogeneity. Given the diversity of experimental tools providing local and global structural information, constructing an experimental restraint-satisfying structural ensemble remains challenging. Here, we use the disordered N-terminal domain (NTD) of the estrogen receptor alpha (ERalpha) as a model system to combine existing small-angle X-ray scattering (SAXS) and hydroxyl radical protein footprinting (HRPF) data and newly acquired solvent accessibility data via D2O-induced fluorine chemical shifting (DFCS) measurements. A new set of DFCS data for the solvent exposure of a set of 12 amino acid positions were added to complement previously acquired HRPF measurements for the solvent exposure of the other 16 nonoverlapping amino acids, thereby improving the NTD ensemble characterization considerably. We also found that while choosing an initial ensemble of structures generated from a different atomic-level force field or sampling/modeling method can lead to distinct contact maps even when the same sets of experimental measurements were used for ensemble-fitting, comparative analyses from these initial ensembles reveal commonly recurring structural features in their ensemble-averaged contact map. Specifically, nonlocal or long-range transient interactions were found consistently between the N-terminal segments and the central region, sufficient to mediate the conformational ensemble and regulate how the NTD interacts with its coactivator proteins.
Collapse
Affiliation(s)
- Wenwei Zheng
- College of Integrative Sciences and Arts, Arizona State University, Mesa, Arizona 85212, United States
| | - Zhanwen Du
- Center for Proteomics and Department of Nutrition, School of Medicine, Case Western Reserve University, Cleveland, Ohio 44106, United States
| | - Soo Bin Ko
- Center for Proteomics and Department of Nutrition, School of Medicine, Case Western Reserve University, Cleveland, Ohio 44106, United States
| | | | - Sichun Yang
- Center for Proteomics and Department of Nutrition, School of Medicine, Case Western Reserve University, Cleveland, Ohio 44106, United States
| |
Collapse
|
43
|
Förster D, Idier J, Liberti L, Mucherino A, Lin JH, Malliavin TE. Low-resolution description of the conformational space for intrinsically disordered proteins. Sci Rep 2022; 12:19057. [PMID: 36352011 PMCID: PMC9646904 DOI: 10.1038/s41598-022-21648-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2022] [Accepted: 09/29/2022] [Indexed: 11/11/2022] Open
Abstract
Intrinsically disordered proteins (IDP) are at the center of numerous biological processes, and attract consequently extreme interest in structural biology. Numerous approaches have been developed for generating sets of IDP conformations verifying a given set of experimental measurements. We propose here to perform a systematic enumeration of protein conformations, carried out using the TAiBP approach based on distance geometry. This enumeration was performed on two proteins, Sic1 and pSic1, corresponding to unphosphorylated and phosphorylated states of an IDP. The relative populations of the obtained conformations were then obtained by fitting SAXS curves as well as Ramachandran probability maps, the original finite mixture approach RamaMix being developed for this second task. The similarity between profiles of local gyration radii provides to a certain extent a converged view of the Sic1 and pSic1 conformational space. Profiles and populations are thus proposed for describing IDP conformations. Different variations of the resulting gyration radius between phosphorylated and unphosphorylated states are observed, depending on the set of enumerated conformations as well as on the methods used for obtaining the populations.
Collapse
Affiliation(s)
- Daniel Förster
- grid.112485.b0000 0001 0217 6921UMR7374 Interfaces, Confinement, Matériaux et Nanostructures, Université d’Orléans, Orléans, France
| | - Jérôme Idier
- grid.503212.70000 0000 9563 6044UMR6004 Laboratoire des Sciences du Numérique de Nantes, Nantes, France
| | - Leo Liberti
- grid.508893.fLIX UMR 7161 CNRS École Polytechnique, Institut Polytechnique de Paris, 91128 Palaiseau, France
| | - Antonio Mucherino
- grid.420225.30000 0001 2298 7270IRISA, University of Rennes 1, Rennes, France
| | - Jung-Hsin Lin
- grid.509455.8Biomedical Translation Research Center, Academia Sinica, Taipei, Taiwan
| | - Thérèse E. Malliavin
- grid.428999.70000 0001 2353 6535Institut Pasteur, Université Paris Cité, CNRS UMR3528, Unité de Bioinformatique Structurale, F-75015 Paris, France ,grid.29172.3f0000 0001 2194 6418Université de Lorraine, CNRS UMR7019, LPCT, F-54000 Nancy, France
| |
Collapse
|
44
|
Zhu J, Salvatella X, Robustelli P. Small molecules targeting the disordered transactivation domain of the androgen receptor induce the formation of collapsed helical states. Nat Commun 2022; 13:6390. [PMID: 36302916 PMCID: PMC9613762 DOI: 10.1038/s41467-022-34077-z] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Accepted: 10/13/2022] [Indexed: 12/25/2022] Open
Abstract
Intrinsically disordered proteins, which do not adopt well-defined structures under physiological conditions, are implicated in many human diseases. Small molecules that target the disordered transactivation domain of the androgen receptor have entered human trials for the treatment of castration-resistant prostate cancer (CRPC), but no structural or mechanistic rationale exists to explain their inhibition mechanisms or relative potencies. Here, we utilize all-atom molecular dynamics computer simulations to elucidate atomically detailed binding mechanisms of the compounds EPI-002 and EPI-7170 to the androgen receptor. Our simulations reveal that both compounds bind at the interface of two transiently helical regions and induce the formation of partially folded collapsed helical states. We find that EPI-7170 binds androgen receptor more tightly than EPI-002 and we identify a network of intermolecular interactions that drives higher affinity binding. Our results suggest strategies for developing more potent androgen receptor inhibitors and general strategies for disordered protein drug design.
Collapse
Affiliation(s)
- Jiaqi Zhu
- grid.254880.30000 0001 2179 2404Dartmouth College, Department of Chemistry, Hanover, NH 03755 USA
| | - Xavier Salvatella
- grid.473715.30000 0004 6475 7299Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac 10, 08028 Barcelona, Spain ,grid.425902.80000 0000 9601 989XICREA, Passeig Lluís Companys 23, 0810 Barcelona, Spain
| | - Paul Robustelli
- grid.254880.30000 0001 2179 2404Dartmouth College, Department of Chemistry, Hanover, NH 03755 USA
| |
Collapse
|
45
|
Meszaros A, Ahmed J, Russo G, Tompa P, Lazar T. The evolution and polymorphism of mono-amino acid repeats in androgen receptor and their regulatory role in health and disease. Front Med (Lausanne) 2022; 9:1019803. [PMID: 36388907 PMCID: PMC9642029 DOI: 10.3389/fmed.2022.1019803] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Accepted: 09/30/2022] [Indexed: 12/24/2022] Open
Abstract
Androgen receptor (AR) is a key member of nuclear hormone receptors with the longest intrinsically disordered N-terminal domain (NTD) in its protein family. There are four mono-amino acid repeats (polyQ1, polyQ2, polyG, and polyP) located within its NTD, of which two are polymorphic (polyQ1 and polyG). The length of both polymorphic repeats shows clinically important correlations with disease, especially with cancer and neurodegenerative diseases, as shorter and longer alleles exhibit significant differences in expression, activity and solubility. Importantly, AR has also been shown to undergo condensation in the nucleus by liquid-liquid phase separation, a process highly sensitive to protein solubility and concentration. Nonetheless, in prostate cancer cells, AR variants also partition into transcriptional condensates, which have been shown to alter the expression of target gene products. In this review, we summarize current knowledge on the link between AR repeat polymorphisms and cancer types, including mechanistic explanations and models comprising the relationship between condensate formation, polyQ1 length and transcriptional activity. Moreover, we outline the evolutionary paths of these recently evolved amino acid repeats across mammalian species, and discuss new research directions with potential breakthroughs and controversies in the literature.
Collapse
Affiliation(s)
- Attila Meszaros
- VIB-VUB Center for Structural Biology, Vlaams Instituut voor Biotechnologie (VIB), Brussels, Belgium
- Structural Biology Brussels (SBB), Vrije Universiteit Brussel (VUB), Brussels, Belgium
| | - Junaid Ahmed
- VIB-VUB Center for Structural Biology, Vlaams Instituut voor Biotechnologie (VIB), Brussels, Belgium
- Structural Biology Brussels (SBB), Vrije Universiteit Brussel (VUB), Brussels, Belgium
| | - Giorgio Russo
- VIB-VUB Center for Structural Biology, Vlaams Instituut voor Biotechnologie (VIB), Brussels, Belgium
- Structural Biology Brussels (SBB), Vrije Universiteit Brussel (VUB), Brussels, Belgium
| | - Peter Tompa
- VIB-VUB Center for Structural Biology, Vlaams Instituut voor Biotechnologie (VIB), Brussels, Belgium
- Structural Biology Brussels (SBB), Vrije Universiteit Brussel (VUB), Brussels, Belgium
- Research Centre for Natural Sciences (RCNS), ELKH, Budapest, Hungary
- *Correspondence: Peter Tompa,
| | - Tamas Lazar
- VIB-VUB Center for Structural Biology, Vlaams Instituut voor Biotechnologie (VIB), Brussels, Belgium
- Structural Biology Brussels (SBB), Vrije Universiteit Brussel (VUB), Brussels, Belgium
- Tamas Lazar,
| |
Collapse
|
46
|
Gomes GNW, Namini A, Gradinaru CC. Integrative Conformational Ensembles of Sic1 Using Different Initial Pools and Optimization Methods. Front Mol Biosci 2022; 9:910956. [PMID: 35923464 PMCID: PMC9342850 DOI: 10.3389/fmolb.2022.910956] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Accepted: 06/21/2022] [Indexed: 01/02/2023] Open
Abstract
Intrinsically disordered proteins play key roles in regulatory protein interactions, but their detailed structural characterization remains challenging. Here we calculate and compare conformational ensembles for the disordered protein Sic1 from yeast, starting from initial ensembles that were generated either by statistical sampling of the conformational landscape, or by molecular dynamics simulations. Two popular, yet contrasting optimization methods were used, ENSEMBLE and Bayesian Maximum Entropy, to achieve agreement with experimental data from nuclear magnetic resonance, small-angle X-ray scattering and single-molecule Förster resonance energy transfer. The comparative analysis of the optimized ensembles, including secondary structure propensity, inter-residue contact maps, and the distributions of hydrogen bond and pi interactions, revealed the importance of the physics-based generation of initial ensembles. The analysis also provides insights into designing new experiments that report on the least restrained features among the optimized ensembles. Overall, differences between ensembles optimized from different priors were greater than when using the same prior with different optimization methods. Generating increasingly accurate, reliable and experimentally validated ensembles for disordered proteins is an important step towards a mechanistic understanding of their biological function and involvement in various diseases.
Collapse
Affiliation(s)
- Gregory-Neal W. Gomes
- Department of Physics, University of Toronto, Toronto, ON, Canada
- *Correspondence: Gregory-Neal W. Gomes, ; Claudiu C. Gradinaru,
| | - Ashley Namini
- Department of Chemical & Physical Sciences, University of Toronto Mississauga, Mississauga, ON, Canada
| | - Claudiu C. Gradinaru
- Department of Physics, University of Toronto, Toronto, ON, Canada
- Department of Chemical & Physical Sciences, University of Toronto Mississauga, Mississauga, ON, Canada
- *Correspondence: Gregory-Neal W. Gomes, ; Claudiu C. Gradinaru,
| |
Collapse
|
47
|
Wang X, Chong B, Sun Z, Ruan H, Yang Y, Song P, Liu Z. More is simpler: Decomposition of ligand-binding affinity for proteins being disordered. Protein Sci 2022; 31:e4375. [PMID: 35762723 PMCID: PMC9214758 DOI: 10.1002/pro.4375] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Revised: 06/03/2022] [Accepted: 06/06/2022] [Indexed: 11/08/2022]
Abstract
In statistical mechanics, it is well known that the huge number of degrees of freedom does not complicate the problem as it seems, but actually greatly simplifies the analysis (e.g., to give a Boltzmann distribution). Here, we reveal that the ensemble averaging from the vast conformations of intrinsically disordered proteins (IDPs) greatly simplifies the nature of binding affinity, which can be reliably decomposed into a sum of the ligandability of IDP and the capacity of ligand. Such an unexpected regularity is applied to facilitate the virtual screening upon IDPs. It also provides essential insight in understanding the specificity difference between IDPs and conventional ordered proteins since the specificity is caused by deviation from the baseline behavior of protein-ligand binding.
Collapse
Affiliation(s)
- Xiaohui Wang
- College of Chemistry and Molecular Engineering, and Beijing National Laboratory for Molecular Sciences (BNLMS)Peking UniversityBeijingChina
| | - Bin Chong
- School of Economics and ManagementTsinghua UniversityBeijingChina
| | - Zhaoxi Sun
- College of Chemistry and Molecular Engineering, and Beijing National Laboratory for Molecular Sciences (BNLMS)Peking UniversityBeijingChina
| | - Hao Ruan
- College of Chemistry and Molecular Engineering, and Beijing National Laboratory for Molecular Sciences (BNLMS)Peking UniversityBeijingChina
| | - Yingguang Yang
- School of CyberscienceUniversity of Science and Technology of ChinaHefeiChina
| | - Pengbo Song
- College of Chemistry and Molecular Engineering, and Beijing National Laboratory for Molecular Sciences (BNLMS)Peking UniversityBeijingChina
| | - Zhirong Liu
- College of Chemistry and Molecular Engineering, and Beijing National Laboratory for Molecular Sciences (BNLMS)Peking UniversityBeijingChina
| |
Collapse
|
48
|
Compositional Bias of Intrinsically Disordered Proteins and Regions and Their Predictions. Biomolecules 2022; 12:biom12070888. [PMID: 35883444 PMCID: PMC9313023 DOI: 10.3390/biom12070888] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 06/10/2022] [Accepted: 06/10/2022] [Indexed: 11/17/2022] Open
Abstract
Intrinsically disordered regions (IDRs) carry out many cellular functions and vary in length and placement in protein sequences. This diversity leads to variations in the underlying compositional biases, which were demonstrated for the short vs. long IDRs. We analyze compositional biases across four classes of disorder: fully disordered proteins; short IDRs; long IDRs; and binding IDRs. We identify three distinct biases: for the fully disordered proteins, the short IDRs and the long and binding IDRs combined. We also investigate compositional bias for putative disorder produced by leading disorder predictors and find that it is similar to the bias of the native disorder. Interestingly, the accuracy of disorder predictions across different methods is correlated with the correctness of the compositional bias of their predictions highlighting the importance of the compositional bias. The predictive quality is relatively low for the disorder classes with compositional bias that is the most different from the “generic” disorder bias, while being much higher for the classes with the most similar bias. We discover that different predictors perform best across different classes of disorder. This suggests that no single predictor is universally best and motivates the development of new architectures that combine models that target specific disorder classes.
Collapse
|
49
|
Guo HB, Perminov A, Bekele S, Kedziora G, Farajollahi S, Varaljay V, Hinkle K, Molinero V, Meister K, Hung C, Dennis P, Kelley-Loughnane N, Berry R. AlphaFold2 models indicate that protein sequence determines both structure and dynamics. Sci Rep 2022; 12:10696. [PMID: 35739160 PMCID: PMC9226352 DOI: 10.1038/s41598-022-14382-9] [Citation(s) in RCA: 36] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Accepted: 06/06/2022] [Indexed: 12/29/2022] Open
Abstract
AlphaFold 2 (AF2) has placed Molecular Biology in a new era where we can visualize, analyze and interpret the structures and functions of all proteins solely from their primary sequences. We performed AF2 structure predictions for various protein systems, including globular proteins, a multi-domain protein, an intrinsically disordered protein (IDP), a randomized protein, two larger proteins (> 1000 AA), a heterodimer and a homodimer protein complex. Our results show that along with the three dimensional (3D) structures, AF2 also decodes protein sequences into residue flexibilities via both the predicted local distance difference test (pLDDT) scores of the models, and the predicted aligned error (PAE) maps. We show that PAE maps from AF2 are correlated with the distance variation (DV) matrices from molecular dynamics (MD) simulations, which reveals that the PAE maps can predict the dynamical nature of protein residues. Here, we introduce the AF2-scores, which are simply derived from pLDDT scores and are in the range of [0, 1]. We found that for most protein models, including large proteins and protein complexes, the AF2-scores are highly correlated with the root mean square fluctuations (RMSF) calculated from MD simulations. However, for an IDP and a randomized protein, the AF2-scores do not correlate with the RMSF from MD, especially for the IDP. Our results indicate that the protein structures predicted by AF2 also convey information of the residue flexibility, i.e., protein dynamics.
Collapse
Affiliation(s)
- Hao-Bo Guo
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA
- UES Inc., Dayton, OH, USA
| | - Alexander Perminov
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA
- Computer Science Department, Miami University, Oxford, OH, USA
| | - Selemon Bekele
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA
- UES Inc., Dayton, OH, USA
| | - Gary Kedziora
- General Dynamics Information Technology, Inc., Wright-Patterson Air Force Base, 45433, OH, USA
| | - Sanaz Farajollahi
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA
- UES Inc., Dayton, OH, USA
| | - Vanessa Varaljay
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA
| | - Kevin Hinkle
- Department of Chemical and Materials Engineering, Dayton University, Dayton, OH, USA
| | - Valeria Molinero
- Department of Chemistry, The University of Utah, Salt Lake City, UT, USA
| | - Konrad Meister
- Department of Natural Sciences, University of Alaska Southeast, Juneau, AK, USA
- Max Planck Institute for Polymer Research, Mainz, Germany
| | - Chia Hung
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA
| | - Patrick Dennis
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA
| | - Nancy Kelley-Loughnane
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA.
| | - Rajiv Berry
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, 45433, OH, USA.
| |
Collapse
|
50
|
Conformational ensemble of the full-length SARS-CoV-2 nucleocapsid (N) protein based on molecular simulations and SAXS data. Biophys Chem 2022; 288:106843. [PMID: 35696898 PMCID: PMC9172258 DOI: 10.1016/j.bpc.2022.106843] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Revised: 05/10/2022] [Accepted: 06/02/2022] [Indexed: 11/02/2022]
Abstract
The nucleocapsid protein of the SARS-CoV-2 virus comprises two RNA-binding domains and three regions that are intrinsically disordered. While the structures of the RNA-binding domains have been solved using protein crystallography and NMR, current knowledge of the conformations of the full-length nucleocapsid protein is rather limited. To fill in this knowledge gap, we combined coarse-grained molecular simulations with data from small-angle X-ray scattering (SAXS) experiments using the ensemble refinement of SAXS (EROS) method. Our results show that the dimer of the full-length nucleocapsid protein exhibits large conformational fluctuations with its radius of gyration ranging from about 4 to 8 nm. The RNA-binding domains do not make direct contacts. The disordered region that links these two domains comprises a hydrophobic α-helix which makes frequent and nonspecific contacts with the RNA-binding domains. Each of the intrinsically disordered regions adopts conformations that are locally compact, yet on average, much more extended than Gaussian chains of equivalent lengths. We offer a detailed picture of the conformational ensemble of the nucleocapsid protein dimer under near-physiological conditions, which will be important for understanding the nucleocapsid assembly process.
Collapse
|