1
|
Jeschke G. Protein ensemble modeling and analysis with MMMx. Protein Sci 2024; 33:e4906. [PMID: 38358120 PMCID: PMC10868441 DOI: 10.1002/pro.4906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 01/04/2024] [Accepted: 01/06/2024] [Indexed: 02/16/2024]
Abstract
Proteins, especially of eukaryotes, often have disordered domains and may contain multiple folded domains whose relative spatial arrangement is distributed. The MMMx ensemble modeling and analysis toolbox (https://github.com/gjeschke/MMMx) can support the design of experiments to characterize the distributed structure of such proteins, starting from AlphaFold2 predictions or folded domain structures. Weak order can be analyzed with reference to a random coil model or to peptide chains that match the residue-specific Ramachandran angle distribution of the loop regions and are otherwise unrestrained. The deviation of the mean square end-to-end distance of chain sections from their average over sections of the same sequence length reveals localized compaction or expansion of the chain. The shape sampled by disordered chains is visualized by superposition in the principal axes frame of their inertia tensor. Ensembles of different sizes and with weighted conformers can be compared based on a similarity parameter that abstracts from the ensemble width.
Collapse
Affiliation(s)
- Gunnar Jeschke
- Department of Chemistry and Applied BiosciencesETH ZürichZürichSwitzerland
| |
Collapse
|
2
|
Holehouse AS, Kragelund BB. The molecular basis for cellular function of intrinsically disordered protein regions. Nat Rev Mol Cell Biol 2024; 25:187-211. [PMID: 37957331 DOI: 10.1038/s41580-023-00673-0] [Citation(s) in RCA: 37] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/26/2023] [Indexed: 11/15/2023]
Abstract
Intrinsically disordered protein regions exist in a collection of dynamic interconverting conformations that lack a stable 3D structure. These regions are structurally heterogeneous, ubiquitous and found across all kingdoms of life. Despite the absence of a defined 3D structure, disordered regions are essential for cellular processes ranging from transcriptional control and cell signalling to subcellular organization. Through their conformational malleability and adaptability, disordered regions extend the repertoire of macromolecular interactions and are readily tunable by their structural and chemical context, making them ideal responders to regulatory cues. Recent work has led to major advances in understanding the link between protein sequence and conformational behaviour in disordered regions, yet the link between sequence and molecular function is less well defined. Here we consider the biochemical and biophysical foundations that underlie how and why disordered regions can engage in productive cellular functions, provide examples of emerging concepts and discuss how protein disorder contributes to intracellular information processing and regulation of cellular function.
Collapse
Affiliation(s)
- Alex S Holehouse
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St Louis, MO, USA.
- Center for Biomolecular Condensates, Washington University in St Louis, St Louis, MO, USA.
| | - Birthe B Kragelund
- REPIN, Structural Biology and NMR Laboratory, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
3
|
Tesei G, Trolle AI, Jonsson N, Betz J, Knudsen FE, Pesce F, Johansson KE, Lindorff-Larsen K. Conformational ensembles of the human intrinsically disordered proteome. Nature 2024; 626:897-904. [PMID: 38297118 DOI: 10.1038/s41586-023-07004-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Accepted: 12/19/2023] [Indexed: 02/02/2024]
Abstract
Intrinsically disordered proteins and regions (collectively, IDRs) are pervasive across proteomes in all kingdoms of life, help to shape biological functions and are involved in numerous diseases. IDRs populate a diverse set of transiently formed structures and defy conventional sequence-structure-function relationships1. Developments in protein science have made it possible to predict the three-dimensional structures of folded proteins at the proteome scale2. By contrast, there is a lack of knowledge about the conformational properties of IDRs, partly because the sequences of disordered proteins are poorly conserved and also because only a few of these proteins have been characterized experimentally. The inability to predict structural properties of IDRs across the proteome has limited our understanding of the functional roles of IDRs and how evolution shapes them. As a supplement to previous structural studies of individual IDRs3, we developed an efficient molecular model to generate conformational ensembles of IDRs and thereby to predict their conformational properties from sequences4,5. Here we use this model to simulate nearly all of the IDRs in the human proteome. Examining conformational ensembles of 28,058 IDRs, we show how chain compaction is correlated with cellular function and localization. We provide insights into how sequence features relate to chain compaction and, using a machine-learning model trained on our simulation data, show the conservation of conformational properties across orthologues. Our results recapitulate observations from previous studies of individual protein systems and exemplify how to link-at the proteome scale-conformational ensembles with cellular function and localization, amino acid sequence, evolutionary conservation and disease variants. Our freely available database of conformational properties will encourage further experimental investigation and enable the generation of hypotheses about the biological roles and evolution of IDRs.
Collapse
Affiliation(s)
- Giulio Tesei
- Structural Biology and NMR Laboratory, Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| | - Anna Ida Trolle
- Structural Biology and NMR Laboratory, Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Nicolas Jonsson
- Structural Biology and NMR Laboratory, Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Johannes Betz
- Structural Biology and NMR Laboratory, Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Frederik E Knudsen
- Structural Biology and NMR Laboratory, Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Francesco Pesce
- Structural Biology and NMR Laboratory, Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Kristoffer E Johansson
- Structural Biology and NMR Laboratory, Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Kresten Lindorff-Larsen
- Structural Biology and NMR Laboratory, Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
4
|
Valverde JM, Dubra G, Phillips M, Haider A, Elena-Real C, Fournet A, Alghoul E, Chahar D, Andrés-Sanchez N, Paloni M, Bernadó P, van Mierlo G, Vermeulen M, van den Toorn H, Heck AJR, Constantinou A, Barducci A, Ghosh K, Sibille N, Knipscheer P, Krasinska L, Fisher D, Altelaar M. A cyclin-dependent kinase-mediated phosphorylation switch of disordered protein condensation. Nat Commun 2023; 14:6316. [PMID: 37813838 PMCID: PMC10562473 DOI: 10.1038/s41467-023-42049-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Accepted: 09/28/2023] [Indexed: 10/11/2023] Open
Abstract
Cell cycle transitions result from global changes in protein phosphorylation states triggered by cyclin-dependent kinases (CDKs). To understand how this complexity produces an ordered and rapid cellular reorganisation, we generated a high-resolution map of changing phosphosites throughout unperturbed early cell cycles in single Xenopus embryos, derived the emergent principles through systems biology analysis, and tested them by biophysical modelling and biochemical experiments. We found that most dynamic phosphosites share two key characteristics: they occur on highly disordered proteins that localise to membraneless organelles, and are CDK targets. Furthermore, CDK-mediated multisite phosphorylation can switch homotypic interactions of such proteins between favourable and inhibitory modes for biomolecular condensate formation. These results provide insight into the molecular mechanisms and kinetics of mitotic cellular reorganisation.
Collapse
Affiliation(s)
- Juan Manuel Valverde
- Biomolecular Mass Spectrometry and Proteomics, Bijvoet Center for Biomolecular Research and Utrecht Institute for Pharmaceutical Sciences, University of Utrecht, Utrecht, 3584 CH, Utrecht, Netherlands
- Netherlands Proteomics Center, Padualaan 8, 3584 CH, Utrecht, Netherlands
| | - Geronimo Dubra
- IGMM, CNRS, University of Montpellier, INSERM, Montpellier, France
- Equipe Labellisée LIGUE 2018, Ligue Nationale Contre le Cancer, Paris, France
| | - Michael Phillips
- Department of Physics and Astronomy, University of Denver, Denver, Co, 80208, USA
| | - Austin Haider
- Department of Molecular and Cellular Biophysics, University of Denver, 80208, Denver, Co, USA
| | | | - Aurélie Fournet
- CBS, CNRS, University of Montpellier, INSERM, Montpellier, France
| | - Emile Alghoul
- IGH, CNRS, University of Montpellier, Montpellier, France
| | - Dhanvantri Chahar
- IGMM, CNRS, University of Montpellier, INSERM, Montpellier, France
- Equipe Labellisée LIGUE 2018, Ligue Nationale Contre le Cancer, Paris, France
| | - Nuria Andrés-Sanchez
- IGMM, CNRS, University of Montpellier, INSERM, Montpellier, France
- Equipe Labellisée LIGUE 2018, Ligue Nationale Contre le Cancer, Paris, France
| | - Matteo Paloni
- Department of Physics and Astronomy, University of Denver, Denver, Co, 80208, USA
| | - Pau Bernadó
- CBS, CNRS, University of Montpellier, INSERM, Montpellier, France
| | - Guido van Mierlo
- Department of Molecular Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, Oncode Institute, Radboud University Nijmegen, Nijmegen, 6525 GA, The Netherlands
| | - Michiel Vermeulen
- Department of Molecular Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, Oncode Institute, Radboud University Nijmegen, Nijmegen, 6525 GA, The Netherlands
| | - Henk van den Toorn
- Biomolecular Mass Spectrometry and Proteomics, Bijvoet Center for Biomolecular Research and Utrecht Institute for Pharmaceutical Sciences, University of Utrecht, Utrecht, 3584 CH, Utrecht, Netherlands
- Netherlands Proteomics Center, Padualaan 8, 3584 CH, Utrecht, Netherlands
| | - Albert J R Heck
- Biomolecular Mass Spectrometry and Proteomics, Bijvoet Center for Biomolecular Research and Utrecht Institute for Pharmaceutical Sciences, University of Utrecht, Utrecht, 3584 CH, Utrecht, Netherlands
- Netherlands Proteomics Center, Padualaan 8, 3584 CH, Utrecht, Netherlands
| | | | | | - Kingshuk Ghosh
- Department of Physics and Astronomy, University of Denver, Denver, Co, 80208, USA
- Department of Molecular and Cellular Biophysics, University of Denver, 80208, Denver, Co, USA
| | - Nathalie Sibille
- CBS, CNRS, University of Montpellier, INSERM, Montpellier, France
| | - Puck Knipscheer
- Oncode Institute, Hubrecht Institute-KNAW and University Medical Center, Utrecht, 3584 CT, Netherlands
| | - Liliana Krasinska
- IGMM, CNRS, University of Montpellier, INSERM, Montpellier, France
- Equipe Labellisée LIGUE 2018, Ligue Nationale Contre le Cancer, Paris, France
| | - Daniel Fisher
- IGMM, CNRS, University of Montpellier, INSERM, Montpellier, France.
- Equipe Labellisée LIGUE 2018, Ligue Nationale Contre le Cancer, Paris, France.
| | - Maarten Altelaar
- Biomolecular Mass Spectrometry and Proteomics, Bijvoet Center for Biomolecular Research and Utrecht Institute for Pharmaceutical Sciences, University of Utrecht, Utrecht, 3584 CH, Utrecht, Netherlands.
- Netherlands Proteomics Center, Padualaan 8, 3584 CH, Utrecht, Netherlands.
| |
Collapse
|
5
|
Tang YJ, Yan K, Zhang X, Tian Y, Liu B. Protein intrinsically disordered region prediction by combining neural architecture search and multi-objective genetic algorithm. BMC Biol 2023; 21:188. [PMID: 37674132 PMCID: PMC10483879 DOI: 10.1186/s12915-023-01672-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2023] [Accepted: 07/31/2023] [Indexed: 09/08/2023] Open
Abstract
BACKGROUND Intrinsically disordered regions (IDRs) are widely distributed in proteins and related to many important biological functions. Accurately identifying IDRs is of great significance for protein structure and function analysis. Because the long disordered regions (LDRs) and short disordered regions (SDRs) share different characteristics, the existing predictors fail to achieve better and more stable performance on datasets with different ratios between LDRs and SDRs. There are two main reasons. First, the existing predictors construct network structures based on their own experiences such as convolutional neural network (CNN) which is used to extract the feature of neighboring residues in protein, and long short-term memory (LSTM) is used to extract the long-distance dependencies feature of protein residues. But these networks cannot capture the hidden feature associated with the length-dependent between residues. Second, many algorithms based on deep learning have been proposed but the complementarity of the existing predictors is not fully explored and used. RESULTS In this study, the neural architecture search (NAS) algorithm was employed to automatically construct the network structures so as to capture the hidden features in protein sequences. In order to stably predict both the LDRs and SDRs, the model constructed by NAS was combined with length-dependent models for capturing the unique features of SDRs or LDRs and general models for capturing the common features between LDRs and SDRs. A new predictor called IDP-Fusion was proposed. CONCLUSIONS Experimental results showed that IDP-Fusion can achieve more stable performance than the other existing predictors on independent test sets with different ratios between SDRs and LDRs.
Collapse
Affiliation(s)
- Yi-Jun Tang
- School of Computer Science and Technology, Beijing Institute of Technology, Haidian District, No. 5, South Zhongguancun Street, Beijing, 100081, China
| | - Ke Yan
- School of Computer Science and Technology, Beijing Institute of Technology, Haidian District, No. 5, South Zhongguancun Street, Beijing, 100081, China
| | - Xingyi Zhang
- School of Artificial Intelligence, Anhui University, Hefei, 230601, China
| | - Ye Tian
- Institutes of Physical Science and Information Technology, Anhui University, Hefei, 230601, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Haidian District, No. 5, South Zhongguancun Street, Beijing, 100081, China.
- Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing, 100081, China.
| |
Collapse
|
6
|
Chao TH, Rekhi S, Mittal J, Tabor DP. Data-Driven Models for Predicting Intrinsically Disordered Protein Polymer Physics Directly from Composition or Sequence. MOLECULAR SYSTEMS DESIGN & ENGINEERING 2023; 8:1146-1155. [PMID: 38222029 PMCID: PMC10786636 DOI: 10.1039/d3me00053b] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/16/2024]
Abstract
The molecular-level understanding of intrinsically disordered proteins is challenging due to experimental characterization difficulties. Computational understanding of IDPs also requires fundamental advances, as the leading tools for predicting protein folding (e.g., AlphaFold), typically fail to describe the structural ensembles of IDPs. The focus of this paper is to 1) develop new representations for intrinsically disordered proteins and 2) pair these representations with classical machine learning and deep learning models to predict the radius of gyration and derived scaling exponent of IDPs. Here, we build a new physically-motivated feature called the bag of amino acid interactions representation, which encodes pairwise interactions explicitly into the representation. This feature essentially counts and weights all possible non-bonded interactions in a sequence and thus is, in principle, compatible with arbitrary sequence lengths. To see how well this new feature performs, both categorical and physically-motivated featurization techniques are tested on a computational dataset containing 10,000 sequences simulated at the coarse-grained level. The results indicate that this new feature outperforms the other purely categorical and physically-motivated features and possesses solid extrapolation capabilities. For future use, this feature can potentially provide physical insights into amino acid interactions, including their temperature dependence, and be applied to other protein spaces.
Collapse
Affiliation(s)
- Tzu-Hsuan Chao
- Department of Chemistry, Texas A&M University, PO Box 30012, College Station, TX 77842-3012, USA
| | - Shiv Rekhi
- Department of Chemistry, Texas A&M University, PO Box 30012, College Station, TX 77842-3012, USA
| | - Jeetain Mittal
- Department of Chemistry, Texas A&M University, PO Box 30012, College Station, TX 77842-3012, USA
| | - Daniel P Tabor
- Department of Chemistry, Texas A&M University, PO Box 30012, College Station, TX 77842-3012, USA
| |
Collapse
|
7
|
Wohl S, Zheng W. Interpreting Transient Interactions of Intrinsically Disordered Proteins. J Phys Chem B 2023; 127:2395-2406. [PMID: 36917561 PMCID: PMC10038935 DOI: 10.1021/acs.jpcb.3c00096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/16/2023]
Abstract
The flexible nature of intrinsically disordered proteins (IDPs) gives rise to a conformational ensemble with a diverse set of conformations. The simplest way to describe this ensemble is through a homopolymer model without any specific interactions. However, there has been growing evidence that the conformational properties of IDPs and their relevant functions can be affected by transient interactions between specific and even nonlocal pairs of amino acids. Interpreting these interactions from experimental methods, each of which is most sensitive to a different distance regime referred to as probing length, remains a challenging and unsolved problem. Here, we first show that transient interactions can be realized between short fragments of charged amino acids by generating conformational ensembles using model disordered peptides and coarse-grained simulations. Using these ensembles, we investigate how sensitive different types of experimental measurements are to the presence of transient interactions. We find methods with shorter probing lengths to be more appropriate for detecting these transient interactions, but one experimental method is not sufficient due to the existence of other weak interactions typically seen in IDPs. Finally, we develop an adjusted polymer model with an additional short-distance peak which can robustly reproduce the distance distribution function from two experimental measurements with complementary short and long probing lengths. This new model can suggest whether a homopolymer model is insufficient for describing a specific IDP and meets the challenge of quantitatively identifying specific, transient interactions from a background of nonspecific, weak interactions.
Collapse
Affiliation(s)
- Samuel Wohl
- Department of Physics, Arizona State University, Tempe, Arizona 85287, United States
| | - Wenwei Zheng
- College of Integrative Sciences and Arts, Arizona State University, Mesa, Arizona 85212, United States
| |
Collapse
|
8
|
González-Delgado J, Sagar A, Zanon C, Lindorff-Larsen K, Bernadó P, Neuvial P, Cortés J. WASCO: A Wasserstein-based statistical tool to compare conformational ensembles of intrinsically disordered proteins. J Mol Biol 2023:168053. [PMID: 36934808 DOI: 10.1016/j.jmb.2023.168053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Revised: 02/10/2023] [Accepted: 03/14/2023] [Indexed: 03/19/2023]
Abstract
The structural investigation of intrinsically disordered proteins (IDPs) requires ensemble models describing the diversity of the conformational states of the molecule. Due to their probabilistic nature, there is a need for new paradigms that understand and treat IDPs from a purely statistical point of view, considering their conformational ensembles as well-defined probability distributions. In this work, we define a conformational ensemble as an ordered set of probability distributions and provide a suitable metric to detect differences between two given ensembles at the residue level, both locally and globally. The underlying geometry of the conformational space is properly integrated, one ensemble being characterized by a set of probability distributions supported on the three-dimensional Euclidean space (for global-scale comparisons) and on the two-dimensional flat torus (for local-scale comparisons). The inherent uncertainty of the data is also taken into account to provide finer estimations of the differences between ensembles. Additionally, an overall distance between ensembles is defined from the differences at the residue level. We illustrate the interest of the approach with several examples of applications for the comparison of conformational ensembles: (i) produced from molecular dynamics (MD) simulations using different force fields, and (ii) before and after refinement with experimental data. We also show the usefulness of the method to assess the convergence of MD simulations, and discuss other potential applications such as in machine-learning-based approaches. The numerical tool has been implemented in Python through easy-to-use Jupyter Notebooks available at https://gitlab.laas.fr/moma/WASCO.
Collapse
Affiliation(s)
- Javier González-Delgado
- LAAS-CNRS, Université de Toulouse, CNRS, Toulouse, France; Institut de Mathématiques de Toulouse, Université de Toulouse, CNRS, Toulouse, France
| | - Amin Sagar
- Centre de Biologie Structurale, Université de Montpellier, INSERM, CNRS, Montpellier, France
| | | | - Kresten Lindorff-Larsen
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Denmark
| | - Pau Bernadó
- Centre de Biologie Structurale, Université de Montpellier, INSERM, CNRS, Montpellier, France
| | - Pierre Neuvial
- Institut de Mathématiques de Toulouse, Université de Toulouse, CNRS, Toulouse, France
| | - Juan Cortés
- LAAS-CNRS, Université de Toulouse, CNRS, Toulouse, France
| |
Collapse
|
9
|
Krasinska L, Fisher D. A Mechanistic Model for Cell Cycle Control in Which CDKs Act as Switches of Disordered Protein Phase Separation. Cells 2022; 11:cells11142189. [PMID: 35883632 PMCID: PMC9321858 DOI: 10.3390/cells11142189] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Revised: 07/05/2022] [Accepted: 07/06/2022] [Indexed: 12/30/2022] Open
Abstract
Cyclin-dependent kinases (CDKs) are presumed to control the cell cycle by phosphorylating a large number of proteins involved in S-phase and mitosis, two mechanistically disparate biological processes. While the traditional qualitative model of CDK-mediated cell cycle control relies on differences in inherent substrate specificity between distinct CDK-cyclin complexes, they are largely dispensable according to the opposing quantitative model, which states that changes in the overall CDK activity level promote orderly progression through S-phase and mitosis. However, a mechanistic explanation for how such an activity can simultaneously regulate many distinct proteins is lacking. New evidence suggests that the CDK-dependent phosphorylation of ostensibly very diverse proteins might be achieved due to underlying similarity of phosphorylation sites and of the biochemical effects of their phosphorylation: they are preferentially located within intrinsically disordered regions of proteins that are components of membraneless organelles, and they regulate phase separation. Here, we review this evidence and suggest a mechanism for how a single enzyme’s activity can generate the dynamics required to remodel the cell at mitosis.
Collapse
|
10
|
Ghosh K, Huihui J, Phillips M, Haider A. Rules of Physical Mathematics Govern Intrinsically Disordered Proteins. Annu Rev Biophys 2022; 51:355-376. [PMID: 35119946 PMCID: PMC9190209 DOI: 10.1146/annurev-biophys-120221-095357] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
In stark contrast to foldable proteins with a unique folded state, intrinsically disordered proteins and regions (IDPs) persist in perpetually disordered ensembles. Yet an IDP ensemble has conformational features-even when averaged-that are specific to its sequence. In fact, subtle changes in an IDP sequence can modulate its conformational features and its function. Recent advances in theoretical physics reveal a set of elegant mathematical expressions that describe the intricate relationships among IDP sequences, their ensemble conformations, and the regulation of their biological functions. These equations also describe the molecular properties of IDP sequences that predict similarities and dissimilarities in their functions and facilitate classification of sequences by function, an unmet challenge to traditional bioinformatics. These physical sequence-patterning metrics offer a promising new avenue for advancing synthetic biology at a time when multiple novel functional modes mediated by IDPs are emerging.
Collapse
Affiliation(s)
- Kingshuk Ghosh
- Department of Physics and Astronomy, University of Denver, Denver, Colorado, USA,Molecular and Cellular Biophysics Program, University of Denver, Denver, Colorado, USA
| | - Jonathan Huihui
- Department of Physics and Astronomy, University of Denver, Denver, Colorado, USA
| | - Michael Phillips
- Department of Physics and Astronomy, University of Denver, Denver, Colorado, USA
| | - Austin Haider
- Molecular and Cellular Biophysics Program, University of Denver, Denver, Colorado, USA
| |
Collapse
|
11
|
Nassar R, Dignon GL, Razban RM, Dill KA. The Protein Folding Problem: The Role of Theory. J Mol Biol 2021; 433:167126. [PMID: 34224747 PMCID: PMC8547331 DOI: 10.1016/j.jmb.2021.167126] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 06/21/2021] [Accepted: 06/26/2021] [Indexed: 10/20/2022]
Abstract
The protein folding problem was first articulated as question of how order arose from disorder in proteins: How did the various native structures of proteins arise from interatomic driving forces encoded within their amino acid sequences, and how did they fold so fast? These matters have now been largely resolved by theory and statistical mechanics combined with experiments. There are general principles. Chain randomness is overcome by solvation-based codes. And in the needle-in-a-haystack metaphor, native states are found efficiently because protein haystacks (conformational ensembles) are funnel-shaped. Order-disorder theory has now grown to encompass a large swath of protein physical science across biology.
Collapse
Affiliation(s)
- Roy Nassar
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA; Department of Chemistry, Stony Brook University, Stony Brook, NY, USA
| | - Gregory L Dignon
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA
| | - Rostam M Razban
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA
| | - Ken A Dill
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA; Department of Chemistry, Stony Brook University, Stony Brook, NY, USA; Department of Physics and Astronomy, Stony Brook University, Stony Brook, NY, USA.
| |
Collapse
|
12
|
Lindorff-Larsen K, Kragelund BB. On the potential of machine learning to examine the relationship between sequence, structure, dynamics and function of intrinsically disordered proteins. J Mol Biol 2021; 433:167196. [PMID: 34390736 DOI: 10.1016/j.jmb.2021.167196] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Revised: 08/03/2021] [Accepted: 08/04/2021] [Indexed: 11/29/2022]
Abstract
Intrinsically disordered proteins (IDPs) constitute a broad set of proteins with few uniting and many diverging properties. IDPs-and intrinsically disordered regions (IDRs) interspersed between folded domains-are generally characterized as having no persistent tertiary structure; instead they interconvert between a large number of different and often expanded structures. IDPs and IDRs are involved in an enormously wide range of biological functions and reveal novel mechanisms of interactions, and while they defy the common structure-function paradigm of folded proteins, their structural preferences and dynamics are important for their function. We here discuss open questions in the field of IDPs and IDRs, focusing on areas where machine learning and other computational methods play a role. We discuss computational methods aimed to predict transiently formed local and long-range structure, including methods for integrative structural biology. We discuss the many different ways in which IDPs and IDRs can bind to other molecules, both via short linear motifs, as well as in the formation of larger dynamic complexes such as biomolecular condensates. We discuss how experiments are providing insight into such complexes and may enable more accurate predictions. Finally, we discuss the role of IDPs in disease and how new methods are needed to interpret the mechanistic effects of genomic variants in IDPs.
Collapse
Affiliation(s)
- Kresten Lindorff-Larsen
- Structural Biology and NMR Laboratory & Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen. Ole Maaløes Vej 5, DK-2200 Copenhagen N, Denmark.
| | - Birthe B Kragelund
- Structural Biology and NMR Laboratory & Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen. Ole Maaløes Vej 5, DK-2200 Copenhagen N, Denmark.
| |
Collapse
|
13
|
Ozkan SB. Can sequence-specific and dynamics-based metrics allow us to decipher the function in IDP sequences? Biophys J 2021; 120:1857-1859. [PMID: 33951452 PMCID: PMC8204289 DOI: 10.1016/j.bpj.2021.04.008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Revised: 03/05/2021] [Accepted: 04/12/2021] [Indexed: 12/29/2022] Open
Affiliation(s)
- S Banu Ozkan
- Department of Physics, Center for Biological Physics, Arizona University, Tempe, Arizona.
| |
Collapse
|