1
|
Malebary SJ, Alromema N. iDLB-Pred: identification of disordered lipid binding residues in protein sequences using convolutional neural network. Sci Rep 2024; 14:24724. [PMID: 39433833 PMCID: PMC11494137 DOI: 10.1038/s41598-024-75700-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2024] [Accepted: 10/08/2024] [Indexed: 10/23/2024] Open
Abstract
Proteins, nucleic acids, and lipids all interact with intrinsically disordered protein areas. Lipid-binding regions are involved in a variety of biological processes as well as a number of human illnesses. The expanding body of experimental evidence for these interactions and the dearth of techniques to anticipate them from the protein sequence serve as driving forces. Although large-scale laboratory techniques are considered to be essential for equipment for studying binding residues, they are time consuming and costly, making it challenging for researchers to predict lipid binding residues. As a result, computational techniques are being looked at as a different strategy to overcome this difficulty. To predict disordered lipid-binding residues (DLBRs), we proposed iDLB-Pred predictor utilizing benchmark dataset to compute feature through extraction techniques to identify relevant patterns and information. Various classification techniques, including deep learning methods such as Convolutional Neural Networks (CNNs), Deep Neural Networks (DNNs), Multilayer Perceptrons (MLPs), Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTM) networks, and Gated Recurrent Units (GRUs), were employed for model training. The proposed model, iDLB-Pred, was rigorously validated using metrics such as accuracy, sensitivity, specificity, and Matthew's correlation coefficient. The results demonstrate the predictor's exceptional performance, achieving accuracy rates of 81% on an independent dataset and 86% in 10-fold cross-validation.
Collapse
Affiliation(s)
- Sharaf J Malebary
- Department of Information Technology, Faculty of Computing and Information Technology-Rabigh, King Abdulaziz University, P.O. Box 344, 21911, Rabigh, Saudi Arabia.
| | - Nashwan Alromema
- Department of Computer Science, Faculty of Computing and Information Technology-Rabigh, King Abdulaziz University, P.O. Box 344, 21911, Rabigh, Saudi Arabia
| |
Collapse
|
2
|
Uversky VN. On the Roles of Protein Intrinsic Disorder in the Origin of Life and Evolution. Life (Basel) 2024; 14:1307. [PMID: 39459607 PMCID: PMC11509291 DOI: 10.3390/life14101307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2024] [Revised: 10/13/2024] [Accepted: 10/14/2024] [Indexed: 10/28/2024] Open
Abstract
Obviously, the discussion of different factors that could have contributed to the origin of life and evolution is clear speculation, since there is no way of checking the validity of most of the related hypotheses in practice, as the corresponding events not only already happened, but took place in a very distant past. However, there are a few undisputable facts that are present at the moment, such as the existence of a wide variety of living forms and the abundant presence of intrinsically disordered proteins (IDPs) or hybrid proteins containing ordered domains and intrinsically disordered regions (IDRs) in all living forms. Since it seems that the currently existing living forms originated from a common ancestor, their variety is a result of evolution. Therefore, one could ask a logical question of what role(s) the structureless and highly dynamic but vastly abundant and multifunctional IDPs/IDRs might have in evolution. This study represents an attempt to consider various ideas pertaining to the potential roles of protein intrinsic disorder in the origin of life and evolution.
Collapse
Affiliation(s)
- Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA
| |
Collapse
|
3
|
Qin Z, Yuan B, Qu G, Sun Z. Rational enzyme design by reducing the number of hotspots and library size. Chem Commun (Camb) 2024; 60:10451-10463. [PMID: 39210728 DOI: 10.1039/d4cc01394h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/04/2024]
Abstract
Biocatalysts that are eco-friendly, sustainable, and highly specific have great potential for applications in the production of fine chemicals, food, detergents, biofuels, pharmaceuticals, and more. However, due to factors such as low activity, narrow substrate scope, poor thermostability, or incorrect selectivity, most natural enzymes cannot be directly used for large-scale production of the desired products. To overcome these obstacles, protein engineering methods have been developed over decades and have become powerful and versatile tools for adapting enzymes with improved catalytic properties or new functions. The vastness of the protein sequence space makes screening a bottleneck in obtaining advantageous mutated enzymes in traditional directed evolution. In the realm of mathematics, there are two major constraints in the protein sequence space: (1) the number of residue substitutions (M); and (2) the number of codons encoding amino acids as building blocks (N). This feature review highlights protein engineering strategies to reduce screening efforts from two dimensions by reducing the numbers M and N, and also discusses representative seminal studies of rationally engineered natural enzymes to deliver new catalytic functions.
Collapse
Affiliation(s)
- Zongmin Qin
- University of Chinese Academy of Sciences, Beijing 100049, China
- Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China.
- National Center of Technology Innovation for Synthetic Biology, Tianjin 300308, China
| | - Bo Yuan
- University of Chinese Academy of Sciences, Beijing 100049, China
- Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China.
- National Center of Technology Innovation for Synthetic Biology, Tianjin 300308, China
- Key Laboratory of Engineering Biology for Low-Carbon Manufacturing, Tianjin 300308, China
| | - Ge Qu
- University of Chinese Academy of Sciences, Beijing 100049, China
- Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China.
- National Center of Technology Innovation for Synthetic Biology, Tianjin 300308, China
- Key Laboratory of Engineering Biology for Low-Carbon Manufacturing, Tianjin 300308, China
| | - Zhoutong Sun
- University of Chinese Academy of Sciences, Beijing 100049, China
- Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China.
- National Center of Technology Innovation for Synthetic Biology, Tianjin 300308, China
- Key Laboratory of Engineering Biology for Low-Carbon Manufacturing, Tianjin 300308, China
| |
Collapse
|
4
|
Wang K, Hu G, Basu S, Kurgan L. flDPnn2: Accurate and Fast Predictor of Intrinsic Disorder in Proteins. J Mol Biol 2024; 436:168605. [PMID: 39237195 DOI: 10.1016/j.jmb.2024.168605] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 04/16/2024] [Accepted: 05/04/2024] [Indexed: 09/07/2024]
Abstract
Prediction of the intrinsic disorder in protein sequences is an active research area, with well over 100 predictors that were released to date. These efforts are motivated by the functional importance and high levels of abundance of intrinsic disorder, combined with relatively low amounts of experimental annotations. The disorder predictors are periodically evaluated by independent assessors in the Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiments. The recently completed CAID2 experiment assessed close to 40 state-of-the-art methods demonstrating that some of them produce accurate results. In particular, flDPnn2 method, which is the successor of flDPnn that performed well in the CAID1 experiment, secured the overall most accurate results on the Disorder-NOX dataset in CAID2. flDPnn2 implements a number of improvements when compared to its predecessor including changes to the inputs, increased size of the deep network model that we retrained on a larger training set, and addition of an alignment module. Using results from CAID2, we show that flDPnn2 produces accurate predictions very quickly, modestly improving over the accuracy of flDPnn and reducing the runtime by half, to about 27 s per protein. flDPnn2 is freely available as a convenient web server at http://biomine.cs.vcu.edu/servers/flDPnn2/.
Collapse
Affiliation(s)
- Kui Wang
- NITFID, School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin, China
| | - Gang Hu
- NITFID, School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin, China
| | - Sushmita Basu
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA.
| |
Collapse
|
5
|
Aftab A, Sil S, Nath S, Basu A, Basu S. Intrinsic Disorder and Other Malleable Arsenals of Evolved Protein Multifunctionality. J Mol Evol 2024:10.1007/s00239-024-10196-7. [PMID: 39214891 DOI: 10.1007/s00239-024-10196-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Accepted: 08/18/2024] [Indexed: 09/04/2024]
Abstract
Microscopic evolution at the functional biomolecular level is an ongoing process. Leveraging functional and high-throughput assays, along with computational data mining, has led to a remarkable expansion of our understanding of multifunctional protein (and gene) families over the past few decades. Various molecular and intermolecular mechanisms are now known that collectively meet the cumulative multifunctional demands in higher organisms along an evolutionary path. This multitasking ability is attributed to a certain degree of intrinsic or adapted flexibility at the structure-function level. Evolutionary diversification of structure-function relationships in proteins highlights the functional importance of intrinsically disordered proteins/regions (IDPs/IDRs) which are highly dynamic biological soft matter. Multifunctionality is favorably supported by the fluid-like shapes of IDPs/IDRs, enabling them to undergo disorder-to-order transitions upon binding to different molecular partners. Other new malleable members of the protein superfamily, such as those involved in fold-switching, also undergo structural transitions. This new insight diverges from all traditional notions of functional singularity in enzyme classes and emphasizes a far more complex, multi-layered diversification of protein functionality. However, a thorough review in this line, focusing on flexibility and function-driven structural transitions related to evolved multifunctionality in proteins, is currently missing. This review attempts to address this gap while broadening the scope of multifunctionality beyond single protein sequences. It argues that protein intrinsic disorder is likely the most striking mechanism for expressing multifunctionality in proteins. A phenomenological analogy has also been drawn to illustrate the increasingly complex nature of modern digital life, driven by the need for multitasking, particularly involving media.
Collapse
Affiliation(s)
- Asifa Aftab
- Department of Zoology, Asutosh College, (affiliated with University of Calcutta), Kolkata, 700026, India
| | - Souradeep Sil
- Department of Genetics, Osmania University, Hyderabad, 500007, India
| | - Seema Nath
- Department of Biochemistry and Structural Biology, University of Texas Health Science Center at San Antonio, San Antonio, TX, 78229, USA
| | - Anirneya Basu
- Department of Microbiology, Asutosh College (Affiliated With University of Calcutta), Kolkata, 700026, India
| | - Sankar Basu
- Department of Microbiology, Asutosh College (Affiliated With University of Calcutta), Kolkata, 700026, India.
| |
Collapse
|
6
|
Pepelnjak M, Velten B, Näpflin N, von Rosen T, Palmiero UC, Ko JH, Maynard HD, Arosio P, Weber-Ban E, de Souza N, Huber W, Picotti P. In situ analysis of osmolyte mechanisms of proteome thermal stabilization. Nat Chem Biol 2024; 20:1053-1065. [PMID: 38424171 PMCID: PMC11288892 DOI: 10.1038/s41589-024-01568-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Accepted: 02/03/2024] [Indexed: 03/02/2024]
Abstract
Organisms use organic molecules called osmolytes to adapt to environmental conditions. In vitro studies indicate that osmolytes thermally stabilize proteins, but mechanisms are controversial, and systematic studies within the cellular milieu are lacking. We analyzed Escherichia coli and human protein thermal stabilization by osmolytes in situ and across the proteome. Using structural proteomics, we probed osmolyte effects on protein thermal stability, structure and aggregation, revealing common mechanisms but also osmolyte- and protein-specific effects. All tested osmolytes (trimethylamine N-oxide, betaine, glycerol, proline, trehalose and glucose) stabilized many proteins, predominantly via a preferential exclusion mechanism, and caused an upward shift in temperatures at which most proteins aggregated. Thermal profiling of the human proteome provided evidence for intrinsic disorder in situ but also identified potential structure in predicted disordered regions. Our analysis provides mechanistic insight into osmolyte function within a complex biological matrix and sheds light on the in situ prevalence of intrinsically disordered regions.
Collapse
Affiliation(s)
- Monika Pepelnjak
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Britta Velten
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Centre for Organismal Studies (COS) & Center for Scientific Computing (IWR), Heidelberg University, Heidelberg, Germany
| | - Nicolas Näpflin
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Tatjana von Rosen
- Department of Biology, Institute of Molecular Biology & Biophysics, ETH Zurich, Zurich, Switzerland
| | - Umberto Capasso Palmiero
- Department of Chemistry and Applied Biosciences, Institute of Chemical and Bioengineering, ETH Zurich, Zurich, Switzerland
| | - Jeong Hoon Ko
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, USA
| | - Heather D Maynard
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, USA
| | - Paolo Arosio
- Department of Chemistry and Applied Biosciences, Institute of Chemical and Bioengineering, ETH Zurich, Zurich, Switzerland
| | - Eilika Weber-Ban
- Department of Biology, Institute of Molecular Biology & Biophysics, ETH Zurich, Zurich, Switzerland
| | - Natalie de Souza
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
- Department of Quantitative Biomedicine, University of Zurich, Zurich, Switzerland
| | - Wolfgang Huber
- Genome Biology Unit, European Molecular Biological Laboratory, Heidelberg, Germany
| | - Paola Picotti
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland.
| |
Collapse
|
7
|
Ghosh D, Biswas A, Radhakrishna M. Advanced computational approaches to understand protein aggregation. BIOPHYSICS REVIEWS 2024; 5:021302. [PMID: 38681860 PMCID: PMC11045254 DOI: 10.1063/5.0180691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 03/18/2024] [Indexed: 05/01/2024]
Abstract
Protein aggregation is a widespread phenomenon implicated in debilitating diseases like Alzheimer's, Parkinson's, and cataracts, presenting complex hurdles for the field of molecular biology. In this review, we explore the evolving realm of computational methods and bioinformatics tools that have revolutionized our comprehension of protein aggregation. Beginning with a discussion of the multifaceted challenges associated with understanding this process and emphasizing the critical need for precise predictive tools, we highlight how computational techniques have become indispensable for understanding protein aggregation. We focus on molecular simulations, notably molecular dynamics (MD) simulations, spanning from atomistic to coarse-grained levels, which have emerged as pivotal tools in unraveling the complex dynamics governing protein aggregation in diseases such as cataracts, Alzheimer's, and Parkinson's. MD simulations provide microscopic insights into protein interactions and the subtleties of aggregation pathways, with advanced techniques like replica exchange molecular dynamics, Metadynamics (MetaD), and umbrella sampling enhancing our understanding by probing intricate energy landscapes and transition states. We delve into specific applications of MD simulations, elucidating the chaperone mechanism underlying cataract formation using Markov state modeling and the intricate pathways and interactions driving the toxic aggregate formation in Alzheimer's and Parkinson's disease. Transitioning we highlight how computational techniques, including bioinformatics, sequence analysis, structural data, machine learning algorithms, and artificial intelligence have become indispensable for predicting protein aggregation propensity and locating aggregation-prone regions within protein sequences. Throughout our exploration, we underscore the symbiotic relationship between computational approaches and empirical data, which has paved the way for potential therapeutic strategies against protein aggregation-related diseases. In conclusion, this review offers a comprehensive overview of advanced computational methodologies and bioinformatics tools that have catalyzed breakthroughs in unraveling the molecular basis of protein aggregation, with significant implications for clinical interventions, standing at the intersection of computational biology and experimental research.
Collapse
Affiliation(s)
- Deepshikha Ghosh
- Department of Biological Sciences and Engineering, Indian Institute of Technology (IIT) Gandhinagar, Palaj, Gujarat 382355, India
| | - Anushka Biswas
- Department of Chemical Engineering, Indian Institute of Technology (IIT) Gandhinagar, Palaj, Gujarat 382355, India
| | | |
Collapse
|
8
|
Idrees S, Paudel KR, Sadaf T, Hansbro PM. Uncovering domain motif interactions using high-throughput protein-protein interaction detection methods. FEBS Lett 2024; 598:725-742. [PMID: 38439692 DOI: 10.1002/1873-3468.14841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 01/09/2024] [Accepted: 02/18/2024] [Indexed: 03/06/2024]
Abstract
Protein-protein interactions (PPIs) are often mediated by short linear motifs (SLiMs) in one protein and domain in another, known as domain-motif interactions (DMIs). During the past decade, SLiMs have been studied to find their role in cellular functions such as post-translational modifications, regulatory processes, protein scaffolding, cell cycle progression, cell adhesion, cell signalling and substrate selection for proteasomal degradation. This review provides a comprehensive overview of the current PPI detection techniques and resources, focusing on their relevance to capturing interactions mediated by SLiMs. We also address the challenges associated with capturing DMIs. Moreover, a case study analysing the BioGrid database as a source of DMI prediction revealed significant known DMI enrichment in different PPI detection methods. Overall, it can be said that current high-throughput PPI detection methods can be a reliable source for predicting DMIs.
Collapse
Affiliation(s)
- Sobia Idrees
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, Australia
- Centre for Inflammation, Centenary Institute and Faculty of Science, School of Life Sciences, University of Technology Sydney, Australia
| | - Keshav Raj Paudel
- Centre for Inflammation, Centenary Institute and Faculty of Science, School of Life Sciences, University of Technology Sydney, Australia
| | - Tayyaba Sadaf
- Centre for Inflammation, Centenary Institute and Faculty of Science, School of Life Sciences, University of Technology Sydney, Australia
| | - Philip M Hansbro
- Centre for Inflammation, Centenary Institute and Faculty of Science, School of Life Sciences, University of Technology Sydney, Australia
| |
Collapse
|
9
|
Pepelnjak M, Rogawski R, Arkind G, Leushkin Y, Fainer I, Ben-Nissan G, Picotti P, Sharon M. Systematic identification of 20S proteasome substrates. Mol Syst Biol 2024; 20:403-427. [PMID: 38287148 PMCID: PMC10987551 DOI: 10.1038/s44320-024-00015-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 12/13/2023] [Accepted: 01/05/2024] [Indexed: 01/31/2024] Open
Abstract
For years, proteasomal degradation was predominantly attributed to the ubiquitin-26S proteasome pathway. However, it is now evident that the core 20S proteasome can independently target proteins for degradation. With approximately half of the cellular proteasomes comprising free 20S complexes, this degradation mechanism is not rare. Identifying 20S-specific substrates is challenging due to the dual-targeting of some proteins to either 20S or 26S proteasomes and the non-specificity of proteasome inhibitors. Consequently, knowledge of 20S proteasome substrates relies on limited hypothesis-driven studies. To comprehensively explore 20S proteasome substrates, we employed advanced mass spectrometry, along with biochemical and cellular analyses. This systematic approach revealed hundreds of 20S proteasome substrates, including proteins undergoing specific N- or C-terminal cleavage, possibly for regulation. Notably, these substrates were enriched in RNA- and DNA-binding proteins with intrinsically disordered regions, often found in the nucleus and stress granules. Under cellular stress, we observed reduced proteolytic activity in oxidized proteasomes, with oxidized protein substrates exhibiting higher structural disorder compared to unmodified proteins. Overall, our study illuminates the nature of 20S substrates, offering crucial insights into 20S proteasome biology.
Collapse
Affiliation(s)
- Monika Pepelnjak
- Institute of Molecular Systems Biology, Department of Biology, ETH Zurich, Zurich, Switzerland
| | - Rivkah Rogawski
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Galina Arkind
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Yegor Leushkin
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Irit Fainer
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Gili Ben-Nissan
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Paola Picotti
- Institute of Molecular Systems Biology, Department of Biology, ETH Zurich, Zurich, Switzerland.
| | - Michal Sharon
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, 7610001, Israel.
| |
Collapse
|
10
|
Fang C, He J, Yamana H. MoRF_ESM: Prediction of MoRFs in disordered proteins based on a deep transformer protein language model. J Bioinform Comput Biol 2024; 22:2450006. [PMID: 38812466 DOI: 10.1142/s0219720024500069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2024]
Abstract
Molecular recognition features (MoRFs) are particular functional segments of disordered proteins, which play crucial roles in regulating the phase transition of membrane-less organelles and frequently serve as central sites in cellular interaction networks. As the association between disordered proteins and severe diseases continues to be discovered, identifying MoRFs has gained growing significance. Due to the limited number of experimentally validated MoRFs, the performance of existing MoRF's prediction algorithms is not good enough and still needs to be improved. In this research, we present a model named MoRF_ESM, which utilizes deep-learning protein representations to predict MoRFs in disordered proteins. This approach employs a pretrained ESM-2 protein language model to generate embedding representations of residues in the form of attention map matrices. These representations are combined with a self-learned TextCNN model for feature extraction and prediction. In addition, an averaging step was incorporated at the end of the MoRF_ESM model to refine the output and generate final prediction results. In comparison to other impressive methods on benchmark datasets, the MoRF_ESM approach demonstrates state-of-the-art performance, achieving [Formula: see text] higher AUC than other methods when tested on TEST1 and achieving [Formula: see text] higher AUC than other methods when tested on TEST2. These results imply that the combination of ESM-2 and TextCNN can effectively extract deep evolutionary features related to protein structure and function, along with capturing shallow pattern features located in protein sequences, and is well qualified for the prediction task of MoRFs. Given that ESM-2 is a highly versatile protein language model, the methodology proposed in this study can be readily applied to other tasks involving the classification of protein sequences.
Collapse
Affiliation(s)
- Chun Fang
- Department of Information Engineering, Beijing Institute of Petrochemical Technology, 19 Qingyuan North Road, Daxing District, Beijing 102617, P. R. China
- Department of Computer Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku, Tokyo 169-8555, Japan
| | - Jiasheng He
- Department of Information Engineering, Beijing Institute of Petrochemical Technology, 19 Qingyuan North Road, Daxing District, Beijing 102617, P. R. China
| | - Hayato Yamana
- Department of Computer Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku, Tokyo 169-8555, Japan
| |
Collapse
|
11
|
Maiti S, Singh A, Maji T, Saibo NV, De S. Experimental methods to study the structure and dynamics of intrinsically disordered regions in proteins. Curr Res Struct Biol 2024; 7:100138. [PMID: 38707546 PMCID: PMC11068507 DOI: 10.1016/j.crstbi.2024.100138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 03/12/2024] [Accepted: 03/15/2024] [Indexed: 05/07/2024] Open
Abstract
Eukaryotic proteins often feature long stretches of amino acids that lack a well-defined three-dimensional structure and are referred to as intrinsically disordered proteins (IDPs) or regions (IDRs). Although these proteins challenge conventional structure-function paradigms, they play vital roles in cellular processes. Recent progress in experimental techniques, such as NMR spectroscopy, single molecule FRET, high speed AFM and SAXS, have provided valuable insights into the biophysical basis of IDP function. This review discusses the advancements made in these techniques particularly for the study of disordered regions in proteins. In NMR spectroscopy new strategies such as 13C detection, non-uniform sampling, segmental isotope labeling, and rapid data acquisition methods address the challenges posed by spectral overcrowding and low stability of IDPs. The importance of various NMR parameters, including chemical shifts, hydrogen exchange rates, and relaxation measurements, to reveal transient secondary structures within IDRs and IDPs are presented. Given the high flexibility of IDPs, the review outlines NMR methods for assessing their dynamics at both fast (ps-ns) and slow (μs-ms) timescales. IDPs exert their functions through interactions with other molecules such as proteins, DNA, or RNA. NMR-based titration experiments yield insights into the thermodynamics and kinetics of these interactions. Detailed study of IDPs requires multiple experimental techniques, and thus, several methods are described for studying disordered proteins, highlighting their respective advantages and limitations. The potential for integrating these complementary techniques, each offering unique perspectives, is explored to achieve a comprehensive understanding of IDPs.
Collapse
Affiliation(s)
| | - Aakanksha Singh
- School of Bioscience, Indian Institute of Technology Kharagpur, Kharagpur, WB, 721302, India
| | - Tanisha Maji
- School of Bioscience, Indian Institute of Technology Kharagpur, Kharagpur, WB, 721302, India
| | - Nikita V. Saibo
- School of Bioscience, Indian Institute of Technology Kharagpur, Kharagpur, WB, 721302, India
| | - Soumya De
- School of Bioscience, Indian Institute of Technology Kharagpur, Kharagpur, WB, 721302, India
| |
Collapse
|
12
|
Pasani S, Menon KS, Viswanath S. The molecular architecture of the desmosomal outer dense plaque by integrative structural modeling. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.06.13.544884. [PMID: 37398295 PMCID: PMC10312763 DOI: 10.1101/2023.06.13.544884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Desmosomes mediate cell-cell adhesion and are prevalent in tissues under mechanical stress. However, their detailed structural characterization is not available. Here, we characterized the molecular architecture of the desmosomal outer dense plaque (ODP) using Bayesian integrative structural modeling via the Integrative Modeling Platform. Starting principally from the structural interpretation of an electron cryo-tomogram, we integrated information from X-ray crystallography, an immuno-electron microscopy study, biochemical assays, in-silico predictions of transmembrane and disordered regions, homology modeling, and stereochemistry information. The integrative structure was validated by information from imaging, tomography, and biochemical studies that were not used in modeling. The ODP resembles a densely packed cylinder with a PKP layer and a PG layer; the desmosomal cadherins and PKP span these two layers. Our integrative approach allowed us to localize disordered regions, such as N-PKP and PG-C. We refined previous protein-protein interactions between desmosomal proteins and provided possible structural hypotheses for defective cell-cell adhesion in several diseases by mapping disease-related mutations on the structure. Finally, we point to features of the structure that could confer resilience to mechanical stress. Our model provides a basis for generating experimentally verifiable hypotheses on the structure and function of desmosomal proteins in normal and disease states.
Collapse
Affiliation(s)
- Satwik Pasani
- National Center for Biological Sciences, Tata Institute of Fundamental Research, Bengaluru 560065, India
| | - Kavya S Menon
- National Center for Biological Sciences, Tata Institute of Fundamental Research, Bengaluru 560065, India
| | - Shruthi Viswanath
- National Center for Biological Sciences, Tata Institute of Fundamental Research, Bengaluru 560065, India
| |
Collapse
|
13
|
Pang Y, Liu B. DisoFLAG: accurate prediction of protein intrinsic disorder and its functions using graph-based interaction protein language model. BMC Biol 2024; 22:3. [PMID: 38166858 PMCID: PMC10762911 DOI: 10.1186/s12915-023-01803-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2023] [Accepted: 12/15/2023] [Indexed: 01/05/2024] Open
Abstract
Intrinsically disordered proteins and regions (IDPs/IDRs) are functionally important proteins and regions that exist as highly dynamic conformations under natural physiological conditions. IDPs/IDRs exhibit a broad range of molecular functions, and their functions involve binding interactions with partners and remaining native structural flexibility. The rapid increase in the number of proteins in sequence databases and the diversity of disordered functions challenge existing computational methods for predicting protein intrinsic disorder and disordered functions. A disordered region interacts with different partners to perform multiple functions, and these disordered functions exhibit different dependencies and correlations. In this study, we introduce DisoFLAG, a computational method that leverages a graph-based interaction protein language model (GiPLM) for jointly predicting disorder and its multiple potential functions. GiPLM integrates protein semantic information based on pre-trained protein language models into graph-based interaction units to enhance the correlation of the semantic representation of multiple disordered functions. The DisoFLAG predictor takes amino acid sequences as the only inputs and provides predictions of intrinsic disorder and six disordered functions for proteins, including protein-binding, DNA-binding, RNA-binding, ion-binding, lipid-binding, and flexible linker. We evaluated the predictive performance of DisoFLAG following the Critical Assessment of protein Intrinsic Disorder (CAID) experiments, and the results demonstrated that DisoFLAG offers accurate and comprehensive predictions of disordered functions, extending the current coverage of computationally predicted disordered function categories. The standalone package and web server of DisoFLAG have been established to provide accurate prediction tools for intrinsic disorders and their associated functions.
Collapse
Affiliation(s)
- Yihe Pang
- School of Computer Science and Technology, Beijing Institute of Technology, No. 5, South Zhongguancun Street, Beijing, Haidian District, 100081, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, No. 5, South Zhongguancun Street, Beijing, Haidian District, 100081, China.
- Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, No. 5, South Zhongguancun Street, Beijing, Haidian District, 100081, China.
| |
Collapse
|
14
|
Donald H, Blane A, Buthelezi S, Naicker P, Stoychev S, Majakwara J, Fanucchi S. Assessing the dynamics and macromolecular interactions of the intrinsically disordered protein YY1. Biosci Rep 2023; 43:BSR20231295. [PMID: 37815922 PMCID: PMC10611921 DOI: 10.1042/bsr20231295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 09/26/2023] [Accepted: 10/10/2023] [Indexed: 10/12/2023] Open
Abstract
YY1 is a ubiquitously expressed, intrinsically disordered transcription factor involved in neural development. The oligomeric state of YY1 varies depending on the environment. These structural changes may alter its DNA binding ability and hence its transcriptional activity. Just as YY1's oligomeric state can impact its role in transcription, so does its interaction with other proteins such as FOXP2. The aim of this work is to study the structure and dynamics of YY1 so as to determine the influence of oligomerisation and associations with FOXP2 on its DNA binding mechanism. The results confirm that YY1 is primarily a disordered protein, but it does consist of certain specific structured regions. We observed that YY1 quaternary structure is a heterogenous mixture of oligomers, the overall size of which is dependent on ionic strength. Both YY1 oligomerisation and its dynamic behaviour are further subject to changes upon DNA binding, whereby increases in DNA concentration result in a decrease in the size of YY1 oligomers. YY1 and the FOXP2 forkhead domain were found to interact with each other both in isolation and in the presence of YY1-specific DNA. The heterogeneous, dynamic multimerisation of YY1 identified in this work is, therefore likely to be important for its ability to make heterologous associations with other proteins such as FOXP2. The interactions that YY1 makes with itself, FOXP2 and DNA form part of an intricate mechanism of transcriptional regulation by YY1, which is vital for appropriate neural development.
Collapse
Affiliation(s)
- Heather Donald
- Protein Structure-Function Unit, School of molecular and Cell Biology, University of the Witwatersrand, Jan Smuts Ave, Braamfontein, 2050 Johannesburg, Gauteng, South Africa
| | - Ashleigh Blane
- Protein Structure-Function Unit, School of molecular and Cell Biology, University of the Witwatersrand, Jan Smuts Ave, Braamfontein, 2050 Johannesburg, Gauteng, South Africa
| | - Sindisiwe Buthelezi
- CSIR Biosciences, CSIR, Meiring Naude Road, Brummeria, 0001 Pretoria, Gauteng, South Africa
| | - Previn Naicker
- CSIR Biosciences, CSIR, Meiring Naude Road, Brummeria, 0001 Pretoria, Gauteng, South Africa
| | - Stoyan Stoychev
- CSIR Biosciences, CSIR, Meiring Naude Road, Brummeria, 0001 Pretoria, Gauteng, South Africa
| | - Jacob Majakwara
- School of Statistics and Actuarial Science, University of the Witwatersrand, Jan Smuts Ave, Braamfontein, 2050 Johannesburg, Gauteng, South Africa
| | - Sylvia Fanucchi
- Protein Structure-Function Unit, School of molecular and Cell Biology, University of the Witwatersrand, Jan Smuts Ave, Braamfontein, 2050 Johannesburg, Gauteng, South Africa
| |
Collapse
|
15
|
Alderson TR, Pritišanac I, Kolarić Đ, Moses AM, Forman-Kay JD. Systematic identification of conditionally folded intrinsically disordered regions by AlphaFold2. Proc Natl Acad Sci U S A 2023; 120:e2304302120. [PMID: 37878721 PMCID: PMC10622901 DOI: 10.1073/pnas.2304302120] [Citation(s) in RCA: 31] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 08/30/2023] [Indexed: 10/27/2023] Open
Abstract
The AlphaFold Protein Structure Database contains predicted structures for millions of proteins. For the majority of human proteins that contain intrinsically disordered regions (IDRs), which do not adopt a stable structure, it is generally assumed that these regions have low AlphaFold2 confidence scores that reflect low-confidence structural predictions. Here, we show that AlphaFold2 assigns confident structures to nearly 15% of human IDRs. By comparison to experimental NMR data for a subset of IDRs that are known to conditionally fold (i.e., upon binding or under other specific conditions), we find that AlphaFold2 often predicts the structure of the conditionally folded state. Based on databases of IDRs that are known to conditionally fold, we estimate that AlphaFold2 can identify conditionally folding IDRs at a precision as high as 88% at a 10% false positive rate, which is remarkable considering that conditionally folded IDR structures were minimally represented in its training data. We find that human disease mutations are nearly fivefold enriched in conditionally folded IDRs over IDRs in general and that up to 80% of IDRs in prokaryotes are predicted to conditionally fold, compared to less than 20% of eukaryotic IDRs. These results indicate that a large majority of IDRs in the proteomes of human and other eukaryotes function in the absence of conditional folding, but the regions that do acquire folds are more sensitive to mutations. We emphasize that the AlphaFold2 predictions do not reveal functionally relevant structural plasticity within IDRs and cannot offer realistic ensemble representations of conditionally folded IDRs.
Collapse
Affiliation(s)
- T. Reid Alderson
- Department of Biochemistry, University of Toronto, Toronto, ONM5S 1A8, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ONM5S 1A8, Canada
| | - Iva Pritišanac
- Department of Cell and Systems Biology, University of Toronto, Toronto, ONM5S 35G, Canada
- Molecular Medicine Program, The Hospital for Sick Children, Toronto, ONM5G 0A4, Canada
- Department of Molecular Biology and Biochemistry, Gottfried Schatz Research Center for Cell Signaling, Metabolism and Aging, Medical University of Graz, Graz8010, Austria
| | - Đesika Kolarić
- Department of Molecular Biology and Biochemistry, Gottfried Schatz Research Center for Cell Signaling, Metabolism and Aging, Medical University of Graz, Graz8010, Austria
| | - Alan M. Moses
- Department of Cell and Systems Biology, University of Toronto, Toronto, ONM5S 35G, Canada
| | - Julie D. Forman-Kay
- Department of Biochemistry, University of Toronto, Toronto, ONM5S 1A8, Canada
- Molecular Medicine Program, The Hospital for Sick Children, Toronto, ONM5G 0A4, Canada
| |
Collapse
|
16
|
Subedi S, Nag N, Shukla H, Padhi AK, Tripathi T. Comprehensive analysis of liquid-liquid phase separation propensities of HSV-1 proteins and their interaction with host factors. J Cell Biochem 2023. [PMID: 37796176 DOI: 10.1002/jcb.30480] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 09/08/2023] [Accepted: 09/17/2023] [Indexed: 10/06/2023]
Abstract
In recent years, it has been shown that the liquid-liquid phase separation (LLPS) of virus proteins plays a crucial role in their life cycle. It promotes the formation of viral replication organelles, concentrating viral components for efficient replication and facilitates the assembly of viral particles. LLPS has emerged as a crucial process in the replication and assembly of herpes simplex virus-1 (HSV-1). Recent studies have identified several HSV-1 proteins involved in LLPS, including the myristylated tegument protein UL11 and infected cell protein 4; however, a complete proteome-level understanding of the LLPS-prone HSV-1 proteins is not available. We provide a comprehensive analysis of the HSV-1 proteome and explore the potential of its proteins to undergo LLPS. By integrating sequence analysis, prediction algorithms and an array of tools and servers, we identified 10 HSV-1 proteins that exhibit high LLPS potential. By analysing the amino acid sequences of the LLPS-prone proteins, we identified specific sequence motifs and enriched amino acid residues commonly found in LLPS-prone regions. Our findings reveal a diverse range of LLPS-prone proteins within the HSV-1, which are involved in critical viral processes such as replication, transcriptional regulation and assembly of viral particles. This suggests that LLPS might play a crucial role in facilitating the formation of specialized viral replication compartments and the assembly of HSV-1 virion. The identification of LLPS-prone proteins in HSV-1 opens up new avenues for understanding the molecular mechanisms underlying viral pathogenesis. Our work provides valuable insights into the LLPS landscape of HSV-1, highlighting potential targets for further experimental validation and enhancing our understanding of viral replication and pathogenesis.
Collapse
Affiliation(s)
- Sushma Subedi
- Molecular and Structural Biophysics Laboratory, Department of Biochemistry, North-Eastern Hill University, Shillong, India
| | - Niharika Nag
- Molecular and Structural Biophysics Laboratory, Department of Biochemistry, North-Eastern Hill University, Shillong, India
| | - Harish Shukla
- Molecular and Structural Biophysics Laboratory, Department of Biochemistry, North-Eastern Hill University, Shillong, India
| | - Aditya K Padhi
- Laboratory for Computational Biology & Biomolecular Design, School of Biochemical Engineering, Indian Institute of Technology (BHU), Varanasi, India
| | - Timir Tripathi
- Molecular and Structural Biophysics Laboratory, Department of Biochemistry, North-Eastern Hill University, Shillong, India
- Department of Zoology, North-Eastern Hill University, Shillong, India
| |
Collapse
|
17
|
Kumar G, Hazra JP, Sinha S. Disordered regions endow structural flexibility to shell proteins and function towards shell-enzyme interactions in 1,2-propanediol utilization microcompartment. J Biomol Struct Dyn 2023; 41:8891-8901. [PMID: 36318590 DOI: 10.1080/07391102.2022.2138552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Accepted: 10/16/2022] [Indexed: 11/07/2022]
Abstract
Intrinsically disordered regions in proteins have been functionally linked to the protein-protein interactions and genesis of several membraneless organelles. Depending on their residual makeup, hydrophobicity or charge distribution they may remain in extended form or may assume certain conformations upon biding to a partner protein or peptide. The present work investigates the distribution and potential roles of disordered regions in the integral proteins of 1,2-propanediol utilization microcompartments. We use bioinformatics tools to identify the probable disordered regions in the shell proteins and enzyme of the 1,2-propanediol utilization microcompartment. Using a combination of computational modelling and biochemical techniques we elucidate the role of disordered terminal regions of a major shell protein and enzyme. Our findings throw light on the importance of disordered regions in the self-assembly, providing flexibility to shell protein and mediating its interaction with a native enzyme.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Gaurav Kumar
- Chemical Biology Unit, Institute of Nano Science and Technology, Mohali, India
| | - Jagadish Prasad Hazra
- Department of Chemical Sciences, Indian Institute of Science Education and Research, Mohali, India
| | - Sharmistha Sinha
- Chemical Biology Unit, Institute of Nano Science and Technology, Mohali, India
| |
Collapse
|
18
|
Tomar V, Rikkerink EHA, Song J, Sofkova-Bobcheva S, Bus VGM. Structure-Function Characterisation of Eop1 Effectors from the Erwinia-Pantoea Clade Reveals They May Acetylate Their Defence Target through a Catalytic Dyad. Int J Mol Sci 2023; 24:14664. [PMID: 37834112 PMCID: PMC10572645 DOI: 10.3390/ijms241914664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 09/21/2023] [Accepted: 09/25/2023] [Indexed: 10/15/2023] Open
Abstract
The YopJ group of acetylating effectors from phytopathogens of the genera Pseudomonas and Ralstonia have been widely studied to understand how they modify and suppress their host defence targets. In contrast, studies on a related group of effectors, the Eop1 group, lag far behind. Members of the Eop1 group are widely present in the Erwinia-Pantoea clade of Gram-negative bacteria, which contains phytopathogens, non-pathogens and potential biocontrol agents, implying that they may play an important role in agroecological or pathological adaptations. The lack of research in this group of YopJ effectors has left a significant knowledge gap in their functioning and role. For the first time, we perform a comparative analysis combining AlphaFold modelling, in planta transient expressions and targeted mutational analyses of the Eop1 group effectors from the Erwinia-Pantoea clade, to help elucidate their likely activity and mechanism(s). This integrated study revealed several new findings, including putative binding sites for inositol hexakisphosphate and acetyl coenzyme A and newly postulated target-binding domains, and raises questions about whether these effectors function through a catalytic triad mechanism. The results imply that some Eop1s may use a catalytic dyad acetylation mechanism that we found could be promoted by the electronegative environment around the active site.
Collapse
Affiliation(s)
- Vishant Tomar
- Mt Albert Research Centre, The New Zealand Institute for Plant and Food Research Limited, Auckland 1025, New Zealand
- School of Agriculture and Environment, Massey University, Private Bag 11222, Palmerston North 4442, New Zealand;
| | - Erik H. A. Rikkerink
- Mt Albert Research Centre, The New Zealand Institute for Plant and Food Research Limited, Auckland 1025, New Zealand
| | - Janghoon Song
- Pear Research Institute, National Institute of Horticultural & Herbal Science, Rural Development Administration, Naju 58216, Republic of Korea
| | - Svetla Sofkova-Bobcheva
- School of Agriculture and Environment, Massey University, Private Bag 11222, Palmerston North 4442, New Zealand;
| | - Vincent G. M. Bus
- Hawkes Bay Research Centre, The New Zealand Institute for Plant and Food Research Limited, Havelock North 4130, New Zealand;
| |
Collapse
|
19
|
Manyilov VD, Ilyinsky NS, Nesterov SV, Saqr BMGA, Dayhoff GW, Zinovev EV, Matrenok SS, Fonin AV, Kuznetsova IM, Turoverov KK, Ivanovich V, Uversky VN. Chaotic aging: intrinsically disordered proteins in aging-related processes. Cell Mol Life Sci 2023; 80:269. [PMID: 37634152 PMCID: PMC11073068 DOI: 10.1007/s00018-023-04897-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2023] [Revised: 07/03/2023] [Accepted: 07/24/2023] [Indexed: 08/29/2023]
Abstract
The development of aging is associated with the disruption of key cellular processes manifested as well-established hallmarks of aging. Intrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs) have no stable tertiary structure that provide them a power to be configurable hubs in signaling cascades and regulate many processes, potentially including those related to aging. There is a need to clarify the roles of IDPs/IDRs in aging. The dataset of 1702 aging-related proteins was collected from established aging databases and experimental studies. There is a noticeable presence of IDPs/IDRs, accounting for about 36% of the aging-related dataset, which is however less than the disorder content of the whole human proteome (about 40%). A Gene Ontology analysis of the used here aging proteome reveals an abundance of IDPs/IDRs in one-third of aging-associated processes, especially in genome regulation. Signaling pathways associated with aging also contain IDPs/IDRs on different hierarchical levels, revealing the importance of "structure-function continuum" in aging. Protein-protein interaction network analysis showed that IDPs present in different clusters associated with different aging hallmarks. Protein cluster with IDPs enrichment has simultaneously high liquid-liquid phase separation (LLPS) probability, "nuclear" localization and DNA-associated functions, related to aging hallmarks: genomic instability, telomere attrition, epigenetic alterations, and stem cells exhaustion. Intrinsic disorder, LLPS, and aggregation propensity should be considered as features that could be markers of pathogenic proteins. Overall, our analyses indicate that IDPs/IDRs play significant roles in aging-associated processes, particularly in the regulation of DNA functioning. IDP aggregation, which can lead to loss of function and toxicity, could be critically harmful to the cell. A structure-based analysis of aging and the identification of proteins that are particularly susceptible to disturbances can enhance our understanding of the molecular mechanisms of aging and open up new avenues for slowing it down.
Collapse
Affiliation(s)
- Vladimir D Manyilov
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Institutskiy Pereulok, 9, Dolgoprudny, 141700, Russia
| | - Nikolay S Ilyinsky
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Institutskiy Pereulok, 9, Dolgoprudny, 141700, Russia.
| | - Semen V Nesterov
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Institutskiy Pereulok, 9, Dolgoprudny, 141700, Russia
- Institute of Cytology, Russian Academy of Sciences, Saint Petersburg, 194064, Russia
| | - Baraa M G A Saqr
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Institutskiy Pereulok, 9, Dolgoprudny, 141700, Russia
| | - Guy W Dayhoff
- Department of Chemistry, University of South Florida, Tampa, FL, USA
| | - Egor V Zinovev
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Institutskiy Pereulok, 9, Dolgoprudny, 141700, Russia
| | - Simon S Matrenok
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Institutskiy Pereulok, 9, Dolgoprudny, 141700, Russia
| | - Alexander V Fonin
- Institute of Cytology, Russian Academy of Sciences, Saint Petersburg, 194064, Russia
| | - Irina M Kuznetsova
- Institute of Cytology, Russian Academy of Sciences, Saint Petersburg, 194064, Russia
| | | | - Valentin Ivanovich
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Institutskiy Pereulok, 9, Dolgoprudny, 141700, Russia
| | - Vladimir N Uversky
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Institutskiy Pereulok, 9, Dolgoprudny, 141700, Russia.
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Blvd., MDC07, Tampa, FL, 33612, USA.
| |
Collapse
|
20
|
Flynn AJ, Miller K, Codjoe JM, King MR, Haswell ES. Mechanosensitive ion channels MSL8, MSL9, and MSL10 have environmentally sensitive intrinsically disordered regions with distinct biophysical characteristics in vitro. PLANT DIRECT 2023; 7:e515. [PMID: 37547488 PMCID: PMC10400277 DOI: 10.1002/pld3.515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Revised: 06/26/2023] [Accepted: 06/27/2023] [Indexed: 08/08/2023]
Abstract
Intrinsically disordered protein regions (IDRs) are highly dynamic sequences that rapidly sample a collection of conformations over time. In the past several decades, IDRs have emerged as a major component of many proteomes, comprising ~30% of all eukaryotic protein sequences. Proteins with IDRs function in a wide range of biological pathways and are notably enriched in signaling cascades that respond to environmental stresses. Here, we identify and characterize intrinsic disorder in the soluble cytoplasmic N-terminal domains of MSL8, MSL9, and MSL10, three members of the MscS-like (MSL) family of mechanosensitive ion channels. In plants, MSL channels are proposed to mediate cell and organelle osmotic homeostasis. Bioinformatic tools unanimously predicted that the cytosolic N-termini of MSL channels are intrinsically disordered. We examined the N-terminus of MSL10 (MSL10N) as an exemplar of these IDRs and circular dichroism spectroscopy confirms its disorder. MSL10N adopted a predominately helical structure when exposed to the helix-inducing compound trifluoroethanol (TFE). Furthermore, in the presence of molecular crowding agents, MSL10N underwent structural changes and exhibited alterations to its homotypic interaction favorability. Lastly, interrogations of collective behavior via in vitro imaging of condensates indicated that MSL8N, MSL9N, and MSL10N have sharply differing propensities for self-assembly into condensates, both inherently and in response to salt, temperature, and molecular crowding. Taken together, these data establish the N-termini of MSL channels as intrinsically disordered regions with distinct biophysical properties and the potential to respond uniquely to changes in their physiochemical environment.
Collapse
Affiliation(s)
- Aidan J. Flynn
- Department of BiologyWashington University in St. LouisSt. LouisMissouriUSA
- NSF Center for Engineering Mechanobiology, Department of BiologyWashington University in St. LouisSt. LouisMissouriUSA
- Department of Biochemistry and BiophysicsWashington University in St. LouisSt. LouisMissouriUSA
| | - Kari Miller
- Department of BiologyWashington University in St. LouisSt. LouisMissouriUSA
- NSF Center for Engineering Mechanobiology, Department of BiologyWashington University in St. LouisSt. LouisMissouriUSA
| | - Jennette M. Codjoe
- Department of BiologyWashington University in St. LouisSt. LouisMissouriUSA
- NSF Center for Engineering Mechanobiology, Department of BiologyWashington University in St. LouisSt. LouisMissouriUSA
| | - Matthew R. King
- Department of Biomedical EngineeringWashington University in St. LouisSt. LouisMissouriUSA
| | - Elizabeth S. Haswell
- Department of BiologyWashington University in St. LouisSt. LouisMissouriUSA
- NSF Center for Engineering Mechanobiology, Department of BiologyWashington University in St. LouisSt. LouisMissouriUSA
| |
Collapse
|
21
|
Osipov SD, Ryzhykau YL, Zinovev EV, Minaeva AV, Ivashchenko SD, Verteletskiy DP, Sudarev VV, Kuklina DD, Nikolaev MY, Semenov YS, Zagryadskaya YA, Okhrimenko IS, Gette MS, Dronova EA, Shishkin AY, Dencher NA, Kuklin AI, Ivanovich V, Uversky VN, Vlasov AV. I-Shaped Dimers of a Plant Chloroplast F OF 1-ATP Synthase in Response to Changes in Ionic Strength. Int J Mol Sci 2023; 24:10720. [PMID: 37445905 DOI: 10.3390/ijms241310720] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 06/12/2023] [Accepted: 06/24/2023] [Indexed: 07/15/2023] Open
Abstract
F-type ATP synthases play a key role in oxidative and photophosphorylation processes generating adenosine triphosphate (ATP) for most biochemical reactions in living organisms. In contrast to the mitochondrial FOF1-ATP synthases, those of chloroplasts are known to be mostly monomers with approx. 15% fraction of oligomers interacting presumably non-specifically in a thylakoid membrane. To shed light on the nature of this difference we studied interactions of the chloroplast ATP synthases using small-angle X-ray scattering (SAXS) method. Here, we report evidence of I-shaped dimerization of solubilized FOF1-ATP synthases from spinach chloroplasts at different ionic strengths. The structural data were obtained by SAXS and demonstrated dimerization in response to ionic strength. The best model describing SAXS data was two ATP-synthases connected through F1/F1' parts, presumably via their δ-subunits, forming "I" shape dimers. Such I-shaped dimers might possibly connect the neighboring lamellae in thylakoid stacks assuming that the FOF1 monomers comprising such dimers are embedded in parallel opposing stacked thylakoid membrane areas. If this type of dimerization exists in nature, it might be one of the pathways of inhibition of chloroplast FOF1-ATP synthase for preventing ATP hydrolysis in the dark, when ionic strength in plant chloroplasts is rising. Together with a redox switch inserted into a γ-subunit of chloroplast FOF1 and lateral oligomerization, an I-shaped dimerization might comprise a subtle regulatory process of ATP synthesis and stabilize the structure of thylakoid stacks in chloroplasts.
Collapse
Affiliation(s)
- Stepan D Osipov
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, 141700 Dolgoprudny, Russia
| | - Yury L Ryzhykau
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, 141700 Dolgoprudny, Russia
- Frank Laboratory of Neutron Physics, Joint Institute for Nuclear Research, 141980 Dubna, Russia
| | - Egor V Zinovev
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, 141700 Dolgoprudny, Russia
| | - Andronika V Minaeva
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, 141700 Dolgoprudny, Russia
| | - Sergey D Ivashchenko
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, 141700 Dolgoprudny, Russia
| | - Dmitry P Verteletskiy
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, 141700 Dolgoprudny, Russia
| | - Vsevolod V Sudarev
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, 141700 Dolgoprudny, Russia
| | - Daria D Kuklina
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, 141700 Dolgoprudny, Russia
| | - Mikhail Yu Nikolaev
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, 141700 Dolgoprudny, Russia
| | - Yury S Semenov
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, 141700 Dolgoprudny, Russia
| | - Yuliya A Zagryadskaya
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, 141700 Dolgoprudny, Russia
| | - Ivan S Okhrimenko
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, 141700 Dolgoprudny, Russia
| | - Margarita S Gette
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, 141700 Dolgoprudny, Russia
| | - Elizaveta A Dronova
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, 141700 Dolgoprudny, Russia
| | - Aleksei Yu Shishkin
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, 141700 Dolgoprudny, Russia
| | - Norbert A Dencher
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, 141700 Dolgoprudny, Russia
| | - Alexander I Kuklin
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, 141700 Dolgoprudny, Russia
- Frank Laboratory of Neutron Physics, Joint Institute for Nuclear Research, 141980 Dubna, Russia
| | - Valentin Ivanovich
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, 141700 Dolgoprudny, Russia
| | - Vladimir N Uversky
- Department of Molecular Medicine and Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA
| | - Alexey V Vlasov
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, 141700 Dolgoprudny, Russia
- Frank Laboratory of Neutron Physics, Joint Institute for Nuclear Research, 141980 Dubna, Russia
| |
Collapse
|
22
|
Zhao B, Ghadermarzi S, Kurgan L. Comparative evaluation of AlphaFold2 and disorder predictors for prediction of intrinsic disorder, disorder content and fully disordered proteins. Comput Struct Biotechnol J 2023; 21:3248-3258. [PMID: 38213902 PMCID: PMC10782001 DOI: 10.1016/j.csbj.2023.06.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 05/31/2023] [Accepted: 06/01/2023] [Indexed: 01/13/2024] Open
Abstract
We expand studies of AlphaFold2 (AF2) in the context of intrinsic disorder prediction by comparing it against a broad selection of 20 accurate, popular and recently released disorder predictors. We use 25% larger benchmark dataset with 646 proteins and cover protein-level predictions of disorder content and fully disordered proteins. AF2-based disorder predictions secure a relatively high Area Under receiver operating characteristic Curve (AUC) of 0.77 and are statistically outperformed by several modern disorder predictors that secure AUCs around 0.8 with median runtime of about 20 s compared to 1200 s for AF2. Moreover, AF2 provides modestly accurate predictions of fully disordered proteins (F1 = 0.59 vs. 0.91 for the best disorder predictor) and disorder content (mean absolute error of 0.21 vs. 0.15). AF2 also generates statistically more accurate disorder predictions for about 20% of proteins that have relatively short sequences and a few disordered regions that tend to be located at the sequence termini, and which are absent of disordered protein-binding regions. Interestingly, AF2 and the most accurate disorder predictors rely on deep neural networks, suggesting that these models are useful for protein structure and disorder predictions.
Collapse
Affiliation(s)
- Bi Zhao
- Genomics program, College of Public Health, University of South Florida, Tampa, FL, United States
| | - Sina Ghadermarzi
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| |
Collapse
|
23
|
Reinar WB, Greulich A, Stø IM, Knutsen JB, Reitan T, Tørresen OK, Jentoft S, Butenko MA, Jakobsen KS. Adaptive protein evolution through length variation of short tandem repeats in Arabidopsis. SCIENCE ADVANCES 2023; 9:eadd6960. [PMID: 36947624 PMCID: PMC10032594 DOI: 10.1126/sciadv.add6960] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Accepted: 02/22/2023] [Indexed: 06/18/2023]
Abstract
Intrinsically disordered protein regions are of high importance for biotic and abiotic stress responses in plants. Tracts of identical amino acids accumulate in these regions and can vary in length over generations because of expansions and retractions of short tandem repeats at the genomic level. However, little attention has been paid to what extent length variation is shaped by natural selection. By environmental association analysis on 2514 length variable tracts in 770 whole-genome sequenced Arabidopsis thaliana, we show that length variation in glutamine and asparagine amino acid homopolymers, as well as in interaction hotspots, correlate with local bioclimatic habitat. We determined experimentally that the promoter activity of a light-stress gene depended on polyglutamine length variants in a disordered transcription factor. Our results show that length variations affect protein function and are likely adaptive. Length variants modulating protein function at a global genomic scale has implications for understanding protein evolution and eco-evolutionary biology.
Collapse
Affiliation(s)
- William B. Reinar
- Section for Genetics and Evolutionary Biology, Department of Biosciences, University of Oslo, 0316 Oslo, Norway
- Centre for Ecological and Evolutionary Synthesis (CEES), Department of Biosciences, University of Oslo, 0316 Oslo, Norway
| | - Anne Greulich
- Section for Genetics and Evolutionary Biology, Department of Biosciences, University of Oslo, 0316 Oslo, Norway
- Centre for Ecological and Evolutionary Synthesis (CEES), Department of Biosciences, University of Oslo, 0316 Oslo, Norway
| | - Ida M. Stø
- Section for Genetics and Evolutionary Biology, Department of Biosciences, University of Oslo, 0316 Oslo, Norway
| | - Jonfinn B. Knutsen
- Section for Genetics and Evolutionary Biology, Department of Biosciences, University of Oslo, 0316 Oslo, Norway
- Centre for Ecological and Evolutionary Synthesis (CEES), Department of Biosciences, University of Oslo, 0316 Oslo, Norway
| | - Trond Reitan
- Centre for Ecological and Evolutionary Synthesis (CEES), Department of Biosciences, University of Oslo, 0316 Oslo, Norway
| | - Ole K. Tørresen
- Centre for Ecological and Evolutionary Synthesis (CEES), Department of Biosciences, University of Oslo, 0316 Oslo, Norway
| | - Sissel Jentoft
- Centre for Ecological and Evolutionary Synthesis (CEES), Department of Biosciences, University of Oslo, 0316 Oslo, Norway
| | - Melinka A. Butenko
- Section for Genetics and Evolutionary Biology, Department of Biosciences, University of Oslo, 0316 Oslo, Norway
| | - Kjetill S. Jakobsen
- Centre for Ecological and Evolutionary Synthesis (CEES), Department of Biosciences, University of Oslo, 0316 Oslo, Norway
| |
Collapse
|
24
|
Computational prediction of disordered binding regions. Comput Struct Biotechnol J 2023; 21:1487-1497. [PMID: 36851914 PMCID: PMC9957716 DOI: 10.1016/j.csbj.2023.02.018] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 02/08/2023] [Accepted: 02/08/2023] [Indexed: 02/12/2023] Open
Abstract
One of the key features of intrinsically disordered regions (IDRs) is their ability to interact with a broad range of partner molecules. Multiple types of interacting IDRs were identified including molecular recognition fragments (MoRFs), short linear sequence motifs (SLiMs), and protein-, nucleic acids- and lipid-binding regions. Prediction of binding IDRs in protein sequences is gaining momentum in recent years. We survey 38 predictors of binding IDRs that target interactions with a diverse set of partners, such as peptides, proteins, RNA, DNA and lipids. We offer a historical perspective and highlight key events that fueled efforts to develop these methods. These tools rely on a diverse range of predictive architectures that include scoring functions, regular expressions, traditional and deep machine learning and meta-models. Recent efforts focus on the development of deep neural network-based architectures and extending coverage to RNA, DNA and lipid-binding IDRs. We analyze availability of these methods and show that providing implementations and webservers results in much higher rates of citations/use. We also make several recommendations to take advantage of modern deep network architectures, develop tools that bundle predictions of multiple and different types of binding IDRs, and work on algorithms that model structures of the resulting complexes.
Collapse
|
25
|
Chakravarty D, Schafer JW, Porter LL. Distinguishing features of fold-switching proteins. Protein Sci 2023; 32:e4596. [PMID: 36782353 PMCID: PMC9951197 DOI: 10.1002/pro.4596] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 01/30/2023] [Accepted: 02/09/2023] [Indexed: 02/15/2023]
Abstract
Though many folded proteins assume one stable structure that performs one function, a small-but-increasing number remodel their secondary and tertiary structures and change their functions in response to cellular stimuli. These fold-switching proteins regulate biological processes and are associated with autoimmune dysfunction, severe acute respiratory syndrome coronavirus-2 infection, and more. Despite their biological importance, it is difficult to computationally predict fold switching. With the aim of advancing computational prediction and experimental characterization of fold switchers, this review discusses several features that distinguish fold-switching proteins from their single-fold and intrinsically disordered counterparts. First, the isolated structures of fold switchers are less stable and more heterogeneous than single folders but more stable and less heterogeneous than intrinsically disordered proteins (IDPs). Second, the sequences of single fold, fold switching, and intrinsically disordered proteins can evolve at distinct rates. Third, proteins from these three classes are best predicted using different computational techniques. Finally, late-breaking results suggest that single folders, fold switchers, and IDPs have distinct patterns of residue-residue coevolution. The review closes by discussing high-throughput and medium-throughput experimental approaches that might be used to identify new fold-switching proteins.
Collapse
Affiliation(s)
- Devlina Chakravarty
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of HealthBethesdaMarylandUSA
| | - Joseph W. Schafer
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of HealthBethesdaMarylandUSA
| | - Lauren L. Porter
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of HealthBethesdaMarylandUSA
- Biochemistry and Biophysics Center, National Heart, Lung, and Blood Institute, National Institutes of HealthBethesdaMarylandUSA
| |
Collapse
|
26
|
Han B, Ren C, Wang W, Li J, Gong X. Computational Prediction of Protein Intrinsically Disordered Region Related Interactions and Functions. Genes (Basel) 2023; 14:432. [PMID: 36833360 PMCID: PMC9956190 DOI: 10.3390/genes14020432] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Revised: 02/02/2023] [Accepted: 02/05/2023] [Indexed: 02/11/2023] Open
Abstract
Intrinsically Disordered Proteins (IDPs) and Regions (IDRs) exist widely. Although without well-defined structures, they participate in many important biological processes. In addition, they are also widely related to human diseases and have become potential targets in drug discovery. However, there is a big gap between the experimental annotations related to IDPs/IDRs and their actual number. In recent decades, the computational methods related to IDPs/IDRs have been developed vigorously, including predicting IDPs/IDRs, the binding modes of IDPs/IDRs, the binding sites of IDPs/IDRs, and the molecular functions of IDPs/IDRs according to different tasks. In view of the correlation between these predictors, we have reviewed these prediction methods uniformly for the first time, summarized their computational methods and predictive performance, and discussed some problems and perspectives.
Collapse
Affiliation(s)
- Bingqing Han
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
| | - Chongjiao Ren
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
| | - Wenda Wang
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
| | - Jiashan Li
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
| | - Xinqi Gong
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
- Beijing Academy of Intelligence, Beijing 100083, China
| |
Collapse
|
27
|
Anbo H, Sakuma K, Fukuchi S, Ota M. How AlphaFold2 Predicts Conditionally Folding Regions Annotated in an Intrinsically Disordered Protein Database, IDEAL. BIOLOGY 2023; 12:182. [PMID: 36829461 PMCID: PMC9952413 DOI: 10.3390/biology12020182] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 01/19/2023] [Accepted: 01/21/2023] [Indexed: 01/27/2023]
Abstract
AlphaFold2 (AF2) is a protein structure prediction program which provides accurate models. In addition to predicting structural domains, AF2 assigns intrinsically disordered regions (IDRs) by identifying regions with low prediction reliability (pLDDT). Some regions in IDRs undergo disorder-to-order transition upon binding the interaction partner. Here we assessed model structures of AF2 based on the annotations in IDEAL, in which segments with disorder-to-order transition have been collected as Protean Segments (ProSs). We non-redundantly selected ProSs from IDEAL and classified them based on the root mean square deviation to the corresponding region of AF2 models. Statistical analysis identified 11 structural and sequential features, possibly contributing toward the prediction of ProS structures. These features were categorized into two groups: one that contained pLDDT and the other that contained normalized radius of gyration. The typical ProS structures in the former group comprise a long α helix or a whole or part of the structural domain and those in the latter group comprise a short α helix with terminal loops.
Collapse
Affiliation(s)
- Hiroto Anbo
- Faculty of Engineering, Maebashi Institute of Technology, Maebashi 371-0816, Japan
| | - Koya Sakuma
- Graduate School of Informatics, Nagoya University, Nagoya 464-8601, Japan
| | - Satoshi Fukuchi
- Faculty of Engineering, Maebashi Institute of Technology, Maebashi 371-0816, Japan
| | - Motonori Ota
- Graduate School of Informatics, Nagoya University, Nagoya 464-8601, Japan
- Institute for Glyco-core Research, Nagoya University, Nagoya 464-8601, Japan
| |
Collapse
|
28
|
Peng Z, Li Z, Meng Q, Zhao B, Kurgan L. CLIP: accurate prediction of disordered linear interacting peptides from protein sequences using co-evolutionary information. Brief Bioinform 2023; 24:6858950. [PMID: 36458437 DOI: 10.1093/bib/bbac502] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 09/30/2022] [Accepted: 10/24/2022] [Indexed: 12/04/2022] Open
Abstract
One of key features of intrinsically disordered regions (IDRs) is facilitation of protein-protein and protein-nucleic acids interactions. These disordered binding regions include molecular recognition features (MoRFs), short linear motifs (SLiMs) and longer binding domains. Vast majority of current predictors of disordered binding regions target MoRFs, with a handful of methods that predict SLiMs and disordered protein-binding domains. A new and broader class of disordered binding regions, linear interacting peptides (LIPs), was introduced recently and applied in the MobiDB resource. LIPs are segments in protein sequences that undergo disorder-to-order transition upon binding to a protein or a nucleic acid, and they cover MoRFs, SLiMs and disordered protein-binding domains. Although current predictors of MoRFs and disordered protein-binding regions could be used to identify some LIPs, there are no dedicated sequence-based predictors of LIPs. To this end, we introduce CLIP, a new predictor of LIPs that utilizes robust logistic regression model to combine three complementary types of inputs: co-evolutionary information derived from multiple sequence alignments, physicochemical profiles and disorder predictions. Ablation analysis suggests that the co-evolutionary information is particularly useful for this prediction and that combining the three inputs provides substantial improvements when compared to using these inputs individually. Comparative empirical assessments using low-similarity test datasets reveal that CLIP secures area under receiver operating characteristic curve (AUC) of 0.8 and substantially improves over the results produced by the closest current tools that predict MoRFs and disordered protein-binding regions. The webserver of CLIP is freely available at http://biomine.cs.vcu.edu/servers/CLIP/ and the standalone code can be downloaded from http://yanglab.qd.sdu.edu.cn/download/CLIP/.
Collapse
Affiliation(s)
- Zhenling Peng
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China.,Frontier Science Center for Nonlinear Expectations, Ministry of Education, Qingdao, 266237, China
| | - Zixia Li
- Center for Applied Mathematics, Tianjin University, Tianjin, 300072, China
| | - Qiaozhen Meng
- College of Intelligence and Computing, Tianjin University, Tianjin, 300072, China
| | - Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
29
|
Zhang F, Li M, Zhang J, Shi W, Kurgan L. DeepPRObind: Modular Deep Learner that Accurately Predicts Structure and Disorder-Annotated Protein Binding Residues. J Mol Biol 2023:167945. [PMID: 36621533 DOI: 10.1016/j.jmb.2023.167945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Revised: 12/15/2022] [Accepted: 01/01/2023] [Indexed: 01/07/2023]
Abstract
Current sequence-based predictors of protein-binding residues (PBRs) belong to two distinct categories: structure-trained vs. intrinsic disorder-trained. Since disordered PBRs differ from structured PBRs in several ways, including ability to bind multiple partners by folding into different conformations and enrichment in different amino acids, the structure-trained and disorder-trained predictors were shown to provide inaccurate results for the other annotation type. A simple consensus-based solution that combines structure- and disorder-trained methods provides limited levels of predictive performance and generates relatively many cross-predictions, where residues that interact with other ligand types are predicted as PBRs. We address this unsolved problem by designing a novel and fast deep-learner, DeepPRObind, that relies on carefully designed modular convolutional architecture and uses innovative aggregate input features. Comparative empirical tests on a low-similarity test dataset reveal that DeepPRObind generates accurate predictions of structured and disordered PBRs and low amounts of cross-predictions, outperforming a comprehensive collection of 12 predictors of PBRs. Given the relatively low runtime of DeepPRObind (40 seconds per protein), we further validate its results based on an analysis of putative PBRs in the yeast proteome, confirming that interactions in disordered regions are enriched among hub proteins. We release DeepPRObind as a convenient web server at https://www.csuligroup.com/DeepPRObind/.
Collapse
Affiliation(s)
- Fuhao Zhang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Min Li
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China.
| | - Jian Zhang
- School of Computer and Information Technology, Xinyang Normal University, Xinyang 464000, China
| | - Wenbo Shi
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA.
| |
Collapse
|
30
|
Portability of a Small-Molecule Binding Site between Disordered Proteins. Biomolecules 2022; 12:biom12121887. [PMID: 36551315 PMCID: PMC9775153 DOI: 10.3390/biom12121887] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Revised: 12/07/2022] [Accepted: 12/09/2022] [Indexed: 12/23/2022] Open
Abstract
Intrinsically disordered proteins (IDPs) are important in both normal and disease states. Small molecules can be targeted to disordered regions, but we currently have only a limited understanding of the nature of small-molecule binding sites in IDPs. Here, we show that a minimal small-molecule binding sequence of eight contiguous residues derived from the Myc protein can be ported into a different disordered protein and recapitulate small-molecule binding activity in the new context. We also find that the residue immediately flanking the binding site can have opposing effects on small-molecule binding in the different disordered protein contexts. The results demonstrate that small-molecule binding sites can act modularly and are portable between disordered protein contexts but that residues outside of the minimal binding site can modulate binding affinity.
Collapse
|
31
|
Soft disorder modulates the assembly path of protein complexes. PLoS Comput Biol 2022; 18:e1010713. [DOI: 10.1371/journal.pcbi.1010713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Revised: 12/01/2022] [Accepted: 11/06/2022] [Indexed: 11/18/2022] Open
Abstract
The relationship between interactions, flexibility and disorder in proteins has been explored from many angles over the years: folding upon binding, flexibility of the core relative to the periphery, entropy changes, etc. In this work, we provide statistical evidence for the involvement of highly mobile and disordered regions in complex assembly. We ordered the entire set of X-ray crystallographic structures in the Protein Data Bank into hierarchies of progressive interactions involving identical or very similar protein chains, yielding 40205 hierarchies of protein complexes with increasing numbers of partners. We then examine them as proxies for the assembly pathways. Using this database, we show that upon oligomerisation, the new interfaces tend to be observed at residues that were characterised as softly disordered (flexible, amorphous or missing residues) in the complexes preceding them in the hierarchy. We also rule out the possibility that this correlation is just a surface effect by restricting the analysis to residues on the surface of the complexes. Interestingly, we find that the location of soft disordered residues in the sequence changes as the number of partners increases. Our results show that there is a general mechanism for protein assembly that involves soft disorder and modulates the way protein complexes are assembled. This work highlights the difficulty of predicting the structure of large protein complexes from sequence and emphasises the importance of linking predictors of soft disorder to the next generation of predictors of complex structure. Finally, we investigate the relationship between the Alphafold2’s confidence metric pLDDT for structure prediction in unbound versus bound structures, and soft disorder. We show a strong correlation between Alphafold2 low confidence residues and the union of all regions of soft disorder observed in the hierarchy. This paves the way for using the pLDDT metric as a proxy for predicting interfaces and assembly paths.
Collapse
|
32
|
Pang Y, Liu B. DMFpred: Predicting protein disorder molecular functions based on protein cubic language model. PLoS Comput Biol 2022; 18:e1010668. [PMID: 36315580 PMCID: PMC9674156 DOI: 10.1371/journal.pcbi.1010668] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Revised: 11/18/2022] [Accepted: 10/19/2022] [Indexed: 11/05/2022] Open
Abstract
Intrinsically disordered proteins and regions (IDP/IDRs) are widespread in living organisms and perform various essential molecular functions. These functions are summarized as six general categories, including entropic chain, assembler, scavenger, effector, display site, and chaperone. The alteration of IDP functions is responsible for many human diseases. Therefore, identifying the function of disordered proteins is helpful for the studies of drug target discovery and rational drug design. Experimental identification of the molecular functions of IDP in the wet lab is an expensive and laborious procedure that is not applicable on a large scale. Some computational methods have been proposed and mainly focus on predicting the entropic chain function of IDRs, while the computational predictive methods for the remaining five important categories of disordered molecular functions are desired. Motivated by the growing numbers of experimental annotated functional sequences and the need to expand the coverage of disordered protein function predictors, we proposed DMFpred for disordered molecular functions prediction, covering disordered assembler, scavenger, effector, display site and chaperone. DMFpred employs the Protein Cubic Language Model (PCLM), which incorporates three protein language models for characterizing sequences, structural and functional features of proteins, and attention-based alignment for understanding the relationship among three captured features and generating a joint representation of proteins. The PCLM was pre-trained with large-scaled IDR sequences and fine-tuned with functional annotation sequences for molecular function prediction. The predictive performance evaluation on five categories of functional and multi-functional residues suggested that DMFpred provides high-quality predictions. The web-server of DMFpred can be freely accessed from http://bliulab.net/DMFpred/.
Collapse
Affiliation(s)
- Yihe Pang
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
- Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing, China
- * E-mail:
| |
Collapse
|
33
|
Chen R, Li X, Yang Y, Song X, Wang C, Qiao D. Prediction of protein-protein interaction sites in intrinsically disordered proteins. Front Mol Biosci 2022; 9:985022. [PMID: 36250006 PMCID: PMC9567019 DOI: 10.3389/fmolb.2022.985022] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Accepted: 07/27/2022] [Indexed: 11/25/2022] Open
Abstract
Intrinsically disordered proteins (IDPs) participate in many biological processes by interacting with other proteins, including the regulation of transcription, translation, and the cell cycle. With the increasing amount of disorder sequence data available, it is thus crucial to identify the IDP binding sites for functional annotation of these proteins. Over the decades, many computational approaches have been developed to predict protein-protein binding sites of IDP (IDP-PPIS) based on protein sequence information. Moreover, there are new IDP-PPIS predictors developed every year with the rapid development of artificial intelligence. It is thus necessary to provide an up-to-date overview of these methods in this field. In this paper, we collected 30 representative predictors published recently and summarized the databases, features and algorithms. We described the procedure how the features were generated based on public data and used for the prediction of IDP-PPIS, along with the methods to generate the feature representations. All the predictors were divided into three categories: scoring functions, machine learning-based prediction, and consensus approaches. For each category, we described the details of algorithms and their performances. Hopefully, our manuscript will not only provide a full picture of the status quo of IDP binding prediction, but also a guide for selecting different methods. More importantly, it will shed light on the inspirations for future development trends and principles.
Collapse
Affiliation(s)
- Ranran Chen
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Xinlu Li
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Yaqing Yang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Xixi Song
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Cheng Wang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- National Institute of Health Data Science of China, Shandong University, Jinan, China
| | - Dongdong Qiao
- Shandong Mental Health Center, Shandong University, Jinan, China
| |
Collapse
|
34
|
Jaipuria G, Shet D, Malik S, Swain M, Atreya HS, Galea CA, Slomiany MG, Rosenzweig SA, Forbes BE, Norton RS, Mondal S. IGF-dependent dynamic modulation of a protease cleavage site in the intrinsically disordered linker domain of human IGFBP2. Proteins 2022; 90:1732-1743. [PMID: 35443068 PMCID: PMC9357107 DOI: 10.1002/prot.26350] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Revised: 03/02/2022] [Accepted: 03/22/2022] [Indexed: 12/29/2022]
Abstract
Functional regulation via conformational dynamics is well known in structured proteins but less well characterized in intrinsically disordered proteins and their complexes. Using NMR spectroscopy, we have identified a dynamic regulatory mechanism in the human insulin-like growth factor (IGF) system involving the central, intrinsically disordered linker domain of human IGF-binding protein-2 (hIGFBP2). The bioavailability of IGFs is regulated by the proteolysis of IGF-binding proteins. In the case of hIGFBP2, the linker domain (L-hIGFBP2) retains its intrinsic disorder upon binding IGF-1, but its dynamics are significantly altered, both in the IGF binding region and distantly located protease cleavage sites. The increase in flexibility of the linker domain upon IGF-1 binding may explain the IGF-dependent modulation of proteolysis of IGFBP2 in this domain. As IGF homeostasis is important for cell growth and function, and its dysregulation is a key contributor to several cancers, our findings open up new avenues for the design of IGFBP analogs inhibiting IGF-dependent tumors.
Collapse
Affiliation(s)
- Garima Jaipuria
- NMR Research Centre, Indian Institute of Science, Bangalore-560012, India
| | - Divya Shet
- NMR Research Centre, Indian Institute of Science, Bangalore-560012, India,Nanobiophysics lab, Raman Research Institute, Sadashivnagar, Bangalore-80, India
| | - Shahid Malik
- NMR Research Centre, Indian Institute of Science, Bangalore-560012, India
| | - Monalisa Swain
- NMR Research Centre, Indian Institute of Science, Bangalore-560012, India,Frederick National Laboratory for Cancer Research, Maryland-21701, USA
| | | | - Charles A. Galea
- Medicinal Chemistry, Monash Institute of Pharmaceutical Sciences, Parkville 3052, Australia
| | - Mark G. Slomiany
- Department of Cell and Molecular Pharmacology, Medical University of South Carolina, Charleston SC 29425, USA
| | - Steven A. Rosenzweig
- Department of Cell and Molecular Pharmacology, Medical University of South Carolina, Charleston SC 29425, USA
| | - Briony E. Forbes
- Flinders Health and Medical Research Institute, Flinders University, SA 5042, Australia
| | - Raymond S. Norton
- Medicinal Chemistry, Monash Institute of Pharmaceutical Sciences, Parkville 3052, Australia,ARC Centre for Fragment-Based Design, Monash University, Parkville 3052, Australia
| | - Somnath Mondal
- NMR Research Centre, Indian Institute of Science, Bangalore-560012, India,Univ. Bordeaux, Institut Européen de Chimie et Biologie and INSERM U1212, ARNA Laboratory, 2 rue Robert Escarpit, 33607 Pessac Cedex, Bordeaux, France
| |
Collapse
|
35
|
Chaudhary A, Chaurasia PK, Kushwaha S, Chauhan P, Chawade A, Mani A. Correlating multi-functional role of cold shock domain proteins with intrinsically disordered regions. Int J Biol Macromol 2022; 220:743-753. [PMID: 35987358 DOI: 10.1016/j.ijbiomac.2022.08.100] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 07/26/2022] [Accepted: 08/14/2022] [Indexed: 11/05/2022]
Abstract
Cold shock proteins (CSPs) are an ancient and conserved family of proteins. They are renowned for their role in response to low-temperature stress in bacteria and nucleic acid binding activities. In prokaryotes, cold and non-cold inducible CSPs are involved in various cellular and metabolic processes such as growth and development, osmotic oxidation, starvation, stress tolerance, and host cell invasion. In prokaryotes, cold shock condition reduces cell transcription and translation efficiency. Eukaryotic cold shock domain (CSD) proteins are evolved form of prokaryotic CSPs where CSD is flanked by N- and C-terminal domains. Eukaryotic CSPs are multi-functional proteins. CSPs also act as nucleic acid chaperons by preventing the formation of secondary structures in mRNA at low temperatures. In human, CSD proteins play a crucial role in the progression of breast cancer, colon cancer, lung cancer, and Alzheimer's disease. A well-defined three-dimensional structure of intrinsically disordered regions of CSPs family members is still undetermined. In this article, intrinsic disorder regions of CSPs have been explored systematically to understand the pleiotropic role of the cold shock family of proteins.
Collapse
Affiliation(s)
- Amit Chaudhary
- Department of Metallurgical Engineering & Materials Science, Indian Institute of Technology Bombay
| | - Pankaj Kumar Chaurasia
- PG Department of Chemistry, L.S. College, Babasaheb Bhimrao Ambedkar Bihar University, Muzaffarpur, Bihar 842001, India
| | - Sandeep Kushwaha
- National Institute of Animal Biotechnology, Hyderabad 500032, India.
| | | | - Aakash Chawade
- Department of Plant Breeding, Swedish University of Agricultural Sciences, 230 53 Alnarp, Sweden.
| | - Ashutosh Mani
- Department of Biotechnology, Motilal Nehru National Institute of Technology Allahabad, Prayagraj 211004, India.
| |
Collapse
|
36
|
Intrinsically disordered proteins and proteins with intrinsically disordered regions in neurodegenerative diseases. Biophys Rev 2022; 14:679-707. [DOI: 10.1007/s12551-022-00968-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Accepted: 05/28/2022] [Indexed: 12/14/2022] Open
|
37
|
Li H, Pang Y, Liu B, Yu L. MoRF-FUNCpred: Molecular Recognition Feature Function Prediction Based on Multi-Label Learning and Ensemble Learning. Front Pharmacol 2022; 13:856417. [PMID: 35350759 PMCID: PMC8957949 DOI: 10.3389/fphar.2022.856417] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Accepted: 02/14/2022] [Indexed: 01/13/2023] Open
Abstract
Intrinsically disordered regions (IDRs) without stable structure are important for protein structures and functions. Some IDRs can be combined with molecular fragments to make itself completed the transition from disordered to ordered, which are called molecular recognition features (MoRFs). There are five main functions of MoRFs: molecular recognition assembler (MoR_assembler), molecular recognition chaperone (MoR_chaperone), molecular recognition display sites (MoR_display_sites), molecular recognition effector (MoR_effector), and molecular recognition scavenger (MoR_scavenger). Researches on functions of molecular recognition features are important for pharmaceutical and disease pathogenesis. However, the existing computational methods can only predict the MoRFs in proteins, failing to distinguish their different functions. In this paper, we treat MoRF function prediction as a multi-label learning task and solve it with the Binary Relevance (BR) strategy. Finally, we use Support Vector Machine (SVM), Logistic Regression (LR), Decision Tree (DT), and Random Forest (RF) as basic models to construct MoRF-FUNCpred through ensemble learning. Experimental results show that MoRF-FUNCpred performs well for MoRF function prediction. To the best knowledge of ours, MoRF-FUNCpred is the first predictor for predicting the functions of MoRFs. Availability and Implementation: The stand alone package of MoRF-FUNCpred can be accessed from https://github.com/LiangYu-Xidian/MoRF-FUNCpred.
Collapse
Affiliation(s)
- Haozheng Li
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Yihe Pang
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China.,Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing, China
| | - Liang Yu
- School of Computer Science and Technology, Xidian University, Xi'an, China
| |
Collapse
|
38
|
Delaunay M, Ha-Duong T. Computational Tools and Strategies to Develop Peptide-Based Inhibitors of Protein-Protein Interactions. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2405:205-230. [PMID: 35298816 DOI: 10.1007/978-1-0716-1855-4_11] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Protein-protein interactions play crucial and subtle roles in many biological processes and modifications of their fine mechanisms generally result in severe diseases. Peptide derivatives are very promising therapeutic agents for modulating protein-protein associations with sizes and specificities between those of small compounds and antibodies. For the same reasons, rational design of peptide-based inhibitors naturally borrows and combines computational methods from both protein-ligand and protein-protein research fields. In this chapter, we aim to provide an overview of computational tools and approaches used for identifying and optimizing peptides that target protein-protein interfaces with high affinity and specificity. We hope that this review will help to implement appropriate in silico strategies for peptide-based drug design that builds on available information for the systems of interest.
Collapse
Affiliation(s)
| | - Tâp Ha-Duong
- Université Paris-Saclay, CNRS, BioCIS, Châtenay-Malabry, France.
| |
Collapse
|
39
|
Bondos SE, Dunker AK, Uversky VN. Intrinsically disordered proteins play diverse roles in cell signaling. Cell Commun Signal 2022; 20:20. [PMID: 35177069 PMCID: PMC8851865 DOI: 10.1186/s12964-022-00821-7] [Citation(s) in RCA: 73] [Impact Index Per Article: 36.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Accepted: 12/11/2021] [Indexed: 11/29/2022] Open
Abstract
Signaling pathways allow cells to detect and respond to a wide variety of chemical (e.g. Ca2+ or chemokine proteins) and physical stimuli (e.g., sheer stress, light). Together, these pathways form an extensive communication network that regulates basic cell activities and coordinates the function of multiple cells or tissues. The process of cell signaling imposes many demands on the proteins that comprise these pathways, including the abilities to form active and inactive states, and to engage in multiple protein interactions. Furthermore, successful signaling often requires amplifying the signal, regulating or tuning the response to the signal, combining information sourced from multiple pathways, all while ensuring fidelity of the process. This sensitivity, adaptability, and tunability are possible, in part, due to the inclusion of intrinsically disordered regions in many proteins involved in cell signaling. The goal of this collection is to highlight the many roles of intrinsic disorder in cell signaling. Following an overview of resources that can be used to study intrinsically disordered proteins, this review highlights the critical role of intrinsically disordered proteins for signaling in widely diverse organisms (animals, plants, bacteria, fungi), in every category of cell signaling pathway (autocrine, juxtacrine, intracrine, paracrine, and endocrine) and at each stage (ligand, receptor, transducer, effector, terminator) in the cell signaling process. Thus, a cell signaling pathway cannot be fully described without understanding how intrinsically disordered protein regions contribute to its function. The ubiquitous presence of intrinsic disorder in different stages of diverse cell signaling pathways suggest that more mechanisms by which disorder modulates intra- and inter-cell signals remain to be discovered.
Collapse
Affiliation(s)
- Sarah E. Bondos
- Department of Molecular and Cellular Medicine, Texas A&M Health Science Center, College Station, TX 77843 USA
| | - A. Keith Dunker
- Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Indiana University School of Medicine, Indianapolis, IN 46202 USA
| | - Vladimir N. Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer’s Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33612 USA
- Institute for Biological Instrumentation of the Russian Academy of Sciences, Federal Research Center “Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences”, Pushchino, Moscow Region, Russia 142290
| |
Collapse
|
40
|
Li L, Zhou X, Chen Z, Cao Y, Zhao G. The group 3 LEA protein of Artemia franciscana for cryopreservation. Cryobiology 2022; 106:1-12. [DOI: 10.1016/j.cryobiol.2022.01.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Revised: 01/25/2022] [Accepted: 01/25/2022] [Indexed: 11/03/2022]
|
41
|
Grape ASR-Silencing Sways Nuclear Proteome, Histone Marks and Interplay of Intrinsically Disordered Proteins. Int J Mol Sci 2022; 23:ijms23031537. [PMID: 35163458 PMCID: PMC8835812 DOI: 10.3390/ijms23031537] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Revised: 01/25/2022] [Accepted: 01/26/2022] [Indexed: 01/27/2023] Open
Abstract
In order to unravel the functions of ASR (Abscisic acid, Stress, Ripening-induced) proteins in the nucleus, we created a new model of genetically transformed grape embryogenic cells by RNAi-knockdown of grape ASR (VvMSA). Nuclear proteomes of wild-type and VvMSA-RNAi grape cell lines were analyzed by quantitative isobaric tagging (iTRAQ 8-plex). The most significantly up- or down-regulated nuclear proteins were involved in epigenetic regulation, DNA replication/repair, transcription, mRNA splicing/stability/editing, rRNA processing/biogenesis, metabolism, cell division/differentiation and stress responses. The spectacular up-regulation in VvMSA-silenced cells was that of the stress response protein VvLEA D-29 (Late Embryogenesis Abundant). Both VvMSA and VvLEA D-29 genes displayed strong and contrasted responsiveness to auxin depletion, repression of VvMSA and induction of VvLEA D-29. In silico analysis of VvMSA and VvLEA D-29 proteins highlighted their intrinsically disordered nature and possible compensatory relationship. Semi-quantitative evaluation by medium-throughput immunoblotting of eighteen post-translational modifications of histones H3 and H4 in VvMSA-knockdown cells showed significant enrichment/depletion of the histone marks H3K4me1, H3K4me3, H3K9me1, H3K9me2, H3K36me2, H3K36me3 and H4K16ac. We demonstrate that grape ASR repression differentially affects members of complex nucleoprotein structures and may not only act as molecular chaperone/transcription factor, but also participates in plant responses to developmental and environmental cues through epigenetic mechanisms.
Collapse
|
42
|
Kulkarni P, Behal A, Mohanty A, Salgia R, Nedelcu AM, Uversky VN. Co-opting disorder into order: Intrinsically disordered proteins and the early evolution of complex multicellularity. Int J Biol Macromol 2022; 201:29-36. [PMID: 34998872 DOI: 10.1016/j.ijbiomac.2021.12.182] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 12/18/2021] [Accepted: 12/28/2021] [Indexed: 02/07/2023]
Abstract
Intrinsically disordered proteins (IDPs) are proteins that lack rigid structures yet play important roles in myriad biological phenomena. A distinguishing feature of IDPs is that they often mediate specific biological outcomes via multivalent weak cooperative interactions with multiple partners. Here, we show that several proteins specifically associated with processes that were key in the evolution of complex multicellularity in the lineage leading to the multicellular green alga Volvox carteri are IDPs. We suggest that, by rewiring cellular protein interaction networks, IDPs facilitated the co-option of ancestral pathways for specialized multicellular functions, underscoring the importance of IDPs in the early evolution of complex multicellularity.
Collapse
Affiliation(s)
- Prakash Kulkarni
- Department of Medical Oncology and Experimental Therapeutics, City of Hope National Medical Center, Duarte, CA, USA.
| | - Amita Behal
- Department of Medical Oncology and Experimental Therapeutics, City of Hope National Medical Center, Duarte, CA, USA
| | - Atish Mohanty
- Department of Medical Oncology and Experimental Therapeutics, City of Hope National Medical Center, Duarte, CA, USA
| | - Ravi Salgia
- Department of Medical Oncology and Experimental Therapeutics, City of Hope National Medical Center, Duarte, CA, USA
| | - Aurora M Nedelcu
- Department of Biology, University of New Brunswick, Fredericton, Canada.
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL, USA; Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Institutskiy pereulok, 9, Dolgoprudny, Moscow region 141700, Russia.
| |
Collapse
|
43
|
Tamburrini KC, Pesce G, Nilsson J, Gondelaud F, Kajava AV, Berrin JG, Longhi S. Predicting Protein Conformational Disorder and Disordered Binding Sites. Methods Mol Biol 2022; 2449:95-147. [PMID: 35507260 DOI: 10.1007/978-1-0716-2095-3_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
In the last two decades it has become increasingly evident that a large number of proteins adopt either a fully or a partially disordered conformation. Intrinsically disordered proteins are ubiquitous proteins that fulfill essential biological functions while lacking a stable 3D structure. Their conformational heterogeneity is encoded by the amino acid sequence, thereby allowing intrinsically disordered proteins or regions to be recognized based on their sequence properties. The identification of disordered regions facilitates the functional annotation of proteins and is instrumental for delineating boundaries of protein domains amenable to crystallization. This chapter focuses on the methods currently employed for predicting protein disorder and identifying intrinsically disordered binding sites.
Collapse
Affiliation(s)
- Ketty C Tamburrini
- Aix Marseille Univ, CNRS, Architecture et Fonction des Macromolécules Biologiques, AFMB, UMR 7257, Marseille, France
- INRAE, Aix Marseille Univ, Biodiversité et Biotechnologie Fongiques (BBF), UMR 1163, Marseille, France
| | - Giulia Pesce
- Aix Marseille Univ, CNRS, Architecture et Fonction des Macromolécules Biologiques, AFMB, UMR 7257, Marseille, France
| | - Juliet Nilsson
- Aix Marseille Univ, CNRS, Architecture et Fonction des Macromolécules Biologiques, AFMB, UMR 7257, Marseille, France
| | - Frank Gondelaud
- Aix Marseille Univ, CNRS, Architecture et Fonction des Macromolécules Biologiques, AFMB, UMR 7257, Marseille, France
| | - Andrey V Kajava
- Centre de Recherche en Biologie cellulaire de Montpellier, UMR 5237, CNRS, Université Montpellier, Montpellier, France
| | - Jean-Guy Berrin
- INRAE, Aix Marseille Univ, Biodiversité et Biotechnologie Fongiques (BBF), UMR 1163, Marseille, France
| | - Sonia Longhi
- Aix Marseille Univ, CNRS, Architecture et Fonction des Macromolécules Biologiques, AFMB, UMR 7257, Marseille, France.
| |
Collapse
|
44
|
Katuwawala A, Zhao B, Kurgan L. DisoLipPred: accurate prediction of disordered lipid-binding residues in protein sequences with deep recurrent networks and transfer learning. Bioinformatics 2021; 38:115-124. [PMID: 34487138 DOI: 10.1093/bioinformatics/btab640] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 08/05/2021] [Accepted: 09/02/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Intrinsically disordered protein regions interact with proteins, nucleic acids and lipids. Regions that bind lipids are implicated in a wide spectrum of cellular functions and several human diseases. Motivated by the growing amount of experimental data for these interactions and lack of tools that can predict them from the protein sequence, we develop DisoLipPred, the first predictor of the disordered lipid-binding residues (DLBRs). RESULTS DisoLipPred relies on a deep bidirectional recurrent network that implements three innovative features: transfer learning, bypass module that sidesteps predictions for putative structured residues, and expanded inputs that cover physiochemical properties associated with the protein-lipid interactions. Ablation analysis shows that these features drive predictive quality of DisoLipPred. Tests on an independent test dataset and the yeast proteome reveal that DisoLipPred generates accurate results and that none of the related existing tools can be used to indirectly identify DLBR. We also show that DisoLipPred's predictions complement the results generated by predictors of the transmembrane regions. Altogether, we conclude that DisoLipPred provides high-quality predictions of DLBRs that complement the currently available methods. AVAILABILITY AND IMPLEMENTATION DisoLipPred's webserver is available at http://biomine.cs.vcu.edu/servers/DisoLipPred/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Akila Katuwawala
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
45
|
Zhang F, Zhao B, Shi W, Li M, Kurgan L. DeepDISOBind: accurate prediction of RNA-, DNA- and protein-binding intrinsically disordered residues with deep multi-task learning. Brief Bioinform 2021; 23:6461158. [PMID: 34905768 DOI: 10.1093/bib/bbab521] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 10/30/2021] [Accepted: 11/14/2021] [Indexed: 12/14/2022] Open
Abstract
Proteins with intrinsically disordered regions (IDRs) are common among eukaryotes. Many IDRs interact with nucleic acids and proteins. Annotation of these interactions is supported by computational predictors, but to date, only one tool that predicts interactions with nucleic acids was released, and recent assessments demonstrate that current predictors offer modest levels of accuracy. We have developed DeepDISOBind, an innovative deep multi-task architecture that accurately predicts deoxyribonucleic acid (DNA)-, ribonucleic acid (RNA)- and protein-binding IDRs from protein sequences. DeepDISOBind relies on an information-rich sequence profile that is processed by an innovative multi-task deep neural network, where subsequent layers are gradually specialized to predict interactions with specific partner types. The common input layer links to a layer that differentiates protein- and nucleic acid-binding, which further links to layers that discriminate between DNA and RNA interactions. Empirical tests show that this multi-task design provides statistically significant gains in predictive quality across the three partner types when compared to a single-task design and a representative selection of the existing methods that cover both disorder- and structure-trained tools. Analysis of the predictions on the human proteome reveals that DeepDISOBind predictions can be encoded into protein-level propensities that accurately predict DNA- and RNA-binding proteins and protein hubs. DeepDISOBind is available at https://www.csuligroup.com/DeepDISOBind/.
Collapse
Affiliation(s)
- Fuhao Zhang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, 410083, China
| | - Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, 23284, USA
| | - Wenbo Shi
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, 410083, China
| | - Min Li
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, 410083, China
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, 23284, USA
| |
Collapse
|
46
|
Chio US, Liu Y, Chung S, Shim WJ, Chandrasekar S, Weiss S, Shan SO. Subunit cooperation in the Get1/2 receptor promotes tail-anchored membrane protein insertion. J Cell Biol 2021; 220:212681. [PMID: 34614151 PMCID: PMC8530227 DOI: 10.1083/jcb.202103079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Revised: 08/03/2021] [Accepted: 08/19/2021] [Indexed: 11/29/2022] Open
Abstract
The guided entry of tail-anchored protein (GET) pathway, in which the Get3 ATPase delivers an essential class of tail-anchored membrane proteins (TAs) to the Get1/2 receptor at the endoplasmic reticulum, provides a conserved mechanism for TA biogenesis in eukaryotic cells. The membrane-associated events of this pathway remain poorly understood. Here we show that complex assembly between the cytosolic domains (CDs) of Get1 and Get2 strongly enhances the affinity of the individual subunits for Get3•TA, thus enabling efficient capture of the targeting complex. In addition to the known role of Get1CD in remodeling Get3 conformation, two molecular recognition features (MoRFs) in Get2CD induce Get3 opening, and both subunits are required for optimal TA release from Get3. Mutation of the MoRFs attenuates TA insertion into the ER in vivo. Our results demonstrate extensive cooperation between the Get1/2 receptor subunits in the capture and remodeling of the targeting complex, and emphasize the role of MoRFs in receptor function during membrane protein biogenesis.
Collapse
Affiliation(s)
- Un Seng Chio
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA
| | - Yumeng Liu
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA
| | - SangYoon Chung
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA
| | - Woo Jun Shim
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA
| | - Sowmya Chandrasekar
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA
| | - Shimon Weiss
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA.,Department of Physics, Institute for Nanotechnology and Advanced Materials, Bar-Ilan University, Ramat-Gan, Israel
| | - Shu-Ou Shan
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA
| |
Collapse
|
47
|
Shafat Z, Ahmed A, Parvez MK, Parveen S. Role of "dual-personality" fragments in HEV adaptation-analysis of Y-domain region. J Genet Eng Biotechnol 2021; 19:154. [PMID: 34637041 PMCID: PMC8511232 DOI: 10.1186/s43141-021-00238-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Accepted: 08/30/2021] [Indexed: 01/06/2023]
Abstract
BACKGROUND Hepatitis E is a liver disease caused by the pathogen hepatitis E virus (HEV). The largest polyprotein open reading frame 1 (ORF1) contains a nonstructural Y-domain region (YDR) whose activity in HEV adaptation remains uncharted. The specific role of disordered regions in several nonstructural proteins has been demonstrated to participate in the multiplication and multiple regulatory functions of the viruses. Thus, intrinsic disorder of YDR including its structural and functional annotation was comprehensively studied by exploiting computational methodologies to delineate its role in viral adaptation. RESULTS Based on our findings, it was evident that YDR contains significantly higher levels of ordered regions with less prevalence of disordered residues. Sequence-based analysis of YDR revealed it as a "dual personality" (DP) protein due to the presence of both structured and unstructured (intrinsically disordered) regions. The evolution of YDR was shaped by pressures that lead towards predominance of both disordered and regularly folded amino acids (Ala, Arg, Gly, Ile, Leu, Phe, Pro, Ser, Tyr, Val). Additionally, the predominance of characteristic DP residues (Thr, Arg, Gly, and Pro) further showed the order as well as disorder characteristic possessed by YDR. The intrinsic disorder propensity analysis of YDR revealed it as a moderately disordered protein. All the YDR sequences consisted of molecular recognition features (MoRFs), i.e., intrinsic disorder-based protein-protein interaction (PPI) sites, in addition to several nucleotide-binding sites. Thus, the presence of molecular recognition (PPI, RNA binding, and DNA binding) signifies the YDR's interaction with specific partners, host membranes leading to further viral infection. The presence of various disordered-based phosphorylation sites further signifies the role of YDR in various biological processes. Furthermore, functional annotation of YDR revealed it as a multifunctional-associated protein, due to its susceptibility in binding to a wide range of ligands and involvement in various catalytic activities. CONCLUSIONS As DP are targets for regulation, thus, YDR contributes to cellular signaling processes through PPIs. As YDR is incompletely understood, therefore, our data on disorder-based function could help in better understanding its associated functions. Collectively, our novel data from this comprehensive investigation is the first attempt to delineate YDR role in the regulation and pathogenesis of HEV.
Collapse
Affiliation(s)
- Zoya Shafat
- Centre for Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, New Delhi, India
| | - Anwar Ahmed
- Centre of Excellence in Biotechnology Research, College of Science, King Saud University, Riyadh, Saudi Arabia
| | - Mohammad K Parvez
- Department of Pharmacognosy, College of Pharmacy, King Saud University, Riyadh, Saudi Arabia
| | - Shama Parveen
- Centre for Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, New Delhi, India.
| |
Collapse
|
48
|
He H, Zhou Y, Chi Y, He J. Prediction of MoRFs based on sequence properties and convolutional neural networks. BioData Min 2021; 14:39. [PMID: 34391457 PMCID: PMC8364704 DOI: 10.1186/s13040-021-00275-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2021] [Accepted: 08/08/2021] [Indexed: 12/02/2022] Open
Abstract
Background Intrinsically disordered proteins possess flexible 3-D structures, which makes them play an important role in a variety of biological functions. Molecular recognition features (MoRFs) act as an important type of functional regions, which are located within longer intrinsically disordered regions and undergo disorder-to-order transitions upon binding their interaction partners. Results We develop a method, MoRFCNN, to predict MoRFs based on sequence properties and convolutional neural networks (CNNs). The sequence properties contain structural and physicochemical properties which are used to describe the differences between MoRFs and non-MoRFs. Especially, to highlight the correlation between the target residue and adjacent residues, three windows are selected to preprocess the selected properties. After that, these calculated properties are combined into the feature matrix to predict MoRFs through the constructed CNN. Comparing with other existing methods, MoRFCNN obtains better performance. Conclusions MoRFCNN is a new individual MoRFs prediction method which just uses protein sequence properties without evolutionary information. The simulation results show that MoRFCNN is effective and competitive.
Collapse
Affiliation(s)
- Hao He
- School of Electronic and Information Engineering, Hebei University of Technology, Tianjin, China
| | - Yatong Zhou
- School of Electronic and Information Engineering, Hebei University of Technology, Tianjin, China.
| | - Yue Chi
- School of Electronic and Information Engineering, Hebei University of Technology, Tianjin, China
| | - Jingfei He
- School of Electronic and Information Engineering, Hebei University of Technology, Tianjin, China
| |
Collapse
|
49
|
Chen TR, Lo CH, Juan SH, Lo WC. The influence of dataset homology and a rigorous evaluation strategy on protein secondary structure prediction. PLoS One 2021; 16:e0254555. [PMID: 34260641 PMCID: PMC8279362 DOI: 10.1371/journal.pone.0254555] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Accepted: 06/29/2021] [Indexed: 11/28/2022] Open
Abstract
The secondary structure prediction (SSP) of proteins has long been an essential structural biology technique with various applications. Despite its vital role in many research and industrial fields, in recent years, as the accuracy of state-of-the-art secondary structure predictors approaches the theoretical upper limit, SSP has been considered no longer challenging or too challenging to make advances. With the belief that the substantial improvement of SSP will move forward many fields depending on it, we conducted this study, which focused on three issues that have not been noticed or thoroughly examined yet but may have affected the reliability of the evaluation of previous SSP algorithms. These issues are all about the sequence homology between or within the developmental and evaluation datasets. We thus designed many different homology layouts of datasets to train and evaluate SSP prediction models. Multiple repeats were performed in each experiment by random sampling. The conclusions obtained with small experimental datasets were verified with large-scale datasets using state-of-the-art SSP algorithms. Very different from the long-established assumption, we discover that the sequence homology between query datasets for training, testing, and independent tests exerts little influence on SSP accuracy. Besides, the sequence homology redundancy between or within most datasets would make the accuracy of an SSP algorithm overestimated, while the redundancy within the reference dataset for extracting predictive features would make the accuracy underestimated. Since the overestimating effects are more significant than the underestimating effect, the accuracy of some SSP methods might have been overestimated. Based on the discoveries, we propose a rigorous procedure for developing SSP algorithms and making reliable evaluations, hoping to bring substantial improvements to future SSP methods and benefit all research and application fields relying on accurate prediction of protein secondary structures.
Collapse
Affiliation(s)
- Teng-Ruei Chen
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Chia-Hua Lo
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan
| | - Sheng-Hung Juan
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan
| | - Wei-Cheng Lo
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan
- Department of Biological Science and Technology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
- The Center for Bioinformatics Research, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| |
Collapse
|
50
|
Erdős G, Pajkos M, Dosztányi Z. IUPred3: prediction of protein disorder enhanced with unambiguous experimental annotation and visualization of evolutionary conservation. Nucleic Acids Res 2021; 49:W297-W303. [PMID: 34048569 PMCID: PMC8262696 DOI: 10.1093/nar/gkab408] [Citation(s) in RCA: 254] [Impact Index Per Article: 84.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 04/21/2021] [Accepted: 05/14/2021] [Indexed: 12/22/2022] Open
Abstract
Intrinsically disordered proteins and protein regions (IDPs/IDRs) exist without a single well-defined conformation. They carry out important biological functions with multifaceted roles which is also reflected in their evolutionary behavior. Computational methods play important roles in the characterization of IDRs. One of the commonly used disorder prediction methods is IUPred, which relies on an energy estimation approach. The IUPred web server takes an amino acid sequence or a Uniprot ID/accession as an input and predicts the tendency for each amino acid to be in a disordered region with an option to also predict context-dependent disordered regions. In this new iteration of IUPred, we added multiple novel features to enhance the prediction capabilities of the server. First, learning from the latest evaluation of disorder prediction methods we introduced multiple new smoothing functions to the prediction that decreases noise and increases the performance of the predictions. We constructed a dataset consisting of experimentally verified ordered/disordered regions with unambiguous annotations which were added to the prediction. We also introduced a novel tool that enables the exploration of the evolutionary conservation of protein disorder coupled to sequence conservation in model organisms. The web server is freely available to users and accessible at https://iupred3.elte.hu.
Collapse
Affiliation(s)
- Gábor Erdős
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| | - Mátyás Pajkos
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| | - Zsuzsanna Dosztányi
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| |
Collapse
|