1
|
Erdős G, Dosztányi Z. AIUPred: combining energy estimation with deep learning for the enhanced prediction of protein disorder. Nucleic Acids Res 2024; 52:W176-W181. [PMID: 38747347 PMCID: PMC11223784 DOI: 10.1093/nar/gkae385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 04/19/2024] [Accepted: 05/07/2024] [Indexed: 07/06/2024] Open
Abstract
Intrinsically disordered proteins and protein regions (IDPs/IDRs) carry out important biological functions without relying on a single well-defined conformation. As these proteins are a challenge to study experimentally, computational methods play important roles in their characterization. One of the commonly used tools is the IUPred web server which provides prediction of disordered regions and their binding sites. IUPred is rooted in a simple biophysical model and uses a limited number of parameters largely derived on globular protein structures only. This enabled an incredibly fast and robust prediction method, however, its limitations have also become apparent in light of recent breakthrough methods using deep learning techniques. Here, we present AIUPred, a novel version of IUPred which incorporates deep learning techniques into the energy estimation framework. It achieves improved performance while keeping the robustness of the original method. Based on the evaluation of recent benchmark datasets, AIUPred scored amongst the top three single sequence based methods. With a new web server we offer fast and reliable visual analysis for users as well as options to analyze whole genomes in mere seconds with the downloadable package. AIUPred is available at https://aiupred.elte.hu.
Collapse
Affiliation(s)
- Gábor Erdős
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| | - Zsuzsanna Dosztányi
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| |
Collapse
|
2
|
Krokengen OC, Touma C, Mularski A, Sutinen A, Dunkel R, Ytterdal M, Raasakka A, Mertens HDT, Simonsen AC, Kursula P. The cytoplasmic tail of myelin protein zero induces morphological changes in lipid membranes. BIOCHIMICA ET BIOPHYSICA ACTA. BIOMEMBRANES 2024; 1866:184368. [PMID: 38971517 DOI: 10.1016/j.bbamem.2024.184368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 06/24/2024] [Accepted: 07/01/2024] [Indexed: 07/08/2024]
Abstract
The major myelin protein expressed by the peripheral nervous system Schwann cells is protein zero (P0), which represents 50% of the total protein content in myelin. This 30-kDa integral membrane protein consists of an immunoglobulin (Ig)-like domain, a transmembrane helix, and a 69-residue C-terminal cytoplasmic tail (P0ct). The basic residues in P0ct contribute to the tight packing of myelin lipid bilayers, and alterations in the tail affect how P0 functions as an adhesion molecule necessary for the stability of compact myelin. Several neurodegenerative neuropathies are related to P0, including the more common Charcot-Marie-Tooth disease (CMT) and Dejerine-Sottas syndrome (DSS) as well as rare cases of motor and sensory polyneuropathy. We found that high P0ct concentrations affected the membrane properties of bicelles and induced a lamellar-to-inverted hexagonal phase transition, which caused bicelles to fuse into long, protein-containing filament-like structures. These structures likely reflect the formation of semicrystalline lipid domains with potential relevance for myelination. Not only is P0ct important for stacking lipid membranes, but time-lapse fluorescence microscopy also shows that it might affect membrane properties during myelination. We further describe recombinant production and low-resolution structural characterization of full-length human P0. Our findings shed light on P0ct effects on membrane properties, and with the successful purification of full-length P0, we have new tools to study the role of P0 in myelin formation and maintenance in vitro.
Collapse
Affiliation(s)
- Oda C Krokengen
- Department of Biomedicine, University of Bergen, Bergen, Norway
| | - Christine Touma
- Faculty of Biochemistry and Molecular Medicine & Biocenter Oulu, University of Oulu, Oulu, Finland
| | - Anna Mularski
- Department of Physics, Chemistry and Pharmacy, University of Southern Denmark, Odense, Denmark
| | - Aleksi Sutinen
- Faculty of Biochemistry and Molecular Medicine & Biocenter Oulu, University of Oulu, Oulu, Finland
| | - Ryan Dunkel
- Department of Biomedicine, University of Bergen, Bergen, Norway
| | - Marie Ytterdal
- Department of Biomedicine, University of Bergen, Bergen, Norway
| | - Arne Raasakka
- Department of Biomedicine, University of Bergen, Bergen, Norway
| | - Haydyn D T Mertens
- European Molecular Biology Laboratory EMBL, Hamburg Site, c/o DESY, Hamburg, Germany
| | - Adam Cohen Simonsen
- Department of Physics, Chemistry and Pharmacy, University of Southern Denmark, Odense, Denmark
| | - Petri Kursula
- Department of Biomedicine, University of Bergen, Bergen, Norway; Faculty of Biochemistry and Molecular Medicine & Biocenter Oulu, University of Oulu, Oulu, Finland.
| |
Collapse
|
3
|
Hutchins CM, Gorfe AA. From disorder comes function: Regulation of small GTPase function by intrinsically disordered lipidated membrane anchor. Curr Opin Struct Biol 2024; 87:102869. [PMID: 38943706 DOI: 10.1016/j.sbi.2024.102869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 05/23/2024] [Accepted: 06/04/2024] [Indexed: 07/01/2024]
Abstract
The intrinsically disordered, lipid-modified membrane anchor of small GTPases is emerging as a critical modulator of function through its ability to sort lipids in a conformation-dependent manner. We reviewed recent computational and experimental studies that have begun to shed light on the sequence-ensemble-function relationship in this unique class of lipidated intrinsically disordered regions (LIDRs).
Collapse
Affiliation(s)
- Chase M Hutchins
- Department of Integrative Biology and Pharmacology, McGovern Medical School, University of Texas Health Science Center at Houston, 6431 Fannin St., Houston, TX 77030, USA; Biochemistry and Cell Biology Program & Therapeutics and Pharmacology Program, MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences, 6431 Fannin St., Houston, TX 77030, USA. https://twitter.com/chasedsims
| | - Alemayehu A Gorfe
- Department of Integrative Biology and Pharmacology, McGovern Medical School, University of Texas Health Science Center at Houston, 6431 Fannin St., Houston, TX 77030, USA; Biochemistry and Cell Biology Program & Therapeutics and Pharmacology Program, MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences, 6431 Fannin St., Houston, TX 77030, USA.
| |
Collapse
|
4
|
Jung J, Yagi K, Tan C, Oshima H, Mori T, Yu I, Matsunaga Y, Kobayashi C, Ito S, Ugarte La Torre D, Sugita Y. GENESIS 2.1: High-Performance Molecular Dynamics Software for Enhanced Sampling and Free-Energy Calculations for Atomistic, Coarse-Grained, and Quantum Mechanics/Molecular Mechanics Models. J Phys Chem B 2024; 128:6028-6048. [PMID: 38876465 PMCID: PMC11215777 DOI: 10.1021/acs.jpcb.4c02096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2024] [Revised: 05/15/2024] [Accepted: 05/21/2024] [Indexed: 06/16/2024]
Abstract
GENeralized-Ensemble SImulation System (GENESIS) is a molecular dynamics (MD) software developed to simulate the conformational dynamics of a single biomolecule, as well as molecular interactions in large biomolecular assemblies and between multiple biomolecules in cellular environments. To achieve the latter purpose, the earlier versions of GENESIS emphasized high performance in atomistic MD simulations on massively parallel supercomputers, with or without graphics processing units (GPUs). Here, we implemented multiscale MD simulations that include atomistic, coarse-grained, and hybrid quantum mechanics/molecular mechanics (QM/MM) calculations. They demonstrate high performance and are integrated with enhanced conformational sampling algorithms and free-energy calculations without using external programs except for the QM programs. In this article, we review new functions, molecular models, and other essential features in GENESIS version 2.1 and discuss ongoing developments for future releases.
Collapse
Affiliation(s)
- Jaewoon Jung
- Computational
Biophysics Research Team, RIKEN Center for
Computational Science, Kobe, Hyogo 650-0047, Japan
- Theoretical
Molecular Science Laboratory, RIKEN Cluster
for Pioneering Research, Wako, Saitama 351-0198, Japan
| | - Kiyoshi Yagi
- Theoretical
Molecular Science Laboratory, RIKEN Cluster
for Pioneering Research, Wako, Saitama 351-0198, Japan
| | - Cheng Tan
- Computational
Biophysics Research Team, RIKEN Center for
Computational Science, Kobe, Hyogo 650-0047, Japan
| | - Hiraku Oshima
- Laboratory
for Biomolecular Function Simulation, RIKEN
Center for Biosystems Dynamics Research, Kobe, Hyogo 650-0047, Japan
- Graduate
School of Life Science, University of Hyogo, Harima Science Park City, Hyogo 678-1297, Japan
| | - Takaharu Mori
- Theoretical
Molecular Science Laboratory, RIKEN Cluster
for Pioneering Research, Wako, Saitama 351-0198, Japan
- Department
of Chemistry, Tokyo University of Science, Shinjuku-ku, Tokyo 162-8601, Japan
| | - Isseki Yu
- Theoretical
Molecular Science Laboratory, RIKEN Cluster
for Pioneering Research, Wako, Saitama 351-0198, Japan
- Department
of Bioinformatics, Maebashi Institute of
Technology, Maebashi, Gunma 371-0816, Japan
| | - Yasuhiro Matsunaga
- Computational
Biophysics Research Team, RIKEN Center for
Computational Science, Kobe, Hyogo 650-0047, Japan
- Graduate
School of Science and Engineering, Saitama
University, Saitama 338-8570, Japan
| | - Chigusa Kobayashi
- Computational
Biophysics Research Team, RIKEN Center for
Computational Science, Kobe, Hyogo 650-0047, Japan
| | - Shingo Ito
- Theoretical
Molecular Science Laboratory, RIKEN Cluster
for Pioneering Research, Wako, Saitama 351-0198, Japan
| | - Diego Ugarte La Torre
- Computational
Biophysics Research Team, RIKEN Center for
Computational Science, Kobe, Hyogo 650-0047, Japan
| | - Yuji Sugita
- Computational
Biophysics Research Team, RIKEN Center for
Computational Science, Kobe, Hyogo 650-0047, Japan
- Theoretical
Molecular Science Laboratory, RIKEN Cluster
for Pioneering Research, Wako, Saitama 351-0198, Japan
- Laboratory
for Biomolecular Function Simulation, RIKEN
Center for Biosystems Dynamics Research, Kobe, Hyogo 650-0047, Japan
| |
Collapse
|
5
|
Davis MC, André AAM, Kjaergaard M. Entering the Next Phase: Predicting Biological Effects of Biomolecular Condensates. J Mol Biol 2024:168645. [PMID: 38848869 DOI: 10.1016/j.jmb.2024.168645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2024] [Revised: 06/02/2024] [Accepted: 06/03/2024] [Indexed: 06/09/2024]
Abstract
Biomolecular condensates are increasingly recognized as important drivers of cellular function; their dysregulation leads to pathology and disease. We discuss three questions in terms of the impending utility of data-driven techniques to predict condensate-driven biological outcomes, i.e., the impact of cellular state changes on condensates, the effect of condensates on biochemical processes within, and condensate properties that result in cellular dysregulation and disease.
Collapse
Affiliation(s)
- Maria C Davis
- Department of Molecular Biology and Genetics, Aarhus University, Aarhus, Denmark
| | - Alain A M André
- Department of Molecular Biology and Genetics, Aarhus University, Aarhus, Denmark
| | - Magnus Kjaergaard
- Department of Molecular Biology and Genetics, Aarhus University, Aarhus, Denmark; The Danish Research Institute for Translational Neuroscience (DANDRITE), Denmark.
| |
Collapse
|
6
|
Chen J, Li Q, Xia S, Arsala D, Sosa D, Wang D, Long M. The Rapid Evolution of De Novo Proteins in Structure and Complex. Genome Biol Evol 2024; 16:evae107. [PMID: 38753069 PMCID: PMC11149777 DOI: 10.1093/gbe/evae107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/10/2024] [Indexed: 06/06/2024] Open
Abstract
Recent studies in the rice genome-wide have established that de novo genes, evolving from noncoding sequences, enhance protein diversity through a stepwise process. However, the pattern and rate of their evolution in protein structure over time remain unclear. Here, we addressed these issues within a surprisingly short evolutionary timescale (<1 million years for 97% of Oryza de novo genes) with comparative approaches to gene duplicates. We found that de novo genes evolve faster than gene duplicates in the intrinsically disordered regions (such as random coils), secondary structure elements (such as α helix and β strand), hydrophobicity, and molecular recognition features. In de novo proteins, specifically, we observed an 8% to 14% decay in random coils and intrinsically disordered region lengths and a 2.3% to 6.5% increase in structured elements, hydrophobicity, and molecular recognition features, per million years on average. These patterns of structural evolution align with changes in amino acid composition over time as well. We also revealed higher positive charges but smaller molecular weights for de novo proteins than duplicates. Tertiary structure predictions showed that most de novo proteins, though not typically well folded on their own, readily form low-energy and compact complexes with other proteins facilitated by extensive residue contacts and conformational flexibility, suggesting a faster-binding scenario in de novo proteins to promote interaction. These analyses illuminate a rapid evolution of protein structure in de novo genes in rice genomes, originating from noncoding sequences, highlighting their quick transformation into active, protein complex-forming components within a remarkably short evolutionary timeframe.
Collapse
Affiliation(s)
- Jianhai Chen
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA
| | - Qingrong Li
- Division of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA 92093, USA
- Department of Cellular & Molecular Medicine, School of Medicine, University of California San Diego, La Jolla, CA 92093, USA
| | - Shengqian Xia
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA
| | - Deanna Arsala
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA
| | - Dylan Sosa
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA
| | - Dong Wang
- Division of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA 92093, USA
- Department of Cellular & Molecular Medicine, School of Medicine, University of California San Diego, La Jolla, CA 92093, USA
| | - Manyuan Long
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA
| |
Collapse
|
7
|
Ginell GM, Emenecker RJ, Lotthammer JM, Usher ET, Holehouse AS. Direct prediction of intermolecular interactions driven by disordered regions. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.03.597104. [PMID: 38895487 PMCID: PMC11185574 DOI: 10.1101/2024.06.03.597104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
Intrinsically disordered regions (IDRs) are critical for a wide variety of cellular functions, many of which involve interactions with partner proteins. Molecular recognition is typically considered through the lens of sequence-specific binding events. However, a growing body of work has shown that IDRs often interact with partners in a manner that does not depend on the precise order of the amino acid order, instead driven by complementary chemical interactions leading to disordered bound-state complexes. Despite this emerging paradigm, we lack tools to describe, quantify, predict, and interpret these types of structurally heterogeneous interactions from the underlying amino acid sequences. Here, we repurpose the chemical physics developed originally for molecular simulations to develop an approach for predicting intermolecular interactions between IDRs and partner proteins. Our approach enables the direct prediction of phase diagrams, the identification of chemically-specific interaction hotspots on IDRs, and a route to develop and test mechanistic hypotheses regarding IDR function in the context of molecular recognition. We use our approach to examine a range of systems and questions to highlight its versatility and applicability.
Collapse
Affiliation(s)
- Garrett M. Ginell
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO
- Center for Biomolecular Condensates (CBC), Washington University in St. Louis, St. Louis, MO
| | - Ryan. J Emenecker
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO
- Center for Biomolecular Condensates (CBC), Washington University in St. Louis, St. Louis, MO
| | - Jeffrey M. Lotthammer
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO
- Center for Biomolecular Condensates (CBC), Washington University in St. Louis, St. Louis, MO
| | - Emery T. Usher
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO
- Center for Biomolecular Condensates (CBC), Washington University in St. Louis, St. Louis, MO
| | - Alex S. Holehouse
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO
- Center for Biomolecular Condensates (CBC), Washington University in St. Louis, St. Louis, MO
| |
Collapse
|
8
|
Khan MI, Pathania S, Al-Rabia MW, Ethayathulla AS, Khan MI, Allemailem KS, Azam M, Hariprasad G, Imran MA. MolDy: molecular dynamics simulation made easy. Bioinformatics 2024; 40:btae313. [PMID: 38867698 PMCID: PMC11187490 DOI: 10.1093/bioinformatics/btae313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 04/26/2024] [Accepted: 06/11/2024] [Indexed: 06/14/2024] Open
Abstract
MOTIVATION Molecular dynamics (MD) is a computational experiment that is crucial for understanding the structure of biological macro and micro molecules, their folding, and the inter-molecular interactions. Accurate knowledge of these structural features is the cornerstone in drug development and elucidating macromolecules functions. The open-source GROMACS biomolecular MD simulation program is recognized as a reliable and frequently used simulation program for its precision. However, the user requires expertise, and scripting skills to carrying out MD simulations. RESULTS We have developed an end-to-end interactive MD simulation application, MolDy for Gromacs. This front-end application provides a customizable user interface integrated with the Python and Perl-based logical backend connecting the Linux shell and Gromacs software. The tool performs analysis and provides the user with simulation trajectories and graphical representations of relevant biophysical parameters. The advantages of MolDy are (i) user-friendly, does not requiring the researcher to have prior knowledge of Linux; (ii) easy installation by a single command; (iii) freely available for academic research; (iv) can run with minimum configuration of operating systems; (v) has valid default prefilled parameters for beginners, and at the same time provides scope for modifications for expert users. AVAILABILITY AND IMPLEMENTATION MolDy is available freely as compressed source code files with user manual for installation and operation on GitHub: https://github.com/AIBResearchMolDy/Moldyv01.git and on https://aibresearch.com/innovations.
Collapse
Affiliation(s)
- Mohd Imran Khan
- Division of Bioinformatics, AIBR Artificial Intelligence and Biochemical Research Pvt. Ltd., New Delhi 110076, India
| | - Sheetal Pathania
- Division of Bioinformatics, AIBR Artificial Intelligence and Biochemical Research Pvt. Ltd., New Delhi 110076, India
| | - Mohammed W Al-Rabia
- Department of Clinical Microbiology and Immunology, Faculty of Medicine, King Abdul Aziz University, Jeddah 21589, Saudi Arabia
- Department of Clinical and Molecular Microbiology Laboratory, King Abdulaziz University Hospital, Jeddah 21589, Saudi Arabia
| | - Abdul S Ethayathulla
- Department of Biophysics, All India Institute of Medical Sciences, New Delhi 110029, India
| | - Mohammad Imran Khan
- Research Center, King Faisal Specialist Hospital and Research Center, Jeddah 21589, Saudi Arabia
| | - Khaled S Allemailem
- Department of Medical Laboratories, College of Applied Medical Sciences, Qassim University, Buraydah 51452, Saudi Arabia
| | - Mohd Azam
- Department of Medical Laboratories, College of Applied Medical Sciences, Qassim University, Buraydah 51452, Saudi Arabia
| | - Gururao Hariprasad
- Department of Biophysics, All India Institute of Medical Sciences, New Delhi 110029, India
| | - Mohammad Azhar Imran
- Division of Bioinformatics, AIBR Artificial Intelligence and Biochemical Research Pvt. Ltd., New Delhi 110076, India
| |
Collapse
|
9
|
Hutson M. Software tools identify forgotten genes. Nature 2024:10.1038/d41586-024-01548-w. [PMID: 38789607 DOI: 10.1038/d41586-024-01548-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2024]
|
10
|
Waszkiewicz R, Michaś A, Białobrzewski MK, Klepka BP, Cieplak-Rotowska MK, Staszałek Z, Cichocki B, Lisicki M, Szymczak P, Niedzwiecka A. Hydrodynamic Radii of Intrinsically Disordered Proteins: Fast Prediction by Minimum Dissipation Approximation and Experimental Validation. J Phys Chem Lett 2024; 15:5024-5033. [PMID: 38696815 PMCID: PMC11103702 DOI: 10.1021/acs.jpclett.4c00312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 04/12/2024] [Accepted: 04/26/2024] [Indexed: 05/04/2024]
Abstract
The diffusion coefficients of globular and fully unfolded proteins can be predicted with high accuracy solely from their mass or chain length. However, this approach fails for intrinsically disordered proteins (IDPs) containing structural domains. We propose a rapid predictive methodology for estimating the diffusion coefficients of IDPs. The methodology uses accelerated conformational sampling based on self-avoiding random walks and includes hydrodynamic interactions between coarse-grained protein subunits, modeled using the generalized Rotne-Prager-Yamakawa approximation. To estimate the hydrodynamic radius, we rely on the minimum dissipation approximation recently introduced by Cichocki et al. Using a large set of experimentally measured hydrodynamic radii of IDPs over a wide range of chain lengths and domain contributions, we demonstrate that our predictions are more accurate than the Kirkwood approximation and phenomenological approaches. Our technique may prove to be valuable in predicting the hydrodynamic properties of both fully unstructured and multidomain disordered proteins.
Collapse
Affiliation(s)
- Radost Waszkiewicz
- Institute
of Theoretical Physics, Faculty of Physics, University of Warsaw, L. Pasteura 5, 02-093 Warsaw, Poland
| | - Agnieszka Michaś
- Institute
of Physics, Polish Academy of Sciences, Aleja Lotnikow 32/46, PL-02668 Warsaw, Poland
| | - Michał K. Białobrzewski
- Institute
of Physics, Polish Academy of Sciences, Aleja Lotnikow 32/46, PL-02668 Warsaw, Poland
| | - Barbara P. Klepka
- Institute
of Physics, Polish Academy of Sciences, Aleja Lotnikow 32/46, PL-02668 Warsaw, Poland
| | | | - Zuzanna Staszałek
- Institute
of Physics, Polish Academy of Sciences, Aleja Lotnikow 32/46, PL-02668 Warsaw, Poland
| | - Bogdan Cichocki
- Institute
of Theoretical Physics, Faculty of Physics, University of Warsaw, L. Pasteura 5, 02-093 Warsaw, Poland
| | - Maciej Lisicki
- Institute
of Theoretical Physics, Faculty of Physics, University of Warsaw, L. Pasteura 5, 02-093 Warsaw, Poland
| | - Piotr Szymczak
- Institute
of Theoretical Physics, Faculty of Physics, University of Warsaw, L. Pasteura 5, 02-093 Warsaw, Poland
| | - Anna Niedzwiecka
- Institute
of Physics, Polish Academy of Sciences, Aleja Lotnikow 32/46, PL-02668 Warsaw, Poland
| |
Collapse
|
11
|
Manfredi M, Savojardo C, Iardukhin G, Salomoni D, Costantini A, Martelli PL, Casadio R. Alpha&ESMhFolds: A Web Server for Comparing AlphaFold2 and ESMFold Models of the Human Reference Proteome. J Mol Biol 2024:168593. [PMID: 38718922 DOI: 10.1016/j.jmb.2024.168593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 04/22/2024] [Accepted: 04/30/2024] [Indexed: 05/16/2024]
Abstract
We develop a novel database Alpha&ESMhFolds which allows the direct comparison of AlphaFold2 and ESMFold predicted models for 42,942 proteins of the Reference Human Proteome, and when available, their comparison with 2,900 directly associated PDB structures with at least a structure to sequence coverage of 70%. Statistics indicate that good quality models tend to overlap with a TM-score >0.6 as long as some PDB structural information is available. As expected, a direct model superimposition to the PDB structure highlights that AlphaFold2 models are slightly superior to ESMFold ones. However, some 55% of the database is endowed with models overlapping with TM-score <0.6. This highlights the different outputs of the two methods. The database is freely available for usage at https://alpha-esmhfolds.biocomp.unibo.it/.
Collapse
Affiliation(s)
- Matteo Manfredi
- Biocomputing Group, Dept. of Pharmacy and Biotechnology, University of Bologna, Italy
| | - Castrense Savojardo
- Biocomputing Group, Dept. of Pharmacy and Biotechnology, University of Bologna, Italy.
| | - Georgii Iardukhin
- Biocomputing Group, Dept. of Pharmacy and Biotechnology, University of Bologna, Italy
| | | | | | - Pier Luigi Martelli
- Biocomputing Group, Dept. of Pharmacy and Biotechnology, University of Bologna, Italy.
| | - Rita Casadio
- Biocomputing Group, Dept. of Pharmacy and Biotechnology, University of Bologna, Italy
| |
Collapse
|
12
|
Song FV, Su J, Huang S, Zhang N, Li K, Ni M, Liao M. DeepSS2GO: protein function prediction from secondary structure. Brief Bioinform 2024; 25:bbae196. [PMID: 38701416 PMCID: PMC11066904 DOI: 10.1093/bib/bbae196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 03/31/2024] [Accepted: 04/10/2024] [Indexed: 05/05/2024] Open
Abstract
Predicting protein function is crucial for understanding biological life processes, preventing diseases and developing new drug targets. In recent years, methods based on sequence, structure and biological networks for protein function annotation have been extensively researched. Although obtaining a protein in three-dimensional structure through experimental or computational methods enhances the accuracy of function prediction, the sheer volume of proteins sequenced by high-throughput technologies presents a significant challenge. To address this issue, we introduce a deep neural network model DeepSS2GO (Secondary Structure to Gene Ontology). It is a predictor incorporating secondary structure features along with primary sequence and homology information. The algorithm expertly combines the speed of sequence-based information with the accuracy of structure-based features while streamlining the redundant data in primary sequences and bypassing the time-consuming challenges of tertiary structure analysis. The results show that the prediction performance surpasses state-of-the-art algorithms. It has the ability to predict key functions by effectively utilizing secondary structure information, rather than broadly predicting general Gene Ontology terms. Additionally, DeepSS2GO predicts five times faster than advanced algorithms, making it highly applicable to massive sequencing data. The source code and trained models are available at https://github.com/orca233/DeepSS2GO.
Collapse
Affiliation(s)
- Fu V Song
- Department of Chemical Biology, School of Life Sciences, Southern University of Science and Technology, Xueyuan Avenue, 518055, Shenzhen, China
| | - Jiaqi Su
- Department of Chemical Biology, School of Life Sciences, Southern University of Science and Technology, Xueyuan Avenue, 518055, Shenzhen, China
| | - Sixing Huang
- Gemini Data Japan, Kitaku Oujikamiya 1-11-11, 115-0043, Tokyo, Japan
| | - Neng Zhang
- Electronic Engineering and Computer Science, Queen Mary University of London, Mile End Road, E1 4NS, London, UK
| | - Kaiyue Li
- Department of Chemical Biology, School of Life Sciences, Southern University of Science and Technology, Xueyuan Avenue, 518055, Shenzhen, China
| | - Ming Ni
- MGI Tech, Beishan Industrial Zone, 518083, Shenzhen, China
| | - Maofu Liao
- Department of Chemical Biology, School of Life Sciences, Southern University of Science and Technology, Xueyuan Avenue, 518055, Shenzhen, China
- Institute for Biological Electron Microscopy, Southern University of Science and Technology, Xueyuan Avenue, 518055, Shenzhen, China
| |
Collapse
|
13
|
Lotthammer JM, Ginell GM, Griffith D, Emenecker RJ, Holehouse AS. Direct prediction of intrinsically disordered protein conformational properties from sequence. Nat Methods 2024; 21:465-476. [PMID: 38297184 PMCID: PMC10927563 DOI: 10.1038/s41592-023-02159-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2023] [Accepted: 12/20/2023] [Indexed: 02/02/2024]
Abstract
Intrinsically disordered regions (IDRs) are ubiquitous across all domains of life and play a range of functional roles. While folded domains are generally well described by a stable three-dimensional structure, IDRs exist in a collection of interconverting states known as an ensemble. This structural heterogeneity means that IDRs are largely absent from the Protein Data Bank, contributing to a lack of computational approaches to predict ensemble conformational properties from sequence. Here we combine rational sequence design, large-scale molecular simulations and deep learning to develop ALBATROSS, a deep-learning model for predicting ensemble dimensions of IDRs, including the radius of gyration, end-to-end distance, polymer-scaling exponent and ensemble asphericity, directly from sequences at a proteome-wide scale. ALBATROSS is lightweight, easy to use and accessible as both a locally installable software package and a point-and-click-style interface via Google Colab notebooks. We first demonstrate the applicability of our predictors by examining the generalizability of sequence-ensemble relationships in IDRs. Then, we leverage the high-throughput nature of ALBATROSS to characterize the sequence-specific biophysical behavior of IDRs within and between proteomes.
Collapse
Affiliation(s)
- Jeffrey M Lotthammer
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO, USA
- Center for Biomolecular Condensates, Washington University in St. Louis, St. Louis, MO, USA
| | - Garrett M Ginell
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO, USA
- Center for Biomolecular Condensates, Washington University in St. Louis, St. Louis, MO, USA
| | - Daniel Griffith
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO, USA
- Center for Biomolecular Condensates, Washington University in St. Louis, St. Louis, MO, USA
| | - Ryan J Emenecker
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO, USA
- Center for Biomolecular Condensates, Washington University in St. Louis, St. Louis, MO, USA
| | - Alex S Holehouse
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO, USA.
- Center for Biomolecular Condensates, Washington University in St. Louis, St. Louis, MO, USA.
| |
Collapse
|
14
|
An easy-to-use computational tool for predicting 3D properties of disordered proteins. Nat Methods 2024; 21:385-386. [PMID: 38297185 DOI: 10.1038/s41592-023-02160-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2024]
|