1
|
da Rocha W, Liberti L, Mucherino A, Malliavin TE. Influence of Stereochemistry in a Local Approach for Calculating Protein Conformations. J Chem Inf Model 2024; 64:8999-9008. [PMID: 39560315 DOI: 10.1021/acs.jcim.4c01232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2024]
Abstract
Protein structure prediction is generally based on the use of local conformational information coupled with long-range distance restraints. Such restraints can be derived from the knowledge of a template structure or the analysis of protein sequence alignment in the framework of models arising from the physics of disordered systems. The accuracy of approaches based on sequence alignment, however, is limited in the case where the number of aligned sequences is small. Here, we derive protein conformations using only local conformations knowledge by means of the interval Branch-and-Prune algorithm. The computation efficiency is directly related to the knowledge of stereochemistry (bond angle and ω values) along the protein sequence and, in particular, to the variations of the torsion angle ω. The impact of stereochemistry variations is particularly strong in the case of protein topologies defined from numerous long-range restraints, as in the case of protein of β secondary structures. The systematic enumeration of the conformations improves the efficiency of the calculations. The analysis of DNA codons permits to connect the variations of torsion angle ω to the positions of rare DNA codons.
Collapse
Affiliation(s)
- Wagner da Rocha
- LIX CNRS, École Polytechnique, Institut Polytechnique de Paris, Palaiseau 91128, France
| | - Leo Liberti
- LIX CNRS, École Polytechnique, Institut Polytechnique de Paris, Palaiseau 91128, France
| | | | - Thérèse E Malliavin
- LPCT, UMR 7019 Université de Lorraine CNRS, Vandoeuvre-lès-Nancy 54500, France
| |
Collapse
|
2
|
Disela R, Neijenhuis T, Le Bussy O, Geldhof G, Klijn M, Pabst M, Ottens M. Experimental characterization and prediction of Escherichia coli host cell proteome retention during preparative chromatography. Biotechnol Bioeng 2024; 121:3848-3859. [PMID: 39267334 DOI: 10.1002/bit.28840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2024] [Revised: 08/29/2024] [Accepted: 08/31/2024] [Indexed: 09/17/2024]
Abstract
Purification of recombinantly produced biopharmaceuticals involves removal of host cell material, such as host cell proteins (HCPs). For lysates of the common expression host Escherichia coli (E. coli) over 1500 unique proteins can be identified. Currently, understanding the behavior of individual HCPs for purification operations, such as preparative chromatography, is limited. Therefore, we aim to elucidate the elution behavior of individual HCPs from E. coli strain BLR(DE3) during chromatography. Understanding this complex mixture and knowing the chromatographic behavior of each individual HCP improves the ability for rational purification process design. Specifically, linear gradient experiments were performed using ion exchange (IEX) and hydrophobic interaction chromatography, coupled with mass spectrometry-based proteomics to map the retention of individual HCPs. We combined knowledge of protein location, function, and interaction available in literature to identify trends in elution behavior. Additionally, quantitative structure-property relationship models were trained relating the protein 3D structure to elution behavior during IEX. For the complete data set a model with a cross-validated R2 of 0.55 was constructed, that could be improved to a R2 of 0.70 by considering only monomeric proteins. Ultimately this study is a significant step toward greater process understanding.
Collapse
Affiliation(s)
- Roxana Disela
- Department of Biotechnology, Delft University of Technology, Delft, The Netherlands
| | - Tim Neijenhuis
- Department of Biotechnology, Delft University of Technology, Delft, The Netherlands
| | | | | | - Marieke Klijn
- Department of Biotechnology, Delft University of Technology, Delft, The Netherlands
| | - Martin Pabst
- Department of Biotechnology, Delft University of Technology, Delft, The Netherlands
| | - Marcel Ottens
- Department of Biotechnology, Delft University of Technology, Delft, The Netherlands
| |
Collapse
|
3
|
Mitra R, Cohen AS, Sagendorf JM, Berman HM, Rohs R. DNAproDB: an updated database for the automated and interactive analysis of protein-DNA complexes. Nucleic Acids Res 2024:gkae970. [PMID: 39494533 DOI: 10.1093/nar/gkae970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2024] [Revised: 10/07/2024] [Accepted: 10/11/2024] [Indexed: 11/05/2024] Open
Abstract
DNAproDB (https://dnaprodb.usc.edu/) is a database, visualization tool, and processing pipeline for analyzing structural features of protein-DNA interactions. Here, we present a substantially updated version of the database through additional structural annotations, search, and user interface functionalities. The update expands the number of pre-analyzed protein-DNA structures, which are automatically updated weekly. The analysis pipeline identifies water-mediated hydrogen bonds that are incorporated into the visualizations of protein-DNA complexes. Tertiary structure-aware nucleotide layouts are now available. New file formats and external database annotations are supported. The website has been redesigned, and interacting with graphs and data is more intuitive. We also present a statistical analysis on the updated collection of structures revealing salient patterns in protein-DNA interactions.
Collapse
Affiliation(s)
- Raktim Mitra
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Ari S Cohen
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Jared M Sagendorf
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Helen M Berman
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Remo Rohs
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
- Department of Chemistry, University of Southern California, Los Angeles, CA 90089, USA
- Department of Physics & Astronomy, University of Southern California, Los Angeles, CA 90089, USA
- Thomas Lord Department of Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
4
|
Raouraoua N, Mirabello C, Véry T, Blanchet C, Wallner B, Lensink MF, Brysbaert G. MassiveFold: unveiling AlphaFold's hidden potential with optimized and parallelized massive sampling. NATURE COMPUTATIONAL SCIENCE 2024; 4:824-828. [PMID: 39528570 PMCID: PMC11578886 DOI: 10.1038/s43588-024-00714-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2024] [Accepted: 10/03/2024] [Indexed: 11/16/2024]
Abstract
Massive sampling in AlphaFold enables access to increased structural diversity. In combination with its efficient confidence ranking, this unlocks elevated modeling capabilities for monomeric structures and foremost for protein assemblies. However, the approach struggles with GPU cost and data storage. Here we introduce MassiveFold, an optimized and customizable version of AlphaFold that runs predictions in parallel, reducing the computing time from several months to hours. MassiveFold is scalable and able to run on anything from a single computer to a large GPU infrastructure, where it can fully benefit from all the computing nodes.
Collapse
Affiliation(s)
- Nessim Raouraoua
- Université de Lille, CNRS, UMR 8576 - UGSF - Unité de Glycobiologie Structurale et Fonctionnelle, Université de Lille, CNRS, Lille, France
| | - Claudio Mirabello
- Science for Life Laboratory, Department of Physics, Chemistry and Biology, National Bioinformatics Infrastructure Sweden, Linköping University, Linköping, Sweden
| | - Thibaut Véry
- Institut du Développement et des Ressources en Informatique Scientifique (IDRIS), CNRS, Université Paris-Saclay, Orsay, France
| | - Christophe Blanchet
- IFB-core, Institut Français de Bioinformatique (IFB), CNRS, INSERM, INRAE, CEA, Evry, France
| | - Björn Wallner
- Division of Bioinformatics, Department of Physics, Chemistry and Biology, Linköping University, Linköping, Sweden
| | - Marc F Lensink
- Université de Lille, CNRS, UMR 8576 - UGSF - Unité de Glycobiologie Structurale et Fonctionnelle, Université de Lille, CNRS, Lille, France
| | - Guillaume Brysbaert
- Université de Lille, CNRS, UMR 8576 - UGSF - Unité de Glycobiologie Structurale et Fonctionnelle, Université de Lille, CNRS, Lille, France.
| |
Collapse
|
5
|
Zeng C, Zhuo C, Gao J, Liu H, Zhao Y. Advances and Challenges in Scoring Functions for RNA-Protein Complex Structure Prediction. Biomolecules 2024; 14:1245. [PMID: 39456178 PMCID: PMC11506084 DOI: 10.3390/biom14101245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2024] [Revised: 09/24/2024] [Accepted: 09/30/2024] [Indexed: 10/28/2024] Open
Abstract
RNA-protein complexes play a crucial role in cellular functions, providing insights into cellular mechanisms and potential therapeutic targets. However, experimental determination of these complex structures is often time-consuming and resource-intensive, and it rarely yields high-resolution data. Many computational approaches have been developed to predict RNA-protein complex structures in recent years. Despite these advances, achieving accurate and high-resolution predictions remains a formidable challenge, primarily due to the limitations inherent in current RNA-protein scoring functions. These scoring functions are critical tools for evaluating and interpreting RNA-protein interactions. This review comprehensively explores the latest advancements in scoring functions for RNA-protein docking, delving into the fundamental principles underlying various approaches, including coarse-grained knowledge-based, all-atom knowledge-based, and machine-learning-based methods. We critically evaluate the strengths and limitations of existing scoring functions, providing a detailed performance assessment. Considering the significant progress demonstrated by machine learning techniques, we discuss emerging trends and propose future research directions to enhance the accuracy and efficiency of scoring functions in RNA-protein complex prediction. We aim to inspire the development of more sophisticated and reliable computational tools in this rapidly evolving field.
Collapse
Affiliation(s)
| | | | | | | | - Yunjie Zhao
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China; (C.Z.); (C.Z.); (J.G.); (H.L.)
| |
Collapse
|
6
|
Bergquist T, Stenton SL, Nadeau EA, Byrne AB, Greenblatt MS, Harrison SM, Tavtigian SV, O'Donnell-Luria A, Biesecker LG, Radivojac P, Brenner SE, Pejaver V. Calibration of additional computational tools expands ClinGen recommendation options for variant classification with PP3/BP4 criteria. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.17.611902. [PMID: 39345488 PMCID: PMC11429929 DOI: 10.1101/2024.09.17.611902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/01/2024]
Abstract
Purpose We previously developed an approach to calibrate computational tools for clinical variant classification, updating recommendations for the reliable use of variant impact predictors to provide evidence strength up to Strong. A new generation of tools using distinctive approaches have since been released, and these methods must be independently calibrated for clinical application. Method Using our local posterior probability-based calibration and our established data set of ClinVar pathogenic and benign variants, we determined the strength of evidence provided by three new tools (AlphaMissense, ESM1b, VARITY) and calibrated scores meeting each evidence strength. Results All three tools reached the Strong level of evidence for variant pathogenicity and Moderate for benignity, though sometimes for few variants. Compared to previously recommended tools, these yielded at best only modest improvements in the tradeoffs of evidence strength and false positive predictions. Conclusion At calibrated thresholds, three new computational predictors provided evidence for variant pathogenicity at similar strength to the four previously recommended predictors (and comparable with functional assays for some variants). This calibration broadens the scope of computational tools for application in clinical variant classification. Their new approaches offer promise for future advancement of the field.
Collapse
Affiliation(s)
- Timothy Bergquist
- Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Sarah L. Stenton
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Division of Genetics and Genomics, Boston Children's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Emily A.W. Nadeau
- Department of Medicine and University of Vermont Cancer Center, University of Vermont, Larner College of Medicine, Burlington, VT 05405, USA
| | - Alicia B. Byrne
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Marc S. Greenblatt
- Department of Medicine and University of Vermont Cancer Center, University of Vermont, Larner College of Medicine, Burlington, VT 05405, USA
| | - Steven M. Harrison
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Ambry Genetics, Aliso Viejo, CA 92656, USA
| | - Sean V. Tavtigian
- Department of Oncological Sciences, Huntsman Cancer Institute, University of Utah School of Medicine, Salt Lake City, UT 84112, USA
| | - Anne O'Donnell-Luria
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Division of Genetics and Genomics, Boston Children's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Leslie G. Biesecker
- Center for Precision Health Research, National Human Genome Research Institute, NIH, Bethesda, MD 20892, USA
| | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, MA 02115, USA
| | - Steven E. Brenner
- Department of Plant and Microbial Biology and Center for Computational Biology, University of California, Berkeley, CA 94720, USA
| | - Vikas Pejaver
- Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | | |
Collapse
|
7
|
Szikszai M, Magnus M, Sanghi S, Kadyan S, Bouatta N, Rivas E. RNA3DB: A structurally-dissimilar dataset split for training and benchmarking deep learning models for RNA structure prediction. J Mol Biol 2024; 436:168552. [PMID: 38552946 PMCID: PMC11377173 DOI: 10.1016/j.jmb.2024.168552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Revised: 03/19/2024] [Accepted: 03/22/2024] [Indexed: 04/09/2024]
Abstract
With advances in protein structure prediction thanks to deep learning models like AlphaFold, RNA structure prediction has recently received increased attention from deep learning researchers. RNAs introduce substantial challenges due to the sparser availability and lower structural diversity of the experimentally resolved RNA structures in comparison to protein structures. These challenges are often poorly addressed by the existing literature, many of which report inflated performance due to using training and testing sets with significant structural overlap. Further, the most recent Critical Assessment of Structure Prediction (CASP15) has shown that deep learning models for RNA structure are currently outperformed by traditional methods. In this paper we present RNA3DB, a dataset of structured RNAs, derived from the Protein Data Bank (PDB), that is designed for training and benchmarking deep learning models. The RNA3DB method arranges the RNA 3D chains into distinct groups (Components) that are non-redundant both with regard to sequence as well as structure, providing a robust way of dividing training, validation, and testing sets. Any split of these structurally-dissimilar Components are guaranteed to produce test and validations sets that are distinct by sequence and structure from those in the training set. We provide the RNA3DB dataset, a particular train/test split of the RNA3DB Components (in an approximate 70/30 ratio) that will be updated periodically. We also provide the RNA3DB methodology along with the source-code, with the goal of creating a reproducible and customizable tool for producing structurally-dissimilar dataset splits for structural RNAs.
Collapse
Affiliation(s)
- Marcell Szikszai
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, 02138, MA, USA
| | - Marcin Magnus
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, 02138, MA, USA
| | - Siddhant Sanghi
- Department of Systems Biology, Columbia University, New York 10027, NY, USA; College of Biological Sciences, UC Davis, Davis 95616, CA, USA
| | - Sachin Kadyan
- Department of Systems Biology, Columbia University, New York 10027, NY, USA
| | - Nazim Bouatta
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston 02115, MA, USA
| | - Elena Rivas
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, 02138, MA, USA
| |
Collapse
|
8
|
Collins KW, Copeland MM, Brysbaert G, Wodak SJ, Bonvin AMJJ, Kundrotas PJ, Vakser IA, Lensink MF. CAPRI-Q: The CAPRI resource evaluating the quality of predicted structures of protein complexes. J Mol Biol 2024; 436:168540. [PMID: 39237205 PMCID: PMC11458157 DOI: 10.1016/j.jmb.2024.168540] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 03/08/2024] [Accepted: 03/11/2024] [Indexed: 09/07/2024]
Abstract
Protein interactions are essential for cellular processes. In recent years there has been significant progress in computational prediction of 3D structures of individual protein chains, with the best-performing algorithms reaching sub-Ångström accuracy. These techniques are now finding their way into the prediction of protein interactions, adding to the existing modeling approaches. The community-wide Critical Assessment of Predicted Interactions (CAPRI) has been a catalyst for the development of procedures for the structural modeling of protein assemblies by organizing blind prediction experiments. The predicted structures are assessed against unpublished experimentally determined structures using a set of metrics with proven robustness that have been established in the CAPRI community. In addition, several advanced benchmarking databases provide targets against which users can test docking and assembly modeling software. These include the Protein-Protein Docking Benchmark, the CAPRI Scoreset, and the Dockground database, all developed by members of the CAPRI community. Here we present CAPRI-Q, a stand-alone model quality assessment tool, which can be freely downloaded or used via a publicly available web server. This tool applies the CAPRI metrics to assess the quality of query structures against given target structures, along with other popular quality metrics such as DockQ, TM-score and l-DDT, and classifies the models according to the CAPRI model quality criteria. The tool can handle a variety of protein complex types including those involving peptides, nucleic acids, and oligosaccharides. The source code is freely available from https://gitlab.in2p3.fr/cmsb-public/CAPRI-Q and its web interface through the Dockground resource at https://dockground.compbio.ku.edu/assessment/.
Collapse
Affiliation(s)
- Keeley W Collins
- Computational Biology Program, The University of Kansas, Lawrence, KS 66045, USA
| | - Matthew M Copeland
- Computational Biology Program, The University of Kansas, Lawrence, KS 66045, USA
| | - Guillaume Brysbaert
- Univ. Lille, CNRS, UMR 8576 - UGSF - Unité de Glycobiologie Structurale et Fonctionnelle, F-59000 Lille, France
| | | | - Alexandre M J J Bonvin
- Bijvoet Centre for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, The Netherlands
| | - Petras J Kundrotas
- Computational Biology Program, The University of Kansas, Lawrence, KS 66045, USA.
| | - Ilya A Vakser
- Computational Biology Program, The University of Kansas, Lawrence, KS 66045, USA; Department of Molecular Biology, The University of Kansas, Lawrence, KS 66045, USA.
| | - Marc F Lensink
- Univ. Lille, CNRS, UMR 8576 - UGSF - Unité de Glycobiologie Structurale et Fonctionnelle, F-59000 Lille, France.
| |
Collapse
|
9
|
Manfredi M, Savojardo C, Iardukhin G, Salomoni D, Costantini A, Martelli PL, Casadio R. Alpha&ESMhFolds: A Web Server for Comparing AlphaFold2 and ESMFold Models of the Human Reference Proteome. J Mol Biol 2024; 436:168593. [PMID: 38718922 DOI: 10.1016/j.jmb.2024.168593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 04/22/2024] [Accepted: 04/30/2024] [Indexed: 05/16/2024]
Abstract
We develop a novel database Alpha&ESMhFolds which allows the direct comparison of AlphaFold2 and ESMFold predicted models for 42,942 proteins of the Reference Human Proteome, and when available, their comparison with 2,900 directly associated PDB structures with at least a structure to sequence coverage of 70%. Statistics indicate that good quality models tend to overlap with a TM-score >0.6 as long as some PDB structural information is available. As expected, a direct model superimposition to the PDB structure highlights that AlphaFold2 models are slightly superior to ESMFold ones. However, some 55% of the database is endowed with models overlapping with TM-score <0.6. This highlights the different outputs of the two methods. The database is freely available for usage at https://alpha-esmhfolds.biocomp.unibo.it/.
Collapse
Affiliation(s)
- Matteo Manfredi
- Biocomputing Group, Dept. of Pharmacy and Biotechnology, University of Bologna, Italy
| | - Castrense Savojardo
- Biocomputing Group, Dept. of Pharmacy and Biotechnology, University of Bologna, Italy.
| | - Georgii Iardukhin
- Biocomputing Group, Dept. of Pharmacy and Biotechnology, University of Bologna, Italy
| | | | | | - Pier Luigi Martelli
- Biocomputing Group, Dept. of Pharmacy and Biotechnology, University of Bologna, Italy.
| | - Rita Casadio
- Biocomputing Group, Dept. of Pharmacy and Biotechnology, University of Bologna, Italy
| |
Collapse
|
10
|
Zhang J, Qian J. Advances in Computational Intelligence-Based Methods of Structure and Function Prediction of Proteins. Biomolecules 2024; 14:1083. [PMID: 39334850 PMCID: PMC11430421 DOI: 10.3390/biom14091083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2024] [Accepted: 08/26/2024] [Indexed: 09/30/2024] Open
Abstract
Proteins serve as the building blocks of life and play essential roles in almost every cellular process [...].
Collapse
Affiliation(s)
- Jian Zhang
- School of Computer and Information Technology, Xinyang Normal University, Xinyang 464000, China;
| | | |
Collapse
|
11
|
Ali MA, Caetano-Anollés G. AlphaFold2 Reveals Structural Patterns of Seasonal Haplotype Diversification in SARS-CoV-2 Nucleocapsid Protein Variants. Viruses 2024; 16:1358. [PMID: 39339835 PMCID: PMC11435742 DOI: 10.3390/v16091358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Revised: 08/10/2024] [Accepted: 08/21/2024] [Indexed: 09/30/2024] Open
Abstract
The COVID-19 pandemic saw the emergence of various Variants of Concern (VOCs) that took the world by storm, often replacing the ones that preceded them. The characteristic mutant constellations of these VOCs increased viral transmissibility and infectivity. Their origin and evolution remain puzzling. With the help of data mining efforts and the GISAID database, a chronology of 22 haplotypes described viral evolution up until 23 July 2023. Since the three-dimensional atomic structures of proteins corresponding to the identified haplotypes are not available, ab initio methods were here utilized. Regions of intrinsic disorder proved to be important for viral evolution, as evidenced by the targeted change to the nucleocapsid (N) protein at the sequence, structure, and biochemical levels. The linker region of the N-protein, which binds to the RNA genome and self-oligomerizes for efficient genome packaging, was greatly impacted by mutations throughout the pandemic, followed by changes in structure and intrinsic disorder. Remarkably, VOC constellations acted co-operatively to balance the more extreme effects of individual haplotypes. Our strategy of mapping the dynamic evolutionary landscape of genetically linked mutations to the N-protein structure demonstrates the utility of ab initio modeling and deep learning tools for therapeutic intervention.
Collapse
Affiliation(s)
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA;
| |
Collapse
|
12
|
Nieto-Fabregat F, Lenza MP, Marseglia A, Di Carluccio C, Molinaro A, Silipo A, Marchetti R. Computational toolbox for the analysis of protein-glycan interactions. Beilstein J Org Chem 2024; 20:2084-2107. [PMID: 39189002 PMCID: PMC11346309 DOI: 10.3762/bjoc.20.180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Accepted: 08/01/2024] [Indexed: 08/28/2024] Open
Abstract
Protein-glycan interactions play pivotal roles in numerous biological processes, ranging from cellular recognition to immune response modulation. Understanding the intricate details of these interactions is crucial for deciphering the molecular mechanisms underlying various physiological and pathological conditions. Computational techniques have emerged as powerful tools that can help in drawing, building and visualising complex biomolecules and provide insights into their dynamic behaviour at atomic and molecular levels. This review provides an overview of the main computational tools useful for studying biomolecular systems, particularly glycans, both in free state and in complex with proteins, also with reference to the principles, methodologies, and applications of all-atom molecular dynamics simulations. Herein, we focused on the programs that are generally employed for preparing protein and glycan input files to execute molecular dynamics simulations and analyse the corresponding results. The presented computational toolbox represents a valuable resource for researchers studying protein-glycan interactions and incorporates advanced computational methods for building, visualising and predicting protein/glycan structures, modelling protein-ligand complexes, and analyse MD outcomes. Moreover, selected case studies have been reported to highlight the importance of computational tools in studying protein-glycan systems, revealing the capability of these tools to provide valuable insights into the binding kinetics, energetics, and structural determinants that govern specific molecular interactions.
Collapse
Affiliation(s)
- Ferran Nieto-Fabregat
- Department of Chemical Sciences, University of Naples Federico II, Via Cinthia 4, 80126, Italy
| | - Maria Pia Lenza
- Department of Chemical Sciences, University of Naples Federico II, Via Cinthia 4, 80126, Italy
| | - Angela Marseglia
- Department of Chemical Sciences, University of Naples Federico II, Via Cinthia 4, 80126, Italy
| | - Cristina Di Carluccio
- Department of Chemical Sciences, University of Naples Federico II, Via Cinthia 4, 80126, Italy
| | - Antonio Molinaro
- Department of Chemical Sciences, University of Naples Federico II, Via Cinthia 4, 80126, Italy
| | - Alba Silipo
- Department of Chemical Sciences, University of Naples Federico II, Via Cinthia 4, 80126, Italy
| | - Roberta Marchetti
- Department of Chemical Sciences, University of Naples Federico II, Via Cinthia 4, 80126, Italy
| |
Collapse
|
13
|
Dapkūnas J, Timinskas A, Olechnovič K, Tomkuvienė M, Venclovas Č. PPI3D: a web server for searching, analyzing and modeling protein-protein, protein-peptide and protein-nucleic acid interactions. Nucleic Acids Res 2024; 52:W264-W271. [PMID: 38619046 PMCID: PMC11223826 DOI: 10.1093/nar/gkae278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Revised: 03/19/2024] [Accepted: 04/03/2024] [Indexed: 04/16/2024] Open
Abstract
Structure-resolved protein interactions with other proteins, peptides and nucleic acids are key for understanding molecular mechanisms. The PPI3D web server enables researchers to query preprocessed and clustered structural data, analyze the results and make homology-based inferences for protein interactions. PPI3D offers three interaction exploration modes: (i) all interactions for proteins homologous to the query, (ii) interactions between two proteins or their homologs and (iii) interactions within a specific PDB entry. The server allows interactive analysis of the identified interactions in both summarized and detailed manner. This includes protein annotations, structures, the interface residues and the corresponding contact surface areas. In addition, users can make inferences about residues at the interaction interface for the query protein(s) from the sequence alignments and homology models. The weekly updated PPI3D database includes all the interaction interfaces and binding sites from PDB, clustered based on both protein sequence and structural similarity, yielding non-redundant datasets without loss of alternative interaction modes. Consequently, the PPI3D users avoid being flooded with redundant information, a typical situation for intensely studied proteins. Furthermore, PPI3D provides a possibility to download user-defined sets of interaction interfaces and analyze them locally. The PPI3D web server is available at https://bioinformatics.lt/ppi3d.
Collapse
Affiliation(s)
- Justas Dapkūnas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Saulėtekio av. 7, Vilnius LT-10257, Lithuania
| | - Albertas Timinskas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Saulėtekio av. 7, Vilnius LT-10257, Lithuania
| | - Kliment Olechnovič
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Saulėtekio av. 7, Vilnius LT-10257, Lithuania
- Univ. Grenoble Alpes, CNRS, Grenoble INP, LJK, 38000 Grenoble, France
| | - Miglė Tomkuvienė
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Saulėtekio av. 7, Vilnius LT-10257, Lithuania
| | - Česlovas Venclovas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Saulėtekio av. 7, Vilnius LT-10257, Lithuania
| |
Collapse
|
14
|
Bryant P, Noé F. Improved protein complex prediction with AlphaFold-multimer by denoising the MSA profile. PLoS Comput Biol 2024; 20:e1012253. [PMID: 39052676 DOI: 10.1371/journal.pcbi.1012253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 08/06/2024] [Accepted: 06/14/2024] [Indexed: 07/27/2024] Open
Abstract
Structure prediction of protein complexes has improved significantly with AlphaFold2 and AlphaFold-multimer (AFM), but only 60% of dimers are accurately predicted. Here, we learn a bias to the MSA representation that improves the predictions by performing gradient descent through the AFM network. We demonstrate the performance on seven difficult targets from CASP15 and increase the average MMscore to 0.76 compared to 0.63 with AFM. We evaluate the procedure on 487 protein complexes where AFM fails and obtain an increased success rate (MMscore>0.75) of 33% on these difficult targets. Our protocol, AFProfile, provides a way to direct predictions towards a defined target function guided by the MSA. We expect gradient descent over the MSA to be useful for different tasks.
Collapse
Affiliation(s)
- Patrick Bryant
- Department of Mathematics and Informatics, Freie Universität Berlin, Germany
- The Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
- Science for Life Laboratory, Solna, Sweden
| | - Frank Noé
- Department of Mathematics and Informatics, Freie Universität Berlin, Germany
- Microsoft Research AI4Science, Berlin, Germany
| |
Collapse
|
15
|
Üresin D, Schulte J, Morgner N, Soppa J. C(P)XCG Proteins of Haloferax volcanii with Predicted Zinc Finger Domains: The Majority Bind Zinc, but Several Do Not. Int J Mol Sci 2024; 25:7166. [PMID: 39000272 PMCID: PMC11241148 DOI: 10.3390/ijms25137166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2024] [Revised: 06/20/2024] [Accepted: 06/24/2024] [Indexed: 07/16/2024] Open
Abstract
In recent years, interest in very small proteins (µ-proteins) has increased significantly, and they were found to fulfill important functions in all prokaryotic and eukaryotic species. The halophilic archaeon Haloferax volcanii encodes about 400 µ-proteins of less than 70 amino acids, 49 of which contain at least two C(P)XCG motifs and are, thus, predicted zinc finger proteins. The determination of the NMR solution structure of HVO_2753 revealed that only one of two predicted zinc fingers actually bound zinc, while a second one was metal-free. Therefore, the aim of the current study was the homologous production of additional C(P)XCG proteins and the quantification of their zinc content. Attempts to produce 31 proteins failed, underscoring the particular difficulties of working with µ-proteins. In total, 14 proteins could be produced and purified, and the zinc content was determined. Only nine proteins complexed zinc, while five proteins were zinc-free. Three of the latter could be analyzed using ESI-MS and were found to contain another metal, most likely cobalt or nickel. Therefore, at least in haloarchaea, the variability of predicted C(P)XCG zinc finger motifs is higher than anticipated, and they can be metal-free, bind zinc, or bind another metal. Notably, AlphaFold2 cannot correctly predict whether or not the four cysteines have the tetrahedral configuration that is a prerequisite for metal binding.
Collapse
Affiliation(s)
- Deniz Üresin
- Institute for Molecular Biosciences, Goethe University, 60438 Frankfurt, Germany;
| | - Jonathan Schulte
- Institute of Physical and Theoretical Chemistry, Goethe University, 60438 Frankfurt, Germany; (J.S.); (N.M.)
| | - Nina Morgner
- Institute of Physical and Theoretical Chemistry, Goethe University, 60438 Frankfurt, Germany; (J.S.); (N.M.)
| | - Jörg Soppa
- Institute for Molecular Biosciences, Goethe University, 60438 Frankfurt, Germany;
| |
Collapse
|
16
|
Dahlström KM, Salminen TA. Apprehensions and emerging solutions in ML-based protein structure prediction. Curr Opin Struct Biol 2024; 86:102819. [PMID: 38631107 DOI: 10.1016/j.sbi.2024.102819] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 03/05/2024] [Accepted: 03/31/2024] [Indexed: 04/19/2024]
Abstract
The three-dimensional structure of proteins determines their function in vital biological processes. Thus, when the structure is known, the molecular mechanism of protein function can be understood in more detail and obtained information utilized in biotechnological, diagnostics, and therapeutic applications. Over the past five years, machine learning (ML)-based modeling has pushed protein structure prediction to the next level with AlphaFold in the front line, predicting the structure for hundreds of millions of proteins. Further advances recently report promising ML-based approaches for solving remaining challenges by incorporating functionally important metals, co-factors, post-translational modifications, structural dynamics, and interdomain and multimer interactions in the structure prediction process.
Collapse
Affiliation(s)
- Käthe M Dahlström
- Structural Bioinformatics Laboratory, Biochemistry, Faculty of Science and Engineering, Åbo Akademi University, Tykistökatu 6A, 20520 Turku, Finland; InFLAMES Research Flagship Center, Åbo Akademi University, 20520 Turku, Finland
| | - Tiina A Salminen
- Structural Bioinformatics Laboratory, Biochemistry, Faculty of Science and Engineering, Åbo Akademi University, Tykistökatu 6A, 20520 Turku, Finland; InFLAMES Research Flagship Center, Åbo Akademi University, 20520 Turku, Finland.
| |
Collapse
|
17
|
Fazekas Z, K Menyhárd D, Perczel A. LoCoHD: a metric for comparing local environments of proteins. Nat Commun 2024; 15:4029. [PMID: 38740745 DOI: 10.1038/s41467-024-48225-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Accepted: 04/22/2024] [Indexed: 05/16/2024] Open
Abstract
Protein folds and the local environments they create can be compared using a variety of differently designed measures, such as the root mean squared deviation, the global distance test, the template modeling score or the local distance difference test. Although these measures have proven to be useful for a variety of tasks, each fails to fully incorporate the valuable chemical information inherent to atoms and residues, and considers these only partially and indirectly. Here, we develop the highly flexible local composition Hellinger distance (LoCoHD) metric, which is based on the chemical composition of local residue environments. Using LoCoHD, we analyze the chemical heterogeneity of amino acid environments and identify valines having the most conserved-, and arginines having the most variable chemical environments. We use LoCoHD to investigate structural ensembles, to evaluate critical assessment of structure prediction (CASP) competitors, to compare the results with the local distance difference test (lDDT) scoring system, and to evaluate a molecular dynamics simulation. We show that LoCoHD measurements provide unique information about protein structures that is distinct from, for example, those derived using the alignment-based RMSD metric, or the similarly distance matrix-based but alignment-free lDDT metric.
Collapse
Affiliation(s)
- Zsolt Fazekas
- Laboratory of Structural Chemistry and Biology, Institute of Chemistry, ELTE Eötvös Loránd University, Budapest, Hungary
- ELTE Hevesy György PhD School of Chemistry, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Dóra K Menyhárd
- Laboratory of Structural Chemistry and Biology, Institute of Chemistry, ELTE Eötvös Loránd University, Budapest, Hungary
- HUN-REN-ELTE Protein Modeling Research Group, ELTE Eötvös Loránd University, Budapest, Hungary
| | - András Perczel
- Laboratory of Structural Chemistry and Biology, Institute of Chemistry, ELTE Eötvös Loránd University, Budapest, Hungary.
- HUN-REN-ELTE Protein Modeling Research Group, ELTE Eötvös Loránd University, Budapest, Hungary.
| |
Collapse
|
18
|
Burley SK, Piehl DW, Vallat B, Zardecki C. RCSB Protein Data Bank: supporting research and education worldwide through explorations of experimentally determined and computationally predicted atomic level 3D biostructures. IUCRJ 2024; 11:279-286. [PMID: 38597878 PMCID: PMC11067742 DOI: 10.1107/s2052252524002604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Accepted: 03/19/2024] [Indexed: 04/11/2024]
Abstract
The Protein Data Bank (PDB) was established as the first open-access digital data resource in biology and medicine in 1971 with seven X-ray crystal structures of proteins. Today, the PDB houses >210 000 experimentally determined, atomic level, 3D structures of proteins and nucleic acids as well as their complexes with one another and small molecules (e.g. approved drugs, enzyme cofactors). These data provide insights into fundamental biology, biomedicine, bioenergy and biotechnology. They proved particularly important for understanding the SARS-CoV-2 global pandemic. The US-funded Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) and other members of the Worldwide Protein Data Bank (wwPDB) partnership jointly manage the PDB archive and support >60 000 `data depositors' (structural biologists) around the world. wwPDB ensures the quality and integrity of the data in the ever-expanding PDB archive and supports global open access without limitations on data usage. The RCSB PDB research-focused web portal at https://www.rcsb.org/ (RCSB.org) supports millions of users worldwide, representing a broad range of expertise and interests. In addition to retrieving 3D structure data, PDB `data consumers' access comparative data and external annotations, such as information about disease-causing point mutations and genetic variations. RCSB.org also provides access to >1 000 000 computed structure models (CSMs) generated using artificial intelligence/machine-learning methods. To avoid doubt, the provenance and reliability of experimentally determined PDB structures and CSMs are identified. Related training materials are available to support users in their RCSB.org explorations.
Collapse
Affiliation(s)
- Stephen K. Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Research Collaboratory for Structural Biology Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08901, USA
| | - Dennis W. Piehl
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Brinda Vallat
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08901, USA
| | - Christine Zardecki
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| |
Collapse
|
19
|
Vallat B, Berman HM. Structural highlights of macromolecular complexes and assemblies. Curr Opin Struct Biol 2024; 85:102773. [PMID: 38271778 DOI: 10.1016/j.sbi.2023.102773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Revised: 12/22/2023] [Accepted: 12/26/2023] [Indexed: 01/27/2024]
Abstract
The structures of macromolecular assemblies have given us deep insights into cellular processes and have profoundly impacted biological research and drug discovery. We highlight the structures of macromolecular assemblies that have been modeled using integrative and computational methods and describe how open access to these structures from structural archives has empowered the research community. The arsenal of experimental and computational methods for structure determination ensures a future where whole organelles and cells can be modeled.
Collapse
Affiliation(s)
- Brinda Vallat
- Research Collaboratory for Structural Bioinformatics Protein Data Bank and the Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA.
| | - Helen M Berman
- Research Collaboratory for Structural Bioinformatics Protein Data Bank and the Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Quantitative and Computational Biology, University of Southern California, Los Angeles CA 90089, USA
| |
Collapse
|
20
|
Caparotta M, Perez A. Advancing Molecular Dynamics: Toward Standardization, Integration, and Data Accessibility in Structural Biology. J Phys Chem B 2024; 128:2219-2227. [PMID: 38418288 DOI: 10.1021/acs.jpcb.3c04823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/01/2024]
Abstract
Molecular dynamics (MD) simulations have become a valuable tool in structural biology, offering insights into complex biological systems that are difficult to obtain through experimental techniques alone. The lack of available data sets and structures in most published computational work has limited other researchers' use of these models. In recent years, the emergence of online sharing platforms and MD database initiatives favor the deposition of ensembles and structures to accompany publications, favoring reuse of the data sets. However, the lack of uniform metadata collection, formats, and what data are deposited limits the impact and its use by different communities that are not necessarily experts in MD. This Perspective highlights the need for standardization and better resource sharing for processing and interpreting MD simulation results, akin to efforts in other areas of structural biology. As the field moves forward, we will see an increase in popularity and benefits of MD-based integrative approaches combining experimental data and simulations through probabilistic reasoning, but these too are limited by uniformity in experimental data availability and choices on how the data are modeled that are not trivial to decipher from papers. Other fields have addressed similar challenges comprehensively by establishing task forces with different degrees of success. The large scope and number of communities to represent the breadth of types of MD simulations complicates a parallel approach that would fit all. Thus, each group typically decides what data and which format to upload on servers like Zenodo. Uploading data with FAIR (findable, accessible, interoperable, reusable) principles in mind including optimal metadata collection will make the data more accessible and actionable by the community. Such a wealth of simulation data will foster method development and infrastructure advancements, thus propelling the field forward.
Collapse
Affiliation(s)
- Marcelo Caparotta
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, Florida 32611, United States
| | - Alberto Perez
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, Florida 32611, United States
| |
Collapse
|
21
|
Szikszai M, Magnus M, Sanghi S, Kadyan S, Bouatta N, Rivas E. RNA3DB: A structurally-dissimilar dataset split for training and benchmarking deep learning models for RNA structure prediction. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.30.578025. [PMID: 38352531 PMCID: PMC10862857 DOI: 10.1101/2024.01.30.578025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/23/2024]
Abstract
With advances in protein structure prediction thanks to deep learning models like AlphaFold, RNA structure prediction has recently received increased attention from deep learning researchers. RNAs introduce substantial challenges due to the sparser availability and lower structural diversity of the experimentally resolved RNA structures in comparison to protein structures. These challenges are often poorly addressed by the existing literature, many of which report inflated performance due to using training and testing sets with significant structural overlap. Further, the most recent Critical Assessment of Structure Prediction (CASP15) has shown that deep learning models for RNA structure are currently outperformed by traditional methods. In this paper we present RNA3DB, a dataset of structured RNAs, derived from the Protein Data Bank (PDB), that is designed for training and benchmarking deep learning models. The RNA3DB method arranges the RNA 3D chains into distinct groups (Components) that are non-redundant both with regard to sequence as well as structure, providing a robust way of dividing training, validation, and testing sets. Any split of these structurally-dissimilar Components are guaranteed to produce test and validations sets that are distinct by sequence and structure from those in the training set. We provide the RNA3DB dataset, a particular train/test split of the RNA3DB Components (in an approximate 70/30 ratio) that will be updated periodically. We also provide the RNA3DB methodology along with the source-code, with the goal of creating a reproducible and customizable tool for producing structurally-dissimilar dataset splits for structural RNAs.
Collapse
Affiliation(s)
- Marcell Szikszai
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, 02138, MA, USA
| | - Marcin Magnus
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, 02138, MA, USA
| | - Siddhant Sanghi
- Department of Systems Biology, Columbia University, New York, 10027, NY, USA
- College of Biological Sciences, UC Davis, Davis, 95616, CA, USA
| | - Sachin Kadyan
- Department of Systems Biology, Columbia University, New York, 10027, NY, USA
| | - Nazim Bouatta
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, 02115, MA, USA
| | - Elena Rivas
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, 02138, MA, USA
| |
Collapse
|
22
|
Arrowsmith CH. Structure-guided drug discovery: back to the future. Nat Struct Mol Biol 2024; 31:395-396. [PMID: 38486110 DOI: 10.1038/s41594-024-01244-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/20/2024]
Affiliation(s)
- Cheryl H Arrowsmith
- Princess Margaret Cancer Centre, Toronto, Ontario, Canada.
- Structural Genomics Consortium, University of Toronto, Toronto, Ontario, Canada.
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada.
| |
Collapse
|
23
|
Sriramoju MK, Ko KT, Hsu STD. Tying a true topological protein knot by cyclization. Biochem Biophys Res Commun 2024; 696:149470. [PMID: 38244314 DOI: 10.1016/j.bbrc.2024.149470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 12/23/2023] [Accepted: 01/03/2024] [Indexed: 01/22/2024]
Abstract
Knotted proteins are fascinating to biophysicists because of their robust ability to fold into intricately defined three-dimensional structures with complex and topologically knotted arrangements. Exploring the biophysical properties of the knotted proteins is of significant interest, as they could offer enhanced chemical, thermal, and mechanostabilities. A true mathematical knot requires a closed path; in contrast, knotted protein structures have open N- and C-termini. To address the question of how a truly knotted protein differs from the naturally occurring counterpart, we enzymatically cyclized a 31 knotted YibK protein from Haemophilus influenza (HiYibK) to investigate the impact of path closure on its structure-function relationship and folding stability. Through the use of a multitude of structural and biophysical tools, including X-ray crystallography, NMR spectroscopy, small angle X-ray scattering, differential scanning calorimetry, and isothermal calorimetry, we showed that the path closure minimally perturbs the native structure and ligand binding of HiYibK. Nevertheless, the cyclization did alter the folding stability and mechanism according to chemical and thermal unfolding analysis. These molecular insights contribute to our fundamental understanding of protein folding and knotting that could have implications in the protein design with higher stabilities.
Collapse
Affiliation(s)
| | - Kuang-Ting Ko
- Institute of Biological Chemistry, Academia Sinica, Taipei, 11529, Taiwan
| | - Shang-Te Danny Hsu
- Institute of Biological Chemistry, Academia Sinica, Taipei, 11529, Taiwan; Institute of Biochemical Sciences, National Taiwan University, Taipei, 106319, Taiwan; International Institute for Sustainability with Knotted Chiral Meta Matter (SKCM(2)), Hiroshima University, Higashihiroshima, 739-8527, Japan.
| |
Collapse
|