1
|
Sikora M, Klimentova E, Uchal D, Sramkova D, Perlinska AP, Nguyen ML, Korpacz M, Malinowska R, Nowakowski S, Rubach P, Simecek P, Sulkowska JI. Knot or not? Identifying unknotted proteins in knotted families with sequence-based Machine Learning model. Protein Sci 2024; 33:e4998. [PMID: 38888487 DOI: 10.1002/pro.4998] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 03/14/2024] [Accepted: 04/09/2024] [Indexed: 06/20/2024]
Abstract
Knotted proteins, although scarce, are crucial structural components of certain protein families, and their roles continue to be a topic of intense research. Capitalizing on the vast collection of protein structure predictions offered by AlphaFold (AF), this study computationally examines the entire UniProt database to create a robust dataset of knotted and unknotted proteins. Utilizing this dataset, we develop a machine learning (ML) model capable of accurately predicting the presence of knots in protein structures solely from their amino acid sequences. We tested the model's capabilities on 100 proteins whose structures had not yet been predicted by AF and found agreement with our local prediction in 92% cases. From the point of view of structural biology, we found that all potentially knotted proteins predicted by AF can be classified only into 17 families. This allows us to discover the presence of unknotted proteins in families with a highly conserved knot. We found only three new protein families: UCH, DUF4253, and DUF2254, that contain both knotted and unknotted proteins, and demonstrate that deletions within the knot core could potentially account for the observed unknotted (trivial) topology. Finally, we have shown that in the majority of knotted families (11 out of 15), the knotted topology is strictly conserved in functional proteins with very low sequence similarity. We have conclusively demonstrated that proteins AF predicts as unknotted are structurally accurate in their unknotted configurations. However, these proteins often represent nonfunctional fragments, lacking significant portions of the knot core (amino acid sequence).
Collapse
Affiliation(s)
- Maciej Sikora
- Centre of New Technologies, University of Warsaw, Warsaw, Poland
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Eva Klimentova
- Central European Institute of Technology, Masaryk University, Brno, Czech Republic
- National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Brno, Czech Republic
| | - Dawid Uchal
- Centre of New Technologies, University of Warsaw, Warsaw, Poland
- Faculty of Physics, University of Warsaw, Warsaw, Poland
| | - Denisa Sramkova
- Central European Institute of Technology, Masaryk University, Brno, Czech Republic
- National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Brno, Czech Republic
| | | | - Mai Lan Nguyen
- Centre of New Technologies, University of Warsaw, Warsaw, Poland
| | - Marta Korpacz
- Centre of New Technologies, University of Warsaw, Warsaw, Poland
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Roksana Malinowska
- Centre of New Technologies, University of Warsaw, Warsaw, Poland
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Szymon Nowakowski
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
- Faculty of Physics, University of Warsaw, Warsaw, Poland
| | - Pawel Rubach
- Centre of New Technologies, University of Warsaw, Warsaw, Poland
- Warsaw School of Economics, Warsaw, Poland
| | - Petr Simecek
- Central European Institute of Technology, Masaryk University, Brno, Czech Republic
| | | |
Collapse
|
2
|
Hsu STD. Folding and functions of knotted proteins. Curr Opin Struct Biol 2023; 83:102709. [PMID: 37778185 DOI: 10.1016/j.sbi.2023.102709] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2023] [Revised: 09/02/2023] [Accepted: 09/05/2023] [Indexed: 10/03/2023]
Abstract
Topologically knotted proteins have entangled structural elements within their native structures that cannot be disentangled simply by pulling from the N- and C-termini. Systematic surveys have identified different types of knotted protein structures, constituting as much as 1% of the total entries within the Protein Data Bank. Many knotted proteins rely on their knotted structural elements to carry out evolutionarily conserved biological functions. Being knotted may also provide mechanical stability to withstand unfolding-coupled proteolysis. Reconfiguring a knotted protein topology by circular permutation or cyclization provides insights into the importance of being knotted in the context of folding and functions. With the explosion of predicted protein structures by artificial intelligence, we are now entering a new era of exploring the entangled protein universe.
Collapse
Affiliation(s)
- Shang-Te Danny Hsu
- Institute of Biological Chemistry, Academia Sinica, Taipei 11529, Taiwan; Institute of Biochemical Sciences, National Taiwan University, Taipei 10617, Taiwan; International Institute for Sustainability with Knotted Chiral Meta Matter (WPI-SKCM(2)), Hiroshima University, Higashi-Hiroshima, Hiroshima 739-8526, Japan.
| |
Collapse
|
3
|
Repositioning of Etravirine as a Potential CK1ε Inhibitor by Virtual Screening. Pharmaceuticals (Basel) 2021; 15:ph15010008. [PMID: 35056065 PMCID: PMC8778358 DOI: 10.3390/ph15010008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Revised: 12/17/2021] [Accepted: 12/19/2021] [Indexed: 11/16/2022] Open
Abstract
CK1ε is a key regulator of WNT/β-catenin and other pathways that are linked to tumor progression; thus, CK1ε is considered a target for the development of antineoplastic therapies. In this study, we performed a virtual screening to search for potential CK1ε inhibitors. First, we characterized the dynamic noncovalent interactions profiles for a set of reported CK1ε inhibitors to generate a pharmacophore model, which was used to identify new potential inhibitors among FDA-approved drugs. We found that etravirine and abacavir, two drugs that are approved for HIV infections, can be repurposed as CK1ε inhibitors. The interaction of these drugs with CK1ε was further examined by molecular docking and molecular dynamics. Etravirine and abacavir formed stable complexes with the target, emulating the binding behavior of known inhibitors. However, only etravirine showed high theoretical binding affinity to CK1ε. Our findings provide a new pharmacophore for targeting CK1ε and implicate etravirine as a CK1ε inhibitor and antineoplastic agent.
Collapse
|
4
|
Muthu SA, Jadav HC, Srivastava S, Pissurlenkar RRS, Ahmad B. The reorganization of conformations, stability and aggregation of serum albumin isomers through the interaction of glycopeptide antibiotic teicoplanin: A thermodynamic and spectroscopy study. Int J Biol Macromol 2020; 163:66-78. [PMID: 32615213 DOI: 10.1016/j.ijbiomac.2020.06.258] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 06/10/2020] [Accepted: 06/26/2020] [Indexed: 11/18/2022]
Abstract
The drugs-protein binding study is of growing importance for drug-repurposing against amyloidosis. In this work, we study the binding of teicoplanin (TPN), a glycopeptide antibiotic, with bovine serum albumin (BSA) in its neutral (N), physiological (P) and basic (B) forms, which exist at pH 6, pH 7.4 and pH 9, respectively. The binding and thermodynamic parameters of TPN binding were determined by isothermal titration calorimetry (ITC) and fluorescence quench titration methods. Two binding sites were observed for N and P forms, whereas B form showed only one binding site. ITC and molecular docking results indicated that TPN-BSA complex formation is stabilized by hydrogen bonds, salt bridges and hydrophobic interaction. The red-edge excitation shift (REES) study indicated an ordered compact and spatial arrangement of the TPN bound protein molecule. TPN was found to affect the secondary and tertiary structures of B form only. The TPN binding was observed to marginally stabilize BSA isomers. TPN was also found to inhibit BSA aggregation as monitored by Rayleigh light scattering and thioflavin T binding assay. The current in vitro study will open a new path to explore the possible use of TPN as potential drugs to treat amyloidosis.
Collapse
Affiliation(s)
- Shivani A Muthu
- Protein Assembly Laboratory (PAL), JH-Institute of Molecular Medicine, Jamia Hamdard, Hamdard Nagar, New Delhi 110062, India
| | - Helly Chetan Jadav
- School of Chemical Sciences, UM-DAE Centre for Excellence in Basic Sciences, University of Mumbai, Vidyanagari Campus, Mumbai 400098, India
| | - Sadhavi Srivastava
- School of Chemical Sciences, UM-DAE Centre for Excellence in Basic Sciences, University of Mumbai, Vidyanagari Campus, Mumbai 400098, India; Department of Biotechnology, Central University of South Bihar, Gaya 824236, India
| | - Raghuvir R S Pissurlenkar
- Department of Pharmaceutical and Medicinal Chemistry, Goa College of Pharmacy, 18th June Road, Panaji, Goa 403001, India
| | - Basir Ahmad
- Protein Assembly Laboratory (PAL), JH-Institute of Molecular Medicine, Jamia Hamdard, Hamdard Nagar, New Delhi 110062, India.
| |
Collapse
|
5
|
Niemyska W, Millett KC, Sulkowska JI. GLN: a method to reveal unique properties of lasso type topology in proteins. Sci Rep 2020; 10:15186. [PMID: 32938999 PMCID: PMC7494857 DOI: 10.1038/s41598-020-71874-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Accepted: 08/17/2020] [Indexed: 02/02/2023] Open
Abstract
Geometry and topology are the main factors that determine the functional properties of proteins. In this work, we show how to use the Gauss linking integral (GLN) in the form of a matrix diagram-for a pair of a loop and a tail-to study both the geometry and topology of proteins with closed loops e.g. lassos. We show that the GLN method is a significantly faster technique to detect entanglement in lasso proteins in comparison with other methods. Based on the GLN technique, we conduct comprehensive analysis of all proteins deposited in the PDB and compare it to the statistical properties of the polymers. We show how high and low GLN values correlate with the internal exibility of proteins, and how the GLN in the form of a matrix diagram can be used to study folding and unfolding routes. Finally, we discuss how the GLN method can be applied to study entanglement between two structures none of which are closed loops. Since this approach is much faster than other linking invariants, the next step will be evaluation of lassos in much longer molecules such as RNA or loops in a single chromosome.
Collapse
Affiliation(s)
- Wanda Niemyska
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Banacha 2, 02-097, Warsaw, Poland
- Centre of New Technologies, University of Warsaw, Banacha 2c, 02-097, Warsaw, Poland
| | - Kenneth C Millett
- Department of Mathematics, University of California Santa Barbara, Santa Barbara, CA, 93106, USA
| | - Joanna I Sulkowska
- Centre of New Technologies, University of Warsaw, Banacha 2c, 02-097, Warsaw, Poland.
- Faculty of Chemistry, University of Warsaw, Pasteura 1, 02-093, Warsaw, Poland.
| |
Collapse
|
6
|
Piejko M, Niewieczerzal S, Sulkowska JI. The Folding of Knotted Proteins: Distinguishing the Distinct Behavior of Shallow and Deep Knots. Isr J Chem 2020. [DOI: 10.1002/ijch.202000036] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Maciej Piejko
- Faculty of ChemistryUniversity of Warsaw Pasteura 1 Warsaw 02-093 Poland
- Centre of New TechnologiesUniversity of Warsaw Banacha 2c Warsaw 02-097 Poland
| | | | - Joanna I. Sulkowska
- Faculty of ChemistryUniversity of Warsaw Pasteura 1 Warsaw 02-093 Poland
- Centre of New TechnologiesUniversity of Warsaw Banacha 2c Warsaw 02-097 Poland
| |
Collapse
|
7
|
Grønbæk C, Hamelryck T, Røgen P. GISA: using Gauss Integrals to identify rare conformations in protein structures. PeerJ 2020; 8:e9159. [PMID: 32566389 PMCID: PMC7293858 DOI: 10.7717/peerj.9159] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Accepted: 04/18/2020] [Indexed: 12/13/2022] Open
Abstract
The native structure of a protein is important for its function, and therefore methods for exploring protein structures have attracted much research. However, rather few methods are sensitive to topologic-geometric features, the examples being knots, slipknots, lassos, links, and pokes, and with each method aimed only for a specific set of such configurations. We here propose a general method which transforms a structure into a ”fingerprint of topological-geometric values” consisting in a series of real-valued descriptors from mathematical Knot Theory. The extent to which a structure contains unusual configurations can then be judged from this fingerprint. The method is not confined to a particular pre-defined topology or geometry (like a knot or a poke), and so, unlike existing methods, it is general. To achieve this our new algorithm, GISA, as a key novelty produces the descriptors, so called Gauss integrals, not only for the full chains of a protein but for all its sub-chains. This allows fingerprinting on any scale from local to global. The Gauss integrals are known to be effective descriptors of global protein folds. Applying GISA to sets of several thousand high resolution structures, we first show how the most basic Gauss integral, the writhe, enables swift identification of pre-defined geometries such as pokes and links. We then apply GISA with no restrictions on geometry, to show how it allows identifying rare conformations by finding rare invariant values only. In this unrestricted search, pokes and links are still found, but also knotted conformations, as well as more highly entangled configurations not previously described. Thus, an application of the basic scan method in GISA’s tool-box revealed 10 known cases of knots as the top positive writhe cases, while placing at the top of the negative writhe 14 cases in cis-trans isomerases sharing a spatial motif of little secondary structure content, which possibly has gone unnoticed. Possible general applications of GISA are fold classification and structural alignment based on local Gauss integrals. Others include finding errors in protein models and identifying unusual conformations that might be important for protein folding and function. By its broad potential, we believe that GISA will be of general benefit to the structural bioinformatics community. GISA is coded in C and comes as a command line tool. Source and compiled code for GISA plus read-me and examples are publicly available at GitHub (https://github.com).
Collapse
Affiliation(s)
- Christian Grønbæk
- Department of Biology, University of Copenhagen, Copenhagen, Denmark.,Current affiliation: Department of Biology, Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, Copenhagen, Denmark
| | - Thomas Hamelryck
- Department of Biology, Department of Computer Science, University of Copenhagen, Copenhagen, Denmark
| | - Peter Røgen
- DTU COMPUTE, Technical University of Denmark, Kgs. Lyngby, Denmark
| |
Collapse
|
8
|
Sulkowska JI. On folding of entangled proteins: knots, lassos, links and θ-curves. Curr Opin Struct Biol 2020; 60:131-141. [PMID: 32062143 DOI: 10.1016/j.sbi.2020.01.007] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2019] [Revised: 01/02/2020] [Accepted: 01/12/2020] [Indexed: 12/15/2022]
Abstract
Around 6% of protein structures deposited in the PDB are entangled, forming knots, slipknots, lassos, links, and θ-curves. In each of these cases, the protein backbone weaves through itself in a complex way, and at some point passes through a closed loop, formed by other regions of the protein structure. Such a passing can be interpreted as crossing a topological barrier. How proteins overcome such barriers, and therefore different degrees of frustration, challenged scientists and has shed new light on the field of protein folding. In this review, we summarize the current knowledge about the free energy landscape of proteins with non-trivial topology. We describe identified mechanisms which lead proteins to self-tying. We discuss the influence of excluded volume, such as crowding and chaperones, on tying, based on available data. We briefly discuss the diversity of topological complexity of proteins and their evolution. We also list available tools to investigate non-trivial topology. Finally, we formulate intriguing and challenging questions at the boundary of biophysics, bioinformatics, biology, and mathematics, which arise from the discovery of entangled proteins.
Collapse
Affiliation(s)
- Joanna Ida Sulkowska
- Centre of New Technologies, University of Warsaw, Warsaw, Poland; Faculty of Chemistry, University of Warsaw, Warsaw, Poland.
| |
Collapse
|
9
|
Badaczewska-Dawid AE, Kolinski A, Kmiecik S. Computational reconstruction of atomistic protein structures from coarse-grained models. Comput Struct Biotechnol J 2019; 18:162-176. [PMID: 31969975 PMCID: PMC6961067 DOI: 10.1016/j.csbj.2019.12.007] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Accepted: 12/10/2019] [Indexed: 01/02/2023] Open
Abstract
Three-dimensional protein structures, whether determined experimentally or theoretically, are often too low resolution. In this mini-review, we outline the computational methods for protein structure reconstruction from incomplete coarse-grained to all atomistic models. Typical reconstruction schemes can be divided into four major steps. Usually, the first step is reconstruction of the protein backbone chain starting from the C-alpha trace. This is followed by side-chains rebuilding based on protein backbone geometry. Subsequently, hydrogen atoms can be reconstructed. Finally, the resulting all-atom models may require structure optimization. Many methods are available to perform each of these tasks. We discuss the available tools and their potential applications in integrative modeling pipelines that can transfer coarse-grained information from computational predictions, or experiment, to all atomistic structures.
Collapse
Affiliation(s)
| | | | - Sebastian Kmiecik
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| |
Collapse
|
10
|
Zuk PJ, Cichocki B, Szymczak P. GRPY: An Accurate Bead Method for Calculation of Hydrodynamic Properties of Rigid Biomacromolecules. Biophys J 2018; 115:782-800. [PMID: 30144937 PMCID: PMC6127458 DOI: 10.1016/j.bpj.2018.07.015] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2017] [Revised: 07/08/2018] [Accepted: 07/16/2018] [Indexed: 10/28/2022] Open
Abstract
Two main problems that arise in the context of hydrodynamic bead modeling are an inaccurate treatment of bead overlaps and the necessity of using volume corrections when calculating intrinsic viscosity. We present a formalism based on the generalized Rotne-Prager-Yamakawa approximation that successfully addresses both of these issues. The generalized Rotne-Prager-Yamakawa method is shown to be highly effective for the calculation of transport properties of rigid biomolecules represented as assemblies of spherical beads of different sizes, both overlapping and nonoverlapping. We test the method on simple molecular shapes as well as real protein structures and compare its performance with other computational approaches.
Collapse
Affiliation(s)
- Pawel J Zuk
- Department of Biosystems and Soft Matter, Institute of Fundamental Technological Research, Polish Academy of Sciences, Warsaw, Poland; Department of Mechanical and Aerospace Engineering, Princeton University, Princeton, New Jersey
| | - Bogdan Cichocki
- Institute of Theoretical Physics, Faculty of Physics, University of Warsaw, Warsaw, Poland
| | - Piotr Szymczak
- Institute of Theoretical Physics, Faculty of Physics, University of Warsaw, Warsaw, Poland.
| |
Collapse
|