1
|
Perlinska AP, Sikora M, Sulkowska JI. Everything AlphaFold tells us about protein knots. J Mol Biol 2024; 436:168715. [PMID: 39029890 DOI: 10.1016/j.jmb.2024.168715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Revised: 06/29/2024] [Accepted: 07/14/2024] [Indexed: 07/21/2024]
Abstract
Recent advances in Machine Learning methods in structural biology opened up new perspectives for protein analysis. Utilizing these methods allows us to go beyond the limitations of empirical research, and take advantage of the vast amount of generated data. We use a complete set of potentially knotted protein models identified in all high-quality predictions from the AlphaFold Database to search for any common trends that describe them. We show that the vast majority of knotted proteins have 31 knot and that the presence of knots is preferred in neither Bacteria, Eukaryota, or Archaea domains. On the contrary, the percentage of knotted proteins in any given proteome is around 0.4%, regardless of the taxonomical group. We also verified that the organism's living conditions do not impact the number of knotted proteins in its proteome, as previously expected. We did not encounter an organism without a single knotted protein. What is more, we found four universally present families of knotted proteins in Bacteria, consisting of SAM synthase, and TrmD, TrmH, and RsmE methyltransferases.
Collapse
Affiliation(s)
- Agata P Perlinska
- Centre of New Technologies, University of Warsaw, Banacha 2c, Warsaw 02-097, Poland
| | - Maciej Sikora
- Centre of New Technologies, University of Warsaw, Banacha 2c, Warsaw 02-097, Poland
| | - Joanna I Sulkowska
- Centre of New Technologies, University of Warsaw, Banacha 2c, Warsaw 02-097, Poland.
| |
Collapse
|
2
|
Dabrowski‐Tumanski P, Goundaroulis D, Stasiak A, Rawdon EJ, Sulkowska JI. Theta-curves in proteins. Protein Sci 2024; 33:e5133. [PMID: 39167036 PMCID: PMC11337915 DOI: 10.1002/pro.5133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 06/22/2024] [Accepted: 07/10/2024] [Indexed: 08/23/2024]
Abstract
We study and characterize the topology of connectivity circuits observed in natively folded protein structures whose coordinates are deposited in the Protein Data Bank (PDB). Polypeptide chains of some proteins naturally fold into unique knotted configurations. Another kind of nontrivial topology of polypeptide chains is observed when, in addition to covalent bonds connecting consecutive amino acids in polypeptide chains, one also considers disulfide and ionic bonds between non-consecutive amino acids. Bonds between non-consecutive amino acids introduce bifurcation points into connectivity circuits defined by bonds between consecutive and nonconsecutive amino acids in analyzed proteins. Circuits with bifurcation points can form θ-curves with various topologies. We catalog here the observed topologies of θ-curves passing through bridges between consecutive and non-consecutive amino acids in studied proteins.
Collapse
Affiliation(s)
| | - Dimos Goundaroulis
- Center for Genome Architecture, Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTexasUSA
- Center for Theoretical Biological PhysicsRice UniversityHoustonTexasUSA
| | - Andrzej Stasiak
- Center for Integrative GenomicsUniversity of LausanneLausanneSwitzerland
- Swiss Institute of BioinformaticsLausanneSwitzerland
| | - Eric J. Rawdon
- Department of MathematicsUniversity of St. ThomasSt. PaulMinnesotaUSA
| | | |
Collapse
|
3
|
Park J, Yamashita E, Yu J, Lee SJ, Hyun S. De Novo Designed Cell-Penetrating Peptide Self-Assembly Featuring Distinctive Tertiary Structure. ACS OMEGA 2024; 9:32991-32999. [PMID: 39100342 PMCID: PMC11292830 DOI: 10.1021/acsomega.4c04004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Revised: 07/04/2024] [Accepted: 07/09/2024] [Indexed: 08/06/2024]
Abstract
Recent attention has focused on the de novo design of proteins, paralleling advancements in biopharmaceuticals. Achieving protein designs with both structure and function poses a significant challenge, particularly considering the importance of quaternary structures, such as oligomers, in protein function. The cell penetration properties of peptides are of particular interest as they involve the penetration of large molecules into cells. We previously suggested a link between the oligomerization propensity of amphipathic peptides and their cell penetration abilities, yet concrete evidence at cellular-relevant concentrations was lacking due to oligomers' instability. In this study, we sought to characterize oligomerization states using various techniques, including X-ray crystallography, acceptor photobleaching Förster resonance energy transfer (FRET), native mass spectrometry (MS), and differential scanning calorimetry (DSC), while exploring the function related to oligomer status. X-ray crystallography revealed the atomic structures of oligomers formed by LK-3, a bis-disulfide bridged dimer with amino acid sequence LKKLCLKLKKLCKLAG, and its derivatives, highlighting the formation of hexamers, specifically the trimer of dimers, which exhibited a stable hydrophobic core. FRET experiments showed that LK-3 oligomer formation was associated with cell penetration. Native MS confirmed higher-order oligomers of LK-3, while an intriguing finding was the enhanced cell-penetrating capability of a 1:1 mixture of l/d-peptide dimers compared to pure enantiomers. DSC analysis supported the notion that this enantiomeric mixture promotes the formation of functional oligomers, crucial for cell penetration. In conclusion, our study provides direct evidence that amphipathic peptide LK-3 forms oligomers at low nanomolar concentrations, underscoring their significance in cell penetration behavior.
Collapse
Affiliation(s)
- Jaehui Park
- College
of Pharmacy, Chungbuk National University, Cheongju 28160, Korea
| | - Eiki Yamashita
- Institute
for Protein Research, Osaka University, 3-2 Yamada-koa, Suita Osaka 565-0871, Japan
| | - Jaehoon Yu
- Department
of Chemistry & Education, Seoul National
University, Seoul 08826, Korea
- CAMP
Therapeutics Co., Ltd., Seoul 08826, Korea
| | - Soo Jae Lee
- College
of Pharmacy, Chungbuk National University, Cheongju 28160, Korea
| | - Soonsil Hyun
- College
of Pharmacy, Chungbuk National University, Cheongju 28160, Korea
| |
Collapse
|
4
|
Sikora M, Klimentova E, Uchal D, Sramkova D, Perlinska AP, Nguyen ML, Korpacz M, Malinowska R, Nowakowski S, Rubach P, Simecek P, Sulkowska JI. Knot or not? Identifying unknotted proteins in knotted families with sequence-based Machine Learning model. Protein Sci 2024; 33:e4998. [PMID: 38888487 PMCID: PMC11184937 DOI: 10.1002/pro.4998] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 03/14/2024] [Accepted: 04/09/2024] [Indexed: 06/20/2024]
Abstract
Knotted proteins, although scarce, are crucial structural components of certain protein families, and their roles continue to be a topic of intense research. Capitalizing on the vast collection of protein structure predictions offered by AlphaFold (AF), this study computationally examines the entire UniProt database to create a robust dataset of knotted and unknotted proteins. Utilizing this dataset, we develop a machine learning (ML) model capable of accurately predicting the presence of knots in protein structures solely from their amino acid sequences. We tested the model's capabilities on 100 proteins whose structures had not yet been predicted by AF and found agreement with our local prediction in 92% cases. From the point of view of structural biology, we found that all potentially knotted proteins predicted by AF can be classified only into 17 families. This allows us to discover the presence of unknotted proteins in families with a highly conserved knot. We found only three new protein families: UCH, DUF4253, and DUF2254, that contain both knotted and unknotted proteins, and demonstrate that deletions within the knot core could potentially account for the observed unknotted (trivial) topology. Finally, we have shown that in the majority of knotted families (11 out of 15), the knotted topology is strictly conserved in functional proteins with very low sequence similarity. We have conclusively demonstrated that proteins AF predicts as unknotted are structurally accurate in their unknotted configurations. However, these proteins often represent nonfunctional fragments, lacking significant portions of the knot core (amino acid sequence).
Collapse
Affiliation(s)
- Maciej Sikora
- Centre of New Technologies, University of WarsawWarsawPoland
- Faculty of Mathematics, Informatics and Mechanics, University of WarsawWarsawPoland
| | - Eva Klimentova
- Central European Institute of Technology, Masaryk UniversityBrnoCzech Republic
- National Centre for Biomolecular Research, Faculty of Science, Masaryk UniversityBrnoCzech Republic
| | - Dawid Uchal
- Centre of New Technologies, University of WarsawWarsawPoland
- Faculty of Physics, University of WarsawWarsawPoland
| | - Denisa Sramkova
- Central European Institute of Technology, Masaryk UniversityBrnoCzech Republic
- National Centre for Biomolecular Research, Faculty of Science, Masaryk UniversityBrnoCzech Republic
| | | | - Mai Lan Nguyen
- Centre of New Technologies, University of WarsawWarsawPoland
| | - Marta Korpacz
- Centre of New Technologies, University of WarsawWarsawPoland
- Faculty of Mathematics, Informatics and Mechanics, University of WarsawWarsawPoland
| | - Roksana Malinowska
- Centre of New Technologies, University of WarsawWarsawPoland
- Faculty of Mathematics, Informatics and Mechanics, University of WarsawWarsawPoland
| | - Szymon Nowakowski
- Faculty of Mathematics, Informatics and Mechanics, University of WarsawWarsawPoland
- Faculty of Physics, University of WarsawWarsawPoland
| | - Pawel Rubach
- Centre of New Technologies, University of WarsawWarsawPoland
- Warsaw School of EconomicsWarsawPoland
| | - Petr Simecek
- Central European Institute of Technology, Masaryk UniversityBrnoCzech Republic
| | | |
Collapse
|
5
|
Castells-Graells R, Yeates TO. Making topological protein links using enzymatic reactions. Natl Sci Rev 2024; 11:nwae071. [PMID: 38572076 PMCID: PMC10990160 DOI: 10.1093/nsr/nwae071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Accepted: 02/25/2024] [Indexed: 04/05/2024] Open
Affiliation(s)
- Roger Castells-Graells
- Department of Chemistry and Biochemistry, University of California, USA
- UCLA-DOE Institute for Genomics and Proteomics, USA
| | - Todd O Yeates
- Department of Chemistry and Biochemistry, University of California, USA
- UCLA-DOE Institute for Genomics and Proteomics, USA
| |
Collapse
|
6
|
Wang J, Chen C, Yao G, Ding J, Wang L, Jiang H. Intelligent Protein Design and Molecular Characterization Techniques: A Comprehensive Review. Molecules 2023; 28:7865. [PMID: 38067593 PMCID: PMC10707872 DOI: 10.3390/molecules28237865] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2023] [Revised: 11/13/2023] [Accepted: 11/23/2023] [Indexed: 12/18/2023] Open
Abstract
In recent years, the widespread application of artificial intelligence algorithms in protein structure, function prediction, and de novo protein design has significantly accelerated the process of intelligent protein design and led to many noteworthy achievements. This advancement in protein intelligent design holds great potential to accelerate the development of new drugs, enhance the efficiency of biocatalysts, and even create entirely new biomaterials. Protein characterization is the key to the performance of intelligent protein design. However, there is no consensus on the most suitable characterization method for intelligent protein design tasks. This review describes the methods, characteristics, and representative applications of traditional descriptors, sequence-based and structure-based protein characterization. It discusses their advantages, disadvantages, and scope of application. It is hoped that this could help researchers to better understand the limitations and application scenarios of these methods, and provide valuable references for choosing appropriate protein characterization techniques for related research in the field, so as to better carry out protein research.
Collapse
Affiliation(s)
| | | | | | - Junjie Ding
- State Key Laboratory of NBC Protection for Civilian, Beijing 102205, China; (J.W.); (C.C.); (G.Y.)
| | - Liangliang Wang
- State Key Laboratory of NBC Protection for Civilian, Beijing 102205, China; (J.W.); (C.C.); (G.Y.)
| | - Hui Jiang
- State Key Laboratory of NBC Protection for Civilian, Beijing 102205, China; (J.W.); (C.C.); (G.Y.)
| |
Collapse
|
7
|
Dabrowski-Tumanski P, Stasiak A. AlphaFold Blindness to Topological Barriers Affects Its Ability to Correctly Predict Proteins' Topology. Molecules 2023; 28:7462. [PMID: 38005184 PMCID: PMC10672856 DOI: 10.3390/molecules28227462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 11/03/2023] [Accepted: 11/04/2023] [Indexed: 11/26/2023] Open
Abstract
AlphaFold is a groundbreaking deep learning tool for protein structure prediction. It achieved remarkable accuracy in modeling many 3D structures while taking as the user input only the known amino acid sequence of proteins in question. Intriguingly though, in the early steps of each individual structure prediction procedure, AlphaFold does not respect topological barriers that, in real proteins, result from the reciprocal impermeability of polypeptide chains. This study aims to investigate how this failure to respect topological barriers affects AlphaFold predictions with respect to the topology of protein chains. We focus on such classes of proteins that, during their natural folding, reproducibly form the same knot type on their linear polypeptide chain, as revealed by their crystallographic analysis. We use partially artificial test constructs in which the mutual non-permeability of polypeptide chains should not permit the formation of complex composite knots during natural protein folding. We find that despite the formal impossibility that the protein folding process could produce such knots, AlphaFold predicts these proteins to form complex composite knots. Our study underscores the necessity for cautious interpretation and further validation of topological features in protein structures predicted by AlphaFold.
Collapse
Affiliation(s)
- Pawel Dabrowski-Tumanski
- Faculty of Mathematics and Natural Sciences, School of Exact Sciences, Cardinal Wyszynski University in Warsaw, Wóycickiego 1/3, 01-938 Warsaw, Poland
| | - Andrzej Stasiak
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| |
Collapse
|