1
|
O'Donnell TJ, Kanduri C, Isacchini G, Limenitakis JP, Brachman RA, Alvarez RA, Haff IH, Sandve GK, Greiff V. Reading the repertoire: Progress in adaptive immune receptor analysis using machine learning. Cell Syst 2024; 15:1168-1189. [PMID: 39701034 DOI: 10.1016/j.cels.2024.11.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2024] [Revised: 08/16/2024] [Accepted: 11/14/2024] [Indexed: 12/21/2024]
Abstract
The adaptive immune system holds invaluable information on past and present immune responses in the form of B and T cell receptor sequences, but we are limited in our ability to decode this information. Machine learning approaches are under active investigation for a range of tasks relevant to understanding and manipulating the adaptive immune receptor repertoire, including matching receptors to the antigens they bind, generating antibodies or T cell receptors for use as therapeutics, and diagnosing disease based on patient repertoires. Progress on these tasks has the potential to substantially improve the development of vaccines, therapeutics, and diagnostics, as well as advance our understanding of fundamental immunological principles. We outline key challenges for the field, highlighting the need for software benchmarking, targeted large-scale data generation, and coordinated research efforts.
Collapse
Affiliation(s)
| | - Chakravarthi Kanduri
- Department of Informatics, University of Oslo, Oslo, Norway; UiO:RealArt Convergence Environment, University of Oslo, Oslo, Norway
| | | | | | - Rebecca A Brachman
- Imprint Labs, LLC, New York, NY, USA; Cornell Tech, Cornell University, New York, NY, USA
| | | | - Ingrid H Haff
- Department of Mathematics, University of Oslo, 0371 Oslo, Norway
| | - Geir K Sandve
- Department of Informatics, University of Oslo, Oslo, Norway; UiO:RealArt Convergence Environment, University of Oslo, Oslo, Norway
| | - Victor Greiff
- Imprint Labs, LLC, New York, NY, USA; Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway.
| |
Collapse
|
2
|
Li C, Luo Y, Xie Y, Zhang Z, Liu Y, Zou L, Xiao F. Structural and functional prediction, evaluation, and validation in the post-sequencing era. Comput Struct Biotechnol J 2024; 23:446-451. [PMID: 38223342 PMCID: PMC10787220 DOI: 10.1016/j.csbj.2023.12.031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 12/20/2023] [Accepted: 12/22/2023] [Indexed: 01/16/2024] Open
Abstract
The surge of genome sequencing data has underlined substantial genetic variants of uncertain significance (VUS). The decryption of VUS discovered by sequencing poses a major challenge in the post-sequencing era. Although experimental assays have progressed in classifying VUS, only a tiny fraction of the human genes have been explored experimentally. Thus, it is urgently needed to generate state-of-the-art functional predictors of VUS in silico. Artificial intelligence (AI) is an invaluable tool to assist in the identification of VUS with high efficiency and accuracy. An increasing number of studies indicate that AI has brought an exciting acceleration in the interpretation of VUS, and our group has already used AI to develop protein structure-based prediction models. In this review, we provide an overview of the previous research on AI-based prediction of missense variants, and elucidate the challenges and opportunities for protein structure-based variant prediction in the post-sequencing era.
Collapse
Affiliation(s)
- Chang Li
- Clinical Biobank, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
- The Key Laboratory of Geriatrics, Beijing Institute of Geriatrics, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Yixuan Luo
- Beijing Normal University, Beijing, China
| | - Yibo Xie
- Information Center, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Zaifeng Zhang
- The Key Laboratory of Geriatrics, Beijing Institute of Geriatrics, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Ye Liu
- The Key Laboratory of Geriatrics, Beijing Institute of Geriatrics, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Lihui Zou
- The Key Laboratory of Geriatrics, Beijing Institute of Geriatrics, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Fei Xiao
- Clinical Biobank, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
- The Key Laboratory of Geriatrics, Beijing Institute of Geriatrics, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
- Beijing Normal University, Beijing, China
| |
Collapse
|
3
|
Lazou M, Khan O, Nguyen T, Padhorny D, Kozakov D, Joseph-McCarthy D, Vajda S. Predicting multiple conformations of ligand binding sites in proteins suggests that AlphaFold2 may remember too much. Proc Natl Acad Sci U S A 2024; 121:e2412719121. [PMID: 39565312 PMCID: PMC11621821 DOI: 10.1073/pnas.2412719121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2024] [Accepted: 10/21/2024] [Indexed: 11/21/2024] Open
Abstract
The goal of this paper is predicting the conformational distributions of ligand binding sites using the AlphaFold2 (AF2) protein structure prediction program with stochastic subsampling of the multiple sequence alignment (MSA). We explored the opening of cryptic ligand binding sites in 16 proteins, where the closed and open conformations define the expected extreme points of the conformational variation. Due to the many structures of these proteins in the Protein Data Bank (PDB), we were able to study whether the distribution of X-ray structures affects the distribution of AF2 models. We have found that AF2 generates both a cluster of open and a cluster of closed models for proteins that have comparable numbers of open and closed structures in the PDB and not too many other conformations. This was observed even with default MSA parameters, thus without further subsampling. In contrast, with the exception of a single protein, AF2 did not yield multiple clusters of conformations for proteins that had imbalanced numbers of open and closed structures in the PDB, or had substantial numbers of other structures. Subsampling improved the results only for a single protein, but very shallow MSA led to incorrect structures. The ability of generating both open and closed conformations for six out of the 16 proteins agrees with the success rates of similar studies reported in the literature. However, we showed that this partial success is due to AF2 "remembering" the conformational distributions in the PDB and that the approach fails to predict rarely seen conformations.
Collapse
Affiliation(s)
- Maria Lazou
- Department of Biomedical Engineering, Boston University, Boston, MA02215
| | - Omeir Khan
- Department of Chemistry, Boston University, Boston, MA02215
| | - Thu Nguyen
- Department of Computer Science, Stony Brook University, Stony Brook, NY11794
| | - Dzmitry Padhorny
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY11794
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY11794
| | - Dima Kozakov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY11794
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY11794
| | - Diane Joseph-McCarthy
- Department of Biomedical Engineering, Boston University, Boston, MA02215
- Department of Chemistry, Boston University, Boston, MA02215
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, MA02215
- Department of Chemistry, Boston University, Boston, MA02215
| |
Collapse
|
4
|
Wu D, Yin R, Chen G, Ribeiro-Filho HV, Cheung M, Robbins PF, Mariuzza RA, Pierce BG. Structural characterization and AlphaFold modeling of human T cell receptor recognition of NRAS cancer neoantigens. SCIENCE ADVANCES 2024; 10:eadq6150. [PMID: 39576860 PMCID: PMC11584006 DOI: 10.1126/sciadv.adq6150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Accepted: 10/21/2024] [Indexed: 11/24/2024]
Abstract
T cell receptors (TCRs) that recognize cancer neoantigens are important for anticancer immune responses and immunotherapy. Understanding the structural basis of TCR recognition of neoantigens provides insights into their exquisite specificity and can enable design of optimized TCRs. We determined crystal structures of a human TCR in complex with NRAS Q61K and Q61R neoantigen peptides and HLA-A1 major histocompatibility complex (MHC), revealing the molecular underpinnings for dual recognition and specificity versus wild-type NRAS peptide. We then used multiple versions of AlphaFold to model the corresponding complex structures, given the challenge of immune recognition for such methods. One implementation of AlphaFold2 (TCRmodel2) with additional sampling was able to generate accurate models of the complexes, while AlphaFold3 also showed strong performance, although success was lower for other complexes. This study provides insights into TCR recognition of a shared cancer neoantigen as well as the utility and practical considerations for using AlphaFold to model TCR-peptide-MHC complexes.
Collapse
MESH Headings
- Humans
- Receptors, Antigen, T-Cell/metabolism
- Receptors, Antigen, T-Cell/immunology
- Receptors, Antigen, T-Cell/chemistry
- Antigens, Neoplasm/immunology
- Antigens, Neoplasm/chemistry
- Antigens, Neoplasm/metabolism
- Membrane Proteins/chemistry
- Membrane Proteins/immunology
- Membrane Proteins/metabolism
- Membrane Proteins/genetics
- Models, Molecular
- GTP Phosphohydrolases/metabolism
- GTP Phosphohydrolases/chemistry
- GTP Phosphohydrolases/genetics
- GTP Phosphohydrolases/immunology
- Protein Binding
- Neoplasms/immunology
- Neoplasms/genetics
- Neoplasms/metabolism
- Crystallography, X-Ray
- Protein Conformation
- Peptides/chemistry
- Peptides/immunology
- Peptides/metabolism
Collapse
Affiliation(s)
- Daichao Wu
- Department of Hepatopancreatobiliary Surgery, The First Affiliated Hospital, Laboratory of Structural Immunology, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- W. M. Keck Laboratory for Structural Biology, University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
| | - Rui Yin
- W. M. Keck Laboratory for Structural Biology, University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA
| | - Guodong Chen
- Department of Hepatopancreatobiliary Surgery, The First Affiliated Hospital, Laboratory of Structural Immunology, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
| | - Helder V. Ribeiro-Filho
- W. M. Keck Laboratory for Structural Biology, University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA
- Brazilian Biosciences National Laboratory, Brazilian Center for Research in Energy and Materials, Campinas 13083-100, Brazil
| | - Melyssa Cheung
- W. M. Keck Laboratory for Structural Biology, University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
- Department of Chemistry and Biochemistry, University of Maryland, College Park, MD 20742, USA
| | - Paul F. Robbins
- Surgery Branch, Center for Cancer Research, National Cancer Institute, Bethesda, MD 20892, USA
| | - Roy A. Mariuzza
- W. M. Keck Laboratory for Structural Biology, University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA
| | - Brian G. Pierce
- W. M. Keck Laboratory for Structural Biology, University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA
| |
Collapse
|
5
|
Yudenko A, Bukhdruker S, Shishkin P, Rodin S, Burtseva A, Petrov A, Pigareva N, Sokolov A, Zinovev E, Eliseev I, Remeeva A, Marin E, Mishin A, Gordeliy V, Gushchin I, Ischenko A, Borshchevskiy V. Structural basis of signaling complex inhibition by IL-6 domain-swapped dimers. Structure 2024:S0969-2126(24)00463-5. [PMID: 39566503 DOI: 10.1016/j.str.2024.10.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2024] [Revised: 09/16/2024] [Accepted: 10/24/2024] [Indexed: 11/22/2024]
Abstract
Interleukin-6 (IL-6) is a multifaceted cytokine essential in many immune system processes and their regulation. It also plays a key role in hematopoiesis, and in triggering the acute phase reaction. IL-6 overproduction is critical in chronic inflammation associated with autoimmune diseases like rheumatoid arthritis and contributes to cytokine storms in COVID-19 patients. Over 20 years ago, researchers proposed that IL-6, which is typically monomeric, can also form dimers via a domain-swap mechanism, with indirect evidence supporting their existence. The physiological significance of IL-6 dimers was shown in B-cell chronic lymphocytic leukemia. However, no structures have been reported so far. Here, we present the crystal structure of an IL-6 domain-swapped dimer that computational approaches could not predict. The structure explains why the IL-6 dimer is antagonistic to the IL-6 monomer in signaling complex formation and provides insights for IL-6 targeted therapies.
Collapse
Affiliation(s)
- Anna Yudenko
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region 141701, Russia
| | - Sergey Bukhdruker
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region 141701, Russia
| | - Pavel Shishkin
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region 141701, Russia
| | - Sergey Rodin
- Institute of Experimental Medicine, St. Petersburg 197022, Russia; Research Institute of Highly Pure Biopreparations, St. Petersburg 197110, Russia
| | - Anastasia Burtseva
- St. Petersburg Pasteur Institute, St. Petersburg 197101, Russia; Research Institute of Highly Pure Biopreparations, St. Petersburg 197110, Russia
| | - Aleksandr Petrov
- Research Institute of Highly Pure Biopreparations, St. Petersburg 197110, Russia; Medicinal Chemistry Center, Togliatti State University, Togliatti, Samara Region 445020, Russia
| | - Natalia Pigareva
- St. Petersburg Pasteur Institute, St. Petersburg 197101, Russia; Research Institute of Highly Pure Biopreparations, St. Petersburg 197110, Russia
| | - Alexey Sokolov
- Institute of Experimental Medicine, St. Petersburg 197022, Russia
| | - Egor Zinovev
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region 141701, Russia
| | - Igor Eliseev
- Alferov University, St. Petersburg 194021, Russia; St. Petersburg School of Physics, Mathematics, and Computer Science, HSE University, St. Petersburg 194100, Russia
| | - Alina Remeeva
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region 141701, Russia
| | - Egor Marin
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region 141701, Russia
| | - Alexey Mishin
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region 141701, Russia
| | - Valentin Gordeliy
- Institut de Biologie Structurale J.-P. Ebel, Université Grenoble Alpes-CEA-CNRS, 38000 Grenoble, France
| | - Ivan Gushchin
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region 141701, Russia
| | - Aleksandr Ischenko
- St. Petersburg Pasteur Institute, St. Petersburg 197101, Russia; Research Institute of Highly Pure Biopreparations, St. Petersburg 197110, Russia.
| | - Valentin Borshchevskiy
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region 141701, Russia; Joint Institute for Nuclear Research, Dubna, Moscow Region 141980, Russia.
| |
Collapse
|
6
|
Raouraoua N, Mirabello C, Véry T, Blanchet C, Wallner B, Lensink MF, Brysbaert G. MassiveFold: unveiling AlphaFold's hidden potential with optimized and parallelized massive sampling. NATURE COMPUTATIONAL SCIENCE 2024; 4:824-828. [PMID: 39528570 PMCID: PMC11578886 DOI: 10.1038/s43588-024-00714-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2024] [Accepted: 10/03/2024] [Indexed: 11/16/2024]
Abstract
Massive sampling in AlphaFold enables access to increased structural diversity. In combination with its efficient confidence ranking, this unlocks elevated modeling capabilities for monomeric structures and foremost for protein assemblies. However, the approach struggles with GPU cost and data storage. Here we introduce MassiveFold, an optimized and customizable version of AlphaFold that runs predictions in parallel, reducing the computing time from several months to hours. MassiveFold is scalable and able to run on anything from a single computer to a large GPU infrastructure, where it can fully benefit from all the computing nodes.
Collapse
Affiliation(s)
- Nessim Raouraoua
- Université de Lille, CNRS, UMR 8576 - UGSF - Unité de Glycobiologie Structurale et Fonctionnelle, Université de Lille, CNRS, Lille, France
| | - Claudio Mirabello
- Science for Life Laboratory, Department of Physics, Chemistry and Biology, National Bioinformatics Infrastructure Sweden, Linköping University, Linköping, Sweden
| | - Thibaut Véry
- Institut du Développement et des Ressources en Informatique Scientifique (IDRIS), CNRS, Université Paris-Saclay, Orsay, France
| | - Christophe Blanchet
- IFB-core, Institut Français de Bioinformatique (IFB), CNRS, INSERM, INRAE, CEA, Evry, France
| | - Björn Wallner
- Division of Bioinformatics, Department of Physics, Chemistry and Biology, Linköping University, Linköping, Sweden
| | - Marc F Lensink
- Université de Lille, CNRS, UMR 8576 - UGSF - Unité de Glycobiologie Structurale et Fonctionnelle, Université de Lille, CNRS, Lille, France
| | - Guillaume Brysbaert
- Université de Lille, CNRS, UMR 8576 - UGSF - Unité de Glycobiologie Structurale et Fonctionnelle, Université de Lille, CNRS, Lille, France.
| |
Collapse
|
7
|
McFee M, Kim J, Kim PM. EuDockScore: Euclidean graph neural networks for scoring protein-protein interfaces. Bioinformatics 2024; 40:btae636. [PMID: 39441796 PMCID: PMC11543620 DOI: 10.1093/bioinformatics/btae636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2024] [Revised: 10/16/2024] [Accepted: 10/21/2024] [Indexed: 10/25/2024] Open
Abstract
MOTIVATION Protein-protein interactions are essential for a variety of biological phenomena including mediating biochemical reactions, cell signaling, and the immune response. Proteins seek to form interfaces which reduce overall system energy. Although determination of single polypeptide chain protein structures has been revolutionized by deep learning techniques, complex prediction has still not been perfected. Additionally, experimentally determining structures is incredibly resource and time expensive. An alternative is the technique of computational docking, which takes the solved individual structures of proteins to produce candidate interfaces (decoys). Decoys are then scored using a mathematical function that assess the quality of the system, known as scoring functions. Beyond docking, scoring functions are a critical component of assessing structures produced by many protein generative models. Scoring models are also used as a final filtering in many generative deep learning models including those that generate antibody binders, and those which perform docking. RESULTS In this work, we present improved scoring functions for protein-protein interactions which utilizes cutting-edge Euclidean graph neural network architectures, to assess protein-protein interfaces. These Euclidean docking score models are known as EuDockScore, and EuDockScore-Ab with the latter being antibody-antigen dock specific. Finally, we provided EuDockScore-AFM a model trained on antibody-antigen outputs from AlphaFold-Multimer (AFM) which proves useful in reranking large numbers of AFM outputs. AVAILABILITY AND IMPLEMENTATION The code for these models is available at https://gitlab.com/mcfeemat/eudockscore.
Collapse
Affiliation(s)
- Matthew McFee
- Department of Molecular Genetics, The University of Toronto, Toronto, ON M5S 1A8, Canada
- Donnelly Centre for Cellular and Biomolecular Research, The University of Toronto, Toronto, ON M5S 3E1, Canada
| | - Jisun Kim
- Donnelly Centre for Cellular and Biomolecular Research, The University of Toronto, Toronto, ON M5S 3E1, Canada
| | - Philip M Kim
- Department of Molecular Genetics, The University of Toronto, Toronto, ON M5S 1A8, Canada
- Donnelly Centre for Cellular and Biomolecular Research, The University of Toronto, Toronto, ON M5S 3E1, Canada
- Department of Computer Science, The University of Toronto, Toronto, ON M5S 2E4, Canada
| |
Collapse
|
8
|
Xu J, Wang Y. Generating Multistate Conformations of P-type ATPases with a Conditional Diffusion Model. J Chem Inf Model 2024. [PMID: 39480276 DOI: 10.1021/acs.jcim.4c01519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2024]
Abstract
Understanding and predicting the diverse conformational states of membrane proteins is essential for elucidating their biological functions. Despite advancements in computational methods, accurately capturing these complex structural changes remains a significant challenge. Here, we introduce a computational approach to generate diverse and biologically relevant conformations of membrane proteins using a conditional diffusion model. Our approach integrates forward and backward diffusion processes, incorporating state classifiers and additional conditioners to control the generation gradient of conformational states. We specifically targeted the P-type ATPases, a critical family of membrane transporters, and constructed a comprehensive data set through a combination of experimental structures and molecular dynamics simulations. Our model, incorporating a graph neural network with specialized membrane constraints, demonstrates exceptional accuracy in generating a wide range of P-type ATPase conformations associated with different functional states. This approach represents a meaningful step forward in the computational generation of membrane protein conformations using AI and holds promise for studying the dynamics of other membrane proteins.
Collapse
Affiliation(s)
- Jingtian Xu
- College of Life Sciences, Zhejiang University, Hangzhou 310027, China
| | - Yong Wang
- College of Life Sciences, Zhejiang University, Hangzhou 310027, China
| |
Collapse
|
9
|
Büttiker P, Boukherissa A, Weissenberger S, Ptacek R, Anders M, Raboch J, Stefano GB. Cognitive Impact of Neurotropic Pathogens: Investigating Molecular Mimicry through Computational Methods. Cell Mol Neurobiol 2024; 44:72. [PMID: 39467848 PMCID: PMC11519248 DOI: 10.1007/s10571-024-01509-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2024] [Accepted: 10/22/2024] [Indexed: 10/30/2024]
Abstract
Neurotropic pathogens, notably, herpesviruses, have been associated with significant neuropsychiatric effects. As a group, these pathogens can exploit molecular mimicry mechanisms to manipulate the host central nervous system to their advantage. Here, we present a systematic computational approach that may ultimately be used to unravel protein-protein interactions and molecular mimicry processes that have not yet been solved experimentally. Toward this end, we validate this approach by replicating a set of pre-existing experimental findings that document the structural and functional similarities shared by the human cytomegalovirus-encoded UL144 glycoprotein and human tumor necrosis factor receptor superfamily member 14 (TNFRSF14). We began with a thorough exploration of the Homo sapiens protein database using the Basic Local Alignment Search Tool (BLASTx) to identify proteins sharing sequence homology with UL144. Subsequently, we used AlphaFold2 to predict the independent three-dimensional structures of UL144 and TNFRSF14. This was followed by a comprehensive structural comparison facilitated by Distance-Matrix Alignment and Foldseek. Finally, we used AlphaFold-multimer and PPIscreenML to elucidate potential protein complexes and confirm the predicted binding activities of both UL144 and TNFRSF14. We then used our in silico approach to replicate the experimental finding that revealed TNFRSF14 binding to both B- and T-lymphocyte attenuator (BTLA) and glycoprotein domain and UL144 binding to BTLA alone. This computational framework offers promise in identifying structural similarities and interactions between pathogen-encoded proteins and their host counterparts. This information will provide valuable insights into the cognitive mechanisms underlying the neuropsychiatric effects of viral infections.
Collapse
Affiliation(s)
- Pascal Büttiker
- Department of Psychiatry, First Faculty of Medicine, Charles University and General University Hospital in Prague, Prague, Czech Republic
| | - Amira Boukherissa
- Institute for Integrative Biology of the Cell (I2BC), UMR91918, CNRS, CEA, Paris-Saclay University, Gif-Sur-Yvette, France
- Ecology Systematics Evolution (ESE), CNRS, AgroParisTech, Paris-Saclay University, Orsay, France
| | - Simon Weissenberger
- Department of Psychology, University of New York in Prague, Prague, Czech Republic
| | - Radek Ptacek
- Department of Psychiatry, First Faculty of Medicine, Charles University and General University Hospital in Prague, Prague, Czech Republic
| | - Martin Anders
- Department of Psychiatry, First Faculty of Medicine, Charles University and General University Hospital in Prague, Prague, Czech Republic
| | - Jiri Raboch
- Department of Psychiatry, First Faculty of Medicine, Charles University and General University Hospital in Prague, Prague, Czech Republic
| | - George B Stefano
- Department of Psychiatry, First Faculty of Medicine, Charles University and General University Hospital in Prague, Prague, Czech Republic.
| |
Collapse
|
10
|
Bellinzona G, Sassera D, Bonvin AMJJ. Accelerating protein-protein interaction screens with reduced AlphaFold-Multimer sampling. BIOINFORMATICS ADVANCES 2024; 4:vbae153. [PMID: 39464748 PMCID: PMC11513016 DOI: 10.1093/bioadv/vbae153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/22/2024] [Revised: 09/30/2024] [Accepted: 10/10/2024] [Indexed: 10/29/2024]
Abstract
Motivation Discovering new protein-protein interactions (PPIs) across entire proteomes offers vast potential for understanding novel protein functions and elucidate system properties within or between an organism. While recent advances in computational structural biology, particularly AlphaFold-Multimer, have facilitated this task, scaling for large-scale screenings remains a challenge, requiring significant computational resources. Results We evaluated the impact of reducing the number of models generated by AlphaFold-Multimer from five to one on the method's ability to distinguish true PPIs from false ones. Our evaluation was conducted on a dataset containing both intra- and inter-species PPIs, which included proteins from bacterial and eukaryotic sources. We demonstrate that reducing the sampling does not compromise the accuracy of the method, offering a faster, efficient, and environmentally friendly solution for PPI predictions. Availability and implementation The code used in this article is available at https://github.com/MIDIfactory/AlphaFastPPi. Note that the same can be achieved using the latest version of AlphaPulldown available at https://github.com/KosinskiLab/AlphaPulldown.
Collapse
Affiliation(s)
- Greta Bellinzona
- Department of Biology and Biotechnology, University of Pavia, Pavia 27100, Italy
| | - Davide Sassera
- Department of Biology and Biotechnology, University of Pavia, Pavia 27100, Italy
- IRCCS Policlinico San Matteo, Pavia 27100, Italy
| | - Alexandre M J J Bonvin
- Department of Chemistry, Faculty of Science, Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Utrecht 3584 CS, The Netherlands
| |
Collapse
|
11
|
Mirabello C, Wallner B, Nystedt B, Azinas S, Carroni M. Unmasking AlphaFold to integrate experiments and predictions in multimeric complexes. Nat Commun 2024; 15:8724. [PMID: 39379372 PMCID: PMC11461844 DOI: 10.1038/s41467-024-52951-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Accepted: 09/26/2024] [Indexed: 10/10/2024] Open
Abstract
Since the release of AlphaFold, researchers have actively refined its predictions and attempted to integrate it into existing pipelines for determining protein structures. These efforts have introduced a number of functionalities and optimisations at the latest Critical Assessment of protein Structure Prediction edition (CASP15), resulting in a marked improvement in the prediction of multimeric protein structures. However, AlphaFold's capability of predicting large protein complexes is still limited and integrating experimental data in the prediction pipeline is not straightforward. In this study, we introduce AF_unmasked to overcome these limitations. Our results demonstrate that AF_unmasked can integrate experimental information to build larger or hard to predict protein assemblies with high confidence. The resulting predictions can help interpret and augment experimental data. This approach generates high quality (DockQ score > 0.8) structures even when little to no evolutionary information is available and imperfect experimental structures are used as a starting point. AF_unmasked is developed and optimised to fill incomplete experimental structures (structural inpainting), which may provide insights into protein dynamics. In summary, AF_unmasked provides an easy-to-use method that efficiently integrates experiments to predict large protein complexes more confidently.
Collapse
Affiliation(s)
- Claudio Mirabello
- Dept of Physics, Chemistry and Biology, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Linköping University, 581 83, Linköping, Sweden.
| | - Björn Wallner
- Dept of Physics, Chemistry and Biology, Linköping University, 581 83, Linköping, Sweden
| | - Björn Nystedt
- Dept of Cell and Molecular Biology, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Uppsala University, Husargatan 3, SE-752 37, Uppsala, Sweden
| | - Stavros Azinas
- Dept of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Stockholm, Sweden
| | - Marta Carroni
- Dept of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Stockholm, Sweden
| |
Collapse
|
12
|
Träger TK, Tüting C, Kastritis PL. The human touch: Utilizing AlphaFold 3 to analyze structures of endogenous metabolons. Structure 2024; 32:1555-1562. [PMID: 39303718 DOI: 10.1016/j.str.2024.08.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Revised: 07/10/2024] [Accepted: 08/26/2024] [Indexed: 09/22/2024]
Abstract
Computational structural biology aims to accurately predict biomolecular complexes with AlphaFold 3 spearheading the field. However, challenges loom for structural analysis, especially when complex assemblies such as the pyruvate dehydrogenase complex (PDHc), which catalyzes the link reaction in cellular respiration, are studied. PDHc subcomplexes are challenging to predict, particularly interactions involving weaker, lower-affinity subcomplexes. Supervised modeling, i.e., integrative structural biology, will continue to play a role in fine-tuning this type of prediction (e.g., removing clashes, rebuilding loops/disordered regions, and redocking interfaces). 3D analysis of endogenous metabolic complexes continues to require, in addition to AI, precise and multi-faceted interrogation methods.
Collapse
Affiliation(s)
- Toni K Träger
- Institute of Biochemistry and Biotechnology, Martin Luther University Halle-Wittenberg, Kurt-Mothes-Straße 3, 06120 Halle/Saale, Germany; Biozentrum, Martin Luther University Halle-Wittenberg, Weinbergweg 22, 06120 Halle/Saale, Germany
| | - Christian Tüting
- Institute of Biochemistry and Biotechnology, Martin Luther University Halle-Wittenberg, Kurt-Mothes-Straße 3, 06120 Halle/Saale, Germany; Biozentrum, Martin Luther University Halle-Wittenberg, Weinbergweg 22, 06120 Halle/Saale, Germany
| | - Panagiotis L Kastritis
- Institute of Biochemistry and Biotechnology, Martin Luther University Halle-Wittenberg, Kurt-Mothes-Straße 3, 06120 Halle/Saale, Germany; Biozentrum, Martin Luther University Halle-Wittenberg, Weinbergweg 22, 06120 Halle/Saale, Germany; Institute of Chemical Biology, National Hellenic Research Foundation, 11635 Athens, Greece; Interdisciplinary Research Center HALOmem, Charles Tanford Protein Center, Martin Luther University Halle-Wittenberg, Kurt-Mothes-Straße 3a, 06120 Halle/Saale, Germany.
| |
Collapse
|
13
|
Rosignoli S, Pacelli M, Manganiello F, Paiardini A. An outlook on structural biology after AlphaFold: tools, limits and perspectives. FEBS Open Bio 2024. [PMID: 39313455 DOI: 10.1002/2211-5463.13902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 08/19/2024] [Accepted: 09/13/2024] [Indexed: 09/25/2024] Open
Abstract
AlphaFold and similar groundbreaking, AI-based tools, have revolutionized the field of structural bioinformatics, with their remarkable accuracy in ab-initio protein structure prediction. This success has catalyzed the development of new software and pipelines aimed at incorporating AlphaFold's predictions, often focusing on addressing the algorithm's remaining challenges. Here, we present the current landscape of structural bioinformatics shaped by AlphaFold, and discuss how the field is dynamically responding to this revolution, with new software, methods, and pipelines. While the excitement around AI-based tools led to their widespread application, it is essential to acknowledge that their practical success hinges on their integration into established protocols within structural bioinformatics, often neglected in the context of AI-driven advancements. Indeed, user-driven intervention is still as pivotal in the structure prediction process as in complementing state-of-the-art algorithms with functional and biological knowledge.
Collapse
Affiliation(s)
- Serena Rosignoli
- Department of Biochemical sciences "A. Rossi Fanelli", Sapienza Università di Roma, Italy
| | - Maddalena Pacelli
- Department of Biochemical sciences "A. Rossi Fanelli", Sapienza Università di Roma, Italy
| | - Francesca Manganiello
- Department of Biochemical sciences "A. Rossi Fanelli", Sapienza Università di Roma, Italy
| | - Alessandro Paiardini
- Department of Biochemical sciences "A. Rossi Fanelli", Sapienza Università di Roma, Italy
| |
Collapse
|
14
|
Benavides TL, Montelione GT. Integrative Modeling of Protein-Polypeptide Complexes by Bayesian Model Selection using AlphaFold and NMR Chemical Shift Perturbation Data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.19.613999. [PMID: 39345459 PMCID: PMC11430059 DOI: 10.1101/2024.09.19.613999] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/01/2024]
Abstract
Protein-polypeptide interactions, including those involving intrinsically-disordered peptides and intrinsically-disordered regions of protein binding partners, are crucial for many biological functions. However, experimental structure determination of protein-peptide complexes can be challenging. Computational methods, while promising, generally require experimental data for validation and refinement. Here we present CSP_Rank, an integrated modeling approach to determine the structures of protein-peptide complexes. This method combines AlphaFold2 (AF2) enhanced sampling methods with a Bayesian conformational selection process based on experimental Nuclear Magnetic Resonance (NMR) Chemical Shift Perturbation (CSP) data and AF2 confidence metrics. Using a curated dataset of 108 protein-peptide complexes from the Biological Magnetic Resonance Data Bank (BMRB), we observe that while AF2 typically yields models with excellent consistency with experimental CSP data, applying enhanced sampling followed by data-guided conformational selection routinely results in ensembles of structures with improved agreement with NMR observables. For two systems, we cross-validate the CSP-selected models using independently acquired nuclear Overhauser effect (NOE) NMR data and demonstrate how CSP and NMR can be combined using our Bayesian framework for model selection. CSP_Rank is a novel method for integrative modeling of protein-peptide complexes and has broad implications for studies of protein-peptide interactions and aiding in understanding their biological functions.
Collapse
Affiliation(s)
- Tiburon L. Benavides
- Department of Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180 USA
| | - Gaetano T. Montelione
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180 USA
| |
Collapse
|
15
|
Liu J, Guo Z, You H, Zhang C, Lai L. All-Atom Protein Sequence Design Based on Geometric Deep Learning. Angew Chem Int Ed Engl 2024:e202411461. [PMID: 39295564 DOI: 10.1002/anie.202411461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Revised: 09/09/2024] [Accepted: 09/18/2024] [Indexed: 09/21/2024]
Abstract
Designing sequences for specific protein backbones is a key step in creating new functional proteins. Here, we introduce GeoSeqBuilder, a deep learning framework that integrates protein sequence generation with side chain conformation prediction to produce the complete all-atom structures for designed sequences. GeoSeqBuilder uses spatial geometric features from protein backbones and explicitly includes three-body interactions of neighboring residues. GeoSeqBuilder achieves native residue type recovery rate of 51.6 %, comparable to ProteinMPNN and other leading methods, while accurately predicting side chain conformations. We first used GeoSeqBuilder to design sequences for thioredoxin and a hallucinated three-helical bundle protein. All the 15 tested sequences expressed as soluble monomeric proteins with high thermal stability, and the 2 high-resolution crystal structures solved closely match the designed models. The generated protein sequences exhibit low similarity (minimum 23 %) to the original sequences, with significantly altered hydrophobic cores. We further redesigned the hydrophobic core of glutathione peroxidase 4, and 3 of the 5 designs showed improved enzyme activity. Although further testing is needed, the high experimental success rate in our testing demonstrates that GeoSeqBuilder is a powerful tool for designing novel sequences for predefined protein structures with atomic details. GeoSeqBuilder is available at https://github.com/PKUliujl/GeoSeqBuilder.
Collapse
Affiliation(s)
- Jiale Liu
- Center for Life Sciences Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, 100871, China
| | - Zheng Guo
- Center for Life Sciences Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, 100871, China
| | - Hantian You
- BNLMS, College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, China
| | - Changsheng Zhang
- BNLMS, College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, China
| | - Luhua Lai
- Center for Life Sciences Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, 100871, China
- BNLMS, College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, China
- Center for Quantitative Biology Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, 100871, China
- Chengdu Academy for Advanced Interdisciplinary Biotechnologies, Peking University, Chengdu, 510100, Sichuan, China
| |
Collapse
|
16
|
Stahl K, Warneke R, Demann L, Bremenkamp R, Hormes B, Brock O, Stülke J, Rappsilber J. Modelling protein complexes with crosslinking mass spectrometry and deep learning. Nat Commun 2024; 15:7866. [PMID: 39251624 PMCID: PMC11383924 DOI: 10.1038/s41467-024-51771-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Accepted: 08/16/2024] [Indexed: 09/11/2024] Open
Abstract
Scarcity of structural and evolutionary information on protein complexes poses a challenge to deep learning-based structure modelling. We integrate experimental distance restraints obtained by crosslinking mass spectrometry (MS) into AlphaFold-Multimer, by extending AlphaLink to protein complexes. Integrating crosslinking MS data substantially improves modelling performance on challenging targets, by helping to identify interfaces, focusing sampling, and improving model selection. This extends to single crosslinks from whole-cell crosslinking MS, opening the possibility of whole-cell structural investigations driven by experimental data. We demonstrate this by revealing the molecular basis of iron homoeostasis in Bacillus subtilis.
Collapse
Affiliation(s)
- Kolja Stahl
- Technische Universität Berlin, Chair of Bioanalytics, Berlin, Germany
| | - Robert Warneke
- Georg-August-Universität Göttingen, Department of General Microbiology, Institute for Microbiology & Genetics, GZMB, Göttingen, Germany
| | - Lorenz Demann
- Georg-August-Universität Göttingen, Department of General Microbiology, Institute for Microbiology & Genetics, GZMB, Göttingen, Germany
| | - Rica Bremenkamp
- Georg-August-Universität Göttingen, Department of General Microbiology, Institute for Microbiology & Genetics, GZMB, Göttingen, Germany
| | - Björn Hormes
- Georg-August-Universität Göttingen, Department of General Microbiology, Institute for Microbiology & Genetics, GZMB, Göttingen, Germany
| | - Oliver Brock
- Technische Universität Berlin, Robotics and Biology Laboratory, Berlin, Germany
- Science of Intelligence, Research Cluster of Excellence, Berlin, Germany
| | - Jörg Stülke
- Georg-August-Universität Göttingen, Department of General Microbiology, Institute for Microbiology & Genetics, GZMB, Göttingen, Germany.
| | - Juri Rappsilber
- Technische Universität Berlin, Chair of Bioanalytics, Berlin, Germany.
- Si-M/"Der Simulierte Mensch", a Science Framework of Technische Universität Berlin and Charité - Universitätsmedizin Berlin, Berlin, Germany.
- Wellcome Centre for Cell Biology, University of Edinburgh, Edinburgh, UK.
| |
Collapse
|
17
|
Xing C, Li G, Zheng X, Li P, Yuan J, Yan W. Characterization of a Novel Monoclonal Antibody with High Affinity and Specificity against Aflatoxins: A Discovery from Rosetta Antibody-Ligand Computational Simulation. J Chem Inf Model 2024; 64:6814-6826. [PMID: 39157865 DOI: 10.1021/acs.jcim.4c00736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/20/2024]
Abstract
Aflatoxin B1 (AFB1) accumulates in crops, where it poses a threat to human health. To detect AFB1, anti-AFB1 monoclonal antibodies have been developed and are widely used. While the sensitivity and specificity of these antibodies have been extensively studied, information regarding the atomic-level docking of AFB1 (and its derivatives) with these antibodies is limited. Such information is crucial for understanding the key interactions that are required for high affinity and specificity in aflatoxin binding. First, a 3D comparative model of anti-AFB1 antibody (Ab-4B5G6) was predicted from the sequence using RosettaAntibody. We then utilized RosettaLigand to dock AFB1 onto ten homology models, producing a total of 10,000 binding modes. Interestingly, the best-scoring mode predicted strong interactions involving four sites within the heavy chain: ALA33, ASN52, HIS95, and TRP99. Importantly, these strong binding interactions exclusively involve the variable domain of the heavy chain. The best-scoring mode with AFB1 was also obtained through AF multimer combined with RosettaLigand, and two interactions at TRP and HIS were consistent with those found by Rosetta antibody-ligand computational simulation. The role of tryptophan in π interactions in antibodies was confirmed through mutation experiments, and the resulting mutant (W99A) exhibited a >1000-fold reduction in binding affinity for AFB1 and analogs, indicating the effect of tryptophan on the stability of CDR-H3 region. Additionally, we evaluated the binding of two glycolic acid-derived molecular derivatives (with impaired hydrogen bonding potential), and these derivatives (AFB2-GA and AFG2-GA) demonstrated a very weak binding affinity for Ab-4B5G6. The heavy chain was successfully isolated, and its sensitivity and specificity were consistent with those of the intact antibody. The homology models of variable heavy (VH) single-domain antibodies were established by RosettaAntibody, and the docking analysis revealed the same residues, including Ala, His, and Trp. Compared to the potential binding mode of fragment variable (FV) region, the results from a model of VH indicated that there are seven models involved in hydrophobic interaction with TYR32, which is usually referred to as polar amino acid and has both hydrophobic and hydrophilic features depending on the circumstances. Our work encompasses the entire process of Rosetta antibody-ligand computational simulation, highlighting the significance of variable heavy domain structural design in enhancing molecular interactions.
Collapse
Affiliation(s)
- Changrui Xing
- College of Food Science and Engineering, Collaborative Innovation Center for Modern Grain Circulation and Safety, Key Laboratory of Grains and Oils Quality Control and Processing, Nanjing University of Finance and Economics, Nanjing 210023, China
| | - Guanglei Li
- College of Food Science and Engineering, Collaborative Innovation Center for Modern Grain Circulation and Safety, Key Laboratory of Grains and Oils Quality Control and Processing, Nanjing University of Finance and Economics, Nanjing 210023, China
| | - Xin Zheng
- College of Food Science and Engineering, Collaborative Innovation Center for Modern Grain Circulation and Safety, Key Laboratory of Grains and Oils Quality Control and Processing, Nanjing University of Finance and Economics, Nanjing 210023, China
| | - Peng Li
- College of Food Science and Engineering, Collaborative Innovation Center for Modern Grain Circulation and Safety, Key Laboratory of Grains and Oils Quality Control and Processing, Nanjing University of Finance and Economics, Nanjing 210023, China
| | - Jian Yuan
- College of Food Science and Engineering, Collaborative Innovation Center for Modern Grain Circulation and Safety, Key Laboratory of Grains and Oils Quality Control and Processing, Nanjing University of Finance and Economics, Nanjing 210023, China
| | - Wenjing Yan
- National Center of Meat Quality & Safety Control, College of Food Science and Technology, Nanjing Agricultural University, Nanjing 210095, China
| |
Collapse
|
18
|
Guzmán-Vega FJ, Arold ST. AlphaCRV: a pipeline for identifying accurate binder topologies in mass-modeling with AlphaFold. BIOINFORMATICS ADVANCES 2024; 4:vbae131. [PMID: 39286602 PMCID: PMC11405088 DOI: 10.1093/bioadv/vbae131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Figures] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 08/05/2024] [Accepted: 09/04/2024] [Indexed: 09/19/2024]
Abstract
Motivation The speed and accuracy of deep learning-based structure prediction algorithms make it now possible to perform in silico "pull-downs" to identify protein-protein interactions on a proteome-wide scale. However, on such a large scale, existing scoring algorithms are often insufficient to discriminate biologically relevant interactions from false positives. Results Here, we introduce AlphaCRV, a Python package that helps identify correct interactors in a one-against-many AlphaFold screen by clustering, ranking, and visualizing conserved binding topologies, based on protein sequence and fold. Availability and implementation AlphaCRV is a Python package for Linux, freely available at https://github.com/strubelab/AlphaCRV.
Collapse
Affiliation(s)
- Francisco J Guzmán-Vega
- Biological and Environmental Science and Engineering Division, Computational Biology Research Center, King Abdullah University of Science and Technology, Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Stefan T Arold
- Biological and Environmental Science and Engineering Division, Computational Biology Research Center, King Abdullah University of Science and Technology, Thuwal 23955-6900, Kingdom of Saudi Arabia
| |
Collapse
|
19
|
Konold PE, Monrroy L, Bellisario A, Filipe D, Adams P, Alvarez R, Bean R, Bielecki J, Bódizs S, Ducrocq G, Grubmueller H, Kirian RA, Kloos M, Koliyadu JCP, Koua FHM, Larkiala T, Letrun R, Lindsten F, Maihöfer M, Martin AV, Mészáros P, Mutisya J, Nimmrich A, Okamoto K, Round A, Sato T, Valerio J, Westphal D, Wollter A, Yenupuri TV, You T, Maia F, Westenhoff S. Microsecond time-resolved X-ray scattering by utilizing MHz repetition rate at second-generation XFELs. Nat Methods 2024; 21:1608-1611. [PMID: 38969722 PMCID: PMC11399097 DOI: 10.1038/s41592-024-02344-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Accepted: 06/10/2024] [Indexed: 07/07/2024]
Abstract
Detecting microsecond structural perturbations in biomolecules has wide relevance in biology, chemistry and medicine. Here we show how MHz repetition rates at X-ray free-electron lasers can be used to produce microsecond time-series of protein scattering with exceptionally low noise levels of 0.001%. We demonstrate the approach by examining Jɑ helix unfolding of a light-oxygen-voltage photosensory domain. This time-resolved acquisition strategy is easy to implement and widely applicable for direct observation of structural dynamics of many biochemical processes.
Collapse
Affiliation(s)
- Patrick E Konold
- Laboratory of Molecular Biophysics, Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| | - Leonardo Monrroy
- Department of Chemistry - BMC, Uppsala University, Uppsala, Sweden
| | - Alfredo Bellisario
- Laboratory of Molecular Biophysics, Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| | - Diogo Filipe
- Laboratory of Molecular Biophysics, Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| | - Patrick Adams
- School of Science, STEM College, RMIT University, Melbourne, Victoria, Australia
| | - Roberto Alvarez
- Department of Physics, Arizona State University, Tempe, AZ, USA
| | | | | | - Szabolcs Bódizs
- Department of Chemistry - BMC, Uppsala University, Uppsala, Sweden
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden
| | - Gabriel Ducrocq
- Department of Computer and Information Science (IDA), Linköping University, Linköping, Sweden
- The Division of Statistics and Machine Learning (STIMA), Linköping University, Linköping, Sweden
| | - Helmut Grubmueller
- Department of Computer and Information Science (IDA), Linköping University, Linköping, Sweden
| | | | - Marco Kloos
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden
| | - Jayanath C P Koliyadu
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden
| | | | - Taru Larkiala
- Department of Chemistry - BMC, Uppsala University, Uppsala, Sweden
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden
| | | | - Fredrik Lindsten
- Department of Computer and Information Science (IDA), Linköping University, Linköping, Sweden
- The Division of Statistics and Machine Learning (STIMA), Linköping University, Linköping, Sweden
| | - Michael Maihöfer
- Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | - Andrew V Martin
- School of Science, STEM College, RMIT University, Melbourne, Victoria, Australia
| | - Petra Mészáros
- Department of Chemistry - BMC, Uppsala University, Uppsala, Sweden
| | - Jennifer Mutisya
- Department of Chemistry - BMC, Uppsala University, Uppsala, Sweden
| | - Amke Nimmrich
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden
- Department of Chemistry, University of Washington, Seattle, WA, USA
| | - Kenta Okamoto
- Laboratory of Molecular Biophysics, Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| | | | | | | | - Daniel Westphal
- Laboratory of Molecular Biophysics, Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| | - August Wollter
- Laboratory of Molecular Biophysics, Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| | - Tej Varma Yenupuri
- Laboratory of Molecular Biophysics, Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| | - Tong You
- Laboratory of Molecular Biophysics, Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| | - Filipe Maia
- Laboratory of Molecular Biophysics, Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden.
| | - Sebastian Westenhoff
- Laboratory of Molecular Biophysics, Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden.
- Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden.
| |
Collapse
|
20
|
Bryant P, Noé F. Structure prediction of alternative protein conformations. Nat Commun 2024; 15:7328. [PMID: 39187507 PMCID: PMC11347660 DOI: 10.1038/s41467-024-51507-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 08/07/2024] [Indexed: 08/28/2024] Open
Abstract
Proteins are dynamic molecules whose movements result in different conformations with different functions. Neural networks such as AlphaFold2 can predict the structure of single-chain proteins with conformations most likely to exist in the PDB. However, almost all protein structures with multiple conformations represented in the PDB have been used while training these models. Therefore, it is unclear whether alternative protein conformations can be genuinely predicted using these networks, or if they are simply reproduced from memory. Here, we train a structure prediction network, Cfold, on a conformational split of the PDB to generate alternative conformations. Cfold enables efficient exploration of the conformational landscape of monomeric protein structures. Over 50% of experimentally known nonredundant alternative protein conformations evaluated here are predicted with high accuracy (TM-score > 0.8).
Collapse
Affiliation(s)
- Patrick Bryant
- Department of Mathematics and Informatics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany.
- The Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Svante Arrhenius väg 20C, 114 18, Stockholm, Sweden.
- Science for Life Laboratory, 172 21, Solna, Sweden.
| | - Frank Noé
- Department of Mathematics and Informatics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany
- Microsoft Research AI4Science, Karl-Liebknecht Str. 32, 10178, Berlin, Germany
| |
Collapse
|
21
|
Edmunds NS, Genc AG, McGuffin LJ. Benchmarking of AlphaFold2 accuracy self-estimates as indicators of empirical model quality and ranking: a comparison with independent model quality assessment programmes. Bioinformatics 2024; 40:btae491. [PMID: 39115813 PMCID: PMC11322044 DOI: 10.1093/bioinformatics/btae491] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2024] [Revised: 07/09/2024] [Indexed: 08/15/2024] Open
Abstract
MOTIVATION Despite an increase in protein modelling accuracy following the development of AlphaFold2, there remains an accuracy gap between predicted and observed model quality assessment (MQA) scores. In CASP15, variations in AlphaFold2 model accuracy prediction were noticed for quaternary models of very similar observed quality. In this study, we compare plDDT and pTM to their observed counterparts the local distance difference test (lDDT) and TM-score for both tertiary and quaternary models to examine whether reliability is retained across the scoring range under normal modelling conditions and in situations where AlphaFold2 functionality is customized. We also explore plDDT and pTM ranking accuracy in comparison with the published independent MQA programmes ModFOLD9 and ModFOLDdock. RESULTS plDDT was found to be an accurate descriptor of tertiary model quality compared to observed lDDT-Cα scores (Pearson r = 0.97), and achieved a ranking agreement true positive rate (TPR) of 0.34 with observed scores, which ModFOLD9 could not improve. However, quaternary structure accuracy was reduced (plDDT r = 0.67, pTM r = 0.70) and significant overprediction was seen with both scores for some lower quality models. Additionally, ModFOLDdock was able to improve upon AF2-Multimer model ranking compared to TM-score (TPR 0.34) and oligo-lDDT score (TPR 0.43). Finally, evidence is presented for increased variability in plDDT and pTM when using custom template recycling, which is more pronounced for quaternary structures. AVAILABILITY AND IMPLEMENTATION The ModFOLD9 and ModFOLDdock quality assessment servers are available at https://www.reading.ac.uk/bioinf/ModFOLD/ and https://www.reading.ac.uk/bioinf/ModFOLDdock/, respectively. A docker image is available at https://hub.docker.com/r/mcguffin/multifold.
Collapse
Affiliation(s)
- Nicholas S Edmunds
- School of Biological Sciences, University of Reading, Whiteknights, Reading, RG6 6EX, United Kingdom
| | - Ahmet G Genc
- School of Biological Sciences, University of Reading, Whiteknights, Reading, RG6 6EX, United Kingdom
| | - Liam J McGuffin
- School of Biological Sciences, University of Reading, Whiteknights, Reading, RG6 6EX, United Kingdom
| |
Collapse
|
22
|
Agarwal V, McShan AC. The power and pitfalls of AlphaFold2 for structure prediction beyond rigid globular proteins. Nat Chem Biol 2024; 20:950-959. [PMID: 38907110 DOI: 10.1038/s41589-024-01638-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 04/29/2024] [Indexed: 06/23/2024]
Abstract
Artificial intelligence-driven advances in protein structure prediction in recent years have raised the question: has the protein structure-prediction problem been solved? Here, with a focus on nonglobular proteins, we highlight the many strengths and potential weaknesses of DeepMind's AlphaFold2 in the context of its biological and therapeutic applications. We summarize the subtleties associated with evaluation of AlphaFold2 model quality and reliability using the predicted local distance difference test (pLDDT) and predicted aligned error (PAE) values. We highlight various classes of proteins that AlphaFold2 can be applied to and the caveats involved. Concrete examples of how AlphaFold2 models can be integrated with experimental data in the form of small-angle X-ray scattering (SAXS), solution NMR, cryo-electron microscopy (cryo-EM) and X-ray diffraction are discussed. Finally, we highlight the need to move beyond structure prediction of rigid, static structural snapshots toward conformational ensembles and alternate biologically relevant states. The overarching theme is that careful consideration is due when using AlphaFold2-generated models to generate testable hypotheses and structural models, rather than treating predicted models as de facto ground truth structures.
Collapse
Affiliation(s)
- Vinayak Agarwal
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA, USA.
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA.
| | - Andrew C McShan
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA, USA.
| |
Collapse
|
23
|
Shor B, Schneidman-Duhovny D. Integrative modeling meets deep learning: Recent advances in modeling protein assemblies. Curr Opin Struct Biol 2024; 87:102841. [PMID: 38795564 DOI: 10.1016/j.sbi.2024.102841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 04/24/2024] [Accepted: 04/27/2024] [Indexed: 05/28/2024]
Abstract
Recent progress in protein structure prediction based on deep learning revolutionized the field of Structural Biology. Beyond single proteins, it also enabled high-throughput prediction of structures of protein-protein interactions. Despite the success in predicting complex structures, large macromolecular assemblies still require specialized approaches. Here we describe recent advances in modeling macromolecular assemblies using integrative and hierarchical approaches. We highlight applications that predict protein-protein interactions and challenges in modeling complexes based on the interaction networks, including the prediction of complex stoichiometry and heterogeneity.
Collapse
Affiliation(s)
- Ben Shor
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel. https://twitter.com/ben_shor
| | - Dina Schneidman-Duhovny
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel.
| |
Collapse
|
24
|
Swapna GVT, Dube N, Roth MJ, Montelione GT. Modeling Alternative Conformational States of Pseudo-Symmetric Solute Carrier Transporters using Methods from Machine Learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.15.603529. [PMID: 39071413 PMCID: PMC11275918 DOI: 10.1101/2024.07.15.603529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/30/2024]
Abstract
The Solute Carrier (SLC) superfamily of integral membrane proteins function to transport a wide array of solutes across the plasma and organelle membranes. SLC proteins also function as important drug transporters and as viral receptors. Despite being classified as a single superfamily, SLC proteins do not share a single common fold classification; however, most belong to multi-pass transmembrane helical protein fold families. SLC proteins populate different conformational states during the solute transport process, including outward open, intermediate (occluded), and inward open conformational states. For some SLC fold families this structural "flipping" corresponds to swapping between conformations of their N-terminal and C-terminal symmetry-related sub-structures. Conventional AlphaFold2 or Evolutionary Scale Modeling methods typically generate models for only one of these multiple conformational states of SLC proteins. Here we describe a fast and simple approach for modeling multiple conformational states of SLC proteins using a combined ESM - AF2 process. The resulting multi-state models are validated by comparison with sequence-based evolutionary co-variance data (ECs) that encode information about contacts present in the various conformational states adopted by the protein. We also explored the impact of mutations on conformational distributions of SLC proteins modeled by AlphaFold2 using both conventional and enhanced sampling methods. This approach for modeling conformational landscapes of pseudo-symmetric SLC proteins is demonstrated for several integral membrane protein transporters, including SLC35F2 the receptor of a feline leukemia virus envelope protein required for viral entry into eukaryotic cells.
Collapse
Affiliation(s)
- G V T Swapna
- Dept. of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, 12180 USA
- Department of Pharmacology, Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway NJ 08854 USA
| | - Namita Dube
- Dept. of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, 12180 USA
| | - Monica J Roth
- Department of Pharmacology, Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway NJ 08854 USA
| | - Gaetano T Montelione
- Dept. of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, 12180 USA
| |
Collapse
|
25
|
Bryant P, Noé F. Improved protein complex prediction with AlphaFold-multimer by denoising the MSA profile. PLoS Comput Biol 2024; 20:e1012253. [PMID: 39052676 DOI: 10.1371/journal.pcbi.1012253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 08/06/2024] [Accepted: 06/14/2024] [Indexed: 07/27/2024] Open
Abstract
Structure prediction of protein complexes has improved significantly with AlphaFold2 and AlphaFold-multimer (AFM), but only 60% of dimers are accurately predicted. Here, we learn a bias to the MSA representation that improves the predictions by performing gradient descent through the AFM network. We demonstrate the performance on seven difficult targets from CASP15 and increase the average MMscore to 0.76 compared to 0.63 with AFM. We evaluate the procedure on 487 protein complexes where AFM fails and obtain an increased success rate (MMscore>0.75) of 33% on these difficult targets. Our protocol, AFProfile, provides a way to direct predictions towards a defined target function guided by the MSA. We expect gradient descent over the MSA to be useful for different tasks.
Collapse
Affiliation(s)
- Patrick Bryant
- Department of Mathematics and Informatics, Freie Universität Berlin, Germany
- The Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
- Science for Life Laboratory, Solna, Sweden
| | - Frank Noé
- Department of Mathematics and Informatics, Freie Universität Berlin, Germany
- Microsoft Research AI4Science, Berlin, Germany
| |
Collapse
|
26
|
McLean TC. LazyAF, a pipeline for accessible medium-scale in silico prediction of protein-protein interactions. MICROBIOLOGY (READING, ENGLAND) 2024; 170:001473. [PMID: 38967642 PMCID: PMC11316561 DOI: 10.1099/mic.0.001473] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Accepted: 06/14/2024] [Indexed: 07/06/2024]
Abstract
Artificial intelligence has revolutionized the field of protein structure prediction. However, with more powerful and complex software being developed, it is accessibility and ease of use rather than capability that is quickly becoming a limiting factor to end users. LazyAF is a Google Colaboratory-based pipeline which integrates the existing ColabFold BATCH software to streamline the process of medium-scale protein-protein interaction prediction. LazyAF was used to predict the interactome of the 76 proteins encoded on the broad-host-range multi-drug resistance plasmid RK2, demonstrating the ease and accessibility the pipeline provides.
Collapse
Affiliation(s)
- Thomas C. McLean
- Department of Molecular Microbiology, John Innes Centre, Norwich, UK
| |
Collapse
|
27
|
Huang YJ, Montelione GT. Hidden Structural States of Proteins Revealed by Conformer Selection with AlphaFold-NMR. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.26.600902. [PMID: 38979209 PMCID: PMC11230435 DOI: 10.1101/2024.06.26.600902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Recent advances in molecular modeling using deep learning can revolutionize our understanding of dynamic protein structures. NMR is particularly well-suited for determining dynamic features of biomolecular structures. The conventional process for determining biomolecular structures from experimental NMR data involves its representation as conformation-dependent restraints, followed by generation of structural models guided by these spatial restraints. Here we describe an alternative approach: generating a distribution of realistic protein conformational models using artificial intelligence-(AI-) based methods and then selecting the sets of conformers that best explain the experimental data. We applied this conformational selection approach to redetermine the solution NMR structure of the enzyme Gaussia luciferase. First, we generated a diverse set of conformer models using AlphaFold2 (AF2) with an enhanced sampling protocol. The models that best-fit NOESY and chemical shift data were then selected with a Bayesian scoring metric. The resulting models include features of both the published NMR structure and the standard AF2 model generated without enhanced sampling. This "AlphaFold-NMR" protocol also generated an alternative "open" conformational state that fits nearly as well to the overall NMR data but accounts for some NOESY data that is not consistent with first "closed" conformational state; while other NOESY data consistent with this second state are not consistent with the first conformational state. The structure of this "open" structural state differs from that of the "closed" state primarily by the position of a thumb-shaped loop between α-helices H5 and H6, revealing a cryptic surface pocket. These alternative conformational states of Gluc are supported by "double recall" analysis of NOESY data and AF2 models. Additional structural states are also indicated by backbone chemical shift data indicating partially-disordered conformations for the C-terminal segment. Considered as a multistate ensemble, these multiple states of Gluc together fit the NOESY and chemical shift data better than the "restraint-based" NMR structure and provide novel insights into its structure-dynamic-function relationships. This study demonstrates the potential of AI-based modeling with enhanced sampling to generate conformational ensembles followed by conformer selection with experimental data as an alternative to conventional restraint satisfaction protocols for protein NMR structure determination.
Collapse
Affiliation(s)
- Yuanpeng J. Huang
- Dept of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, 12180 USA
| | - Gaetano T. Montelione
- Dept of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, 12180 USA
| |
Collapse
|
28
|
El Salamouni NS, Cater JH, Spenkelink LM, Yu H. Nanobody engineering: computational modelling and design for biomedical and therapeutic applications. FEBS Open Bio 2024. [PMID: 38898362 DOI: 10.1002/2211-5463.13850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Revised: 05/25/2024] [Accepted: 06/10/2024] [Indexed: 06/21/2024] Open
Abstract
Nanobodies, the smallest functional antibody fragment derived from camelid heavy-chain-only antibodies, have emerged as powerful tools for diverse biomedical applications. In this comprehensive review, we discuss the structural characteristics, functional properties, and computational approaches driving the design and optimisation of synthetic nanobodies. We explore their unique antigen-binding domains, highlighting the critical role of complementarity-determining regions in target recognition and specificity. This review further underscores the advantages of nanobodies over conventional antibodies from a biosynthesis perspective, including their small size, stability, and solubility, which make them ideal candidates for economical antigen capture in diagnostics, therapeutics, and biosensing. We discuss the recent advancements in computational methods for nanobody modelling, epitope prediction, and affinity maturation, shedding light on their intricate antigen-binding mechanisms and conformational dynamics. Finally, we examine a direct example of how computational design strategies were implemented for improving a nanobody-based immunosensor, known as a Quenchbody. Through combining experimental findings and computational insights, this review elucidates the transformative impact of nanobodies in biotechnology and biomedical research, offering a roadmap for future advancements and applications in healthcare and diagnostics.
Collapse
Affiliation(s)
- Nehad S El Salamouni
- Molecular Horizons and School of Chemistry and Molecular Bioscience, University of Wollongong, Australia
| | - Jordan H Cater
- Molecular Horizons and School of Chemistry and Molecular Bioscience, University of Wollongong, Australia
| | - Lisanne M Spenkelink
- Molecular Horizons and School of Chemistry and Molecular Bioscience, University of Wollongong, Australia
| | - Haibo Yu
- Molecular Horizons and School of Chemistry and Molecular Bioscience, University of Wollongong, Australia
- ARC Centre of Excellence in Quantum Biotechnology, University of Wollongong, Australia
| |
Collapse
|
29
|
Urvas L, Chiesa L, Bret G, Jacquemard C, Kellenberger E. Benchmarking AlphaFold-Generated Structures of Chemokine-Chemokine Receptor Complexes. J Chem Inf Model 2024; 64:4587-4600. [PMID: 38809680 DOI: 10.1021/acs.jcim.3c01835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2024]
Abstract
AlphaFold and AlphaFold-Multimer have become two essential tools for the modeling of unknown structures of proteins and protein complexes. In this work, we extensively benchmarked the quality of chemokine-chemokine receptor structures generated by AlphaFold-Multimer against experimentally determined structures. Our analysis considered both the global quality of the model, as well as key structural features for chemokine recognition. To study the effects of template and multiple sequence alignment parameters on the results, a new prediction pipeline called LIT-AlphaFold (https://github.com/LIT-CCM-lab/LIT-AlphaFold) was developed, allowing extensive input customization. AlphaFold-Multimer correctly predicted differences in chemokine binding orientation and accurately reproduced the unique binding orientation of the CXCL12-ACKR3 complex. Further, the predictions of the full receptor N-terminus provided insights into a putative chemokine recognition site 0.5. The accuracy of chemokine N-terminus binding mode prediction varied between complexes, but the confidence score permitted the distinguishing of residues that were very likely well positioned. Finally, we generated a high-confidence model of the unsolved CXCL12-CXCR4 complex, which agreed with experimental mutagenesis and cross-linking data.
Collapse
Affiliation(s)
- Lauri Urvas
- Laboratoire d'Innovation Thérapeutique, UMR 7200 CNRS, Université de Strasbourg, 67400 Illkirch, France
| | - Luca Chiesa
- Laboratoire d'Innovation Thérapeutique, UMR 7200 CNRS, Université de Strasbourg, 67400 Illkirch, France
| | - Guillaume Bret
- Laboratoire d'Innovation Thérapeutique, UMR 7200 CNRS, Université de Strasbourg, 67400 Illkirch, France
| | - Célien Jacquemard
- Laboratoire d'Innovation Thérapeutique, UMR 7200 CNRS, Université de Strasbourg, 67400 Illkirch, France
| | - Esther Kellenberger
- Laboratoire d'Innovation Thérapeutique, UMR 7200 CNRS, Université de Strasbourg, 67400 Illkirch, France
| |
Collapse
|
30
|
Abramson J, Adler J, Dunger J, Evans R, Green T, Pritzel A, Ronneberger O, Willmore L, Ballard AJ, Bambrick J, Bodenstein SW, Evans DA, Hung CC, O'Neill M, Reiman D, Tunyasuvunakool K, Wu Z, Žemgulytė A, Arvaniti E, Beattie C, Bertolli O, Bridgland A, Cherepanov A, Congreve M, Cowen-Rivers AI, Cowie A, Figurnov M, Fuchs FB, Gladman H, Jain R, Khan YA, Low CMR, Perlin K, Potapenko A, Savy P, Singh S, Stecula A, Thillaisundaram A, Tong C, Yakneen S, Zhong ED, Zielinski M, Žídek A, Bapst V, Kohli P, Jaderberg M, Hassabis D, Jumper JM. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 2024; 630:493-500. [PMID: 38718835 PMCID: PMC11168924 DOI: 10.1038/s41586-024-07487-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Accepted: 04/29/2024] [Indexed: 06/13/2024]
Abstract
The introduction of AlphaFold 21 has spurred a revolution in modelling the structure of proteins and their interactions, enabling a huge range of applications in protein modelling and design2-6. Here we describe our AlphaFold 3 model with a substantially updated diffusion-based architecture that is capable of predicting the joint structure of complexes including proteins, nucleic acids, small molecules, ions and modified residues. The new AlphaFold model demonstrates substantially improved accuracy over many previous specialized tools: far greater accuracy for protein-ligand interactions compared with state-of-the-art docking tools, much higher accuracy for protein-nucleic acid interactions compared with nucleic-acid-specific predictors and substantially higher antibody-antigen prediction accuracy compared with AlphaFold-Multimer v.2.37,8. Together, these results show that high-accuracy modelling across biomolecular space is possible within a single unified deep-learning framework.
Collapse
Affiliation(s)
| | - Jonas Adler
- Core Contributor, Google DeepMind, London, UK
| | - Jack Dunger
- Core Contributor, Google DeepMind, London, UK
| | | | - Tim Green
- Core Contributor, Google DeepMind, London, UK
| | | | | | | | | | | | | | | | | | | | | | | | - Zachary Wu
- Core Contributor, Google DeepMind, London, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | - Yousuf A Khan
- Google DeepMind, London, UK
- Department of Molecular and Cellular Physiology, Stanford University, Stanford, CA, USA
| | | | | | | | | | | | | | | | | | | | - Ellen D Zhong
- Google DeepMind, London, UK
- Department of Computer Science, Princeton University, Princeton, NJ, USA
| | | | | | | | | | | | - Demis Hassabis
- Core Contributor, Google DeepMind, London, UK.
- Core Contributor, Isomorphic Labs, London, UK.
| | | |
Collapse
|
31
|
Wu D, Yin R, Chen G, Ribeiro-Filho HV, Cheung M, Robbins PF, Mariuzza RA, Pierce BG. Structural characterization and AlphaFold modeling of human T cell receptor recognition of NRAS cancer neoantigens. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.21.595215. [PMID: 38826362 PMCID: PMC11142219 DOI: 10.1101/2024.05.21.595215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
T cell receptors (TCRs) that recognize cancer neoantigens are important for anti-cancer immune responses and immunotherapy. Understanding the structural basis of TCR recognition of neoantigens provides insights into their exquisite specificity and can enable design of optimized TCRs. We determined crystal structures of a human TCR in complex with NRAS Q61K and Q61R neoantigen peptides and HLA-A1 MHC, revealing the molecular underpinnings for dual recognition and specificity versus wild-type NRAS peptide. We then used multiple versions of AlphaFold to model the corresponding complex structures, given the challenge of immune recognition for such methods. Interestingly, one implementation of AlphaFold2 (TCRmodel2) was able to generate accurate models of the complexes, while AlphaFold3 also showed strong performance, although success was lower for other complexes. This study provides insights into TCR recognition of a shared cancer neoantigen, as well as the utility and practical considerations for using AlphaFold to model TCR-peptide-MHC complexes.
Collapse
Affiliation(s)
- Daichao Wu
- Department of Hepatopancreatobiliary Surgery, The First Affiliated Hospital, Laboratory of Structural Immunology, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- W.M. Keck Laboratory for Structural Biology, University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
| | - Rui Yin
- W.M. Keck Laboratory for Structural Biology, University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA
| | - Guodong Chen
- Department of Hepatopancreatobiliary Surgery, The First Affiliated Hospital, Laboratory of Structural Immunology, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
| | - Helder V. Ribeiro-Filho
- W.M. Keck Laboratory for Structural Biology, University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA
- Brazilian Biosciences National Laboratory, Brazilian Center for Research in Energy and Materials, Campinas 13083-100, Brazil
| | - Melyssa Cheung
- W.M. Keck Laboratory for Structural Biology, University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
- Department of Chemistry and Biochemistry, University of Maryland, College Park, MD 20742, USA
| | - Paul F. Robbins
- Surgery Branch, Center for Cancer Research, National Cancer Institute, Bethesda, MD 20892, USA
| | - Roy A. Mariuzza
- W.M. Keck Laboratory for Structural Biology, University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA
| | - Brian G. Pierce
- W.M. Keck Laboratory for Structural Biology, University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA
| |
Collapse
|
32
|
Sonmez C, Toia B, Eickhoff P, Matei AM, El Beyrouthy M, Wallner B, Douglas ME, de Lange T, Lottersberger F. DNA-PK controls Apollo's access to leading-end telomeres. Nucleic Acids Res 2024; 52:4313-4327. [PMID: 38407308 PMCID: PMC11077071 DOI: 10.1093/nar/gkae105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 01/23/2024] [Accepted: 02/01/2024] [Indexed: 02/27/2024] Open
Abstract
The complex formed by Ku70/80 and DNA-PKcs (DNA-PK) promotes the synapsis and the joining of double strand breaks (DSBs) during canonical non-homologous end joining (c-NHEJ). In c-NHEJ during V(D)J recombination, DNA-PK promotes the processing of the ends and the opening of the DNA hairpins by recruiting and/or activating the nuclease Artemis/DCLRE1C/SNM1C. Paradoxically, DNA-PK is also required to prevent the fusions of newly replicated leading-end telomeres. Here, we describe the role for DNA-PK in controlling Apollo/DCLRE1B/SNM1B, the nuclease that resects leading-end telomeres. We show that the telomeric function of Apollo requires DNA-PKcs's kinase activity and the binding of Apollo to DNA-PK. Furthermore, AlphaFold-Multimer predicts that Apollo's nuclease domain has extensive additional interactions with DNA-PKcs, and comparison to the cryo-EM structure of Artemis bound to DNA-PK phosphorylated on the ABCDE/Thr2609 cluster suggests that DNA-PK can similarly grant Apollo access to the DNA end. In agreement, the telomeric function of DNA-PK requires the ABCDE/Thr2609 cluster. These data reveal that resection of leading-end telomeres is regulated by DNA-PK through its binding to Apollo and its (auto)phosphorylation-dependent positioning of Apollo at the DNA end, analogous but not identical to DNA-PK dependent regulation of Artemis at hairpins.
Collapse
Affiliation(s)
- Ceylan Sonmez
- Department of Biomedical and Clinical Sciences, Linköping University, Linköping 58 183, Sweden
| | - Beatrice Toia
- Department of Biomedical and Clinical Sciences, Linköping University, Linköping 58 183, Sweden
| | - Patrik Eickhoff
- Chester Beatty Laboratories, The Institute of Cancer Research, 237 Fulham Road, London SW3 6JB, UK
| | - Andreea Medeea Matei
- Department of Biomedical and Clinical Sciences, Linköping University, Linköping 58 183, Sweden
| | - Michael El Beyrouthy
- Department of Biomedical and Clinical Sciences, Linköping University, Linköping 58 183, Sweden
| | - Björn Wallner
- Department of Physics, Chemistry and Biology, Linköping University, Linköping 58 183, Sweden
| | - Max E Douglas
- Chester Beatty Laboratories, The Institute of Cancer Research, 237 Fulham Road, London SW3 6JB, UK
| | - Titia de Lange
- Laboratory for Cell Biology and Genetics, The Rockefeller University, 1230 York Avenue, NY, NY 10021, USA
| | - Francisca Lottersberger
- Department of Biomedical and Clinical Sciences, Linköping University, Linköping 58 183, Sweden
| |
Collapse
|
33
|
Ellaway JIJ, Anyango S, Nair S, Zaki HA, Nadzirin N, Powell HR, Gutmanas A, Varadi M, Velankar S. Identifying protein conformational states in the Protein Data Bank: Toward unlocking the potential of integrative dynamics studies. STRUCTURAL DYNAMICS (MELVILLE, N.Y.) 2024; 11:034701. [PMID: 38774441 PMCID: PMC11106648 DOI: 10.1063/4.0000251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Accepted: 05/08/2024] [Indexed: 05/24/2024]
Abstract
Studying protein dynamics and conformational heterogeneity is crucial for understanding biomolecular systems and treating disease. Despite the deposition of over 215 000 macromolecular structures in the Protein Data Bank and the advent of AI-based structure prediction tools such as AlphaFold2, RoseTTAFold, and ESMFold, static representations are typically produced, which fail to fully capture macromolecular motion. Here, we discuss the importance of integrating experimental structures with computational clustering to explore the conformational landscapes that manifest protein function. We describe the method developed by the Protein Data Bank in Europe - Knowledge Base to identify distinct conformational states, demonstrate the resource's primary use cases, through examples, and discuss the need for further efforts to annotate protein conformations with functional information. Such initiatives will be crucial in unlocking the potential of protein dynamics data, expediting drug discovery research, and deepening our understanding of macromolecular mechanisms.
Collapse
Affiliation(s)
- Joseph I. J. Ellaway
- Protein Data Bank in Europe, European Bioinformatics Institute, Hinxton, United Kingdom
| | - Stephen Anyango
- Protein Data Bank in Europe, European Bioinformatics Institute, Hinxton, United Kingdom
| | - Sreenath Nair
- Protein Data Bank in Europe, European Bioinformatics Institute, Hinxton, United Kingdom
| | - Hossam A. Zaki
- The Warren Alpert Medical School of Brown University, Providence, Rhode Island 02903, USA
| | - Nurul Nadzirin
- Protein Data Bank in Europe, European Bioinformatics Institute, Hinxton, United Kingdom
| | - Harold R. Powell
- Imperial College London, Department of Life Sciences, London, United Kingdom
| | - Aleksandras Gutmanas
- WaveBreak Therapeutics Ltd., Clarendon House, Clarendon Road, Cambridge, United Kingdom
| | - Mihaly Varadi
- Protein Data Bank in Europe, European Bioinformatics Institute, Hinxton, United Kingdom
| | - Sameer Velankar
- Protein Data Bank in Europe, European Bioinformatics Institute, Hinxton, United Kingdom
| |
Collapse
|
34
|
Schmid EW, Walter JC. Predictomes: A classifier-curated database of AlphaFold-modeled protein-protein interactions. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.09.588596. [PMID: 38645019 PMCID: PMC11030396 DOI: 10.1101/2024.04.09.588596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Protein-protein interactions (PPIs) are ubiquitous in biology, yet a comprehensive structural characterization of the PPIs underlying biochemical processes is lacking. Although AlphaFold-Multimer (AF-M) has the potential to fill this knowledge gap, standard AF-M confidence metrics do not reliably separate relevant PPIs from an abundance of false positive predictions. To address this limitation, we used machine learning on well curated datasets to train a Structure Prediction and Omics informed Classifier called SPOC that shows excellent performance in separating true and false PPIs, including in proteome-wide screens. We applied SPOC to an all-by-all matrix of nearly 300 human genome maintenance proteins, generating ~40,000 predictions that can be viewed at predictomes.org, where users can also score their own predictions with SPOC. High confidence PPIs discovered using our approach suggest novel hypotheses in genome maintenance. Our results provide a framework for interpreting large scale AF-M screens and help lay the foundation for a proteome-wide structural interactome.
Collapse
Affiliation(s)
- Ernst W. Schmid
- Department of Biological Chemistry & Molecular Pharmacology, Blavatnik Institute, Harvard Medical School, Boston, MA 02115, USA
| | - Johannes C. Walter
- Department of Biological Chemistry & Molecular Pharmacology, Blavatnik Institute, Harvard Medical School, Boston, MA 02115, USA
- Howard Hughes Medical Institute, Boston, MA 02115, USA
| |
Collapse
|
35
|
Monteiro da Silva G, Cui JY, Dalgarno DC, Lisi GP, Rubenstein BM. High-throughput prediction of protein conformational distributions with subsampled AlphaFold2. Nat Commun 2024; 15:2464. [PMID: 38538622 PMCID: PMC10973385 DOI: 10.1038/s41467-024-46715-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Accepted: 02/28/2024] [Indexed: 04/12/2024] Open
Abstract
This paper presents an innovative approach for predicting the relative populations of protein conformations using AlphaFold 2, an AI-powered method that has revolutionized biology by enabling the accurate prediction of protein structures. While AlphaFold 2 has shown exceptional accuracy and speed, it is designed to predict proteins' ground state conformations and is limited in its ability to predict conformational landscapes. Here, we demonstrate how AlphaFold 2 can directly predict the relative populations of different protein conformations by subsampling multiple sequence alignments. We tested our method against nuclear magnetic resonance experiments on two proteins with drastically different amounts of available sequence data, Abl1 kinase and the granulocyte-macrophage colony-stimulating factor, and predicted changes in their relative state populations with more than 80% accuracy. Our subsampling approach worked best when used to qualitatively predict the effects of mutations or evolution on the conformational landscape and well-populated states of proteins. It thus offers a fast and cost-effective way to predict the relative populations of protein conformations at even single-point mutation resolution, making it a useful tool for pharmacology, analysis of experimental results, and predicting evolution.
Collapse
Affiliation(s)
| | - Jennifer Y Cui
- Brown University Department of Molecular and Cell Biology and Biochemistry, Providence, RI, USA
| | | | - George P Lisi
- Brown University Department of Molecular and Cell Biology and Biochemistry, Providence, RI, USA
- Brown University Department of Chemistry, Providence, RI, USA
| | - Brenda M Rubenstein
- Brown University Department of Molecular and Cell Biology and Biochemistry, Providence, RI, USA.
- Brown University Department of Chemistry, Providence, RI, USA.
| |
Collapse
|
36
|
Jänes J, Beltrao P. Deep learning for protein structure prediction and design-progress and applications. Mol Syst Biol 2024; 20:162-169. [PMID: 38291232 PMCID: PMC10912668 DOI: 10.1038/s44320-024-00016-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 12/21/2023] [Accepted: 01/11/2024] [Indexed: 02/01/2024] Open
Abstract
Proteins are the key molecular machines that orchestrate all biological processes of the cell. Most proteins fold into three-dimensional shapes that are critical for their function. Studying the 3D shape of proteins can inform us of the mechanisms that underlie biological processes in living cells and can have practical applications in the study of disease mutations or the discovery of novel drug treatments. Here, we review the progress made in sequence-based prediction of protein structures with a focus on applications that go beyond the prediction of single monomer structures. This includes the application of deep learning methods for the prediction of structures of protein complexes, different conformations, the evolution of protein structures and the application of these methods to protein design. These developments create new opportunities for research that will have impact across many areas of biomedical research.
Collapse
Affiliation(s)
- Jürgen Jänes
- Institute of Molecular Systems Biology, ETH Zürich, 8093, Zürich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Pedro Beltrao
- Institute of Molecular Systems Biology, ETH Zürich, 8093, Zürich, Switzerland.
- Swiss Institute of Bioinformatics, Lausanne, Switzerland.
| |
Collapse
|
37
|
Shor B, Schneidman-Duhovny D. CombFold: predicting structures of large protein assemblies using a combinatorial assembly algorithm and AlphaFold2. Nat Methods 2024; 21:477-487. [PMID: 38326495 PMCID: PMC10927564 DOI: 10.1038/s41592-024-02174-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Accepted: 01/09/2024] [Indexed: 02/09/2024]
Abstract
Deep learning models, such as AlphaFold2 and RosettaFold, enable high-accuracy protein structure prediction. However, large protein complexes are still challenging to predict due to their size and the complexity of interactions between multiple subunits. Here we present CombFold, a combinatorial and hierarchical assembly algorithm for predicting structures of large protein complexes utilizing pairwise interactions between subunits predicted by AlphaFold2. CombFold accurately predicted (TM-score >0.7) 72% of the complexes among the top-10 predictions in two datasets of 60 large, asymmetric assemblies. Moreover, the structural coverage of predicted complexes was 20% higher compared to corresponding Protein Data Bank entries. We applied the method on complexes from Complex Portal with known stoichiometry but without known structure and obtained high-confidence predictions. CombFold supports the integration of distance restraints based on crosslinking mass spectrometry and fast enumeration of possible complex stoichiometries. CombFold's high accuracy makes it a promising tool for expanding structural coverage beyond monomeric proteins.
Collapse
Affiliation(s)
- Ben Shor
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Dina Schneidman-Duhovny
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel.
| |
Collapse
|
38
|
Chu AE, Lu T, Huang PS. Sparks of function by de novo protein design. Nat Biotechnol 2024; 42:203-215. [PMID: 38361073 PMCID: PMC11366440 DOI: 10.1038/s41587-024-02133-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2023] [Accepted: 01/09/2024] [Indexed: 02/17/2024]
Abstract
Information in proteins flows from sequence to structure to function, with each step causally driven by the preceding one. Protein design is founded on inverting this process: specify a desired function, design a structure executing this function, and find a sequence that folds into this structure. This 'central dogma' underlies nearly all de novo protein-design efforts. Our ability to accomplish these tasks depends on our understanding of protein folding and function and our ability to capture this understanding in computational methods. In recent years, deep learning-derived approaches for efficient and accurate structure modeling and enrichment of successful designs have enabled progression beyond the design of protein structures and towards the design of functional proteins. We examine these advances in the broader context of classical de novo protein design and consider implications for future challenges to come, including fundamental capabilities such as sequence and structure co-design and conformational control considering flexibility, and functional objectives such as antibody and enzyme design.
Collapse
Affiliation(s)
- Alexander E Chu
- Biophysics Program, Stanford University, Palo Alto, CA, USA
- Department of Bioengineering, Stanford University, Palo Alto, CA, USA
- Google DeepMind, London, UK
| | - Tianyu Lu
- Department of Bioengineering, Stanford University, Palo Alto, CA, USA
| | - Po-Ssu Huang
- Biophysics Program, Stanford University, Palo Alto, CA, USA.
- Department of Bioengineering, Stanford University, Palo Alto, CA, USA.
| |
Collapse
|
39
|
Bret H, Gao J, Zea DJ, Andreani J, Guerois R. From interaction networks to interfaces, scanning intrinsically disordered regions using AlphaFold2. Nat Commun 2024; 15:597. [PMID: 38238291 PMCID: PMC10796318 DOI: 10.1038/s41467-023-44288-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Accepted: 12/07/2023] [Indexed: 01/22/2024] Open
Abstract
The revolution brought about by AlphaFold2 opens promising perspectives to unravel the complexity of protein-protein interaction networks. The analysis of interaction networks obtained from proteomics experiments does not systematically provide the delimitations of the interaction regions. This is of particular concern in the case of interactions mediated by intrinsically disordered regions, in which the interaction site is generally small. Using a dataset of protein-peptide complexes involving intrinsically disordered regions that are non-redundant with the structures used in AlphaFold2 training, we show that when using the full sequences of the proteins, AlphaFold2-Multimer only achieves 40% success rate in identifying the correct site and structure of the interface. By delineating the interaction region into fragments of decreasing size and combining different strategies for integrating evolutionary information, we manage to raise this success rate up to 90%. We obtain similar success rates using a much larger dataset of protein complexes taken from the ELM database. Beyond the correct identification of the interaction site, our study also explores specificity issues. We show the advantages and limitations of using the AlphaFold2 confidence score to discriminate between alternative binding partners, a task that can be particularly challenging in the case of small interaction motifs.
Collapse
Affiliation(s)
- Hélène Bret
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Jinmei Gao
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Diego Javier Zea
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Jessica Andreani
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France.
| | - Raphaël Guerois
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France.
| |
Collapse
|
40
|
Yin R, Pierce BG. Evaluation of AlphaFold antibody-antigen modeling with implications for improving predictive accuracy. Protein Sci 2024; 33:e4865. [PMID: 38073135 PMCID: PMC10751731 DOI: 10.1002/pro.4865] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 12/01/2023] [Accepted: 12/07/2023] [Indexed: 12/26/2023]
Abstract
High resolution antibody-antigen structures provide critical insights into immune recognition and can inform therapeutic design. The challenges of experimental structural determination and the diversity of the immune repertoire underscore the necessity of accurate computational tools for modeling antibody-antigen complexes. Initial benchmarking showed that despite overall success in modeling protein-protein complexes, AlphaFold and AlphaFold-Multimer have limited success in modeling antibody-antigen interactions. In this study, we performed a thorough analysis of AlphaFold's antibody-antigen modeling performance on 427 nonredundant antibody-antigen complex structures, identifying useful confidence metrics for predicting model quality, and features of complexes associated with improved modeling success. Notably, we found that the latest version of AlphaFold improves near-native modeling success to over 30%, versus approximately 20% for a previous version, while increased AlphaFold sampling gives approximately 50% success. With this improved success, AlphaFold can generate accurate antibody-antigen models in many cases, while additional training or other optimization may further improve performance.
Collapse
Affiliation(s)
- Rui Yin
- University of Maryland Institute for Bioscience and Biotechnology ResearchRockvilleMarylandUSA
- Department of Cell Biology and Molecular GeneticsUniversity of MarylandCollege ParkMarylandUSA
| | - Brian G. Pierce
- University of Maryland Institute for Bioscience and Biotechnology ResearchRockvilleMarylandUSA
- Department of Cell Biology and Molecular GeneticsUniversity of MarylandCollege ParkMarylandUSA
| |
Collapse
|
41
|
Lensink MF, Brysbaert G, Raouraoua N, Bates PA, Giulini M, Honorato RV, van Noort C, Teixeira JMC, Bonvin AMJJ, Kong R, Shi H, Lu X, Chang S, Liu J, Guo Z, Chen X, Morehead A, Roy RS, Wu T, Giri N, Quadir F, Chen C, Cheng J, Del Carpio CA, Ichiishi E, Rodriguez‐Lumbreras LA, Fernandez‐Recio J, Harmalkar A, Chu L, Canner S, Smanta R, Gray JJ, Li H, Lin P, He J, Tao H, Huang S, Roel‐Touris J, Jimenez‐Garcia B, Christoffer CW, Jain AJ, Kagaya Y, Kannan H, Nakamura T, Terashi G, Verburgt JC, Zhang Y, Zhang Z, Fujuta H, Sekijima M, Kihara D, Khan O, Kotelnikov S, Ghani U, Padhorny D, Beglov D, Vajda S, Kozakov D, Negi SS, Ricciardelli T, Barradas‐Bautista D, Cao Z, Chawla M, Cavallo L, Oliva R, Yin R, Cheung M, Guest JD, Lee J, Pierce BG, Shor B, Cohen T, Halfon M, Schneidman‐Duhovny D, Zhu S, Yin R, Sun Y, Shen Y, Maszota‐Zieleniak M, Bojarski KK, Lubecka EA, Marcisz M, Danielsson A, Dziadek L, Gaardlos M, Gieldon A, Liwo A, Samsonov SA, Slusarz R, Zieba K, Sieradzan AK, Czaplewski C, Kobayashi S, Miyakawa Y, Kiyota Y, Takeda‐Shitaka M, Olechnovic K, Valancauskas L, Dapkunas J, Venclovas C, Wallner B, Yang L, Hou C, He X, Guo S, Jiang S, Ma X, Duan R, Qui L, Xu X, Zou X, Velankar S, Wodak SJ. Impact of AlphaFold on structure prediction of protein complexes: The CASP15-CAPRI experiment. Proteins 2023; 91:1658-1683. [PMID: 37905971 PMCID: PMC10841881 DOI: 10.1002/prot.26609] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 09/22/2023] [Accepted: 09/28/2023] [Indexed: 11/02/2023]
Abstract
We present the results for CAPRI Round 54, the 5th joint CASP-CAPRI protein assembly prediction challenge. The Round offered 37 targets, including 14 homodimers, 3 homo-trimers, 13 heterodimers including 3 antibody-antigen complexes, and 7 large assemblies. On average ~70 CASP and CAPRI predictor groups, including more than 20 automatics servers, submitted models for each target. A total of 21 941 models submitted by these groups and by 15 CAPRI scorer groups were evaluated using the CAPRI model quality measures and the DockQ score consolidating these measures. The prediction performance was quantified by a weighted score based on the number of models of acceptable quality or higher submitted by each group among their five best models. Results show substantial progress achieved across a significant fraction of the 60+ participating groups. High-quality models were produced for about 40% of the targets compared to 8% two years earlier. This remarkable improvement is due to the wide use of the AlphaFold2 and AlphaFold2-Multimer software and the confidence metrics they provide. Notably, expanded sampling of candidate solutions by manipulating these deep learning inference engines, enriching multiple sequence alignments, or integration of advanced modeling tools, enabled top performing groups to exceed the performance of a standard AlphaFold2-Multimer version used as a yard stick. This notwithstanding, performance remained poor for complexes with antibodies and nanobodies, where evolutionary relationships between the binding partners are lacking, and for complexes featuring conformational flexibility, clearly indicating that the prediction of protein complexes remains a challenging problem.
Collapse
Affiliation(s)
- Marc F. Lensink
- Univ. Lille, CNRS, UMR8576 – UGSF – Unité de Glycobiologie Structurale et FonctionnelleLilleFrance
| | - Guillaume Brysbaert
- Univ. Lille, CNRS, UMR8576 – UGSF – Unité de Glycobiologie Structurale et FonctionnelleLilleFrance
| | - Nessim Raouraoua
- Univ. Lille, CNRS, UMR8576 – UGSF – Unité de Glycobiologie Structurale et FonctionnelleLilleFrance
| | - Paul A. Bates
- Biomolecular Modeling LaboratoryThe Francis Crick InstituteLondonUK
| | - Marco Giulini
- Bijvoet Center for Biomolecular Research, Faculty of Science – ChemistryUtrecht UniversityUtrechtThe Netherlands
| | - Rodrigo V. Honorato
- Bijvoet Center for Biomolecular Research, Faculty of Science – ChemistryUtrecht UniversityUtrechtThe Netherlands
| | - Charlotte van Noort
- Bijvoet Center for Biomolecular Research, Faculty of Science – ChemistryUtrecht UniversityUtrechtThe Netherlands
| | - Joao M. C. Teixeira
- Bijvoet Center for Biomolecular Research, Faculty of Science – ChemistryUtrecht UniversityUtrechtThe Netherlands
| | - Alexandre M. J. J. Bonvin
- Bijvoet Center for Biomolecular Research, Faculty of Science – ChemistryUtrecht UniversityUtrechtThe Netherlands
| | - Ren Kong
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information EngineeringJiangsu University of TechnologyChangzhouChina
| | - Hang Shi
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information EngineeringJiangsu University of TechnologyChangzhouChina
| | - Xufeng Lu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information EngineeringJiangsu University of TechnologyChangzhouChina
| | - Shan Chang
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information EngineeringJiangsu University of TechnologyChangzhouChina
| | - Jian Liu
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Zhiye Guo
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Xiao Chen
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Alex Morehead
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Raj S. Roy
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Tianqi Wu
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Nabin Giri
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Farhan Quadir
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Chen Chen
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Jianlin Cheng
- Dept. of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | | | - Eichiro Ichiishi
- International University of Health and Welfare (IUHV Hospital)Nasushiobara‐CityJapan
| | - Luis A. Rodriguez‐Lumbreras
- Instituto de Ciencias de la Vida y del Vino (ICVV)CSIC ‐ Universidad de La Rioja ‐ Gobierno de La RiojaLogronoSpain
- Barcelona Supercomputing Center (BSC)BarcelonaSpain
| | - Juan Fernandez‐Recio
- Instituto de Ciencias de la Vida y del Vino (ICVV)CSIC ‐ Universidad de La Rioja ‐ Gobierno de La RiojaLogronoSpain
- Barcelona Supercomputing Center (BSC)BarcelonaSpain
| | - Ameya Harmalkar
- Dept. of Chemical and Biomolecular EngineeringJohns Hopkins UniversityBaltimoreMarylandUSA
| | - Lee‐Shin Chu
- Dept. of Chemical and Biomolecular EngineeringJohns Hopkins UniversityBaltimoreMarylandUSA
| | - Sam Canner
- Dept. of Chemical and Biomolecular EngineeringJohns Hopkins UniversityBaltimoreMarylandUSA
| | - Rituparna Smanta
- Dept. of Chemical and Biomolecular EngineeringJohns Hopkins UniversityBaltimoreMarylandUSA
| | - Jeffrey J. Gray
- Dept. of Chemical and Biomolecular EngineeringJohns Hopkins UniversityBaltimoreMarylandUSA
- Program in Molecular BiophysicsJohns Hopkins UniversityBaltimoreMarylandUSA
| | - Hao Li
- School of PhysicsHuazhong University of Science and TechnologyWuhanChina
| | - Peicong Lin
- School of PhysicsHuazhong University of Science and TechnologyWuhanChina
| | - Jiahua He
- School of PhysicsHuazhong University of Science and TechnologyWuhanChina
| | - Huanyu Tao
- School of PhysicsHuazhong University of Science and TechnologyWuhanChina
| | - Sheng‐You Huang
- School of PhysicsHuazhong University of Science and TechnologyWuhanChina
| | - Jorge Roel‐Touris
- Protein Design and Modeling Lab, Dept. of Structural BiologyMolecular Biology Institute of Barcelona (IBMB‐CSIC)BarcelonaSpain
| | | | | | - Anika J. Jain
- Dept. of Biological SciencesPurdue UniversityWest LafayetteIndianaUSA
| | - Yuki Kagaya
- Dept. of Biological SciencesPurdue UniversityWest LafayetteIndianaUSA
| | - Harini Kannan
- Dept. of Biological SciencesPurdue UniversityWest LafayetteIndianaUSA
- Dept. of Biotechnology, Bhupat and Jyoti Mehta School of BiosciencesIndian Institute of Technology MadrasChennaiIndia
| | - Tsukasa Nakamura
- Dept. of Biological SciencesPurdue UniversityWest LafayetteIndianaUSA
| | - Genki Terashi
- Dept. of Biological SciencesPurdue UniversityWest LafayetteIndianaUSA
| | - Jacob C. Verburgt
- Dept. of Biological SciencesPurdue UniversityWest LafayetteIndianaUSA
| | - Yuanyuan Zhang
- Dept. of Computer SciencePurdue UniversityWest LafayetteIndianaUSA
| | - Zicong Zhang
- Dept. of Computer SciencePurdue UniversityWest LafayetteIndianaUSA
| | - Hayato Fujuta
- Dept. of Biotechnology, Bhupat and Jyoti Mehta School of BiosciencesIndian Institute of Technology MadrasChennaiIndia
| | | | - Daisuke Kihara
- Dept. of Computer SciencePurdue UniversityWest LafayetteIndianaUSA
- Dept. of Biological SciencesPurdue UniversityWest LafayetteIndianaUSA
| | | | | | | | | | | | | | | | - Surendra S. Negi
- Sealy Center for Structural Biology and Molecular BiophysicsUniversity of Texas Medical BranchGalvestonTexasUSA
| | | | | | - Zhen Cao
- King Abdullah University of Science and Technology (KAUST)Saudi Arabia
| | - Mohit Chawla
- King Abdullah University of Science and Technology (KAUST)Saudi Arabia
| | - Luigi Cavallo
- King Abdullah University of Science and Technology (KAUST)Saudi Arabia
- Department of Chemistry and BiologyUniversity of SalernoFiscianoItaly
| | | | - Rui Yin
- University of Maryland Institute for Bioscience and Biotechnology ResearchRockvilleMarylandUSA
- Dept. of Cell Biology and Molecular GeneticsUniversity of MarylandCollege ParkMarylandUSA
| | - Melyssa Cheung
- University of Maryland Institute for Bioscience and Biotechnology ResearchRockvilleMarylandUSA
- Dept. of Chemistry and BiochemistryUniversity of MarylandCollege ParkMarylandUSA
| | - Johnathan D. Guest
- University of Maryland Institute for Bioscience and Biotechnology ResearchRockvilleMarylandUSA
- Dept. of Cell Biology and Molecular GeneticsUniversity of MarylandCollege ParkMarylandUSA
| | - Jessica Lee
- University of Maryland Institute for Bioscience and Biotechnology ResearchRockvilleMarylandUSA
- Dept. of Cell Biology and Molecular GeneticsUniversity of MarylandCollege ParkMarylandUSA
| | - Brian G. Pierce
- University of Maryland Institute for Bioscience and Biotechnology ResearchRockvilleMarylandUSA
- Dept. of Cell Biology and Molecular GeneticsUniversity of MarylandCollege ParkMarylandUSA
| | - Ben Shor
- School of Computer Science and EngineeringThe Hebrew University of JerusalemJerusalemIsrael
| | - Tomer Cohen
- School of Computer Science and EngineeringThe Hebrew University of JerusalemJerusalemIsrael
| | - Matan Halfon
- School of Computer Science and EngineeringThe Hebrew University of JerusalemJerusalemIsrael
| | | | - Shaowen Zhu
- Department of Electrical and Computer EngineeringTexas A&M UniversityCollege StationTexasUSA
| | - Rujie Yin
- Department of Electrical and Computer EngineeringTexas A&M UniversityCollege StationTexasUSA
| | - Yuanfei Sun
- Department of Electrical and Computer EngineeringTexas A&M UniversityCollege StationTexasUSA
| | - Yang Shen
- Department of Electrical and Computer EngineeringTexas A&M UniversityCollege StationTexasUSA
- Department of Computer Science and EngineeringTexas A&M UniversityCollege StationTexasUSA
- Institute of Biosciences and Technology and Department of Translational Medical SciencesTexas A&M UniversityHoustonTexasUSA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Yuta Miyakawa
- School of PharmacyKitasato UniversityMinato‐kuTokyoJapan
| | - Yasuomi Kiyota
- School of PharmacyKitasato UniversityMinato‐kuTokyoJapan
| | | | - Kliment Olechnovic
- Institute of Biotechnology, Life Sciences CenterVilnius UniversityVilniusLithuania
| | - Lukas Valancauskas
- Institute of Biotechnology, Life Sciences CenterVilnius UniversityVilniusLithuania
| | - Justas Dapkunas
- Institute of Biotechnology, Life Sciences CenterVilnius UniversityVilniusLithuania
| | - Ceslovas Venclovas
- Institute of Biotechnology, Life Sciences CenterVilnius UniversityVilniusLithuania
| | - Bjorn Wallner
- Bioinformatics Division, Department of Physics, Chemistry, and BiologyLinkoping UniversityLinköpingSweden
| | - Lin Yang
- National Key Laboratory of Science and Technology on Advanced Composites in Special Environments, Center for Composite Materials and StructuresHarbin Institute of TechnologyHarbinChina
- School of Aerospace, Mechanical and Mechatronic EngineeringThe University of SydneyNew South WalesAustralia
| | - Chengyu Hou
- School of Electronics and Information EngineeringHarbin Institute of TechnologyHarbinChina
| | - Xiaodong He
- National Key Laboratory of Science and Technology on Advanced Composites in Special Environments, Center for Composite Materials and StructuresHarbin Institute of TechnologyHarbinChina
- Shenzhen STRONG Advanced Materials Research Institute Col, LtdShenzhenPeople's Republic of China
| | - Shuai Guo
- National Key Laboratory of Science and Technology on Advanced Composites in Special Environments, Center for Composite Materials and StructuresHarbin Institute of TechnologyHarbinChina
| | - Shenda Jiang
- National Key Laboratory of Science and Technology on Advanced Composites in Special Environments, Center for Composite Materials and StructuresHarbin Institute of TechnologyHarbinChina
| | - Xiaoliang Ma
- National Key Laboratory of Science and Technology on Advanced Composites in Special Environments, Center for Composite Materials and StructuresHarbin Institute of TechnologyHarbinChina
| | - Rui Duan
- Dalton Cardiovascular Research CenterUniversity of MissouriColumbiaMissouriUSA
| | - Liming Qui
- Dalton Cardiovascular Research CenterUniversity of MissouriColumbiaMissouriUSA
| | - Xianjin Xu
- Dalton Cardiovascular Research CenterUniversity of MissouriColumbiaMissouriUSA
| | - Xiaoqin Zou
- Dalton Cardiovascular Research CenterUniversity of MissouriColumbiaMissouriUSA
- Dept. of Physics and AstronomyUniversity of MissouriColumbiaMissouriUSA
- Dept. of BiochemistryUniversity of MissouriColumbiaMissouriUSA
- Institute for Data Science and InformaticsUniversity of MissouriColumbiaMissouriUSA
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)HinxtonCambridgeUK
| | | |
Collapse
|
42
|
Hou Y, Xie T, He L, Tao L, Huang J. Topological links in predicted protein complex structures reveal limitations of AlphaFold. Commun Biol 2023; 6:1098. [PMID: 37898666 PMCID: PMC10613300 DOI: 10.1038/s42003-023-05489-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Accepted: 10/19/2023] [Indexed: 10/30/2023] Open
Abstract
AlphaFold is making great progress in protein structure prediction, not only for single-chain proteins but also for multi-chain protein complexes. When using AlphaFold-Multimer to predict protein‒protein complexes, we observed some unusual structures in which chains are looped around each other to form topologically intertwining links at the interface. Based on physical principles, such topological links should generally not exist in native protein complex structures unless covalent modifications of residues are involved. Although it is well known and has been well studied that protein structures may have topologically complex shapes such as knots and links, existing methods are hampered by the chain closure problem and show poor performance in identifying topologically linked structures in protein‒protein complexes. Therefore, we address the chain closure problem by using sliding windows from a local perspective and propose an algorithm to measure the topological-geometric features that can be used to identify topologically linked structures. An application of the method to AlphaFold-Multimer-predicted protein complex structures finds that approximately 1.72% of the predicted structures contain topological links. The method presented in this work will facilitate the computational study of protein‒protein interactions and help further improve the structural prediction of multi-chain protein complexes.
Collapse
Affiliation(s)
- Yingnan Hou
- Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
| | - Tengyu Xie
- Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
| | - Liuqing He
- Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
- Center for Infectious Disease Research, Westlake Laboratory of Life Sciences and Biomedicine, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
| | - Liang Tao
- Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
- Center for Infectious Disease Research, Westlake Laboratory of Life Sciences and Biomedicine, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
| | - Jing Huang
- Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China.
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China.
| |
Collapse
|