1
|
Madsen AV, Mejias-Gomez O, Pedersen LE, Preben Morth J, Kristensen P, Jenkins TP, Goletz S. Structural trends in antibody-antigen binding interfaces: a computational analysis of 1833 experimentally determined 3D structures. Comput Struct Biotechnol J 2024; 23:199-211. [PMID: 38161735 PMCID: PMC10755492 DOI: 10.1016/j.csbj.2023.11.056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2023] [Revised: 11/27/2023] [Accepted: 11/28/2023] [Indexed: 01/03/2024] Open
Abstract
Antibodies are attractive therapeutic candidates due to their ability to bind cognate antigens with high affinity and specificity. Still, the underlying molecular rules governing the antibody-antigen interface remain poorly understood, making in silico antibody design inherently difficult and keeping the discovery and design of novel antibodies a costly and laborious process. This study investigates the characteristics of antibody-antigen binding interfaces through a computational analysis of more than 850,000 atom-atom contacts from the largest reported set of antibody-antigen complexes with 1833 nonredundant, experimentally determined structures. The analysis compares binding characteristics of conventional antibodies and single-domain antibodies (sdAbs) targeting both protein- and peptide antigens. We find clear patterns in the number antibody-antigen contacts and amino acid frequencies in the paratope. The direct comparison of sdAbs and conventional antibodies helps elucidate the mechanisms employed by sdAbs to compensate for their smaller size and the fact that they harbor only half the number of complementarity-determining regions compared to conventional antibodies. Furthermore, we pinpoint antibody interface hotspot residues that are often found at the binding interface and the amino acid frequencies at these positions. These findings have direct potential applications in antibody engineering and the design of improved antibody libraries.
Collapse
Affiliation(s)
- Andreas V. Madsen
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Oscar Mejias-Gomez
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Lasse E. Pedersen
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - J. Preben Morth
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Peter Kristensen
- Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark
| | - Timothy P. Jenkins
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Steffen Goletz
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kgs. Lyngby, Denmark
| |
Collapse
|
2
|
Wang X, Gao X, Fan X, Huai Z, Zhang G, Yao M, Wang T, Huang X, Lai L. WUREN: Whole-modal union representation for epitope prediction. Comput Struct Biotechnol J 2024; 23:2122-2131. [PMID: 38817963 PMCID: PMC11137340 DOI: 10.1016/j.csbj.2024.05.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 05/14/2024] [Accepted: 05/14/2024] [Indexed: 06/01/2024] Open
Abstract
B-cell epitope identification plays a vital role in the development of vaccines, therapies, and diagnostic tools. Currently, molecular docking tools in B-cell epitope prediction are heavily influenced by empirical parameters and require significant computational resources, rendering a great challenge to meet large-scale prediction demands. When predicting epitopes from antigen-antibody complex, current artificial intelligence algorithms cannot accurately implement the prediction due to insufficient protein feature representations, indicating novel algorithm is desperately needed for efficient protein information extraction. In this paper, we introduce a multimodal model called WUREN (Whole-modal Union Representation for Epitope predictioN), which effectively combines sequence, graph, and structural features. It achieved AUC-PR scores of 0.213 and 0.193 on the solved structures and AlphaFold-generated structures, respectively, for the independent test proteins selected from DiscoTope3 benchmark. Our findings indicate that WUREN is an efficient feature extraction model for protein complexes, with the generalizable application potential in the development of protein-based drugs. Moreover, the streamlined framework of WUREN could be readily extended to model similar biomolecules, such as nucleic acids, carbohydrates, and lipids.
Collapse
Affiliation(s)
| | | | - Xuezhe Fan
- XtalPi Innovation Center, Beijing, China
| | - Zhe Huai
- XtalPi Innovation Center, Beijing, China
| | | | | | | | | | - Lipeng Lai
- XtalPi Innovation Center, Beijing, China
| |
Collapse
|
3
|
Meng F, Zhou N, Hu G, Liu R, Zhang Y, Jing M, Hou Q. A comprehensive overview of recent advances in generative models for antibodies. Comput Struct Biotechnol J 2024; 23:2648-2660. [PMID: 39027650 PMCID: PMC11254834 DOI: 10.1016/j.csbj.2024.06.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2024] [Revised: 06/15/2024] [Accepted: 06/18/2024] [Indexed: 07/20/2024] Open
Abstract
Therapeutic antibodies are an important class of biopharmaceuticals. With the rapid development of deep learning methods and the increasing amount of antibody data, antibody generative models have made great progress recently. They aim to solve the antibody space searching problems and are widely incorporated into the antibody development process. Therefore, a comprehensive introduction to the development methods in this field is imperative. Here, we collected 34 representative antibody generative models published recently and all generative models can be divided into three categories: sequence-generating models, structure-generating models, and hybrid models, based on their principles and algorithms. We further studied their performance and contributions to antibody sequence prediction, structure optimization, and affinity enhancement. Our manuscript will provide a comprehensive overview of the status of antibody generative models and also offer guidance for selecting different approaches.
Collapse
Affiliation(s)
- Fanxu Meng
- College of Chemical Engineering, Qingdao University of Science and Technology, Qingdao 266042, China
| | - Na Zhou
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan 250100, China
- National Institute of Health Data Science of China, Shandong University, Jinan 250100, China
| | - Guangchun Hu
- School of Information Science and Engineering, University of Jinan, Jinan 250022, China
| | - Ruotong Liu
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan 250100, China
- National Institute of Health Data Science of China, Shandong University, Jinan 250100, China
| | - Yuanyuan Zhang
- College of Chemical Engineering, Qingdao University of Science and Technology, Qingdao 266042, China
| | - Ming Jing
- Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, China
- Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250000, China
| | - Qingzhen Hou
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan 250100, China
- National Institute of Health Data Science of China, Shandong University, Jinan 250100, China
| |
Collapse
|
4
|
Zheng F, Jiang X, Wen Y, Yang Y, Li M. Systematic investigation of machine learning on limited data: A study on predicting protein-protein binding strength. Comput Struct Biotechnol J 2024; 23:460-472. [PMID: 38235359 PMCID: PMC10792694 DOI: 10.1016/j.csbj.2023.12.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 12/14/2023] [Accepted: 12/16/2023] [Indexed: 01/19/2024] Open
Abstract
The application of machine learning techniques in biological research, especially when dealing with limited data availability, poses significant challenges. In this study, we leveraged advancements in method development for predicting protein-protein binding strength to conduct a systematic investigation into the application of machine learning on limited data. The binding strength, quantitatively measured as binding affinity, is vital for understanding the processes of recognition, association, and dysfunction that occur within protein complexes. By incorporating transfer learning, integrating domain knowledge, and employing both deep learning and traditional machine learning algorithms, we mitigated the impact of data limitations and made significant advancements in predicting protein-protein binding affinity. In particular, we developed over 20 models, ultimately selecting three representative best-performing ones that belong to distinct categories. The first model is structure-based, consisting of a random forest regression and thirteen handcrafted features. The second model is sequence-based, employing an architecture that combines transferred embedding features with a multilayer perceptron. Finally, we created an ensemble model by averaging the predictions of the two aforementioned models. The comparison with other predictors on three independent datasets confirms the significant improvements achieved by our models in predicting protein-protein binding affinity. The programs for running these three models are available at https://github.com/minghuilab/BindPPI.
Collapse
Affiliation(s)
- Feifan Zheng
- MOE Key Laboratory of Geriatric Diseases and Immunology, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou, Jiangsu Province 215123, China
| | - Xin Jiang
- MOE Key Laboratory of Geriatric Diseases and Immunology, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou, Jiangsu Province 215123, China
| | - Yuhao Wen
- MOE Key Laboratory of Geriatric Diseases and Immunology, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou, Jiangsu Province 215123, China
| | - Yan Yang
- MOE Key Laboratory of Geriatric Diseases and Immunology, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou, Jiangsu Province 215123, China
| | - Minghui Li
- MOE Key Laboratory of Geriatric Diseases and Immunology, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou, Jiangsu Province 215123, China
| |
Collapse
|
5
|
McCoy KM, Ackerman ME, Grigoryan G. A comparison of antibody-antigen complex sequence-to-structure prediction methods and their systematic biases. Protein Sci 2024; 33:e5127. [PMID: 39167052 DOI: 10.1002/pro.5127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 06/24/2024] [Accepted: 07/14/2024] [Indexed: 08/23/2024]
Abstract
The ability to accurately predict antibody-antigen complex structures from their sequences could greatly advance our understanding of the immune system and would aid in the development of novel antibody therapeutics. There have been considerable recent advancements in predicting protein-protein interactions (PPIs) fueled by progress in machine learning (ML). To understand the current state of the field, we compare six representative methods for predicting antibody-antigen complexes from sequence, including two deep learning approaches trained to predict PPIs in general (AlphaFold-Multimer and RoseTTAFold), two composite methods that initially predict antibody and antigen structures separately and dock them (using antibody-mode ClusPro), local refinement in Rosetta (SnugDock) of globally docked poses from ClusPro, and a pipeline combining homology modeling with rigid-body docking informed by ML-based epitope and paratope prediction (AbAdapt). We find that AlphaFold-Multimer outperformed other methods, although the absolute performance leaves considerable room for improvement. AlphaFold-Multimer models of lower quality display significant structural biases at the level of tertiary motifs (TERMs) toward having fewer structural matches in non-antibody-containing structures from the Protein Data Bank (PDB). Specifically, better models exhibit more common PDB-like TERMs at the antibody-antigen interface than worse ones. Importantly, the clear relationship between performance and the commonness of interfacial TERMs suggests that the scarcity of interfacial geometry data in the structural database may currently limit the application of ML to the prediction of antibody-antigen interactions.
Collapse
Affiliation(s)
- Katherine Maia McCoy
- Molecular and Cell Biology Graduate Program, Dartmouth College, Hanover, New Hampshire, USA
| | - Margaret E Ackerman
- Molecular and Cell Biology Graduate Program, Dartmouth College, Hanover, New Hampshire, USA
- Thayer School of Engineering, Dartmouth College, Hanover, New Hampshire, USA
| | - Gevorg Grigoryan
- Molecular and Cell Biology Graduate Program, Dartmouth College, Hanover, New Hampshire, USA
- Department of Computer Science, Dartmouth College, Hanover, New Hampshire, USA
| |
Collapse
|
6
|
Wang T, Zhang X, Zhang O, Chen G, Pan P, Wang E, Wang J, Wu J, Zhou D, Wang L, Jin R, Chen S, Shen C, Kang Y, Hsieh CY, Hou T. Highly Accurate and Efficient Deep Learning Paradigm for Full-Atom Protein Loop Modeling with KarmaLoop. RESEARCH (WASHINGTON, D.C.) 2024; 7:0408. [PMID: 39055686 PMCID: PMC11268956 DOI: 10.34133/research.0408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/25/2024] [Accepted: 05/22/2024] [Indexed: 07/27/2024]
Abstract
Protein loop modeling is a challenging yet highly nontrivial task in protein structure prediction. Despite recent progress, existing methods including knowledge-based, ab initio, hybrid, and deep learning (DL) methods fall substantially short of either atomic accuracy or computational efficiency. To overcome these limitations, we present KarmaLoop, a novel paradigm that distinguishes itself as the first DL method centered on full-atom (encompassing both backbone and side-chain heavy atoms) protein loop modeling. Our results demonstrate that KarmaLoop considerably outperforms conventional and DL-based methods of loop modeling in terms of both accuracy and efficiency, with the average RMSDs of 1.77 and 1.95 Å for the CASP13+14 and CASP15 benchmark datasets, respectively, and manifests at least 2 orders of magnitude speedup in general compared with other methods. Consequently, our comprehensive evaluations indicate that KarmaLoop provides a state-of-the-art DL solution for protein loop modeling, with the potential to hasten the advancement of protein engineering, antibody-antigen recognition, and drug design.
Collapse
Affiliation(s)
- Tianyue Wang
- Innovation Institute for Artificial Intelligence in Medicine ofZhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Xujun Zhang
- Innovation Institute for Artificial Intelligence in Medicine ofZhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Odin Zhang
- Innovation Institute for Artificial Intelligence in Medicine ofZhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | | | - Peichen Pan
- Innovation Institute for Artificial Intelligence in Medicine ofZhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Ercheng Wang
- Innovation Institute for Artificial Intelligence in Medicine ofZhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
- Zhejiang Laboratory, Hangzhou 311100, Zhejiang, China
| | - Jike Wang
- Innovation Institute for Artificial Intelligence in Medicine ofZhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Jialu Wu
- Innovation Institute for Artificial Intelligence in Medicine ofZhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Donghao Zhou
- Shenzhen Institute of Advanced Technology,
Chinese Academy of Sciences, Shenzhen 518055, Guangdong, China
| | - Langcheng Wang
- Department of Pathology,
New York University Medical Center, New York, NY 10016, USA
| | - Ruofan Jin
- Innovation Institute for Artificial Intelligence in Medicine ofZhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
- College of Life Sciences,
Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Shicheng Chen
- Innovation Institute for Artificial Intelligence in Medicine ofZhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Chao Shen
- Innovation Institute for Artificial Intelligence in Medicine ofZhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Yu Kang
- Innovation Institute for Artificial Intelligence in Medicine ofZhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Chang-Yu Hsieh
- Innovation Institute for Artificial Intelligence in Medicine ofZhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine ofZhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| |
Collapse
|
7
|
Lee M, Lu M, Zhang B, Zhou T, Katte R, Han Y, Rawi R, Kwong PD. HIV-1-envelope trimer transitions from prefusion-closed to CD4-bound-open conformations through an occluded-intermediate state. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.15.603531. [PMID: 39071380 PMCID: PMC11275901 DOI: 10.1101/2024.07.15.603531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/30/2024]
Abstract
HIV-1 infection is initiated by the interaction between the gp120 subunit in the envelope (Env) trimer and the cellular receptor CD4 on host cells. This interaction induces substantial structural rearrangement of the Env trimer. Currently, static structural information for prefusion-closed trimers, CD4-bound prefusion-open trimers, and various antibody-bound trimers is available. However, dynamic features between these static states (e.g., transition structures) are not well understood. Here, we investigate the full transition pathway of a site specifically glycosylated Env trimer between prefusion-closed and CD4-bound-open conformations by collective molecular dynamics and single-molecule Förster resonance energy transfer (smFRET). Our investigations reveal and confirm important features of the transition pathway, including movement of variable loops to generate a glycan hole at the trimer apex and formation or rearrangements of α-helices and β-strands. Notably, by comparing the transition pathway to known Env-structures, we uncover evidence for a transition intermediate, with four antibodies, Ab1303, Ab1573, b12, and DH851.3, recognizing this intermediate. Each of these four antibodies induce population shifts of Env to occupy a newly observed smFRET state: the "occluded-intermediate" state. We propose this occluded-intermediate state to be both a prevalent state of Env and an on-path conformation between prefusion-closed and CD4-bound-open states, previously overlooked in smFRET analyses.
Collapse
|
8
|
Song B, Wang K, Na S, Yao J, Fattah FJ, von Itzstein MS, Yang DM, Liu J, Xue Y, Liang C, Guo Y, Raman I, Zhu C, Dowell JE, Homsi J, Rashdan S, Yang S, Gwin ME, Hsiehchen D, Gloria-McCutchen Y, Raj P, Bai X, Wang J, Conejo-Garcia J, Xie Y, Gerber DE, Huang J, Wang T. Cmai: Predicting Antigen-Antibody Interactions from Massive Sequencing Data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.27.601035. [PMID: 39005456 PMCID: PMC11244862 DOI: 10.1101/2024.06.27.601035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]
Abstract
The interaction between antigens and antibodies (B cell receptors, BCRs) is the key step underlying the function of the humoral immune system in various biological contexts. The capability to profile the landscape of antigen-binding affinity of a vast number of BCRs will provide a powerful tool to reveal novel insights at unprecedented levels and will yield powerful tools for translational development. However, current experimental approaches for profiling antibody-antigen interactions are costly and time-consuming, and can only achieve low-to-mid throughput. On the other hand, bioinformatics tools in the field of antibody informatics mostly focus on optimization of antibodies given known binding antigens, which is a very different research question and of limited scope. In this work, we developed an innovative Artificial Intelligence tool, Cmai, to address the prediction of the binding between antibodies and antigens that can be scaled to high-throughput sequencing data. Cmai achieved an AUROC of 0.91 in our validation cohort. We devised a biomarker metric based on the output from Cmai applied to high-throughput BCR sequencing data. We found that, during immune-related adverse events (irAEs) caused by immune-checkpoint inhibitor (ICI) treatment, the humoral immunity is preferentially responsive to intracellular antigens from the organs affected by the irAEs. In contrast, extracellular antigens on malignant tumor cells are inducing B cell infiltrations, and the infiltrating B cells have a greater tendency to co-localize with tumor cells expressing these antigens. We further found that the abundance of tumor antigen-targeting antibodies is predictive of ICI treatment response. Overall, Cmai and our biomarker approach filled in a gap that is not addressed by current antibody optimization works nor works such as AlphaFold3 that predict the structures of complexes of proteins that are known to bind.
Collapse
|
9
|
Ananya, Panchariya DC, Karthic A, Singh SP, Mani A, Chawade A, Kushwaha S. Vaccine design and development: Exploring the interface with computational biology and AI. Int Rev Immunol 2024:1-20. [PMID: 38982912 DOI: 10.1080/08830185.2024.2374546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Accepted: 06/26/2024] [Indexed: 07/11/2024]
Abstract
Computational biology involves applying computer science and informatics techniques in biology to understand complex biological data. It allows us to collect, connect, and analyze biological data at a large scale and build predictive models. In the twenty first century, computational resources along with Artificial Intelligence (AI) have been widely used in various fields of biological sciences such as biochemistry, structural biology, immunology, microbiology, and genomics to handle massive data for decision-making, including in applications such as drug design and vaccine development, one of the major areas of focus for human and animal welfare. The knowledge of available computational resources and AI-enabled tools in vaccine design and development can improve our ability to conduct cutting-edge research. Therefore, this review article aims to summarize important computational resources and AI-based tools. Further, the article discusses the various applications and limitations of AI tools in vaccine development.
Collapse
Affiliation(s)
- Ananya
- National Institute of Animal Biotechnology, Hyderabad, India
| | | | | | | | - Ashutosh Mani
- Motilal Nehru National Institute of Technology, Prayagraj, India
| | - Aakash Chawade
- Swedish University of Agricultural Sciences, Alnarp, Sweden
| | | |
Collapse
|
10
|
Papadopoulos S, Tinschert R, Papadopoulos I, Gerloff X, Schmitz F. Analytical Post-Embedding Immunogold-Electron Microscopy with Direct Gold-Labelled Monoclonal Primary Antibodies against RIBEYE A- and B-Domain Suggests a Refined Model of Synaptic Ribbon Assembly. Int J Mol Sci 2024; 25:7443. [PMID: 39000549 PMCID: PMC11242772 DOI: 10.3390/ijms25137443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Revised: 07/02/2024] [Accepted: 07/04/2024] [Indexed: 07/16/2024] Open
Abstract
Synaptic ribbons are the eponymous specializations of continuously active ribbon synapses. They are primarily composed of the RIBEYE protein that consists of a unique amino-terminal A-domain and carboxy-terminal B-domain that is largely identical to the ubiquitously expressed transcriptional regulator protein CtBP2. Both RIBEYE A-domain and RIBEYE B-domain are essential for the assembly of the synaptic ribbon, as shown by previous analyses of RIBEYE knockout and knockin mice and related investigations. How exactly the synaptic ribbon is assembled from RIBEYE subunits is not yet clear. To achieve further insights into the architecture of the synaptic ribbon, we performed analytical post-embedding immunogold-electron microscopy with direct gold-labelled primary antibodies against RIBEYE A-domain and RIBEYE B-domain for improved ultrastructural resolution. With direct gold-labelled monoclonal antibodies against RIBEYE A-domain and RIBEYE B-domain, we found that both domains show a very similar localization within the synaptic ribbon of mouse photoreceptor synapses, with no obvious differential gradient between the centre and surface of the synaptic ribbon. These data favour a model of the architecture of the synaptic ribbon in which the RIBEYE A-domain and RIBEYE B-domain are located similar distances from the midline of the synaptic ribbon.
Collapse
Affiliation(s)
- Stella Papadopoulos
- Institute of Anatomy, Department of Neuroanatomy, Medical School, Saarland University, 66421 Homburg, Germany; (S.P.); (R.T.)
| | - René Tinschert
- Institute of Anatomy, Department of Neuroanatomy, Medical School, Saarland University, 66421 Homburg, Germany; (S.P.); (R.T.)
| | - Iason Papadopoulos
- Mathematical Institute, University of Bonn, 53115 Bonn, Germany; (I.P.); (X.G.)
| | - Xenia Gerloff
- Mathematical Institute, University of Bonn, 53115 Bonn, Germany; (I.P.); (X.G.)
| | - Frank Schmitz
- Institute of Anatomy, Department of Neuroanatomy, Medical School, Saarland University, 66421 Homburg, Germany; (S.P.); (R.T.)
| |
Collapse
|
11
|
Pegoraro M, Dominé C, Rodolà E, Veličković P, Deac A. Geometric epitope and paratope prediction. BIOINFORMATICS (OXFORD, ENGLAND) 2024; 40:btae405. [PMID: 38984742 PMCID: PMC11245313 DOI: 10.1093/bioinformatics/btae405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Revised: 05/14/2024] [Accepted: 07/09/2024] [Indexed: 07/11/2024]
Abstract
MOTIVATION Identifying the binding sites of antibodies is essential for developing vaccines and synthetic antibodies. In this article, we investigate the optimal representation for predicting the binding sites in the two molecules and emphasize the importance of geometric information. RESULTS Specifically, we compare different geometric deep learning methods applied to proteins' inner (I-GEP) and outer (O-GEP) structures. We incorporate 3D coordinates and spectral geometric descriptors as input features to fully leverage the geometric information. Our research suggests that different geometrical representation information is useful for different tasks. Surface-based models are more efficient in predicting the binding of the epitope, while graph models are better in paratope prediction, both achieving significant performance improvements. Moreover, we analyze the impact of structural changes in antibodies and antigens resulting from conformational rearrangements or reconstruction errors. Through this investigation, we showcase the robustness of geometric deep learning methods and spectral geometric descriptors to such perturbations. AVAILABILITY AND IMPLEMENTATION The python code for the models, together with the data and the processing pipeline, is open-source and available at https://github.com/Marco-Peg/GEP.
Collapse
Affiliation(s)
- Marco Pegoraro
- Department of Computer Science, Sapienza University of Rome, 00185, Italy
| | - Clémentine Dominé
- Gatsby Computational Neuroscience Unit, University College London, W1T 4JG, United-Kingdom
| | - Emanuele Rodolà
- Department of Computer Science, Sapienza University of Rome, 00185, Italy
| | | | - Andreea Deac
- Département d'informatique et de recherche opérationelle, Université de Montréal, QC H2S 3H1, Canada
| |
Collapse
|
12
|
McCoy KM, Ackerman ME, Grigoryan G. A Comparison of Antibody-Antigen Complex Sequence-to-Structure Prediction Methods and their Systematic Biases. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.15.585121. [PMID: 38979267 PMCID: PMC11230293 DOI: 10.1101/2024.03.15.585121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
The ability to accurately predict antibody-antigen complex structures from their sequences could greatly advance our understanding of the immune system and would aid in the development of novel antibody therapeutics. There have been considerable recent advancements in predicting protein-protein interactions (PPIs) fueled by progress in machine learning (ML). To understand the current state of the field, we compare six representative methods for predicting antibody-antigen complexes from sequence, including two deep learning approaches trained to predict PPIs in general (AlphaFold-Multimer, RoseTTAFold), two composite methods that initially predict antibody and antigen structures separately and dock them (using antibody-mode ClusPro), local refinement in Rosetta (SnugDock) of globally docked poses from ClusPro, and a pipeline combining homology modeling with rigid-body docking informed by ML-based epitope and paratope prediction (AbAdapt). We find that AlphaFold-Multimer outperformed other methods, although the absolute performance leaves considerable room for improvement. AlphaFold-Multimer models of lower-quality display significant structural biases at the level of tertiary motifs (TERMs) towards having fewer structural matches in non-antibody containing structures from the Protein Data Bank (PDB). Specifically, better models exhibit more common PDB-like TERMs at the antibody-antigen interface than worse ones. Importantly, the clear relationship between performance and the commonness of interfacial TERMs suggests that scarcity of interfacial geometry data in the structural database may currently limit application of machine learning to the prediction of antibody-antigen interactions.
Collapse
Affiliation(s)
- Katherine Maia McCoy
- Molecular and Cell Biology Graduate Program, Dartmouth College, Hanover, New Hampshire, USA
| | - Margaret E Ackerman
- Thayer School of Engineering, Dartmouth College, Hanover, New Hampshire, USA
- Molecular and Cell Biology Graduate Program, Dartmouth College, Hanover, New Hampshire, USA
| | - Gevorg Grigoryan
- Department of Computer Science, Dartmouth College, Hanover, New Hampshire, USA
- Molecular and Cell Biology Graduate Program, Dartmouth College, Hanover, New Hampshire, USA
| |
Collapse
|
13
|
Chen H, Fan X, Zhu S, Pei Y, Zhang X, Zhang X, Liu L, Qian F, Tian B. Accurate prediction of CDR-H3 loop structures of antibodies with deep learning. eLife 2024; 12:RP91512. [PMID: 38921957 PMCID: PMC11208048 DOI: 10.7554/elife.91512] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/27/2024] Open
Abstract
Accurate prediction of the structurally diverse complementarity determining region heavy chain 3 (CDR-H3) loop structure remains a primary and long-standing challenge for antibody modeling. Here, we present the H3-OPT toolkit for predicting the 3D structures of monoclonal antibodies and nanobodies. H3-OPT combines the strengths of AlphaFold2 with a pre-trained protein language model and provides a 2.24 Å average RMSDCα between predicted and experimentally determined CDR-H3 loops, thus outperforming other current computational methods in our non-redundant high-quality dataset. The model was validated by experimentally solving three structures of anti-VEGF nanobodies predicted by H3-OPT. We examined the potential applications of H3-OPT through analyzing antibody surface properties and antibody-antigen interactions. This structural prediction tool can be used to optimize antibody-antigen binding and engineer therapeutic antibodies with biophysical properties for specialized drug administration route.
Collapse
Affiliation(s)
- Hedi Chen
- MOE Key Laboratory of Bioinformatics, State Key Laboratory of Molecular Oncology, School of Pharmaceutical Sciences, Tsinghua UniversityBeijingChina
| | - Xiaoyu Fan
- MOE Key Laboratory of Bioinformatics, State Key Laboratory of Molecular Oncology, School of Pharmaceutical Sciences, Tsinghua UniversityBeijingChina
| | - Shuqian Zhu
- MOE Key Laboratory of Bioinformatics, State Key Laboratory of Molecular Oncology, School of Pharmaceutical Sciences, Tsinghua UniversityBeijingChina
| | - Yuchan Pei
- Tsinghua Institute of Multidisciplinary Biomedical Research, Tsinghua UniversityBeijingChina
| | - Xiaochun Zhang
- MOE Key Laboratory of Bioinformatics, State Key Laboratory of Molecular Oncology, School of Pharmaceutical Sciences, Tsinghua UniversityBeijingChina
| | - Xiaonan Zhang
- Department of Natural Language Processing, Baidu International Technology (Shenzhen) Co LtdShenzhenChina
| | - Lihang Liu
- Department of Natural Language Processing, Baidu International Technology (Shenzhen) Co LtdShenzhenChina
| | - Feng Qian
- MOE Key Laboratory of Bioinformatics, State Key Laboratory of Molecular Oncology, School of Pharmaceutical Sciences, Tsinghua UniversityBeijingChina
| | - Boxue Tian
- MOE Key Laboratory of Bioinformatics, State Key Laboratory of Molecular Oncology, School of Pharmaceutical Sciences, Tsinghua UniversityBeijingChina
| |
Collapse
|
14
|
Jamasb AR, Morehead A, Joshi CK, Zhang Z, Didi K, Mathis S, Harris C, Tang J, Cheng J, Liò P, Blundell TL. Evaluating Representation Learning on the Protein Structure Universe. ARXIV 2024:arXiv:2406.13864v1. [PMID: 38947934 PMCID: PMC11213157] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
We introduce ProteinWorkshop, a comprehensive benchmark suite for representation learning on protein structures with Geometric Graph Neural Networks. We consider large-scale pre-training and downstream tasks on both experimental and predicted structures to enable the systematic evaluation of the quality of the learned structural representation and their usefulness in capturing functional relationships for downstream tasks. We find that: (1) large-scale pretraining on AlphaFold structures and auxiliary tasks consistently improve the performance of both rotation-invariant and equivariant GNNs, and (2) more expressive equivariant GNNs benefit from pretraining to a greater extent compared to invariant models. We aim to establish a common ground for the machine learning and computational biology communities to rigorously compare and advance protein structure representation learning. Our open-source codebase reduces the barrier to entry for working with large protein structure datasets by providing: (1) storage-efficient dataloaders for large-scale structural databases including AlphaFoldDB and ESM Atlas, as well as (2) utilities for constructing new tasks from the entire PDB. ProteinWorkshop is available at: github.com/a-r-j/ProteinWorkshop.
Collapse
|
15
|
Joubbi S, Micheli A, Milazzo P, Maccari G, Ciano G, Cardamone D, Medini D. Antibody design using deep learning: from sequence and structure design to affinity maturation. Brief Bioinform 2024; 25:bbae307. [PMID: 38960409 PMCID: PMC11221890 DOI: 10.1093/bib/bbae307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2024] [Revised: 05/20/2024] [Accepted: 06/12/2024] [Indexed: 07/05/2024] Open
Abstract
Deep learning has achieved impressive results in various fields such as computer vision and natural language processing, making it a powerful tool in biology. Its applications now encompass cellular image classification, genomic studies and drug discovery. While drug development traditionally focused deep learning applications on small molecules, recent innovations have incorporated it in the discovery and development of biological molecules, particularly antibodies. Researchers have devised novel techniques to streamline antibody development, combining in vitro and in silico methods. In particular, computational power expedites lead candidate generation, scaling and potential antibody development against complex antigens. This survey highlights significant advancements in protein design and optimization, specifically focusing on antibodies. This includes various aspects such as design, folding, antibody-antigen interactions docking and affinity maturation.
Collapse
Affiliation(s)
- Sara Joubbi
- Department of Computer Science, University of Pisa, Largo B. Pontecorvo, 3, 56127, Pisa, Italy
- Data Science for Health (DaScH) Lab, Fondazione Toscana Life Sciences, Via Fiorentina, 1, 53100, Siena, Italy
| | - Alessio Micheli
- Department of Computer Science, University of Pisa, Largo B. Pontecorvo, 3, 56127, Pisa, Italy
| | - Paolo Milazzo
- Department of Computer Science, University of Pisa, Largo B. Pontecorvo, 3, 56127, Pisa, Italy
| | - Giuseppe Maccari
- Data Science for Health (DaScH) Lab, Fondazione Toscana Life Sciences, Via Fiorentina, 1, 53100, Siena, Italy
| | - Giorgio Ciano
- Data Science for Health (DaScH) Lab, Fondazione Toscana Life Sciences, Via Fiorentina, 1, 53100, Siena, Italy
| | - Dario Cardamone
- Data Science for Health (DaScH) Lab, Fondazione Toscana Life Sciences, Via Fiorentina, 1, 53100, Siena, Italy
| | - Duccio Medini
- Data Science for Health (DaScH) Lab, Fondazione Toscana Life Sciences, Via Fiorentina, 1, 53100, Siena, Italy
| |
Collapse
|
16
|
Li D, Pucci F, Rooman M. Prediction of Paratope-Epitope Pairs Using Convolutional Neural Networks. Int J Mol Sci 2024; 25:5434. [PMID: 38791470 PMCID: PMC11121317 DOI: 10.3390/ijms25105434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Revised: 05/06/2024] [Accepted: 05/13/2024] [Indexed: 05/26/2024] Open
Abstract
Antibodies play a central role in the adaptive immune response of vertebrates through the specific recognition of exogenous or endogenous antigens. The rational design of antibodies has a wide range of biotechnological and medical applications, such as in disease diagnosis and treatment. However, there are currently no reliable methods for predicting the antibodies that recognize a specific antigen region (or epitope) and, conversely, epitopes that recognize the binding region of a given antibody (or paratope). To fill this gap, we developed ImaPEp, a machine learning-based tool for predicting the binding probability of paratope-epitope pairs, where the epitope and paratope patches were simplified into interacting two-dimensional patches, which were colored according to the values of selected features, and pixelated. The specific recognition of an epitope image by a paratope image was achieved by using a convolutional neural network-based model, which was trained on a set of two-dimensional paratope-epitope images derived from experimental structures of antibody-antigen complexes. Our method achieves good performances in terms of cross-validation with a balanced accuracy of 0.8. Finally, we showcase examples of application of ImaPep, including extensive screening of large libraries to identify paratope candidates that bind to a selected epitope, and rescoring and refining antibody-antigen docking poses.
Collapse
Affiliation(s)
- Dong Li
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, 1050 Brussels, Belgium; (D.L.); (F.P.)
- Interuniversity Institute of Bioinformatics in Brussels, 1050 Brussels, Belgium
| | - Fabrizio Pucci
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, 1050 Brussels, Belgium; (D.L.); (F.P.)
- Interuniversity Institute of Bioinformatics in Brussels, 1050 Brussels, Belgium
| | - Marianne Rooman
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, 1050 Brussels, Belgium; (D.L.); (F.P.)
- Interuniversity Institute of Bioinformatics in Brussels, 1050 Brussels, Belgium
| |
Collapse
|
17
|
Sammut SJ, Galson JD, Minter R, Sun B, Chin SF, De Mattos-Arruda L, Finch DK, Schätzle S, Dias J, Rueda OM, Seoane J, Osbourn J, Caldas C, Bashford-Rogers RJM. Predictability of B cell clonal persistence and immunosurveillance in breast cancer. Nat Immunol 2024; 25:916-924. [PMID: 38698238 PMCID: PMC11065701 DOI: 10.1038/s41590-024-01821-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Accepted: 03/15/2024] [Indexed: 05/05/2024]
Abstract
B cells and T cells are important components of the adaptive immune system and mediate anticancer immunity. The T cell landscape in cancer is well characterized, but the contribution of B cells to anticancer immunosurveillance is less well explored. Here we show an integrative analysis of the B cell and T cell receptor repertoire from individuals with metastatic breast cancer and individuals with early breast cancer during neoadjuvant therapy. Using immune receptor, RNA and whole-exome sequencing, we show that both B cell and T cell responses seem to coevolve with the metastatic cancer genomes and mirror tumor mutational and neoantigen architecture. B cell clones associated with metastatic immunosurveillance and temporal persistence were more expanded and distinct from site-specific clones. B cell clonal immunosurveillance and temporal persistence are predictable from the clonal structure, with higher-centrality B cell antigen receptors more likely to be detected across multiple metastases or across time. This predictability was generalizable across other immune-mediated disorders. This work lays a foundation for prioritizing antibody sequences for therapeutic targeting in cancer.
Collapse
MESH Headings
- Humans
- Female
- Breast Neoplasms/immunology
- B-Lymphocytes/immunology
- Immunologic Surveillance
- Receptors, Antigen, T-Cell/genetics
- Receptors, Antigen, T-Cell/immunology
- Receptors, Antigen, T-Cell/metabolism
- Receptors, Antigen, B-Cell/metabolism
- Receptors, Antigen, B-Cell/genetics
- Receptors, Antigen, B-Cell/immunology
- T-Lymphocytes/immunology
- Monitoring, Immunologic
- Exome Sequencing
- Antigens, Neoplasm/immunology
- Neoplasm Metastasis
- Clone Cells
Collapse
Affiliation(s)
- Stephen-John Sammut
- Breast Cancer Now Toby Robins Research Centre, The Institute of Cancer Research, London, UK.
- The Royal Marsden Hospital NHS Foundation Trust, London, UK.
| | | | | | - Bo Sun
- Wellcome Centre for Human Genetics, Oxford, UK
- Nuffield Department of Clinical Neuroscience, University of Oxford, Oxford, UK
| | - Suet-Feung Chin
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
| | - Leticia De Mattos-Arruda
- IrsiCaixa, Germans Trias i Pujol University Hospital, Badalona, Spain
- Germans Trias i Pujol Research Institute (IGTP), Badalona, Spain
| | | | | | | | - Oscar M Rueda
- MRC Biostatistics Unit, University of Cambridge, Cambridge, UK
| | - Joan Seoane
- Vall d'Hebron Institute of Oncology (VHIO), Vall d'Hebron University Hospital, Institució Catalana de Recerca i Estudis Avançats (ICREA), Universitat Autònoma de Barcelona (UAB), CIBERONC, Barcelona, Spain
| | | | - Carlos Caldas
- School of Clinical Medicine, University of Cambridge, Cambridge, UK.
| | - Rachael J M Bashford-Rogers
- Wellcome Centre for Human Genetics, Oxford, UK.
- Department of Biochemistry, University of Oxford, Oxford, UK.
- Oxford Cancer Centre, Oxford, UK.
| |
Collapse
|
18
|
Roel‐Touris J, Carcelén L, Marcos E. The structural landscape of the immunoglobulin fold by large-scale de novo design. Protein Sci 2024; 33:e4936. [PMID: 38501461 PMCID: PMC10949314 DOI: 10.1002/pro.4936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 02/02/2024] [Accepted: 02/06/2024] [Indexed: 03/20/2024]
Abstract
De novo designing immunoglobulin-like frameworks that allow for functional loop diversification shows great potential for crafting antibody-like scaffolds with fully customizable structures and functions. In this work, we combined de novo parametric design with deep-learning methods for protein structure prediction and design to explore the structural landscape of 7-stranded immunoglobulin domains. After screening folding of nearly 4 million designs, we have assembled a structurally diverse library of ~50,000 immunoglobulin domains with high-confidence AlphaFold2 predictions and structures diverging from naturally occurring ones. The designed dataset enabled us to identify structural requirements for the correct folding of immunoglobulin domains, shed light on β-sheet-β-sheet rotational preferences and how these are linked to functional properties. Our approach eliminates the need for preset loop conformations and opens the route to large-scale de novo design of immunoglobulin-like frameworks.
Collapse
Affiliation(s)
- Jorge Roel‐Touris
- Protein Design and Modeling Lab, Department of Structural and Molecular BiologyMolecular Biology Institute of Barcelona (IBMB), CSICBarcelonaSpain
| | - Lourdes Carcelén
- Protein Design and Modeling Lab, Department of Structural and Molecular BiologyMolecular Biology Institute of Barcelona (IBMB), CSICBarcelonaSpain
| | - Enrique Marcos
- Protein Design and Modeling Lab, Department of Structural and Molecular BiologyMolecular Biology Institute of Barcelona (IBMB), CSICBarcelonaSpain
| |
Collapse
|
19
|
Corcoran MM, Karlsson Hedestam GB. Adaptive immune receptor germline gene variation. Curr Opin Immunol 2024; 87:102429. [PMID: 38805851 DOI: 10.1016/j.coi.2024.102429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 04/30/2024] [Accepted: 05/09/2024] [Indexed: 05/30/2024]
Abstract
Recognition of antigens by T cell receptors (TCRs) and B cell receptors (BCRs) is a key step in lymphocyte activation. T and B cells mediate adaptive immune responses, which protect us against infections and provide immunological memory, and also, in some instances, drive pathogenic responses in autoimmune diseases. TCRs and BCRs are encoded within loci that are known to be genetically diverse. However, the extent and functional impact of this variation, both in humans and model animals used in immunological research, remain largely unknown. Experimental and genetic evidence has demonstrated that the complementarity determining regions 1 and 2 (HCDR1 and HCDR2), encoded by the variable (V) region of TCRs and BCRs, also often make critical contacts with the targeted antigen. Thus, knowledge about allelic variation in the genes encoding TCRs and BCRs is critically important for understanding adaptive immune responses in outbred populations and to define responder and non-responder phenotypes.
Collapse
Affiliation(s)
- Martin M Corcoran
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, 17177 Stockholm, Sweden
| | | |
Collapse
|
20
|
Jing X, Wu F, Luo X, Xu J. Single-sequence protein structure prediction by integrating protein language models. Proc Natl Acad Sci U S A 2024; 121:e2308788121. [PMID: 38507445 PMCID: PMC10990103 DOI: 10.1073/pnas.2308788121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 02/05/2024] [Indexed: 03/22/2024] Open
Abstract
Protein structure prediction has been greatly improved by deep learning in the past few years. However, the most successful methods rely on multiple sequence alignment (MSA) of the sequence homologs of the protein under prediction. In nature, a protein folds in the absence of its sequence homologs and thus, a MSA-free structure prediction method is desired. Here, we develop a single-sequence-based protein structure prediction method RaptorX-Single by integrating several protein language models and a structure generation module and then study its advantage over MSA-based methods. Our experimental results indicate that in addition to running much faster than MSA-based methods such as AlphaFold2, RaptorX-Single outperforms AlphaFold2 and other MSA-free methods in predicting the structure of antibodies (after fine-tuning on antibody data), proteins of very few sequence homologs, and single mutation effects. By comparing different protein language models, our results show that not only the scale but also the training data of protein language models will impact the performance. RaptorX-Single also compares favorably to MSA-based AlphaFold2 when the protein under prediction has a large number of sequence homologs.
Collapse
Affiliation(s)
| | - Fandi Wu
- MoleculeMind Ltd., Beijing100084, China
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing100190, China
| | - Xiao Luo
- Toyota Technological Institute at Chicago, Chicago, IL60637
- Shanghai Artificial Intelligence Laboratory, Shanghai200232, China
| | - Jinbo Xu
- MoleculeMind Ltd., Beijing100084, China
- Toyota Technological Institute at Chicago, Chicago, IL60637
| |
Collapse
|
21
|
Tandiana R, Barletta GP, Soler MA, Fortuna S, Rocchia W. Computational Mutagenesis of Antibody Fragments: Disentangling Side Chains from ΔΔ G Predictions. J Chem Theory Comput 2024; 20:2630-2642. [PMID: 38445482 DOI: 10.1021/acs.jctc.3c01225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2024]
Abstract
The development of highly potent antibodies and antibody fragments as binding agents holds significant implications in fields such as biosensing and biotherapeutics. Their binding strength is intricately linked to the arrangement and composition of residues at the binding interface. Computational techniques offer a robust means to predict the three-dimensional structure of these complexes and to assess the affinity changes resulting from mutations. Given the interdependence of structure and affinity prediction, our objective here is to disentangle their roles. We aim to evaluate independently six side-chain reconstruction methods and ten binding affinity estimation techniques. This evaluation was pivotal in predicting affinity alterations due to single mutations, a key step in computational affinity maturation protocols. Our analysis focuses on a data set comprising 27 distinct antibody/hen egg white lysozyme complexes, each with crystal structures and experimentally determined binding affinities. Using six different side-chain reconstruction methods, we transformed each structure into its corresponding mutant via in silico single-point mutations. Subsequently, these structures undergo minimization and molecular dynamics simulation. We therefore estimate ΔΔG values based on the original crystal structure, its energy-minimized form, and the ensuing molecular dynamics trajectories. Our research underscores the critical importance of selecting reliable side-chain reconstruction methods and conducting thorough molecular dynamics simulations to accurately predict the impact of mutations. In summary, our study demonstrates that the integration of conformational sampling and scoring is a potent approach to precisely characterizing mutation processes in single-point mutagenesis protocols and crucial for computational antibody design.
Collapse
Affiliation(s)
- Rika Tandiana
- Computational MOdelling of NanosCalE and BioPhysical SysTems─CONCEPT Lab Istituto Italiano di Tecnologia (IIT), Via Melen-83, B Block, 16152 Genoa, Italy
| | - German P Barletta
- Computational MOdelling of NanosCalE and BioPhysical SysTems─CONCEPT Lab Istituto Italiano di Tecnologia (IIT), Via Melen-83, B Block, 16152 Genoa, Italy
- The Abdus Salam International Centre for Theoretical Physics─ICTP, Strada Costiera 11, 34151 Trieste, Italy
| | - Miguel Angel Soler
- Dipartimento di Scienze Matematiche, Informatiche e Fisiche, Universita' di Udine, Via delle Scienze 206, 33100 Udine, Italy
| | - Sara Fortuna
- Computational MOdelling of NanosCalE and BioPhysical SysTems─CONCEPT Lab Istituto Italiano di Tecnologia (IIT), Via Melen-83, B Block, 16152 Genoa, Italy
| | - Walter Rocchia
- Computational MOdelling of NanosCalE and BioPhysical SysTems─CONCEPT Lab Istituto Italiano di Tecnologia (IIT), Via Melen-83, B Block, 16152 Genoa, Italy
| |
Collapse
|
22
|
Li S, Meng X, Li R, Huang B, Wang X. NanoBERTa-ASP: predicting nanobody paratope based on a pretrained RoBERTa model. BMC Bioinformatics 2024; 25:122. [PMID: 38515052 PMCID: PMC10956323 DOI: 10.1186/s12859-024-05750-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 03/18/2024] [Indexed: 03/23/2024] Open
Abstract
BACKGROUND Nanobodies, also known as VHH or single-domain antibodies, are unique antibody fragments derived solely from heavy chains. They offer advantages of small molecules and conventional antibodies, making them promising therapeutics. The paratope is the specific region on an antibody that binds to an antigen. Paratope prediction involves the identification and characterization of the antigen-binding site on an antibody. This process is crucial for understanding the specificity and affinity of antibody-antigen interactions. Various computational methods and experimental approaches have been developed to predict and analyze paratopes, contributing to advancements in antibody engineering, drug development, and immunotherapy. However, existing predictive models trained on traditional antibodies may not be suitable for nanobodies. Additionally, the limited availability of nanobody datasets poses challenges in constructing accurate models. METHODS To address these challenges, we have developed a novel nanobody prediction model, named NanoBERTa-ASP (Antibody Specificity Prediction), which is specifically designed for predicting nanobody-antigen binding sites. The model adopts a training strategy more suitable for nanobodies, based on an advanced natural language processing (NLP) model called BERT (Bidirectional Encoder Representations from Transformers). To be more specific, the model utilizes a masked language modeling approach named RoBERTa (Robustly Optimized BERT Pretraining Approach) to learn the contextual information of the nanobody sequence and predict its binding site. RESULTS NanoBERTa-ASP achieved exceptional performance in predicting nanobody binding sites, outperforming existing methods, indicating its proficiency in capturing sequence information specific to nanobodies and accurately identifying their binding sites. Furthermore, NanoBERTa-ASP provides insights into the interaction mechanisms between nanobodies and antigens, contributing to a better understanding of nanobodies and facilitating the design and development of nanobodies with therapeutic potential. CONCLUSION NanoBERTa-ASP represents a significant advancement in nanobody paratope prediction. Its superior performance highlights the potential of deep learning approaches in nanobody research. By leveraging the increasing volume of nanobody data, NanoBERTa-ASP can further refine its predictions, enhance its performance, and contribute to the development of novel nanobody-based therapeutics. Github repository: https://github.com/WangLabforComputationalBiology/NanoBERTa-ASP.
Collapse
Affiliation(s)
- Shangru Li
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen, China
| | - Xiangpeng Meng
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen, China
| | - Rui Li
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen, China
| | - Bingding Huang
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen, China.
| | - Xin Wang
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen, China.
| |
Collapse
|
23
|
Bennett NR, Watson JL, Ragotte RJ, Borst AJ, See DL, Weidle C, Biswas R, Shrock EL, Leung PJY, Huang B, Goreshnik I, Ault R, Carr KD, Singer B, Criswell C, Vafeados D, Sanchez MG, Kim HM, Torres SV, Chan S, Baker D. Atomically accurate de novo design of single-domain antibodies. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.14.585103. [PMID: 38562682 PMCID: PMC10983868 DOI: 10.1101/2024.03.14.585103] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Despite the central role that antibodies play in modern medicine, there is currently no way to rationally design novel antibodies to bind a specific epitope on a target. Instead, antibody discovery currently involves time-consuming immunization of an animal or library screening approaches. Here we demonstrate that a fine-tuned RFdiffusion network is capable of designing de novo antibody variable heavy chains (VHH's) that bind user-specified epitopes. We experimentally confirm binders to four disease-relevant epitopes, and the cryo-EM structure of a designed VHH bound to influenza hemagglutinin is nearly identical to the design model both in the configuration of the CDR loops and the overall binding pose.
Collapse
Affiliation(s)
- Nathaniel R. Bennett
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
- Graduate Program in Molecular Engineering, University of Washington, Seattle, WA 98105, USA
| | - Joseph L. Watson
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - Robert J. Ragotte
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - Andrew J. Borst
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - Déjenaé L. See
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
- Department of Bioengineering, University of Washington, Seattle, WA, USA
| | - Connor Weidle
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - Riti Biswas
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
- Graduate Program in Molecular Engineering, University of Washington, Seattle, WA 98105, USA
| | - Ellen L. Shrock
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - Philip J. Y. Leung
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
- Graduate Program in Molecular Engineering, University of Washington, Seattle, WA 98105, USA
| | - Buwei Huang
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
- Department of Bioengineering, University of Washington, Seattle, WA, USA
| | - Inna Goreshnik
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Russell Ault
- Department of Pediatrics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Kenneth D. Carr
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - Benedikt Singer
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - Cameron Criswell
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - Dionne Vafeados
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | | | - Ho Min Kim
- Center for Biomolecular and Cellular Structure, Institute for Basic Science (IBS), Daejeon, 34126, Republic of Korea
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, 34141, Republic of Korea
| | - Susana Vázquez Torres
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
- Graduate Program in Biological Physics, Structure and Design, University of Washington, Seattle, WA, USA
| | - Sidney Chan
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA 98105, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98105, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| |
Collapse
|
24
|
Jeon W, Kim D. AbFlex: designing antibody complementarity determining regions with flexible CDR definition. Bioinformatics 2024; 40:btae122. [PMID: 38449295 DOI: 10.1093/bioinformatics/btae122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 02/04/2024] [Accepted: 03/05/2024] [Indexed: 03/08/2024] Open
Abstract
MOTIVATION Antibodies are proteins that the immune system produces in response to foreign pathogens. Designing antibodies that specifically bind to antigens is a key step in developing antibody therapeutics. The complementarity determining regions (CDRs) of the antibody are mainly responsible for binding to the target antigen, and therefore must be designed to recognize the antigen. RESULTS We develop an antibody design model, AbFlex, that exhibits state-of-the-art performance in terms of structure prediction accuracy and amino acid recovery rate. Furthermore, >38% of newly designed antibody models are estimated to have better binding energies for their antigens than wild types. The effectiveness of the model is attributed to two different strategies that are developed to overcome the difficulty associated with the scarcity of antibody-antigen complex structure data. One strategy is to use an equivariant graph neural network model that is more data-efficient. More importantly, a new data augmentation strategy based on the flexible definition of CDRs significantly increases the performance of the CDR prediction model. AVAILABILITY AND IMPLEMENTATION The source code and implementation are available at https://github.com/wsjeon92/AbFlex.
Collapse
Affiliation(s)
- Woosung Jeon
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea
| | - Dongsup Kim
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea
| |
Collapse
|
25
|
Barton J, Gaspariunas A, Galson JD, Leem J. Building Representation Learning Models for Antibody Comprehension. Cold Spring Harb Perspect Biol 2024; 16:a041462. [PMID: 38012013 PMCID: PMC10910360 DOI: 10.1101/cshperspect.a041462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Antibodies are versatile proteins with both the capacity to bind a broad range of targets and a proven track record as some of the most successful therapeutics. However, the development of novel antibody therapeutics is a lengthy and costly process. It is challenging to predict the functional and biophysical properties of antibodies from their amino acid sequence alone, requiring numerous experiments for full characterization. Machine learning, specifically deep representation learning, has emerged as a family of methods that can complement wet lab approaches and accelerate the overall discovery and engineering process. Here, we review advances in antibody sequence representation learning, and how this has improved antibody structure prediction and facilitated antibody optimization. We discuss challenges in the development and implementation of such models, such as the lack of publicly available, well-curated antibody function data and highlight opportunities for improvement. These and future advances in machine learning for antibody sequences have the potential to increase the success rate in developing new therapeutics, resulting in broader access to transformative medicines and improved patient outcomes.
Collapse
Affiliation(s)
- Justin Barton
- Alchemab Therapeutics Ltd, London N1C 4AX, United Kingdom
| | | | - Jacob D Galson
- Alchemab Therapeutics Ltd, London N1C 4AX, United Kingdom
| | - Jinwoo Leem
- Alchemab Therapeutics Ltd, London N1C 4AX, United Kingdom
| |
Collapse
|
26
|
Greenshields-Watson A, Abanades B, Deane CM. Investigating the ability of deep learning-based structure prediction to extrapolate and/or enrich the set of antibody CDR canonical forms. Front Immunol 2024; 15:1352703. [PMID: 38482007 PMCID: PMC10933040 DOI: 10.3389/fimmu.2024.1352703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Accepted: 01/30/2024] [Indexed: 04/13/2024] Open
Abstract
Deep learning models have been shown to accurately predict protein structure from sequence, allowing researchers to explore protein space from the structural viewpoint. In this paper we explore whether "novel" features, such as distinct loop conformations can arise from these predictions despite not being present in the training data. Here we have used ABodyBuilder2, a deep learning antibody structure predictor, to predict the structures of ~1.5M paired antibody sequences. We examined the predicted structures of the canonical CDR loops and found that most of these predictions fall into the already described CDR canonical form structural space. We also found a small number of "new" canonical clusters composed of heterogeneous sequences united by a common sequence motif and loop conformation. Analysis of these novel clusters showed their origins to be either shapes seen in the training data at very low frequency or shapes seen at high frequency but at a shorter sequence length. To evaluate explicitly the ability of ABodyBuilder2 to extrapolate, we retrained several models whilst withholding all antibody structures of a specific CDR loop length or canonical form. These "starved" models showed evidence of generalisation across CDRs of different lengths, but they did not extrapolate to loop conformations which were highly distinct from those present in the training data. However, the models were able to accurately predict a canonical form even if only a very small number of examples of that shape were in the training data. Our results suggest that deep learning protein structure prediction methods are unable to make completely out-of-domain predictions for CDR loops. However, in our analysis we also found that even minimal amounts of data of a structural shape allow the method to recover its original predictive abilities. We have made the ~1.5 M predicted structures used in this study available to download at https://doi.org/10.5281/zenodo.10280181.
Collapse
|
27
|
Høie MH, Gade FS, Johansen J, Würtzen C, Winther O, Nielsen M, Marcatili P. DiscoTope-3.0: improved B-cell epitope prediction using inverse folding latent representations. Front Immunol 2024; 15:1322712. [PMID: 38390326 PMCID: PMC10882062 DOI: 10.3389/fimmu.2024.1322712] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 01/08/2024] [Indexed: 02/24/2024] Open
Abstract
Accurate computational identification of B-cell epitopes is crucial for the development of vaccines, therapies, and diagnostic tools. However, current structure-based prediction methods face limitations due to the dependency on experimentally solved structures. Here, we introduce DiscoTope-3.0, a markedly improved B-cell epitope prediction tool that innovatively employs inverse folding structure representations and a positive-unlabelled learning strategy, and is adapted for both solved and predicted structures. Our tool demonstrates a considerable improvement in performance over existing methods, accurately predicting linear and conformational epitopes across multiple independent datasets. Most notably, DiscoTope-3.0 maintains high predictive performance across solved, relaxed and predicted structures, alleviating the need for experimental structures and extending the general applicability of accurate B-cell epitope prediction by 3 orders of magnitude. DiscoTope-3.0 is made widely accessible on two web servers, processing over 100 structures per submission, and as a downloadable package. In addition, the servers interface with RCSB and AlphaFoldDB, facilitating large-scale prediction across over 200 million cataloged proteins. DiscoTope-3.0 is available at: https://services.healthtech.dtu.dk/service.php?DiscoTope-3.0.
Collapse
Affiliation(s)
- Magnus Haraldson Høie
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark (DTU), Kgs. Lyngby, Denmark
| | - Frederik Steensgaard Gade
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark (DTU), Kgs. Lyngby, Denmark
| | - Julie Maria Johansen
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark (DTU), Kgs. Lyngby, Denmark
| | - Charlotte Würtzen
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark (DTU), Kgs. Lyngby, Denmark
| | - Ole Winther
- Section for Cognitive Systems, DTU Compute, Technical University of Denmark (DTU), Kgs. Lyngby, Denmark
- Center for Genomic Medicine, Rigshospitalet (Copenhagen University Hospital), Copenhagen, Denmark
- Department of Biology, Bioinformatics Centre, University of Copenhagen, Copenhagen, Denmark
| | - Morten Nielsen
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark (DTU), Kgs. Lyngby, Denmark
| | - Paolo Marcatili
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark (DTU), Kgs. Lyngby, Denmark
| |
Collapse
|
28
|
Guo D, De Sciscio ML, Chi-Fung Ng J, Fraternali F. Modelling the assembly and flexibility of antibody structures. Curr Opin Struct Biol 2024; 84:102757. [PMID: 38118364 DOI: 10.1016/j.sbi.2023.102757] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 11/29/2023] [Accepted: 11/30/2023] [Indexed: 12/22/2023]
Abstract
Antibodies are large protein assemblies capable of both specifically recognising antigens and engaging with other proteins and receptors to coordinate immune action. Traditionally, structural studies have been dedicated to antibody variable regions, but efforts to determine and model full-length antibody structures are emerging. Here we review the current knowledge on modelling the structures of antibody assemblies, focusing on their conformational flexibility and the challenge this poses to obtaining and evaluating structural models. Integrative modelling approaches, combining experiments (cryo-electron microscopy, mass spectrometry, etc.) and computational methods (molecular dynamics simulations, deep-learning based approaches, etc.), hold the promise to map the complex conformational landscape of full-length antibody structures.
Collapse
Affiliation(s)
- Dongjun Guo
- Institute of Structural and Molecular Biology, University College London, Darwin Building, Gower Street, London, WC1E 6BT, United Kingdom; Randall Centre for Cell & Molecular Biophysics, King's College London, New Hunt's House, Guy's Campus, London, SE1 1UL, United Kingdom
| | - Maria Laura De Sciscio
- Institute of Structural and Molecular Biology, University College London, Darwin Building, Gower Street, London, WC1E 6BT, United Kingdom; Department of Chemistry, Sapienza University of Rome, P.le A. Moro 5, Rome, 00185, Italy
| | - Joseph Chi-Fung Ng
- Institute of Structural and Molecular Biology, University College London, Darwin Building, Gower Street, London, WC1E 6BT, United Kingdom
| | - Franca Fraternali
- Institute of Structural and Molecular Biology, University College London, Darwin Building, Gower Street, London, WC1E 6BT, United Kingdom.
| |
Collapse
|
29
|
Guo Z, Liu J, Wang Y, Chen M, Wang D, Xu D, Cheng J. Diffusion models in bioinformatics and computational biology. NATURE REVIEWS BIOENGINEERING 2024; 2:136-154. [PMID: 38576453 PMCID: PMC10994218 DOI: 10.1038/s44222-023-00114-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 08/25/2023] [Indexed: 04/06/2024]
Abstract
Denoising diffusion models embody a type of generative artificial intelligence that can be applied in computer vision, natural language processing and bioinformatics. In this Review, we introduce the key concepts and theoretical foundations of three diffusion modelling frameworks (denoising diffusion probabilistic models, noise-conditioned scoring networks and score stochastic differential equations). We then explore their applications in bioinformatics and computational biology, including protein design and generation, drug and small-molecule design, protein-ligand interaction modelling, cryo-electron microscopy image data analysis and single-cell data analysis. Finally, we highlight open-source diffusion model tools and consider the future applications of diffusion models in bioinformatics.
Collapse
Affiliation(s)
- Zhiye Guo
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
- NextGen Precision Health, University of Missouri, Columbia, MO, USA
| | - Jian Liu
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
- NextGen Precision Health, University of Missouri, Columbia, MO, USA
| | - Yanli Wang
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
- NextGen Precision Health, University of Missouri, Columbia, MO, USA
| | - Mengrui Chen
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
- NextGen Precision Health, University of Missouri, Columbia, MO, USA
| | - Duolin Wang
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
- NextGen Precision Health, University of Missouri, Columbia, MO, USA
| | - Dong Xu
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
- NextGen Precision Health, University of Missouri, Columbia, MO, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
- NextGen Precision Health, University of Missouri, Columbia, MO, USA
| |
Collapse
|
30
|
Pinetre J, Delcourt V, Becher F, Garcia P, Barnabé A, Loup B, Popot MA, Fenaille F, Bailly-Chouriberry L. High-throughput untargeted screening of biotherapeutic macromolecules in equine plasma by UHPLC-HRMS/MS: Application to monoclonal antibodies and Fc-fusion proteins for doping control. Drug Test Anal 2024; 16:199-209. [PMID: 37337992 DOI: 10.1002/dta.3525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 05/24/2023] [Accepted: 05/25/2023] [Indexed: 06/21/2023]
Abstract
Many innovative biotherapeutics have been marketed in the last decade. Monoclonal antibodies (mAbs) and Fc-fusion proteins (Fc-proteins) have been developed for the treatment of diverse diseases (cancer, autoimmune diseases, and inflammatory disorders) and now represent an important part of targeted therapies. However, the ready availability of such biomolecules, sometimes characterized by their anabolic, anti-inflammatory, or erythropoiesis-stimulating properties, raises concerns about their potential misuse as performance enhancers for human and animal athletes. In equine doping control laboratories, a method has been reported to detect the administration of a specific human biotherapeutic in equine plasma; but no high-throughput method has been described for the screening without any a priori knowledge of human or murine biotherapeutic. In this context, a new broad-spectrum screening method involving UHPLC-HRMS/MS has been developed for the untargeted analysis of murine or human mAbs and related macromolecules in equine plasma. This approach, consisting of a "pellet digestion" strategy performed in a 96-well plate, demonstrates reliable performances at low concentrations (pmol/mL range) with high-throughput capability (≈100 samples/day). Targeting species-specific proteotypic peptides located within the constant parts of mAbs enables the "universal" detection of human biotherapeutics only by monitoring 10 peptides. As proof of principle, this strategy successfully detected different biotherapeutics in spiked plasma samples, and allowed, for the first time, the detection of a human mAb up to 10 days after a 0.12 mg/kg administration to a horse. This development will expand the analytical capabilities of horse doping control laboratories towards protein-based biotherapeutics with adequate sensitivity, throughput, and cost-effectiveness.
Collapse
Affiliation(s)
- Justine Pinetre
- GIE LCH, Laboratoire des Courses Hippiques, Verrières-le-Buisson, Essonne, France
- Université Paris-Saclay, CEA, INRAE, Département Médicaments et Technologies pour la Santé (DMTS), MetaboHUB, Gif sur Yvette, Ile de France, France
| | - Vivian Delcourt
- GIE LCH, Laboratoire des Courses Hippiques, Verrières-le-Buisson, Essonne, France
| | - François Becher
- Université Paris-Saclay, CEA, INRAE, Département Médicaments et Technologies pour la Santé (DMTS), MetaboHUB, Gif sur Yvette, Ile de France, France
| | - Patrice Garcia
- GIE LCH, Laboratoire des Courses Hippiques, Verrières-le-Buisson, Essonne, France
| | - Agnès Barnabé
- GIE LCH, Laboratoire des Courses Hippiques, Verrières-le-Buisson, Essonne, France
| | - Benoit Loup
- GIE LCH, Laboratoire des Courses Hippiques, Verrières-le-Buisson, Essonne, France
| | - Marie-Agnès Popot
- GIE LCH, Laboratoire des Courses Hippiques, Verrières-le-Buisson, Essonne, France
| | - François Fenaille
- Université Paris-Saclay, CEA, INRAE, Département Médicaments et Technologies pour la Santé (DMTS), MetaboHUB, Gif sur Yvette, Ile de France, France
| | | |
Collapse
|
31
|
Zhao N, Han B, Zhao C, Xu J, Gong X. ABAG-docking benchmark: a non-redundant structure benchmark dataset for antibody-antigen computational docking. Brief Bioinform 2024; 25:bbae048. [PMID: 38385879 PMCID: PMC10883643 DOI: 10.1093/bib/bbae048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 01/05/2024] [Accepted: 01/15/2024] [Indexed: 02/23/2024] Open
Abstract
Accurate prediction of antibody-antigen complex structures is pivotal in drug discovery, vaccine design and disease treatment and can facilitate the development of more effective therapies and diagnostics. In this work, we first review the antibody-antigen docking (ABAG-docking) datasets. Then, we present the creation and characterization of a comprehensive benchmark dataset of antibody-antigen complexes. We categorize the dataset based on docking difficulty, interface properties and structural characteristics, to provide a diverse set of cases for rigorous evaluation. Compared with Docking Benchmark 5.5, we have added 112 cases, including 14 single-domain antibody (sdAb) cases and 98 monoclonal antibody (mAb) cases, and also increased the proportion of Difficult cases. Our dataset contains diverse cases, including human/humanized antibodies, sdAbs, rodent antibodies and other types, opening the door to better algorithm development. Furthermore, we provide details on the process of building the benchmark dataset and introduce a pipeline for periodic updates to keep it up to date. We also utilize multiple complex prediction methods including ZDOCK, ClusPro, HDOCK and AlphaFold-Multimer for testing and analyzing this dataset. This benchmark serves as a valuable resource for evaluating and advancing docking computational methods in the analysis of antibody-antigen interaction, enabling researchers to develop more accurate and effective tools for predicting and designing antibody-antigen complexes. The non-redundant ABAG-docking structure benchmark dataset is available at https://github.com/Zhaonan99/Antibody-antigen-complex-structure-benchmark-dataset.
Collapse
Affiliation(s)
- Nan Zhao
- Institute for Mathematical Sciences, School of Mathematics, Renmin University of China, Beijing, China
| | - Bingqing Han
- Institute for Mathematical Sciences, School of Mathematics, Renmin University of China, Beijing, China
| | - Cuicui Zhao
- Institute for Mathematical Sciences, School of Mathematics, Renmin University of China, Beijing, China
| | - Jinbo Xu
- MoleculeMind Ltd., Beijing, China
| | - Xinqi Gong
- Institute for Mathematical Sciences, School of Mathematics, Renmin University of China, Beijing, China
- Beijing Academy of Artificial Intelligence, Beijing, China
| |
Collapse
|
32
|
Bravi B. Development and use of machine learning algorithms in vaccine target selection. NPJ Vaccines 2024; 9:15. [PMID: 38242890 PMCID: PMC10798987 DOI: 10.1038/s41541-023-00795-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Accepted: 12/07/2023] [Indexed: 01/21/2024] Open
Abstract
Computer-aided discovery of vaccine targets has become a cornerstone of rational vaccine design. In this article, I discuss how Machine Learning (ML) can inform and guide key computational steps in rational vaccine design concerned with the identification of B and T cell epitopes and correlates of protection. I provide examples of ML models, as well as types of data and predictions for which they are built. I argue that interpretable ML has the potential to improve the identification of immunogens also as a tool for scientific discovery, by helping elucidate the molecular processes underlying vaccine-induced immune responses. I outline the limitations and challenges in terms of data availability and method development that need to be addressed to bridge the gap between advances in ML predictions and their translational application to vaccine design.
Collapse
Affiliation(s)
- Barbara Bravi
- Department of Mathematics, Imperial College London, London, SW7 2AZ, UK.
| |
Collapse
|
33
|
Johnson NV, Wall SC, Kramer KJ, Holt CM, Periasamy S, Richardson S, Suryadevara N, Andreano E, Paciello I, Pierleoni G, Piccini G, Huang Y, Ge P, Allen JD, Uno N, Shiakolas AR, Pilewski KA, Nargi RS, Sutton RE, Abu-Shmais AA, Parks R, Haynes BF, Carnahan RH, Crowe JE, Montomoli E, Rappuoli R, Bukreyev A, Ross TM, Sautto GA, McLellan JS, Georgiev IS. Discovery and Characterization of a Pan-betacoronavirus S2-binding antibody. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.15.575741. [PMID: 38293237 PMCID: PMC10827111 DOI: 10.1101/2024.01.15.575741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
Three coronaviruses have spilled over from animal reservoirs into the human population and caused deadly epidemics or pandemics. The continued emergence of coronaviruses highlights the need for pan-coronavirus interventions for effective pandemic preparedness. Here, using LIBRA-seq, we report a panel of 50 coronavirus antibodies isolated from human B cells. Of these antibodies, 54043-5 was shown to bind the S2 subunit of spike proteins from alpha-, beta-, and deltacoronaviruses. A cryo-EM structure of 54043-5 bound to the pre-fusion S2 subunit of the SARS-CoV-2 spike defined an epitope at the apex of S2 that is highly conserved among betacoronaviruses. Although non-neutralizing, 54043-5 induced Fc-dependent antiviral responses, including ADCC and ADCP. In murine SARS-CoV-2 challenge studies, protection against disease was observed after introduction of Leu234Ala, Leu235Ala, and Pro329Gly (LALA-PG) substitutions in the Fc region of 54043-5. Together, these data provide new insights into the protective mechanisms of non-neutralizing antibodies and define a broadly conserved epitope within the S2 subunit.
Collapse
Affiliation(s)
- Nicole V. Johnson
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA
| | - Steven C. Wall
- Vanderbilt Vaccine Center, Vanderbilt University Medical Center; Nashville, TN 37232, USA
- Department of Pathology, Microbiology, and Immunology, Vanderbilt University Medical Center; Nashville, TN 73232, USA
| | - Kevin J. Kramer
- Vanderbilt Vaccine Center, Vanderbilt University Medical Center; Nashville, TN 37232, USA
- Department of Pathology, Microbiology, and Immunology, Vanderbilt University Medical Center; Nashville, TN 73232, USA
| | - Clinton M. Holt
- Vanderbilt Vaccine Center, Vanderbilt University Medical Center; Nashville, TN 37232, USA
- Program in Chemical and Physical Biology, Vanderbilt University Medical Center; Nashville, TN 37232, USA
| | - Sivakumar Periasamy
- Department of Pathology, University of Texas Medical Branch at Galveston, Galveston, TX 77555, USA
- Galveston National Laboratory, University of Texas Medical Branch at Galveston, Galveston, TX 77555, USA
| | - Simone Richardson
- National Institute for Communicable Diseases of the National Health Laboratory Service, Johannesburg 2131, South Africa
- Faculty of Health Sciences, University of the Witwatersrand, Johannesburg 2000, South Africa
| | | | - Emanuele Andreano
- Monoclonal Antibody Discovery (MAD) Lab, Fondazione Toscana Life Sciences, Siena 53100, Italy
| | - Ida Paciello
- Monoclonal Antibody Discovery (MAD) Lab, Fondazione Toscana Life Sciences, Siena 53100, Italy
| | - Giulio Pierleoni
- Monoclonal Antibody Discovery (MAD) Lab, Fondazione Toscana Life Sciences, Siena 53100, Italy
| | | | - Ying Huang
- Florida Research and Innovation Center, Cleveland Clinic, Port Saint Lucie, FL 34987, USA
- Centers for Disease Control and Prevention, Atlanta, GA 30329, USA
| | - Pan Ge
- Florida Research and Innovation Center, Cleveland Clinic, Port Saint Lucie, FL 34987, USA
| | - James D. Allen
- Florida Research and Innovation Center, Cleveland Clinic, Port Saint Lucie, FL 34987, USA
| | - Naoko Uno
- Department of Infection Biology, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44196, USA
- Center for Vaccines and Immunology, University of Georgia, Athens, GA 30602, USA
| | - Andrea R. Shiakolas
- Vanderbilt Vaccine Center, Vanderbilt University Medical Center; Nashville, TN 37232, USA
- Department of Pathology, Microbiology, and Immunology, Vanderbilt University Medical Center; Nashville, TN 73232, USA
| | - Kelsey A. Pilewski
- Vanderbilt Vaccine Center, Vanderbilt University Medical Center; Nashville, TN 37232, USA
- Department of Pathology, Microbiology, and Immunology, Vanderbilt University Medical Center; Nashville, TN 73232, USA
| | - Rachel S. Nargi
- Vanderbilt Vaccine Center, Vanderbilt University Medical Center; Nashville, TN 37232, USA
| | - Rachel E. Sutton
- Vanderbilt Vaccine Center, Vanderbilt University Medical Center; Nashville, TN 37232, USA
| | - Alexandria A. Abu-Shmais
- Vanderbilt Vaccine Center, Vanderbilt University Medical Center; Nashville, TN 37232, USA
- Department of Pathology, Microbiology, and Immunology, Vanderbilt University Medical Center; Nashville, TN 73232, USA
| | - Robert Parks
- Duke Human Vaccine Institute, Duke University, Durham, NC 27710, USA
| | - Barton F. Haynes
- Duke Human Vaccine Institute, Duke University, Durham, NC 27710, USA
- Departments of Medicine and Immunology, Duke University, Durham, NC 27710, USA
| | - Robert H. Carnahan
- Vanderbilt Vaccine Center, Vanderbilt University Medical Center; Nashville, TN 37232, USA
- Department of Pediatrics, Vanderbilt University Medical Center; Nashville, TN 37232, USA
| | - James E. Crowe
- Vanderbilt Vaccine Center, Vanderbilt University Medical Center; Nashville, TN 37232, USA
- Department of Pathology, Microbiology, and Immunology, Vanderbilt University Medical Center; Nashville, TN 73232, USA
- Department of Pediatrics, Vanderbilt University Medical Center; Nashville, TN 37232, USA
| | - Emanuele Montomoli
- VisMederi Research S.r.l., Siena 53100, Italy
- VisMederi S.r.l, Siena 53100, Italy
- Department of Molecular and Developmental Medicine, University of Siena, Siena 53100, Italy
| | - Rino Rappuoli
- Monoclonal Antibody Discovery (MAD) Lab, Fondazione Toscana Life Sciences, Siena 53100, Italy
- Department of Biotechnology, Chemistry and Pharmacy, University of Siena, Siena 53100, Italy
| | - Alexander Bukreyev
- Department of Pathology, University of Texas Medical Branch at Galveston, Galveston, TX 77555, USA
- Galveston National Laboratory, University of Texas Medical Branch at Galveston, Galveston, TX 77555, USA
| | - Ted M. Ross
- Florida Research and Innovation Center, Cleveland Clinic, Port Saint Lucie, FL 34987, USA
- Department of Infection Biology, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44196, USA
- Center for Vaccines and Immunology, University of Georgia, Athens, GA 30602, USA
- Department of Infectious Diseases, University of Georgia, Athens, GA 30602, USA
| | - Giuseppe A. Sautto
- Florida Research and Innovation Center, Cleveland Clinic, Port Saint Lucie, FL 34987, USA
| | - Jason S. McLellan
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA
| | - Ivelin S. Georgiev
- Vanderbilt Vaccine Center, Vanderbilt University Medical Center; Nashville, TN 37232, USA
- Department of Pathology, Microbiology, and Immunology, Vanderbilt University Medical Center; Nashville, TN 73232, USA
- Vanderbilt Institute for Infection, Immunology, and Inflammation, Vanderbilt University Medical Center; Nashville, TN 37232, USA
- Department of Computer Science, Vanderbilt University; Nashville, TN 37232, USA
- Center for Structural Biology, Vanderbilt University; Nashville, TN 37232, USA
- Program in Computational Microbiology and Immunology, Vanderbilt University Medical Center; Nashville, TN 37232, USA
| |
Collapse
|
34
|
Raybould MIJ, Turnbull OM, Suter A, Guloglu B, Deane CM. Contextualising the developability risk of antibodies with lambda light chains using enhanced therapeutic antibody profiling. Commun Biol 2024; 7:62. [PMID: 38191620 PMCID: PMC10774428 DOI: 10.1038/s42003-023-05744-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Accepted: 12/26/2023] [Indexed: 01/10/2024] Open
Abstract
Antibodies with lambda light chains (λ-antibodies) are generally considered to be less developable than those with kappa light chains (κ-antibodies). Though this hypothesis has not been formally established, it has led to substantial systematic biases in drug discovery pipelines and thus contributed to kappa dominance amongst clinical-stage therapeutics. However, the identification of increasing numbers of epitopes preferentially engaged by λ-antibodies shows there is a functional cost to neglecting to consider them as potential lead candidates. Here, we update our Therapeutic Antibody Profiler (TAP) tool to use the latest data and machine learning-based structure prediction, and apply it to evaluate developability risk profiles for κ-antibodies and λ-antibodies based on their surface physicochemical properties. We find that while human λ-antibodies on average have a higher risk of developability issues than κ-antibodies, a sizeable proportion are assigned lower-risk profiles by TAP and should represent more tractable candidates for therapeutic development. Through a comparative analysis of the low- and high-risk populations, we highlight opportunities for strategic design that TAP suggests would enrich for more developable λ-antibodies. Overall, we provide context to the differing developability of κ- and λ-antibodies, enabling a rational approach to incorporate more diversity into the initial pool of immunotherapeutic candidates.
Collapse
Affiliation(s)
- Matthew I J Raybould
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, 24-29 St Giles', Oxford, OX1 3LB, UK
| | - Oliver M Turnbull
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, 24-29 St Giles', Oxford, OX1 3LB, UK
| | - Annabel Suter
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, 24-29 St Giles', Oxford, OX1 3LB, UK
| | - Bora Guloglu
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, 24-29 St Giles', Oxford, OX1 3LB, UK
| | - Charlotte M Deane
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, 24-29 St Giles', Oxford, OX1 3LB, UK.
| |
Collapse
|
35
|
Abanades B, Olsen T, Raybould MJ, Aguilar-Sanjuan B, Wong W, Georges G, Bujotzek A, Deane C. The Patent and Literature Antibody Database (PLAbDab): an evolving reference set of functionally diverse, literature-annotated antibody sequences and structures. Nucleic Acids Res 2024; 52:D545-D551. [PMID: 37971316 PMCID: PMC10767817 DOI: 10.1093/nar/gkad1056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 10/20/2023] [Accepted: 10/30/2023] [Indexed: 11/19/2023] Open
Abstract
Antibodies are key proteins of the adaptive immune system, and there exists a large body of academic literature and patents dedicated to their study and concomitant conversion into therapeutics, diagnostics, or reagents. These documents often contain extensive functional characterisations of the sets of antibodies they describe. However, leveraging these heterogeneous reports, for example to offer insights into the properties of query antibodies of interest, is currently challenging as there is no central repository through which this wide corpus can be mined by sequence or structure. Here, we present PLAbDab (the Patent and Literature Antibody Database), a self-updating repository containing over 150,000 paired antibody sequences and 3D structural models, of which over 65 000 are unique. We describe the methods used to extract, filter, pair, and model the antibodies in PLAbDab, and showcase how PLAbDab can be searched by sequence, structure, or keyword. PLAbDab uses include annotating query antibodies with potential antigen information from similar entries, analysing structural models of existing antibodies to identify modifications that could improve their properties, and facilitating the compilation of bespoke datasets of antibody sequences/structures that bind to a specific antigen. PLAbDab is freely available via Github (https://github.com/oxpig/PLAbDab) and as a searchable webserver (https://opig.stats.ox.ac.uk/webapps/plabdab/).
Collapse
Affiliation(s)
- Brennan Abanades
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, 24-29 St Giles’, Oxford OX1 3LB, UK
| | - Tobias H Olsen
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, 24-29 St Giles’, Oxford OX1 3LB, UK
| | - Matthew I J Raybould
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, 24-29 St Giles’, Oxford OX1 3LB, UK
| | - Broncio Aguilar-Sanjuan
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, 24-29 St Giles’, Oxford OX1 3LB, UK
| | - Wing Ki Wong
- Large Molecule Research, Roche Pharma Research and Early Development, Roche Innovation Center Munich, DE-82377 Penzberg, Germany
| | - Guy Georges
- Large Molecule Research, Roche Pharma Research and Early Development, Roche Innovation Center Munich, DE-82377 Penzberg, Germany
| | - Alexander Bujotzek
- Large Molecule Research, Roche Pharma Research and Early Development, Roche Innovation Center Munich, DE-82377 Penzberg, Germany
| | - Charlotte M Deane
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, 24-29 St Giles’, Oxford OX1 3LB, UK
| |
Collapse
|
36
|
Barra C, Nilsson JB, Saksager A, Carri I, Deleuran S, Garcia Alvarez HM, Høie MH, Li Y, Clifford JN, Wan YTR, Moreta LS, Nielsen M. In Silico Tools for Predicting Novel Epitopes. Methods Mol Biol 2024; 2813:245-280. [PMID: 38888783 DOI: 10.1007/978-1-0716-3890-3_17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/20/2024]
Abstract
Identifying antigens within a pathogen is a critical task to develop effective vaccines and diagnostic methods, as well as understanding the evolution and adaptation to host immune responses. Historically, antigenicity was studied with experiments that evaluate the immune response against selected fragments of pathogens. Using this approach, the scientific community has gathered abundant information regarding which pathogenic fragments are immunogenic. The systematic collection of this data has enabled unraveling many of the fundamental rules underlying the properties defining epitopes and immunogenicity, and has resulted in the creation of a large panel of immunologically relevant predictive (in silico) tools. The development and application of such tools have proven to accelerate the identification of novel epitopes within biomedical applications reducing experimental costs. This chapter introduces some basic concepts about MHC presentation, T cell and B cell epitopes, the experimental efforts to determine those, and focuses on state-of-the-art methods for epitope prediction, highlighting their strengths and limitations, and catering instructions for their rational use.
Collapse
Affiliation(s)
- Carolina Barra
- Section for Bioinformatics, Health Tech, Technical University of Denmark, Lyngby, Denmark.
| | | | - Astrid Saksager
- Section for Bioinformatics, Health Tech, Technical University of Denmark, Lyngby, Denmark
| | - Ibel Carri
- Instituto de Investigaciones Biotecnológicas, Universidad Nacional de San Martín (UNSAM) - Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), San Martín, Argentina
| | - Sebastian Deleuran
- Section for Bioinformatics, Health Tech, Technical University of Denmark, Lyngby, Denmark
| | - Heli M Garcia Alvarez
- Instituto de Investigaciones Biotecnológicas, Universidad Nacional de San Martín (UNSAM) - Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), San Martín, Argentina
| | - Magnus Haraldson Høie
- Section for Bioinformatics, Health Tech, Technical University of Denmark, Lyngby, Denmark
| | - Yuchen Li
- Section for Bioinformatics, Health Tech, Technical University of Denmark, Lyngby, Denmark
| | | | - Yat-Tsai Richie Wan
- Section for Bioinformatics, Health Tech, Technical University of Denmark, Lyngby, Denmark
| | - Lys Sanz Moreta
- Section for Bioinformatics, Health Tech, Technical University of Denmark, Lyngby, Denmark
| | - Morten Nielsen
- Section for Bioinformatics, Health Tech, Technical University of Denmark, Lyngby, Denmark
- Instituto de Investigaciones Biotecnológicas, Universidad Nacional de San Martín (UNSAM) - Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), San Martín, Argentina
| |
Collapse
|
37
|
Yin R, Pierce BG. Evaluation of AlphaFold antibody-antigen modeling with implications for improving predictive accuracy. Protein Sci 2024; 33:e4865. [PMID: 38073135 PMCID: PMC10751731 DOI: 10.1002/pro.4865] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 12/01/2023] [Accepted: 12/07/2023] [Indexed: 12/26/2023]
Abstract
High resolution antibody-antigen structures provide critical insights into immune recognition and can inform therapeutic design. The challenges of experimental structural determination and the diversity of the immune repertoire underscore the necessity of accurate computational tools for modeling antibody-antigen complexes. Initial benchmarking showed that despite overall success in modeling protein-protein complexes, AlphaFold and AlphaFold-Multimer have limited success in modeling antibody-antigen interactions. In this study, we performed a thorough analysis of AlphaFold's antibody-antigen modeling performance on 427 nonredundant antibody-antigen complex structures, identifying useful confidence metrics for predicting model quality, and features of complexes associated with improved modeling success. Notably, we found that the latest version of AlphaFold improves near-native modeling success to over 30%, versus approximately 20% for a previous version, while increased AlphaFold sampling gives approximately 50% success. With this improved success, AlphaFold can generate accurate antibody-antigen models in many cases, while additional training or other optimization may further improve performance.
Collapse
Affiliation(s)
- Rui Yin
- University of Maryland Institute for Bioscience and Biotechnology ResearchRockvilleMarylandUSA
- Department of Cell Biology and Molecular GeneticsUniversity of MarylandCollege ParkMarylandUSA
| | - Brian G. Pierce
- University of Maryland Institute for Bioscience and Biotechnology ResearchRockvilleMarylandUSA
- Department of Cell Biology and Molecular GeneticsUniversity of MarylandCollege ParkMarylandUSA
| |
Collapse
|
38
|
Dorey-Robinson D, Maccari G, Hammond JA. IgMAT: immunoglobulin sequence multi-species annotation tool for any species including those with incomplete antibody annotation or unusual characteristics. BMC Bioinformatics 2023; 24:491. [PMID: 38129777 PMCID: PMC10740263 DOI: 10.1186/s12859-023-05624-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2023] [Accepted: 12/15/2023] [Indexed: 12/23/2023] Open
Abstract
BACKGROUND The advent and continual improvement of high-throughput sequencing technologies has made immunoglobulin repertoire sequencing accessible and informative regardless of study species. However, to fully map dynamic changes in polyclonal responses precise framework and complementarity determining region annotation of rearranging genes is pivotal. Most sequence annotation tools are designed primarily for use with human and mouse antibody sequences which use databases with fixed species lists, applying very specific assumptions which select against unique structural characteristics. For this reason, data agnostic tools able to learn from presented data can be very useful with new species or with novel datasets. RESULTS We have developed IgMAT, which utilises a reduced amino acid alphabet, that incorporates multiple HMM alignments into a single consensus to automatically annotate immunoglobulin sequences from most organisms. Additionally, the software allows the incorporation of user defined databases to better represent the species and/or antibody class of interest. To demonstrate the accuracy and utility of IgMAT, we present analysis of sequences extracted from structural data and immunoglobulin sequence datasets from several different species. CONCLUSIONS IgMAT is fully open-sourced and freely available on GitHub ( https://github.com/TPI-Immunogenetics/igmat ) for download under GPLv3 license. It can be used as a CLI application or as a python module to be integrated in custom scripts.
Collapse
Affiliation(s)
| | - Giuseppe Maccari
- The Pirbright Institute, Pirbright, UK
- Anthony Nolan Research Institute, London, UK
| | | |
Collapse
|
39
|
Terekhova M, Swain A, Bohacova P, Aladyeva E, Arthur L, Laha A, Mogilenko DA, Burdess S, Sukhov V, Kleverov D, Echalar B, Tsurinov P, Chernyatchik R, Husarcikova K, Artyomov MN. Single-cell atlas of healthy human blood unveils age-related loss of NKG2C +GZMB -CD8 + memory T cells and accumulation of type 2 memory T cells. Immunity 2023; 56:2836-2854.e9. [PMID: 37963457 DOI: 10.1016/j.immuni.2023.10.013] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 08/11/2023] [Accepted: 10/19/2023] [Indexed: 11/16/2023]
Abstract
Extensive, large-scale single-cell profiling of healthy human blood at different ages is one of the critical pending tasks required to establish a framework for the systematic understanding of human aging. Here, using single-cell RNA/T cell receptor (TCR)/BCR-seq with protein feature barcoding, we profiled 317 samples from 166 healthy individuals aged 25-85 years old. From this, we generated a dataset from ∼2 million cells that described 55 subpopulations of blood immune cells. Twelve subpopulations changed with age, including the accumulation of GZMK+CD8+ T cells and HLA-DR+CD4+ T cells. In contrast to other T cell memory subsets, transcriptionally distinct NKG2C+GZMB-CD8+ T cells counterintuitively decreased with age. Furthermore, we found a concerted age-associated increase in type 2/interleukin (IL)4-expressing memory subpopulations across CD4+ and CD8+ T cell compartments (CCR4+CD8+ Tcm and Th2 CD4+ Tmem), suggesting a systematic functional shift in immune homeostasis with age. Our work provides novel insights into healthy human aging and a comprehensive annotated resource.
Collapse
Affiliation(s)
- Marina Terekhova
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Amanda Swain
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Pavla Bohacova
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Ekaterina Aladyeva
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Laura Arthur
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Anwesha Laha
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Denis A Mogilenko
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO 63110, USA; Department of Medicine, Department of Pathology, Microbiology, and Immunology, Vanderbilt Center for Immunobiology, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Samantha Burdess
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Vladimir Sukhov
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO 63110, USA; Computer Technologies Laboratory, ITMO University, Saint Petersburg 197101, Russia
| | - Denis Kleverov
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO 63110, USA; Computer Technologies Laboratory, ITMO University, Saint Petersburg 197101, Russia
| | - Barbora Echalar
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Petr Tsurinov
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO 63110, USA; JetBrains Research, 8021 Paphos, Cyprus
| | - Roman Chernyatchik
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO 63110, USA; JetBrains Research, 80639 Munich, Germany
| | - Kamila Husarcikova
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Maxim N Artyomov
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO 63110, USA.
| |
Collapse
|
40
|
Tsuchiya Y, Yonezawa T, Yamamori Y, Inoura H, Osawa M, Ikeda K, Tomii K. PoSSuM v.3: A Major Expansion of the PoSSuM Database for Finding Similar Binding Sites of Proteins. J Chem Inf Model 2023; 63:7578-7587. [PMID: 38016694 PMCID: PMC10716853 DOI: 10.1021/acs.jcim.3c01405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 10/28/2023] [Accepted: 11/01/2023] [Indexed: 11/30/2023]
Abstract
Information on structures of protein-ligand complexes, including comparisons of known and putative protein-ligand-binding pockets, is valuable for protein annotation and drug discovery and development. To facilitate biomedical and pharmaceutical research, we developed PoSSuM (https://possum.cbrc.pj.aist.go.jp/PoSSuM/), a database for identifying similar binding pockets in proteins. The current PoSSuM database includes 191 million similar pairs among almost 10 million identified pockets. PoSSuM drug search (PoSSuMds) is a resource for investigating ligand and receptor diversity among a set of pockets that can bind to an approved drug compound. The enhanced PoSSuMds covers pockets associated with both approved drugs and drug candidates in clinical trials from the latest release of ChEMBL. Additionally, we developed two new databases: PoSSuMAg for investigating antibody-antigen interactions and PoSSuMAF to simplify exploring putative pockets in AlphaFold human protein models.
Collapse
Affiliation(s)
- Yuko Tsuchiya
- Artificial
Intelligence Research Center, National Institute
of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
| | - Tomoki Yonezawa
- Division
of Physics for Life Functions, Keio University
Faculty of Pharmacy, 1-5-30 Shibakoen, Minato-ku, Tokyo 105-8512, Japan
| | - Yu Yamamori
- Artificial
Intelligence Research Center, National Institute
of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
| | - Hiroko Inoura
- Artificial
Intelligence Research Center, National Institute
of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
| | - Masanori Osawa
- Division
of Physics for Life Functions, Keio University
Faculty of Pharmacy, 1-5-30 Shibakoen, Minato-ku, Tokyo 105-8512, Japan
| | - Kazuyoshi Ikeda
- Division
of Physics for Life Functions, Keio University
Faculty of Pharmacy, 1-5-30 Shibakoen, Minato-ku, Tokyo 105-8512, Japan
- Medicinal
Chemistry Applied AI Unit, HPC- and AI-driven Drug Development Platform
Division, RIKEN Center for Computational
Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Kentaro Tomii
- Artificial
Intelligence Research Center, National Institute
of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
| |
Collapse
|
41
|
Chungyoun M, Gray JJ. AI Models for Protein Design are Driving Antibody Engineering. CURRENT OPINION IN BIOMEDICAL ENGINEERING 2023; 28:100473. [PMID: 37484815 PMCID: PMC10361400 DOI: 10.1016/j.cobme.2023.100473] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/25/2023]
Abstract
Therapeutic antibody engineering seeks to identify antibody sequences with specific binding to a target and optimized drug-like properties. When guided by deep learning, antibody generation methods can draw on prior knowledge and experimental efforts to improve this process. By leveraging the increasing quantity and quality of predicted structures of antibodies and target antigens, powerful structure-based generative models are emerging. In this review, we tie the advancements in deep learning-based protein structure prediction and design to the study of antibody therapeutics.
Collapse
Affiliation(s)
- Michael Chungyoun
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD, 21287, USA
| | - Jeffrey J Gray
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD, 21287, USA
- Program in Molecular Biophysics, institute for Nanobiotechnology, and Center for Computational Biology, Johns Hopkins University, Baltimore, MD, 21287, USA
| |
Collapse
|
42
|
Castel J, Delaux S, Hernandez-Alba O, Cianférani S. Recent advances in structural mass spectrometry methods in the context of biosimilarity assessment: from sequence heterogeneities to higher order structures. J Pharm Biomed Anal 2023; 236:115696. [PMID: 37713983 DOI: 10.1016/j.jpba.2023.115696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 08/31/2023] [Accepted: 09/01/2023] [Indexed: 09/17/2023]
Abstract
Biotherapeutics and their biosimilar versions have been flourishing in the biopharmaceutical market for several years. Structural and functional characterization is needed to achieve analytical biosimilarity through the assessment of critical quality attributes as required by regulatory authorities. The role of analytical strategies, particularly mass spectrometry-based methods, is pivotal to gathering valuable information for the in-depth characterization of biotherapeutics and biosimilarity assessment. Structural mass spectrometry methods (native MS, HDX-MS, top-down MS, etc.) provide information ranging from primary sequence assessment to higher order structure evaluation. This review focuses on recent developments and applications in structural mass spectrometry for biotherapeutic and biosimilar characterization.
Collapse
Affiliation(s)
- Jérôme Castel
- Laboratoire de Spectrométrie de Masse Bio-Organique, IPHC UMR 7178, Université de Strasbourg, CNRS, Strasbourg 67087, France; Infrastructure Nationale de Protéomique ProFI, FR2048 CNRS CEA, Strasbourg 67087, France
| | - Sarah Delaux
- Laboratoire de Spectrométrie de Masse Bio-Organique, IPHC UMR 7178, Université de Strasbourg, CNRS, Strasbourg 67087, France; Infrastructure Nationale de Protéomique ProFI, FR2048 CNRS CEA, Strasbourg 67087, France
| | - Oscar Hernandez-Alba
- Laboratoire de Spectrométrie de Masse Bio-Organique, IPHC UMR 7178, Université de Strasbourg, CNRS, Strasbourg 67087, France; Infrastructure Nationale de Protéomique ProFI, FR2048 CNRS CEA, Strasbourg 67087, France
| | - Sarah Cianférani
- Laboratoire de Spectrométrie de Masse Bio-Organique, IPHC UMR 7178, Université de Strasbourg, CNRS, Strasbourg 67087, France; Infrastructure Nationale de Protéomique ProFI, FR2048 CNRS CEA, Strasbourg 67087, France.
| |
Collapse
|
43
|
Liu Y, Tian B. Protein-DNA binding sites prediction based on pre-trained protein language model and contrastive learning. Brief Bioinform 2023; 25:bbad488. [PMID: 38171929 PMCID: PMC10782905 DOI: 10.1093/bib/bbad488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 09/28/2023] [Accepted: 11/30/2023] [Indexed: 01/05/2024] Open
Abstract
Protein-DNA interaction is critical for life activities such as replication, transcription and splicing. Identifying protein-DNA binding residues is essential for modeling their interaction and downstream studies. However, developing accurate and efficient computational methods for this task remains challenging. Improvements in this area have the potential to drive novel applications in biotechnology and drug design. In this study, we propose a novel approach called Contrastive Learning And Pre-trained Encoder (CLAPE), which combines a pre-trained protein language model and the contrastive learning method to predict DNA binding residues. We trained the CLAPE-DB model on the protein-DNA binding sites dataset and evaluated the model performance and generalization ability through various experiments. The results showed that the area under ROC curve values of the CLAPE-DB model on the two benchmark datasets reached 0.871 and 0.881, respectively, indicating superior performance compared to other existing models. CLAPE-DB showed better generalization ability and was specific to DNA-binding sites. In addition, we trained CLAPE on different protein-ligand binding sites datasets, demonstrating that CLAPE is a general framework for binding sites prediction. To facilitate the scientific community, the benchmark datasets and codes are freely available at https://github.com/YAndrewL/clape.
Collapse
Affiliation(s)
- Yufan Liu
- MOE Key Laboratory of Bioinformatics, State Key Laboratory of Molecular Oncology, School of Pharmaceutical Sciences, Tsinghua University, Beijing, 100084, China
| | - Boxue Tian
- MOE Key Laboratory of Bioinformatics, State Key Laboratory of Molecular Oncology, School of Pharmaceutical Sciences, Tsinghua University, Beijing, 100084, China
| |
Collapse
|
44
|
Polonsky K, Pupko T, Freund NT. Evaluation of the Ability of AlphaFold to Predict the Three-Dimensional Structures of Antibodies and Epitopes. JOURNAL OF IMMUNOLOGY (BALTIMORE, MD. : 1950) 2023; 211:1578-1588. [PMID: 37782047 DOI: 10.4049/jimmunol.2300150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 09/06/2023] [Indexed: 10/03/2023]
Abstract
Being able to accurately predict the three-dimensional structure of an Ab can facilitate Ab characterization and epitope prediction, with important diagnostic and clinical implications. In this study, we evaluated the ability of AlphaFold to predict the structures of 222 recently published, high-resolution Fab H and L chain structures of Abs from different species directed against different Ags. We show that although the overall Ab prediction quality is in line with the results of CASP14, regions such as the complementarity-determining regions (CDRs) of the H chain, which are prone to higher variation, are predicted less accurately. Moreover, we discovered that AlphaFold mispredicts the bending angles between the variable and constant domains. To evaluate the ability of AlphaFold to model Ab-Ag interactions based only on sequence, we used AlphaFold-Multimer in combination with ZDOCK to predict the structures of 26 known Ab-Ag complexes. ZDOCK, which was applied on bound components of both the Ab and the Ag, succeeded in assembling 11 complexes, whereas AlphaFold succeeded in predicting only 2 of 26 models, with significant deviations in the docking contacts predicted in the rest of the molecules. Within the 11 complexes that were successfully predicted by ZDOCK, 9 involved short-peptide Ags (18-mer or less), whereas only 2 were complexes of Ab with a full-length protein. Docking of modeled unbound Ab and Ag was unsuccessful. In summary, our study provides important information about the abilities and limitations of using AlphaFold to predict Ab-Ag interactions and suggests areas for possible improvement.
Collapse
Affiliation(s)
- Ksenia Polonsky
- Department of Clinical Microbiology and Immunology, Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
- Shmunis School of Biomedicine and Cancer Research, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Tal Pupko
- Shmunis School of Biomedicine and Cancer Research, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Natalia T Freund
- Department of Clinical Microbiology and Immunology, Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
45
|
Yuan M, Feng Z, Lv H, So N, Shen IR, Tan TJC, Teo QW, Ouyang WO, Talmage L, Wilson IA, Wu NC. Widespread impact of immunoglobulin V-gene allelic polymorphisms on antibody reactivity. Cell Rep 2023; 42:113194. [PMID: 37777966 PMCID: PMC10636607 DOI: 10.1016/j.celrep.2023.113194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 09/07/2023] [Accepted: 09/14/2023] [Indexed: 10/03/2023] Open
Abstract
The ability of the human immune system to generate antibodies to any given antigen can be strongly influenced by immunoglobulin V-gene allelic polymorphisms. However, previous studies have provided only limited examples. Therefore, the prevalence of this phenomenon has been unclear. By analyzing >1,000 publicly available antibody-antigen structures, we show that many V-gene allelic polymorphisms in antibody paratopes are determinants for antibody binding activity. Biolayer interferometry experiments further demonstrate that paratope allelic polymorphisms on both heavy and light chains often abolish antibody binding. We also illustrate the importance of minor V-gene allelic polymorphisms with low frequency in several broadly neutralizing antibodies to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and influenza virus. Overall, this study not only highlights the pervasive impact of V-gene allelic polymorphisms on antibody binding but also provides mechanistic insights into the variability of antibody repertoires across individuals, which in turn have important implications for vaccine development and antibody discovery.
Collapse
Affiliation(s)
- Meng Yuan
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Ziqi Feng
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Huibin Lv
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA; Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Natalie So
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA; Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Ivana R Shen
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Timothy J C Tan
- Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Qi Wen Teo
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA; Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Wenhao O Ouyang
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Logan Talmage
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Ian A Wilson
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA; The Skaggs Institute for Chemical Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Nicholas C Wu
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA; Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA; Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA; Carle Illinois College of Medicine, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.
| |
Collapse
|
46
|
Zhou Y, Huang Z, Li W, Wei J, Jiang Q, Yang W, Huang J. Deep learning in preclinical antibody drug discovery and development. Methods 2023; 218:57-71. [PMID: 37454742 DOI: 10.1016/j.ymeth.2023.07.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 03/20/2023] [Accepted: 07/10/2023] [Indexed: 07/18/2023] Open
Abstract
Antibody drugs have become a key part of biotherapeutics. Patients suffering from various diseases have benefited from antibody therapies. However, its development process is rather long, expensive and risky. To speed up the process, reduce cost and improve success rate, artificial intelligence, especially deep learning methods, have been widely used in all aspects of preclinical antibody drug development, from library generation to hit identification, developability screening, lead selection and optimization. In this review, we systematically summarize antibody encodings, deep learning architectures and models used in preclinical antibody drug discovery and development. We also critically discuss challenges and opportunities, problems and possible solutions, current applications and future directions of deep learning in antibody drug development.
Collapse
Affiliation(s)
- Yuwei Zhou
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Ziru Huang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Wenzhen Li
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Jinyi Wei
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Qianhu Jiang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Wei Yang
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Jian Huang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China.
| |
Collapse
|
47
|
Bai G, Sun C, Guo Z, Wang Y, Zeng X, Su Y, Zhao Q, Ma B. Accelerating antibody discovery and design with artificial intelligence: Recent advances and prospects. Semin Cancer Biol 2023; 95:13-24. [PMID: 37355214 DOI: 10.1016/j.semcancer.2023.06.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 06/09/2023] [Accepted: 06/18/2023] [Indexed: 06/26/2023]
Abstract
Therapeutic antibodies are the largest class of biotherapeutics and have been successful in treating human diseases. However, the design and discovery of antibody drugs remains challenging and time-consuming. Recently, artificial intelligence technology has had an incredible impact on antibody design and discovery, resulting in significant advances in antibody discovery, optimization, and developability. This review summarizes major machine learning (ML) methods and their applications for computational predictors of antibody structure and antigen interface/interaction, as well as the evaluation of antibody developability. Additionally, this review addresses the current status of ML-based therapeutic antibodies under preclinical and clinical phases. While many challenges remain, ML may offer a new therapeutic option for the future direction of fully computational antibody design.
Collapse
Affiliation(s)
- Ganggang Bai
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Chuance Sun
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Ziang Guo
- Cancer Center, Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Taipa, Macao Special Administrative Region of China
| | - Yangjing Wang
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Xincheng Zeng
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yuhong Su
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Qi Zhao
- Cancer Center, Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Taipa, Macao Special Administrative Region of China; MoE Frontiers Science Center for Precision Oncology, University of Macau, Taipa, Macao Special Administrative Region of China.
| | - Buyong Ma
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China; Shanghai Digiwiser BioTechnolgy, Limited, Shanghai 201203, China.
| |
Collapse
|
48
|
Fischman S, Levin I, Rondeau JM, Štrajbl M, Lehmann S, Huber T, Nimrod G, Cebe R, Omer D, Kovarik J, Bernstein S, Sasson Y, Demishtein A, Shlamkovich T, Bluvshtein O, Grossman N, Barak-Fuchs R, Zhenin M, Fastman Y, Twito S, Vana T, Zur N, Ofran Y. "Redirecting an anti-IL-1β antibody to bind a new, unrelated and computationally predicted epitope on hIL-17A". Commun Biol 2023; 6:997. [PMID: 37773269 PMCID: PMC10542344 DOI: 10.1038/s42003-023-05369-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Accepted: 09/18/2023] [Indexed: 10/01/2023] Open
Abstract
Antibody engineering technology is at the forefront of therapeutic antibody development. The primary goal for engineering a therapeutic antibody is the generation of an antibody with a desired specificity, affinity, function, and developability profile. Mature antibodies are considered antigen specific, which may preclude their use as a starting point for antibody engineering. Here, we explore the plasticity of mature antibodies by engineering novel specificity and function to a pre-selected antibody template. Using a small, focused library, we engineered AAL160, an anti-IL-1β antibody, to bind the unrelated antigen IL-17A, with the introduction of seven mutations. The final redesigned antibody, 11.003, retains favorable biophysical properties, binds IL-17A with sub-nanomolar affinity, inhibits IL-17A binding to its cognate receptor and is functional in a cell-based assay. The epitope of the engineered antibody can be computationally predicted based on the sequence of the template antibody, as is confirmed by the crystal structure of the 11.003/IL-17A complex. The structures of the 11.003/IL-17A and the AAL160/IL-1β complexes highlight the contribution of germline residues to the paratopes of both the template and re-designed antibody. This case study suggests that the inherent plasticity of antibodies allows for re-engineering of mature antibodies to new targets, while maintaining desirable developability profiles.
Collapse
Affiliation(s)
| | - Itay Levin
- Biolojic Design LTD, Rehovot, Israel
- Enzymit LTD, Ness Ziona, Israel
| | | | | | - Sylvie Lehmann
- Novartis Institutes for Biomedical Research, Basel, Switzerland
| | - Thomas Huber
- Novartis Institutes for Biomedical Research, Basel, Switzerland
- Ridgelinediscovery, Basel, Switzerland
| | | | - Régis Cebe
- Novartis Institutes for Biomedical Research, Basel, Switzerland
| | - Dotan Omer
- Biolojic Design LTD, Rehovot, Israel
- EmendoBio Inc., Rehovot, Israel
| | - Jiri Kovarik
- Novartis Institutes for Biomedical Research, Basel, Switzerland
| | | | | | - Alik Demishtein
- Biolojic Design LTD, Rehovot, Israel
- Anima Biotech, Ramat-Gan, Israel
| | | | - Olga Bluvshtein
- Biolojic Design LTD, Rehovot, Israel
- Enzymit LTD, Ness Ziona, Israel
| | | | | | | | | | - Shir Twito
- Biolojic Design LTD, Rehovot, Israel
- Enzymit LTD, Ness Ziona, Israel
| | - Tal Vana
- Biolojic Design LTD, Rehovot, Israel
| | - Nevet Zur
- Biolojic Design LTD, Rehovot, Israel
| | - Yanay Ofran
- Biolojic Design LTD, Rehovot, Israel
- The Goodman Faculty of Life Sciences, Nanotechnology Building, Bar Ilan University, Ramat Gan, Israel
| |
Collapse
|
49
|
Spoendlin FC, Abanades B, Raybould MIJ, Wong WK, Georges G, Deane CM. Improved computational epitope profiling using structural models identifies a broader diversity of antibodies that bind to the same epitope. Front Mol Biosci 2023; 10:1237621. [PMID: 37790877 PMCID: PMC10544996 DOI: 10.3389/fmolb.2023.1237621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 08/28/2023] [Indexed: 10/05/2023] Open
Abstract
The function of an antibody is intrinsically linked to the epitope it engages. Clonal clustering methods, based on sequence identity, are commonly used to group antibodies that will bind to the same epitope. However, such methods neglect the fact that antibodies with highly diverse sequences can exhibit similar binding site geometries and engage common epitopes. In a previous study, we described SPACE1, a method that structurally clustered antibodies in order to predict their epitopes. This methodology was limited by the inaccuracies and incomplete coverage of template-based modeling. In addition, it was only benchmarked at the level of domain-consistency on one virus class. Here, we present SPACE2, which uses the latest machine learning-based structure prediction technology combined with a novel clustering protocol, and benchmark it on binding data that have epitope-level resolution. On six diverse sets of antigen-specific antibodies, we demonstrate that SPACE2 accurately clusters antibodies that engage common epitopes and achieves far higher dataset coverage than clonal clustering and SPACE1. Furthermore, we show that the functionally consistent structural clusters identified by SPACE2 are even more diverse in sequence, genetic lineage, and species origin than those found by SPACE1. These results reiterate that structural data improve our ability to identify antibodies that bind to the same epitope, adding information to sequence-based methods, especially in datasets of antibodies from diverse sources. SPACE2 is openly available on GitHub (https://github.com/oxpig/SPACE2).
Collapse
Affiliation(s)
- Fabian C. Spoendlin
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford, United Kingdom
| | - Brennan Abanades
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford, United Kingdom
| | - Matthew I. J. Raybould
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford, United Kingdom
| | - Wing Ki Wong
- Large Molecule Research, Roche Pharma Research and Early Development, Roche Innovation Center Munich, Penzberg, Germany
| | - Guy Georges
- Large Molecule Research, Roche Pharma Research and Early Development, Roche Innovation Center Munich, Penzberg, Germany
| | - Charlotte M. Deane
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
50
|
Gaudreault F, Corbeil CR, Sulea T. Enhanced antibody-antigen structure prediction from molecular docking using AlphaFold2. Sci Rep 2023; 13:15107. [PMID: 37704686 PMCID: PMC10499836 DOI: 10.1038/s41598-023-42090-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Accepted: 09/05/2023] [Indexed: 09/15/2023] Open
Abstract
Predicting the structure of antibody-antigen complexes has tremendous value in biomedical research but unfortunately suffers from a poor performance in real-life applications. AlphaFold2 (AF2) has provided renewed hope for improvements in the field of protein-protein docking but has shown limited success against antibody-antigen complexes due to the lack of co-evolutionary constraints. In this study, we used physics-based protein docking methods for building decoy sets consisting of low-energy docking solutions that were either geometrically close to the native structure (positives) or not (negatives). The docking models were then fed into AF2 to assess their confidence with a novel composite score based on normalized pLDDT and pTMscore metrics after AF2 structural refinement. We show benefits of the AF2 composite score for rescoring docking poses both in terms of (1) classification of positives/negatives and of (2) success rates with particular emphasis on early enrichment. Docking models of at least medium quality present in the decoy set, but not necessarily highly ranked by docking methods, benefitted most from AF2 rescoring by experiencing large advances towards the top of the reranked list of models. These improvements, obtained without any calibration or novel methodologies, led to a notable level of performance in antibody-antigen unbound docking that was never achieved previously.
Collapse
Affiliation(s)
- Francis Gaudreault
- Human Health Therapeutics Research Centre, National Research Council Canada, 6100 Royalmount Avenue, Montreal, QC, H4P 2R2, Canada
| | - Christopher R Corbeil
- Human Health Therapeutics Research Centre, National Research Council Canada, 6100 Royalmount Avenue, Montreal, QC, H4P 2R2, Canada
| | - Traian Sulea
- Human Health Therapeutics Research Centre, National Research Council Canada, 6100 Royalmount Avenue, Montreal, QC, H4P 2R2, Canada.
- Institute of Parasitology, McGill University, 21111 Lakeshore Road, Sainte-Anne-de-Bellevue, QC, H9X 3V9, Canada.
| |
Collapse
|