1
|
Singh R, Im C, Qiu Y, Mackness B, Gupta A, Joren T, Sledzieski S, Erlach L, Wendt M, Fomekong Nanfack Y, Bryson B, Berger B. Learning the language of antibody hypervariability. Proc Natl Acad Sci U S A 2025; 122:e2418918121. [PMID: 39793083 PMCID: PMC11725859 DOI: 10.1073/pnas.2418918121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2024] [Accepted: 11/19/2024] [Indexed: 01/12/2025] Open
Abstract
Protein language models (PLMs) have demonstrated impressive success in modeling proteins. However, general-purpose "foundational" PLMs have limited performance in modeling antibodies due to the latter's hypervariable regions, which do not conform to the evolutionary conservation principles that such models rely on. In this study, we propose a transfer learning framework called Antibody Mutagenesis-Augmented Processing (AbMAP), which fine-tunes foundational models for antibody-sequence inputs by supervising on antibody structure and binding specificity examples. Our learned feature representations accurately predict mutational effects on antigen binding, paratope identification, and other key antibody properties. We experimentally validate AbMAP for antibody optimization by applying it to refine a set of antibodies that bind to a SARS-CoV-2 peptide, and obtain an 82% hit-rate and up to 22-fold increase in binding affinity. AbMAP also unlocks large-scale analyses of immune repertoires, revealing that B-cell receptor repertoires of individuals, while remarkably different in sequence, converge toward similar structural and functional coverage. Importantly, AbMAP's transfer learning approach can be readily adapted to advances in foundational PLMs. We anticipate AbMAP will accelerate the efficient design and modeling of antibodies, expedite the discovery of antibody-based therapeutics, and deepen our understanding of humoral immunity.
Collapse
Affiliation(s)
- Rohit Singh
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA02139
| | - Chiho Im
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA02139
| | - Yu Qiu
- Sanofi R&D Large Molecule Research, Cambridge, MA02141
| | | | - Abhinav Gupta
- Sanofi R&D Large Molecule Research, Cambridge, MA02141
| | - Taylor Joren
- Sanofi R&D Data and Data Science, Artificial Intelligence and Deep Analytics, Cambridge, MA02141
| | - Samuel Sledzieski
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA02139
| | - Lena Erlach
- Department of Biosystems Science and Engineering, ETH Zürich, 8092, Switzerland
| | - Maria Wendt
- Sanofi R&D Large Molecule Research, Cambridge, MA02141
| | | | - Bryan Bryson
- Department of Biological Engineering, Massachusetts Institute of Technology, Technology, Cambridge, MA02139
| | - Bonnie Berger
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA02139
- Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA02139
| |
Collapse
|
2
|
O'Donnell TJ, Kanduri C, Isacchini G, Limenitakis JP, Brachman RA, Alvarez RA, Haff IH, Sandve GK, Greiff V. Reading the repertoire: Progress in adaptive immune receptor analysis using machine learning. Cell Syst 2024; 15:1168-1189. [PMID: 39701034 DOI: 10.1016/j.cels.2024.11.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2024] [Revised: 08/16/2024] [Accepted: 11/14/2024] [Indexed: 12/21/2024]
Abstract
The adaptive immune system holds invaluable information on past and present immune responses in the form of B and T cell receptor sequences, but we are limited in our ability to decode this information. Machine learning approaches are under active investigation for a range of tasks relevant to understanding and manipulating the adaptive immune receptor repertoire, including matching receptors to the antigens they bind, generating antibodies or T cell receptors for use as therapeutics, and diagnosing disease based on patient repertoires. Progress on these tasks has the potential to substantially improve the development of vaccines, therapeutics, and diagnostics, as well as advance our understanding of fundamental immunological principles. We outline key challenges for the field, highlighting the need for software benchmarking, targeted large-scale data generation, and coordinated research efforts.
Collapse
Affiliation(s)
| | - Chakravarthi Kanduri
- Department of Informatics, University of Oslo, Oslo, Norway; UiO:RealArt Convergence Environment, University of Oslo, Oslo, Norway
| | | | | | - Rebecca A Brachman
- Imprint Labs, LLC, New York, NY, USA; Cornell Tech, Cornell University, New York, NY, USA
| | | | - Ingrid H Haff
- Department of Mathematics, University of Oslo, 0371 Oslo, Norway
| | - Geir K Sandve
- Department of Informatics, University of Oslo, Oslo, Norway; UiO:RealArt Convergence Environment, University of Oslo, Oslo, Norway
| | - Victor Greiff
- Imprint Labs, LLC, New York, NY, USA; Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway.
| |
Collapse
|
3
|
Meng F, Zhou N, Hu G, Liu R, Zhang Y, Jing M, Hou Q. A comprehensive overview of recent advances in generative models for antibodies. Comput Struct Biotechnol J 2024; 23:2648-2660. [PMID: 39027650 PMCID: PMC11254834 DOI: 10.1016/j.csbj.2024.06.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2024] [Revised: 06/15/2024] [Accepted: 06/18/2024] [Indexed: 07/20/2024] Open
Abstract
Therapeutic antibodies are an important class of biopharmaceuticals. With the rapid development of deep learning methods and the increasing amount of antibody data, antibody generative models have made great progress recently. They aim to solve the antibody space searching problems and are widely incorporated into the antibody development process. Therefore, a comprehensive introduction to the development methods in this field is imperative. Here, we collected 34 representative antibody generative models published recently and all generative models can be divided into three categories: sequence-generating models, structure-generating models, and hybrid models, based on their principles and algorithms. We further studied their performance and contributions to antibody sequence prediction, structure optimization, and affinity enhancement. Our manuscript will provide a comprehensive overview of the status of antibody generative models and also offer guidance for selecting different approaches.
Collapse
Affiliation(s)
- Fanxu Meng
- College of Chemical Engineering, Qingdao University of Science and Technology, Qingdao 266042, China
| | - Na Zhou
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan 250100, China
- National Institute of Health Data Science of China, Shandong University, Jinan 250100, China
| | - Guangchun Hu
- School of Information Science and Engineering, University of Jinan, Jinan 250022, China
| | - Ruotong Liu
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan 250100, China
- National Institute of Health Data Science of China, Shandong University, Jinan 250100, China
| | - Yuanyuan Zhang
- College of Chemical Engineering, Qingdao University of Science and Technology, Qingdao 266042, China
| | - Ming Jing
- Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, China
- Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250000, China
| | - Qingzhen Hou
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan 250100, China
- National Institute of Health Data Science of China, Shandong University, Jinan 250100, China
| |
Collapse
|
4
|
Deichmann M, Hansson FG, Jensen ED. Yeast-based screening platforms to understand and improve human health. Trends Biotechnol 2024; 42:1258-1272. [PMID: 38677901 DOI: 10.1016/j.tibtech.2024.04.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 04/01/2024] [Accepted: 04/03/2024] [Indexed: 04/29/2024]
Abstract
Detailed molecular understanding of the human organism is essential to develop effective therapies. Saccharomyces cerevisiae has been used extensively for acquiring insights into important aspects of human health, such as studying genetics and cell-cell communication, elucidating protein-protein interaction (PPI) networks, and investigating human G protein-coupled receptor (hGPCR) signaling. We highlight recent advances and opportunities of yeast-based technologies for cost-efficient chemical library screening on hGPCRs, accelerated deciphering of PPI networks with mating-based screening and selection, and accurate cell-cell communication with human immune cells. Overall, yeast-based technologies constitute an important platform to support basic understanding and innovative applications towards improving human health.
Collapse
Affiliation(s)
- Marcus Deichmann
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark
| | - Frederik G Hansson
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark
| | - Emil D Jensen
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark.
| |
Collapse
|
5
|
Li J, Liao L, Zhang C, Huang K, Zhang P, Zhang JZH, Wan X, Zhang H. Development and experimental validation of computational methods for human antibody affinity enhancement. Brief Bioinform 2024; 25:bbae488. [PMID: 39358035 PMCID: PMC11446602 DOI: 10.1093/bib/bbae488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2024] [Revised: 09/01/2024] [Accepted: 09/18/2024] [Indexed: 10/04/2024] Open
Abstract
High affinity is crucial for the efficacy and specificity of antibody. Due to involving high-throughput screens, biological experiments for antibody affinity maturation are time-consuming and have a low success rate. Precise computational-assisted antibody design promises to accelerate this process, but there is still a lack of effective computational methods capable of pinpointing beneficial mutations within the complementarity-determining region (CDR) of antibodies. Moreover, random mutations often lead to challenges in antibody expression and immunogenicity. In this study, to enhance the affinity of a human antibody against avian influenza virus, a CDR library was constructed and evolutionary information was acquired through sequence alignment to restrict the mutation positions and types. Concurrently, a statistical potential methodology was developed based on amino acid interactions between antibodies and antigens to calculate potential affinity-enhanced antibodies, which were further subjected to molecular dynamics simulations. Subsequently, experimental validation confirmed that a point mutation enhancing 2.5-fold affinity was obtained from 10 designs, resulting in the antibody affinity of 2 nM. A predictive model for antibody-antigen interactions based on the binding interface was also developed, achieving an Area Under the Curve (AUC) of 0.83 and a precision of 0.89 on the test set. Lastly, a novel approach involving combinations of affinity-enhancing mutations and an iterative mutation optimization scheme similar to the Monte Carlo method were proposed. This study presents computational methods that rapidly and accurately enhance antibody affinity, addressing issues related to antibody expression and immunogenicity.
Collapse
Affiliation(s)
- Junxin Li
- Center for Protein and Cell-based Drugs, Institute of Biomedicine and Biotechnology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, 1068 Xueyuan Avenue, Nanshan District, Shenzhen 518055, China
| | - Linbu Liao
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, 8700 Beverly Blvd, Los Angeles, CA 90048, United States
| | - Chao Zhang
- Key Laboratory of Quantitative Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, 1068 Xueyuan Avenue, Nanshan District, Shenzhen 518055, China
| | - Kaifang Huang
- Key Laboratory of Quantitative Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, 1068 Xueyuan Avenue, Nanshan District, Shenzhen 518055, China
- School of Chemistry and Molecular Engineering, East China Normal University, 3663 Zhongshan North Road, Putuo District, Shanghai 200062, China
| | - Pengfei Zhang
- Guangdong Key Laboratory of Nanomedicine, Shenzhen Engineering Laboratory of Nanomedicine and Nanoformulations, CAS Key Lab for Health Informatics, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, 1068 Xueyuan Avenue, Nanshan District, Shenzhen 518055, China
| | - John Z H Zhang
- Key Laboratory of Quantitative Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, 1068 Xueyuan Avenue, Nanshan District, Shenzhen 518055, China
- Faculty of Synthetic Biology, Shenzhen University of Advanced Technology, Shenzhen 518055, China
| | - Xiaochun Wan
- Center for Protein and Cell-based Drugs, Institute of Biomedicine and Biotechnology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, 1068 Xueyuan Avenue, Nanshan District, Shenzhen 518055, China
| | - Haiping Zhang
- Key Laboratory of Quantitative Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, 1068 Xueyuan Avenue, Nanshan District, Shenzhen 518055, China
| |
Collapse
|
6
|
He H, He B, Guan L, Zhao Y, Jiang F, Chen G, Zhu Q, Chen CYC, Li T, Yao J. De novo generation of SARS-CoV-2 antibody CDRH3 with a pre-trained generative large language model. Nat Commun 2024; 15:6867. [PMID: 39127753 PMCID: PMC11316817 DOI: 10.1038/s41467-024-50903-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 07/23/2024] [Indexed: 08/12/2024] Open
Abstract
Artificial Intelligence (AI) techniques have made great advances in assisting antibody design. However, antibody design still heavily relies on isolating antigen-specific antibodies from serum, which is a resource-intensive and time-consuming process. To address this issue, we propose a Pre-trained Antibody generative large Language Model (PALM-H3) for the de novo generation of artificial antibodies heavy chain complementarity-determining region 3 (CDRH3) with desired antigen-binding specificity, reducing the reliance on natural antibodies. We also build a high-precision model antigen-antibody binder (A2binder) that pairs antigen epitope sequences with antibody sequences to predict binding specificity and affinity. PALM-H3-generated antibodies exhibit binding ability to SARS-CoV-2 antigens, including the emerging XBB variant, as confirmed through in-silico analysis and in-vitro assays. The in-vitro assays validate that PALM-H3-generated antibodies achieve high binding affinity and potent neutralization capability against spike proteins of SARS-CoV-2 wild-type, Alpha, Delta, and the emerging XBB variant. Meanwhile, A2binder demonstrates exceptional predictive performance on binding specificity for various epitopes and variants. Furthermore, by incorporating the attention mechanism inherent in the Roformer architecture into the PALM-H3 model, we improve its interpretability, providing crucial insights into the fundamental principles of antibody design.
Collapse
Affiliation(s)
- Haohuai He
- AI Lab, Tencent, Shenzhen, 518052, China
- Artificial Intelligence Medical Research Center, School of Intelligent Systems Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen, China
| | - Bing He
- AI Lab, Tencent, Shenzhen, 518052, China.
| | - Lei Guan
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and National Clinical Research Center for Digestive Diseases, Xijing Hospital of Digestive Diseases, Xi'an, China
| | - Yu Zhao
- AI Lab, Tencent, Shenzhen, 518052, China
| | - Feng Jiang
- AI Lab, Tencent, Shenzhen, 518052, China
| | - Guanxing Chen
- Artificial Intelligence Medical Research Center, School of Intelligent Systems Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen, China
| | - Qingge Zhu
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and National Clinical Research Center for Digestive Diseases, Xijing Hospital of Digestive Diseases, Xi'an, China
| | - Calvin Yu-Chian Chen
- AI for Science (AI4S)-Preferred Program, School of Electronic and Computer Engineering, Peking University Shenzhen Graduate School, Shenzhen, 518055, China.
- State Key Laboratory of Chemical Oncogenomics, School of Chemical Biology and Biotechnology, Peking University Shenzhen Graduate School, Shenzhen, 518055, China.
- Department of Medical Research, China Medical University Hospital, Taichung, 40447, Taiwan.
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung, 41354, Taiwan.
- Guangdong L-Med Biotechnology Co. Ltd, Meizhou, 514699, Guangdong, China.
| | - Ting Li
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and National Clinical Research Center for Digestive Diseases, Xijing Hospital of Digestive Diseases, Xi'an, China.
| | | |
Collapse
|
7
|
Wang Q, Feng Y, Wang Y, Li B, Wen J, Zhou X, Song Q. AntiFormer: graph enhanced large language model for binding affinity prediction. Brief Bioinform 2024; 25:bbae403. [PMID: 39162312 PMCID: PMC11333967 DOI: 10.1093/bib/bbae403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Revised: 07/24/2024] [Accepted: 07/30/2024] [Indexed: 08/21/2024] Open
Abstract
Antibodies play a pivotal role in immune defense and serve as key therapeutic agents. The process of affinity maturation, wherein antibodies evolve through somatic mutations to achieve heightened specificity and affinity to target antigens, is crucial for effective immune response. Despite their significance, assessing antibody-antigen binding affinity remains challenging due to limitations in conventional wet lab techniques. To address this, we introduce AntiFormer, a graph-based large language model designed to predict antibody binding affinity. AntiFormer incorporates sequence information into a graph-based framework, allowing for precise prediction of binding affinity. Through extensive evaluations, AntiFormer demonstrates superior performance compared with existing methods, offering accurate predictions with reduced computational time. Application of AntiFormer to severe acute respiratory syndrome coronavirus 2 patient samples reveals antibodies with strong neutralizing capabilities, providing insights for therapeutic development and vaccination strategies. Furthermore, analysis of individual samples following influenza vaccination elucidates differences in antibody response between young and older adults. AntiFormer identifies specific clonotypes with enhanced binding affinity post-vaccination, particularly in young individuals, suggesting age-related variations in immune response dynamics. Moreover, our findings underscore the importance of large clonotype category in driving affinity maturation and immune modulation. Overall, AntiFormer is a promising approach to accelerate antibody-based diagnostics and therapeutics, bridging the gap between traditional methods and complex antibody maturation processes.
Collapse
Affiliation(s)
- Qing Wang
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, FL 32611, USA
| | - Yuzhou Feng
- Department of Laboratory Medicine and West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu 610041, China
- Shihezi University School of Medicine, Shihezi University, Shihezi 832003, China
| | - Yanfei Wang
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, FL 32611, USA
| | - Bo Li
- Department of Computer and Information Science, University of Macau, Macau SAR, China
| | - Jianguo Wen
- Center for Computational Systems Medicine, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Xiaobo Zhou
- Center for Computational Systems Medicine, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Qianqian Song
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, FL 32611, USA
| |
Collapse
|
8
|
Ferretti F, Kardar M. Universal characterization of epitope immunodominance from a multiscale model of clonal competition in germinal centers. Phys Rev E 2024; 109:064409. [PMID: 39020898 DOI: 10.1103/physreve.109.064409] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 04/02/2024] [Indexed: 07/20/2024]
Abstract
We introduce a multiscale model for affinity maturation, which aims to capture the intraclonal, interclonal, and epitope-specific organization of the B-cell population in a germinal center. We describe the evolution of the B-cell population via a quasispecies dynamics, with species corresponding to unique B-cell receptors (BCRs), where the desired multiscale structure is reflected on the mutational connectivity of the accessible BCR space, and on the statistical properties of its fitness landscape. Within this mathematical framework, we study the competition among classes of BCRs targeting different antigen epitopes, and we construct an effective immunogenic space where epitope immunodominance relations can be universally characterized. We finally study how varying the relative composition of a mixture of antigens with variable and conserved domains allows for a parametric exploration of this space, and we identify general principles for the rational design of two-antigen cocktails.
Collapse
Affiliation(s)
- Federica Ferretti
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Mehran Kardar
- Department of Physics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| |
Collapse
|
9
|
Gallo E. The rise of big data: deep sequencing-driven computational methods are transforming the landscape of synthetic antibody design. J Biomed Sci 2024; 31:29. [PMID: 38491519 PMCID: PMC10943851 DOI: 10.1186/s12929-024-01018-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 03/05/2024] [Indexed: 03/18/2024] Open
Abstract
Synthetic antibodies (Abs) represent a category of artificial proteins capable of closely emulating the functions of natural Abs. Their in vitro production eliminates the need for an immunological response, streamlining the process of Ab discovery, engineering, and development. These artificially engineered Abs offer novel approaches to antigen recognition, paratope site manipulation, and biochemical/biophysical enhancements. As a result, synthetic Abs are fundamentally reshaping conventional methods of Ab production. This mirrors the revolution observed in molecular biology and genomics as a result of deep sequencing, which allows for the swift and cost-effective sequencing of DNA and RNA molecules at scale. Within this framework, deep sequencing has enabled the exploration of whole genomes and transcriptomes, including particular gene segments of interest. Notably, the fusion of synthetic Ab discovery with advanced deep sequencing technologies is redefining the current approaches to Ab design and development. Such combination offers opportunity to exhaustively explore Ab repertoires, fast-tracking the Ab discovery process, and enhancing synthetic Ab engineering. Moreover, advanced computational algorithms have the capacity to effectively mine big data, helping to identify Ab sequence patterns/features hidden within deep sequencing Ab datasets. In this context, these methods can be utilized to predict novel sequence features thereby enabling the successful generation of de novo Ab molecules. Hence, the merging of synthetic Ab design, deep sequencing technologies, and advanced computational models heralds a new chapter in Ab discovery, broadening our comprehension of immunology and streamlining the advancement of biological therapeutics.
Collapse
Affiliation(s)
- Eugenio Gallo
- Department of Medicinal Chemistry, Avance Biologicals, 950 Dupont Street, Toronto, ON, M6H 1Z2, Canada.
- Department of Protein Engineering, RevivAb, Av. Ipiranga, 6681, Partenon, Porto Alegre, RS, 90619-900, Brazil.
| |
Collapse
|
10
|
Barton J, Gaspariunas A, Galson JD, Leem J. Building Representation Learning Models for Antibody Comprehension. Cold Spring Harb Perspect Biol 2024; 16:a041462. [PMID: 38012013 PMCID: PMC10910360 DOI: 10.1101/cshperspect.a041462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Antibodies are versatile proteins with both the capacity to bind a broad range of targets and a proven track record as some of the most successful therapeutics. However, the development of novel antibody therapeutics is a lengthy and costly process. It is challenging to predict the functional and biophysical properties of antibodies from their amino acid sequence alone, requiring numerous experiments for full characterization. Machine learning, specifically deep representation learning, has emerged as a family of methods that can complement wet lab approaches and accelerate the overall discovery and engineering process. Here, we review advances in antibody sequence representation learning, and how this has improved antibody structure prediction and facilitated antibody optimization. We discuss challenges in the development and implementation of such models, such as the lack of publicly available, well-curated antibody function data and highlight opportunities for improvement. These and future advances in machine learning for antibody sequences have the potential to increase the success rate in developing new therapeutics, resulting in broader access to transformative medicines and improved patient outcomes.
Collapse
Affiliation(s)
- Justin Barton
- Alchemab Therapeutics Ltd, London N1C 4AX, United Kingdom
| | | | - Jacob D Galson
- Alchemab Therapeutics Ltd, London N1C 4AX, United Kingdom
| | - Jinwoo Leem
- Alchemab Therapeutics Ltd, London N1C 4AX, United Kingdom
| |
Collapse
|
11
|
Li L, Gupta E, Spaeth J, Shing L, Jaimes R, Engelhart E, Lopez R, Caceres RS, Bepler T, Walsh ME. Machine learning optimization of candidate antibody yields highly diverse sub-nanomolar affinity antibody libraries. Nat Commun 2023; 14:3454. [PMID: 37308471 DOI: 10.1038/s41467-023-39022-2] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 05/23/2023] [Indexed: 06/14/2023] Open
Abstract
Therapeutic antibodies are an important and rapidly growing drug modality. However, the design and discovery of early-stage antibody therapeutics remain a time and cost-intensive endeavor. Here we present an end-to-end Bayesian, language model-based method for designing large and diverse libraries of high-affinity single-chain variable fragments (scFvs) that are then empirically measured. In a head-to-head comparison with a directed evolution approach, we show that the best scFv generated from our method represents a 28.7-fold improvement in binding over the best scFv from the directed evolution. Additionally, 99% of designed scFvs in our most successful library are improvements over the initial candidate scFv. By comparing a library's predicted success to actual measurements, we demonstrate our method's ability to explore tradeoffs between library success and diversity. Results of our work highlight the significant impact machine learning models can have on scFv development. We expect our method to be broadly applicable and provide value to other protein engineering tasks.
Collapse
Affiliation(s)
- Lin Li
- Massachusetts Institute of Technology Lincoln Laboratory, Lexington, MA, USA.
| | - Esther Gupta
- Massachusetts Institute of Technology Lincoln Laboratory, Lexington, MA, USA
| | - John Spaeth
- Massachusetts Institute of Technology Lincoln Laboratory, Lexington, MA, USA
| | - Leslie Shing
- Massachusetts Institute of Technology Lincoln Laboratory, Lexington, MA, USA
| | - Rafael Jaimes
- Massachusetts Institute of Technology Lincoln Laboratory, Lexington, MA, USA
| | | | | | - Rajmonda S Caceres
- Massachusetts Institute of Technology Lincoln Laboratory, Lexington, MA, USA
| | - Tristan Bepler
- Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, MA, USA
- Simons Electron Microscopy Center, New York Structural Biology Center, New York, NY, USA
| | - Matthew E Walsh
- Massachusetts Institute of Technology Lincoln Laboratory, Lexington, MA, USA
- Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| |
Collapse
|