1
|
He H, He B, Guan L, Zhao Y, Jiang F, Chen G, Zhu Q, Chen CYC, Li T, Yao J. De novo generation of SARS-CoV-2 antibody CDRH3 with a pre-trained generative large language model. Nat Commun 2024; 15:6867. [PMID: 39127753 PMCID: PMC11316817 DOI: 10.1038/s41467-024-50903-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 07/23/2024] [Indexed: 08/12/2024] Open
Abstract
Artificial Intelligence (AI) techniques have made great advances in assisting antibody design. However, antibody design still heavily relies on isolating antigen-specific antibodies from serum, which is a resource-intensive and time-consuming process. To address this issue, we propose a Pre-trained Antibody generative large Language Model (PALM-H3) for the de novo generation of artificial antibodies heavy chain complementarity-determining region 3 (CDRH3) with desired antigen-binding specificity, reducing the reliance on natural antibodies. We also build a high-precision model antigen-antibody binder (A2binder) that pairs antigen epitope sequences with antibody sequences to predict binding specificity and affinity. PALM-H3-generated antibodies exhibit binding ability to SARS-CoV-2 antigens, including the emerging XBB variant, as confirmed through in-silico analysis and in-vitro assays. The in-vitro assays validate that PALM-H3-generated antibodies achieve high binding affinity and potent neutralization capability against spike proteins of SARS-CoV-2 wild-type, Alpha, Delta, and the emerging XBB variant. Meanwhile, A2binder demonstrates exceptional predictive performance on binding specificity for various epitopes and variants. Furthermore, by incorporating the attention mechanism inherent in the Roformer architecture into the PALM-H3 model, we improve its interpretability, providing crucial insights into the fundamental principles of antibody design.
Collapse
Affiliation(s)
- Haohuai He
- AI Lab, Tencent, Shenzhen, 518052, China
- Artificial Intelligence Medical Research Center, School of Intelligent Systems Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen, China
| | - Bing He
- AI Lab, Tencent, Shenzhen, 518052, China.
| | - Lei Guan
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and National Clinical Research Center for Digestive Diseases, Xijing Hospital of Digestive Diseases, Xi'an, China
| | - Yu Zhao
- AI Lab, Tencent, Shenzhen, 518052, China
| | - Feng Jiang
- AI Lab, Tencent, Shenzhen, 518052, China
| | - Guanxing Chen
- Artificial Intelligence Medical Research Center, School of Intelligent Systems Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen, China
| | - Qingge Zhu
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and National Clinical Research Center for Digestive Diseases, Xijing Hospital of Digestive Diseases, Xi'an, China
| | - Calvin Yu-Chian Chen
- AI for Science (AI4S)-Preferred Program, School of Electronic and Computer Engineering, Peking University Shenzhen Graduate School, Shenzhen, 518055, China.
- State Key Laboratory of Chemical Oncogenomics, School of Chemical Biology and Biotechnology, Peking University Shenzhen Graduate School, Shenzhen, 518055, China.
- Department of Medical Research, China Medical University Hospital, Taichung, 40447, Taiwan.
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung, 41354, Taiwan.
- Guangdong L-Med Biotechnology Co. Ltd, Meizhou, 514699, Guangdong, China.
| | - Ting Li
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and National Clinical Research Center for Digestive Diseases, Xijing Hospital of Digestive Diseases, Xi'an, China.
| | | |
Collapse
|
2
|
Xu J, Gong J, Bo X, Tong Y, Ren Z, Ni M. A benchmark for evaluation of structure-based online tools for antibody-antigen binding affinity. Biophys Chem 2024; 311:107253. [PMID: 38768531 DOI: 10.1016/j.bpc.2024.107253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 04/08/2024] [Accepted: 04/28/2024] [Indexed: 05/22/2024]
Abstract
The prediction of binding affinity changes caused by missense mutations can elucidate antigen-antibody interactions. A few accessible structure-based online computational tools have been proposed. However, selecting suitable software for particular research is challenging, especially research on the SARS-CoV-2 spike protein with antibodies. Therefore, benchmarking of the mutation-diverse SARS-CoV-2 datasets is critical. Here, we collected the datasets including 1216 variants about the changes in binding affinity of antigens from 22 complexes for SARS-CoV-2 S proteins and 22 monoclonal antibodies as well as applied them to evaluate the performance of seven binding affinity prediction tools. The tested tools' Pearson correlations between predicted and measured changes in binding affinity were between -0.158 and 0.657, while accuracy in classification tasks on predicting increasing or decreasing affinity ranged from 0.444 to 0.834. These tools performed relatively better on predicting single mutations, especially at epitope sites, whereas poor performance on extremely decreasing affinity. The tested tools were relatively insensitive to the experimental techniques used to obtain structures of complexes. In summary, we constructed a list of datasets and evaluated a range of structure-based online prediction tools that will explicate relevant processes of antigen-antibody interactions and enhance the computational design of therapeutic monoclonal antibodies.
Collapse
Affiliation(s)
- Jiayi Xu
- College of Life Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China
| | - Jianting Gong
- Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Xiaochen Bo
- Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Yigang Tong
- College of Life Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China; Beijing Advanced Innovation Center for Soft Matter Science and Engineering, Beijing University of Chemical Technology, Beijing 100029, China.
| | - Zilin Ren
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China; Changchun Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Changchun 130122, China.
| | - Ming Ni
- Institute of Health Service and Transfusion Medicine, Beijing 100850, China.
| |
Collapse
|
3
|
Velloso JPL, de Sá AGC, Pires DEV, Ascher DB. Engineering G protein-coupled receptors for stabilization. Protein Sci 2024; 33:e5000. [PMID: 38747401 PMCID: PMC11094779 DOI: 10.1002/pro.5000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Revised: 03/21/2024] [Accepted: 04/10/2024] [Indexed: 05/19/2024]
Abstract
G protein-coupled receptors (GPCRs) are one of the most important families of targets for drug discovery. One of the limiting steps in the study of GPCRs has been their stability, with significant and time-consuming protein engineering often used to stabilize GPCRs for structural characterization and drug screening. Unfortunately, computational methods developed using globular soluble proteins have translated poorly to the rational engineering of GPCRs. To fill this gap, we propose GPCR-tm, a novel and personalized structurally driven web-based machine learning tool to study the impacts of mutations on GPCR stability. We show that GPCR-tm performs as well as or better than alternative methods, and that it can accurately rank the stability changes of a wide range of mutations occurring in various types of class A GPCRs. GPCR-tm achieved Pearson's correlation coefficients of 0.74 and 0.46 on 10-fold cross-validation and blind test sets, respectively. We observed that the (structural) graph-based signatures were the most important set of features for predicting destabilizing mutations, which points out that these signatures properly describe the changes in the environment where the mutations occur. More specifically, GPCR-tm was able to accurately rank mutations based on their effect on protein stability, guiding their rational stabilization. GPCR-tm is available through a user-friendly web server at https://biosig.lab.uq.edu.au/gpcr_tm/.
Collapse
Affiliation(s)
- João Paulo L. Velloso
- School of Chemistry and Molecular Biosciences, The Australian Centre for EcogenomicsThe University of QueenslandBrisbaneQueenslandAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneVictoriaAustralia
- Baker Department of Cardiometabolic HealthThe University of MelbourneParkvilleVictoriaAustralia
| | - Alex G. C. de Sá
- School of Chemistry and Molecular Biosciences, The Australian Centre for EcogenomicsThe University of QueenslandBrisbaneQueenslandAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneVictoriaAustralia
- Baker Department of Cardiometabolic HealthThe University of MelbourneParkvilleVictoriaAustralia
| | - Douglas E. V. Pires
- School of Computing and Information SystemsThe University of MelbourneParkvilleVictoriaAustralia
| | - David B. Ascher
- School of Chemistry and Molecular Biosciences, The Australian Centre for EcogenomicsThe University of QueenslandBrisbaneQueenslandAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneVictoriaAustralia
- Baker Department of Cardiometabolic HealthThe University of MelbourneParkvilleVictoriaAustralia
| |
Collapse
|
4
|
Zhang X, Wang H, Sun C. BiSpec Pairwise AI: guiding the selection of bispecific antibody target combinations with pairwise learning and GPT augmentation. J Cancer Res Clin Oncol 2024; 150:237. [PMID: 38713378 PMCID: PMC11076393 DOI: 10.1007/s00432-024-05740-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2024] [Accepted: 04/03/2024] [Indexed: 05/08/2024]
Abstract
PURPOSE Bispecific antibodies (BsAbs), capable of targeting two antigens simultaneously, represent a significant advancement by employing dual mechanisms of action for tumor suppression. However, how to pair targets to develop effective and safe bispecific drugs is a major challenge for pharmaceutical companies. METHODS Using machine learning models, we refined the biological characteristics of currently approved or in clinical development BsAbs and analyzed hundreds of membrane proteins as bispecific targets to predict the likelihood of successful drug development for various target combinations. Moreover, to enhance the interpretability of prediction results in bispecific target combination, we combined machine learning models with Large Language Models (LLMs). Through a Retrieval-Augmented Generation (RAG) approach, we supplement each pair of bispecific targets' machine learning prediction with important features and rationales, generating interpretable analytical reports. RESULTS In this study, the XGBoost model with pairwise learning was employed to predict the druggability of BsAbs. By analyzing extensive data on BsAbs and designing features from perspectives such as target activity, safety, cell type specificity, pathway mechanism, and gene embedding representation, our model is able to predict target combinations of BsAbs with high market potential. Specifically, we integrated XGBoost with the GPT model to discuss the efficacy of each bispecific target pair, thereby aiding the decision-making for drug developers. CONCLUSION The novelty of this study lies in the integration of machine learning and GPT techniques to provide a novel framework for the design of BsAbs drugs. This holistic approach not only improves prediction accuracy, but also enhances the interpretability and innovativeness of drug design.
Collapse
Affiliation(s)
- Xin Zhang
- Beijing Engineering Research Center of Protein and Antibody, Sinocelltech Ltd., Beijing, 100176, China
- School of Medicine, Nankai University, Tianjin, 300071, China
| | - Huiyu Wang
- Beijing Engineering Research Center of Protein and Antibody, Sinocelltech Ltd., Beijing, 100176, China
| | - Chunyun Sun
- Beijing Engineering Research Center of Protein and Antibody, Sinocelltech Ltd., Beijing, 100176, China.
| |
Collapse
|
5
|
Lv Y, Gong H, Liu X, Hao J, Xu L, Sun Z, Yu C, Xu L. A dual computational and experimental strategy to enhance TSLP antibody affinity for improved asthma treatment. PLoS Comput Biol 2024; 20:e1011984. [PMID: 38536788 PMCID: PMC10971747 DOI: 10.1371/journal.pcbi.1011984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2023] [Accepted: 03/10/2024] [Indexed: 04/05/2024] Open
Abstract
Thymic stromal lymphopoietin is a key cytokine involved in the pathogenesis of asthma and other allergic diseases. Targeting TSLP and its signaling pathways is increasingly recognized as an effective strategy for asthma treatment. This study focused on enhancing the affinity of the T6 antibody, which specifically targets TSLP, by integrating computational and experimental methods. The initial affinity of the T6 antibody for TSLP was lower than the benchmark antibody AMG157. To improve this, we utilized alanine scanning, molecular docking, and computational tools including mCSM-PPI2 and GEO-PPI to identify critical amino acid residues for site-directed mutagenesis. Subsequent mutations and experimental validations resulted in an antibody with significantly enhanced blocking capacity against TSLP. Our findings demonstrate the potential of computer-assisted techniques in expediting antibody affinity maturation, thereby reducing both the time and cost of experiments. The integration of computational methods with experimental approaches holds great promise for the development of targeted therapeutic antibodies for TSLP-related diseases.
Collapse
Affiliation(s)
- Yitong Lv
- College of Life Science and Technology, Beijing University of Chemical Technology, Beijing, China
| | - He Gong
- College of Life Science and Technology, Beijing University of Chemical Technology, Beijing, China
| | - Xuechao Liu
- Beijing Sungen Biomedical Technology Co., Ltd, Beijing, China
| | - Jia Hao
- Beijing Sungen Biomedical Technology Co., Ltd, Beijing, China
| | - Lei Xu
- Beijing Sungen Biomedical Technology Co., Ltd, Beijing, China
| | - Zhiwei Sun
- Beijing Sungen Biomedical Technology Co., Ltd, Beijing, China
| | - Changyuan Yu
- College of Life Science and Technology, Beijing University of Chemical Technology, Beijing, China
| | - Lida Xu
- Beijing Sungen Biomedical Technology Co., Ltd, Beijing, China
- Beijing Hotgen Biotech Co., Ltd, Beijing, China
| |
Collapse
|
6
|
Pan Q, Portelli S, Nguyen TB, Ascher DB. Characterization on the oncogenic effect of the missense mutations of p53 via machine learning. Brief Bioinform 2023; 25:bbad428. [PMID: 38018912 PMCID: PMC10685404 DOI: 10.1093/bib/bbad428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 10/13/2023] [Accepted: 11/05/2023] [Indexed: 11/30/2023] Open
Abstract
Dysfunctions caused by missense mutations in the tumour suppressor p53 have been extensively shown to be a leading driver of many cancers. Unfortunately, it is time-consuming and labour-intensive to experimentally elucidate the effects of all possible missense variants. Recent works presented a comprehensive dataset and machine learning model to predict the functional outcome of mutations in p53. Despite the well-established dataset and precise predictions, this tool was trained on a complicated model with limited predictions on p53 mutations. In this work, we first used computational biophysical tools to investigate the functional consequences of missense mutations in p53, informing a bias of deleterious mutations with destabilizing effects. Combining these insights with experimental assays, we present two interpretable machine learning models leveraging both experimental assays and in silico biophysical measurements to accurately predict the functional consequences on p53 and validate their robustness on clinical data. Our final model based on nine features obtained comparable predictive performance with the state-of-the-art p53 specific method and outperformed other generalized, widely used predictors. Interpreting our models revealed that information on residue p53 activity, polar atom distances and changes in p53 stability were instrumental in the decisions, consistent with a bias of the properties of deleterious mutations. Our predictions have been computed for all possible missense mutations in p53, offering clinical diagnostic utility, which is crucial for patient monitoring and the development of personalized cancer treatment.
Collapse
Affiliation(s)
- Qisheng Pan
- School of Chemistry and Molecular Bioscience, University of Queensland, Brisbane Queensland 4072, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne Victoria 3004, Australia
| | - Stephanie Portelli
- School of Chemistry and Molecular Bioscience, University of Queensland, Brisbane Queensland 4072, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne Victoria 3004, Australia
| | - Thanh Binh Nguyen
- School of Chemistry and Molecular Bioscience, University of Queensland, Brisbane Queensland 4072, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne Victoria 3004, Australia
| | - David B Ascher
- School of Chemistry and Molecular Bioscience, University of Queensland, Brisbane Queensland 4072, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne Victoria 3004, Australia
| |
Collapse
|
7
|
Li J, Kang G, Wang J, Yuan H, Wu Y, Meng S, Wang P, Zhang M, Wang Y, Feng Y, Huang H, de Marco A. Affinity maturation of antibody fragments: A review encompassing the development from random approaches to computational rational optimization. Int J Biol Macromol 2023; 247:125733. [PMID: 37423452 DOI: 10.1016/j.ijbiomac.2023.125733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 07/04/2023] [Accepted: 07/06/2023] [Indexed: 07/11/2023]
Abstract
Routinely screened antibody fragments usually require further in vitro maturation to achieve the desired biophysical properties. Blind in vitro strategies can produce improved ligands by introducing random mutations into the original sequences and selecting the resulting clones under more and more stringent conditions. Rational approaches exploit an alternative perspective that aims first at identifying the specific residues potentially involved in the control of biophysical mechanisms, such as affinity or stability, and then to evaluate what mutations could improve those characteristics. The understanding of the antigen-antibody interactions is instrumental to develop this process the reliability of which, consequently, strongly depends on the quality and completeness of the structural information. Recently, methods based on deep learning approaches critically improved the speed and accuracy of model building and are promising tools for accelerating the docking step. Here, we review the features of the available bioinformatic instruments and analyze the reports illustrating the result obtained with their application to optimize antibody fragments, and nanobodies in particular. Finally, the emerging trends and open questions are summarized.
Collapse
Affiliation(s)
- Jiaqi Li
- School of Chemical Engineering and Technology, Tianjin University, Tianjin 300350, China; Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China
| | - Guangbo Kang
- School of Chemical Engineering and Technology, Tianjin University, Tianjin 300350, China; Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China
| | - Jiewen Wang
- School of Chemical Engineering and Technology, Tianjin University, Tianjin 300350, China; Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China
| | - Haibin Yuan
- School of Chemical Engineering and Technology, Tianjin University, Tianjin 300350, China; Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China
| | - Yili Wu
- Zhejiang Provincial Clinical Research Center for Mental Disorders, School of Mental Health and the Affiliated Kangning Hospital, Institute of Aging, Key Laboratory of Alzheimer's Disease of Zhejiang Province, Wenzhou Medical University, Oujiang Laboratory, Wenzhou, Zhejiang 325035, China
| | - Shuxian Meng
- School of Chemical Engineering and Technology, Tianjin University, Tianjin 300350, China
| | - Ping Wang
- New Technology R&D Department, Tianjin Modern Innovative TCM Technology Company Limited, Tianjin 300392, China
| | - Miao Zhang
- School of Chemical Engineering and Technology, Tianjin University, Tianjin 300350, China; China Resources Biopharmaceutical Company Limited, Beijing 100029, China
| | - Yuli Wang
- School of Chemical Engineering and Technology, Tianjin University, Tianjin 300350, China; Tianjin Pharmaceutical Da Ren Tang Group Corporation Limited, Traditional Chinese Pharmacy Research Institute, Tianjin Key Laboratory of Quality Control in Chinese Medicine, Tianjin 300457, China; State Key Laboratory of Drug Delivery Technology and Pharmacokinetics, Tianjin Institute of Pharmaceutical Research, Tianjin 300193, China
| | - Yuanhang Feng
- School of Chemical Engineering and Technology, Tianjin University, Tianjin 300350, China
| | - He Huang
- School of Chemical Engineering and Technology, Tianjin University, Tianjin 300350, China; Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China.
| | - Ario de Marco
- Laboratory for Environmental and Life Sciences, University of Nova Gorica, Nova Gorica, Slovenia.
| |
Collapse
|
8
|
Guarra F, Colombo G. Computational Methods in Immunology and Vaccinology: Design and Development of Antibodies and Immunogens. J Chem Theory Comput 2023; 19:5315-5333. [PMID: 37527403 PMCID: PMC10448727 DOI: 10.1021/acs.jctc.3c00513] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Indexed: 08/03/2023]
Abstract
The design of new biomolecules able to harness immune mechanisms for the treatment of diseases is a prime challenge for computational and simulative approaches. For instance, in recent years, antibodies have emerged as an important class of therapeutics against a spectrum of pathologies. In cancer, immune-inspired approaches are witnessing a surge thanks to a better understanding of tumor-associated antigens and the mechanisms of their engagement or evasion from the human immune system. Here, we provide a summary of the main state-of-the-art computational approaches that are used to design antibodies and antigens, and in parallel, we review key methodologies for epitope identification for both B- and T-cell mediated responses. A special focus is devoted to the description of structure- and physics-based models, privileged over purely sequence-based approaches. We discuss the implications of novel methods in engineering biomolecules with tailored immunological properties for possible therapeutic uses. Finally, we highlight the extraordinary challenges and opportunities presented by the possible integration of structure- and physics-based methods with emerging Artificial Intelligence technologies for the prediction and design of novel antigens, epitopes, and antibodies.
Collapse
Affiliation(s)
- Federica Guarra
- Department of Chemistry, University
of Pavia, Via Taramelli 12, 27100 Pavia, Italy
| | - Giorgio Colombo
- Department of Chemistry, University
of Pavia, Via Taramelli 12, 27100 Pavia, Italy
| |
Collapse
|
9
|
Sheng Z, Bimela JS, Wang M, Li Z, Guo Y, Ho DD. An optimized thermodynamics integration protocol for identifying beneficial mutations in antibody design. Front Immunol 2023; 14:1190416. [PMID: 37275896 PMCID: PMC10235760 DOI: 10.3389/fimmu.2023.1190416] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 04/28/2023] [Indexed: 06/07/2023] Open
Abstract
Accurate identification of beneficial mutations is central to antibody design. Many knowledge-based (KB) computational approaches have been developed to predict beneficial mutations, but their accuracy leaves room for improvement. Thermodynamic integration (TI) is an alchemical free energy algorithm that offers an alternative technique for identifying beneficial mutations, but its performance has not been evaluated. In this study, we developed an efficient TI protocol with high accuracy for predicting binding free energy changes of antibody mutations. The improved TI method outperforms KB methods at identifying both beneficial and deleterious mutations. We observed that KB methods have higher accuracies in predicting deleterious mutations than beneficial mutations. A pipeline using KB methods to efficiently exclude deleterious mutations and TI to accurately identify beneficial mutations was developed for high-throughput mutation scanning. The pipeline was applied to optimize the binding affinity of a broadly sarbecovirus neutralizing antibody 10-40 against the circulating severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) omicron variant. Three identified beneficial mutations show strong synergy and improve both binding affinity and neutralization potency of antibody 10-40. Molecular dynamics simulation revealed that the three mutations improve the binding affinity of antibody 10-40 through the stabilization of an altered binding mode with increased polar and hydrophobic interactions. Above all, this study presents an accurate and efficient TI-based approach for optimizing antibodies and other biomolecules.
Collapse
Affiliation(s)
- Zizhang Sheng
- Aaron Diamond AIDS Research Center, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, United States
| | - Jude S. Bimela
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, United States
| | - Maple Wang
- Aaron Diamond AIDS Research Center, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, United States
| | - Zhiteng Li
- Aaron Diamond AIDS Research Center, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, United States
| | - Yicheng Guo
- Aaron Diamond AIDS Research Center, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, United States
| | - David D. Ho
- Aaron Diamond AIDS Research Center, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, United States
| |
Collapse
|
10
|
Chen Z, Wang X, Chen X, Huang J, Wang C, Wang J, Wang Z. Accelerating therapeutic protein design with computational approaches toward the clinical stage. Comput Struct Biotechnol J 2023; 21:2909-2926. [PMID: 38213894 PMCID: PMC10781723 DOI: 10.1016/j.csbj.2023.04.027] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 04/11/2023] [Accepted: 04/27/2023] [Indexed: 01/13/2024] Open
Abstract
Therapeutic protein, represented by antibodies, is of increasing interest in human medicine. However, clinical translation of therapeutic protein is still largely hindered by different aspects of developability, including affinity and selectivity, stability and aggregation prevention, solubility and viscosity reduction, and deimmunization. Conventional optimization of the developability with widely used methods, like display technologies and library screening approaches, is a time and cost-intensive endeavor, and the efficiency in finding suitable solutions is still not enough to meet clinical needs. In recent years, the accelerated advancement of computational methodologies has ushered in a transformative era in the field of therapeutic protein design. Owing to their remarkable capabilities in feature extraction and modeling, the integration of cutting-edge computational strategies with conventional techniques presents a promising avenue to accelerate the progression of therapeutic protein design and optimization toward clinical implementation. Here, we compared the differences between therapeutic protein and small molecules in developability and provided an overview of the computational approaches applicable to the design or optimization of therapeutic protein in several developability issues.
Collapse
Affiliation(s)
- Zhidong Chen
- Department of Pathology, The Eighth Affiliated Hospital, Sun Yat-sen University, Shenzhen 518033, China
- School of Pharmaceutical Sciences, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China
| | - Xinpei Wang
- School of Pharmaceutical Sciences, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China
| | - Xu Chen
- School of Pharmaceutical Sciences, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China
| | - Juyang Huang
- School of Pharmaceutical Sciences, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China
| | - Chenglin Wang
- Shenzhen Qiyu Biotechnology Co., Ltd, Shenzhen 518107, China
| | - Junqing Wang
- School of Pharmaceutical Sciences, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China
| | - Zhe Wang
- Department of Pathology, The Eighth Affiliated Hospital, Sun Yat-sen University, Shenzhen 518033, China
| |
Collapse
|
11
|
Ascher DB, Kaminskas LM, Myung Y, Pires DEV. Using Graph-Based Signatures to Guide Rational Antibody Engineering. Methods Mol Biol 2023; 2552:375-397. [PMID: 36346604 DOI: 10.1007/978-1-0716-2609-2_21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Antibodies are essential experimental and diagnostic tools and as biotherapeutics have significantly advanced our ability to treat a range of diseases. With recent innovations in computational tools to guide protein engineering, we can now rationally design better antibodies with improved efficacy, stability, and pharmacokinetics. Here, we describe the use of the mCSM web-based in silico suite, which uses graph-based signatures to rapidly identify the structural and functional consequences of mutations, to guide rational antibody engineering to improve stability, affinity, and specificity.
Collapse
Affiliation(s)
- David B Ascher
- Structural Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Parkville, VIC, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia
- Department of Biochemistry, Cambridge University, Cambridge, UK
- School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, Queensland, Australia
| | - Lisa M Kaminskas
- School of Biological Sciences, University of Queensland, St Lucia, QLD, Australia
| | - Yoochan Myung
- Structural Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Parkville, VIC, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia
- School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, Queensland, Australia
| | - Douglas E V Pires
- Structural Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Parkville, VIC, Australia.
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia.
- School of Computing and Information Systems, University of Melbourne, Parkville, VIC, Australia.
| |
Collapse
|
12
|
Boer JC, Pan Q, Holien JK, Nguyen TB, Ascher DB, Plebanski M. A bias of Asparagine to Lysine mutations in SARS-CoV-2 outside the receptor binding domain affects protein flexibility. Front Immunol 2022; 13:954435. [PMID: 36569921 PMCID: PMC9788125 DOI: 10.3389/fimmu.2022.954435] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Accepted: 11/14/2022] [Indexed: 12/14/2022] Open
Abstract
Introduction COVID-19 pandemic has been threatening public health and economic development worldwide for over two years. Compared with the original SARS-CoV-2 strain reported in 2019, the Omicron variant (B.1.1.529.1) is more transmissible. This variant has 34 mutations in its Spike protein, 15 of which are present in the Receptor Binding Domain (RBD), facilitating viral internalization via binding to the angiotensin-converting enzyme 2 (ACE2) receptor on endothelial cells as well as promoting increased immune evasion capacity. Methods Herein we compared SARS-CoV-2 proteins (including ORF3a, ORF7, ORF8, Nucleoprotein (N), membrane protein (M) and Spike (S) proteins) from multiple ancestral strains. We included the currently designated original Variant of Concern (VOC) Omicron, its subsequent emerged variants BA.1, BA2, BA3, BA.4, BA.5, the two currently emerging variants BQ.1 and BBX.1, and compared these with the previously circulating VOCs Alpha, Beta, Gamma, and Delta, to better understand the nature and potential impact of Omicron specific mutations. Results Only in Omicron and its subvariants, a bias toward an Asparagine to Lysine (N to K) mutation was evident within the Spike protein, including regions outside the RBD domain, while none of the regions outside the Spike protein domain were characterized by this mutational bias. Computational structural analysis revealed that three of these specific mutations located in the central core region, contribute to a preference for the alteration of conformations of the Spike protein. Several mutations in the RBD which have circulated across most Omicron subvariants were also analysed, and these showed more potential for immune escape. Conclusion This study emphasizes the importance of understanding how specific N to K mutations outside of the RBD region affect SARS-CoV-2 conformational changes and the need for neutralizing antibodies for Omicron to target a subset of conformationally dependent B cell epitopes.
Collapse
Affiliation(s)
- Jennifer C. Boer
- School of Health and Biomedical Science, Royal Melbourne Institute of Technology, Melbourne, VIC, Australia
| | - Qisheng Pan
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, QLD, Australia,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia
| | - Jessica K. Holien
- School of Science, Royal Melbourne Institute of Technology (RMIT) University, Melbourne, VIC, Australia
| | - Thanh-Binh Nguyen
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, QLD, Australia,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia
| | - David B. Ascher
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, QLD, Australia,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia
| | - Magdalena Plebanski
- School of Health and Biomedical Science, Royal Melbourne Institute of Technology, Melbourne, VIC, Australia,*Correspondence: Magdalena Plebanski,
| |
Collapse
|
13
|
Zhou Y, Al‐Jarf R, Alavi A, Nguyen TB, Rodrigues CHM, Pires DEV, Ascher DB. kinCSM: Using graph-based signatures to predict small molecule CDK2 inhibitors. Protein Sci 2022; 31:e4453. [PMID: 36305769 PMCID: PMC9597374 DOI: 10.1002/pro.4453] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Revised: 09/14/2022] [Accepted: 09/15/2022] [Indexed: 11/20/2022]
Abstract
Protein phosphorylation acts as an essential on/off switch in many cellular signaling pathways. This has led to ongoing interest in targeting kinases for therapeutic intervention. Computer-aided drug discovery has been proven a useful and cost-effective approach for facilitating prioritization and enrichment of screening libraries, but limited effort has been devoted providing insights on what makes a potent kinase inhibitor. To fill this gap, here we developed kinCSM, an integrative computational tool capable of accurately identifying potent cyclin-dependent kinase 2 (CDK2) inhibitors, quantitatively predicting CDK2 ligand-kinase inhibition constants (pKi ) and classifying different types of inhibitors based on their favorable binding modes. kinCSM predictive models were built using supervised learning and leveraged the concept of graph-based signatures to capture both physicochemical properties and geometry properties of small molecules. CDK2 inhibitors were accurately identified with Matthew's Correlation Coefficients (MCC) of up to 0.74, and inhibition constants predicted with Pearson's correlation of up to 0.76, both with consistent performances of 0.66 and 0.68 on a nonredundant blind test, respectively. kinCSM was also able to identify the potential type of inhibition for a given molecule, achieving MCC of up to 0.80 on cross-validation and 0.73 on the blind test. Analyzing the molecular composition of revealed enriched chemical fragments in CDK2 inhibitors and different types of inhibitors, which provides insights into the molecular mechanisms behind ligand-kinase interactions. kinCSM will be an invaluable tool to guide future kinase drug discovery. To aid the fast and accurate screening of CDK2 inhibitors, kinCSM is freely available at https://biosig.lab.uq.edu.au/kin_csm/.
Collapse
Affiliation(s)
- Yunzhuo Zhou
- School of Chemistry and Molecular BiosciencesUniversity of QueenslandBrisbaneQueenslandAustralia
- Structural Biology and Bioinformatics, Department of BiochemistryUniversity of MelbourneMelbourneVictoriaAustralia
- Systems and Computational Biology, Bio21 InstituteUniversity of MelbourneMelbourneVictoriaAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneVictoriaAustralia
| | - Raghad Al‐Jarf
- Structural Biology and Bioinformatics, Department of BiochemistryUniversity of MelbourneMelbourneVictoriaAustralia
- Systems and Computational Biology, Bio21 InstituteUniversity of MelbourneMelbourneVictoriaAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneVictoriaAustralia
| | - Azadeh Alavi
- Structural Biology and Bioinformatics, Department of BiochemistryUniversity of MelbourneMelbourneVictoriaAustralia
- Systems and Computational Biology, Bio21 InstituteUniversity of MelbourneMelbourneVictoriaAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneVictoriaAustralia
| | - Thanh Binh Nguyen
- School of Chemistry and Molecular BiosciencesUniversity of QueenslandBrisbaneQueenslandAustralia
- Structural Biology and Bioinformatics, Department of BiochemistryUniversity of MelbourneMelbourneVictoriaAustralia
- Systems and Computational Biology, Bio21 InstituteUniversity of MelbourneMelbourneVictoriaAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneVictoriaAustralia
| | - Carlos H. M. Rodrigues
- School of Chemistry and Molecular BiosciencesUniversity of QueenslandBrisbaneQueenslandAustralia
- Structural Biology and Bioinformatics, Department of BiochemistryUniversity of MelbourneMelbourneVictoriaAustralia
- Systems and Computational Biology, Bio21 InstituteUniversity of MelbourneMelbourneVictoriaAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneVictoriaAustralia
| | - Douglas E. V. Pires
- School of Chemistry and Molecular BiosciencesUniversity of QueenslandBrisbaneQueenslandAustralia
- Structural Biology and Bioinformatics, Department of BiochemistryUniversity of MelbourneMelbourneVictoriaAustralia
- Systems and Computational Biology, Bio21 InstituteUniversity of MelbourneMelbourneVictoriaAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneVictoriaAustralia
- School of Computing and Information SystemsUniversity of MelbourneMelbourneVictoriaAustralia
| | - David B. Ascher
- School of Chemistry and Molecular BiosciencesUniversity of QueenslandBrisbaneQueenslandAustralia
- Structural Biology and Bioinformatics, Department of BiochemistryUniversity of MelbourneMelbourneVictoriaAustralia
- Systems and Computational Biology, Bio21 InstituteUniversity of MelbourneMelbourneVictoriaAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneVictoriaAustralia
| |
Collapse
|
14
|
Iftkhar S, de Sá AGC, Velloso JPL, Aljarf R, Pires DEV, Ascher DB. cardioToxCSM: A Web Server for Predicting Cardiotoxicity of Small Molecules. J Chem Inf Model 2022; 62:4827-4836. [PMID: 36219164 DOI: 10.1021/acs.jcim.2c00822] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The design of novel, safe, and effective drugs to treat human diseases is a challenging venture, with toxicity being one of the main sources of attrition at later stages of development. Failure due to toxicity incurs a significant increase in costs and time to market, with multiple drugs being withdrawn from the market due to their adverse effects. Cardiotoxicity, for instance, was responsible for the failure of drugs such as fenspiride, propoxyphene, and valdecoxib. While significant effort has been dedicated to mitigate this issue by developing computational approaches that aim to identify molecules likely to be toxic, including quantitative structure-activity relationship models and machine learning methods, current approaches present limited performance and interpretability. To overcome these, we propose a new web-based computational method, cardioToxCSM, which can predict six types of cardiac toxicity outcomes, including arrhythmia, cardiac failure, heart block, hERG toxicity, hypertension, and myocardial infarction, efficiently and accurately. cardioToxCSM was developed using the concept of graph-based signatures, molecular descriptors, toxicophore matchings, and molecular fingerprints, leveraging explainable machine learning, and was validated internally via different cross validation schemes and externally via low-redundancy blind sets. The models presented robust performances with areas under ROC curves of up to 0.898 on 5-fold cross-validation, consistent with metrics on blind tests. Additionally, our models provide interpretation of the predictions by identifying whether substructures that are commonly enriched in toxic compounds were present. We believe cardioToxCSM will provide valuable insight into the potential cardiotoxicity of small molecules early on drug screening efforts. The method is made freely available as a web server at https://biosig.lab.uq.edu.au/cardiotoxcsm.
Collapse
Affiliation(s)
- Saba Iftkhar
- School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia 4072, Queensland, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Victoria, Australia
| | - Alex G C de Sá
- School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia 4072, Queensland, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Victoria, Australia.,Baker Department of Cardiometabolic Health, Melbourne Medical School, University of Melbourne, Parkville 3010, Victoria, Australia
| | - João P L Velloso
- School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia 4072, Queensland, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Victoria, Australia
| | - Raghad Aljarf
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Victoria, Australia.,Baker Department of Cardiometabolic Health, Melbourne Medical School, University of Melbourne, Parkville 3010, Victoria, Australia
| | - Douglas E V Pires
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Victoria, Australia.,School of Computing and Information Systems, University of Melbourne, Parkville 3052, Victoria, Australia
| | - David B Ascher
- School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia 4072, Queensland, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Victoria, Australia.,Baker Department of Cardiometabolic Health, Melbourne Medical School, University of Melbourne, Parkville 3010, Victoria, Australia
| |
Collapse
|
15
|
Wilman W, Wróbel S, Bielska W, Deszynski P, Dudzic P, Jaszczyszyn I, Kaniewski J, Młokosiewicz J, Rouyan A, Satława T, Kumar S, Greiff V, Krawczyk K. Machine-designed biotherapeutics: opportunities, feasibility and advantages of deep learning in computational antibody discovery. Brief Bioinform 2022; 23:bbac267. [PMID: 35830864 PMCID: PMC9294429 DOI: 10.1093/bib/bbac267] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 05/09/2022] [Accepted: 06/07/2022] [Indexed: 11/13/2022] Open
Abstract
Antibodies are versatile molecular binders with an established and growing role as therapeutics. Computational approaches to developing and designing these molecules are being increasingly used to complement traditional lab-based processes. Nowadays, in silico methods fill multiple elements of the discovery stage, such as characterizing antibody-antigen interactions and identifying developability liabilities. Recently, computational methods tackling such problems have begun to follow machine learning paradigms, in many cases deep learning specifically. This paradigm shift offers improvements in established areas such as structure or binding prediction and opens up new possibilities such as language-based modeling of antibody repertoires or machine-learning-based generation of novel sequences. In this review, we critically examine the recent developments in (deep) machine learning approaches to therapeutic antibody design with implications for fully computational antibody design.
Collapse
|
16
|
An effective strategy for the humanization of antibody fragments under an accelerated timeline. Int J Biol Macromol 2022; 216:465-474. [PMID: 35803408 DOI: 10.1016/j.ijbiomac.2022.06.195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 06/23/2022] [Accepted: 06/24/2022] [Indexed: 11/22/2022]
Abstract
The use of monoclonal antibodies (mAbs) in therapy is gradually advancing and discussions entail its safety, rentability and effectiveness. To this date, around a hundred mAbs have been approved by the FDA for the treatment of various diseases. Aiming for their large-scale production, recombinant DNA technology is mainly employed, and antibodies can be expressed in various eukaryotic and prokaryotic systems. Moreover, considering their heterologous origin and potential immunogenicity, various strategies have been developed for mAb humanization, considering that around 50 % of commercial mAbs are humanized. Hence, we introduce LimAb7, a mouse mAb capable of binding and neutralizing brown spider's Loxosceles intermedia dermonecrotic toxins in vivo/in vitro. This antibody has been produced in mouse and humanized scFv and diabody formats, however results indicated losses in antigen-binding affinity, stability, and neutralizing ability. Intending to develop evolved, stable, and neutralizing antibody fragments, we report for the first time the design of humanized antibody V-domains produced as Fab fragments, against spider venom toxins. Improvements in constructs were observed regarding their physicochemical stability, target binding and binding pattern maintenance. As their neutralizing features remain to be characterized, we believe this data sheds new light on antibody humanization by producing a parental molecule in different recombinant formats.
Collapse
|
17
|
Hummer AM, Abanades B, Deane CM. Advances in computational structure-based antibody design. Curr Opin Struct Biol 2022; 74:102379. [PMID: 35490649 DOI: 10.1016/j.sbi.2022.102379] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 02/28/2022] [Accepted: 03/17/2022] [Indexed: 12/12/2022]
Abstract
Antibodies are currently the most important class of biotherapeutics and are used to treat numerous diseases. Recent advances in computational methods are ushering in a new era of antibody design, driven in part by accurate structure prediction. Previously, structure-based antibody design has been limited to a relatively small number of cases where accurate structures or models of both the target antigen and antibody were available. As we move towards a time where it is possible to accurately model most antibodies and antigens, and to reliably predict their binding site, there is vast potential for true computational antibody design. In this review, we describe the latest methods that promise to launch a paradigm shift towards entirely in silico structure-based antibody design.
Collapse
Affiliation(s)
- Alissa M Hummer
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, UK. https://twitter.com/@AlissaHummer
| | - Brennan Abanades
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, UK. https://twitter.com/@brennanaba
| | - Charlotte M Deane
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, UK.
| |
Collapse
|
18
|
Pires DEV, Stubbs KA, Mylne JS, Ascher DB. cropCSM: designing safe and potent herbicides with graph-based signatures. Brief Bioinform 2022; 23:bbac042. [PMID: 35211724 PMCID: PMC9155605 DOI: 10.1093/bib/bbac042] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 01/26/2022] [Accepted: 01/27/2022] [Indexed: 12/11/2022] Open
Abstract
Herbicides have revolutionised weed management, increased crop yields and improved profitability allowing for an increase in worldwide food security. Their widespread use, however, has also led to a rise in resistance and concerns about their environmental impact. Despite the need for potent and safe herbicidal molecules, no herbicide with a new mode of action has reached the market in 30 years. Although development of computational approaches has proven invaluable to guide rational drug discovery pipelines, leading to higher hit rates and lower attrition due to poor toxicity, little has been done in contrast for herbicide design. To fill this gap, we have developed cropCSM, a computational platform to help identify new, potent, nontoxic and environmentally safe herbicides. By using a knowledge-based approach, we identified physicochemical properties and substructures enriched in safe herbicides. By representing the small molecules as a graph, we leveraged these insights to guide the development of predictive models trained and tested on the largest collected data set of molecules with experimentally characterised herbicidal profiles to date (over 4500 compounds). In addition, we developed six new environmental and human toxicity predictors, spanning five different species to assist in molecule prioritisation. cropCSM was able to correctly identify 97% of herbicides currently available commercially, while predicting toxicity profiles with accuracies of up to 92%. We believe cropCSM will be an essential tool for the enrichment of screening libraries and to guide the development of potent and safe herbicides. We have made the method freely available through a user-friendly webserver at http://biosig.unimelb.edu.au/crop_csm.
Collapse
Affiliation(s)
- Douglas E V Pires
- School of Computing and Information Systems at the University of Melbourne
| | - Keith A Stubbs
- School of Molecular Sciences at the University of Western Australia
| | - Joshua S Mylne
- Curtin University and Deputy Director of the Centre for Crop and Disease Management
| | - David B Ascher
- University of Queensland, and head of Computational Biology and Clinical Informatics at the Baker Institute and Systems
| |
Collapse
|
19
|
Myung Y, Pires DEV, Ascher DB. CSM-AB: graph-based antibody-antigen binding affinity prediction and docking scoring function. Bioinformatics 2022; 38:1141-1143. [PMID: 34734992 DOI: 10.1093/bioinformatics/btab762] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Revised: 10/18/2021] [Accepted: 11/01/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Understanding antibody-antigen interactions is key to improving their binding affinities and specificities. While experimental approaches are fundamental for developing new therapeutics, computational methods can provide quick assessment of binding landscapes, guiding experimental design. Despite this, little effort has been devoted to accurately predicting the binding affinity between antibodies and antigens and to develop tailored docking scoring functions for this type of interaction. Here, we developed CSM-AB, a machine learning method capable of predicting antibody-antigen binding affinity by modelling interaction interfaces as graph-based signatures. RESULTS CSM-AB outperformed alternative methods achieving a Pearson's correlation of up to 0.64 on blind tests. We also show CSM-AB can accurately rank near-native poses, working effectively as a docking scoring function. We believe CSM-AB will be an invaluable tool to assist in the development of new immunotherapies. AVAILABILITY AND IMPLEMENTATION CSM-AB is freely available as a user-friendly web interface and API at http://biosig.unimelb.edu.au/csm_ab/datasets. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yoochan Myung
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, VIC, Australia
| | - Douglas E V Pires
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, VIC, Australia.,School of Computing and Information Systems, University of Melbourne, Melbourne, VIC, Australia.,School of Chemistry and Molecular Biosciences, University Of Queensland, St Lucia, QLD, Australia
| | - David B Ascher
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, VIC, Australia.,School of Chemistry and Molecular Biosciences, University Of Queensland, St Lucia, QLD, Australia
| |
Collapse
|
20
|
Akbar R, Bashour H, Rawat P, Robert PA, Smorodina E, Cotet TS, Flem-Karlsen K, Frank R, Mehta BB, Vu MH, Zengin T, Gutierrez-Marcos J, Lund-Johansen F, Andersen JT, Greiff V. Progress and challenges for the machine learning-based design of fit-for-purpose monoclonal antibodies. MAbs 2022; 14:2008790. [PMID: 35293269 PMCID: PMC8928824 DOI: 10.1080/19420862.2021.2008790] [Citation(s) in RCA: 47] [Impact Index Per Article: 23.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2021] [Revised: 11/04/2021] [Accepted: 11/17/2021] [Indexed: 12/15/2022] Open
Abstract
Although the therapeutic efficacy and commercial success of monoclonal antibodies (mAbs) are tremendous, the design and discovery of new candidates remain a time and cost-intensive endeavor. In this regard, progress in the generation of data describing antigen binding and developability, computational methodology, and artificial intelligence may pave the way for a new era of in silico on-demand immunotherapeutics design and discovery. Here, we argue that the main necessary machine learning (ML) components for an in silico mAb sequence generator are: understanding of the rules of mAb-antigen binding, capacity to modularly combine mAb design parameters, and algorithms for unconstrained parameter-driven in silico mAb sequence synthesis. We review the current progress toward the realization of these necessary components and discuss the challenges that must be overcome to allow the on-demand ML-based discovery and design of fit-for-purpose mAb therapeutic candidates.
Collapse
Affiliation(s)
- Rahmad Akbar
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Habib Bashour
- School of Life Sciences, University of Warwick, Coventry, UK
| | - Puneet Rawat
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| | - Philippe A. Robert
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Eva Smorodina
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russia
| | | | - Karine Flem-Karlsen
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
- Institute of Clinical Medicine, Department of Pharmacology, University of Oslo and Oslo University Hospital, Norway
| | - Robert Frank
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Brij Bhushan Mehta
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Mai Ha Vu
- Department of Linguistics and Scandinavian Studies, University of Oslo, Norway
| | - Talip Zengin
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
- Department of Bioinformatics, Mugla Sitki Kocman University, Turkey
| | | | | | - Jan Terje Andersen
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
- Institute of Clinical Medicine, Department of Pharmacology, University of Oslo and Oslo University Hospital, Norway
| | - Victor Greiff
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| |
Collapse
|
21
|
Nguyen TB, Pires DEV, Ascher DB. CSM-carbohydrate: protein-carbohydrate binding affinity prediction and docking scoring function. Brief Bioinform 2021; 23:6457169. [PMID: 34882232 DOI: 10.1093/bib/bbab512] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 11/06/2021] [Accepted: 11/08/2021] [Indexed: 12/29/2022] Open
Abstract
Protein-carbohydrate interactions are crucial for many cellular processes but can be challenging to biologically characterise. To improve our understanding and ability to model these molecular interactions, we used a carefully curated set of 370 protein-carbohydrate complexes with experimental structural and biophysical data in order to train and validate a new tool, cutoff scanning matrix (CSM)-carbohydrate, using machine learning algorithms to accurately predict their binding affinity and rank docking poses as a scoring function. Information on both protein and carbohydrate complementarity, in terms of shape and chemistry, was captured using graph-based structural signatures. Across both training and independent test sets, we achieved comparable Pearson's correlations of 0.72 under cross-validation [root mean square error (RMSE) of 1.58 Kcal/mol] and 0.67 on the independent test (RMSE of 1.72 Kcal/mol), providing confidence in the generalisability and robustness of the final model. Similar performance was obtained across mono-, di- and oligosaccharides, further highlighting the applicability of this approach to the study of larger complexes. We show CSM-carbohydrate significantly outperformed previous approaches and have implemented our method and make all data freely available through both a user-friendly web interface and application programming interface, to facilitate programmatic access at http://biosig.unimelb.edu.au/csm_carbohydrate/. We believe CSM-carbohydrate will be an invaluable tool for helping assess docking poses and the effects of mutations on protein-carbohydrate affinity, unravelling important aspects that drive binding recognition.
Collapse
Affiliation(s)
- Thanh Binh Nguyen
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
| | - Douglas E V Pires
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,School of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia
| | - David B Ascher
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia.,Department of Biochemistry, University of Cambridge, Cambridge, UK
| |
Collapse
|
22
|
Uthayopas K, de Sá AGC, Alavi A, Pires DEV, Ascher DB. TSMDA: Target and symptom-based computational model for miRNA-disease-association prediction. MOLECULAR THERAPY. NUCLEIC ACIDS 2021; 26:536-546. [PMID: 34631283 PMCID: PMC8479276 DOI: 10.1016/j.omtn.2021.08.016] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 08/19/2021] [Indexed: 02/06/2023]
Abstract
The emergence of high-throughput sequencing techniques has revealed a primary role of microRNAs (miRNAs) in a wide range of diseases, including cancers and neurodegenerative disorders. Understanding novel relationships between miRNAs and diseases can potentially unveil complex pathogenesis mechanisms, leading to effective diagnosis and treatment. The investigation of novel miRNA-disease associations, however, is currently costly and time consuming. Over the years, several computational models have been proposed to prioritize potential miRNA-disease associations, but with limited usability or predictive capability. In order to fill this gap, we introduce TSMDA, a novel machine-learning method that leverages target and symptom information and negative sample selection to predict miRNA-disease association. TSMDA significantly outperforms similar methods, achieving an area under the receiver operating characteristic (ROC) curve (AUC) of 0.989 and 0.982 under 5-fold cross-validation and blind test, respectively. We also demonstrate the capability of the method to uncover potential miRNA-disease associations in breast, prostate, and lung cancers, as case studies. We believe TSMDA will be an invaluable tool for the community to explore and prioritize potentially new miRNA-disease associations for further experimental characterization. The method was made available as a freely accessible and user-friendly web interface at http://biosig.unimelb.edu.au/tsmda/.
Collapse
Affiliation(s)
- Korawich Uthayopas
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Parkville 3052, VIC, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, VIC, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, VIC, Australia
| | - Alex G C de Sá
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Parkville 3052, VIC, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, VIC, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, VIC, Australia.,Baker Department of Cardiometabolic Health, Melbourne Medical School, University of Melbourne, Parkville 3010, VIC, Australia
| | - Azadeh Alavi
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Parkville 3052, VIC, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, VIC, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, VIC, Australia
| | - Douglas E V Pires
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Parkville 3052, VIC, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, VIC, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, VIC, Australia.,School of Computing and Information Systems, University of Melbourne, Parkville 3052, VIC, Australia
| | - David B Ascher
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Parkville 3052, VIC, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, VIC, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, VIC, Australia.,Baker Department of Cardiometabolic Health, Melbourne Medical School, University of Melbourne, Parkville 3010, VIC, Australia.,Department of Biochemistry, University of Cambridge, 80 Tennis Ct Rd, Cambridge CB2 1GA, UK
| |
Collapse
|
23
|
Nguyen TB, Myung Y, de Sá AGC, Pires DEV, Ascher DB. mmCSM-NA: accurately predicting effects of single and multiple mutations on protein-nucleic acid binding affinity. NAR Genom Bioinform 2021; 3:lqab109. [PMID: 34805992 PMCID: PMC8600011 DOI: 10.1093/nargab/lqab109] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2021] [Revised: 09/20/2021] [Accepted: 10/27/2021] [Indexed: 02/02/2023] Open
Abstract
While protein-nucleic acid interactions are pivotal for many crucial biological processes, limited experimental data has made the development of computational approaches to characterise these interactions a challenge. Consequently, most approaches to understand the effects of missense mutations on protein-nucleic acid affinity have focused on single-point mutations and have presented a limited performance on independent data sets. To overcome this, we have curated the largest dataset of experimentally measured effects of mutations on nucleic acid binding affinity to date, encompassing 856 single-point mutations and 141 multiple-point mutations across 155 experimentally solved complexes. This was used in combination with an optimized version of our graph-based signatures to develop mmCSM-NA (http://biosig.unimelb.edu.au/mmcsm_na), the first scalable method capable of quantitatively and accurately predicting the effects of multiple-point mutations on nucleic acid binding affinities. mmCSM-NA obtained a Pearson's correlation of up to 0.67 (RMSE of 1.06 Kcal/mol) on single-point mutations under cross-validation, and up to 0.65 on independent non-redundant datasets of multiple-point mutations (RMSE of 1.12 kcal/mol), outperforming similar tools. mmCSM-NA is freely available as an easy-to-use web-server and API. We believe it will be an invaluable tool to shed light on the role of mutations affecting protein-nucleic acid interactions in diseases.
Collapse
Affiliation(s)
- Thanh Binh Nguyen
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
| | - Yoochan Myung
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
| | - Alex G C de Sá
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
| | - Douglas E V Pires
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- School of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia
| | - David B Ascher
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| |
Collapse
|
24
|
Rodrigues CHM, Pires DEV, Ascher DB. pdCSM-PPI: Using Graph-Based Signatures to Identify Protein-Protein Interaction Inhibitors. J Chem Inf Model 2021; 61:5438-5445. [PMID: 34719929 DOI: 10.1021/acs.jcim.1c01135] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Protein-protein interactions are promising sites for development of selective drugs; however, they have generally been viewed as challenging targets. Molecules targeting protein-protein interactions tend to be larger and more lipophilic than other drug-like molecules, mimicking the properties of interacting interfaces. Here, we propose a machine learning approach that uses a graph-based representation of small molecules to guide identification of inhibitors modulating protein-protein interactions, pdCSM-PPI. This approach was applied to 21 different PPI targets. We developed interaction-specific models that were able to accurately identify active compounds achieving MCC and F1 scores up to 1, and Pearson's correlations up to 0.87, outperforming previous approaches. Using insights from these individual models, we developed a generic protein-protein interaction modulator predictive model, which accurately predicted IC50 with a Pearson's correlation of 0.64 on a low redundancy blind test. Importantly, we were able to accurately identify active from inactive compounds, achieving an AUC of 0.77 and sensitivity and specificity of 76% and 78%, respectively. We believe pdCSM-PPI will be an important tool to help guide more efficient screening of new PPI inhibitors; it is freely available as an easy-to-use web server and API at http://biosig.unimelb.edu.au/pdcsm_ppi.
Collapse
Affiliation(s)
- Carlos H M Rodrigues
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, Victoria Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Victoria, Australia.,School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane 4072, Australia
| | - Douglas E V Pires
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, Victoria Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Victoria, Australia.,School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane 4072, Australia.,School of Computing and Information Systems, University of Melbourne, Parkville 3052, Victoria, Australia
| | - David B Ascher
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, Victoria Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Victoria, Australia.,School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane 4072, Australia
| |
Collapse
|
25
|
da Silva BM, Myung Y, Ascher DB, Pires DEV. epitope3D: a machine learning method for conformational B-cell epitope prediction. Brief Bioinform 2021; 23:6407730. [PMID: 34676398 DOI: 10.1093/bib/bbab423] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 08/25/2021] [Accepted: 09/14/2021] [Indexed: 11/13/2022] Open
Abstract
The ability to identify antigenic determinants of pathogens, or epitopes, is fundamental to guide rational vaccine development and immunotherapies, which are particularly relevant for rapid pandemic response. A range of computational tools has been developed over the past two decades to assist in epitope prediction; however, they have presented limited performance and generalization, particularly for the identification of conformational B-cell epitopes. Here, we present epitope3D, a novel scalable machine learning method capable of accurately identifying conformational epitopes trained and evaluated on the largest curated epitope data set to date. Our method uses the concept of graph-based signatures to model epitope and non-epitope regions as graphs and extract distance patterns that are used as evidence to train and test predictive models. We show epitope3D outperforms available alternative approaches, achieving Mathew's Correlation Coefficient and F1-scores of 0.55 and 0.57 on cross-validation and 0.45 and 0.36 during independent blind tests, respectively.
Collapse
Affiliation(s)
- Bruna Moreira da Silva
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,School of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia
| | - YooChan Myung
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,Baker Department of Cardiometabolic Health, University of Melbourne, Melbourne, Victoria, Australia
| | - David B Ascher
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,Baker Department of Cardiometabolic Health, University of Melbourne, Melbourne, Victoria, Australia.,Department of Biochemistry, University of Cambridge, 80 Tennis Ct Rd, Cambridge CB2 1GA, UK
| | - Douglas E V Pires
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,School of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia
| |
Collapse
|
26
|
Computational and Rational Design of Single-Chain Antibody against Tick-Borne Encephalitis Virus for Modifying Its Specificity. Viruses 2021; 13:v13081494. [PMID: 34452359 PMCID: PMC8402911 DOI: 10.3390/v13081494] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 06/09/2021] [Accepted: 06/23/2021] [Indexed: 12/27/2022] Open
Abstract
Tick-borne encephalitis virus (TBEV) causes 5−7 thousand cases of human meningitis and encephalitis annually. The neutralizing and protective antibody ch14D5 is a potential therapeutic agent. This antibody exhibits a high affinity for binding with the D3 domain of the glycoprotein E of the Far Eastern subtype of the virus, but a lower affinity for the D3 domains of the Siberian and European subtypes. In this study, a 2.2-fold increase in the affinity of single-chain antibody sc14D5 to D3 proteins of the Siberian and European subtypes of the virus was achieved using rational design and computational modeling. This improvement can be further enhanced in the case of the bivalent binding of the full-length chimeric antibody containing the identified mutation.
Collapse
|
27
|
Rodrigues CHM, Pires DEV, Ascher DB. mmCSM-PPI: predicting the effects of multiple point mutations on protein-protein interactions. Nucleic Acids Res 2021; 49:W417-W424. [PMID: 33893812 PMCID: PMC8262703 DOI: 10.1093/nar/gkab273] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 03/18/2021] [Accepted: 04/15/2021] [Indexed: 11/16/2022] Open
Abstract
Protein-protein interactions play a crucial role in all cellular functions and biological processes and mutations leading to their disruption are enriched in many diseases. While a number of computational methods to assess the effects of variants on protein-protein binding affinity have been proposed, they are in general limited to the analysis of single point mutations and have been shown to perform poorly on independent test sets. Here, we present mmCSM-PPI, a scalable and effective machine learning model for accurately assessing changes in protein-protein binding affinity caused by single and multiple missense mutations. We expanded our well-established graph-based signatures in order to capture physicochemical and geometrical properties of multiple wild-type residue environments and integrated them with substitution scores and dynamics terms from normal mode analysis. mmCSM-PPI was able to achieve a Pearson's correlation of up to 0.75 (RMSE = 1.64 kcal/mol) under 10-fold cross-validation and 0.70 (RMSE = 2.06 kcal/mol) on a non-redundant blind test, outperforming existing methods. Our method is freely available as a user-friendly and easy-to-use web server and API at http://biosig.unimelb.edu.au/mmcsm_ppi.
Collapse
Affiliation(s)
- Carlos H M Rodrigues
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Structural Biology and Bioinformatics, Department of Biochemistry and Pharmacology, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
| | - Douglas E V Pires
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Structural Biology and Bioinformatics, Department of Biochemistry and Pharmacology, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- School of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia
| | - David B Ascher
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Structural Biology and Bioinformatics, Department of Biochemistry and Pharmacology, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| |
Collapse
|
28
|
Chen J, Wu F, Lin D, Kong W, Cai X, Yang J, Sun X, Cao P. Rational optimization of a human neutralizing antibody of SARS-CoV-2. Comput Biol Med 2021; 135:104550. [PMID: 34147856 PMCID: PMC8196228 DOI: 10.1016/j.compbiomed.2021.104550] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Revised: 06/02/2021] [Accepted: 06/02/2021] [Indexed: 01/06/2023]
Abstract
SARS-CoV-2 has caused a worldwide epidemic of coronavirus disease 19 (COVID-19). Antibody drugs present an effective weapon for tens of millions of COVID-19 patients. Antibodies disrupting the interactions between the receptor-binding domain (RBD) of SARS-CoV-2 S protein and the angiotensin converting enzyme 2 (ACE2) effectively block SARS-CoV-2 cell entry into host cells. In order to rapidly develop more potent neutralizing antibodies, we utilized virtual scanning mutageneses and molecular dynamics simulations to optimize the antibody of P2B-2F6 isolated from single B cells of SARS-CoV-2 infected patients. Two potent P2B-2F6 mutants, namely H:V106R and H:V106R/H:P107Y, were found to possess higher binding affinities with the RBD domain of SARS-CoV-2 than others. Polar interactions are preferred near 106 and 107 paratope residues of the heavy chain. The mutations also increase the hydrogen-bonding network formed between the antibody and the RBD. Notably, the optimized antibodies possess potential neutralizing activity against the alarming SARS-CoV-2 variant of N501Y. This study provides insights into structure-based optimization of antibodies with higher affinity to the antigen. We hope that our proposed antibody mutants could contribute to the development of improved therapies against COVID-19.
Collapse
Affiliation(s)
- Jiao Chen
- Affiliated Hospital of Integrated Traditional Chinese and Western Medicine, Nanjing University of Chinese Medicine, Nanjing, 210028, China; Jiangsu Province Academy of Traditional Chinese Medicine, Nanjing, 210028, China
| | - Fei Wu
- Engineering Research Center of Modern Preparation Technology of TCM of Ministry of Education, Shanghai University of Traditional Chinese Medicine, Shanghai, 201203, China
| | - Dan Lin
- Affiliated Hospital of Integrated Traditional Chinese and Western Medicine, Nanjing University of Chinese Medicine, Nanjing, 210028, China
| | - Weikang Kong
- Sir Run Run Hospital, NanJing Medical University, Nanjing, 211100, China
| | - Xueting Cai
- Affiliated Hospital of Integrated Traditional Chinese and Western Medicine, Nanjing University of Chinese Medicine, Nanjing, 210028, China; Jiangsu Province Academy of Traditional Chinese Medicine, Nanjing, 210028, China
| | - Jie Yang
- Affiliated Hospital of Integrated Traditional Chinese and Western Medicine, Nanjing University of Chinese Medicine, Nanjing, 210028, China; Jiangsu Province Academy of Traditional Chinese Medicine, Nanjing, 210028, China
| | - Xiaoyan Sun
- Affiliated Hospital of Integrated Traditional Chinese and Western Medicine, Nanjing University of Chinese Medicine, Nanjing, 210028, China; Jiangsu Province Academy of Traditional Chinese Medicine, Nanjing, 210028, China.
| | - Peng Cao
- Affiliated Hospital of Integrated Traditional Chinese and Western Medicine, Nanjing University of Chinese Medicine, Nanjing, 210028, China; Jiangsu Province Academy of Traditional Chinese Medicine, Nanjing, 210028, China; School of Pharmacy, Nanjing University of Chinese Medicine, Nanjing, 210023, China.
| |
Collapse
|
29
|
Portelli S, Barr L, de Sá AG, Pires DE, Ascher DB. Distinguishing between PTEN clinical phenotypes through mutation analysis. Comput Struct Biotechnol J 2021; 19:3097-3109. [PMID: 34141133 PMCID: PMC8180946 DOI: 10.1016/j.csbj.2021.05.028] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 04/29/2021] [Accepted: 05/19/2021] [Indexed: 12/28/2022] Open
Abstract
Phosphate and tensin homolog on chromosome ten (PTEN) germline mutations are associated with an overarching condition known as PTEN hamartoma tumor syndrome. Clinical phenotypes associated with this syndrome range from macrocephaly and autism spectrum disorder to Cowden syndrome, which manifests as multiple noncancerous tumor-like growths (hamartomas), and an increased predisposition to certain cancers. It is unclear, however, the basis by which mutations might lead to these very diverse phenotypic outcomes. Here we show that, by considering the molecular consequences of mutations in PTEN on protein structure and function, we can accurately distinguish PTEN mutations exhibiting different phenotypes. Changes in phosphatase activity, protein stability, and intramolecular interactions appeared to be major drivers of clinical phenotype, with cancer-associated variants leading to the most drastic changes, while ASD and non-pathogenic variants associated with more mild and neutral changes, respectively. Importantly, we show via saturation mutagenesis that more than half of variants of unknown significance could be associated with disease phenotypes, while over half of Cowden syndrome mutations likely lead to cancer. These insights can assist in exploring potentially important clinical outcomes delineated by PTEN variation.
Collapse
Affiliation(s)
- Stephanie Portelli
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
| | - Lucy Barr
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
| | - Alex G.C. de Sá
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Baker Department of Cardiometabolic Health, Melbourne Medical School, University of Melbourne, Melbourne, Victoria, Australia
| | - Douglas E.V. Pires
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- School of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia
| | - David B. Ascher
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Baker Department of Cardiometabolic Health, Melbourne Medical School, University of Melbourne, Melbourne, Victoria, Australia
- Department of Biochemistry, University of Cambridge, 80 Tennis Ct Rd, Cambridge CB2 1GA, United States
| |
Collapse
|
30
|
Pertseva M, Gao B, Neumeier D, Yermanos A, Reddy ST. Applications of Machine and Deep Learning in Adaptive Immunity. Annu Rev Chem Biomol Eng 2021; 12:39-62. [PMID: 33852352 DOI: 10.1146/annurev-chembioeng-101420-125021] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Adaptive immunity is mediated by lymphocyte B and T cells, which respectively express a vast and diverse repertoire of B cell and T cell receptors and, in conjunction with peptide antigen presentation through major histocompatibility complexes (MHCs), can recognize and respond to pathogens and diseased cells. In recent years, advances in deep sequencing have led to a massive increase in the amount of adaptive immune receptor repertoire data; additionally, proteomics techniques have led to a wealth of data on peptide-MHC presentation. These large-scale data sets are now making it possible to train machine and deep learning models, which can be used to identify complex and high-dimensional patterns in immune repertoires. This article introduces adaptive immune repertoires and machine and deep learning related to biological sequence data and then summarizes the many applications in this field, which span from predicting the immunological status of a host to the antigen specificity of individual receptors and the engineering of immunotherapeutics.
Collapse
Affiliation(s)
- Margarita Pertseva
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland; .,Life Science Zurich Graduate School, ETH Zurich and University of Zurich, 8006 Zurich, Switzerland
| | - Beichen Gao
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland;
| | - Daniel Neumeier
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland;
| | - Alexander Yermanos
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland; .,Department of Pathology and Immunology, University of Geneva, 1205 Geneva, Switzerland.,Department of Biology, Institute of Microbiology and Immunology, ETH Zurich, 8093 Zurich, Switzerland
| | - Sai T Reddy
- Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland;
| |
Collapse
|
31
|
Rodrigues CHM, Pires DEV, Ascher DB. DynaMut2: Assessing changes in stability and flexibility upon single and multiple point missense mutations. Protein Sci 2020; 30:60-69. [PMID: 32881105 PMCID: PMC7737773 DOI: 10.1002/pro.3942] [Citation(s) in RCA: 239] [Impact Index Per Article: 59.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Revised: 08/27/2020] [Accepted: 08/28/2020] [Indexed: 12/11/2022]
Abstract
Predicting the effect of missense variations on protein stability and dynamics is important for understanding their role in diseases, and the link between protein structure and function. Approaches to estimate these changes have been proposed, but most only consider single‐point missense variants and a static state of the protein, with those that incorporate dynamics are computationally expensive. Here we present DynaMut2, a web server that combines Normal Mode Analysis (NMA) methods to capture protein motion and our graph‐based signatures to represent the wildtype environment to investigate the effects of single and multiple point mutations on protein stability and dynamics. DynaMut2 was able to accurately predict the effects of missense mutations on protein stability, achieving Pearson's correlation of up to 0.72 (RMSE: 1.02 kcal/mol) on a single point and 0.64 (RMSE: 1.80 kcal/mol) on multiple‐point missense mutations across 10‐fold cross‐validation and independent blind tests. For single‐point mutations, DynaMut2 achieved comparable performance with other methods when predicting variations in Gibbs Free Energy (ΔΔG) and in melting temperature (ΔTm). We anticipate our tool to be a valuable suite for the study of protein flexibility analysis and the study of the role of variants in disease. DynaMut2 is freely available as a web server and API at http://biosig.unimelb.edu.au/dynamut2.
Collapse
Affiliation(s)
- Carlos H M Rodrigues
- Structural Biology and Bioinformatics, Department of Biochemistry, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
| | - Douglas E V Pires
- Structural Biology and Bioinformatics, Department of Biochemistry, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,School of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia
| | - David B Ascher
- Structural Biology and Bioinformatics, Department of Biochemistry, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,Department of Biochemistry, University of Cambridge, Cambridge, UK
| |
Collapse
|
32
|
Abstract
Mutations in protein-coding regions can lead to large biological changes and are associated with genetic conditions, including cancers and Mendelian diseases, as well as drug resistance. Although whole genome and exome sequencing help to elucidate potential genotype-phenotype correlations, there is a large gap between the identification of new variants and deciphering their molecular consequences. A comprehensive understanding of these mechanistic consequences is crucial to better understand and treat diseases in a more personalized and effective way. This is particularly relevant considering estimates that over 80% of mutations associated with a disease are incorrectly assumed to be causative. A thorough analysis of potential effects of mutations is required to correctly identify the molecular mechanisms of disease and enable the distinction between disease-causing and non-disease-causing variation within a gene. Here we present an overview of our integrative mutation analysis platform, which focuses on refining the current genotype-phenotype correlation methods by using the wealth of protein structural information.
Collapse
|