1
|
Meng F, Zhou N, Hu G, Liu R, Zhang Y, Jing M, Hou Q. A comprehensive overview of recent advances in generative models for antibodies. Comput Struct Biotechnol J 2024; 23:2648-2660. [PMID: 39027650 PMCID: PMC11254834 DOI: 10.1016/j.csbj.2024.06.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2024] [Revised: 06/15/2024] [Accepted: 06/18/2024] [Indexed: 07/20/2024] Open
Abstract
Therapeutic antibodies are an important class of biopharmaceuticals. With the rapid development of deep learning methods and the increasing amount of antibody data, antibody generative models have made great progress recently. They aim to solve the antibody space searching problems and are widely incorporated into the antibody development process. Therefore, a comprehensive introduction to the development methods in this field is imperative. Here, we collected 34 representative antibody generative models published recently and all generative models can be divided into three categories: sequence-generating models, structure-generating models, and hybrid models, based on their principles and algorithms. We further studied their performance and contributions to antibody sequence prediction, structure optimization, and affinity enhancement. Our manuscript will provide a comprehensive overview of the status of antibody generative models and also offer guidance for selecting different approaches.
Collapse
Affiliation(s)
- Fanxu Meng
- College of Chemical Engineering, Qingdao University of Science and Technology, Qingdao 266042, China
| | - Na Zhou
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan 250100, China
- National Institute of Health Data Science of China, Shandong University, Jinan 250100, China
| | - Guangchun Hu
- School of Information Science and Engineering, University of Jinan, Jinan 250022, China
| | - Ruotong Liu
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan 250100, China
- National Institute of Health Data Science of China, Shandong University, Jinan 250100, China
| | - Yuanyuan Zhang
- College of Chemical Engineering, Qingdao University of Science and Technology, Qingdao 266042, China
| | - Ming Jing
- Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, China
- Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250000, China
| | - Qingzhen Hou
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan 250100, China
- National Institute of Health Data Science of China, Shandong University, Jinan 250100, China
| |
Collapse
|
2
|
Pamonsupornwichit T, Kodchakorn K, Udomwong P, Sornsuwan K, Weechan A, Juntit OA, Nimmanpipug P, Tayapiwatana C. Engineering affinity of humanized ScFv targeting CD147 antibody: A combined approach of mCSM-AB2 and molecular dynamics simulations. J Mol Graph Model 2024; 133:108884. [PMID: 39405982 DOI: 10.1016/j.jmgm.2024.108884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Revised: 09/24/2024] [Accepted: 10/11/2024] [Indexed: 10/21/2024]
Abstract
This study aims to assess the effectiveness of mCSM-AB2, a graph-based signature machine learning method, for affinity engineering of the humanized single-chain Fv anti-CD147 (HuScFvM6-1B9). In parallel, molecular dynamics (MD) simulations were used to gain valuable insights into the dynamics and affinity of the HuScFvM6-1B9-CD147 complex. The result analysis involved integrating free energy changes calculated from the mCSM-AB2 with binding free energy predictions from MD simulations. The simulated structures of the modified HuScFvM6-1B9-CD147 domain 1 complex from MD simulations were used to highlight critical residues participating in the binding surface. Interestingly, alterations in the pattern of amino acids of HuScFvM6-1B9 at the complementarity determining regions interacting with the 31EDLGS35 epitope were observed, particularly in mutants that lost binding activity. The predicted mutants of HuScFvM6-1B9 were subsequently engineered and expressed in E. coli for subsequent binding property validation. Compared to WT HuScFvM6-1B9, the mutant HuScFvM6-1B9L1:N32Y exhibited a 1.66-fold increase in binding affinity, with a KD of 1.75 × 10-8 M. While mCSM-AB2 demonstrates insignificant improvement in predicting binding affinity enhancements, it excels at predicting negative effects, aligning well with experimental validation. In addition to binding free energies, total entropy was considered to explain the discrepancy between mCSM-AB2 predictions and experimental results. This study provides guidelines and identifies the limitations of mCSM-AB2 and MD simulations in antibody engineering.
Collapse
Affiliation(s)
- Thanathat Pamonsupornwichit
- Division of Clinical Immunology, Department of Medical Technology, Faculty of Associated Medical Sciences, Chiang Mai University, Chiang Mai, 50200, Thailand; Center of Biomolecular Therapy and Diagnostic, Faculty of Associated Medical Sciences, Chiang Mai University, Chiang Mai, 50200, Thailand
| | - Kanchanok Kodchakorn
- Office of Research Administration, Chiang Mai University, Chiang Mai, 50200, Thailand; Department of Chemistry, Faculty of Science, Chiang Mai University, Chiang Mai, 50200, Thailand
| | - Piyachat Udomwong
- International College of Digital Innovation, Chiang Mai University, Chiang Mai, 50200, Thailand
| | - Kanokporn Sornsuwan
- Center of Biomolecular Therapy and Diagnostic, Faculty of Associated Medical Sciences, Chiang Mai University, Chiang Mai, 50200, Thailand; Office of Research Administration, Chiang Mai University, Chiang Mai, 50200, Thailand
| | - Anuwat Weechan
- Center of Biomolecular Therapy and Diagnostic, Faculty of Associated Medical Sciences, Chiang Mai University, Chiang Mai, 50200, Thailand
| | - On-Anong Juntit
- Center of Biomolecular Therapy and Diagnostic, Faculty of Associated Medical Sciences, Chiang Mai University, Chiang Mai, 50200, Thailand; Office of Research Administration, Chiang Mai University, Chiang Mai, 50200, Thailand
| | - Piyarat Nimmanpipug
- Department of Chemistry, Faculty of Science, Chiang Mai University, Chiang Mai, 50200, Thailand.
| | - Chatchai Tayapiwatana
- Division of Clinical Immunology, Department of Medical Technology, Faculty of Associated Medical Sciences, Chiang Mai University, Chiang Mai, 50200, Thailand; Center of Biomolecular Therapy and Diagnostic, Faculty of Associated Medical Sciences, Chiang Mai University, Chiang Mai, 50200, Thailand.
| |
Collapse
|
3
|
Pham LV, Underwood AP, Binderup A, Fahnøe U, Fernandez-Antunez C, Lopez-Mendez B, Ryberg LA, Galli A, Sølund C, Weis N, Ramirez S, Bukh J. Neutralisation resistance of SARS-CoV-2 spike-variants is primarily mediated by synergistic receptor binding domain substitutions. Emerg Microbes Infect 2024; 13:2412643. [PMID: 39392057 PMCID: PMC11562025 DOI: 10.1080/22221751.2024.2412643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2024] [Revised: 09/22/2024] [Accepted: 09/30/2024] [Indexed: 10/12/2024]
Abstract
The evolution of SARS-CoV-2 has led to the emergence of numerous variants of concern (VOCs), marked by changes in the viral spike glycoprotein, the primary target for neutralising antibody (nAb) responses. Emerging VOCs, particularly omicron sub-lineages, show resistance to nAbs induced by prior infection or vaccination. The precise spike protein changes contributing to this resistance remain unclear in infectious cell culture systems. In the present study, a large panel of infectious SARS-CoV-2 mutant viruses, each with spike protein changes found in VOCs, including omicron JN.1 and its derivatives KP.2 and KP.3, was generated using a reverse genetic system. The susceptibility of these viruses to antibody neutralisation was measured using plasma from convalescent and vaccinated individuals. Synergistic roles of combined substitutions in the spike receptor binding domain (RBD) were observed in neutralisation resistance. However, recombinant viruses with the entire spike protein from a specific VOC showed enhanced resistance, indicating that changes outside the RBD are also significant. In silico analyses of spike antibody epitopes suggested that changes in neutralisation could be due to altered antibody binding affinities. Assessing ACE2 usage for entry through anti-ACE2 antibody blocking and ACE2 siRNA revealed that omicron BA.2.86 and JN.1 mutant viruses were less dependent on ACE2 for entry. However, surface plasmon resonance analysis showed increased affinity for ACE2 for both BA.2.86 and JN.1 compared to the ancestral spike. This detailed analysis of specific changes in the SARS-CoV-2 spike enhances understanding of coronavirus evolution, particularly regarding neutralising antibody evasion and ACE2 entry receptor dependence.
Collapse
Affiliation(s)
- Long V. Pham
- Copenhagen Hepatitis C Program (CO-HEP), Department of Infectious Diseases, Copenhagen University Hospital, Hvidovre, Denmark
- Copenhagen Hepatitis C Program (CO-HEP), Department of Immunology and Microbiology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Alexander P. Underwood
- Copenhagen Hepatitis C Program (CO-HEP), Department of Infectious Diseases, Copenhagen University Hospital, Hvidovre, Denmark
- Copenhagen Hepatitis C Program (CO-HEP), Department of Immunology and Microbiology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Alekxander Binderup
- Copenhagen Hepatitis C Program (CO-HEP), Department of Infectious Diseases, Copenhagen University Hospital, Hvidovre, Denmark
- Copenhagen Hepatitis C Program (CO-HEP), Department of Immunology and Microbiology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Ulrik Fahnøe
- Copenhagen Hepatitis C Program (CO-HEP), Department of Infectious Diseases, Copenhagen University Hospital, Hvidovre, Denmark
- Copenhagen Hepatitis C Program (CO-HEP), Department of Immunology and Microbiology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Carlota Fernandez-Antunez
- Copenhagen Hepatitis C Program (CO-HEP), Department of Infectious Diseases, Copenhagen University Hospital, Hvidovre, Denmark
- Copenhagen Hepatitis C Program (CO-HEP), Department of Immunology and Microbiology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Blanca Lopez-Mendez
- Protein Production and Characterization Platform, Novo Nordisk Foundation Centre for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Line Abildgaard Ryberg
- Copenhagen Hepatitis C Program (CO-HEP), Department of Infectious Diseases, Copenhagen University Hospital, Hvidovre, Denmark
- Copenhagen Hepatitis C Program (CO-HEP), Department of Immunology and Microbiology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Andrea Galli
- Copenhagen Hepatitis C Program (CO-HEP), Department of Infectious Diseases, Copenhagen University Hospital, Hvidovre, Denmark
- Copenhagen Hepatitis C Program (CO-HEP), Department of Immunology and Microbiology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Christina Sølund
- Copenhagen Hepatitis C Program (CO-HEP), Department of Infectious Diseases, Copenhagen University Hospital, Hvidovre, Denmark
- Copenhagen Hepatitis C Program (CO-HEP), Department of Immunology and Microbiology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
- Department of Infectious Diseases, Copenhagen University Hospital, Hvidovre, Denmark
| | - Nina Weis
- Department of Infectious Diseases, Copenhagen University Hospital, Hvidovre, Denmark
- Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Santseharay Ramirez
- Copenhagen Hepatitis C Program (CO-HEP), Department of Infectious Diseases, Copenhagen University Hospital, Hvidovre, Denmark
- Copenhagen Hepatitis C Program (CO-HEP), Department of Immunology and Microbiology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Jens Bukh
- Copenhagen Hepatitis C Program (CO-HEP), Department of Infectious Diseases, Copenhagen University Hospital, Hvidovre, Denmark
- Copenhagen Hepatitis C Program (CO-HEP), Department of Immunology and Microbiology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
4
|
Kumar A K, Rathore RS. Categorization of hotspots into three types - weak, moderate and strong to distinguish protein-protein versus protein-peptide interactions. J Biomol Struct Dyn 2024; 42:9348-9360. [PMID: 37649387 DOI: 10.1080/07391102.2023.2252077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Accepted: 08/18/2023] [Indexed: 09/01/2023]
Abstract
Protein-protein and protein-peptide interactions (PPI and PPepI) belong to a similar category of interactions, yet seemingly subtle differences exist among them. To characterize differences between protein-protein (PP) and protein-peptide (PPep) interactions, we have focussed on two important classes of residues-hotspot and anchor residues. Using implicit solvation-based free energy calculations, a very large-scale alanine scanning has been performed on benchmark datasets, consisting of over 5700 interface residues. The differences in the two categories are more pronounced, if the data were divided into three distinct types, namely - weak hotspots (having binding free energy loss upon Ala mutation, ΔΔG, ∼2-10 kcal/mol), moderate hotspots (ΔΔG, ∼10-20 kcal/mol) and strong hotspots (ΔΔG ≥ ∼20 kcal/mol). The analysis suggests that for PPI, weak hotspots are predominantly populated by polar and hydrophobic residues. The distribution shifts towards charged and polar residues for moderate hotspot and charged residues (principally Arg) are overwhelmingly present in the strong hotspot. On the other hand, in the PPepI dataset, the distribution shifts from predominantly hydrophobic and polar (in the weak type) to almost similar preference for polar, hydrophobic and charged residues (in moderate type) and finally the charged residue (Arg) and Trp are mostly occupied in the strong type. The preferred anchor residues in both categories are Arg, Tyr and Leu, possessing bulky side chain and which also strike a delicate balance between side chain flexibility and rigidity. The present knowledge should aid in effective design of biologics, by augmentation or disruption of PPIs with peptides or peptidomimetics.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Kiran Kumar A
- Department of Bioinformatics, School of Earth, Biological and Environmental Sciences, Central University of South Bihar, Gaya, India
| | - R S Rathore
- Department of Bioinformatics, School of Earth, Biological and Environmental Sciences, Central University of South Bihar, Gaya, India
| |
Collapse
|
5
|
Zhou Y, Myung Y, Rodrigues CM, Ascher D. DDMut-PPI: predicting effects of mutations on protein-protein interactions using graph-based deep learning. Nucleic Acids Res 2024; 52:W207-W214. [PMID: 38783112 PMCID: PMC11223791 DOI: 10.1093/nar/gkae412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Revised: 04/30/2024] [Accepted: 05/02/2024] [Indexed: 05/25/2024] Open
Abstract
Protein-protein interactions (PPIs) play a vital role in cellular functions and are essential for therapeutic development and understanding diseases. However, current predictive tools often struggle to balance efficiency and precision in predicting the effects of mutations on these complex interactions. To address this, we present DDMut-PPI, a deep learning model that efficiently and accurately predicts changes in PPI binding free energy upon single and multiple point mutations. Building on the robust Siamese network architecture with graph-based signatures from our prior work, DDMut, the DDMut-PPI model was enhanced with a graph convolutional network operated on the protein interaction interface. We used residue-specific embeddings from ProtT5 protein language model as node features, and a variety of molecular interactions as edge features. By integrating evolutionary context with spatial information, this framework enables DDMut-PPI to achieve a robust Pearson correlation of up to 0.75 (root mean squared error: 1.33 kcal/mol) in our evaluations, outperforming most existing methods. Importantly, the model demonstrated consistent performance across mutations that increase or decrease binding affinity. DDMut-PPI offers a significant advancement in the field and will serve as a valuable tool for researchers probing the complexities of protein interactions. DDMut-PPI is freely available as a web server and an application programming interface at https://biosig.lab.uq.edu.au/ddmut_ppi.
Collapse
Affiliation(s)
- Yunzhuo Zhou
- The Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, Queensland 4072, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia
| | - YooChan Myung
- The Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, Queensland 4072, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia
| | - Carlos H M Rodrigues
- The Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, Queensland 4072, Australia
| | - David B Ascher
- The Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, Queensland 4072, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia
| |
Collapse
|
6
|
Chaudhuri D, Majumder S, Datta J, Giri K. Repurposing of therapeutic antibodies against dengue virus envelope protein receptor binding domain. Arch Microbiol 2024; 206:312. [PMID: 38900285 DOI: 10.1007/s00203-024-04039-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Revised: 06/05/2024] [Accepted: 06/08/2024] [Indexed: 06/21/2024]
Abstract
Dengue virus (DENV) is the leading cause of numerous deaths every year due to its high infectivity. In this study we have tried to target the DENV envelope protein receptor binding domain, the region crucial for binding to host receptors which leads to membrane fusion and entry of the viral genome into the human host cell. We have taken 13 known FDA approved antiviral therapeutic antibodies from therapeutic antibody database and tried to repurpose them against the DENV envelope protein. Based on the humanness analysis, 10 antibodies were selected against the DENV envelope protein. Computational affinity maturation of the 10 selected antibodies was performed to increase their binding affinity and specificity against the DENV envelope protein which ultimately led to 8 mutant antibodies having better binding affinity than the native ones. Molecular Dynamics (MD) simulation shows that, the stability of the complexes involving both the native and mutant antibodies were found to be the same although the binding energy between the protein and the respective antibodies was seen to improve upon computational affinity maturation. Contact analyses show similar robustness of the interaction for both the mutant and native antibodies during complex formation with the DENV envelope protein. This has led to the selection of total 18 antibodies including 10 natural and 8 affinity matured mutants which have a high probability of interacting with the DENV envelope protein. Finally, based on all these analyses along with heated MD simulation, Bamlanivimab, Etesivimab and Tixagevimab with a mutation of residue 100 of the heavy chain from serine to tyrosine were selected as prospective therapeutic antibodies to combat DENV infection. This study may open a new avenue in designing therapeutics to combat Dengue viral infection.
Collapse
Affiliation(s)
- Dwaipayan Chaudhuri
- Department of Life Sciences, Presidency University, 86/1 College Street, Kolkata, 700073, India
| | - Satyabrata Majumder
- Department of Life Sciences, Presidency University, 86/1 College Street, Kolkata, 700073, India
| | - Joyeeta Datta
- Department of Life Sciences, Presidency University, 86/1 College Street, Kolkata, 700073, India
| | - Kalyan Giri
- Department of Life Sciences, Presidency University, 86/1 College Street, Kolkata, 700073, India.
| |
Collapse
|
7
|
Velloso JPL, de Sá AGC, Pires DEV, Ascher DB. Engineering G protein-coupled receptors for stabilization. Protein Sci 2024; 33:e5000. [PMID: 38747401 PMCID: PMC11094779 DOI: 10.1002/pro.5000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Revised: 03/21/2024] [Accepted: 04/10/2024] [Indexed: 05/19/2024]
Abstract
G protein-coupled receptors (GPCRs) are one of the most important families of targets for drug discovery. One of the limiting steps in the study of GPCRs has been their stability, with significant and time-consuming protein engineering often used to stabilize GPCRs for structural characterization and drug screening. Unfortunately, computational methods developed using globular soluble proteins have translated poorly to the rational engineering of GPCRs. To fill this gap, we propose GPCR-tm, a novel and personalized structurally driven web-based machine learning tool to study the impacts of mutations on GPCR stability. We show that GPCR-tm performs as well as or better than alternative methods, and that it can accurately rank the stability changes of a wide range of mutations occurring in various types of class A GPCRs. GPCR-tm achieved Pearson's correlation coefficients of 0.74 and 0.46 on 10-fold cross-validation and blind test sets, respectively. We observed that the (structural) graph-based signatures were the most important set of features for predicting destabilizing mutations, which points out that these signatures properly describe the changes in the environment where the mutations occur. More specifically, GPCR-tm was able to accurately rank mutations based on their effect on protein stability, guiding their rational stabilization. GPCR-tm is available through a user-friendly web server at https://biosig.lab.uq.edu.au/gpcr_tm/.
Collapse
Affiliation(s)
- João Paulo L. Velloso
- School of Chemistry and Molecular Biosciences, The Australian Centre for EcogenomicsThe University of QueenslandBrisbaneQueenslandAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneVictoriaAustralia
- Baker Department of Cardiometabolic HealthThe University of MelbourneParkvilleVictoriaAustralia
| | - Alex G. C. de Sá
- School of Chemistry and Molecular Biosciences, The Australian Centre for EcogenomicsThe University of QueenslandBrisbaneQueenslandAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneVictoriaAustralia
- Baker Department of Cardiometabolic HealthThe University of MelbourneParkvilleVictoriaAustralia
| | - Douglas E. V. Pires
- School of Computing and Information SystemsThe University of MelbourneParkvilleVictoriaAustralia
| | - David B. Ascher
- School of Chemistry and Molecular Biosciences, The Australian Centre for EcogenomicsThe University of QueenslandBrisbaneQueenslandAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneVictoriaAustralia
- Baker Department of Cardiometabolic HealthThe University of MelbourneParkvilleVictoriaAustralia
| |
Collapse
|
8
|
Jin R, Ye Q, Wang J, Cao Z, Jiang D, Wang T, Kang Y, Xu W, Hsieh CY, Hou T. AttABseq: an attention-based deep learning prediction method for antigen-antibody binding affinity changes based on protein sequences. Brief Bioinform 2024; 25:bbae304. [PMID: 38960407 PMCID: PMC11221889 DOI: 10.1093/bib/bbae304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Revised: 04/15/2024] [Accepted: 06/11/2024] [Indexed: 07/05/2024] Open
Abstract
The optimization of therapeutic antibodies through traditional techniques, such as candidate screening via hybridoma or phage display, is resource-intensive and time-consuming. In recent years, computational and artificial intelligence-based methods have been actively developed to accelerate and improve the development of therapeutic antibodies. In this study, we developed an end-to-end sequence-based deep learning model, termed AttABseq, for the predictions of the antigen-antibody binding affinity changes connected with antibody mutations. AttABseq is a highly efficient and generic attention-based model by utilizing diverse antigen-antibody complex sequences as the input to predict the binding affinity changes of residue mutations. The assessment on the three benchmark datasets illustrates that AttABseq is 120% more accurate than other sequence-based models in terms of the Pearson correlation coefficient between the predicted and experimental binding affinity changes. Moreover, AttABseq also either outperforms or competes favorably with the structure-based approaches. Furthermore, AttABseq consistently demonstrates robust predictive capabilities across a diverse array of conditions, underscoring its remarkable capacity for generalization across a wide spectrum of antigen-antibody complexes. It imposes no constraints on the quantity of altered residues, rendering it particularly applicable in scenarios where crystallographic structures remain unavailable. The attention-based interpretability analysis indicates that the causal effects of point mutations on antibody-antigen binding affinity changes can be visualized at the residue level, which might assist automated antibody sequence optimization. We believe that AttABseq provides a fiercely competitive answer to therapeutic antibody optimization.
Collapse
Affiliation(s)
- Ruofan Jin
- College of Pharmaceutical Science, Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Zhejiang University, Yuhangtang Road 866, Hangzhou 310058, Zhejiang, China
- College of Life Science, Zhejiang University, Yuhangtang Road 866, Hangzhou 310058, Zhejiang, China
| | - Qing Ye
- College of Pharmaceutical Science, Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Zhejiang University, Yuhangtang Road 866, Hangzhou 310058, Zhejiang, China
| | - Jike Wang
- College of Pharmaceutical Science, Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Zhejiang University, Yuhangtang Road 866, Hangzhou 310058, Zhejiang, China
| | - Zheng Cao
- College of Computer Science and Technology, Zhejiang University, Yuhangtang Road 866, Hangzhou 310058, Zhejiang, China
| | - Dejun Jiang
- College of Pharmaceutical Science, Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Zhejiang University, Yuhangtang Road 866, Hangzhou 310058, Zhejiang, China
| | - Tianyue Wang
- College of Pharmaceutical Science, Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Zhejiang University, Yuhangtang Road 866, Hangzhou 310058, Zhejiang, China
| | - Yu Kang
- College of Pharmaceutical Science, Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Zhejiang University, Yuhangtang Road 866, Hangzhou 310058, Zhejiang, China
| | - Wanting Xu
- College of Pharmaceutical Science, Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Zhejiang University, Yuhangtang Road 866, Hangzhou 310058, Zhejiang, China
| | - Chang-Yu Hsieh
- College of Pharmaceutical Science, Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Zhejiang University, Yuhangtang Road 866, Hangzhou 310058, Zhejiang, China
| | - Tingjun Hou
- College of Pharmaceutical Science, Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Zhejiang University, Yuhangtang Road 866, Hangzhou 310058, Zhejiang, China
| |
Collapse
|
9
|
Sharma D, Rawat P, Greiff V, Janakiraman V, Gromiha MM. Predicting the immune escape of SARS-CoV-2 neutralizing antibodies upon mutation. Biochim Biophys Acta Mol Basis Dis 2024; 1870:166959. [PMID: 37967796 DOI: 10.1016/j.bbadis.2023.166959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 10/25/2023] [Accepted: 11/07/2023] [Indexed: 11/17/2023]
Abstract
COVID-19 has resulted in millions of deaths and severe impact on economies worldwide. Moreover, the emergence of SARS-CoV-2 variants presented significant challenges in controlling the pandemic, particularly their potential to avoid the immune system and evade vaccine immunity. This has led to a growing need for research to predict how mutations in SARS-CoV-2 reduces the ability of antibodies to neutralize the virus. In this study, we assembled a set of 1813 mutations from the interface of SARS-CoV-2 spike protein's receptor binding domain (RBD) and neutralizing antibody complexes and developed a machine learning model to classify high or low escape mutations using interaction energy, inter-residue contacts and predicted binding free energy change. Our approach achieved an Area under the Receiver Operating Characteristics (ROC) Curve (AUC) of 0.91 using the Random Forest classifier on the test dataset with 217 mutations. The model was further utilized to predict the escape mutations on a dataset of 29,165 mutations located at the interface of 83 RBD-neutralizing antibody complexes. A small subset of this dataset was also validated based on available experimental data. We found that top 10 % high escape mutations were dominated by charged to nonpolar mutations whereas low escape mutations were dominated by polar to nonpolar mutations. We believe that the present method will allow prioritization of high/low escape mutations in the context of neutralizing antibodies targeting SARS-CoV-2 RBD region and assist antibody design for current and emerging variants.
Collapse
Affiliation(s)
- Divya Sharma
- Protein Bioinformatics Lab, Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu 600036, India
| | - Puneet Rawat
- University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Victor Greiff
- University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Vani Janakiraman
- Infection Biology Lab, Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu 600036, India
| | - M Michael Gromiha
- Protein Bioinformatics Lab, Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu 600036, India; International Research Frontiers Initiative, School of Computing, Tokyo Institute of Technology, Yokohama 226-8501, Japan; Department of Computer Science, National University of Singapore, Singapore.
| |
Collapse
|
10
|
Rana MM, Nguyen DD. Geometric Graph Learning to Predict Changes in Binding Free Energy and Protein Thermodynamic Stability upon Mutation. J Phys Chem Lett 2023; 14:10870-10879. [PMID: 38032742 DOI: 10.1021/acs.jpclett.3c02679] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2023]
Abstract
Accurate prediction of binding free energy changes upon mutations is vital for optimizing drugs, designing proteins, understanding genetic diseases, and cost-effective virtual screening. While machine learning methods show promise in this domain, achieving accuracy and generalization across diverse data sets remains a challenge. This study introduces Geometric Graph Learning for Protein-Protein Interactions (GGL-PPI), a novel approach integrating geometric graph representation and machine learning to forecast mutation-induced binding free energy changes. GGL-PPI leverages atom-level graph coloring and multiscale weighted colored geometric subgraphs to capture structural features of biomolecules, demonstrating superior performance on three standard data sets, namely, AB-Bind, SKEMPI 1.0, and SKEMPI 2.0 data sets. The model's efficacy extends to predicting protein thermodynamic stability in a blind test set, providing unbiased predictions for both direct and reverse mutations and showcasing notable generalization. GGL-PPI's precision in predicting changes in binding free energy and stability due to mutations enhances our comprehension of protein complexes, offering valuable insights for drug design endeavors.
Collapse
Affiliation(s)
- Md Masud Rana
- Department of Mathematics, University of Kentucky, Lexington, Kentucky 40506, United States
| | - Duc Duy Nguyen
- Department of Mathematics, University of Kentucky, Lexington, Kentucky 40506, United States
| |
Collapse
|
11
|
Nikam R, Yugandhar K, Gromiha MM. Deep learning-based method for predicting and classifying the binding affinity of protein-protein complexes. BIOCHIMICA ET BIOPHYSICA ACTA. PROTEINS AND PROTEOMICS 2023; 1871:140948. [PMID: 37567456 DOI: 10.1016/j.bbapap.2023.140948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Revised: 08/05/2023] [Accepted: 08/08/2023] [Indexed: 08/13/2023]
Abstract
Protein-protein interactions (PPIs) play a critical role in various biological processes. Accurately estimating the binding affinity of PPIs is essential for understanding the underlying molecular recognition mechanisms. In this study, we employed a deep learning approach to predict the binding affinity (ΔG) of protein-protein complexes. To this end, we compiled a dataset of 903 protein-protein complexes, each with its corresponding experimental binding affinity, which belong to six functional classes. We extracted 8 to 20 non-redundant features from the sequence information as well as the predicted three-dimensional structures using feature selection methods for each protein functional class. Our method showed an overall mean absolute error of 1.05 kcal/mol and a correlation of 0.79 between experimental and predicted ΔG values. Additionally, we evaluated our model for discriminating high and low affinity protein-protein complexes and it achieved an accuracy of 87% with an F1 score of 0.86 using 10-fold cross-validation on the selected features. Our approach presents an efficient tool for studying PPIs and provides crucial insights into the underlying mechanisms of the molecular recognition process. The web server can be freely accessed at https://web.iitm.ac.in/bioinfo2/DeepPPAPred/index.html.
Collapse
Affiliation(s)
- Rahul Nikam
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, Tamil Nadu, India
| | - Kumar Yugandhar
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, Tamil Nadu, India; Department of Computational Biology, Cornell University, New York, USA
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, Tamil Nadu, India; Department of Computer Science, Tokyo Institute of Technology, Yokohama, Japan; Department of Computer Science, National University of Singapore, Singapore.
| |
Collapse
|
12
|
Yu H, Mao G, Pei Z, Cen J, Meng W, Wang Y, Zhang S, Li S, Xu Q, Sun M, Xiao K. In Vitro Affinity Maturation of Nanobodies against Mpox Virus A29 Protein Based on Computer-Aided Design. Molecules 2023; 28:6838. [PMID: 37836685 PMCID: PMC10574621 DOI: 10.3390/molecules28196838] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 09/20/2023] [Accepted: 09/25/2023] [Indexed: 10/15/2023] Open
Abstract
Mpox virus (MPXV), the most pathogenic zoonotic orthopoxvirus, caused worldwide concern during the SARS-CoV-2 epidemic. Growing evidence suggests that the MPXV surface protein A29 could be a specific diagnostic marker for immunological detection. In this study, a fully synthetic phage display library was screened, revealing two nanobodies (A1 and H8) that specifically recognize A29. Subsequently, an in vitro affinity maturation strategy based on computer-aided design was proposed by building and docking the A29 and A1 three-dimensional structures. Ligand-receptor binding and molecular dynamics simulations were performed to predict binding modes and key residues. Three mutant antibodies were predicted using the platform, increasing the affinity by approximately 10-fold compared with the parental form. These results will facilitate the application of computers in antibody optimization and reduce the cost of antibody development; moreover, the predicted antibodies provide a reference for establishing an immunological response against MPXV.
Collapse
Affiliation(s)
- Haiyang Yu
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China;
- Lab of Toxicology and Pharmacology, Faculty of Naval Medicine, Naval Medical University, Shanghai 200433, China; (G.M.); (Z.P.); (J.C.); (W.M.); (Y.W.); (S.Z.); (S.L.)
| | - Guanchao Mao
- Lab of Toxicology and Pharmacology, Faculty of Naval Medicine, Naval Medical University, Shanghai 200433, China; (G.M.); (Z.P.); (J.C.); (W.M.); (Y.W.); (S.Z.); (S.L.)
| | - Zhipeng Pei
- Lab of Toxicology and Pharmacology, Faculty of Naval Medicine, Naval Medical University, Shanghai 200433, China; (G.M.); (Z.P.); (J.C.); (W.M.); (Y.W.); (S.Z.); (S.L.)
| | - Jinfeng Cen
- Lab of Toxicology and Pharmacology, Faculty of Naval Medicine, Naval Medical University, Shanghai 200433, China; (G.M.); (Z.P.); (J.C.); (W.M.); (Y.W.); (S.Z.); (S.L.)
| | - Wenqi Meng
- Lab of Toxicology and Pharmacology, Faculty of Naval Medicine, Naval Medical University, Shanghai 200433, China; (G.M.); (Z.P.); (J.C.); (W.M.); (Y.W.); (S.Z.); (S.L.)
| | - Yunqin Wang
- Lab of Toxicology and Pharmacology, Faculty of Naval Medicine, Naval Medical University, Shanghai 200433, China; (G.M.); (Z.P.); (J.C.); (W.M.); (Y.W.); (S.Z.); (S.L.)
| | - Shanshan Zhang
- Lab of Toxicology and Pharmacology, Faculty of Naval Medicine, Naval Medical University, Shanghai 200433, China; (G.M.); (Z.P.); (J.C.); (W.M.); (Y.W.); (S.Z.); (S.L.)
| | - Songling Li
- Lab of Toxicology and Pharmacology, Faculty of Naval Medicine, Naval Medical University, Shanghai 200433, China; (G.M.); (Z.P.); (J.C.); (W.M.); (Y.W.); (S.Z.); (S.L.)
| | - Qingqiang Xu
- Lab of Toxicology and Pharmacology, Faculty of Naval Medicine, Naval Medical University, Shanghai 200433, China; (G.M.); (Z.P.); (J.C.); (W.M.); (Y.W.); (S.Z.); (S.L.)
| | - Mingxue Sun
- Lab of Toxicology and Pharmacology, Faculty of Naval Medicine, Naval Medical University, Shanghai 200433, China; (G.M.); (Z.P.); (J.C.); (W.M.); (Y.W.); (S.Z.); (S.L.)
| | - Kai Xiao
- Lab of Toxicology and Pharmacology, Faculty of Naval Medicine, Naval Medical University, Shanghai 200433, China; (G.M.); (Z.P.); (J.C.); (W.M.); (Y.W.); (S.Z.); (S.L.)
- Marine Biomedical Science and Technology Innovation Platform of Lingang Special Area, Shanghai 201306, China
| |
Collapse
|
13
|
Yue Y, Li S, Wang L, Liu H, Tong HHY, He S. MpbPPI: a multi-task pre-training-based equivariant approach for the prediction of the effect of amino acid mutations on protein-protein interactions. Brief Bioinform 2023; 24:bbad310. [PMID: 37651610 PMCID: PMC10516393 DOI: 10.1093/bib/bbad310] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Revised: 07/12/2023] [Accepted: 08/04/2023] [Indexed: 09/02/2023] Open
Abstract
The accurate prediction of the effect of amino acid mutations for protein-protein interactions (PPI $\Delta \Delta G$) is a crucial task in protein engineering, as it provides insight into the relevant biological processes underpinning protein binding and provides a basis for further drug discovery. In this study, we propose MpbPPI, a novel multi-task pre-training-based geometric equivariance-preserving framework to predict PPI $\Delta \Delta G$. Pre-training on a strictly screened pre-training dataset is employed to address the scarcity of protein-protein complex structures annotated with PPI $\Delta \Delta G$ values. MpbPPI employs a multi-task pre-training technique, forcing the framework to learn comprehensive backbone and side chain geometric regulations of protein-protein complexes at different scales. After pre-training, MpbPPI can generate high-quality representations capturing the effective geometric characteristics of labeled protein-protein complexes for downstream $\Delta \Delta G$ predictions. MpbPPI serves as a scalable framework supporting different sources of mutant-type (MT) protein-protein complexes for flexible application. Experimental results on four benchmark datasets demonstrate that MpbPPI is a state-of-the-art framework for PPI $\Delta \Delta G$ predictions. The data and source code are available at https://github.com/arantir123/MpbPPI.
Collapse
Affiliation(s)
- Yang Yue
- School of Computer Science from the University of Birmingham, UK
| | - Shu Li
- Centre for Artificial Intelligence Driven Drug Discovery at Macao Polytechnic University
| | - Lingling Wang
- Centre for Artificial Intelligence Driven Drug Discovery at Macao Polytechnic University
| | - Huanxiang Liu
- Centre for Artificial Intelligence Driven Drug Discovery at Macao Polytechnic University
| | - Henry H Y Tong
- Centre for Artificial Intelligence Driven Drug Discovery at Macao Polytechnic University
| | - Shan He
- School of Computer Science, the University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK
| |
Collapse
|
14
|
Shirvanizadeh N, Vihinen M. VariBench, new variation benchmark categories and data sets. FRONTIERS IN BIOINFORMATICS 2023; 3:1248732. [PMID: 37795169 PMCID: PMC10546188 DOI: 10.3389/fbinf.2023.1248732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 09/08/2023] [Indexed: 10/06/2023] Open
Affiliation(s)
| | - Mauno Vihinen
- Department of Experimental Medical Science, Lund University, Lund, Sweden
| |
Collapse
|
15
|
Chen J, Woldring DR, Huang F, Huang X, Wei GW. Topological deep learning based deep mutational scanning. Comput Biol Med 2023; 164:107258. [PMID: 37506452 PMCID: PMC10528359 DOI: 10.1016/j.compbiomed.2023.107258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 06/28/2023] [Accepted: 07/08/2023] [Indexed: 07/30/2023]
Abstract
High-throughput deep mutational scanning (DMS) experiments have significantly impacted protein engineering, drug discovery, immunology, cancer biology, and evolutionary biology by enabling the systematic understanding of protein functions. However, the mutational space associated with proteins is astronomically large, making it overwhelming for current experimental capabilities. Therefore, alternative methods for DMS are imperative. We propose a topological deep learning (TDL) paradigm to facilitate in silico DMS. We utilize a new topological data analysis (TDA) technique based on the persistent spectral theory, also known as persistent Laplacian, to capture both topological invariants and the homotopic shape evolution of data. To validate our TDL-DMS model, we use SARS-CoV-2 datasets and show excellent accuracy and reliability for binding interface mutations. This finding is significant for SARS-CoV-2 variant forecasting and designing effective antibodies and vaccines. Our proposed model is expected to have a significant impact on drug discovery, vaccine design, precision medicine, and protein engineering.
Collapse
Affiliation(s)
- Jiahui Chen
- Department of Mathematical Sciences, University of Arkansas, Fayetteville, AR 72701, USA
| | - Daniel R Woldring
- Department of Chemical Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Faqing Huang
- Department of Chemistry and Biochemistry, University of Southern Mississippi, Hattiesburg, MS 39406, USA
| | - Xuefei Huang
- Department of Chemistry, Michigan State University, MI 48824, USA; Department of Biomedical Engineering, Michigan State University, East Lansing, MI 48824, USA; The Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA; Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48824, USA; Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA.
| |
Collapse
|
16
|
Wang G, Liu X, Wang K, Gao Y, Li G, Baptista-Hon DT, Yang XH, Xue K, Tai WH, Jiang Z, Cheng L, Fok M, Lau JYN, Yang S, Lu L, Zhang P, Zhang K. Deep-learning-enabled protein-protein interaction analysis for prediction of SARS-CoV-2 infectivity and variant evolution. Nat Med 2023; 29:2007-2018. [PMID: 37524952 DOI: 10.1038/s41591-023-02483-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Accepted: 06/28/2023] [Indexed: 08/02/2023]
Abstract
Host-pathogen interactions and pathogen evolution are underpinned by protein-protein interactions between viral and host proteins. An understanding of how viral variants affect protein-protein binding is important for predicting viral-host interactions, such as the emergence of new pathogenic SARS-CoV-2 variants. Here we propose an artificial intelligence-based framework called UniBind, in which proteins are represented as a graph at the residue and atom levels. UniBind integrates protein three-dimensional structure and binding affinity and is capable of multi-task learning for heterogeneous biological data integration. In systematic tests on benchmark datasets and further experimental validation, UniBind effectively and scalably predicted the effects of SARS-CoV-2 spike protein variants on their binding affinities to the human ACE2 receptor, as well as to SARS-CoV-2 neutralizing monoclonal antibodies. Furthermore, in a cross-species analysis, UniBind could be applied to predict host susceptibility to SARS-CoV-2 variants and to predict future viral variant evolutionary trends. This in silico approach has the potential to serve as an early warning system for problematic emerging SARS-CoV-2 variants, as well as to facilitate research on protein-protein interactions in general.
Collapse
Affiliation(s)
- Guangyu Wang
- State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China.
| | - Xiaohong Liu
- Instutite for Artificial Intelligence in Medicine and Faculty of Medicine, Macau University of Science and Technology, Macau, China
- UCL Cancer Institute, University College London, London, UK
| | - Kai Wang
- Department of Big Data and Biomedical Artificial Intelligence, National Biomedical Imaging Center, College of Future Technology, Peking University and Peking-Tsinghua Center for Life Sciences, Beijing, China
| | - Yuanxu Gao
- Guangzhou National Laboratory, Guangzhou, China
| | - Gen Li
- Guangzhou National Laboratory, Guangzhou, China
- Guangzhou Women and Children's Medical Center, Guangzhou, China
| | - Daniel T Baptista-Hon
- Instutite for Artificial Intelligence in Medicine and Faculty of Medicine, Macau University of Science and Technology, Macau, China
- Zhuhai International Eye Center and Provincial Key Laboratory of Tumor Interventional Diagnosis and Treatment, Zhuhai People's Hospital and the First Affiliated Hospital of Faculty of Medicine, Macau University of Science and Technology, Guangdong, China
| | - Xiaohong Helena Yang
- Instutite for Artificial Intelligence in Medicine and Faculty of Medicine, Macau University of Science and Technology, Macau, China
| | - Kanmin Xue
- Nuffield Laboratory of Ophthalmology, Department of Clinical Neurosciences, University of Oxford, Oxford, UK
| | - Wa Hou Tai
- Instutite for Artificial Intelligence in Medicine and Faculty of Medicine, Macau University of Science and Technology, Macau, China
| | - Zeyu Jiang
- State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China
| | - Linling Cheng
- Instutite for Artificial Intelligence in Medicine and Faculty of Medicine, Macau University of Science and Technology, Macau, China
- Zhuhai International Eye Center and Provincial Key Laboratory of Tumor Interventional Diagnosis and Treatment, Zhuhai People's Hospital and the First Affiliated Hospital of Faculty of Medicine, Macau University of Science and Technology, Guangdong, China
| | - Manson Fok
- Instutite for Artificial Intelligence in Medicine and Faculty of Medicine, Macau University of Science and Technology, Macau, China
| | - Johnson Yiu-Nam Lau
- Departments of Biology and Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, China
| | - Shengyong Yang
- State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, China
| | - Ligong Lu
- Instutite for Artificial Intelligence in Medicine and Faculty of Medicine, Macau University of Science and Technology, Macau, China
- Zhuhai International Eye Center and Provincial Key Laboratory of Tumor Interventional Diagnosis and Treatment, Zhuhai People's Hospital and the First Affiliated Hospital of Faculty of Medicine, Macau University of Science and Technology, Guangdong, China
| | - Ping Zhang
- State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China
| | - Kang Zhang
- Instutite for Artificial Intelligence in Medicine and Faculty of Medicine, Macau University of Science and Technology, Macau, China.
- Department of Big Data and Biomedical Artificial Intelligence, National Biomedical Imaging Center, College of Future Technology, Peking University and Peking-Tsinghua Center for Life Sciences, Beijing, China.
- Guangzhou National Laboratory, Guangzhou, China.
- Zhuhai International Eye Center and Provincial Key Laboratory of Tumor Interventional Diagnosis and Treatment, Zhuhai People's Hospital and the First Affiliated Hospital of Faculty of Medicine, Macau University of Science and Technology, Guangdong, China.
| |
Collapse
|
17
|
Mohseni Behbahani Y, Laine E, Carbone A. Deep Local Analysis deconstructs protein-protein interfaces and accurately estimates binding affinity changes upon mutation. Bioinformatics 2023; 39:i544-i552. [PMID: 37387162 DOI: 10.1093/bioinformatics/btad231] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION The spectacular recent advances in protein and protein complex structure prediction hold promise for reconstructing interactomes at large-scale and residue resolution. Beyond determining the 3D arrangement of interacting partners, modeling approaches should be able to unravel the impact of sequence variations on the strength of the association. RESULTS In this work, we report on Deep Local Analysis, a novel and efficient deep learning framework that relies on a strikingly simple deconstruction of protein interfaces into small locally oriented residue-centered cubes and on 3D convolutions recognizing patterns within cubes. Merely based on the two cubes associated with the wild-type and the mutant residues, DLA accurately estimates the binding affinity change for the associated complexes. It achieves a Pearson correlation coefficient of 0.735 on about 400 mutations on unseen complexes. Its generalization capability on blind datasets of complexes is higher than the state-of-the-art methods. We show that taking into account the evolutionary constraints on residues contributes to predictions. We also discuss the influence of conformational variability on performance. Beyond the predictive power on the effects of mutations, DLA is a general framework for transferring the knowledge gained from the available non-redundant set of complex protein structures to various tasks. For instance, given a single partially masked cube, it recovers the identity and physicochemical class of the central residue. Given an ensemble of cubes representing an interface, it predicts the function of the complex. AVAILABILITY AND IMPLEMENTATION Source code and models are available at http://gitlab.lcqb.upmc.fr/DLA/DLA.git.
Collapse
Affiliation(s)
- Yasser Mohseni Behbahani
- Laboratory of Computational and Quantitative Biology (LCQB), UMR 7238, Sorbonne Université, CNRS, IBPS, Paris 75005, France
| | - Elodie Laine
- Laboratory of Computational and Quantitative Biology (LCQB), UMR 7238, Sorbonne Université, CNRS, IBPS, Paris 75005, France
| | - Alessandra Carbone
- Laboratory of Computational and Quantitative Biology (LCQB), UMR 7238, Sorbonne Université, CNRS, IBPS, Paris 75005, France
| |
Collapse
|
18
|
Grasso D, Galderisi S, Santucci A, Bernini A. Pharmacological Chaperones and Protein Conformational Diseases: Approaches of Computational Structural Biology. Int J Mol Sci 2023; 24:ijms24065819. [PMID: 36982893 PMCID: PMC10054308 DOI: 10.3390/ijms24065819] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Revised: 03/09/2023] [Accepted: 03/16/2023] [Indexed: 03/30/2023] Open
Abstract
Whenever a protein fails to fold into its native structure, a profound detrimental effect is likely to occur, and a disease is often developed. Protein conformational disorders arise when proteins adopt abnormal conformations due to a pathological gene variant that turns into gain/loss of function or improper localization/degradation. Pharmacological chaperones are small molecules restoring the correct folding of a protein suitable for treating conformational diseases. Small molecules like these bind poorly folded proteins similarly to physiological chaperones, bridging non-covalent interactions (hydrogen bonds, electrostatic interactions, and van der Waals contacts) loosened or lost due to mutations. Pharmacological chaperone development involves, among other things, structural biology investigation of the target protein and its misfolding and refolding. Such research can take advantage of computational methods at many stages. Here, we present an up-to-date review of the computational structural biology tools and approaches regarding protein stability evaluation, binding pocket discovery and druggability, drug repurposing, and virtual ligand screening. The tools are presented as organized in an ideal workflow oriented at pharmacological chaperones' rational design, also with the treatment of rare diseases in mind.
Collapse
Affiliation(s)
- Daniela Grasso
- Department of Biotechnology, Chemistry, and Pharmacy, University of Siena, 53100 Siena, Italy
| | - Silvia Galderisi
- Department of Biotechnology, Chemistry, and Pharmacy, University of Siena, 53100 Siena, Italy
| | - Annalisa Santucci
- Department of Biotechnology, Chemistry, and Pharmacy, University of Siena, 53100 Siena, Italy
| | - Andrea Bernini
- Department of Biotechnology, Chemistry, and Pharmacy, University of Siena, 53100 Siena, Italy
| |
Collapse
|
19
|
Evolution of tetraspanin antigens in the zoonotic Asian blood fluke Schistosoma japonicum. Parasit Vectors 2023; 16:97. [PMID: 36918965 PMCID: PMC10012309 DOI: 10.1186/s13071-023-05706-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2022] [Accepted: 02/17/2023] [Indexed: 03/16/2023] Open
Abstract
BACKGROUND Despite successful control efforts in China over the past 60 years, zoonotic schistosomiasis caused by Schistosoma japonicum remains a threat with transmission ongoing and the risk of localised resurgences prompting calls for a novel integrated control strategy, with an anti-schistosome vaccine as a core element. Anti-schistosome vaccine development and immunisation attempts in non-human mammalian host species, intended to interrupt transmission, and utilising various antigen targets, have yielded mixed success, with some studies highlighting variation in schistosome antigen coding genes (ACGs) as possible confounders of vaccine efficacy. Thus, robust selection of target ACGs, including assessment of their genetic diversity and antigenic variability, is paramount. Tetraspanins (TSPs), a family of tegument-surface antigens in schistosomes, interact directly with the host's immune system and are promising vaccine candidates. Here, for the first time to our knowledge, diversity in S. japonicum TSPs (SjTSPs) and the impact of diversifying selection and sequence variation on immunogenicity in these protiens were evaluated. METHODS SjTSP sequences, representing parasite populations from seven provinces across China, were gathered by baiting published short-read NGS data and were analysed using in silico methods to measure sequence variation and selection pressures and predict the impact of selection on variation in antigen protein structure, function and antigenic propensity. RESULTS Here, 27 SjTSPs were identified across three subfamilies, highlighting the diversity of TSPs in S. japonicum. Considerable variation was demonstrated for several SjTSPs between geographical regions/provinces, revealing that episodic, diversifying positive selection pressures promote amino acid variation/variability in the large extracellular loop (LEL) domain of certain SjTSPs. Accumulating polymorphisms in the LEL domain of SjTSP-2, -8 and -23 led to altered structural, functional and antibody binding characteristics, which are predicted to impact antibody recognition and possibly blunt the host's ability to respond to infection. Such changes, therefore, appear to represent a mechanism utilised by S. japonicum to evade the host's immune system. CONCLUSION Whilst the genetic and antigenic geographic variability observed amongst certain SjTSPs could present challenges to vaccine development, here we demonstrate conservation amongst SjTSP-1, -13 and -14, revealing their likely improved utility as efficacious vaccine candidates. Importantly, our data highlight that robust evaluation of vaccine target variability in natural parasite populations should be a prerequisite for anti-schistosome vaccine development.
Collapse
|
20
|
PSnpBind-ML: predicting the effect of binding site mutations on protein-ligand binding affinity. J Cheminform 2023; 15:31. [PMID: 36864534 PMCID: PMC9983232 DOI: 10.1186/s13321-023-00701-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Accepted: 02/17/2023] [Indexed: 03/04/2023] Open
Abstract
Protein mutations, especially those which occur in the binding site, play an important role in inter-individual drug response and may alter binding affinity and thus impact the drug's efficacy and side effects. Unfortunately, large-scale experimental screening of ligand-binding against protein variants is still time-consuming and expensive. Alternatively, in silico approaches can play a role in guiding those experiments. Methods ranging from computationally cheaper machine learning (ML) to the more expensive molecular dynamics have been applied to accurately predict the mutation effects. However, these effects have been mostly studied on limited and small datasets, while ideally a large dataset of binding affinity changes due to binding site mutations is needed. In this work, we used the PSnpBind database with six hundred thousand docking experiments to train a machine learning model predicting protein-ligand binding affinity for both wild-type proteins and their variants with a single-point mutation in the binding site. A numerical representation of the protein, binding site, mutation, and ligand information was encoded using 256 features, half of them were manually selected based on domain knowledge. A machine learning approach composed of two regression models is proposed, the first predicting wild-type protein-ligand binding affinity while the second predicting the mutated protein-ligand binding affinity. The best performing models reported an RMSE value within 0.5 [Formula: see text] 0.6 kcal/mol-1 on an independent test set with an R2 value of 0.87 [Formula: see text] 0.90. We report an improvement in the prediction performance compared to several reported models developed for protein-ligand binding affinity prediction. The obtained models can be used as a complementary method in early-stage drug discovery. They can be applied to rapidly obtain a better overview of the ligand binding affinity changes across protein variants carried by people in the population and narrow down the search space where more time-demanding methods can be used to identify potential leads that achieve a better affinity for all protein variants.
Collapse
|
21
|
Aljarf R, Tang S, Pires DEV, Ascher DB. embryoTox: Using Graph-Based Signatures to Predict the Teratogenicity of Small Molecules. J Chem Inf Model 2023; 63:432-441. [PMID: 36595441 DOI: 10.1021/acs.jcim.2c00824] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Teratogenic drugs can lead to extreme fetal malformation and consequently critically influence the fetus's health, yet the teratogenic risks associated with most approved drugs are unknown. Here, we propose a novel predictive tool, embryoTox, which utilizes a graph-based signature representation of the chemical structure of a small molecule to predict and classify molecules likely to be safe during pregnancy. embryoTox was trained and validated using in vitro bioactivity data of over 700 small molecules with characterized teratogenicity effects. Our final model achieved an area under the receiver operating characteristic curve (AUC) of up to 0.96 on 10-fold cross-validation and 0.82 on nonredundant blind tests, outperforming alternative approaches. We believe that our predictive tool will provide a practical resource for optimizing screening libraries to determine effective and safe molecules to use during pregnancy. To provide a simple and integrated platform to rapidly screen for potential safe molecules and their risk factors, we made embryoTox freely available online at https://biosig.lab.uq.edu.au/embryotox/.
Collapse
Affiliation(s)
- Raghad Aljarf
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Parkville 3052, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, Victoria, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Victoria, Australia
| | - Simon Tang
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Parkville 3052, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, Victoria, Australia
| | - Douglas E V Pires
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, Victoria, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Victoria, Australia
- School of Computing and Information Systems, University of Melbourne, Parkville 3052, Victoria, Australia
| | - David B Ascher
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Parkville 3052, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, Victoria, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Victoria, Australia
- School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia 4072, Queensland, Australia
| |
Collapse
|
22
|
Wang E. Prediction of antibody binding to SARS-CoV-2 RBDs. BIOINFORMATICS ADVANCES 2023; 3:vbac103. [PMID: 36698760 PMCID: PMC9868522 DOI: 10.1093/bioadv/vbac103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 12/18/2022] [Accepted: 12/31/2022] [Indexed: 01/03/2023]
Abstract
Summary The ability to predict antibody-antigen binding is essential for computational models of antibody affinity maturation and protein design. While most models aim to predict binding for arbitrary antigens and antibodies, the global impact of SARS-CoV-2 on public health and the availability of associated data suggest that a SARS-CoV-2-specific model would be highly beneficial. In this work, we present a neural network model, trained on ∼315 000 datapoints from deep mutational scanning experiments, that predicts escape fractions of SARS-CoV-2 RBDs binding to arbitrary antibodies. The antibody embeddings within the model constitute an effective sequence space, which correlates with the Hamming distance, suggesting that these embeddings may be useful for downstream tasks such as binding prediction. Indeed, the model achieves Spearman correlation coefficients of 0.46 and 0.52 on two held-out test sets. By comparison, correlation coefficients calculated using existing structure and sequence-based models do not exceed 0.28. The correlation coefficient against dissociation constants of antibodies binding to SARS-CoV-2 RBD variants is 0.46. Additionally, the residue-level escapes are highest in the antibody epitope, correlating well with experimentally measured escapes. We further study the effect of antibody chain use, embedding dimension size and feed-forward and convolutional architectures on the model results. Lastly, we find that the inference time of our model is significantly faster than previous models, suggesting that it could be a useful tool for the accurate and rapid prediction of antibodies binding to SARS-CoV-2 RBDs. Availability and implementation The model and associated code are available for download at https://github.com/ericzwang/RBD_AB. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Eric Wang
- To whom correspondence should be addressed.
| |
Collapse
|
23
|
Ascher DB, Kaminskas LM, Myung Y, Pires DEV. Using Graph-Based Signatures to Guide Rational Antibody Engineering. Methods Mol Biol 2023; 2552:375-397. [PMID: 36346604 DOI: 10.1007/978-1-0716-2609-2_21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Antibodies are essential experimental and diagnostic tools and as biotherapeutics have significantly advanced our ability to treat a range of diseases. With recent innovations in computational tools to guide protein engineering, we can now rationally design better antibodies with improved efficacy, stability, and pharmacokinetics. Here, we describe the use of the mCSM web-based in silico suite, which uses graph-based signatures to rapidly identify the structural and functional consequences of mutations, to guide rational antibody engineering to improve stability, affinity, and specificity.
Collapse
Affiliation(s)
- David B Ascher
- Structural Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Parkville, VIC, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia
- Department of Biochemistry, Cambridge University, Cambridge, UK
- School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, Queensland, Australia
| | - Lisa M Kaminskas
- School of Biological Sciences, University of Queensland, St Lucia, QLD, Australia
| | - Yoochan Myung
- Structural Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Parkville, VIC, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia
- School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, Queensland, Australia
| | - Douglas E V Pires
- Structural Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Parkville, VIC, Australia.
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia.
- School of Computing and Information Systems, University of Melbourne, Parkville, VIC, Australia.
| |
Collapse
|
24
|
Bai Z, Wang J, Li J, Yuan H, Wang P, Zhang M, Feng Y, Cao X, Cao X, Kang G, de Marco A, Huang H. Design of nanobody-based bispecific constructs by in silico affinity maturation and umbrella sampling simulations. Comput Struct Biotechnol J 2022; 21:601-613. [PMID: 36659922 PMCID: PMC9822835 DOI: 10.1016/j.csbj.2022.12.021] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Revised: 12/14/2022] [Accepted: 12/14/2022] [Indexed: 12/23/2022] Open
Abstract
Random mutagenesis is the natural opportunity for proteins to evolve and biotechnologically it has been exploited to create diversity and identify variants with improved characteristics in the mutant pools. Rational mutagenesis based on biophysical assumptions and supported by computational power has been proposed as a faster and more predictable strategy to reach the same aim. In this work we confirm that substantial improvements in terms of both affinity and stability of nanobodies can be obtained by using combinations of algorithms, even for binders with already high affinity and elevated thermal stability. Furthermore, in silico approaches allowed the development of an optimized bispecific construct able to bind simultaneously the two clinically relevant antigens TNF-α and IL-23 and, by means of its enhanced avidity, to inhibit effectively the apoptosis of TNF-α-sensitive L929 cells. The results revealed that salt bridges, hydrogen bonds, aromatic-aromatic and cation-pi interactions had a critical role in increasing affinity. We provided a platform for the construction of high-affinity bispecific constructs based on nanobodies that can have relevant applications for the control of all those biological mechanisms in which more than a single antigen must be targeted to increase the treatment effectiveness and avoid resistance mechanisms.
Collapse
Affiliation(s)
- Zixuan Bai
- Department of Biochemical Engineering, School of Chemical Engineering and Technology, Tianjin University, Tianjin 300350, China,Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China
| | - Jiewen Wang
- Department of Biochemical Engineering, School of Chemical Engineering and Technology, Tianjin University, Tianjin 300350, China,Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China,Institute of Shaoxing, Tianjin University, Zhejiang 312300, China
| | - Jiaqi Li
- Department of Biochemical Engineering, School of Chemical Engineering and Technology, Tianjin University, Tianjin 300350, China,Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China,Institute of Shaoxing, Tianjin University, Zhejiang 312300, China
| | - Haibin Yuan
- Department of Biochemical Engineering, School of Chemical Engineering and Technology, Tianjin University, Tianjin 300350, China,Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China
| | - Ping Wang
- Tianjin Modern Innovative TCM Technology Co. Ltd., Tianjin, China
| | - Miao Zhang
- Department of Biochemical Engineering, School of Chemical Engineering and Technology, Tianjin University, Tianjin 300350, China,Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China,China Resources Biopharmaceutical Company Limited, Beijing, China
| | - Yuanhang Feng
- Department of Biochemical Engineering, School of Chemical Engineering and Technology, Tianjin University, Tianjin 300350, China,Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China
| | - Xiangtong Cao
- Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China
| | - Xiangan Cao
- Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China
| | - Guangbo Kang
- Department of Biochemical Engineering, School of Chemical Engineering and Technology, Tianjin University, Tianjin 300350, China,Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China,Institute of Shaoxing, Tianjin University, Zhejiang 312300, China,Corresponding authors at: Department of Biochemical Engineering, School of Chemical Engineering and Technology, Tianjin University, Tianjin 300350, China.
| | - Ario de Marco
- Laboratory for Environmental and Life Sciences, University of Nova Gorica, Nova Gorica, Slovenia,Corresponding author.
| | - He Huang
- Department of Biochemical Engineering, School of Chemical Engineering and Technology, Tianjin University, Tianjin 300350, China,Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin 300072, China,Institute of Shaoxing, Tianjin University, Zhejiang 312300, China,Corresponding authors at: Department of Biochemical Engineering, School of Chemical Engineering and Technology, Tianjin University, Tianjin 300350, China.
| |
Collapse
|
25
|
Boer JC, Pan Q, Holien JK, Nguyen TB, Ascher DB, Plebanski M. A bias of Asparagine to Lysine mutations in SARS-CoV-2 outside the receptor binding domain affects protein flexibility. Front Immunol 2022; 13:954435. [PMID: 36569921 PMCID: PMC9788125 DOI: 10.3389/fimmu.2022.954435] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Accepted: 11/14/2022] [Indexed: 12/14/2022] Open
Abstract
Introduction COVID-19 pandemic has been threatening public health and economic development worldwide for over two years. Compared with the original SARS-CoV-2 strain reported in 2019, the Omicron variant (B.1.1.529.1) is more transmissible. This variant has 34 mutations in its Spike protein, 15 of which are present in the Receptor Binding Domain (RBD), facilitating viral internalization via binding to the angiotensin-converting enzyme 2 (ACE2) receptor on endothelial cells as well as promoting increased immune evasion capacity. Methods Herein we compared SARS-CoV-2 proteins (including ORF3a, ORF7, ORF8, Nucleoprotein (N), membrane protein (M) and Spike (S) proteins) from multiple ancestral strains. We included the currently designated original Variant of Concern (VOC) Omicron, its subsequent emerged variants BA.1, BA2, BA3, BA.4, BA.5, the two currently emerging variants BQ.1 and BBX.1, and compared these with the previously circulating VOCs Alpha, Beta, Gamma, and Delta, to better understand the nature and potential impact of Omicron specific mutations. Results Only in Omicron and its subvariants, a bias toward an Asparagine to Lysine (N to K) mutation was evident within the Spike protein, including regions outside the RBD domain, while none of the regions outside the Spike protein domain were characterized by this mutational bias. Computational structural analysis revealed that three of these specific mutations located in the central core region, contribute to a preference for the alteration of conformations of the Spike protein. Several mutations in the RBD which have circulated across most Omicron subvariants were also analysed, and these showed more potential for immune escape. Conclusion This study emphasizes the importance of understanding how specific N to K mutations outside of the RBD region affect SARS-CoV-2 conformational changes and the need for neutralizing antibodies for Omicron to target a subset of conformationally dependent B cell epitopes.
Collapse
Affiliation(s)
- Jennifer C. Boer
- School of Health and Biomedical Science, Royal Melbourne Institute of Technology, Melbourne, VIC, Australia
| | - Qisheng Pan
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, QLD, Australia,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia
| | - Jessica K. Holien
- School of Science, Royal Melbourne Institute of Technology (RMIT) University, Melbourne, VIC, Australia
| | - Thanh-Binh Nguyen
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, QLD, Australia,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia
| | - David B. Ascher
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, QLD, Australia,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia
| | - Magdalena Plebanski
- School of Health and Biomedical Science, Royal Melbourne Institute of Technology, Melbourne, VIC, Australia,*Correspondence: Magdalena Plebanski,
| |
Collapse
|
26
|
Robert PA, Akbar R, Frank R, Pavlović M, Widrich M, Snapkov I, Slabodkin A, Chernigovskaya M, Scheffer L, Smorodina E, Rawat P, Mehta BB, Vu MH, Mathisen IF, Prósz A, Abram K, Olar A, Miho E, Haug DTT, Lund-Johansen F, Hochreiter S, Haff IH, Klambauer G, Sandve GK, Greiff V. Unconstrained generation of synthetic antibody-antigen structures to guide machine learning methodology for antibody specificity prediction. NATURE COMPUTATIONAL SCIENCE 2022; 2:845-865. [PMID: 38177393 DOI: 10.1038/s43588-022-00372-4] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 11/09/2022] [Indexed: 01/06/2024]
Abstract
Machine learning (ML) is a key technology for accurate prediction of antibody-antigen binding. Two orthogonal problems hinder the application of ML to antibody-specificity prediction and the benchmarking thereof: the lack of a unified ML formalization of immunological antibody-specificity prediction problems and the unavailability of large-scale synthetic datasets to benchmark real-world relevant ML methods and dataset design. Here we developed the Absolut! software suite that enables parameter-based unconstrained generation of synthetic lattice-based three-dimensional antibody-antigen-binding structures with ground-truth access to conformational paratope, epitope and affinity. We formalized common immunological antibody-specificity prediction problems as ML tasks and confirmed that for both sequence- and structure-based tasks, accuracy-based rankings of ML methods trained on experimental data hold for ML methods trained on Absolut!-generated data. The Absolut! framework has the potential to enable real-world relevant development and benchmarking of ML strategies for biotherapeutics design.
Collapse
Affiliation(s)
- Philippe A Robert
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway.
| | - Rahmad Akbar
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Robert Frank
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | | | - Michael Widrich
- ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
| | - Igor Snapkov
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Andrei Slabodkin
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Maria Chernigovskaya
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | | | - Eva Smorodina
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Puneet Rawat
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Brij Bhushan Mehta
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Mai Ha Vu
- Department of Linguistics and Scandinavian Studies, University of Oslo, Oslo, Norway
| | | | - Aurél Prósz
- Danish Cancer Society Research Center, Translational Cancer Genomics, Copenhagen, Denmark
| | - Krzysztof Abram
- The Novo Nordisk Foundation Center for Biosustainability, Autoflow, DTU Biosustain and IT University of Copenhagen, Copenhagen, Denmark
| | - Alex Olar
- Department of Complex Systems in Physics, Eötvös Loránd University, Budapest, Hungary
| | - Enkelejda Miho
- Institute of Medical Engineering and Medical Informatics, School of Life Sciences, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Muttenz, Switzerland
- aiNET GmbH, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | | | - Sepp Hochreiter
- ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
- Institute of Advanced Research in Artificial Intelligence (IARAI), Vienna, Austria
| | | | - Günter Klambauer
- ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
| | | | - Victor Greiff
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway.
| |
Collapse
|
27
|
Chen J, Qiu Y, Wang R, Wei GW. Persistent Laplacian projected Omicron BA.4 and BA.5 to become new dominating variants. Comput Biol Med 2022; 151:106262. [PMID: 36379191 PMCID: PMC10754203 DOI: 10.1016/j.compbiomed.2022.106262] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 10/21/2022] [Accepted: 10/30/2022] [Indexed: 11/15/2022]
Abstract
Due to its high transmissibility, Omicron BA.1 ousted the Delta variant to become a dominating variant in late 2021 and was replaced by more transmissible Omicron BA.2 in March 2022. An important question is which new variants will dominate in the future. Topology-based deep learning models have had tremendous success in forecasting emerging variants in the past. However, topology is insensitive to homotopic shape evolution in virus-human protein-protein binding, which is crucial to viral evolution and transmission. This challenge is tackled with persistent Laplacian, which is able to capture both the topological change and homotopic shape evolution of data. Persistent Laplacian-based deep learning models are developed to systematically evaluate variant infectivity. Our comparative analysis of Alpha, Beta, Gamma, Delta, Lambda, Mu, and Omicron BA.1, BA.1.1, BA.2, BA.2.11, BA.2.12.1, BA.3, BA.4, and BA.5 unveils that Omicron BA.2.11, BA.2.12.1, BA.3, BA.4, and BA.5 are more contagious than BA.2. In particular, BA.4 and BA.5 are about 36% more infectious than BA.2 and are projected to become new dominant variants by natural selection. Moreover, the proposed models outperform the state-of-the-art methods on three major benchmark datasets for mutation-induced protein-protein binding free energy changes. Our key projection about BA4 and BA.5's dominance made on May 1, 2022 (see arXiv:2205.00532) became a reality in late June 2022.
Collapse
Affiliation(s)
- Jiahui Chen
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA
| | - Yuchi Qiu
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA
| | - Rui Wang
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA; Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48824, USA; Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA.
| |
Collapse
|
28
|
Ye C, Hu W, Gaeta B. Prediction of Antibody-Antigen Binding via Machine Learning: Development of Data Sets and Evaluation of Methods. JMIR BIOINFORMATICS AND BIOTECHNOLOGY 2022; 3:e29404. [PMID: 38935962 PMCID: PMC11135222 DOI: 10.2196/29404] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Revised: 09/23/2021] [Accepted: 10/18/2022] [Indexed: 06/29/2024]
Abstract
BACKGROUND The mammalian immune system is able to generate antibodies against a huge variety of antigens, including bacteria, viruses, and toxins. The ultradeep DNA sequencing of rearranged immunoglobulin genes has considerable potential in furthering our understanding of the immune response, but it is limited by the lack of a high-throughput, sequence-based method for predicting the antigen(s) that a given immunoglobulin recognizes. OBJECTIVE As a step toward the prediction of antibody-antigen binding from sequence data alone, we aimed to compare a range of machine learning approaches that were applied to a collated data set of antibody-antigen pairs in order to predict antibody-antigen binding from sequence data. METHODS Data for training and testing were extracted from the Protein Data Bank and the Coronavirus Antibody Database, and additional antibody-antigen pair data were generated by using a molecular docking protocol. Several machine learning methods, including the weighted nearest neighbor method, the nearest neighbor method with the BLOSUM62 matrix, and the random forest method, were applied to the problem. RESULTS The final data set contained 1157 antibodies and 57 antigens that were combined in 5041 antibody-antigen pairs. The best performance for the prediction of interactions was obtained by using the nearest neighbor method with the BLOSUM62 matrix, which resulted in around 82% accuracy on the full data set. These results provide a useful frame of reference, as well as protocols and considerations, for machine learning and data set creation in the prediction of antibody-antigen binding. CONCLUSIONS Several machine learning approaches were compared to predict antibody-antigen interaction from protein sequences. Both the data set (in CSV format) and the machine learning program (coded in Python) are freely available for download on GitHub.
Collapse
Affiliation(s)
- Chao Ye
- School of Computer Science and Engineering, The University of New South Wales, Sydney, Australia
| | - Wenxing Hu
- Department of Computer Science, School of Information Science and Technology, Tokyo Institute of Technology, Tokyo, Japan
| | - Bruno Gaeta
- School of Computer Science and Engineering, The University of New South Wales, Sydney, Australia
| |
Collapse
|
29
|
Iftkhar S, de Sá AGC, Velloso JPL, Aljarf R, Pires DEV, Ascher DB. cardioToxCSM: A Web Server for Predicting Cardiotoxicity of Small Molecules. J Chem Inf Model 2022; 62:4827-4836. [PMID: 36219164 DOI: 10.1021/acs.jcim.2c00822] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The design of novel, safe, and effective drugs to treat human diseases is a challenging venture, with toxicity being one of the main sources of attrition at later stages of development. Failure due to toxicity incurs a significant increase in costs and time to market, with multiple drugs being withdrawn from the market due to their adverse effects. Cardiotoxicity, for instance, was responsible for the failure of drugs such as fenspiride, propoxyphene, and valdecoxib. While significant effort has been dedicated to mitigate this issue by developing computational approaches that aim to identify molecules likely to be toxic, including quantitative structure-activity relationship models and machine learning methods, current approaches present limited performance and interpretability. To overcome these, we propose a new web-based computational method, cardioToxCSM, which can predict six types of cardiac toxicity outcomes, including arrhythmia, cardiac failure, heart block, hERG toxicity, hypertension, and myocardial infarction, efficiently and accurately. cardioToxCSM was developed using the concept of graph-based signatures, molecular descriptors, toxicophore matchings, and molecular fingerprints, leveraging explainable machine learning, and was validated internally via different cross validation schemes and externally via low-redundancy blind sets. The models presented robust performances with areas under ROC curves of up to 0.898 on 5-fold cross-validation, consistent with metrics on blind tests. Additionally, our models provide interpretation of the predictions by identifying whether substructures that are commonly enriched in toxic compounds were present. We believe cardioToxCSM will provide valuable insight into the potential cardiotoxicity of small molecules early on drug screening efforts. The method is made freely available as a web server at https://biosig.lab.uq.edu.au/cardiotoxcsm.
Collapse
Affiliation(s)
- Saba Iftkhar
- School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia 4072, Queensland, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Victoria, Australia
| | - Alex G C de Sá
- School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia 4072, Queensland, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Victoria, Australia.,Baker Department of Cardiometabolic Health, Melbourne Medical School, University of Melbourne, Parkville 3010, Victoria, Australia
| | - João P L Velloso
- School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia 4072, Queensland, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Victoria, Australia
| | - Raghad Aljarf
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Victoria, Australia.,Baker Department of Cardiometabolic Health, Melbourne Medical School, University of Melbourne, Parkville 3010, Victoria, Australia
| | - Douglas E V Pires
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Victoria, Australia.,School of Computing and Information Systems, University of Melbourne, Parkville 3052, Victoria, Australia
| | - David B Ascher
- School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia 4072, Queensland, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Victoria, Australia.,Baker Department of Cardiometabolic Health, Melbourne Medical School, University of Melbourne, Parkville 3010, Victoria, Australia
| |
Collapse
|
30
|
Gurusinghe SN, Oppenheimer B, Shifman JM. Cold spots are universal in protein-protein interactions. Protein Sci 2022; 31:e4435. [PMID: 36173158 PMCID: PMC9490803 DOI: 10.1002/pro.4435] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 07/22/2022] [Accepted: 08/26/2022] [Indexed: 12/02/2022]
Abstract
Proteins interact with each other through binding interfaces that differ greatly in size and physico-chemical properties. Within the binding interface, a few residues called hot spots contribute the majority of the binding free energy and are hence irreplaceable. In contrast, cold spots are occupied by suboptimal amino acids, providing possibility for affinity enhancement through mutations. In this study, we identify cold spots due to cavities and unfavorable charge interactions in multiple protein-protein interactions (PPIs). For our cold spot analysis, we first use a small affinity database of PPIs with known structures and affinities and then expand our search to nearly 4000 homo- and heterodimers in the Protein Data Bank (PDB). We observe that cold spots due to cavities are present in nearly all PPIs unrelated to their binding affinity, while unfavorable charge interactions are relatively rare. We also find that most cold spots are located in the periphery of the binding interface, with high-affinity complexes showing fewer centrally located colds spots than low-affinity complexes. A larger number of cold spots is also found in non-cognate interactions compared to their cognate counterparts. Furthermore, our analysis reveals that cold spots are more frequent in homo-dimeric complexes compared to hetero-complexes, likely due to symmetry constraints imposed on sequences of homodimers. Finally, we find that glycines, glutamates, and arginines are the most frequent amino acids appearing at cold spot positions. Our analysis emphasizes the importance of cold spot positions to protein evolution and facilitates protein engineering studies directed at enhancing binding affinity and specificity in a wide range of applications.
Collapse
Affiliation(s)
- Sagara N.S. Gurusinghe
- Department of Biological ChemistryThe Alexander Silberman Institute of Life Sciences, The Hebrew University of JerusalemJerusalemIsrael
| | - Ben Oppenheimer
- Department of Biological ChemistryThe Alexander Silberman Institute of Life Sciences, The Hebrew University of JerusalemJerusalemIsrael
| | - Julia M. Shifman
- Department of Biological ChemistryThe Alexander Silberman Institute of Life Sciences, The Hebrew University of JerusalemJerusalemIsrael
| |
Collapse
|
31
|
Liu X, Feng H, Wu J, Xia K. Hom-Complex-Based Machine Learning (HCML) for the Prediction of Protein-Protein Binding Affinity Changes upon Mutation. J Chem Inf Model 2022; 62:3961-3969. [PMID: 36040839 DOI: 10.1021/acs.jcim.2c00580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Protein-protein interactions (PPIs) are involved in almost all biological processes in the cell. Understanding protein-protein interactions holds the key for the understanding of biological functions, diseases and the development of therapeutics. Recently, artificial intelligence (AI) models have demonstrated great power in PPIs. However, a key issue for all AI-based PPI models is efficient molecular representations and featurization. Here, we propose Hom-complex-based PPI representation, and Hom-complex-based machine learning models for the prediction of PPI binding affinity changes upon mutation, for the first time. In our model, various Hom complexes Hom(G1, G) can be generated for the graph representation G of protein-protein complex by using different graphs G1, which reveal G1-related inner connections within the graph representation G of protein-protein complex. Further, for a specific graph G1, a series of nested Hom complexes are generated to give a multiscale characterization of the PPIs. Its persistent homology and persistent Euler characteristic are used as molecular descriptors and further combined with the machine learning model, in particular, gradient boosting tree (GBT). We systematically test our model on the two most-commonly used data sets, that is, SKEMPI and AB-Bind. It has been found that our model outperforms all the existing models as far as we know, which demonstrates the great potential of our model for the analysis of PPIs. Our model can be used for the analysis and design of efficient antibodies for SARS-CoV-2.
Collapse
Affiliation(s)
- Xiang Liu
- Chern Institute of Mathematics and LPMC, Nankai University, Tianjin, China, 300071.,Division of Mathematical Sciences, School of Physical and Mathematical Sciences Nanyang Technological University, Singapore 637371
| | - Huitao Feng
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences Nanyang Technological University, Singapore 637371.,Mathematical Science Research Center, Chongqing University of Technology, Chongqing, China, 400054
| | - Jie Wu
- Yanqi Lake Beijing Institute of Mathematical Sciences and Applications (BIMSA), Beijing, China,101408
| | - Kelin Xia
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences Nanyang Technological University, Singapore 637371
| |
Collapse
|
32
|
Pires DEV, Stubbs KA, Mylne JS, Ascher DB. cropCSM: designing safe and potent herbicides with graph-based signatures. Brief Bioinform 2022; 23:bbac042. [PMID: 35211724 PMCID: PMC9155605 DOI: 10.1093/bib/bbac042] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 01/26/2022] [Accepted: 01/27/2022] [Indexed: 12/11/2022] Open
Abstract
Herbicides have revolutionised weed management, increased crop yields and improved profitability allowing for an increase in worldwide food security. Their widespread use, however, has also led to a rise in resistance and concerns about their environmental impact. Despite the need for potent and safe herbicidal molecules, no herbicide with a new mode of action has reached the market in 30 years. Although development of computational approaches has proven invaluable to guide rational drug discovery pipelines, leading to higher hit rates and lower attrition due to poor toxicity, little has been done in contrast for herbicide design. To fill this gap, we have developed cropCSM, a computational platform to help identify new, potent, nontoxic and environmentally safe herbicides. By using a knowledge-based approach, we identified physicochemical properties and substructures enriched in safe herbicides. By representing the small molecules as a graph, we leveraged these insights to guide the development of predictive models trained and tested on the largest collected data set of molecules with experimentally characterised herbicidal profiles to date (over 4500 compounds). In addition, we developed six new environmental and human toxicity predictors, spanning five different species to assist in molecule prioritisation. cropCSM was able to correctly identify 97% of herbicides currently available commercially, while predicting toxicity profiles with accuracies of up to 92%. We believe cropCSM will be an essential tool for the enrichment of screening libraries and to guide the development of potent and safe herbicides. We have made the method freely available through a user-friendly webserver at http://biosig.unimelb.edu.au/crop_csm.
Collapse
Affiliation(s)
- Douglas E V Pires
- School of Computing and Information Systems at the University of Melbourne
| | - Keith A Stubbs
- School of Molecular Sciences at the University of Western Australia
| | - Joshua S Mylne
- Curtin University and Deputy Director of the Centre for Crop and Disease Management
| | - David B Ascher
- University of Queensland, and head of Computational Biology and Clinical Informatics at the Baker Institute and Systems
| |
Collapse
|
33
|
Myung Y, Pires DEV, Ascher DB. CSM-AB: graph-based antibody-antigen binding affinity prediction and docking scoring function. Bioinformatics 2022; 38:1141-1143. [PMID: 34734992 DOI: 10.1093/bioinformatics/btab762] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Revised: 10/18/2021] [Accepted: 11/01/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Understanding antibody-antigen interactions is key to improving their binding affinities and specificities. While experimental approaches are fundamental for developing new therapeutics, computational methods can provide quick assessment of binding landscapes, guiding experimental design. Despite this, little effort has been devoted to accurately predicting the binding affinity between antibodies and antigens and to develop tailored docking scoring functions for this type of interaction. Here, we developed CSM-AB, a machine learning method capable of predicting antibody-antigen binding affinity by modelling interaction interfaces as graph-based signatures. RESULTS CSM-AB outperformed alternative methods achieving a Pearson's correlation of up to 0.64 on blind tests. We also show CSM-AB can accurately rank near-native poses, working effectively as a docking scoring function. We believe CSM-AB will be an invaluable tool to assist in the development of new immunotherapies. AVAILABILITY AND IMPLEMENTATION CSM-AB is freely available as a user-friendly web interface and API at http://biosig.unimelb.edu.au/csm_ab/datasets. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yoochan Myung
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, VIC, Australia
| | - Douglas E V Pires
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, VIC, Australia.,School of Computing and Information Systems, University of Melbourne, Melbourne, VIC, Australia.,School of Chemistry and Molecular Biosciences, University Of Queensland, St Lucia, QLD, Australia
| | - David B Ascher
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, VIC, Australia.,School of Chemistry and Molecular Biosciences, University Of Queensland, St Lucia, QLD, Australia
| |
Collapse
|
34
|
Chaudhuri D, Majumder S, Datta J, Giri K. Designing of nanobodies against Dengue virus Capsid: a computational affinity maturation approach. J Biomol Struct Dyn 2022; 41:2289-2299. [PMID: 35067204 DOI: 10.1080/07391102.2022.2029773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
Dengue virus, an arbovirus, is one of the most prevalent diseases in the tropical environment and leads to huge number of casualties every year. No therapeutics are available till date against the viral disease and the only medications provide symptomatic relief. In this study, we have focused on utilizing conventional nanobodies and repurposing them for Dengue. Computationally affinity matured, best binding nanobodies tagged with constant antibody regions, could be proposed as therapeutics. These could also be applied for drug delivery purposes due to their high specificity against the viral Capsid. Another application of these nanobodies has been thought to utilize them for diagnostic purposes, to use the nanobodies for viral detection from patient samples at the earliest stage using ELISA. This study may open a new avenue for immunologic study in foreseeable future with the usage of the same molecules for multiple purposes. HighlightsNatural nanobodies against viruses were modified for use against Dengue virus Capsid conserved regions.Computational affinity maturation was performed making use of change in binding affinities upon mutating various residues in the complementary determining regions.Docking studies performed to inspect the docking groove, interface analysis and energy calculations.MM/GBSA calculations done to calculate binding free energy of the complex to determine stability of the complex.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
| | | | - Joyeeta Datta
- Department of Life Sciences, Presidency University, Kolkata, India
| | - Kalyan Giri
- Department of Life Sciences, Presidency University, Kolkata, India
| |
Collapse
|
35
|
Akbar R, Bashour H, Rawat P, Robert PA, Smorodina E, Cotet TS, Flem-Karlsen K, Frank R, Mehta BB, Vu MH, Zengin T, Gutierrez-Marcos J, Lund-Johansen F, Andersen JT, Greiff V. Progress and challenges for the machine learning-based design of fit-for-purpose monoclonal antibodies. MAbs 2022; 14:2008790. [PMID: 35293269 PMCID: PMC8928824 DOI: 10.1080/19420862.2021.2008790] [Citation(s) in RCA: 52] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2021] [Revised: 11/04/2021] [Accepted: 11/17/2021] [Indexed: 12/15/2022] Open
Abstract
Although the therapeutic efficacy and commercial success of monoclonal antibodies (mAbs) are tremendous, the design and discovery of new candidates remain a time and cost-intensive endeavor. In this regard, progress in the generation of data describing antigen binding and developability, computational methodology, and artificial intelligence may pave the way for a new era of in silico on-demand immunotherapeutics design and discovery. Here, we argue that the main necessary machine learning (ML) components for an in silico mAb sequence generator are: understanding of the rules of mAb-antigen binding, capacity to modularly combine mAb design parameters, and algorithms for unconstrained parameter-driven in silico mAb sequence synthesis. We review the current progress toward the realization of these necessary components and discuss the challenges that must be overcome to allow the on-demand ML-based discovery and design of fit-for-purpose mAb therapeutic candidates.
Collapse
Affiliation(s)
- Rahmad Akbar
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Habib Bashour
- School of Life Sciences, University of Warwick, Coventry, UK
| | - Puneet Rawat
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| | - Philippe A. Robert
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Eva Smorodina
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russia
| | | | - Karine Flem-Karlsen
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
- Institute of Clinical Medicine, Department of Pharmacology, University of Oslo and Oslo University Hospital, Norway
| | - Robert Frank
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Brij Bhushan Mehta
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Mai Ha Vu
- Department of Linguistics and Scandinavian Studies, University of Oslo, Norway
| | - Talip Zengin
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
- Department of Bioinformatics, Mugla Sitki Kocman University, Turkey
| | | | | | - Jan Terje Andersen
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
- Institute of Clinical Medicine, Department of Pharmacology, University of Oslo and Oslo University Hospital, Norway
| | - Victor Greiff
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| |
Collapse
|
36
|
Nguyen TB, Pires DEV, Ascher DB. CSM-carbohydrate: protein-carbohydrate binding affinity prediction and docking scoring function. Brief Bioinform 2021; 23:6457169. [PMID: 34882232 DOI: 10.1093/bib/bbab512] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 11/06/2021] [Accepted: 11/08/2021] [Indexed: 12/29/2022] Open
Abstract
Protein-carbohydrate interactions are crucial for many cellular processes but can be challenging to biologically characterise. To improve our understanding and ability to model these molecular interactions, we used a carefully curated set of 370 protein-carbohydrate complexes with experimental structural and biophysical data in order to train and validate a new tool, cutoff scanning matrix (CSM)-carbohydrate, using machine learning algorithms to accurately predict their binding affinity and rank docking poses as a scoring function. Information on both protein and carbohydrate complementarity, in terms of shape and chemistry, was captured using graph-based structural signatures. Across both training and independent test sets, we achieved comparable Pearson's correlations of 0.72 under cross-validation [root mean square error (RMSE) of 1.58 Kcal/mol] and 0.67 on the independent test (RMSE of 1.72 Kcal/mol), providing confidence in the generalisability and robustness of the final model. Similar performance was obtained across mono-, di- and oligosaccharides, further highlighting the applicability of this approach to the study of larger complexes. We show CSM-carbohydrate significantly outperformed previous approaches and have implemented our method and make all data freely available through both a user-friendly web interface and application programming interface, to facilitate programmatic access at http://biosig.unimelb.edu.au/csm_carbohydrate/. We believe CSM-carbohydrate will be an invaluable tool for helping assess docking poses and the effects of mutations on protein-carbohydrate affinity, unravelling important aspects that drive binding recognition.
Collapse
Affiliation(s)
- Thanh Binh Nguyen
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
| | - Douglas E V Pires
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,School of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia
| | - David B Ascher
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia.,Department of Biochemistry, University of Cambridge, Cambridge, UK
| |
Collapse
|
37
|
Nguyen TB, Myung Y, de Sá AGC, Pires DEV, Ascher DB. mmCSM-NA: accurately predicting effects of single and multiple mutations on protein-nucleic acid binding affinity. NAR Genom Bioinform 2021; 3:lqab109. [PMID: 34805992 PMCID: PMC8600011 DOI: 10.1093/nargab/lqab109] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2021] [Revised: 09/20/2021] [Accepted: 10/27/2021] [Indexed: 02/02/2023] Open
Abstract
While protein-nucleic acid interactions are pivotal for many crucial biological processes, limited experimental data has made the development of computational approaches to characterise these interactions a challenge. Consequently, most approaches to understand the effects of missense mutations on protein-nucleic acid affinity have focused on single-point mutations and have presented a limited performance on independent data sets. To overcome this, we have curated the largest dataset of experimentally measured effects of mutations on nucleic acid binding affinity to date, encompassing 856 single-point mutations and 141 multiple-point mutations across 155 experimentally solved complexes. This was used in combination with an optimized version of our graph-based signatures to develop mmCSM-NA (http://biosig.unimelb.edu.au/mmcsm_na), the first scalable method capable of quantitatively and accurately predicting the effects of multiple-point mutations on nucleic acid binding affinities. mmCSM-NA obtained a Pearson's correlation of up to 0.67 (RMSE of 1.06 Kcal/mol) on single-point mutations under cross-validation, and up to 0.65 on independent non-redundant datasets of multiple-point mutations (RMSE of 1.12 kcal/mol), outperforming similar tools. mmCSM-NA is freely available as an easy-to-use web-server and API. We believe it will be an invaluable tool to shed light on the role of mutations affecting protein-nucleic acid interactions in diseases.
Collapse
Affiliation(s)
- Thanh Binh Nguyen
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
| | - Yoochan Myung
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
| | - Alex G C de Sá
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
| | - Douglas E V Pires
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- School of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia
| | - David B Ascher
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| |
Collapse
|
38
|
Rodrigues CHM, Pires DEV, Ascher DB. pdCSM-PPI: Using Graph-Based Signatures to Identify Protein-Protein Interaction Inhibitors. J Chem Inf Model 2021; 61:5438-5445. [PMID: 34719929 DOI: 10.1021/acs.jcim.1c01135] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Protein-protein interactions are promising sites for development of selective drugs; however, they have generally been viewed as challenging targets. Molecules targeting protein-protein interactions tend to be larger and more lipophilic than other drug-like molecules, mimicking the properties of interacting interfaces. Here, we propose a machine learning approach that uses a graph-based representation of small molecules to guide identification of inhibitors modulating protein-protein interactions, pdCSM-PPI. This approach was applied to 21 different PPI targets. We developed interaction-specific models that were able to accurately identify active compounds achieving MCC and F1 scores up to 1, and Pearson's correlations up to 0.87, outperforming previous approaches. Using insights from these individual models, we developed a generic protein-protein interaction modulator predictive model, which accurately predicted IC50 with a Pearson's correlation of 0.64 on a low redundancy blind test. Importantly, we were able to accurately identify active from inactive compounds, achieving an AUC of 0.77 and sensitivity and specificity of 76% and 78%, respectively. We believe pdCSM-PPI will be an important tool to help guide more efficient screening of new PPI inhibitors; it is freely available as an easy-to-use web server and API at http://biosig.unimelb.edu.au/pdcsm_ppi.
Collapse
Affiliation(s)
- Carlos H M Rodrigues
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, Victoria Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Victoria, Australia.,School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane 4072, Australia
| | - Douglas E V Pires
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, Victoria Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Victoria, Australia.,School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane 4072, Australia.,School of Computing and Information Systems, University of Melbourne, Parkville 3052, Victoria, Australia
| | - David B Ascher
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, Victoria Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Victoria, Australia.,School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane 4072, Australia
| |
Collapse
|
39
|
da Silva BM, Myung Y, Ascher DB, Pires DEV. epitope3D: a machine learning method for conformational B-cell epitope prediction. Brief Bioinform 2021; 23:6407730. [PMID: 34676398 DOI: 10.1093/bib/bbab423] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 08/25/2021] [Accepted: 09/14/2021] [Indexed: 11/13/2022] Open
Abstract
The ability to identify antigenic determinants of pathogens, or epitopes, is fundamental to guide rational vaccine development and immunotherapies, which are particularly relevant for rapid pandemic response. A range of computational tools has been developed over the past two decades to assist in epitope prediction; however, they have presented limited performance and generalization, particularly for the identification of conformational B-cell epitopes. Here, we present epitope3D, a novel scalable machine learning method capable of accurately identifying conformational epitopes trained and evaluated on the largest curated epitope data set to date. Our method uses the concept of graph-based signatures to model epitope and non-epitope regions as graphs and extract distance patterns that are used as evidence to train and test predictive models. We show epitope3D outperforms available alternative approaches, achieving Mathew's Correlation Coefficient and F1-scores of 0.55 and 0.57 on cross-validation and 0.45 and 0.36 during independent blind tests, respectively.
Collapse
Affiliation(s)
- Bruna Moreira da Silva
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,School of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia
| | - YooChan Myung
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,Baker Department of Cardiometabolic Health, University of Melbourne, Melbourne, Victoria, Australia
| | - David B Ascher
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,Baker Department of Cardiometabolic Health, University of Melbourne, Melbourne, Victoria, Australia.,Department of Biochemistry, University of Cambridge, 80 Tennis Ct Rd, Cambridge CB2 1GA, UK
| | - Douglas E V Pires
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,School of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia
| |
Collapse
|
40
|
Liu X, Luo Y, Li P, Song S, Peng J. Deep geometric representations for modeling effects of mutations on protein-protein binding affinity. PLoS Comput Biol 2021; 17:e1009284. [PMID: 34347784 PMCID: PMC8366979 DOI: 10.1371/journal.pcbi.1009284] [Citation(s) in RCA: 53] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Revised: 08/16/2021] [Accepted: 07/17/2021] [Indexed: 11/19/2022] Open
Abstract
Modeling the impact of amino acid mutations on protein-protein interaction plays a crucial role in protein engineering and drug design. In this study, we develop GeoPPI, a novel structure-based deep-learning framework to predict the change of binding affinity upon mutations. Based on the three-dimensional structure of a protein, GeoPPI first learns a geometric representation that encodes topology features of the protein structure via a self-supervised learning scheme. These representations are then used as features for training gradient-boosting trees to predict the changes of protein-protein binding affinity upon mutations. We find that GeoPPI is able to learn meaningful features that characterize interactions between atoms in protein structures. In addition, through extensive experiments, we show that GeoPPI achieves new state-of-the-art performance in predicting the binding affinity changes upon both single- and multi-point mutations on six benchmark datasets. Moreover, we show that GeoPPI can accurately estimate the difference of binding affinities between a few recently identified SARS-CoV-2 antibodies and the receptor-binding domain (RBD) of the S protein. These results demonstrate the potential of GeoPPI as a powerful and useful computational tool in protein design and engineering. Our code and datasets are available at: https://github.com/Liuxg16/GeoPPI. Estimating the binding affinities of protein-protein interactions (PPIs) is crucial to understand protein function and design new functional proteins. Since the experimental measurement in wet-labs is labor-intensive and time-consuming, fast and accurate in silico approaches have received much attention. Although considerable efforts have been made in this direction, predicting the effects of mutations on the protein-protein binding affinity is still a challenging research problem. In this work, we introduce GeoPPI, a novel computational approach that uses deep geometric representations of protein complexes to predict the effects of mutations on the binding affinity. The geometric representations are first learned via a self-supervised learning scheme and then integrated with gradient-boosting trees to accomplish the prediction. We find that the learned representations encode meaningful patterns underlying the interactions between atoms in protein structures. Also, extensive tests on major benchmark datasets show that GeoPPI has made an important improvement over the existing methods in predicting the effects of mutations on the binding affinity.
Collapse
Affiliation(s)
- Xianggen Liu
- Laboratory for Brain and Intelligence and Department of Biomedical Engineering, Tsinghua University, Beijing, China
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, China
- Beijing Innovation Center for Future Chip, Tsinghua University, Beijing, China
| | - Yunan Luo
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
| | - Pengyong Li
- Laboratory for Brain and Intelligence and Department of Biomedical Engineering, Tsinghua University, Beijing, China
- Beijing Innovation Center for Future Chip, Tsinghua University, Beijing, China
| | - Sen Song
- Laboratory for Brain and Intelligence and Department of Biomedical Engineering, Tsinghua University, Beijing, China
- Beijing Innovation Center for Future Chip, Tsinghua University, Beijing, China
- * E-mail: (JP); (SS)
| | - Jian Peng
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
- * E-mail: (JP); (SS)
| |
Collapse
|
41
|
Gupta R, Srivastava D, Sahu M, Tiwari S, Ambasta RK, Kumar P. Artificial intelligence to deep learning: machine intelligence approach for drug discovery. Mol Divers 2021; 25:1315-1360. [PMID: 33844136 PMCID: PMC8040371 DOI: 10.1007/s11030-021-10217-3] [Citation(s) in RCA: 331] [Impact Index Per Article: 82.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Accepted: 03/22/2021] [Indexed: 02/06/2023]
Abstract
Drug designing and development is an important area of research for pharmaceutical companies and chemical scientists. However, low efficacy, off-target delivery, time consumption, and high cost impose a hurdle and challenges that impact drug design and discovery. Further, complex and big data from genomics, proteomics, microarray data, and clinical trials also impose an obstacle in the drug discovery pipeline. Artificial intelligence and machine learning technology play a crucial role in drug discovery and development. In other words, artificial neural networks and deep learning algorithms have modernized the area. Machine learning and deep learning algorithms have been implemented in several drug discovery processes such as peptide synthesis, structure-based virtual screening, ligand-based virtual screening, toxicity prediction, drug monitoring and release, pharmacophore modeling, quantitative structure-activity relationship, drug repositioning, polypharmacology, and physiochemical activity. Evidence from the past strengthens the implementation of artificial intelligence and deep learning in this field. Moreover, novel data mining, curation, and management techniques provided critical support to recently developed modeling algorithms. In summary, artificial intelligence and deep learning advancements provide an excellent opportunity for rational drug design and discovery process, which will eventually impact mankind. The primary concern associated with drug design and development is time consumption and production cost. Further, inefficiency, inaccurate target delivery, and inappropriate dosage are other hurdles that inhibit the process of drug delivery and development. With advancements in technology, computer-aided drug design integrating artificial intelligence algorithms can eliminate the challenges and hurdles of traditional drug design and development. Artificial intelligence is referred to as superset comprising machine learning, whereas machine learning comprises supervised learning, unsupervised learning, and reinforcement learning. Further, deep learning, a subset of machine learning, has been extensively implemented in drug design and development. The artificial neural network, deep neural network, support vector machines, classification and regression, generative adversarial networks, symbolic learning, and meta-learning are examples of the algorithms applied to the drug design and discovery process. Artificial intelligence has been applied to different areas of drug design and development process, such as from peptide synthesis to molecule design, virtual screening to molecular docking, quantitative structure-activity relationship to drug repositioning, protein misfolding to protein-protein interactions, and molecular pathway identification to polypharmacology. Artificial intelligence principles have been applied to the classification of active and inactive, monitoring drug release, pre-clinical and clinical development, primary and secondary drug screening, biomarker development, pharmaceutical manufacturing, bioactivity identification and physiochemical properties, prediction of toxicity, and identification of mode of action.
Collapse
Affiliation(s)
- Rohan Gupta
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Devesh Srivastava
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Mehar Sahu
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Swati Tiwari
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Rashmi K Ambasta
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Pravir Kumar
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India.
| |
Collapse
|
42
|
Tunstall T, Phelan J, Eccleston C, Clark TG, Furnham N. Structural and Genomic Insights Into Pyrazinamide Resistance in Mycobacterium tuberculosis Underlie Differences Between Ancient and Modern Lineages. Front Mol Biosci 2021; 8:619403. [PMID: 34422898 PMCID: PMC8372558 DOI: 10.3389/fmolb.2021.619403] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Accepted: 04/14/2021] [Indexed: 11/30/2022] Open
Abstract
Resistance to drugs used to treat tuberculosis disease (TB) continues to remain a public health burden, with missense point mutations in the underlying Mycobacterium tuberculosis bacteria described for nearly all anti-TB drugs. The post-genomics era along with advances in computational and structural biology provide opportunities to understand the interrelationships between the genetic basis and the structural consequences of M. tuberculosis mutations linked to drug resistance. Pyrazinamide (PZA) is a crucial first line antibiotic currently used in TB treatment regimens. The mutational promiscuity exhibited by the pncA gene (target for PZA) necessitates computational approaches to investigate the genetic and structural basis for PZA resistance development. We analysed 424 missense point mutations linked to PZA resistance derived from ∼35K M. tuberculosis clinical isolates sourced globally, which comprised the four main M. tuberculosis lineages (Lineage 1-4). Mutations were annotated to reflect their association with PZA resistance. Genomic measures (minor allele frequency and odds ratio), structural features (surface area, residue depth and hydrophobicity) and biophysical effects (change in stability and ligand affinity) of point mutations on pncA protein stability and ligand affinity were assessed. Missense point mutations within pncA were distributed throughout the gene, with the majority (>80%) of mutations with a destabilising effect on protomer stability and on ligand affinity. Active site residues involved in PZA binding were associated with multiple point mutations highlighting mutational diversity due to selection pressures at these functionally important sites. There were weak associations between genomic measures and biophysical effect of mutations. However, mutations associated with PZA resistance showed statistically significant differences between structural features (surface area and residue depth), but not hydrophobicity score for mutational sites. Most interestingly M. tuberculosis lineage 1 (ancient lineage) exhibited a distinct protein stability profile for mutations associated with PZA resistance, compared to modern lineages.
Collapse
Affiliation(s)
- Tanushree Tunstall
- Department of Infection Biology, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Jody Phelan
- Department of Infection Biology, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Charlotte Eccleston
- Department of Infection Biology, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Taane G. Clark
- Department of Infection Biology, London School of Hygiene and Tropical Medicine, London, United Kingdom
- Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Nicholas Furnham
- Department of Infection Biology, London School of Hygiene and Tropical Medicine, London, United Kingdom
| |
Collapse
|
43
|
Rodrigues CHM, Pires DEV, Ascher DB. mmCSM-PPI: predicting the effects of multiple point mutations on protein-protein interactions. Nucleic Acids Res 2021; 49:W417-W424. [PMID: 33893812 PMCID: PMC8262703 DOI: 10.1093/nar/gkab273] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 03/18/2021] [Accepted: 04/15/2021] [Indexed: 11/16/2022] Open
Abstract
Protein-protein interactions play a crucial role in all cellular functions and biological processes and mutations leading to their disruption are enriched in many diseases. While a number of computational methods to assess the effects of variants on protein-protein binding affinity have been proposed, they are in general limited to the analysis of single point mutations and have been shown to perform poorly on independent test sets. Here, we present mmCSM-PPI, a scalable and effective machine learning model for accurately assessing changes in protein-protein binding affinity caused by single and multiple missense mutations. We expanded our well-established graph-based signatures in order to capture physicochemical and geometrical properties of multiple wild-type residue environments and integrated them with substitution scores and dynamics terms from normal mode analysis. mmCSM-PPI was able to achieve a Pearson's correlation of up to 0.75 (RMSE = 1.64 kcal/mol) under 10-fold cross-validation and 0.70 (RMSE = 2.06 kcal/mol) on a non-redundant blind test, outperforming existing methods. Our method is freely available as a user-friendly and easy-to-use web server and API at http://biosig.unimelb.edu.au/mmcsm_ppi.
Collapse
Affiliation(s)
- Carlos H M Rodrigues
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Structural Biology and Bioinformatics, Department of Biochemistry and Pharmacology, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
| | - Douglas E V Pires
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Structural Biology and Bioinformatics, Department of Biochemistry and Pharmacology, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- School of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia
| | - David B Ascher
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Structural Biology and Bioinformatics, Department of Biochemistry and Pharmacology, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| |
Collapse
|
44
|
Al-Jarf R, de Sá AGC, Pires DEV, Ascher DB. pdCSM-cancer: Using Graph-Based Signatures to Identify Small Molecules with Anticancer Properties. J Chem Inf Model 2021; 61:3314-3322. [PMID: 34213323 PMCID: PMC8317153 DOI: 10.1021/acs.jcim.1c00168] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
![]()
The development of
new, effective, and safe drugs to treat cancer
remains a challenging and time-consuming task due to limited hit rates,
restraining subsequent development efforts. Despite the impressive
progress of quantitative structure–activity relationship and
machine learning-based models that have been developed to predict
molecule pharmacodynamics and bioactivity, they have had mixed success
at identifying compounds with anticancer properties against multiple
cell lines. Here, we have developed a novel predictive tool, pdCSM-cancer,
which uses a graph-based signature representation of the chemical
structure of a small molecule in order to accurately predict molecules
likely to be active against one or multiple cancer cell lines. pdCSM-cancer
represents the most comprehensive anticancer bioactivity prediction
platform developed till date, comprising trained and validated models
on experimental data of the growth inhibition concentration (GI50%)
effects, including over 18,000 compounds, on 9 tumor types and 74
distinct cancer cell lines. Across 10-fold cross-validation, it achieved
Pearson’s correlation coefficients of up to 0.74 and comparable
performance of up to 0.67 across independent, non-redundant blind
tests. Leveraging the insights from these cell line-specific models,
we developed a generic predictive model to identify molecules active
in at least 60 cell lines. Our final model achieved an area under
the receiver operating characteristic curve (AUC) of up to 0.94 on
10-fold cross-validation and up to 0.94 on independent non-redundant
blind tests, outperforming alternative approaches. We believe that
our predictive tool will provide a valuable resource to optimizing
and enriching screening libraries for the identification of effective
and safe anticancer molecules. To provide a simple and integrated
platform to rapidly screen for potential biologically active molecules
with favorable anticancer properties, we made pdCSM-cancer freely
available online at http://biosig.unimelb.edu.au/pdcsm_cancer.
Collapse
Affiliation(s)
- Raghad Al-Jarf
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Parkville 3052, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Victoria, Australia
| | - Alex G C de Sá
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Parkville 3052, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Victoria, Australia.,Baker Department of Cardiometabolic Health, Melbourne Medical School, University of Melbourne, Parkville 3010, Victoria, Australia
| | - Douglas E V Pires
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Parkville 3052, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Victoria, Australia.,School of Computing and Information Systems, University of Melbourne, Parkville 3052, Victoria, Australia
| | - David B Ascher
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Parkville 3052, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Parkville 3052, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne 3004, Victoria, Australia.,Baker Department of Cardiometabolic Health, Melbourne Medical School, University of Melbourne, Parkville 3010, Victoria, Australia.,Department of Biochemistry, University of Cambridge, 80 Tennis Ct Rd, Cambridge CB2 1GA, United Kingdom
| |
Collapse
|
45
|
Mei LC, Hao GF, Yang GF. Computational methods for predicting hotspots at protein-RNA interfaces. WILEY INTERDISCIPLINARY REVIEWS-RNA 2021; 13:e1675. [PMID: 34080311 DOI: 10.1002/wrna.1675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Revised: 05/13/2021] [Accepted: 05/14/2021] [Indexed: 11/10/2022]
Abstract
Protein-RNA interactions play essential roles in many critical biological events. A comprehensive understanding of the mechanisms underlying these interactions is helpful when studying cellular activities and therapeutic applications. Hotspots are a small portion of residues contributing much toward protein-RNA binding affinity. In pharmaceutical research, the hotspot residues are seen as the best option for designing small molecules to target proteins of therapeutic interest. With the accumulation of experimental data about protein-RNA interactions, computational methods have been produced for hotspot prediction on a large scale. In this review, we first present an overview of the existing databases for protein-RNA binding data. Furthermore, we outline the most adopted computational methods for hotspots prediction in protein-RNA interactions. Finally, we discuss the applications of hotspot prediction. This article is categorized under: RNA Interactions with Proteins and Other Molecules > Protein-RNA Recognition RNA Interactions with Proteins and Other Molecules > Protein-RNA Interactions: Functional Implications RNA Methods > RNA Analyses In Vitro and In Silico.
Collapse
Affiliation(s)
- Long-Can Mei
- Key Laboratory of Pesticide and Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan, China.,International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan, China
| | - Ge-Fei Hao
- Key Laboratory of Pesticide and Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan, China.,International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan, China.,State Key Laboratory Breeding Base of Green Pesticide and Agricultural Bioengineering, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Research and Development Center for Fine Chemicals, Guizhou University, Guiyang, China
| | - Guang-Fu Yang
- Key Laboratory of Pesticide and Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan, China.,International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan, China.,Collaborative Innovation Center of Chemical Science and Engineering, Tianjin, China
| |
Collapse
|
46
|
Portelli S, Barr L, de Sá AG, Pires DE, Ascher DB. Distinguishing between PTEN clinical phenotypes through mutation analysis. Comput Struct Biotechnol J 2021; 19:3097-3109. [PMID: 34141133 PMCID: PMC8180946 DOI: 10.1016/j.csbj.2021.05.028] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 04/29/2021] [Accepted: 05/19/2021] [Indexed: 12/28/2022] Open
Abstract
Phosphate and tensin homolog on chromosome ten (PTEN) germline mutations are associated with an overarching condition known as PTEN hamartoma tumor syndrome. Clinical phenotypes associated with this syndrome range from macrocephaly and autism spectrum disorder to Cowden syndrome, which manifests as multiple noncancerous tumor-like growths (hamartomas), and an increased predisposition to certain cancers. It is unclear, however, the basis by which mutations might lead to these very diverse phenotypic outcomes. Here we show that, by considering the molecular consequences of mutations in PTEN on protein structure and function, we can accurately distinguish PTEN mutations exhibiting different phenotypes. Changes in phosphatase activity, protein stability, and intramolecular interactions appeared to be major drivers of clinical phenotype, with cancer-associated variants leading to the most drastic changes, while ASD and non-pathogenic variants associated with more mild and neutral changes, respectively. Importantly, we show via saturation mutagenesis that more than half of variants of unknown significance could be associated with disease phenotypes, while over half of Cowden syndrome mutations likely lead to cancer. These insights can assist in exploring potentially important clinical outcomes delineated by PTEN variation.
Collapse
Affiliation(s)
- Stephanie Portelli
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
| | - Lucy Barr
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
| | - Alex G.C. de Sá
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Baker Department of Cardiometabolic Health, Melbourne Medical School, University of Melbourne, Melbourne, Victoria, Australia
| | - Douglas E.V. Pires
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- School of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia
| | - David B. Ascher
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Baker Department of Cardiometabolic Health, Melbourne Medical School, University of Melbourne, Melbourne, Victoria, Australia
- Department of Biochemistry, University of Cambridge, 80 Tennis Ct Rd, Cambridge CB2 1GA, United States
| |
Collapse
|
47
|
Guest JD, Vreven T, Zhou J, Moal I, Jeliazkov JR, Gray JJ, Weng Z, Pierce BG. An expanded benchmark for antibody-antigen docking and affinity prediction reveals insights into antibody recognition determinants. Structure 2021; 29:606-621.e5. [PMID: 33539768 DOI: 10.1016/j.str.2021.01.005] [Citation(s) in RCA: 64] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2020] [Revised: 11/15/2020] [Accepted: 01/11/2021] [Indexed: 01/04/2023]
Abstract
Accurate predictive modeling of antibody-antigen complex structures and structure-based antibody design remain major challenges in computational biology, with implications for biotherapeutics, immunity, and vaccines. Through a systematic search for high-resolution structures of antibody-antigen complexes and unbound antibody and antigen structures, in conjunction with identification of experimentally determined binding affinities, we have assembled a non-redundant set of test cases for antibody-antigen docking and affinity prediction. This benchmark more than doubles the number of antibody-antigen complexes and corresponding affinities available in our previous benchmarks, providing an unprecedented view of the determinants of antibody recognition and insights into molecular flexibility. Initial assessments of docking and affinity prediction tools highlight the challenges posed by this diverse set of cases, which includes camelid nanobodies, therapeutic monoclonal antibodies, and broadly neutralizing antibodies targeting viral glycoproteins. This dataset will enable development of advanced predictive modeling and design methods for this therapeutically relevant class of protein-protein interactions.
Collapse
Affiliation(s)
- Johnathan D Guest
- University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA; Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA
| | - Thom Vreven
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Jing Zhou
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Iain Moal
- Computational Sciences, GlaxoSmithKline Research and Development, Stevenage SG1 2NY, UK
| | - Jeliazko R Jeliazkov
- Program in Molecular Biophysics, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Jeffrey J Gray
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD 21218, USA; Program in Molecular Biophysics, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA.
| | - Brian G Pierce
- University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA; Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA.
| |
Collapse
|
48
|
Khan MAAK, Turjya RR, Islam ABMMK. Computational engineering the binding affinity of Adalimumab monoclonal antibody for designing potential biosimilar candidate. J Mol Graph Model 2021; 102:107774. [DOI: 10.1016/j.jmgm.2020.107774] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Revised: 08/15/2020] [Accepted: 10/05/2020] [Indexed: 12/26/2022]
|
49
|
Predicting antibody affinity changes upon mutations by combining multiple predictors. Sci Rep 2020; 10:19533. [PMID: 33177627 PMCID: PMC7658247 DOI: 10.1038/s41598-020-76369-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Accepted: 10/19/2020] [Indexed: 02/07/2023] Open
Abstract
Antibodies are proteins working in our immune system with high affinity and specificity for target antigens, making them excellent tools for both biotherapeutic and bioengineering applications. The prediction of antibody affinity changes upon mutations (\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${{\Delta \Delta {\mathrm{G}}}}_{\mathrm{binding}}$$\end{document}ΔΔGbinding) is important for antibody engineering. Numerous computational methods have been proposed based on different approaches including molecular mechanics and machine learning. However, the accuracy by each individual predictor is not enough for efficient antibody development. In this study, we develop a new prediction method by combining multiple predictors based on machine learning. Our method was tested on the SiPMAB database, evaluating the Pearson’s correlation coefficient between predicted and experimental \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${{\Delta \Delta {\mathrm{G}}}}_{\mathrm{binding}}$$\end{document}ΔΔGbinding. Our method achieved higher accuracy (R = 0.69) than previous molecular mechanics or machine-learning based methods (R = 0.59) and the previous method using the average of multiple predictors (R = 0.64). Feature importance analysis indicated that the improved accuracy was obtained by combining predictors with different importance, which have different protocols for calculating energies and for generating mutant and unbound state structures. This study demonstrates that machine learning is a powerful framework for combining different approaches to predict antibody affinity changes.
Collapse
|
50
|
Pires DEV, Rodrigues CHM, Ascher DB. mCSM-membrane: predicting the effects of mutations on transmembrane proteins. Nucleic Acids Res 2020; 48:W147-W153. [PMID: 32469063 PMCID: PMC7319563 DOI: 10.1093/nar/gkaa416] [Citation(s) in RCA: 64] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Revised: 05/04/2020] [Accepted: 05/28/2020] [Indexed: 12/17/2022] Open
Abstract
Significant efforts have been invested into understanding and predicting the molecular consequences of mutations in protein coding regions, however nearly all approaches have been developed using globular, soluble proteins. These methods have been shown to poorly translate to studying the effects of mutations in membrane proteins. To fill this gap, here we report, mCSM-membrane, a user-friendly web server that can be used to analyse the impacts of mutations on membrane protein stability and the likelihood of them being disease associated. mCSM-membrane derives from our well-established mutation modelling approach that uses graph-based signatures to model protein geometry and physicochemical properties for supervised learning. Our stability predictor achieved correlations of up to 0.72 and 0.67 (on cross validation and blind tests, respectively), while our pathogenicity predictor achieved a Matthew's Correlation Coefficient (MCC) of up to 0.77 and 0.73, outperforming previously described methods in both predicting changes in stability and in identifying pathogenic variants. mCSM-membrane will be an invaluable and dedicated resource for investigating the effects of single-point mutations on membrane proteins through a freely available, user friendly web server at http://biosig.unimelb.edu.au/mcsm_membrane.
Collapse
Affiliation(s)
- Douglas E V Pires
- Computational Biology and Clinical Informatics, Baker Institute, Melbourne, Victoria 3004, Australia.,Structural Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Parkville, VIC, 3052, Australia.,School of Computing and Information Systems, University of Melbourne, Parkville, VIC, 3052, Australia
| | - Carlos H M Rodrigues
- Computational Biology and Clinical Informatics, Baker Institute, Melbourne, Victoria 3004, Australia.,Structural Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Parkville, VIC, 3052, Australia
| | - David B Ascher
- Computational Biology and Clinical Informatics, Baker Institute, Melbourne, Victoria 3004, Australia.,Structural Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Parkville, VIC, 3052, Australia.,Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK
| |
Collapse
|