1
|
Vardaxis I, Simovski B, Anzar I, Stratford R, Clancy T. Deep learning of antibody epitopes using positional permutation vectors. Comput Struct Biotechnol J 2024; 23:2695-2707. [PMID: 39035832 PMCID: PMC11260035 DOI: 10.1016/j.csbj.2024.06.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Revised: 06/04/2024] [Accepted: 06/04/2024] [Indexed: 07/23/2024] Open
Abstract
Background The accurate computational prediction of B cell epitopes can vastly reduce the cost and time required for identifying potential epitope candidates for the design of vaccines and immunodiagnostics. However, current computational tools for B cell epitope prediction perform poorly and are not fit-for-purpose, and there remains enormous room for improvement and the need for superior prediction strategies. Results Here we propose a novel approach that improves B cell epitope prediction by encoding epitopes as binary positional permutation vectors that represent the position and structural properties of the amino acids within a protein antigen sequence that interact with an antibody. This approach supersedes the traditional method of defining epitopes as scores per amino acid on a protein sequence, where each score reflects each amino acids predicted probability of partaking in a B cell epitope antibody interaction. In addition to defining epitopes as binary positional permutation vectors, the approach also uses the 3D macrostructure features of the unbound protein structures, and in turn uses these features to train another deep learning model on the corresponding antibody-bound protein 3D structures. This enables the algorithm to learn the key structural and physiochemical features of the unbound protein and embedded epitope that initiate the antibody binding process helping to eliminate "induced fit" biases in the training data. We demonstrate that the strategy predicts B cell epitopes with improved accuracy compared to the existing tools. Additionally, we show that this approach reliably identifies the majority of experimentally verified epitopes on the spike protein of SARS-CoV-2 not seen by the model during training and generalizes in a very robust manner on dissimilar data not seen by the model during training. Conclusions With the approach described herein, a primary protein sequence and a query positional permutation vector encoding a putative epitope is sufficient to predict B cell epitopes in a reliable manner, potentially advancing the use of computational prediction of B cell epitopes in biomedical research applications.
Collapse
Affiliation(s)
- Ioannis Vardaxis
- NEC OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, Oslo 0379, Norway
| | - Boris Simovski
- NEC OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, Oslo 0379, Norway
| | - Irantzu Anzar
- NEC OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, Oslo 0379, Norway
| | - Richard Stratford
- NEC OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, Oslo 0379, Norway
| | - Trevor Clancy
- NEC OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, Oslo 0379, Norway
- Department of Vaccine Informatics, Institute for Tropical Medicine, Nagasaki University, Japan
| |
Collapse
|
2
|
Wang X, Gao X, Fan X, Huai Z, Zhang G, Yao M, Wang T, Huang X, Lai L. WUREN: Whole-modal union representation for epitope prediction. Comput Struct Biotechnol J 2024; 23:2122-2131. [PMID: 38817963 PMCID: PMC11137340 DOI: 10.1016/j.csbj.2024.05.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 05/14/2024] [Accepted: 05/14/2024] [Indexed: 06/01/2024] Open
Abstract
B-cell epitope identification plays a vital role in the development of vaccines, therapies, and diagnostic tools. Currently, molecular docking tools in B-cell epitope prediction are heavily influenced by empirical parameters and require significant computational resources, rendering a great challenge to meet large-scale prediction demands. When predicting epitopes from antigen-antibody complex, current artificial intelligence algorithms cannot accurately implement the prediction due to insufficient protein feature representations, indicating novel algorithm is desperately needed for efficient protein information extraction. In this paper, we introduce a multimodal model called WUREN (Whole-modal Union Representation for Epitope predictioN), which effectively combines sequence, graph, and structural features. It achieved AUC-PR scores of 0.213 and 0.193 on the solved structures and AlphaFold-generated structures, respectively, for the independent test proteins selected from DiscoTope3 benchmark. Our findings indicate that WUREN is an efficient feature extraction model for protein complexes, with the generalizable application potential in the development of protein-based drugs. Moreover, the streamlined framework of WUREN could be readily extended to model similar biomolecules, such as nucleic acids, carbohydrates, and lipids.
Collapse
Affiliation(s)
| | | | - Xuezhe Fan
- XtalPi Innovation Center, Beijing, China
| | - Zhe Huai
- XtalPi Innovation Center, Beijing, China
| | | | | | | | | | - Lipeng Lai
- XtalPi Innovation Center, Beijing, China
| |
Collapse
|
3
|
AlJarf R, Rodrigues CHM, Myung Y, Pires DEV, Ascher DB. piscesCSM: prediction of anticancer synergistic drug combinations. J Cheminform 2024; 16:81. [PMID: 39030592 PMCID: PMC11264925 DOI: 10.1186/s13321-024-00859-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Accepted: 05/12/2024] [Indexed: 07/21/2024] Open
Abstract
While drug combination therapies are of great importance, particularly in cancer treatment, identifying novel synergistic drug combinations has been a challenging venture. Computational methods have emerged in this context as a promising tool for prioritizing drug combinations for further evaluation, though they have presented limited performance, utility, and interpretability. Here, we propose a novel predictive tool, piscesCSM, that leverages graph-based representations to model small molecule chemical structures to accurately predict drug combinations with favourable anticancer synergistic effects against one or multiple cancer cell lines. Leveraging these insights, we developed a general supervised machine learning model to guide the prediction of anticancer synergistic drug combinations in over 30 cell lines. It achieved an area under the receiver operating characteristic curve (AUROC) of up to 0.89 on independent non-redundant blind tests, outperforming state-of-the-art approaches on both large-scale oncology screening data and an independent test set generated by AstraZeneca (with more than a 16% improvement in predictive accuracy). Moreover, by exploring the interpretability of our approach, we found that simple physicochemical properties and graph-based signatures are predictive of chemotherapy synergism. To provide a simple and integrated platform to rapidly screen potential candidate pairs with favourable synergistic anticancer effects, we made piscesCSM freely available online at https://biosig.lab.uq.edu.au/piscescsm/ as a web server and API. We believe that our predictive tool will provide a valuable resource for optimizing and augmenting combinatorial screening libraries to identify effective and safe synergistic anticancer drug combinations. SCIENTIFIC CONTRIBUTION: This work proposes piscesCSM, a machine-learning-based framework that relies on well-established graph-based representations of small molecules to identify and provide better predictive accuracy of syngenetic drug combinations. Our model, piscesCSM, shows that combining physiochemical properties with graph-based signatures can outperform current architectures on classification prediction tasks. Furthermore, implementing our tool as a web server offers a user-friendly platform for researchers to screen for potential synergistic drug combinations with favorable anticancer effects against one or multiple cancer cell lines.
Collapse
Affiliation(s)
- Raghad AlJarf
- Structural Biology and Bioinformatics, Department of Biochemistry and Pharmacology, University of Melbourne, Melbourne, VIC, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, VIC, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia
| | - Carlos H M Rodrigues
- Structural Biology and Bioinformatics, Department of Biochemistry and Pharmacology, University of Melbourne, Melbourne, VIC, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, VIC, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, QLD, Australia
| | - Yoochan Myung
- Structural Biology and Bioinformatics, Department of Biochemistry and Pharmacology, University of Melbourne, Melbourne, VIC, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, VIC, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, QLD, Australia
| | - Douglas E V Pires
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, VIC, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia
- School of Computing and Information Systems, University of Melbourne, Melbourne, VIC, Australia
| | - David B Ascher
- Structural Biology and Bioinformatics, Department of Biochemistry and Pharmacology, University of Melbourne, Melbourne, VIC, Australia.
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, VIC, Australia.
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia.
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, QLD, Australia.
| |
Collapse
|
4
|
Pegoraro M, Dominé C, Rodolà E, Veličković P, Deac A. Geometric epitope and paratope prediction. BIOINFORMATICS (OXFORD, ENGLAND) 2024; 40:btae405. [PMID: 38984742 PMCID: PMC11245313 DOI: 10.1093/bioinformatics/btae405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Revised: 05/14/2024] [Accepted: 07/09/2024] [Indexed: 07/11/2024]
Abstract
MOTIVATION Identifying the binding sites of antibodies is essential for developing vaccines and synthetic antibodies. In this article, we investigate the optimal representation for predicting the binding sites in the two molecules and emphasize the importance of geometric information. RESULTS Specifically, we compare different geometric deep learning methods applied to proteins' inner (I-GEP) and outer (O-GEP) structures. We incorporate 3D coordinates and spectral geometric descriptors as input features to fully leverage the geometric information. Our research suggests that different geometrical representation information is useful for different tasks. Surface-based models are more efficient in predicting the binding of the epitope, while graph models are better in paratope prediction, both achieving significant performance improvements. Moreover, we analyze the impact of structural changes in antibodies and antigens resulting from conformational rearrangements or reconstruction errors. Through this investigation, we showcase the robustness of geometric deep learning methods and spectral geometric descriptors to such perturbations. AVAILABILITY AND IMPLEMENTATION The python code for the models, together with the data and the processing pipeline, is open-source and available at https://github.com/Marco-Peg/GEP.
Collapse
Affiliation(s)
- Marco Pegoraro
- Department of Computer Science, Sapienza University of Rome, 00185, Italy
| | - Clémentine Dominé
- Gatsby Computational Neuroscience Unit, University College London, W1T 4JG, United-Kingdom
| | - Emanuele Rodolà
- Department of Computer Science, Sapienza University of Rome, 00185, Italy
| | | | - Andreea Deac
- Département d'informatique et de recherche opérationelle, Université de Montréal, QC H2S 3H1, Canada
| |
Collapse
|
5
|
Kaur B, Karnwal A, Bansal A, Malik T. An Immunoinformatic-Based In Silico Identification on the Creation of a Multiepitope-Based Vaccination Against the Nipah Virus. BIOMED RESEARCH INTERNATIONAL 2024; 2024:4066641. [PMID: 38962403 PMCID: PMC11221950 DOI: 10.1155/2024/4066641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Revised: 05/30/2024] [Accepted: 06/01/2024] [Indexed: 07/05/2024]
Abstract
The zoonotic viruses pose significant threats to public health. Nipah virus (NiV) is an emerging virus transmitted from bats to humans. The NiV causes severe encephalitis and acute respiratory distress syndrome, leading to high mortality rates, with fatality rates ranging from 40% to 75%. The first emergence of the disease was found in Malaysia in 1998-1999 and later in Bangladesh, Cambodia, Timor-Leste, Indonesia, Singapore, Papua New Guinea, Vietnam, Thailand, India, and other South and Southeast Asian nations. Currently, no specific vaccines or antiviral drugs are available. The potential advantages of epitope-based vaccines include their ability to elicit specific immune responses while minimizing potential side effects. The epitopes have been identified from the conserved region of viral proteins obtained from the UniProt database. The selection of conserved epitopes involves analyzing the genetic sequences of various viral strains. The present study identified two B cell epitopes, seven cytotoxic T lymphocyte (CTL) epitopes, and seven helper T lymphocyte (HTL) epitope interactions from the NiV proteomic inventory. The antigenic and physiological properties of retrieved protein were analyzed using online servers ToxinPred, VaxiJen v2.0, and AllerTOP. The final vaccine candidate has a total combined coverage range of 80.53%. The tertiary structure of the constructed vaccine was optimized, and its stability was confirmed with the help of molecular simulation. Molecular docking was performed to check the binding affinity and binding energy of the constructed vaccine with TLR-3 and TLR-5. Codon optimization was performed in the constructed vaccine within the Escherichia coli K12 strain, to eliminate the danger of codon bias. However, these findings must require further validation to assess their effectiveness and safety. The development of vaccines and therapeutic approaches for virus infection is an ongoing area of research, and it may take time before effective interventions are available for clinical use.
Collapse
Affiliation(s)
- Beant Kaur
- School of Bioengineering and BiosciencesLovely Professional University, Phagwara, Punjab 144411, India
| | - Arun Karnwal
- School of Bioengineering and BiosciencesLovely Professional University, Phagwara, Punjab 144411, India
| | - Anu Bansal
- School of Bioengineering and BiosciencesLovely Professional University, Phagwara, Punjab 144411, India
| | - Tabarak Malik
- Department of Biomedical SciencesInstitute of HealthJimma University, Jimma, Ethiopia
| |
Collapse
|
6
|
Yang Y, He X, Li F, He S, Liu M, Li M, Xia F, Su W, Liu G. Animal-derived food allergen: A review on the available crystal structure and new insights into structural epitope. Compr Rev Food Sci Food Saf 2024; 23:e13340. [PMID: 38778570 DOI: 10.1111/1541-4337.13340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2023] [Accepted: 03/19/2024] [Indexed: 05/25/2024]
Abstract
Immunoglobulin E (IgE)-mediated food allergy is a rapidly growing public health problem. The interaction between allergens and IgE is at the core of the allergic response. One of the best ways to understand this interaction is through structural characterization. This review focuses on animal-derived food allergens, overviews allergen structures determined by X-ray crystallography, presents an update on IgE conformational epitopes, and explores the structural features of these epitopes. The structural determinants of allergenicity and cross-reactivity are also discussed. Animal-derived food allergens are classified into limited protein families according to structural features, with the calcium-binding protein and actin-binding protein families dominating. Progress in epitope characterization has provided useful information on the structural properties of the IgE recognition region. The data reveals that epitopes are located in relatively protruding areas with negative surface electrostatic potential. Ligand binding and disulfide bonds are two intrinsic characteristics that influence protein structure and impact allergenicity. Shared structures, local motifs, and shared epitopes are factors that lead to cross-reactivity. The structural properties of epitope regions and structural determinants of allergenicity and cross-reactivity may provide directions for the prevention, diagnosis, and treatment of food allergies. Experimentally determined structure, especially that of antigen-antibody complexes, remains limited, and the identification of epitopes continues to be a bottleneck in the study of animal-derived food allergens. A combination of traditional immunological techniques and emerging bioinformatics technology will revolutionize how protein interactions are characterized.
Collapse
Affiliation(s)
- Yang Yang
- College of Ocean Food and Biological Engineering, Jimei University, Xiamen, Fujian, China
- College of Environment and Public Health, Xiamen Huaxia University, Xiamen, Fujian, China
| | - Xinrong He
- College of Ocean Food and Biological Engineering, Jimei University, Xiamen, Fujian, China
| | - Fajie Li
- College of Ocean Food and Biological Engineering, Jimei University, Xiamen, Fujian, China
| | - Shaogui He
- State Key Laboratory of Vaccines for Infectious Diseases, Xiang An Biomedicine Laboratory, Xiamen, Fujian, China
| | - Meng Liu
- College of Ocean Food and Biological Engineering, Jimei University, Xiamen, Fujian, China
- College of Marine Biology, Xiamen Ocean Vocational College, Xiamen, Fujian, China
| | - Mengsi Li
- College of Ocean Food and Biological Engineering, Jimei University, Xiamen, Fujian, China
- School of Food Engineering, Zhangzhou Institute of Technology, Zhangzhou, Fujian, China
| | - Fei Xia
- College of Ocean Food and Biological Engineering, Jimei University, Xiamen, Fujian, China
| | - Wenjin Su
- College of Ocean Food and Biological Engineering, Jimei University, Xiamen, Fujian, China
| | - Guangming Liu
- College of Ocean Food and Biological Engineering, Jimei University, Xiamen, Fujian, China
| |
Collapse
|
7
|
Kumar N, Tripathi S, Sharma N, Patiyal S, Devi NL, Raghava GPS. A method for predicting linear and conformational B-cell epitopes in an antigen from its primary sequence. Comput Biol Med 2024; 170:108083. [PMID: 38295479 DOI: 10.1016/j.compbiomed.2024.108083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 12/26/2023] [Accepted: 01/27/2024] [Indexed: 02/02/2024]
Abstract
B-cell is an essential component of the immune system that plays a vital role in providing the immune response against any pathogenic infection by producing antibodies. Existing methods either predict linear or conformational B-cell epitopes in an antigen. In this study, a single method was developed for predicting both types (linear/conformational) of B-cell epitopes. The dataset used in this study contains 3875 B-cell epitopes and 3996 non-B-cell epitopes, where B-cell epitopes consist of both linear and conformational B-cell epitopes. Our primary analysis indicates that certain residues (like Asp, Glu, Lys, and Asn) are more prominent in B-cell epitopes. We developed machine-learning based methods using different types of sequence composition and achieved the highest AUROC of 0.80 using dipeptide composition. In addition, models were developed on selected features, but no further improvement was observed. Our similarity-based method implemented using BLAST shows a high probability of correct prediction with poor sensitivity. Finally, we developed a hybrid model that combines alignment-free (dipeptide based random forest model) and alignment-based (BLAST-based similarity) models. Our hybrid model attained a maximum AUROC of 0.83 with an MCC of 0.49 on the independent dataset. Our hybrid model performs better than existing methods on an independent dataset used in this study. All models were trained and tested on 80 % of the data using a cross-validation technique, and the final model was evaluated on 20 % of the data, called an independent or validation dataset. A webserver and standalone package named "CLBTope" has been developed for predicting, designing, and scanning B-cell epitopes in an antigen sequence available at (https://webs.iiitd.edu.in/raghava/clbtope/).
Collapse
Affiliation(s)
- Nishant Kumar
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Sadhana Tripathi
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Neelam Sharma
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Sumeet Patiyal
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Naorem Leimarembi Devi
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Gajendra P S Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| |
Collapse
|
8
|
Høie MH, Gade FS, Johansen J, Würtzen C, Winther O, Nielsen M, Marcatili P. DiscoTope-3.0: improved B-cell epitope prediction using inverse folding latent representations. Front Immunol 2024; 15:1322712. [PMID: 38390326 PMCID: PMC10882062 DOI: 10.3389/fimmu.2024.1322712] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 01/08/2024] [Indexed: 02/24/2024] Open
Abstract
Accurate computational identification of B-cell epitopes is crucial for the development of vaccines, therapies, and diagnostic tools. However, current structure-based prediction methods face limitations due to the dependency on experimentally solved structures. Here, we introduce DiscoTope-3.0, a markedly improved B-cell epitope prediction tool that innovatively employs inverse folding structure representations and a positive-unlabelled learning strategy, and is adapted for both solved and predicted structures. Our tool demonstrates a considerable improvement in performance over existing methods, accurately predicting linear and conformational epitopes across multiple independent datasets. Most notably, DiscoTope-3.0 maintains high predictive performance across solved, relaxed and predicted structures, alleviating the need for experimental structures and extending the general applicability of accurate B-cell epitope prediction by 3 orders of magnitude. DiscoTope-3.0 is made widely accessible on two web servers, processing over 100 structures per submission, and as a downloadable package. In addition, the servers interface with RCSB and AlphaFoldDB, facilitating large-scale prediction across over 200 million cataloged proteins. DiscoTope-3.0 is available at: https://services.healthtech.dtu.dk/service.php?DiscoTope-3.0.
Collapse
Affiliation(s)
- Magnus Haraldson Høie
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark (DTU), Kgs. Lyngby, Denmark
| | - Frederik Steensgaard Gade
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark (DTU), Kgs. Lyngby, Denmark
| | - Julie Maria Johansen
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark (DTU), Kgs. Lyngby, Denmark
| | - Charlotte Würtzen
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark (DTU), Kgs. Lyngby, Denmark
| | - Ole Winther
- Section for Cognitive Systems, DTU Compute, Technical University of Denmark (DTU), Kgs. Lyngby, Denmark
- Center for Genomic Medicine, Rigshospitalet (Copenhagen University Hospital), Copenhagen, Denmark
- Department of Biology, Bioinformatics Centre, University of Copenhagen, Copenhagen, Denmark
| | - Morten Nielsen
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark (DTU), Kgs. Lyngby, Denmark
| | - Paolo Marcatili
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark (DTU), Kgs. Lyngby, Denmark
| |
Collapse
|
9
|
Bravi B. Development and use of machine learning algorithms in vaccine target selection. NPJ Vaccines 2024; 9:15. [PMID: 38242890 PMCID: PMC10798987 DOI: 10.1038/s41541-023-00795-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Accepted: 12/07/2023] [Indexed: 01/21/2024] Open
Abstract
Computer-aided discovery of vaccine targets has become a cornerstone of rational vaccine design. In this article, I discuss how Machine Learning (ML) can inform and guide key computational steps in rational vaccine design concerned with the identification of B and T cell epitopes and correlates of protection. I provide examples of ML models, as well as types of data and predictions for which they are built. I argue that interpretable ML has the potential to improve the identification of immunogens also as a tool for scientific discovery, by helping elucidate the molecular processes underlying vaccine-induced immune responses. I outline the limitations and challenges in terms of data availability and method development that need to be addressed to bridge the gap between advances in ML predictions and their translational application to vaccine design.
Collapse
Affiliation(s)
- Barbara Bravi
- Department of Mathematics, Imperial College London, London, SW7 2AZ, UK.
| |
Collapse
|
10
|
Farriol-Duran R, López-Aladid R, Porta-Pardo E, Torres A, Fernández-Barat L. Brewpitopes: a pipeline to refine B-cell epitope predictions during public health emergencies. Front Immunol 2023; 14:1278534. [PMID: 38124749 PMCID: PMC10730938 DOI: 10.3389/fimmu.2023.1278534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 11/14/2023] [Indexed: 12/23/2023] Open
Abstract
The application of B-cell epitope identification to develop therapeutic antibodies and vaccine candidates is well established. However, the validation of epitopes is time-consuming and resource-intensive. To alleviate this, in recent years, multiple computational predictors have been developed in the immunoinformatics community. Brewpitopes is a pipeline that curates bioinformatic B-cell epitope predictions obtained by integrating different state-of-the-art tools. We used additional computational predictors to account for subcellular location, glycosylation status, and surface accessibility of the predicted epitopes. The implementation of these sets of rational filters optimizes in vivo antibody recognition properties of the candidate epitopes. To validate Brewpitopes, we performed a proteome-wide analysis of SARS-CoV-2 with a particular focus on S protein and its variants of concern. In the S protein, we obtained a fivefold enrichment in terms of predicted neutralization versus the epitopes identified by individual tools. We analyzed epitope landscape changes caused by mutations in the S protein of new viral variants that were linked to observed immune escape evidence in specific strains. In addition, we identified a set of epitopes with neutralizing potential in four SARS-CoV-2 proteins (R1AB, R1A, AP3A, and ORF9C). These epitopes and antigenic proteins are conserved targets for viral neutralization studies. In summary, Brewpitopes is a powerful pipeline that refines B-cell epitope bioinformatic predictions during public health emergencies in a high-throughput capacity to facilitate the optimization of experimental validation of therapeutic antibodies and candidate vaccines.
Collapse
Affiliation(s)
| | - Ruben López-Aladid
- CELLEX Research Laboratories, CibeRes (Centro de Investigación Biomédica en Red de Enfermedades Respiratorias, Institut d’Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain
- Pneumology Department, Hospital Clínic, Barcelona, Spain
| | - Eduard Porta-Pardo
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
- Josep Carreras Leukaemia Research Institute (IJC), Badalona, Spain
| | - Antoni Torres
- CELLEX Research Laboratories, CibeRes (Centro de Investigación Biomédica en Red de Enfermedades Respiratorias, Institut d’Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain
- Pneumology Department, Hospital Clínic, Barcelona, Spain
| | - Laia Fernández-Barat
- CELLEX Research Laboratories, CibeRes (Centro de Investigación Biomédica en Red de Enfermedades Respiratorias, Institut d’Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain
- Pneumology Department, Hospital Clínic, Barcelona, Spain
| |
Collapse
|
11
|
Kumar N, Bajiya N, Patiyal S, Raghava GPS. Multi-perspectives and challenges in identifying B-cell epitopes. Protein Sci 2023; 32:e4785. [PMID: 37733481 PMCID: PMC10578127 DOI: 10.1002/pro.4785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 09/11/2023] [Accepted: 09/16/2023] [Indexed: 09/23/2023]
Abstract
The identification of B-cell epitopes (BCEs) in antigens is a crucial step in developing recombinant vaccines or immunotherapies for various diseases. Over the past four decades, numerous in silico methods have been developed for predicting BCEs. However, existing reviews have only covered specific aspects, such as the progress in predicting conformational or linear BCEs. Therefore, in this paper, we have undertaken a systematic approach to provide a comprehensive review covering all aspects associated with the identification of BCEs. First, we have covered the experimental techniques developed over the years for identifying linear and conformational epitopes, including the limitations and challenges associated with these techniques. Second, we have briefly described the historical perspectives and resources that maintain experimentally validated information on BCEs. Third, we have extensively reviewed the computational methods developed for predicting conformational BCEs from the structure of the antigen, as well as the methods for predicting conformational epitopes from the sequence. Fourth, we have systematically reviewed the in silico methods developed in the last four decades for predicting linear or continuous BCEs. Finally, we have discussed the overall challenge of identifying continuous or conformational BCEs. In this review, we only listed major computational resources; a complete list with the URL is available from the BCinfo website (https://webs.iiitd.edu.in/raghava/bcinfo/).
Collapse
Affiliation(s)
- Nishant Kumar
- Department of Computational BiologyIndraprastha Institute of Information TechnologyNew DelhiIndia
| | - Nisha Bajiya
- Department of Computational BiologyIndraprastha Institute of Information TechnologyNew DelhiIndia
| | - Sumeet Patiyal
- Department of Computational BiologyIndraprastha Institute of Information TechnologyNew DelhiIndia
| | - Gajendra P. S. Raghava
- Department of Computational BiologyIndraprastha Institute of Information TechnologyNew DelhiIndia
| |
Collapse
|
12
|
Angaitkar P, Janghel RR, Sahu TP. DL-TCNN: Deep Learning-based Temporal Convolutional Neural Network for prediction of conformational B-cell epitopes. 3 Biotech 2023; 13:297. [PMID: 37575599 PMCID: PMC10412510 DOI: 10.1007/s13205-023-03716-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2023] [Accepted: 07/24/2023] [Indexed: 08/15/2023] Open
Abstract
Prediction of conformational B-cell epitopes (CBCE) is an essential phase for vaccine design, drug invention, and accurate disease diagnosis. Many laboratorial and computational approaches have been developed to predict CBCE. However, laboratorial experiments are costly and time consuming, leading to the popularity of Machine Learning (ML)-based computational methods. Although ML methods have succeeded in many domains, achieving higher accuracy in CBCE prediction remains a challenge. To overcome this drawback and consider the limitations of ML methods, this paper proposes a novel DL-based framework for CBCE prediction, leveraging the capabilities of deep learning in the medical domain. The proposed model is named Deep Learning-based Temporal Convolutional Neural Network (DL-TCNN), which hybridizes empirical hyper-tuned 1D-CNN and TCN. TCN is an architecture that employs causal convolutions and dilations, adapting well to sequential input with extensive receptive fields. To train the proposed model, physicochemical features are firstly extracted from antigen sequences. Next, the Synthetic Minority Oversampling Technique (SMOTE) is applied to address the class imbalance problem. Finally, the proposed DL-TCNN is employed for the prediction of CBCE. The model's performance is evaluated and validated on a benchmark antigen-antibody dataset. The DL-TCNN achieves 94.44% accuracy, and 0.989 AUC score for the training dataset, 78.53% accuracy, and 0.661 AUC score for the validation dataset; and 85.10% accuracy, 0.855 AUC score for the testing dataset. The proposed model outperforms all the existing CBCE methods.
Collapse
Affiliation(s)
- Pratik Angaitkar
- Department of Information Technology, National Institute of Technology, Raipur, G.E. Road, Raipur, C.G. 492010 India
| | - Rekh Ram Janghel
- Department of Information Technology, National Institute of Technology, Raipur, G.E. Road, Raipur, C.G. 492010 India
| | - Tirath Prasad Sahu
- Department of Information Technology, National Institute of Technology, Raipur, G.E. Road, Raipur, C.G. 492010 India
| |
Collapse
|
13
|
Guarra F, Colombo G. Computational Methods in Immunology and Vaccinology: Design and Development of Antibodies and Immunogens. J Chem Theory Comput 2023; 19:5315-5333. [PMID: 37527403 PMCID: PMC10448727 DOI: 10.1021/acs.jctc.3c00513] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Indexed: 08/03/2023]
Abstract
The design of new biomolecules able to harness immune mechanisms for the treatment of diseases is a prime challenge for computational and simulative approaches. For instance, in recent years, antibodies have emerged as an important class of therapeutics against a spectrum of pathologies. In cancer, immune-inspired approaches are witnessing a surge thanks to a better understanding of tumor-associated antigens and the mechanisms of their engagement or evasion from the human immune system. Here, we provide a summary of the main state-of-the-art computational approaches that are used to design antibodies and antigens, and in parallel, we review key methodologies for epitope identification for both B- and T-cell mediated responses. A special focus is devoted to the description of structure- and physics-based models, privileged over purely sequence-based approaches. We discuss the implications of novel methods in engineering biomolecules with tailored immunological properties for possible therapeutic uses. Finally, we highlight the extraordinary challenges and opportunities presented by the possible integration of structure- and physics-based methods with emerging Artificial Intelligence technologies for the prediction and design of novel antigens, epitopes, and antibodies.
Collapse
Affiliation(s)
- Federica Guarra
- Department of Chemistry, University
of Pavia, Via Taramelli 12, 27100 Pavia, Italy
| | - Giorgio Colombo
- Department of Chemistry, University
of Pavia, Via Taramelli 12, 27100 Pavia, Italy
| |
Collapse
|
14
|
Zeng X, Bai G, Sun C, Ma B. Recent Progress in Antibody Epitope Prediction. Antibodies (Basel) 2023; 12:52. [PMID: 37606436 PMCID: PMC10443277 DOI: 10.3390/antib12030052] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 07/31/2023] [Accepted: 08/03/2023] [Indexed: 08/23/2023] Open
Abstract
Recent progress in epitope prediction has shown promising results in the development of vaccines and therapeutics against various diseases. However, the overall accuracy and success rate need to be improved greatly to gain practical application significance, especially conformational epitope prediction. In this review, we examined the general features of antibody-antigen recognition, highlighting the conformation selection mechanism in flexible antibody-antigen binding. We recently highlighted the success and warning signs of antibody epitope predictions, including linear and conformation epitope predictions. While deep learning-based models gradually outperform traditional feature-based machine learning, sequence and structure features still provide insight into antibody-antigen recognition problems.
Collapse
Affiliation(s)
- Xincheng Zeng
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China; (X.Z.); (C.S.)
| | - Ganggang Bai
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China; (X.Z.); (C.S.)
| | - Chuance Sun
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China; (X.Z.); (C.S.)
| | - Buyong Ma
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China; (X.Z.); (C.S.)
- Shanghai Digiwiser Biological, Inc., Shanghai 200131, China
| |
Collapse
|
15
|
Li T, Li Y, Zhu X, He Y, Wu Y, Ying T, Xie Z. Artificial intelligence in cancer immunotherapy: Applications in neoantigen recognition, antibody design and immunotherapy response prediction. Semin Cancer Biol 2023; 91:50-69. [PMID: 36870459 DOI: 10.1016/j.semcancer.2023.02.007] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 02/13/2023] [Accepted: 02/28/2023] [Indexed: 03/06/2023]
Abstract
Cancer immunotherapy is a method of controlling and eliminating tumors by reactivating the body's cancer-immunity cycle and restoring its antitumor immune response. The increased availability of data, combined with advancements in high-performance computing and innovative artificial intelligence (AI) technology, has resulted in a rise in the use of AI in oncology research. State-of-the-art AI models for functional classification and prediction in immunotherapy research are increasingly used to support laboratory-based experiments. This review offers a glimpse of the current AI applications in immunotherapy, including neoantigen recognition, antibody design, and prediction of immunotherapy response. Advancing in this direction will result in more robust predictive models for developing better targets, drugs, and treatments, and these advancements will eventually make their way into the clinical setting, pushing AI forward in the field of precision oncology.
Collapse
Affiliation(s)
- Tong Li
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Yupeng Li
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Xiaoyi Zhu
- MOE/NHC Key Laboratory of Medical Molecular Virology, Shanghai Institute of Infectious Disease and Biosecurity, School of Basic Medical Sciences, Shanghai Medical College, Fudan University, Shanghai, China; Shanghai Engineering Research Center for Synthetic Immunology, Shanghai, China
| | - Yao He
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Yanling Wu
- MOE/NHC Key Laboratory of Medical Molecular Virology, Shanghai Institute of Infectious Disease and Biosecurity, School of Basic Medical Sciences, Shanghai Medical College, Fudan University, Shanghai, China; Shanghai Engineering Research Center for Synthetic Immunology, Shanghai, China
| | - Tianlei Ying
- MOE/NHC Key Laboratory of Medical Molecular Virology, Shanghai Institute of Infectious Disease and Biosecurity, School of Basic Medical Sciences, Shanghai Medical College, Fudan University, Shanghai, China; Shanghai Engineering Research Center for Synthetic Immunology, Shanghai, China.
| | - Zhi Xie
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China; Center for Precision Medicine, Sun Yat-sen University, Guangzhou, China.
| |
Collapse
|
16
|
Parkinson J, Hard R, Wang W. The RESP AI model accelerates the identification of tight-binding antibodies. Nat Commun 2023; 14:454. [PMID: 36709319 PMCID: PMC9884274 DOI: 10.1038/s41467-023-36028-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Accepted: 01/13/2023] [Indexed: 01/30/2023] Open
Abstract
High-affinity antibodies are often identified through directed evolution, which may require many iterations of mutagenesis and selection to find an optimal candidate. Deep learning techniques hold the potential to accelerate this process but the existing methods cannot provide the confidence interval or uncertainty needed to assess the reliability of the predictions. Here we present a pipeline called RESP for efficient identification of high affinity antibodies. We develop a learned representation trained on over 3 million human B-cell receptor sequences to encode antibody sequences. We then develop a variational Bayesian neural network to perform ordinal regression on a set of the directed evolution sequences binned by off-rate and quantify their likelihood to be tight binders against an antigen. Importantly, this model can assess sequences not present in the directed evolution library and thus greatly expand the search space to uncover the best sequences for experimental evaluation. We demonstrate the power of this pipeline by achieving a 17-fold improvement in the KD of the PD-L1 antibody Atezolizumab and this success illustrates the potential of RESP in facilitating general antibody development.
Collapse
Affiliation(s)
- Jonathan Parkinson
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA, 92093-0359, USA
| | - Ryan Hard
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA, 92093-0359, USA
| | - Wei Wang
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA, 92093-0359, USA. .,Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA, 92093-0359, USA.
| |
Collapse
|
17
|
Cia G, Pucci F, Rooman M. Critical review of conformational B-cell epitope prediction methods. Brief Bioinform 2023; 24:6972295. [PMID: 36611255 DOI: 10.1093/bib/bbac567] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 11/17/2022] [Accepted: 11/19/2022] [Indexed: 01/09/2023] Open
Abstract
Accurate in silico prediction of conformational B-cell epitopes would lead to major improvements in disease diagnostics, drug design and vaccine development. A variety of computational methods, mainly based on machine learning approaches, have been developed in the last decades to tackle this challenging problem. Here, we rigorously benchmarked nine state-of-the-art conformational B-cell epitope prediction webservers, including generic and antibody-specific methods, on a dataset of over 250 antibody-antigen structures. The results of our assessment and statistical analyses show that all the methods achieve very low performances, and some do not perform better than randomly generated patches of surface residues. In addition, we also found that commonly used consensus strategies that combine the results from multiple webservers are at best only marginally better than random. Finally, we applied all the predictors to the SARS-CoV-2 spike protein as an independent case study, and showed that they perform poorly in general, which largely recapitulates our benchmarking conclusions. We hope that these results will lead to greater caution when using these tools until the biases and issues that limit current methods have been addressed, promote the use of state-of-the-art evaluation methodologies in future publications and suggest new strategies to improve the performance of conformational B-cell epitope prediction methods.
Collapse
Affiliation(s)
- Gabriel Cia
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, F. Roosevelt Avenue, 1050, Brussels, Belgium.,Interuniversity Institute of Bioinformatics in Brussels, Triumph Boulevard, 1050, Brussels, Belgium
| | - Fabrizio Pucci
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, F. Roosevelt Avenue, 1050, Brussels, Belgium.,Interuniversity Institute of Bioinformatics in Brussels, Triumph Boulevard, 1050, Brussels, Belgium
| | - Marianne Rooman
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, F. Roosevelt Avenue, 1050, Brussels, Belgium.,Interuniversity Institute of Bioinformatics in Brussels, Triumph Boulevard, 1050, Brussels, Belgium
| |
Collapse
|
18
|
Waury K, Willemse EAJ, Vanmechelen E, Zetterberg H, Teunissen CE, Abeln S. Bioinformatics tools and data resources for assay development of fluid protein biomarkers. Biomark Res 2022; 10:83. [DOI: 10.1186/s40364-022-00425-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2022] [Accepted: 10/25/2022] [Indexed: 11/16/2022] Open
Abstract
AbstractFluid protein biomarkers are important tools in clinical research and health care to support diagnosis and to monitor patients. Especially within the field of dementia, novel biomarkers could address the current challenges of providing an early diagnosis and of selecting trial participants. While the great potential of fluid biomarkers is recognized, their implementation in routine clinical use has been slow. One major obstacle is the often unsuccessful translation of biomarker candidates from explorative high-throughput techniques to sensitive antibody-based immunoassays. In this review, we propose the incorporation of bioinformatics into the workflow of novel immunoassay development to overcome this bottleneck and thus facilitate the development of novel biomarkers towards clinical laboratory practice. Due to the rapid progress within the field of bioinformatics many freely available and easy-to-use tools and data resources exist which can aid the researcher at various stages. Current prediction methods and databases can support the selection of suitable biomarker candidates, as well as the choice of appropriate commercial affinity reagents. Additionally, we examine methods that can determine or predict the epitope - an antibody’s binding region on its antigen - and can help to make an informed choice on the immunogenic peptide used for novel antibody production. Selected use cases for biomarker candidates help illustrate the application and interpretation of the introduced tools.
Collapse
|
19
|
Wang Y, Tang H, Gao C, Ge M, Li Z, Dong Z, Zhao L. Flexibility-aware graph model for accurate epitope identification. Comput Biol Med 2022; 149:106064. [DOI: 10.1016/j.compbiomed.2022.106064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 08/05/2022] [Accepted: 08/27/2022] [Indexed: 11/25/2022]
|
20
|
Kakkanas A, Karamichali E, Koufogeorgou EI, Kotsakis SD, Georgopoulou U, Foka P. Targeting the YXXΦ Motifs of the SARS Coronaviruses 1 and 2 ORF3a Peptides by In Silico Analysis to Predict Novel Virus-Host Interactions. Biomolecules 2022; 12:1052. [PMID: 36008946 PMCID: PMC9405953 DOI: 10.3390/biom12081052] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Revised: 07/21/2022] [Accepted: 07/25/2022] [Indexed: 02/08/2023] Open
Abstract
The emerging SARS-CoV and SARS-CoV-2 belong to the family of "common cold" RNA coronaviruses, and they are responsible for the 2003 epidemic and the current pandemic with over 6.3 M deaths worldwide. The ORF3a gene is conserved in both viruses and codes for the accessory protein ORF3a, with unclear functions, possibly related to viral virulence and pathogenesis. The tyrosine-based YXXΦ motif (Φ: bulky hydrophobic residue-L/I/M/V/F) was originally discovered to mediate clathrin-dependent endocytosis of membrane-spanning proteins. Many viruses employ the YXXΦ motif to achieve efficient receptor-guided internalisation in host cells, maintain the structural integrity of their capsids and enhance viral replication. Importantly, this motif has been recently identified on the ORF3a proteins of SARS-CoV and SARS-CoV-2. Given that the ORF3a aa sequence is not fully conserved between the two SARS viruses, we aimed to map in silico structural differences and putative sequence-driven alterations of regulatory elements within and adjacently to the YXXΦ motifs that could predict variations in ORF3a functions. Using robust bioinformatics tools, we investigated the presence of relevant post-translational modifications and the YXXΦ motif involvement in protein-protein interactions. Our study suggests that the predicted YXXΦ-related features may confer specific-yet to be discovered-functions to ORF3a proteins, significant to the new virus and related to enhanced propagation, host immune regulation and virulence.
Collapse
Affiliation(s)
- Athanassios Kakkanas
- Laboratory of Molecular Virology, Hellenic Pasteur Institute, 115-21 Athens, Greece; (A.K.); (E.K.); (E.I.K.); (U.G.)
| | - Eirini Karamichali
- Laboratory of Molecular Virology, Hellenic Pasteur Institute, 115-21 Athens, Greece; (A.K.); (E.K.); (E.I.K.); (U.G.)
| | - Efthymia Ioanna Koufogeorgou
- Laboratory of Molecular Virology, Hellenic Pasteur Institute, 115-21 Athens, Greece; (A.K.); (E.K.); (E.I.K.); (U.G.)
| | - Stathis D. Kotsakis
- Laboratory of Bacteriology, Hellenic Pasteur Institute, 115-21 Athens, Greece;
| | - Urania Georgopoulou
- Laboratory of Molecular Virology, Hellenic Pasteur Institute, 115-21 Athens, Greece; (A.K.); (E.K.); (E.I.K.); (U.G.)
| | - Pelagia Foka
- Laboratory of Molecular Virology, Hellenic Pasteur Institute, 115-21 Athens, Greece; (A.K.); (E.K.); (E.I.K.); (U.G.)
| |
Collapse
|
21
|
Wilman W, Wróbel S, Bielska W, Deszynski P, Dudzic P, Jaszczyszyn I, Kaniewski J, Młokosiewicz J, Rouyan A, Satława T, Kumar S, Greiff V, Krawczyk K. Machine-designed biotherapeutics: opportunities, feasibility and advantages of deep learning in computational antibody discovery. Brief Bioinform 2022; 23:6643456. [PMID: 35830864 PMCID: PMC9294429 DOI: 10.1093/bib/bbac267] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 05/09/2022] [Accepted: 06/07/2022] [Indexed: 11/13/2022] Open
Abstract
Antibodies are versatile molecular binders with an established and growing role as therapeutics. Computational approaches to developing and designing these molecules are being increasingly used to complement traditional lab-based processes. Nowadays, in silico methods fill multiple elements of the discovery stage, such as characterizing antibody–antigen interactions and identifying developability liabilities. Recently, computational methods tackling such problems have begun to follow machine learning paradigms, in many cases deep learning specifically. This paradigm shift offers improvements in established areas such as structure or binding prediction and opens up new possibilities such as language-based modeling of antibody repertoires or machine-learning-based generation of novel sequences. In this review, we critically examine the recent developments in (deep) machine learning approaches to therapeutic antibody design with implications for fully computational antibody design.
Collapse
|
22
|
Caoili SEC. Comprehending B-Cell Epitope Prediction to Develop Vaccines and Immunodiagnostics. Front Immunol 2022; 13:908459. [PMID: 35874755 PMCID: PMC9300992 DOI: 10.3389/fimmu.2022.908459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Accepted: 06/13/2022] [Indexed: 11/18/2022] Open
|
23
|
Rezende PM, Xavier JS, Ascher DB, Fernandes GR, Pires DEV. Evaluating hierarchical machine learning approaches to classify biological databases. Brief Bioinform 2022; 23:6611916. [PMID: 35724625 PMCID: PMC9310517 DOI: 10.1093/bib/bbac216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Revised: 04/29/2022] [Accepted: 05/09/2022] [Indexed: 12/04/2022] Open
Abstract
The rate of biological data generation has increased dramatically in recent years, which has driven the importance of databases as a resource to guide innovation and the generation of biological insights. Given the complexity and scale of these databases, automatic data classification is often required. Biological data sets are often hierarchical in nature, with varying degrees of complexity, imposing different challenges to train, test and validate accurate and generalizable classification models. While some approaches to classify hierarchical data have been proposed, no guidelines regarding their utility, applicability and limitations have been explored or implemented. These include ‘Local’ approaches considering the hierarchy, building models per level or node, and ‘Global’ hierarchical classification, using a flat classification approach. To fill this gap, here we have systematically contrasted the performance of ‘Local per Level’ and ‘Local per Node’ approaches with a ‘Global’ approach applied to two different hierarchical datasets: BioLip and CATH. The results show how different components of hierarchical data sets, such as variation coefficient and prediction by depth, can guide the choice of appropriate classification schemes. Finally, we provide guidelines to support this process when embarking on a hierarchical classification task, which will help optimize computational resources and predictive performance.
Collapse
Affiliation(s)
- Pâmela M Rezende
- Universidade Federal de Minas Gerais.,Instituto René Rachou, Fundação Oswaldo Cruz.,Stilingue Inteligência Artificial
| | - Joicymara S Xavier
- Universidade Federal de Minas Gerais.,Instituto René Rachou, Fundação Oswaldo Cruz.,Institute of Agricultural Sciences, Universidade Federal dos Vales do Jequitinhonha e Mucuri
| | - David B Ascher
- School of Chemistry and Molecular Biosciences, University of Queensland.,Systems and Computational Biology, Bio 21 Institute, University of Melbourne.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute
| | | | - Douglas E V Pires
- Systems and Computational Biology, Bio 21 Institute, University of Melbourne.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute.,School of Computing and Information Systems, University of Melbourne
| |
Collapse
|
24
|
Hummer AM, Abanades B, Deane CM. Advances in computational structure-based antibody design. Curr Opin Struct Biol 2022; 74:102379. [PMID: 35490649 DOI: 10.1016/j.sbi.2022.102379] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 02/28/2022] [Accepted: 03/17/2022] [Indexed: 12/12/2022]
Abstract
Antibodies are currently the most important class of biotherapeutics and are used to treat numerous diseases. Recent advances in computational methods are ushering in a new era of antibody design, driven in part by accurate structure prediction. Previously, structure-based antibody design has been limited to a relatively small number of cases where accurate structures or models of both the target antigen and antibody were available. As we move towards a time where it is possible to accurately model most antibodies and antigens, and to reliably predict their binding site, there is vast potential for true computational antibody design. In this review, we describe the latest methods that promise to launch a paradigm shift towards entirely in silico structure-based antibody design.
Collapse
Affiliation(s)
- Alissa M Hummer
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, UK. https://twitter.com/@AlissaHummer
| | - Brennan Abanades
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, UK. https://twitter.com/@brennanaba
| | - Charlotte M Deane
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, UK.
| |
Collapse
|