1
|
Weckbecker M, Anžel A, Yang Z, Hattab G. Interpretable molecular encodings and representations for machine learning tasks. Comput Struct Biotechnol J 2024; 23:2326-2336. [PMID: 38867722 PMCID: PMC11167246 DOI: 10.1016/j.csbj.2024.05.035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Revised: 05/13/2024] [Accepted: 05/19/2024] [Indexed: 06/14/2024] Open
Abstract
Molecular encodings and their usage in machine learning models have demonstrated significant breakthroughs in biomedical applications, particularly in the classification of peptides and proteins. To this end, we propose a new encoding method: Interpretable Carbon-based Array of Neighborhoods (iCAN). Designed to address machine learning models' need for more structured and less flexible input, it captures the neighborhoods of carbon atoms in a counting array and improves the utility of the resulting encodings for machine learning models. The iCAN method provides interpretable molecular encodings and representations, enabling the comparison of molecular neighborhoods, identification of repeating patterns, and visualization of relevance heat maps for a given data set. When reproducing a large biomedical peptide classification study, it outperforms its predecessor encoding. When extended to proteins, it outperforms a lead structure-based encoding on 71% of the data sets. Our method offers interpretable encodings that can be applied to all organic molecules, including exotic amino acids, cyclic peptides, and larger proteins, making it highly versatile across various domains and data sets. This work establishes a promising new direction for machine learning in peptide and protein classification in biomedicine and healthcare, potentially accelerating advances in drug discovery and disease diagnosis.
Collapse
Affiliation(s)
- Moritz Weckbecker
- Center for Artificial Intelligence in Public Health Research, (ZKI-PH), Robert Koch Institute, Nordufer 20, Berlin, 13353, Berlin, Germany
| | - Aleksandar Anžel
- Center for Artificial Intelligence in Public Health Research, (ZKI-PH), Robert Koch Institute, Nordufer 20, Berlin, 13353, Berlin, Germany
| | - Zewen Yang
- Center for Artificial Intelligence in Public Health Research, (ZKI-PH), Robert Koch Institute, Nordufer 20, Berlin, 13353, Berlin, Germany
| | - Georges Hattab
- Center for Artificial Intelligence in Public Health Research, (ZKI-PH), Robert Koch Institute, Nordufer 20, Berlin, 13353, Berlin, Germany
- Department of Mathematics and Computer science Freie Universität, Arnimallee 14, Berlin, 14195, Berlin, Germany
| |
Collapse
|
2
|
Raoufi Z, Abdollahi S. Vaccination with OprB porin, and its epitopes offers protection against A. baumannii infections in mice. Int Immunopharmacol 2024; 141:112972. [PMID: 39186832 DOI: 10.1016/j.intimp.2024.112972] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2024] [Revised: 08/15/2024] [Accepted: 08/16/2024] [Indexed: 08/28/2024]
Abstract
A. baumannii is a deadly antimicrobial resistance pathogen that acquires drug resistance through different mechanisms. Therefore, it is necessary to investigate all its virulence factors and design effective vaccines against it. For this purpose, OprB, an outer membrane porin, was investigated in this study, and its secondary and tertiary structures, physicochemical properties, and B-T epitopes were determined. The vaccine potential of this protein and its linear, non-continuous, and chimeric epitopes were also in-vivo analyzed. Based on the results, two surface epitopes and one non-continuous epitope were identified. Surface contiguous epitopes were produced recombinantly and non-continuous epitope sequences were synthesized and then produced. The chimeric epitope was also produced via the SOE-PCR technique. Active and passive immunization of mice with the whole OprB protein, non-continuous epitope, contiguous epitopes, two epitopes in chimeric form, as well as the mixture of two purified epitopes showed that the survival level and total IgG titer of the mice compared to non-vaccinated mice or mice that were vaccinated with an internal fragment increased significantly. The bacterial load in the immunized mice's lung, liver, kidney, and spleen was much lower than in the control groups, and the TNF-α, IFN-γ, and IL-6 cytokines levels were also lower in these groups and were similar to the naive mice. On the other hand, subunit vaccines showed acceptable safety and due to their minimal cross-activity, their use is much safer.
Collapse
Affiliation(s)
- Zeinab Raoufi
- Department of Biology, Faculty of Basic Science, Behbahan Khatam Alanbia University of Technology, Behbahan, Iran.
| | - Sajad Abdollahi
- Department of Biology, Faculty of Basic Science, Behbahan Khatam Alanbia University of Technology, Behbahan, Iran
| |
Collapse
|
3
|
Zhang G, Kuang X, Zhang Y, Liu Y, Su Z, Zhang T, Wu Y. Machine-learning-based structural analysis of interactions between antibodies and antigens. Biosystems 2024; 243:105264. [PMID: 38964652 DOI: 10.1016/j.biosystems.2024.105264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 06/21/2024] [Accepted: 07/01/2024] [Indexed: 07/06/2024]
Abstract
Computational analysis of paratope-epitope interactions between antibodies and their corresponding antigens can facilitate our understanding of the molecular mechanism underlying humoral immunity and boost the design of new therapeutics for many diseases. The recent breakthrough in artificial intelligence has made it possible to predict protein-protein interactions and model their structures. Unfortunately, detecting antigen-binding sites associated with a specific antibody is still a challenging problem. To tackle this challenge, we implemented a deep learning model to characterize interaction patterns between antibodies and their corresponding antigens. With high accuracy, our model can distinguish between antibody-antigen complexes and other types of protein-protein complexes. More intriguingly, we can identify antigens from other common protein binding regions with an accuracy of higher than 70% even if we only have the epitope information. This indicates that antigens have distinct features on their surface that antibodies can recognize. Additionally, our model was unable to predict the partnerships between antibodies and their particular antigens. This result suggests that one antigen may be targeted by more than one antibody and that antibodies may bind to previously unidentified proteins. Taken together, our results support the precision of antibody-antigen interactions while also suggesting positive future progress in the prediction of specific pairing.
Collapse
Affiliation(s)
- Grace Zhang
- Staples High School, 70 North Avenue, Westport, CT, 06880, USA
| | - Xiaohan Kuang
- Data Science Institute, Vanderbilt University, 1001 19th Ave S, Nashville, TN, 37212, USA
| | - Yuhao Zhang
- Data Science Institute, Vanderbilt University, 1001 19th Ave S, Nashville, TN, 37212, USA
| | - Yunchao Liu
- Department of Computer Science, Vanderbilt University, 1400 18th Ave S, Nashville, TN, 37212, USA
| | - Zhaoqian Su
- Data Science Institute, Vanderbilt University, 1001 19th Ave S, Nashville, TN, 37212, USA
| | - Tom Zhang
- California Institute of Technology, 1200 East California Boulevard, Pasadena, CA, 91125, USA.
| | - Yinghao Wu
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY, 10461, USA.
| |
Collapse
|
4
|
Chen TY, Ho YJ, Ko FY, Wu PY, Chang CJ, Ho SY. Multi-epitope vaccine design of African swine fever virus considering T cell and B cell immunogenicity. AMB Express 2024; 14:95. [PMID: 39215890 PMCID: PMC11365882 DOI: 10.1186/s13568-024-01749-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Accepted: 08/13/2024] [Indexed: 09/04/2024] Open
Abstract
T and B cell activation are equally important in triggering and orchestrating adaptive host responses to design multi-epitope African swine fever virus (ASFV) vaccines. However, few design methods have considered the trade-off between T and B cell immunogenicity when identifying promising ASFV epitopes. This work proposed a novel Pareto front-based ASFV screening method PFAS to identify promising epitopes for designing multi-epitope vaccines utilizing five ASFV Georgia 2007/1 sequences. To accurately predict T cell immunogenicity, four scoring methods were used to estimate the T cell activation in the four stages, including proteasomal cleavage probability, transporter associated with antigen processing transport efficiency, class I binding affinity of the major histocompatibility complex, and CD8 + cytotoxic T cell immunogenicity. PFAS ranked promising epitopes using a Pareto front method considering T and B cell immunogenicity. The coefficient of determination between the Pareto ranks of multi-epitope vaccines and survival days of swine vaccinations was R2 = 0.95. Consequently, PFAS scored complete epitope profiles and identified 72 promising top-ranked epitopes, including 46 CD2v epitopes, two p30 epitopes, 10 p72 epitopes, and 14 pp220 epitopes. PFAS is the first method of using the Pareto front approach to identify promising epitopes that considers the objectives of maximizing both T and B cell immunogenicity. The top-ranked promising epitopes can be cost-effectively validated in vitro. The Pareto front approach can be adaptively applied to various epitope predictors for bacterial, viral and cancer vaccine developments. The MATLAB code of the Pareto front method was available at https://github.com/NYCU-ICLAB/PFAS .
Collapse
Affiliation(s)
- Ting-Yu Chen
- Institute of Molecular Medicine and Bioengineering, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Yann-Jen Ho
- Department of Life Science, National Chung Hsing University, Taichung, Taiwan
- Department of Life Science, Genome and Systems Biology Degree Program, National Taiwan University, Taipei, Taiwan
| | - Fang-Yu Ko
- Department of Life Science, National Chung Hsing University, Taichung, Taiwan
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Pei-Yin Wu
- Reber Genetics Co., Ltd. 13F, No. 160, Sec. 6, Minquan E. Rd., Neihu Dist, Taipei, 114, Taiwan
| | - Chia-Jung Chang
- Reber Genetics Co., Ltd. 13F, No. 160, Sec. 6, Minquan E. Rd., Neihu Dist, Taipei, 114, Taiwan.
| | - Shinn-Ying Ho
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan.
- Department of Biological Science and Technology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan.
- Center for Intelligent Drug Systems and Smart Bio-devices (IDS2B), National Yang Ming Chiao Tung University, Hsinchu, Taiwan.
- College of Health Sciences, Kaohsiung Medical University, Kaohsiung, Taiwan.
| |
Collapse
|
5
|
Yang J, Lv Y, Zhu Y, Song J, Zhu M, Wu C, Fu Y, Zhao W, Zhao Y. Optimizing sheep B-cell epitopes in Echinococcus granulosus recombinant antigen P29 for vaccine development. Front Immunol 2024; 15:1451538. [PMID: 39206186 PMCID: PMC11349700 DOI: 10.3389/fimmu.2024.1451538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2024] [Accepted: 08/01/2024] [Indexed: 09/04/2024] Open
Abstract
Background Echinococcus granulosus is a widespread zoonotic parasitic disease, significantly impacting human health and livestock development; however, no vaccine is currently available for humans. Our preliminary studies indicate that recombinant antigen P29 (rEg.P29) is a promising candidate for vaccine. Methods Sheep were immunized with rEg.P29, and venous blood was collected at various time points. Serum was isolated, and the presence of specific antibodies was detected using ELISA. We designed and synthesized a total of 45 B cell monopeptides covering rEg.P29 using the overlap method. ELISA was employed to assess the serum antibodies of the immunized sheep for recognition of these overlapping peptides, leading to the preliminary identification of B cell epitopes. Utilizing these identified epitopes, new single peptides were designed, synthesized, and used to optimize and confirm B-cell epitopes. Results rEg.P29 effectively induces a sustained antibody response in sheep, particularly characterized by high and stable levels of IgG. Eight B-cell epitopes of were identified, which were mainly distributed in three regions of rEg.P29. Finally, three B cell epitopes were identified and optimized: rEg.P2971-90, rEg.P29151-175, and rEg.P29211-235. These optimized epitopes were well recognized by antibodies in sheep and mice, and the efficacy of these three epitopes significantly increased when they were linked in tandem. Conclusion Three B-cell epitopes were identified and optimized, and the efficacy of these epitopes was significantly enhanced by tandem connection, which indicated the feasibility of tandem peptide vaccine research. This laid a solid foundation for the development of epitope peptide vaccine for Echinococcus granulosus.
Collapse
Affiliation(s)
- Jihui Yang
- Center of Scientific Technology, Ningxia Medical University, Yinchuan, China
- Ningxia Key Laboratory of Prevention and Treatment of Common Infectious Diseases, Ningxia Medical University, Yinchuan, China
| | - Yongxue Lv
- Ningxia Key Laboratory of Prevention and Treatment of Common Infectious Diseases, Ningxia Medical University, Yinchuan, China
- School of Basic Medicine, Ningxia Medical University, Yinchuan, China
| | - Yazhou Zhu
- Ningxia Key Laboratory of Prevention and Treatment of Common Infectious Diseases, Ningxia Medical University, Yinchuan, China
- School of Basic Medicine, Ningxia Medical University, Yinchuan, China
| | - Jiahui Song
- Center of Scientific Technology, Ningxia Medical University, Yinchuan, China
- Ningxia Key Laboratory of Prevention and Treatment of Common Infectious Diseases, Ningxia Medical University, Yinchuan, China
| | - Mingxing Zhu
- Center of Scientific Technology, Ningxia Medical University, Yinchuan, China
- Ningxia Key Laboratory of Prevention and Treatment of Common Infectious Diseases, Ningxia Medical University, Yinchuan, China
| | - Changyou Wu
- Institute of Immunology, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China
| | - Yong Fu
- Qinghai Academy of Animal Sciences and Veterinary Medicine, Qinghai University, Xining, China
| | - Wei Zhao
- Center of Scientific Technology, Ningxia Medical University, Yinchuan, China
- Ningxia Key Laboratory of Prevention and Treatment of Common Infectious Diseases, Ningxia Medical University, Yinchuan, China
| | - Yinqi Zhao
- Center of Scientific Technology, Ningxia Medical University, Yinchuan, China
- Ningxia Key Laboratory of Prevention and Treatment of Common Infectious Diseases, Ningxia Medical University, Yinchuan, China
| |
Collapse
|
6
|
Zhang X, Wang H, Sun C. BiSpec Pairwise AI: guiding the selection of bispecific antibody target combinations with pairwise learning and GPT augmentation. J Cancer Res Clin Oncol 2024; 150:237. [PMID: 38713378 PMCID: PMC11076393 DOI: 10.1007/s00432-024-05740-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2024] [Accepted: 04/03/2024] [Indexed: 05/08/2024]
Abstract
PURPOSE Bispecific antibodies (BsAbs), capable of targeting two antigens simultaneously, represent a significant advancement by employing dual mechanisms of action for tumor suppression. However, how to pair targets to develop effective and safe bispecific drugs is a major challenge for pharmaceutical companies. METHODS Using machine learning models, we refined the biological characteristics of currently approved or in clinical development BsAbs and analyzed hundreds of membrane proteins as bispecific targets to predict the likelihood of successful drug development for various target combinations. Moreover, to enhance the interpretability of prediction results in bispecific target combination, we combined machine learning models with Large Language Models (LLMs). Through a Retrieval-Augmented Generation (RAG) approach, we supplement each pair of bispecific targets' machine learning prediction with important features and rationales, generating interpretable analytical reports. RESULTS In this study, the XGBoost model with pairwise learning was employed to predict the druggability of BsAbs. By analyzing extensive data on BsAbs and designing features from perspectives such as target activity, safety, cell type specificity, pathway mechanism, and gene embedding representation, our model is able to predict target combinations of BsAbs with high market potential. Specifically, we integrated XGBoost with the GPT model to discuss the efficacy of each bispecific target pair, thereby aiding the decision-making for drug developers. CONCLUSION The novelty of this study lies in the integration of machine learning and GPT techniques to provide a novel framework for the design of BsAbs drugs. This holistic approach not only improves prediction accuracy, but also enhances the interpretability and innovativeness of drug design.
Collapse
Affiliation(s)
- Xin Zhang
- Beijing Engineering Research Center of Protein and Antibody, Sinocelltech Ltd., Beijing, 100176, China
- School of Medicine, Nankai University, Tianjin, 300071, China
| | - Huiyu Wang
- Beijing Engineering Research Center of Protein and Antibody, Sinocelltech Ltd., Beijing, 100176, China
| | - Chunyun Sun
- Beijing Engineering Research Center of Protein and Antibody, Sinocelltech Ltd., Beijing, 100176, China.
| |
Collapse
|
7
|
Mortazavi B, Molaei A, Fard NA. Multi-epitopevaccines, from design to expression; an in silico approach. Hum Immunol 2024; 85:110804. [PMID: 38658216 DOI: 10.1016/j.humimm.2024.110804] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Revised: 04/02/2024] [Accepted: 04/15/2024] [Indexed: 04/26/2024]
Abstract
The development of vaccines against a wide range of infectious diseases and pathogens often relies on multi-epitope strategies that can effectively stimulate both humoral and cellular immunity. Immunoinformatics tools play a pivotal role in designing such vaccines, enhancing immune response potential, and minimizing the risk of failure. This review presents a comprehensive overview of practical tools for epitope prediction and the associated immune responses. These immunoinformatics tools facilitate the selection of epitopes based on parameters such as antigenicity, absence of toxic and allergenic sequences, secondary and tertiary structures, sequence conservation, and population coverage. The chosen epitopes can be tailored for B-cells or T-cells, both of which require further assessments covered in this study. We offer a range of suitable linkers that effectively separate cytotoxic T lymphocyte and helper T lymphocyte epitopes while preserving their functionality. Additionally, we identify various adjuvants for specific purposes. We delve into the evaluation of MHC-epitope interactions, MHC clusters, and the simulation of final constructs through molecular docking techniques. We provide diverse linkers and adjuvants optimized for epitope functions to bolster immune responses through epitope attachment. By leveraging these comprehensive tools, the development of multi-epitope vaccines holds the promise of robust immunity and a significant reduction in experimental costs.
Collapse
Affiliation(s)
- Behnam Mortazavi
- Department of systems Biotechnology, Faculty of Industrial and Environmental Biotechnology, National Institute of Genetic Engineering and Biotechnology (NIGEB), Tehran, Iran
| | - Ali Molaei
- Department of Biology, Science and Research Branch, Islamic Azad University, Tehran, Iran
| | - Najaf Allahyari Fard
- Department of systems Biotechnology, Faculty of Industrial and Environmental Biotechnology, National Institute of Genetic Engineering and Biotechnology (NIGEB), Tehran, Iran.
| |
Collapse
|
8
|
Kumar N, Tripathi S, Sharma N, Patiyal S, Devi NL, Raghava GPS. A method for predicting linear and conformational B-cell epitopes in an antigen from its primary sequence. Comput Biol Med 2024; 170:108083. [PMID: 38295479 DOI: 10.1016/j.compbiomed.2024.108083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 12/26/2023] [Accepted: 01/27/2024] [Indexed: 02/02/2024]
Abstract
B-cell is an essential component of the immune system that plays a vital role in providing the immune response against any pathogenic infection by producing antibodies. Existing methods either predict linear or conformational B-cell epitopes in an antigen. In this study, a single method was developed for predicting both types (linear/conformational) of B-cell epitopes. The dataset used in this study contains 3875 B-cell epitopes and 3996 non-B-cell epitopes, where B-cell epitopes consist of both linear and conformational B-cell epitopes. Our primary analysis indicates that certain residues (like Asp, Glu, Lys, and Asn) are more prominent in B-cell epitopes. We developed machine-learning based methods using different types of sequence composition and achieved the highest AUROC of 0.80 using dipeptide composition. In addition, models were developed on selected features, but no further improvement was observed. Our similarity-based method implemented using BLAST shows a high probability of correct prediction with poor sensitivity. Finally, we developed a hybrid model that combines alignment-free (dipeptide based random forest model) and alignment-based (BLAST-based similarity) models. Our hybrid model attained a maximum AUROC of 0.83 with an MCC of 0.49 on the independent dataset. Our hybrid model performs better than existing methods on an independent dataset used in this study. All models were trained and tested on 80 % of the data using a cross-validation technique, and the final model was evaluated on 20 % of the data, called an independent or validation dataset. A webserver and standalone package named "CLBTope" has been developed for predicting, designing, and scanning B-cell epitopes in an antigen sequence available at (https://webs.iiitd.edu.in/raghava/clbtope/).
Collapse
Affiliation(s)
- Nishant Kumar
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Sadhana Tripathi
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Neelam Sharma
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Sumeet Patiyal
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Naorem Leimarembi Devi
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Gajendra P S Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| |
Collapse
|
9
|
Habib A, Liang Y, Xu X, Zhu N, Xie J. Immunoinformatic Identification of Multiple Epitopes of gp120 Protein of HIV-1 to Enhance the Immune Response against HIV-1 Infection. Int J Mol Sci 2024; 25:2432. [PMID: 38397105 PMCID: PMC10889372 DOI: 10.3390/ijms25042432] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Revised: 02/10/2024] [Accepted: 02/13/2024] [Indexed: 02/25/2024] Open
Abstract
Acquired Immunodeficiency Syndrome is caused by the Human Immunodeficiency Virus (HIV), and a significant number of fatalities occur annually. There is a dire need to develop an effective vaccine against HIV-1. Understanding the structural proteins of viruses helps in designing a vaccine based on immunogenic peptides. In the current experiment, we identified gp120 epitopes using bioinformatic epitope prediction tools, molecular docking, and MD simulations. The Gb-1 peptide was considered an adjuvant. Consecutive sequences of GTG, GSG, GGTGG, and GGGGS linkers were used to bind the B cell, Cytotoxic T Lymphocytes (CTL), and Helper T Lymphocytes (HTL) epitopes. The final vaccine construct consisted of 315 amino acids and is expected to be a recombinant protein of approximately 35.49 kDa. Based on docking experiments, molecular dynamics simulations, and tertiary structure validation, the analysis of the modeled protein indicates that it possesses a stable structure and can interact with Toll-like receptors. The analysis demonstrates that the proposed vaccine can provoke an immunological response by activating T and B cells, as well as stimulating the release of IgA and IgG antibodies. This vaccine shows potential for HIV-1 prophylaxis. The in-silico design suggests that multiple-epitope constructs can be used as potentially effective immunogens for HIV-1 vaccine development.
Collapse
Affiliation(s)
- Arslan Habib
- Laboratory of Molecular Immunology, State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai 200433, China; (A.H.); (X.X.); (N.Z.)
| | - Yulai Liang
- Laboratory of Molecular Immunology, State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai 200433, China; (A.H.); (X.X.); (N.Z.)
| | - Xinyi Xu
- Laboratory of Molecular Immunology, State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai 200433, China; (A.H.); (X.X.); (N.Z.)
| | - Naishuo Zhu
- Laboratory of Molecular Immunology, State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai 200433, China; (A.H.); (X.X.); (N.Z.)
- Institute of Biomedical Sciences, School of Life Sciences, Fudan University, Shanghai 200438, China
| | - Jun Xie
- Laboratory of Molecular Immunology, State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai 200433, China; (A.H.); (X.X.); (N.Z.)
| |
Collapse
|
10
|
Liu F, Yuan C, Chen H, Yang F. Prediction of linear B-cell epitopes based on protein sequence features and BERT embeddings. Sci Rep 2024; 14:2464. [PMID: 38291341 PMCID: PMC10828400 DOI: 10.1038/s41598-024-53028-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Accepted: 01/26/2024] [Indexed: 02/01/2024] Open
Abstract
Linear B-cell epitopes (BCEs) play a key role in the development of peptide vaccines and immunodiagnostic reagents. Therefore, the accurate identification of linear BCEs is of great importance in the prevention of infectious diseases and the diagnosis of related diseases. The experimental methods used to identify BCEs are both expensive and time-consuming and they do not meet the demand for identification of large-scale protein sequence data. As a result, there is a need to develop an efficient and accurate computational method to rapidly identify linear BCE sequences. In this work, we developed the new linear BCE prediction method LBCE-BERT. This method is based on peptide chain sequence information and natural language model BERT embedding information, using an XGBoost classifier. The models were trained on three benchmark datasets. The model was training on three benchmark datasets for hyperparameter selection and was subsequently evaluated on several test datasets. The result indicate that our proposed method outperforms others in terms of AUROC and accuracy. The LBCE-BERT model is publicly available at: https://github.com/Lfang111/LBCE-BERT .
Collapse
Affiliation(s)
- Fang Liu
- School of Humanistic Medicine, Anhui Medical University, Hefei, 230032, Anhui, China
| | - ChengCheng Yuan
- School of Biomedical Engineering, Anhui Medical University, Hefei, 230030, Anhui, China
| | - Haoqiang Chen
- School of Humanistic Medicine, Anhui Medical University, Hefei, 230032, Anhui, China
| | - Fei Yang
- School of Biomedical Engineering, Anhui Medical University, Hefei, 230030, Anhui, China.
| |
Collapse
|
11
|
Zhang G, Su Z, Zhang T, Wu Y. Machine-learning-based Structural Analysis of Interactions between Antibodies and Antigens. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.06.570397. [PMID: 38106177 PMCID: PMC10723427 DOI: 10.1101/2023.12.06.570397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Computational analysis of paratope-epitope interactions between antibodies and their corresponding antigens can facilitate our understanding of the molecular mechanism underlying humoral immunity and boost the design of new therapeutics for many diseases. The recent breakthrough in artificial intelligence has made it possible to predict protein-protein interactions and model their structures. Unfortunately, detecting antigen-binding sites associated with a specific antibody is still a challenging problem. To tackle this challenge, we implemented a deep learning model to characterize interaction patterns between antibodies and their corresponding antigens. With high accuracy, our model can distinguish between antibody-antigen complexes and other types of protein-protein complexes. More intriguingly, we can identify antigens from other common protein binding regions with an accuracy of higher than 70% even if we only have the epitope information. This indicates that antigens have distinct features on their surface that antibodies can recognize. Additionally, our model was unable to predict the partnerships between antibodies and their particular antigens. This result suggests that one antigen may be targeted by more than one antibody and that antibodies may bind to previously unidentified proteins. Taken together, our results support the precision of antibody-antigen interactions while also suggesting positive future progress in the prediction of specific pairing.
Collapse
Affiliation(s)
- Grace Zhang
- Staples High School, 70 North Avenue, Westport, CT 06880
| | - Zhaoqian Su
- Data Science Institute, Vanderbilt University, 1001 19th Ave S, Nashville, TN, 37212
| | - Tom Zhang
- California Institute of Technology, 1200 East California Boulevard, Pasadena, CA 91125
| | - Yinghao Wu
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461
| |
Collapse
|
12
|
Karagöz IK, Kaya M, Rückert R, Bozman N, Kaya V, Bayram H, Yıldırım M. A bioinformatic analysis: Previous allergen exposure may support anti- SARS-CoV-2 immune response. Comput Biol Chem 2023; 107:107961. [PMID: 37788543 DOI: 10.1016/j.compbiolchem.2023.107961] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 09/17/2023] [Accepted: 09/18/2023] [Indexed: 10/05/2023]
Abstract
COVID-19, caused by infection with the SARS-CoV-2 has become a global health problem due to significant mortality rates; the exact pathophysiological mechanism remains uncertain. Articles reporting patient data are quite heterogeneous and have several limitations. Surviving patients develop a CD4 and CD8 T-cell response to the virus SARS-CoV-2 during COVID-19. Interestingly, pre-existing virus-reactive T-cells have been found in patients that were not infected before, suggesting some form of cross-reactivity or immunological mimicry. To better understand this phenomenon, we performed a bioinformatic study, which was aimed to identify antigenic structures that may explain the presence of such "reactive" T-cells, which may support or modulate the immune response to SARS-CoV-2 infections. Seven different common environmental allergen epitopes identical to the SARS-CoV-2 S-protein were identified that share affinity to 8 MHCI-specific epitope regions. Pollen showed the greatest similarity with the S protein epitope. In the epitope similarity analysis between the S protein and MHC-II / T helper epitopes, the highest similarity was determined for mites. When S-protein that stimulates B cells and identical epitope antigens are examined, the most common allergens were hornbeam and wheat. The high epitope similarity observed for the allergens examined and S protein epitopes suggest that these allergens may be a reason for pre-existing SARS-CoV-2 - reactive T-cells in previously non-infected subjects and such a previous exposure may affect the course of the disease in COVID-19 infection. It remains to be determined whether such a previous existence of SARS-CoV-2 reactive cells can support the clearance of the virus or if they, in contrast, may even aggravate the disease course. (Table 4, Ref 54).
Collapse
Affiliation(s)
- Isıl Kutluturk Karagöz
- Umraniye Trn. And Rch. Hospital, Division of Ophthalmology, Istanbul, Turkey; Yıldız Technical University, Bioengineering Department, Istanbul, Turkey.
| | | | | | - Nazli Bozman
- Gaziantep University Arts and Science Faculty Department of Biology, Gaziantep, Turkey
| | - Vildan Kaya
- Medstar Antalya Hospital, Division of Radiation Oncology, Antalya, Turkey
| | - Halim Bayram
- Dr. Ersin Arslan Trn. And Rch Hospital, Division of Infection Diseases, Gaziantep, Turkey
| | - Mustafa Yıldırım
- Sanko University, School of Medicine, Internal Diseases, Division of Oncology, Gaziantep, Turkey
| |
Collapse
|
13
|
Kumar N, Bajiya N, Patiyal S, Raghava GPS. Multi-perspectives and challenges in identifying B-cell epitopes. Protein Sci 2023; 32:e4785. [PMID: 37733481 PMCID: PMC10578127 DOI: 10.1002/pro.4785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 09/11/2023] [Accepted: 09/16/2023] [Indexed: 09/23/2023]
Abstract
The identification of B-cell epitopes (BCEs) in antigens is a crucial step in developing recombinant vaccines or immunotherapies for various diseases. Over the past four decades, numerous in silico methods have been developed for predicting BCEs. However, existing reviews have only covered specific aspects, such as the progress in predicting conformational or linear BCEs. Therefore, in this paper, we have undertaken a systematic approach to provide a comprehensive review covering all aspects associated with the identification of BCEs. First, we have covered the experimental techniques developed over the years for identifying linear and conformational epitopes, including the limitations and challenges associated with these techniques. Second, we have briefly described the historical perspectives and resources that maintain experimentally validated information on BCEs. Third, we have extensively reviewed the computational methods developed for predicting conformational BCEs from the structure of the antigen, as well as the methods for predicting conformational epitopes from the sequence. Fourth, we have systematically reviewed the in silico methods developed in the last four decades for predicting linear or continuous BCEs. Finally, we have discussed the overall challenge of identifying continuous or conformational BCEs. In this review, we only listed major computational resources; a complete list with the URL is available from the BCinfo website (https://webs.iiitd.edu.in/raghava/bcinfo/).
Collapse
Affiliation(s)
- Nishant Kumar
- Department of Computational BiologyIndraprastha Institute of Information TechnologyNew DelhiIndia
| | - Nisha Bajiya
- Department of Computational BiologyIndraprastha Institute of Information TechnologyNew DelhiIndia
| | - Sumeet Patiyal
- Department of Computational BiologyIndraprastha Institute of Information TechnologyNew DelhiIndia
| | - Gajendra P. S. Raghava
- Department of Computational BiologyIndraprastha Institute of Information TechnologyNew DelhiIndia
| |
Collapse
|
14
|
Abdollahi S, Raoufi Z. A novel vaccine candidate against A. baumannii based on a new OmpW family protein (OmpW2); structural characterization, antigenicity and epitope investigation, and in-vivo analysis. Microb Pathog 2023; 183:106317. [PMID: 37611777 DOI: 10.1016/j.micpath.2023.106317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 06/06/2023] [Accepted: 08/20/2023] [Indexed: 08/25/2023]
Abstract
A. baumannii is an MDR pathogen whose SARS-CoV-2 has recently increased its mortality rate in hospitalized patients. So, the virulence factors investigation and design of a vaccine against this bacterium seem to be critical. In this regard, the OmpW2 protein was structurally characterized by this study, and its B-T cell epitopes were mapped by bioinformatic tools. In-vivo analyses were employed to verify the immunogenicity of this protein and its selected epitopes. The results indicated that OmpW2 is a conserved virulent antigen, not toxic for the host, and not similar to the human or mouse proteome. A putative interaction between OmpW2 and a Fe-S-cluster redox enzyme was detected. Based on the results, OmpW2 belongs to the OmpW superfamily and eight beta sheets have been predicted in its tight beta-barrel structure. Several exposed epitopes were detected in the OmpW2 sequence and structure, and a sub-unit potential vaccine was generated based on the epitopes. The ELISA results indicated that after the second booster vaccination of BALB/c mice with the whole OmpW2 protein or its sub-unit fragment, the IgG titer significantly raised (p < 0.05). The mortality rate and the bacterial burden in the lung, liver, kidney, and spleen in both passive and active immunized mice were significantly decreased (p ≤ 0.001). In-vivo experiments confirmed that the OmpW2 whole protein and its sub-unit fragment induce the host immune system and can be applied to design a commercial vaccine or diagnostic kit.
Collapse
Affiliation(s)
- Sajad Abdollahi
- Department of Biology, Faculty of Basic Science, Behbahan Khatam Alanbia University of Technology, Behbahan, Iran.
| | - Zeinab Raoufi
- Department of Biology, Faculty of Basic Science, Behbahan Khatam Alanbia University of Technology, Behbahan, Iran
| |
Collapse
|
15
|
Angaitkar P, Aljrees T, Kumar Pandey S, Kumar A, Janghel RR, Sahu TP, Singh KU, Singh T. Inferring linear-B cell epitopes using 2-step metaheuristic variant-feature selection using genetic algorithm. Sci Rep 2023; 13:14593. [PMID: 37670007 PMCID: PMC10480427 DOI: 10.1038/s41598-023-41179-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2023] [Accepted: 08/23/2023] [Indexed: 09/07/2023] Open
Abstract
Linear-B cell epitopes (LBCE) play a vital role in vaccine design; thus, efficiently detecting them from protein sequences is of primary importance. These epitopes consist of amino acids arranged in continuous or discontinuous patterns. Vaccines employ attenuated viruses and purified antigens. LBCE stimulate humoral immunity in the body, where B and T cells target circulating infections. To predict LBCE, the underlying protein sequences undergo a process of feature extraction, feature selection, and classification. Various system models have been proposed for this purpose, but their classification accuracy is only moderate. In order to enhance the accuracy of LBCE classification, this paper presents a novel 2-step metaheuristic variant-feature selection method that combines a linear support vector classifier (LSVC) with a Modified Genetic Algorithm (MGA). The feature selection model employs mono-peptide, dipeptide, and tripeptide features, focusing on the most diverse ones. These selected features are fed into a machine learning (ML)-based parallel ensemble classifier. The ensemble classifier combines correctly classified instances from various classifiers, including k-Nearest Neighbor (kNN), random forest (RF), logistic regression (LR), and support vector machine (SVM). The ensemble classifier came up with an impressively high accuracy of 99.3% as a result of its work. This accuracy is superior to the most recent models that are considered to be state-of-the-art for linear B-cell classification. As a direct consequence of this, the entire system model can now be utilised effectively in real-time clinical settings.
Collapse
Affiliation(s)
- Pratik Angaitkar
- Department of Information Technology, National Institute of Technology, Raipur, G.E. Road, Raipur, 492010, Chhattisgarh, India
| | - Turki Aljrees
- College of Computer Science and Engineering, University of Hafr Al Batin, 39524, Hafar Al Batin, Saudi Arabia
| | - Saroj Kumar Pandey
- Department of Computer Engineering & Applications, GLA University, Mathura, India
| | - Ankit Kumar
- Department of Computer Engineering & Applications, GLA University, Mathura, India.
| | - Rekh Ram Janghel
- Department of Information Technology, National Institute of Technology, Raipur, G.E. Road, Raipur, 492010, Chhattisgarh, India
| | - Tirath Prasad Sahu
- Department of Information Technology, National Institute of Technology, Raipur, G.E. Road, Raipur, 492010, Chhattisgarh, India
| | | | - Teekam Singh
- Department of Computer Science and Engineering, Graphic Era Deemed to be University, Dehradun, 248002, Uttarakhand, India
| |
Collapse
|
16
|
Zeng X, Bai G, Sun C, Ma B. Recent Progress in Antibody Epitope Prediction. Antibodies (Basel) 2023; 12:52. [PMID: 37606436 PMCID: PMC10443277 DOI: 10.3390/antib12030052] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 07/31/2023] [Accepted: 08/03/2023] [Indexed: 08/23/2023] Open
Abstract
Recent progress in epitope prediction has shown promising results in the development of vaccines and therapeutics against various diseases. However, the overall accuracy and success rate need to be improved greatly to gain practical application significance, especially conformational epitope prediction. In this review, we examined the general features of antibody-antigen recognition, highlighting the conformation selection mechanism in flexible antibody-antigen binding. We recently highlighted the success and warning signs of antibody epitope predictions, including linear and conformation epitope predictions. While deep learning-based models gradually outperform traditional feature-based machine learning, sequence and structure features still provide insight into antibody-antigen recognition problems.
Collapse
Affiliation(s)
- Xincheng Zeng
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China; (X.Z.); (C.S.)
| | - Ganggang Bai
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China; (X.Z.); (C.S.)
| | - Chuance Sun
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China; (X.Z.); (C.S.)
| | - Buyong Ma
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China; (X.Z.); (C.S.)
- Shanghai Digiwiser Biological, Inc., Shanghai 200131, China
| |
Collapse
|
17
|
Tanveerul Hassan M, Tayara H, To Chong K. Meta-IL4: An Ensemble Learning Approach for IL-4-Inducing Peptide Prediction. Methods 2023:S1046-2023(23)00113-5. [PMID: 37454743 DOI: 10.1016/j.ymeth.2023.07.002] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 03/25/2023] [Accepted: 07/10/2023] [Indexed: 07/18/2023] Open
Abstract
The cytokine interleukin-4 (IL-4) plays an important role in our immune system. IL-4 leads the way in the differentiation of naïve T-helper 0 cells (Th0) to T-helper 2 cells (Th2). The Th2 responses are characterized by the release of IL-4. CD4+ T cells produce the cytokine IL-4 in response to exogenous parasites. IL-4 has a critical role in the growth of CD8+ cells, inflammation, and responses of T-cells. We propose an ensemble model for the prediction of IL-4 inducing peptides. Four feature encodings were extracted to build an efficient predictor: pseudo-amino acid composition, amphiphilic pseudo-amino acid composition, quasi-sequence-order, and Shannon entropy. We developed an ensemble learning model fusion of random forest, extreme gradient boost, light gradient boosting machine, and extra tree classifier in the first layer, and a Gaussian process classifier as a meta classifier in the second layer. The outcome of the benchmarking testing dataset, with a Matthews correlation coefficient of 0.793, showed that the meta-model (Meta-IL4) outperformed individual classifiers. The highest accuracy achieved by the Meta-IL4 model is 90.70%. These findings suggest that peptides that induce IL-4 can be predicted with reasonable accuracy. These models could aid in the development of peptides that trigger the appropriate Th2 response.
Collapse
Affiliation(s)
- Mir Tanveerul Hassan
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju, South Korea
| | - Hilal Tayara
- School of International Engineering and Science, Jeonbuk National University, Jeonju, South Korea.
| | - Kil To Chong
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju, South Korea; Advances Electronics and Information Research Centre, Jeonbuk National University, Jeonju, South Korea.
| |
Collapse
|
18
|
Shawan MMAK, Sharma AR, Halder SK, Arian TA, Shuvo MN, Sarker SR, Hasan MA. Advances in Computational and Bioinformatics Tools and Databases for Designing and Developing a Multi-Epitope-Based Peptide Vaccine. Int J Pept Res Ther 2023; 29:60. [PMID: 37251529 PMCID: PMC10203685 DOI: 10.1007/s10989-023-10535-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/11/2023] [Indexed: 05/31/2023]
Abstract
A vaccine is defined as a biologic preparation that trains the immune system, boosts immunity, and protects against a deadly microbial infection. They have been used for centuries to combat a variety of contagious illnesses by means of subsiding the disease burden as well as eradicating the disease. Since infectious disease pandemics are a recurring global threat, vaccination has emerged as one of the most promising tools to save millions of lives and reduce infection rates. The World Health Organization reports that immunization protects three million individuals annually. Currently, multi-epitope-based peptide vaccines are a unique concept in vaccine formulation. Epitope-based peptide vaccines utilize small fragments of proteins or peptides (parts of the pathogen), called epitopes, that trigger an adequate immune response against a particular pathogen. However, conventional vaccine designing and development techniques are too cumbersome, expensive, and time-consuming. With the recent advancement in bioinformatics, immunoinformatics, and vaccinomics discipline, vaccine science has entered a new era accompanying a modern, impressive, and more realistic paradigm in designing and developing next-generation strong immunogens. In silico designing and developing a safe and novel vaccine construct involves knowledge of reverse vaccinology, various vaccine databases, and high throughput techniques. The computational tools and techniques directly associated with vaccine research are extremely effective, economical, precise, robust, and safe for human use. Many vaccine candidates have entered clinical trials instantly and are available prior to schedule. In light of this, the present article provides researchers with up-to-date information on various approaches, protocols, and databases regarding the computational designing and development of potent multi-epitope-based peptide vaccines that can assist researchers in tailoring vaccines more rapidly and cost-effectively.
Collapse
Affiliation(s)
- Mohammad Mahfuz Ali Khan Shawan
- Department of Biochemistry and Molecular Biology, Faculty of Biological Sciences, Jahangirnagar University, Savar, Dhaka, 1342 Bangladesh
| | - Ashish Ranjan Sharma
- Institute for Skeletal Aging & Orthopedic Surgery, Hallym University-Chuncheon Sacred Heart Hospital, Chuncheon-si, 24252 Gangwon-do Republic of Korea
| | - Sajal Kumar Halder
- Department of Biochemistry and Molecular Biology, Faculty of Biological Sciences, Jahangirnagar University, Savar, Dhaka, 1342 Bangladesh
| | - Tawsif Al Arian
- Department of Pharmacy, Faculty of Biological Sciences, Jahangirnagar University, Savar, Dhaka, 1342 Bangladesh
| | - Md. Nazmussakib Shuvo
- Department of Botany, Faculty of Biological Sciences, Jahangirnagar University, Savar, Dhaka, 1342 Bangladesh
| | - Satya Ranjan Sarker
- Department of Biotechnology and Genetic Engineering, Faculty of Biological Sciences, Jahangirnagar University, Savar, Dhaka, 1342 Bangladesh
| | - Md. Ashraful Hasan
- Department of Biochemistry and Molecular Biology, Faculty of Biological Sciences, Jahangirnagar University, Savar, Dhaka, 1342 Bangladesh
| |
Collapse
|
19
|
Gu M, Jiao J, Liu S, Zhao W, Ge Z, Cai K, Xu L, He D, Zhang X, Qi X, Jiang W, Zhang P, Wang X, Hu S, Liu X. Monoclonal antibody targeting a novel linear epitope on nucleoprotein confers pan-reactivity to influenza A virus. Appl Microbiol Biotechnol 2023; 107:2437-2450. [PMID: 36820898 PMCID: PMC9947902 DOI: 10.1007/s00253-023-12433-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 01/20/2023] [Accepted: 02/08/2023] [Indexed: 02/24/2023]
Abstract
Nucleoprotein (NP) functions crucially in the replicative cycle of influenza A virus (IAV) via forming the ribonucleoprotein complex together with PB2, PB1, and PA proteins. As its high conservation, NP ranks one of the hot targets for design of universal diagnostic reagents and antiviral drugs for IAV. Here, we report an anti-NP murine monoclonal antibody (mAb) 5F10 prepared from traditional lymphocyte hybridoma technique with the immunogen of a clade 2.3.4.4 H5N1 subtype avian influenza virus. The specificity of mAb 5F10 to NP protein was confirmed by immunofluorescence assay and western blotting, and the mAb 5F10 could be used in immunoprecipitation and immunohistochemistry assays. Importantly, mAb 5F10 possessed broad-spectrum reactivity against H1~H11 subtypes of avian influenza viruses, including various HA clades of H5Nx subtype. In addition, mAb 5F10 also showed good affinity with H1N1 and H3N2 subtype influenza viruses of swine and human origin. Furthermore, the recognized antigenic epitope of mAb 5F10 was identified to consist of the conserved amino acid motif 81EHPSA85 in the second flexible loop region of NP protein through screening the phage display peptide library. Collectively, the mAb 5F10 which recognizes the novel universal NP linear B-cell epitope of IAV with diverse origins and subtypes will be a powerful tool for NP protein-based structural, functional, and mechanistic studies, as well as the development of detection methods and universal vaccines for IAV. KEY POINTS: • A broad-spectrum mAb against various subtypes and sources of IAV was developed • The mAb possessed good reactivity in IFA, western blot, IP, and IHC assays • The mAb targeted a novel conserved linear B-cell epitope involving 81EHPSA85 on NP protein.
Collapse
Affiliation(s)
- Min Gu
- grid.268415.cAnimal Infectious Diseases Laboratory, College of Veterinary Medicine, Yangzhou University, 48 East Wenhui Road, Yangzhou, 225009 Jiangsu China
- grid.268415.cJiangsu Co-innovation Center for Prevention and Control of Important Animal Infectious Diseases and Zoonoses, Yangzhou, 225009 Jiangsu China
- grid.268415.cJiangsu Key Laboratory of Zoonoses, Yangzhou University, Yangzhou, 225009 Jiangsu China
| | - Jun Jiao
- grid.268415.cAnimal Infectious Diseases Laboratory, College of Veterinary Medicine, Yangzhou University, 48 East Wenhui Road, Yangzhou, 225009 Jiangsu China
| | - Suhan Liu
- grid.268415.cAnimal Infectious Diseases Laboratory, College of Veterinary Medicine, Yangzhou University, 48 East Wenhui Road, Yangzhou, 225009 Jiangsu China
| | - Wanchen Zhao
- grid.268415.cAnimal Infectious Diseases Laboratory, College of Veterinary Medicine, Yangzhou University, 48 East Wenhui Road, Yangzhou, 225009 Jiangsu China
| | - Zhichuang Ge
- grid.268415.cAnimal Infectious Diseases Laboratory, College of Veterinary Medicine, Yangzhou University, 48 East Wenhui Road, Yangzhou, 225009 Jiangsu China
| | - Kairui Cai
- grid.268415.cAnimal Infectious Diseases Laboratory, College of Veterinary Medicine, Yangzhou University, 48 East Wenhui Road, Yangzhou, 225009 Jiangsu China
| | - Lijun Xu
- grid.268415.cAnimal Infectious Diseases Laboratory, College of Veterinary Medicine, Yangzhou University, 48 East Wenhui Road, Yangzhou, 225009 Jiangsu China
| | - Dongchang He
- grid.268415.cAnimal Infectious Diseases Laboratory, College of Veterinary Medicine, Yangzhou University, 48 East Wenhui Road, Yangzhou, 225009 Jiangsu China
| | - Xinyu Zhang
- grid.268415.cAnimal Infectious Diseases Laboratory, College of Veterinary Medicine, Yangzhou University, 48 East Wenhui Road, Yangzhou, 225009 Jiangsu China
| | - Xian Qi
- grid.410734.50000 0004 1761 5845Jiangsu Provincial Center for Disease Control and Prevention, Nanjing, 210009 China
| | - Wenming Jiang
- grid.414245.20000 0004 6063 681XChina Animal Health and Epidemiology Center, Qingdao, 266032 China
| | - Pinghu Zhang
- grid.268415.cJiangsu Co-innovation Center for Prevention and Control of Important Animal Infectious Diseases and Zoonoses, Yangzhou, 225009 Jiangsu China
- grid.268415.cJiangsu Key Laboratory of Zoonoses, Yangzhou University, Yangzhou, 225009 Jiangsu China
| | - Xiaoquan Wang
- grid.268415.cAnimal Infectious Diseases Laboratory, College of Veterinary Medicine, Yangzhou University, 48 East Wenhui Road, Yangzhou, 225009 Jiangsu China
- grid.268415.cJiangsu Co-innovation Center for Prevention and Control of Important Animal Infectious Diseases and Zoonoses, Yangzhou, 225009 Jiangsu China
- grid.268415.cJiangsu Key Laboratory of Zoonoses, Yangzhou University, Yangzhou, 225009 Jiangsu China
| | - Shunlin Hu
- grid.268415.cAnimal Infectious Diseases Laboratory, College of Veterinary Medicine, Yangzhou University, 48 East Wenhui Road, Yangzhou, 225009 Jiangsu China
- grid.268415.cJiangsu Co-innovation Center for Prevention and Control of Important Animal Infectious Diseases and Zoonoses, Yangzhou, 225009 Jiangsu China
- grid.268415.cJiangsu Key Laboratory of Zoonoses, Yangzhou University, Yangzhou, 225009 Jiangsu China
| | - Xiufan Liu
- grid.268415.cAnimal Infectious Diseases Laboratory, College of Veterinary Medicine, Yangzhou University, 48 East Wenhui Road, Yangzhou, 225009 Jiangsu China
- grid.268415.cJiangsu Co-innovation Center for Prevention and Control of Important Animal Infectious Diseases and Zoonoses, Yangzhou, 225009 Jiangsu China
- grid.268415.cJiangsu Key Laboratory of Zoonoses, Yangzhou University, Yangzhou, 225009 Jiangsu China
| |
Collapse
|
20
|
Malik A, Shoombuatong W, Kim CB, Manavalan B. GPApred: The first computational predictor for identifying proteins with LPXTG-like motif using sequence-based optimal features. Int J Biol Macromol 2023; 229:529-538. [PMID: 36596370 DOI: 10.1016/j.ijbiomac.2022.12.315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 12/19/2022] [Accepted: 12/28/2022] [Indexed: 01/02/2023]
Abstract
The cell surface proteins of gram-positive bacteria are involved in many important biological functions, including the infection of host cells. Owing to their virulent nature, these proteins are also considered strong candidates for potential drug or vaccine targets. Among the various cell surface proteins of gram-positive bacteria, LPXTG-like proteins form a major class. These proteins have a highly conserved C-terminal cell wall sorting signal, which consists of an LPXTG sequence motif, a hydrophobic domain, and a positively charged tail. These surface proteins are targeted to the cell envelope by a sortase enzyme via transpeptidation. A variety of LPXTG-like proteins have been experimentally characterized; however, their number in public databases has increased owing to extensive bacterial genome sequencing without proper annotation. In the absence of experimental characterization, identifying and annotating these sequences is extremely challenging. Therefore, in this study, we developed the first machine learning-based predictor called GPApred, which can identify LPXTG-like proteins from their primary sequences. Using a newly constructed benchmark dataset, we explored different classifiers and five feature encodings and their hybrids. Optimal features were derived using the recursive feature elimination method, and these features were then trained using a support vector machine algorithm. The performance of different models was evaluated using independent datasets, and a final model (GPApred) was selected based on consistency during cross-validation and independent assessment. GPApred can be an effective tool for predicting LPXTG-like sequences and can be further employed for functional characterization or drug targeting. Availability: https://procarb.org/gpapred/.
Collapse
Affiliation(s)
- Adeel Malik
- Institute of Intelligence Informatics Technology, Sangmyung University, Seoul 03016, Republic of Korea
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand
| | - Chang-Bae Kim
- Department of Biotechnology, Sangmyung University, Seoul 03016, Republic of Korea.
| | - Balachandran Manavalan
- Computational Biology and Bioinformatics Laboratory, Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon 16419, Gyeonggi-do, Republic of Korea.
| |
Collapse
|
21
|
Liu Y, Liu Y, Wang S, Zhu X. LBCE-XGB: A XGBoost Model for Predicting Linear B-Cell Epitopes Based on BERT Embeddings. Interdiscip Sci 2023; 15:293-305. [PMID: 36646842 DOI: 10.1007/s12539-023-00549-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 12/28/2022] [Accepted: 01/03/2023] [Indexed: 01/18/2023]
Abstract
Accurately detecting linear B-cell epitopes (BCEs) makes great sense in vaccine design, immunodiagnostic test, antibody production, disease prevention and treatment. Wet-lab experiments for determining linear BCEs are both expensive and laborious, which are not able to meet the recognition needs of modern massive protein sequence data. Instead, computational methods can efficiently identify linear BCEs with low cost. Although several computational methods are available, the performance is still not satisfactory. Thus, we propose a new method, LBCE-XGB, to forecast linear BCEs based on XGBoost algorithm. To represent the biological information concealed in peptide sequences, the embeddings of the residues were obtained from a pre-trained domain-specific BERT model. In addition, the other five types of attributes comprising amino acid composition, amino acid antigenicity scale were also extracted. The best feature combination was determined according to the cross-validation results. Against the models developed by other deep learning and machine learning algorithms, LBCE-XGB achieves the top performance with an AUROC of 0.845 for fivefold cross-validation. The results on the independent test set show that our model attains an AUROC of 0.838 which is substantially higher than other state-of-the-art methods. The outcomes indicate that the representations of BERT could be an effective feature in predicting linear BCEs and we believe that LBCE-XGB could be a useful medium for detecting linear B cell epitopes with high accuracy and low cost.
Collapse
Affiliation(s)
- Yufeng Liu
- School of Sciences, Anhui Agricultural University, Hefei, 230036, Anhui, China
| | - Yinbo Liu
- School of Sciences, Anhui Agricultural University, Hefei, 230036, Anhui, China
| | - Shuyu Wang
- School of Sciences, Anhui Agricultural University, Hefei, 230036, Anhui, China
| | - Xiaolei Zhu
- School of Sciences, Anhui Agricultural University, Hefei, 230036, Anhui, China.
| |
Collapse
|
22
|
Qi Y, Zheng P, Huang G. DeepLBCEPred: A Bi-LSTM and multi-scale CNN-based deep learning method for predicting linear B-cell epitopes. Front Microbiol 2023; 14:1117027. [PMID: 36910218 PMCID: PMC9992402 DOI: 10.3389/fmicb.2023.1117027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 01/17/2023] [Indexed: 02/24/2023] Open
Abstract
The epitope is the site where antigens and antibodies interact and is vital to understanding the immune system. Experimental identification of linear B-cell epitopes (BCEs) is expensive, is labor-consuming, and has a low throughput. Although a few computational methods have been proposed to address this challenge, there is still a long way to go for practical applications. We proposed a deep learning method called DeepLBCEPred for predicting linear BCEs, which consists of bi-directional long short-term memory (Bi-LSTM), feed-forward attention, and multi-scale convolutional neural networks (CNNs). We extensively tested the performance of DeepLBCEPred through cross-validation and independent tests on training and two testing datasets. The empirical results showed that the DeepLBCEPred obtained state-of-the-art performance. We also investigated the contribution of different deep learning elements to recognize linear BCEs. In addition, we have developed a user-friendly web application for linear BCEs prediction, which is freely available for all scientific researchers at: http://www.biolscience.cn/DeepLBCEPred/.
Collapse
Affiliation(s)
- Yue Qi
- School of Information Engineering, Shaoyang University, Shaoyang, Hunan, China
| | - Peijie Zheng
- School of Information Engineering, Shaoyang University, Shaoyang, Hunan, China
| | - Guohua Huang
- School of Information Engineering, Shaoyang University, Shaoyang, Hunan, China
| |
Collapse
|
23
|
Zheng D, Liang S, Zhang C. B-Cell Epitope Predictions Using Computational Methods. Methods Mol Biol 2023; 2552:239-254. [PMID: 36346595 DOI: 10.1007/978-1-0716-2609-2_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Identifying protein antigenic epitopes that are recognizable by antibodies is a key step in immunologic research. This type of research has broad medical applications, such as new immunodiagnostic reagent discovery, vaccine design, and antibody design. However, due to the countless possibilities of potential epitopes, the experimental search through trial and error would be too costly and time-consuming to be practical. To facilitate this process and improve its efficiency, computational methods were developed to predict both linear epitopes and discontinuous antigenic epitopes. For linear B-cell epitope prediction, many methods were developed, including PREDITOP, PEOPLE, BEPITOPE, BepiPred, COBEpro, ABCpred, AAP, BCPred, BayesB, BEOracle/BROracle, BEST, LBEEP, DRREP, iBCE-EL, SVMTriP, etc. For the more challenging yet important task of discontinuous epitope prediction, methods were also developed, including CEP, DiscoTope, PEPITO, ElliPro, SEPPA, EPITOPIA, PEASE, EpiPred, SEPIa, EPCES, EPSVR, etc. In this chapter, we will discuss computational methods for B-cell epitope predictions of both linear and discontinuous epitopes. SVMTriP and EPCES/EPCSVR, the most successful among the methods for each type of the predictions, will be used as model methods to detail the standard protocols. For linear epitope prediction, SVMTriP was reported to achieve a sensitivity of 80.1% and a precision of 55.2% with a fivefold cross-validation based on a large dataset, yielding an AUC of 0.702. For discontinuous or conformational B-cell epitope prediction, EPCES and EPCSVR were both benchmarked by a curated independent test dataset in which all antigens had no complex structures with the antibody. The identified epitopes by these methods were later independently validated by various biochemical experiments. For these three model methods, webservers and all datasets are publicly available at http://sysbio.unl.edu/SVMTriP , http://sysbio.unl.edu/EPCES/ , and http://sysbio.unl.edu/EPSVR/ .
Collapse
Affiliation(s)
- Dandan Zheng
- Department of Radiation Oncology, University of Rochester School of Medicine and Dentistry, Rochester, NY, USA
| | - Shide Liang
- Department of Research and Development, Bio-Thera Solutions, Guangzhou, China.
| | - Chi Zhang
- School of Biological Sciences, University of Nebraska, Lincoln, NE, USA.
| |
Collapse
|
24
|
Xu Z, Ismanto HS, Zhou H, Saputri DS, Sugihara F, Standley DM. Advances in antibody discovery from human BCR repertoires. FRONTIERS IN BIOINFORMATICS 2022; 2:1044975. [PMID: 36338807 PMCID: PMC9631452 DOI: 10.3389/fbinf.2022.1044975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Accepted: 10/11/2022] [Indexed: 11/06/2022] Open
Abstract
Antibodies make up an important and growing class of compounds used for the diagnosis or treatment of disease. While traditional antibody discovery utilized immunization of animals to generate lead compounds, technological innovations have made it possible to search for antibodies targeting a given antigen within the repertoires of B cells in humans. Here we group these innovations into four broad categories: cell sorting allows the collection of cells enriched in specificity to one or more antigens; BCR sequencing can be performed on bulk mRNA, genomic DNA or on paired (heavy-light) mRNA; BCR repertoire analysis generally involves clustering BCRs into specificity groups or more in-depth modeling of antibody-antigen interactions, such as antibody-specific epitope predictions; validation of antibody-antigen interactions requires expression of antibodies, followed by antigen binding assays or epitope mapping. Together with innovations in Deep learning these technologies will contribute to the future discovery of diagnostic and therapeutic antibodies directly from humans.
Collapse
Affiliation(s)
- Zichang Xu
- Department of Genome Informatics, Research Institute for Microbial Diseases, Osaka University, Suita, Japan
| | - Hendra S. Ismanto
- Department of Genome Informatics, Research Institute for Microbial Diseases, Osaka University, Suita, Japan
| | - Hao Zhou
- Department of Genome Informatics, Research Institute for Microbial Diseases, Osaka University, Suita, Japan
| | - Dianita S. Saputri
- Department of Genome Informatics, Research Institute for Microbial Diseases, Osaka University, Suita, Japan
| | - Fuminori Sugihara
- Core Instrumentation Facility, Immunology Frontier Research Center, Osaka University, Suita, Japan
| | - Daron M. Standley
- Department of Genome Informatics, Research Institute for Microbial Diseases, Osaka University, Suita, Japan
- Department Systems Immunology, Immunology Frontier Research Center, Osaka University, Suita, Japan
| |
Collapse
|
25
|
Islam SI, Sanjida S, Ahmed SS, Almehmadi M, Allahyani M, Aljuaid A, Alsaiari AA, Halawi M. Core Proteomics and Immunoinformatic Approaches to Design a Multiepitope Reverse Vaccine Candidate against Chagas Disease. Vaccines (Basel) 2022; 10:vaccines10101669. [PMID: 36298534 PMCID: PMC9607777 DOI: 10.3390/vaccines10101669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Revised: 09/23/2022] [Accepted: 10/02/2022] [Indexed: 11/05/2022] Open
Abstract
Chagas disease is a tropical ailment indigenous to South America and caused by the protozoan parasite Trypanosoma cruzi, which has serious health consequences globally. Insect vectors transmit the parasite and, due to the lack of vaccine availability and limited treatment options, we implemented an integrated core proteomics analysis to design a reverse vaccine candidate based on immune epitopes for disease control. Firstly, T. cruzi core proteomics was used to identify immunodominant epitopes. Therefore, we designed the vaccine sequence to be non-allergic, antigenic, immunogenic, and to have better solubility. After predicting the tertiary structure, docking and molecular dynamics simulation (MDS) were performed with TLR4, MHC-I, and MHC-II receptors to discover the binding affinities. The final vaccine design demonstrated significant hydrogen bond interactions upon docking with TLR4, MHC-I, and MHC-II receptors. This indicated the efficacy of the vaccine candidate. A server-based immune simulation approach was generated to predict the efficacy. Significant structural compactness and binding stability were found based on MDS. Finally, by optimizing codons on Escherichia coli K12, a high GC content and CAI value were obtained, which were then incorporated into the cloning vector pET2+ (a). Thus, the developed vaccine sequence may be a viable therapy option for Chagas disease.
Collapse
Affiliation(s)
- Sk Injamamul Islam
- The International Graduate Program of Veterinary Science and Technology (VST), Department of Veterinary Microbiology, Faculty of Veterinary Science and Technology, Chulalongkorn University, Bangkok 10330, Thailand
- Correspondence: or
| | - Saloa Sanjida
- Department of Environmental Science and Technology, Faculty of Applied Science and Technology, Jashore University of Science and Technology, Jashore 7408, Bangladesh
| | - Sheikh Sunzid Ahmed
- Department of Botany, Faculty of Biological Sciences, University of Dhaka, Dhaka 1000, Bangladesh
| | - Mazen Almehmadi
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, Taif University, Taif 21944, Saudi Arabia
| | - Mamdouh Allahyani
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, Taif University, Taif 21944, Saudi Arabia
| | - Abdulelah Aljuaid
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, Taif University, Taif 21944, Saudi Arabia
| | - Ahad Amer Alsaiari
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, Taif University, Taif 21944, Saudi Arabia
| | - Mustafa Halawi
- Department of Medical Laboratory Technology, College of Applied Medical Sciences, Jazan University, Jazan 54943, Saudi Arabia
| |
Collapse
|
26
|
Xu H, Zhao Z. NetBCE: An Interpretable Deep Neural Network for Accurate Prediction of Linear B-cell Epitopes. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022; 20:1002-1012. [PMID: 36526218 PMCID: PMC10025766 DOI: 10.1016/j.gpb.2022.11.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/05/2022] [Revised: 10/27/2022] [Accepted: 11/11/2022] [Indexed: 12/15/2022]
Abstract
Identification of B-cell epitopes (BCEs) plays an essential role in the development of peptide vaccines and immuno-diagnostic reagents, as well as antibody design and production. In this work, we generated a large benchmark dataset comprising 124,879 experimentally supported linear epitope-containing regions in 3567 protein clusters from over 1.3 million B cell assays. Analysis of this curated dataset showed large pathogen diversity covering 176 different families. The accuracy in linear BCE prediction was found to strongly vary with different features, while all sequence-derived and structural features were informative. To search more efficient and interpretive feature representations, a ten-layer deep learning framework for linear BCE prediction, namely NetBCE, was developed. NetBCE achieved high accuracy and robust performance with the average area under the curve (AUC) value of 0.8455 in five-fold cross-validation through automatically learning the informative classification features. NetBCE substantially outperformed the conventional machine learning algorithms and other tools, with more than 22.06% improvement of AUC value compared to other tools using an independent dataset. Through investigating the output of important network modules in NetBCE, epitopes and non-epitopes tended to be presented in distinct regions with efficient feature representation along the network layer hierarchy. The NetBCE is freely available at https://github.com/bsml320/NetBCE.
Collapse
Affiliation(s)
- Haodong Xu
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA; Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA; The University of Texas MD Anderson Cancer Center UTHealth Houston Graduate School of Biomedical Sciences, Houston, TX 77030, USA; Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, USA.
| |
Collapse
|
27
|
Shashkova TI, Umerenkov D, Salnikov M, Strashnov PV, Konstantinova AV, Lebed I, Shcherbinin DN, Asatryan MN, Kardymon OL, Ivanisenko NV. SEMA: Antigen B-cell conformational epitope prediction using deep transfer learning. Front Immunol 2022; 13:960985. [PMID: 36189325 PMCID: PMC9523212 DOI: 10.3389/fimmu.2022.960985] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Accepted: 08/23/2022] [Indexed: 11/13/2022] Open
Abstract
One of the primary tasks in vaccine design and development of immunotherapeutic drugs is to predict conformational B-cell epitopes corresponding to primary antibody binding sites within the antigen tertiary structure. To date, multiple approaches have been developed to address this issue. However, for a wide range of antigens their accuracy is limited. In this paper, we applied the transfer learning approach using pretrained deep learning models to develop a model that predicts conformational B-cell epitopes based on the primary antigen sequence and tertiary structure. A pretrained protein language model, ESM-1v, and an inverse folding model, ESM-IF1, were fine-tuned to quantitatively predict antibody-antigen interaction features and distinguish between epitope and non-epitope residues. The resulting model called SEMA demonstrated the best performance on an independent test set with ROC AUC of 0.76 compared to peer-reviewed tools. We show that SEMA can quantitatively rank the immunodominant regions within the SARS-CoV-2 RBD domain. SEMA is available at https://github.com/AIRI-Institute/SEMAi and the web-interface http://sema.airi.net.
Collapse
Affiliation(s)
| | | | | | | | | | - Ivan Lebed
- AI Center Block Services, Sber, Moscow, Russia
| | - Dmitriy N. Shcherbinin
- Federal Research Centre of Epidemiology and Microbiology named after Honorary Academician N. F. Gamaleya, Ministry of Health, Moscow, Russia
| | - Marina N. Asatryan
- Federal Research Centre of Epidemiology and Microbiology named after Honorary Academician N. F. Gamaleya, Ministry of Health, Moscow, Russia
| | | | - Nikita V. Ivanisenko
- Artificial Intelligence Research Institute, Moscow, Russia
- Laboratory of Computational Proteomics, Institute of Cytology and Genetics Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
- *Correspondence: Nikita V. Ivanisenko,
| |
Collapse
|
28
|
Sun Q, Huang Z, Yang S, Li Y, Ma Y, Yang F, Zhang Y, Xu F. Bioinformatics-based SARS-CoV-2 epitopes design and the impact of Spike protein mutants on epitope humoral immunities. Immunobiology 2022; 227:152287. [PMID: 36244092 PMCID: PMC9516880 DOI: 10.1016/j.imbio.2022.152287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Revised: 09/10/2022] [Accepted: 09/26/2022] [Indexed: 11/15/2022]
Abstract
Background Epitope selection is the key to peptide vaccines development. Bioinformatics tools can efficiently improve the screening of antigenic epitopes and help to choose the right ones. Objective To predict, synthesize and testify peptide epitopes at spike protein, assess the effect of mutations on epitope humoral immunity, thus provide clues for the design and development of epitope peptide vaccines against SARS-CoV-2. Methods Bioinformatics servers and immunological tools were used to identify the helper T lymphocyte, cytotoxic T lymphocyte, and linear B lymphocyte epitopes on the S protein of SARS-CoV-2. Physicochemical properties of candidate epitopes were analyzed using IEDB, VaxiJen, and AllerTOP online software. Three candidate epitopes were synthesized and their antigenic responses were evaluated by binding antibody detection. Results A total of 20 antigenic, non-toxic and non-allergenic candidate epitopes were identified from 1502 epitopes, including 6 helper T-cell epitopes, 13 cytotoxic T-cell epitopes, and 1 linear B cell epitope. After immunization with antigen containing candidate epitopes S206-221, S403-425, and S1157-1170 in rabbits, the binding titers of serum antibody to the corresponding peptide, S protein, receptor-binding domain protein were (415044, 2582, 209.3), (852819, 45238, 457767) and (357897, 10528, 13.79), respectively. The binding titers to Omicron S protein were 642, 12,878 and 7750, respectively, showing that N211L, DEL212 and K417N mutations cause the reduction of the antibody binding activity. Conclusions Bioinformatic methods are effective in peptide epitopes design. Certain mutations of the Omicron would lead to the loss of antibody affinity to Omicron S protein.
Collapse
Affiliation(s)
- Qi Sun
- Pharmaceutical Sciences Research Division, Department of Pharmacy, Medical Supplies Centre of PLA General Hospital, Beijing, China; Medical School of Chinese PLA, Beijing, China
| | - Zhuanqing Huang
- Pharmaceutical Sciences Research Division, Department of Pharmacy, Medical Supplies Centre of PLA General Hospital, Beijing, China; Medical School of Chinese PLA, Beijing, China
| | - Sen Yang
- Medical School of Chinese PLA, Beijing, China; Chinese People's Armed Police Force Hospital of Beijing, Beijing, China
| | - Yuanyuan Li
- Pharmaceutical Sciences Research Division, Department of Pharmacy, Medical Supplies Centre of PLA General Hospital, Beijing, China; Medical School of Chinese PLA, Beijing, China
| | - Yue Ma
- Pharmaceutical Sciences Research Division, Department of Pharmacy, Medical Supplies Centre of PLA General Hospital, Beijing, China; Medical School of Chinese PLA, Beijing, China
| | - Fei Yang
- Pharmaceutical Sciences Research Division, Department of Pharmacy, Medical Supplies Centre of PLA General Hospital, Beijing, China; Medical School of Chinese PLA, Beijing, China
| | - Ying Zhang
- Pharmaceutical Sciences Research Division, Department of Pharmacy, Medical Supplies Centre of PLA General Hospital, Beijing, China
| | - Fenghua Xu
- Pharmaceutical Sciences Research Division, Department of Pharmacy, Medical Supplies Centre of PLA General Hospital, Beijing, China.
| |
Collapse
|
29
|
Sahu TK, Meher PK, Choudhury NK, Rao AR. A comparative analysis of amino acid encoding schemes for the prediction of flexible length linear B-cell epitopes. Brief Bioinform 2022; 23:6673853. [PMID: 35998895 DOI: 10.1093/bib/bbac356] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Revised: 07/06/2022] [Accepted: 07/30/2022] [Indexed: 11/12/2022] Open
Abstract
Linear B-cell epitopes have a prominent role in the development of peptide-based vaccines and disease diagnosis. High variability in the length of these epitopes is a major reason for low accuracy in their prediction. Most of the B-cell epitope prediction methods considered fixed length of epitope sequences and achieved good accuracy. Though a number of tools are available for the prediction of flexible length linear B-cell epitopes with reasonable accuracy, further improvement in the prediction performance is still expected. Thus, here we made an attempt to analyze the performance of machine learning approaches (MLA) with 18 different amino acid encoding schemes in the prediction of flexible length linear B-cell epitopes. We considered B-cell epitope sequences of variable lengths (11-56 amino acids) from well-established public resources. The performances of machine learning algorithms with the encoded epitope sequence datasets were evaluated. Besides, the feasible combinations of encoding schemes were also explored and analyzed. The results revealed that amino-acid composition (AC) and distribution component of composition-transition-distribution encoding schemes are suitable for heterogeneous epitope data, whereas amino-acid-anchoring-pair-composition (APC), dipeptide-composition and amino-acids-pair-propensity-scale (APP) are more appropriate for homogeneous data. Further, two combinations of peptide encoding schemes, i.e. APC + AC and APC + APP with random forest classifier were identified to have improved performance over the state-of-the-art tools for flexible length linear B-cell epitope prediction. The study also revealed better performance of random forest over other considered MLAs in the prediction of flexible length linear B-cell epitopes.
Collapse
Affiliation(s)
- Tanmaya Kumar Sahu
- ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India.,ICAR-National Bureau of Plant Genetic Resources, New Delhi, India
| | | | | | - Atmakuri Ramakrishna Rao
- ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India.,Indian Council of Agricultural Research (ICAR), New Delhi, India
| |
Collapse
|
30
|
Considering epitopes conservity in targeting SARS-CoV-2 mutations in variants: a novel immunoinformatics approach to vaccine design. Sci Rep 2022; 12:14017. [PMID: 35982065 PMCID: PMC9386201 DOI: 10.1038/s41598-022-18152-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2022] [Accepted: 08/05/2022] [Indexed: 11/08/2022] Open
Abstract
Severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) has gained mutations at an alarming rate in the past years. Developing mutations can increase the virus's pathogenicity and virulence; reduce the efficacy of vaccines, antibodies neutralization, and even challenge adaptive immunity. So, it is essential to identify conserved epitopes (with fewer mutations) in different variants with appropriate antigenicity to target the variants by an appropriate vaccine design. Yet as, 3369 SARS-CoV-2 genomes were collected from global initiative on sharing avian flu data. Then, mutations in the immunodominant regions (IDRs), immune epitope database (IEDB) epitopes, and also predicted epitopes were calculated. In the following, epitopes conservity score against the total number of events (mutations) and the number of mutated sites in each epitope was weighted by Shannon entropy and then calculated by the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS). Based on the TOPSIS conservity score and antigenicity score, the epitopes were plotted. The result demonstrates that almost all epitopes and IDRs with various lengths have gained different numbers of mutations in dissimilar sites. Herein, our two-step calculation for conservity recommends only 8 IDRs, 14 IEDB epitopes, and 10 predicted epitopes among all epitopes. The selected ones have higher conservity and higher immunogenicity. This method is an open-source multi-criteria decision-making platform, which provides a scientific approach to selecting epitopes with appropriate conservity and immunogenicity; against ever-changing viruses.
Collapse
|
31
|
Prediction of B cell epitopes in proteins using a novel sequence similarity-based method. Sci Rep 2022; 12:13739. [PMID: 35962028 PMCID: PMC9374694 DOI: 10.1038/s41598-022-18021-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 08/03/2022] [Indexed: 11/29/2022] Open
Abstract
Prediction of B cell epitopes that can replace the antigen for antibody production and detection is of great interest for research and the biotech industry. Here, we developed a novel BLAST-based method to predict linear B cell epitopes. To that end, we generated a BLAST-formatted database upon a dataset of 62,730 known linear B cell epitope sequences and considered as a B cell epitope any peptide sequence producing ungapped BLAST hits to this database with identity ≥ 80% and length ≥ 8. We examined B cell epitope predictions by this method in tenfold cross-validations in which we considered various types of non-B cell epitopes, including 62,730 peptide sequences with verified negative B cell assays. As a result, we obtained values of accuracy, specificity and sensitivity of 72.54 ± 0.27%, 81.59 ± 0.37% and 63.49 ± 0.43%, respectively. In an independent dataset incorporating 503 B cell epitopes, this method reached accuracy, specificity and sensitivity of 74.85%, 99.20% and 50.50%, respectively, outperforming state-of-the-art methods to predict linear B cell epitopes. We implemented this BLAST-based approach to predict B cell epitopes at http://imath.med.ucm.es/bepiblast.
Collapse
|
32
|
Shi H, Zhang S, Li X. R5hmCFDV: computational identification of RNA 5-hydroxymethylcytosine based on deep feature fusion and deep voting. Brief Bioinform 2022; 23:6658858. [PMID: 35945157 DOI: 10.1093/bib/bbac341] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2022] [Revised: 07/17/2022] [Accepted: 07/25/2022] [Indexed: 11/13/2022] Open
Abstract
RNA 5-hydroxymethylcytosine (5hmC) is a kind of RNA modification, which is related to the life activities of many organisms. Studying its distribution is very important to reveal its biological function. Previously, high-throughput sequencing was used to identify 5hmC, but it is expensive and inefficient. Therefore, machine learning is used to identify 5hmC sites. Here, we design a model called R5hmCFDV, which is mainly divided into feature representation, feature fusion and classification. (i) Pseudo dinucleotide composition, dinucleotide binary profile and frequency, natural vector and physicochemical property are used to extract features from four aspects: nucleotide composition, coding, natural language and physical and chemical properties. (ii) To strengthen the relevance of features, we construct a novel feature fusion method. Firstly, the attention mechanism is employed to process four single features, stitch them together and feed them to the convolution layer. After that, the output data are processed by BiGRU and BiLSTM, respectively. Finally, the features of these two parts are fused by the multiply function. (iii) We design the deep voting algorithm for classification by imitating the soft voting mechanism in the Python package. The base classifiers contain deep neural network (DNN), convolutional neural network (CNN) and improved gated recurrent unit (GRU). And then using the principle of soft voting, the corresponding weights are assigned to the predicted probabilities of the three classifiers. The predicted probability values are multiplied by the corresponding weights and then summed to obtain the final prediction results. We use 10-fold cross-validation to evaluate the model, and the evaluation indicators are significantly improved. The prediction accuracy of the two datasets is as high as 95.41% and 93.50%, respectively. It demonstrates the stronger competitiveness and generalization performance of our model. In addition, all datasets and source codes can be found at https://github.com/HongyanShi026/R5hmCFDV.
Collapse
Affiliation(s)
- Hongyan Shi
- School of Mathematics and Statistics, Xidian University, Xi'an 710071, P. R. China
| | - Shengli Zhang
- School of Mathematics and Statistics, Xidian University, Xi'an 710071, P. R. China
| | - Xinjie Li
- School of Mathematics and Statistics, Xidian University, Xi'an 710071, P. R. China
| |
Collapse
|
33
|
Islam SI, Mou MJ, Sanjida S. Application of reverse vaccinology to design a multi-epitope subunit vaccine against a new strain of Aeromonas veronii. J Genet Eng Biotechnol 2022; 20:118. [PMID: 35939149 PMCID: PMC9358925 DOI: 10.1186/s43141-022-00391-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Accepted: 07/04/2022] [Indexed: 11/18/2022]
Abstract
BACKGROUND Aeromonas veronii is one of the most common pathogens of freshwater fishes that cause sepsis and ulcers. There are increasing numbers of cases showing that it is a significant zoonotic and aquatic agent. Epidemiological studies have shown that A. veronii virulence and drug tolerance have both increased over the last few years as a result of epidemiological investigations. Cadaverine reverse transporter (CadB) and maltoporin (LamB protein) contribute to the virulence of A. veronii TH0426. TH0426 strain is currently showing severe cases on fish species, and its resistance against therapeutic has been increasing. Despite these devastating complications, there is still no effective cure or vaccine for this strain of A.veronii. RESULTS In this regard, an immunoinformatic method was used to generate an epitope-based vaccine against this pathogen. The immunodominant epitopes were identified using the CadB and LamB protein of A. veronii. The final constructed vaccine sequence was developed to be immunogenic, non-allergenic as well as have better solubility. Molecular dynamic simulation revealed significant binding stability and structural compactness. Finally, using Escherichia coli K12 as a model, codon optimization yielded ideal GC content and a higher CAI value, which was then included in the cloning vector pET2+ (a). CONCLUSION Altogether, our outcomes imply that the proposed peptide vaccine might be a good option for A. veronii TH0426 prophylaxis.
Collapse
Affiliation(s)
- Sk Injamamul Islam
- Department of Fisheries and Marine Bioscience, Faculty of Biological Science, Jashore University of Science and Technology, Jashore, 7408, Bangladesh.
- Center of Excellence in Fish Infectious Diseases (CE FID), Department of Veterinary Microbiology, Faculty of Veterinary Science, Chulalongkorn University, Bangkok, 10330, Thailand.
- The International Graduate Program of Veterinary Science and Technology (VST), Department of Veterinary Microbiology, Faculty of Veterinary Science and Technology, Chulalongkorn University, Bangkok, 10330, Thailand.
| | - Moslema Jahan Mou
- Department of Genetic Engineering and Biotechnology, Faculty of Life and Earth Science, University of Rajshahi, Rajshahi, Bangladesh
| | - Saloa Sanjida
- Department of Environmental Science and Technology, Faculty of Applied Science and Technology, Jashore University of Science and Technology, Jashore, 7408, Bangladesh
| |
Collapse
|
34
|
Yin R, Zhu X, Zeng M, Wu P, Li M, Kwoh CK. A framework for predicting variable-length epitopes of human-adapted viruses using machine learning methods. Brief Bioinform 2022; 23:6645487. [PMID: 35849093 DOI: 10.1093/bib/bbac281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 06/16/2022] [Accepted: 06/17/2022] [Indexed: 11/14/2022] Open
Abstract
The coronavirus disease 2019 pandemic has alerted people of the threat caused by viruses. Vaccine is the most effective way to prevent the disease from spreading. The interaction between antibodies and antigens will clear the infectious organisms from the host. Identifying B-cell epitopes is critical in vaccine design, development of disease diagnostics and antibody production. However, traditional experimental methods to determine epitopes are time-consuming and expensive, and the predictive performance using the existing in silico methods is not satisfactory. This paper develops a general framework to predict variable-length linear B-cell epitopes specific for human-adapted viruses with machine learning approaches based on Protvec representation of peptides and physicochemical properties of amino acids. QR decomposition is incorporated during the embedding process that enables our models to handle variable-length sequences. Experimental results on large immune epitope datasets validate that our proposed model's performance is superior to the state-of-the-art methods in terms of AUROC (0.827) and AUPR (0.831) on the testing set. Moreover, sequence analysis also provides the results of the viral category for the corresponding predicted epitopes with high precision. Therefore, this framework is shown to reliably identify linear B-cell epitopes of human-adapted viruses given protein sequences and could provide assistance for potential future pandemics and epidemics.
Collapse
Affiliation(s)
- Rui Yin
- Department of Biomedical Informatics, Harvard Medical School, Boston, USA
| | - Xianghe Zhu
- Department of Statistics, University of Oxford, Oxford, UK
| | - Min Zeng
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, China
| | - Pengfei Wu
- Center for Medical Genetics, School of Life Sciences, Central South University, Changsha, China
| | - Min Li
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, China
| | - Chee Keong Kwoh
- School of Computer Science and Engineering, Nanyang Technological University, Singapore
| |
Collapse
|
35
|
Bioinformatics, Computational Informatics, and Modeling Approaches to the Design of mRNA COVID-19 Vaccine Candidates. COMPUTATION 2022. [DOI: 10.3390/computation10070117] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
This article is devoted to applying bioinformatics and immunoinformatics approaches for the development of a multi-epitope mRNA vaccine against the spike glycoproteins of circulating SARS-CoV-2 variants in selected African countries. The study’s relevance is dictated by the fact that severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) began its global threat at the end of 2019 and since then has had a devastating impact on the whole world. Measures to reduce threats from the pandemic include social restrictions, restrictions on international travel, and vaccine development. In most cases, vaccine development depends on the spike glycoprotein, which serves as a medium for its entry into host cells. Although several variants of SARS-CoV-2 have emerged from mutations crossing continental boundaries, about 6000 delta variants have been reported along the coast of more than 20 countries in Africa, with South Africa accounting for the highest percentage. This also applies to the omicron variant of the SARS-CoV-2 virus in South Africa. The authors suggest that bioinformatics and immunoinformatics approaches be used to develop a multi-epitope mRNA vaccine against the spike glycoproteins of circulating SARS-CoV-2 variants in selected African countries. Various immunoinformatics tools have been used to predict T- and B-lymphocyte epitopes. The epitopes were further subjected to multiple evaluations to select epitopes that could elicit a sustained immunological response. The candidate vaccine consisted of seven epitopes, a highly immunogenic adjuvant, an MHC I-targeting domain (MITD), a signal peptide, and linkers. The molecular weight (MW) was predicted to be 223.1 kDa, well above the acceptable threshold of 110 kDa on an excellent vaccine candidate. In addition, the results showed that the candidate vaccine was antigenic, non-allergenic, non-toxic, thermostable, and hydrophilic. The vaccine candidate has good population coverage, with the highest range in East Africa (80.44%) followed by South Africa (77.23%). West Africa and North Africa have 76.65% and 76.13%, respectively, while Central Africa (75.64%) has minimal coverage. Among seven epitopes, no mutations were observed in 100 randomly selected SARS-CoV-2 spike glycoproteins in the study area. Evaluation of the secondary structure of the vaccine constructs revealed a stabilized structure showing 36.44% alpha-helices, 20.45% drawn filaments, and 33.38% random helices. Molecular docking of the TLR4 vaccine showed that the simulated vaccine has a high binding affinity for TLR-4, reflecting its ability to stimulate the innate and adaptive immune response.
Collapse
|
36
|
Use of Integrated Core Proteomics, Immuno-Informatics, and In Silico Approaches to Design a Multiepitope Vaccine against Zoonotic Pathogen Edwardsiella tarda. Appl Microbiol 2022. [DOI: 10.3390/applmicrobiol2020031] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Multidrug-resistant Edwardsiella tarda has been reported as the main causative agent for massive fish mortality. The pathogen is well-known for causing hemorrhagic septicemia in fish and has been linked to gastrointestinal infections in humans. Formalin-inactivated Edwardsiella vaccination has previously been found to be ineffective in aquaculture species. Therefore, based on E. tarda’s integrated core complete sequenced genomes, the study aimed to design a subunit vaccine based on T and B cell epitopes employing immunoinformatics approach. Initially, the top immunodominant and antigenic epitopes were predicted from the core complete sequenced genomes of the E. tarda genome and designed the vaccine by using linkers and adjuvant. In addition, vaccine 3D structure was predicted followed by refinement, and molecular docking was performed for the analysis of interacting residues between vaccines with TLR5, MHC-I, and MHC-II, respectively. The final vaccine constructs demonstrated strong hydrogen bond interactions. Molecular dynamic simulation of vaccine-TLR5 receptor complex showed a stable structural binding and compactness. Furthermore, E. coli used as a model organism for codon optimization proved optimal GC content and CAI value, which were subsequently cloned in vector pET2+ (a). Overall, the findings of the study imply that the designed epitope vaccine might be a good option for prophylaxis for E. tarda.
Collapse
|
37
|
Aslam S, Ashfaq UA, Zia T, Aslam N, Alrumaihi F, Shahid F, Noor F, Qasim M. Proteome based mapping and reverse vaccinology techniques to contrive multi-epitope based subunit vaccine (MEBSV) against Streptococcus pyogenes. INFECTION, GENETICS AND EVOLUTION : JOURNAL OF MOLECULAR EPIDEMIOLOGY AND EVOLUTIONARY GENETICS IN INFECTIOUS DISEASES 2022; 100:105259. [PMID: 35231667 DOI: 10.1016/j.meegid.2022.105259] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 12/01/2021] [Accepted: 02/23/2022] [Indexed: 06/14/2023]
Abstract
Streptococcus pyogenes is a root cause of human infection like pharyngitis, tonsillitis, scarlet fever, impetigo, and respiratory tract infections. About 11 million individuals in the US suffer from pharyngitis every year. Unfortunately, no vaccine against S. pyogenes is available yet. The purpose of this study is to create a multiepitope-based subunit vaccine (MEBSV) targeting S. pyogenes top four highly antigenic proteins by using a combination of immunological techniques and molecular docking to tackle term group A streptococcal (GAS) infections. T-cell (HTL & CTL), B-cell, and IFN-γ of target proteins were forecasted and epitopes having high antigenic properties being selected for subsequent research. For designing of final vaccine, 5LBL, 9CTL, and 4HTL epitopes were joined by the KK, AAY, and GPGPG linkers. To enhance the immune response, the N-end of the vaccine was linked by adjuvant (Cholera enterotoxin subunit B) with a linker named EAAAK. With the addition of adjuvants and linkers, the construct size was 421 amino acids. IFN-γ and B-cell epitopes illustrate that the modeled construct is optimized for cell-mediated immune or humoral responses. The developed MEBSV structure was assessed to be highly antigenic, non-toxic, and non-allergenic. Moreover, disulphide engineering further enhanced the stability of the final vaccine protein. Molecular docking of the MEBSV with toll-like receptor 4 (TLR4) was conducted to check the vaccine's compatibility with the receptor. Besides, in-silico cloning has been carried out for credibility validation and proper expression of vaccine construct. These findings suggested that the multi-epitope vaccine produced might be a potential immunogenic against Group A streptococcus infections but further experimental testing is required to validate this study.
Collapse
Affiliation(s)
- Sidra Aslam
- Department of Bioinformatics and Biotechnology, Government College University Faisalabad, Pakistan
| | - Usman Ali Ashfaq
- Department of Bioinformatics and Biotechnology, Government College University Faisalabad, Pakistan
| | - Tuba Zia
- Department of Bioinformatics and Biotechnology, Government College University Faisalabad, Pakistan
| | - Nosheen Aslam
- Department of Biochemistry, Government College University Faisalabad, Pakistan
| | - Faris Alrumaihi
- Department of Medical Laboratories, College of Applied Medical Sciences, Qassim University, Buraydah 51452, Saudi Arabia
| | - Farah Shahid
- Department of Bioinformatics and Biotechnology, Government College University Faisalabad, Pakistan
| | - Fatima Noor
- Department of Bioinformatics and Biotechnology, Government College University Faisalabad, Pakistan
| | - Muhammad Qasim
- Department of Bioinformatics and Biotechnology, Government College University Faisalabad, Pakistan.
| |
Collapse
|
38
|
Development of Anticancer Peptides Using Artificial Intelligence and Combinational Therapy for Cancer Therapeutics. Pharmaceutics 2022; 14:pharmaceutics14050997. [PMID: 35631583 PMCID: PMC9147327 DOI: 10.3390/pharmaceutics14050997] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Revised: 04/28/2022] [Accepted: 05/04/2022] [Indexed: 01/27/2023] Open
Abstract
Cancer is a group of diseases causing abnormal cell growth, altering the genome, and invading or spreading to other parts of the body. Among therapeutic peptide drugs, anticancer peptides (ACPs) have been considered to target and kill cancer cells because cancer cells have unique characteristics such as a high negative charge and abundance of microvilli in the cell membrane when compared to a normal cell. ACPs have several advantages, such as high specificity, cost-effectiveness, low immunogenicity, minimal toxicity, and high tolerance under normal physiological conditions. However, the development and identification of ACPs are time-consuming and expensive in traditional wet-lab-based approaches. Thus, the application of artificial intelligence on the approaches can save time and reduce the cost to identify candidate ACPs. Recently, machine learning (ML), deep learning (DL), and hybrid learning (ML combined DL) have emerged into the development of ACPs without experimental analysis, owing to advances in computer power and big data from the power system. Additionally, we suggest that combination therapy with classical approaches and ACPs might be one of the impactful approaches to increase the efficiency of cancer therapy.
Collapse
|
39
|
Shishir TA, Jannat T, Naser IB. An in-silico study of the mutation-associated effects on the spike protein of SARS-CoV-2, Omicron variant. PLoS One 2022; 17:e0266844. [PMID: 35446879 PMCID: PMC9022835 DOI: 10.1371/journal.pone.0266844] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Accepted: 03/28/2022] [Indexed: 01/16/2023] Open
Abstract
The emergence of Omicron (B.1.1.529), a new Variant of Concern in the COVID-19 pandemic, while accompanied by the ongoing Delta variant infection, has once again fueled fears of a new infection wave and global health concern. In the Omicron variant, the receptor-binding domain (RBD) of its spike glycoprotein is heavily mutated, a feature critical for the transmission rate of the virus by interacting with hACE2. In this study, we used a combination of conventional and advanced neural network-based in silico approaches to predict how these mutations would affect the spike protein. The results demonstrated a decrease in the electrostatic potentials of residues corresponding to receptor recognition sites, an increase in the alkalinity of the protein, a change in hydrophobicity, variations in functional residues, and an increase in the percentage of alpha-helix structure. Moreover, several mutations were found to modulate the immunologic properties of the potential epitopes predicted from the spike protein. Our next step was to predict the structural changes of the spike and their effect on its interaction with the hACE2. The results revealed that the RBD of the Omicron variant had a higher affinity than the reference. Moreover, all-atom molecular dynamics simulations concluded that the RBD of the Omicron variant exhibits a more dispersed interaction network since mutations resulted in an increased number of hydrophobic interactions and hydrogen bonds with hACE2.
Collapse
Affiliation(s)
- Tushar Ahmed Shishir
- Department of Mathematics and Natural Sciences, BRAC University, Dhaka, Bangladesh
- Rangamati General Hospital, Chattogram, Bangladesh
| | - Taslimun Jannat
- Department of Mathematics and Natural Sciences, BRAC University, Dhaka, Bangladesh
| | - Iftekhar Bin Naser
- Department of Mathematics and Natural Sciences, BRAC University, Dhaka, Bangladesh
- * E-mail:
| |
Collapse
|
40
|
Islam SI, Mou MJ, Sanjida S, Tariq M, Nasir S, Mahfuj S. Designing a novel mRNA vaccine against Vibrio harveyi infection in fish: an immunoinformatics approach. Genomics Inform 2022; 20:e11. [PMID: 35399010 PMCID: PMC9002004 DOI: 10.5808/gi.21065] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Accepted: 01/07/2022] [Indexed: 11/20/2022] Open
Abstract
Vibrio harveyi belongs to the family Vibrionaceae of class Gammaproteobacteria. Around 12 Vibrio species can cause gastroenteritis (gastrointestinal illness) in humans. A large number of bacterial particles can be found in the infected cells, which may cause death. Despite these devastating complications, there is still no cure or vaccine for the bacteria. As a result, we used an immunoinformatics approach to develop a multi-epitope vaccine against the most pathogenic hemolysin gene of V. harveyi. The immunodominant T- and B-cell epitopes were identified using the hemolysin protein. We developed a vaccine employing three possible epitopes: cytotoxic T-lymphocytes, helper T-lymphocytes, and linear B-lymphocyte epitopes, after thorough testing. The vaccine was developed to be antigenic, immunogenic, and non-allergenic, as well as have a better solubility. Molecular dynamics simulation revealed significant structural stiffness and binding stability. In addition, the immunological simulation generated by computers revealed that the vaccination might elicit immune reactions Escherichia coli K12 as a model, codon optimization yielded ideal GC content and a higher codon adaptation index value, which was then included in the cloning vector pET2+ (a). Altogether, our experiment implies that the proposed peptide vaccine might be a good option for vibriosis prophylaxis.
Collapse
Affiliation(s)
- Sk Injamamul Islam
- Department of Fisheries and Marine Bioscience, Faculty of Biological Science, Jashore University of Science and Technology, Jashore 7408, Bangladesh.,Chulalongkorn University, Department of Veterinary Microbiology, Faculty of Veterinary Science and Technology, Bangkok 10330, Thailand
| | - Moslema Jahan Mou
- Department of Genetic Engineering & Biotechnology, Faculty of Earth and Life Science, University of Rajshahi, Rajshahi 6205, Bangladesh
| | - Saloa Sanjida
- Department of Environmental Science and Technology, Faculty of Applied Science and Technology, Jashore University of Science and Technology, Jashore 7408, Bangladesh
| | - Muhammad Tariq
- Department of Biotechnology, Faculty of Biological Sciences, University of Malakand, Chakdara 18800, Pakistan
| | - Saad Nasir
- Department of Clinical Medicine and Surgery, Faculty of Veterinary Medicine, University of Veterinary and Animal Sciences, Lahore 54000, Pakistan
| | - Sarower Mahfuj
- Department of Fisheries and Marine Bioscience, Faculty of Biological Science, Jashore University of Science and Technology, Jashore 7408, Bangladesh
| |
Collapse
|
41
|
Kumar A, Sharma P, Arun A, Meena LS. Development of peptide vaccine candidate using highly antigenic PE-PGRS family proteins to stimulate the host immune response against Mycobacterium tuberculosis H 37Rv: an immuno-informatics approach. J Biomol Struct Dyn 2022; 41:3382-3404. [PMID: 35293852 DOI: 10.1080/07391102.2022.2048079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Tuberculosis (TB) is a fast spreading; transmissible disease caused by the Mycobacterium tuberculosis (M. tuberculosis). M. tuberculosis has a high death rate in its endemic regions due to a lack of appropriate treatment and preventative measures. We have used a vaccinomics strategy to create an effective multi-epitope vaccine against M. tuberculosis. The antigenic proteins with the highest antigenicity were utilised to predict cytotoxic T-lymphocyte (CTL), helper T-lymphocyte (HTL), and linear B-lymphocyte (LBL) epitopes. CTL and HTL epitopes were covered in 99.97% of the population. Seven epitopes each of CTL, HTL, and LBL were ultimately selected and utilised to develop a multi-epitope vaccine. A vaccine design was developed by combining these epitopes with suitable linkers and LprG adjuvant. The vaccine chimera was revealed to be highly immunogenic, non-allergenic, and non-toxic. To ensure a better expression within the Escherichia coli K12 (E. coli K12) host system, codon adaptation and in silico cloning were accomplished. Following that, various validation studies were conducted, including molecular docking, molecular dynamics simulation, and immunological simulation, all of which indicated that the designed vaccine would be stable in the biological environment and effective against M. tuberculosis infection. The immune simulation revealed higher levels of T-cell and B-cell activity, which corresponded to the actual immune response. Exposure simulations were repeated several times, resulting in increased clonal selection and faster antigen clearance. These results suggest that, if proposed vaccine chimera would test both in-vitro and in-vivo, it could be a viable treatment and preventive strategy for TB.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Ajit Kumar
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India.,Academy of Scientific and Innovative Research (AcSIR), CSIR-HRDC, Ghaziabad, Uttar Pradesh, India
| | - Priyanka Sharma
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India
| | - Akanksha Arun
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India.,Academy of Scientific and Innovative Research (AcSIR), CSIR-HRDC, Ghaziabad, Uttar Pradesh, India
| | - Laxman S Meena
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India.,Academy of Scientific and Innovative Research (AcSIR), CSIR-HRDC, Ghaziabad, Uttar Pradesh, India
| |
Collapse
|
42
|
Bukhari SNH, Jain A, Haq E, Mehbodniya A, Webber J. Machine Learning Techniques for the Prediction of B-Cell and T-Cell Epitopes as Potential Vaccine Targets with a Specific Focus on SARS-CoV-2 Pathogen: A Review. Pathogens 2022; 11:146. [PMID: 35215090 PMCID: PMC8879824 DOI: 10.3390/pathogens11020146] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 01/19/2022] [Accepted: 01/21/2022] [Indexed: 02/01/2023] Open
Abstract
The only part of an antigen (a protein molecule found on the surface of a pathogen) that is composed of epitopes specific to T and B cells is recognized by the human immune system (HIS). Identification of epitopes is considered critical for designing an epitope-based peptide vaccine (EBPV). Although there are a number of vaccine types, EBPVs have received less attention thus far. It is important to mention that EBPVs have a great deal of untapped potential for boosting vaccination safety-they are less expensive and take a short time to produce. Thus, in order to quickly contain global pandemics such as the ongoing outbreak of coronavirus disease 2019 (COVID-19) caused by the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), as well as epidemics and endemics, EBPVs are considered promising vaccine types. The high mutation rate of SARS-CoV-2 has posed a great challenge to public health worldwide because either the composition of existing vaccines has to be changed or a new vaccine has to be developed to protect against its different variants. In such scenarios, time being the critical factor, EBPVs can be a promising alternative. To design an effective and viable EBPV against different strains of a pathogen, it is important to identify the putative T- and B-cell epitopes. Using the wet-lab experimental approach to identify these epitopes is time-consuming and costly because the experimental screening of a vast number of potential epitope candidates is required. Fortunately, various available machine learning (ML)-based prediction methods have reduced the burden related to the epitope mapping process by decreasing the potential epitope candidate list for experimental trials. Moreover, these methods are also cost-effective, scalable, and fast. This paper presents a systematic review of various state-of-the-art and relevant ML-based methods and tools for predicting T- and B-cell epitopes. Special emphasis is placed on highlighting and analyzing various models for predicting epitopes of SARS-CoV-2, the causative agent of COVID-19. Based on the various methods and tools discussed, future research directions for epitope prediction are presented.
Collapse
Affiliation(s)
- Syed Nisar Hussain Bukhari
- University Institute of Computing, Chandigarh University, NH-95, Chandigarh-Ludhiana Highway, Mohali 140413, India;
| | - Amit Jain
- University Institute of Computing, Chandigarh University, NH-95, Chandigarh-Ludhiana Highway, Mohali 140413, India;
| | - Ehtishamul Haq
- Department of Biotechnology, University of Kashmir, Srinagar 190006, India;
| | - Abolfazl Mehbodniya
- Department of Electronics and Communication Engineering, Kuwait College of Science and Technology, Kuwait City 20185145, Kuwait;
| | - Julian Webber
- Graduate School of Engineering Science, Osaka University, Osaka 560-8531, Japan;
| |
Collapse
|
43
|
Manavalan B, Basith S, Lee G. Comparative analysis of machine learning-based approaches for identifying therapeutic peptides targeting SARS-CoV-2. Brief Bioinform 2022; 23:bbab412. [PMID: 34595489 PMCID: PMC8500067 DOI: 10.1093/bib/bbab412] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 08/27/2021] [Accepted: 09/07/2021] [Indexed: 01/08/2023] Open
Abstract
Coronavirus disease 2019 (COVID-19) has impacted public health as well as societal and economic well-being. In the last two decades, various prediction algorithms and tools have been developed for predicting antiviral peptides (AVPs). The current COVID-19 pandemic has underscored the need to develop more efficient and accurate machine learning (ML)-based prediction algorithms for the rapid identification of therapeutic peptides against severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2). Several peptide-based ML approaches, including anti-coronavirus peptides (ACVPs), IL-6 inducing epitopes and other epitopes targeting SARS-CoV-2, have been implemented in COVID-19 therapeutics. Owing to the growing interest in the COVID-19 field, it is crucial to systematically compare the existing ML algorithms based on their performances. Accordingly, we comprehensively evaluated the state-of-the-art IL-6 and AVP predictors against coronaviruses in terms of core algorithms, feature encoding schemes, performance evaluation metrics and software usability. A comprehensive performance assessment was then conducted to evaluate the robustness and scalability of the existing predictors using well-constructed independent validation datasets. Additionally, we discussed the advantages and disadvantages of the existing methods, providing useful insights into the development of novel computational tools for characterizing and identifying epitopes or ACVPs. The insights gained from this review are anticipated to provide critical guidance to the scientific community in the rapid design and development of accurate and efficient next-generation in silico tools against SARS-CoV-2.
Collapse
Affiliation(s)
| | - Shaherin Basith
- Department of Physiology, Ajou University School of Medicine, Suwon 16499, Korea
| | - Gwang Lee
- Department of Physiology, Ajou University School of Medicine, Suwon 16499, Korea
| |
Collapse
|
44
|
A vaccine built from potential immunogenic pieces derived from the SARS-CoV-2 spike glycoprotein: A computational approximation. J Immunol Methods 2022; 502:113216. [PMID: 35007561 PMCID: PMC8739792 DOI: 10.1016/j.jim.2022.113216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2020] [Revised: 11/20/2021] [Accepted: 01/03/2022] [Indexed: 11/21/2022]
Abstract
Coronavirus Disease 2019 (COVID-19) represents a new global threat demanding a multidisciplinary effort to fight its etiological agent—severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). In this regard, immunoinformatics may aid to predict prominent immunogenic regions from critical SARS-CoV-2 structural proteins, such as the spike (S) glycoprotein, for their use in prophylactic or therapeutic interventions against this highly pathogenic betacoronavirus. Accordingly, in this study, an integrated immunoinformatics approach was applied to identify cytotoxic T cell (CTC), T helper cell (THC), and Linear B cell (BC) epitopes from the S glycoprotein in an attempt to design a high-quality multi-epitope vaccine. The best CTC, THC, and BC epitopes showed high viral antigenicity and lack of allergenic or toxic residues, as well as CTC and THC epitopes showed suitable interactions with HLA class I (HLA-I) and HLA class II (HLA-II) molecules, respectively. Remarkably, SARS-CoV-2 receptor-binding domain (RBD) and its receptor-binding motif (RBM) harbour several potential epitopes. The structure prediction, refinement, and validation data indicate that the multi-epitope vaccine has an appropriate conformation and stability. Four conformational epitopes and an efficient binding between Toll-like receptor 4 (TLR4) and the vaccine model were observed. Importantly, the population coverage analysis showed that the multi-epitope vaccine could be used globally. Notably, computer-based simulations suggest that the vaccine model has a robust potential to evoke and maximize both immune effector responses and immunological memory to SARS-CoV-2. Further research is needed to accomplish with the mandatory international guidelines for human vaccine formulations.
Collapse
|
45
|
Tarrahimofrad H, Rahimnahal S, Zamani J, Jahangirian E, Aminzadeh S. Designing a multi-epitope vaccine to provoke the robust immune response against influenza A H7N9. Sci Rep 2021; 11:24485. [PMID: 34966175 PMCID: PMC8716528 DOI: 10.1038/s41598-021-03932-2] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2021] [Accepted: 12/13/2021] [Indexed: 12/12/2022] Open
Abstract
A new strain of Influenza A Virus (IAV), so-called "H7N9 Avian Influenza", is the first strain of this virus in which a human is infected by transmitting the N9 of influenza virus. Although continuous human-to-human transmission has not been reported, the occurrence of various H7N9-associated epidemics and the lack of production of strong antibodies against H7N9 in humans warn of the potential for H7N9 to become a new pandemic. Therefore, the need for effective vaccination against H7N9 as a life-threatening viral pathogen has become a major concern. The current study reports the design of a multi-epitope vaccine against Hemagglutinin (HA) and Neuraminidase (NA) proteins of H7N9 Influenza A virus by prediction of Cytotoxic T lymphocyte (CTL), Helper T lymphocyte (HTL), IFN-γ and B-cell epitopes. Human β-defensin-3 (HβD-3) and pan HLA DR-binding epitope (PADRE) sequence were considered as adjuvant. EAAAK, AAY, GPGPG, HEYGAEALERAG, KK and RVRR linkers were used as a connector for epitopes. The final construct contained 777 amino acids that are expected to be a recombinant protein of about ~ 86.38 kDa with antigenic and non-allergenic properties after expression. Modeled protein analysis based on the tertiary structure validation, docking studies, and molecular dynamics simulations results like Root-mean-square deviation (RMSD), Gyration, Root-mean-square fluctuation (RMSF) and Molecular Mechanics Poisson-Boltzmann Surface Area (MM/PBSA) showed that this protein has a stable construct and capable of being in interaction with Toll-like receptor 7 (TLR7), TLR8 and m826 antibody. Analysis of the obtained data the demonstrates that suggested vaccine has the potential to induce the immune response by stimulating T and Bcells, and may be utilizable for prevention purposes against Avian Influenza A (H7N9).
Collapse
Affiliation(s)
- Hossein Tarrahimofrad
- Bioprocess Engineering Group, Institute of Industrial and Environmental Biotechnology, National Institute of Genetic Engineering and Biotechnology (NIGEB), Tehran, Iran
| | - Somayyeh Rahimnahal
- Department of Animal Science, Faculty of Agriculture, Ilam University, Ilam, Iran
| | - Javad Zamani
- Bioprocess Engineering Group, Institute of Industrial and Environmental Biotechnology, National Institute of Genetic Engineering and Biotechnology (NIGEB), Tehran, Iran
| | - Ehsan Jahangirian
- Bioprocess Engineering Group, Institute of Industrial and Environmental Biotechnology, National Institute of Genetic Engineering and Biotechnology (NIGEB), Tehran, Iran
| | - Saeed Aminzadeh
- Bioprocess Engineering Group, Institute of Industrial and Environmental Biotechnology, National Institute of Genetic Engineering and Biotechnology (NIGEB), Tehran, Iran.
| |
Collapse
|
46
|
Wang Y, Mai G, Zou M, Long H, Chen YQ, Sun L, Tian D, Zhao Y, Jiang G, Cao Z, Du X. Heavy chain sequence-based classifier for the specificity of human antibodies. Brief Bioinform 2021; 23:6483065. [PMID: 34953464 DOI: 10.1093/bib/bbab516] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Revised: 10/07/2021] [Accepted: 11/12/2021] [Indexed: 11/13/2022] Open
Abstract
Antibodies specifically bind to antigens and are an essential part of the immune system. Hence, antibodies are powerful tools in research and diagnostics. High-throughput sequencing technologies have promoted comprehensive profiling of the immune repertoire, which has resulted in large amounts of antibody sequences that remain to be further analyzed. In this study, antibodies were downloaded from IMGT/LIGM-DB and Sequence Read Archive databases. Contributing features from antibody heavy chains were formulated as numerical inputs and fed into an ensemble machine learning classifier to classify the antigen specificity of six classes of antibodies, namely anti-HIV-1, anti-influenza virus, anti-pneumococcal polysaccharide, anti-citrullinated protein, anti-tetanus toxoid and anti-hepatitis B virus. The classifier was validated using cross-validation and a testing dataset. The ensemble classifier achieved a macro-average area under the receiver operating characteristic curve (AUC) of 0.9246 from the 10-fold cross-validation, and 0.9264 for the testing dataset. Among the contributing features, the contribution of the complementarity-determining regions was 53.1% and that of framework regions was 46.9%, and the amino acid mutation rates occupied the first and second ranks among the top five contributing features. The classifier and insights provided in this study could promote the mechanistic study, isolation and utilization of potential therapeutic antibodies.
Collapse
Affiliation(s)
- Yaqi Wang
- School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, P.R. China.,School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou 510275, P.R. China
| | - Guoqin Mai
- School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, P.R. China.,School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou 510275, P.R. China
| | - Min Zou
- School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, P.R. China.,School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou 510275, P.R. China
| | - Haoyu Long
- School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, P.R. China.,School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou 510275, P.R. China
| | - Yao-Qing Chen
- School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, P.R. China.,School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou 510275, P.R. China
| | - Litao Sun
- School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, P.R. China.,School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou 510275, P.R. China
| | - Dechao Tian
- School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, P.R. China.,School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou 510275, P.R. China
| | - Yang Zhao
- School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, P.R. China.,School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou 510275, P.R. China
| | - Guozhi Jiang
- School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, P.R. China.,School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou 510275, P.R. China
| | - Zicheng Cao
- School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, P.R. China.,School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou 510275, P.R. China
| | - Xiangjun Du
- School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, P.R. China.,School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou 510275, P.R. China.,Key Laboratory of Tropical Disease Control, Ministry of Education, Sun Yat-sen University, Guangzhou, 510030, P.R. China
| |
Collapse
|
47
|
Malik A, Subramaniyam S, Kim CB, Manavalan B. SortPred: The first machine learning based predictor to identify bacterial sortases and their classes using sequence-derived information. Comput Struct Biotechnol J 2021; 20:165-174. [PMID: 34976319 PMCID: PMC8703055 DOI: 10.1016/j.csbj.2021.12.014] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Revised: 12/08/2021] [Accepted: 12/09/2021] [Indexed: 12/12/2022] Open
Abstract
Sortase enzymes are cysteine transpeptidases that embellish the surface of Gram-positive bacteria with various proteins thereby allowing these microorganisms to interact with their neighboring environment. It is known that several of their substrates can cause pathological implications, so researchers have focused on the development of sortase inhibitors. Currently, six different classes of sortases (A-F) are recognized. However, with the extensive application of bacterial genome sequencing projects, the number of potential sortases in the public databases has exploded, presenting considerable challenges in annotating these sequences. It is very laborious and time-consuming to characterize these sortase classes experimentally. Therefore, this study developed the first machine-learning-based two-layer predictor called SortPred, where the first layer predicts the sortase from the given sequence and the second layer predicts their class from the predicted sortase. To develop SortPred, we constructed an original benchmarking dataset and investigated 31 feature descriptors, primarily on five feature encoding algorithms. Afterward, each of these descriptors were trained using a random forest classifier and their robustness was evaluated with an independent dataset. Finally, we selected the final model independently for both layers depending on the performance consistency between cross-validation and independent evaluation. SortPred is expected to be an effective tool for identifying bacterial sortases, which in turn may aid in designing sortase inhibitors and exploring their functions. The SortPred webserver and a standalone version are freely accessible at: https://procarb.org/sortpred.
Collapse
Affiliation(s)
- Adeel Malik
- Institute of Intelligence Informatics Technology, Sangmyung University, Seoul 03016, Republic of Korea
| | | | - Chang-Bae Kim
- Department of Biotechnology, Sangmyung University, Seoul 03016, Republic of Korea
| | | |
Collapse
|
48
|
Ashford J, Reis-Cunha J, Lobo I, Lobo F, Campelo F. Organism-specific training improves performance of linear B-cell epitope prediction. Bioinformatics 2021; 37:4826-4834. [PMID: 34289025 PMCID: PMC8665745 DOI: 10.1093/bioinformatics/btab536] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2021] [Revised: 07/01/2021] [Accepted: 07/19/2021] [Indexed: 12/25/2022] Open
Abstract
MOTIVATION In silico identification of linear B-cell epitopes represents an important step in the development of diagnostic tests and vaccine candidates, by providing potential high-probability targets for experimental investigation. Current predictive tools were developed under a generalist approach, training models with heterogeneous datasets to develop predictors that can be deployed for a wide variety of pathogens. However, continuous advances in processing power and the increasing amount of epitope data for a broad range of pathogens indicate that training organism or taxon-specific models may become a feasible alternative, with unexplored potential gains in predictive performance. RESULTS This article shows how organism-specific training of epitope prediction models can yield substantial performance gains across several quality metrics when compared to models trained with heterogeneous and hybrid data, and with a variety of widely used predictors from the literature. These results suggest a promising alternative for the development of custom-tailored predictive models with high predictive power, which can be easily implemented and deployed for the investigation of specific pathogens. AVAILABILITY AND IMPLEMENTATION The data underlying this article, as well as the full reproducibility scripts, are available at https://github.com/fcampelo/OrgSpec-paper. The R package that implements the organism-specific pipeline functions is available at https://github.com/fcampelo/epitopes. SUPPLEMENTARY INFORMATION Supplementary materials are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jodie Ashford
- Department of Computer Science, College of Engineering and Physical Sciences, Aston University, Birmingham B4 7ET, UK
| | - João Reis-Cunha
- Department of Preventive Veterinary Medicine, Universidade Federal de Minas Gerais, Belo Horizonte 31270-901, Brazil
| | - Igor Lobo
- Graduate Program in Genetics, Universidade Federal de Minas Gerais, Belo Horizonte 31270-901, Brazil
| | - Francisco Lobo
- Department of General Biology, Universidade Federal de Minas Gerais, Belo Horizonte 31270-901, Brazil
| | - Felipe Campelo
- Department of Computer Science, College of Engineering and Physical Sciences, Aston University, Birmingham B4 7ET, UK
| |
Collapse
|
49
|
Sami SA, Marma KKS, Mahmud S, Khan MAN, Albogami S, El-Shehawi AM, Rakib A, Chakraborty A, Mohiuddin M, Dhama K, Uddin MMN, Hossain MK, Tallei TE, Emran TB. Designing of a Multi-epitope Vaccine against the Structural Proteins of Marburg Virus Exploiting the Immunoinformatics Approach. ACS OMEGA 2021; 6:32043-32071. [PMID: 34870027 PMCID: PMC8638006 DOI: 10.1021/acsomega.1c04817] [Citation(s) in RCA: 56] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Accepted: 11/10/2021] [Indexed: 05/08/2023]
Abstract
Marburg virus disease (MVD) caused by the Marburg virus (MARV) generally appears with flu-like symptoms and leads to severe hemorrhagic fever. It spreads via direct contact with infected individuals or animals. Despite being considered to be less threatening in terms of appearances and the number of infected patients, the high fatality rate of this pathogenic virus is a major concern. Until now, no vaccine has been developed to combat this deadly virus. Therefore, vaccination for this virus is necessary to reduce its mortality. Our current investigation focuses on the design and formulation of a multi-epitope vaccine based on the structural proteins of MARV employing immunoinformatics approaches. The screening of potential T-cell and B-cell epitopes from the seven structural proteins of MARV was carried out through specific selection parameters. Afterward, we compiled the shortlisted epitopes by attaching them to an appropriate adjuvant and linkers. Population coverage analysis, conservancy analysis, and MHC cluster analysis of the shortlisted epitopes were satisfactory. Importantly, physicochemical characteristics, human homology assessment, and structure validation of the vaccine construct delineated convenient outcomes. We implemented disulfide bond engineering to stabilize the tertiary or quaternary interactions. Furthermore, stability and physical movements of the vaccine protein were explored using normal-mode analysis. The immune simulation study of the vaccine complexes also exhibited significant results. Additionally, the protein-protein docking and molecular dynamics simulation of the final construct exhibited a higher affinity toward toll-like receptor-4 (TLR4). From simulation trajectories, multiple descriptors, namely, root mean square deviations (rmsd), radius of gyration (Rg), root mean square fluctuations (RMSF), solvent-accessible surface area (SASA), and hydrogen bonds, have been taken into account to demonstrate the inflexible and rigid nature of receptor molecules and the constructed vaccine. Inclusively, our findings suggested the vaccine constructs' ability to regulate promising immune responses against MARV pathogenesis.
Collapse
Affiliation(s)
- Saad Ahmed Sami
- Department of Pharmacy,
Faculty of Biological Sciences, University
of Chittagong, Chittagong 4331, Bangladesh
| | - Kay Kay Shain Marma
- Department of Pharmacy,
Faculty of Biological Sciences, University
of Chittagong, Chittagong 4331, Bangladesh
| | - Shafi Mahmud
- Microbiology
Laboratory, Bioinformatics Division, Department of Genetic Engineering
and Biotechnology, University of Rajshahi, Rajshahi 6205, Bangladesh
| | - Md. Asif Nadim Khan
- Department of Biochemistry and Molecular
Biology, Faculty of Biological Sciences, University of Chittagong, Chittagong 4331, Bangladesh
| | - Sarah Albogami
- Department
of Biotechnology, College of Science, Taif
University, P.O. Box 11099, Taif 21944, Saudi Arabia
| | - Ahmed M. El-Shehawi
- Department
of Biotechnology, College of Science, Taif
University, P.O. Box 11099, Taif 21944, Saudi Arabia
| | - Ahmed Rakib
- Department of Pharmacy,
Faculty of Biological Sciences, University
of Chittagong, Chittagong 4331, Bangladesh
| | - Agnila Chakraborty
- Department of Pharmacy,
Faculty of Biological Sciences, University
of Chittagong, Chittagong 4331, Bangladesh
| | - Mostafah Mohiuddin
- Department of Pharmacy,
Faculty of Biological Sciences, University
of Chittagong, Chittagong 4331, Bangladesh
| | - Kuldeep Dhama
- Division of Pathology, ICAR-Indian Veterinary
Research Institute, Izatnagar, Bareilly, Uttar Pradesh 243122, India
| | - Mir Muhammad Nasir Uddin
- Department of Pharmacy,
Faculty of Biological Sciences, University
of Chittagong, Chittagong 4331, Bangladesh
| | - Mohammed Kamrul Hossain
- Department of Pharmacy,
Faculty of Biological Sciences, University
of Chittagong, Chittagong 4331, Bangladesh
| | - Trina Ekawati Tallei
- Department of Biology,
Faculty of Mathematics and Natural Sciences, Sam Ratulangi University, Manado, North Sulawesi 95115, Indonesia
| | - Talha Bin Emran
- Department of Pharmacy, BGC Trust University Bangladesh, Chittagong 4381, Bangladesh
| |
Collapse
|
50
|
Identification of Oocyst-Driven Toxoplasma gondii Infections in Humans and Animals through Stage-Specific Serology-Current Status and Future Perspectives. Microorganisms 2021; 9:microorganisms9112346. [PMID: 34835471 PMCID: PMC8618849 DOI: 10.3390/microorganisms9112346] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Revised: 11/08/2021] [Accepted: 11/09/2021] [Indexed: 11/17/2022] Open
Abstract
The apicomplexan zoonotic parasite Toxoplasma gondii has three infective stages: sporozoites in sporulated oocysts, which are shed in unsporulated form into the environment by infected felids; tissue cysts containing bradyzoites, and fast replicating tachyzoites that are responsible for acute toxoplasmosis. The contribution of oocysts to infections in both humans and animals is understudied despite being highly relevant. Only a few diagnostic antigens have been described to be capable of discriminating which parasite stage has caused an infection. Here we provide an extensive overview of the antigens and serological assays used to detect oocyst-driven infections in humans and animals according to the literature. In addition, we critically discuss the possibility to exploit the increasing knowledge of the T. gondii genome and the various 'omics datasets available, by applying predictive algorithms, for the identification of new oocyst-specific proteins for diagnostic purposes. Finally, we propose a workflow for how such antigens and assays based on them should be evaluated to ensure reproducible and robust results.
Collapse
|