1
|
Kumar N, Bajiya N, Patiyal S, Raghava GPS. Multi-perspectives and challenges in identifying B-cell epitopes. Protein Sci 2023; 32:e4785. [PMID: 37733481 PMCID: PMC10578127 DOI: 10.1002/pro.4785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 09/11/2023] [Accepted: 09/16/2023] [Indexed: 09/23/2023]
Abstract
The identification of B-cell epitopes (BCEs) in antigens is a crucial step in developing recombinant vaccines or immunotherapies for various diseases. Over the past four decades, numerous in silico methods have been developed for predicting BCEs. However, existing reviews have only covered specific aspects, such as the progress in predicting conformational or linear BCEs. Therefore, in this paper, we have undertaken a systematic approach to provide a comprehensive review covering all aspects associated with the identification of BCEs. First, we have covered the experimental techniques developed over the years for identifying linear and conformational epitopes, including the limitations and challenges associated with these techniques. Second, we have briefly described the historical perspectives and resources that maintain experimentally validated information on BCEs. Third, we have extensively reviewed the computational methods developed for predicting conformational BCEs from the structure of the antigen, as well as the methods for predicting conformational epitopes from the sequence. Fourth, we have systematically reviewed the in silico methods developed in the last four decades for predicting linear or continuous BCEs. Finally, we have discussed the overall challenge of identifying continuous or conformational BCEs. In this review, we only listed major computational resources; a complete list with the URL is available from the BCinfo website (https://webs.iiitd.edu.in/raghava/bcinfo/).
Collapse
Affiliation(s)
- Nishant Kumar
- Department of Computational BiologyIndraprastha Institute of Information TechnologyNew DelhiIndia
| | - Nisha Bajiya
- Department of Computational BiologyIndraprastha Institute of Information TechnologyNew DelhiIndia
| | - Sumeet Patiyal
- Department of Computational BiologyIndraprastha Institute of Information TechnologyNew DelhiIndia
| | - Gajendra P. S. Raghava
- Department of Computational BiologyIndraprastha Institute of Information TechnologyNew DelhiIndia
| |
Collapse
|
2
|
Angaitkar P, Janghel RR, Sahu TP. DL-TCNN: Deep Learning-based Temporal Convolutional Neural Network for prediction of conformational B-cell epitopes. 3 Biotech 2023; 13:297. [PMID: 37575599 PMCID: PMC10412510 DOI: 10.1007/s13205-023-03716-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2023] [Accepted: 07/24/2023] [Indexed: 08/15/2023] Open
Abstract
Prediction of conformational B-cell epitopes (CBCE) is an essential phase for vaccine design, drug invention, and accurate disease diagnosis. Many laboratorial and computational approaches have been developed to predict CBCE. However, laboratorial experiments are costly and time consuming, leading to the popularity of Machine Learning (ML)-based computational methods. Although ML methods have succeeded in many domains, achieving higher accuracy in CBCE prediction remains a challenge. To overcome this drawback and consider the limitations of ML methods, this paper proposes a novel DL-based framework for CBCE prediction, leveraging the capabilities of deep learning in the medical domain. The proposed model is named Deep Learning-based Temporal Convolutional Neural Network (DL-TCNN), which hybridizes empirical hyper-tuned 1D-CNN and TCN. TCN is an architecture that employs causal convolutions and dilations, adapting well to sequential input with extensive receptive fields. To train the proposed model, physicochemical features are firstly extracted from antigen sequences. Next, the Synthetic Minority Oversampling Technique (SMOTE) is applied to address the class imbalance problem. Finally, the proposed DL-TCNN is employed for the prediction of CBCE. The model's performance is evaluated and validated on a benchmark antigen-antibody dataset. The DL-TCNN achieves 94.44% accuracy, and 0.989 AUC score for the training dataset, 78.53% accuracy, and 0.661 AUC score for the validation dataset; and 85.10% accuracy, 0.855 AUC score for the testing dataset. The proposed model outperforms all the existing CBCE methods.
Collapse
Affiliation(s)
- Pratik Angaitkar
- Department of Information Technology, National Institute of Technology, Raipur, G.E. Road, Raipur, C.G. 492010 India
| | - Rekh Ram Janghel
- Department of Information Technology, National Institute of Technology, Raipur, G.E. Road, Raipur, C.G. 492010 India
| | - Tirath Prasad Sahu
- Department of Information Technology, National Institute of Technology, Raipur, G.E. Road, Raipur, C.G. 492010 India
| |
Collapse
|
3
|
Desta IT, Kotelnikov S, Jones G, Ghani U, Abyzov M, Kholodov Y, Standley DM, Beglov D, Vajda S, Kozakov D. The ClusPro AbEMap web server for the prediction of antibody epitopes. Nat Protoc 2023; 18:1814-1840. [PMID: 37188806 PMCID: PMC10898366 DOI: 10.1038/s41596-023-00826-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Accepted: 01/19/2023] [Indexed: 05/17/2023]
Abstract
Antibodies play an important role in the immune system by binding to molecules called antigens at their respective epitopes. These interfaces or epitopes are structural entities determined by the interactions between an antibody and an antigen, making them ideal systems to analyze by using docking programs. Since the advent of high-throughput antibody sequencing, the ability to perform epitope mapping using only the sequence of the antibody has become a high priority. ClusPro, a leading protein-protein docking server, together with its template-based modeling version, ClusPro-TBM, have been re-purposed to map epitopes for specific antibody-antigen interactions by using the Antibody Epitope Mapping server (AbEMap). ClusPro-AbEMap offers three different modes for users depending on the information available on the antibody as follows: (i) X-ray structure, (ii) computational/predicted model of the structure or (iii) only the amino acid sequence. The AbEMap server presents a likelihood score for each antigen residue of being part of the epitope. We provide detailed information on the server's capabilities for the three options and discuss how to obtain the best results. In light of the recent introduction of AlphaFold2 (AF2), we also show how one of the modes allows users to use their AF2-generated antibody models as input. The protocol describes the relative advantages of the server compared to other epitope-mapping tools, its limitations and potential areas of improvement. The server may take 45-90 min depending on the size of the proteins.
Collapse
Affiliation(s)
- Israel T Desta
- Department of Biomedical Engineering, Boston University, Boston, MA, USA
| | - Sergei Kotelnikov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, USA
| | - George Jones
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, USA
| | - Usman Ghani
- Department of Biomedical Engineering, Boston University, Boston, MA, USA
| | | | | | - Daron M Standley
- Department of Genome Informatics, Osaka University, Osaka, Japan
- Center for Infectious Disease Education and Research, Osaka University, Osaka, Japan
| | - Dmitri Beglov
- Department of Biomedical Engineering, Boston University, Boston, MA, USA
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, MA, USA.
| | - Dima Kozakov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, USA.
| |
Collapse
|
4
|
Desta IT, Kotelnikov S, Jones G, Ghani U, Abyzov M, Kholodov Y, Standley DM, Sabitova M, Beglov D, Vajda S, Kozakov D. Mapping of antibody epitopes based on docking and homology modeling. Proteins 2023; 91:171-182. [PMID: 36088633 PMCID: PMC9822860 DOI: 10.1002/prot.26420] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Revised: 08/25/2022] [Accepted: 09/06/2022] [Indexed: 01/11/2023]
Abstract
Antibodies are key proteins produced by the immune system to target pathogen proteins termed antigens via specific binding to surface regions called epitopes. Given an antigen and the sequence of an antibody the knowledge of the epitope is critical for the discovery and development of antibody based therapeutics. In this work, we present a computational protocol that uses template-based modeling and docking to predict epitope residues. This protocol is implemented in three major steps. First, a template-based modeling approach is used to build the antibody structures. We tested several options, including generation of models using AlphaFold2. Second, each antibody model is docked to the antigen using the fast Fourier transform (FFT) based docking program PIPER. Attention is given to optimally selecting the docking energy parameters depending on the input data. In particular, the van der Waals energy terms are reduced for modeled antibodies relative to x-ray structures. Finally, ranking of antigen surface residues is produced. The ranking relies on the docking results, that is, how often the residue appears in the docking poses' interface, and also on the energy favorability of the docking pose in question. The method, called PIPER-Map, has been tested on a widely used antibody-antigen docking benchmark. The results show that PIPER-Map improves upon the existing epitope prediction methods. An interesting observation is that epitope prediction accuracy starting from antibody sequence alone does not significantly differ from that of starting from unbound (i.e., separately crystallized) antibody structure.
Collapse
Affiliation(s)
- Israel T. Desta
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | - Sergei Kotelnikov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY 11794, USA
| | - George Jones
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY 11794, USA
| | - Usman Ghani
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | | | | | - Daron M. Standley
- Department of Genome Informatics, Osaka University, Osaka, 565-0871, Japan
- Center for Infectious Disease Education and Research, Osaka University, Osaka, 565-0871, Japan
| | - Maria Sabitova
- Department of Mathematics, CUNY Queens College, Flushing, NY 11367, USA
| | - Dmitri Beglov
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | - Dima Kozakov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY 11794, USA
| |
Collapse
|
5
|
Ambrosetti F, Jandova Z, Bonvin AMJJ. Information-Driven Antibody-Antigen Modelling with HADDOCK. Methods Mol Biol 2023; 2552:267-282. [PMID: 36346597 DOI: 10.1007/978-1-0716-2609-2_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
In the recent years, therapeutic use of antibodies has seen a huge growth, "due to their inherent proprieties and technological advances in the methods used to study and characterize them. Effective design and engineering of antibodies for therapeutic purposes are heavily dependent on knowledge of the structural principles that regulate antibody-antigen interactions. Several experimental techniques such as X-ray crystallography, cryo-electron microscopy, NMR, or mutagenesis analysis can be applied, but these are usually expensive and time-consuming. Therefore computational approaches like molecular docking may offer a valuable alternative for the characterization of antibody-antigen complexes.Here we describe a protocol for the prediction of the 3D structure of antibody-antigen complexes using the integrative modelling platform HADDOCK. The protocol consists of (1) the identification of the antibody residues belonging to the hypervariable loops which are known to be crucial for the binding and can be used to guide the docking and (2) the detailed steps to perform docking with the HADDOCK 2.4 webserver following different strategies depending on the availability of information about epitope residues.
Collapse
Affiliation(s)
- Francesco Ambrosetti
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Utrecht, The Netherlands
| | - Zuzana Jandova
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Utrecht, The Netherlands
| | - Alexandre M J J Bonvin
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Utrecht, The Netherlands.
| |
Collapse
|
6
|
Depetris RS, Lu D, Polonskaya Z, Zhang Z, Luna X, Tankard A, Kolahi P, Drummond M, Williams C, Ebert MCCJC, Patel JP, Poyurovsky MV. Functional antibody characterization via direct structural analysis and information-driven protein-protein docking. Proteins 2021; 90:919-935. [PMID: 34773424 PMCID: PMC9544432 DOI: 10.1002/prot.26280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 08/28/2021] [Accepted: 11/07/2021] [Indexed: 12/02/2022]
Abstract
Detailed description of the mechanism of action of the therapeutic antibodies is essential for the functional characterization and future optimization of potential clinical agents. We recently developed KD035, a fully human antibody targeting vascular endothelial growth factor receptor 2 (VEGFR2). KD035 blocked VEGF‐A, and VEGF‐C‐mediated VEGFR2 activation, as demonstrated by the in vitro binding and competition assays and functional cellular assays. Here, we report a computational model of the complex between the variable fragment of KD035 (KD035(Fv)) and the domains 2 and 3 of the extracellular portion of VEGFR2 (VEGFR2(D2‐3)). Our modeling was guided by a priori experimental information including the X‐ray structures of KD035 and related antibodies, binding assays, target domain mapping and comparison of KD035 affinity for VEGFR2 from different species. The accuracy of the model was assessed by molecular dynamics simulations, and subsequently validated by mutagenesis and binding analysis. Importantly, the steps followed during the generation of this model can set a precedent for future in silico efforts aimed at the accurate description of the antibody–antigen and more broadly protein–protein complexes.
Collapse
Affiliation(s)
| | - Dan Lu
- Kadmon Corporation, LLC, New York, New York, USA
| | | | - Zhikai Zhang
- Kadmon Corporation, LLC, New York, New York, USA
| | - Xenia Luna
- Kadmon Corporation, LLC, New York, New York, USA
| | | | - Pegah Kolahi
- Kadmon Corporation, LLC, New York, New York, USA
| | | | | | | | | | | |
Collapse
|
7
|
da Silva BM, Myung Y, Ascher DB, Pires DEV. epitope3D: a machine learning method for conformational B-cell epitope prediction. Brief Bioinform 2021; 23:6407730. [PMID: 34676398 DOI: 10.1093/bib/bbab423] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 08/25/2021] [Accepted: 09/14/2021] [Indexed: 11/13/2022] Open
Abstract
The ability to identify antigenic determinants of pathogens, or epitopes, is fundamental to guide rational vaccine development and immunotherapies, which are particularly relevant for rapid pandemic response. A range of computational tools has been developed over the past two decades to assist in epitope prediction; however, they have presented limited performance and generalization, particularly for the identification of conformational B-cell epitopes. Here, we present epitope3D, a novel scalable machine learning method capable of accurately identifying conformational epitopes trained and evaluated on the largest curated epitope data set to date. Our method uses the concept of graph-based signatures to model epitope and non-epitope regions as graphs and extract distance patterns that are used as evidence to train and test predictive models. We show epitope3D outperforms available alternative approaches, achieving Mathew's Correlation Coefficient and F1-scores of 0.55 and 0.57 on cross-validation and 0.45 and 0.36 during independent blind tests, respectively.
Collapse
Affiliation(s)
- Bruna Moreira da Silva
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,School of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia
| | - YooChan Myung
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,Baker Department of Cardiometabolic Health, University of Melbourne, Melbourne, Victoria, Australia
| | - David B Ascher
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,Baker Department of Cardiometabolic Health, University of Melbourne, Melbourne, Victoria, Australia.,Department of Biochemistry, University of Cambridge, 80 Tennis Ct Rd, Cambridge CB2 1GA, UK
| | - Douglas E V Pires
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia.,Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,School of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia
| |
Collapse
|
8
|
Immunoinformatics aided design of peptide-based vaccines against ebolaviruses. VITAMINS AND HORMONES 2021; 117:157-187. [PMID: 34420579 DOI: 10.1016/bs.vh.2021.06.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Ebolaviruses are at the forefront of emerging viruses and present a very perceptible threat to global peace and harmony. In the last decade, Ebola virus disease has claimed more than 90% of total lives since its inception in 1976. Owing to multiple host immune evasion methods employed by the virus and the limitations of traditional vaccine development approaches, finding a globally effective and reliable counter measure against Ebola virus remains a challenge. Highly conserved peptide fragments belonging to critical viral proteins and containing multiple epitopes which have the capacity to interact with a wide array of HLA molecules present a viable solution. Immunoinformatics or computational immunology enables rapid screening and shortlisting of plausible epitopes with a high immunogenic potential, thus, supporting expeditious elucidation of efficacious vaccine candidates. In light of above facts, we describe a computational methodology in this chapter for identification of potent peptide vaccine candidates against human infecting viruses. By applying this stringent methodology, we were able to identify multiple, immunogenic ebolavirus peptide fragments which, after verification in animal models, might be considered as part of future synthetic Ebola vaccine.
Collapse
|
9
|
Reverse vaccinology approach for the identifications of potential vaccine candidates against Salmonella. Int J Med Microbiol 2021; 311:151508. [PMID: 34182206 DOI: 10.1016/j.ijmm.2021.151508] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Revised: 03/14/2021] [Accepted: 04/15/2021] [Indexed: 12/26/2022] Open
Abstract
Salmonella is a leading cause of foodborne pathogen which causes intestinal and systemic diseases across the world. Vaccination is the most effective protection against Salmonella, but the identification and design of an effective broad-spectrum vaccine is still a great challenge, because of the multi-serotypes of Salmonella. Reverse vaccinology is a new tool to discovery and design vaccine antigens combining human immunology, structural biology and computational biology with microbial genomics. In this study, reverse vaccinology, an in-silico approach was established to screen appropriate immunogen targets by calculating the immunogenicity score of 583 non-redundant outer membrane and secreted proteins of Salmonella. Herein among 100 proteins identified with top-ranked scores, 15 representative antigens were selected randomly. Applying the sequence conservation test, four proteins (FliK, BcsZ, FhuA and FepA) remained as potential vaccine candidates for in vivo evaluation of immunogenicity and immunoprotection. All four candidates were capable to trigger the immune response and stimulate the production of antiserum in mice. Furthermore, top-ranked proteins including FliK and BcsZ provided wide antigenic coverage among the multi-serotype of Salmonella. The S. Typhimurium LT2 challenge model used in mice immunized with FliK and BcsZ showed a high relative percentage survival (RPS) of 52.74 % and 64.71 % respectively. In conclusion, this study constructed an in-silico pipeline able to successfully pre-screen the vaccine targets characterized by high immunogenicity and protective immunity. We show that reverse vaccinology allowed screening of appropriate broad-spectrum vaccines for Salmonella.
Collapse
|
10
|
Conformational epitope matching and prediction based on protein surface spiral features. BMC Genomics 2021; 22:116. [PMID: 34058977 PMCID: PMC8165135 DOI: 10.1186/s12864-020-07303-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2020] [Accepted: 12/04/2020] [Indexed: 01/20/2023] Open
Abstract
Background A conformational epitope (CE) is composed of neighboring amino acid residues located on an antigenic protein surface structure. CEs bind their complementary paratopes in B-cell receptors and/or antibodies. An effective and efficient prediction tool for CE analysis is critical for the development of immunology-related applications, such as vaccine design and disease diagnosis. Results We propose a novel method consisting of two sequential modules: matching and prediction. The matching module includes two main approaches. The first approach is a complete sequence search (CSS) that applies BLAST to align the sequence with all known antigen sequences. Fragments with high epitope sequence identities are identified and the predicted residues are annotated on the query structure. The second approach is a spiral vector search (SVS) that adopts a novel surface spiral feature vector for large-scale surface patch detection when queried against a comprehensive epitope database. The prediction module also contains two proposed subsystems. The first system is based on knowledge-based energy and geometrical neighboring residue contents, and the second system adopts combinatorial features, including amino acid contents and physicochemical characteristics, to formulate corresponding geometric spiral vectors and compare them with all spiral vectors from known CEs. An integrated testing dataset was generated for method evaluation, and our two searching methods effectively identified all epitope regions. The prediction results show that our proposed method outperforms previously published systems in terms of sensitivity, specificity, positive predictive value, and accuracy. Conclusions The proposed method significantly improves the performance of traditional epitope prediction. Matching followed by prediction is an efficient and effective approach compared to predicting directly on specific surfaces containing antigenic characteristics.
Collapse
|
11
|
Solihah B, Azhari A, Musdholifah A. Enhancement of conformational B-cell epitope prediction using CluSMOTE. PeerJ Comput Sci 2020; 6:e275. [PMID: 33816926 PMCID: PMC7924438 DOI: 10.7717/peerj-cs.275] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2019] [Accepted: 04/15/2020] [Indexed: 06/12/2023]
Abstract
BACKGROUND A conformational B-cell epitope is one of the main components of vaccine design. It contains separate segments in its sequence, which are spatially close in the antigen chain. The availability of Ag-Ab complex data on the Protein Data Bank allows for the development predictive methods. Several epitope prediction models also have been developed, including learning-based methods. However, the performance of the model is still not optimum. The main problem in learning-based prediction models is class imbalance. METHODS This study proposes CluSMOTE, which is a combination of a cluster-based undersampling method and Synthetic Minority Oversampling Technique. The approach is used to generate other sample data to ensure that the dataset of the conformational epitope is balanced. The Hierarchical DBSCAN algorithm is performed to identify the cluster in the majority class. Some of the randomly selected data is taken from each cluster, considering the oversampling degree, and combined with the minority class data. The balance data is utilized as the training dataset to develop a conformational epitope prediction. Furthermore, two binary classification methods, Support Vector Machine and Decision Tree, are separately used to develop model prediction and to evaluate the performance of CluSMOTE in predicting conformational B-cell epitope. The experiment is focused on determining the best parameter for optimal CluSMOTE. Two independent datasets are used to compare the proposed prediction model with state of the art methods. The first and the second datasets represent the general protein and the glycoprotein antigens respectively. RESULT The experimental result shows that CluSMOTE Decision Tree outperformed the Support Vector Machine in terms of AUC and Gmean as performance measurements. The mean AUC of CluSMOTE Decision Tree in the Kringelum and the SEPPA 3 test sets are 0.83 and 0.766, respectively. This shows that CluSMOTE Decision Tree is better than other methods in the general protein antigen, though comparable with SEPPA 3 in the glycoprotein antigen.
Collapse
Affiliation(s)
- Binti Solihah
- Department of Computer Science and Electronics, Faculty of Mathematics and Natural Sciences, Universitas Gadjah Mada, Yogyakarta, Indonesia
- Department of Informatics Engineering, Universitas Trisakti, Grogol, Jakarta Barat, Indonesia
| | - Azhari Azhari
- Department of Computer Science and Electronics, Faculty of Mathematics and Natural Sciences, Universitas Gadjah Mada, Yogyakarta, Indonesia
| | - Aina Musdholifah
- Department of Computer Science and Electronics, Faculty of Mathematics and Natural Sciences, Universitas Gadjah Mada, Yogyakarta, Indonesia
| |
Collapse
|
12
|
Zhou C, Chen Z, Zhang L, Yan D, Mao T, Tang K, Qiu T, Cao Z. SEPPA 3.0-enhanced spatial epitope prediction enabling glycoprotein antigens. Nucleic Acids Res 2020; 47:W388-W394. [PMID: 31114919 PMCID: PMC6602482 DOI: 10.1093/nar/gkz413] [Citation(s) in RCA: 49] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2019] [Revised: 04/25/2019] [Accepted: 05/05/2019] [Indexed: 01/19/2023] Open
Abstract
B-cell epitope information is critical to immune therapy and vaccine design. Protein epitopes can be significantly affected by glycosylation, while no methods have considered this till now. Based on previous versions of Spatial Epitope Prediction of Protein Antigens (SEPPA), we here present an enhanced tool SEPPA 3.0, enabling glycoprotein antigens. Parameters were updated based on the latest and largest dataset. Then, additional micro-environmental features of glycosylation triangles and glycosylation-related amino acid indexes were added as important classifiers, coupled with final calibration based on neighboring antigenicity. Logistic regression model was retained as SEPPA 2.0. The AUC value of 0.794 was obtained through 10-fold cross-validation on internal validation. Independent testing on general protein antigens resulted in AUC of 0.740 with BA (balanced accuracy) of 0.657 as baseline of SEPPA 3.0. Most importantly, when tested on independent glycoprotein antigens only, SEPPA 3.0 gave an AUC of 0.749 and BA of 0.665, leading the top performance among peers. As the first server enabling accurate epitope prediction for glycoproteins, SEPPA 3.0 shows significant advantages over popular peers on both general protein and glycoprotein antigens. It can be accessed at http://bidd2.nus.edu.sg/SEPPA3/ or at http://www.badd-cao.net/seppa3/index.html. Batch query is supported.
Collapse
Affiliation(s)
- Chen Zhou
- Shanghai 10th People's Hospital & School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Zikun Chen
- Shanghai 10th People's Hospital & School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Lu Zhang
- Shanghai 10th People's Hospital & School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Deyu Yan
- Shanghai 10th People's Hospital & School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Tiantian Mao
- Shanghai 10th People's Hospital & School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Kailin Tang
- Shanghai 10th People's Hospital & School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Tianyi Qiu
- Shanghai 10th People's Hospital & School of Life Sciences and Technology, Tongji University, Shanghai 200092, China.,Shanghai Public Health Clinical Center, Fudan University, Shanghai 200433, China
| | - Zhiwei Cao
- Shanghai 10th People's Hospital & School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| |
Collapse
|
13
|
Cho SH, Lee KM, Kim CH, Kim SS. Construction of a Lectin-Glycan Interaction Network from Enterohemorrhagic Escherichia coli Strains by Multi-omics Analysis. Int J Mol Sci 2020; 21:ijms21082681. [PMID: 32290560 PMCID: PMC7215717 DOI: 10.3390/ijms21082681] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2020] [Revised: 04/04/2020] [Accepted: 04/07/2020] [Indexed: 11/17/2022] Open
Abstract
Enterohemorrhagic Escherichia coli (EHEC) causes hemorrhagic colitis and hemolytic uremic syndrome. EHEC infection begins with bacterial adherence to the host intestine via lectin-like adhesins that bind to the intestinal wall. However, EHEC-related lectin–glycan interactions (LGIs) remain unknown. Here, we conducted a genome-wide investigation of putative adhesins to construct an LGI network. We performed microarray-based transcriptomic and proteomic analyses with E. coli EDL933. Using PSORTb-based analysis, potential outer-membrane-embedded adhesins were predicted from the annotated genes of 318 strains. Predicted proteins were classified using TMHMM v2.0, SignalP v5.0, and LipoP v1.0. Functional and protein–protein interaction analyses were performed using InterProScan and String databases, respectively. Structural information of lectin candidate proteins was predicted using Iterative Threading ASSEmbly Refinement (I-TASSER) and Spatial Epitope Prediction of Protein Antigens (SEPPA) tools based on 3D structure and B-cell epitopes. Pathway analysis returned 42,227 Gene Ontology terms; we then selected 2585 lectin candidate proteins by multi-omics analysis and performed homology modeling and B-cell epitope analysis. We predicted a total of 24,400 outer-membrane-embedded proteins from the genome of 318 strains and integrated multi-omics information into the genomic information of the proteins. Our integrated multi-omics data will provide a useful resource for the construction of LGI networks of E. coli.
Collapse
Affiliation(s)
- Seung-Hak Cho
- Division of Bacterial Disease Research, Center for Infectious Disease Research, Korea National Institute of Health, Cheongju, Chungchungbuk-do 28160, Korea; (S.-H.C.); (K.M.L.)
| | - Kang Mo Lee
- Division of Bacterial Disease Research, Center for Infectious Disease Research, Korea National Institute of Health, Cheongju, Chungchungbuk-do 28160, Korea; (S.-H.C.); (K.M.L.)
| | - Cheorl-Ho Kim
- Glycobiology Unit, Department of Biological Science, Sungkyunkwan University and Samsung Advanced Institute for Health Science and Technology (SAIHST), Suwon, Gyeonggi-do 16419, Korea
- Correspondence: (C.-H.K.); (S.S.K.); Tel.: +82-031-290-7002 (C.-H.K.); +82-043-719-8400 (S.S.K.); Fax: +82-043-719-8402 (S.S.K.)
| | - Sung Soon Kim
- Division of Bacterial Disease Research, Center for Infectious Disease Research, Korea National Institute of Health, Cheongju, Chungchungbuk-do 28160, Korea; (S.-H.C.); (K.M.L.)
- Correspondence: (C.-H.K.); (S.S.K.); Tel.: +82-031-290-7002 (C.-H.K.); +82-043-719-8400 (S.S.K.); Fax: +82-043-719-8402 (S.S.K.)
| |
Collapse
|
14
|
Application of Meta Learning to B-Cell Conformational Epitope Prediction. Methods Mol Biol 2020. [PMID: 32162268 DOI: 10.1007/978-1-0716-0389-5_22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
One of the major challenges in the field of vaccine design is identifying B-cell epitopes in continuously evolving viruses. Various tools have been developed to predict linear or conformational epitopes, each relying on different physicochemical properties and adopting distinct search strategies. In this chapter, we propose different ensemble meta-learning approaches for epitope prediction based on stacked, cascade generalizations, and meta decision trees. Through meta learning, we expect a meta learner to be able to integrate multiple prediction models and outperform the single best-performing model. The objective of this chapter is twofold: (1) to promote the complementary predictive strengths in different prediction tools and (2) to introduce computational models to exploit the synergy among various prediction tools. Our primary goal is not to develop any particular classifier for B-cell epitope prediction, but to advocate the feasibility of meta learning to epitope prediction. With the flexibility of meta learning, the researcher can construct various meta classification hierarchies that are applicable to epitope prediction in different protein domains.
Collapse
|
15
|
Ambrosetti F, Jiménez-García B, Roel-Touris J, Bonvin AMJJ. Modeling Antibody-Antigen Complexes by Information-Driven Docking. Structure 2019; 28:119-129.e2. [PMID: 31727476 DOI: 10.1016/j.str.2019.10.011] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2019] [Revised: 07/03/2019] [Accepted: 10/18/2019] [Indexed: 10/25/2022]
Abstract
Antibodies are Y-shaped proteins essential for immune response. Their capability to recognize antigens with high specificity makes them excellent therapeutic targets. Understanding the structural basis of antibody-antigen interactions is therefore crucial for improving our ability to design efficient biological drugs. Computational approaches such as molecular docking are providing a valuable and fast alternative to experimental structural characterization for these complexes. We investigate here how information about complementarity-determining regions and binding epitopes can be used to drive the modeling process, and present a comparative study of four different docking software suites (ClusPro, LightDock, ZDOCK, and HADDOCK) providing specific options for antibody-antigen modeling. Their performance on a dataset of 16 complexes is reported. HADDOCK, which includes information to drive the docking, is shown to perform best in terms of both success rate and quality of the generated models in both the presence and absence of information about the epitope on the antigen.
Collapse
Affiliation(s)
- Francesco Ambrosetti
- Department of Physics, Sapienza University, Piazzale Aldo Moro 5, 00184 Rome, Italy; Faculty of Science - Chemistry, Computational Structural Biology Group, Bijvoet Center for Biomolecular Research, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands
| | - Brian Jiménez-García
- Faculty of Science - Chemistry, Computational Structural Biology Group, Bijvoet Center for Biomolecular Research, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands
| | - Jorge Roel-Touris
- Faculty of Science - Chemistry, Computational Structural Biology Group, Bijvoet Center for Biomolecular Research, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands
| | - Alexandre M J J Bonvin
- Faculty of Science - Chemistry, Computational Structural Biology Group, Bijvoet Center for Biomolecular Research, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands.
| |
Collapse
|
16
|
Zhao L, Wu S, Jiang J, Li W, Luo J, Li J. Novel overlapping subgraph clustering for the detection of antigen epitopes. Bioinformatics 2019; 34:2061-2068. [PMID: 29409062 DOI: 10.1093/bioinformatics/bty051] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2017] [Accepted: 02/01/2018] [Indexed: 11/12/2022] Open
Abstract
Motivation Antigens that contain overlapping epitopes have been occasionally reported. As current algorithms mainly take a one-antigen-one-epitope approach to the prediction of epitopes, they are not capable of detecting these multiple and overlapping epitopes accurately, or even those multiple and separated epitopes existing in some other antigens. Results We introduce a novel subgraph clustering algorithm for more accurate detection of epitopes. This algorithm takes graph partitions as seeds, and expands the seeds to merge overlapping subgraphs based on the term frequency-inverse document frequency (TF-IDF) featured similarity. Then, the merged subgraphs are each classified as an epitope or non-epitope. Tests of our algorithm were conducted on three newly collected datasets of antigens. In the first dataset, each antigen contains only a single epitope; in the second, each antigen contains only multiple and separated epitopes; and in the third, each antigen contains overlapping epitopes. The prediction performance of our algorithm is significantly better than the state-of-art methods. The lifts of the averaged f-scores on top of the best existing methods are 60, 75 and 22% for the single epitope detection, the multiple and separated epitopes detection, and the overlapping epitopes detection, respectively. Availability and implementation The source code is available at github.com/lzhlab/glep/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Liang Zhao
- Department of Precision Medicine Research Center, Taihe Hospital, Hubei University of Medicine, Hubei, China.,Department of Computer Science, School of Computing and Electronic Information, Guangxi University, Nanning, China
| | - Shaogui Wu
- Department of Computer Science, School of Computing and Electronic Information, Guangxi University, Nanning, China
| | - Jiawen Jiang
- Department of Precision Medicine Research Center, Taihe Hospital, Hubei University of Medicine, Hubei, China
| | - Wencui Li
- Department of Precision Medicine Research Center, Taihe Hospital, Hubei University of Medicine, Hubei, China
| | - Jie Luo
- Department of Precision Medicine Research Center, Taihe Hospital, Hubei University of Medicine, Hubei, China
| | - Jinyan Li
- Department of Data Science, Advanced Analytics Institute, Faculty of Engineering and IT, University of Technology Sydney, Broadway, NSW 2007, Australia
| |
Collapse
|
17
|
Pourseif MM, Yousefpour M, Aminianfar M, Moghaddam G, Nematollahi A. A multi-method and structure-based in silico vaccine designing against Echinococcus granulosus through investigating enolase protein. ACTA ACUST UNITED AC 2019; 9:131-144. [PMID: 31508329 PMCID: PMC6726745 DOI: 10.15171/bi.2019.18] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Revised: 11/27/2018] [Accepted: 12/04/2018] [Indexed: 12/24/2022]
Abstract
![]()
Introduction: Hydatid disease is a ubiquitous parasitic zoonotic disease, which causes different medical, economic and serious public health problems in some parts of the world. The causal organism is a multi-stage parasite named Echinococcus granulosus whose life cycle is dependent on two types of mammalian hosts viz definitive and intermediate hosts.
Methods: In this study, enolase, as a key functional enzyme in the metabolism of E. granulosus (EgEnolase), was targeted through a comprehensive in silico modeling analysis and designing a host-specific multi-epitope vaccine. Three-dimensional (3D) structure of enolase was modeled using MODELLER v9.18 software. The B-cell epitopes (BEs) were predicted based on the multi-method approach and via some authentic online predictors. ClusPro v2.0 server was used for docking-based T-helper epitope prediction. The 3D structure of the vaccine was modeled using the RaptorX server. The designed vaccine was evaluated for its immunogenicity, physicochemical properties, and allergenicity. The codon optimization of the vaccine sequence was performed based on the codon usage table of E. coli K12. Finally, the energy minimization and molecular docking were implemented for simulating the vaccine binding affinity to the TLR-2 and TLR-4 and the complex stability.
Results: The designed multi-epitope vaccine was found to induce anti-EgEnolase immunity which may have the potential to prevent the survival and proliferation of E. granulosus into the definitive host.
Conclusion: Based on the results, this step-by-step immunoinformatics approach could be considered as a rational platform for designing vaccines against such multi-stage parasites. Furthermore, it is proposed that this multi-epitope vaccine is served as a promising preventive anti-echinococcosis agent.
Collapse
Affiliation(s)
- Mohammad Mostafa Pourseif
- Department of Physiology, Faculty of Medicine, AJA University of Medical Sciences, Tehran, Iran.,Infectious Diseases and Tropical Medicine Research Center (IDTMRC), Department of Aerospace and Subaquatic Medicine, AJA University of Medical Sciences, Tehran, Iran
| | - Mitra Yousefpour
- Department of Physiology, Faculty of Medicine, AJA University of Medical Sciences, Tehran, Iran
| | - Mohammad Aminianfar
- Infectious Diseases and Tropical Medicine Research Center (IDTMRC), Department of Aerospace and Subaquatic Medicine, AJA University of Medical Sciences, Tehran, Iran
| | - Gholamali Moghaddam
- Department of Animal Sciences, Faculty of Agriculture, University of Tabriz, Tabriz, Iran
| | - Ahmad Nematollahi
- Department of Pathobiology, Veterinary College, University of Tabriz, Tabriz, Iran
| |
Collapse
|
18
|
Manavalan B, Govindaraj RG, Shin TH, Kim MO, Lee G. iBCE-EL: A New Ensemble Learning Framework for Improved Linear B-Cell Epitope Prediction. Front Immunol 2018; 9:1695. [PMID: 30100904 PMCID: PMC6072840 DOI: 10.3389/fimmu.2018.01695] [Citation(s) in RCA: 108] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2018] [Accepted: 07/10/2018] [Indexed: 11/13/2022] Open
Abstract
Identification of B-cell epitopes (BCEs) is a fundamental step for epitope-based vaccine development, antibody production, and disease prevention and diagnosis. Due to the avalanche of protein sequence data discovered in postgenomic age, it is essential to develop an automated computational method to enable fast and accurate identification of novel BCEs within vast number of candidate proteins and peptides. Although several computational methods have been developed, their accuracy is unreliable. Thus, developing a reliable model with significant prediction improvements is highly desirable. In this study, we first constructed a non-redundant data set of 5,550 experimentally validated BCEs and 6,893 non-BCEs from the Immune Epitope Database. We then developed a novel ensemble learning framework for improved linear BCE predictor called iBCE-EL, a fusion of two independent predictors, namely, extremely randomized tree (ERT) and gradient boosting (GB) classifiers, which, respectively, uses a combination of physicochemical properties (PCP) and amino acid composition and a combination of dipeptide and PCP as input features. Cross-validation analysis on a benchmarking data set showed that iBCE-EL performed better than individual classifiers (ERT and GB), with a Matthews correlation coefficient (MCC) of 0.454. Furthermore, we evaluated the performance of iBCE-EL on the independent data set. Results show that iBCE-EL significantly outperformed the state-of-the-art method with an MCC of 0.463. To the best of our knowledge, iBCE-EL is the first ensemble method for linear BCEs prediction. iBCE-EL was implemented in a web-based platform, which is available at http://thegleelab.org/iBCE-EL. iBCE-EL contains two prediction modes. The first one identifying peptide sequences as BCEs or non-BCEs, while later one is aimed at providing users with the option of mining potential BCEs from protein sequences.
Collapse
Affiliation(s)
| | - Rajiv Gandhi Govindaraj
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, United States
| | - Tae Hwan Shin
- Department of Physiology, Ajou University School of Medicine, Suwon, South Korea.,Institute of Molecular Science and Technology, Ajou University, Suwon, South Korea
| | - Myeong Ok Kim
- Division of Life Science and Applied Life Science (BK21 Plus), College of Natural Sciences, Gyeongsang National University, Jinju, South Korea
| | - Gwang Lee
- Department of Physiology, Ajou University School of Medicine, Suwon, South Korea.,Institute of Molecular Science and Technology, Ajou University, Suwon, South Korea
| |
Collapse
|
19
|
Usmani SS, Kumar R, Bhalla S, Kumar V, Raghava GPS. In Silico Tools and Databases for Designing Peptide-Based Vaccine and Drugs. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2018; 112:221-263. [PMID: 29680238 DOI: 10.1016/bs.apcsb.2018.01.006] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The prolonged conventional approaches of drug screening and vaccine designing prerequisite patience, vigorous effort, outrageous cost as well as additional manpower. Screening and experimentally validating thousands of molecules for a specific therapeutic property never proved to be an easy task. Similarly, traditional way of vaccination includes administration of either whole or attenuated pathogen, which raises toxicity and safety issues. Emergence of sequencing and recombinant DNA technology led to the epitope-based advanced vaccination concept, i.e., small peptides (epitope) can stimulate specific immune response. Advent of bioinformatics proved to be an adjunct in vaccine and drug designing. Genomic study of pathogens aid to identify and analyze the protective epitope. A number of in silico tools have been developed to design immunotherapy as well as peptide-based drugs in the last two decades. These tools proved to be a catalyst in drug and vaccine designing. This review solicits therapeutic peptide databases as well as in silico tools developed for designing peptide-based vaccine and drugs.
Collapse
Affiliation(s)
- Salman Sadullah Usmani
- Center for Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India; Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, India
| | - Rajesh Kumar
- Center for Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India; Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, India
| | - Sherry Bhalla
- Center for Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| | - Vinod Kumar
- Center for Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India; Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, India
| | - Gajendra P S Raghava
- Center for Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India; Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, India.
| |
Collapse
|
20
|
Pourseif MM, Moghaddam G, Daghighkia H, Nematollahi A, Omidi Y. A novel B- and helper T-cell epitopes-based prophylactic vaccine against Echinococcus granulosus. ACTA ACUST UNITED AC 2017; 8:39-52. [PMID: 29713601 PMCID: PMC5915707 DOI: 10.15171/bi.2018.06] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2017] [Revised: 12/02/2017] [Accepted: 12/03/2017] [Indexed: 12/17/2022]
Abstract
![]()
Introduction:
In this study, we targeted the worm stage of Echinococcus granulosus to design a novel multi-epitope B- and helper T-cell based vaccine construct for immunization of dogs against this multi-host parasite.
Methods:
The vaccine was designed based on the local Eg14-3-3 antigen (Ag). DNA samples were extracted from the protoscoleces of the infected sheep’s liver, and then subjected to the polymerase chain reaction (PCR) with 14-3-3 specific forward and reverse primers. For the vaccine designing, several in silico steps were undertaken. Three-dimensional (3D) structure of the local Eg14-3-3 Ag was modeled by EasyModeller software. The protein modeling accuracy was then analyzed via various validation assays. Potential transmembrane helix, signal peptide, post-translational modifications and allergenicity of Eg14-3-3 were evaluated as the preliminary measures of B-cell epitopes (BEs ) prediction. Having used many web-servers, a well-designed process was carried out for improved prediction of BEs. High ranked linear and conformational BEs were utilized for engineering the final vaccine construct. Possible T-helper epitopes (TEs) were identified by the molecular docking between 13-mer fragments of the Eg14-3-3 Ag and two high frequent dog class II MHC alleles (i.e., DLA-DRB1*01101 and DRB1*01501). The epitopes coverage was evaluated by Shannon’s variability plot.
Results:
The final designed construct was analyzed based on different physicochemical properties, which was then codon optimized for high-level expression in Escherichia coli k12. This minigene construct is the first dog-specific epitopic vaccine construct that is established based on TEs with high-binding affinity to canine MHC alleles.
Conclusion:
This in silico study is the first part of a multi-antigenic vaccine designing work that represents as a novel dog-specific vaccine against E. granulosus. Here, we present key data on the step-by-step methodologies used for designing this de novo vaccine, which is under comprehensive in vivo investigations.
Collapse
Affiliation(s)
- Mohammad M Pourseif
- Department of Animal Sciences, Faculty of Agriculture, University of Tabriz, Tabriz, Iran.,Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Gholamali Moghaddam
- Department of Animal Sciences, Faculty of Agriculture, University of Tabriz, Tabriz, Iran
| | - Hossein Daghighkia
- Department of Animal Sciences, Faculty of Agriculture, University of Tabriz, Tabriz, Iran
| | - Ahmad Nematollahi
- Department of Pathobiology, Veterinary Collage, University of Tabriz, Tabriz, Iran
| | - Yadollah Omidi
- Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran.,Department of Pharmaceutics, Faculty of Pharmacy, Tabriz University of Medical Sciences, Tabriz, Iran
| |
Collapse
|
21
|
Dhanda SK, Usmani SS, Agrawal P, Nagpal G, Gautam A, Raghava GPS. Novel in silico tools for designing peptide-based subunit vaccines and immunotherapeutics. Brief Bioinform 2017; 18:467-478. [PMID: 27016393 DOI: 10.1093/bib/bbw025] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Indexed: 12/19/2022] Open
Abstract
The conventional approach for designing vaccine against a particular disease involves stimulation of the immune system using the whole pathogen responsible for the disease. In the post-genomic era, a major challenge is to identify antigenic regions or epitopes that can stimulate different arms of the immune system. In the past two decades, numerous methods and databases have been developed for designing vaccine or immunotherapy against various pathogen-causing diseases. This review describes various computational resources important for designing subunit vaccines or epitope-based immunotherapy. First, different immunological databases are described that maintain epitopes, antigens and vaccine targets. This is followed by in silico tools used for predicting linear and conformational B-cell epitopes required for activating humoral immunity. Finally, information on T-cell epitope prediction methods is provided that includes indirect methods like prediction of Major Histocompatibility Complex and transporter-associated protein binders. Different studies for validating the predicted epitopes are also examined critically. This review enlists novel in silico resources and tools available for predicting humoral and cell-mediated immune potential. These predicted epitopes could be used for designing epitope-based vaccines or immunotherapy as they may activate the adaptive immunity. Authors emphasized the need to develop tools for the prediction of adjuvants to activate innate and adaptive immune system simultaneously. In addition, attention has also been given to novel prediction methods to predict general therapeutic properties of peptides like half-life, cytotoxicity and immune toxicity.
Collapse
|
22
|
Qiu J, Qiu T, Huang Y, Cao Z. Identifying the Epitope Regions of Therapeutic Antibodies Based on Structure Descriptors. Int J Mol Sci 2017; 18:E2457. [PMID: 29186775 PMCID: PMC5751102 DOI: 10.3390/ijms18122457] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2017] [Revised: 11/13/2017] [Accepted: 11/13/2017] [Indexed: 11/16/2022] Open
Abstract
Therapeutic antibodies are widely used for disease detection and specific treatments. However, as an exogenous protein, these antibodies can be detected by the human immune system and elicit a response that can lead to serious illnesses. Therapeutic antibodies can be engineered through antibody humanization, which aims to maintain the specificity and biological function of the original antibodies, and reduce immunogenicity. However, the antibody drug effect is synchronously reduced as more exogenous parts are replaced by human antibodies. Hence, a major challenge in this area is to precisely detect the epitope regions in immunogenic antibodies and guide point mutations of exogenous antibodies to balance both humanization level and drug effect. In this article, the latest dataset of immunoglobulin complexes was collected from protein data bank (PDB) to discover the spatial features of immunogenic antibody. Furthermore, a series of structure descriptors were generated to characterize and distinguish epitope residues from non-immunogenic regions. Finally, a computational model was established based on structure descriptors, and results indicated that this model has the potential to precisely predict the epitope regions of therapeutic antibodies. With rapid accumulation of immunoglobulin complexes, this methodology could be used to improve and guide future antibody humanization and potential clinical applications.
Collapse
Affiliation(s)
- Jingxuan Qiu
- School of Life Sciences and Technology, Tongji University, Shanghai 200092, China; (J.Q.); (Y.H.)
- School of Medical Instrument and Food Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
| | - Tianyi Qiu
- The Institute of Biomedical Sciences, Fudan University, Shanghai 200433, China;
| | - Yin Huang
- School of Life Sciences and Technology, Tongji University, Shanghai 200092, China; (J.Q.); (Y.H.)
| | - Zhiwei Cao
- School of Life Sciences and Technology, Tongji University, Shanghai 200092, China; (J.Q.); (Y.H.)
| |
Collapse
|
23
|
Pourseif MM, Moghaddam G, Naghili B, Saeedi N, Parvizpour S, Nematollahi A, Omidi Y. A novel in silico minigene vaccine based on CD4 + T-helper and B-cell epitopes of EG95 isolates for vaccination against cystic echinococcosis. Comput Biol Chem 2017; 72:150-163. [PMID: 29195784 DOI: 10.1016/j.compbiolchem.2017.11.008] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2017] [Revised: 11/20/2017] [Accepted: 11/21/2017] [Indexed: 01/03/2023]
Abstract
EG95 oncospheral antigen plays a crucial role in Echinococcus granulosus pathogenicity. Considering the diversity of antigen among different EG95 isolates, it seems to be an ideal antigen for designing a universal multivalent minigene vaccine, so-called multi-epitope vaccine. This is the first in silico study to design a construct for the development of global EG95-based hydatid vaccine against E. granulosus in intermediate hosts. After antigen sequence selection, the three-dimensional structure of EG95 was modeled and multilaterally validated. The preliminary parameters for B-cell epitope prediction were implemented such as the possible transmembrane helix, signal peptide, post-translational modifications and allergenicity. The high ranked linear and conformational B-cell epitopes derived from several online web-servers (e.g., ElliPro, BepiPred v1.0, BcePred, ABCpred, SVMTrip, IEDB algorithms, SEPPA v2.0 and Discotope v2.0) were utilized for multiple sequence alignment and then for engineering the vaccine construct. T-helper based epitopes were predicted by molecular docking between the high frequent ovar class II allele (Ovar-DRB1*1202) and hexadecamer fragments of the EG95 protein. Having used the immune-informatics tools, we formulated the first EG95-based minigene vaccine based on T-helper epitope with high-binding affinity to the ovar MHC allele. This designed construct was analyzed for different physicochemical properties. It was also codon-optimized for high-level expression in Escherichia coli k12. Taken all, we propose the present in silico vaccine constructs as a promising platform for the generation of broadly protective vaccines for species and genus-specific immunization of the natural hosts of the parasite.
Collapse
Affiliation(s)
- Mohammad M Pourseif
- Department of Animal Sciences, Faculty of Agriculture, University of Tabriz, Tabriz, Iran; Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Gholamali Moghaddam
- Department of Animal Sciences, Faculty of Agriculture, University of Tabriz, Tabriz, Iran.
| | - Behrouz Naghili
- Research Center for Infectious and Tropical Diseases, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Nazli Saeedi
- Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Sepideh Parvizpour
- Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Ahmad Nematollahi
- Department of Pathobiology, Veterinary College, University of Tabriz, Tabriz, Iran
| | - Yadollah Omidi
- Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran; School of Advanced Biomedical Sciences, Tabriz University of Medical Sciences, Tabriz, Iran; Department of Pharmaceutics, Faculty of Pharmacy, Tabriz University of Medical Sciences, Tabriz, Iran.
| |
Collapse
|
24
|
Ren J, Song J, Ellis J, Li J. Staged heterogeneity learning to identify conformational B-cell epitopes from antigen sequences. BMC Genomics 2017; 18:113. [PMID: 28361709 PMCID: PMC5374683 DOI: 10.1186/s12864-017-3493-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Background The broad heterogeneity of antigen-antibody interactions brings tremendous challenges to the design of a widely applicable learning algorithm to identify conformational B-cell epitopes. Besides the intrinsic heterogeneity introduced by diverse species, extra heterogeneity can also be introduced by various data sources, adding another layer of complexity and further confounding the research. Results This work proposed a staged heterogeneity learning method, which learns both characteristics and heterogeneity of data in a phased manner. The method was applied to identify antigenic residues of heterogenous conformational B-cell epitopes based on antigen sequences. In the first stage, the model learns the general epitope patterns of each kind of propensity from a large data set containing computationally defined epitopes. In the second stage, the model learns the heterogenous complementarity of these propensities from a relatively small guided data set containing experimentally determined epitopes. Moreover, we designed an algorithm to cluster the predicted individual antigenic residues into conformational B-cell epitopes so as to provide strong potential for real-world applications, such as vaccine development. With heterogeneity well learnt, the transferability of the prediction model was remarkably improved to handle new data with a high level of heterogeneity. The model has been tested on two data sets with experimentally determined epitopes, and on a data set with computationally defined epitopes. This proposed sequence-based method achieved outstanding performance - about twice that of existing methods, including the sequence-based predictor CBTOPE and three other structure-based predictors. Conclusions The proposed method uses only antigen sequence information, and thus has much broader applications.
Collapse
Affiliation(s)
- Jing Ren
- Advanced Analytics Institute, Faculty of Engineering and Information Technology, University of Technology Sydney, Ultimo, NSW 2007, Australia.,College of Computer, National University of Defense Technology, Changsha, 410073, China
| | - Jiangning Song
- Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia.,Infection and Immunity Program, Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia
| | - John Ellis
- School of Life Sciences, University of Technology Sydney, Ultimo, NSW 2007, Australia
| | - Jinyan Li
- Advanced Analytics Institute and Centre for Health Technologies, Faculty of Engineering and Information Technology, University of Technology Sydney, Ultimo, NSW 2007, Australia.
| |
Collapse
|
25
|
Abstract
The rapidly increasing number of characterized allergens has created huge demands for advanced information storage, retrieval, and analysis. Bioinformatics and machine learning approaches provide useful tools for the study of allergens and epitopes prediction, which greatly complement traditional laboratory techniques. The specific applications mainly include identification of B- and T-cell epitopes, and assessment of allergenicity and cross-reactivity. In order to facilitate the work of clinical and basic researchers who are not familiar with bioinformatics, we review in this chapter the most important databases, bioinformatic tools, and methods with relevance to the study of allergens.
Collapse
|
26
|
Incorporating structure context of HA protein to improve antigenicity calculation for influenza virus A/H3N2. Sci Rep 2016; 6:31156. [PMID: 27498613 PMCID: PMC4976332 DOI: 10.1038/srep31156] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2016] [Accepted: 07/11/2016] [Indexed: 11/25/2022] Open
Abstract
The rapid and consistent mutation of influenza requires frequent evaluation of antigenicity variation among newly emerged strains, during which several in-silico methods have been reported to facilitate the assays. In this paper, we designed a structure-based antigenicity scoring model instead of those sequence-based previously published. Protein structural context was adopted to derive the antigenicity-dominant positions, as well as the physic-chemical change of local micro-environment in correlation with antigenicity change. Then a position specific scoring matrix (PSSM) profile and local environmental change over above positions were integrated to predict the antigenicity variance. Independent testing showed a high accuracy of 0.875, and sensitivity of 0.986, with a significant ability to discover antigenic-escaping strains. When applying this model to the historical data, global and regional antigenic drift events can be successfully detected. Furthermore, two well-known vaccine failure events were clearly suggested. Therefore, this structure-context model may be particularly useful to identify those to-be-failed vaccine strains, in addition to suggest potential new vaccine strains.
Collapse
|
27
|
Esmaielbeiki R, Krawczyk K, Knapp B, Nebel JC, Deane CM. Progress and challenges in predicting protein interfaces. Brief Bioinform 2016; 17:117-31. [PMID: 25971595 PMCID: PMC4719070 DOI: 10.1093/bib/bbv027] [Citation(s) in RCA: 100] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2015] [Revised: 03/18/2015] [Indexed: 12/31/2022] Open
Abstract
The majority of biological processes are mediated via protein-protein interactions. Determination of residues participating in such interactions improves our understanding of molecular mechanisms and facilitates the development of therapeutics. Experimental approaches to identifying interacting residues, such as mutagenesis, are costly and time-consuming and thus, computational methods for this purpose could streamline conventional pipelines. Here we review the field of computational protein interface prediction. We make a distinction between methods which address proteins in general and those targeted at antibodies, owing to the radically different binding mechanism of antibodies. We organize the multitude of currently available methods hierarchically based on required input and prediction principles to provide an overview of the field.
Collapse
|
28
|
Thakur A, Rajput A, Kumar M. MSLVP: prediction of multiple subcellular localization of viral proteins using a support vector machine. MOLECULAR BIOSYSTEMS 2016; 12:2572-86. [DOI: 10.1039/c6mb00241b] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
Knowledge of the subcellular location (SCL) of viral proteins in the host cell is important for understanding their function in depth.
Collapse
Affiliation(s)
- Anamika Thakur
- Bioinformatics Centre
- Institute of Microbial Technology
- Council of Scientific and Industrial Research
- Chandigarh-160036
- India
| | - Akanksha Rajput
- Bioinformatics Centre
- Institute of Microbial Technology
- Council of Scientific and Industrial Research
- Chandigarh-160036
- India
| | - Manoj Kumar
- Bioinformatics Centre
- Institute of Microbial Technology
- Council of Scientific and Industrial Research
- Chandigarh-160036
- India
| |
Collapse
|
29
|
Ren J, Liu Q, Ellis J, Li J. Positive-unlabeled learning for the prediction of conformational B-cell epitopes. BMC Bioinformatics 2015; 16 Suppl 18:S12. [PMID: 26681157 PMCID: PMC4682424 DOI: 10.1186/1471-2105-16-s18-s12] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
Background The incomplete ground truth of training data of B-cell epitopes is a demanding issue in computational epitope prediction. The challenge is that only a small fraction of the surface residues of an antigen are confirmed as antigenic residues (positive training data); the remaining residues are unlabeled. As some of these uncertain residues can possibly be grouped to form novel but currently unknown epitopes, it is misguided to unanimously classify all the unlabeled residues as negative training data following the traditional supervised learning scheme. Results We propose a positive-unlabeled learning algorithm to address this problem. The key idea is to distinguish between epitope-likely residues and reliable negative residues in unlabeled data. The method has two steps: (1) identify reliable negative residues using a weighted SVM with a high recall; and (2) construct a classification model on the positive residues and the reliable negative residues. Complex-based 10-fold cross-validation was conducted to show that this method outperforms those commonly used predictors DiscoTope 2.0, ElliPro and SEPPA 2.0 in every aspect. We conducted four case studies, in which the approach was tested on antigens of West Nile virus, dihydrofolate reductase, beta-lactamase, and two Ebola antigens whose epitopes are currently unknown. All the results were assessed on a newly-established data set of antigen structures not bound by antibodies, instead of on antibody-bound antigen structures. These bound structures may contain unfair binding information such as bound-state B-factors and protrusion index which could exaggerate the epitope prediction performance. Source codes are available on request.
Collapse
|
30
|
Bari FD, Parida S, Asfor AS, Haydon DT, Reeve R, Paton DJ, Mahapatra M. Prediction and characterization of novel epitopes of serotype A foot-and-mouth disease viruses circulating in East Africa using site-directed mutagenesis. J Gen Virol 2015; 96:1033-1041. [PMID: 25614587 PMCID: PMC4631058 DOI: 10.1099/vir.0.000051] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2014] [Accepted: 01/12/2015] [Indexed: 02/06/2023] Open
Abstract
Epitopes on the surface of the foot-and-mouth disease virus (FMDV) capsid have been identified by monoclonal antibody (mAb) escape mutant studies leading to the designation of four antigenic sites in serotype A FMDV. Previous work focused on viruses isolated mainly from Asia, Europe and Latin America. In this study we report on the prediction of epitopes in African serotype A FMDVs and testing of selected epitopes using reverse genetics. Twenty-four capsid amino acid residues were predicted to be of antigenic significance by analysing the capsid sequences (n = 56) using in silico methods, and six residues by correlating capsid sequence with serum-virus neutralization data. The predicted residues were distributed on the surface-exposed capsid regions, VP1-VP3. The significance of residue changes at eight of the predicted epitopes was tested by site-directed mutagenesis using a cDNA clone resulting in the generation of 12 mutant viruses involving seven sites. The effect of the amino acid substitutions on the antigenic nature of the virus was assessed by virus neutralization (VN) test. Mutations at four different positions, namely VP1-43, VP1-45, VP2-191 and VP3-132, led to significant reduction in VN titre (P value = 0.05, 0.05, 0.001 and 0.05, respectively). This is the first time, to our knowledge, that the antigenic regions encompassing amino acids VP1-43 to -45 (equivalent to antigenic site 3 in serotype O), VP2-191 and VP3-132 have been predicted as epitopes and evaluated serologically for serotype A FMDVs. This identifies novel capsid epitopes of recently circulating serotype A FMDVs in East Africa.
Collapse
Affiliation(s)
- Fufa Dawo Bari
- The Pirbright Institute, Ash Road, Woking, Surrey, GU24 0NF, UK
| | - Satya Parida
- The Pirbright Institute, Ash Road, Woking, Surrey, GU24 0NF, UK
| | - Amin S. Asfor
- The Pirbright Institute, Ash Road, Woking, Surrey, GU24 0NF, UK
| | - Daniel T. Haydon
- Boyd Orr Centre for Population and Ecosystem Health, Institute of Biodiversity, Animal Health and Comparative Medicine, College of Medical, Veterinary and Life Sciences, University of Glasgow, G12 8QQ, UK
| | - Richard Reeve
- The Pirbright Institute, Ash Road, Woking, Surrey, GU24 0NF, UK
- Boyd Orr Centre for Population and Ecosystem Health, Institute of Biodiversity, Animal Health and Comparative Medicine, College of Medical, Veterinary and Life Sciences, University of Glasgow, G12 8QQ, UK
| | - David J. Paton
- The Pirbright Institute, Ash Road, Woking, Surrey, GU24 0NF, UK
| | - Mana Mahapatra
- The Pirbright Institute, Ash Road, Woking, Surrey, GU24 0NF, UK
| |
Collapse
|
31
|
Abstract
Vaccination has a proven record as one of the most effective medical approaches to prevent the spread of infectious diseases. Traditional vaccine approaches involve the administration of whole killed or weakened microorganisms to stimulate protective immune responses. Such approaches deliver many microbial components, some of which contribute to protective immunity, and assist in guiding the type of immune response that is elicited. Despite their impeccable record, these approaches have failed to yield vaccines for many important infectious organisms. This has prompted a move towards more defined vaccines ('subunit vaccines'), where individual protective components are administered. This unit provides an overview of the components that are used for the development of modern vaccines including: an introduction to different vaccine types (whole organism, protein/peptide, polysaccharide, conjugate, and DNA vaccines); techniques for identifying subunit antigens; vaccine delivery systems; and immunostimulatory agents ('adjuvants'), which are fundamental for the development of effective subunit vaccines.
Collapse
|
32
|
Hu YJ, Lin SC, Lin YL, Lin KH, You SN. A meta-learning approach for B-cell conformational epitope prediction. BMC Bioinformatics 2014; 15:378. [PMID: 25403375 PMCID: PMC4237749 DOI: 10.1186/s12859-014-0378-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2014] [Accepted: 11/05/2014] [Indexed: 12/11/2022] Open
Abstract
Background One of the major challenges in the field of vaccine design is identifying B-cell epitopes in continuously evolving viruses. Various tools have been developed to predict linear or conformational epitopes, each relying on different physicochemical properties and adopting distinct search strategies. We propose a meta-learning approach for epitope prediction based on stacked and cascade generalizations. Through meta learning, we expect a meta learner to be able integrate multiple prediction models, and outperform the single best-performing model. The objective of this study is twofold: (1) to analyze the complementary predictive strengths in different prediction tools, and (2) to introduce a generic computational model to exploit the synergy among various prediction tools. Our primary goal is not to develop any particular classifier for B-cell epitope prediction, but to advocate the feasibility of meta learning to epitope prediction. With the flexibility of meta learning, the researcher can construct various meta classification hierarchies that are applicable to epitope prediction in different protein domains. Results We developed the hierarchical meta-learning architectures based on stacked and cascade generalizations. The bottom level of the hierarchy consisted of four conformational and four linear epitope prediction tools that served as the base learners. To perform consistent and unbiased comparisons, we tested the meta-learning method on an independent set of antigen proteins that were not used previously to train the base epitope prediction tools. In addition, we conducted correlation and ablation studies of the base learners in the meta-learning model. Low correlation among the predictions of the base learners suggested that the eight base learners had complementary predictive capabilities. The ablation analysis indicated that the eight base learners differentially interacted and contributed to the final meta model. The results of the independent test demonstrated that the meta-learning approach markedly outperformed the single best-performing epitope predictor. Conclusions Computational B-cell epitope prediction tools exhibit several differences that affect their performances when predicting epitopic regions in protein antigens. The proposed meta-learning approach for epitope prediction combines multiple prediction tools by integrating their complementary predictive strengths. Our experimental results demonstrate the superior performance of the combined approach in comparison with single epitope predictors. Electronic supplementary material The online version of this article (doi:10.1186/s12859-014-0378-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yuh-Jyh Hu
- Department of Computer Science, National Chiao Tung University, 1001 University Rd,, Hsinchu, Taiwan.
| | | | | | | | | |
Collapse
|