1
|
Vardaxis I, Simovski B, Anzar I, Stratford R, Clancy T. Deep learning of antibody epitopes using positional permutation vectors. Comput Struct Biotechnol J 2024; 23:2695-2707. [PMID: 39035832 PMCID: PMC11260035 DOI: 10.1016/j.csbj.2024.06.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Revised: 06/04/2024] [Accepted: 06/04/2024] [Indexed: 07/23/2024] Open
Abstract
Background The accurate computational prediction of B cell epitopes can vastly reduce the cost and time required for identifying potential epitope candidates for the design of vaccines and immunodiagnostics. However, current computational tools for B cell epitope prediction perform poorly and are not fit-for-purpose, and there remains enormous room for improvement and the need for superior prediction strategies. Results Here we propose a novel approach that improves B cell epitope prediction by encoding epitopes as binary positional permutation vectors that represent the position and structural properties of the amino acids within a protein antigen sequence that interact with an antibody. This approach supersedes the traditional method of defining epitopes as scores per amino acid on a protein sequence, where each score reflects each amino acids predicted probability of partaking in a B cell epitope antibody interaction. In addition to defining epitopes as binary positional permutation vectors, the approach also uses the 3D macrostructure features of the unbound protein structures, and in turn uses these features to train another deep learning model on the corresponding antibody-bound protein 3D structures. This enables the algorithm to learn the key structural and physiochemical features of the unbound protein and embedded epitope that initiate the antibody binding process helping to eliminate "induced fit" biases in the training data. We demonstrate that the strategy predicts B cell epitopes with improved accuracy compared to the existing tools. Additionally, we show that this approach reliably identifies the majority of experimentally verified epitopes on the spike protein of SARS-CoV-2 not seen by the model during training and generalizes in a very robust manner on dissimilar data not seen by the model during training. Conclusions With the approach described herein, a primary protein sequence and a query positional permutation vector encoding a putative epitope is sufficient to predict B cell epitopes in a reliable manner, potentially advancing the use of computational prediction of B cell epitopes in biomedical research applications.
Collapse
Affiliation(s)
- Ioannis Vardaxis
- NEC OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, Oslo 0379, Norway
| | - Boris Simovski
- NEC OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, Oslo 0379, Norway
| | - Irantzu Anzar
- NEC OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, Oslo 0379, Norway
| | - Richard Stratford
- NEC OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, Oslo 0379, Norway
| | - Trevor Clancy
- NEC OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, Oslo 0379, Norway
- Department of Vaccine Informatics, Institute for Tropical Medicine, Nagasaki University, Japan
| |
Collapse
|
2
|
Douradinha B. Computational strategies in Klebsiella pneumoniae vaccine design: navigating the landscape of in silico insights. Biotechnol Adv 2024; 76:108437. [PMID: 39216613 DOI: 10.1016/j.biotechadv.2024.108437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Revised: 07/07/2024] [Accepted: 08/25/2024] [Indexed: 09/04/2024]
Abstract
The emergence of multidrug-resistant Klebsiella pneumoniae poses a grave threat to global public health, necessitating urgent strategies for vaccine development. In this context, computational tools have emerged as indispensable assets, offering unprecedented insights into klebsiellal biology and facilitating the design of effective vaccines. Here, a review of the application of computational methods in the development of K. pneumoniae vaccines is presented, elucidating the transformative impact of in silico approaches. Through a systematic exploration of bioinformatics, structural biology, and immunoinformatics techniques, the complex landscape of K. pneumoniae pathogenesis and antigenicity was unravelled. Key insights into virulence factors, antigen discovery, and immune response mechanisms are discussed, highlighting the pivotal role of computational tools in accelerating vaccine development efforts. Advancements in epitope prediction, antigen selection, and vaccine design optimisation are examined, highlighting the potential of in silico approaches to update vaccine development pipelines. Furthermore, challenges and future directions in leveraging computational tools to combat K. pneumoniae are discussed, emphasizing the importance of multidisciplinary collaboration and data integration. This review provides a comprehensive overview of the current state of computational contributions to K. pneumoniae vaccine development, offering insights into innovative strategies for addressing this urgent global health challenge.
Collapse
|
3
|
Angaitkar P, Janghel RR, Sahu TP. DL-TCNN: Deep Learning-based Temporal Convolutional Neural Network for prediction of conformational B-cell epitopes. 3 Biotech 2023; 13:297. [PMID: 37575599 PMCID: PMC10412510 DOI: 10.1007/s13205-023-03716-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2023] [Accepted: 07/24/2023] [Indexed: 08/15/2023] Open
Abstract
Prediction of conformational B-cell epitopes (CBCE) is an essential phase for vaccine design, drug invention, and accurate disease diagnosis. Many laboratorial and computational approaches have been developed to predict CBCE. However, laboratorial experiments are costly and time consuming, leading to the popularity of Machine Learning (ML)-based computational methods. Although ML methods have succeeded in many domains, achieving higher accuracy in CBCE prediction remains a challenge. To overcome this drawback and consider the limitations of ML methods, this paper proposes a novel DL-based framework for CBCE prediction, leveraging the capabilities of deep learning in the medical domain. The proposed model is named Deep Learning-based Temporal Convolutional Neural Network (DL-TCNN), which hybridizes empirical hyper-tuned 1D-CNN and TCN. TCN is an architecture that employs causal convolutions and dilations, adapting well to sequential input with extensive receptive fields. To train the proposed model, physicochemical features are firstly extracted from antigen sequences. Next, the Synthetic Minority Oversampling Technique (SMOTE) is applied to address the class imbalance problem. Finally, the proposed DL-TCNN is employed for the prediction of CBCE. The model's performance is evaluated and validated on a benchmark antigen-antibody dataset. The DL-TCNN achieves 94.44% accuracy, and 0.989 AUC score for the training dataset, 78.53% accuracy, and 0.661 AUC score for the validation dataset; and 85.10% accuracy, 0.855 AUC score for the testing dataset. The proposed model outperforms all the existing CBCE methods.
Collapse
Affiliation(s)
- Pratik Angaitkar
- Department of Information Technology, National Institute of Technology, Raipur, G.E. Road, Raipur, C.G. 492010 India
| | - Rekh Ram Janghel
- Department of Information Technology, National Institute of Technology, Raipur, G.E. Road, Raipur, C.G. 492010 India
| | - Tirath Prasad Sahu
- Department of Information Technology, National Institute of Technology, Raipur, G.E. Road, Raipur, C.G. 492010 India
| |
Collapse
|
4
|
Qi Y, Zheng P, Huang G. DeepLBCEPred: A Bi-LSTM and multi-scale CNN-based deep learning method for predicting linear B-cell epitopes. Front Microbiol 2023; 14:1117027. [PMID: 36910218 PMCID: PMC9992402 DOI: 10.3389/fmicb.2023.1117027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 01/17/2023] [Indexed: 02/24/2023] Open
Abstract
The epitope is the site where antigens and antibodies interact and is vital to understanding the immune system. Experimental identification of linear B-cell epitopes (BCEs) is expensive, is labor-consuming, and has a low throughput. Although a few computational methods have been proposed to address this challenge, there is still a long way to go for practical applications. We proposed a deep learning method called DeepLBCEPred for predicting linear BCEs, which consists of bi-directional long short-term memory (Bi-LSTM), feed-forward attention, and multi-scale convolutional neural networks (CNNs). We extensively tested the performance of DeepLBCEPred through cross-validation and independent tests on training and two testing datasets. The empirical results showed that the DeepLBCEPred obtained state-of-the-art performance. We also investigated the contribution of different deep learning elements to recognize linear BCEs. In addition, we have developed a user-friendly web application for linear BCEs prediction, which is freely available for all scientific researchers at: http://www.biolscience.cn/DeepLBCEPred/.
Collapse
Affiliation(s)
- Yue Qi
- School of Information Engineering, Shaoyang University, Shaoyang, Hunan, China
| | - Peijie Zheng
- School of Information Engineering, Shaoyang University, Shaoyang, Hunan, China
| | - Guohua Huang
- School of Information Engineering, Shaoyang University, Shaoyang, Hunan, China
| |
Collapse
|
5
|
Lu S, Li Y, Ma Q, Nan X, Zhang S. A Structure-Based B-cell Epitope Prediction Model Through Combing Local and Global Features. Front Immunol 2022; 13:890943. [PMID: 35844532 PMCID: PMC9283778 DOI: 10.3389/fimmu.2022.890943] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Accepted: 05/23/2022] [Indexed: 11/24/2022] Open
Abstract
B-cell epitopes (BCEs) are a set of specific sites on the surface of an antigen that binds to an antibody produced by B-cell. The recognition of BCEs is a major challenge for drug design and vaccines development. Compared with experimental methods, computational approaches have strong potential for BCEs prediction at much lower cost. Moreover, most of the currently methods focus on using local information around target residue without taking the global information of the whole antigen sequence into consideration. We propose a novel deep leaning method through combing local features and global features for BCEs prediction. In our model, two parallel modules are built to extract local and global features from the antigen separately. For local features, we use Graph Convolutional Networks (GCNs) to capture information of spatial neighbors of a target residue. For global features, Attention-Based Bidirectional Long Short-Term Memory (Att-BLSTM) networks are applied to extract information from the whole antigen sequence. Then the local and global features are combined to predict BCEs. The experiments show that the proposed method achieves superior performance over the state-of-the-art BCEs prediction methods on benchmark datasets. Also, we compare the performance differences between data with or without global features. The experimental results show that global features play an important role in BCEs prediction. Our detailed case study on the BCEs prediction for SARS-Cov-2 receptor binding domain confirms that our method is effective for predicting and clustering true BCEs.
Collapse
Affiliation(s)
- Shuai Lu
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, China
| | - Yuguang Li
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, China
| | - Qiang Ma
- School of Life Sciences, Zhengzhou University, Zhengzhou, China
| | - Xiaofei Nan
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, China
- *Correspondence: Xiaofei Nan, ; Shoutao Zhang,
| | - Shoutao Zhang
- School of Life Sciences, Zhengzhou University, Zhengzhou, China
- Longhu Laboratory of Advanced Immunology, Zhengzhou, China
- *Correspondence: Xiaofei Nan, ; Shoutao Zhang,
| |
Collapse
|
6
|
Comprehensive Linear Epitope Prediction System for Host Specificity in Nodaviridae. Viruses 2022; 14:v14071357. [PMID: 35891339 PMCID: PMC9319239 DOI: 10.3390/v14071357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Revised: 06/15/2022] [Accepted: 06/20/2022] [Indexed: 02/01/2023] Open
Abstract
Background: Nodaviridae infection is one of the leading causes of death in commercial fish. Although many vaccines against this virus family have been developed, their efficacies are relatively low. Nodaviridae are categorized into three subfamilies: alphanodavirus (infects insects), betanodavirus (infects fish), and gammanodavirus (infects prawns). These three subfamilies possess host-specific characteristics that could be used to identify effective linear epitopes (LEs). Methodology: A multi-expert system using five existing LE prediction servers was established to obtain initial LE candidates. Based on the different clustered pathogen groups, both conserved and exclusive LEs among the Nodaviridae family could be identified. The advantages of undocumented cross infection among the different host species for the Nodaviridae family were applied to re-evaluate the impact of LE prediction. The surface structural characteristics of the identified conserved and unique LEs were confirmed through 3D structural analysis, and concepts of surface patches to analyze the spatial characteristics and physicochemical propensities of the predicted segments were proposed. In addition, an intelligent classifier based on the Immune Epitope Database (IEDB) dataset was utilized to review the predicted segments, and enzyme-linked immunosorbent assays (ELISAs) were performed to identify host-specific LEs. Principal findings: We predicted 29 LEs for Nodaviridae. The analysis of the surface patches showed common tendencies regarding shape, curvedness, and PH features for the predicted LEs. Among them, five predicted exclusive LEs for fish species were selected and synthesized, and the corresponding ELISAs for antigenic feature analysis were examined. Conclusion: Five identified LEs possessed antigenicity and host specificity for grouper fish. We demonstrate that the proposed method provides an effective approach for in silico LE prediction prior to vaccine development and is especially powerful for analyzing antigen sequences with exclusive features among clustered antigen groups.
Collapse
|
7
|
Mohammadzadeh R, Soleimanpour S, Pishdadian A, Farsiani H. Designing and development of epitope-based vaccines against Helicobacter pylori. Crit Rev Microbiol 2021; 48:489-512. [PMID: 34559599 DOI: 10.1080/1040841x.2021.1979934] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Helicobacter pylori infection is the principal cause of serious diseases (e.g. gastric cancer and peptic ulcers). Antibiotic therapy is an inadequate strategy in H. pylori eradication because of which vaccination is an inevitable approach. Despite the presence of countless vaccine candidates, current vaccines in clinical trials have performed with poor efficacy which makes vaccination extremely challenging. Remarkable advancements in immunology and pathogenic biology have provided an appropriate opportunity to develop various epitope-based vaccines. The fusion of proper antigens involved in different aspects of H. pylori colonization and pathogenesis as well as peptide linkers and built-in adjuvants results in producing epitope-based vaccines with excellent therapeutic efficacy and negligible adverse effects. Difficulties of the in vitro culture of H. pylori, high genetic variation, and unfavourable immune responses against feeble epitopes in the complete antigen are major drawbacks of current vaccine strategies that epitope-based vaccines may overcome. Besides decreasing the biohazard risk, designing precise formulations, saving time and cost, and induction of maximum immunity with minimum adverse effects are the advantages of epitope-based vaccines. The present article is a comprehensive review of strategies for designing and developing epitope-based vaccines to provide insights into the innovative vaccination against H. pylori.
Collapse
Affiliation(s)
- Roghayeh Mohammadzadeh
- Antimicrobial Resistance Research Center, Mashhad University of Medical Sciences, Mashhad, Iran.,Department of Microbiology and Virology, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Saman Soleimanpour
- Antimicrobial Resistance Research Center, Mashhad University of Medical Sciences, Mashhad, Iran.,Reference Tuberculosis Laboratory, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Abbas Pishdadian
- Department of Immunology, School of Medicine, Zabol University of Medical Sciences, Zabol, Iran
| | - Hadi Farsiani
- Antimicrobial Resistance Research Center, Mashhad University of Medical Sciences, Mashhad, Iran.,Department of Microbiology and Virology, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| |
Collapse
|
8
|
Conformational epitope matching and prediction based on protein surface spiral features. BMC Genomics 2021; 22:116. [PMID: 34058977 PMCID: PMC8165135 DOI: 10.1186/s12864-020-07303-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2020] [Accepted: 12/04/2020] [Indexed: 01/20/2023] Open
Abstract
Background A conformational epitope (CE) is composed of neighboring amino acid residues located on an antigenic protein surface structure. CEs bind their complementary paratopes in B-cell receptors and/or antibodies. An effective and efficient prediction tool for CE analysis is critical for the development of immunology-related applications, such as vaccine design and disease diagnosis. Results We propose a novel method consisting of two sequential modules: matching and prediction. The matching module includes two main approaches. The first approach is a complete sequence search (CSS) that applies BLAST to align the sequence with all known antigen sequences. Fragments with high epitope sequence identities are identified and the predicted residues are annotated on the query structure. The second approach is a spiral vector search (SVS) that adopts a novel surface spiral feature vector for large-scale surface patch detection when queried against a comprehensive epitope database. The prediction module also contains two proposed subsystems. The first system is based on knowledge-based energy and geometrical neighboring residue contents, and the second system adopts combinatorial features, including amino acid contents and physicochemical characteristics, to formulate corresponding geometric spiral vectors and compare them with all spiral vectors from known CEs. An integrated testing dataset was generated for method evaluation, and our two searching methods effectively identified all epitope regions. The prediction results show that our proposed method outperforms previously published systems in terms of sensitivity, specificity, positive predictive value, and accuracy. Conclusions The proposed method significantly improves the performance of traditional epitope prediction. Matching followed by prediction is an efficient and effective approach compared to predicting directly on specific surfaces containing antigenic characteristics.
Collapse
|
9
|
Hou Q, Stringer B, Waury K, Capel H, Haydarlou R, Xue F, Abeln S, Heringa J, Feenstra KA. SeRenDIP-CE: Sequence-based Interface Prediction for Conformational Epitopes. Bioinformatics 2021; 37:3421-3427. [PMID: 33974039 PMCID: PMC8136078 DOI: 10.1093/bioinformatics/btab321] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Revised: 03/26/2021] [Accepted: 04/26/2021] [Indexed: 11/21/2022] Open
Abstract
Motivation Antibodies play an important role in clinical research and biotechnology, with their specificity determined by the interaction with the antigen’s epitope region, as a special type of protein–protein interaction (PPI) interface. The ubiquitous availability of sequence data, allows us to predict epitopes from sequence in order to focus time-consuming wet-lab experiments toward the most promising epitope regions. Here, we extend our previously developed sequence-based predictors for homodimer and heterodimer PPI interfaces to predict epitope residues that have the potential to bind an antibody. Results We collected and curated a high quality epitope dataset from the SAbDab database. Our generic PPI heterodimer predictor obtained an AUC-ROC of 0.666 when evaluated on the epitope test set. We then trained a random forest model specifically on the epitope dataset, reaching AUC 0.694. Further training on the combined heterodimer and epitope datasets, improves our final predictor to AUC 0.703 on the epitope test set. This is better than the best state-of-the-art sequence-based epitope predictor BepiPred-2.0. On one solved antibody–antigen structure of the COVID19 virus spike receptor binding domain, our predictor reaches AUC 0.778. We added the SeRenDIP-CE Conformational Epitope predictors to our webserver, which is simple to use and only requires a single antigen sequence as input, which will help make the method immediately applicable in a wide range of biomedical and biomolecular research. Availability and implementation Webserver, source code and datasets at www.ibi.vu.nl/programs/serendipwww/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Qingzhen Hou
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Shandong 250002, P. R. China.,National institute of health data science of China, Shandong University, Shandong 250002, P. R. China
| | - Bas Stringer
- IBIVU - Center for Integrative Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands
| | - Katharina Waury
- IBIVU - Center for Integrative Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands
| | - Henriette Capel
- IBIVU - Center for Integrative Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands
| | - Reza Haydarlou
- IBIVU - Center for Integrative Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands
| | - Fuzhong Xue
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Shandong 250002, P. R. China.,National institute of health data science of China, Shandong University, Shandong 250002, P. R. China
| | - Sanne Abeln
- IBIVU - Center for Integrative Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands
| | - Jaap Heringa
- IBIVU - Center for Integrative Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands.,AIMMS - Amsterdam Institute for Molecules Medicines and Systems, Vrije Universiteit Amsterdam
| | - K Anton Feenstra
- IBIVU - Center for Integrative Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam 1081HV, The Netherlands.,AIMMS - Amsterdam Institute for Molecules Medicines and Systems, Vrije Universiteit Amsterdam
| |
Collapse
|
10
|
Choi Y, Jeong S, Choi JM, Ndong C, Griswold KE, Bailey-Kellogg C, Kim HS. Computer-guided binding mode identification and affinity improvement of an LRR protein binder without structure determination. PLoS Comput Biol 2020; 16:e1008150. [PMID: 32866140 PMCID: PMC7485979 DOI: 10.1371/journal.pcbi.1008150] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 09/11/2020] [Accepted: 07/14/2020] [Indexed: 12/24/2022] Open
Abstract
Precise binding mode identification and subsequent affinity improvement without structure determination remain a challenge in the development of therapeutic proteins. However, relevant experimental techniques are generally quite costly, and purely computational methods have been unreliable. Here, we show that integrated computational and experimental epitope localization followed by full-atom energy minimization can yield an accurate complex model structure which ultimately enables effective affinity improvement and redesign of binding specificity. As proof-of-concept, we used a leucine-rich repeat (LRR) protein binder, called a repebody (Rb), that specifically recognizes human IgG1 (hIgG1). We performed computationally-guided identification of the Rb:hIgG1 binding mode and leveraged the resulting model to reengineer the Rb so as to significantly increase its binding affinity for hIgG1 as well as redesign its specificity toward multiple IgGs from other species. Experimental structure determination verified that our Rb:hIgG1 model closely matched the co-crystal structure. Using a benchmark of other LRR protein complexes, we further demonstrated that the present approach may be broadly applicable to proteins undergoing relatively small conformational changes upon target binding.
Collapse
Affiliation(s)
- Yoonjoo Choi
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon, Korea
| | - Sukyo Jeong
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon, Korea
| | - Jung-Min Choi
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon, Korea
| | - Christian Ndong
- Thayer School of Engineering, Dartmouth College, Hanover, New Hampshire, United States of America
| | - Karl E. Griswold
- Thayer School of Engineering, Dartmouth College, Hanover, New Hampshire, United States of America
- Norris Cotton Cancer Center at Dartmouth, Lebanon, New Hampshire, United States of America
- Department of Biological Sciences, Dartmouth College, Hanover, New Hampshire, United States of America
| | - Chris Bailey-Kellogg
- Department of Computer Science, Dartmouth College, Hanover, New Hampshire, United States of America
| | - Hak-Sung Kim
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon, Korea
| |
Collapse
|
11
|
Parvizpour S, Pourseif MM, Razmara J, Rafi MA, Omidi Y. Epitope-based vaccine design: a comprehensive overview of bioinformatics approaches. Drug Discov Today 2020; 25:1034-1042. [DOI: 10.1016/j.drudis.2020.03.006] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2019] [Revised: 01/12/2020] [Accepted: 03/06/2020] [Indexed: 12/26/2022]
|
12
|
Solihah B, Azhari A, Musdholifah A. Enhancement of conformational B-cell epitope prediction using CluSMOTE. PeerJ Comput Sci 2020; 6:e275. [PMID: 33816926 PMCID: PMC7924438 DOI: 10.7717/peerj-cs.275] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2019] [Accepted: 04/15/2020] [Indexed: 06/12/2023]
Abstract
BACKGROUND A conformational B-cell epitope is one of the main components of vaccine design. It contains separate segments in its sequence, which are spatially close in the antigen chain. The availability of Ag-Ab complex data on the Protein Data Bank allows for the development predictive methods. Several epitope prediction models also have been developed, including learning-based methods. However, the performance of the model is still not optimum. The main problem in learning-based prediction models is class imbalance. METHODS This study proposes CluSMOTE, which is a combination of a cluster-based undersampling method and Synthetic Minority Oversampling Technique. The approach is used to generate other sample data to ensure that the dataset of the conformational epitope is balanced. The Hierarchical DBSCAN algorithm is performed to identify the cluster in the majority class. Some of the randomly selected data is taken from each cluster, considering the oversampling degree, and combined with the minority class data. The balance data is utilized as the training dataset to develop a conformational epitope prediction. Furthermore, two binary classification methods, Support Vector Machine and Decision Tree, are separately used to develop model prediction and to evaluate the performance of CluSMOTE in predicting conformational B-cell epitope. The experiment is focused on determining the best parameter for optimal CluSMOTE. Two independent datasets are used to compare the proposed prediction model with state of the art methods. The first and the second datasets represent the general protein and the glycoprotein antigens respectively. RESULT The experimental result shows that CluSMOTE Decision Tree outperformed the Support Vector Machine in terms of AUC and Gmean as performance measurements. The mean AUC of CluSMOTE Decision Tree in the Kringelum and the SEPPA 3 test sets are 0.83 and 0.766, respectively. This shows that CluSMOTE Decision Tree is better than other methods in the general protein antigen, though comparable with SEPPA 3 in the glycoprotein antigen.
Collapse
Affiliation(s)
- Binti Solihah
- Department of Computer Science and Electronics, Faculty of Mathematics and Natural Sciences, Universitas Gadjah Mada, Yogyakarta, Indonesia
- Department of Informatics Engineering, Universitas Trisakti, Grogol, Jakarta Barat, Indonesia
| | - Azhari Azhari
- Department of Computer Science and Electronics, Faculty of Mathematics and Natural Sciences, Universitas Gadjah Mada, Yogyakarta, Indonesia
| | - Aina Musdholifah
- Department of Computer Science and Electronics, Faculty of Mathematics and Natural Sciences, Universitas Gadjah Mada, Yogyakarta, Indonesia
| |
Collapse
|
13
|
Application of Meta Learning to B-Cell Conformational Epitope Prediction. Methods Mol Biol 2020. [PMID: 32162268 DOI: 10.1007/978-1-0716-0389-5_22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
One of the major challenges in the field of vaccine design is identifying B-cell epitopes in continuously evolving viruses. Various tools have been developed to predict linear or conformational epitopes, each relying on different physicochemical properties and adopting distinct search strategies. In this chapter, we propose different ensemble meta-learning approaches for epitope prediction based on stacked, cascade generalizations, and meta decision trees. Through meta learning, we expect a meta learner to be able to integrate multiple prediction models and outperform the single best-performing model. The objective of this chapter is twofold: (1) to promote the complementary predictive strengths in different prediction tools and (2) to introduce computational models to exploit the synergy among various prediction tools. Our primary goal is not to develop any particular classifier for B-cell epitope prediction, but to advocate the feasibility of meta learning to epitope prediction. With the flexibility of meta learning, the researcher can construct various meta classification hierarchies that are applicable to epitope prediction in different protein domains.
Collapse
|
14
|
Javadi Mamaghani A, Fathollahi A, Spotin A, Ranjbar MM, Barati M, Aghamolaie S, Karimi M, Taghipour N, Ashrafi M, Tabaei SJS. Candidate antigenic epitopes for vaccination and diagnosis strategies of Toxoplasma gondii infection: A review. Microb Pathog 2019; 137:103788. [DOI: 10.1016/j.micpath.2019.103788] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Revised: 09/05/2019] [Accepted: 10/08/2019] [Indexed: 12/28/2022]
|
15
|
Beltrán Lissabet JF, Herrera Belén L, Farias JG. TTAgP 1.0: A computational tool for the specific prediction of tumor T cell antigens. Comput Biol Chem 2019; 83:107103. [DOI: 10.1016/j.compbiolchem.2019.107103] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2019] [Revised: 06/20/2019] [Accepted: 08/10/2019] [Indexed: 01/27/2023]
|
16
|
Designing and Modeling of Multi-epitope Proteins for Diagnosis of Toxocara canis Infection. Int J Pept Res Ther 2019. [DOI: 10.1007/s10989-019-09940-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
|
17
|
Demolombe V, de Brevern AG, Felicori L, NGuyen C, Machado de Avila RA, Valera L, Jardin-Watelet B, Lavigne G, Lebreton A, Molina F, Moreau V. PEPOP 2.0: new approaches to mimic non-continuous epitopes. BMC Bioinformatics 2019; 20:387. [PMID: 31296178 PMCID: PMC6625012 DOI: 10.1186/s12859-019-2867-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2018] [Accepted: 04/30/2019] [Indexed: 12/28/2022] Open
Abstract
BACKGROUND Bioinformatics methods are helpful to identify new molecules for diagnostic or therapeutic applications. For example, the use of peptides capable of mimicking binding sites has several benefits in replacing a protein which is difficult to produce, or toxic. Using peptides is less expensive. Peptides are easier to manipulate, and can be used as drugs. Continuous epitopes predicted by bioinformatics tools are commonly used and these sequential epitopes are used as is in further experiments. Numerous discontinuous epitope predictors have been developed but only two bioinformatics tools have been proposed so far to predict peptide sequences: Superficial and PEPOP 2.0. PEPOP 2.0 can generate series of peptide sequences that can replace continuous or discontinuous epitopes in their interaction with their cognate antibody. RESULTS We have developed an improved version of PEPOP (PEPOP 2.0) dedicated to answer to experimentalists' need for a tool able to handle proteins and to turn them into peptides. The PEPOP 2.0 web site has been reorganized by peptide prediction category and is therefore better formulated to experimental designs. Since the first version of PEPOP, 32 new methods of peptide design were developed. In total, PEPOP 2.0 proposes 35 methods in which 34 deal specifically with discontinuous epitopes, the most represented epitope type in nature. CONCLUSION Through the presentation of its user-friendly, well-structured new web site conceived in close proximity to experimentalists, we report original methods that show how PEPOP 2.0 can assist biologists in dealing with discontinuous epitopes.
Collapse
Affiliation(s)
- Vincent Demolombe
- BPMP, CNRS, INRA, Montpellier SupAgro, Univ Montpellier, Montpellier, France
| | - Alexandre G de Brevern
- INSERM UMR-S 1134, DSIMB, F-75739, Paris, France.,Univ Paris Diderot, Sorbonne Paris Cité, Univ de la Réunion, Univ des Antilles, UMR 1134, F-75739, Paris, France.,INTS, F-75739, Paris, France.,Laboratoire d'Excellence GR-Ex, F75737, Paris, France
| | - Liza Felicori
- Departamento de Bioquímica e Imunologia, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Christophe NGuyen
- Sys2Diag UMR 9005 CNRS/ALCEDIAG, Complex System Modeling and Engineering for Diagnosis, Cap delta/Parc Euromédecine, 1682 rue de la Valsière CS 61003, 34184, Montpellier Cedex 4, France
| | - Ricardo Andrez Machado de Avila
- Programa de Pós-Graduação em Ciências da Saúde, Universidade do Extremo Sul Catarinense, Criciúma, Santa Catarina, 88806-000, Brazil
| | - Lionel Valera
- Bio-Rad Laboratories, 1682 Rue de la Valsière CS 61003, 34184, Montpellier CEDEX 04, France
| | | | | | - Aurélien Lebreton
- Service d'hématologie biologique, CHU Clermont-Ferrand, Clermont-Ferrand, France
| | - Franck Molina
- Sys2Diag UMR 9005 CNRS/ALCEDIAG, Complex System Modeling and Engineering for Diagnosis, Cap delta/Parc Euromédecine, 1682 rue de la Valsière CS 61003, 34184, Montpellier Cedex 4, France
| | - Violaine Moreau
- Centre de Biochimie Structurale (CBS), INSERM, CNRS, Univ Montpellier, 29, route de Navacelles, 34090, Montpellier, France.
| |
Collapse
|
18
|
Xiong Y, Qiao Y, Kihara D, Zhang HY, Zhu X, Wei DQ. Survey of Machine Learning Techniques for Prediction of the Isoform Specificity of Cytochrome P450 Substrates. Curr Drug Metab 2019; 20:229-235. [PMID: 30338736 DOI: 10.2174/1389200219666181019094526] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2018] [Revised: 08/05/2018] [Accepted: 08/06/2018] [Indexed: 12/23/2022]
Abstract
Background:Determination or prediction of the Absorption, Distribution, Metabolism, and Excretion (ADME) properties of drug candidates and drug-induced toxicity plays crucial roles in drug discovery and development. Metabolism is one of the most complicated pharmacokinetic properties to be understood and predicted. However, experimental determination of the substrate binding, selectivity, sites and rates of metabolism is time- and recourse- consuming. In the phase I metabolism of foreign compounds (i.e., most of drugs), cytochrome P450 enzymes play a key role. To help develop drugs with proper ADME properties, computational models are highly desired to predict the ADME properties of drug candidates, particularly for drugs binding to cytochrome P450.Objective:This narrative review aims to briefly summarize machine learning techniques used in the prediction of the cytochrome P450 isoform specificity of drug candidates.Results:Both single-label and multi-label classification methods have demonstrated good performance on modelling and prediction of the isoform specificity of substrates based on their quantitative descriptors.Conclusion:This review provides a guide for researchers to develop machine learning-based methods to predict the cytochrome P450 isoform specificity of drug candidates.
Collapse
Affiliation(s)
- Yi Xiong
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yanhua Qiao
- School of Life Sciences, Anhui University, Hefei, Anhui 230601, China
| | - Daisuke Kihara
- Department of Biological Science, Purdue University, West Lafayette, IN 47907, United States
| | - Hui-Yuan Zhang
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Xiaolei Zhu
- School of Life Sciences, Anhui University, Hefei, Anhui 230601, China
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| |
Collapse
|
19
|
Sun P, Guo S, Sun J, Tan L, Lu C, Ma Z. Advances in In-silico B-cell Epitope Prediction. Curr Top Med Chem 2019; 19:105-115. [PMID: 30499399 DOI: 10.2174/1568026619666181130111827] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2018] [Revised: 07/27/2018] [Accepted: 08/09/2018] [Indexed: 01/25/2023]
Abstract
Identification of B-cell epitopes in target antigens is one of the most crucial steps for epitopebased vaccine development, immunodiagnostic tests, antibody production, and disease diagnosis and therapy. Experimental methods for B-cell epitope mapping are time consuming, costly and labor intensive; in the meantime, various in-silico methods are proposed to predict both linear and conformational B-cell epitopes. The accurate identification of B-cell epitopes presents major challenges for immunoinformaticians. In this paper, we have comprehensively reviewed in-silico methods for B-cell epitope identification. The aim of this review is to stimulate the development of better tools which could improve the identification of B-cell epitopes, and further for the development of therapeutic antibodies and diagnostic tools.
Collapse
Affiliation(s)
- Pingping Sun
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China.,Key Laboratory of Intelligent Information Processing of Jilin University, Northeast Normal University, Changchun 130117, China.,Institute of Computational Biology, Northeast Normal University, Changchun 130117, China
| | - Sijia Guo
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China.,Key Laboratory of Intelligent Information Processing of Jilin University, Northeast Normal University, Changchun 130117, China.,Institute of Computational Biology, Northeast Normal University, Changchun 130117, China
| | - Jiahang Sun
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China.,Key Laboratory of Intelligent Information Processing of Jilin University, Northeast Normal University, Changchun 130117, China.,Institute of Computational Biology, Northeast Normal University, Changchun 130117, China
| | - Liming Tan
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China.,Key Laboratory of Intelligent Information Processing of Jilin University, Northeast Normal University, Changchun 130117, China.,Institute of Computational Biology, Northeast Normal University, Changchun 130117, China
| | - Chang Lu
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China.,Key Laboratory of Intelligent Information Processing of Jilin University, Northeast Normal University, Changchun 130117, China.,Institute of Computational Biology, Northeast Normal University, Changchun 130117, China
| | - Zhiqiang Ma
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China.,Key Laboratory of Intelligent Information Processing of Jilin University, Northeast Normal University, Changchun 130117, China.,Institute of Computational Biology, Northeast Normal University, Changchun 130117, China
| |
Collapse
|
20
|
Abstract
Background:
B-cell epitope prediction is an essential tool for a variety of
immunological studies. For identifying such epitopes, several computational predictors have been
proposed in the past 10 years.
Objective:
In this review, we summarized the representative computational approaches developed
for the identification of linear B-cell epitopes.
</P><P>
Methods: We mainly discuss the datasets, feature extraction methods and classification methods
used in the previous work.
Results:
The performance of the existing methods was not very satisfying, and so more effective
approaches should be proposed by considering the structural information of proteins.
Conclusion:
We consider existing challenges and future perspectives for developing reliable
methods for predicting linear B-cell epitopes.
Collapse
Affiliation(s)
- Cangzhi Jia
- School of Science, Dalian Maritime University, No. 1 Linghai Road, Dalian 116026, China
| | - Hongyan Gong
- School of Science, Dalian Maritime University, No. 1 Linghai Road, Dalian 116026, China
| | - Yan Zhu
- School of Science, Dalian Maritime University, No. 1 Linghai Road, Dalian 116026, China
| | - Yixia Shi
- Department of Mathematics and Statistics, Lingnan Normal University, Zhanjiang, China
| |
Collapse
|
21
|
Abstract
With the rise in novel infectious agents and disease pandemics, a new era of vaccine discovery is necessary. To address this, the new field of immunomics is described, which is synergistically powered by integrating bioinformatics methodologies with technological advances in biology and high-throughput instrumentation. By incorporating biological data from immunology and molecular biology with current genomics and proteomics, immunomics is geared to deliver an insight into immune function, optimal stimulation of immune responses and precise mapping and rational selection of immune targets that cover antigenic diversity. These efforts are expected to contribute towards the development of new generation of vaccines, tailored to both the genetic make-up of the human population and of the pathogen. Vaccine technologies are also being explored for prevention or control of non-communicable diseases.
Collapse
|
22
|
Zhang W, Yue X, Tang G, Wu W, Huang F, Zhang X. SFPEL-LPI: Sequence-based feature projection ensemble learning for predicting LncRNA-protein interactions. PLoS Comput Biol 2018; 14:e1006616. [PMID: 30533006 PMCID: PMC6331124 DOI: 10.1371/journal.pcbi.1006616] [Citation(s) in RCA: 93] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2018] [Revised: 01/14/2019] [Accepted: 11/02/2018] [Indexed: 01/12/2023] Open
Abstract
LncRNA-protein interactions play important roles in post-transcriptional gene regulation, poly-adenylation, splicing and translation. Identification of lncRNA-protein interactions helps to understand lncRNA-related activities. Existing computational methods utilize multiple lncRNA features or multiple protein features to predict lncRNA-protein interactions, but features are not available for all lncRNAs or proteins; most of existing methods are not capable of predicting interacting proteins (or lncRNAs) for new lncRNAs (or proteins), which don’t have known interactions. In this paper, we propose the sequence-based feature projection ensemble learning method, “SFPEL-LPI”, to predict lncRNA-protein interactions. First, SFPEL-LPI extracts lncRNA sequence-based features and protein sequence-based features. Second, SFPEL-LPI calculates multiple lncRNA-lncRNA similarities and protein-protein similarities by using lncRNA sequences, protein sequences and known lncRNA-protein interactions. Then, SFPEL-LPI combines multiple similarities and multiple features with a feature projection ensemble learning frame. In computational experiments, SFPEL-LPI accurately predicts lncRNA-protein associations and outperforms other state-of-the-art methods. More importantly, SFPEL-LPI can be applied to new lncRNAs (or proteins). The case studies demonstrate that our method can find out novel lncRNA-protein interactions, which are confirmed by literature. Finally, we construct a user-friendly web server, available at http://www.bioinfotech.cn/SFPEL-LPI/. LncRNA-protein interactions play important roles in post-transcriptional gene regulation, poly-adenylation, splicing and translation. Identification of lncRNA-protein interactions helps to understand lncRNA-related activities. In this paper, we propose a novel computational method “SFPEL-LPI” to predict lncRNA-protein interactions. SFPEL-LPI makes use of lncRNA sequences, protein sequences and known lncRNA-protein associations to extract features and calculate similarities for lncRNAs and proteins, and then combines them with a feature projection ensemble learning frame. SFPEL-LPI can predict unobserved interactions between lncRNAs and proteins, and also can make predictions for new lncRNAs (or proteins), which have no interactions with any proteins (or lncRNAs). SFPEL-LPI produces high-accuracy performances on the benchmark dataset when evaluated by five-fold cross validation, and outperforms state-of-the-art methods. The case studies demonstrate that SFPEL-LPI can find out novel associations, which are confirmed by literature. To facilitate the lncRNA-protein interaction prediction, we develop a user-friendly web server, available at http://www.bioinfotech.cn/SFPEL-LPI/.
Collapse
Affiliation(s)
- Wen Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, China
- School of Computer Science, Wuhan University, Wuhan, China
- * E-mail: , (WZ); (XZ)
| | - Xiang Yue
- Department of Computer Science and Engineering, The Ohio State University, Columbus, United States of America
| | - Guifeng Tang
- School of Computer Science, Wuhan University, Wuhan, China
| | - Wenjian Wu
- Electronic Information School, Wuhan University, Wuhan, China
| | - Feng Huang
- School of Computer Science, Wuhan University, Wuhan, China
| | - Xining Zhang
- School of Computer Science, Wuhan University, Wuhan, China
- * E-mail: , (WZ); (XZ)
| |
Collapse
|
23
|
An Y, Wang J, Li C, Leier A, Marquez-Lago T, Wilksch J, Zhang Y, Webb GI, Song J, Lithgow T. Comprehensive assessment and performance improvement of effector protein predictors for bacterial secretion systems III, IV and VI. Brief Bioinform 2018; 19:148-161. [PMID: 27777222 DOI: 10.1093/bib/bbw100] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2016] [Indexed: 11/15/2022] Open
Abstract
Bacterial effector proteins secreted by various protein secretion systems play crucial roles in host-pathogen interactions. In this context, computational tools capable of accurately predicting effector proteins of the various types of bacterial secretion systems are highly desirable. Existing computational approaches use different machine learning (ML) techniques and heterogeneous features derived from protein sequences and/or structural information. These predictors differ not only in terms of the used ML methods but also with respect to the used curated data sets, the features selection and their prediction performance. Here, we provide a comprehensive survey and benchmarking of currently available tools for the prediction of effector proteins of bacterial types III, IV and VI secretion systems (T3SS, T4SS and T6SS, respectively). We review core algorithms, feature selection techniques, tool availability and applicability and evaluate the prediction performance based on carefully curated independent test data sets. In an effort to improve predictive performance, we constructed three ensemble models based on ML algorithms by integrating the output of all individual predictors reviewed. Our benchmarks demonstrate that these ensemble models outperform all the reviewed tools for the prediction of effector proteins of T3SS and T4SS. The webserver of the proposed ensemble methods for T3SS and T4SS effector protein prediction is freely available at http://tbooster.erc.monash.edu/index.jsp. We anticipate that this survey will serve as a useful guide for interested users and that the new ensemble predictors will stimulate research into host-pathogen relationships and inspiration for the development of new bioinformatics tools for predicting effector proteins of T3SS, T4SS and T6SS.
Collapse
|
24
|
Computational B-cell epitope identification and production of neutralizing murine antibodies against Atroxlysin-I. Sci Rep 2018; 8:14904. [PMID: 30297733 PMCID: PMC6175905 DOI: 10.1038/s41598-018-33298-x] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2018] [Accepted: 09/03/2018] [Indexed: 11/08/2022] Open
Abstract
Epitope identification is essential for developing effective antibodies that can detect and neutralize bioactive proteins. Computational prediction is a valuable and time-saving alternative for experimental identification. Current computational methods for epitope prediction are underused and undervalued due to their high false positive rate. In this work, we targeted common properties of linear B-cell epitopes identified in an individual protein class (metalloendopeptidases) and introduced an alternative method to reduce the false positive rate and increase accuracy, proposing to restrict predictive models to a single specific protein class. For this purpose, curated epitope sequences from metalloendopeptidases were transformed into frame-shifted Kmers (3 to 15 amino acid residues long). These Kmers were decomposed into a matrix of biochemical attributes and used to train a decision tree classifier. The resulting prediction model showed a lower false positive rate and greater area under the curve when compared to state-of-the-art methods. Our predictions were used for synthesizing peptides mimicking the predicted epitopes for immunization of mice. A predicted linear epitope that was previously undetected by an experimental immunoassay was able to induce neutralizing-antibody production in mice. Therefore, we present an improved prediction alternative and show that computationally identified epitopes can go undetected during experimental mapping.
Collapse
|
25
|
Wang A, Li N, Zhou J, Chen Y, Jiang M, Qi Y, Liu H, Liu Y, Liu D, Zhao J, Wang Y, Zhang G. Mapping the B cell epitopes within the major capsid protein L1 of human papillomavirus type 16. Int J Biol Macromol 2018; 118:1354-1361. [DOI: 10.1016/j.ijbiomac.2018.06.094] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2018] [Revised: 04/13/2018] [Accepted: 06/20/2018] [Indexed: 10/28/2022]
|
26
|
Zhang W, Yue X, Huang F, Liu R, Chen Y, Ruan C. Predicting drug-disease associations and their therapeutic function based on the drug-disease association bipartite network. Methods 2018; 145:51-59. [DOI: 10.1016/j.ymeth.2018.06.001] [Citation(s) in RCA: 58] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2018] [Revised: 05/15/2018] [Accepted: 06/01/2018] [Indexed: 02/01/2023] Open
|
27
|
Manavalan B, Govindaraj RG, Shin TH, Kim MO, Lee G. iBCE-EL: A New Ensemble Learning Framework for Improved Linear B-Cell Epitope Prediction. Front Immunol 2018; 9:1695. [PMID: 30100904 PMCID: PMC6072840 DOI: 10.3389/fimmu.2018.01695] [Citation(s) in RCA: 113] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2018] [Accepted: 07/10/2018] [Indexed: 11/13/2022] Open
Abstract
Identification of B-cell epitopes (BCEs) is a fundamental step for epitope-based vaccine development, antibody production, and disease prevention and diagnosis. Due to the avalanche of protein sequence data discovered in postgenomic age, it is essential to develop an automated computational method to enable fast and accurate identification of novel BCEs within vast number of candidate proteins and peptides. Although several computational methods have been developed, their accuracy is unreliable. Thus, developing a reliable model with significant prediction improvements is highly desirable. In this study, we first constructed a non-redundant data set of 5,550 experimentally validated BCEs and 6,893 non-BCEs from the Immune Epitope Database. We then developed a novel ensemble learning framework for improved linear BCE predictor called iBCE-EL, a fusion of two independent predictors, namely, extremely randomized tree (ERT) and gradient boosting (GB) classifiers, which, respectively, uses a combination of physicochemical properties (PCP) and amino acid composition and a combination of dipeptide and PCP as input features. Cross-validation analysis on a benchmarking data set showed that iBCE-EL performed better than individual classifiers (ERT and GB), with a Matthews correlation coefficient (MCC) of 0.454. Furthermore, we evaluated the performance of iBCE-EL on the independent data set. Results show that iBCE-EL significantly outperformed the state-of-the-art method with an MCC of 0.463. To the best of our knowledge, iBCE-EL is the first ensemble method for linear BCEs prediction. iBCE-EL was implemented in a web-based platform, which is available at http://thegleelab.org/iBCE-EL. iBCE-EL contains two prediction modes. The first one identifying peptide sequences as BCEs or non-BCEs, while later one is aimed at providing users with the option of mining potential BCEs from protein sequences.
Collapse
Affiliation(s)
| | - Rajiv Gandhi Govindaraj
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, United States
| | - Tae Hwan Shin
- Department of Physiology, Ajou University School of Medicine, Suwon, South Korea.,Institute of Molecular Science and Technology, Ajou University, Suwon, South Korea
| | - Myeong Ok Kim
- Division of Life Science and Applied Life Science (BK21 Plus), College of Natural Sciences, Gyeongsang National University, Jinju, South Korea
| | - Gwang Lee
- Department of Physiology, Ajou University School of Medicine, Suwon, South Korea.,Institute of Molecular Science and Technology, Ajou University, Suwon, South Korea
| |
Collapse
|
28
|
Prediction of Effective Drug Combinations by an Improved Naïve Bayesian Algorithm. Int J Mol Sci 2018; 19:ijms19020467. [PMID: 29401735 PMCID: PMC5855689 DOI: 10.3390/ijms19020467] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2017] [Revised: 01/22/2018] [Accepted: 01/30/2018] [Indexed: 01/10/2023] Open
Abstract
Drug combinatorial therapy is a promising strategy for combating complex diseases due to its fewer side effects, lower toxicity and better efficacy. However, it is not feasible to determine all the effective drug combinations in the vast space of possible combinations given the increasing number of approved drugs in the market, since the experimental methods for identification of effective drug combinations are both labor- and time-consuming. In this study, we conducted systematic analysis of various types of features to characterize pairs of drugs. These features included information about the targets of the drugs, the pathway in which the target protein of a drug was involved in, side effects of drugs, metabolic enzymes of the drugs, and drug transporters. The latter two features (metabolic enzymes and drug transporters) were related to the metabolism and transportation properties of drugs, which were not analyzed or used in previous studies. Then, we devised a novel improved naïve Bayesian algorithm to construct classification models to predict effective drug combinations by using the individual types of features mentioned above. Our results indicated that the performance of our proposed method was indeed better than the naïve Bayesian algorithm and other conventional classification algorithms such as support vector machine and K-nearest neighbor.
Collapse
|
29
|
Abstract
The increasing number of protein structures with uncharacterized function necessitates the development of in silico prediction methods for functional annotations on proteins. In this chapter, different kinds of computational approaches are briefly introduced to predict DNA-binding residues on surface of DNA-binding proteins, and the merits and limitations of these methods are mainly discussed. This chapter focuses on the structure-based approaches and mainly discusses the framework of machine learning methods in application to DNA-binding prediction task.
Collapse
|
30
|
Zhang W, Zhu X, Fu Y, Tsuji J, Weng Z. Predicting human splicing branchpoints by combining sequence-derived features and multi-label learning methods. BMC Bioinformatics 2017; 18:464. [PMID: 29219070 PMCID: PMC5773893 DOI: 10.1186/s12859-017-1875-6] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
Abstract
Background Alternative splicing is the critical process in a single gene coding, which removes introns and joins exons, and splicing branchpoints are indicators for the alternative splicing. Wet experiments have identified a great number of human splicing branchpoints, but many branchpoints are still unknown. In order to guide wet experiments, we develop computational methods to predict human splicing branchpoints. Results Considering the fact that an intron may have multiple branchpoints, we transform the branchpoint prediction as the multi-label learning problem, and attempt to predict branchpoint sites from intron sequences. First, we investigate a variety of intron sequence-derived features, such as sparse profile, dinucleotide profile, position weight matrix profile, Markov motif profile and polypyrimidine tract profile. Second, we consider several multi-label learning methods: partial least squares regression, canonical correlation analysis and regularized canonical correlation analysis, and use them as the basic classification engines. Third, we propose two ensemble learning schemes which integrate different features and different classifiers to build ensemble learning systems for the branchpoint prediction. One is the genetic algorithm-based weighted average ensemble method; the other is the logistic regression-based ensemble method. Conclusions In the computational experiments, two ensemble learning methods outperform benchmark branchpoint prediction methods, and can produce high-accuracy results on the benchmark dataset.
Collapse
Affiliation(s)
- Wen Zhang
- School of Computer, Wuhan University, Wuhan, 430072, China.
| | - Xiaopeng Zhu
- School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA, 15213, USA
| | - Yu Fu
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, 368 Plantation Street, Worcester, MA, 01605, USA
| | - Junko Tsuji
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, 368 Plantation Street, Worcester, MA, 01605, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, 368 Plantation Street, Worcester, MA, 01605, USA
| |
Collapse
|
31
|
Quantitative prediction of drug side effects based on drug-related features. Interdiscip Sci 2017; 9:434-444. [DOI: 10.1007/s12539-017-0236-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2016] [Revised: 04/29/2017] [Accepted: 05/03/2017] [Indexed: 01/07/2023]
|
32
|
Ren J, Song J, Ellis J, Li J. Staged heterogeneity learning to identify conformational B-cell epitopes from antigen sequences. BMC Genomics 2017; 18:113. [PMID: 28361709 PMCID: PMC5374683 DOI: 10.1186/s12864-017-3493-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Background The broad heterogeneity of antigen-antibody interactions brings tremendous challenges to the design of a widely applicable learning algorithm to identify conformational B-cell epitopes. Besides the intrinsic heterogeneity introduced by diverse species, extra heterogeneity can also be introduced by various data sources, adding another layer of complexity and further confounding the research. Results This work proposed a staged heterogeneity learning method, which learns both characteristics and heterogeneity of data in a phased manner. The method was applied to identify antigenic residues of heterogenous conformational B-cell epitopes based on antigen sequences. In the first stage, the model learns the general epitope patterns of each kind of propensity from a large data set containing computationally defined epitopes. In the second stage, the model learns the heterogenous complementarity of these propensities from a relatively small guided data set containing experimentally determined epitopes. Moreover, we designed an algorithm to cluster the predicted individual antigenic residues into conformational B-cell epitopes so as to provide strong potential for real-world applications, such as vaccine development. With heterogeneity well learnt, the transferability of the prediction model was remarkably improved to handle new data with a high level of heterogeneity. The model has been tested on two data sets with experimentally determined epitopes, and on a data set with computationally defined epitopes. This proposed sequence-based method achieved outstanding performance - about twice that of existing methods, including the sequence-based predictor CBTOPE and three other structure-based predictors. Conclusions The proposed method uses only antigen sequence information, and thus has much broader applications.
Collapse
Affiliation(s)
- Jing Ren
- Advanced Analytics Institute, Faculty of Engineering and Information Technology, University of Technology Sydney, Ultimo, NSW 2007, Australia.,College of Computer, National University of Defense Technology, Changsha, 410073, China
| | - Jiangning Song
- Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia.,Infection and Immunity Program, Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia
| | - John Ellis
- School of Life Sciences, University of Technology Sydney, Ultimo, NSW 2007, Australia
| | - Jinyan Li
- Advanced Analytics Institute and Centre for Health Technologies, Faculty of Engineering and Information Technology, University of Technology Sydney, Ultimo, NSW 2007, Australia.
| |
Collapse
|
33
|
Dalkas GA, Rooman M. SEPIa, a knowledge-driven algorithm for predicting conformational B-cell epitopes from the amino acid sequence. BMC Bioinformatics 2017; 18:95. [PMID: 28183272 PMCID: PMC5301386 DOI: 10.1186/s12859-017-1528-9] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2016] [Accepted: 02/06/2017] [Indexed: 11/12/2022] Open
Abstract
BACKGROUND The identification of immunogenic regions on the surface of antigens, which are able to be recognized by antibodies and to trigger an immune response, is a major challenge for the design of new and effective vaccines. The prediction of such regions through computational immunology techniques is a challenging goal, which will ultimately lead to a drastic limitation of the experimental tests required to validate their efficiency. However, current methods are far from being sufficiently reliable and/or applicable on a large scale. RESULTS We developed SEPIa, a B-cell epitope predictor from the protein sequence, which is sufficiently fast to be applicable on a large scale. The originality of SEPIa lies in the combination of two classifiers, a naïve Bayesian and a random forest classifier, through a voting algorithm that exploits the advantages of both. It is based on 13 sequence-based features, whose values in a 9-residue sequence window are compiled to predict the epitope/non-epitope state of the central residue. The features are related to the type of amino acid, its conservation in homologous proteins, and its tendency of being exposed to the solvent, soluble, flexible, and disordered. The highest signal is obtained from statistical amino acid preferences, but all 13 features contribute non-negligibly in the predictor. SEPIa's average prediction accuracy is limited, with an AUC score (area under the receiver operating characteristic curve) that reaches 0.65 both in 10-fold cross-validation and on an independent test set. It is nevertheless slightly higher than that of other methods evaluated on the same test set. CONCLUSIONS SEPIa was applied to a test protein whose epitopes are known, human β2 adrenergic G-protein-coupled receptor, with promising results. Although the actual AUC score is rather low, many of the predicted epitopes cluster together and overlap the experimental epitope region. The reasons underlying the limitations of SEPIa and of all other B-cell epitope predictors are discussed.
Collapse
Affiliation(s)
- Georgios A. Dalkas
- BioModeling, BioInformatics & BioProcesses (3BIO), Université Libre de Bruxelles (ULB), CP 165/61, 50 Roosevelt Ave, 1050 Brussels, Belgium
- Present address: Institute of Mechanical, Process & Energy Engineering, Heriot-Watt University, Edinburgh, EH14 4AS UK
| | - Marianne Rooman
- BioModeling, BioInformatics & BioProcesses (3BIO), Université Libre de Bruxelles (ULB), CP 165/61, 50 Roosevelt Ave, 1050 Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, CP 263, Triumph Bld, 1050 Brussels, Belgium
| |
Collapse
|
34
|
Li D, Luo L, Zhang W, Liu F, Luo F. A genetic algorithm-based weighted ensemble method for predicting transposon-derived piRNAs. BMC Bioinformatics 2016; 17:329. [PMID: 27578422 PMCID: PMC5006569 DOI: 10.1186/s12859-016-1206-3] [Citation(s) in RCA: 56] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2016] [Accepted: 08/24/2016] [Indexed: 02/05/2023] Open
Abstract
BACKGROUND Predicting piwi-interacting RNA (piRNA) is an important topic in the small non-coding RNAs, which provides clues for understanding the generation mechanism of gamete. To the best of our knowledge, several machine learning approaches have been proposed for the piRNA prediction, but there is still room for improvements. RESULTS In this paper, we develop a genetic algorithm-based weighted ensemble method for predicting transposon-derived piRNAs. We construct datasets for three species: Human, Mouse and Drosophila. For each species, we compile the balanced dataset and imbalanced dataset, and thus obtain six datasets to build and evaluate prediction models. In the computational experiments, the genetic algorithm-based weighted ensemble method achieves 10-fold cross validation AUC of 0.932, 0.937 and 0.995 on the balanced Human dataset, Mouse dataset and Drosophila dataset, respectively, and achieves AUC of 0.935, 0.939 and 0.996 on the imbalanced datasets of three species. Further, we use the prediction models trained on the Mouse dataset to identify piRNAs of other species, and the models demonstrate the good performances in the cross-species prediction. CONCLUSIONS Compared with other state-of-the-art methods, our method can lead to better performances. In conclusion, the proposed method is promising for the transposon-derived piRNA prediction. The source codes and datasets are available in https://github.com/zw9977129/piRNAPredictor .
Collapse
Affiliation(s)
- Dingfang Li
- School of Mathematics and Statistics, Wuhan University, Wuhan, 430072 China
| | - Longqiang Luo
- School of Mathematics and Statistics, Wuhan University, Wuhan, 430072 China
| | - Wen Zhang
- State Key Lab of Software Engineering, Wuhan University, Wuhan, 430072 China
- School of Computer, Wuhan University, Wuhan, 430072 China
| | - Feng Liu
- International School of Software, Wuhan University, Wuhan, 430072 China
| | - Fei Luo
- State Key Lab of Software Engineering, Wuhan University, Wuhan, 430072 China
- School of Computer, Wuhan University, Wuhan, 430072 China
| |
Collapse
|
35
|
Patro R, Norel R, Prill RJ, Saez-Rodriguez J, Lorenz P, Steinbeck F, Ziems B, Luštrek M, Barbarini N, Tiengo A, Bellazzi R, Thiesen HJ, Stolovitzky G, Kingsford C. A computational method for designing diverse linear epitopes including citrullinated peptides with desired binding affinities to intravenous immunoglobulin. BMC Bioinformatics 2016; 17:155. [PMID: 27059896 PMCID: PMC4826543 DOI: 10.1186/s12859-016-1008-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2015] [Accepted: 03/31/2016] [Indexed: 01/20/2023] Open
Abstract
BACKGROUND Understanding the interactions between antibodies and the linear epitopes that they recognize is an important task in the study of immunological diseases. We present a novel computational method for the design of linear epitopes of specified binding affinity to Intravenous Immunoglobulin (IVIg). RESULTS We show that the method, called Pythia-design can accurately design peptides with both high-binding affinity and low binding affinity to IVIg. To show this, we experimentally constructed and tested the computationally constructed designs. We further show experimentally that these designed peptides are more accurate that those produced by a recent method for the same task. Pythia-design is based on combining random walks with an ensemble of probabilistic support vector machines (SVM) classifiers, and we show that it produces a diverse set of designed peptides, an important property to develop robust sets of candidates for construction. We show that by combining Pythia-design and the method of (PloS ONE 6(8):23616, 2011), we are able to produce an even more accurate collection of designed peptides. Analysis of the experimental validation of Pythia-design peptides indicates that binding of IVIg is favored by epitopes that contain trypthophan and cysteine. CONCLUSIONS Our method, Pythia-design, is able to generate a diverse set of binding and non-binding peptides, and its designs have been experimentally shown to be accurate.
Collapse
Affiliation(s)
- Rob Patro
- />Department of Computer Science, Stony Brook, NY, USA
| | - Raquel Norel
- />IBM T.J. Watson Research Center, Yorktown Heights, NY, USA
| | - Robert J. Prill
- />IBM T.J. Watson Research Center, Yorktown Heights, NY, USA
| | - Julio Saez-Rodriguez
- />European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, UK
| | - Peter Lorenz
- />Institute of Immunology, University of Rostock, Rostock, Germany
| | - Felix Steinbeck
- />Institute of Immunology, University of Rostock, Rostock, Germany
- />Gesellschaft für Individualisierte Medizin (IndyMed) mbH, Rostock, Germany
| | - Bjoern Ziems
- />Gesellschaft für Individualisierte Medizin (IndyMed) mbH, Rostock, Germany
| | - Mitja Luštrek
- />Department of Intelligent Systems, Jožef Stefan Institute, Ljubljana, Slovenia
| | - Nicola Barbarini
- />Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
| | - Alessandra Tiengo
- />Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
| | - Riccardo Bellazzi
- />Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
| | - Hans-Jürgen Thiesen
- />Institute of Immunology, University of Rostock, Rostock, Germany
- />Gesellschaft für Individualisierte Medizin (IndyMed) mbH, Rostock, Germany
| | | | - Carl Kingsford
- />Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| |
Collapse
|
36
|
Chen X, Liu X, Ren X, Li X, Wang L, Zang W. Discovery of human posterior head 20 (hPH20) and homo sapiens sperm acrosome associated 1 (hSPACA1) immunocontraceptive epitopes and their effects on fertility in male and female mice. Reprod Fertil Dev 2016; 28:416-27. [DOI: 10.1071/rd14134] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2014] [Accepted: 06/13/2014] [Indexed: 12/30/2022] Open
Abstract
The key goals of immunocontraception research are to obtain full contraceptive effects using vaccines administered to both males and females. Current research concerning human anti-sperm contraceptive vaccines is focused on delineating infertility-related epitopes to avoid autoimmune disease. We constructed phage-display peptide libraries to select epitope peptides derived from human posterior head 20 (hPH20) and homo sapiens sperm acrosome associated 1 (hSPACA1) using sera collected from infertile women harbouring anti-sperm antibodies. Following five rounds of selection, positive colonies were reconfirmed for reactivity with the immunoinfertile sera. We biopanned and analysed the chemical properties of four epitope peptides, named P82, Sa6, Sa37 and Sa76. Synthetic peptides were made and coupled to either bovine serum albumin (BSA) or ovalbumin. We used the BSA-conjugated peptides to immunise BALB/c mice and examined the effects on fertility in female and male mice. The synthetic peptides generated a sperm-specific antibody response in female and male mice that caused a contraceptive state. The immunocontraceptive effect was reversible and, with the disappearance of peptide-specific antibodies, there was complete restoration of fertility. Vaccinations using P82, Sa6 and Sa76 peptides resulted in no apparent side effects. Thus, it is efficient and practical to identify epitope peptide candidates by phage display. These peptides may find clinical application in the specific diagnosis and treatment of male and female infertility and contraceptive vaccine development.
Collapse
|
37
|
Eberhardt M, Lai X, Tomar N, Gupta S, Schmeck B, Steinkasserer A, Schuler G, Vera J. Third-Kind Encounters in Biomedicine: Immunology Meets Mathematics and Informatics to Become Quantitative and Predictive. Methods Mol Biol 2016; 1386:135-179. [PMID: 26677184 DOI: 10.1007/978-1-4939-3283-2_9] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
The understanding of the immune response is right now at the center of biomedical research. There are growing expectations that immune-based interventions will in the midterm provide new, personalized, and targeted therapeutic options for many severe and highly prevalent diseases, from aggressive cancers to infectious and autoimmune diseases. To this end, immunology should surpass its current descriptive and phenomenological nature, and become quantitative, and thereby predictive.Immunology is an ideal field for deploying the tools, methodologies, and philosophy of systems biology, an approach that combines quantitative experimental data, computational biology, and mathematical modeling. This is because, from an organism-wide perspective, the immunity is a biological system of systems, a paradigmatic instance of a multi-scale system. At the molecular scale, the critical phenotypic responses of immune cells are governed by large biochemical networks, enriched in nested regulatory motifs such as feedback and feedforward loops. This network complexity confers them the ability of highly nonlinear behavior, including remarkable examples of homeostasis, ultra-sensitivity, hysteresis, and bistability. Moving from the cellular level, different immune cell populations communicate with each other by direct physical contact or receiving and secreting signaling molecules such as cytokines. Moreover, the interaction of the immune system with its potential targets (e.g., pathogens or tumor cells) is far from simple, as it involves a number of attack and counterattack mechanisms that ultimately constitute a tightly regulated multi-feedback loop system. From a more practical perspective, this leads to the consequence that today's immunologists are facing an ever-increasing challenge of integrating massive quantities from multi-platforms.In this chapter, we support the idea that the analysis of the immune system demands the use of systems-level approaches to ensure the success in the search for more effective and personalized immune-based therapies.
Collapse
Affiliation(s)
- Martin Eberhardt
- Laboratory of Systems Tumor Immunology, Department of Dermatology, University Hospital Erlangen and Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
- Department of Dermatology, University Hospital Erlangen and Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
| | - Xin Lai
- Laboratory of Systems Tumor Immunology, Department of Dermatology, University Hospital Erlangen and Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
- Department of Dermatology, University Hospital Erlangen and Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
| | - Namrata Tomar
- Laboratory of Systems Tumor Immunology, Department of Dermatology, University Hospital Erlangen and Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
- Department of Dermatology, University Hospital Erlangen and Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
| | - Shailendra Gupta
- Department of Systems Biology and Bioinformatics, University of Rostock, Rostock, Germany
| | - Bernd Schmeck
- Department of Medicine, Pulmonary and Critical Care Medicine, University Medical Center Marburg, Philipps University, Marburg, Germany
- Systems Biology Platform, Institute for Lung Research/iLung, German Center for Lung Research, Universities of Giessen and Marburg Lung Centre, Philipps University Marburg, Marburg, Germany
| | - Alexander Steinkasserer
- Department of Immune Modulation at the Department of Dermatology, University Hospital Erlangen, Erlangen, Germany
| | - Gerold Schuler
- Department of Dermatology, University Hospital Erlangen and Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
| | - Julio Vera
- Laboratory of Systems Tumor Immunology, Department of Dermatology, University Hospital Erlangen and Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany.
- Department of Dermatology, University Hospital Erlangen and Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany.
| |
Collapse
|
38
|
|
39
|
Ren J, Liu Q, Ellis J, Li J. Positive-unlabeled learning for the prediction of conformational B-cell epitopes. BMC Bioinformatics 2015; 16 Suppl 18:S12. [PMID: 26681157 PMCID: PMC4682424 DOI: 10.1186/1471-2105-16-s18-s12] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
Background The incomplete ground truth of training data of B-cell epitopes is a demanding issue in computational epitope prediction. The challenge is that only a small fraction of the surface residues of an antigen are confirmed as antigenic residues (positive training data); the remaining residues are unlabeled. As some of these uncertain residues can possibly be grouped to form novel but currently unknown epitopes, it is misguided to unanimously classify all the unlabeled residues as negative training data following the traditional supervised learning scheme. Results We propose a positive-unlabeled learning algorithm to address this problem. The key idea is to distinguish between epitope-likely residues and reliable negative residues in unlabeled data. The method has two steps: (1) identify reliable negative residues using a weighted SVM with a high recall; and (2) construct a classification model on the positive residues and the reliable negative residues. Complex-based 10-fold cross-validation was conducted to show that this method outperforms those commonly used predictors DiscoTope 2.0, ElliPro and SEPPA 2.0 in every aspect. We conducted four case studies, in which the approach was tested on antigens of West Nile virus, dihydrofolate reductase, beta-lactamase, and two Ebola antigens whose epitopes are currently unknown. All the results were assessed on a newly-established data set of antigen structures not bound by antibodies, instead of on antibody-bound antigen structures. These bound structures may contain unfair binding information such as bound-state B-factors and protrusion index which could exaggerate the epitope prediction performance. Source codes are available on request.
Collapse
|
40
|
Zheng W, Ruan J, Hu G, Wang K, Hanlon M, Gao J. Analysis of Conformational B-Cell Epitopes in the Antibody-Antigen Complex Using the Depth Function and the Convex Hull. PLoS One 2015; 10:e0134835. [PMID: 26244562 PMCID: PMC4526569 DOI: 10.1371/journal.pone.0134835] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2015] [Accepted: 07/14/2015] [Indexed: 01/05/2023] Open
Abstract
The prediction of conformational b-cell epitopes plays an important role in immunoinformatics. Several computational methods are proposed on the basis of discrimination determined by the solvent-accessible surface between epitopes and non-epitopes, but the performance of existing methods is far from satisfying. In this paper, depth functions and the k-th surface convex hull are used to analyze epitopes and exposed non-epitopes. On each layer of the protein, we compute relative solvent accessibility and four different types of depth functions, i.e., Chakravarty depth, DPX, half-sphere exposure and half space depth, to analyze the location of epitopes on different layers of the proteins. We found that conformational b-cell epitopes are rich in charged residues Asp, Glu, Lys, Arg, His; aliphatic residues Gly, Pro; non-charged residues Asn, Gln; and aromatic residue Tyr. Conformational b-cell epitopes are rich in coils. Conservation of epitopes is not significantly lower than that of exposed non-epitopes. The average depths (obtained by four methods) for epitopes are significantly lower than that of non-epitopes on the surface using the Wilcoxon rank sum test. Epitopes are more likely to be located in the outer layer of the convex hull of a protein. On the benchmark dataset, the cumulate 10th convex hull covers 84.6% of exposed residues on the protein surface area, and nearly 95% of epitope sites. These findings may be helpful in building a predictor for epitopes.
Collapse
Affiliation(s)
- Wei Zheng
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, People’s Republic of China
| | - Jishou Ruan
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, People’s Republic of China
- State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin, People’s Republic of China
| | - Gang Hu
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, People’s Republic of China
| | - Kui Wang
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, People’s Republic of China
| | - Michelle Hanlon
- Department of Physical Sciences, Grant MacEwan University, Alberta, Canada
| | - Jianzhao Gao
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, People’s Republic of China
- * E-mail:
| |
Collapse
|
41
|
Jafarpour S, Ayat H, Ahadi AM. Design and Antigenic Epitopes Prediction of a New Trial Recombinant Multiepitopic Rotaviral Vaccine: In Silico Analyses. Viral Immunol 2015; 28:325-30. [PMID: 25965449 PMCID: PMC4507124 DOI: 10.1089/vim.2014.0152] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Rotavirus is the major etiologic factor of severe diarrheal disease. Natural infection provides protection against subsequent rotavirus infection and diarrhea. This research presents a new vaccine designed based on computational models. In this study, three types of epitopes are considered-linear, conformational, and combinational-in a proposed model protein. Several studies on rotavirus vaccines have shown that VP6 and VP4 proteins are good candidates for vaccine production. In the present study, a fusion protein was designed as a new generation of rotavirus vaccines by bioinformatics analyses. This model-based study using ABCpred, BCPREDS, Bcepred, and Ellipro web servers showed that the peptide presented in this article has the necessary properties to act as a vaccine. Prediction of linear B-cell epitopes of peptides is helpful to investigate whether these peptides are able to activate humoral immunity.
Collapse
Affiliation(s)
- Sima Jafarpour
- Department of Genetics, Faculty of Science, Shahrekord University , Shahrekord, Iran
| | - Hoda Ayat
- Department of Genetics, Faculty of Science, Shahrekord University , Shahrekord, Iran
| | - Ali Mohammad Ahadi
- Department of Genetics, Faculty of Science, Shahrekord University , Shahrekord, Iran
| |
Collapse
|
42
|
Van Regenmortel MHV. Specificity, polyspecificity, and heterospecificity of antibody-antigen recognition. J Mol Recognit 2015; 27:627-39. [PMID: 25277087 DOI: 10.1002/jmr.2394] [Citation(s) in RCA: 100] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2014] [Revised: 05/14/2014] [Accepted: 05/15/2014] [Indexed: 11/09/2022]
Abstract
The concept of antibody specificity is analyzed and shown to reside in the ability of an antibody to discriminate between two antigens. Initially, antibody specificity was attributed to sequence differences in complementarity determining regions (CDRs), but as increasing numbers of crystallographic antibody-antigen complexes were elucidated, specificity was analyzed in terms of six antigen-binding regions (ABRs) that only roughly correspond to CDRs. It was found that each ABR differs significantly in its amino acid composition and tends to bind different types of amino acids at the surface of proteins. In spite of these differences, the combined preference of the six ABRs does not allow epitopes to be distinguished from the rest of the protein surface. These findings explain the poor success of past and newly proposed methods for predicting protein epitopes. Antibody polyspecificity refers to the ability of one antibody to bind a large variety of epitopes in different antigens, and this property explains how the immune system develops an antibody repertoire that is able to recognize every antigen the system is likely to encounter. Antibody heterospecificity arises when an antibody reacts better with another antigen than with the one used to raise the antibody. As a result, an antibody may sometimes appear to have been elicited by an antigen with which it is unable to react. The implications of antibody polyspecificity and heterospecificity in vaccine development are pointed out.
Collapse
Affiliation(s)
- Marc H V Van Regenmortel
- Wallenberg Research Center, Stellenbosch Institute for Advanced Study, Stellenbosch University, Stellenbosch, South Africa
| |
Collapse
|
43
|
Zhang W, Niu Y, Zou H, Luo L, Liu Q, Wu W. Accurate prediction of immunogenic T-cell epitopes from epitope sequences using the genetic algorithm-based ensemble learning. PLoS One 2015; 10:e0128194. [PMID: 26020952 PMCID: PMC4447411 DOI: 10.1371/journal.pone.0128194] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2014] [Accepted: 04/24/2015] [Indexed: 11/19/2022] Open
Abstract
Background T-cell epitopes play the important role in T-cell immune response, and they are critical components in the epitope-based vaccine design. Immunogenicity is the ability to trigger an immune response. The accurate prediction of immunogenic T-cell epitopes is significant for designing useful vaccines and understanding the immune system. Methods In this paper, we attempt to differentiate immunogenic epitopes from non-immunogenic epitopes based on their primary structures. First of all, we explore a variety of sequence-derived features, and analyze their relationship with epitope immunogenicity. To effectively utilize various features, a genetic algorithm (GA)-based ensemble method is proposed to determine the optimal feature subset and develop the high-accuracy ensemble model. In the GA optimization, a chromosome is to represent a feature subset in the search space. For each feature subset, the selected features are utilized to construct the base predictors, and an ensemble model is developed by taking the average of outputs from base predictors. The objective of GA is to search for the optimal feature subset, which leads to the ensemble model with the best cross validation AUC (area under ROC curve) on the training set. Results Two datasets named ‘IMMA2’ and ‘PAAQD’ are adopted as the benchmark datasets. Compared with the state-of-the-art methods POPI, POPISK, PAAQD and our previous method, the GA-based ensemble method produces much better performances, achieving the AUC score of 0.846 on IMMA2 dataset and the AUC score of 0.829 on PAAQD dataset. The statistical analysis demonstrates the performance improvements of GA-based ensemble method are statistically significant. Conclusions The proposed method is a promising tool for predicting the immunogenic epitopes. The source codes and datasets are available in S1 File.
Collapse
Affiliation(s)
- Wen Zhang
- School of Computer, Wuhan University, Wuhan, 430072, China
- Research Institute of Shenzhen, Wuhan University, Shenzhen, 518057, China
- * E-mail:
| | - Yanqing Niu
- School of Mathematics and Statistics, South-central University for Nationalities, Wuhan, 430074, China
| | - Hua Zou
- School of Computer, Wuhan University, Wuhan, 430072, China
| | - Longqiang Luo
- School of Mathematics and Statistics, Wuhan University, Wuhan, 430072, China
| | - Qianchao Liu
- School of Computer, Wuhan University, Wuhan, 430072, China
| | - Weijian Wu
- School of Computer, Wuhan University, Wuhan, 430072, China
| |
Collapse
|
44
|
Hu YJ, Lin SC, Lin YL, Lin KH, You SN. A meta-learning approach for B-cell conformational epitope prediction. BMC Bioinformatics 2014; 15:378. [PMID: 25403375 PMCID: PMC4237749 DOI: 10.1186/s12859-014-0378-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2014] [Accepted: 11/05/2014] [Indexed: 12/11/2022] Open
Abstract
Background One of the major challenges in the field of vaccine design is identifying B-cell epitopes in continuously evolving viruses. Various tools have been developed to predict linear or conformational epitopes, each relying on different physicochemical properties and adopting distinct search strategies. We propose a meta-learning approach for epitope prediction based on stacked and cascade generalizations. Through meta learning, we expect a meta learner to be able integrate multiple prediction models, and outperform the single best-performing model. The objective of this study is twofold: (1) to analyze the complementary predictive strengths in different prediction tools, and (2) to introduce a generic computational model to exploit the synergy among various prediction tools. Our primary goal is not to develop any particular classifier for B-cell epitope prediction, but to advocate the feasibility of meta learning to epitope prediction. With the flexibility of meta learning, the researcher can construct various meta classification hierarchies that are applicable to epitope prediction in different protein domains. Results We developed the hierarchical meta-learning architectures based on stacked and cascade generalizations. The bottom level of the hierarchy consisted of four conformational and four linear epitope prediction tools that served as the base learners. To perform consistent and unbiased comparisons, we tested the meta-learning method on an independent set of antigen proteins that were not used previously to train the base epitope prediction tools. In addition, we conducted correlation and ablation studies of the base learners in the meta-learning model. Low correlation among the predictions of the base learners suggested that the eight base learners had complementary predictive capabilities. The ablation analysis indicated that the eight base learners differentially interacted and contributed to the final meta model. The results of the independent test demonstrated that the meta-learning approach markedly outperformed the single best-performing epitope predictor. Conclusions Computational B-cell epitope prediction tools exhibit several differences that affect their performances when predicting epitopic regions in protein antigens. The proposed meta-learning approach for epitope prediction combines multiple prediction tools by integrating their complementary predictive strengths. Our experimental results demonstrate the superior performance of the combined approach in comparison with single epitope predictors. Electronic supplementary material The online version of this article (doi:10.1186/s12859-014-0378-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yuh-Jyh Hu
- Department of Computer Science, National Chiao Tung University, 1001 University Rd,, Hsinchu, Taiwan.
| | | | | | | | | |
Collapse
|
45
|
Mahdavi M, Keyhanfar M, Jafarian A, Mohabatkar H, Rabbani M. Immunization with a novel chimeric peptide representing B and T cell epitopes from HER2 extracellular domain (HER2 ECD) for breast cancer. Tumour Biol 2014; 35:12049-57. [DOI: 10.1007/s13277-014-2503-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2014] [Accepted: 08/13/2014] [Indexed: 11/24/2022] Open
|
46
|
A hadoop-based method to predict potential effective drug combination. BIOMED RESEARCH INTERNATIONAL 2014; 2014:196858. [PMID: 25147789 PMCID: PMC4134802 DOI: 10.1155/2014/196858] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/31/2014] [Revised: 07/05/2014] [Accepted: 07/15/2014] [Indexed: 12/28/2022]
Abstract
Combination drugs that impact multiple targets simultaneously are promising candidates for combating complex diseases due to their improved efficacy and reduced side effects. However, exhaustive screening of all possible drug combinations is extremely time-consuming and impractical. Here, we present a novel Hadoop-based approach to predict drug combinations by taking advantage of the MapReduce programming model, which leads to an improvement of scalability of the prediction algorithm. By integrating the gene expression data of multiple drugs, we constructed data preprocessing and the support vector machines and naïve Bayesian classifiers on Hadoop for prediction of drug combinations. The experimental results suggest that our Hadoop-based model achieves much higher efficiency in the big data processing steps with satisfactory performance. We believed that our proposed approach can help accelerate the prediction of potential effective drugs with the increasing of the combination number at an exponential rate in future. The source code and datasets are available upon request.
Collapse
|
47
|
Zhang J, Zhao X, Sun P, Gao B, Ma Z. Conformational B-cell epitopes prediction from sequences using cost-sensitive ensemble classifiers and spatial clustering. BIOMED RESEARCH INTERNATIONAL 2014; 2014:689219. [PMID: 25045691 PMCID: PMC4083607 DOI: 10.1155/2014/689219] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/03/2014] [Revised: 05/02/2014] [Accepted: 05/10/2014] [Indexed: 12/20/2022]
Abstract
B-cell epitopes are regions of the antigen surface which can be recognized by certain antibodies and elicit the immune response. Identification of epitopes for a given antigen chain finds vital applications in vaccine and drug research. Experimental prediction of B-cell epitopes is time-consuming and resource intensive, which may benefit from the computational approaches to identify B-cell epitopes. In this paper, a novel cost-sensitive ensemble algorithm is proposed for predicting the antigenic determinant residues and then a spatial clustering algorithm is adopted to identify the potential epitopes. Firstly, we explore various discriminative features from primary sequences. Secondly, cost-sensitive ensemble scheme is introduced to deal with imbalanced learning problem. Thirdly, we adopt spatial algorithm to tell which residues may potentially form the epitopes. Based on the strategies mentioned above, a new predictor, called CBEP (conformational B-cell epitopes prediction), is proposed in this study. CBEP achieves good prediction performance with the mean AUC scores (AUCs) of 0.721 and 0.703 on two benchmark datasets (bound and unbound) using the leave-one-out cross-validation (LOOCV). When compared with previous prediction tools, CBEP produces higher sensitivity and comparable specificity values. A web server named CBEP which implements the proposed method is available for academic use.
Collapse
Affiliation(s)
- Jian Zhang
- School of Computer Science and Information Technology, Northeast Normal University, Changchun 1300117, China
| | - Xiaowei Zhao
- School of Computer Science and Information Technology, Northeast Normal University, Changchun 1300117, China
| | - Pingping Sun
- School of Computer Science and Information Technology, Northeast Normal University, Changchun 1300117, China
- The Engineering Laboratory for Drug-Gene and Protein Screening, Northeast Normal University, Changchun 1300117, China
| | - Bo Gao
- School of Computer Science and Information Technology, Northeast Normal University, Changchun 1300117, China
| | - Zhiqiang Ma
- School of Computer Science and Information Technology, Northeast Normal University, Changchun 1300117, China
| |
Collapse
|
48
|
Qi T, Qiu T, Zhang Q, Tang K, Fan Y, Qiu J, Wu D, Zhang W, Chen Y, Gao J, Zhu R, Cao Z. SEPPA 2.0--more refined server to predict spatial epitope considering species of immune host and subcellular localization of protein antigen. Nucleic Acids Res 2014; 42:W59-63. [PMID: 24838566 PMCID: PMC4086087 DOI: 10.1093/nar/gku395] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Spatial Epitope Prediction server for Protein Antigens (SEPPA) has received lots of feedback since being published in 2009. In this improved version, relative ASA preference of unit patch and consolidated amino acid index were added as further classification parameters in addition to unit-triangle propensity and clustering coefficient which were previously reported. Then logistic regression model was adopted instead of the previous simple additive one. Most importantly, subcellular localization of protein antigen and species of immune host were fully taken account to improve prediction. The result shows that AUC of 0.745 (5-fold cross-validation) is almost the baseline performance with no differentiation like all the other tools. Specifying subcellular localization of protein antigen and species of immune host will generally push the AUC up. Secretory protein immunized to mouse can push AUC to 0.823. In this version, the false positive rate has been largely decreased as well. As the first method which has considered the subcellular localization of protein antigen and species of immune host, SEPPA 2.0 shows obvious advantages over the other popular servers like SEPPA, PEPITO, DiscoTope-2, B-pred, Bpredictor and Epitopia in supporting more specific biological needs. SEPPA 2.0 can be accessed at http://badd.tongji.edu.cn/seppa/. Batch query is also supported.
Collapse
Affiliation(s)
- Tao Qi
- School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Tianyi Qiu
- School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Qingchen Zhang
- School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Kailin Tang
- School of Life Sciences and Technology, Tongji University, Shanghai 200092, China Institute for Advanced Study of Translational Medicine, Tongji University, Shanghai 200092, China
| | - Yangyang Fan
- School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Jingxuan Qiu
- School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Dingfeng Wu
- School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Wei Zhang
- School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Yanan Chen
- School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Jun Gao
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Ruixin Zhu
- School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Zhiwei Cao
- School of Life Sciences and Technology, Tongji University, Shanghai 200092, China Shanghai Center for Bioinformation and Technology, 1278 Keyuan Road, Shanghai 201203, China
| |
Collapse
|
49
|
Zheng W, Zhang C, Hanlon M, Ruan J, Gao J. An ensemble method for prediction of conformational B-cell epitopes from antigen sequences. Comput Biol Chem 2014; 49:51-8. [DOI: 10.1016/j.compbiolchem.2014.02.002] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2013] [Revised: 01/26/2014] [Accepted: 02/10/2014] [Indexed: 12/12/2022]
|
50
|
Evans MC, Phung P, Paquet AC, Parikh A, Petropoulos CJ, Wrin T, Haddad M. Predicting HIV-1 broadly neutralizing antibody epitope networks using neutralization titers and a novel computational method. BMC Bioinformatics 2014; 15:77. [PMID: 24646213 PMCID: PMC3999910 DOI: 10.1186/1471-2105-15-77] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2013] [Accepted: 03/03/2014] [Indexed: 11/26/2022] Open
Abstract
Background Recent efforts in HIV-1 vaccine design have focused on immunogens that evoke potent neutralizing antibody responses to a broad spectrum of viruses circulating worldwide. However, the development of effective vaccines will depend on the identification and characterization of the neutralizing antibodies and their epitopes. We developed bioinformatics methods to predict epitope networks and antigenic determinants using structural information, as well as corresponding genotypes and phenotypes generated by a highly sensitive and reproducible neutralization assay. 282 clonal envelope sequences from a multiclade panel of HIV-1 viruses were tested in viral neutralization assays with an array of broadly neutralizing monoclonal antibodies (mAbs: b12, PG9,16, PGT121 - 128, PGT130 - 131, PGT135 - 137, PGT141 - 145, and PGV04). We correlated IC50 titers with the envelope sequences, and used this information to predict antibody epitope networks. Structural patches were defined as amino acid groups based on solvent-accessibility, radius, atomic depth, and interaction networks within 3D envelope models. We applied a boosted algorithm consisting of multiple machine-learning and statistical models to evaluate these patches as possible antibody epitope regions, evidenced by strong correlations with the neutralization response for each antibody. Results We identified patch clusters with significant correlation to IC50 titers as sites that impact neutralization sensitivity and therefore are potentially part of the antibody binding sites. Predicted epitope networks were mostly located within the variable loops of the envelope glycoprotein (gp120), particularly in V1/V2. Site-directed mutagenesis experiments involving residues identified as epitope networks across multiple mAbs confirmed association of these residues with loss or gain of neutralization sensitivity. Conclusions Computational methods were implemented to rapidly survey protein structures and predict epitope networks associated with response to individual monoclonal antibodies, which resulted in the identification and deeper understanding of immunological hotspots targeted by broadly neutralizing HIV-1 antibodies.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Mojgan Haddad
- Monogram Biosciences Inc,, 345 Oyster Point Blvd,, South San Francisco, CA 94080, USA.
| |
Collapse
|