1
|
Zhang X, Wu J, Baeza J, Gu K, Zheng Y, Chen S, Zhou Z. DeepTAP: An RNN-based method of TAP-binding peptide prediction in the selection of tumor neoantigens. Comput Biol Med 2023; 164:107247. [PMID: 37454505 DOI: 10.1016/j.compbiomed.2023.107247] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 05/31/2023] [Accepted: 07/07/2023] [Indexed: 07/18/2023]
Abstract
The transport of peptides from the cytoplasm to the endoplasmic reticulum (ER) by transporters associated with antigen processing (TAP) is a critical step in the intracellular presentation of cytotoxic T lymphocyte (CTL) epitopes. The development and application of computational methods, especially deep learning methods and new neural network strategies that can automatically learn feature representations with limited knowledge, provide an opportunity to develop fast and efficient methods to identify TAP-binding peptides. Herein, this study presents a comprehensive analysis of TAP-binding peptide sequences to derive TAP-binding motifs and preferences for N-terminal and C-terminal amino acids. A novel recurrent neural network (RNN)-based method called DeepTAP, using bidirectional gated recurrent unit (BiGRU), was developed for the accurate prediction of TAP-binding peptides. Our results demonstrated that DeepTAP achieves an optimal balance between prediction precision and false positives, outperforming other baseline models. Furthermore, DeepTAP significantly improves the prediction accuracy of high-confidence neoantigens, especially the top-ranked ones, making it a valuable tool for researchers studying antigen presentation processes and T-cell epitope screening. DeepTAP is freely available at https://github.com/zjupgx/deeptap and https://pgx.zju.edu.cn/deeptap.
Collapse
Affiliation(s)
- Xue Zhang
- National Key Laboratory of Advanced Drug Delivery and Release Systems & Zhejiang Provincial Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Jingcheng Wu
- National Key Laboratory of Advanced Drug Delivery and Release Systems & Zhejiang Provincial Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Joseph Baeza
- Biology Program, Iowa State University, Ames, IA, 50011, USA
| | - Katie Gu
- Biology Program, Washington University in St. Louis, St. Louis, MO, 63130, USA
| | - Yichun Zheng
- The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Yiwu, 322000, China.
| | - Shuqing Chen
- National Key Laboratory of Advanced Drug Delivery and Release Systems & Zhejiang Provincial Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China.
| | - Zhan Zhou
- National Key Laboratory of Advanced Drug Delivery and Release Systems & Zhejiang Provincial Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China; The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Yiwu, 322000, China; Innovation Institute for Artificial Intelligence in Medicine, Zhejiang University, Hangzhou, 310018, China; Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, 310058, China.
| |
Collapse
|
2
|
Moura RRD, Agrelli A, Santos-Silva CA, Silva N, Assunção BR, Brandão L, Benko-Iseppon AM, Crovella S. Immunoinformatic approach to assess SARS-CoV-2 protein S epitopes recognised by the most frequent MHC-I alleles in the Brazilian population. J Clin Pathol 2021; 74:528-532. [PMID: 32759312 PMCID: PMC7409971 DOI: 10.1136/jclinpath-2020-206946] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Revised: 07/21/2020] [Accepted: 07/23/2020] [Indexed: 12/23/2022]
Abstract
AIMS Brazil is nowadays one of the epicentres of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic and new therapies are needed to face it. In the context of specific immune response against the virus, a correlation between Major Histocompatibility Complex Class I (MHC-I) and the severity of the disease in patients with COVID-19 has been suggested. Aiming at better understanding the biology of the infection and the immune response against the virus in the Brazilian population, we analysed SARS-CoV-2 protein S peptides in order to identify epitopes able to elicit an immune response mediated by the most frequent MHC-I alleles using in silico methods. METHODS Our analyses consisted in searching for the most frequent Human Leukocyte Antigen (HLA)-A, HLA-B and HLA-C alleles in the Brazilian population, excluding the genetic isolates; then, we performed: molecular modelling for unsolved structures, MHC-I binding affinity and antigenicity prediction, peptide docking and molecular dynamics of the best fitted MHC-I/protein S complexes. RESULTS We identified 24 immunogenic epitopes in the SARS-CoV-2 protein S that could interact with 17 different MHC-I alleles (namely, HLA-A*01:01; HLA-A*02:01; HLA-A*11:01; HLA-A*24:02; HLA-A*68:01; HLA-A*23:01; HLA-A*26:01; HLA-A*30:02; HLA-A*31:01; HLA-B*07:02; HLA-B*51:01; HLA-B*35:01; HLA-B*44:02; HLA-B*35:03; HLA-C*05:01; HLA-C*07:01 and HLA-C*15:02) in the Brazilian population. CONCLUSIONS Being aware of the intrinsic limitations of in silico analysis (mainly the differences between the real and the Protein Data Bank (PDB) structure; and accuracy of the methods for simulate proteasome cleavage), we identified 24 epitopes able to interact with 17 MHC-I more frequent alleles in the Brazilian population that could be useful for the development of strategic methods for vaccines against SARS-CoV-2.
Collapse
Affiliation(s)
- Ronald Rodrigues de Moura
- Department of Advanced Diagnostics, IRCCS Materno Infantile Burlo Garofolo, Trieste, Friuli Venezia Giulia, Italy
| | - Almerinda Agrelli
- Department of Pathology, Federal University of Pernambuco, Recife, Brazil
| | | | - Natália Silva
- Department of Pathology, Federal University of Pernambuco, Recife, Brazil
| | | | - Lucas Brandão
- Department of Pathology, Federal University of Pernambuco, Recife, Brazil
| | | | - Sergio Crovella
- Department of Advanced Diagnostics, IRCCS Materno Infantile Burlo Garofolo, Trieste, Friuli Venezia Giulia, Italy
| |
Collapse
|
3
|
Li Z, Miao Q, Yan F, Meng Y, Zhou P. Machine Learning in Quantitative Protein–peptide Affinity Prediction: Implications for Therapeutic Peptide Design. Curr Drug Metab 2019; 20:170-176. [DOI: 10.2174/1389200219666181012151944] [Citation(s) in RCA: 66] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2017] [Revised: 11/07/2017] [Accepted: 08/20/2018] [Indexed: 01/03/2023]
Abstract
Background:Protein–peptide recognition plays an essential role in the orchestration and regulation of cell signaling networks, which is estimated to be responsible for up to 40% of biological interaction events in the human interactome and has recently been recognized as a new and attractive druggable target for drug development and disease intervention.Methods:We present a systematic review on the application of machine learning techniques in the quantitative modeling and prediction of protein–peptide binding affinity, particularly focusing on its implications for therapeutic peptide design. We also briefly introduce the physical quantities used to characterize protein–peptide affinity and attempt to extend the content of generalized machine learning methods.Results:Existing issues and future perspective on the statistical modeling and regression prediction of protein– peptide binding affinity are discussed.Conclusion:There is still a long way to go before establishment of general, reliable and efficient machine leaningbased protein–peptide affinity predictors.
Collapse
Affiliation(s)
- Zhongyan Li
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 610054, China
| | - Qingqing Miao
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 610054, China
| | - Fugang Yan
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 610054, China
| | - Yang Meng
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 610054, China
| | - Peng Zhou
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 610054, China
| |
Collapse
|
4
|
Liu YY, Sha W, Xu S, Gui XW, Xia L, Ji P, Wang S, Zhao GP, Zhang X, Chen Y, Wang Y. Identification of HLA-A2-Restricted Mycobacterial Lipoprotein Z Peptides Recognized by T CellsFrom Patients With ActiveTuberculosis Infection. Front Microbiol 2018; 9:3131. [PMID: 30622521 PMCID: PMC6308912 DOI: 10.3389/fmicb.2018.03131] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2018] [Accepted: 12/04/2018] [Indexed: 12/14/2022] Open
Abstract
Identification of HLA-restricted peptides derived from mycobacterial antigens that are endowed with high affinity and strong antigenicity is not only of interest in tuberculosis (TB) diagnostics and treatment efficacy evaluation, but might also provide potential candidates for the development of therapeutic vaccines against drug-resistant TB. Our previous work demonstrated that lipoprotein Z (LppZ) displayed high immunogenicity and antigenicity in active TB patients. In the present study, ten HLA-A2-restricted LppZ peptides (LppZp1-10) were predicted by bioinformatics, among which LppZp7 and LppZp10 were verified to possess high affinity to HLA-A2 molecules using T2 cell-based affinity binding assay. Moreover, results from ELISpot assay showed that both LppZp7 and LppZp10 peptides were able to induce more IFN-γ producing cells upon ex vivo stimulation of PBMC from HLA-A2+ active TB (ATB) patients as compared to those from healthy controls (HCs). Also, the numbers of LppZp7 and LppZp10-specific IFN-γ producing cells exhibited positive correlations with those of ESAT-6 peptide (E6p) or CFP-10 peptide (C10p) in ATB. Interestingly, stimulation with LppZp7/p10 mixture was able to induce higher intracellular expression of IFN-γ and IL-2 cytokines in CD8+ and CD4+ T cells from ATB as compared to HC, associated with lower expression of TNF-α in both CD8+ and CD4+ T cells. Taken together, HLA-A2-restricted LppZp7 and LppZp10 peptides display high immunoreactivity in HLA-matched ATB patients demonstrated by high responsiveness in both CD8+ and CD4+ T cells. With the ability to induce strong antigen-specific cellular responses, LppZp7 and LppZp10 are of potential value for the future applications in the prevention and control of TB.
Collapse
Affiliation(s)
- Yuan-Yong Liu
- School of Life Science and Technology, Changchun University of Science and Technology, Changchun, China.,Department of Microbiology and Immunology, Shanghai Institute of Immunology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Wei Sha
- Clinic and Research Center of Tuberculosis, Shanghai Key Laboratory of Tuberculosis, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, China
| | - Shiqiang Xu
- Department of Microbiology and Immunology, Shanghai Institute of Immunology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Xu-Wei Gui
- Clinic and Research Center of Tuberculosis, Shanghai Key Laboratory of Tuberculosis, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, China
| | - Liliang Xia
- Department of Microbiology and Immunology, Shanghai Institute of Immunology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Ping Ji
- Department of Microbiology and Immunology, Shanghai Institute of Immunology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Shujun Wang
- Department of Microbiology and Immunology, Shanghai Institute of Immunology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Guo-Ping Zhao
- Shanghai-MOST Key Laboratory of Health and Disease Genomics, Chinese National Human Genome Center at Shanghai, Shanghai, China
| | - Xiao Zhang
- School of Life Science and Technology, Changchun University of Science and Technology, Changchun, China
| | - Yingying Chen
- Department of Microbiology and Immunology, Shanghai Institute of Immunology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Ying Wang
- Department of Microbiology and Immunology, Shanghai Institute of Immunology, Shanghai Jiao Tong University School of Medicine, Shanghai, China.,Shanghai-MOST Key Laboratory of Health and Disease Genomics, Chinese National Human Genome Center at Shanghai, Shanghai, China
| |
Collapse
|
5
|
Cortes-Ciriano I, van Westen GJ, Lenselink EB, Murrell DS, Bender A, Malliavin T. Proteochemometric modeling in a Bayesian framework. J Cheminform 2014; 6:35. [PMID: 25045403 PMCID: PMC4083135 DOI: 10.1186/1758-2946-6-35] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2014] [Accepted: 06/18/2014] [Indexed: 11/10/2022] Open
Abstract
Proteochemometrics (PCM) is an approach for bioactivity predictive modeling which models the relationship between protein and chemical information. Gaussian Processes (GP), based on Bayesian inference, provide the most objective estimation of the uncertainty of the predictions, thus permitting the evaluation of the applicability domain (AD) of the model. Furthermore, the experimental error on bioactivity measurements can be used as input for this probabilistic model. In this study, we apply GP implemented with a panel of kernels on three various (and multispecies) PCM datasets. The first dataset consisted of information from 8 human and rat adenosine receptors with 10,999 small molecule ligands and their binding affinity. The second consisted of the catalytic activity of four dengue virus NS3 proteases on 56 small peptides. Finally, we have gathered bioactivity information of small molecule ligands on 91 aminergic GPCRs from 9 different species, leading to a dataset of 24,593 datapoints with a matrix completeness of only 2.43%. GP models trained on these datasets are statistically sound, at the same level of statistical significance as Support Vector Machines (SVM), with R02 values on the external dataset ranging from 0.68 to 0.92, and RMSEP values close to the experimental error. Furthermore, the best GP models obtained with the normalized polynomial and radial kernels provide intervals of confidence for the predictions in agreement with the cumulative Gaussian distribution. GP models were also interpreted on the basis of individual targets and of ligand descriptors. In the dengue dataset, the model interpretation in terms of the amino-acid positions in the tetra-peptide ligands gave biologically meaningful results.
Collapse
Affiliation(s)
- Isidro Cortes-Ciriano
- Institut Pasteur, Unité de Bioinformatique Structurale; CNRS UMR 3825; Département de Biologie Structurale et Chimie
| | - Gerard Jp van Westen
- ChEMBL Group, European Molecular Biology Laboratory European Bioinformatics Institute, Wellcome Trust Genome Campus, CB10 1SD, Hinxton, Cambridge, UK
| | - Eelke Bart Lenselink
- Division of Medicinal Chemistry, Leiden Academic Center for Drug Research, Leiden, The Netherlands
| | - Daniel S Murrell
- Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Andreas Bender
- Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Thérèse Malliavin
- Institut Pasteur, Unité de Bioinformatique Structurale; CNRS UMR 3825; Département de Biologie Structurale et Chimie
| |
Collapse
|
6
|
Karpenko LI, Bazhan SI, Antonets DV, Belyakov IM. Novel approaches in polyepitope T-cell vaccine development against HIV-1. Expert Rev Vaccines 2013; 13:155-73. [PMID: 24308576 DOI: 10.1586/14760584.2014.861748] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
RV144 clinical trial was modestly effective in preventing HIV infection. New alternative approaches are needed to design improved HIV-1 vaccines and their delivery strategies. One of these approaches is construction of synthetic polyepitope HIV-1 immunogen using protective T- and B-cell epitopes that can induce broadly neutralizing antibodies and responses of cytotoxic (CD8(+) CTL) and helpers (CD4(+) Th) T-lymphocytes. This approach seems to be promising for designing of new generation of vaccines against HIV-1, enables in theory to cope with HIV-1 antigenic variability, focuses immune responses on protective determinants and enables to exclude from the vaccine compound that can induce autoantibodies or antibodies enhancing HIV-1 infectivity. Herein, the authors will focus on construction and rational design of polyepitope T-cell HIV-1 immunogens and their delivery, including: advantages and disadvantages of existing T-cell epitope prediction methods; features of organization of polyepitope immunogens, which can generate high-level CD8(+) and CD4(+) T-lymphocyte responses; the strategies to optimize efficient processing, presentation and immunogenicity of polyepitope constructs; original software to design polyepitope immunogens; and delivery vectors as well as mucosal strategies of vaccination. This new knowledge may bring us a one step closer to developing an effective T-cell vaccine against HIV-1, other chronic viral infections and cancer.
Collapse
Affiliation(s)
- Larisa I Karpenko
- State Research Center of Virology and Biotechnology "Vector", Koltsovo, Novosibirsk region, 630559, Russia
| | | | | | | |
Collapse
|
7
|
Antonets DV, Bazhan SI. PolyCTLDesigner: a computational tool for constructing polyepitope T-cell antigens. BMC Res Notes 2013; 6:407. [PMID: 24107711 PMCID: PMC3853014 DOI: 10.1186/1756-0500-6-407] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2013] [Accepted: 09/24/2013] [Indexed: 11/10/2022] Open
Abstract
Background Construction of artificial polyepitope antigens is one of the most promising strategies for developing more efficient and safer vaccines evoking T-cell immune responses. Epitope rearrangements and utilization of certain spacer sequences have been proven to greatly influence the immunogenicity of polyepitope constructs. However, despite numerous efforts towards constructing and evaluating artificial polyepitope immunogens as well as despite numerous computational methods elaborated to date for predicting T-cell epitopes, peptides binding to TAP and for antigen processing prediction, only a few computational tools were currently developed for rational design of polyepitope antigens. Findings Here we present a PolyCTLDesigner program that is intended for constructing polyepitope immunogens. Given a set of either known or predicted T-cell epitopes the program selects N-terminal flanking sequences for each epitope to optimize its binding to TAP (if necessary) and joins resulting oligopeptides into a polyepitope in a way providing efficient liberation of potential epitopes by proteasomal and/or immunoproteasomal processing. And it also tries to minimize the number of non-target junctional epitopes resulting from artificial juxtaposition of target epitopes within the polyepitope. For constructing polyepitopes, PolyCTLDesigner utilizes known amino acid patterns of TAP-binding and proteasomal/immunoproteasomal cleavage specificity together with genetic algorithm and graph theory approaches. The program was implemented using Python programming language and it can be used either interactively or through scripting, which allows users familiar with Python to create custom pipelines. Conclusions The developed software realizes a rational approach to designing poly-CTL-epitope antigens and can be used to develop new candidate polyepitope vaccines. The current version of PolyCTLDesigner is integrated with our TEpredict program for predicting T-cell epitopes, and thus it can be used not only for constructing the polyepitope antigens based on preselected sets of T-cell epitopes, but also for predicting cytotoxic and helper T-cell epitopes within selected protein antigens. PolyCTLDesigner is freely available from the project’s web site: http://tepredict.sourceforge.net/PolyCTLDesigner.html.
Collapse
Affiliation(s)
- Denis V Antonets
- State Research Center of Virology and Biotechnology "Vector", Koltsovo, Novosibirsk Region, Russian Federation.
| | | |
Collapse
|
8
|
Abawajy J, Kelarev A, Chowdhury M, Stranieri A, Jelinek HF. Predicting cardiac autonomic neuropathy category for diabetic data with missing values. Comput Biol Med 2013; 43:1328-33. [PMID: 24034723 DOI: 10.1016/j.compbiomed.2013.07.002] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2012] [Revised: 07/02/2013] [Accepted: 07/04/2013] [Indexed: 10/26/2022]
Abstract
Cardiovascular autonomic neuropathy (CAN) is a serious and well known complication of diabetes. Previous articles circumvented the problem of missing values in CAN data by deleting all records and fields with missing values and applying classifiers trained on different sets of features that were complete. Most of them also added alternative features to compensate for the deleted ones. Here we introduce and investigate a new method for classifying CAN data with missing values. In contrast to all previous papers, our new method does not delete attributes with missing values, does not use classifiers, and does not add features. Instead it is based on regression and meta-regression combined with the Ewing formula for identifying the classes of CAN. This is the first article using the Ewing formula and regression to classify CAN. We carried out extensive experiments to determine the best combination of regression and meta-regression techniques for classifying CAN data with missing values. The best outcomes have been obtained by the additive regression meta-learner based on M5Rules and combined with the Ewing formula. It has achieved the best accuracy of 99.78% for two classes of CAN, and 98.98% for three classes of CAN. These outcomes are substantially better than previous results obtained in the literature by deleting all missing attributes and applying traditional classifiers to different sets of features without regression. Another advantage of our method is that it does not require practitioners to perform more tests collecting additional alternative features.
Collapse
Affiliation(s)
- Jemal Abawajy
- School of Information Technology, Deakin University, 221 Burwood Hwy, VIC 3125, Australia
| | | | | | | | | |
Collapse
|
9
|
Biomacromolecular quantitative structure–activity relationship (BioQSAR): a proof-of-concept study on the modeling, prediction and interpretation of protein–protein binding affinity. J Comput Aided Mol Des 2013; 27:67-78. [DOI: 10.1007/s10822-012-9625-3] [Citation(s) in RCA: 78] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2012] [Accepted: 12/12/2012] [Indexed: 01/22/2023]
|