1
|
Collesano L, Łuksza M, Lässig M. Energy landscapes of peptide-MHC binding. PLoS Comput Biol 2024; 20:e1012380. [PMID: 39226310 PMCID: PMC11398667 DOI: 10.1371/journal.pcbi.1012380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Revised: 09/13/2024] [Accepted: 07/31/2024] [Indexed: 09/05/2024] Open
Abstract
Molecules of the Major Histocompatibility Complex (MHC) present short protein fragments on the cell surface, an important step in T cell immune recognition. MHC-I molecules process peptides from intracellular proteins; MHC-II molecules act in antigen-presenting cells and present peptides derived from extracellular proteins. Here we show that the sequence-dependent energy landscapes of MHC-peptide binding encode class-specific nonlinearities (epistasis). MHC-I has a smooth landscape with global epistasis; the binding energy is a simple deformation of an underlying linear trait. This form of epistasis enhances the discrimination between strong-binding peptides. In contrast, MHC-II has a rugged landscape with idiosyncratic epistasis: binding depends on detailed amino acid combinations at multiple positions of the peptide sequence. The form of epistasis affects the learning of energy landscapes from training data. For MHC-I, a low-complexity problem, we derive a simple matrix model of binding energies that outperforms current models trained by machine learning. For MHC-II, higher complexity prevents learning by simple regression methods. Epistasis also affects the energy and fitness effects of mutations in antigen-derived peptides (epitopes). In MHC-I, large-effect mutations occur predominantly in anchor positions of strong-binding epitopes. In MHC-II, large effects depend on the background epitope sequence but are broadly distributed over the epitope, generating a bigger target for escape mutations due to loss of presentation. Together, our analysis shows how an energy landscape of protein-protein binding constrains the target of escape mutations from T cell immunity, linking the complexity of the molecular interactions to the dynamics of adaptive immune response.
Collapse
Affiliation(s)
- Laura Collesano
- Institute for Biological Physics, University of Cologne, Cologne, Germany
| | - Marta Łuksza
- Tisch Cancer Institute, Departments of Oncological Sciences and Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, United States of America
| | - Michael Lässig
- Institute for Biological Physics, University of Cologne, Cologne, Germany
| |
Collapse
|
2
|
Vincenzi M, Mercurio FA, Leone M. Virtual Screening of Peptide Libraries: The Search for Peptide-Based Therapeutics Using Computational Tools. Int J Mol Sci 2024; 25:1798. [PMID: 38339078 PMCID: PMC10855943 DOI: 10.3390/ijms25031798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 01/26/2024] [Accepted: 01/30/2024] [Indexed: 02/12/2024] Open
Abstract
Over the last few decades, we have witnessed growing interest from both academic and industrial laboratories in peptides as possible therapeutics. Bioactive peptides have a high potential to treat various diseases with specificity and biological safety. Compared to small molecules, peptides represent better candidates as inhibitors (or general modulators) of key protein-protein interactions. In fact, undruggable proteins containing large and smooth surfaces can be more easily targeted with the conformational plasticity of peptides. The discovery of bioactive peptides, working against disease-relevant protein targets, generally requires the high-throughput screening of large libraries, and in silico approaches are highly exploited for their low-cost incidence and efficiency. The present review reports on the potential challenges linked to the employment of peptides as therapeutics and describes computational approaches, mainly structure-based virtual screening (SBVS), to support the identification of novel peptides for therapeutic implementations. Cutting-edge SBVS strategies are reviewed along with examples of applications focused on diverse classes of bioactive peptides (i.e., anticancer, antimicrobial/antiviral peptides, peptides blocking amyloid fiber formation).
Collapse
Affiliation(s)
| | | | - Marilisa Leone
- Institute of Biostructures and Bioimaging, Via Pietro Castellino 111, 80131 Naples, Italy; (M.V.); (F.A.M.)
| |
Collapse
|
3
|
Pelaez-Prestel HF, Fernandez SA, Ballesteros-Sanabria L, Reche PA. Prediction of TAP Transport of Peptides with Variable Length Using TAPREG. Methods Mol Biol 2023; 2673:227-235. [PMID: 37258918 DOI: 10.1007/978-1-0716-3239-0_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
CD8 T cells recognize short peptides, more frequently of nine residues, presented by class I major histocompatibility complex (MHC I) molecules in the cell surface of antigen-presenting cells. These epitope peptides are loaded onto MHC I molecules in the endoplasmic reticulum, where they are shuttled from the cytosol by the transporter associated with antigen processing (TAP) as such or as N-terminal extended precursors of up to 16 residues. In this chapter, we describe the use of TAPREG, a tool for predicting TAP binding affinity that has been enhanced to identify potential CD8 T cell epitope precursors transported by TAP. TAPREG is available for free public use at http://imed.med.ucm.es/Tools/tapreg/ .
Collapse
Affiliation(s)
- Hector F Pelaez-Prestel
- School of Medicine, Department of Immunology, Complutense University of Madrid, Madrid, Spain
| | - Sara Alonso Fernandez
- School of Medicine, Department of Immunology, Complutense University of Madrid, Madrid, Spain
| | | | - Pedro A Reche
- School of Medicine, Department of Immunology, Complutense University of Madrid, Madrid, Spain.
| |
Collapse
|
4
|
Zhou P, Liu Q, Wu T, Miao Q, Shang S, Wang H, Chen Z, Wang S, Wang H. Systematic Comparison and Comprehensive Evaluation of 80 Amino Acid Descriptors in Peptide QSAR Modeling. J Chem Inf Model 2021; 61:1718-1731. [DOI: 10.1021/acs.jcim.0c01370] [Citation(s) in RCA: 44] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Peng Zhou
- Center for Informational Biology, University of Electronic Science and Technology of China (UESTC) at Qingshuihe Campus, Chengdu 611731, China
- School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
| | - Qian Liu
- Center for Informational Biology, University of Electronic Science and Technology of China (UESTC) at Qingshuihe Campus, Chengdu 611731, China
- School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
| | - Ting Wu
- School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
| | - Qingqing Miao
- Center for Informational Biology, University of Electronic Science and Technology of China (UESTC) at Qingshuihe Campus, Chengdu 611731, China
- School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
| | - Shuyong Shang
- College of Chemistry and Life Science, Chengdu Normal University, Chengdu 611130, China
| | - Heyi Wang
- Center for Informational Biology, University of Electronic Science and Technology of China (UESTC) at Qingshuihe Campus, Chengdu 611731, China
- School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
| | - Zheng Chen
- Center for Informational Biology, University of Electronic Science and Technology of China (UESTC) at Qingshuihe Campus, Chengdu 611731, China
- School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
| | - Shaozhou Wang
- School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
| | - Heyan Wang
- School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
| |
Collapse
|
5
|
Li Z, Miao Q, Yan F, Meng Y, Zhou P. Machine Learning in Quantitative Protein–peptide Affinity Prediction: Implications for Therapeutic Peptide Design. Curr Drug Metab 2019; 20:170-176. [DOI: 10.2174/1389200219666181012151944] [Citation(s) in RCA: 66] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2017] [Revised: 11/07/2017] [Accepted: 08/20/2018] [Indexed: 01/03/2023]
Abstract
Background:Protein–peptide recognition plays an essential role in the orchestration and regulation of cell signaling networks, which is estimated to be responsible for up to 40% of biological interaction events in the human interactome and has recently been recognized as a new and attractive druggable target for drug development and disease intervention.Methods:We present a systematic review on the application of machine learning techniques in the quantitative modeling and prediction of protein–peptide binding affinity, particularly focusing on its implications for therapeutic peptide design. We also briefly introduce the physical quantities used to characterize protein–peptide affinity and attempt to extend the content of generalized machine learning methods.Results:Existing issues and future perspective on the statistical modeling and regression prediction of protein– peptide binding affinity are discussed.Conclusion:There is still a long way to go before establishment of general, reliable and efficient machine leaningbased protein–peptide affinity predictors.
Collapse
Affiliation(s)
- Zhongyan Li
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 610054, China
| | - Qingqing Miao
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 610054, China
| | - Fugang Yan
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 610054, China
| | - Yang Meng
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 610054, China
| | - Peng Zhou
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 610054, China
| |
Collapse
|
6
|
Quantitative prediction of peptide binding affinity by using hybrid fuzzy support vector regression. Appl Soft Comput 2016. [DOI: 10.1016/j.asoc.2016.01.024] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
7
|
Li B, Zheng X, Hu C, Cao Y. Human Papillomavirus Genome-Wide Identification of T-Cell Epitopes for Peptide Vaccine Development Against Cervical Cancer: An Integration of Computational Analysis and Experimental Assay. J Comput Biol 2015; 22:962-74. [PMID: 26418056 DOI: 10.1089/cmb.2014.0287] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Affiliation(s)
- Bo Li
- Department of Obstetrics and Gynecology, Anhui Medical University, Hefei, China
| | - Xianfang Zheng
- Department of Obstetrics and Gynecology, Chaohu Hospital of Anhui Medical University, Chaohu, China
| | - Chuancui Hu
- Department of Obstetrics and Gynecology, Chaohu Hospital of Anhui Medical University, Chaohu, China
| | - Yunxia Cao
- Department of Obstetrics and Gynecology, Anhui Medical University, Hefei, China
| |
Collapse
|
8
|
Computational prediction of broadly neutralizing HIV-1 antibody epitopes from neutralization activity data. PLoS One 2013; 8:e80562. [PMID: 24312481 PMCID: PMC3846483 DOI: 10.1371/journal.pone.0080562] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2013] [Accepted: 10/03/2013] [Indexed: 11/19/2022] Open
Abstract
Broadly neutralizing monoclonal antibodies effective against the majority of circulating isolates of HIV-1 have been isolated from a small number of infected individuals. Definition of the conformational epitopes on the HIV spike to which these antibodies bind is of great value in defining targets for vaccine and drug design. Drawing on techniques from compressed sensing and information theory, we developed a computational methodology to predict key residues constituting the conformational epitopes on the viral spike from cross-clade neutralization activity data. Our approach does not require the availability of structural information for either the antibody or antigen. Predictions of the conformational epitopes of ten broadly neutralizing HIV-1 antibodies are shown to be in good agreement with new and existing experimental data. Our findings suggest that our approach offers a means to accelerate epitope identification for diverse pathogenic antigens.
Collapse
|
9
|
Luo F, Gao Y, Zhu Y, Liu J. Integrating peptides' sequence and energy of contact residues information improves prediction of peptide and HLA-I binding with unknown alleles. BMC Bioinformatics 2013; 14 Suppl 8:S1. [PMID: 23815611 PMCID: PMC3654895 DOI: 10.1186/1471-2105-14-s8-s1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Background The HLA (human leukocyte antigen) class I is a kind of molecule encoded by a large family of genes and is characteristic of high polymorphism. Now the number of the registered HLA-I molecules has exceeded 3000. Slight differences in the amino acid sequences of HLAs would make them bind to different sets of peptides. In the past decades, although many methods have been proposed to predict the binding between peptides and HLA-I molecules and achieved good performance, most experimental data used by them is limited to the HLAs with a small number of alleles. Thus they are inclined to obtain high prediction accuracy only for data with similar alleles. Because the peptides and HLAs together determine the binding, it's necessary to consider their contribution meanwhile. Results By taking into account the features of the peptides sequence and the energy of contact residues, in this paper a method based on the artificial neural network is proposed to predict the binding of peptides and HLA-I even when the HLAs' potential alleles are unknown. Two experiments in the allele-specific and super-type cases are performed respectively to validate our method. In the first case, we collect 14 HLA-A and 14 HLA-B molecules on Bjoern Peters dataset, and compare our method with the ARB, SMM, NetMHC and other 16 online methods. Our method gets the best average AUC (Area under the ROC) value as 0.909. In the second one, we use leave one out cross validation on MHC-peptide binding data that has different alleles but shares the common super-type. Compared to gold standard methods like NetMHC and NetMHCpan, our method again achieves the best average AUC value as 0.847. Conclusions Our method achieves satisfactory results. Whenever it's tested on the HLA-I with single definite gene or with super-type gene locus, it gets better classification accuracy. Especially, when the training set is small, our method still works better than the other methods in the comparison. Therefore, we could make a conclusion that by combining the peptides' information, HLAs amino acid residues' interaction information and contact energy, our method really could improve prediction of the peptide HLA-I binding even when there aren't the prior experimental dataset for HLAs with various alleles.
Collapse
Affiliation(s)
- Fei Luo
- School of Computer, Wuhan University, Wuhan, Hubei, China
| | | | | | | |
Collapse
|
10
|
Koch CP, Pillong M, Hiss JA, Schneider G. Computational Resources for MHC Ligand Identification. Mol Inform 2013; 32:326-36. [PMID: 27481589 DOI: 10.1002/minf.201300042] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2012] [Accepted: 04/04/2013] [Indexed: 01/16/2023]
Abstract
Advances in the high-throughput determination of functional modulators of major histocompatibility complex (MHC) and improved computational predictions of MHC ligands have rendered the rational design of immunomodulatory peptides feasible. Proteome-derived peptides and 'reverse vaccinology' by computational means will play a driving role in future vaccine design. Here we review the molecular mechanisms of the MHC mediated immune response, present the computational approaches that have emerged in this area of biotechnology, and provide an overview of publicly available computational resources for predicting and designing new peptidic MHC ligands.
Collapse
Affiliation(s)
- Christian P Koch
- ETH Zürich, Department of Chemistry and Applied Biosciences, Institute of Pharmaceutical Sciences, Wolfgang-Pauli-Str. 10, 8093 Zürich, Switzerland
| | - Max Pillong
- ETH Zürich, Department of Chemistry and Applied Biosciences, Institute of Pharmaceutical Sciences, Wolfgang-Pauli-Str. 10, 8093 Zürich, Switzerland
| | - Jan A Hiss
- ETH Zürich, Department of Chemistry and Applied Biosciences, Institute of Pharmaceutical Sciences, Wolfgang-Pauli-Str. 10, 8093 Zürich, Switzerland
| | - Gisbert Schneider
- ETH Zürich, Department of Chemistry and Applied Biosciences, Institute of Pharmaceutical Sciences, Wolfgang-Pauli-Str. 10, 8093 Zürich, Switzerland.
| |
Collapse
|
11
|
Doytchinova I, Petkov P, Dimitrov I, Atanasova M, Flower DR. HLA-DP2 binding prediction by molecular dynamics simulations. Protein Sci 2011; 20:1918-28. [PMID: 21898654 DOI: 10.1002/pro.732] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2011] [Revised: 08/16/2011] [Accepted: 08/21/2011] [Indexed: 11/11/2022]
Abstract
Major histocompatibility complex (MHC) II proteins bind peptide fragments derived from pathogen antigens and present them at the cell surface for recognition by T cells. MHC proteins are divided into Class I and Class II. Human MHC Class II alleles are grouped into three loci: HLA-DP, HLA-DQ, and HLA-DR. They are involved in many autoimmune diseases. In contrast to HLA-DR and HLA-DQ proteins, the X-ray structure of the HLA-DP2 protein has been solved quite recently. In this study, we have used structure-based molecular dynamics simulation to derive a tool for rapid and accurate virtual screening for the prediction of HLA-DP2-peptide binding. A combinatorial library of 247 peptides was built using the "single amino acid substitution" approach and docked into the HLA-DP2 binding site. The complexes were simulated for 1 ns and the short range interaction energies (Lennard-Jones and Coulumb) were used as binding scores after normalization. The normalized values were collected into quantitative matrices (QMs) and their predictive abilities were validated on a large external test set. The validation shows that the best performing QM consisted of Lennard-Jones energies normalized over all positions for anchor residues only plus cross terms between anchor-residues.
Collapse
Affiliation(s)
- Irini Doytchinova
- School of Pharmacy, Medical University of Sofia, Sofia 1000, Bulgaria.
| | | | | | | | | |
Collapse
|
12
|
Bi J, Song R, Yang H, Li B, Fan J, Liu Z, Long C. Stepwise identification of HLA-A*0201-restricted CD8+ T-cell epitope peptides from herpes simplex virus type 1 genome boosted by a StepRank scheme. Biopolymers 2011; 96:328-39. [PMID: 21072852 DOI: 10.1002/bip.21564] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
Identification of immunodominant epitopes is the first step in the rational design of peptide vaccines aimed at T-cell immunity. To date, however, it is yet a great challenge for accurately predicting the potent epitope peptides from a pool of large-scale candidates with an efficient manner. In this study, a method that we named StepRank has been developed for the reliable and rapid prediction of binding capabilities/affinities between proteins and genome-wide peptides. In this procedure, instead of single strategy used in most traditional epitope identification algorithms, four steps with different purposes and thus different computational demands are employed in turn to screen the large-scale peptide candidates that are normally generated from, for example, pathogenic genome. The steps 1 and 2 aim at qualitative exclusion of typical nonbinders by using empirical rule and linear statistical approach, while the steps 3 and 4 focus on quantitative examination and prediction of the interaction energy profile and binding affinity of peptide to target protein via quantitative structure-activity relationship (QSAR) and structure-based free energy analysis. We exemplify this method through its application to binding predictions of the peptide segments derived from the 76 known open-reading frames (ORFs) of herpes simplex virus type 1 (HSV-1) genome with or without affinity to human major histocompatibility complex class I (MHC I) molecule HLA-A*0201, and find that the predictive results are well compatible with the classical anchor residue theory and perfectly match for the extended motif pattern of MHC I-binding peptides. The putative epitopes are further confirmed by comparisons with 11 experimentally measured HLA-A*0201-restrcited peptides from the HSV-1 glycoproteins D and K. We expect that this well-designed scheme can be applied in the computational screening of other viral genomes as well.
Collapse
Affiliation(s)
- Jianjun Bi
- Department of Dermatology, General Hospital of Guangzhou Military Command of PLA, Guangzhou, China
| | | | | | | | | | | | | |
Collapse
|
13
|
Rivera CG, Rosca EV, Pandey NB, Koskimaki JE, Bader JS, Popel AS. Novel peptide-specific quantitative structure-activity relationship (QSAR) analysis applied to collagen IV peptides with antiangiogenic activity. J Med Chem 2011; 54:6492-500. [PMID: 21866962 DOI: 10.1021/jm200114f] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Angiogenesis is the growth of new blood vessels from existing vasculature. Excessive vascularization is associated with a number of diseases including cancer. Antiangiogenic therapies have the potential to stunt cancer progression. Peptides derived from type IV collagen are potent inhibitors of angiogenesis. We wanted to gain a better understanding of collagen IV structure-activity relationships using a ligand-based approach. We developed novel peptide-specific QSAR models to study the activity of the peptides in endothelial cell proliferation, migration, and adhesion inhibition assays. We found that the models produced quantitatively accurate predictions of activity and provided insight into collagen IV derived peptide structure-activity relationships.
Collapse
Affiliation(s)
- Corban G Rivera
- Department of Biomedical Engineering, 613 Traylor Building, Johns Hopkins University, 720 Rutland Avenue, Baltimore, Maryland 21205, United States.
| | | | | | | | | | | |
Collapse
|
14
|
Liao WWP, Arthur JW. Predicting peptide binding to Major Histocompatibility Complex molecules. Autoimmun Rev 2011; 10:469-73. [PMID: 21333759 DOI: 10.1016/j.autrev.2011.02.003] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2011] [Accepted: 02/09/2011] [Indexed: 12/29/2022]
Abstract
The Major Histocompatibility Complex (MHC) constitutes an important part of the human immune system. During infection, pathogenic proteins are processed into peptide fragments by the antigen processing machinery. These peptides bind to MHC molecules and the MHC-peptide complex is then transported to the cell membrane where it elicits an immune response via T-cell binding. Understanding the molecular mechanism of this process will greatly assist in determining the aetiology of various diseases and in the design of effective drugs. One of the most challenging aspects of this area of research is understanding the specificity and sensitivity of the binding process. An empirical approach to the problem is unfeasible as there are over 512 billion potential binding peptides for each MHC molecule. Computational approaches offer the promise of predicting peptide binding, thus dramatically reducing the number of peptides proceeding to experimental verification. Various bioinformatic approaches have been developed to predict whether or not a particular peptide will bind to a particular MHC allele. Currently, peptide binding prediction methods can be categorised into three major groups: motif- and scoring matrix-based methods, artificial intelligence- (AI-) based methods, and structure-based methods. The first two are sequence-based approaches and are generally based on common sequence motifs in peptides known to bind to MHC molecules. The structure-based approach concerns the structural features and the distribution of energy between the binding peptide and the MHC molecule. Although knowledge of the molecular structure of the MHC molecules is expected to lead to better predictions of peptide binding, the development of structure-based methods has been relatively slow compared to sequence-based methods. Comparisons of various methods showed that the best sequence-based methods significantly outperform structure-based methods. This may be improved by producing more structures and binding data desperately needed by many alleles, especially class II molecules. On the other hand, the large number of verification methods and indicators used by structure-based studies hinders critical evaluation of the methods. Adopting commonly used assessment procedures can demonstrate the relative performance of structure-based methods in a straightforward comparison with other methods. This review provides an overview of current methods for predicting peptide binding to the MHC, with a focus on structure-based methods, and explores the potential for future development in this area.
Collapse
Affiliation(s)
- Webber W P Liao
- Discipline of Medicine, Central Clinical School, University of Sydney, NSW, 2006, Australia
| | | |
Collapse
|
15
|
Tian F, Zhang C, Fan X, Yang X, Wang X, Liang H. Predicting the Flexibility Profile of Ribosomal RNAs. Mol Inform 2010; 29:707-15. [PMID: 27464014 DOI: 10.1002/minf.201000092] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2010] [Accepted: 09/28/2010] [Indexed: 11/06/2022]
Abstract
Flexibility in biomolecules is an important determinant of biological functionality, which can be measured quantitatively by atomic Debye-Waller factor or B-factor. Although numerous works have been addressed on theoretical and computational studies of the B-factor profiles of proteins, the methods used for predicting B-factor values of nucleic acids, especially the complicated ribosomal RNAs (rRNAs), which are very functionally similar to proteins in providing matrix structures and in catalyzing biochemical reactions, still remain unexploited. In this article, we present a quantitative structure-flexibility relationship (QSFR) study with the aim at the quantitative prediction of rRNA B-factor based on primary sequences (sequence-based) and advanced structures (structure-based) by using both linear and nonlinear machine learning approaches, including partial least squares regression (PLS), least squares support vector machine (LSSVM), and Gaussian process (GP). By rigorously examining the performance and reliability of constructed statistical models and by comparing our models in detail to those developed previously for protein B-factors, we demonstrate that (i) rRNA B-factors could be predicted at a similar level of accuracy with that of protein, (ii) a structure-based approach performed much better as compared to sequence-based methods in modeling of rRNA B-factors, and (iii) rRNA flexibility is primarily governed by the local features of nonbonding potential landscapes, such as electrostatic and van der Waals forces.
Collapse
Affiliation(s)
- Feifei Tian
- State Key Laboratory of Trauma, Burns and Combined Injury, Research Institute of Surgery, Daping Hospital, The Third Military Medical University, Chongqing 400042, China phone: +86 23 68757411, fax: +86 23 68757404.,College of Bioengineering, Chongqing University, Chongqing 400044, China
| | - Chun Zhang
- State Key Laboratory of Trauma, Burns and Combined Injury, Research Institute of Surgery, Daping Hospital, The Third Military Medical University, Chongqing 400042, China phone: +86 23 68757411, fax: +86 23 68757404
| | - Xia Fan
- State Key Laboratory of Trauma, Burns and Combined Injury, Research Institute of Surgery, Daping Hospital, The Third Military Medical University, Chongqing 400042, China phone: +86 23 68757411, fax: +86 23 68757404
| | - Xue Yang
- State Key Laboratory of Trauma, Burns and Combined Injury, Research Institute of Surgery, Daping Hospital, The Third Military Medical University, Chongqing 400042, China phone: +86 23 68757411, fax: +86 23 68757404
| | - Xi Wang
- State Key Laboratory of Trauma, Burns and Combined Injury, Research Institute of Surgery, Daping Hospital, The Third Military Medical University, Chongqing 400042, China phone: +86 23 68757411, fax: +86 23 68757404
| | - Huaping Liang
- State Key Laboratory of Trauma, Burns and Combined Injury, Research Institute of Surgery, Daping Hospital, The Third Military Medical University, Chongqing 400042, China phone: +86 23 68757411, fax: +86 23 68757404.
| |
Collapse
|
16
|
Hu L, Ai Z, Liu P, Xiong Q, Min M, Lan C, Wang J, Fan L, Chen D. Predicting the binding affinity of epitope-peptides with HLA-A*0201 by encoding atom-pair non-covalent interaction information between receptor and ligands. Chem Biol Drug Des 2010; 75:597-606. [PMID: 20565476 DOI: 10.1111/j.1747-0285.2010.00975.x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
A structure-based method was used to characterize the non-covalent interactions of HLA-A*0201 with its peptide ligands. In this procedure, protein and peptide atoms were classified into 16 types in terms of their chemical property and local environment, and a 16 x 16 matrix was then defined to describe the interaction mode of 256 atom-pairs between the receptor and ligand in a complex structure. Three biologically related chemical forces as electrostatic, van der Waals, and hydrophobic potentials were separately calculated for each element of the matrix to yield 768 structural descriptors encoding the detailed information about the non-covalent interactions involved in protein-peptide binding. We employed this method to perform quantitative structure-activity relationship (QSAR) study of a data panel consisting of 419 non-apeptides with known binding affinities to HLA-A*0201 protein. Several QSAR models were constructed using partial least square regression (PLS) coupled with or without genetic algorithm (GA)-variable selection, and these models were validated rigorously and investigated systematically by using external test set and one-way analysis of variance. Results show that diverse properties have significant contributions to the HLA-A*0201-peptide binding. Particularly, the hydrophobicity and electrostatic property at the anchor residues of peptides confer a significant specificity and stability for the bound complexes.
Collapse
Affiliation(s)
- Lu Hu
- Department of Gastroenterology, Daping hospital, The Third Military Medical University, Chongqing, China
| | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Diez-Rivero CM, Chenlo B, Zuluaga P, Reche PA. Quantitative modeling of peptide binding to TAP using support vector machine. Proteins 2010; 78:63-72. [PMID: 19705485 DOI: 10.1002/prot.22535] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The transport of peptides to the endoplasmic reticulum by the transporter associated with antigen processing (TAP) is a necessary step towards determining CD8 T cell epitopes. In this work, we have studied the predictive performance of support vector machine models trained on single residue positions and residue combinations drawn from a large dataset consisting of 613 nonamer peptides of known affinity to TAP. Predictive performance of these TAP affinity models was evaluated under 10-fold cross-validation experiments and measured using Pearson's correlation coefficients (R(p)). Our results show that every peptide position (P1-P9) contributes to TAP binding (minimum R(p) of 0.26 +/- 0.11 was achieved by a model trained on the P6 residue), although the largest contributions to binding correspond to the C-terminal end (R(p) = 0.68 +/- 0.06) and the P1 (R(p) = 0.51 +/- 0.09) and P2 (0.57 +/- 0.08) residues of the peptide. Training the models on additional peptide residues generally improved their predictive performance and a maximum correlation (R(p) = 0.89 +/- 0.03) was achieved by a model trained on the full-length sequences or a residue selection consisting of the first 5 N- and last 3 C-terminal residues of the peptides included in the training set. A system for predicting the binding affinity of peptides to TAP using the methods described here is readily available for free public use at http://imed.med.ucm.es/Tools/tapreg/.
Collapse
Affiliation(s)
- Carmen M Diez-Rivero
- Laboratorio de Inmuno Medicina, Departamento de Microbiología I-Immunología, Facultad de Medicina, Universidad Complutense, Madrid, Spain
| | | | | | | |
Collapse
|
18
|
Li Y, Yang Y, He P, Yang Q. QM/MM Study of Epitope Peptides Binding to HLA-A*0201: The Roles of Anchor Residues and Water. Chem Biol Drug Des 2009; 74:611-8. [DOI: 10.1111/j.1747-0285.2009.00896.x] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
19
|
Walshe VA, Hattotuwagama CK, Doytchinova IA, Wong M, Macdonald IK, Mulder A, Claas FHJ, Pellegrino P, Turner J, Williams I, Turnbull EL, Borrow P, Flower DR. Integrating in silico and in vitro analysis of peptide binding affinity to HLA-Cw*0102: a bioinformatic approach to the prediction of new epitopes. PLoS One 2009; 4:e8095. [PMID: 19956609 PMCID: PMC2779488 DOI: 10.1371/journal.pone.0008095] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2009] [Accepted: 11/03/2009] [Indexed: 11/24/2022] Open
Abstract
Background Predictive models of peptide-Major Histocompatibility Complex (MHC) binding affinity are important components of modern computational immunovaccinology. Here, we describe the development and deployment of a reliable peptide-binding prediction method for a previously poorly-characterized human MHC class I allele, HLA-Cw*0102. Methodology/Findings Using an in-house, flow cytometry-based MHC stabilization assay we generated novel peptide binding data, from which we derived a precise two-dimensional quantitative structure-activity relationship (2D-QSAR) binding model. This allowed us to explore the peptide specificity of HLA-Cw*0102 molecule in detail. We used this model to design peptides optimized for HLA-Cw*0102-binding. Experimental analysis showed these peptides to have high binding affinities for the HLA-Cw*0102 molecule. As a functional validation of our approach, we also predicted HLA-Cw*0102-binding peptides within the HIV-1 genome, identifying a set of potent binding peptides. The most affine of these binding peptides was subsequently determined to be an epitope recognized in a subset of HLA-Cw*0102-positive individuals chronically infected with HIV-1. Conclusions/Significance A functionally-validated in silico-in vitro approach to the reliable and efficient prediction of peptide binding to a previously uncharacterized human MHC allele HLA-Cw*0102 was developed. This technique is generally applicable to all T cell epitope identification problems in immunology and vaccinology.
Collapse
Affiliation(s)
- Valerie A. Walshe
- The Jenner Institute, University of Oxford, Compton, Berkshire, United Kingdom
| | | | | | - MaiLee Wong
- The Jenner Institute, University of Oxford, Compton, Berkshire, United Kingdom
| | - Isabel K. Macdonald
- The Jenner Institute, University of Oxford, Compton, Berkshire, United Kingdom
| | - Arend Mulder
- Department of Immunohaematology and Blood Transfusion, Leiden University Medical Centre, Leiden, The Netherlands
| | - Frans H. J. Claas
- Department of Immunohaematology and Blood Transfusion, Leiden University Medical Centre, Leiden, The Netherlands
| | - Pierre Pellegrino
- Centre for Sexual Health and HIV Research, Royal Free and University College London Medical School and Camden Primary Care Trust, London, United Kingdom
| | - Jo Turner
- Centre for Sexual Health and HIV Research, Royal Free and University College London Medical School and Camden Primary Care Trust, London, United Kingdom
| | - Ian Williams
- Centre for Sexual Health and HIV Research, Royal Free and University College London Medical School and Camden Primary Care Trust, London, United Kingdom
| | - Emma L. Turnbull
- The Jenner Institute, University of Oxford, Compton, Berkshire, United Kingdom
| | - Persephone Borrow
- The Jenner Institute, University of Oxford, Compton, Berkshire, United Kingdom
| | - Darren R. Flower
- The Jenner Institute, University of Oxford, Compton, Berkshire, United Kingdom
- * E-mail:
| |
Collapse
|
20
|
Toussaint NC, Kohlbacher O. Towards in silico design of epitope-based vaccines. Expert Opin Drug Discov 2009; 4:1047-60. [PMID: 23480396 DOI: 10.1517/17460440903242283] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
BACKGROUND Epitope-based vaccines (EVs) make use of immunogenic peptides (epitopes) to trigger an immune response. Due to their manifold advantages, EVs have recently been attracting growing interest. The success of an EV is determined by the choice of epitopes used as a basis. However, the experimental discovery of candidate epitopes is expensive in terms of time and money. Furthermore, for the final choice of epitopes various immunological requirements have to be considered. METHODS Numerous in silico approaches exist that can guide the design of EVs. In particular, computational methods for MHC binding prediction have already become standard tools in immunology. Apart from binding prediction and prediction of antigen processing, methods for epitope design and selection have been suggested. We review these in silico approaches for epitope discovery and selection along with their strengths and weaknesses. Finally, we discuss some of the obvious problems in the design of EVs. CONCLUSION State-of-the-art in silico approaches to MHC binding prediction yield high accuracies. However, a more thorough understanding of the underlying biological processes and significant amounts of experimental data will be required for the validation and improvement of in silico approaches to the remaining aspects of EV design.
Collapse
Affiliation(s)
- Nora C Toussaint
- Eberhard Karls University, Center for Bioinformatics Tübingen, Division for Simulation of Biological Systems, 72076 Tübingen, Germany +49 7071 2970458 ; +49 7071 295152 ;
| | | |
Collapse
|
21
|
Rao X, Costa AICAF, van Baarle D, Kesmir C. A comparative study of HLA binding affinity and ligand diversity: implications for generating immunodominant CD8+ T cell responses. THE JOURNAL OF IMMUNOLOGY 2009; 182:1526-32. [PMID: 19155500 DOI: 10.4049/jimmunol.182.3.1526] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Conventional CD8(+) T cell responses against intracellular infectious agents are initiated upon recognition of pathogen-derived peptides presented at the cell surface of infected cells in the context of MHC class I molecules. Among the major MHC class I loci, HLA-B is the swiftest evolving and the most polymorphic locus. Additionally, responses restricted by HLA-B molecules tend to be dominant, and most associations with susceptibility or protection against infectious diseases have been assigned to HLA-B alleles. To assess whether the differences in responses mediated via two major HLA class I loci, HLA-B and HLA-A, may already begin at the Ag presentation level, we have analyzed the diversity and binding affinity of their peptide repertoire by making use of curated pathogen-derived epitope data retrieved from the Immune Epitope Database and Analysis Resource, as well as in silico predicted epitopes. In contrast to our expectations, HLA-B alleles were found to have a less diverse peptide repertoire, which points toward a more restricted binding motif, and the respective average peptide binding affinity was shown to be lower than that of HLA-A-restricted epitopes. This unexpected observation gives rise to new hypotheses concerning the mechanisms underlying immunodominance of CD8(+) T cell responses.
Collapse
Affiliation(s)
- Xiangyu Rao
- Department of Theoretical Biology/Bioinformatics, Utrecht University, Utrecht, The Netherlands
| | | | | | | |
Collapse
|
22
|
Tian F, Yang L, Lv F, Yang Q, Zhou P. In silico quantitative prediction of peptides binding affinity to human MHC molecule: an intuitive quantitative structure-activity relationship approach. Amino Acids 2008; 36:535-54. [PMID: 18575802 DOI: 10.1007/s00726-008-0116-8] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2008] [Accepted: 06/02/2008] [Indexed: 10/21/2022]
Abstract
In this paper, we have handpicked 23 kinds of electronic properties, 37 kinds of steric properties, 54 kinds of hydrophobic properties and 5 kinds of hydrogen bond properties from thousands of amino acid structural and property parameters. Principal component analysis (PCA) was applied on these parameters and thus ten score vectors involving significant nonbonding properties of 20 coded amino acids were yielded, called the divided physicochemical property scores (DPPS) of amino acids. The DPPS descriptor was then used to characterize the structures of 152 HLA-A*0201-restricted CTL epitopes, and significant variables being responsible for the binding affinities were selected by genetic algorithm, and a quantitative structure-activity relationship (QSAR) model by partial least square was established to predict the peptide-HLA-A*0201 molecule interactions. Statistical analysis on the resulted DPPS-based QSAR models were consistent well with experimental exhibits and molecular graphics display. Diversified properties of the different residues in binding peptides may contribute remarkable effect to the interactions between the HLA-A*0201 molecule and its peptide ligands. Particularly, hydrophobicity and hydrogen bond of anchor residues of peptides may have a significant contribution to the interactions. The results showed that DPPS can well represent the structural characteristics of the antigenic peptides and is a promising approach to predict the affinities of peptide binding to HLA-A*0201 in a efficient and intuitive way. We expect that this physical-principle based method can be applied to other protein-peptide interactions as well.
Collapse
Affiliation(s)
- F Tian
- Research Institute of Surgery, Daping Hospital, Third Military Medical University, Chongqing, China
| | | | | | | | | |
Collapse
|
23
|
Toward the prediction of class I and II mouse major histocompatibility complex-peptide-binding affinity: in silico bioinformatic step-by-step guide using quantitative structure-activity relationships. Methods Mol Biol 2008. [PMID: 18450004 DOI: 10.1007/978-1-60327-118-9_16] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
Quantitative structure-activity relationship (QSAR) analysis is a cornerstone of modern informatics. Predictive computational models of peptide-major histocompatibility complex (MHC)-binding affinity based on QSAR technology have now become important components of modern computational immunovaccinology. Historically, such approaches have been built around semiqualitative, classification methods, but these are now giving way to quantitative regression methods. We review three methods--a 2D-QSAR additive-partial least squares (PLS) and a 3D-QSAR comparative molecular similarity index analysis (CoMSIA) method--which can identify the sequence dependence of peptide-binding specificity for various class I MHC alleles from the reported binding affinities (IC50) of peptide sets. The third method is an iterative self-consistent (ISC) PLS-based additive method, which is a recently developed extension to the additive method for the affinity prediction of class II peptides. The QSAR methods presented here have established themselves as immunoinformatic techniques complementary to existing methodology, useful in the quantitative prediction of binding affinity: current methods for the in silico identification of T-cell epitopes (which form the basis of many vaccines, diagnostics, and reagents) rely on the accurate computational prediction of peptide-MHC affinity. We have reviewed various human and mouse class I and class II allele models. Studied alleles comprise HLA-A*0101, HLA-A*0201, HLA-A*0202, HLA-A*0203, HLA-A*0206, HLA-A*0301, HLA-A*1101, HLA-A*3101, HLA-A*6801, HLA-A*6802, HLA-B*3501, H2-K(k), H2-K(b), H2-D(b) HLA-DRB1*0101, HLA-DRB1*0401, HLA-DRB1*0701, I-A(b), I-A(d), I-A(k), I-A(S), I-E(d), and I-E(k). In this chapter we show a step-by-step guide into predicting the reliability and the resulting models to represent an advance on existing methods. The peptides used in this study are available from the AntiJen database (http://www.jenner.ac.uk/AntiJen). The PLS method is available commercially in the SYBYL molecular modeling software package. The resulting models, which can be used for accurate T-cell epitope prediction, will be made are freely available online at the URL http://www.jenner.ac.uk/MHCPred.
Collapse
|
24
|
Ivanciuc O, Braun W. Robust quantitative modeling of peptide binding affinities for MHC molecules using physical-chemical descriptors. Protein Pept Lett 2008; 14:903-16. [PMID: 18045233 DOI: 10.2174/092986607782110257] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Major histocompatibility complex (MHC) molecules bind short peptides resulting from intracellular processing of foreign and self proteins, and present them on the cell surface for recognition by T-cell receptors. We propose a new robust approach to quantitatively model the binding affinities of MHC molecules by quantitative structure-activity relationships (QSAR) that use the physical-chemical amino acid descriptors E1-E5. These QSAR models are robust, sequence-based, and can be used as a fast and reliable filter to predict the MHC binding affinity for large protein databases.
Collapse
Affiliation(s)
- Ovidiu Ivanciuc
- Sealy Center for Structural Biology and Molecular Biophysics, Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, 301 University Boulevard, Galveston, Texas 77555-0857, USA
| | | |
Collapse
|
25
|
Todman SJ, Halling-Brown MD, Davies MN, Flower DR, Kayikci M, Moss DS. Toward the atomistic simulation of T cell epitopes. J Mol Graph Model 2008; 26:957-61. [PMID: 17766153 DOI: 10.1016/j.jmgm.2007.07.005] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2007] [Revised: 07/25/2007] [Accepted: 07/25/2007] [Indexed: 01/01/2023]
Abstract
Epitopes mediated by T cells lie at the heart of the adaptive immune response and form the essential nucleus of anti-tumour peptide or epitope-based vaccines. Antigenic T cell epitopes are mediated by major histocompatibility complex (MHC) molecules, which present them to T cell receptors. Calculating the affinity between a given MHC molecule and an antigenic peptide using experimental approaches is both difficult and time consuming, thus various computational methods have been developed for this purpose. A server has been developed to allow a structural approach to the problem by generating specific MHC:peptide complex structures and providing configuration files to run molecular modelling simulations upon them. A system has been produced which allows the automated construction of MHC:peptide structure files and the corresponding configuration files required to execute a molecular dynamics simulation using NAMD. The system has been made available through a web-based front end and stand-alone scripts. Previous attempts at structural prediction of MHC:peptide affinity have been limited due to the paucity of structures and the computational expense in running large scale molecular dynamics simulations. The MHCsim server (http://igrid-ext.cryst.bbk.ac.uk/MHCsim) allows the user to rapidly generate any desired MHC:peptide complex and will facilitate molecular modelling simulation of MHC complexes on an unprecedented scale.
Collapse
Affiliation(s)
- Sarah J Todman
- Department of Crystallography, University of London, Birkbeck College, Malet Street, London WC1E 7HX, United Kingdom
| | | | | | | | | | | |
Collapse
|
26
|
Doytchinova IA, Flower DR. Predicting class I major histocompatibility complex (MHC) binders using multivariate statistics: comparison of discriminant analysis and multiple linear regression. J Chem Inf Model 2007; 47:234-8. [PMID: 17238269 DOI: 10.1021/ci600318z] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The accurate in silico identification of T-cell epitopes is a critical step in the development of peptide-based vaccines, reagents, and diagnostics. It has a direct impact on the success of subsequent experimental work. Epitopes arise as a consequence of complex proteolytic processing within the cell. Prior to being recognized by T cells, an epitope is presented on the cell surface as a complex with a major histocompatibility complex (MHC) protein. A prerequisite therefore for T-cell recognition is that an epitope is also a good MHC binder. Thus, T-cell epitope prediction overlaps strongly with the prediction of MHC binding. In the present study, we compare discriminant analysis and multiple linear regression as algorithmic engines for the definition of quantitative matrices for binding affinity prediction. We apply these methods to peptides which bind the well-studied human MHC allele HLA-A*0201. A matrix which results from combining results of the two methods proved powerfully predictive under cross-validation. The new matrix was also tested on an external set of 160 binders to HLA-A*0201; it was able to recognize 135 (84%) of them.
Collapse
Affiliation(s)
- Irini A Doytchinova
- Faculty of Pharmacy, Medical University of Sofia, 2 Dunav st., 1000 Sofia, Bulgaria.
| | | |
Collapse
|
27
|
Holm L, Frech K, Dzhambazov B, Holmdahl R, Kihlberg J, Linusson A. Quantitative Structure−Activity Relationship of Peptides Binding to the Class II Major Histocompatibility Complex Molecule Aq Associated with Autoimmune Arthritis. J Med Chem 2007; 50:2049-59. [PMID: 17425295 DOI: 10.1021/jm061209b] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Presentation of (glyco)peptides by the class II major histocompatibility complex molecule Aq to T cells plays a central role in collagen-induced arthritis, an animal model for the autoimmune disease rheumatoid arthritis. A peptide library was designed using statistical molecular design in amino acid space in which five positions in the minimal mouse collagen type II binding epitope CII260-267 were varied. A substantially reduced peptide library of 24 peptides with diverse and representative molecular characteristics was selected, synthesized, and evaluated for the binding strength to Aq. A multivariate QSAR model was established by correlating calculated descriptors, compressed to its principle properties, with the binding data using partial least-square regression. The model was successfully validated by an external test set. Interpretation of the model provided a molecular property binding motif for peptides interacting with Aq. The information may be useful in future research directed toward new treatments of rheumatoid arthritis.
Collapse
Affiliation(s)
- Lotta Holm
- Department of Chemistry, Umeå University, SE-901 87 Umeå, Sweden
| | | | | | | | | | | |
Collapse
|
28
|
Pissurlenkar R, Malde A, Khedkar S, Coutinho E. Encoding Type and Position in Peptide QSAR: Application to Peptides Binding to Class I MHC Molecule HLA-A*0201. ACTA ACUST UNITED AC 2007. [DOI: 10.1002/qsar.200530184] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
29
|
Winkler DA, Burden FR. Nonlinear predictive modeling of MHC class II-peptide binding using Bayesian neural networks. Methods Mol Biol 2007; 409:365-77. [PMID: 18450015 DOI: 10.1007/978-1-60327-118-9_27] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
Methods for predicting the binding affinity of peptides to the MHC have become more sophisticated in the past 5-10 years. It is possible to use computational quantitative structure-activity methods to build models of peptide affinity that are truly predictive. Two of the most useful methods for building models are Bayesian regularized neural networks for continuous or discrete (categorical) data and support vector machines (SVMs) for discrete data. We illustrate the application of Bayesian regularized neural networks to modeling MHC class II-binding affinity of peptides. Training data comprised sequences and binding data for nonamer (nine amino acid) peptides. Peptides were characterized by mathematical representations of several types. Independent test data comprised sequences and binding data for peptides of length < or = 25. We also internally validated the models by using 30% of the data in an internal test set. We obtained robust models, with near-identical statistics for multiple training runs. We determined how predictive our models were using statistical tests and area under the receiver operating characteristic (ROC) graphs (A(ROC)). Some mathematical representations of the peptides were more efficient than others and were able to generalize to unknown peptides outside of the training space. Bayesian neural networks are robust, efficient "universal approximators" that are well able to tackle the difficult problem of correctly predicting the MHC class II-binding activities of a majority of the test set peptides.
Collapse
Affiliation(s)
- David A Winkler
- Centre for Complexity in Drug Discovery, CSIRO Molecular and Health Technologies, Clayton, Australia.
| | | |
Collapse
|
30
|
Wan J, Liu W, Xu Q, Ren Y, Flower DR, Li T. SVRMHC prediction server for MHC-binding peptides. BMC Bioinformatics 2006; 7:463. [PMID: 17059589 PMCID: PMC1626489 DOI: 10.1186/1471-2105-7-463] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2006] [Accepted: 10/23/2006] [Indexed: 11/24/2022] Open
Abstract
Background The binding between antigenic peptides (epitopes) and the MHC molecule is a key step in the cellular immune response. Accurate in silico prediction of epitope-MHC binding affinity can greatly expedite epitope screening by reducing costs and experimental effort. Results Recently, we demonstrated the appealing performance of SVRMHC, an SVR-based quantitative modeling method for peptide-MHC interactions, when applied to three mouse class I MHC molecules. Subsequently, we have greatly extended the construction of SVRMHC models and have established such models for more than 40 class I and class II MHC molecules. Here we present the SVRMHC web server for predicting peptide-MHC binding affinities using these models. Benchmarked percentile scores are provided for all predictions. The larger number of SVRMHC models available allowed for an updated evaluation of the performance of the SVRMHC method compared to other well- known linear modeling methods. Conclusion SVRMHC is an accurate and easy-to-use prediction server for epitope-MHC binding with significant coverage of MHC molecules. We believe it will prove to be a valuable resource for T cell epitope researchers.
Collapse
Affiliation(s)
- Ji Wan
- Department of Neuroscience, University of Minnesota, Minneapolis, MN 55455, USA
| | - Wen Liu
- Department of Neuroscience, University of Minnesota, Minneapolis, MN 55455, USA
| | - Qiqi Xu
- Department of Neuroscience, University of Minnesota, Minneapolis, MN 55455, USA
| | - Yongliang Ren
- Department of Neuroscience, University of Minnesota, Minneapolis, MN 55455, USA
| | - Darren R Flower
- The Jenner Institute, University of Oxford, Compton, Berkshire RG20 7NN, UK
| | - Tongbin Li
- Department of Neuroscience, University of Minnesota, Minneapolis, MN 55455, USA
| |
Collapse
|
31
|
Hattotuwagama CK, Toseland CP, Guan P, Taylor DJ, Hemsley SL, Doytchinova IA, Flower DR. Toward prediction of class II mouse major histocompatibility complex peptide binding affinity: in silico bioinformatic evaluation using partial least squares, a robust multivariate statistical technique. J Chem Inf Model 2006; 46:1491-502. [PMID: 16711768 DOI: 10.1021/ci050380d] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The accurate identification of T-cell epitopes remains a principal goal of bioinformatics within immunology. As the immunogenicity of peptide epitopes is dependent on their binding to major histocompatibility complex (MHC) molecules, the prediction of binding affinity is a prerequisite to the reliable prediction of epitopes. The iterative self-consistent (ISC) partial-least-squares (PLS)-based additive method is a recently developed bioinformatic approach for predicting class II peptide-MHC binding affinity. The ISC-PLS method overcomes many of the conceptual difficulties inherent in the prediction of class II peptide-MHC affinity, such as the binding of a mixed population of peptide lengths due to the open-ended class II binding site. The method has applications in both the accurate prediction of class II epitopes and the manipulation of affinity for heteroclitic and competitor peptides. The method is applied here to six class II mouse alleles (I-Ab, I-Ad, I-Ak, I-As, I-Ed, and I-Ek) and included peptides up to 25 amino acids in length. A series of regression equations highlighting the quantitative contributions of individual amino acids at each peptide position was established. The initial model for each allele exhibited only moderate predictivity. Once the set of selected peptide subsequences had converged, the final models exhibited a satisfactory predictive power. Convergence was reached between the 4th and 17th iterations, and the leave-one-out cross-validation statistical terms--q2, SEP, and NC--ranged between 0.732 and 0.925, 0.418 and 0.816, and 1 and 6, respectively. The non-cross-validated statistical terms r2 and SEE ranged between 0.98 and 0.995 and 0.089 and 0.180, respectively. The peptides used in this study are available from the AntiJen database (http://www.jenner.ac.uk/AntiJen). The PLS method is available commercially in the SYBYL molecular modeling software package. The resulting models, which can be used for accurate T-cell epitope prediction, will be made freely available online (http://www.jenner.ac.uk/MHCPred).
Collapse
|
32
|
Liu W, Meng X, Xu Q, Flower DR, Li T. Quantitative prediction of mouse class I MHC peptide binding affinity using support vector machine regression (SVR) models. BMC Bioinformatics 2006; 7:182. [PMID: 16579851 PMCID: PMC1513606 DOI: 10.1186/1471-2105-7-182] [Citation(s) in RCA: 84] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2005] [Accepted: 03/31/2006] [Indexed: 11/20/2022] Open
Abstract
Background The binding between peptide epitopes and major histocompatibility complex proteins (MHCs) is an important event in the cellular immune response. Accurate prediction of the binding between short peptides and the MHC molecules has long been a principal challenge for immunoinformatics. Recently, the modeling of MHC-peptide binding has come to emphasize quantitative predictions: instead of categorizing peptides as "binders" or "non-binders" or as "strong binders" and "weak binders", recent methods seek to make predictions about precise binding affinities. Results We developed a quantitative support vector machine regression (SVR) approach, called SVRMHC, to model peptide-MHC binding affinities. As a non-linear method, SVRMHC was able to generate models that out-performed existing linear models, such as the "additive method". By adopting a new "11-factor encoding" scheme, SVRMHC takes into account similarities in the physicochemical properties of the amino acids constituting the input peptides. When applied to MHC-peptide binding data for three mouse class I MHC alleles, the SVRMHC models produced more accurate predictions than those produced previously. Furthermore, comparisons based on Receiver Operating Characteristic (ROC) analysis indicated that SVRMHC was able to out-perform several prominent methods in identifying strongly binding peptides. Conclusion As a method with demonstrated performance in the quantitative modeling of MHC-peptide binding and in identifying strong binders, SVRMHC is a promising immunoinformatics tool with not inconsiderable future potential.
Collapse
Affiliation(s)
- Wen Liu
- Department of Neuroscience, University of Minnesota, Minneapolis, MN 55455, USA
| | - Xiangshan Meng
- Department of Neuroscience, University of Minnesota, Minneapolis, MN 55455, USA
| | - Qiqi Xu
- Department of Neuroscience, University of Minnesota, Minneapolis, MN 55455, USA
| | - Darren R Flower
- The Jenner Institute, University of Oxford, Compton, Berkshire RG20 7NN, UK
| | - Tongbin Li
- Department of Neuroscience, University of Minnesota, Minneapolis, MN 55455, USA
| |
Collapse
|
33
|
Doytchinova IA, Guan P, Flower DR. EpiJen: a server for multistep T cell epitope prediction. BMC Bioinformatics 2006; 7:131. [PMID: 16533401 PMCID: PMC1421443 DOI: 10.1186/1471-2105-7-131] [Citation(s) in RCA: 126] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2005] [Accepted: 03/13/2006] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND The main processing pathway for MHC class I ligands involves degradation of proteins by the proteasome, followed by transport of products by the transporter associated with antigen processing (TAP) to the endoplasmic reticulum (ER), where peptides are bound by MHC class I molecules, and then presented on the cell surface by MHCs. The whole process is modeled here using an integrated approach, which we call EpiJen. EpiJen is based on quantitative matrices, derived by the additive method, and applied successively to select epitopes. EpiJen is available free online. RESULTS To identify epitopes, a source protein is passed through four steps: proteasome cleavage, TAP transport, MHC binding and epitope selection. At each stage, different proportions of non-epitopes are eliminated. The final set of peptides represents no more than 5% of the whole protein sequence and will contain 85% of the true epitopes, as indicated by external validation. Compared to other integrated methods (NetCTL, WAPP and SMM), EpiJen performs best, predicting 61 of the 99 HIV epitopes used in this study. CONCLUSION EpiJen is a reliable multi-step algorithm for T cell epitope prediction, which belongs to the next generation of in silico T cell epitope identification methods. These methods aim to reduce subsequent experimental work by improving the success rate of epitope prediction.
Collapse
Affiliation(s)
| | - Pingping Guan
- Edward Jenner Institute for Vaccine Research, Compton, RG20 7NN, UK
| | - Darren R Flower
- Edward Jenner Institute for Vaccine Research, Compton, RG20 7NN, UK
| |
Collapse
|
34
|
Doytchinova IA, Flower DR. Class I T-cell epitope prediction: improvements using a combination of proteasome cleavage, TAP affinity, and MHC binding. Mol Immunol 2006; 43:2037-44. [PMID: 16524630 DOI: 10.1016/j.molimm.2005.12.013] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2005] [Revised: 11/03/2005] [Accepted: 12/23/2005] [Indexed: 01/03/2023]
Abstract
Cleavage by the proteasome is responsible for generating the C terminus of T-cell epitopes. Modeling the process of proteasome cleavage as part of a multi-step algorithm for T-cell epitope prediction will reduce the number of non-binders and increase the overall accuracy of the predictive algorithm. Quantitative matrix-based models for prediction of the proteasome cleavage sites in a protein were developed using a training set of 489 naturally processed T-cell epitopes (nonamer peptides) associated with HLA-A and HLA-B molecules. The models were validated using an external test set of 227 T-cell epitopes. The performance of the models was good, identifying 76% of the C-termini correctly. The best model of proteasome cleavage was incorporated as the first step in a three-step algorithm for T-cell epitope prediction, where subsequent steps predicted TAP affinity and MHC binding using previously derived models.
Collapse
|
35
|
Doytchinova IA, Flower DR. Modeling the Peptide−T Cell Receptor Interaction by the Comparative Molecular Similarity Indices Analysis−Soft Independent Modeling of Class Analogy Technique. J Med Chem 2006; 49:2193-9. [PMID: 16570915 DOI: 10.1021/jm050876m] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
A set of 38 epitopes and 183 non-epitopes, which bind to alleles of the HLA-A3 supertype, was subjected to a combination of comparative molecular similarity indices analysis (CoMSIA) and soft independent modeling of class analogy (SIMCA). During the process of T cell recognition, T cell receptors (TCR) interact with the central section of the bound nonamer peptide; thus only positions 4-8 were considered in the study. The derived model distinguished 82% of the epitopes and 73% of the non-epitopes after cross-validation in five groups. The overall preference from the model is for polar amino acids with high electron density and the ability to form hydrogen bonds. These so-called "aggressive" amino acids are flanked by small-sized residues, which enable such residues to protrude from the binding cleft and take an active role in TCR-mediated T cell recognition. Combinations of "aggressive" and "passive" amino acids in the middle part of epitopes constitute a putative TCR binding motif.
Collapse
Affiliation(s)
- Irini A Doytchinova
- The Jenner Institute, University of Oxford, Compton, Berkshire RG20 7NN, United Kingdom
| | | |
Collapse
|
36
|
Guan P, Doytchinova IA, Walshe VA, Borrow P, Flower DR. Analysis of peptide-protein binding using amino acid descriptors: prediction and experimental verification for human histocompatibility complex HLA-A0201. J Med Chem 2006; 48:7418-25. [PMID: 16279801 DOI: 10.1021/jm0505258] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Amino acid descriptors are often used in quantitative structure-activity relationship (QSAR) analysis of proteins and peptides. In the present study, descriptors were used to characterize peptides binding to the human MHC allele HLA-A0201. Two sets of amino acid descriptors were chosen: 93 descriptors taken from the amino acid descriptor database AAindex and the z descriptors defined by Wold and Sandberg. Variable selection techniques (SIMCA, genetic algorithm, and GOLPE) were applied to remove redundant descriptors. Our results indicate that QSAR models generated using five z descriptors had the highest predictivity and explained variance (q2 between 0.6 and 0.7 and r2 between 0.6 and 0.9). Further to the QSAR analysis, 15 peptides were synthesized and tested using a T2 stabilization assay. All peptides bound to HLA-A0201 well, and four peptides were identified as high-affinity binders.
Collapse
Affiliation(s)
- Pingping Guan
- Edward Jenner Institute for Vaccine Research, Compton, Berkshire RG20 7NN, UK.
| | | | | | | | | |
Collapse
|
37
|
Burden FR, Winkler DA. Predictive Bayesian neural network models of MHC class II peptide binding. J Mol Graph Model 2005; 23:481-9. [PMID: 15878832 DOI: 10.1016/j.jmgm.2005.03.001] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2004] [Accepted: 03/18/2005] [Indexed: 11/20/2022]
Abstract
We used Bayesian regularized neural networks to model data on the MHC class II-binding affinity of peptides. Training data consisted of sequences and binding data for nonamer (nine amino acid) peptides. Independent test data consisted of sequences and binding data for peptides of length </=25. We assumed that MHC class II-binding activity of peptides depends only on the highest ranked embedded nonamer and that reverse sequences of active nonamers are inactive. We also internally validated the models by using 30% of the training data in an internal test set. We obtained robust models, with near identical statistics for multiple training runs. We determined how predictive our models were using statistical tests and area under the Receiver Operating Characteristic (ROC) graphs (A(ROC)). Most models gave training A(ROC) values close to 1.0 and test set A(ROC) values >0.8. We also used both amino acid indicator variables (bin20) and property-based descriptors to generate models for MHC class II-binding of peptides. The property-based descriptors were more parsimonious than the indicator variable descriptors, making them applicable to larger peptides, and their design makes them able to generalize to unknown peptides outside of the training space. None of the external test data sets contained any of the nonamer sequences in the training sets. Consequently, the models attempted to predict the activity of truly unknown peptides not encountered in the training sets. Our models were well able to tackle the difficult problem of correctly predicting the MHC class II-binding activities of a majority of the test set peptides. Exceptions to the assumption that nonamer motif activities were invariant to the peptide in which they were embedded, together with the limited coverage of the test data, and the fuzziness of the classification procedure, are likely explanations for some misclassifications.
Collapse
|
38
|
Doytchinova IA, Walshe V, Borrow P, Flower DR. Towards the chemometric dissection of peptide--HLA-A*0201 binding affinity: comparison of local and global QSAR models. J Comput Aided Mol Des 2005; 19:203-12. [PMID: 16059672 DOI: 10.1007/s10822-005-3993-x] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2004] [Accepted: 03/15/2005] [Indexed: 10/25/2022]
Abstract
The affinities of 177 nonameric peptides binding to the HLA-A*0201 molecule were measured using a FACS-based MHC stabilisation assay and analysed using chemometrics. Their structures were described by global and local descriptors, QSAR models were derived by genetic algorithm, stepwise regression and PLS. The global molecular descriptors included molecular connectivity chi indices, kappa shape indices, E-state indices, molecular properties like molecular weight and log P, and three-dimensional descriptors like polarizability, surface area and volume. The local descriptors were of two types. The first used a binary string to indicate the presence of each amino acid type at each position of the peptide. The second was also position-dependent but used five z-scales to describe the main physicochemical properties of the amino acids forming the peptides. The models were developed using a representative training set of 131 peptides and validated using an independent test set of 46 peptides. It was found that the global descriptors could not explain the variance in the training set nor predict the affinities of the test set accurately. Both types of local descriptors gave QSAR models with better explained variance and predictive ability. The results suggest that, in their interactions with the MHC molecule, the peptide acts as a complicated ensemble of multiple amino acids mutually potentiating each other.
Collapse
Affiliation(s)
- Irini A Doytchinova
- Edward Jenner Institute for Vaccine Research, RG20 7NN, Compton, Berkshire, UK
| | | | | | | |
Collapse
|
39
|
Toseland CP, Clayton DJ, McSparron H, Hemsley SL, Blythe MJ, Paine K, Doytchinova IA, Guan P, Hattotuwagama CK, Flower DR. AntiJen: a quantitative immunology database integrating functional, thermodynamic, kinetic, biophysical, and cellular data. Immunome Res 2005; 1:4. [PMID: 16305757 PMCID: PMC1289288 DOI: 10.1186/1745-7580-1-4] [Citation(s) in RCA: 141] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2005] [Accepted: 10/06/2005] [Indexed: 11/30/2022] Open
Abstract
AntiJen is a database system focused on the integration of kinetic, thermodynamic, functional, and cellular data within the context of immunology and vaccinology. Compared to its progenitor JenPep, the interface has been completely rewritten and redesigned and now offers a wider variety of search methods, including a nucleotide and a peptide BLAST search. In terms of data archived, AntiJen has a richer and more complete breadth, depth, and scope, and this has seen the database increase to over 31,000 entries. AntiJen provides the most complete and up-to-date dataset of its kind. While AntiJen v2.0 retains a focus on both T cell and B cell epitopes, its greatest novelty is the archiving of continuous quantitative data on a variety of immunological molecular interactions. This includes thermodynamic and kinetic measures of peptide binding to TAP and the Major Histocompatibility Complex (MHC), peptide-MHC complexes binding to T cell receptors, antibodies binding to protein antigens and general immunological protein-protein interactions. The database also contains quantitative specificity data from position-specific peptide libraries and biophysical data, in the form of diffusion co-efficients and cell surface copy numbers, on MHCs and other immunological molecules. The uses of AntiJen include the design of vaccines and diagnostics, such as tetramers, and other laboratory reagents, as well as helping parameterize the bioinformatic or mathematical in silico modeling of the immune system. The database is accessible from the URL: .
Collapse
Affiliation(s)
- Christopher P Toseland
- Edward Jenner Institute for Vaccine Research, High Street, Compton, Berkshire, RG20 7NN, UK
| | - Debra J Clayton
- Edward Jenner Institute for Vaccine Research, High Street, Compton, Berkshire, RG20 7NN, UK
| | - Helen McSparron
- Edward Jenner Institute for Vaccine Research, High Street, Compton, Berkshire, RG20 7NN, UK
| | - Shelley L Hemsley
- Edward Jenner Institute for Vaccine Research, High Street, Compton, Berkshire, RG20 7NN, UK
| | - Martin J Blythe
- Edward Jenner Institute for Vaccine Research, High Street, Compton, Berkshire, RG20 7NN, UK
| | - Kelly Paine
- Edward Jenner Institute for Vaccine Research, High Street, Compton, Berkshire, RG20 7NN, UK
| | - Irini A Doytchinova
- Edward Jenner Institute for Vaccine Research, High Street, Compton, Berkshire, RG20 7NN, UK
| | - Pingping Guan
- Edward Jenner Institute for Vaccine Research, High Street, Compton, Berkshire, RG20 7NN, UK
| | - Channa K Hattotuwagama
- Edward Jenner Institute for Vaccine Research, High Street, Compton, Berkshire, RG20 7NN, UK
| | - Darren R Flower
- Edward Jenner Institute for Vaccine Research, High Street, Compton, Berkshire, RG20 7NN, UK
| |
Collapse
|
40
|
Peters B, Sette A. Generating quantitative models describing the sequence specificity of biological processes with the stabilized matrix method. BMC Bioinformatics 2005; 6:132. [PMID: 15927070 PMCID: PMC1173087 DOI: 10.1186/1471-2105-6-132] [Citation(s) in RCA: 382] [Impact Index Per Article: 20.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2005] [Accepted: 05/31/2005] [Indexed: 12/12/2022] Open
Abstract
Background Many processes in molecular biology involve the recognition of short sequences of nucleic-or amino acids, such as the binding of immunogenic peptides to major histocompatibility complex (MHC) molecules. From experimental data, a model of the sequence specificity of these processes can be constructed, such as a sequence motif, a scoring matrix or an artificial neural network. The purpose of these models is two-fold. First, they can provide a summary of experimental results, allowing for a deeper understanding of the mechanisms involved in sequence recognition. Second, such models can be used to predict the experimental outcome for yet untested sequences. In the past we reported the development of a method to generate such models called the Stabilized Matrix Method (SMM). This method has been successfully applied to predicting peptide binding to MHC molecules, peptide transport by the transporter associated with antigen presentation (TAP) and proteasomal cleavage of protein sequences. Results Herein we report the implementation of the SMM algorithm as a publicly available software package. Specific features determining the type of problems the method is most appropriate for are discussed. Advantageous features of the package are: (1) the output generated is easy to interpret, (2) input and output are both quantitative, (3) specific computational strategies to handle experimental noise are built in, (4) the algorithm is designed to effectively handle bounded experimental data, (5) experimental data from randomized peptide libraries and conventional peptides can easily be combined, and (6) it is possible to incorporate pair interactions between positions of a sequence. Conclusion Making the SMM method publicly available enables bioinformaticians and experimental biologists to easily access it, to compare its performance to other prediction methods, and to extend it to other applications.
Collapse
Affiliation(s)
- Bjoern Peters
- La Jolla Institute for Allergy and Immunology, 3030 Bunker Hill Street, Suite 326, San Diego, CA 92109, USA
| | - Alessandro Sette
- La Jolla Institute for Allergy and Immunology, 3030 Bunker Hill Street, Suite 326, San Diego, CA 92109, USA
| |
Collapse
|
41
|
Doytchinova IA, Guan P, Flower DR. Quantitative structure-activity relationships and the prediction of MHC supermotifs. Methods 2005; 34:444-53. [PMID: 15542370 DOI: 10.1016/j.ymeth.2004.06.007] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/21/2004] [Indexed: 10/26/2022] Open
Abstract
The underlying assumption in quantitative structure-activity relationship (QSAR) methodology is that related chemical structures exhibit related biological activities. We review here two QSAR methods in terms of their applicability for human MHC supermotif definition. Supermotifs are motifs that characterise binding to more than one allele. Supermotif definition is the initial in silico step of epitope-based vaccine design. The first QSAR method we review here--the additive method--is based on the assumption that the binding affinity of a peptide depends on contributions from both amino acids and the interactions between them. The second method is a 3D-QSAR method: comparative molecular similarity indices analysis (CoMSIA). Both methods were applied to 771 peptides binding to 9 HLA alleles. Five of the alleles (A*0201, A*0202, A*0203, A*0206 and A*6802) belong to the HLA-A2 superfamily and the other four (A*0301, A*1101, A*3101 and A*6801) to the HLA-A3 superfamily. For each superfamily, supermotifs defined by the two QSAR methods agree closely and are supported by many experimental data.
Collapse
Affiliation(s)
- Irini A Doytchinova
- Edward Jenner Institute for Vaccine Research, High Street, Compton, Berkshire RG20 7NN, UK
| | | | | |
Collapse
|
42
|
Wan S, Coveney P, Flower DR. Large-scale molecular dynamics simulations of HLA-A*0201 complexed with a tumor-specific antigenic peptide: can the alpha3 and beta2m domains be neglected? J Comput Chem 2004; 25:1803-13. [PMID: 15386470 DOI: 10.1002/jcc.20100] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Large-scale massively parallel molecular dynamics (MD) simulations of the human class I major histocompatibility complex (MHC) protein HLA-A*0201 bound to a decameric tumor-specific antigenic peptide GVYDGREHTV were performed using a scalable MD code on high-performance computing platforms. Such computational capabilities put us in reach of simulations of various scales and complexities. The supercomputing resources available for this study allow us to compare directly differences in the behavior of very large molecular models; in this case, the entire extracellular portion of the peptide-MHC complex vs. the isolated peptide binding domain. Comparison of the results from the partial and the whole system simulations indicates that the peptide is less tightly bound in the partial system than in the whole system. From a detailed study of conformations, solvent-accessible surface area, the nature of the water network structure, and the binding energies, we conclude that, when considering the conformation of the alpha1-alpha2 domain, the alpha3 and beta2m domains cannot be neglected.
Collapse
Affiliation(s)
- Shunzhou Wan
- Centre for Computational Science, Department of Chemistry, University College London, 20 Gordon Street, WC1H 0AJ, UK
| | | | | |
Collapse
|
43
|
Doytchinova I, Hemsley S, Flower DR. Transporter Associated with Antigen Processing Preselection of Peptides Binding to the MHC: A Bioinformatic Evaluation. THE JOURNAL OF IMMUNOLOGY 2004; 173:6813-9. [PMID: 15557175 DOI: 10.4049/jimmunol.173.11.6813] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
TAP is responsible for the transit of peptides from the cytosol to the lumen of the endoplasmic reticulum. In an immunological context, this event is followed by the binding of peptides to MHC molecules before export to the cell surface and recognition by T cells. Because TAP transport precedes MHC binding, TAP preferences may make a significant contribution to epitope selection. To assess the impact of this preselection, we have developed a scoring function for TAP affinity prediction using the additive method, have used it to analyze and extend the TAP binding motif, and have evaluated how well this model acts as a preselection step in predicting MHC binding peptides. To distinguish between MHC alleles that are exclusively dependent on TAP and those exhibiting only a partial dependence on TAP, two sets of MHC binding peptides were examined: HLA-A*0201 was selected as a representative of partially TAP-dependent HLA alleles, and HLA-A*0301 represented fully TAP-dependent HLA alleles. TAP preselection has a greater impact on TAP-dependent alleles than on TAP-independent alleles. The reduction in the number of nonbinders varied from 10% (TAP-independent) to 33% (TAP-dependent), suggesting that TAP preselection is an important component in the successful in silico prediction of T cell epitopes.
Collapse
Affiliation(s)
- Irini Doytchinova
- Edward Jenner Institute for Vaccine Research, Compton, Berkshire, United Kingdom
| | | | | |
Collapse
|
44
|
Doytchinova IA, Walshe VA, Jones NA, Gloster SE, Borrow P, Flower DR. Coupling in silico and in vitro analysis of peptide-MHC binding: a bioinformatic approach enabling prediction of superbinding peptides and anchorless epitopes. THE JOURNAL OF IMMUNOLOGY 2004; 172:7495-502. [PMID: 15187128 DOI: 10.4049/jimmunol.172.12.7495] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
The ability to define and manipulate the interaction of peptides with MHC molecules has immense immunological utility, with applications in epitope identification, vaccine design, and immunomodulation. However, the methods currently available for prediction of peptide-MHC binding are far from ideal. We recently described the application of a bioinformatic prediction method based on quantitative structure-affinity relationship methods to peptide-MHC binding. In this study we demonstrate the predictivity and utility of this approach. We determined the binding affinities of a set of 90 nonamer peptides for the MHC class I allele HLA-A*0201 using an in-house, FACS-based, MHC stabilization assay, and from these data we derived an additive quantitative structure-affinity relationship model for peptide interaction with the HLA-A*0201 molecule. Using this model we then designed a series of high affinity HLA-A2-binding peptides. Experimental analysis revealed that all these peptides showed high binding affinities to the HLA-A*0201 molecule, significantly higher than the highest previously recorded. In addition, by the use of systematic substitution at principal anchor positions 2 and 9, we showed that high binding peptides are tolerant to a wide range of nonpreferred amino acids. Our results support a model in which the affinity of peptide binding to MHC is determined by the interactions of amino acids at multiple positions with the MHC molecule and may be enhanced by enthalpic cooperativity between these component interactions.
Collapse
Affiliation(s)
- Irini A Doytchinova
- Edward Jenner Institute for Vaccine Research-Compton, High Street, Berkshire, Compton RG20 7NN, United Kingdom
| | | | | | | | | | | |
Collapse
|
45
|
Hattotuwagama CK, Guan P, Doytchinova IA, Zygouri C, Flower DR. Quantitative online prediction of peptide binding to the major histocompatibility complex. J Mol Graph Model 2004; 22:195-207. [PMID: 14629978 DOI: 10.1016/s1093-3263(03)00160-8] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
With its implications for vaccine discovery, the accurate prediction of T cell epitopes is one of the key aspirations of computational vaccinology. We have developed a robust multivariate statistical method, based on partial least squares, for the quantitative prediction of peptide binding to major histocompatibility complexes (MHC), the principal checkpoint on the antigen presentation pathway. As a service to the immunobiology community, we have made a Perl implementation of the method available via a World Wide Web server. We call this server MHCPred. Access to the server is freely available from the URL: http://www.jenner.ac.uk/MHCPred. We have exemplified our method with a model for peptides binding to the common human MHC molecule HLA-B*3501.
Collapse
|
46
|
|
47
|
Abstract
As torrents of new data now emerge from microbial genomics, bioinformatic prediction of immunogenic epitopes remains challenging but vital. In silico methods often produce paradoxically inconsistent results: good prediction rates on certain test sets but not others. The inherent complexity of immune presentation and recognition processes complicates epitope prediction. Two encouraging developments - data driven artificial intelligence sequence-based methods for epitope prediction and molecular modeling methods based on three-dimensional protein structures - offer hope for the future.
Collapse
Affiliation(s)
- Darren R Flower
- Edward Jenner Institute for Vaccine Research, Compton, RG20 7NN, Berkshire, UK.
| |
Collapse
|
48
|
Zhao B, Mathura VS, Rajaseger G, Moochhala S, Sakharkar MK, Kangueane P. A novel MHCp binding prediction model. Hum Immunol 2003; 64:1123-43. [PMID: 14630395 DOI: 10.1016/j.humimm.2003.08.343] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Many statistical and molecular mechanics models have been developed and tested for major histocompatibility complex peptide (MHCp) binding predictions during the last decade. The statistical model prediction using pooled peptide sequence data and three-dimensional modeling prediction by molecular mechanics calculations have been assessed for efficiency and human leukocyte antigen diversity coverage. We describe a novel predictive model using information gleaned from 29 human MHCp crystal structures. The validation for the new model is performed using four different sets of data: (1) MHCp crystal structures, (2) peptides with known IC(50) binding values, (3) peptides tested positive by tetramer staining, (4) peptides with known binding information at the MHCBN database. The model produces high prediction efficiencies (average 60 %) with good sensitivity (approximately 50%-73%) and specificity (52%-58%) values. The average positive predictive value of the model is 89%, while the average negative predictive value is only 18%. The efficiency is very high in predicting binders and very low in predicting nonbinders. This model is superior to many existing methods because of its potential application to any given MHC allele whose sequence is clearly defined.
Collapse
Affiliation(s)
- Bing Zhao
- School of Mechanical and Production Engineering, Nanyang Centre for Supercomputing and Visualization, Nanyang Technological University, 50 Nanyang Avenue, Singapore 639 798, Republic of Singapore
| | | | | | | | | | | |
Collapse
|
49
|
McSparron H, Blythe MJ, Zygouri C, Doytchinova IA, Flower DR. JenPep: a novel computational information resource for immunobiology and vaccinology. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES 2003; 43:1276-87. [PMID: 12870921 DOI: 10.1021/ci030461e] [Citation(s) in RCA: 60] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
JenPep is a relational database containing a compendium of thermodynamic binding data for the interaction of peptides with a range of important immunological molecules: the major histocompatibility complex, TAP transporter, and T cell receptor. The database also includes annotated lists of B cell and T cell epitopes. Version 2.0 of the database is implemented in a bespoke postgreSQL database system and is fully searchable online via a perl/HTML interface (URL: http://www.jenner.ac.uk/JenPep).
Collapse
Affiliation(s)
- Helen McSparron
- Edward Jenner Institute for Vaccine Research, Compton, Berkshire, UK RG20 7NN
| | | | | | | | | |
Collapse
|
50
|
Guan P, Doytchinova IA, Zygouri C, Flower DR. MHCPred: A server for quantitative prediction of peptide-MHC binding. Nucleic Acids Res 2003; 31:3621-4. [PMID: 12824380 PMCID: PMC168917 DOI: 10.1093/nar/gkg510] [Citation(s) in RCA: 204] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Accurate T-cell epitope prediction is a principal objective of computational vaccinology. As a service to the immunology and vaccinology communities at large, we have implemented, as a server on the World Wide Web, a partial least squares-based multivariate statistical approach to the quantitative prediction of peptide binding to major histocom- patibility complexes (MHC), the key checkpoint on the antigen presentation pathway within adaptive cellular immunity. MHCPred implements robust statistical models for both Class I alleles (HLA-A*0101, HLA-A*0201, HLA-A*0202, HLA-A*0203, HLA-A*0206, HLA-A*0301, HLA-A*1101, HLA-A*3301, HLA-A*6801, HLA-A*6802 and HLA-B*3501) and Class II alleles (HLA-DRB*0401, HLA-DRB*0401 and HLA-DRB*0701). MHCPred is available from the URL: http://www.jenner.ac.uk/MHCPred.
Collapse
Affiliation(s)
- Pingping Guan
- Edward Jenner Institute for Vaccine Research, High Street, Compton, Berkshire RG0 7NN, UK
| | | | | | | |
Collapse
|