1
|
Kuri P, Goswami P. Current Update on Rotavirus in-Silico Multiepitope Vaccine Design. ACS OMEGA 2023; 8:190-207. [PMID: 36643547 PMCID: PMC9835168 DOI: 10.1021/acsomega.2c07213] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Accepted: 12/14/2022] [Indexed: 06/06/2023]
Abstract
Rotavirus gastroenteritis is one of the leading causes of pediatric morbidity and mortality worldwide in infants and under-five populations. The World Health Organization (WHO) recommended global incorporation of the rotavirus vaccine in national immunization programs to alleviate the burden of the disease. Implementation of the rotavirus vaccination in certain regions of the world brought about a significant and consistent reduction of rotavirus-associated hospitalizations. However, the efficacy of licensed vaccines remains suboptimal in low-income countries where the incidences of rotavirus gastroenteritis continue to happen unabated. The problem of low efficacy of currently licensed oral rotavirus vaccines in low-income countries necessitates continuous exploration, design, and development of new rotavirus vaccines. Traditional vaccine development is a complex, expensive, labor-intensive, and time-consuming process. Reverse vaccinology essentially utilizes the genome and proteome information on pathogens and has opened new avenues for in-silico multiepitope vaccine design for a plethora of pathogens, promising time reduction in the complete vaccine development pipeline by complementing the traditional vaccinology approach. A substantial number of reviews on licensed rotavirus vaccines and those under evaluation are already available in the literature. However, a collective account of rotavirus in-silico vaccines is lacking in the literature, and such an account may further fuel the interest of researchers to use reverse vaccinology to expedite the vaccine development process. Therefore, the main focus of this review is to summarize the research endeavors undertaken for the design and development of rotavirus vaccines by the reverse vaccinology approach utilizing the tools of immunoinformatics.
Collapse
|
2
|
Keller GLJ, Weiss LI, Baker BM. Physicochemical Heuristics for Identifying High Fidelity, Near-Native Structural Models of Peptide/MHC Complexes. Front Immunol 2022; 13:887759. [PMID: 35547730 PMCID: PMC9084917 DOI: 10.3389/fimmu.2022.887759] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 03/29/2022] [Indexed: 11/13/2022] Open
Abstract
There is long-standing interest in accurately modeling the structural features of peptides bound and presented by class I MHC proteins. This interest has grown with the advent of rapid genome sequencing and the prospect of personalized, peptide-based cancer vaccines, as well as the development of molecular and cellular therapeutics based on T cell receptor recognition of peptide-MHC. However, while the speed and accessibility of peptide-MHC modeling has improved substantially over the years, improvements in accuracy have been modest. Accuracy is crucial in peptide-MHC modeling, as T cell receptors are highly sensitive to peptide conformation and capturing fine details is therefore necessary for useful models. Studying nonameric peptides presented by the common class I MHC protein HLA-A*02:01, here we addressed a key question common to modern modeling efforts: from a set of models (or decoys) generated through conformational sampling, which is best? We found that the common strategy of decoy selection by lowest energy can lead to substantial errors in predicted structures. We therefore adopted a data-driven approach and trained functions capable of predicting near native decoys with exceptionally high accuracy. Although our implementation is limited to nonamer/HLA-A*02:01 complexes, our results serve as an important proof of concept from which improvements can be made and, given the significance of HLA-A*02:01 and its preference for nonameric peptides, should have immediate utility in select immunotherapeutic and other efforts for which structural information would be advantageous.
Collapse
Affiliation(s)
- Grant L J Keller
- Department of Chemistry & Biochemistry and the Harper Cancer Research Institute, University of Notre Dame, Notre Dame, IN, United States
| | - Laura I Weiss
- Department of Chemistry & Biochemistry and the Harper Cancer Research Institute, University of Notre Dame, Notre Dame, IN, United States
| | - Brian M Baker
- Department of Chemistry & Biochemistry and the Harper Cancer Research Institute, University of Notre Dame, Notre Dame, IN, United States
| |
Collapse
|
3
|
Chen D, Li Y. PredMHC: An Effective Predictor of Major Histocompatibility Complex Using Mixed Features. Front Genet 2022; 13:875112. [PMID: 35547252 PMCID: PMC9081368 DOI: 10.3389/fgene.2022.875112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2022] [Accepted: 03/07/2022] [Indexed: 12/03/2022] Open
Abstract
The major histocompatibility complex (MHC) is a large locus on vertebrate DNA that contains a tightly linked set of polymorphic genes encoding cell surface proteins essential for the adaptive immune system. The groups of proteins encoded in the MHC play an important role in the adaptive immune system. Therefore, the accurate identification of the MHC is necessary to understand its role in the adaptive immune system. An effective predictor called PredMHC is established in this study to identify the MHC from protein sequences. Firstly, PredMHC encoded a protein sequence with mixed features including 188D, APAAC, KSCTriad, CKSAAGP, and PAAC. Secondly, three classifiers including SGD, SMO, and random forest were trained on the mixed features of the protein sequence. Finally, the prediction result was obtained by the voting of the three classifiers. The experimental results of the 10-fold cross-validation test in the training dataset showed that PredMHC can obtain 91.69% accuracy. Experimental results on comparison with other features, classifiers, and existing methods showed the effectiveness of PredMHC in predicting the MHC.
Collapse
Affiliation(s)
- Dong Chen
- College of Electrical and Information Engineering, Quzhou University, Quzhou, China
| | - Yanjuan Li
- College of Electrical and Information Engineering, Quzhou University, Quzhou, China
| |
Collapse
|
4
|
Bukhari SNH, Jain A, Haq E, Mehbodniya A, Webber J. Machine Learning Techniques for the Prediction of B-Cell and T-Cell Epitopes as Potential Vaccine Targets with a Specific Focus on SARS-CoV-2 Pathogen: A Review. Pathogens 2022; 11:pathogens11020146. [PMID: 35215090 PMCID: PMC8879824 DOI: 10.3390/pathogens11020146] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 01/19/2022] [Accepted: 01/21/2022] [Indexed: 02/01/2023] Open
Abstract
The only part of an antigen (a protein molecule found on the surface of a pathogen) that is composed of epitopes specific to T and B cells is recognized by the human immune system (HIS). Identification of epitopes is considered critical for designing an epitope-based peptide vaccine (EBPV). Although there are a number of vaccine types, EBPVs have received less attention thus far. It is important to mention that EBPVs have a great deal of untapped potential for boosting vaccination safety—they are less expensive and take a short time to produce. Thus, in order to quickly contain global pandemics such as the ongoing outbreak of coronavirus disease 2019 (COVID-19) caused by the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), as well as epidemics and endemics, EBPVs are considered promising vaccine types. The high mutation rate of SARS-CoV-2 has posed a great challenge to public health worldwide because either the composition of existing vaccines has to be changed or a new vaccine has to be developed to protect against its different variants. In such scenarios, time being the critical factor, EBPVs can be a promising alternative. To design an effective and viable EBPV against different strains of a pathogen, it is important to identify the putative T- and B-cell epitopes. Using the wet-lab experimental approach to identify these epitopes is time-consuming and costly because the experimental screening of a vast number of potential epitope candidates is required. Fortunately, various available machine learning (ML)-based prediction methods have reduced the burden related to the epitope mapping process by decreasing the potential epitope candidate list for experimental trials. Moreover, these methods are also cost-effective, scalable, and fast. This paper presents a systematic review of various state-of-the-art and relevant ML-based methods and tools for predicting T- and B-cell epitopes. Special emphasis is placed on highlighting and analyzing various models for predicting epitopes of SARS-CoV-2, the causative agent of COVID-19. Based on the various methods and tools discussed, future research directions for epitope prediction are presented.
Collapse
Affiliation(s)
- Syed Nisar Hussain Bukhari
- University Institute of Computing, Chandigarh University, NH-95, Chandigarh-Ludhiana Highway, Mohali 140413, India;
- Correspondence:
| | - Amit Jain
- University Institute of Computing, Chandigarh University, NH-95, Chandigarh-Ludhiana Highway, Mohali 140413, India;
| | - Ehtishamul Haq
- Department of Biotechnology, University of Kashmir, Srinagar 190006, India;
| | - Abolfazl Mehbodniya
- Department of Electronics and Communication Engineering, Kuwait College of Science and Technology, Kuwait City 20185145, Kuwait;
| | - Julian Webber
- Graduate School of Engineering Science, Osaka University, Osaka 560-8531, Japan;
| |
Collapse
|
5
|
Strategies to Increase Prediction Accuracy in Genomic Selection of Complex Traits in Alfalfa ( Medicago sativa L.). Cells 2021; 10:cells10123372. [PMID: 34943880 PMCID: PMC8699225 DOI: 10.3390/cells10123372] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Revised: 11/19/2021] [Accepted: 11/24/2021] [Indexed: 12/27/2022] Open
Abstract
Agronomic traits such as biomass yield and abiotic stress tolerance are genetically complex and challenging to improve through conventional breeding approaches. Genomic selection (GS) is an alternative approach in which genome-wide markers are used to determine the genomic estimated breeding value (GEBV) of individuals in a population. In alfalfa (Medicago sativa L.), previous results indicated that low to moderate prediction accuracy values (<70%) were obtained in complex traits, such as yield and abiotic stress resistance. There is a need to increase the prediction value in order to employ GS in breeding programs. In this paper we reviewed different statistic models and their applications in polyploid crops, such as alfalfa and potato. Specifically, we used empirical data affiliated with alfalfa yield under salt stress to investigate approaches that use DNA marker importance values derived from machine learning models, and genome-wide association studies (GWAS) of marker-trait association scores based on different GWASpoly models, in weighted GBLUP analyses. This approach increased prediction accuracies from 50% to more than 80% for alfalfa yield under salt stress. Finally, we expended the weighted GBLUP approach to potato and analyzed 13 phenotypic traits and obtained similar results. This is the first report on alfalfa to use variable importance and GWAS-assisted approaches to increase the prediction accuracy of GS, thus helping to select superior alfalfa lines based on their GEBVs.
Collapse
|
6
|
Selvaraj C, Chandra I, Singh SK. Artificial intelligence and machine learning approaches for drug design: challenges and opportunities for the pharmaceutical industries. Mol Divers 2021; 26:1893-1913. [PMID: 34686947 PMCID: PMC8536481 DOI: 10.1007/s11030-021-10326-z] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2021] [Accepted: 09/24/2021] [Indexed: 12/27/2022]
Abstract
The global spread of COVID-19 has raised the importance of pharmaceutical drug development as intractable and hot research. Developing new drug molecules to overcome any disease is a costly and lengthy process, but the process continues uninterrupted. The critical point to consider the drug design is to use the available data resources and to find new and novel leads. Once the drug target is identified, several interdisciplinary areas work together with artificial intelligence (AI) and machine learning (ML) methods to get enriched drugs. These AI and ML methods are applied in every step of the computer-aided drug design, and integrating these AI and ML methods results in a high success rate of hit compounds. In addition, this AI and ML integration with high-dimension data and its powerful capacity have taken a step forward. Clinical trials output prediction through the AI/ML integrated models could further decrease the clinical trials cost by also improving the success rate. Through this review, we discuss the backend of AI and ML methods in supporting the computer-aided drug design, along with its challenge and opportunity for the pharmaceutical industry. From the available information or data, the AI and ML based prediction for the high throughput virtual screening. After this integration of AI and ML, the success rate of hit identification has gained a momentum with huge success by providing novel drugs.
Collapse
Affiliation(s)
- Chandrabose Selvaraj
- CADD and Molecular Modelling Lab, Department of Bioinformatics, Alagappa University, Science Block, Karaikudi, Tamil Nadu, 630004, India.
| | - Ishwar Chandra
- CADD and Molecular Modelling Lab, Department of Bioinformatics, Alagappa University, Science Block, Karaikudi, Tamil Nadu, 630004, India
| | - Sanjeev Kumar Singh
- CADD and Molecular Modelling Lab, Department of Bioinformatics, Alagappa University, Science Block, Karaikudi, Tamil Nadu, 630004, India.
| |
Collapse
|
7
|
Jiang L, Yu H, Li J, Tang J, Guo Y, Guo F. Predicting MHC class I binder: existing approaches and a novel recurrent neural network solution. Brief Bioinform 2021; 22:6299205. [PMID: 34131696 DOI: 10.1093/bib/bbab216] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Revised: 05/14/2021] [Accepted: 05/17/2021] [Indexed: 01/04/2023] Open
Abstract
Major histocompatibility complex (MHC) possesses important research value in the treatment of complex human diseases. A plethora of computational tools has been developed to predict MHC class I binders. Here, we comprehensively reviewed 27 up-to-date MHC I binding prediction tools developed over the last decade, thoroughly evaluating feature representation methods, prediction algorithms and model training strategies on a benchmark dataset from Immune Epitope Database. A common limitation was identified during the review that all existing tools can only handle a fixed peptide sequence length. To overcome this limitation, we developed a bilateral and variable long short-term memory (BVLSTM)-based approach, named BVLSTM-MHC. It is the first variable-length MHC class I binding predictor. In comparison to the 10 mainstream prediction tools on an independent validation dataset, BVLSTM-MHC achieved the best performance in six out of eight evaluated metrics. A web server based on the BVLSTM-MHC model was developed to enable accurate and efficient MHC class I binder prediction in human, mouse, macaque and chimpanzee.
Collapse
Affiliation(s)
- Limin Jiang
- Comprehensive cancer center, Department of Internal Medicine, University of New Mexico, Albuquerque, NM, USA
| | - Hui Yu
- Comprehensive cancer center, Department of Internal Medicine, University of New Mexico, Albuquerque, NM, USA
| | - Jiawei Li
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Jijun Tang
- Department of Computer Science, University of South Carolina, SC, USA.,Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Yan Guo
- Comprehensive cancer center, Department of Internal Medicine, University of New Mexico, Albuquerque, NM, USA
| | - Fei Guo
- School of Computer Science and Engineering, Central South University, Changsha, China
| |
Collapse
|
8
|
Predicting MHC I restricted T cell epitopes in mice with NAP-CNB, a novel online tool. Sci Rep 2021; 11:10780. [PMID: 34031450 PMCID: PMC8144223 DOI: 10.1038/s41598-021-89927-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Accepted: 04/27/2021] [Indexed: 12/05/2022] Open
Abstract
Lack of a dedicated integrated pipeline for neoantigen discovery in mice hinders cancer immunotherapy research. Novel sequential approaches through recurrent neural networks can improve the accuracy of T-cell epitope binding affinity predictions in mice, and a simplified variant selection process can reduce operational requirements. We have developed a web server tool (NAP-CNB) for a full and automatic pipeline based on recurrent neural networks, to predict putative neoantigens from tumoral RNA sequencing reads. The developed software can estimate H-2 peptide ligands, with an AUC comparable or superior to state-of-the-art methods, directly from tumor samples. As a proof-of-concept, we used the B16 melanoma model to test the system’s predictive capabilities, and we report its putative neoantigens. NAP-CNB web server is freely available at http://biocomp.cnb.csic.es/NeoantigensApp/ with scripts and datasets accessible through the download section.
Collapse
|
9
|
Zinsli LV, Stierlin N, Loessner MJ, Schmelcher M. Deimmunization of protein therapeutics - Recent advances in experimental and computational epitope prediction and deletion. Comput Struct Biotechnol J 2020; 19:315-329. [PMID: 33425259 PMCID: PMC7779837 DOI: 10.1016/j.csbj.2020.12.024] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Revised: 12/15/2020] [Accepted: 12/16/2020] [Indexed: 12/11/2022] Open
Abstract
Biotherapeutics, and antimicrobial proteins in particular, are of increasing interest for human medicine. An important challenge in the development of such therapeutics is their potential immunogenicity, which can induce production of anti-drug-antibodies, resulting in altered pharmacokinetics, reduced efficacy, and potentially severe anaphylactic or hypersensitivity reactions. For this reason, the development and application of effective deimmunization methods for protein drugs is of utmost importance. Deimmunization may be achieved by unspecific shielding approaches, which include PEGylation, fusion to polypeptides (e.g., XTEN or PAS), reductive methylation, glycosylation, and polysialylation. Alternatively, the identification of epitopes for T cells or B cells and their subsequent deletion through site-directed mutagenesis represent promising deimmunization strategies and can be accomplished through either experimental or computational approaches. This review highlights the most recent advances and current challenges in the deimmunization of protein therapeutics, with a special focus on computational epitope prediction and deletion tools.
Collapse
Key Words
- ABR, Antigen-binding region
- ADA, Anti-drug antibody
- ANN, Artificial neural network
- APC, Antigen-presenting cell
- Anti-drug-antibody
- B cell epitope
- BCR, B cell receptor
- Bab, Binding antibody
- CDR, Complementarity determining region
- CRISPR, Clustered regularly interspaced short palindromic repeats
- DC, Dendritic cell
- ELP, Elastin-like polypeptide
- EPO, Erythropoietin
- ER, Endoplasmatic reticulum
- GLK, Gelatin-like protein
- HAP, Homo-amino-acid polymer
- HLA, Human leukocyte antigen
- HMM, Hidden Markov model
- IL, Interleukin
- Ig, Immunoglobulin
- Immunogenicity
- LPS, Lipopolysaccharide
- MHC, Major histocompatibility complex
- NMR, Nuclear magnetic resonance
- Nab, Neutralizing antibody
- PAMP, Pathogen-associated molecular pattern
- PAS, Polypeptide composed of proline, alanine, and/or serine
- PBMC, Peripheral blood mononuclear cell
- PD, Pharmacodynamics
- PEG, Polyethylene glycol
- PK, Pharmacokinetics
- PRR, Pattern recognition receptor
- PSA, Sialic acid polymers
- Protein therapeutic
- RNN, Recurrent artificial neural network
- SVM, Support vector machine
- T cell epitope
- TAP, Transporter associated with antigen processing
- TCR, T cell receptor
- TLR, Toll-like receptor
- XTEN, “Xtended” recombinant polypeptide
Collapse
Affiliation(s)
- Léa V. Zinsli
- Institute of Food, Nutrition and Health, ETH Zurich, Zurich, Switzerland
| | - Noël Stierlin
- Institute of Food, Nutrition and Health, ETH Zurich, Zurich, Switzerland
| | - Martin J. Loessner
- Institute of Food, Nutrition and Health, ETH Zurich, Zurich, Switzerland
| | - Mathias Schmelcher
- Institute of Food, Nutrition and Health, ETH Zurich, Zurich, Switzerland
| |
Collapse
|
10
|
Li Q, Zhou W, Wang D, Wang S, Li Q. Prediction of Anticancer Peptides Using a Low-Dimensional Feature Model. Front Bioeng Biotechnol 2020; 8:892. [PMID: 32903381 PMCID: PMC7434836 DOI: 10.3389/fbioe.2020.00892] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Accepted: 07/10/2020] [Indexed: 01/09/2023] Open
Abstract
Cancer is still a severe health problem globally. The therapy of cancer traditionally involves the use of radiotherapy or anticancer drugs to kill cancer cells, but these methods are quite expensive and have side effects, which will cause great harm to patients. With the find of anticancer peptides (ACPs), significant progress has been achieved in the therapy of tumors. Therefore, it is invaluable to accurately identify anticancer peptides. Although biochemical experiments can solve this work, this method is expensive and time-consuming. To promote the application of anticancer peptides in cancer therapy, machine learning can be used to recognize anticancer peptides by extracting the feature vectors of anticancer peptides. Nevertheless, poor performance usually be found in training the machine learning model to utilizing high-dimensional features in practice. In order to solve the above job, this paper put forward a 19-dimensional feature model based on anticancer peptide sequences, which has lower dimensionality and better performance than some existing methods. In addition, this paper also separated a model with a low number of dimensions and acceptable performance. The few features identified in this study may represent the important features of anticancer peptides.
Collapse
Affiliation(s)
- Qingwen Li
- College of Animal Science and Technology, Northeast Agricultural University, Harbin, China
| | - Wenyang Zhou
- Center for Bioinformatics, School of Life Sciences and Technology, Harbin Institute of Technology, Harbin, China
| | - Donghua Wang
- Department of General Surgery, Heilongjiang Province Land Reclamation Headquarters General Hospital, Harbin, China
| | - Sui Wang
- Key Laboratory of Soybean Biology in Chinese Ministry of Education, Northeast Agricultural University, Harbin, China
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, Harbin, China
| | - Qingyuan Li
- Forestry and Fruit Tree Research Institute, Wuhan Academy of Agricultural Sciences, Wuhan, China
| |
Collapse
|
11
|
Abstract
Immunoinformatics is a discipline that applies methods of computer science to study and model the immune system. A fundamental question addressed by immunoinformatics is how to understand the rules of antigen presentation by MHC molecules to T cells, a process that is central to adaptive immune responses to infections and cancer. In the modern era of personalized medicine, the ability to model and predict which antigens can be presented by MHC is key to manipulating the immune system and designing strategies for therapeutic intervention. Since the MHC is both polygenic and extremely polymorphic, each individual possesses a personalized set of MHC molecules with different peptide-binding specificities, and collectively they present a unique individualized peptide imprint of the ongoing protein metabolism. Mapping all MHC allotypes is an enormous undertaking that cannot be achieved without a strong bioinformatics component. Computational tools for the prediction of peptide-MHC binding have thus become essential in most pipelines for T cell epitope discovery and an inescapable component of vaccine and cancer research. Here, we describe the development of several such tools, from pioneering efforts to the current state-of-the-art methods, that have allowed for accurate predictions of peptide binding of all MHC molecules, even including those that have not yet been characterized experimentally.
Collapse
Affiliation(s)
- Morten Nielsen
- Department of Health Technology, Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark
- Instituto de Investigaciones Biotecnológicas, Universidad Nacional de San Martín, CP 1650 San Martin, Buenos Aires, Argentina
| | - Massimo Andreatta
- Instituto de Investigaciones Biotecnológicas, Universidad Nacional de San Martín, CP 1650 San Martin, Buenos Aires, Argentina
| | - Bjoern Peters
- Division of Vaccine Discovery, La Jolla Institute for Immunology, La Jolla, California 92037, USA
- Department of Medicine, University of California, San Diego, La Jolla, California 92093, USA
| | - Søren Buus
- Department of Immunology and Microbiology, Faculty of Health Sciences, University of Copenhagen, DK-2200 Copenhagen, Denmark
| |
Collapse
|
12
|
JAVADI A, KHAMESIPOUR A, MONAJEMI F, GHAZISAEEDI M. Computational Modeling and Analysis to Predict Intracellular Parasite Epitope Characteristics Using Random Forest Technique. IRANIAN JOURNAL OF PUBLIC HEALTH 2020; 49:125-133. [PMID: 32309231 PMCID: PMC7152625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/10/2018] [Accepted: 10/12/2018] [Indexed: 11/23/2022]
Abstract
BACKGROUND In a new approach, computational methods are used to design and evaluate the vaccine. The aim of the current study was to develop a computational tool to predict epitope candidate vaccines to be tested in experimental models. METHODS This study was conducted in the School of Allied Medical Sciences, and Center for Research and Training in Skin Diseases and Leprosy, Tehran University of Medical Sciences, Tehran, Iran in 2018. The random forest which is a classifier method was used to design computer-based tool to predict immunogenic peptides. Data was used to check the collected information from the IEDB, UniProt, and AAindex database. Overall, 1,264 collected data were used and divided into three parts; 70% of the data was used to train, 15% to validate and 15% to test the model. Five-fold cross-validation was used to find optimal hyper parameters of the model. Common performance metrics were used to evaluate the developed model. RESULTS Twenty seven features were identified as more important using RF predictor model and were used to predict the class of peptides. The RF model improves the performance of predictor model in comparison with the other predictor models (AUC±SE: 0.925±0.029). Using the developed RF model helps to identify the most likely epitopes for further experimental studies. CONCLUSION The current developed random forest model is able to more accurately predict the immunogenic peptides of intracellular parasites.
Collapse
Affiliation(s)
- Amir JAVADI
- Department of Health Information Management, School of Allied Medical Sciences, Tehran University of Medical Sciences, Tehran, Iran
- Department of Medical Social Sciences, Faculty of Medicine, Qazvin University of Medical Sciences, Qazvin, Iran
| | - Ali KHAMESIPOUR
- Center for Research and Training in Skin Diseases and Leprosy, Tehran University of Medical Sciences, Tehran, Iran
| | | | - Marjan GHAZISAEEDI
- Department of Health Information Management, School of Allied Medical Sciences, Tehran University of Medical Sciences, Tehran, Iran
| |
Collapse
|
13
|
Abolhassani Targhi MV, Asgari Jafarabadi G, Aminafshar M, Emam Jomeh Kashan N. Comparison of non-parametric methods in genomic evaluation of discrete traits. GENE REPORTS 2019. [DOI: 10.1016/j.genrep.2019.100379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
14
|
Frimpong A, Kusi KA, Ofori MF, Ndifon W. Novel Strategies for Malaria Vaccine Design. Front Immunol 2018; 9:2769. [PMID: 30555463 PMCID: PMC6281765 DOI: 10.3389/fimmu.2018.02769] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2018] [Accepted: 11/12/2018] [Indexed: 12/19/2022] Open
Abstract
The quest for a licensed effective vaccine against malaria remains a global priority. Even though classical vaccine design strategies have been successful for some viral and bacterial pathogens, little success has been achieved for Plasmodium falciparum, which causes the deadliest form of malaria due to its diversity and ability to evade host immune responses. Nevertheless, recent advances in vaccinology through high throughput discovery of immune correlates of protection, lymphocyte repertoire sequencing and structural design of immunogens, provide a comprehensive approach to identifying and designing a highly efficacious vaccine for malaria. In this review, we discuss novel vaccine approaches that can be employed in malaria vaccine design.
Collapse
Affiliation(s)
- Augustina Frimpong
- Department of Biochemistry, Cell and Molecular Biology, West African Centre for Cell Biology of Infectious Pathogens, College of Basic and Applied Sciences, University of Ghana, Accra, Ghana.,Immunology Department, College of Health Sciences, Noguchi Memorial Institute for Medical Research, University of Ghana, Accra, Ghana.,African Institute for Mathematical Sciences, Cape Coast, Ghana
| | - Kwadwo Asamoah Kusi
- Department of Biochemistry, Cell and Molecular Biology, West African Centre for Cell Biology of Infectious Pathogens, College of Basic and Applied Sciences, University of Ghana, Accra, Ghana.,Immunology Department, College of Health Sciences, Noguchi Memorial Institute for Medical Research, University of Ghana, Accra, Ghana
| | - Michael Fokuo Ofori
- Department of Biochemistry, Cell and Molecular Biology, West African Centre for Cell Biology of Infectious Pathogens, College of Basic and Applied Sciences, University of Ghana, Accra, Ghana.,Immunology Department, College of Health Sciences, Noguchi Memorial Institute for Medical Research, University of Ghana, Accra, Ghana
| | - Wilfred Ndifon
- African Institute for Mathematical Sciences, Cape Coast, Ghana.,African Institute for Mathematical Sciences, University of Stellenbosch, Cape Town, South Africa
| |
Collapse
|
15
|
Sarac F, Seker H, Bouridane A. Exploration of unsupervised feature selection methods to predict chronological age of individuals by utilising CpG dinucleotics from whole blood. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2018; 2017:3652-3655. [PMID: 29060690 DOI: 10.1109/embc.2017.8037649] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Identification of the age of individuals from epigenetic biomarkers can reveal vital information for criminal investigation, disease prevention, and extension of life. DNA methylation changes are highly associated with chronological age and the process of disease development. Computational methods such as clustering, feature selection and regression can be utilised to construct quantitative model of aging. In this study, we utilised 473034 CpG biomarkers from whole blood of 656 individuals aged 19 to 101 to construct predictive models and we treat the development of this age predictive model as extremely high-dimensional regression problem that is relatively understudied. Unlike semi-supervised and supervised feature selection methods, unsupervised feature selection methods are generally good at removing irrelevant features that can act as noise. In this study, along with the entire feature set, four different unsupervised feature selection methods (USFSMs) are therefore considered for the quantitative prediction of human ages. Since USFSMs are independent of any predictive method, support vector regression is then used to evaluate the prediction performances of the unsupervised feature selection methods. We proposed a novel k-means based unsupervised feature selection method to predict human ages by utilising CpG dinucleotides. Experimental results have validated the effectiveness of the proposed method as the optimum number of the CpG dinucleotides is found to be only 41 that corresponds to only 0.0087% of the entire feature space. To the best of our knowledge, this is the first study that presents exploration and comprehensive comparison of USFSMs in very high dimensional regression problems, particularly in epigenetic biomedical domain for the prediction of chronological age from changes in DNA methylation.
Collapse
|
16
|
Johnson ZI, Jones JD, Mukherjee A, Ren D, Feghali-Bostwick C, Conley YP, Yates CC. Novel classification for global gene signature model for predicting severity of systemic sclerosis. PLoS One 2018; 13:e0199314. [PMID: 29924864 PMCID: PMC6010260 DOI: 10.1371/journal.pone.0199314] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2018] [Accepted: 06/05/2018] [Indexed: 11/25/2022] Open
Abstract
Progression of systemic scleroderma (SSc), a chronic connective tissue disease that causes a fibrotic phenotype, is highly heterogeneous amongst patients and difficult to accurately diagnose. To meet this clinical need, we developed a novel three-layer classification model, which analyses gene expression profiles from SSc skin biopsies to diagnose SSc severity. Two SSc skin biopsy microarray datasets were obtained from Gene Expression Omnibus. The skin scores obtained from the original papers were used to further categorize the data into subgroups of low (<18) and high (≥18) severity. Data was pre-processed for normalization, background correction, centering and scaling. A two-layered cross-validation scheme was employed to objectively evaluate the performance of classification models of unobserved data. Three classification models were used: support vector machine, random forest, and naive Bayes in combination with feature selection methods to improve performance accuracy. For both input datasets, random forest classifier combined with correlation-based feature selection (CFS) method and naive Bayes combined with CFS or support vector machine based recursive feature elimination method yielded the best results. Additionally, we performed a principal component analysis to show that low and high severity groups are readily separable by gene expression signatures. Ultimately, we found that our novel classification prediction model produced global gene signatures that significantly correlated with skin scores. This study represents the first report comparing the performance of various classification prediction models for gene signatures from SSc patients, using current clinical diagnostic factors. In summary, our three-classification model system is a powerful tool for elucidating gene signatures from SSc skin biopsies and can also be used to develop a prognostic gene signature for SSc and other fibrotic disorders.
Collapse
Affiliation(s)
- Zariel I. Johnson
- Department of Health Promotions and Development, University of Pittsburgh School of Nursing, Pittsburgh, PA, United States of America
| | - Jacqueline D. Jones
- Department of Biological & Environmental Sciences, Troy University, Troy, AL, United States of America
| | - Angana Mukherjee
- Department of Biological & Environmental Sciences, Troy University, Troy, AL, United States of America
| | - Dianxu Ren
- Health and Community Systems, University of Pittsburgh School of Nursing, Pittsburgh, PA, United States of America
| | - Carol Feghali-Bostwick
- Department of Rheumatology & Immunology, University of South Carolina, Charleston, SC, United States of America
| | - Yvette P. Conley
- Department of Health Promotions and Development, University of Pittsburgh School of Nursing, Pittsburgh, PA, United States of America
- Department of Human Genetics, University of Pittsburgh, Pittsburgh, PA, United States of America
| | - Cecelia C. Yates
- Department of Health Promotions and Development, University of Pittsburgh School of Nursing, Pittsburgh, PA, United States of America
- Department of Pathology, University of Pittsburgh School of Medicine, Pittsburgh, PA, United States of America
- * E-mail:
| |
Collapse
|
17
|
Kasnavi SA, Aminafshar M, Shariati MM, Emam Jomeh Kashan N, Honarvar M. The effect of kernel selection on genome wide prediction of discrete traits by Support Vector Machine. GENE REPORTS 2018. [DOI: 10.1016/j.genrep.2018.04.006] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
18
|
Ho L, Legere M, Li T, Levine S, Hao K, Valcarcel B, Pasinetti GM. Autonomic Nervous System Dysfunctions as a Basis for a Predictive Model of Risk of Neurological Disorders in Subjects with Prior History of Traumatic Brain Injury: Implications in Alzheimer's Disease. J Alzheimers Dis 2018; 56:305-315. [PMID: 27911325 DOI: 10.3233/jad-160948] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Autonomic dysfunction is very common in patients with dementia, and its presence might also help in differential diagnosis among dementia subtypes. Various central nervous system structures affected in Alzheimer's disease (AD) are also implicated in the central autonomic nervous system (ANS) regulation. For example, deficits in central cholinergic function in AD could likely lead to autonomic dysfunction. We recently developed a simple, readily applicable evaluation for monitoring ANS disturbances in response to traumatic brain injury (TBI). This ability to monitor TBI allows for the possible detection and targeted prevention of long-term, detrimental brain responses caused by TBI that lead to neurodegenerative diseases such as AD. We randomly selected and extracted de-identified medical record information from subjects who have been assessed using the ANS evaluation protocol. Using machine learning strategies in the analysis of information from individual as well as a combination of ANS evaluation protocol components, we identified a novel prediction model that is effective in correctly segregating between cases with or without a documented history of TBI exposure. Results from our study support the hypothesis that trauma-induced ANS dysfunctions may contribute to clinical TBI features. Because autonomic dysfunction is very common in AD patients it is possible that TBI may also contribute to AD and/or other forms of dementia through these novel mechanisms. This study provides a novel prediction model to physiologically assess the likelihood of subjects with prior history of TBI to develop clinical TBI complications, such as AD.
Collapse
Affiliation(s)
- Lap Ho
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Geriatric Research, Education & Clinical Center, James J. Peters Veterans Affairs Medical Center, Bronx, NY, USA
| | | | | | - Samara Levine
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Ke Hao
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Breanna Valcarcel
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Giulio M Pasinetti
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Geriatric Research, Education & Clinical Center, James J. Peters Veterans Affairs Medical Center, Bronx, NY, USA
| |
Collapse
|
19
|
Han Y, Kim D. Deep convolutional neural networks for pan-specific peptide-MHC class I binding prediction. BMC Bioinformatics 2017; 18:585. [PMID: 29281985 PMCID: PMC5745637 DOI: 10.1186/s12859-017-1997-x] [Citation(s) in RCA: 50] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2017] [Accepted: 12/12/2017] [Indexed: 11/10/2022] Open
Abstract
Background Computational scanning of peptide candidates that bind to a specific major histocompatibility complex (MHC) can speed up the peptide-based vaccine development process and therefore various methods are being actively developed. Recently, machine-learning-based methods have generated successful results by training large amounts of experimental data. However, many machine learning-based methods are generally less sensitive in recognizing locally-clustered interactions, which can synergistically stabilize peptide binding. Deep convolutional neural network (DCNN) is a deep learning method inspired by visual recognition process of animal brain and it is known to be able to capture meaningful local patterns from 2D images. Once the peptide-MHC interactions can be encoded into image-like array(ILA) data, DCNN can be employed to build a predictive model for peptide-MHC binding prediction. In this study, we demonstrated that DCNN is able to not only reliably predict peptide-MHC binding, but also sensitively detect locally-clustered interactions. Results Nonapeptide-HLA-A and -B binding data were encoded into ILA data. A DCNN, as a pan-specific prediction model, was trained on the ILA data. The DCNN showed higher performance than other prediction tools for the latest benchmark datasets, which consist of 43 datasets for 15 HLA-A alleles and 25 datasets for 10 HLA-B alleles. In particular, the DCNN outperformed other tools for alleles belonging to the HLA-A3 supertype. The F1 scores of the DCNN were 0.86, 0.94, and 0.67 for HLA-A*31:01, HLA-A*03:01, and HLA-A*68:01 alleles, respectively, which were significantly higher than those of other tools. We found that the DCNN was able to recognize locally-clustered interactions that could synergistically stabilize peptide binding. We developed ConvMHC, a web server to provide user-friendly web interfaces for peptide-MHC class I binding predictions using the DCNN. ConvMHC web server can be accessible via http://jumong.kaist.ac.kr:8080/convmhc. Conclusions We developed a novel method for peptide-HLA-I binding predictions using DCNN trained on ILA data that encode peptide binding data and demonstrated the reliable performance of the DCNN in nonapeptide binding predictions through the independent evaluation on the latest IEDB benchmark datasets. Our approaches can be applied to characterize locally-clustered patterns in molecular interactions, such as protein/DNA, protein/RNA, and drug/protein interactions. Electronic supplementary material The online version of this article (10.1186/s12859-017-1997-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Youngmahn Han
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea.,Department of Convergence Technology Research, Korea Institute of Science and Technology Information, Daejeon, Republic of Korea
| | - Dongsup Kim
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea.
| |
Collapse
|
20
|
Fundamentals and Methods for T- and B-Cell Epitope Prediction. J Immunol Res 2017; 2017:2680160. [PMID: 29445754 PMCID: PMC5763123 DOI: 10.1155/2017/2680160] [Citation(s) in RCA: 284] [Impact Index Per Article: 40.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2017] [Revised: 11/22/2017] [Accepted: 11/27/2017] [Indexed: 12/25/2022] Open
Abstract
Adaptive immunity is mediated by T- and B-cells, which are immune cells capable of developing pathogen-specific memory that confers immunological protection. Memory and effector functions of B- and T-cells are predicated on the recognition through specialized receptors of specific targets (antigens) in pathogens. More specifically, B- and T-cells recognize portions within their cognate antigens known as epitopes. There is great interest in identifying epitopes in antigens for a number of practical reasons, including understanding disease etiology, immune monitoring, developing diagnosis assays, and designing epitope-based vaccines. Epitope identification is costly and time-consuming as it requires experimental screening of large arrays of potential epitope candidates. Fortunately, researchers have developed in silico prediction methods that dramatically reduce the burden associated with epitope mapping by decreasing the list of potential epitope candidates for experimental testing. Here, we analyze aspects of antigen recognition by T- and B-cells that are relevant for epitope prediction. Subsequently, we provide a systematic and inclusive review of the most relevant B- and T-cell epitope prediction methods and tools, paying particular attention to their foundations.
Collapse
|
21
|
Uslan V, Seker H. Binding affinity prediction of S. cerevisiae 14-3-3 and GYF peptide-recognition domains using support vector regression. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2016; 2016:3445-3448. [PMID: 28269042 DOI: 10.1109/embc.2016.7591469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Proteins interact with other proteins and bio-molecules to carry out biological processes in a cell. Computational models help understanding complex biochemical processes that happens throughout the life of a cell. Domain-mediated protein interaction to peptides one such complex problem in bioinformatics that requires computational predictive models to identify meaningful bindings. In this study, domain-peptide binding affinity prediction models are proposed based on support vector regression. Proposed models are applied to yeast bmh 14-3-3 and syh GYF peptide-recognition domains. The cross validated results of the domain-peptide binding affinity data sets show that predictive performance of the support vector based models are efficient.
Collapse
|
22
|
Uslan V, Seker H. The quantitative prediction of HLA-B*2705 peptide binding affinities using Support Vector Regression to gain insights into its role for the Spondyloarthropathies. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2016; 2015:7651-4. [PMID: 26738064 DOI: 10.1109/embc.2015.7320164] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Computational methods are increasingly utilised in many immunoinformatics problems such as the prediction of binding affinity of peptides. The peptides could provide valuable insight into the drug design and development such as vaccines. Moreover, they can be used to diagnose diseases. The presence of human class I MHC allele HLA-B*2705 is one of the strong hypothesis that would lead spondyloarthropathies. In this paper, Support Vector Regression is used in order to predict binding affinity of peptides with the aid of experimentally determined peptide-MHC binding affinities of 222 peptides to HLA-B*2705 to get more insight into this problematic disease. The results yield a high correlation coefficient as much as 0.65 and the SVR-based predictive models can be considered as a useful tool in order to predict the binding affinities for newly discovered peptides.
Collapse
|
23
|
Luo H, Ye H, Ng HW, Shi L, Tong W, Mendrick DL, Hong H. Machine Learning Methods for Predicting HLA-Peptide Binding Activity. Bioinform Biol Insights 2015; 9:21-9. [PMID: 26512199 PMCID: PMC4603527 DOI: 10.4137/bbi.s29466] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2015] [Revised: 07/30/2015] [Accepted: 08/02/2015] [Indexed: 11/23/2022] Open
Abstract
As major histocompatibility complexes in humans, the human leukocyte antigens (HLAs) have important functions to present antigen peptides onto T-cell receptors for immunological recognition and responses. Interpreting and predicting HLA–peptide binding are important to study T-cell epitopes, immune reactions, and the mechanisms of adverse drug reactions. We review different types of machine learning methods and tools that have been used for HLA–peptide binding prediction. We also summarize the descriptors based on which the HLA–peptide binding prediction models have been constructed and discuss the limitation and challenges of the current methods. Lastly, we give a future perspective on the HLA–peptide binding prediction method based on network analysis.
Collapse
Affiliation(s)
- Heng Luo
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA. ; University of Arkansas at Little Rock/University of Arkansas for Medical Sciences Bioinformatics Graduate Program, Little Rock, AR, USA
| | - Hao Ye
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Hui Wen Ng
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Leming Shi
- Center for Pharmacogenomics, School of Pharmacy, Fudan University, Shanghai, China
| | - Weida Tong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Donna L Mendrick
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Huixiao Hong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| |
Collapse
|
24
|
Zhang W, Niu Y, Zou H, Luo L, Liu Q, Wu W. Accurate prediction of immunogenic T-cell epitopes from epitope sequences using the genetic algorithm-based ensemble learning. PLoS One 2015; 10:e0128194. [PMID: 26020952 PMCID: PMC4447411 DOI: 10.1371/journal.pone.0128194] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2014] [Accepted: 04/24/2015] [Indexed: 11/19/2022] Open
Abstract
Background T-cell epitopes play the important role in T-cell immune response, and they are critical components in the epitope-based vaccine design. Immunogenicity is the ability to trigger an immune response. The accurate prediction of immunogenic T-cell epitopes is significant for designing useful vaccines and understanding the immune system. Methods In this paper, we attempt to differentiate immunogenic epitopes from non-immunogenic epitopes based on their primary structures. First of all, we explore a variety of sequence-derived features, and analyze their relationship with epitope immunogenicity. To effectively utilize various features, a genetic algorithm (GA)-based ensemble method is proposed to determine the optimal feature subset and develop the high-accuracy ensemble model. In the GA optimization, a chromosome is to represent a feature subset in the search space. For each feature subset, the selected features are utilized to construct the base predictors, and an ensemble model is developed by taking the average of outputs from base predictors. The objective of GA is to search for the optimal feature subset, which leads to the ensemble model with the best cross validation AUC (area under ROC curve) on the training set. Results Two datasets named ‘IMMA2’ and ‘PAAQD’ are adopted as the benchmark datasets. Compared with the state-of-the-art methods POPI, POPISK, PAAQD and our previous method, the GA-based ensemble method produces much better performances, achieving the AUC score of 0.846 on IMMA2 dataset and the AUC score of 0.829 on PAAQD dataset. The statistical analysis demonstrates the performance improvements of GA-based ensemble method are statistically significant. Conclusions The proposed method is a promising tool for predicting the immunogenic epitopes. The source codes and datasets are available in S1 File.
Collapse
Affiliation(s)
- Wen Zhang
- School of Computer, Wuhan University, Wuhan, 430072, China
- Research Institute of Shenzhen, Wuhan University, Shenzhen, 518057, China
- * E-mail:
| | - Yanqing Niu
- School of Mathematics and Statistics, South-central University for Nationalities, Wuhan, 430074, China
| | - Hua Zou
- School of Computer, Wuhan University, Wuhan, 430072, China
| | - Longqiang Luo
- School of Mathematics and Statistics, Wuhan University, Wuhan, 430072, China
| | - Qianchao Liu
- School of Computer, Wuhan University, Wuhan, 430072, China
| | - Weijian Wu
- School of Computer, Wuhan University, Wuhan, 430072, China
| |
Collapse
|
25
|
Aguilar-Bonavides C, Sanchez-Arias R, Lanzas C. Accurate prediction of major histocompatibility complex class II epitopes by sparse representation via ℓ 1-minimization. BioData Min 2014; 7:23. [PMID: 25392716 PMCID: PMC4225598 DOI: 10.1186/1756-0381-7-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2014] [Accepted: 10/25/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The major histocompatibility complex (MHC) is responsible for presenting antigens (epitopes) on the surface of antigen-presenting cells (APCs). When pathogen-derived epitopes are presented by MHC class II on an APC surface, T cells may be able to trigger an specific immune response. Prediction of MHC-II epitopes is particularly challenging because the open binding cleft of the MHC-II molecule allows epitopes to bind beyond the peptide binding groove; therefore, the molecule is capable of accommodating peptides of variable length. Among the methods proposed to predict MHC-II epitopes, artificial neural networks (ANNs) and support vector machines (SVMs) are the most effective methods. We propose a novel classification algorithm to predict MHC-II called sparse representation via ℓ 1-minimization. RESULTS We obtained a collection of experimentally confirmed MHC-II epitopes from the Immune Epitope Database and Analysis Resource (IEDB) and applied our ℓ 1-minimization algorithm. To benchmark the performance of our proposed algorithm, we compared our predictions against a SVM classifier. We measured sensitivity, specificity abd accuracy; then we used Receiver Operating Characteristic (ROC) analysis to evaluate the performance of our method. The prediction performance of MHC-II epitopes of the ℓ 1-minimization algorithm was generally comparable and, in some cases, superior to the standard SVM classification method and overcame the lack of robustness of other methods with respect to outliers. While our method consistently favoured DPPS encoding with the alleles tested, SVM showed a slightly better accuracy when "11-factor" encoding was used. CONCLUSIONS ℓ 1-minimization has similar accuracy than SVM, and has additional advantages, such as overcoming the lack of robustness with respect to outliers. With ℓ 1-minimization no model selection dependency is involved.
Collapse
Affiliation(s)
- Clemente Aguilar-Bonavides
- National Institute for Mathematical and Biological Synthesis, University of Tennessee, 37996-3410 Knoxville, TN, USA
| | - Reinaldo Sanchez-Arias
- Department of Applied Mathematics, Wentworth Institute of Technology, 02115 Boston, MA, USA
| | - Cristina Lanzas
- National Institute for Mathematical and Biological Synthesis, University of Tennessee, 37996-3410 Knoxville, TN, USA.,Department of Biomedical and Diagnostic Sciences, University of Tennessee, 37996-3410 Knoxville, TN, USA
| |
Collapse
|
26
|
Vaishnav RA, Liu R, Chapman J, Roberts AM, Ye H, Rebolledo-Mendez JD, Tabira T, Fitzpatrick AH, Achiron A, Running MP, Friedland RP. Aquaporin 4 molecular mimicry and implications for neuromyelitis optica. J Neuroimmunol 2013; 260:92-8. [PMID: 23664693 PMCID: PMC3682654 DOI: 10.1016/j.jneuroim.2013.04.015] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2013] [Revised: 04/10/2013] [Accepted: 04/11/2013] [Indexed: 12/31/2022]
Abstract
Neuromyelitis optica (NMO) is associated with antibodies to aquaporin 4 (AQP4). We hypothesized that antibodies to AQP4 can be triggered by exposure to environmental proteins. We compared human AQP4 to plant and bacterial proteins to investigate the occurrence of significantly similar structures and sequences. High similarity to a known epitope for NMO-IgG, AQP4(207-232), was observed for corn ZmTIP4-1. NMO and non-NMO sera were assessed for reactivity to AQP4(207-232) and the corn peptide. NMO patient serum showed reactivity to both peptides as well as to plant tissue. These findings warrant further investigation into the role of the environment in NMO etiology.
Collapse
Affiliation(s)
- Radhika A. Vaishnav
- Department of Neurology, University of Louisville, KY, USA
- Department of Physiology and Biophysics, University of Louisville, KY, USA
| | - Ruolan Liu
- Department of Neurology, University of Louisville, KY, USA
| | - Joab Chapman
- Department of Neurology, Sheba Medical Center, Sackler Faculty of Medicine, Tel Aviv University, Israel
| | - Andrew M. Roberts
- Department of Physiology and Biophysics, University of Louisville, KY, USA
| | - Hong Ye
- Department of Pharmacology, University of Louisville, KY, USA
| | | | - Takeshi Tabira
- Department of Diagnosis, Prevention, and Treatment of Dementia, Graduate School of Juntendo University, Tokyo, Japan
| | | | - Anat Achiron
- Department of Neurology, Sheba Medical Center, Sackler Faculty of Medicine, Tel Aviv University, Israel
| | | | - Robert P. Friedland
- Department of Neurology, University of Louisville, KY, USA
- Department of Biochemistry, University of Louisville, KY, USA
| |
Collapse
|
27
|
Luo F, Gao Y, Zhu Y, Liu J. Integrating peptides' sequence and energy of contact residues information improves prediction of peptide and HLA-I binding with unknown alleles. BMC Bioinformatics 2013; 14 Suppl 8:S1. [PMID: 23815611 PMCID: PMC3654895 DOI: 10.1186/1471-2105-14-s8-s1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Background The HLA (human leukocyte antigen) class I is a kind of molecule encoded by a large family of genes and is characteristic of high polymorphism. Now the number of the registered HLA-I molecules has exceeded 3000. Slight differences in the amino acid sequences of HLAs would make them bind to different sets of peptides. In the past decades, although many methods have been proposed to predict the binding between peptides and HLA-I molecules and achieved good performance, most experimental data used by them is limited to the HLAs with a small number of alleles. Thus they are inclined to obtain high prediction accuracy only for data with similar alleles. Because the peptides and HLAs together determine the binding, it's necessary to consider their contribution meanwhile. Results By taking into account the features of the peptides sequence and the energy of contact residues, in this paper a method based on the artificial neural network is proposed to predict the binding of peptides and HLA-I even when the HLAs' potential alleles are unknown. Two experiments in the allele-specific and super-type cases are performed respectively to validate our method. In the first case, we collect 14 HLA-A and 14 HLA-B molecules on Bjoern Peters dataset, and compare our method with the ARB, SMM, NetMHC and other 16 online methods. Our method gets the best average AUC (Area under the ROC) value as 0.909. In the second one, we use leave one out cross validation on MHC-peptide binding data that has different alleles but shares the common super-type. Compared to gold standard methods like NetMHC and NetMHCpan, our method again achieves the best average AUC value as 0.847. Conclusions Our method achieves satisfactory results. Whenever it's tested on the HLA-I with single definite gene or with super-type gene locus, it gets better classification accuracy. Especially, when the training set is small, our method still works better than the other methods in the comparison. Therefore, we could make a conclusion that by combining the peptides' information, HLAs amino acid residues' interaction information and contact energy, our method really could improve prediction of the peptide HLA-I binding even when there aren't the prior experimental dataset for HLAs with various alleles.
Collapse
Affiliation(s)
- Fei Luo
- School of Computer, Wuhan University, Wuhan, Hubei, China
| | | | | | | |
Collapse
|
28
|
Koch CP, Pillong M, Hiss JA, Schneider G. Computational Resources for MHC Ligand Identification. Mol Inform 2013; 32:326-36. [PMID: 27481589 DOI: 10.1002/minf.201300042] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2012] [Accepted: 04/04/2013] [Indexed: 01/16/2023]
Abstract
Advances in the high-throughput determination of functional modulators of major histocompatibility complex (MHC) and improved computational predictions of MHC ligands have rendered the rational design of immunomodulatory peptides feasible. Proteome-derived peptides and 'reverse vaccinology' by computational means will play a driving role in future vaccine design. Here we review the molecular mechanisms of the MHC mediated immune response, present the computational approaches that have emerged in this area of biotechnology, and provide an overview of publicly available computational resources for predicting and designing new peptidic MHC ligands.
Collapse
Affiliation(s)
- Christian P Koch
- ETH Zürich, Department of Chemistry and Applied Biosciences, Institute of Pharmaceutical Sciences, Wolfgang-Pauli-Str. 10, 8093 Zürich, Switzerland
| | - Max Pillong
- ETH Zürich, Department of Chemistry and Applied Biosciences, Institute of Pharmaceutical Sciences, Wolfgang-Pauli-Str. 10, 8093 Zürich, Switzerland
| | - Jan A Hiss
- ETH Zürich, Department of Chemistry and Applied Biosciences, Institute of Pharmaceutical Sciences, Wolfgang-Pauli-Str. 10, 8093 Zürich, Switzerland
| | - Gisbert Schneider
- ETH Zürich, Department of Chemistry and Applied Biosciences, Institute of Pharmaceutical Sciences, Wolfgang-Pauli-Str. 10, 8093 Zürich, Switzerland.
| |
Collapse
|
29
|
Patronov A, Doytchinova I. T-cell epitope vaccine design by immunoinformatics. Open Biol 2013; 3:120139. [PMID: 23303307 PMCID: PMC3603454 DOI: 10.1098/rsob.120139] [Citation(s) in RCA: 255] [Impact Index Per Article: 23.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2012] [Accepted: 12/11/2012] [Indexed: 01/08/2023] Open
Abstract
Vaccination is generally considered to be the most effective method of preventing infectious diseases. All vaccinations work by presenting a foreign antigen to the immune system in order to evoke an immune response. The active agent of a vaccine may be intact but inactivated ('attenuated') forms of the causative pathogens (bacteria or viruses), or purified components of the pathogen that have been found to be highly immunogenic. The increased understanding of antigen recognition at molecular level has resulted in the development of rationally designed peptide vaccines. The concept of peptide vaccines is based on identification and chemical synthesis of B-cell and T-cell epitopes which are immunodominant and can induce specific immune responses. The accelerating growth of bioinformatics techniques and applications along with the substantial amount of experimental data has given rise to a new field, called immunoinformatics. Immunoinformatics is a branch of bioinformatics dealing with in silico analysis and modelling of immunological data and problems. Different sequence- and structure-based immunoinformatics methods are reviewed in the paper.
Collapse
Affiliation(s)
| | - Irini Doytchinova
- Department of Chemistry, Faculty of Pharmacy, Medical University of Sofia, Sofia, Bulgaria
| |
Collapse
|
30
|
Comparison of the predicted population coverage of tuberculosis vaccine candidates Ag85B-ESAT-6, Ag85B-TB10.4, and Mtb72f via a bioinformatics approach. PLoS One 2012; 7:e40882. [PMID: 22815851 PMCID: PMC3398899 DOI: 10.1371/journal.pone.0040882] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2011] [Accepted: 06/15/2012] [Indexed: 11/24/2022] Open
Abstract
The Bacille-Calmette Guérin (BCG) vaccine does not provide consistent protection against adult pulmonary tuberculosis (TB) worldwide. As novel TB vaccine candidates advance in studies and clinical trials, it will be critically important to evaluate their global coverage by assessing the impact of host and pathogen variability on vaccine efficacy. In this study, we focus on the impact that host genetic variability may have on the protective effect of TB vaccine candidates Ag85B-ESAT-6, Ag85B-TB10.4, and Mtb72f. We use open-source epitope binding prediction programs to evaluate the binding of vaccine epitopes to Class I HLA (A, B, and C) and Class II HLA (DRB1) alleles. Our findings suggest that Mtb72f may be less consistently protective than either Ag85B-ESAT-6 or Ag85B-TB10.4 in populations with a high TB burden, while Ag85B-TB10.4 may provide the most consistent protection. The findings of this study highlight the utility of bioinformatics as a tool for evaluating vaccine candidates before the costly stages of clinical trials and informing the development of new vaccines with the broadest possible population coverage.
Collapse
|
31
|
Grading amino acid properties increased accuracies of single point mutation on protein stability prediction. BMC Bioinformatics 2012; 13:44. [PMID: 22435732 PMCID: PMC3820156 DOI: 10.1186/1471-2105-13-44] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2011] [Accepted: 03/22/2012] [Indexed: 11/23/2022] Open
Abstract
Background Protein stabilities can be affected sometimes by point mutations introduced to the
protein. Current sequence-information-based protein stability prediction encoding
schemes of machine learning approaches include sparse encoding and amino acid
property encoding. Property encoding schemes employ physical-chemical information
of the mutated protein environments, however, they produce complexity in the mean
time when many properties joined in the scheme. The complexity introduces noises
that affect machine learning algorithm accuracies. In order to overcome the
problem we described a new encoding scheme that graded twenty amino acids into
groups according to their specific property values. Results We employed three predefined values, 0.1, 0.5, and 0.9 to represent 'weak',
'middle', and 'strong' groups for each amino acid property, and introduced two
thresholds for each property to split twenty amino acids into one of the three
groups according to their property values. Each amino acid can take only one out
of three predefined values rather than twenty different values for each property.
The complexity and noises in the encoding schemes were reduced in this way. More
than 7% average accuracy improvement was found in the graded amino acid property
encoding schemes by 20-fold cross validation. The overall accuracy of our method
is more than 72% when performed on the independent test sets starting from
sequence information with three-state prediction definitions. Conclusions Grading numeric values of amino acid property can reduce the noises and complexity
of input information. It is in accordance with biochemical concepts for amino acid
properties and makes the input data simplified in the mean time. The idea of
graded property encoding schemes may be applied to protein related predictions
with machine learning approaches.
Collapse
|
32
|
Song J, Tan H, Wang M, Webb GI, Akutsu T. TANGLE: two-level support vector regression approach for protein backbone torsion angle prediction from primary sequences. PLoS One 2012; 7:e30361. [PMID: 22319565 PMCID: PMC3271071 DOI: 10.1371/journal.pone.0030361] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2011] [Accepted: 12/14/2011] [Indexed: 12/29/2022] Open
Abstract
Protein backbone torsion angles (Phi) and (Psi) involve two rotation angles rotating around the Cα-N bond (Phi) and the Cα-C bond (Psi). Due to the planarity of the linked rigid peptide bonds, these two angles can essentially determine the backbone geometry of proteins. Accordingly, the accurate prediction of protein backbone torsion angle from sequence information can assist the prediction of protein structures. In this study, we develop a new approach called TANGLE (Torsion ANGLE predictor) to predict the protein backbone torsion angles from amino acid sequences. TANGLE uses a two-level support vector regression approach to perform real-value torsion angle prediction using a variety of features derived from amino acid sequences, including the evolutionary profiles in the form of position-specific scoring matrices, predicted secondary structure, solvent accessibility and natively disordered region as well as other global sequence features. When evaluated based on a large benchmark dataset of 1,526 non-homologous proteins, the mean absolute errors (MAEs) of the Phi and Psi angle prediction are 27.8° and 44.6°, respectively, which are 1% and 3% respectively lower than that using one of the state-of-the-art prediction tools ANGLOR. Moreover, the prediction of TANGLE is significantly better than a random predictor that was built on the amino acid-specific basis, with the p-value<1.46e-147 and 7.97e-150, respectively by the Wilcoxon signed rank test. As a complementary approach to the current torsion angle prediction algorithms, TANGLE should prove useful in predicting protein structural properties and assisting protein fold recognition by applying the predicted torsion angles as useful restraints. TANGLE is freely accessible at http://sunflower.kuicr.kyoto-u.ac.jp/~sjn/TANGLE/.
Collapse
Affiliation(s)
- Jiangning Song
- Department of Biochemistry and Molecular Biology, Faculty of Medicine, Monash University, Melbourne, Victoria, Australia
- National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, China
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto, Japan
- * E-mail: (JS); (GIW); (TA)
| | - Hao Tan
- Department of Biochemistry and Molecular Biology, Faculty of Medicine, Monash University, Melbourne, Victoria, Australia
| | - Mingjun Wang
- National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, China
| | - Geoffrey I. Webb
- Faculty of Information Technology, Monash University, Melbourne, Victoria, Australia
- * E-mail: (JS); (GIW); (TA)
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto, Japan
- * E-mail: (JS); (GIW); (TA)
| |
Collapse
|
33
|
Tung CW, Ziehm M, Kämper A, Kohlbacher O, Ho SY. POPISK: T-cell reactivity prediction using support vector machines and string kernels. BMC Bioinformatics 2011; 12:446. [PMID: 22085524 PMCID: PMC3228774 DOI: 10.1186/1471-2105-12-446] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2011] [Accepted: 11/15/2011] [Indexed: 02/03/2023] Open
Abstract
Background Accurate prediction of peptide immunogenicity and characterization of relation between peptide sequences and peptide immunogenicity will be greatly helpful for vaccine designs and understanding of the immune system. In contrast to the prediction of antigen processing and presentation pathway, the prediction of subsequent T-cell reactivity is a much harder topic. Previous studies of identifying T-cell receptor (TCR) recognition positions were based on small-scale analyses using only a few peptides and concluded different recognition positions such as positions 4, 6 and 8 of peptides with length 9. Large-scale analyses are necessary to better characterize the effect of peptide sequence variations on T-cell reactivity and design predictors of a peptide's T-cell reactivity (and thus immunogenicity). The identification and characterization of important positions influencing T-cell reactivity will provide insights into the underlying mechanism of immunogenicity. Results This work establishes a large dataset by collecting immunogenicity data from three major immunology databases. In order to consider the effect of MHC restriction, peptides are classified by their associated MHC alleles. Subsequently, a computational method (named POPISK) using support vector machine with a weighted degree string kernel is proposed to predict T-cell reactivity and identify important recognition positions. POPISK yields a mean 10-fold cross-validation accuracy of 68% in predicting T-cell reactivity of HLA-A2-binding peptides. POPISK is capable of predicting immunogenicity with scores that can also correctly predict the change in T-cell reactivity related to point mutations in epitopes reported in previous studies using crystal structures. Thorough analyses of the prediction results identify the important positions 4, 6, 8 and 9, and yield insights into the molecular basis for TCR recognition. Finally, we relate this finding to physicochemical properties and structural features of the MHC-peptide-TCR interaction. Conclusions A computational method POPISK is proposed to predict immunogenicity with scores which are useful for predicting immunogenicity changes made by single-residue modifications. The web server of POPISK is freely available at http://iclab.life.nctu.edu.tw/POPISK.
Collapse
Affiliation(s)
- Chun-Wei Tung
- School of Pharmacy, Kaohsiung Medical University, Kaohsiung 807, Taiwan
| | | | | | | | | |
Collapse
|
34
|
Tiriveedhi V, Sarma NJ, Subramanian V, Fleming TP, Gillanders WE, Mohanakumar T. Identification of HLA-A24-restricted CD8(+) cytotoxic T-cell epitopes derived from mammaglobin-A, a human breast cancer-associated antigen. Hum Immunol 2011; 73:11-6. [PMID: 22074997 DOI: 10.1016/j.humimm.2011.10.017] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2011] [Revised: 07/29/2011] [Accepted: 10/12/2011] [Indexed: 01/20/2023]
Abstract
Human breast cancer-associated antigen, mammaglobin-A (Mam-A), potentially offers a novel therapeutic target as a breast cancer vaccine. In this study, we define the CD8(+) cytotoxic T lymphocyte (CTL) response to Mam-A-derived candidate epitopes presented in the context of HLA-A24 (A*2402). HLA-A24 has a frequency of 72% in Japanese, 27% in Asian Indian, and 18% in Caucasian populations. Using a human leukocyte antigen (HLA)-binding prediction algorithm we identified 7 HLA-A24-restricted Mam-A-derived candidate epitopes (MAA24.1-7). Membrane stabilization studies with TAP-deficient T2 cells transfected with HLA-A2402 (T2.A24) indicated that MAA24.2 (CYAGSGCPL) and MAA24.4 (ETLSNVEVF) have the highest HLA-A24 binding affinity. Further, 2 CD8(+) CTL cell lines generated in vitro against T2.A24 cells individually loaded with Mam-A-derived candidate epitopes demonstrated significant cytotoxic activity against MAA24.2 and MAA24.4. In addition, the same CD8(+) CTL lines lysed the HLA-A24(+)/Mam-A(+) stable transfected human breast cancer cell lines AU565 and MDA-MB-361. However, these CTLs had no cytotoxicity against HLA-A24(-)/Mam-A(+) and HLA-A24(+)/Mam-A(-) breast cancer cell lines. In summary, our results define HLA-A24-restricted, Mam-A-derived, CD8(+) CTL epitopes that can potentially be employed for Mam-A-based breast cancer vaccine therapy to breast cancer patients with HLA-A24 phenotype.
Collapse
Affiliation(s)
- Venkataswarup Tiriveedhi
- Department of Surgery, Washington University School of Medicine, Saint Louis, Missouri 63110, USA
| | | | | | | | | | | |
Collapse
|
35
|
Patronov A, Dimitrov I, Flower DR, Doytchinova I. Peptide binding prediction for the human class II MHC allele HLA-DP2: a molecular docking approach. BMC STRUCTURAL BIOLOGY 2011; 11:32. [PMID: 21752305 PMCID: PMC3146810 DOI: 10.1186/1472-6807-11-32] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/08/2011] [Accepted: 07/14/2011] [Indexed: 12/04/2022]
Abstract
Background MHC class II proteins bind oligopeptide fragments derived from proteolysis of pathogen antigens, presenting them at the cell surface for recognition by CD4+ T cells. Human MHC class II alleles are grouped into three loci: HLA-DP, HLA-DQ and HLA-DR. In contrast to HLA-DR and HLA-DQ, HLA-DP proteins have not been studied extensively, as they have been viewed as less important in immune responses than DRs and DQs. However, it is now known that HLA-DP alleles are associated with many autoimmune diseases. Quite recently, the X-ray structure of the HLA-DP2 molecule (DPA*0103, DPB1*0201) in complex with a self-peptide derived from the HLA-DR α-chain has been determined. In the present study, we applied a validated molecular docking protocol to a library of 247 modelled peptide-DP2 complexes, seeking to assess the contribution made by each of the 20 naturally occurred amino acids at each of the nine binding core peptide positions and the four flanking residues (two on both sides). Results The free binding energies (FBEs) derived from the docking experiments were normalized on a position-dependent (npp) and on an overall basis (nap), and two docking score-based quantitative matrices (DS-QMs) were derived: QMnpp and QMnap. They reveal the amino acid preferences at each of the 13 positions considered in the study. Apart from the leading role of anchor positions p1 and p6, the binding to HLA-DP2 depends on the preferences at p2. No effect of the flanking residues was found on the peptide binding predictions to DP2, although all four of them show strong preferences for particular amino acids. The predictive ability of the DS-QMs was tested using a set of 457 known binders to HLA-DP2, originating from 24 proteins. The sensitivities of the predictions at five different thresholds (5%, 10%, 15%, 20% and 25%) were calculated and compared to the predictions made by the NetMHCII and IEDB servers. Analysis of the DS-QMs indicated an improvement in performance. Additionally, DS-QMs identified the binding cores of several known DP2 binders. Conclusions The molecular docking protocol, as applied to a combinatorial library of peptides, models the peptide-HLA-DP2 protein interaction effectively, generating reliable predictions in a quantitative assessment. The method is structure-based and does not require extensive experimental sequence-based data. Thus, it is universal and can be applied to model any peptide - protein interaction.
Collapse
Affiliation(s)
- Atanas Patronov
- Rebirth, Hannover Biomedical Research School, Carl-Neuberg strasse 1, Hannover, Germany
| | | | | | | |
Collapse
|
36
|
EL-Manzalawy Y, Dobbs D, Honavar V. Predicting MHC-II binding affinity using multiple instance regression. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011; 8:1067-1079. [PMID: 20855923 PMCID: PMC3400677 DOI: 10.1109/tcbb.2010.94] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
Reliably predicting the ability of antigen peptides to bind to major histocompatibility complex class II (MHC-II) molecules is an essential step in developing new vaccines. Uncovering the amino acid sequence correlates of the binding affinity of MHC-II binding peptides is important for understanding pathogenesis and immune response. The task of predicting MHC-II binding peptides is complicated by the significant variability in their length. Most existing computational methods for predicting MHC-II binding peptides focus on identifying a nine amino acids core region in each binding peptide. We formulate the problems of qualitatively and quantitatively predicting flexible length MHC-II peptides as multiple instance learning and multiple instance regression problems, respectively. Based on this formulation, we introduce MHCMIR, a novel method for predicting MHC-II binding affinity using multiple instance regression. We present results of experiments using several benchmark data sets that show that MHCMIR is competitive with the state-of-the-art methods for predicting MHC-II binding peptides. An online web server that implements the MHCMIR method for MHC-II binding affinity prediction is freely accessible at http://ailab.cs.iastate.edu/mhcmir.
Collapse
Affiliation(s)
- Yasser EL-Manzalawy
- Department of Systems and Computers Engineering, Al-Azhar University, Cairo, Egypt.
| | | | | |
Collapse
|
37
|
Liao WWP, Arthur JW. Predicting peptide binding to Major Histocompatibility Complex molecules. Autoimmun Rev 2011; 10:469-73. [PMID: 21333759 DOI: 10.1016/j.autrev.2011.02.003] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2011] [Accepted: 02/09/2011] [Indexed: 12/29/2022]
Abstract
The Major Histocompatibility Complex (MHC) constitutes an important part of the human immune system. During infection, pathogenic proteins are processed into peptide fragments by the antigen processing machinery. These peptides bind to MHC molecules and the MHC-peptide complex is then transported to the cell membrane where it elicits an immune response via T-cell binding. Understanding the molecular mechanism of this process will greatly assist in determining the aetiology of various diseases and in the design of effective drugs. One of the most challenging aspects of this area of research is understanding the specificity and sensitivity of the binding process. An empirical approach to the problem is unfeasible as there are over 512 billion potential binding peptides for each MHC molecule. Computational approaches offer the promise of predicting peptide binding, thus dramatically reducing the number of peptides proceeding to experimental verification. Various bioinformatic approaches have been developed to predict whether or not a particular peptide will bind to a particular MHC allele. Currently, peptide binding prediction methods can be categorised into three major groups: motif- and scoring matrix-based methods, artificial intelligence- (AI-) based methods, and structure-based methods. The first two are sequence-based approaches and are generally based on common sequence motifs in peptides known to bind to MHC molecules. The structure-based approach concerns the structural features and the distribution of energy between the binding peptide and the MHC molecule. Although knowledge of the molecular structure of the MHC molecules is expected to lead to better predictions of peptide binding, the development of structure-based methods has been relatively slow compared to sequence-based methods. Comparisons of various methods showed that the best sequence-based methods significantly outperform structure-based methods. This may be improved by producing more structures and binding data desperately needed by many alleles, especially class II molecules. On the other hand, the large number of verification methods and indicators used by structure-based studies hinders critical evaluation of the methods. Adopting commonly used assessment procedures can demonstrate the relative performance of structure-based methods in a straightforward comparison with other methods. This review provides an overview of current methods for predicting peptide binding to the MHC, with a focus on structure-based methods, and explores the potential for future development in this area.
Collapse
Affiliation(s)
- Webber W P Liao
- Discipline of Medicine, Central Clinical School, University of Sydney, NSW, 2006, Australia
| | | |
Collapse
|
38
|
Shao X, Tan CSH, Voss C, Li SSC, Deng N, Bader GD. A regression framework incorporating quantitative and negative interaction data improves quantitative prediction of PDZ domain-peptide interaction from primary sequence. ACTA ACUST UNITED AC 2010; 27:383-90. [PMID: 21127034 PMCID: PMC3031032 DOI: 10.1093/bioinformatics/btq657] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Motivation: Predicting protein interactions involving peptide recognition domains is essential for understanding the many important biological processes they mediate. It is important to consider the binding strength of these interactions to help us construct more biologically relevant protein interaction networks that consider cellular context and competition between potential binders. Results: We developed a novel regression framework that considers both positive (quantitative) and negative (qualitative) interaction data available for mouse PDZ domains to quantitatively predict interactions between PDZ domains, a large peptide recognition domain family, and their peptide ligands using primary sequence information. First, we show that it is possible to learn from existing quantitative and negative interaction data to infer the relative binding strength of interactions involving previously unseen PDZ domains and/or peptides given their primary sequence. Performance was measured using cross-validated hold out testing and testing with previously unseen PDZ domain–peptide interactions. Second, we find that incorporating negative data improves quantitative interaction prediction. Third, we show that sequence similarity is an important prediction performance determinant, which suggests that experimentally collecting additional quantitative interaction data for underrepresented PDZ domain subfamilies will improve prediction. Availability and Implementation: The Matlab code for our SemiSVR predictor and all data used here are available at http://baderlab.org/Data/PDZAffinity. Contact:gary.bader@utoronto.ca; dengnaiyang@cau.edu.cn Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xiaojian Shao
- Department of Applied Mathematics, College of Science, China Agricultural University, Beijing, 100083, China
| | | | | | | | | | | |
Collapse
|
39
|
King CA, Bradley P. Structure-based prediction of protein-peptide specificity in Rosetta. Proteins 2010; 78:3437-49. [PMID: 20954182 DOI: 10.1002/prot.22851] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2010] [Revised: 07/16/2010] [Accepted: 07/28/2010] [Indexed: 01/03/2023]
Abstract
Protein-peptide interactions mediate many of the connections in intracellular signaling networks. A generalized computational framework for atomically precise modeling of protein-peptide specificity may allow for predicting molecular interactions, anticipating the effects of drugs and genetic mutations, and redesigning molecules for new interactions. We have developed an extensible, general algorithm for structure-based prediction of protein-peptide specificity as part of the Rosetta molecular modeling package. The algorithm is not restricted to any one peptide-binding domain family and, at minimum, does not require an experimentally characterized structure of the target protein nor any information about sequence specificity; although known structural data can be incorporated when available to improve performance. We demonstrate substantial success in specificity prediction across a diverse set of peptide-binding proteins, and show how performance is affected when incorporating varying degrees of input structural data. We also illustrate how structure-based approaches can provide atomic-level insight into mechanisms of peptide recognition and can predict the effects of point mutations on peptide specificity. Shortcomings and artifacts of our benchmark predictions are explained and limits on the generality of the method are explored. This work provides a promising foundation upon which further development of completely generalized, de novo prediction of peptide specificity may progress.
Collapse
Affiliation(s)
- Christopher A King
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA.
| | | |
Collapse
|
40
|
Zhang W, Liu J, Niu Y. Quantitative prediction of MHC-II binding affinity using particle swarm optimization. Artif Intell Med 2010; 50:127-32. [PMID: 20541921 DOI: 10.1016/j.artmed.2010.05.003] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2009] [Revised: 03/31/2010] [Accepted: 05/12/2010] [Indexed: 01/13/2023]
Abstract
OBJECTIVE Helper T-cell epitopes (Th epitopes) are the basic units which activate helper T-cell's immune response, and they are helpful for understanding the immune mechanism and developing vaccines. Peptide and major histocompatibility complex class II (MHC-II) binding is an important prerequisite event for helper T-cell immune response, and the binding peptides are usually recognized as Th epitopes, therefore we can identify Th epitopes by predicting MHC-II binding peptides. Recently, instead of differentiating the peptides as binder or non-binder, researchers are more interested in predicting binding affinities between MHC-II molecules and peptides. METHODOLOGY Motivated by the collective search strategy of the particle swarm optimization algorithm (PSO), a method was developed to make the direct prediction of peptide binding affinity. In our paper, PSO was utilized to search for the optimal position-specific scoring matrices (PSSM) from the experimentally derived allele-related peptides, and then the prediction models were constructed based on the matrices. Moreover, we evaluated several factors influencing the binding affinity, including peptide length and flanking residue length, and incorporated them into our models. RESULTS The performance of our models was evaluated on three MHC-II alleles from AntiJen database and 14 MHC-II alleles from IEDB database. When compared to the existing popular quantitative methods such as MHCPred, SVRMHC, ARB and SMM-align, our method can give out better performance in terms of correlation coefficient (r) and area under ROC curve (AUC). In addition, the results demonstrated that the performance of models was further improved by incorporating the global length information, achieving average AUC value of 0.7534 and average r value of 0.4707. CONCLUSIONS Quantitative prediction of MHC-II binding affinity can be modeled as an optimization problem. Our PSO based method can find the optimal PSSM, which will then be used for identifying the binding cores and scoring the binding affinities of the peptides. The experiment results show that our method is promising for the prediction of MHC-II binding affinity.
Collapse
Affiliation(s)
- Wen Zhang
- School of Computer Science, Wuhan University, Wuhan 430072, People's Republic of China.
| | | | | |
Collapse
|
41
|
Ilias Basha H, Tiriveedhi V, Fleming TP, Gillanders WE, Mohanakumar T. Identification of immunodominant HLA-B7-restricted CD8+ cytotoxic T cell epitopes derived from mammaglobin-A expressed on human breast cancers. Breast Cancer Res Treat 2010; 127:81-9. [PMID: 20544273 DOI: 10.1007/s10549-010-0975-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2010] [Accepted: 05/29/2010] [Indexed: 01/07/2023]
Abstract
Mammaglobin-A (MGBA), a 10-kD protein, is over expressed in 80% of primary and metastatic human breast cancers. Breast cancer patients demonstrate high frequencies of CD8(+) cytotoxic T lymphocytes (CTL) specific to MGBA. Defining CD8(+) CTL responses to HLA class I-restricted MGBA-derived epitopes assumes significance in the context of our ongoing efforts to clinically translate vaccine strategies targeting MGBA for prevention and/or treatment of human breast cancers. In this study, we define the CD8(+) CTL response to MGBA-derived candidate epitopes presented in the context of HLA-B7, which has a frequency of 17.7% in Caucasian and 15.5% in African American populations. We identified seven MGBA-derived candidate epitopes with high predicted binding scores for HLA-B7 using a computer algorithm. Membrane stabilization studies with TAP-deficient T2 cells transfected with HLA-B7 indicated that MGBA B7.3 (VSKTEYKEL), B7.6 (KLLMVLMLA), B7.7 (NPQVSKTEY), and B7.1 (YAGSGCPLL) have the highest HLA-B7 binding affinities. Further, two CD8(+) CTL cell lines generated in vitro against T2.B7 cells individually loaded with MGBA-derived candidate epitopes showed significant cytotoxic activity against MGBA B7.1, B7.3, B7.6, and B7.7. In addition, the same CD8(+) CTL lines lysed the HLA-B7(+)/MGBA(+) human breast cancer cell line DU-4475 but had no significant cytotoxicity against HLA-B7(-) or MGBA(-) breast cancer cell lines. Cold-target inhibition studies strongly suggest that MGBA B7.3 is an immunodominant epitope. In summary, our results define HLA-B7-restriced, MGBA-derived, CD8(+) CTL epitopes with all of the necessary features for developing novel vaccine strategies against HLA-B7 expressing breast cancer patients.
Collapse
Affiliation(s)
- Haseeb Ilias Basha
- Department of Surgery, Washington University School of Medicine, Box 8109, 3328 CSRB, 660 South Euclid Avenue, Saint Louis, MO 63110, USA
| | | | | | | | | |
Collapse
|
42
|
Innovative bioinformatic approaches for developing peptide-based vaccines against hypervariable viruses. Immunol Cell Biol 2010; 89:81-9. [PMID: 20458336 DOI: 10.1038/icb.2010.65] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
The application of the fields of pharmacogenomics and pharmacogenetics to vaccine design has been recently labeled 'vaccinomics'. This newly named area of vaccine research, heavily intertwined with bioinformatics, seems to be leading the charge in developing novel vaccines for currently unmet medical needs against hypervariable viruses such as human immunodeficiency virus (HIV), hepatitis C and emerging avian and swine influenza. Some of the more recent bioinformatic approaches in the area of vaccine research include the use of epitope determination and prediction algorithms for exploring the use of peptide epitopes as vaccine immunogens. This paper briefly discusses and explores some current uses of bioinformatics in vaccine design toward the pursuit of peptide vaccines for hypervariable viruses. The various informatics and vaccine design strategies attempted by other groups toward hypervariable viruses will also be briefly examined, along with the strategy used by our group in the design and synthesis of peptide immunogens for candidate HIV and influenza vaccines.
Collapse
|
43
|
Liu J, Li QJ, Zhang W. A novel Locally Linear Embedding and Wavelet Transform based encoding method for prediction of MHC-II binding affinity. Interdiscip Sci 2010; 2:145-50. [PMID: 20640782 DOI: 10.1007/s12539-010-0075-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2009] [Revised: 10/19/2009] [Accepted: 10/24/2009] [Indexed: 11/29/2022]
Abstract
The binding between peptides and MHC molecules is an important event to the cellular immunity against pathogens. The binding peptides are recognized as the epitopes, which are useful for the epitope-based vaccine design. Accurate prediction of the MHC-II binding peptides has long been a challenge in bioinformatics. Recently, most researchers are interested in predicting the binding affinity instead of categorizing peptides as "binders" or "non-binders". In this paper, we introduced a novel encoding scheme based on Locally Linear Embedding (LLE) and Wavelet Transform (WT), in which important amino acid properties were firstly selected from all properties (described in AAindex database) by using LLE, and then amino acids of peptides were replaced with these novel properties. Further, WT was adopted to extract the frequency attributes of the numerical sequences; thereby the peptides were transformed into homogeneous-length vectors. Finally, Support Vector machine Regression (SVR) was used to make quantitative prediction models based on these numerical vectors. When applied to the 16 datasets from IEDB database, our encoding scheme produced consistently better performance than other encoding schemes, indicating that our encoding scheme is an effective tool for the prediction of MHC-II binding affinity.
Collapse
Affiliation(s)
- Juan Liu
- School of Computer Science, Wuhan University, Wuhan, China.
| | | | | |
Collapse
|
44
|
McNamara LA, He Y, Yang Z. Using epitope predictions to evaluate efficacy and population coverage of the Mtb72f vaccine for tuberculosis. BMC Immunol 2010; 11:18. [PMID: 20353587 PMCID: PMC2862017 DOI: 10.1186/1471-2172-11-18] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2009] [Accepted: 03/30/2010] [Indexed: 03/05/2023] Open
Abstract
Background The Mtb72f subunit vaccine for tuberculosis, currently in clinical trials, is hoped to provide improved protection compared to the current BCG vaccine. It is not clear, however, whether Mtb72f would be equally protective in the different human populations suffering from a high burden of tuberculosis. Previous work by Hebert and colleagues demonstrated that the PPE18 protein of Mtb72f had significant variability in a sample of clinical M. tuberculosis isolates. However, whether this variation might impact the efficacy of Mtb72f in the context of the microbial and host immune system interactions remained to be determined. The present study assesses Mtb72f's predicted efficacy in people with different DRB1 genotypes to predict whether the vaccine will protect against diverse clinical strains of M. tuberculosis in a diverse host population. Results We evaluated the binding of epitopes in the vaccine to different alleles of the human DRB1 Class II MHC protein using freely available epitope prediction programs and compared protein sequences from clinical isolates to the sequences included in the Mtb72f vaccine. This analysis predicted that the Mtb72f vaccine would be less effective for several DRB1 genotypes, due either to limited vaccine epitope binding to the DRB1 proteins or to binding primarily by unconserved PPE18 epitopes. Furthermore, we found that these less-protective DRB1 alleles are found at a very high frequency in several populations with a high burden of tuberculosis. Conclusion Although the Mtb72f vaccine candidate has shown promise in animal and clinical trials thus far, it may not be optimally effective in some genotypic backgrounds. Due to variation in both M. tuberculosis protein sequences and epitope-binding capabilities of different HLA alleles, certain human populations with a high burden of tuberculosis may not be optimally protected by the Mtb72f vaccine. The efficacy of the Mtb72f vaccine should be further examined in these particular populations to determine whether additional protective measures might be necessary for these regions.
Collapse
Affiliation(s)
- Lucy A McNamara
- Department of Epidemiology University of Michigan, Ann Arbor, MI 48109, USA
| | | | | |
Collapse
|
45
|
|
46
|
Song J, Tan H, Shen H, Mahmood K, Boyd SE, Webb GI, Akutsu T, Whisstock JC. Cascleave: towards more accurate prediction of caspase substrate cleavage sites. ACTA ACUST UNITED AC 2010; 26:752-60. [PMID: 20130033 DOI: 10.1093/bioinformatics/btq043] [Citation(s) in RCA: 132] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
MOTIVATION The caspase family of cysteine proteases play essential roles in key biological processes such as programmed cell death, differentiation, proliferation, necrosis and inflammation. The complete repertoire of caspase substrates remains to be fully characterized. Accordingly, systematic computational screening studies of caspase substrate cleavage sites may provide insight into the substrate specificity of caspases and further facilitating the discovery of putative novel substrates. RESULTS In this article we develop an approach (termed Cascleave) to predict both classical (i.e. following a P(1) Asp) and non-typical caspase cleavage sites. When using local sequence-derived profiles, Cascleave successfully predicted 82.2% of the known substrate cleavage sites, with a Matthews correlation coefficient (MCC) of 0.667. We found that prediction performance could be further improved by incorporating information such as predicted solvent accessibility and whether a cleavage sequence lies in a region that is most likely natively unstructured. Novel bi-profile Bayesian signatures were found to significantly improve the prediction performance and yielded the best performance with an overall accuracy of 87.6% and a MCC of 0.747, which is higher accuracy than published methods that essentially rely on amino acid sequence alone. It is anticipated that Cascleave will be a powerful tool for predicting novel substrate cleavage sites of caspases and shedding new insights on the unknown caspase-substrate interactivity relationship. AVAILABILITY http://sunflower.kuicr.kyoto-u.ac.jp/ approximately sjn/Cascleave/ CONTACT jiangning.song@med.monash.edu.au; takutsu@kuicr.kyoto-u.ac.jp; james; whisstock@med.monash.edu.au SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jiangning Song
- Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia.
| | | | | | | | | | | | | | | |
Collapse
|
47
|
Chen G, Zuo Z, Zhu Q, Hong A, Zhou X, Gao X, Li T. Qualitative and quantitative analysis of peptide microarray binding experiments using SVM-PEPARRAY. Methods Mol Biol 2010; 570:403-11. [PMID: 19649609 DOI: 10.1007/978-1-60327-394-7_23] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]
Abstract
A main objective of analyzing peptide array-based binding experiments is to uncover the relationship between a peptide sequence and the binding outcome. Limited by the peptide array technologies available for applications, few attempts have been made to construct qualitative or quantitative models that depict the peptide sequence:binding strength relationships in peptide microarray-based binding studies. There has been a long history of similar modeling efforts based on low-throughput binding data in the areas of T-cell epitope screening and kinase substrate mapping, however. The keen needs in peptide array applications and the success of the modeling efforts in related fields have prompted us to develop SVM-PEPARRAY, a Web-based program capable of constructing qualitative and quantitative models based on peptide microarray binding datasets using support vector machine (SVM) modeling methods. We expect that such modeling analysis will allow researchers to quickly extract sequence-based biological information from improved peptide array binding results and provide more precise and accurate information about the biological systems investigated.
Collapse
Affiliation(s)
- Gang Chen
- Department of Neuroscience, University of Minnesota, Minneapolis, MN, USA
| | | | | | | | | | | | | |
Collapse
|
48
|
Song J, Tan H, Mahmood K, Law RHP, Buckle AM, Webb GI, Akutsu T, Whisstock JC. Prodepth: predict residue depth by support vector regression approach from protein sequences only. PLoS One 2009; 4:e7072. [PMID: 19759917 PMCID: PMC2742725 DOI: 10.1371/journal.pone.0007072] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2009] [Accepted: 08/20/2009] [Indexed: 11/24/2022] Open
Abstract
Residue depth (RD) is a solvent exposure measure that complements the information provided by conventional accessible surface area (ASA) and describes to what extent a residue is buried in the protein structure space. Previous studies have established that RD is correlated with several protein properties, such as protein stability, residue conservation and amino acid types. Accurate prediction of RD has many potentially important applications in the field of structural bioinformatics, for example, facilitating the identification of functionally important residues, or residues in the folding nucleus, or enzyme active sites from sequence information. In this work, we introduce an efficient approach that uses support vector regression to quantify the relationship between RD and protein sequence. We systematically investigated eight different sequence encoding schemes including both local and global sequence characteristics and examined their respective prediction performances. For the objective evaluation of our approach, we used 5-fold cross-validation to assess the prediction accuracies and showed that the overall best performance could be achieved with a correlation coefficient (CC) of 0.71 between the observed and predicted RD values and a root mean square error (RMSE) of 1.74, after incorporating the relevant multiple sequence features. The results suggest that residue depth could be reliably predicted solely from protein primary sequences: local sequence environments are the major determinants, while global sequence features could influence the prediction performance marginally. We highlight two examples as a comparison in order to illustrate the applicability of this approach. We also discuss the potential implications of this new structural parameter in the field of protein structure prediction and homology modeling. This method might prove to be a powerful tool for sequence analysis.
Collapse
Affiliation(s)
- Jiangning Song
- Department of Biochemistry and Molecular Biology, Monash University, Clayton, Melbourne, Victoria, Australia
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, Kyoto, Japan
- * E-mail: (JS); (JCW)
| | - Hao Tan
- Department of Biochemistry and Molecular Biology, Monash University, Clayton, Melbourne, Victoria, Australia
| | - Khalid Mahmood
- Department of Biochemistry and Molecular Biology, Monash University, Clayton, Melbourne, Victoria, Australia
- ARC Centre of Excellence for Structural and Functional Microbial Genomics, Monash University, Clayton, Melbourne, Victoria, Australia
| | - Ruby H. P. Law
- Department of Biochemistry and Molecular Biology, Monash University, Clayton, Melbourne, Victoria, Australia
| | - Ashley M. Buckle
- Department of Biochemistry and Molecular Biology, Monash University, Clayton, Melbourne, Victoria, Australia
| | - Geoffrey I. Webb
- Faculty of Information Technology, Monash University, Clayton, Melbourne, Victoria, Australia
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, Kyoto, Japan
| | - James C. Whisstock
- Department of Biochemistry and Molecular Biology, Monash University, Clayton, Melbourne, Victoria, Australia
- ARC Centre of Excellence for Structural and Functional Microbial Genomics, Monash University, Clayton, Melbourne, Victoria, Australia
- * E-mail: (JS); (JCW)
| |
Collapse
|
49
|
Gaussian process: an alternative approach for QSAM modeling of peptides. Amino Acids 2009; 38:199-212. [DOI: 10.1007/s00726-008-0228-1] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2008] [Accepted: 12/18/2008] [Indexed: 10/21/2022]
|
50
|
Davies MN, Flower DR. Computational Vaccinology. BIOINFORMATICS FOR IMMUNOMICS 2009. [PMCID: PMC7121138 DOI: 10.1007/978-1-4419-0540-6_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|