1
|
Hu X, Li J, Liu T. Alg-MFDL: A multi-feature deep learning framework for allergenic proteins prediction. Anal Biochem 2025; 697:115701. [PMID: 39481588 DOI: 10.1016/j.ab.2024.115701] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2024] [Revised: 10/26/2024] [Accepted: 10/28/2024] [Indexed: 11/02/2024]
Abstract
The escalating global incidence of allergy patients illustrates the growing impact of allergic issues on global health. Allergens are small molecule antigens that trigger allergic reactions. A widely recognized strategy for allergy prevention involves identifying allergens and avoiding re-exposure. However, the laboratory methods to identify allergenic proteins are often time-consuming and resource-intensive. There is a crucial need to establish efficient and reliable computational approaches for the identification of allergenic proteins. In this study, we developed a novel allergenic proteins predictor named Alg-MFDL, which integrates pre-trained protein language models (PLMs) and traditional handcrafted features to achieve a more complete protein representation. First, we compared the performance of eight pre-trained PLMs from ProtTrans and ESM-2 and selected the best-performing one from each of the two groups. In addition, we evaluated the performance of three handcrafted features and different combinations of them to select the optimal feature or feature combination. Then, these three protein representations were fused and used as inputs to train the convolutional neural network (CNN). Finally, the independent validation was performed on benchmark datasets to evaluate the performance of Alg-MFDL. As a result, Alg-MFDL achieved an accuracy of 0.973, a precision of 0.996, a sensitivity of 0.951, and an F1 value of 0.973, outperforming the most of current state-of-the-art (SOTA) methods across all key metrics. We anticipated that the proposed model could be considered a useful tool for predicting allergen proteins.
Collapse
Affiliation(s)
- Xiang Hu
- College of Information Technology, Shanghai Ocean University, Shanghai, 201306, China
| | - Jingyi Li
- AIEN Institute, Shanghai Ocean University, Shanghai, 201306, China
| | - Taigang Liu
- College of Information Technology, Shanghai Ocean University, Shanghai, 201306, China.
| |
Collapse
|
2
|
Kumar SD, Park J, Radhakrishnan NK, Aryal YP, Jeong GH, Pyo IH, Ganbaatar B, Lee CW, Yang S, Shin Y, Subramaniyam S, Lim YJ, Kim SH, Lee S, Shin SY, Cho SJ. Novel Leech Antimicrobial Peptides, Hirunipins: Real-Time 3D Monitoring of Antimicrobial and Antibiofilm Mechanisms Using Optical Diffraction Tomography. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2025:e2409803. [PMID: 39792785 DOI: 10.1002/advs.202409803] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/17/2024] [Revised: 12/13/2024] [Indexed: 01/12/2025]
Abstract
Antimicrobial peptides (AMPs) are promising agents for treating antibiotic-resistant bacterial infections. Although discovering novel AMPs is crucial for combating multidrug-resistant bacteria and biofilm-related infections, their clinical potential relies on precise, real-time evaluation of efficacy, toxicity, and mechanisms. Optical diffraction tomography (ODT), a label-free imaging technology, enables real-time visualization of bacterial morphological changes, membrane damage, and biofilm formation over time. Here, a computational analysis of the leech transcriptome using an advanced AI-based peptide screening strategy with ODT to identify potential AMPs is employed. Among the 19 potential AMPs identified, hirunipin 2 demonstrates potent antibacterial activity, low mammalian cytotoxicity, and minimal hemolytic effects. It demonstrates efficacy comparable to melittin, resistance to physiological salts and human serum, and a low likelihood of inducing bacterial resistance. Microscopy and 3D-ODT confirm its disruption of bacterial membranes and intracellular aggregation, leading to cell death. Notably, hirunipin 2 effectively inhibits biofilm formation, eradicates preformed biofilms, and synergizes with antibiotics against multidrug-resistant Acinetobacter baumannii (MDRAB) by enhancing membrane permeability. Additionally, hirunipin 2 significantly suppresses pro-inflammatory cytokine expression in LPS-stimulated macrophages, highlighting its anti-inflammatory properties. These findings highlight hirunipin 2 as a strong candidate for developing novel antibacterial, anti-inflammatory, and antibiofilm therapies, particularly against multidrug-resistant bacterial infections.
Collapse
Affiliation(s)
- S Dinesh Kumar
- Department of Cellular & Molecular Medicine, School of Medicine, Chosun University, Gwangju, 61452, Republic of Korea
| | - Jeongwon Park
- Gwangju Center, Korea Basic Science Institute (KBSI), Gwangju, 61751, Republic of Korea
- Department of Animal Science, Chonnam National University, Gwangju, 61186, South Korea
| | - Naveen Kumar Radhakrishnan
- Department of Biomedical Sciences, Graduate School, Chosun University, Gwangju, 61452, Republic of Korea
| | - Yam Prasad Aryal
- Department of Biological Sciences and Biotechnology, College of Natural Sciences, Chungbuk National University, Cheongju, Chungbuk, 28644, Republic of Korea
| | - Geon-Hwi Jeong
- Department of Biological Sciences and Biotechnology, College of Natural Sciences, Chungbuk National University, Cheongju, Chungbuk, 28644, Republic of Korea
| | - In-Hyeok Pyo
- Department of Biological Sciences and Biotechnology, College of Natural Sciences, Chungbuk National University, Cheongju, Chungbuk, 28644, Republic of Korea
| | - Byambasuren Ganbaatar
- Department of Chemistry, Chonnam National University, Gwangju, 61186, Republic of Korea
| | - Chul Won Lee
- Department of Chemistry, Chonnam National University, Gwangju, 61186, Republic of Korea
| | - Sungtae Yang
- Institute of Well-Aging Medicare & CSU G-LAMP Project Group, Chosun University, Gwangju, 61452, Republic of Korea
| | - Younhee Shin
- Research and Development Center, Insilicogen Inc, Yongin-si, Gyeonggi-do, 16954, Republic of Korea
| | | | - Yu-Jin Lim
- Research and Development Center, Insilicogen Inc, Yongin-si, Gyeonggi-do, 16954, Republic of Korea
| | - Sung-Hak Kim
- Department of Animal Science, Chonnam National University, Gwangju, 61186, South Korea
| | - Seongsoo Lee
- Gwangju Center, Korea Basic Science Institute (KBSI), Gwangju, 61751, Republic of Korea
- Department of Bio-Analysis Science, University of Science & Technology, Daejeon, 34113, Republic of Korea
- Department of Systems Biotechnology, Chung-Ang University, Anseong, 17546, Republic of Korea
- Department of Life Science, Hanyang University, Seoul, 04763, Republic of Korea
| | - Song Yub Shin
- Department of Cellular & Molecular Medicine, School of Medicine, Chosun University, Gwangju, 61452, Republic of Korea
| | - Sung-Jin Cho
- Department of Biological Sciences and Biotechnology, College of Natural Sciences, Chungbuk National University, Cheongju, Chungbuk, 28644, Republic of Korea
| |
Collapse
|
3
|
Zhang L, Liu T. PreAlgPro: Prediction of allergenic proteins with pre-trained protein language model and efficient neutral network. Int J Biol Macromol 2024; 280:135762. [PMID: 39322150 DOI: 10.1016/j.ijbiomac.2024.135762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2024] [Revised: 09/03/2024] [Accepted: 09/16/2024] [Indexed: 09/27/2024]
Abstract
Allergy is a prevalent phenomenon, involving allergens such as nuts and milk. Avoiding exposure to allergens is the most effective preventive measure against allergic reactions. However, current homology-based methods for identifying allergenic proteins encounter challenges when dealing with non-homologous data. Traditional machine learning approaches rely on manually extracted features, which lack important protein functional characteristics, including evolutionary information. Consequently, there is still considerable room for improvement in existing methods. In this study, we present PreAlgPro, a method for identifying allergenic proteins based on pre-trained protein language models and deep learning techniques. Specifically, we employed the ProtT5 model to extract protein embedding features, replacing the manual feature extraction step. Furthermore, we devised an Attention-CNN neural network architecture to identify potential features that contribute to the classification of allergenic proteins. The performance of our model was evaluated on four independent test sets, and the experimental results demonstrate that PreAlgPro surpasses existing state-of-the-art methods. Additionally, we collected allergenic protein samples to validate the robustness of the model and conducted an analysis of model interpretability.
Collapse
Affiliation(s)
- Lingrong Zhang
- College of Information Technology, Shanghai Ocean University, Shanghai 201306, China
| | - Taigang Liu
- College of Information Technology, Shanghai Ocean University, Shanghai 201306, China.
| |
Collapse
|
4
|
Rodríguez Longarela N, Paredes Ramos M, López Vilariño JM. Bioinformatics tools for the study of bioactive peptides from vegetal sources: evolution and future perspectives. Crit Rev Food Sci Nutr 2024:1-20. [PMID: 38907628 DOI: 10.1080/10408398.2024.2367571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/24/2024]
Abstract
Bioactive peptides from vegetal sources have been shown to have functional properties as anti-inflammatory, antioxidant, antihypertensive or antidiabetic capacity. For this reason, they have been proposed as an interesting and promising alternative to improve human health. In recent years, the numerous advances in the bioinformatics field for in silico prediction have speeded up the discovery of bioactive peptides, also reducing the associated costs when using an integrated approach between the classical and bioinformatics discovery. This review aims to provide an overview of the evolution, limitations and latest advances in the field of bioinformatics and computational tools, and specifically make a critical and comprehensive insight into computational techniques used to study the mechanism of interaction that allows the explanation of plant bioactive peptide functionality. In particular, molecular docking is considered key to explain the different functionalities that have been previously identified. The assumptions to simplify such a high complex environment implies a degree of uncertainty that can only be guaranteed and validated by in vitro or in vivo studies, however, the combination of databases, software and bioinformatics applications with the classical approach has become a promising procedure for the study of bioactive peptides.
Collapse
|
5
|
Du Z, Xu Y, Liu C, Li Y. pLM4Alg: Protein Language Model-Based Predictors for Allergenic Proteins and Peptides. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2024; 72:752-760. [PMID: 38113537 DOI: 10.1021/acs.jafc.3c07143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2023]
Abstract
The rising prevalence of allergy demands efficient and accurate bioinformatic tools to expedite allergen identification and risk assessment while also reducing wet experiment expenses and time. Recently, pretrained protein language models (pLMs) have successfully predicted protein structure and function. However, to our best knowledge, they have not been used for predicting allergenic proteins/peptides. Therefore, this study aims to develop robust models for allergenic protein/peptide prediction using five pLMs of varying sizes and systematically assess their performance through fine-tuning with a convolutional neural network. The developed pLM4Alg models have achieved state-of-the-art performance with accuracy, Matthews correlation coefficient, and area under the curve scoring 93.4-95.1%, 0.869-0.902, and 0.981-0.990, respectively. Moreover, pLM4Alg is the first model capable of handling prediction tasks involving residue-missed sequences and sequences containing nonstandard amino acid residues. To facilitate easy access, a user-friendly web server (https://f6wxpfd3sh.us-east-1.awsapprunner.com) has been established. pLM4Alg is expected to become the leading machine learning-based prediction model for allergenic peptides and proteins. Its collaboration with other predictors holds great promise for accelerating allergy research.
Collapse
Affiliation(s)
- Zhenjiao Du
- Department of Grain Science and Industry, Kansas State University, Manhattan, Kansas 66506, United States
| | - Yixiang Xu
- Healthy Processed Foods Research Unit, Western Regional Research Center, USDA-ARS, Albany, California 94710, United States
| | - Changqi Liu
- School of Exercise and Nutritional Sciences, San Diego State University, San Diego, California 92182, United States
| | - Yonghui Li
- Department of Grain Science and Industry, Kansas State University, Manhattan, Kansas 66506, United States
| |
Collapse
|
6
|
Li Y, Sackett PW, Nielsen M, Barra C. NetAllergen, a random forest model integrating MHC-II presentation propensity for improved allergenicity prediction. BIOINFORMATICS ADVANCES 2023; 3:vbad151. [PMID: 37901344 PMCID: PMC10603389 DOI: 10.1093/bioadv/vbad151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/01/2023] [Revised: 09/28/2023] [Accepted: 10/13/2023] [Indexed: 10/31/2023]
Abstract
Motivation Allergy is a pathological immune reaction towards innocuous protein antigens. Although only a narrow fraction of plant or animal proteins induce allergy, atopic disorders affect millions of children and adults and cost billions in healthcare systems worldwide. In silico predictors can aid in the development of more innocuous food sources. Previous allergenicity predictors used sequence similarity, common structural domains, and amino acid physicochemical features. However, these predictors strongly rely on sequence similarity to known allergens and fail to predict protein allergenicity accurately when similarity diminishes. Results To overcome these limitations, we collected allergens from AllergenOnline, a curated database of IgE-inducing allergens, carefully removed allergen redundancy with a novel protein partitioning pipeline, and developed a new allergen prediction method, introducing MHC presentation propensity as a novel feature. NetAllergen outperformed a sequence similarity-based BLAST baseline approach, and previous allergenicity predictor AlgPred 2 when similarity to known allergens is limited. Availability and implementation The web service NetAllergen and the datasets are available at https://services.healthtech.dtu.dk/services/NetAllergen-1.0/.
Collapse
Affiliation(s)
- Yuchen Li
- Department of Health Technology, Technical University of Denmark, Kgs. Lyngby, Copenhagen 2800, Denmark
| | - Peter Wad Sackett
- Department of Health Technology, Technical University of Denmark, Kgs. Lyngby, Copenhagen 2800, Denmark
| | - Morten Nielsen
- Department of Health Technology, Technical University of Denmark, Kgs. Lyngby, Copenhagen 2800, Denmark
- Instituto de Investigaciones Biotecnológicas, Universidad Nacional de San Martín, San Martin 1650, Argentina
| | - Carolina Barra
- Department of Health Technology, Technical University of Denmark, Kgs. Lyngby, Copenhagen 2800, Denmark
| |
Collapse
|
7
|
Mason K, Davies J, Ruutu M. Immunoglobulin E-specific allergens against leaf in serum of dogs with clinical features of grass leaf allergy. Vet Dermatol 2023; 34:393-403. [PMID: 37190989 DOI: 10.1111/vde.13166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 08/05/2022] [Accepted: 02/19/2023] [Indexed: 05/17/2023]
Abstract
BACKGROUND Grass leaf has been suspected of causing immunoglobulin (Ig)E-mediated immediate hypersensitivity reactions in humans and dogs. However, most studies in this area are case-control studies without in vitro data showing the involvement of IgE in the reaction. Laboratory studies have demonstrated the reactivity to a 50-55 kDa protein with clinical signs immediately after contact with grass leaf material. The clinical findings of dogs with atopic-like dermatitis immediately after contact with grass leaf material suggest the involvement of grass leaves as the allergen source. OBJECTIVES This study was designed to test the IgE-reactivity of grass leaf proteins in dogs with clinical signs and positive scratch test results against grass leaf material. MATERIALS AND METHODS The serum of 41 patients with a history of allergy and suspected to grass leaf material was immunoblotted against grass leaf extracts from five suspected grass species. The IgE-positive blots were separated with 2D gel electrophoresis and analysed with mass spectrometry (MS). Commercially supplied proteins were used to validate immunoblot activity. RESULTS The serum of 25 dogs diagnosed with grass dermatitis had positive IgE-specific immunoblot against one or more grass leaf extracts. The MS data indicated a reactive band at 55 kDa to be beta-amylase or RuBisCO (ribulose-1,5-bisphosphate carboxylase/oxygenase) large subunit (RbLS). All tested dog sera showed IgE-reactivity with beta-amylase and some with RbLS. CONCLUSIONS AND CLINICAL RELEVANCE Canines with clinical signs of grass-related dermatitis had IgE-reactivity against grass leaf proteins. Serum IgE-reactivity to beta-amylase and RuBisCO large subunit may indicate that these proteins act as allergens, possibly causing pruritus and skin lesions.
Collapse
Affiliation(s)
- Ken Mason
- Animal Allergy and Dermatology Service, Slacks Creek, Queensland, Australia
- School of Veterinary Science, University of Queensland, Gatton, Queensland, Australia
| | - Janet Davies
- School of Biomedical Sciences, Queensland University of Technology, South Brisbane, Queensland, Australia
| | - Merja Ruutu
- Animal Allergy and Dermatology Service, Slacks Creek, Queensland, Australia
- School of Veterinary Science, University of Queensland, Gatton, Queensland, Australia
| |
Collapse
|
8
|
Goto K, Tamehiro N, Yoshida T, Hanada H, Sakuma T, Adachi R, Kondo K, Takeuchi I. Novel Machine Learning Method AllerStat Identifies Statistically Significant Allergen-Specific Patterns in Protein Sequences. J Biol Chem 2023; 299:104733. [PMID: 37086787 DOI: 10.1016/j.jbc.2023.104733] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 04/12/2023] [Accepted: 04/18/2023] [Indexed: 04/24/2023] Open
Abstract
Cutting-edge technologies such as genome editing and synthetic biology allow us to produce novel foods and functional proteins. However, their toxicity and allergenicity must be accurately evaluated. It is known that specific amino-acid sequences in proteins make some proteins allergic, but many of these sequences remain uncharacterized. In this study, we introduce a data-driven approach and a machine-learning (ML) method to find undiscovered allergen specific patterns (ASPs) among amino acid sequences. The proposed method enables an exhaustive search for amino-acid subsequences whose frequencies are statistically significantly higher in allergenic proteins. As a proof-of-concept, we created a database containing 21,154 proteins of which the presence or absence of allergic reactions are already known, and applied the proposed method to the database. The detected ASPs in this proof-of-concept study were consistent with known biological findings, and the allergenicity prediction performance using the detected ASPs was higher than extant approaches, indicating this method may be useful in evaluating the utility of synthetic foods and proteins.
Collapse
Affiliation(s)
- Kento Goto
- Department of Computer Science, Nagoya Institute of Technology. Gokiso-cho, Showa-ku, Nagoya, Aichi, 466-8555, Japan
| | - Norimasa Tamehiro
- Division of Biochemistry, National Institute of Health Sciences. 3-25-26 Tonomachi, Kawasaki-ku, Kawasaki, Kanagawa, 210-9501, Japan
| | - Takumi Yoshida
- Department of Computer Science, Nagoya Institute of Technology. Gokiso-cho, Showa-ku, Nagoya, Aichi, 466-8555, Japan
| | - Hiroyuki Hanada
- Center for Advanced Intelligence Project, RIKEN. 1-4-1 Nihonbashi, Chuo-ku, Tokyo, 103-0027, Japan
| | - Takuto Sakuma
- Department of Computer Science, Nagoya Institute of Technology. Gokiso-cho, Showa-ku, Nagoya, Aichi, 466-8555, Japan
| | - Reiko Adachi
- Division of Biochemistry, National Institute of Health Sciences. 3-25-26 Tonomachi, Kawasaki-ku, Kawasaki, Kanagawa, 210-9501, Japan
| | - Kazunari Kondo
- Division of Biochemistry, National Institute of Health Sciences. 3-25-26 Tonomachi, Kawasaki-ku, Kawasaki, Kanagawa, 210-9501, Japan.
| | - Ichiro Takeuchi
- Graduate School of Engineering, Nagoya University.Furo-cho, Chikusa-ku, Nagoya, 464-8603, Japan; Center for Advanced Intelligence Project, RIKEN. 1-4-1 Nihonbashi, Chuo-ku, Tokyo, 103-0027, Japan.
| |
Collapse
|
9
|
Japanese Regulatory Framework and Approach for Genome-edited Foods Based on Latest Scientific Findings. FOOD SAFETY (TOKYO, JAPAN) 2022; 10:113-128. [PMID: 36619008 PMCID: PMC9789915 DOI: 10.14252/foodsafetyfscj.d-21-00016] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Accepted: 12/01/2022] [Indexed: 12/24/2022]
Abstract
The food supply system is facing important challenges and its sustainability has to be considered. Genome-editing technology, which accelerates the development of new variety, could be used to achieve sustainable development goals, thereby protecting the environment and ensuring the stable production of food for an increasing global population. The most widely used genome-editing tool, CRISPR/Cas9, is easy to use, affordable, and versatile. Foods produced by genome-editing technologies have been developed worldwide to create novel traits. In the first half of the review, the latest scientific findings on genome-editing technologies are summarized, and the technical challenge in genome sequence analysis are clarified. CRISPR/Cas9 has versatile alternative techniques, such as base editor and prime editor. Genome sequencing technology has developed rapidly in recent years. However, it is still difficult to detect large deletions and structural variations. Long-read sequencing technology would solve this challenge. In the second part, regulatory framework and approach for genome-edited foods is introduced. The four government ministries, including the Ministry of Environment, the Ministry of Agriculture, Forestry and Fisheries, and the Ministry of Health, Labour and Welfare (MHLW), started to discuss how the regulation should be implemented in 2019. The SDN-1 technique is excluded from the current genetically modified organism (GMO) regulation. The Japanese regulatory framework includes pre-submission consultation and submission of notification form. In the last part of this review, transparency of regulatory framework and consumer confidence were described. Since maintaining consumer trust is vital, transparency of regulatory framework is a key to consumers. The information of notification process on approved genome-edited foods is made public immediately. This review will help regulators build regulatory frameworks, and lead to harmonization of the framework between the countries.
Collapse
|
10
|
Mora-Cross M, Morales-Carmiol A, Chen-Huang T, Barquero-Pérez M. Essential Biodiversity Variables: extracting plant phenological data from specimen labels using machine learning. RESEARCH IDEAS AND OUTCOMES 2022. [DOI: 10.3897/rio.8.e86012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Essential Biodiversity Variables (EBVs) make it possible to evaluate and monitor the state of biodiversity over time at different spatial scales. Its development is led by the Group on Earth Observations Biodiversity Observation Network (GEO BON) to harmonize, consolidate and standardize biodiversity data from varied biodiversity sources. This document presents a mechanism to obtain baseline data to feed the Species Traits Variable Phenology or other biodiversity indicators by extracting species characters and structure names from morphological descriptions of specimens and classifying such descriptions using machine learning (ML).
A workflow that performs Named Entity Recognition (NER) and Classification of morphological descriptions using ML algorithms was evaluated with excellent results. It was implemented using Python, Pytorch, Scikit-Learn, Pomegranate, Python-crfsuite, and other libraries applied to 106,804 herbarium records from the National Biodiversity Institute of Costa Rica (INBio). The text classification results were almost excellent (F1 score between 96% and 99%) using three traditional ML methods: Multinomial Naive Bayes (NB), Linear Support Vector Classification (SVC), and Logistic Regression (LR). Furthermore, results extracting names of species morphological structures (e.g., leaves, trichomes, flowers, petals, sepals) and character names (e.g., length, width, pigmentation patterns, and smell) using NER algorithms were competitive (F1 score between 95% and 98%) using Hidden Markov Models (HMM), Conditional Random Fields (CRFs), and Bidirectional Long Short Term Memory Networks with CRF (BI-LSTM-CRF).
Collapse
|
11
|
Sharma N, Patiyal S, Dhall A, Devi NL, Raghava GPS. ChAlPred: A web server for prediction of allergenicity of chemical compounds. Comput Biol Med 2021; 136:104746. [PMID: 34388468 DOI: 10.1016/j.compbiomed.2021.104746] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Revised: 08/04/2021] [Accepted: 08/04/2021] [Indexed: 11/28/2022]
Abstract
BACKGROUND Allergy is the abrupt reaction of the immune system that may occur after the exposure to allergens such as proteins, peptides, or chemicals. In the past, various methods have been generated for predicting allergenicity of proteins and peptides. In contrast, there is no method that can predict allergenic potential of chemicals. In this paper, we described a method ChAlPred developed for predicting chemical allergens as well as for designing chemical analogs with desired allergenicity. METHOD In this study, we have used 403 allergenic and 1074 non-allergenic chemical compounds obtained from IEDB database. The PaDEL software was used to compute the molecular descriptors of the chemical compounds to develop different prediction models. All the models were trained and tested on the 80% training data and evaluated on the 20% validation data using the 2D, 3D and FP descriptors. RESULTS In this study, we have developed different prediction models using several machine learning approaches. It was observed that the Random Forest based model developed using hybrid descriptors performed the best, and achieved the maximum accuracy of 83.39% and AUC of 0.93 on validation dataset. The fingerprint analysis of the dataset indicates that certain chemical fingerprints are more abundant in allergens that include PubChemFP129 and GraphFP1014. We have also predicted allergenicity potential of FDA-approved drugs using our best model and identified the drugs causing allergic symptoms (e.g., Cefuroxime, Spironolactone, Tioconazole). Our results agreed with allergenicity of these drugs reported in literature. CONCLUSIONS To aid the research community, we developed a smart-device compatible web server ChAlPred (https://webs.iiitd.edu.in/raghava/chalpred/) that allows to predict and design the chemicals with allergenic properties.
Collapse
Affiliation(s)
- Neelam Sharma
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Sumeet Patiyal
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Anjali Dhall
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Naorem Leimarembi Devi
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Gajendra P S Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| |
Collapse
|
12
|
Wang L, Niu D, Zhao X, Wang X, Hao M, Che H. A Comparative Analysis of Novel Deep Learning and Ensemble Learning Models to Predict the Allergenicity of Food Proteins. Foods 2021; 10:809. [PMID: 33918556 PMCID: PMC8069377 DOI: 10.3390/foods10040809] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Revised: 04/02/2021] [Accepted: 04/06/2021] [Indexed: 11/16/2022] Open
Abstract
Traditional food allergen identification mainly relies on in vivo and in vitro experiments, which often needs a long period and high cost. The artificial intelligence (AI)-driven rapid food allergen identification method has solved the above mentioned some drawbacks and is becoming an efficient auxiliary tool. Aiming to overcome the limitations of lower accuracy of traditional machine learning models in predicting the allergenicity of food proteins, this work proposed to introduce deep learning model-transformer with self-attention mechanism, ensemble learning models (representative as Light Gradient Boosting Machine (LightGBM) eXtreme Gradient Boosting (XGBoost)) to solve the problem. In order to highlight the superiority of the proposed novel method, the study also selected various commonly used machine learning models as the baseline classifiers. The results of 5-fold cross-validation showed that the area under the receiver operating characteristic curve (AUC) of the deep model was the highest (0.9578), which was better than the ensemble learning and baseline algorithms. But the deep model need to be pre-trained, and the training time is the longest. By comparing the characteristics of the transformer model and boosting models, it can be analyzed that, each model has its own advantage, which provides novel clues and inspiration for the rapid prediction of food allergens in the future.
Collapse
Affiliation(s)
- Liyang Wang
- Key Laboratory of Precision Nutrition and Food Quality, The Ministry of Education, College of Food Science and Nutritional Engineering, China Agricultural University, Beijing 100083, China; (L.W.); (X.W.); (M.H.)
| | - Dantong Niu
- College of Information and Electrical Engineering, China Agricultural University, Beijing 100083, China;
| | - Xinjie Zhao
- College of Humanities and Development Studies, China Agricultural University, Beijing 100083, China;
| | - Xiaoya Wang
- Key Laboratory of Precision Nutrition and Food Quality, The Ministry of Education, College of Food Science and Nutritional Engineering, China Agricultural University, Beijing 100083, China; (L.W.); (X.W.); (M.H.)
| | - Mengzhen Hao
- Key Laboratory of Precision Nutrition and Food Quality, The Ministry of Education, College of Food Science and Nutritional Engineering, China Agricultural University, Beijing 100083, China; (L.W.); (X.W.); (M.H.)
| | - Huilian Che
- Key Laboratory of Precision Nutrition and Food Quality, The Ministry of Education, College of Food Science and Nutritional Engineering, China Agricultural University, Beijing 100083, China; (L.W.); (X.W.); (M.H.)
| |
Collapse
|
13
|
Barati M, Javanmardi F, Mousavi Jazayeri SMH, Jabbari M, Rahmani J, Barati F, Nickho H, Davoodi SH, Roshanravan N, Mousavi Khaneghah A. Techniques, perspectives, and challenges of bioactive peptide generation: A comprehensive systematic review. Compr Rev Food Sci Food Saf 2020; 19:1488-1520. [PMID: 33337080 DOI: 10.1111/1541-4337.12578] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2019] [Revised: 04/03/2020] [Accepted: 04/27/2020] [Indexed: 12/14/2022]
Abstract
Due to the digestible refractory and absorbable structures of bioactive peptides (BPs), they could induce notable biological impacts on the living organism. In this regard, the current study was devoted to providing an overview regarding the available methods for BPs generation by the aid of a systematic review conducted on the published articles up to April 2019. In this context, the PubMed and Scopus databases were screened to retrieve the related publications. According to the results, although the characterization of BPs mainly has been performed using enzymatic and microbial in-vitro methods, they cannot be considered as suitable techniques for further stimulation of digestion in the gastrointestinal tract. Therefore, new approaches for both in-vivo and in-silico methods for BPs identification should be developed to overcome the obstacles that belonged to the current methods. The purpose of this review was to compile the recent analytical methods applied for studying various aspects of food-derived biopeptides, and emphasizing generation at in vitro, in vivo, and in silico.
Collapse
Affiliation(s)
- Meisam Barati
- Student Research Committee, Department of Cellular and Molecular Nutrition, Faculty of Nutrition and Food Technology, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Fardin Javanmardi
- Department of Food Science and Technology, Faculty of Nutrition and Food Technology, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | | | - Masoumeh Jabbari
- Department of Community Nutrition, Faculty of Nutrition and Food Technology, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Jamal Rahmani
- Department of Community Nutrition, Faculty of Nutrition and Food Technology, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Farzaneh Barati
- Department of Biotechnology, Faculty of Biological Sciences, Alzahra University, Tehran, Iran
| | - Hamid Nickho
- Immunology Research Center, Iran University of Medical Sciences, Tehran, Iran.,Department of Immunology, School of Medicine, Iran University of Medical Sciences, Tehran, Iran
| | - Sayed Hossein Davoodi
- Department of Clinical Nutrition and Dietetic, National Institute and Faculty of Nutrition and Food Technology; Cancer Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Neda Roshanravan
- Cardiovascular Research Center, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Amin Mousavi Khaneghah
- Department of Food Science, Faculty of Food Engineering, University of Campinas (UNICAMP), São Paulo, Brazil
| |
Collapse
|
14
|
Sharma N, Patiyal S, Dhall A, Pande A, Arora C, Raghava GPS. AlgPred 2.0: an improved method for predicting allergenic proteins and mapping of IgE epitopes. Brief Bioinform 2020; 22:5985292. [PMID: 33201237 DOI: 10.1093/bib/bbaa294] [Citation(s) in RCA: 123] [Impact Index Per Article: 24.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2020] [Revised: 10/02/2020] [Accepted: 10/05/2020] [Indexed: 12/22/2022] Open
Abstract
AlgPred 2.0 is a web server developed for predicting allergenic proteins and allergenic regions in a protein. It is an updated version of AlgPred developed in 2006. The dataset used for training, testing and validation consists of 10 075 allergens and 10 075 non-allergens. In addition, 10 451 experimentally validated immunoglobulin E (IgE) epitopes were used to identify antigenic regions in a protein. All models were trained on 80% of data called training dataset, and the performance of models was evaluated using 5-fold cross-validation technique. The performance of the final model trained on the training dataset was evaluated on 20% of data called validation dataset; no two proteins in any two sets have more than 40% similarity. First, a Basic Local Alignment Search Tool (BLAST) search has been performed against the dataset, and allergens were predicted based on the level of similarity with known allergens. Second, IgE epitopes obtained from the IEDB database were searched in the dataset to predict allergens based on their presence in a protein. Third, motif-based approaches like multiple EM for motif elicitation/motif alignment and search tool have been used to predict allergens. Fourth, allergen prediction models have been developed using a wide range of machine learning techniques. Finally, the ensemble approach has been used for predicting allergenic protein by combining prediction scores of different approaches. Our best model achieved maximum performance in terms of area under receiver operating characteristic curve 0.98 with Matthew's correlation coefficient 0.85 on the validation dataset. A web server AlgPred 2.0 has been developed that allows the prediction of allergens, mapping of IgE epitope, motif search and BLAST search (https://webs.iiitd.edu.in/raghava/algpred2/).
Collapse
Affiliation(s)
- Neelam Sharma
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| | - Sumeet Patiyal
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| | - Anjali Dhall
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| | - Akshara Pande
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| | - Chakit Arora
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| | - Gajendra P S Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| |
Collapse
|
15
|
Easton A, Gao S, Lawton SP, Bennuru S, Khan A, Dahlstrom E, Oliveira RG, Kepha S, Porcella SF, Webster J, Anderson R, Grigg ME, Davis RE, Wang J, Nutman TB. Molecular evidence of hybridization between pig and human Ascaris indicates an interbred species complex infecting humans. eLife 2020; 9:e61562. [PMID: 33155980 PMCID: PMC7647404 DOI: 10.7554/elife.61562] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Accepted: 10/19/2020] [Indexed: 02/06/2023] Open
Abstract
Human ascariasis is a major neglected tropical disease caused by the nematode Ascaris lumbricoides. We report a 296 megabase (Mb) reference-quality genome comprised of 17,902 protein-coding genes derived from a single, representative Ascaris worm. An additional 68 worms were collected from 60 human hosts in Kenyan villages where pig husbandry is rare. Notably, the majority of these worms (63/68) possessed mitochondrial genomes that clustered closer to the pig parasite Ascaris suum than to A. lumbricoides. Comparative phylogenomic analyses identified over 11 million nuclear-encoded SNPs but just two distinct genetic types that had recombined across the genomes analyzed. The nuclear genomes had extensive heterozygosity, and all samples existed as genetic mosaics with either A. suum-like or A. lumbricoides-like inheritance patterns supporting a highly interbred Ascaris species genetic complex. As no barriers appear to exist for anthroponotic transmission of these 'hybrid' worms, a one-health approach to control the spread of human ascariasis will be necessary.
Collapse
Affiliation(s)
- Alice Easton
- Helminth Immunology Section, Laboratory of Parasitic Diseases, National Institute of Allergy and Infectious Disease, National Institutes of HealthBethesdaUnited States
- Department of Infectious Disease Epidemiology, Imperial College LondonLondonUnited Kingdom
| | - Shenghan Gao
- Department of Biochemistry and Molecular Genetics, RNA Bioscience Initiative, University of Colorado School of MedicineAuroraUnited States
- Beijing Institute of Genomics, Chinese Academy of SciencesBeijingChina
| | - Scott P Lawton
- Epidemiology Research Unit (ERU) Department of Veterinary and Animal Sciences, Northern Faculty, Scotland’s Rural College (SRUC)InvernessUnited Kingdom
| | - Sasisekhar Bennuru
- Helminth Immunology Section, Laboratory of Parasitic Diseases, National Institute of Allergy and Infectious Disease, National Institutes of HealthBethesdaUnited States
| | - Asis Khan
- Molecular Parasitology Section, Laboratory of Parasitic Diseases, National Institute of Allergy and Infectious Disease, National Institutes of HealthBethesdaUnited States
| | - Eric Dahlstrom
- Genomics Unit, Research Technologies Branch, National Institute of Allergy and Infectious Diseases, National Institutes of HealthHamiltonUnited States
| | - Rita G Oliveira
- Department of Infectious Disease Epidemiology, Imperial College LondonLondonUnited Kingdom
| | - Stella Kepha
- London School of Tropical Medicine and HygieneLondonUnited Kingdom
| | - Stephen F Porcella
- Genomics Unit, Research Technologies Branch, National Institute of Allergy and Infectious Diseases, National Institutes of HealthHamiltonUnited States
| | - Joanne Webster
- Department of Infectious Disease Epidemiology, Imperial College LondonLondonUnited Kingdom
- Royal Veterinary College, University of London, Department of Pathobiology and Population SciencesHertfordshireUnited Kingdom
| | - Roy Anderson
- Department of Infectious Disease Epidemiology, Imperial College LondonLondonUnited Kingdom
| | - Michael E Grigg
- Molecular Parasitology Section, Laboratory of Parasitic Diseases, National Institute of Allergy and Infectious Disease, National Institutes of HealthBethesdaUnited States
| | - Richard E Davis
- Department of Biochemistry and Molecular Genetics, RNA Bioscience Initiative, University of Colorado School of MedicineAuroraUnited States
| | - Jianbin Wang
- Department of Biochemistry and Molecular Genetics, RNA Bioscience Initiative, University of Colorado School of MedicineAuroraUnited States
- Department of Biochemistry and Cellular and Molecular Biology, University of TennesseeKnoxvilleUnited States
| | - Thomas B Nutman
- Helminth Immunology Section, Laboratory of Parasitic Diseases, National Institute of Allergy and Infectious Disease, National Institutes of HealthBethesdaUnited States
| |
Collapse
|
16
|
Xie S, Braga-Neto UM. On the Bias of Precision Estimation Under Separate Sampling. Cancer Inform 2019; 18:1176935119860822. [PMID: 31360060 PMCID: PMC6636226 DOI: 10.1177/1176935119860822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Accepted: 06/02/2019] [Indexed: 11/29/2022] Open
Abstract
Observational case-control studies for biomarker discovery in cancer studies often collect data that are sampled separately from the case and control populations. We present an analysis of the bias in the estimation of the precision of classifiers designed on separately sampled data. The analysis consists of both theoretical and numerical results, which show that classifier precision estimates can display strong bias under separating sampling, with the bias magnitude depending on the difference between the true case prevalence in the population and the sample prevalence in the data. We show that this bias is systematic in the sense that it cannot be reduced by increasing sample size. If information about the true case prevalence is available from public health records, then a modified precision estimator that uses the known prevalence displays smaller bias, which can in fact be reduced to zero as sample size increases under regularity conditions on the classification algorithm. The accuracy of the theoretical analysis and the performance of the precision estimators under separate sampling are confirmed by numerical experiments using synthetic and real data from published observational case-control studies. The results with real data confirmed that under separately sampled data, the usual estimator produces larger, ie, more optimistic, precision estimates than the estimator using the true prevalence value.
Collapse
Affiliation(s)
- Shuilian Xie
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX, USA
| | - Ulisses M Braga-Neto
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX, USA
| |
Collapse
|
17
|
The ABA392/pET30a protein of Pasteurella multocida provoked mucosal immunity against HS disease in a rat model. Microb Pathog 2019; 128:90-96. [DOI: 10.1016/j.micpath.2018.12.042] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2018] [Revised: 12/20/2018] [Accepted: 12/20/2018] [Indexed: 01/09/2023]
|
18
|
Dong X, Chaisiri K, Xia D, Armstrong SD, Fang Y, Donnelly MJ, Kadowaki T, McGarry JW, Darby AC, Makepeace BL. Genomes of trombidid mites reveal novel predicted allergens and laterally transferred genes associated with secondary metabolism. Gigascience 2018; 7:5160133. [PMID: 30445460 PMCID: PMC6275457 DOI: 10.1093/gigascience/giy127] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2018] [Accepted: 10/18/2018] [Indexed: 12/21/2022] Open
Abstract
Background Trombidid mites have a unique life cycle in which only the larval stage is ectoparasitic. In the superfamily Trombiculoidea ("chiggers"), the larvae feed preferentially on vertebrates, including humans. Species in the genus Leptotrombidium are vectors of a potentially fatal bacterial infection, scrub typhus, that affects 1 million people annually. Moreover, chiggers can cause pruritic dermatitis (trombiculiasis) in humans and domesticated animals. In the Trombidioidea (velvet mites), the larvae feed on other arthropods and are potential biological control agents for agricultural pests. Here, we present the first trombidid mites genomes, obtained both for a chigger, Leptotrombidium deliense, and for a velvet mite, Dinothrombium tinctorium. Results Sequencing was performed using Illumina technology. A 180 Mb draft assembly for D. tinctorium was generated from two paired-end and one mate-pair library using a single adult specimen. For L. deliense, a lower-coverage draft assembly (117 Mb) was obtained using pooled, engorged larvae with a single paired-end library. Remarkably, both genomes exhibited evidence of ancient lateral gene transfer from soil-derived bacteria or fungi. The transferred genes confer functions that are rare in animals, including terpene and carotenoid synthesis. Thirty-seven allergenic protein families were predicted in the L. deliense genome, of which nine were unique. Preliminary proteomic analyses identified several of these putative allergens in larvae. Conclusions Trombidid mite genomes appear to be more dynamic than those of other acariform mites. A priority for future research is to determine the biological function of terpene synthesis in this taxon and its potential for exploitation in disease control.
Collapse
Affiliation(s)
- Xiaofeng Dong
- Institute of Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom.,Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China.,School of Life Sciences, Jiangsu Normal University, Xuzhou 221116, China.,Institute of Infection & Global Health, University of Liverpool, L3 5RF, United Kingdom
| | - Kittipong Chaisiri
- Institute of Infection & Global Health, University of Liverpool, L3 5RF, United Kingdom.,Faculty of Tropical Medicine, Mahidol University, Ratchathewi Bangkok 10400, Thailand
| | - Dong Xia
- Institute of Infection & Global Health, University of Liverpool, L3 5RF, United Kingdom.,The Royal Veterinary College, London NW1 0TU, United Kingdom
| | - Stuart D Armstrong
- Institute of Infection & Global Health, University of Liverpool, L3 5RF, United Kingdom
| | - Yongxiang Fang
- Institute of Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - Martin J Donnelly
- Department of Vector Biology, Liverpool School of Tropical Medicine, Liverpool L3 5QA, United Kingdom
| | - Tatsuhiko Kadowaki
- Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China
| | - John W McGarry
- Institute of Veterinary Science, University of Liverpool, Liverpool L3 5RP, United Kingdom
| | - Alistair C Darby
- Institute of Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - Benjamin L Makepeace
- Institute of Infection & Global Health, University of Liverpool, L3 5RF, United Kingdom
| |
Collapse
|
19
|
Investigation of immunogenic properties of Hemolin from silkworm, Bombyx mori as carrier protein: an immunoinformatic approach. Sci Rep 2018; 8:6957. [PMID: 29725106 PMCID: PMC5934409 DOI: 10.1038/s41598-018-25374-z] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2017] [Accepted: 04/20/2018] [Indexed: 11/08/2022] Open
Abstract
Infectious diseases are the major cause of high mortality among infants and geriatric patients. Vaccines are the only weapon in our arsenal to defend us ourselves against innumerable infectious diseases. Though myriad of vaccines are available, still countless people die due to microbial infections. Subunit vaccine is an effective strategy of vaccine development, combining a highly immunogenic carrier protein with highly antigenic but non-immunogenic antigen (haptens). In this study we have made an attempt to utilize the immunoinformatic tool for carrier protein development. Immunogenic mediators (T-cell, B-cell, IFN-γ epitopes) and physiochemical properties of hemolin protein of silkworm, Bombyx mori were studied. Hemolin was found to be non-allergic and highly antigenic in nature. The refined tertiary structure of modelled hemolin was docked against TLR3 and TLR4-MD2 complex. Molecular dynamics study emphasized the stable microscopic interaction between hemolin and TLRs. In-silico cloning and codon optimization was carried out for effective expression of hemolin in E. coli expression system. The overall presence of Cytotoxic T Lymphocytes (CTL), Humoral T Lymphocytes (HTL), and IFN-γ epitopes with high antigenicity depicts the potential of hemolin as a good candidate for carrier protein.
Collapse
|
20
|
Distinguishing allergens from non-allergenic homologues using Physical-Chemical Property (PCP) motifs. Mol Immunol 2018; 99:1-8. [PMID: 29627609 DOI: 10.1016/j.molimm.2018.03.022] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2018] [Revised: 03/22/2018] [Accepted: 03/27/2018] [Indexed: 02/07/2023]
Abstract
Quantitative guidelines to distinguish allergenic proteins from related, but non-allergenic ones are urgently needed for regulatory agencies, biotech companies and physicians. In a previous study, we found that allergenic proteins populate a relatively small number of protein families, as characterized by the Pfam database. However, these families also contain non-allergenic proteins, meaning that allergenic determinants must lie within more discrete regions of the sequence. Thus, new methods are needed to discriminate allergenic proteins within those families. Physical-Chemical Properties (PCP)-motifs specific for allergens within a Pfam class were determined for 17 highly populated protein domains. A novel scoring method based on PCP-motifs that characterize known allergenic proteins within these families was developed, and validated for those domains. The motif scores distinguished sequences of allergens from a large selection of 80,000 randomly selected non-allergenic sequences. The motif scores for the birch pollen allergen (Bet v 1) family, which also contains related fruit and nut allergens, correlated better than global sequence similarities with clinically observed cross-reactivities among those allergens. Further, we demonstrated that the average scores of allergen specific motifs for allergenic profilins are significantly different from the scores of non-allergenic profilins. Several of the selective motifs coincide with experimentally determined IgE epitopes of allergenic profilins. The motifs also discriminated allergenic pectate lyases, including Jun a 1 from mountain cedar pollen, from similar proteins in the human microbiome, which can be assumed to be non-allergens. The latter lacked key motifs characteristic of the known allergens, some of which correlate with known IgE binding sites.
Collapse
|
21
|
Bragin AO, Sokolov VS, Demenkov PS, Ivanisenko TV, Bragina EY, Matushkin YG, Ivanisenko VA. Prediction of Bacterial and Archaeal Allergenicity with AllPred Program. Mol Biol 2018. [DOI: 10.1134/s0026893317050041] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
22
|
Negahdaripour M, Nezafat N, Eslami M, Ghoshoon MB, Shoolian E, Najafipour S, Morowvat MH, Dehshahri A, Erfani N, Ghasemi Y. Structural vaccinology considerations for in silico designing of a multi-epitope vaccine. INFECTION GENETICS AND EVOLUTION 2018; 58:96-109. [DOI: 10.1016/j.meegid.2017.12.008] [Citation(s) in RCA: 70] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2017] [Revised: 12/05/2017] [Accepted: 12/11/2017] [Indexed: 01/26/2023]
|
23
|
Asad Y, Ahmad S, Rungrotmongkol T, Ranaghan KE, Azam SS. Immuno-informatics driven proteome-wide investigation revealed novel peptide-based vaccine targets against emerging multiple drug resistant Providencia stuartii. J Mol Graph Model 2018; 80:238-250. [PMID: 29414043 DOI: 10.1016/j.jmgm.2018.01.010] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2017] [Revised: 12/22/2017] [Accepted: 01/15/2018] [Indexed: 11/22/2022]
Abstract
The bacterium Providencia stuartii, is associated with urinary tract infections and is the most common cause of purple urine bag syndrome. The increasing multi-drug resistance pattern shown by the pathogen and lack of licensed vaccines make treatment of infections caused by P. stuartii challenging. As vaccinology data against the pathogen is scarce, an in silico proteome based Reverse Vaccinology (RV) protocol, in combination with subtractive proteomics is introduced in this work to screen potential vaccine candidates against P. stuartii. The analysis identified three potential vaccine candidates for designing broad-spectrum and strain-specific peptide vaccines: FimD4, FimD6, and FimD8. These proteins are essential for pathogen survival, localized in the outer membrane, virulent, and antigenic in nature. Immunoproteomic tools mapped surface exposed and non-allergenic 9mer B-cell derived T-cell antigenic epitopes for the proteins. The epitopes also show stable and rich interactions with the most predominant HLA allele (DRB1*0101) in the human population. Metabolic pathway annotation of the proteins indicated that fimbrial biogenesis outer membrane usher protein (FimD6) is the most suitable candidate for vaccine design, due to its involvement in several significant pathways. These pathways include: the bacterial secretion system, two-component system, β-lactam resistance, and cationic antimicrobial peptide pathways. The predicted epitopes may provide a basis for designing a peptide-based vaccine against P. stuartii.
Collapse
Affiliation(s)
- Yelda Asad
- Computational Biology Lab, National Center for Bioinformatics, Quaid-i-Azam University, Islamabad, Pakistan
| | - Sajjad Ahmad
- Computational Biology Lab, National Center for Bioinformatics, Quaid-i-Azam University, Islamabad, Pakistan
| | - Thanyada Rungrotmongkol
- Biocatalyst and Environmental Biotechnology Research unit, Department of Biochemistry, Faculty of Science, Chulalongkorn University, Bangkok 10330, Thailand; Ph.D. Program in Bioinformatics and Computational Biology, Faculty of Science, Chulalongkorn University, Bangkok 10330, Thailand
| | - Kara E Ranaghan
- Centre for Computational Chemistry, University of Bristol, Bristol, United Kingdom
| | - Syed Sikander Azam
- Computational Biology Lab, National Center for Bioinformatics, Quaid-i-Azam University, Islamabad, Pakistan; Biocatalyst and Environmental Biotechnology Research unit, Department of Biochemistry, Faculty of Science, Chulalongkorn University, Bangkok 10330, Thailand.
| |
Collapse
|
24
|
Hossain R, Yasmin T, Hosen MI, Nabi AHMN. In silico identification of potential epitopes present in human adenovirus proteins for vaccine design and of putative drugs for treatment against viral infection. J Immunol Methods 2018; 455:55-70. [PMID: 29371093 DOI: 10.1016/j.jim.2018.01.005] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2017] [Revised: 01/18/2018] [Accepted: 01/18/2018] [Indexed: 12/15/2022]
Abstract
In silico approach using computational biology to design best probable epitopes and/or drug target(s) has given an edge to foresee active components for the treatment of many infectious diseases. This study aims to investigate the best probable epitopes from fiber, hexon and penton base proteins as well as probable drug targets to prevent and to cure adenovirus infection, respectively. After retrieving protein sequences, analysis of selection pressure; prediction of continuous/discontinuous B cell epitopes along with their antigenicity, immunogenicity, allergenicity; T cell epitopes along with their population coverage and echelon of conservancy were performed. Out of three proteins, fiber protein underwent the highest degree of selection pressure. Five peptides from fiber C-5, hexon C-5 and D-8, penton base B-3 and C-5 proteins were considered as the best potential B cell epitopes. Further analyses revealed that peptides present in fiber C-5, hexon C-5, penton base B-3 and C-5 proteins fulfilled the criteria of having surface accessibility, hydrophilicity, flexibility, antigenicity and beta turn. Several regions of proteins were identified as discontinuous B cell epitopes. Interestingly, a peptide present in 692-699 region of hexon C-5 and six amino acids at positions 100, 102, 105, 108, 112 and 114 of penton base B-3 proteins were recognized both as continuous and discontinuous B cell epitopes. Of all the predicted T cell epitopes, three nonamers from hexon C-5, D-8 and penton base C-5 proteins may elicit strong immune response by activating both humoral and cellular immunity as these were found to overlap with those of B cell epitopic peptides. Considering non-allergen, conservancy and population coverage properties, "SGYDPYYTY" of hexon protein C-5 was further validated using in silico docking study for its interaction with the HLA allele. This study also demonstrated the possibility of compounds like 3-(azepan-1-ium-1-yl) propane-1-sulfonate and E-5842 as the potential inhibitors of penton base and hexon proteins that could act as more effective drugs against the virus compared to the current ones. Therefore, further in vitro and animal model experiments using these predicted epitopes and compounds may pave the way for newer and more effective treatment approaches against adenovirus infection.
Collapse
Affiliation(s)
- Rafeka Hossain
- Laboratory of Population Genetics, Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka 1000, Bangladesh
| | - Tahirah Yasmin
- Laboratory of Population Genetics, Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka 1000, Bangladesh
| | - Md Ismail Hosen
- Laboratory of Population Genetics, Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka 1000, Bangladesh
| | - A H M Nurun Nabi
- Laboratory of Population Genetics, Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka 1000, Bangladesh.
| |
Collapse
|
25
|
Negahdaripour M, Eslami M, Nezafat N, Hajighahramani N, Ghoshoon MB, Shoolian E, Dehshahri A, Erfani N, Morowvat MH, Ghasemi Y. A novel HPV prophylactic peptide vaccine, designed by immunoinformatics and structural vaccinology approaches. INFECTION GENETICS AND EVOLUTION 2017; 54:402-416. [DOI: 10.1016/j.meegid.2017.08.002] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/24/2017] [Revised: 07/19/2017] [Accepted: 08/01/2017] [Indexed: 12/19/2022]
|
26
|
Vishnu US, Sankarasubramanian J, Gunasekaran P, Rajendhran J. Identification of potential antigens from non-classically secreted proteins and designing novel multitope peptide vaccine candidate against Brucella melitensis through reverse vaccinology and immunoinformatics approach. INFECTION GENETICS AND EVOLUTION 2017; 55:151-158. [PMID: 28919551 DOI: 10.1016/j.meegid.2017.09.015] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/03/2017] [Revised: 09/01/2017] [Accepted: 09/13/2017] [Indexed: 12/31/2022]
Abstract
Brucella melitensis is an intracellular pathogen resides in the professional and non-professional phagocytes of the host, causing zoonotic disease brucellosis. The stealthy nature of the Brucella makes it's highly pathogenic, and it is hard to eliminate the bacteria completely from the infected host. Hitherto, no licensed vaccines are available for human brucellosis. In this study, we identified potential antigens for vaccine development from non-classically secreted proteins through reverse vaccinology approach. Based on the systemic screening of non-classically secreted proteins of B. melitensis 16M, we identified nine proteins as potential vaccine candidates. Among these, Omp31 and Omp22 are known immunogens, and its role in the virulence of Brucella is known. Roles of other proteins in the pathogenesis are yet to be studied. From the nine proteins, we identified six novel antigenic epitopes that can elicit both B-cell and T-cell immune responses. Among the nine proteins, the epitopes were predicted from Omp31 immunogenic protein precursor, Omp22 protein precursor, extracellular serine protease, hypothetical membrane-associated protein, iron-regulated outer membrane protein FrpB. Further, we designed a multitope vaccine using Omp31 immunogenic protein precursor, Omp22 protein precursor, extra cellular serine protease, iron-regulated outer membrane protein FrpB, hypothetical membrane-associated protein, and LPS-assembly protein LptD and polysaccharide export protein identified in the previous study. Epitopes were joined using amino acid linkers such as EAAAK and GPGPG. Cholera toxin subunit B, the nontoxic part of cholera toxin, was used as an adjuvant and it was linked to the N-terminal of the multitope vaccine candidate. The designed vaccine candidate was modeled, validated and the physicochemical properties were analyzed. Results revealed that the vaccine candidate is soluble, stable, non-allergenic, antigenic and 87% of residues of the designed vaccine candidate is located in the favored region. In conclusion, the computational analysis showed that the newly designed multitope protein could be used to develop a promising vaccine for human brucellosis.
Collapse
Affiliation(s)
- Udayakumar S Vishnu
- Department of Genetics, School of Biological Sciences, Madurai Kamaraj University, Madurai 625021, Tamil Nadu, India
| | - Jagadesan Sankarasubramanian
- Department of Genetics, School of Biological Sciences, Madurai Kamaraj University, Madurai 625021, Tamil Nadu, India
| | | | - Jeyaprakash Rajendhran
- Department of Genetics, School of Biological Sciences, Madurai Kamaraj University, Madurai 625021, Tamil Nadu, India.
| |
Collapse
|
27
|
Meena M, Gupta SK, Swapnil P, Zehra A, Dubey MK, Upadhyay RS. Alternaria Toxins: Potential Virulence Factors and Genes Related to Pathogenesis. Front Microbiol 2017; 8:1451. [PMID: 28848500 PMCID: PMC5550700 DOI: 10.3389/fmicb.2017.01451] [Citation(s) in RCA: 118] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2017] [Accepted: 07/18/2017] [Indexed: 01/04/2023] Open
Abstract
Alternaria is an important fungus to study due to their different life style from saprophytes to endophytes and a very successful fungal pathogen that causes diseases to a number of economically important crops. Alternaria species have been well-characterized for the production of different host-specific toxins (HSTs) and non-host specific toxins (nHSTs) which depend upon their physiological and morphological stages. The pathogenicity of Alternaria species depends on host susceptibility or resistance as well as quantitative production of HSTs and nHSTs. These toxins are chemically low molecular weight secondary metabolites (SMs). The effects of toxins are mainly on different parts of cells like mitochondria, chloroplast, plasma membrane, Golgi complex, nucleus, etc. Alternaria species produce several nHSTs such as brefeldin A, tenuazonic acid, tentoxin, and zinniol. HSTs that act in very low concentrations affect only certain plant varieties or genotype and play a role in determining the host range of specificity of plant pathogens. The commonly known HSTs are AAL-, AK-, AM-, AF-, ACR-, and ACT-toxins which are named by their host specificity and these toxins are classified into different family groups. The HSTs are differentiated on the basis of bio-statistical and other molecular analyses. All these toxins have different mode of action, biochemical reactions and signaling mechanisms to cause diseases. Different species of Alternaria produced toxins which reveal its biochemical and genetic effects on itself as well as on its host cells tissues. The genes responsible for the production of HSTs are found on the conditionally dispensable chromosomes (CDCs) which have been well characterized. Different bio-statistical methods like basic local alignment search tool (BLAST) data analysis used for the annotation of gene prediction, pathogenicity-related genes may provide surprising knowledge in present and future.
Collapse
Affiliation(s)
- Mukesh Meena
- Department of Botany, Institute of Science, Banaras Hindu UniversityVaranasi, India
| | | | | | | | | | | |
Collapse
|
28
|
Zhang C, Zhang Y, Wang Z, Chen S, Luo Y. Production and identification of antioxidant and angiotensin-converting enzyme inhibition and dipeptidyl peptidase IV inhibitory peptides from bighead carp (Hypophthalmichthys nobilis) muscle hydrolysate. J Funct Foods 2017. [DOI: 10.1016/j.jff.2017.05.032] [Citation(s) in RCA: 50] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
|
29
|
Nezafat N, Eslami M, Negahdaripour M, Rahbar MR, Ghasemi Y. Designing an efficient multi-epitope oral vaccine against Helicobacter pylori using immunoinformatics and structural vaccinology approaches. MOLECULAR BIOSYSTEMS 2017; 13:699-713. [PMID: 28194462 DOI: 10.1039/c6mb00772d] [Citation(s) in RCA: 76] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Helicobacter pylori is the cunning bacterium that can live in the stomachs of many people without any symptoms, but gradually can lead to gastric cancer. Due to various obstacles, which are related to anti-H. pylori antibiotic therapy, recently developing an anti-H. pylori vaccine has attracted more attention. In this study, different immunoinformatics and computational vaccinology approaches were employed to design an efficient multi-epitope oral vaccine against H. pylori. Our multi-epitope vaccine is composed of heat labile enterotoxin IIc B (LT-IIc) that is used as a mucosal adjuvant to enhance vaccine immunogenicity for oral immunization, cartilage oligomeric matrix protein (COMP) to increase vaccine stability in acidic pH of gut, one experimentally protective antigen, OipA, and two hypothetical protective antigens, HP0487 and HP0906, and "CTGKSC" peptide motif that target epithelial microfold cells (M cells) to enhance vaccine uptake from the gut barrier. All the aforesaid segments were joined to each other by proper linkers. The vaccine construct was modeled, validated, and refined by different programs to achieve a high-quality 3D structure. The resulting high-quality model was applied for conformational B-cell epitopes selection and docking analyses with a toll-like receptor 2 (TLR2). Moreover, molecular dynamics studies demonstrated that the protein-TLR2 docked model was stable during simulation time. We believe that our vaccine candidate can induce mucosal sIgA and IgG antibodies, and Th1/Th2/Th17-mediated protective immunity that are crucial for eradicating H. pylori infection. In sum, the computational results suggest that our newly designed vaccine could serve as a promising anti-H. pylori vaccine candidate.
Collapse
Affiliation(s)
- Navid Nezafat
- Pharmaceutical Science Research Center, Shiraz University of Medical Science, Shiraz, Iran
| | - Mahboobeh Eslami
- Pharmaceutical Science Research Center, Shiraz University of Medical Science, Shiraz, Iran
| | - Manica Negahdaripour
- Pharmaceutical Science Research Center, Shiraz University of Medical Science, Shiraz, Iran and Department of Pharmaceutical Biotechnology, School of Pharmacy, Shiraz University of Medical Sciences, Shiraz, Iran.
| | - Mohammad Reza Rahbar
- Pharmaceutical Science Research Center, Shiraz University of Medical Science, Shiraz, Iran and Department of Pharmaceutical Biotechnology, School of Pharmacy, Shiraz University of Medical Sciences, Shiraz, Iran.
| | - Younes Ghasemi
- Pharmaceutical Science Research Center, Shiraz University of Medical Science, Shiraz, Iran and Department of Pharmaceutical Biotechnology, School of Pharmacy, Shiraz University of Medical Sciences, Shiraz, Iran. and Department of Medical Biotechnology, School of Advanced Medical Sciences and Technologies, Shiraz University of Medical Sciences, Shiraz, Iran
| |
Collapse
|
30
|
Yasmin T, Akter S, Debnath M, Ebihara A, Nakagawa T, Nabi AHMN. In silico proposition to predict cluster of B- and T-cell epitopes for the usefulness of vaccine design from invasive, virulent and membrane associated proteins of C. jejuni. In Silico Pharmacol 2016; 4:5. [PMID: 27376537 PMCID: PMC4932005 DOI: 10.1186/s40203-016-0020-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2016] [Accepted: 06/22/2016] [Indexed: 11/25/2022] Open
Abstract
Purpose Campylobacter jejuni is the one of the leading causes of bacterial diarrheal illness worldwide. This study aims to design specific epitopes for the utility of designing peptide vaccine(s) against C. jejuni by targeting invasive, virulent and membrane associated proteins like FlaA, Cia, CadF, PEB1, PEB3 and MOMP. Methods In the present study, various immunoinformatics approaches have been applied to design a potential epitope based vaccine against C. jejuni. The tools include Bepipred, ABCpred, Immune Epitope databse (IEDB) resource portal, Autodock vina etc. Results Peptides “EINKN”, “TGSRLN”, “KSNPDI”, “LDENGCE” respectively from FlaA, MOMP, PEB3, CadF proteins were found to be the most potential B cell epitopes while peptides “FRINTNVAA”, “NYFEGNLDM”, “YKYSPKLNF”, “YQDAIGLLV”, “FRNNIVAFV” and “LIMPVFHEL” respectively from Fla, CadF, MOMP, PEB1A, PEB3 and Cia might elicit cell mediated immunity and “IFYTTGSRL” from MOMP protein might elicit both humoral and cell-mediated immunity. All these potential peptidic epitopes showed almost 80–100 % conservancy in different strains of C jejuni with varying proportions of population coverage ranging from 22–60 %. Further authentication of these peptide epitopes as probable vaccine candidate was mediated by their binding to specific HLA alleles using in silico docking technique. Conclusion Based on the present study, it could be concluded that these predicted epitopes might be used to design a vaccine against C. jejuni bacteria and thus, could be validated in model hosts to verify their efficacy as vaccine. Electronic supplementary material The online version of this article (doi:10.1186/s40203-016-0020-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Tahirah Yasmin
- Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka-, 1000, Bangladesh
| | - Salma Akter
- Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka-, 1000, Bangladesh
| | - Mouly Debnath
- Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka-, 1000, Bangladesh
| | - Akio Ebihara
- Laboratory of Applied Biochemistry, Faculty of Applied Biological Sciences, Gifu University, 1-1 Yanagido, Gifu, 501-1193, Japan
| | - Tsutomu Nakagawa
- Laboratory of Applied Biochemistry, Faculty of Applied Biological Sciences, Gifu University, 1-1 Yanagido, Gifu, 501-1193, Japan
| | - A H M Nurun Nabi
- Department of Biochemistry and Molecular Biology, University of Dhaka, Dhaka-, 1000, Bangladesh.
| |
Collapse
|
31
|
Li L, Luo Q, Xiao W, Li J, Zhou S, Li Y, Zheng X, Yang H. A machine-learning approach for predicting palmitoylation sites from integrated sequence-based features. J Bioinform Comput Biol 2016; 15:1650025. [PMID: 27411307 DOI: 10.1142/s0219720016500256] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Palmitoylation is the covalent attachment of lipids to amino acid residues in proteins. As an important form of protein posttranslational modification, it increases the hydrophobicity of proteins, which contributes to the protein transportation, organelle localization, and functions, therefore plays an important role in a variety of cell biological processes. Identification of palmitoylation sites is necessary for understanding protein-protein interaction, protein stability, and activity. Since conventional experimental techniques to determine palmitoylation sites in proteins are both labor intensive and costly, a fast and accurate computational approach to predict palmitoylation sites from protein sequences is in urgent need. In this study, a support vector machine (SVM)-based method was proposed through integrating PSI-BLAST profile, physicochemical properties, [Formula: see text]-mer amino acid compositions (AACs), and [Formula: see text]-mer pseudo AACs into the principal feature vector. A recursive feature selection scheme was subsequently implemented to single out the most discriminative features. Finally, an SVM method was implemented to predict palmitoylation sites in proteins based on the optimal features. The proposed method achieved an accuracy of 99.41% and Matthews Correlation Coefficient of 0.9773 for a benchmark dataset. The result indicates the efficiency and accuracy of our method in prediction of palmitoylation sites based on protein sequences.
Collapse
Affiliation(s)
- Liqi Li
- * Department of General Surgery, Xinqiao Hospital, Third Military Medical University, Chongqing 400037, China
| | - Qifa Luo
- * Department of General Surgery, Xinqiao Hospital, Third Military Medical University, Chongqing 400037, China
| | - Weidong Xiao
- * Department of General Surgery, Xinqiao Hospital, Third Military Medical University, Chongqing 400037, China
| | - Jinhui Li
- * Department of General Surgery, Xinqiao Hospital, Third Military Medical University, Chongqing 400037, China
| | - Shiwen Zhou
- † National Drug Clinical Trial Institution, Xinqiao Hospital, Third Military Medical University, Chongqing 400037, China
| | - Yongsheng Li
- ‡ Institute of Cancer, Xinqiao Hospital, Third Military Medical University, Chongqing 400037, China
| | - Xiaoqi Zheng
- § Department of Mathematics, Shanghai Normal University, Shanghai 200234, China
| | - Hua Yang
- * Department of General Surgery, Xinqiao Hospital, Third Military Medical University, Chongqing 400037, China
| |
Collapse
|
32
|
Kim IW, Lee JH, Subramaniyam S, Yun EY, Kim I, Park J, Hwang JS. De Novo Transcriptome Analysis and Detection of Antimicrobial Peptides of the American Cockroach Periplaneta americana (Linnaeus). PLoS One 2016; 11:e0155304. [PMID: 27167617 PMCID: PMC4864078 DOI: 10.1371/journal.pone.0155304] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2015] [Accepted: 04/27/2016] [Indexed: 01/13/2023] Open
Abstract
Cockroaches are surrogate hosts for microbes that cause many human diseases. In spite of their generally destructive nature, cockroaches have recently been found to harbor potentially beneficial and medically useful substances such as drugs and allergens. However, genomic information for the American cockroach (Periplaneta americana) is currently unavailable; therefore, transcriptome and gene expression profiling is needed as an important resource to better understand the fundamental biological mechanisms of this species, which would be particularly useful for the selection of novel antimicrobial peptides. Thus, we performed de novo transcriptome analysis of P. americana that were or were not immunized with Escherichia coli. Using an Illumina HiSeq sequencer, we generated a total of 9.5 Gb of sequences, which were assembled into 85,984 contigs and functionally annotated using Basic Local Alignment Search Tool (BLAST), Gene Ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) database terms. Finally, using an in silico antimicrobial peptide prediction method, 86 antimicrobial peptide candidates were predicted from the transcriptome, and 21 of these peptides were experimentally validated for their antimicrobial activity against yeast and gram positive and -negative bacteria by a radial diffusion assay. Notably, 11 peptides showed strong antimicrobial activities against these organisms and displayed little or no cytotoxic effects in the hemolysis and cell viability assay. This work provides prerequisite baseline data for the identification and development of novel antimicrobial peptides, which is expected to provide a better understanding of the phenomenon of innate immunity in similar species.
Collapse
Affiliation(s)
- In-Woo Kim
- Department of Agricultural Biology, National Institute of Agricultural Sciences, Rural Development Administration, Wanju, Republic of Korea
- College of Agriculture & Life Sciences, Chonnam National University, Gwangju, Republic of Korea
| | - Joon Ha Lee
- Department of Agricultural Biology, National Institute of Agricultural Sciences, Rural Development Administration, Wanju, Republic of Korea
| | | | - Eun-Young Yun
- Department of Agricultural Biology, National Institute of Agricultural Sciences, Rural Development Administration, Wanju, Republic of Korea
| | - Iksoo Kim
- College of Agriculture & Life Sciences, Chonnam National University, Gwangju, Republic of Korea
| | - Junhyung Park
- Insilicogen, Inc., Giheung-gu, Yongin-si, Gyeonggi-do, Republic of Korea
| | - Jae Sam Hwang
- Department of Agricultural Biology, National Institute of Agricultural Sciences, Rural Development Administration, Wanju, Republic of Korea
| |
Collapse
|
33
|
|
34
|
Predicting cancerlectins by the optimal g-gap dipeptides. Sci Rep 2015; 5:16964. [PMID: 26648527 PMCID: PMC4673586 DOI: 10.1038/srep16964] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2015] [Accepted: 10/22/2015] [Indexed: 12/14/2022] Open
Abstract
The cancerlectin plays a key role in the process of tumor cell differentiation. Thus, to fully understand the function of cancerlectin is significant because it sheds light on the future direction for the cancer therapy. However, the traditional wet-experimental methods were money- and time-consuming. It is highly desirable to develop an effective and efficient computational tool to identify cancerlectins. In this study, we developed a sequence-based method to discriminate between cancerlectins and non-cancerlectins. The analysis of variance (ANOVA) was used to choose the optimal feature set derived from the g-gap dipeptide composition. The jackknife cross-validated results showed that the proposed method achieved the accuracy of 75.19%, which is superior to other published methods. For the convenience of other researchers, an online web-server CaLecPred was established and can be freely accessed from the website http://lin.uestc.edu.cn/server/CalecPred. We believe that the CaLecPred is a powerful tool to study cancerlectins and to guide the related experimental validations.
Collapse
|
35
|
Garino C, Coïsson JD, Arlorio M. In silico allergenicity prediction of several lipid transfer proteins. Comput Biol Chem 2015; 60:32-42. [PMID: 26643760 DOI: 10.1016/j.compbiolchem.2015.11.006] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2015] [Revised: 11/04/2015] [Accepted: 11/10/2015] [Indexed: 10/22/2022]
Abstract
Non-specific lipid transfer proteins (nsLTPs) are common allergens and they are particularly widespread within the plant kingdom. They have a highly conserved three-dimensional structure that generate a strong cross-reactivity among the members of this family. In the last years several web tools for the prediction of allergenicity of new molecules based on their homology with known allergens have been released, and guidelines to assess potential allergenicity of proteins through bioinformatics have been established. Even if such tools are only partially reliable yet, they can provide important indications when other kinds of molecular characterization are lacking. The potential allergenicity of 28 amino acid sequences of LTPs homologs, either retrieved from the UniProt database or in silico deduced from the corresponding EST coding sequence, was predicted using 7 publicly available web tools. Moreover, their similarity degree to their closest known LTP allergens was calculated, in order to evaluate their potential cross-reactivity. Finally, all sequences were studied for their identity degree with the peach allergen Pru p 3, considering the regions involved in the formation of its known conformational IgE-binding epitope. Most of the analyzed sequences displayed a high probability to be allergenic according to all the software employed. The analyzed LTPs from bell pepper, cassava, mango, mungbean and soybean showed high homology (>70%) with some known allergenic LTPs, suggesting a potential risk of cross-reactivity for sensitized individuals. Other LTPs, like for example those from canola, cassava, mango, mungbean, papaya or persimmon, displayed a high degree of identity with Pru p 3 within the consensus sequence responsible for the formation, at three-dimensional level, of its major conformational epitope. Since recent studies highlighted how in patients mono-sensitized to peach LTP the levels of IgE seem directly proportional to the chance of developing cross-reactivity to LTPs from non-Rosaceae foods, and these chances increase the more similar the protein is to Pru p 3, these proteins should be taken into special account for future studies aimed at evaluating the risk of cross-allergenicity in highly sensitized individuals.
Collapse
Affiliation(s)
- Cristiano Garino
- Dipartimento di Scienze del Farmaco & Drug and Food Biotechnology (DFB) Center, Università del Piemonte Orientale "A. Avogadro", largo Donegani 2, 28100 Novara, Italy.
| | - Jean Daniel Coïsson
- Dipartimento di Scienze del Farmaco & Drug and Food Biotechnology (DFB) Center, Università del Piemonte Orientale "A. Avogadro", largo Donegani 2, 28100 Novara, Italy.
| | - Marco Arlorio
- Dipartimento di Scienze del Farmaco & Drug and Food Biotechnology (DFB) Center, Università del Piemonte Orientale "A. Avogadro", largo Donegani 2, 28100 Novara, Italy.
| |
Collapse
|
36
|
Dang HX, Pryor B, Peever T, Lawrence CB. The Alternaria genomes database: a comprehensive resource for a fungal genus comprised of saprophytes, plant pathogens, and allergenic species. BMC Genomics 2015; 16:239. [PMID: 25887485 PMCID: PMC4387663 DOI: 10.1186/s12864-015-1430-7] [Citation(s) in RCA: 65] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2015] [Accepted: 03/02/2015] [Indexed: 12/19/2022] Open
Abstract
Background Alternaria is considered one of the most common saprophytic fungal genera on the planet. It is comprised of many species that exhibit a necrotrophic phytopathogenic lifestyle. Several species are clinically associated with allergic respiratory disorders although rarely found to cause invasive infections in humans. Finally, Alternaria spp. are among the most well known producers of diverse fungal secondary metabolites, especially toxins. Description We have recently sequenced and annotated the genomes of 25 Alternaria spp. including but not limited to many necrotrophic plant pathogens such as A. brassicicola (a pathogen of Brassicaceous crops like cabbage and canola) and A. solani (a major pathogen of Solanaceous plants like potato and tomato), and several saprophytes that cause allergy in human such as A. alternata isolates. These genomes were annotated and compared. Multiple genetic differences were found in the context of plant and human pathogenicity, notably the pro-inflammatory potential of A. alternata. The Alternaria genomes database was built to provide a public platform to access the whole genome sequences, genome annotations, and comparative genomics data of these species. Genome annotation and comparison were performed using a pipeline that integrated multiple computational and comparative genomics tools. Alternaria genome sequences together with their annotation and comparison data were ported to Ensembl database schemas using a self-developed tool (EnsImport). Collectively, data are currently hosted using a customized installation of the Ensembl genome browser platform. Conclusion Recent efforts in fungal genome sequencing have facilitated the studies of the molecular basis of fungal pathogenicity as a whole system. The Alternaria genomes database provides a comprehensive resource of genomics and comparative data of an important saprophytic and plant/human pathogenic fungal genus. The database will be updated regularly with new genomes when they become available. The Alternaria genomes database is freely available for non-profit use at http://alternaria.vbi.vt.edu.
Collapse
Affiliation(s)
- Ha X Dang
- Department of Biological Sciences, Virginia Tech, Blacksburg, Virginia, 24061, USA. .,Current address: Department of Internal Medicine, Division of Oncology, and The Genome Institute, Washington University School of Medicine, St. Louis, MO, 63110, USA.
| | - Barry Pryor
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, 85721, USA.
| | - Tobin Peever
- Department of Plant Pathology, Washington State University, Pullman, Washington, 99164, USA.
| | - Christopher B Lawrence
- Department of Biological Sciences, Virginia Tech, Blacksburg, Virginia, 24061, USA. .,Department of Plant Pathology, Washington State University, Pullman, Washington, 99164, USA.
| |
Collapse
|
37
|
Dang HX, Pryor B, Peever T, Lawrence CB. The Alternaria genomes database: a comprehensive resource for a fungal genus comprised of saprophytes, plant pathogens, and allergenic species. BMC Genomics 2015; 16:239. [PMID: 25887485 DOI: 10.1186/s12864-015-1430-7/figures/5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2015] [Accepted: 03/02/2015] [Indexed: 05/25/2023] Open
Abstract
BACKGROUND Alternaria is considered one of the most common saprophytic fungal genera on the planet. It is comprised of many species that exhibit a necrotrophic phytopathogenic lifestyle. Several species are clinically associated with allergic respiratory disorders although rarely found to cause invasive infections in humans. Finally, Alternaria spp. are among the most well known producers of diverse fungal secondary metabolites, especially toxins. DESCRIPTION We have recently sequenced and annotated the genomes of 25 Alternaria spp. including but not limited to many necrotrophic plant pathogens such as A. brassicicola (a pathogen of Brassicaceous crops like cabbage and canola) and A. solani (a major pathogen of Solanaceous plants like potato and tomato), and several saprophytes that cause allergy in human such as A. alternata isolates. These genomes were annotated and compared. Multiple genetic differences were found in the context of plant and human pathogenicity, notably the pro-inflammatory potential of A. alternata. The Alternaria genomes database was built to provide a public platform to access the whole genome sequences, genome annotations, and comparative genomics data of these species. Genome annotation and comparison were performed using a pipeline that integrated multiple computational and comparative genomics tools. Alternaria genome sequences together with their annotation and comparison data were ported to Ensembl database schemas using a self-developed tool (EnsImport). Collectively, data are currently hosted using a customized installation of the Ensembl genome browser platform. CONCLUSION Recent efforts in fungal genome sequencing have facilitated the studies of the molecular basis of fungal pathogenicity as a whole system. The Alternaria genomes database provides a comprehensive resource of genomics and comparative data of an important saprophytic and plant/human pathogenic fungal genus. The database will be updated regularly with new genomes when they become available. The Alternaria genomes database is freely available for non-profit use at http://alternaria.vbi.vt.edu .
Collapse
Affiliation(s)
- Ha X Dang
- Department of Biological Sciences, Virginia Tech, Blacksburg, Virginia, 24061, USA.
- Current address: Department of Internal Medicine, Division of Oncology, and The Genome Institute, Washington University School of Medicine, St. Louis, MO, 63110, USA.
| | - Barry Pryor
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, 85721, USA.
| | - Tobin Peever
- Department of Plant Pathology, Washington State University, Pullman, Washington, 99164, USA.
| | - Christopher B Lawrence
- Department of Biological Sciences, Virginia Tech, Blacksburg, Virginia, 24061, USA.
- Department of Plant Pathology, Washington State University, Pullman, Washington, 99164, USA.
| |
Collapse
|