1
|
Kwon H, Du Z, Li Y. AlphaFold 2-based stacking model for protein solubility prediction and its transferability on seed storage proteins. Int J Biol Macromol 2024; 278:134601. [PMID: 39137857 DOI: 10.1016/j.ijbiomac.2024.134601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Revised: 07/29/2024] [Accepted: 08/07/2024] [Indexed: 08/15/2024]
Abstract
Accurate protein solubility prediction is crucial in screening suitable candidates for food application. Existing models often rely only on sequences, overlooking important structural details. In this study, a regression model for protein solubility was developed using both the sequences and predicted structures of 2983 E. coli proteins. The sequence and structural level properties of the proteins were bioinformatically extracted and subjected to multilayer perceptron (MLP). Moreover, residue level features and contact maps were utilized to construct a graph convolutional network (GCN). The out-of-fold predictions of the two models were combined and fed into multiple meta-regressors to create a stacking model. The stacking model with support vector regressor (SVR) achieved R2 of 0.502 and 0.468 on test and external validation datasets, respectively, displaying higher performance compared to existing regression models. Based on the improved performance compared to its based models, the stacking model effectively captured the strength of its base models as well as the significance of the different features used. Furthermore, the model's transferability was indirectly validated on a dataset of seed storage proteins using Osborne definition as well as on a case study using molecular dynamic simulation, showing potential for application beyond microbial proteins to food and agriculture-related ones.
Collapse
Affiliation(s)
- Hyukjin Kwon
- Department of Grain Science and Industry, Kansas State University, Manhattan, KS 66506, USA
| | - Zhenjiao Du
- Department of Grain Science and Industry, Kansas State University, Manhattan, KS 66506, USA
| | - Yonghui Li
- Department of Grain Science and Industry, Kansas State University, Manhattan, KS 66506, USA.
| |
Collapse
|
2
|
Miao Y, Sun Z, Lin C, Gu H, Ma C, Liang Y, Wang G. DeePhafier: a phage lifestyle classifier using a multilayer self-attention neural network combining protein information. Brief Bioinform 2024; 25:bbae377. [PMID: 39110476 PMCID: PMC11304974 DOI: 10.1093/bib/bbae377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2024] [Revised: 07/04/2024] [Accepted: 07/19/2024] [Indexed: 08/10/2024] Open
Abstract
Bacteriophages are the viruses that infect bacterial cells. They are the most diverse biological entities on earth and play important roles in microbiome. According to the phage lifestyle, phages can be divided into the virulent phages and the temperate phages. Classifying virulent and temperate phages is crucial for further understanding of the phage-host interactions. Although there are several methods designed for phage lifestyle classification, they merely either consider sequence features or gene features, leading to low accuracy. A new computational method, DeePhafier, is proposed to improve classification performance on phage lifestyle. Built by several multilayer self-attention neural networks, a global self-attention neural network, and being combined by protein features of the Position Specific Scoring Matrix matrix, DeePhafier improves the classification accuracy and outperforms two benchmark methods. The accuracy of DeePhafier on five-fold cross-validation is as high as 87.54% for sequences with length >2000bp.
Collapse
Affiliation(s)
- Yan Miao
- College of Computer and Control Engineering, Northeast Forestry University, No. 26 Hexing Road, Harbin, 150040, Heilongjiang, China
| | - Zhenyuan Sun
- College of Computer and Control Engineering, Northeast Forestry University, No. 26 Hexing Road, Harbin, 150040, Heilongjiang, China
| | - Chen Lin
- National Institute for Data Science in Health and Medicine, Xiamen University, No. 4221 Xiangannan Road, Xiamen, 361102, Fujian, China
| | - Haoran Gu
- College of Computer and Control Engineering, Northeast Forestry University, No. 26 Hexing Road, Harbin, 150040, Heilongjiang, China
| | - Chenjing Ma
- College of Computer and Control Engineering, Northeast Forestry University, No. 26 Hexing Road, Harbin, 150040, Heilongjiang, China
| | - Yingjian Liang
- Key Laboratory of Hepatosplenic Surgery, Ministry of Education, Department of General Surgery, the First Affiliated Hospital of Harbin Medical University, No. 23 Postal Street, Harbin, 150007, Heilongjiang, China
| | - Guohua Wang
- College of Computer and Control Engineering, Northeast Forestry University, No. 26 Hexing Road, Harbin, 150040, Heilongjiang, China
| |
Collapse
|
3
|
Sodani M, Misra CS, Nigam G, Fatima Z, Kulkarni S, Rath D. MSMEG_0311 is a conserved essential polar protein involved in mycobacterium cell wall metabolism. Int J Biol Macromol 2024; 260:129583. [PMID: 38242409 DOI: 10.1016/j.ijbiomac.2024.129583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 01/16/2024] [Accepted: 01/16/2024] [Indexed: 01/21/2024]
Abstract
Cell wall synthesis and cell division are two closely linked pathways in a bacterial cell which distinctly influence the growth and survival of a bacterium. This requires an appreciable coordination between the two processes, more so, in case of mycobacteria with an intricate multi-layered cell wall structure. In this study, we investigated a conserved gene cluster using CRISPR-Cas12 based gene silencing technology to show that knockdown of most of the genes in this cluster leads to growth defects. Investigating conserved genes is important as they likely perform vital cellular functions and the functional insights on such genes can be extended to other mycobacterial species. We characterised one of the genes in the locus, MSMEG_0311. The repression of this gene not only imparts severe growth defect but also changes colony morphology. We demonstrate that the protein preferentially localises to the polar region and investigate its influence on the polar growth of the bacillus. A combination of permeability and drug susceptibility assay strongly suggests a cell wall associated function of this gene which is also corroborated by transcriptomic analysis of the knockdown where a number of cell wall associated genes, particularly iniA and sigF regulon get altered. Considering the gene is highly conserved across mycobacterial species and appears to be essential for growth, it may serve as a potential drug target.
Collapse
Affiliation(s)
- Megha Sodani
- Radiation Medicine Centre, Medical Group, Bhabha Atomic Research Centre, Mumbai 400085, Maharashtra, India; Homi Bhabha National Institute, Training School Complex, Anushaktinagar, Mumbai 400094, Maharashtra, India
| | - Chitra S Misra
- Applied Genomics Section, Bio-Science Group, Bhabha Atomic Research Centre, Mumbai 400085, Maharashtra, India
| | - Gaurav Nigam
- Amity Institute of Biotechnology, Amity University Haryana, Gurugram, India
| | - Zeeshan Fatima
- Amity Institute of Biotechnology, Amity University Haryana, Gurugram, India; Department of Laboratory Medicine, Faculty of Applied Medical Sciences, University of Bisha, Bisha, Saudi Arabia
| | - Savita Kulkarni
- Radiation Medicine Centre, Medical Group, Bhabha Atomic Research Centre, Mumbai 400085, Maharashtra, India; Homi Bhabha National Institute, Training School Complex, Anushaktinagar, Mumbai 400094, Maharashtra, India.
| | - Devashish Rath
- Homi Bhabha National Institute, Training School Complex, Anushaktinagar, Mumbai 400094, Maharashtra, India; Applied Genomics Section, Bio-Science Group, Bhabha Atomic Research Centre, Mumbai 400085, Maharashtra, India.
| |
Collapse
|
4
|
Medha, Joshi H, Sharma S, Sharma M. Elucidating the function of hypothetical PE_PGRS45 protein of Mycobacterium tuberculosis as an oxido-reductase: a potential target for drug repurposing for the treatment of tuberculosis. J Biomol Struct Dyn 2023; 41:10009-10025. [PMID: 36448553 DOI: 10.1080/07391102.2022.2151514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Accepted: 11/19/2022] [Indexed: 06/17/2023]
Abstract
Mycobacterium tuberculosis (Mtb) encodes a total of 67 PE_PGRS proteins and definite functions of many of them are still unknown. This study reports PE_PGRS45 (Rv2615c) protein from Mtb as NADPH dependent oxido-reductase having substrate specificity for fatty acyl Coenzyme A. Computational studies predicted PE_PGRS45 to be an integral membrane protein of Mtb. Expression of PE_PGRS45 in non-pathogenic Mycobacterium smegmatis, which does not possess PE_PGRS genes, confirmed its membrane localization. This protein was observed to have NADPH binding motif. Experimental validation confirmed its NADPH dependent oxido-reductase activity (Km value = 34.85 ± 9.478 μM, Vmax = 96.77 ± 7.184 nmol/min/mg of protein). Therefore, its potential to be targeted by first line anti-tubercular drug Isoniazid (INH) was investigated. INH was predicted to bind within the active site of PE_PGRS45 protein and experiments validated its inhibitory effect on the oxido-reductase activity of PE_PGRS45 with IC50/Ki values of 5.66 μM. Mtb is resistant to first line drugs including INH. Therefore, to address the problem of drug resistant TB, docking and Molecular Dynamics (MD) simulation studies between PE_PGRS45 and three drugs (Entacapone, Tolcapone and Verapamil) which are being used in Parkinson's and hypertension treatment were performed. PE_PGRS45 bound the three drugs with similar or better affinity in comparison to INH. Additionally, INH and these drugs bound within the same active site of PE_PGRS45. This study discovered Mtb's PE_PGRS45 protein to have an oxido-reductase activity and could be targeted by drugs that can be repurposed for TB treatment. Furthermore, in-vitro and in-vivo validation will aid in drug-resistant TB treatment. HIGHLIGHTSIn-silico and in-vitro studies of hypothetical protein PE_PGRS45 (Rv2615c) of Mycobacterium tuberculosis (Mtb) reveals it to be an integral membrane proteinPE_PGRS45 protein has substrate specificity for fatty acyl Coenzyme A (fatty acyl CoA) and possess NADPH dependent oxido-reductase activityDocking and simulation studies revealed that first line anti-tubercular drug Isoniazid (INH) and other drugs with anti-TB property have strong affinity for PE_PGRS45 proteinOxido-reductase activity of PE_PGRS45 protein is inhibited by INHPE_PGRS45 protein could be targeted by drugs that can be repurposed for TB treatmentCommunicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Medha
- DSKC Bio Discovery Lab and Department of Zoology, Miranda House, University of Delhi, New Delhi, India
| | - Hemant Joshi
- Laboratory of Molecular Biology and Genetic Engineering, School of Biotechnology, Jawaharlal Nehru University, New Delhi, India
| | - Sadhna Sharma
- DSKC Bio Discovery Lab and Department of Zoology, Miranda House, University of Delhi, New Delhi, India
| | - Monika Sharma
- DSKC Bio Discovery Lab and Department of Zoology, Miranda House, University of Delhi, New Delhi, India
| |
Collapse
|
5
|
Nayak SS, Sethi G, Ramadas K. Design of multi-epitope based vaccine against Mycobacterium tuberculosis: a subtractive proteomics and reverse vaccinology based immunoinformatics approach. J Biomol Struct Dyn 2023; 41:14116-14134. [PMID: 36775659 DOI: 10.1080/07391102.2023.2178511] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2022] [Accepted: 02/02/2023] [Indexed: 02/14/2023]
Abstract
Tuberculosis is an airborne transmissible disease caused by Mycobacterium tuberculosis that infects millions of lives worldwide. There is still no single comprehensive therapy or preventative available for the lethal illness. Currently, the available vaccine, BCG is ineffectual in preventing the prophylactic adult pulmonary TB and reactivation of latent tuberculosis. Therefore, this investigation was intended to design a new multi-epitope vaccine that can address the existing problems. The subtractive proteomics approach was implemented to prioritize essential, virulence, druggable, and antigenic proteins as suitable vaccine candidates. Furthermore, a reverse vaccinology-based immunoinformatics technique was employed to identify potential B-cell, helper T lymphocytes (HTL), and cytotoxic T lymphocytes (CTL) epitopes from the target proteins. Immune-stimulating adjuvant, linkers, and PADRE (Pan HLA-DR epitopes) amino acid sequences along with the selected epitopes were used to construct a chimeric multi-epitope vaccine. The molecular docking and normal mode analysis (NMA) were carried out to evaluate the binding mode of the designed vaccine with different immunogenic receptors (MHC-I, MHC-II, and Tlr4). In addition, the MD simulation, followed by essential dynamics study and MMPBSA analysis, was carried out to understand the dynamics and stability of the complexes. In-silico cloning was accomplished using E.coli as an expression system to express the designed vaccine successfully. Finally, the immune simulation study has foreseen that our designed vaccine could induce a significant immune response by elevation of different immunoglobulins in the host. However, there is an imperative need for the experimental validation of the designed vaccine in animal models to confer effectiveness and safety.HIGHLIGHTSMulti-epitope based vaccine was designed against Mycobacterium tuberculosis using subtractive proteomics and Immunoinformatics approach.The vaccine was found to be antigenic, non-allergenic, immunogenic, and stable based on in-silico prediction.Population coverage analysis of the proposed vaccine predicts an effective response in the world population.The molecular docking, MD simulation, and MM-PBSA study confirm the stable interaction of the vaccine with immunogenic receptors.In silico cloning and immune simulation of the vaccine demonstrated its successful expression in E.coli and induction of immune response in the host. Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
| | - Guneswar Sethi
- Department of Bioinformatics, Pondicherry University, Pondicherry, India
| | - Krishna Ramadas
- Department of Bioinformatics, Pondicherry University, Pondicherry, India
| |
Collapse
|
6
|
Pande A, Patiyal S, Lathwal A, Arora C, Kaur D, Dhall A, Mishra G, Kaur H, Sharma N, Jain S, Usmani SS, Agrawal P, Kumar R, Kumar V, Raghava GPS. Pfeature: A Tool for Computing Wide Range of Protein Features and Building Prediction Models. J Comput Biol 2023; 30:204-222. [PMID: 36251780 DOI: 10.1089/cmb.2022.0241] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
In the last three decades, a wide range of protein features have been discovered to annotate a protein. Numerous attempts have been made to integrate these features in a software package/platform so that the user may compute a wide range of features from a single source. To complement the existing methods, we developed a method, Pfeature, for computing a wide range of protein features. Pfeature allows to compute more than 200,000 features required for predicting the overall function of a protein, residue-level annotation of a protein, and function of chemically modified peptides. It has six major modules, namely, composition, binary profiles, evolutionary information, structural features, patterns, and model building. Composition module facilitates to compute most of the existing compositional features, plus novel features. The binary profile of amino acid sequences allows to compute the fraction of each type of residue as well as its position. The evolutionary information module allows to compute evolutionary information of a protein in the form of a position-specific scoring matrix profile generated using Position-Specific Iterative Basic Local Alignment Search Tool (PSI-BLAST); fit for annotation of a protein and its residues. A structural module was developed for computing of structural features/descriptors from a tertiary structure of a protein. These features are suitable to predict the therapeutic potential of a protein containing non-natural or chemically modified residues. The model-building module allows to implement various machine learning techniques for developing classification and regression models as well as feature selection. Pfeature also allows the generation of overlapping patterns and features from a protein. A user-friendly Pfeature is available as a web server python library and stand-alone package.
Collapse
Affiliation(s)
- Akshara Pande
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| | - Sumeet Patiyal
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| | - Anjali Lathwal
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| | - Chakit Arora
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| | - Dilraj Kaur
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| | - Anjali Dhall
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| | - Gaurav Mishra
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.,Department of Electrical Engineering, Shiv Nadar University, Greater Noida, India
| | - Harpreet Kaur
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.,Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, India
| | - Neelam Sharma
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| | - Shipra Jain
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| | - Salman Sadullah Usmani
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.,Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, India
| | - Piyush Agrawal
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.,Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, India
| | - Rajesh Kumar
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.,Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, India
| | - Vinod Kumar
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.,Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, India
| | - Gajendra P S Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| |
Collapse
|
7
|
Mechanistic Insight into the Enzymatic Inhibition of β-Amyrin against Mycobacterial Rv1636: In Silico and In Vitro Approaches. BIOLOGY 2022; 11:biology11081214. [PMID: 36009841 PMCID: PMC9405466 DOI: 10.3390/biology11081214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/19/2022] [Revised: 08/02/2022] [Accepted: 08/08/2022] [Indexed: 12/05/2022]
Abstract
Simple Summary Rv1636 is a mycobacterial universal stress protein whose expression level increases in different type of stress conditions. This protein promotes the growth of Mycobacterium tuberculosis in the host derived stress conditions generated during infection. Therefore in this manuscipt, we are trying to target Rv1636 using natural inhibitor. Targeting essential Mycobacterial protein using natural prodect was hypothesized to generate a molecule with low toxic effects and high inhibitory activity. It was found that Rv1636 contains ATPase activity and its ATPase activity gets disturbed by addition of β-Amyrin in the reaction. β-Amyrin was forund to interfere with the ATP binding site of Rv1636 which was confirmed by molecular docking anad dynamic studies. In addition to the ATPase activity, Rv1636 was also contain the cAMP binding capacity and also involved in balancing the cAMP levels inside cells. So, targeting Rv1636 using β-Amyrin disrupts its ATPase activity and cAMP regulatory activity and these conditions might make Mycobacterium tuberculosis more susceptible to the host derived stress conditions. Abstract Mycobacterium tuberculosis has seen tremendous success as it has developed defenses to reside in host alveoli despite various host-related stress circumstances. Rv1636 is a universal stress protein contributing to mycobacterial survival in different host-derived stress conditions. Both ATP and cAMP can be bound with the Rv1636, and their binding actions are independent of one another. β-Amyrin, a triterpenoid compound, is abundant in medicinal plants and has many pharmacological properties and broad therapeutic potential. The current study uses biochemical, biophysical, and computational methods to define the binding of Rv1636 with β-Amyrin. A substantial interaction between β-Amyrin and Rv1636 was discovered by molecular docking studies, which helped decipher the critical residues involved in the binding process. VAL60 is a crucial residue found in the complexes of both Rv1636_β-Amyrin and Rv1636-ATP. Additionally, the Rv1636_β-Amyrin complex was shown to be stable by molecular dynamics simulation studies (MD), with minimal changes observed during the simulation. In silico observations were further complemented by in vitro assays. Successful cloning, expression, and purification of Rv1636 were accomplished using Ni-NTA affinity chromatography. The results of the ATPase activity assay indicated that Rv1636’s ATPase activity was inhibited in the presence of various β-Amyrin concentrations. Additionally, circular dichroism spectroscopy (CD) was used to examine modifications to Rv1636 secondary structure upon binding of β-Amyrin. Finally, isothermal titration calorimetry (ITC) advocated spontaneous binding of β-Amyrin with Rv1636 elucidating the thermodynamics of the Rv1636_β-Amyrin complex. Thus, the study establishes that β-Amyrin binds to Rv1636 with a significant affinity forming a stable complex and inhibiting its ATPase activity. The present study suggests that β-Amyrin might affect the functioning of Rv1636, which makes the bacterium vulnerable to different stress conditions.
Collapse
|
8
|
Potential Efficacy of β-Amyrin Targeting Mycobacterial Universal Stress Protein by In Vitro and In Silico Approach. Molecules 2022; 27:molecules27144581. [PMID: 35889451 PMCID: PMC9320329 DOI: 10.3390/molecules27144581] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2022] [Revised: 07/11/2022] [Accepted: 07/14/2022] [Indexed: 01/29/2023] Open
Abstract
The emergence of drug resistance and the limited number of approved antitubercular drugs prompted identification and development of new antitubercular compounds to cure Tuberculosis (TB). In this work, an attempt was made to identify potential natural compounds that target mycobacterial proteins. Three plant extracts (A. aspera, C. gigantea and C. procera) were investigated. The ethyl acetate fraction of the aerial part of A. aspera and the flower ash of C. gigantea were found to be effective against M. tuberculosis H37Rv. Furthermore, the GC-MS analysis of the plant fractions confirmed the presence of active compounds in the extracts. The Mycobacterium target proteins, i.e., available PDB dataset proteins and proteins classified in virulence, detoxification, and adaptation, were investigated. A total of ten target proteins were shortlisted for further study, identified as follows: BpoC, RipA, MazF4, RipD, TB15.3, VapC15, VapC20, VapC21, TB31.7, and MazF9. Molecular docking studies showed that β-amyrin interacted with most of these proteins and its highest binding affinity was observed with Mycobacterium Rv1636 (TB15.3) protein. The stability of the protein-ligand complex was assessed by molecular dynamic simulation, which confirmed that β-amyrin most firmly interacted with Rv1636 protein. Rv1636 is a universal stress protein, which regulates Mycobacterium growth in different stress conditions and, thus, targeting Rv1636 makes M. tuberculosis vulnerable to host-derived stress conditions.
Collapse
|
9
|
Kootery KP, Sarojini S. Structural and functional characterization of a hypothetical protein in the RD7 region in clinical isolates of Mycobacterium tuberculosis - an in silico approach to candidate vaccines. J Genet Eng Biotechnol 2022; 20:55. [PMID: 35394551 PMCID: PMC8993957 DOI: 10.1186/s43141-022-00340-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Accepted: 03/30/2022] [Indexed: 11/29/2022]
Abstract
Background Mycobacterium tuberculosis has been ravaging humans by inflicting respiratory tuberculosis since centuries. Bacillus Calmette Guerine (BCG) is the only vaccine available for tuberculosis, and it is known to be poorly effective against adult tuberculosis. Proteins belonging to the ESAT-6 family and PE/PPE family show immune responses and are included in different vaccine trials. Herein, we study the functional and structural characterization of a 248 amino acid long putative protein novel hypothetical protein 1 (NHP1) present in the RD7 region of Mycobacterium tuberculosis (identified first by subtractive hybridization in the clinical isolate RGTB123) using bioinformatics tools. Results Physicochemical properties were studied using Expasy ProtParam and SMS software. We predicted different B-cell and T-cell epitopes by using the immune epitope database (IEDB) and also tested antigenicity, immunogenicity, and allergenicity. Secondary structure of the protein predicted 30% alpha helices, 20% beta strands, and 48% random coils. Tertiary structure of the protein was predicted using the Robetta server using the Mycobacterium smegmatis protein as the putative protein with homology. Structural evaluations were done with Ramachandran plot analysis, ProSA-web, and VERIFY3D, and with GalaxyWEB server, a more stable structure was validated with good stereo chemical properties. Conclusion The present study of a subtracted genomic locus using various bioinformatics tools indicated good immunological properties of the putative mycobacterial protein, NHP1. Evidence obtained from the analyses of NHP1 using structure prediction tools strongly point to the fact that NHP1 is an ancient protein having flavodoxin folding structure with ATP binding sites. Positive scores were obtained for antigenicity, immunogenicity, and virulence too, implying the possibility of NHP1 to be a potential vaccine candidate. Such computational studies might give clues for developing newer vaccines for tuberculosis, which is the need of the hour. Supplementary Information The online version contains supplementary material available at 10.1186/s43141-022-00340-5.
Collapse
Affiliation(s)
- Kaviya Parambath Kootery
- Department of Lifesciences, CHRIST (Deemed to be University), Bengaluru, Karnataka, 560029, India
| | - Suma Sarojini
- Department of Lifesciences, CHRIST (Deemed to be University), Bengaluru, Karnataka, 560029, India.
| |
Collapse
|
10
|
Khalili E, Ramazi S, Ghanati F, Kouchaki S. Predicting protein phosphorylation sites in soybean using interpretable deep tabular learning network. Brief Bioinform 2022; 23:bbac015. [PMID: 35152280 DOI: 10.1093/bib/bbac015] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Revised: 12/17/2021] [Accepted: 01/12/2022] [Indexed: 12/17/2023] Open
Abstract
Phosphorylation of proteins is one of the most significant post-translational modifications (PTMs) and plays a crucial role in plant functionality due to its impact on signaling, gene expression, enzyme kinetics, protein stability and interactions. Accurate prediction of plant phosphorylation sites (p-sites) is vital as abnormal regulation of phosphorylation usually leads to plant diseases. However, current experimental methods for PTM prediction suffers from high-computational cost and are error-prone. The present study develops machine learning-based prediction techniques, including a high-performance interpretable deep tabular learning network (TabNet) to improve the prediction of protein p-sites in soybean. Moreover, we use a hybrid feature set of sequential-based features, physicochemical properties and position-specific scoring matrices to predict serine (Ser/S), threonine (Thr/T) and tyrosine (Tyr/Y) p-sites in soybean for the first time. The experimentally verified p-sites data of soybean proteins are collected from the eukaryotic phosphorylation sites database and database post-translational modification. We then remove the redundant set of positive and negative samples by dropping protein sequences with >40% similarity. It is found that the developed techniques perform >70% in terms of accuracy. The results demonstrate that the TabNet model is the best performing classifier using hybrid features and with window size of 13, resulted in 78.96 and 77.24% sensitivity and specificity, respectively. The results indicate that the TabNet method has advantages in terms of high-performance and interpretability. The proposed technique can automatically analyze the data without any measurement errors and any human intervention. Furthermore, it can be used to predict putative protein p-sites in plants effectively. The collected dataset and source code are publicly deposited at https://github.com/Elham-khalili/Soybean-P-sites-Prediction.
Collapse
Affiliation(s)
- Elham Khalili
- Department of Plant Science, Faculty of Science, Tarbiat Modarres University, Tehran, Iran
| | - Shahin Ramazi
- Department of Biophysics, Faculty of Biological Science, Tarbiat Modares University, Tehran, Iran
| | - Faezeh Ghanati
- Department of Plant Science, Faculty of Science, Tarbiat Modarres University, Tehran, Iran
| | - Samaneh Kouchaki
- Department of Electrical and Electronic Engineering, .Faculty of Engineering and Physical Sciences, Centre for Vision, Speech, and Signal Processing, University of Surrey, Guildford, UK
| |
Collapse
|
11
|
Michalik M, Djahanschiri B, Leo JC, Linke D. An Update on "Reverse Vaccinology": The Pathway from Genomes and Epitope Predictions to Tailored, Recombinant Vaccines. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2412:45-71. [PMID: 34918241 DOI: 10.1007/978-1-0716-1892-9_4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
In this chapter, we review the computational approaches that have led to a new generation of vaccines in recent years. There are many alternative routes to develop vaccines based on the concept of reverse vaccinology. They all follow the same basic principles-mining available genome and proteome information for antigen candidates, and recombinantly expressing them for vaccine production. Some of the same principles have been used successfully for cancer therapy approaches. In this review, we focus on infectious diseases, describing the general workflow from bioinformatic predictions of antigens and epitopes down to examples where such predictions have been used successfully for vaccine development.
Collapse
Affiliation(s)
| | - Bardya Djahanschiri
- Institute of Cell Biology and Neuroscience, Goethe University, Frankfurt, Germany
| | - Jack C Leo
- Department of Biosciences, Nottingham Trent University, Nottingham, UK
| | - Dirk Linke
- Department of Biosciences, University of Oslo, Oslo, Norway.
| |
Collapse
|
12
|
Prathiviraj R, Chellapandi P, Begum A, Kiran GS, Selvin J. Identification of genotypic variants and its proteomic mutations of Brazilian SARS-CoV-2 isolates. Virus Res 2022; 307:198618. [PMID: 34740719 PMCID: PMC8563081 DOI: 10.1016/j.virusres.2021.198618] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 10/25/2021] [Accepted: 10/26/2021] [Indexed: 01/01/2023]
Abstract
The second wave of COVID-19 caused by severe acute respiratory syndrome virus (SARS-CoV-2) is rapidly spreading over the world. Mechanisms behind the flee from current antivirals are still unclear due to the continuous occurrence of SARS-CoV-2 genetic variants. Brazil is the world's second-most COVID-19 affected country. In the present study, we identified the genomic and proteomic variants of Brazilian SARS-CoV-2 isolates. We identified 16 different genotypic variants were found among the 27 isolates. The genotypes of three isolates such as Bra/1236/2021 (G15), Bra/MASP2C844R2/2020 (G11), and Bra/RJ-DCVN5/2020 (G9) have a unique mutant in NSP4 (S184N), 2'O-Mutase (R216N), membrane protein (A2V) and Envelope protein (V5A). A mutation in RdRp of SARS-CoV-2, particularly the change of Pro-to Leu-at 323 resulted in the stabilization of the structure in BRA/CD1739-P4/2020. NSP4, NSP5 protein mutants are more virulent in genotype 15 and 16. A fast protein folding rate changes the structural stability and leads to escape for current antivirals. Thus, our findings help researchers to develop the best potent antivirals based on the new mutant of Brazilian isolates.
Collapse
Affiliation(s)
| | - Paulchamy Chellapandi
- Department of Bioinformatics, Bharathidasan University, Tiruchirappalli 620024, India
| | - Ajima Begum
- National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi 110067, India
| | - George Seghal Kiran
- Department of Food Science and Technology, Pondicherry University, Puducherry 605014, India
| | - Joseph Selvin
- Department of Microbiology, Pondicherry University, Puducherry 605014, India
| |
Collapse
|
13
|
Beg MA, Hejazi II, Thakur SC, Athar F. Domain-wise differentiation of Mycobacterium tuberculosis H 37 Rv hypothetical proteins: A roadmap to discover bacterial survival potentials. Biotechnol Appl Biochem 2021; 69:296-312. [PMID: 33469971 DOI: 10.1002/bab.2109] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Accepted: 01/06/2021] [Indexed: 01/08/2023]
Abstract
Proteomic information revealed approximately 3,923 proteins in Mycobacterium tuberculosis H37 Rv genome of which around ∼25% of proteins are hypothetical proteins (HPs). The present work comprises computational approaches to identify and characterize the HPs of M. tuberculosis that symbolize the putative target for rationale development of a drug or antituberculosis strategy. Proteins were primarily classified based on motif and domain information, which were further analyzed for the presence of virulence factors (VFs), determination of localization, and signal peptide/enzymatic cleavage sites. 863 HPs were found, and 599 HPs were finalized based on motifs, that is, GTP (525), Trx (47), SAM (14), PE-PGRS (5), and CBD (8). 80 HPs contain virulence factor (VF), 24 HPs localized in membrane region, and 4 HPs contain signal peptide/enzymatic cleavage sites. The overall parametric study finalizes four HPs Rv0679c, Rv0906, Rv3627c, and Rv3811 that also comprise GTPase domain. Structure prediction, structure-based function prediction, molecular docking and mutation analysis of selected proteins were done. Docking studies revealed that GTP and GTPase inhibitor (mac0182344) were docked with all four proteins with high affinities. In silico point mutation studies showed that substitution of aspartate with glycine within a GTPase motif showed the largest decrease in stability and pH differentiation also affects protein's stability. This analysis thus fixes a roadmap in the direction of finding potential target of this bacterium for drug development and enlightens the efficacy of GTP as a major regulator of Mycobacterial cellular pathways.
Collapse
Affiliation(s)
- Md Amjad Beg
- Centre for Interdisciplinary Research in Basic Science, Jamia Millia Islamia, Jamia Nagar, New Delhi, India
| | - Iram Iqbal Hejazi
- Centre for Interdisciplinary Research in Basic Science, Jamia Millia Islamia, Jamia Nagar, New Delhi, India
| | - Sonu Chand Thakur
- Centre for Interdisciplinary Research in Basic Science, Jamia Millia Islamia, Jamia Nagar, New Delhi, India
| | - Fareeda Athar
- Centre for Interdisciplinary Research in Basic Science, Jamia Millia Islamia, Jamia Nagar, New Delhi, India
| |
Collapse
|
14
|
The evolutionary relationship of S15/NS1RNA binding domains with a similar protein domain pattern - A computational approach. INFORMATICS IN MEDICINE UNLOCKED 2021. [DOI: 10.1016/j.imu.2021.100611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
15
|
Bian H, Guo M, Wang J. Recognition of Mitochondrial Proteins in Plasmodium Based on the Tripeptide Composition. Front Cell Dev Biol 2020; 8:578901. [PMID: 33043014 PMCID: PMC7525148 DOI: 10.3389/fcell.2020.578901] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Accepted: 08/13/2020] [Indexed: 01/31/2023] Open
Abstract
Mitochondria play essential roles in eukaryotic cells, especially in Plasmodium cells. They have several unusual evolutionary and functional features that are incredibly vital for disease diagnosis and drug design. Thus, predicting mitochondrial proteins of Plasmodium has become a worthwhile work. However, existing computational methods can only predict mitochondrial proteins of Plasmodium falciparum (P. falciparum for short), and these methods have low accuracy. It is highly desirable to design a classifier with high accuracy for predicting mitochondrial proteins for all Plasmodium species, not only P. falciparum. We proposed a novel method, named as PM-OTC, for predicting mitochondrial proteins in Plasmodium. PM-OTC uses the Support Vector Machine (SVM) as the classifier and the selected tripeptide composition as the features. We adopted the 5-fold cross-validation method to train and test PM-OTC. Results demonstrate that PM-OTC achieves an accuracy of 94.91%, and performances of PM-OTC are superior to other methods.
Collapse
Affiliation(s)
- Haodong Bian
- School of Computer Science, Inner Mongolia University, Hohhot, China
| | - Maozu Guo
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China.,Beijing Key Laboratory of Intelligent Processing for Building Big Data, Beijing, China
| | - Juan Wang
- School of Computer Science, Inner Mongolia University, Hohhot, China.,Stage Key Laboratories of Reproductive Regulation & Breeding of Grassland Livestock, Hohhot, China
| |
Collapse
|
16
|
Yuan F, Liu G, Yang X, Wang S, Wang X. Prediction of oxidoreductase subfamily classes based on RFE-SND-CC-PSSM and machine learning methods. J Bioinform Comput Biol 2020; 17:1950029. [PMID: 31617464 DOI: 10.1142/s021972001950029x] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Oxidoreductase is an enzyme that widely exists in organisms. It plays an important role in cellular energy metabolism and biotransformation processes. Oxidoreductases have many subclasses with different functions, creating an important classification task in bioinformatics. In this paper, a dataset of 2640 oxidoreductase sequences was used to perform an analysis and comparison. The idea of dipeptides was introduced to process the Position Specific Score Matrix (PSSM), since each dipeptide consists of two amino acids and each column of PSSM corresponds to the information of one amino acid. Two kinds of dipeptide scores were proposed, the Standardization Normal Distribution PSSM (SND-PSSM) and the Correlation Coefficient PSSM (CC-PSSM). Recursive Feature Elimination (RFE) is used to extract features from the SND-PSSM and CC-PSSM, and the two sets of extracted features are combined to form a new feature matrix, the RFE-SND-CC-PSSM. The results show that, with the proposed method and a kernel-based nonlinear SVM classifier, the accuracy can reach 95.56% by the Jackknife test. Our method greatly improves the accuracy of oxidoreductase subclass prediction. Using this method to predict the categories of the 6 major types of enzymes effectively improves its prediction accuracy to 94.54%, indicating that this method has general applicability to other protein problems. The results show that our method is effective and universally applicable, and might be complementary to the existing methods.
Collapse
Affiliation(s)
- Fang Yuan
- Department of Biochemistry and Molecular Biology, School of Basic Medicine, Kunming Medical University, Kunming 650500, P. R. China
| | - Gan Liu
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming 650504, P. R. China
| | - Xiwen Yang
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming 650504, P. R. China
| | - Shunfang Wang
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming 650504, P. R. China
| | - Xueren Wang
- School of Mathematics and Statistics, Yunnan University, Kunming 650504, P. R. China
| |
Collapse
|
17
|
Abdullah M, Suraiya S, Mohamad S, Harun A. Dataset of complete genome assembly and analysis of mycobacterium tuberculosis strain SIT745/EAI1-MYS. Data Brief 2020; 31:105949. [PMID: 32671154 PMCID: PMC7339031 DOI: 10.1016/j.dib.2020.105949] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2020] [Accepted: 06/26/2020] [Indexed: 11/18/2022] Open
Abstract
In this dataset, we report the genome assembly and data analysis of Mycobacterium tuberculosis strain SIT745/EAI1-MYS. Previously, this strain was isolated from a Malaysian patient with extra-pulmonary tuberculosis, and identification of this strain is done by spoligotype patterns with fifteen known Shared International Type (SITs). Further analysis showed that this strain has a remarkable phylogeographical specificity for Malaysia. Based on the National Center for Biotechnology Information (NCBI) nucleotide database information, the complete genome consists of 150 contigs with various sequence lengths and was not assembled. In this assembly, the aforementioned contigs along with reference sequence from Mycobacterium tuberculosis strain H37Rv and Mycobacterium bovis strain AF2122/97 was used for gap closures, were assembled into a single circular chromosome length of approximately 4.42 Mega bases (Mb) with an average GC content of 65.6%. The single circular chromosome was shown to contain 4,009 protein-coding sequences, 3 ribosomal RNAs, 45 transfer RNAs, and 12 superclasses distributed with 277 subsystems which constitute nearly 1900 genes, respectively. The genome information will provide fundamental knowledge of this organism as well as insight for understanding genomic and proteomic profiling, phylogenetic relationship.
Collapse
|
18
|
Shahbaaz M, Potemkin V, Bisetty K, Hassan MI, Hussien MA. Classification and functional analyses of putative virulence factors of Mycobacterium tuberculosis: A combined sequence and structure based study. Comput Biol Chem 2020; 87:107270. [PMID: 32438116 DOI: 10.1016/j.compbiolchem.2020.107270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2019] [Revised: 04/16/2020] [Accepted: 04/26/2020] [Indexed: 11/17/2022]
Abstract
The emergence of the drug-resistant mechanisms in Mycobacterium tuberculosis poses the biggest challenges to the current therapeutic measures, which necessitates the identification of new drug targets. The Hypothetical Proteins (HPs), a class of functionally uncharacterized proteins, may provide a new class of undiscovered therapeutic targets. The genome of M. tuberculosis contains 1000 HPs with their sequences were analyzed using a variety of bioinformatics tools and the functional annotations were performed. The functions of 662 HPs were successfully predicted and further classified 483 HPs as enzymes, 141 HPs were predicted to be involved in the diverse cellular mechanisms and 38 HPs may function as transporters and carriers proteins. Furthermore, 28 HPs were predicted to be virulent in nature. Amongst them, the HP P95201, HP P9WM79, HP I6WZ30, HP I6 × 9T8, HP P9WKP3, and HP P9WK89 showed the highest virulence scores. Therefore, these proteins were subjected to extensive structure analyses and dynamics of their conformations were investigated using the principles of molecular dynamics simulations, each for a 150 ns time scale. This study provides a deeper understanding of the undiscovered drug targets and the generated outputs will facilitate the process of drug design and discovery against the infection of M. tuberculosis.
Collapse
Affiliation(s)
- Mohd Shahbaaz
- South African Medical Research Council Bioinformatics Unit, South African National Bioinformatics Institute (SANBI), University of the Western Cape, Private Bag X17, Bellville 7535, Cape Town, South Africa; Laboratory of Computational Modeling of Drugs, South Ural State University, 76 Lenin prospekt, 454080 Chelyabinsk, Russia
| | - Vladimir Potemkin
- Laboratory of Computational Modeling of Drugs, South Ural State University, 76 Lenin prospekt, 454080 Chelyabinsk, Russia
| | - Krishna Bisetty
- Department of Chemistry, Durban University of Technology, Durban, 4000, South Africa
| | - Md Imtaiyaz Hassan
- Center for Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, Jamia Nagar, New Delhi, 110025, India
| | - Mostafa A Hussien
- Department of Chemistry, Faculty of Science, King Abdulaziz University, P.O. Box 80203 Jeddah 21589, Saudi Arabia; Department of Chemistry, Faculty of Science, Port Said University, Port Said, 42521, Egypt
| |
Collapse
|
19
|
Zarei M, Rahbar MR, Negahdaripour M, Morowvat MH, Nezafat N, Ghasemi Y. Cell Penetrating Peptide: Sequence-Based Computational Prediction for Intercellular Delivery of Arginine Deiminase. CURR PROTEOMICS 2020. [DOI: 10.2174/1570164616666190701120351] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:Cell-Penetrating Peptides (CPPs), a family of short peptides, are broadly used as the carrier in the delivery of drugs and different therapeutic agents. Thanks to the existence of valuable databases, computational screening of the experimentally validated CPPs can help the researchers to select more effective CPPs for the intercellular delivery of therapeutic proteins. Arginine deiminase of Mycoplasma hominis, an arginine-degrading enzyme, is currently in the clinical trial for treating several arginine auxotrophic cancers. However, some tumor cells have developed resistance to ADI treatment. The ADI resistance arises from the over-expression of argininosuccinate synthetase 1 enzyme, which is involved in arginine synthesis. Intracellular delivery of ADI into tumor cells is suggested as an efficient approach to overcome the aforesaid drawback.Objective:In this study, in-silico tools were used for evaluating the experimentally validated CPPs to select the best CPP candidates for the intracellular delivery of ADI.Results:In this regard, 150 CPPs of protein cargo available at CPPsite were retrieved and evaluated by the CellPPD server. The best CPP candidates for the intracellular delivery of ADI were selected based on stability and antigenicity of the ADI-CPP fusion form. The conjugated forms of ADI with each of the three CPPs including EGFP-hcT (9-32), EGFP-ppTG20, and F(SG)4TP10 were stable and nonantigenic; thus, these sequences were introduced as the best CPP candidates for the intracellular delivery of ADI. In addition, the proposed CPPs had appropriate positive charge and lengths for an efficient cellular uptake.Conclusion:These three introduced CPPs not only are appropriate for the intracellular delivery of ADI, but also can overcome the limitation of its therapeutic application, including short half-life and antigenicity.
Collapse
Affiliation(s)
- Mahboubeh Zarei
- Pharmaceutical Sciences Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Mohammad Reza Rahbar
- Pharmaceutical Sciences Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Manica Negahdaripour
- Pharmaceutical Sciences Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| | | | - Navid Nezafat
- Pharmaceutical Sciences Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Younes Ghasemi
- Pharmaceutical Sciences Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| |
Collapse
|
20
|
Li SH, Guan ZX, Zhang D, Zhang ZM, Huang J, Yang W, Lin H. Recent Advancement in Predicting Subcellular Localization of Mycobacterial Protein with Machine Learning Methods. Med Chem 2019; 16:605-619. [PMID: 31584379 DOI: 10.2174/1573406415666191004101913] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2019] [Revised: 06/25/2019] [Accepted: 08/23/2019] [Indexed: 01/28/2023]
Abstract
Mycobacterium tuberculosis (MTB) can cause the terrible tuberculosis (TB), which is reported as one of the most dreadful epidemics. Although many biochemical molecular drugs have been developed to cope with this disease, the drug resistance-especially the multidrug-resistant (MDR) and extensively drug-resistance (XDR)-poses a huge threat to the treatment. However, traditional biochemical experimental method to tackle TB is time-consuming and costly. Benefited by the appearance of the enormous genomic and proteomic sequence data, TB can be treated via sequence-based biological computational approach-bioinformatics. Studies on predicting subcellular localization of mycobacterial protein (MBP) with high precision and efficiency may help figure out the biological function of these proteins and then provide useful insights for protein function annotation as well as drug design. In this review, we reported the progress that has been made in computational prediction of subcellular localization of MBP including the following aspects: 1) Construction of benchmark datasets. 2) Methods of feature extraction. 3) Techniques of feature selection. 4) Application of several published prediction algorithms. 5) The published results. 6) The further study on prediction of subcellular localization of MBP.
Collapse
Affiliation(s)
- Shi-Hao Li
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Zheng-Xing Guan
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Dan Zhang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Zi-Mei Zhang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Jian Huang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Wuritu Yang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.,Development and Planning Department, Inner Mongolia University, Hohhot, P.R. China
| | - Hao Lin
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
21
|
Sánchez-Barinas CD, Ocampo M, Tabares L, Bermúdez M, Patarroyo MA, Patarroyo ME. Specific Binding Peptides from Rv3632: A Strategy for Blocking Mycobacterium tuberculosis Entry to Target Cells? BIOMED RESEARCH INTERNATIONAL 2019; 2019:8680935. [PMID: 31111070 PMCID: PMC6487176 DOI: 10.1155/2019/8680935] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/13/2018] [Revised: 02/13/2019] [Accepted: 03/03/2019] [Indexed: 11/17/2022]
Abstract
Tuberculosis is an infectious disease caused by Mycobacterium tuberculosis (Mtb, i.e., the aetiological agent); the WHO has established this disease as high priority due to its ensuing mortality. Mtb uses a range of mechanisms for preventing its elimination by an infected host; new, viable alternatives for blocking the host-pathogen interaction are thus sought constantly. This article updates our laboratory's systematic search for antigens using bioinformatics tools to clarify the Mtb H37Rv Rv3632 protein's topology and location. This article reports a C-terminal region consisting of peptides 39255 and 39256 (81Thr-Arg114) having high specific binding regarding two infection-related cell lines (A549 and U937); they inhibited mycobacterial entry to U937 cells in a concentration-dependent manner. Rv3632 forms part of the mycobacterial cell envelope, formed by six linear synthetic peptides. Circular dichroism enabled determining the protein's secondary structure. It was also found that peptide 39254 (61Gly-Thr83) was a HABP for alveolar epithelial cells and inhibited mycobacteria entry to these cells regardless of concentration. Sera from active or latent tuberculosis patients did not recognise HABPs 39254 and 39256. These sequences represent a promising approach aiming at their ongoing modification and for including them when designing a multi-epitope, anti-tuberculosis vaccine.
Collapse
Affiliation(s)
- Christian David Sánchez-Barinas
- Fundación Instituto de Inmunología de Colombia (FIDIC), Carrera 50 No. 26–20, 111321 Bogotá, Colombia
- Universidad del Rosario, Carrera 24 No. 63C-69, 111321 Bogotá, Colombia
| | - Marisol Ocampo
- Fundación Instituto de Inmunología de Colombia (FIDIC), Carrera 50 No. 26–20, 111321 Bogotá, Colombia
- Universidad del Rosario, Carrera 24 No. 63C-69, 111321 Bogotá, Colombia
| | - Luisa Tabares
- Fundación Instituto de Inmunología de Colombia (FIDIC), Carrera 50 No. 26–20, 111321 Bogotá, Colombia
- Universidad del Rosario, Carrera 24 No. 63C-69, 111321 Bogotá, Colombia
| | - Maritza Bermúdez
- Fundación Instituto de Inmunología de Colombia (FIDIC), Carrera 50 No. 26–20, 111321 Bogotá, Colombia
- Universidad del Rosario, Carrera 24 No. 63C-69, 111321 Bogotá, Colombia
| | - Manuel Alfonso Patarroyo
- Fundación Instituto de Inmunología de Colombia (FIDIC), Carrera 50 No. 26–20, 111321 Bogotá, Colombia
- Universidad del Rosario, Carrera 24 No. 63C-69, 111321 Bogotá, Colombia
| | - Manuel Elkin Patarroyo
- Fundación Instituto de Inmunología de Colombia (FIDIC), Carrera 50 No. 26–20, 111321 Bogotá, Colombia
- Universidad Nacional de Colombia, Carrera 45 No. 26-85, 11001 Bogotá, Colombia
| |
Collapse
|
22
|
Yonge F, Weixia X. Identification of Mitochondrial Proteins of Malaria Parasite Adding the New Parameter. LETT ORG CHEM 2019. [DOI: 10.2174/1570178615666180608100348] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Malaria has been one of the serious infectious diseases caused by Plasmodium falciparum (P. falciparum). Mitochondrial proteins of P. falciparum are regarded as effective drug targets against malaria. Thus, it is necessary to accurately identify mitochondrial proteins of malaria parasite. Many algorithms have been proposed for the prediction of mitochondrial proteins of malaria parasite and yielded the better results. However, the parameters used by these methods were primarily based on amino acid sequences. In this study, we added a novel parameter for predicting mitochondrial proteins of malaria parasite based on protein secondary structure. Firstly, we extracted three feature parameters, namely, three kinds of protein secondary structures compositions (3PSS), 20 amino acid compositions (20AAC) and 400 dipeptide compositions (400DC), and used the analysis of variance (ANOVA) to screen 400 dipeptides. Secondly, we adopted these features to predict mitochondrial proteins of malaria parasite by using support vector machine (SVM). Finally, we found that 1) adding the feature of protein secondary structure (3PSS) can indeed improve the prediction accuracy. This result demonstrated that the parameter of protein secondary structure is a valid feature in the prediction of mitochondrial proteins of malaria parasite; 2) feature combination can improve the prediction’s results; feature selection can reduce the dimension and simplify the calculation. We achieved the sensitivity (Sn) of 98.16%, the specificity (Sp) of 97.64% and overall accuracy (Acc) of 97.88% with 0.957 of Mathew’s correlation coefficient (MCC) by using 3PSS+ 20AAC+ 34DC as a feature in 15-fold cross-validation. This result is compared with that of the similar work in the same dataset, showing the superiority of our work.
Collapse
Affiliation(s)
- Feng Yonge
- College of Science, Inner Mongolia Agriculture University, Hohhot 010018, China
| | - Xie Weixia
- College of Science, Inner Mongolia Agriculture University, Hohhot 010018, China
| |
Collapse
|
23
|
Abstract
Codon usage depends on mutation bias, tRNA-mediated selection, and the need for high efficiency and accuracy in translation. One codon in a synonymous codon family is often strongly over-used, especially in highly expressed genes, which often leads to a high dN/dS ratio because dS is very small. Many different codon usage indices have been proposed to measure codon usage and codon adaptation. Sense codon could be misread by release factors and stop codons misread by tRNAs, which also contribute to codon usage in rare cases. This chapter outlines the conceptual framework on codon evolution, illustrates codon-specific and gene-specific codon usage indices, and presents their applications. A new index for codon adaptation that accounts for background mutation bias (Index of Translation Elongation) is presented and contrasted with codon adaptation index (CAI) which does not consider background mutation bias. They are used to re-analyze data from a recent paper claiming that translation elongation efficiency matters little in protein production. The reanalysis disproves the claim.
Collapse
|
24
|
Sarkar R, Mdladla C, Macingwana L, Pietersen RD, Ngwane A, Tabb D, van Helden P, Wiid I, Baker B. Proteomic analysis reveals that sulfamethoxazole induces oxidative stress in M. tuberculosis. Tuberculosis (Edinb) 2018; 111:78-85. [DOI: 10.1016/j.tube.2018.05.010] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2018] [Revised: 03/27/2018] [Accepted: 05/15/2018] [Indexed: 02/04/2023]
|
25
|
Integrated proteomics, genomics, metabolomics approaches reveal oxalic acid as pathogenicity factor in Tilletia indica inciting Karnal bunt disease of wheat. Sci Rep 2018; 8:7826. [PMID: 29777151 PMCID: PMC5959904 DOI: 10.1038/s41598-018-26257-z] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2017] [Accepted: 05/03/2018] [Indexed: 01/21/2023] Open
Abstract
Tilletia indica incites Karnal bunt (KB) disease in wheat. To date, no KB resistant wheat cultivar could be developed due to non-availability of potential biomarkers related to pathogenicity/virulence for screening of resistant wheat genotypes. The present study was carried out to compare the proteomes of T. indica highly (TiK) and low (TiP) virulent isolates. Twenty one protein spots consistently observed as up-regulated/differential in the TiK proteome were selected for identification by MALDI-TOF/TOF. Identified sequences showed homology with fungal proteins playing essential role in plant infection and pathogen survival, including stress response, adhesion, fungal penetration, invasion, colonization, degradation of host cell wall, signal transduction pathway. These results were integrated with T. indica genome sequence for identification of homologs of candidate pathogenicity/virulence related proteins. Protein identified in TiK isolate as malate dehydrogenase that converts malate to oxaloacetate which is precursor of oxalic acid. Oxalic acid is key pathogenicity factor in phytopathogenic fungi. These results were validated by GC-MS based metabolic profiling of T. indica isolates indicating that oxalic acid was exclusively identified in TiK isolate. Thus, integrated omics approaches leads to identification of pathogenicity/virulence factor(s) that would provide insights into pathogenic mechanisms of fungi and aid in devising effective disease management strategies.
Collapse
|
26
|
Carabali-Isajar ML, Ocampo M, Rodriguez DC, Vanegas M, Curtidor H, Patarroyo MA, Patarroyo ME. Towards designing a synthetic antituberculosis vaccine: The Rv3587c peptide inhibits mycobacterial entry to host cells. Bioorg Med Chem 2018; 26:2401-2409. [DOI: 10.1016/j.bmc.2018.03.044] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2017] [Revised: 03/15/2018] [Accepted: 03/29/2018] [Indexed: 01/07/2023]
|
27
|
Muthu Krishnan S. Using Chou's general PseAAC to analyze the evolutionary relationship of receptor associated proteins (RAP) with various folding patterns of protein domains. J Theor Biol 2018; 445:62-74. [DOI: 10.1016/j.jtbi.2018.02.008] [Citation(s) in RCA: 59] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2017] [Revised: 01/24/2018] [Accepted: 02/12/2018] [Indexed: 01/31/2023]
|
28
|
Srivastava A, Kumar M. Prediction of zinc binding sites in proteins using sequence derived information. J Biomol Struct Dyn 2018; 36:4413-4423. [PMID: 29241411 DOI: 10.1080/07391102.2017.1417910] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Zinc is one the most abundant catalytic cofactor and also an important structural component of a large number of metallo-proteins. Hence prediction of zinc metal binding sites in proteins can be a significant step in annotation of molecular function of a large number of proteins. Majority of existing methods for zinc-binding site predictions are based on a data-set of proteins, which has been compiled nearly a decade ago. Hence there is a need to develop zinc-binding site prediction system using the current updated data to include recently added proteins. Herein, we propose a support vector machine-based method, named as ZincBinder, for prediction of zinc metal-binding site in a protein using sequence profile information. The predictor was trained using fivefold cross validation approach and achieved 85.37% sensitivity with 86.20% specificity during training. Benchmarking on an independent non-redundant data-set, which was not used during training, showed better performance of ZincBinder vis-à-vis existing methods. Executable versions, source code, sample datasets, and usage instructions are available at http://proteininformatics.org/mkumar/znbinder/.
Collapse
Affiliation(s)
- Abhishikha Srivastava
- a Department of Biophysics , University of Delhi South Campus , Benito Juarez Road, New Delhi 110021 , India
| | - Manish Kumar
- a Department of Biophysics , University of Delhi South Campus , Benito Juarez Road, New Delhi 110021 , India
| |
Collapse
|
29
|
Mycobacterium tuberculosis PPE44 (Rv2770c) is involved in response to multiple stresses and promotes the macrophage expression of IL-12 p40 and IL-6 via the p38, ERK, and NF-κB signaling axis. Int Immunopharmacol 2017; 50:319-329. [DOI: 10.1016/j.intimp.2017.06.028] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2017] [Revised: 06/11/2017] [Accepted: 06/26/2017] [Indexed: 11/19/2022]
|
30
|
Nielsen H. Predicting Subcellular Localization of Proteins by Bioinformatic Algorithms. Curr Top Microbiol Immunol 2017; 404:129-158. [PMID: 26728066 DOI: 10.1007/82_2015_5006] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
When predicting the subcellular localization of proteins from their amino acid sequences, there are basically three approaches: signal-based, global property-based, and homology-based. Each of these has its advantages and drawbacks, and it is important when comparing methods to know which approach was used. Various statistical and machine learning algorithms are used with all three approaches, and various measures and standards are employed when reporting the performances of the developed methods. This chapter presents a number of available methods for prediction of sorting signals and subcellular localization, but rather than providing a checklist of which predictors to use, it aims to function as a guide for critical assessment of prediction methods.
Collapse
Affiliation(s)
- Henrik Nielsen
- Department of Systems Biology, Center for Biological Sequence Analysis, Technical University of Denmark, Kemitorvet building 208, 2800, Lyngby, Denmark.
| |
Collapse
|
31
|
Díaz DP, Ocampo M, Varela Y, Curtidor H, Patarroyo MA, Patarroyo ME. Identifying and characterising PPE7 (Rv0354c) high activity binding peptides and their role in inhibiting cell invasion. Mol Cell Biochem 2017; 430:149-160. [PMID: 28205097 DOI: 10.1007/s11010-017-2962-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2016] [Accepted: 01/28/2017] [Indexed: 10/20/2022]
Abstract
This study was aimed at characterising the PPE7 protein from the PE/PPE protein family. The presence and transcription of the rv0354c gene in the Mycobacterium tuberculosis complex was determined and the subcellular localisation of the PPE7 protein on mycobacterial membrane was confirmed by immunoelectron microscope. Two peptides were identified as having high binding activity (HABPs) and were tested in vitro regarding the invasion of Mycobacterium tuberculosis H37Rv. HABP 39224 inhibited invasion in A549 epithelial cells and U937 macrophages by more than 50%, whilst HABP 39225 inhibited invasion by 40% in U937 cells. HABP 39224, located in the protein's C-terminal region, has a completely conserved amino acid sequence in M. tuberculosis complex species and could be selected as a base peptide when designing a subunit-based, anti-tuberculosis vaccine.
Collapse
Affiliation(s)
- Diana P Díaz
- Fundación Instituto de Inmunología de Colombia (FIDIC), 111321, Bogotá, Colombia.,Universidad del Rosario, 111321, Bogotá, Colombia
| | - Marisol Ocampo
- Fundación Instituto de Inmunología de Colombia (FIDIC), 111321, Bogotá, Colombia. .,Universidad del Rosario, 111321, Bogotá, Colombia.
| | - Yahson Varela
- Fundación Instituto de Inmunología de Colombia (FIDIC), 111321, Bogotá, Colombia.,Universidad del Rosario, 111321, Bogotá, Colombia
| | - Hernando Curtidor
- Fundación Instituto de Inmunología de Colombia (FIDIC), 111321, Bogotá, Colombia.,Universidad del Rosario, 111321, Bogotá, Colombia
| | - Manuel A Patarroyo
- Fundación Instituto de Inmunología de Colombia (FIDIC), 111321, Bogotá, Colombia.,Universidad del Rosario, 111321, Bogotá, Colombia
| | - Manuel E Patarroyo
- Fundación Instituto de Inmunología de Colombia (FIDIC), 111321, Bogotá, Colombia.,Universidad Nacional de Colombia, 11001, Bogotá, Colombia
| |
Collapse
|
32
|
Sharma D, Lata M, Singh R, Deo N, Venkatesan K, Bisht D. Cytosolic Proteome Profiling of Aminoglycosides Resistant Mycobacterium tuberculosis Clinical Isolates Using MALDI-TOF/MS. Front Microbiol 2016; 7:1816. [PMID: 27895634 PMCID: PMC5108770 DOI: 10.3389/fmicb.2016.01816] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2016] [Accepted: 10/28/2016] [Indexed: 12/25/2022] Open
Abstract
Emergence of extensively drug resistant tuberculosis (XDR-TB) is the consequence of the failure of second line TB treatment. Aminoglycosides are the important second line anti-TB drugs used to treat the multi drug resistant tuberculosis (MDR-TB). Main known mechanism of action of aminoglycosides is to inhibit the protein synthesis by inhibiting the normal functioning of ribosome. Primary target of aminoglycosides are the ribosomal RNA and its associated proteins. Various mechanisms have been proposed for aminoglycosides resistance but still some are unsolved. As proteins are involved in most of the biological processes, these act as a potential diagnostic markers and drug targets. In the present study we analyzed the purely cytosolic proteome of amikacin (AK) and kanamycin (KM) resistant Mycobacterium tuberculosis isolates by proteomic and bioinformatic approaches. Twenty protein spots were found to have over expressed in resistant isolates and were identified. Among these Rv3208A, Rv2623, Rv1360, Rv2140c, Rv1636, and Rv2185c are six proteins with unknown functions or undefined role. Docking results showed that AK and KM binds to the conserved domain (DUF, USP-A, Luciferase, PEBP and Polyketidecyclase/dehydrase domain) of these hypothetical proteins and over expression of these proteins might neutralize/modulate the effect of drug molecules. TBPred and GPS-PUP predicted cytoplasmic nature and potential pupylation sites within these identified proteins, respectively. String analysis also suggested that over expressed proteins along with their interactive partners might be involved in aminoglycosides resistance. Cumulative effect of these over expressed proteins could be involved in AK and KM resistance by mitigating the toxicity, repression of drug target and neutralizing affect. These findings need further exploitation for the expansion of newer therapeutics or diagnostic markers against AK and KM resistance so that an extreme condition like XDR-TB can be prevented.
Collapse
Affiliation(s)
- Divakar Sharma
- Department of Biochemistry, National JALMA Institute for Leprosy and Other Mycobacterial Diseases Agra, India
| | - Manju Lata
- Department of Biochemistry, National JALMA Institute for Leprosy and Other Mycobacterial Diseases Agra, India
| | - Rananjay Singh
- Department of Biochemistry, National JALMA Institute for Leprosy and Other Mycobacterial Diseases Agra, India
| | - Nirmala Deo
- Department of Biochemistry, National JALMA Institute for Leprosy and Other Mycobacterial Diseases Agra, India
| | - Krishnamurthy Venkatesan
- Department of Biochemistry, National JALMA Institute for Leprosy and Other Mycobacterial Diseases Agra, India
| | - Deepa Bisht
- Department of Biochemistry, National JALMA Institute for Leprosy and Other Mycobacterial Diseases Agra, India
| |
Collapse
|
33
|
Muthu Krishnan S. Classify vertebrate hemoglobin proteins by incorporating the evolutionary information into the general PseAAC with the hybrid approach. J Theor Biol 2016; 409:27-37. [PMID: 27575465 DOI: 10.1016/j.jtbi.2016.08.027] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2016] [Revised: 08/11/2016] [Accepted: 08/16/2016] [Indexed: 01/26/2023]
Abstract
Hemoglobin is an oxygen-binding protein widely present in all kingdoms of life from prokaryotic to eukaryotic, but well established in the vertebrate system. An attempt was made to determine the Vertebrate hemoglobin (VerHb) protein on their animal classifications, based on general pseudo amino acid composition (PseAAC)'s evolutionary profiles and hybrid approach. The support vector machine (SVM) has been applied to develop all models, the prediction results further compared according to their animal classification. The performance of the approaches estimated using five-fold cross-validation techniques. The prediction performance was further investigated by receiver operating characteristic (ROC) and prediction score graphs. The prediction accuracy (ACC), sensitivity (SN) and specificity (SP) were examined to find the accurate predictions on the threshold level. Based on the approach, a web-tool has been developed for identifying the VerHb proteins.
Collapse
Affiliation(s)
- S Muthu Krishnan
- CSIR - Institute of Microbial Technology (IMTECH), Sector-39A, Chandigarh, India.
| |
Collapse
|
34
|
Gazi MA, Kibria MG, Mahfuz M, Islam MR, Ghosh P, Afsar MNA, Khan MA, Ahmed T. Functional, structural and epitopic prediction of hypothetical proteins of Mycobacterium tuberculosis H37Rv: An in silico approach for prioritizing the targets. Gene 2016; 591:442-55. [PMID: 27374154 DOI: 10.1016/j.gene.2016.06.057] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2016] [Revised: 04/27/2016] [Accepted: 06/28/2016] [Indexed: 01/11/2023]
Abstract
The global control of tuberculosis (TB) remains a great challenge from the standpoint of diagnosis, detection of drug resistance, and treatment. Major serodiagnostic limitations include low sensitivity and high cost in detecting TB. On the other hand, treatment measures are often hindered by low efficacies of commonly used drugs and resistance developed by the bacteria. Hence, there is a need to look into newer diagnostic and therapeutic targets. The proteome information available suggests that among the 3906 proteins in Mycobacterium tuberculosis H37Rv, about quarter remain classified as hypothetical uncharacterized set. This study involves a combination of a number of bioinformatics tools to analyze those hypothetical proteins (HPs). An entire set of 999 proteins was primarily screened for protein sequences having conserved domains with high confidence using a combination of the latest versions of protein family databases. Subsequently, 98 of such potential target proteins were extensively analyzed by means of physicochemical characteristics, protein-protein interaction, sub-cellular localization, structural similarity and functional classification. Next, we predicted antigenic proteins from the entire set and identified B and T cell epitopes of these proteins in M. tuberculosis H37Rv. We predicted the function of these HPs belong to various classes of proteins such as enzymes, transporters, receptors, structural proteins, transcription regulators and other proteins. However, the structural similarity prediction of the annotated proteins substantiated the functional classification of those proteins. Consequently, based on higher antigenicity score and sub-cellular localization, we choose two (NP_216420.1, NP_216903.1) of the antigenic proteins to exemplify B and T cell epitope prediction approach. Finally we found 15 epitopes those located partially or fully in the linear epitope region. We found 21 conformational epitopes by using Ellipro server as well. In silico methodology used in this study and the data thus generated for HPs of M. tuberculosis H37Rv may facilitate swift experimental identification of potential serodiagnostic and therapeutic targets for treatment and control.
Collapse
Affiliation(s)
- Md Amran Gazi
- Nutrition and Clinical Services Division, International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Bangladesh.
| | - Mohammad Golam Kibria
- Parasitology Laboratory, International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Bangladesh.
| | - Mustafa Mahfuz
- Nutrition and Clinical Services Division, International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Bangladesh.
| | - Md Rezaul Islam
- International Max Planck Research School, Grisebachstraße 5, 37077 Göttingen, Germany.
| | - Prakash Ghosh
- Parasitology Laboratory, International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Bangladesh.
| | - Md Nure Alam Afsar
- Infectious Diseases Division, International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Bangladesh.
| | - Md Arif Khan
- Bio-Bio-1 Research Foundation, Sangskriti Bikash Kendra Bhaban, 1/E/1, Poribag, Dhaka 1000, Bangladesh.
| | - Tahmeed Ahmed
- Nutrition and Clinical Services Division, International Centre for Diarrhoeal Disease Research, Bangladesh (icddr,b), Bangladesh.
| |
Collapse
|
35
|
Díaz DP, Ocampo M, Pabón L, Herrera C, Patarroyo MA, Munoz M, Patarroyo ME. Mycobacterium tuberculosis PE9 protein has high activity binding peptides which inhibit target cell invasion. Int J Biol Macromol 2016; 86:646-55. [DOI: 10.1016/j.ijbiomac.2015.12.081] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2015] [Revised: 12/03/2015] [Accepted: 12/26/2015] [Indexed: 10/22/2022]
|
36
|
Michalik M, Djahanshiri B, Leo JC, Linke D. Reverse Vaccinology: The Pathway from Genomes and Epitope Predictions to Tailored Recombinant Vaccines. Methods Mol Biol 2016; 1403:87-106. [PMID: 27076126 DOI: 10.1007/978-1-4939-3387-7_4] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
In this chapter, we review the computational approaches that have led to a new generation of vaccines in recent years. There are many alternative routes to develop vaccines based on the technology of reverse vaccinology. We focus here on bacterial infectious diseases, describing the general workflow from bioinformatic predictions of antigens and epitopes down to examples where such predictions have been used successfully for vaccine development.
Collapse
Affiliation(s)
- Marcin Michalik
- Department of Biosciences, University of Oslo, 0371, Oslo, Norway.,Department of Protein Evolution, Max Planck Institute for Developmental Biology, 72076, Tübingen, Germany
| | - Bardya Djahanshiri
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, 72076, Tübingen, Germany.,Department for Applied Bioinformatics, Goethe-University, 60438, Frankfurt, Germany
| | - Jack C Leo
- Department of Biosciences, University of Oslo, 0371, Oslo, Norway
| | - Dirk Linke
- Department of Biosciences, University of Oslo, 0371, Oslo, Norway. .,Department of Protein Evolution, Max Planck Institute for Developmental Biology, 72076, Tübingen, Germany.
| |
Collapse
|
37
|
Structure-based functional annotation of hypothetical proteins from Candida dubliniensis: a quest for potential drug targets. 3 Biotech 2015; 5:561-576. [PMID: 28324558 PMCID: PMC4522726 DOI: 10.1007/s13205-014-0256-3] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2014] [Accepted: 09/30/2014] [Indexed: 12/24/2022] Open
Abstract
Candida dubliniensis is an emerging pathogenic yeast in humans and infections are usually restricted to mucosal parts of the body. However, its presence in specimens of immunocompromised individuals, especially in HIV-positive patients, is of major medical concern. There is a large fraction of genomes of C. dubliniensis in the database which are uncharacterized for their biochemical, biophysical, and/or cellular functions, and are identified as hypothetical proteins (HPs). Function annotation of Candida genome is, therefore, essentially required to facilitate the understanding of mechanisms of pathogenesis and biochemical pathways important for selecting novel therapeutic target. Here, we carried out an extensive analysis to explain the functional properties of genome, using available protein structure and function analysis tools. We successfully modeled the structures of eight HPs for which a template with moderate sequence similarity was available in the protein data bank. All modeled structures were analyzed and we found that these proteins may act as transporter, kinase, transferase, ketosteroid, isomerase, hydrolase, oxidoreductase, and binding targets for DNA and RNA. Since these unique HPs of Candida showed no homologs in humans, these proteins are expected to be a potential target for future antifungal therapy.
Collapse
|
38
|
Abstract
RNA interference (RNAi) is one of the most popular and effective molecular technologies for knocking down the expression of an individual gene of interest in living organisms. Yet the technology still faces the major issue of nonspecific gene silencing, which can compromise gene functional characterization and the interpretation of phenotypes associated with individual gene knockdown. Designing an effective and target-specific small interfering RNA (siRNA) for induction of RNAi is therefore the major challenge in RNAi-based gene silencing. A 'good' siRNA molecule must possess three key features: (a) the ability to specifically silence an individual gene of interest, (b) little or no effect on the expressions of unintended siRNA gene targets (off-target genes), and (c) no cell toxicity. Although several siRNA design and analysis algorithms have been developed, only a few of them are specifically focused on gene silencing in plants. Furthermore, current algorithms lack a comprehensive consideration of siRNA specificity, efficacy, and nontoxicity in siRNA design, mainly due to lack of integration of all known rules that govern different steps in the RNAi pathway. In this review, we first describe popular RNAi methods that have been used for gene silencing in plants and their serious limitations regarding gene-silencing potency and specificity. We then present novel, rationale-based strategies in combination with computational and experimental approaches to induce potent, specific, and nontoxic gene silencing in plants.
Collapse
Affiliation(s)
- Firoz Ahmed
- Plant Biology Division, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK, 73401, USA
| | | | | |
Collapse
|
39
|
Zhu PP, Li WC, Zhong ZJ, Deng EZ, Ding H, Chen W, Lin H. Predicting the subcellular localization of mycobacterial proteins by incorporating the optimal tripeptides into the general form of pseudo amino acid composition. MOLECULAR BIOSYSTEMS 2015; 11:558-63. [DOI: 10.1039/c4mb00645c] [Citation(s) in RCA: 97] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Mycobacterium tuberculosis is a bacterium that causes tuberculosis, one of the most prevalent infectious diseases.
Collapse
Affiliation(s)
- Pan-Pan Zhu
- Key Laboratory for Neuro-Information of Ministry of Education
- Center of Bioinformatics
- School of Life Science and Technology
- University of Electronic Science and Technology of China
- Chengdu 610054
| | - Wen-Chao Li
- Key Laboratory for Neuro-Information of Ministry of Education
- Center of Bioinformatics
- School of Life Science and Technology
- University of Electronic Science and Technology of China
- Chengdu 610054
| | - Zhe-Jin Zhong
- Key Laboratory for Neuro-Information of Ministry of Education
- Center of Bioinformatics
- School of Life Science and Technology
- University of Electronic Science and Technology of China
- Chengdu 610054
| | - En-Ze Deng
- Key Laboratory for Neuro-Information of Ministry of Education
- Center of Bioinformatics
- School of Life Science and Technology
- University of Electronic Science and Technology of China
- Chengdu 610054
| | - Hui Ding
- Key Laboratory for Neuro-Information of Ministry of Education
- Center of Bioinformatics
- School of Life Science and Technology
- University of Electronic Science and Technology of China
- Chengdu 610054
| | - Wei Chen
- Department of Physics
- School of Sciences
- and Center for Genomics and Computational Biology
- Hebei United University
- Tangshan 063000
| | - Hao Lin
- Key Laboratory for Neuro-Information of Ministry of Education
- Center of Bioinformatics
- School of Life Science and Technology
- University of Electronic Science and Technology of China
- Chengdu 610054
| |
Collapse
|
40
|
Tiwari AK, Srivastava R. A survey of computational intelligence techniques in protein function prediction. INTERNATIONAL JOURNAL OF PROTEOMICS 2014; 2014:845479. [PMID: 25574395 PMCID: PMC4276698 DOI: 10.1155/2014/845479] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 09/10/2014] [Revised: 10/31/2014] [Accepted: 11/07/2014] [Indexed: 02/08/2023]
Abstract
During the past, there was a massive growth of knowledge of unknown proteins with the advancement of high throughput microarray technologies. Protein function prediction is the most challenging problem in bioinformatics. In the past, the homology based approaches were used to predict the protein function, but they failed when a new protein was different from the previous one. Therefore, to alleviate the problems associated with homology based traditional approaches, numerous computational intelligence techniques have been proposed in the recent past. This paper presents a state-of-the-art comprehensive review of various computational intelligence techniques for protein function predictions using sequence, structure, protein-protein interaction network, and gene expression data used in wide areas of applications such as prediction of DNA and RNA binding sites, subcellular localization, enzyme functions, signal peptides, catalytic residues, nuclear/G-protein coupled receptors, membrane proteins, and pathway analysis from gene expression datasets. This paper also summarizes the result obtained by many researchers to solve these problems by using computational intelligence techniques with appropriate datasets to improve the prediction performance. The summary shows that ensemble classifiers and integration of multiple heterogeneous data are useful for protein function prediction.
Collapse
Affiliation(s)
- Arvind Kumar Tiwari
- Department of Computer Science & Engineering, Indian Institute of Technology (BHU), Varanasi 221005, India
| | - Rajeev Srivastava
- Department of Computer Science & Engineering, Indian Institute of Technology (BHU), Varanasi 221005, India
| |
Collapse
|
41
|
Song L, Li D, Zeng X, Wu Y, Guo L, Zou Q. nDNA-Prot: identification of DNA-binding proteins based on unbalanced classification. BMC Bioinformatics 2014; 15:298. [PMID: 25196432 PMCID: PMC4165999 DOI: 10.1186/1471-2105-15-298] [Citation(s) in RCA: 127] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2014] [Accepted: 09/03/2014] [Indexed: 11/23/2022] Open
Abstract
Background DNA-binding proteins are vital for the study of cellular processes. In recent genome engineering studies, the identification of proteins with certain functions has become increasingly important and needs to be performed rapidly and efficiently. In previous years, several approaches have been developed to improve the identification of DNA-binding proteins. However, the currently available resources are insufficient to accurately identify these proteins. Because of this, the previous research has been limited by the relatively unbalanced accuracy rate and the low identification success of the current methods. Results In this paper, we explored the practicality of modelling DNA binding identification and simultaneously employed an ensemble classifier, and a new predictor (nDNA-Prot) was designed. The presented framework is comprised of two stages: a 188-dimension feature extraction method to obtain the protein structure and an ensemble classifier designated as imDC. Experiments using different datasets showed that our method is more successful than the traditional methods in identifying DNA-binding proteins. The identification was conducted using a feature that selected the minimum Redundancy and Maximum Relevance (mRMR). An accuracy rate of 95.80% and an Area Under the Curve (AUC) value of 0.986 were obtained in a cross validation. A test dataset was tested in our method and resulted in an 86% accuracy, versus a 76% using iDNA-Prot and a 68% accuracy using DNA-Prot. Conclusions Our method can help to accurately identify DNA-binding proteins, and the web server is accessible at http://datamining.xmu.edu.cn/~songli/nDNA. In addition, we also predicted possible DNA-binding protein sequences in all of the sequences from the UniProtKB/Swiss-Prot database. Electronic supplementary material The online version of this article (doi:10.1186/1471-2105-15-298) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | | | | | - Li Guo
- School of Information Science and Technology, Xiamen University, Xiamen, Fujian 361005, China.
| | | |
Collapse
|
42
|
Liu WX, Deng EZ, Chen W, Lin H. Identifying the subfamilies of voltage-gated potassium channels using feature selection technique. Int J Mol Sci 2014; 15:12940-51. [PMID: 25054318 PMCID: PMC4139883 DOI: 10.3390/ijms150712940] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2014] [Revised: 07/13/2014] [Accepted: 07/14/2014] [Indexed: 11/16/2022] Open
Abstract
Voltage-gated K+ channel (VKC) plays important roles in biology procession, especially in nervous system. Different subfamilies of VKCs have different biological functions. Thus, knowing VKCs’ subfamilies has become a meaningful job because it can guide the direction for the disease diagnosis and drug design. However, the traditional wet-experimental methods were costly and time-consuming. It is highly desirable to develop an effective and powerful computational tool for identifying different subfamilies of VKCs. In this study, a predictor, called iVKC-OTC, has been developed by incorporating the optimized tripeptide composition (OTC) generated by feature selection technique into the general form of pseudo-amino acid composition to identify six subfamilies of VKCs. One of the remarkable advantages of introducing the optimized tripeptide composition is being able to avoid the notorious dimension disaster or over fitting problems in statistical predictions. It was observed on a benchmark dataset, by using a jackknife test, that the overall accuracy achieved by iVKC-OTC reaches to 96.77% in identifying the six subfamilies of VKCs, indicating that the new predictor is promising or at least may become a complementary tool to the existing methods in this area. It has not escaped our notice that the optimized tripeptide composition can also be used to investigate other protein classification problems.
Collapse
Affiliation(s)
- Wei-Xin Liu
- Key Laboratory for Neuro-Information of Ministry of Education, Center of Bioinformatics, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - En-Ze Deng
- Key Laboratory for Neuro-Information of Ministry of Education, Center of Bioinformatics, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Wei Chen
- Department of Physics, School of Sciences, and Center for Genomics and Computational Biology, Hebei United University, Tangshan 063000, China.
| | - Hao Lin
- Key Laboratory for Neuro-Information of Ministry of Education, Center of Bioinformatics, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| |
Collapse
|
43
|
acACS: improving the prediction accuracy of protein subcellular locations and protein classification by incorporating the average chemical shifts composition. ScientificWorldJournal 2014; 2014:864135. [PMID: 25110749 PMCID: PMC4106170 DOI: 10.1155/2014/864135] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2014] [Revised: 06/15/2014] [Accepted: 06/16/2014] [Indexed: 11/17/2022] Open
Abstract
The chemical shift is sensitive to changes in the local environments and can report the structural changes. The structure information of a protein can be represented by the average chemical shifts (ACS) composition, which has been broadly applied for enhancing the prediction accuracy in protein subcellular locations and protein classification. However, different kinds of ACS composition can solve different problems. We established an online web server named acACS, which can convert secondary structure into average chemical shift and then compose the vector for representing a protein by using the algorithm of auto covariance. Our solution is easy to use and can meet the needs of users.
Collapse
|
44
|
Zhang L, Zhao X, Kong L. Predict protein structural class for low-similarity sequences by evolutionary difference information into the general form of Chou's pseudo amino acid composition. J Theor Biol 2014; 355:105-10. [PMID: 24735902 DOI: 10.1016/j.jtbi.2014.04.008] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2013] [Revised: 02/26/2014] [Accepted: 04/04/2014] [Indexed: 10/25/2022]
Abstract
Knowledge of protein structural class plays an important role in characterizing the overall folding type of a given protein. At present, it is still a challenge to extract sequence information solely using protein sequence for protein structural class prediction with low similarity sequence in the current computational biology. In this study, a novel sequence representation method is proposed based on position specific scoring matrix for protein structural class prediction. By defined evolutionary difference formula, varying length proteins are expressed as uniform dimensional vectors, which can represent evolutionary difference information between the adjacent residues of a given protein. To perform and evaluate the proposed method, support vector machine and jackknife tests are employed on three widely used datasets, 25PDB, 1189 and 640 datasets with sequence similarity lower than 25%, 40% and 25%, respectively. Comparison of our results with the previous methods shows that our method may provide a promising method to predict protein structural class especially for low-similarity sequences.
Collapse
Affiliation(s)
- Lichao Zhang
- College of Marine Life Science, Ocean University of China, Yushan Road, Qingdao 266003, PR China
| | - Xiqiang Zhao
- College of Mathematical Science, Ocean University of China, Songling Road, Qingdao 266100, PR China.
| | - Liang Kong
- College of Mathematics and Information Technology, Hebei Normal University of Science and Technology, Qinhuangdao 066004, PR China
| |
Collapse
|
45
|
Zakeri P, Jeuris B, Vandebril R, Moreau Y. Protein fold recognition using geometric kernel data fusion. ACTA ACUST UNITED AC 2014; 30:1850-7. [PMID: 24590441 PMCID: PMC4071197 DOI: 10.1093/bioinformatics/btu118] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Motivation: Various approaches based on features extracted from protein sequences and often machine learning methods have been used in the prediction of protein folds. Finding an efficient technique for integrating these different protein features has received increasing attention. In particular, kernel methods are an interesting class of techniques for integrating heterogeneous data. Various methods have been proposed to fuse multiple kernels. Most techniques for multiple kernel learning focus on learning a convex linear combination of base kernels. In addition to the limitation of linear combinations, working with such approaches could cause a loss of potentially useful information. Results: We design several techniques to combine kernel matrices by taking more involved, geometry inspired means of these matrices instead of convex linear combinations. We consider various sequence-based protein features including information extracted directly from position-specific scoring matrices and local sequence alignment. We evaluate our methods for classification on the SCOP PDB-40D benchmark dataset for protein fold recognition. The best overall accuracy on the protein fold recognition test set obtained by our methods is ∼86.7%. This is an improvement over the results of the best existing approach. Moreover, our computational model has been developed by incorporating the functional domain composition of proteins through a hybridization model. It is observed that by using our proposed hybridization model, the protein fold recognition accuracy is further improved to 89.30%. Furthermore, we investigate the performance of our approach on the protein remote homology detection problem by fusing multiple string kernels. Availability and implementation: The MATLAB code used for our proposed geometric kernel fusion frameworks are publicly available at http://people.cs.kuleuven.be/∼raf.vandebril/homepage/software/geomean.php?menu=5/ Contact:pooyapaydar@gmail.com or yves.moreau@esat.kuleuven.be Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Pooya Zakeri
- Department of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, iMinds Medical IT and Department of Computer Science, KU Leuven, 3001 Leuven, BelgiumDepartment of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, iMinds Medical IT and Department of Computer Science, KU Leuven, 3001 Leuven, Belgium
| | - Ben Jeuris
- Department of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, iMinds Medical IT and Department of Computer Science, KU Leuven, 3001 Leuven, Belgium
| | - Raf Vandebril
- Department of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, iMinds Medical IT and Department of Computer Science, KU Leuven, 3001 Leuven, Belgium
| | - Yves Moreau
- Department of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, iMinds Medical IT and Department of Computer Science, KU Leuven, 3001 Leuven, BelgiumDepartment of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, iMinds Medical IT and Department of Computer Science, KU Leuven, 3001 Leuven, BelgiumDepartment of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, iMinds Medical IT and Department of Computer Science, KU Leuven, 3001 Leuven, Belgium
| |
Collapse
|
46
|
Muthukrishnan S, Puri M, Lefevre C. Support vector machine (SVM) based multiclass prediction with basic statistical analysis of plasminogen activators. BMC Res Notes 2014; 7:63. [PMID: 24468032 PMCID: PMC3924408 DOI: 10.1186/1756-0500-7-63] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2013] [Accepted: 01/16/2014] [Indexed: 12/05/2022] Open
Abstract
Background Plasminogen (Pg), the precursor of the proteolytic and fibrinolytic enzyme of blood, is converted to the active enzyme plasmin (Pm) by different plasminogen activators (tissue plasminogen activators and urokinase), including the bacterial activators streptokinase and staphylokinase, which activate Pg to Pm and thus are used clinically for thrombolysis. The identification of Pg-activators is therefore an important step in understanding their functional mechanism and derives new therapies. Methods In this study, different computational methods for predicting plasminogen activator peptide sequences with high accuracy were investigated, including support vector machines (SVM) based on amino acid (AC), dipeptide composition (DC), PSSM profile and Hybrid methods used to predict different Pg-activators from both prokaryotic and eukaryotic origins. Results Overall maximum accuracy, evaluated using the five-fold cross validation technique, was 88.37%, 84.32%, 87.61%, 85.63% in 0.87, 0.83,0.86 and 0.85 MCC with amino (AC) or dipeptide composition (DC), PSSM profile and Hybrid methods respectively. Through this study, we have found that the different subfamilies of Pg-activators are quite closely correlated in terms of amino, dipeptide, PSSM and Hybrid compositions. Therefore, our prediction results show that plasminogen activators are predictable with a high accuracy from their primary sequence. Prediction performance was also cross-checked by confusion matrix and ROC (Receiver operating characteristics) analysis. A web server to facilitate the prediction of Pg-activators from primary sequence data was implemented. Conclusion The results show that dipeptide, PSSM profile, and Hybrid based methods perform better than single amino acid composition (AC). Furthermore, we also have developed a web server, which predicts the Pg-activators and their classification (available online at http://mamsap.it.deakin.edu.au/plas_pred/home.html). Our experimental results show that our approaches are faster and achieve generally a good prediction performance.
Collapse
Affiliation(s)
| | - Munish Puri
- Fermentation and Protein Biotechnology Laboratory, Department of Biotechnology, Punjabi University, Patiala, India, 2CSIR-IMTECH, Chandigarh, India.
| | | |
Collapse
|
47
|
Comparative proteomic analysis of Mycobacterium tuberculosis strain H37Rv versus H37Ra. Int J Mycobacteriol 2013; 2:220-6. [PMID: 26786126 DOI: 10.1016/j.ijmyco.2013.10.004] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2013] [Revised: 10/11/2013] [Accepted: 10/14/2013] [Indexed: 10/26/2022] Open
Abstract
BACKGROUND Mycobacterium tuberculosis (MTB) H37Ra is an attenuated tubercle bacillus closely related to the virulent type strain MTB H37Rv. In spite of extensive study, variation in virulence between the MTB H37Rv and MTB H37Ra strains is still to be understood. The difference in protein expression or structure due to mutation may probably be an important factor for the virulence property of MTB H37Rv strain. METHODS In this study, a whole proteome comparison between these two strains was carried out using bioinformatics approaches to elucidate differences in their protein sequences. RESULTS On comparison of whole proteome using NCBI standalone BLAST program between these two strains, 3759 identical proteins in both the strains out of 4003 proteins were revealed in MTB H37Rv and 4034 proteins were revealed in MTB H37Ra; 244 proteins of MTB H37Rv and 260 proteins of MTB H37Ra were found to be non-identical. A total of 172 proteins were identified with mutations (Insertions/deletions/substitutions) in MTB H37Ra while 53 proteins of MTB H37Rv and 85 proteins of MTB H37Ra were found to be distinct. Among 244 non-identical proteins, 19 proteins were reported to have an important biological function; In this study, mutation was shown in these proteins of MTB H37Ra. CONCLUSION This study reports the protein differences with mutations between MTB H37Rv and H37Ra, which may help in better understanding the pathogenesis and virulence properties of MTB H37Rv.
Collapse
|
48
|
Gupta S, Ansari HR, Gautam A, Raghava GPS. Identification of B-cell epitopes in an antigen for inducing specific class of antibodies. Biol Direct 2013; 8:27. [PMID: 24168386 PMCID: PMC3831251 DOI: 10.1186/1745-6150-8-27] [Citation(s) in RCA: 84] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2013] [Accepted: 10/25/2013] [Indexed: 01/11/2023] Open
Abstract
Background In the past, numerous methods have been developed for predicting antigenic regions or B-cell epitopes that can induce B-cell response. To the best of authors’ knowledge, no method has been developed for predicting B-cell epitopes that can induce a specific class of antibody (e.g., IgA, IgG) except allergenic epitopes (IgE). In this study, an attempt has been made to understand the relation between primary sequence of epitopes and the class of antibodies generated. Results The dataset used in this study has been derived from Immune Epitope Database and consists of 14725 B-cell epitopes that include 11981 IgG, 2341 IgE, 403 IgA specific epitopes and 22835 non-B-cell epitopes. In order to understand the preference of residues or motifs in these epitopes, we computed and compared amino acid and dipeptide composition of IgG, IgE, IgA inducing epitopes and non-B-cell epitopes. Differences in composition profiles of different classes of epitopes were observed, and few residues were found to be preferred. Based on these observations, we developed models for predicting antibody class-specific B-cell epitopes using various features like amino acid composition, dipeptide composition, and binary profiles. Among these, dipeptide composition-based support vector machine model achieved maximum Matthews correlation coefficient of 0.44, 0.70 and 0.45 for IgG, IgE and IgA specific epitopes respectively. All models were developed on experimentally validated non-redundant dataset and evaluated using five-fold cross validation. In addition, the performance of dipeptide-based model was also evaluated on independent dataset. Conclusion Present study utilizes the amino acid sequence information for predicting the tendencies of antigens to induce different classes of antibodies. For the first time, in silico models have been developed for predicting B-cell epitopes, which can induce specific class of antibodies. A web service called IgPred has been developed to serve the scientific community. This server will be useful for researchers working in the field of subunit/epitope/peptide-based vaccines and immunotherapy (http://crdd.osdd.net/raghava/igpred/). Reviewers This article was reviewed by Dr. M Michael Gromiha, Dr Christopher Langmead (nominated by Dr Robert Murphy) and Dr Lina Ma (nominated by Dr Zhang Zhang).
Collapse
Affiliation(s)
| | | | | | | | - Gajendra P S Raghava
- Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh 160036, India.
| |
Collapse
|
49
|
Gautam A, Chaudhary K, Kumar R, Sharma A, Kapoor P, Tyagi A, Raghava GPS. In silico approaches for designing highly effective cell penetrating peptides. J Transl Med 2013; 11:74. [PMID: 23517638 PMCID: PMC3615965 DOI: 10.1186/1479-5876-11-74] [Citation(s) in RCA: 207] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2012] [Accepted: 03/11/2013] [Indexed: 11/23/2022] Open
Abstract
Background Cell penetrating peptides have gained much recognition as a versatile transport vehicle for the intracellular delivery of wide range of cargoes (i.e. oligonucelotides, small molecules, proteins, etc.), that otherwise lack bioavailability, thus offering great potential as future therapeutics. Keeping in mind the therapeutic importance of these peptides, we have developed in silico methods for the prediction of cell penetrating peptides, which can be used for rapid screening of such peptides prior to their synthesis. Methods In the present study, support vector machine (SVM)-based models have been developed for predicting and designing highly effective cell penetrating peptides. Various features like amino acid composition, dipeptide composition, binary profile of patterns, and physicochemical properties have been used as input features. The main dataset used in this study consists of 708 peptides. In addition, we have identified various motifs in cell penetrating peptides, and used these motifs for developing a hybrid prediction model. Performance of our method was evaluated on an independent dataset and also compared with that of the existing methods. Results In cell penetrating peptides, certain residues (e.g. Arg, Lys, Pro, Trp, Leu, and Ala) are preferred at specific locations. Thus, it was possible to discriminate cell-penetrating peptides from non-cell penetrating peptides based on amino acid composition. All models were evaluated using five-fold cross-validation technique. We have achieved a maximum accuracy of 97.40% using the hybrid model that combines motif information and binary profile of the peptides. On independent dataset, we achieved maximum accuracy of 81.31% with MCC of 0.63. Conclusion The present study demonstrates that features like amino acid composition, binary profile of patterns and motifs, can be used to train an SVM classifier that can predict cell penetrating peptides with higher accuracy. The hybrid model described in this study achieved more accuracy than the previous methods and thus may complement the existing methods. Based on the above study, a user- friendly web server CellPPD has been developed to help the biologists, where a user can predict and design CPPs with much ease. CellPPD web server is freely accessible at http://crdd.osdd.net/raghava/cellppd/.
Collapse
Affiliation(s)
- Ankur Gautam
- Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh 160036, India
| | | | | | | | | | | | | | | |
Collapse
|
50
|
de la Caridad Addine Ramírez B, Marrón R, Calero R, Mirabal M, Ramírez JC, Sarmiento ME, Norazmi MN, Acosta A. In silico identification of common epitopes from pathogenic mycobacteria. BMC Immunol 2013; 14 Suppl 1:S6. [PMID: 23458668 PMCID: PMC3582421 DOI: 10.1186/1471-2172-14-s1-s6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
An in silico study was carried out to identify antigens for their possible collective use as vaccine candidates against diseases caused by different classes of pathogenic mycobacteria with significant clinical relevance. The genome sequences of the relevant causative agents were used in order to search for orthologous genes among them. Bioinformatics tools permitted us to identify several conserved sequences with 100% identity with no possibility of cross-reactivity to the normal flora and human proteins. Nine different proteins were characterized using the strain H37Rv as reference and taking into account their functional category, their in vivo expression and subcellular location. T and B cell epitopes were identified in the selected sequences. Theoretical prediction of population coverage was calculated for individual epitopes as well as their combinations. Several identical sequences, belonging to six proteins containing T and B cell epitopes which are not present in selected microorganisms of the normal microbial flora or in human proteins were obtained.
Collapse
Affiliation(s)
| | - Reynel Marrón
- Finlay Institute. Ave. 27, No. 19805, La Lisa. La Habana, Cuba
| | - Rommel Calero
- Finlay Institute. Ave. 27, No. 19805, La Lisa. La Habana, Cuba
| | - Mayelin Mirabal
- Finlay Institute. Ave. 27, No. 19805, La Lisa. La Habana, Cuba
| | | | | | - Mohd Nor Norazmi
- School of Health Sciences Universiti Sains Malaysia, Malaysia
- Institute for Research in Molecular Medicine, Universiti Sains Malaysia, Malaysia
| | - Armando Acosta
- Finlay Institute. Ave. 27, No. 19805, La Lisa. La Habana, Cuba
| |
Collapse
|