101
|
Blicher T, Gupta R, Wesolowska A, Jensen LJ, Brunak S. Protein annotation in the era of personal genomics. Curr Opin Struct Biol 2010; 20:335-41. [DOI: 10.1016/j.sbi.2010.03.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2010] [Accepted: 03/24/2010] [Indexed: 10/19/2022]
|
102
|
Briesemeister S, Rahnenführer J, Kohlbacher O. YLoc--an interpretable web server for predicting subcellular localization. Nucleic Acids Res 2010; 38:W497-502. [PMID: 20507917 PMCID: PMC2896088 DOI: 10.1093/nar/gkq477] [Citation(s) in RCA: 214] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Predicting subcellular localization has become a valuable alternative to time-consuming experimental methods. Major drawbacks of many of these predictors is their lack of interpretability and the fact that they do not provide an estimate of the confidence of an individual prediction. We present YLoc, an interpretable web server for predicting subcellular localization. YLoc uses natural language to explain why a prediction was made and which biological property of the protein was mainly responsible for it. In addition, YLoc estimates the reliability of its own predictions. YLoc can, thus, assist in understanding protein localization and in location engineering of proteins. The YLoc web server is available online at www.multiloc.org/YLoc.
Collapse
|
103
|
Ban HJ, Heo JY, Oh KS, Park KJ. Identification of type 2 diabetes-associated combination of SNPs using support vector machine. BMC Genet 2010; 11:26. [PMID: 20416077 PMCID: PMC2875201 DOI: 10.1186/1471-2156-11-26] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2009] [Accepted: 04/23/2010] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND Type 2 diabetes mellitus (T2D), a metabolic disorder characterized by insulin resistance and relative insulin deficiency, is a complex disease of major public health importance. Its incidence is rapidly increasing in the developed countries. Complex diseases are caused by interactions between multiple genes and environmental factors. Most association studies aim to identify individual susceptibility single markers using a simple disease model. Recent studies are trying to estimate the effects of multiple genes and multi-locus in genome-wide association. However, estimating the effects of association is very difficult. We aim to assess the rules for classifying diseased and normal subjects by evaluating potential gene-gene interactions in the same or distinct biological pathways. RESULTS We analyzed the importance of gene-gene interactions in T2D susceptibility by investigating 408 single nucleotide polymorphisms (SNPs) in 87 genes involved in major T2D-related pathways in 462 T2D patients and 456 healthy controls from the Korean cohort studies. We evaluated the support vector machine (SVM) method to differentiate between cases and controls using SNP information in a 10-fold cross-validation test. We achieved a 65.3% prediction rate with a combination of 14 SNPs in 12 genes by using the radial basis function (RBF)-kernel SVM. Similarly, we investigated subpopulation data sets of men and women and identified different SNP combinations with the prediction rates of 70.9% and 70.6%, respectively. As the high-throughput technology for genome-wide SNPs improves, it is likely that a much higher prediction rate with biologically more interesting combination of SNPs can be acquired by using this method. CONCLUSIONS Support Vector Machine based feature selection method in this research found novel association between combinations of SNPs and T2D in a Korean population.
Collapse
Affiliation(s)
- Hyo-Jeong Ban
- Division of Bio-Medical Informatics, Center for Genome Science, National Institute of Health, Korea Center for Disease Control and Prevention, 194, Tongil-Lo, Eunpyung-Gu, Seoul 122-701, Republic of Korea
| | | | | | | |
Collapse
|
104
|
Gill J, Kumar A, Yogavel M, Belrhali H, Jain SK, Rug M, Brown M, Maier AG, Sharma A. Structure, localization and histone binding properties of nuclear-associated nucleosome assembly protein from Plasmodium falciparum. Malar J 2010; 9:90. [PMID: 20377878 PMCID: PMC2873526 DOI: 10.1186/1475-2875-9-90] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2010] [Accepted: 04/08/2010] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Nucleosome assembly proteins (NAPs) are histone chaperones that are crucial for the shuttling and incorporation of histones into nucleosomes. NAPs participate in the assembly and disassembly of nucleosomes thus contributing to chromatin structure organization. The human malaria parasite Plasmodium falciparum contains two nucleosome assembly proteins termed PfNapL and PfNapS. METHODS Three-dimensional crystal structure of PfNapS has been determined and analysed. Gene knockout and localization studies were also performed on PfNapS using transfection studies. Fluorescence spectroscopy was performed to identify histone-binding sites on PfNapS. Extensive sequence and structural comparisons were done with the crystal structures available for NAP/SET family of proteins. RESULTS Crystal structure of PfNapS shares structural similarity with previous structures from NAP/SET family. Failed attempts to knock-out the gene for PfNapS from malaria parasite suggest essentiality in the parasite. GFP-fused PfNapS fusion protein targeting indicates cellular localization of PfNapS in the parasite nucleus. Fluorescence spectroscopy data suggest that PfNapS interacts with core histones (tetramer, octamer, H3, H4, H2A and H2B) at a different site from its interaction with linker histone H1. This analysis illustrates two regions on the PfNapS dimer as the possible sites for histone recognition. CONCLUSIONS This work presents a thorough analysis of the structural, functional and regulatory attributes of PfNapS from P. falciparum with respect to previously studied histone chaperones.
Collapse
Affiliation(s)
- Jasmita Gill
- Structural and Computational Biology Group, International Centre for Genetic Engineering and Biotechnology (ICGEB), Aruna Asaf Ali Road, New Delhi, 110067, India
| | | | | | | | | | | | | | | | | |
Collapse
|
105
|
Jung J, Ryu T, Hwang Y, Lee E, Lee D. Prediction of extracellular matrix proteins based on distinctive sequence and domain characteristics. J Comput Biol 2010; 17:97-105. [PMID: 20078400 DOI: 10.1089/cmb.2008.0236] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Extracellular matrix (ECM) proteins are secreted to the exterior of the cell, and function as mediators between resident cells and the external environment. These proteins not only support cellular structure but also participate in diverse processes, including growth, hormonal response, homeostasis, and disease progression. Despite their importance, current knowledge of the number and functions of ECM proteins is limited. Here, we propose a computational method to predict ECM proteins. Specific features, such as ECM domain score and repetitive residues, were utilized for prediction. Based on previously employed and newly generated features, discriminatory characteristics for ECM protein categorization were determined, which significantly improved the performance of Random Forest and support vector machine (SVM) classification. We additionally predicted novel ECM proteins from non-annotated human proteins, validated with gene ontology and earlier literature. Our novel prediction method is available at biosoft.kaist.ac.kr/ecm.
Collapse
Affiliation(s)
- Juhyun Jung
- Department of Bio and Brain Engineering , KAIST, Daejeon, Korea
| | | | | | | | | |
Collapse
|
106
|
Goudenège D, Avner S, Lucchetti-Miganeh C, Barloy-Hubler F. CoBaltDB: Complete bacterial and archaeal orfeomes subcellular localization database and associated resources. BMC Microbiol 2010; 10:88. [PMID: 20331850 PMCID: PMC2850352 DOI: 10.1186/1471-2180-10-88] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2009] [Accepted: 03/23/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The functions of proteins are strongly related to their localization in cell compartments (for example the cytoplasm or membranes) but the experimental determination of the sub-cellular localization of proteomes is laborious and expensive. A fast and low-cost alternative approach is in silico prediction, based on features of the protein primary sequences. However, biologists are confronted with a very large number of computational tools that use different methods that address various localization features with diverse specificities and sensitivities. As a result, exploiting these computer resources to predict protein localization accurately involves querying all tools and comparing every prediction output; this is a painstaking task. Therefore, we developed a comprehensive database, called CoBaltDB, that gathers all prediction outputs concerning complete prokaryotic proteomes. DESCRIPTION The current version of CoBaltDB integrates the results of 43 localization predictors for 784 complete bacterial and archaeal proteomes (2.548.292 proteins in total). CoBaltDB supplies a simple user-friendly interface for retrieving and exploring relevant information about predicted features (such as signal peptide cleavage sites and transmembrane segments). Data are organized into three work-sets ("specialized tools", "meta-tools" and "additional tools"). The database can be queried using the organism name, a locus tag or a list of locus tags and may be browsed using numerous graphical and text displays. CONCLUSIONS With its new functionalities, CoBaltDB is a novel powerful platform that provides easy access to the results of multiple localization tools and support for predicting prokaryotic protein localizations with higher confidence than previously possible. CoBaltDB is available at http://www.umr6026.univ-rennes1.fr/english/home/research/basic/software/cobalten.
Collapse
Affiliation(s)
- David Goudenège
- CNRS UMR 6026, ICM, Equipe B@SIC, Université de Rennes 1, Campus de Beaulieu, 35042 Rennes, France
| | | | | | | |
Collapse
|
107
|
Localization and characterization of ST7 in cancer. J Cancer Res Clin Oncol 2010; 137:89-97. [PMID: 20238225 DOI: 10.1007/s00432-010-0863-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2009] [Accepted: 02/25/2010] [Indexed: 01/10/2023]
Abstract
PURPOSE ST7 has been proposed to be a tumor suppressor gene in the chromosome region 7q31.1-q31.2. In order to gain some insight into its role in cancer, the localization and verification of the ST7 expression levels were determined. METHODS Various types of ST7 expression vectors tagged with the sequences of GFP, YFP or V5 were created using a gateway cloning system and full-length ST7 cDNA isolated from a human adult brain cDNA library. Cell cycle synchronization was also performed to analyze the expression of endogenous ST7 and its potentially related genes at each stage of the cell cycle. RESULTS Cytosolic ST7 expression in HCT-116, MCF-7 and PC-3 cancer cell lines was detected via the fluorescence signal of the fusion proteins. ST7 translocation from the cytoplasm to the nucleus has not been observed in any of the conditions assayed. A cell cycle synchronization study demonstrated that both ST7 and SERPINE1 were overexpressed when cells were arrested. Expression of these genes was found to be diminished when the cells re-entered cell division status. In addition, we also found that Survivin, MMP-13 and Cyclin D1 were differentially expressed during the cell cycle. CONCLUSION Our findings suggest that ST7 mediates tumor suppression through the regulation of the genes involved in maintaining the cellular structure of the cell and involved in oncogenic pathways.
Collapse
|
108
|
Briesemeister S, Rahnenführer J, Kohlbacher O. Going from where to why--interpretable prediction of protein subcellular localization. ACTA ACUST UNITED AC 2010; 26:1232-8. [PMID: 20299325 PMCID: PMC2859129 DOI: 10.1093/bioinformatics/btq115] [Citation(s) in RCA: 118] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION Protein subcellular localization is pivotal in understanding a protein's function. Computational prediction of subcellular localization has become a viable alternative to experimental approaches. While current machine learning-based methods yield good prediction accuracy, most of them suffer from two key problems: lack of interpretability and dealing with multiple locations. RESULTS We present YLoc, a novel method for predicting protein subcellular localization that addresses these issues. Due to its simple architecture, YLoc can identify the relevant features of a protein sequence contributing to its subcellular localization, e.g. localization signals or motifs relevant to protein sorting. We present several example applications where YLoc identifies the sequence features responsible for protein localization, and thus reveals not only to which location a protein is transported to, but also why it is transported there. YLoc also provides a confidence estimate for the prediction. Thus, the user can decide what level of error is acceptable for a prediction. Due to a probabilistic approach and the use of several thousands of dual-targeted proteins, YLoc is able to predict multiple locations per protein. YLoc was benchmarked using several independent datasets for protein subcellular localization and performs on par with other state-of-the-art predictors. Disregarding low-confidence predictions, YLoc can achieve prediction accuracies of over 90%. Moreover, we show that YLoc is able to reliably predict multiple locations and outperforms the best predictors in this area. AVAILABILITY www.multiloc.org/YLoc.
Collapse
|
109
|
De Bodt S, Carvajal D, Hollunder J, Van den Cruyce J, Movahedi S, Inzé D. CORNET: a user-friendly tool for data mining and integration. PLANT PHYSIOLOGY 2010; 152:1167-79. [PMID: 20053712 PMCID: PMC2832254 DOI: 10.1104/pp.109.147215] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/08/2009] [Accepted: 01/04/2010] [Indexed: 05/17/2023]
Abstract
As an overwhelming amount of functional genomics data have been generated, the retrieval, integration, and interpretation of these data need to be facilitated to enable the advance of (systems) biological research. For example, gathering and processing microarray data that are related to a particular biological process is not straightforward, nor is the compilation of protein-protein interactions from numerous partially overlapping databases identified through diverse approaches. However, these tasks are inevitable to address the following questions. Does a group of differentially expressed genes show similar expression in diverse microarray experiments? Was an identified protein-protein interaction previously detected by other approaches? Are the interacting proteins encoded by genes with similar expression profiles and localization? We developed CORNET (for CORrelation NETworks) as an access point to transcriptome, protein interactome, and localization data and functional information on Arabidopsis (Arabidopsis thaliana). It consists of two flexible and versatile tools, namely the coexpression tool and the protein-protein interaction tool. The ability to browse and search microarray experiments using ontology terms and the incorporation of personal microarray data are distinctive features of the microarray repository. The coexpression tool enables either the alternate or simultaneous use of diverse expression compendia, whereas the protein-protein interaction tool searches experimentally and computationally identified protein-protein interactions. Different search options are implemented to enable the construction of coexpression and/or protein-protein interaction networks centered around multiple input genes or proteins. Moreover, networks and associated evidence are visualized in Cytoscape. Localization is visualized in pie charts, thereby allowing multiple localizations per protein. CORNET is available at http://bioinformatics.psb.ugent.be/cornet.
Collapse
Affiliation(s)
- Stefanie De Bodt
- Department of Plant Systems Biology, Flanders Institute for Biotechnology, Ghent University, 9052 Ghent, Belgium.
| | | | | | | | | | | |
Collapse
|
110
|
Engineering Ca2+/calmodulin-mediated modulation of protein translocation by overlapping binding and signaling peptide sequences. Cell Calcium 2010; 47:369-77. [PMID: 20167369 DOI: 10.1016/j.ceca.2010.01.008] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2009] [Revised: 01/09/2010] [Accepted: 01/22/2010] [Indexed: 11/23/2022]
Abstract
Protein translocation is used by cells to regulate protein activity in time and space. Synthetic systems have studied the effect of second messengers and exogenous chemicals on translocation, and have used translocation-based sensors to monitor unrelated pathways such as caspase activity. We have created a synthetic Ca2+-inducible protein using calmodulin binding peptides that selectively reveal nuclear localization and export signals in low Ca2+ (0 microM) and high Ca2+ (10 microM) environments, respectively. Experiments in live cells showed that our construct translocates between the nucleolus and plasma membrane with time constants of approximately 2 h. Further, a single amino acid mutation (Cys20Ala) in our construct prevented translocation to the plasma membrane and instead targeted it the mitochondria as predicted by bioinformatic analysis. Lastly, we studied the effect of cell line, Ca2+ concentration, chemical inhibitors, and cell morphology on translocation and found these conditions affected the rate, extent and direction of translocation. Our work demonstrates the feasibility of engineering Ca2+/calmodulin-mediated modulation of protein translocation and suggests that more natural analogs may exist.
Collapse
|
111
|
Zhou J, Qiao X, Xiao L, Sun W, Wang L, Li H, Wu Y, Ding X, Hu X, Zhou C, Zhang J. Identification and characterization of the novel protein CCDC106 that interacts with p53 and promotes its degradation. FEBS Lett 2010; 584:1085-90. [DOI: 10.1016/j.febslet.2010.02.031] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2010] [Revised: 02/08/2010] [Accepted: 02/10/2010] [Indexed: 10/19/2022]
|
112
|
Bailey LM, Wallace JC, Polyak SW. Holocarboxylase synthetase: correlation of protein localisation with biological function. Arch Biochem Biophys 2010; 496:45-52. [PMID: 20153287 DOI: 10.1016/j.abb.2010.01.015] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2009] [Revised: 01/26/2010] [Accepted: 01/27/2010] [Indexed: 10/19/2022]
Abstract
Holocarboxylase synthetase (HCS) governs the cellular fate of the essential micronutrient biotin (Vitamin H or B7). HCS is responsible for attaching biotin onto the biotin-dependent enzymes that reside in the cytoplasm and mitochondria. Evidence for an alternative role, viz the regulation of gene expression, has also been reported. Recent immunohistochemical studies reported HCS is primarily nuclear, inconsistent with the location of HCS activity. Improved understanding of biotin biology demands greater knowledge about HCS. Here, we investigated the localisation of HCS and its isoforms. Three variants were observed that differ at the N-terminus. All HCS isoforms were predominantly non-nuclear, consistent with the distribution of biotin protein ligase activity. Unlike the longer constructs, the Met(58) isoform was also detected in the nucleus--a novel observation suggesting shuttling activity between nucleus and cytoplasm. We resolved that the previous controversies in the literature are due to specificity and detection limitations that arise when using partially purified antibodies.
Collapse
Affiliation(s)
- L M Bailey
- School of Molecular and Biomedical Science, University of Adelaide, Adelaide, South Australia 5005, Australia
| | | | | |
Collapse
|
113
|
Mishra S. Function prediction of Rv0079, a hypothetical Mycobacterium tuberculosis DosR regulon protein. J Biomol Struct Dyn 2010; 27:283-92. [PMID: 19795912 DOI: 10.1080/07391102.2009.10507316] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Mycobacterium tuberculosis (Mtb), the pathogen causing tuberculosis, continues to elude a cure. Latent Mtb forms are present in human population for extended periods and have the potential to be re-activated into an active form. The prophylactic vaccine, live-attenuated Mycobacterium bovis Bacillus-Calmette-Guerin (BCG) vaccine is not effective in preventing latent infection. The failure of BCG in prevention/protection against latent forms of Mtb calls for efforts to curb latent Mtb infection. The inclusion of latency/dormancy antigens in the classical antigen preparation is surmised as a strategy. DosR (Dormancy Survival Regulator, Rv3133c) regulon genes are expressed under the conditions of latency/dormancy. Previous bioinformatics analyses have pointed towards their role as probable vaccine candidates. Since nearly 60% of DosR regulon genes are unannotated, efforts towards elucidating their functional role will prove valuable. The study presented here provides an in-depth in silico 3D-structure prediction and functional analyses of the first member of the DosR regulon group, the hypothetical protein, Rv0079. A combination of approaches such as: homology modeling and threading using SWISS-MODEL workspace, Phyre and BioInfo bank Metaserver; protein localization predictions using PSORTb, LOCtree, TMHMM and TMpred; function prediction using ProFunc, epitope prediction using NetCTL and others was implemented. Evidence gathered from a combination of bioinformatics tools supports the hypothesis that Mtb Rv0079 protein is a likely cytoplasmic translation factor. Experimental validation will help provide more insight into its actual function.
Collapse
Affiliation(s)
- Seema Mishra
- National Institute of Biologicals, A-32 Sector-62, Noida, U.P. India.
| |
Collapse
|
114
|
Protein location prediction using atomic composition and global features of the amino acid sequence. Biochem Biophys Res Commun 2010; 391:1670-4. [DOI: 10.1016/j.bbrc.2009.12.118] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2009] [Accepted: 12/21/2009] [Indexed: 11/17/2022]
|
115
|
Abstract
One of the major challenges in the post-genomic era with hundreds of genomes sequenced is the annotation of protein structure and function. Computational predictions of subcellular localization are an important step toward this end. The development of computational tools that predict targeting and localization has, therefore, been a very active area of research, in particular since the first release of the groundbreaking program PSORT in 1991. The most reliable means of annotating protein structure and function remains homology-based inference, i.e. the transfer of experimental annotations from one protein to its homologs. However, annotations about localization demonstrate how much can be gained from advanced machine learning: more proteins can be annotated more reliably. Contemporary computational tools for the annotation of protein targeting include automatic methods that mine the textual information from the biological literature and molecular biology databases. Some machine learning-based methods that accurately predict features of sorting signals and that use sequence-derived features to predict localization have reached remarkable levels of performance. Sustained prediction accuracy has increased by more than 30 percentage points over the last decade. Here, we review some of the most recent methods for the prediction of subcellular localization and protein targeting that contributed toward this breakthrough.
Collapse
Affiliation(s)
- Shruti Rastogi
- Department of Biochemistry and Molecular Biophysics, Columbia University and Columbia University Center for Computational Biology and Bioinformatics (C2B2), New York, NY, USA
| | | |
Collapse
|
116
|
Siddiki AMAMZ, Wastling JM. Charting the proteome of Cryptosporidium parvum sporozoites using sequence similarity-based BLAST searching. J Vet Sci 2009; 10:203-10. [PMID: 19687620 PMCID: PMC2801136 DOI: 10.4142/jvs.2009.10.3.203] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Cryptosporidium (C.) spp. are important zoonotic parasites causing widespread diarrhoeal disease in man and animals. The recent release of the complete genome sequences for C. parvum and C. hominis has facilitated the comprehensive global proteome analysis of these opportunistic pathogens. The well-known approach for mass spectrometry (MS) based data analysis using the BLAST tool (MS BLAST) is a database search protocol for identifying unknown proteins by sequence similarity to homologous proteins using peptide sequences produced by mass spectrometry. We have used several complementary approaches to explore the global sporozoite proteome of C. parvum with available proteomic tools. To optimize the output of the MS data, a sequence similarity-based MS BLAST strategy was employed for bioinformatic analysis. Most significantly, almost all the constituents of glycolysis and several mitochondrion-related proteins were identified. In addition, many hypothetical Cryptosporidium proteins were validated by the identification of their constituent peptides. The MS BLAST approach was found to be useful during the study and could provide valuable information towards a complete understanding of the unique biology of Cryptosporidium.
Collapse
Affiliation(s)
- A M A M Z Siddiki
- Department of Preclinical Veterinary Sciences, Faculty of Veterinary Science, University of Liverpool, Crown Street, Liverpool, L69 7ZJ, UK.
| | | |
Collapse
|
117
|
Xiao K, Jehle F, Peters C, Reinheckel T, Schirmer RH, Dandekar T. CA/C1 peptidases of the malaria parasites Plasmodium falciparum and P. berghei and their mammalian hosts--a bioinformatical analysis. Biol Chem 2009; 390:1185-97. [PMID: 19663681 DOI: 10.1515/bc.2009.124] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
In genome-wide screens we studied CA/C1 peptidases of malaria-causing plasmodia and their hosts (man and mouse). For Plasmodium falciparum and P. berghei, several new CA/C1 peptidase genes encoding proteases of the L- and B-family with specific promoter modules were identified. In addition, two new human CA/C1 peptidase loci and one new mouse gene locus were found; otherwise, the sets of CA/C1 peptidase genes in man and mouse seem to be complete now. In each species studied there is a multitude of CA/C1 peptidases with lysosomal localization signals and partial functional overlap according to similar but subfamily-specific structures. Individual target structures in plasmodia include residues specifically different in CA/C1 peptidase subsite 2. This is of medical interest considering CA/C1 peptidase inhibition for chemotherapy in malaria, malignancies and other diseases. Promoter structures and mRNA regulation differ widely among CA/C1 peptidase subfamilies and between mammals and plasmodia. We characterized promoter modules conserved in mouse and man for the CA/C1 peptidase families B and L (with the L-like subfamily, F-like subfamily and mouse-specific J-like subfamily). RNA motif searches revealed conserved regulatory elements such as GAIT elements; plasmodial CA/C1 peptidase mRNA elements include ARE elements and mammalian mRNAs contain 15-lox DICE elements.
Collapse
Affiliation(s)
- Ke Xiao
- Lehrstuhl für Bioinformatik, Universität Würzburg, Biozentrum, D-97074 Würzburg, Germany
| | | | | | | | | | | |
Collapse
|
118
|
Lin HN, Chen CT, Sung TY, Ho SY, Hsu WL. Protein subcellular localization prediction of eukaryotes using a knowledge-based approach. BMC Bioinformatics 2009; 10 Suppl 15:S8. [PMID: 19958518 PMCID: PMC2788359 DOI: 10.1186/1471-2105-10-s15-s8] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The study of protein subcellular localization (PSL) is important for elucidating protein functions involved in various cellular processes. However, determining the localization sites of a protein through wet-lab experiments can be time-consuming and labor-intensive. Thus, computational approaches become highly desirable. Most of the PSL prediction systems are established for single-localized proteins. However, a significant number of eukaryotic proteins are known to be localized into multiple subcellular organelles. Many studies have shown that proteins may simultaneously locate or move between different cellular compartments and be involved in different biological processes with different roles. RESULTS In this study, we propose a knowledge based method, called KnowPredsite, to predict the localization site(s) of both single-localized and multi-localized proteins. Based on the local similarity, we can identify the "related sequences" for prediction. We construct a knowledge base to record the possible sequence variations for protein sequences. When predicting the localization annotation of a query protein, we search against the knowledge base and used a scoring mechanism to determine the predicted sites. We downloaded the dataset from ngLOC, which consisted of ten distinct subcellular organelles from 1923 species, and performed ten-fold cross validation experiments to evaluate KnowPred site's performance. The experiment results show that KnowPred site achieves higher prediction accuracy than ngLOC and Blast-hit method. For single-localized proteins, the overall accuracy of KnowPred site is 91.7%. For multi-localized proteins, the overall accuracy of KnowPred site is 72.1%, which is significantly higher than that of ngLOC by 12.4%. Notably, half of the proteins in the dataset that cannot find any Blast hit sequence above a specified threshold can still be correctly predicted by KnowPred site. CONCLUSION KnowPred site demonstrates the power of identifying related sequences in the knowledge base. The experiment results show that even though the sequence similarity is low, the local similarity is effective for prediction. Experiment results show that KnowPred site is a highly accurate prediction method for both single- and multi-localized proteins. It is worth-mentioning the prediction process of KnowPred site is transparent and biologically interpretable and it shows a set of template sequences to generate the prediction result. The KnowPred site prediction server is available at http://bio-cluster.iis.sinica.edu.tw/kbloc/.
Collapse
Affiliation(s)
- Hsin-Nan Lin
- Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Taipei, Taiwan, Republic of China.
| | | | | | | | | |
Collapse
|
119
|
Huang WL, Tung CW, Huang HL, Ho SY. Predicting protein subnuclear localization using GO-amino-acid composition features. Biosystems 2009; 98:73-9. [DOI: 10.1016/j.biosystems.2009.06.007] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2009] [Revised: 06/10/2009] [Accepted: 06/26/2009] [Indexed: 10/20/2022]
|
120
|
Sakkhachornphop S, Jiranusornkul S, Kodchakorn K, Nangola S, Sirisanthana T, Tayapiwatana C. Designed zinc finger protein interacting with the HIV-1 integrase recognition sequence at 2-LTR-circle junctions. Protein Sci 2009; 18:2219-30. [PMID: 19701937 PMCID: PMC2788277 DOI: 10.1002/pro.233] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2009] [Accepted: 08/13/2009] [Indexed: 12/16/2022]
Abstract
Integration of HIV-1 cDNA into the host genome is a crucial step for viral propagation. Two nucleotides, cytosine and adenine (CA), conserved at the 3' end of the viral cDNA genome, are cleaved by the viral integrase (IN) enzyme. As IN plays a crucial role in the early stages of the HIV-1 life cycle, substrate blockage of IN is an attractive strategy for therapeutic interference. In this study, we used the 2-LTR-circle junctions of HIV-1 DNA as a model to design zinc finger protein (ZFP) targeting at the end terminal portion of HIV-1 LTR. A six-contiguous ZFP, namely 2LTRZFP was designed using zinc finger tools. The designed motif was expressed and purified from E. coli to determine its binding properties. Surface plasmon resonance (SPR) was used to determine the binding affinity of 2LTRZFP to its target DNA. The level of dissociation constant (K(d)) was 12.0 nM. The competitive SPR confirmed that 2LTRZFP specifically interacted with its target DNA. The qualitative binding activity was subsequently determined by EMSA and demonstrated the aforementioned correlation. In addition, molecular modeling and binding energy analyses were carried out to provide structural insight into the binding of 2LTRZFP to the specific and nonspecific DNA target. It is suggested that hydrogen-bonding interactions play a key role in the DNA recognition mechanisms of the designed ZFP. Our study suggested an alternative HIV therapeutic strategy using ZFP interference of the HIV integration process.
Collapse
Affiliation(s)
- Supachai Sakkhachornphop
- Division of Clinical Immunology, Department of Medical Technology, Faculty of Associated Medical Sciences, Chiang Mai UniversityChiang Mai 50200, Thailand
- Research Institute for Health Sciences, Chiang Mai UniversityChiang Mai 50200, Thailand
| | - Supat Jiranusornkul
- Department of Pharmaceutical Sciences, Faculty of Pharmacy, Chiang Mai UniversityChiang Mai 50200, Thailand
| | - Kanchanok Kodchakorn
- Thailand Excellence Center for Tissue Engineering, Department of Biochemistry, Faculty of Medicine, Chiang Mai UniversityChiang Mai 50200, Thailand
| | - Sawitree Nangola
- Division of Clinical Immunology, Department of Medical Technology, Faculty of Associated Medical Sciences, Chiang Mai UniversityChiang Mai 50200, Thailand
| | - Thira Sirisanthana
- Research Institute for Health Sciences, Chiang Mai UniversityChiang Mai 50200, Thailand
| | - Chatchai Tayapiwatana
- Division of Clinical Immunology, Department of Medical Technology, Faculty of Associated Medical Sciences, Chiang Mai UniversityChiang Mai 50200, Thailand
- Biomedical Technology Research Unit, National Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency at the Faculty of Associated Medical Sciences, Chiang Mai UniversityChiang Mai 50200, Thailand
| |
Collapse
|
121
|
Briesemeister S, Blum T, Brady S, Lam Y, Kohlbacher O, Shatkay H. SherLoc2: A High-Accuracy Hybrid Method for Predicting Subcellular Localization of Proteins. J Proteome Res 2009; 8:5363-6. [PMID: 19764776 DOI: 10.1021/pr900665y] [Citation(s) in RCA: 106] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Sebastian Briesemeister
- Division for Simulation of Biological Systems, Center for Bioinformatics Tübingen, Eberhard-Karls-Universität Tübingen, Germany, and School of Computing, Queen’s University, Kingston, Ontario, Canada
| | - Torsten Blum
- Division for Simulation of Biological Systems, Center for Bioinformatics Tübingen, Eberhard-Karls-Universität Tübingen, Germany, and School of Computing, Queen’s University, Kingston, Ontario, Canada
| | - Scott Brady
- Division for Simulation of Biological Systems, Center for Bioinformatics Tübingen, Eberhard-Karls-Universität Tübingen, Germany, and School of Computing, Queen’s University, Kingston, Ontario, Canada
| | - Yin Lam
- Division for Simulation of Biological Systems, Center for Bioinformatics Tübingen, Eberhard-Karls-Universität Tübingen, Germany, and School of Computing, Queen’s University, Kingston, Ontario, Canada
| | - Oliver Kohlbacher
- Division for Simulation of Biological Systems, Center for Bioinformatics Tübingen, Eberhard-Karls-Universität Tübingen, Germany, and School of Computing, Queen’s University, Kingston, Ontario, Canada
| | - Hagit Shatkay
- Division for Simulation of Biological Systems, Center for Bioinformatics Tübingen, Eberhard-Karls-Universität Tübingen, Germany, and School of Computing, Queen’s University, Kingston, Ontario, Canada
| |
Collapse
|
122
|
Uropathogenic Escherichia coli Suppresses the host inflammatory response via pathogenicity island genes sisA and sisB. Infect Immun 2009; 77:5322-33. [PMID: 19797063 DOI: 10.1128/iai.00779-09] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
Extraintestinal pathogenic Escherichia coli can successfully colonize the urinary tract of the immunocompetent host. In part, this is accomplished by dampening the host immune response. Indeed, the sisA and sisB genes (shiA-like inflammation suppressor genes A and B) of uropathogenic E. coli strain CFT073, homologs of the Shigella flexneri SHI-2 pathogenicity island gene shiA, suppress the host inflammatory response. A double deletion mutant (DeltasisA DeltasisB) resulted in a hyperinflammatory phenotype in an experimental model of ascending urinary tract infection. The DeltasisA DeltasisB mutant not only caused significantly more inflammatory foci in the kidneys of CBA/J mice (P = 0.0399), but these lesions were also histologically more severe (P = 0.0477) than lesions observed in mice infected with wild-type CFT073. This hyperinflammatory phenotype could be suppressed to wild-type levels by in vivo complementation of the DeltasisA DeltasisB mutant with either the sisA or sisB gene in trans. The DeltasisA DeltasisB mutant was outcompeted by wild-type CFT073 during cochallenge infection in the bladder (P = 0.0295) at 48 h postinoculation (hpi). However, during cochallenge infections, we reasoned that wild-type CFT073 could partially complement the DeltasisA DeltasisB mutant. Consistent with this, the most significant colonization defect of the DeltasisA DeltasisB mutant in vivo was observed during independent challenge relative to wild-type CFT073, with attenuation of the mutant observed in the bladder (P < 0.0001) and kidneys (P = 0.0003) at 6 hpi. By 24 and 48 hpi, the DeltasisA DeltasisB mutant was no longer significantly attenuated in the bladder or kidneys, suggesting that the sisA and sisB genes may be important for suppressing the host immune response during the initial stages of infection.
Collapse
|
123
|
Desler C, Suravajhala P, Sanderhoff M, Rasmussen M, Rasmussen LJ. In Silico screening for functional candidates amongst hypothetical proteins. BMC Bioinformatics 2009; 10:289. [PMID: 19754976 PMCID: PMC2758874 DOI: 10.1186/1471-2105-10-289] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2008] [Accepted: 09/16/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The definition of a hypothetical protein is a protein that is predicted to be expressed from an open reading frame, but for which there is no experimental evidence of translation. Hypothetical proteins constitute a substantial fraction of proteomes of human as well as of other eukaryotes. With the general belief that the majority of hypothetical proteins are the product of pseudogenes, it is essential to have a tool with the ability of pinpointing the minority of hypothetical proteins with a high probability of being expressed. RESULTS Here, we present an in silico selection strategy where eukaryotic hypothetical proteins are sorted according to two criteria that can be reliably identified in silico: the presence of subcellular targeting signals and presence of characterized protein domains. To validate the selection strategy we applied it on a database of human hypothetical proteins dating to 2006 and compared the proteins predicted to be expressed by our selecting strategy, with their status in 2008. For the comparison we focused on mitochondrial proteins, since considerable amounts of research have focused on this field in between 2006 and 2008. Therefore, many proteins, defined as hypothetical in 2006, have later been characterized as mitochondrial. CONCLUSION Among the total amount of human proteins hypothetical in 2006, 21% have later been experimentally characterized and 6% of those have been shown to have a role in a mitochondrial context. In contrast, among the selected hypothetical proteins from the 2006 dataset, predicted by our strategy to have a mitochondrial role, 53-62% have later been experimentally characterized, and 85% of these have actually been assigned a role in mitochondria by 2008.Therefore our in silico selection strategy can be used to select the most promising candidates for subsequent in vitro and in vivo analyses.
Collapse
Affiliation(s)
- Claus Desler
- Department of Science, Systems and Models, Roskilde University, DK-4000 Roskilde, Denmark.
| | | | | | | | | |
Collapse
|
124
|
Kint G, Sonck KA, Schoofs G, De Coster D, Vanderleyden J, De Keersmaecker SC. 2D proteome analysis initiates new insights on the Salmonella Typhimurium LuxS protein. BMC Microbiol 2009; 9:198. [PMID: 19754952 PMCID: PMC2761396 DOI: 10.1186/1471-2180-9-198] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2009] [Accepted: 09/15/2009] [Indexed: 12/31/2022] Open
Abstract
Background Quorum sensing is a term describing a bacterial communication system mediated by the production and recognition of small signaling molecules. The LuxS enzyme, catalyzing the synthesis of AI-2, is conserved in a wide diversity of bacteria. AI-2 has therefore been suggested as an interspecies quorum sensing signal. To investigate the role of endogenous AI-2 in protein expression of the Gram-negative pathogen Salmonella enterica serovar Typhimurium (S. Typhimurium), we performed a 2D-DIGE proteomics experiment comparing total protein extract of wildtype S. Typhimurium with that of a luxS mutant, unable to produce AI-2. Results Differential proteome analysis of wildtype S. Typhimurium versus a luxS mutant revealed relatively few changes beyond the known effect on phase 2 flagellin. However, two highly differentially expressed protein spots with similar molecular weight but differing isoelectric point, were identified as LuxS whereas the S. Typhimurium genome contains only one luxS gene. This observation was further explored and we show that the S. Typhimurium LuxS protein can undergo posttranslational modification at a catalytic cysteine residue. Additionally, by constructing LuxS-βla and LuxS-PhoA fusion proteins, we demonstrate that S. Typhimurium LuxS can substitute the cognate signal peptide sequences of β-lactamase and alkaline phosphatase for translocation across the cytoplasmic membrane in S. Typhimurium. This was further confirmed by fractionation of S. Typhimurium protein extracts, followed by Western blot analysis. Conclusion 2D-DIGE analysis of a luxS mutant vs. wildtype Salmonella Typhimurium did not reveal new insights into the role of AI-2/LuxS in Salmonella as only a small amount of proteins were differentially expressed. However, subsequent in depth analysis of the LuxS protein itself revealed two interesting features: posttranslational modification and potential translocation across the cytoplasmic membrane. As the S. Typhimurium LuxS protein does not contain obvious signal motifs, it is speculated that LuxS is a new member of so called moonlighting proteins. These observations might have consequences in future studies on AI-2 quorum signaling in S. Typhimurium.
Collapse
Affiliation(s)
- Gwendoline Kint
- Centre of Microbial and Plant Genetics, K, U, Leuven, Kasteelpark Arenberg 20, B-3001 Leuven, Belgium.
| | | | | | | | | | | |
Collapse
|
125
|
Blum T, Briesemeister S, Kohlbacher O. MultiLoc2: integrating phylogeny and Gene Ontology terms improves subcellular protein localization prediction. BMC Bioinformatics 2009; 10:274. [PMID: 19723330 PMCID: PMC2745392 DOI: 10.1186/1471-2105-10-274] [Citation(s) in RCA: 206] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2009] [Accepted: 09/01/2009] [Indexed: 11/10/2022] Open
Abstract
Background Knowledge of subcellular localization of proteins is crucial to proteomics, drug target discovery and systems biology since localization and biological function are highly correlated. In recent years, numerous computational prediction methods have been developed. Nevertheless, there is still a need for prediction methods that show more robustness and higher accuracy. Results We extended our previous MultiLoc predictor by incorporating phylogenetic profiles and Gene Ontology terms. Two different datasets were used for training the system, resulting in two versions of this high-accuracy prediction method. One version is specialized for globular proteins and predicts up to five localizations, whereas a second version covers all eleven main eukaryotic subcellular localizations. In a benchmark study with five localizations, MultiLoc2 performs considerably better than other methods for animal and plant proteins and comparably for fungal proteins. Furthermore, MultiLoc2 performs clearly better when using a second dataset that extends the benchmark study to all eleven main eukaryotic subcellular localizations. Conclusion MultiLoc2 is an extensive high-performance subcellular protein localization prediction system. By incorporating phylogenetic profiles and Gene Ontology terms MultiLoc2 yields higher accuracies compared to its previous version. Moreover, it outperforms other prediction systems in two benchmarks studies. MultiLoc2 is available as user-friendly and free web-service, available at: .
Collapse
Affiliation(s)
- Torsten Blum
- Division for Simulation of Biological Systems, ZBIT/WSI, Eberhard-Karls-Universität Tübingen, Germany.
| | | | | |
Collapse
|
126
|
Qiu P, Cai XY, Ding W, Zhang Q, Norris ED, Greene JR. HCV genotyping using statistical classification approach. J Biomed Sci 2009; 16:62. [PMID: 19586537 PMCID: PMC2720937 DOI: 10.1186/1423-0127-16-62] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2009] [Accepted: 07/08/2009] [Indexed: 01/24/2023] Open
Abstract
The genotype of Hepatitis C Virus (HCV) strains is an important determinant of the severity and aggressiveness of liver infection as well as patient response to antiviral therapy. Fast and accurate determination of viral genotype could provide direction in the clinical management of patients with chronic HCV infections. Using publicly available HCV nucleotide sequences, we built a global Position Weight Matrix (PWM) for the HCV genome. Based on the PWM, a set of genotype specific nucleotide sequence "signatures" were selected from the 5' NCR, CORE, E1, and NS5B regions of the HCV genome. We evaluated the predictive power of these signatures for predicting the most common HCV genotypes and subtypes. We observed that nucleotide sequence signatures selected from NS5B and E1 regions generally demonstrated stronger discriminant power in differentiating major HCV genotypes and subtypes than that from 5' NCR and CORE regions. Two discriminant methods were used to build predictive models. Through 10 fold cross validation, over 99% prediction accuracy was achieved using both support vector machine (SVM) and random forest based classification methods in a dataset of 1134 sequences for NS5B and 947 sequences for E1. Prediction accuracy for each genotype is also reported.
Collapse
Affiliation(s)
- Ping Qiu
- Molecular Design and Informatics, Schering-Plough Research Institute, 2015 Galloping Hill Road, Kenilworth, NJ 07033, USA.
| | | | | | | | | | | |
Collapse
|
127
|
Abstract
This chapter outlines key considerations for constructing and implementing an EST database. Instead of showing the technological details step by step, emphasis is put on the design of an EST database suited to the specific needs of EST projects and how to choose the most suitable tools. Using TBestDB as an example, we illustrate the essential factors to be considered for database construction and the steps for data population and annotation. This process employs technologies such as PostgreSQL, Perl, and PHP to build the database and interface, and tools such as AutoFACT for data processing and annotation. We discuss these in comparison to other available technologies and tools, and explain the reasons for our choices.
Collapse
|
128
|
Dumas E, Desvaux M, Chambon C, Hébraud M. Insight into the core and variant exoproteomes of Listeria monocytogenes species by comparative subproteomic analysis. Proteomics 2009; 9:3136-55. [DOI: 10.1002/pmic.200800765] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
129
|
Contributions to neutropenia from PFAAP5 (N4BP2L2), a novel protein mediating transcriptional repressor cooperation between Gfi1 and neutrophil elastase. Mol Cell Biol 2009; 29:4394-405. [PMID: 19506020 DOI: 10.1128/mcb.00596-09] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
"Neutropenia" refers to deficient numbers of neutrophils, the most abundant type of white blood cell. Two main forms of inherited neutropenia are cyclic neutropenia, in which neutrophil counts oscillate with a 21-day frequency, and severe congenital neutropenia, in which static neutropenia may evolve at times into leukemia. Mutations of ELA2, encoding the protease neutrophil elastase, can cause both disorders. Among other genes, severe congenital neutropenia can also result from mutations affecting the transcriptional repressor Gfi1, one of whose genetic targets is ELA2, suggesting that the two act through similar mechanisms. In order to identify components of a common pathway regulating neutrophil production, we conducted yeast two-hybrid screens with Gfi1 and neutrophil elastase and detected a novel protein, PFAAP5 (also known as N4BP2L2), interacting with both. Expression of PFAAP5 allows neutrophil elastase to potentiate the repression of Gfi1 target genes, as determined by reporter assays, RNA interference, chromatin immunoprecipitation, and impairment of neutrophil differentiation in HSCs with PFAAP5 depletion, thus delineating a mechanism through which neutrophil elastase could regulate its own synthesis. Our findings are consistent with theoretical models of cyclic neutropenia proposing that its periodicity can be explained through disturbance of a feedback circuit in which mature neutrophils inhibit cell proliferation, thereby homeostatically regulating progenitor populations.
Collapse
|
130
|
Su J, Yang C, Zhu Z, Wang Y, Jang S, Liao L. Enhanced grass carp reovirus resistance of Mx-transgenic rare minnow (Gobiocypris rarus). FISH & SHELLFISH IMMUNOLOGY 2009; 26:828-835. [PMID: 19138747 DOI: 10.1016/j.fsi.2008.12.007] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/18/2008] [Revised: 12/08/2008] [Accepted: 12/22/2008] [Indexed: 05/27/2023]
Abstract
In the interferon-induced antiviral mechanisms, the Mx pathway is one of the most powerful. Mx proteins have direct antiviral activity and inhibit a wide range of viruses by blocking an early stage of the viral genome replication cycle. However, antiviral activity of piscine Mx remains unclear in vivo. In the present study, an Mx-like gene was cloned, characterized and gene-transferred in rare minnow Gobiocypris rarus, and its antiviral activity was confirmed in vivo. The full length of the rare minnow Mx-like cDNA is 2241 bp in length and encodes a polypeptide of 625 amino acids with an estimated molecular mass of 70.928 kDa and a predicted isoelectric point of 7.33. Analysis of the deduced amino acid sequence indicated that the mature peptide contains an amino-terminal tripartite GTP-binding motif, a dynamin family signature sequence, a GTPase effector domain and two carboxy-terminal leucine zipper motifs, and is the most similar to the crucian carp (Carassius auratus) Mx3 sequence with an identity of 89%. Both P0 and F1 generations of Mx-transgenic rare minnow demonstrated very significantly high survival rate to GCRV infection (P<0.01). The mRNA expression of Mx gene was consistent with survival rate in F1 generation. The virus yield was also concurrent with survival time using electron microscope technology. Rare minnow has Mx gene(s) of its own but introducing more Mx gene improves their resistance to GCRV. Mx-transgenic rare minnow might contribute to control the GCRV diseases.
Collapse
Affiliation(s)
- Jianguo Su
- Northwest A & F University, Shaanxi Key Laboratory of Molecular Biology for Agriculture, Yangling 712100, China
| | | | | | | | | | | |
Collapse
|
131
|
Linder P, Owttrim GW. Plant RNA helicases: linking aberrant and silencing RNA. TRENDS IN PLANT SCIENCE 2009; 14:344-52. [PMID: 19446493 DOI: 10.1016/j.tplants.2009.03.007] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/21/2009] [Revised: 03/11/2009] [Accepted: 03/17/2009] [Indexed: 05/06/2023]
Abstract
RNA helicases are ATPases that are capable of rearranging RNA and ribonucleoprotein (RNP) structure, and they can potentially function in any aspect of RNA metabolism. The RNA helicase gene family of plant genomes is larger and more diverse than genome families observed in other systems and provides an ideal model for investigation of the physiological importance of RNA secondary structure rearrangement in plant development. Numerous plant RNA helicases are associated with a variety of physiological functions, but this review will focus on the thirteen RNA helicases associated with the metabolism of aberrant and silencing RNAs. The results emphasize the crucial role RNA helicase activity has in the regulation of mRNA quality control and gene expression in plant development.
Collapse
Affiliation(s)
- Patrick Linder
- Department of Microbiology and Molecular Medicine, CMU, 1 Rue Michel Servet, CH-1211 Geneve 4, Switzerland
| | | |
Collapse
|
132
|
Kernytsky A, Rost B. Using genetic algorithms to select most predictive protein features. Proteins 2009; 75:75-88. [PMID: 18798568 DOI: 10.1002/prot.22211] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Many important characteristics of proteins such as biochemical activity and subcellular localization present a challenge to machine-learning methods: it is often difficult to encode the appropriate input features at the residue level for the purpose of making a prediction for the entire protein. The problem is usually that the biophysics of the connection between a machine-learning method's input (sequence feature) and its output (observed phenomenon to be predicted) remains unknown; in other words, we may only know that a certain protein is an enzyme (output) without knowing which region may contain the active site residues (input). The goal then becomes to dissect a protein into a vast set of sequence-derived features and to correlate those features with the desired output. We introduce a framework that begins with a set of global sequence features and then vastly expands the feature space by generically encoding the coexistence of residue-based features. It is this combination of individual features, that is the step from the fractions of serine and buried (input space 20 + 2) to the fraction of buried serine (input space 20 * 2) that implicitly shifts the search space from global feature inputs to features that can capture very local evidence such as a the individual residues of a catalytic triad. The vast feature space created is explored by a genetic algorithm (GA) paired with neural networks and support vector machines. We find that the GA is critical for selecting combinations of features that are neither too general resulting in poor performance, nor too specific, leading to overtraining. The final framework manages to effectively sample a feature space that is far too large for exhaustive enumeration. We demonstrate the power of the concept by applying it to prediction of protein enzymatic activity.
Collapse
Affiliation(s)
- Andrew Kernytsky
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York 10032, New York, USA.
| | | |
Collapse
|
133
|
Keene SD, Greco TM, Parastatidis I, Lee SH, Hughes EG, Balice-Gordon RJ, Speicher DW, Ischiropoulos H. Mass spectrometric and computational analysis of cytokine-induced alterations in the astrocyte secretome. Proteomics 2009; 9:768-82. [PMID: 19132682 DOI: 10.1002/pmic.200800385] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The roles of astrocytes in the CNS have been expanding beyond the long held view of providing passive, supportive functions. Recent evidence has identified roles in neuronal development, extracellular matrix maintenance, and response to inflammatory challenges. Therefore, insights into astrocyte secretion are critically important for understanding physiological responses and pathological mechanisms in CNS diseases. Primary astrocyte cultures were treated with inflammatory cytokines for either a short (1 day) or sustained (7 days) exposure. Increased interleukin-6 secretion, nitric oxide production, cyclooxygenase-2 activation, and nerve growth factor (NGF) secretion confirmed the astrocytic response to cytokine treatment. MS/MS analysis, computational prediction algorithms, and functional classification were used to compare the astrocyte protein secretome from control and cytokine-exposed cultures. In total, 169 secreted proteins were identified, including both classically and nonconventionally secreted proteins that comprised components of the extracellular matrix and enzymes involved in processing of glycoproteins and glycosaminoglycans. Twelve proteins were detected exclusively in the secretome from cytokine-treated astrocytes, including matrix metalloproteinase-3 (MMP-3) and members of the chemokine ligand family. This compilation of secreted proteins provides a framework for identifying factors that influence the biochemical environment of the nervous system, regulate development, construct extracellular matrices, and coordinate the nervous system response to inflammation.
Collapse
Affiliation(s)
- Sarah Dunn Keene
- Stokes Research Institute and Department of Pediatrics, Children's Hospital of Philadelphia, Philadelphia, PA 19104-4318, USA
| | | | | | | | | | | | | | | |
Collapse
|
134
|
Kaundal R, Raghava GPS. RSLpred: an integrative system for predicting subcellular localization of rice proteins combining compositional and evolutionary information. Proteomics 2009; 9:2324-42. [DOI: 10.1002/pmic.200700597] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
135
|
Discrimination of disease-related non-synonymous single nucleotide polymorphisms using multi-scale RBF kernel fuzzy support vector machine. Pattern Recognit Lett 2009. [DOI: 10.1016/j.patrec.2008.11.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
136
|
Carrie C, Kühn K, Murcha MW, Duncan O, Small ID, O'Toole N, Whelan J. Approaches to defining dual-targeted proteins in Arabidopsis. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2009; 57:1128-39. [PMID: 19036033 DOI: 10.1111/j.1365-313x.2008.03745.x] [Citation(s) in RCA: 98] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/19/2023]
Abstract
A variety of approaches were used to predict dual-targeted proteins in Arabidopsis thaliana. These predictions were experimentally tested using GFP fusions. Twelve new dual-targeted proteins were identified: five that were dual-targeted to mitochondria and plastids, six that were dual-targeted to mitochondria and peroxisomes, and one that was dual-targeted to mitochondria and the nucleus. Two methods to predict dual-targeted proteins had a high success rate: (1) combining the AraPerox database with a variety of subcellular prediction programs to identify mitochondrial- and peroxisomal-targeted proteins, and (2) using a variety of prediction programs on a biochemical pathway or process known to contain at least one dual-targeted protein. Several technical parameters need to be taken into account before assigning subcellular localization using GFP fusion proteins. The position of GFP with respect to the tagged polypeptide, the tissue or cells used to detect subcellular localization, and the portion of a candidate protein fused to GFP are all relevant to the expression and targeting of a fusion protein. Testing all gene models for a chromosomal locus is required if more than one model exists.
Collapse
Affiliation(s)
- Chris Carrie
- ARC Centre of Excellence in Plant Energy Biology, University of Western Australia, 35 Stirling Highway, Crawley 6009, WA, Australia
| | | | | | | | | | | | | |
Collapse
|
137
|
Ling XP, Zhu JY, Huang L, Huang HQ. Proteomic changes in response to acute cadmium toxicity in gill tissue of Paralichthys olivaceus. ENVIRONMENTAL TOXICOLOGY AND PHARMACOLOGY 2009; 27:212-218. [PMID: 21783942 DOI: 10.1016/j.etap.2008.10.004] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2008] [Revised: 10/07/2008] [Accepted: 10/17/2008] [Indexed: 05/31/2023]
Abstract
In the present study, we developed a two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) technique for examining the response of the proteome from gill tissue of Paralichthys olivaceus (POGT) to acute cadmium (AC) toxicity. Approximately 700 protein spots were detected from the gill sample when applying a 600μg protein 2D-PAGE gel in the pH range 5.0-8.0, and approximately 400 of these were identified by peptide mass fingerprinting (PMF) and database search. Compared to a control sample, significant changes were visualized in 18 protein spots exposed to seawater cadmium acute toxicity at 10.0ppm for 24h. Among these spots, two were up-regulated, one was down-regulated, seven showed low expression, and eight showed high expression. The collected spots were further identified by PMF and database search. Ten of the 18 proteins identified on the 2D-PAGE gel, including heat shock protein 70 and calcium-binding protein, demonstrated a synchronous response to AC, and we suggest that the variable levels and trends of these spots on the gel might be utilized as biomarker profiles to investigate cadmium contamination levels in seawater and to evaluate the degree of risk of human fatalities. The experimental results emphasize that the application of multiple biomarkers has an advantage over single biomarkers for monitoring levels of heavy metal contamination in seawater.
Collapse
Affiliation(s)
- Xue-Ping Ling
- Department of Biochemistry and Biotechnology, School of Life Sciences, Xiamen University, Xiamen 361005, China; Department of Chemical and Biochemical Engineering, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | | | | | | |
Collapse
|
138
|
Wrzeszczynski KO, Rost B. Cell cycle kinases predicted from conserved biophysical properties. Proteins 2009; 74:655-68. [PMID: 18704950 DOI: 10.1002/prot.22181] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Machine-learning techniques can classify functionally related proteins where homology-transfer as well as sequence and structure motifs fail. Here, we present a method that aimed at complementing homology-transfer in the identification of cell cycle control kinases from sequence alone. First, we identified functionally significant residues in cell cycle proteins through their high sequence conservation and biophysical properties. We then incorporated these residues and their features into support vector machines (SVM) to identify new kinases and more specifically to differentiate cell cycle kinases from other kinases and other proteins. As expected, the most informative residues tend to be highly conserved and tend to localize in the ATP binding regions of the kinases. Another observation confirmed that ATP binding regions are typically not found on the surface but in partially buried sites, and that this fact is correctly captured by accessibility predictions. Using these highly conserved, semi-buried residues and their biophysical properties, we could distinguish cell cycle S/T kinases from other kinase families at levels around 70-80% accuracy and 62-81% coverage. An application to the entire human proteome predicted at least 97 human proteins with limited previous annotations to be candidates for cell cycle kinases.
Collapse
Affiliation(s)
- Kazimierz O Wrzeszczynski
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York 10032, USA
| | | |
Collapse
|
139
|
Abstract
BACKGROUND Protein subcellular localization is concerned with predicting the location of a protein within a cell using computational method. The location information can indicate key functionalities of proteins. Accurate predictions of subcellular localizations of protein can aid the prediction of protein function and genome annotation, as well as the identification of drug targets. Computational methods based on machine learning, such as support vector machine approaches, have already been widely used in the prediction of protein subcellular localization. However, a major drawback of these machine learning-based approaches is that a large amount of data should be labeled in order to let the prediction system learn a classifier of good generalization ability. However, in real world cases, it is laborious, expensive and time-consuming to experimentally determine the subcellular localization of a protein and prepare instances of labeled data. RESULTS In this paper, we present an approach based on a new learning framework, semi-supervised learning, which can use much fewer labeled instances to construct a high quality prediction model. We construct an initial classifier using a small set of labeled examples first, and then use unlabeled instances to refine the classifier for future predictions. CONCLUSION Experimental results show that our methods can effectively reduce the workload for labeling data using the unlabeled data. Our method is shown to enhance the state-of-the-art prediction results of SVM classifiers by more than 10%.
Collapse
Affiliation(s)
- Qian Xu
- Program of Bioengineering, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong.
| | | | | | | | | |
Collapse
|
140
|
Kumar M, Raghava GPS. Prediction of nuclear proteins using SVM and HMM models. BMC Bioinformatics 2009; 10:22. [PMID: 19152693 PMCID: PMC2632991 DOI: 10.1186/1471-2105-10-22] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2008] [Accepted: 01/19/2009] [Indexed: 11/16/2022] Open
Abstract
Background The nucleus, a highly organized organelle, plays important role in cellular homeostasis. The nuclear proteins are crucial for chromosomal maintenance/segregation, gene expression, RNA processing/export, and many other processes. Several methods have been developed for predicting the nuclear proteins in the past. The aim of the present study is to develop a new method for predicting nuclear proteins with higher accuracy. Results All modules were trained and tested on a non-redundant dataset and evaluated using five-fold cross-validation technique. Firstly, Support Vector Machines (SVM) based modules have been developed using amino acid and dipeptide compositions and achieved a Mathews correlation coefficient (MCC) of 0.59 and 0.61 respectively. Secondly, we have developed SVM modules using split amino acid compositions (SAAC) and achieved the maximum MCC of 0.66. Thirdly, a hidden Markov model (HMM) based module/profile was developed for searching exclusively nuclear and non-nuclear domains in a protein. Finally, a hybrid module was developed by combining SVM module and HMM profile and achieved a MCC of 0.87 with an accuracy of 94.61%. This method performs better than the existing methods when evaluated on blind/independent datasets. Our method estimated 31.51%, 21.89%, 26.31%, 25.72% and 24.95% of the proteins as nuclear proteins in Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, mouse and human proteomes respectively. Based on the above modules, we have developed a web server NpPred for predicting nuclear proteins . Conclusion This study describes a highly accurate method for predicting nuclear proteins. SVM module has been developed for the first time using SAAC for predicting nuclear proteins, where amino acid composition of N-terminus and the remaining protein were computed separately. In addition, our study is a first documentation where exclusively nuclear and non-nuclear domains have been identified and used for predicting nuclear proteins. The performance of the method improved further by combining both approaches together.
Collapse
Affiliation(s)
- Manish Kumar
- Bioinformatics Centre, Institute of Microbial Technology, Chandigarh, India
| | | |
Collapse
|
141
|
Improving Protein Localization Prediction Using Amino Acid Group Based Physichemical Encoding. ACTA ACUST UNITED AC 2009. [DOI: 10.1007/978-3-642-00727-9_24] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/25/2023]
|
142
|
Baginsky S. Plant proteomics: concepts, applications, and novel strategies for data interpretation. MASS SPECTROMETRY REVIEWS 2009; 28:93-120. [PMID: 18618656 DOI: 10.1002/mas.20183] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Proteomics is an essential source of information about biological systems because it generates knowledge about the concentrations, interactions, functions, and catalytic activities of proteins, which are the major structural and functional determinants of cells. In the last few years significant technology development has taken place both at the level of data analysis software and mass spectrometry hardware. Conceptual progress in proteomics has made possible the analysis of entire proteomes at previously unprecedented density and accuracy. New concepts have emerged that comprise quantitative analyses of full proteomes, database-independent protein identification strategies, targeted quantitative proteomics approaches with proteotypic peptides and the systematic analysis of an increasing number of posttranslational modifications at high temporal and spatial resolution. Although plant proteomics is making progress, there are still several analytical challenges that await experimental and conceptual solutions. With this review I will highlight the current status of plant proteomics and put it into the context of the aforementioned conceptual progress in the field, illustrate some of the plant-specific challenges and present my view on the great opportunities for plant systems biology offered by proteomics.
Collapse
Affiliation(s)
- Sacha Baginsky
- Institute of Plant Sciences, Swiss Federal Institute of Technology, Universitätsstrasse 2, 8092 Zurich, Switzerland.
| |
Collapse
|
143
|
Distinct roles of GSK-3alpha and GSK-3beta phosphorylation in the heart under pressure overload. Proc Natl Acad Sci U S A 2008; 105:20900-5. [PMID: 19106302 DOI: 10.1073/pnas.0808315106] [Citation(s) in RCA: 114] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Glycogen synthase kinase-3 (GSK-3) is a master regulator of growth and death in cardiac myocytes. GSK-3 is inactivated by hypertrophic stimuli through phosphorylation-dependent and -independent mechanisms. Inactivation of GSK-3 removes the negative constraint of GSK-3 on hypertrophy, thereby stimulating cardiac hypertrophy. N-terminal phosphorylation of the GSK-3 isoforms GSK-3alpha and GSK-3beta by upstream kinases (e.g., Akt) is a major mechanism of GSK-3 inhibition. Nonetheless, its role in mediating cardiac hypertrophy and failure remains to be established. Here we evaluated the role of Serine(S)21 and S9 phosphorylation of GSK-3alpha and GSK-3beta in the regulation of cardiac hypertrophy and function during pressure overload (PO), using GSK-3alpha S21A knock-in (alphaKI) and GSK-3beta S9A knock-in (betaKI) mice. Although inhibition of S9 phosphorylation during PO in the betaKI mice attenuated hypertrophy and heart failure (HF), inhibition of S21 phosphorylation in the alphaKI mice unexpectedly promoted hypertrophy and HF. Inhibition of S21 phosphorylation in GSK-3alpha, but not of S9 phosphorylation in GSK-3beta, caused phosphorylation and down-regulation of G1-cyclins, due to preferential localization of GSK-3alpha in the nucleus, and suppressed E2F and markers of cell proliferation, including phosphorylated histone H3, under PO, thereby contributing to decreases in the total number of myocytes in the heart. Restoration of the E2F activity by injection of adenovirus harboring cyclin D1 with a nuclear localization signal attenuated HF under PO in the alphaKI mice. Collectively, our results reveal that whereas S9 phosphorylation of GSK-3beta mediates pathological hypertrophy, S21 phosphorylation of GSK-3alpha plays a compensatory role during PO, in part by alleviating the negative constraint on the cell cycle machinery in cardiac myocytes.
Collapse
|
144
|
Katta SS, Sahasrabuddhe AA, Gupta CM. Flagellar localization of a novel isoform of myosin, myosin XXI, in Leishmania. Mol Biochem Parasitol 2008; 164:105-10. [PMID: 19121339 DOI: 10.1016/j.molbiopara.2008.12.002] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2008] [Revised: 12/03/2008] [Accepted: 12/05/2008] [Indexed: 12/11/2022]
Abstract
Leishmania major genome analysis revealed the presence of putative genes corresponding to two myosins, which have been designated to class IB and a novel class, class XXI, specifically present in kinetoplastids. To characterize these myosin homologs in Leishmania, we have cloned and over-expressed the full-length myosin XXI gene and variable region of myosin IB gene in bacteria, purified the corresponding proteins, and then used the affinity purified anti-sera to analyze the expression and intracellular distribution of these proteins. Whereas myosin XXI was expressed in both the promastigote and amastigote stages, no expression of myosin IB could be detected in any of the two stages of these parasites. Further, myosin XXI expression was more predominant in the promastigote stage where it was preferentially localized in the proximal region of the flagellum. The observed flagellar localization was not dependent on the myosin head region or actin but was exclusively determined by the myosin tail region, as judged by over-expressing GFP conjugates of full-length myosin XXI, its head domain and its tail domain separately in Leishmania. Furthermore, immunofluorescence and immuno-gold electron microscopy analyses revealed that this protein was partly associated with paraflagellar rod proteins but not with tubulins in the flagellar axoneme. Our results, for the first time, report the expression and detailed analysis of cellular localization of a novel class of myosin, myosin XXI in trypanosomatids.
Collapse
Affiliation(s)
- Santharam S Katta
- Division of Molecular and Structural Biology, Central Drug Research Institute, Lucknow, India
| | | | | |
Collapse
|
145
|
Garg A, Raghava GPS. ESLpred2: improved method for predicting subcellular localization of eukaryotic proteins. BMC Bioinformatics 2008; 9:503. [PMID: 19038062 PMCID: PMC2612013 DOI: 10.1186/1471-2105-9-503] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2008] [Accepted: 11/28/2008] [Indexed: 11/10/2022] Open
Abstract
Background The expansion of raw protein sequence databases in the post genomic era and availability of fresh annotated sequences for major localizations particularly motivated us to introduce a new improved version of our previously forged eukaryotic subcellular localizations prediction method namely "ESLpred". Since, subcellular localization of a protein offers essential clues about its functioning, hence, availability of localization predictor would definitely aid and expedite the protein deciphering studies. However, robustness of a predictor is highly dependent on the superiority of dataset and extracted protein attributes; hence, it becomes imperative to improve the performance of presently available method using latest dataset and crucial input features. Results Here, we describe augmentation in the prediction performance obtained for our most popular ESLpred method using new crucial features as an input to Support Vector Machine (SVM). In addition, recently available, highly non-redundant dataset encompassing three kingdoms specific protein sequence sets; 1198 fungi sequences, 2597 from animal and 491 plant sequences were also included in the present study. First, using the evolutionary information in the form of profile composition along with whole and N-terminal sequence composition as an input feature vector of 440 dimensions, overall accuracies of 72.7, 75.8 and 74.5% were achieved respectively after five-fold cross-validation. Further, enhancement in performance was observed when similarity search based results were coupled with whole and N-terminal sequence composition along with profile composition by yielding overall accuracies of 75.9, 80.8, 76.6% respectively; best accuracies reported till date on the same datasets. Conclusion These results provide confidence about the reliability and accurate prediction of SVM modules generated in the present study using sequence and profile compositions along with similarity search based results. The presently developed modules are implemented as web server "ESLpred2" available at .
Collapse
Affiliation(s)
- Aarti Garg
- Department of Biotechnology, Panjab University, Chandigarh, India.
| | | |
Collapse
|
146
|
Zhu YZ, Li QT, Wang L, Zhong Y, Ding GH, Li G, Jia PL, Shi TL, Guo XK. Gene expression profiling-based in silico approach to identify potential vaccine candidates and drug targets against B. pertussis and B. parapertussis. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2008; 12:161-9. [PMID: 18717643 DOI: 10.1089/omi.2008.0029] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Whooping cough (pertussis) caused by B. pertussis (B.p) is still serious public health threat. B. parapertussis (B.pp), closely related to B.p, also causes whooping cough. The incidence of B.pp infections has been increasing over the last decades, partly because pertussis vaccines have low efficiency against B.pp infections. Moreover, because the majority of pertussis patients are infants, common antimicrobial agents producing serious adverse reactions in infants are not fully satisfactory. Therefore, we try to identify potential vaccine candidates and alternative drug targets against both B.p and B.pp. This preliminary work integrates several different kinds of data from in silico analysis, comparative genomic hybridization, global transcriptional profiling, and protein-protein interaction (PPI) network to screen potential vaccine candidates and drug targets against the two species. Finally, 191 potential crossprotective vaccine candidates are identified. They have high transcriptional levels in both species, or are associated with virulence and pathogenesis. Moreover, these proteins are not only potentially surface-exposed in the bacteria, but also well conserved among the 165 B.p and B.pp strains. Among them, 22 candidates with high essentiality in the two PPI networks of B.p and B.pp are regarded as suitable drug targets against the two species. We just selected Bordetella as an example to develop a rapid and reliable approach for screening alternative drug targets that associated with novel protein pathways, complexes, and cellular functions against these antibiotic-resistant pathogens. Further researches focusing on the 191 vaccine candidates could accelerate the development of more effective vaccines and drug therapy against B.p and B.pp infection.
Collapse
Affiliation(s)
- Yong-Zhang Zhu
- Department of Medical Microbiology and Parasitology, Institutes of Medical Sciences, Shanghai Jiao Tong University School of Medicine, 200025 Shanghai, People's Republic of China
| | | | | | | | | | | | | | | | | |
Collapse
|
147
|
Yoshihara C, Inoue K, Schichnes D, Ruzin S, Inwood W, Kustu S. An Rh1-GFP fusion protein is in the cytoplasmic membrane of a white mutant strain of Chlamydomonas reinhardtii. MOLECULAR PLANT 2008; 1:1007-20. [PMID: 19825599 PMCID: PMC2902906 DOI: 10.1093/mp/ssn074] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/18/2008] [Accepted: 10/14/2008] [Indexed: 05/21/2023]
Abstract
The major Rhesus (Rh) protein of the green alga Chlamydomonas reinhardtii, Rh1, is homologous to Rh proteins of humans. It is an integral membrane protein involved in transport of carbon dioxide. To localize a fusion of intact Rh1 to the green fluorescent protein (GFP), we used as host a white (lts1) mutant strain of C. reinhardtii, which is blocked at the first step of carotenoid biosynthesis. The lts1 mutant strain accumulated normal amounts of Rh1 heterotrophically in the dark and Rh1-GFP was at the periphery of the cell co-localized with the cytoplasmic membrane dye FM4-64. Although Rh1 carries a potential chloroplast targeting sequence at its N-terminus, Rh1-GFP was clearly not associated with the chloroplast envelope membrane. Moreover, the N-terminal half of the protein was not imported into chloroplasts in vitro and N-terminal regions of Rh1 did not direct import of the small subunit of ribulose bisphosphate carboxylase (SSU). Despite caveats to this interpretation, which we discuss, current evidence indicates that Rh1 is a cytoplasmic membrane protein and that Rh1-GFP is among the first cytoplasmic membrane protein fusions to be obtained in C. reinhardtii. Although lts1 (white) mutant strains cannot be used to localize proteins within sub-compartments of the chloroplast because they lack thylakoid membranes, they should nonetheless be valuable for localizing many GFP fusions in Chlamydomonas.
Collapse
Affiliation(s)
- Corinne Yoshihara
- Department of Plant and Microbial Biology, 111 Koshland Hall, University of California, Berkeley, CA 94720-3102, USA
| | - Kentaro Inoue
- Department of Plant Sciences, 131 Asmundson Hall, University of California, One Shields Avenue, Davis, CA 95616, USA
| | - Denise Schichnes
- CNR Biological Imaging Facility, 381 Koshland Hall, University of California, Berkeley, CA 94720-3102, USA
| | - Steven Ruzin
- CNR Biological Imaging Facility, 381 Koshland Hall, University of California, Berkeley, CA 94720-3102, USA
| | - William Inwood
- Department of Plant and Microbial Biology, 111 Koshland Hall, University of California, Berkeley, CA 94720-3102, USA
| | - Sydney Kustu
- Department of Plant and Microbial Biology, 111 Koshland Hall, University of California, Berkeley, CA 94720-3102, USA
- To whom correspondence should be addressed. E-mail , fax (510) 642-4995, tel. (510) 643-9308
| |
Collapse
|
148
|
Yu Y, Tang T, Qian Q, Wang Y, Yan M, Zeng D, Han B, Wu CI, Shi S, Li J. Independent losses of function in a polyphenol oxidase in rice: differentiation in grain discoloration between subspecies and the role of positive selection under domestication. THE PLANT CELL 2008; 20:2946-59. [PMID: 19033526 PMCID: PMC2613672 DOI: 10.1105/tpc.108.060426] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Asian rice (Oryza sativa) cultivars originated from wild rice and can be divided into two subspecies by several criteria, one of which is the phenol reaction (PHR) phenotype. Grains of indica cultivars turn brown in a phenol solution that accelerates a similar process that occurs during prolonged storage. By contrast, the grains of japonica do not discolor. This distinction may reflect the divergent domestication of these two subspecies. The PHR is controlled by a single gene, Phr1; here, we report the cloning of Phr1, which encodes a polyphenol oxidase. The Phr1 gene is indeed responsible for the PHR phenotype, as transformation with a functional Phr1 can complement a PHR negative cultivar. Phr1 is defective in all japonica lines but functional in nearly all indica and wild strains. Phylogenetic analysis showed that the defects in Phr1 arose independently three times. The multiple recent origins and rapid spread of phr1 in japonica suggest the action of positive selection, which is further supported by several population genetic tests. This case may hence represent an example of artificial selection driving the differentiation among domesticated varieties.
Collapse
MESH Headings
- Amino Acid Sequence
- Catechol Oxidase/genetics
- Cloning, Molecular
- Crops, Agricultural/genetics
- DNA, Plant/genetics
- Evolution, Molecular
- Genes, Plant
- Genetic Complementation Test
- Genetics, Population
- Molecular Sequence Data
- Mutation
- Oryza/genetics
- Phylogeny
- Plant Proteins/genetics
- Plant Structures/genetics
- Plants, Genetically Modified/genetics
- Polymorphism, Genetic
- Selection, Genetic
- Sequence Analysis, DNA
- Species Specificity
Collapse
Affiliation(s)
- Yanchun Yu
- State Key Laboratory of Plant Genomics and National Center for Plant Gene Research, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China
| | | | | | | | | | | | | | | | | | | |
Collapse
|
149
|
Punta M, Ofran Y. The rough guide to in silico function prediction, or how to use sequence and structure information to predict protein function. PLoS Comput Biol 2008; 4:e1000160. [PMID: 18974821 PMCID: PMC2518264 DOI: 10.1371/journal.pcbi.1000160] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Affiliation(s)
- Marco Punta
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York, United States of America
- Columbia University Center for Computational Biology and Bioinformatics (C2B2), New York, New York, United States of America
- Northeast Structural Genomics Consortium (NESG), Columbia University, New York, New York, United States of America
| | - Yanay Ofran
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, Israel
- * E-mail:
| |
Collapse
|
150
|
Soong TT, Wrzeszczynski KO, Rost B. Physical protein-protein interactions predicted from microarrays. ACTA ACUST UNITED AC 2008; 24:2608-14. [PMID: 18829707 PMCID: PMC2579715 DOI: 10.1093/bioinformatics/btn498] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
Motivation: Microarray expression data reveal functionally associated proteins. However, most proteins that are associated are not actually in direct physical contact. Predicting physical interactions directly from microarrays is both a challenging and important task that we addressed by developing a novel machine learning method optimized for this task. Results: We validated our support vector machine-based method on several independent datasets. At the same levels of accuracy, our method recovered more experimentally observed physical interactions than a conventional correlation-based approach. Pairs predicted by our method to very likely interact were close in the overall network of interaction, suggesting our method as an aid for functional annotation. We applied the method to predict interactions in yeast (Saccharomyces cerevisiae). A Gene Ontology function annotation analysis and literature search revealed several probable and novel predictions worthy of future experimental validation. We therefore hope our new method will improve the annotation of interactions as one component of multi-source integrated systems. Contact:ts2186@columbia.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ta-Tsen Soong
- Columbia University Center for Computational Biology and Bioinformatics, Columbia University, New York, NY, USA.
| | | | | |
Collapse
|