1
|
Zhang T, Zhang M, Xu Z, He Y, Zhao X, Cheng H, Chen X, Xu J, Ding Z. The Screening of the Protective Antigens of Aeromonas hydrophila Using the Reverse Vaccinology Approach: Potential Candidates for Subunit Vaccine Development. Vaccines (Basel) 2023; 11:1266. [PMID: 37515081 PMCID: PMC10383140 DOI: 10.3390/vaccines11071266] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 07/14/2023] [Accepted: 07/19/2023] [Indexed: 07/30/2023] Open
Abstract
The threat of bacterial septicemia caused by Aeromonas hydrophila infection to aquaculture growth can be prevented through vaccination, but differences among A. hydrophila strains may affect the effectiveness of non-conserved subunit vaccines or non-inactivated A. hydrophila vaccines, making the identification and development of conserved antigens crucial. In this study, a bioinformatics analysis of 4268 protein sequences encoded by the A. hydrophila J-1 strain whole genome was performed based on reverse vaccinology. The specific analysis included signal peptide prediction, transmembrane helical structure prediction, subcellular localization prediction, and antigenicity and adhesion evaluation, as well as interspecific and intraspecific homology comparison, thereby screening the 39 conserved proteins as candidate antigens for A. hydrophila vaccine. The 9 isolated A. hydrophila strains from diseased fish were categorized into 6 different molecular subtypes via enterobacterial repetitive intergenic consensus (ERIC)-PCR technology, and the coding regions of 39 identified candidate proteins were amplified via PCR and sequenced to verify their conservation in different subtypes of A. hydrophila and other Aeromonas species. In this way, conserved proteins were screened out according to the comparison results. Briefly, 16 proteins were highly conserved in different A. hydrophila subtypes, of which 2 proteins were highly conserved in Aeromonas species, which could be selected as candidate antigens for vaccines development, including type IV pilus secretin PilQ (AJE35401.1) and TolC family outer membrane protein (AJE35877.1). The present study screened the conserved antigens of A. hydrophila by using reverse vaccinology, which provided basic foundations for developing broad-spectrum protective vaccines of A. hydrophila.
Collapse
Affiliation(s)
- Ting Zhang
- Jiangsu Key Laboratory of Marine Bioresources and Environment, Co-Innovation Center of Jiangsu Marine Bio-Industry Technology, Jiangsu Ocean University, Lianyungang 222005, China
- Jiangsu Key Laboratory of Marine Biotechnology, School of Marine Science and Fisheries, Jiangsu Ocean University, Lianyungang 222005, China
| | - Minying Zhang
- Jiangsu Key Laboratory of Marine Bioresources and Environment, Co-Innovation Center of Jiangsu Marine Bio-Industry Technology, Jiangsu Ocean University, Lianyungang 222005, China
- Jiangsu Key Laboratory of Marine Biotechnology, School of Marine Science and Fisheries, Jiangsu Ocean University, Lianyungang 222005, China
| | - Zehua Xu
- Jiangsu Key Laboratory of Marine Bioresources and Environment, Co-Innovation Center of Jiangsu Marine Bio-Industry Technology, Jiangsu Ocean University, Lianyungang 222005, China
- Jiangsu Key Laboratory of Marine Biotechnology, School of Marine Science and Fisheries, Jiangsu Ocean University, Lianyungang 222005, China
| | - Yang He
- Key Laboratory of Sichuan Province for Fishes Conservation and Utilization in the Upper Reaches of the Yangtze River, Neijiang Normal University, Neijiang 641000, China
| | - Xiaoheng Zhao
- Jiangsu Key Laboratory of Marine Bioresources and Environment, Co-Innovation Center of Jiangsu Marine Bio-Industry Technology, Jiangsu Ocean University, Lianyungang 222005, China
- Jiangsu Key Laboratory of Marine Biotechnology, School of Marine Science and Fisheries, Jiangsu Ocean University, Lianyungang 222005, China
| | - Hanliang Cheng
- Jiangsu Key Laboratory of Marine Bioresources and Environment, Co-Innovation Center of Jiangsu Marine Bio-Industry Technology, Jiangsu Ocean University, Lianyungang 222005, China
- Jiangsu Key Laboratory of Marine Biotechnology, School of Marine Science and Fisheries, Jiangsu Ocean University, Lianyungang 222005, China
| | - Xiangning Chen
- Jiangsu Key Laboratory of Marine Bioresources and Environment, Co-Innovation Center of Jiangsu Marine Bio-Industry Technology, Jiangsu Ocean University, Lianyungang 222005, China
- Jiangsu Key Laboratory of Marine Biotechnology, School of Marine Science and Fisheries, Jiangsu Ocean University, Lianyungang 222005, China
| | - Jianhe Xu
- Jiangsu Key Laboratory of Marine Bioresources and Environment, Co-Innovation Center of Jiangsu Marine Bio-Industry Technology, Jiangsu Ocean University, Lianyungang 222005, China
- Jiangsu Key Laboratory of Marine Biotechnology, School of Marine Science and Fisheries, Jiangsu Ocean University, Lianyungang 222005, China
| | - Zhujin Ding
- Jiangsu Key Laboratory of Marine Bioresources and Environment, Co-Innovation Center of Jiangsu Marine Bio-Industry Technology, Jiangsu Ocean University, Lianyungang 222005, China
- Jiangsu Key Laboratory of Marine Biotechnology, School of Marine Science and Fisheries, Jiangsu Ocean University, Lianyungang 222005, China
- Jiangsu Institute of Marine Resources Development, Lianyungang 222005, China
| |
Collapse
|
2
|
Christian R, Labbancz J, Usadel B, Dhingra A. Understanding protein import in diverse non-green plastids. Front Genet 2023; 14:969931. [PMID: 37007964 PMCID: PMC10063809 DOI: 10.3389/fgene.2023.969931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 02/24/2023] [Indexed: 03/19/2023] Open
Abstract
The spectacular diversity of plastids in non-green organs such as flowers, fruits, roots, tubers, and senescing leaves represents a Universe of metabolic processes in higher plants that remain to be completely characterized. The endosymbiosis of the plastid and the subsequent export of the ancestral cyanobacterial genome to the nuclear genome, and adaptation of the plants to all types of environments has resulted in the emergence of diverse and a highly orchestrated metabolism across the plant kingdom that is entirely reliant on a complex protein import and translocation system. The TOC and TIC translocons, critical for importing nuclear-encoded proteins into the plastid stroma, remain poorly resolved, especially in the case of TIC. From the stroma, three core pathways (cpTat, cpSec, and cpSRP) may localize imported proteins to the thylakoid. Non-canonical routes only utilizing TOC also exist for the insertion of many inner and outer membrane proteins, or in the case of some modified proteins, a vesicular import route. Understanding this complex protein import system is further compounded by the highly heterogeneous nature of transit peptides, and the varying transit peptide specificity of plastids depending on species and the developmental and trophic stage of the plant organs. Computational tools provide an increasingly sophisticated means of predicting protein import into highly diverse non-green plastids across higher plants, which need to be validated using proteomics and metabolic approaches. The myriad plastid functions enable higher plants to interact and respond to all kinds of environments. Unraveling the diversity of non-green plastid functions across the higher plants has the potential to provide knowledge that will help in developing climate resilient crops.
Collapse
Affiliation(s)
- Ryan Christian
- Department of Horticulture, Washington State University, Pullman, WA, United States
| | - June Labbancz
- Department of Horticulture, Washington State University, Pullman, WA, United States
- Department of Horticultural Sciences, Texas A&M University, College Station, TX, United States
| | | | - Amit Dhingra
- Department of Horticulture, Washington State University, Pullman, WA, United States
- Department of Horticultural Sciences, Texas A&M University, College Station, TX, United States
- *Correspondence: Amit Dhingra,
| |
Collapse
|
3
|
Vasu K, Khan D, Ramachandiran I, Blankenberg D, Fox P. Analysis of nested alternate open reading frames and their encoded proteins. NAR Genom Bioinform 2022; 4:lqac076. [PMID: 36267124 PMCID: PMC9580016 DOI: 10.1093/nargab/lqac076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Revised: 08/14/2022] [Accepted: 09/27/2022] [Indexed: 11/22/2022] Open
Abstract
Transcriptional and post-transcriptional mechanisms diversify the proteome beyond gene number, while maintaining a sequence relationship between original and altered proteins. A new mechanism breaks this paradigm, generating novel proteins by translating alternative open reading frames (Alt-ORFs) within canonical host mRNAs. Uniquely, ‘alt-proteins’ lack sequence homology with host ORF-derived proteins. We show global amino acid frequencies, and consequent biochemical characteristics of Alt-ORFs nested within host ORFs (nAlt-ORFs), are genetically-driven, and predicted by summation of frequencies of hundreds of encompassing host codon-pairs. Analysis of 101 human nAlt-ORFs of length ≥150 codons confirms the theoretical predictions, revealing an extraordinarily high median isoelectric point (pI) of 11.68, due to anomalous charged amino acid levels. Also, nAlt-ORF proteins exhibit a >2-fold preference for reading frame 2 versus 3, predicted mitochondrial and nuclear localization, and elevated codon adaptation index indicative of natural selection. Our results provide a theoretical and conceptual framework for exploration of these largely unannotated, but potentially significant, alternative ORFs and their encoded proteins.
Collapse
Affiliation(s)
- Kommireddy Vasu
- Department of Cardiovascular and Metabolic Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
| | - Debjit Khan
- Department of Cardiovascular and Metabolic Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
| | - Iyappan Ramachandiran
- Department of Cardiovascular and Metabolic Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
| | - Daniel Blankenberg
- Correspondence may also be addressed to Daniel Blankenberg. Tel: +1 216 444 4336;
| | - Paul L Fox
- To whom correspondence should be addressed. Tel: +1 216 444 8053; Fax: +1 216 444 9404;
| |
Collapse
|
4
|
Ma SH, Kim HM, Park SH, Park SY, Mai TD, Do JH, Koo Y, Joung YH. The ten amino acids of the oxygen-evolving enhancer of tobacco is sufficient as the peptide residues for protein transport to the chloroplast thylakoid. PLANT MOLECULAR BIOLOGY 2021; 105:513-523. [PMID: 33393067 PMCID: PMC7892526 DOI: 10.1007/s11103-020-01106-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/16/2020] [Accepted: 12/10/2020] [Indexed: 06/12/2023]
Abstract
KEY MESSAGE The thylakoid transit peptide of tobacco oxygen-evolving enhancer protein contains a minimal ten amino acid sequences for thylakoid lumen transports. This ten amino acids do not contain twin-arginine, which is required for typical chloroplast lumen translocation. Chloroplasts are intracellular organelles responsible for photosynthesis to produce organic carbon for all organisms. Numerous proteins must be transported from the cytosol to chloroplasts to support photosynthesis. This transport is facilitated by chloroplast transit peptides (TPs). Four chloroplast thylakoid lumen TPs were isolated from Nicotiana tabacum and were functionally analyzed as thylakoid lumen TPs. Typical chloroplast stroma-transit peptides and thylakoid lumen transit peptides (tTPs) are found in N. tabacum transit peptides (NtTPs) and the functions of these peptides are confirmed with TP-GFP fusion proteins under fluorescence microscopy and chloroplast fractionation, followed by Western blot analysis. During the functional analysis of tTPs, we uncovered the minimum 10 amino acid sequence is sufficient for thylakoid lumen transport. These ten amino acids can efficiently translocate GFP protein, even if they do not contain the twin-arginine residues required for the twin-arginine translocation (Tat) pathway, which is a typical thylakoid lumen transport. Further, thylakoid lumen transporting processes through the Tat pathway was examined by analyzing tTP sequence functions and we demonstrate that the importance of hydrophobic core for the tTP cleavage and target protein translocation.
Collapse
Affiliation(s)
- Sang Hoon Ma
- School of Biological Science and Technology, Chonnam National University, Gwangju, 61186, South Korea
| | - Hyun Min Kim
- School of Biological Science and Technology, Chonnam National University, Gwangju, 61186, South Korea
| | - Se Hee Park
- School of Biological Science and Technology, Chonnam National University, Gwangju, 61186, South Korea
| | - Seo Young Park
- School of Biological Science and Technology, Chonnam National University, Gwangju, 61186, South Korea
| | - Thanh Dat Mai
- School of Biological Science and Technology, Chonnam National University, Gwangju, 61186, South Korea
| | - Ju Hui Do
- School of Biological Science and Technology, Chonnam National University, Gwangju, 61186, South Korea
| | - Yeonjong Koo
- Department of Agricultural Chemistry, Chonnam National University, Gwangju, 61186, South Korea.
| | - Young Hee Joung
- School of Biological Science and Technology, Chonnam National University, Gwangju, 61186, South Korea.
| |
Collapse
|
5
|
Zhang WX, Pan X, Shen HB. Signal-3L 3.0: Improving Signal Peptide Prediction through Combining Attention Deep Learning with Window-Based Scoring. J Chem Inf Model 2020; 60:3679-3686. [PMID: 32501689 DOI: 10.1021/acs.jcim.0c00401] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Signal peptides play an important role in guiding and transferring transmembrane proteins and secreted proteins. In recent years, with the explosive growth of protein sequences, computationally predicting signal peptides and their cleavage sites from protein sequences is highly desired. In this work, we present an improved approach, Signal-3L 3.0, for signal peptide recognition and cleavage-site prediction using a 3-layer hybrid method of integrating deep learning algorithms and window-based scoring. There are three main components in the Signal-3L 3.0 prediction engine: (1) a deep bidirectional long short-term memory (Bi-LSTM) network with a soft self-attention learns abstract features from sequences to determine whether a query protein contains a signal peptide; (2) the statistics propensity window-based cleavage site screening method is applied to generate the set of candidate cleavage sites; (3) the prediction of a conditional random field with a hybrid convolutional neural network (CNN) and Bi-LSTM is fused with the window-based score for identifying the final unique cleavage site. Experimental results on the benchmark datasets show that the new deep learning-driven Signal-3L 3.0 yields promising performance. The online server of Signal-3L 3.0 is available at http://www.csbio.sjtu.edu.cn/bioinf/Signal-3L/.
Collapse
Affiliation(s)
- Wei-Xun Zhang
- Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai Jiao Tong University, and Institute of Image Processing and Pattern Recognition, Shanghai 200240, China
| | - Xiaoyong Pan
- Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai Jiao Tong University, and Institute of Image Processing and Pattern Recognition, Shanghai 200240, China
| | - Hong-Bin Shen
- Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai Jiao Tong University, and Institute of Image Processing and Pattern Recognition, Shanghai 200240, China
| |
Collapse
|
6
|
Molecular Evolution and Functional Analysis of Rubredoxin-Like Proteins in Plants. BIOMED RESEARCH INTERNATIONAL 2019; 2019:2932585. [PMID: 31355252 PMCID: PMC6634066 DOI: 10.1155/2019/2932585] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Revised: 05/14/2019] [Accepted: 06/09/2019] [Indexed: 11/17/2022]
Abstract
Rubredoxins are a class of iron-containing proteins that play an important role in the reduction of superoxide in some anaerobic bacteria and also act as electron carriers in many biochemical processes. Unlike the more widely studied about rubredoxin proteins in anaerobic bacteria, very few researches about the function of rubredoxins have been proceeded in plants. Previous studies indicated that rubredoxins in A. thaliana may play a critical role in responding to oxidative stress. In order to identify more rubredoxins in plants that maybe have similar functions as the rubredoxin-like protein of A. thaliana, we identified and analyzed plant rubredoxin proteins using bioinformatics-based methods. Totally, 66 candidate rubredoxin proteins were identified based on public databases, exhibiting lengths of 187-360 amino acids with molecular weights of 19.856-37.117 kDa. The results of subcellular localization showed that these candidate rubredoxins were localized to the chloroplast, which might be consistent with the fact that rubredoxins were predominantly expressed in leaves. Analyses of conserved motifs indicated that these candidate rubredoxins contained rubredoxin and PDZ domains. The expression patterns of rubredoxins in glycophyte and halophytic plant under salt/drought stress revealed that rubredoxin is one of the important stress response proteins. Finally, the coexpression network of rubredoxin in Arabidopsis thaliana under abiotic was extracted from ATTED-II to explore the function and regulation relationship of rubredoxin in Arabidopsis thaliana. Our results showed that putative rubredoxin proteins containing PDZ and rubredoxin domains, localized to the chloroplast, may act with other proteins in chloroplast to responses to abiotic stress in higher plants. These findings might provide value inference to promote the development of plant tolerance to some abiotic stresses and other economically important crops.
Collapse
|
7
|
Chen L, Wang X, Wang L, Fang Y, Pan X, Gao X, Zhang W. Functional characterization of chloroplast transit peptide in the small subunit of Rubisco in maize. JOURNAL OF PLANT PHYSIOLOGY 2019; 237:12-20. [PMID: 30999073 DOI: 10.1016/j.jplph.2019.04.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2018] [Revised: 04/04/2019] [Accepted: 04/04/2019] [Indexed: 06/09/2023]
Abstract
Functions of domains or motifs, which are encoded by the transit peptide (TP) of the precursor of the small subunit of Rubisco (prSSU), have been investigated intensively in dicots. Functional characterization of the prSSU TP, however, is still understudied in maize. In this study, we found that the TP of maize prSSU1 did not function fully in chloroplast targeting in Arabidopsis or vice versa, indicating the divergent function of TPs in chloroplast targeting between maize and Arabidopsis. Through deletion or substitution assays, we found that the N-terminal region of maize or Arabidopsis prSSU1 was necessary and sufficient for importing specifically the fused-green fluorescent protein (GFP) into each corresponding chloroplast. Finally, we found that the first-five amino acids and MM motif in the N-terminal domain of the maize TP played an essential role in maize chloroplast targeting. Thus, our analyses demonstrate that the N-terminal domain of the prSSU1 TP is the key determinant in chloroplast targeting between maize and Arabidopsis. Our study highlights the unique properties of the maize prSSU1 TP in chloroplast targeting, thus helping to understand the role of N-terminal domain in chloroplast targeting across species. It will help to manipulate chloroplast transit peptides (cTPs) for crop bioengineering.
Collapse
Affiliation(s)
- Lifen Chen
- State Key Laboratory for Crop Genetics and Germplasm Enhancement, JiangSu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, No.1 Weigang, Nanjing, Jiangsu, 210095, China
| | - Ximeng Wang
- State Key Laboratory for Crop Genetics and Germplasm Enhancement, JiangSu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, No.1 Weigang, Nanjing, Jiangsu, 210095, China
| | - Lei Wang
- State Key Laboratory for Crop Genetics and Germplasm Enhancement, JiangSu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, No.1 Weigang, Nanjing, Jiangsu, 210095, China
| | - Yuan Fang
- State Key Laboratory for Crop Genetics and Germplasm Enhancement, JiangSu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, No.1 Weigang, Nanjing, Jiangsu, 210095, China
| | - Xiucai Pan
- State Key Laboratory for Crop Genetics and Germplasm Enhancement, JiangSu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, No.1 Weigang, Nanjing, Jiangsu, 210095, China
| | - Xiquan Gao
- State Key Laboratory for Crop Genetics and Germplasm Enhancement, JiangSu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, No.1 Weigang, Nanjing, Jiangsu, 210095, China
| | - Wenli Zhang
- State Key Laboratory for Crop Genetics and Germplasm Enhancement, JiangSu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, No.1 Weigang, Nanjing, Jiangsu, 210095, China.
| |
Collapse
|
8
|
Jung W, Kim EJ, Han SJ, Kang SH, Choi HG, Kim S. Enzymatic modification by point mutation and functional analysis of an omega-6 fatty acid desaturase from Arctic Chlamydomonas sp. Prep Biochem Biotechnol 2017; 47:143-150. [PMID: 27191514 DOI: 10.1080/10826068.2016.1188311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
Arctic Chlamydomonas sp. is a dominant microalgal strain in cold or frozen freshwater in the Arctic region. The full-length open reading frame of the omega-6 fatty acid desaturase gene (AChFAD6) was obtained from the transcriptomic database of Arctic Chlamydomonas sp. from the KOPRI culture collection of polar micro-organisms. Amino acid sequence analysis indicated the presence of three conserved histidine-rich segments as unique characteristics of omega-6 fatty acid desaturases, and three transmembrane regions transported to plastidic membranes by chloroplast transit peptides in the N-terminal region. The AChFAD6 desaturase activity was examined by expressing wild-type and V254A mutant (Mut-AChFAD6) heterologous recombinant proteins. Quantitative gas chromatography indicated that the concentration of linoleic acids in AChFAD6-transformed cells increased more than 3-fold [6.73 ± 0.13 mg g-1 dry cell weight (DCW)] compared with cells transformed with vector alone. In contrast, transformation with Mut-AChFAD6 increased the concentration of oleic acid to 9.23 ± 0.18 mg g-1 DCW, indicating a change in enzymatic activity to mimic that of stearoyl-CoA desaturase. These results demonstrate that AChFAD6 of Arctic Chlamydomonas sp. increases membrane fluidity by enhancing denaturation of C18 fatty acids and facilitates production of large quantities of linoleic fatty acids in prokaryotic expression systems.
Collapse
Affiliation(s)
- Woongsic Jung
- a Division of Polar Life Sciences, Korea Polar Research Institute , Korea Institute of Ocean Science and Technology , Incheon , Republic of Korea
| | - Eun Jae Kim
- a Division of Polar Life Sciences, Korea Polar Research Institute , Korea Institute of Ocean Science and Technology , Incheon , Republic of Korea.,b Department of Polar Life Sciences , University of Science and Technology , Incheon , Republic of Korea
| | - Se Jong Han
- a Division of Polar Life Sciences, Korea Polar Research Institute , Korea Institute of Ocean Science and Technology , Incheon , Republic of Korea.,b Department of Polar Life Sciences , University of Science and Technology , Incheon , Republic of Korea
| | - Sung-Ho Kang
- c Division of Polar Ocean Sciences, Korea Polar Research Institute , Korea Institute of Ocean Science and Technology , Incheon , Republic of Korea
| | - Han-Gu Choi
- a Division of Polar Life Sciences, Korea Polar Research Institute , Korea Institute of Ocean Science and Technology , Incheon , Republic of Korea
| | - Sanghee Kim
- a Division of Polar Life Sciences, Korea Polar Research Institute , Korea Institute of Ocean Science and Technology , Incheon , Republic of Korea
| |
Collapse
|
9
|
Zhang SW, Zhang TH, Zhang JN, Huang Y. Prediction of Signal Peptide Cleavage Sites with Subsite-Coupled and Template Matching Fusion Algorithm. Mol Inform 2014; 33:230-9. [DOI: 10.1002/minf.201300077] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2013] [Accepted: 01/13/2014] [Indexed: 12/22/2022]
|
10
|
A methodological review of data mining techniques in predictive medicine: An application in hemodynamic prediction for abdominal aortic aneurysm disease. Biocybern Biomed Eng 2014. [DOI: 10.1016/j.bbe.2014.03.003] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
11
|
Andreoni F, Boiani R, Serafini G, Amagliani G, Dominici S, Riccioni G, Zaccone R, Mancuso M, Scapigliati G, Magnani M. Isolation of a novel gene from Photobacterium damselae subsp. piscicida and analysis of the recombinant antigen as promising vaccine candidate. Vaccine 2013; 31:820-6. [DOI: 10.1016/j.vaccine.2012.11.064] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2012] [Revised: 10/19/2012] [Accepted: 11/22/2012] [Indexed: 11/30/2022]
|
12
|
Kobuchi H, Moriya K, Ogino T, Fujita H, Inoue K, Shuin T, Yasuda T, Utsumi K, Utsumi T. Mitochondrial localization of ABC transporter ABCG2 and its function in 5-aminolevulinic acid-mediated protoporphyrin IX accumulation. PLoS One 2012. [PMID: 23189181 PMCID: PMC3506543 DOI: 10.1371/journal.pone.0050082] [Citation(s) in RCA: 102] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Accumulation of protoporphyrin IX (PpIX) in malignant cells is the basis of 5-aminolevulinic acid (ALA)-mediated photodynamic therapy. We studied the expression of proteins that possibly affect ALA-mediated PpIX accumulation, namely oligopeptide transporter-1 and -2, ferrochelatase and ATP-binding cassette transporter G2 (ABCG2), in several tumor cell lines. Among these proteins, only ABCG2 correlated negatively with ALA-mediated PpIX accumulation. Both a subcellular fractionation study and confocal laser microscopic analysis revealed that ABCG2 was distributed not only in the plasma membrane but also intracellular organelles, including mitochondria. In addition, mitochondrial ABCG2 regulated the content of ALA-mediated PpIX in mitochondria, and Ko143, a specific inhibitor of ABCG2, enhanced mitochondrial PpIX accumulation. To clarify the possible roles of mitochondrial ABCG2, we characterized stably transfected-HEK (ST-HEK) cells overexpressing ABCG2. In these ST-HEK cells, functionally active ABCG2 was detected in mitochondria, and treatment with Ko143 increased ALA-mediated mitochondrial PpIX accumulation. Moreover, the mitochondria isolated from ST-HEK cells exported doxorubicin probably through ABCG2, because the export of doxorubicin was inhibited by Ko143. The susceptibility of ABCG2 distributed in mitochondria to proteinase K, endoglycosidase H and peptide-N-glycosidase F suggested that ABCG2 in mitochondrial fraction is modified by N-glycans and trafficked through the endoplasmic reticulum and Golgi apparatus and finally localizes within the mitochondria. Thus, it was found that ABCG2 distributed in mitochondria is a functional transporter and that the mitochondrial ABCG2 regulates ALA-mediated PpIX level through PpIX export from mitochondria to the cytosol.
Collapse
Affiliation(s)
- Hirotsugu Kobuchi
- Department of Cell Chemistry, Okayama University Graduate School of Medicine, Dentistry and Pharmaceutical Sciences, Okayama, Japan.
| | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Abstract
MOTIVATION We previously reported the development of a highly accurate statistical algorithm for identifying β-barrel outer membrane proteins or transmembrane β-barrels (TMBBs), from genomic sequence data of Gram-negative bacteria (Freeman,T.C. and Wimley,W.C. (2010) Bioinformatics, 26, 1965-1974). We have now applied this identification algorithm to all available Gram-negative bacterial genomes (over 600 chromosomes) and have constructed a publicly available, searchable, up-to-date, database of all proteins in these genomes. RESULTS For each protein in the database, there is information on (i) β-barrel membrane protein probability for identification of β-barrels, (ii) β-strand and β-hairpin propensity for structure and topology prediction, (iii) signal sequence score because most TMBBs are secreted through the inner membrane translocon and, thus, have a signal sequence, and (iv) transmembrane α-helix predictions, for reducing false positive predictions. This information is sufficient for the accurate identification of most β-barrel membrane proteins in these genomes. In the database there are nearly 50 000 predicted TMBBs (out of 1.9 million total putative proteins). Of those, more than 15 000 are 'hypothetical' or 'putative' proteins, not previously identified as TMBBs. This wealth of genomic information is not available anywhere else. AVAILABILITY The TMBB genomic database is available at http://beta-barrel.tulane.edu/. CONTACT wwimley@tulane.edu.
Collapse
Affiliation(s)
- Thomas C Freeman
- Department of Biochemistry, Tulane University, New Orleans, LA 70112, USA
| | | |
Collapse
|
14
|
Xu Q, Pan SJ, Xue HH, Yang Q. Multitask learning for protein subcellular location prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011; 8:748-759. [PMID: 20421687 DOI: 10.1109/tcbb.2010.22] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
Protein subcellular localization is concerned with predicting the location of a protein within a cell using computational methods. The location information can indicate key functionalities of proteins. Thus, accurate prediction of subcellular localizations of proteins can help the prediction of protein functions and genome annotations, as well as the identification of drug targets. Machine learning methods such as Support Vector Machines (SVMs) have been used in the past for the problem of protein subcellular localization, but have been shown to suffer from a lack of annotated training data in each species under study. To overcome this data sparsity problem, we observe that because some of the organisms may be related to each other, there may be some commonalities across different organisms that can be discovered and used to help boost the data in each localization task. In this paper, we formulate protein subcellular localization problem as one of multitask learning across different organisms. We adapt and compare two specializations of the multitask learning algorithms on 20 different organisms. Our experimental results show that multitask learning performs much better than the traditional single-task methods. Among the different multitask learning methods, we found that the multitask kernels and supertype kernels under multitask learning that share parameters perform slightly better than multitask learning by sharing latent features. The most significant improvement in terms of localization accuracy is about 25 percent. We find that if the organisms are very different or are remotely related from a biological point of view, then jointly training the multiple models cannot lead to significant improvement. However, if they are closely related biologically, the multitask learning can do much better than individual learning.
Collapse
Affiliation(s)
- Qian Xu
- Bioengineering Program, Hong Kong University of Science and Technology, Clearwater Bay, Kowloon, Hong Kong.
| | | | | | | |
Collapse
|
15
|
Abstract
One of the major challenges in the post-genomic era with hundreds of genomes sequenced is the annotation of protein structure and function. Computational predictions of subcellular localization are an important step toward this end. The development of computational tools that predict targeting and localization has, therefore, been a very active area of research, in particular since the first release of the groundbreaking program PSORT in 1991. The most reliable means of annotating protein structure and function remains homology-based inference, i.e. the transfer of experimental annotations from one protein to its homologs. However, annotations about localization demonstrate how much can be gained from advanced machine learning: more proteins can be annotated more reliably. Contemporary computational tools for the annotation of protein targeting include automatic methods that mine the textual information from the biological literature and molecular biology databases. Some machine learning-based methods that accurately predict features of sorting signals and that use sequence-derived features to predict localization have reached remarkable levels of performance. Sustained prediction accuracy has increased by more than 30 percentage points over the last decade. Here, we review some of the most recent methods for the prediction of subcellular localization and protein targeting that contributed toward this breakthrough.
Collapse
Affiliation(s)
- Shruti Rastogi
- Department of Biochemistry and Molecular Biophysics, Columbia University and Columbia University Center for Computational Biology and Bioinformatics (C2B2), New York, NY, USA
| | | |
Collapse
|
16
|
Boiani R, Andreoni F, Serafini G, Bianconi I, Pierleoni R, Dominici S, Gorini F, Magnani M. Expression and characterization of the periplasmic cobalamin-binding protein of Photobacterium damselae subsp. piscicida. JOURNAL OF FISH DISEASES 2009; 32:745-753. [PMID: 19490395 DOI: 10.1111/j.1365-2761.2009.01050.x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Abstract Cobalamin (vitamin B(12)) is an essential cofactor in a variety of enzymatic reactions and most prokaryotes contain transport systems to import vitamin B(12). A gene coding for a periplasmic cobalamin-binding protein of Photobacterium damselae subsp. piscicida was identified by in silico analysis of sequences from a genomic library. The open reading frame was composed of 834 bp encoding a protein of 277 amino acids. The protein showed 61% identity with the vitamin B(12)-binding protein precursor of P. profundum, 53% identity with the corresponding protein of Vibrio parahaemolyticus and 43% identity with the periplasmic binding protein BtuF of Escherichia coli. The expression of the native protein was investigated in P. damselae subsp. piscicida, but BtuF was weakly expressed under normal conditions. To characterize the BtuF of P. damselae subsp. piscicida, the recombinant protein was expressed with a C-terminal His(6)-tag and purified; the molecular weight was estimated to be approximately 30 kDa. The protein does not contain any free thiol group, consistent with the view that the two cysteine residues are involved in a disulphide bond. The purified BtuF binds cyanocobalamin with an affinity constant of 6 +/- 2 microm.
Collapse
Affiliation(s)
- R Boiani
- Department of Biomolecular Science, University of Urbino Carlo Bo, 61032 Fano, Italy.
| | | | | | | | | | | | | | | |
Collapse
|
17
|
Abstract
BACKGROUND Protein subcellular localization is concerned with predicting the location of a protein within a cell using computational method. The location information can indicate key functionalities of proteins. Accurate predictions of subcellular localizations of protein can aid the prediction of protein function and genome annotation, as well as the identification of drug targets. Computational methods based on machine learning, such as support vector machine approaches, have already been widely used in the prediction of protein subcellular localization. However, a major drawback of these machine learning-based approaches is that a large amount of data should be labeled in order to let the prediction system learn a classifier of good generalization ability. However, in real world cases, it is laborious, expensive and time-consuming to experimentally determine the subcellular localization of a protein and prepare instances of labeled data. RESULTS In this paper, we present an approach based on a new learning framework, semi-supervised learning, which can use much fewer labeled instances to construct a high quality prediction model. We construct an initial classifier using a small set of labeled examples first, and then use unlabeled instances to refine the classifier for future predictions. CONCLUSION Experimental results show that our methods can effectively reduce the workload for labeling data using the unlabeled data. Our method is shown to enhance the state-of-the-art prediction results of SVM classifiers by more than 10%.
Collapse
Affiliation(s)
- Qian Xu
- Program of Bioengineering, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong.
| | | | | | | | | |
Collapse
|
18
|
Kozminsky-Atias A, Bar-Shalom A, Mishmar D, Zilberberg N. Assembling an arsenal, the scorpion way. BMC Evol Biol 2008; 8:333. [PMID: 19087317 PMCID: PMC2651877 DOI: 10.1186/1471-2148-8-333] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2008] [Accepted: 12/16/2008] [Indexed: 11/28/2022] Open
Abstract
Background For survival, scorpions depend on a wide array of short neurotoxic polypeptides. The venoms of scorpions from the most studied group, the Buthida, are a rich source of small, 23–78 amino acid-long peptides, well packed by either three or four disulfide bridges that affect ion channel function in excitable and non-excitable cells. Results In this work, by constructing a toxin transcripts data set from the venom gland of the scorpion Buthus occitanus israelis, we were able to follow the evolutionary path leading to mature toxin diversification and suggest a mechanism for leader peptide hyper-conservation. Toxins from each family were more closely related to one another than to toxins from other species, implying that fixation of duplicated genes followed speciation, suggesting early gene conversion events. Upon fixation, the mature toxin-coding domain was subjected to diversifying selection resulting in a significantly higher substitution rate that can be explained solely by diversifying selection. In contrast to the mature peptide, the leader peptide sequence was hyper-conserved and characterized by an atypical sub-neutral synonymous substitution rate. We interpret this as resulting from purifying selection acting on both the peptide and, as reported here for the first time, the DNA sequence, to create a toxin family-specific codon bias. Conclusion We thus propose that scorpion toxin genes were shaped by selective forces acting at three levels, namely (1) diversifying the mature toxin, (2) conserving the leader peptide amino acid sequence and intriguingly, (3) conserving the leader DNA sequences.
Collapse
Affiliation(s)
- Adi Kozminsky-Atias
- Department of Life Sciences, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel.
| | | | | | | |
Collapse
|
19
|
Regev-Rudzki N, Yogev O, Pines O. The mitochondrial targeting sequence tilts the balance between mitochondrial and cytosolic dual localization. J Cell Sci 2008; 121:2423-31. [DOI: 10.1242/jcs.029207] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Dual localization of proteins in the cell has appeared in recent years to be a more abundant phenomenon than previously reported. One of the mechanisms by which a single translation product is distributed between two compartments, involves retrograde movement of a subset of processed molecules back through the organelle-membrane. Here, we investigated the specific contribution of the mitochondrial targeting sequence (MTS), as a cis element, in the distribution of two proteins, aconitase and fumarase. Whereas the cytosolic presence of fumarase is obvious, the cytosolic amount of aconitase is minute. Therefore, we created (1) MTS-exchange mutants, exchanging the MTS of aconitase and fumarase with each other as well as with those of other proteins and, (2) a set of single mutations, limited to the MTS of these proteins. Distribution of both proteins is affected by mutations, a fact particularly evident for aconitase, which displays extraordinary amounts of processed protein in the cytosol. Thus, we show for the first time, that the MTS has an additional role beyond targeting: it determines the level of retrograde movement of proteins back into the cytosol. Our results suggest that the translocation rate and folding of proteins during import into mitochondria determines the extent to which molecules are withdrawn back into the cytosol.
Collapse
Affiliation(s)
- Neta Regev-Rudzki
- Department of Molecular Biology, Hebrew University Medical School, Jerusalem 91120, Israel
| | - Ohad Yogev
- Department of Molecular Biology, Hebrew University Medical School, Jerusalem 91120, Israel
| | - Ophry Pines
- Department of Molecular Biology, Hebrew University Medical School, Jerusalem 91120, Israel
| |
Collapse
|
20
|
Kaushik DK, Sehgal D. Developing Antibacterial Vaccines in Genomics and Proteomics Era. Scand J Immunol 2008; 67:544-52. [DOI: 10.1111/j.1365-3083.2008.02107.x] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
21
|
Sonnhammer EL, Wolfsberg TG. Identification of motifs in protein sequences. CURRENT PROTOCOLS IN CELL BIOLOGY 2008; Appendix 1:Appendix 1C. [PMID: 18228275 DOI: 10.1002/0471143030.cba01cs00] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
This brief appendix serves as a guide for the analysis of functional motifs in proteins. Several database search engines that can be accessed via the World Wide Web are described. Such computerized searches have become the preferred method to scan large sequence and motif databases, as the searches are efficient and the databases are updated frequently. A short list of sorting signals is also included, since these motifs often cannot be predicted reliably by a computer search.
Collapse
Affiliation(s)
- E L Sonnhammer
- Center for Genomics Research, Karolinska Institutet, Stockholm, Sweden
| | | |
Collapse
|
22
|
Habib T, Zhang C, Yang JY, Yang MQ, Deng Y. Supervised learning method for the prediction of subcellular localization of proteins using amino acid and amino acid pair composition. BMC Genomics 2008; 9 Suppl 1:S16. [PMID: 18366605 PMCID: PMC2386058 DOI: 10.1186/1471-2164-9-s1-s16] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
BACKGROUND Occurrence of protein in the cell is an important step in understanding its function. It is highly desirable to predict a protein's subcellular locations automatically from its sequence. Most studied methods for prediction of subcellular localization of proteins are signal peptides, the location by sequence homology, and the correlation between the total amino acid compositions of proteins. Taking amino-acid composition and amino acid pair composition into consideration helps improving the prediction accuracy. RESULTS We constructed a dataset of protein sequences from SWISS-PROT database and segmented them into 12 classes based on their subcellular locations. SVM modules were trained to predict the subcellular location based on amino acid composition and amino acid pair composition. Results were calculated after 10-fold cross validation. Radial Basis Function (RBF) outperformed polynomial and linear kernel functions. Total prediction accuracy reached to 71.8% for amino acid composition and 77.0% for amino acid pair composition. In order to observe the impact of number of subcellular locations we constructed two more datasets of nine and five subcellular locations. Total accuracy was further improved to 79.9% and 85.66%. CONCLUSIONS A new SVM based approach is presented based on amino acid and amino acid pair composition. Result shows that data simulation and taking more protein features into consideration improves the accuracy to a great extent. It was also noticed that the data set needs to be crafted to take account of the distribution of data in all the classes.
Collapse
Affiliation(s)
- Tanwir Habib
- Department of Biological Sciences, University of Southern Mississippi, Hattiesburg, MS 39406, USA
| | - Chaoyang Zhang
- School of Computing, University of Southern Mississippi, Hattiesburg, MS 39406, USA
| | - Jack Y Yang
- Harvard Medical School, Harvard University, Cambridge, Massachusetts 02140, USA
| | - Mary Qu Yang
- National Human Genome Research Institute, National Institutes of Health (NIH), U.S. Department of Health and Human Services Bethesda, MD 20852, USA
| | - Youping Deng
- Department of Biological Sciences, University of Southern Mississippi, Hattiesburg, MS 39406, USA
| |
Collapse
|
23
|
Burri L, Williams BAP, Bursac D, Lithgow T, Keeling PJ. Microsporidian mitosomes retain elements of the general mitochondrial targeting system. Proc Natl Acad Sci U S A 2006; 103:15916-20. [PMID: 17043242 PMCID: PMC1635103 DOI: 10.1073/pnas.0604109103] [Citation(s) in RCA: 77] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2006] [Indexed: 11/18/2022] Open
Abstract
Microsporidia are intracellular parasites that infect a variety of animals, including humans. As highly specialized parasites, they are characterized by a number of unusual adaptations, many of which are manifested as extreme reduction at the molecular, biochemical, and cellular levels. One interesting aspect of reduction is the mitochondrion. Microsporidia were long considered to be amitochondriate, but recently a tiny mitochondrion-derived organelle called the mitosome was detected. The molecular function of this organelle remains poorly understood. The mitosome has no genome, so it must import all its proteins from the cytosol. In other fungi, the mitochondrial protein import machinery consists of a network series of heterooligomeric translocases and peptidases, but in microsporidia, only a few subunits of some of these complexes have been identified to date. Here, we look at targeting sequences of the microsporidian mitosomal import system and show that mitosomes do in some cases still use N-terminal and internal targeting sequences that are recognizable by import systems of mitochondria in yeast. Furthermore, we have examined the function of the inner membrane peptidase processing enzyme and demonstrate that mitosomal substrates of this enzyme are processed to mature proteins in one species with a simplified processing complex, Antonospora locustae. However, in Encephalitozoon cuniculi, the processing complex is lost altogether, and the preprotein substrate functions with the targeting leader still attached. This report provides direct evidence for presequencing processing in mitosomes and also shows how a complex molecular system has continued to degenerate throughout the evolution of microsporidia.
Collapse
Affiliation(s)
- Lena Burri
- *Canadian Institute for Advanced Research, Department of Botany, University of British Columbia, 3529-6270 University Boulevard, Vancouver, BC, Canada V6T 1Z4; and
| | - Bryony A. P. Williams
- *Canadian Institute for Advanced Research, Department of Botany, University of British Columbia, 3529-6270 University Boulevard, Vancouver, BC, Canada V6T 1Z4; and
| | - Dejan Bursac
- Department of Biochemistry and Molecular Biology and Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Parkville 3010, Australia
| | - Trevor Lithgow
- Department of Biochemistry and Molecular Biology and Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Parkville 3010, Australia
| | - Patrick J. Keeling
- *Canadian Institute for Advanced Research, Department of Botany, University of British Columbia, 3529-6270 University Boulevard, Vancouver, BC, Canada V6T 1Z4; and
| |
Collapse
|
24
|
Abstract
The two ends of each protein are known as the amino (N-) and carboxyl (C-) termini. Short signatures in a protein's termini often carry vital cellular function. No systematic research has been conducted to address the importance of short signatures (3 to 10 amino acids) in protein termini at the proteomic level. Specifically, it is unknown whether such signatures are evolutionarily conserved, and if so, whether this conservation confers shared biological functions. Current signature detection methods fail to detect such short signatures due to inadequate statistical scores. The findings presented in this study strongly support the notion that functional significance of protein sets may be captured by short signatures at their termini. A positional search method was applied to over one million proteins from the UniProt database. The result is a collection of about a thousand significant signature groups (SIGs) that include previously identified as well as many novel signatures in protein termini. These SIGs represent protein sets with minimal or no overall sequence similarity excepting the similarity at their termini. The most significant SIGs are assigned by their strong correspondence to functional annotations derived from external databases such as Gene Ontology. Each of the SIGs is associated with the statistical significance of its functional association. These SIGs provide a valuable source for testing previously overlooked signatures in protein termini and allow for the investigation of the role played by such signatures throughout evolution. The SIGs archive and advanced search options are available at http://www.proteus.cs.huji.ac.il.
Collapse
Affiliation(s)
- Iris Bahir
- Department of Biological Chemistry, Institute of life Sciences, The Hebrew University of Jerusalem, Israel
| | | |
Collapse
|
25
|
Yagihara H, Terada Y, Sugimoto S, Hidaka F, Yamada O, Ono K, Washizu T, Ariizumi K, Bonkobara M. Identification and cornification-related gene expression of canine keratinocyte differentiation-associated protein, Kdap. Vet J 2006; 172:141-6. [PMID: 15927493 DOI: 10.1016/j.tvjl.2005.04.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The outermost layer of skin, the epidermis, is cornified epithelial tissue composed of keratinocytes. To maintain the structure and function of the epidermis, the regulation of proliferation, differentiation, and cornification of keratinocytes is crucial, and various soluble factors secreted by keratinocytes are involved in these regulations. Previously, work has shown that keratinocytes secreted the protein Kdap (keratinocyte differentiation-associated protein) associated with the formation of cornified cell envelopes, a specialized protective barrier structure on the periphery of terminally differentiating keratinocytes. In the present report, the canine counterpart of human Kdap is identified and an attempt has been made to define its physiological role in canine keratinization. Canine Kdap (cKdap) showed structural features commonly observed in other counterparts and is secreted from transfected cells. The expression profile of cKdap mRNA, which was restrictively expressed in cornified epithelial tissues besides skin has also been determined. These findings indicate that there is a strong association between cKdap expression and cornification, which supports previous observations that Kdap is involved in the synthesis and/or degradation of cornified cell envelopes in humans and mice.
Collapse
Affiliation(s)
- H Yagihara
- Department of Veterinary Clinical Pathology, Nippon Veterinary and Animal Science University, 1-7-1 Kyonan-cho, Musashino-shi, Tokyo 180-8602, Japan
| | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Pengelley SC, Chapman DC, Mark Abbott W, Lin HH, Huang W, Dalton K, Jones IM. A suite of parallel vectors for baculovirus expression. Protein Expr Purif 2006; 48:173-81. [PMID: 16797185 DOI: 10.1016/j.pep.2006.04.016] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2005] [Revised: 04/20/2006] [Accepted: 04/20/2006] [Indexed: 02/03/2023]
Abstract
The expression of proteins using recombinant baculoviruses is a mature and widely used technology. However, some aspects of the technology continue to detract from high throughput use and the basis of the final observed expression level is poorly understood. Here, we describe the design and use of a set of vectors developed around a unified cloning strategy that allow parallel expression of target proteins in the baculovirus system as N-terminal or C-terminal fusions. Using several protein kinases as tests we found that amino-terminal fusion to maltose binding protein rescued expression of the poorly expressed human kinase Cot but had only a marginal effect on expression of a well-expressed kinase IKK-2. In addition, MBP fusion proteins were found to be secreted from the expressing cell. Use of a carboxyl-terminal GFP tagging vector showed that fluorescence measurement paralleled expression level and was a convenient readout in the context of insect cell expression, an observation that was further supported with additional non-kinase targets. The expression of the target proteins using the same vectors in vitro showed that differences in expression level were wholly dependent on the environment of the expressing cell and an investigation of the time course of expression showed it could affect substantially the observed expression level for poorly but not well-expressed proteins. Our vector suite approach shows that rapid expression survey can be achieved within the baculovirus system and in addition, goes some way to identifying the underlying basis of the expression level obtained.
Collapse
|
27
|
Zhao A, Zhao T, Sima Y, Zhang Y, Nakagaki K, Miao Y, Shiomi K, Kajiura Z, Nagata Y, Nakagaki M. Unique molecular architecture of egg case silk protein in a spider, Nephila clavata. J Biochem 2006; 138:593-604. [PMID: 16272571 DOI: 10.1093/jb/mvi155] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
We describe a unique silk protein secreted from the cylindrical silk glands of the spider Nephila clavata. This silk is primarily composed of three proteins, whose transcripts of approximately 16.0, 14.5 and 13.0 kb are homologous to one another in two termini and repetitive units, as determined on Northern blotting. Its overall organization shows that it is similar to other characterized silk proteins, including in the mainly central repetitive region as well as the non-repetitive N-terminal (166 residues) and C-terminal (176 residues) parts. However, up to 90% of the protein consists of highly ordered repetitive structures that are not found in other silks. The repetitive region mainly consists of several types of complexes and remarkably conserved polypeptide repeats. The assembled repeat units (A1B1) contain a high proportion of Ala (30.41%), Ser (25.15%), and residues with hydrophobic side chains (22.22% for Gly, Leu, Ile, Val and Phe combined). The presence of Ser-rich and GVGAGASA motifs suggests the formation of a beta-sheet. The repetitive region is characterized by alternating arrays of hydrophobic and hydrophilic blocks. The results suggested that this egg case silk is an exceptional protein when compared with previously investigated spider silks.
Collapse
Affiliation(s)
- Aichun Zhao
- Department of Applied Biology, Faculty of Textile Science and Technology, Shinshu University, Ueda 386-8567
| | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Lubec G, Afjehi-Sadat L, Yang JW, John JPP. Searching for hypothetical proteins: theory and practice based upon original data and literature. Prog Neurobiol 2005; 77:90-127. [PMID: 16271823 DOI: 10.1016/j.pneurobio.2005.10.001] [Citation(s) in RCA: 133] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2005] [Revised: 09/18/2005] [Accepted: 10/02/2005] [Indexed: 12/29/2022]
Abstract
A large part of mammalian proteomes is represented by hypothetical proteins (HP), i.e. proteins predicted from nucleic acid sequences only and protein sequences with unknown function. Databases are far from being complete and errors are expected. The legion of HP is awaiting experiments to show their existence at the protein level and subsequent bioinformatic handling in order to assign proteins a tentative function is mandatory. Two-dimensional gel-electrophoresis with subsequent mass spectrometrical identification of protein spots is an appropriate tool to search for HP in the high-throughput mode. Spots are identified by MS or by MS/MS measurements (MALDI-TOF, MALDI-TOF-TOF) and subsequent software as e.g. Mascot or ProFound. In many cases proteins can thus be unambiguously identified and characterised; if this is not the case, de novo sequencing or Q-TOF analysis is warranted. If the protein is not identified, the sequence is being sent to databases for BLAST searches to determine identities/similarities or homologies to known proteins. If no significant identity to known structures is observed, the protein sequence is examined for the presence of functional domains (databases PROSITE, PRINTS, InterPro, ProDom, Pfam and SMART), subjected to searches for motifs (ELM) and finally protein-protein interaction databases (InterWeaver, STRING) are consulted or predictions from conformations are performed. We here provide information about hypothetical proteins in terms of protein chemical analysis, independent of antibody availability and specificity and bioinformatic handling to contribute to the extension/completion of protein databases and include original work on HP in the brain to illustrate the processes of HP identification and functional assignment.
Collapse
Affiliation(s)
- Gert Lubec
- Department of Pediatrics, Division of Basic Sciences, Medical University of Vienna, Waehringer Guertel 18-20, A-1090, Vienna, Austria.
| | | | | | | |
Collapse
|
29
|
Lister R, Hulett JM, Lithgow T, Whelan J. Protein import into mitochondria: origins and functions today (review). Mol Membr Biol 2005; 22:87-100. [PMID: 16092527 DOI: 10.1080/09687860500041247] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Mitochondria are organelles derived from alpha-proteobacteria over the course of one to two billion years. Mitochondria from the major eukaryotic lineages display some variation in functions and coding capacity but sequence analysis demonstrates them to be derived from a single common ancestral endosymbiont. The loss of assorted functions, the transfer of genes to the nucleus, and the acquisition of various 'eukaryotic' proteins have resulted in an organelle that contains approximately 1000 different proteins, with most of these proteins imported into the organelle across one or two membranes. A single translocase in the outer membrane and two translocases in the inner membrane mediate protein import. Comparative sequence analysis and functional complementation experiments suggest some components of the import pathways to be directly derived from the eubacterial endosymbiont's own proteins, and some to have arisen 'de novo' at the earliest stages of 'mitochondrification' of the endosymbiont. A third class of components appears lineage-specific, suggesting they were incorporated into the process of protein import long after mitochondria was established as an organelle and after the divergence of the various eukaryotic lineages. Protein sorting pathways inherited from the endosymbiont have been co-opted and play roles in intraorganelle protein sorting after import. The import apparatus of animals and fungi show significant similarity to one another, but vary considerably to the plant apparatus. Increasing complexity in the eukaryotic lineage, i.e., from single celled to multi-cellular life forms, has been accompanied by an expansion in genes encoding each component, resulting in small gene families encoding many components. The functional differences in these gene families remain to be elucidated, but point to a mosaic import apparatus that can be regulated by a variety of signals.
Collapse
Affiliation(s)
- Ryan Lister
- Plant Molecular Biology Group, School of Biomedical and Chemical Sciences, The University of Western Australia, Crawley, Western Australia, Australia
| | | | | | | |
Collapse
|
30
|
Bonkobara M, Sato T, Takahashi N, Kasahara Y, Yagihara H, Tamura K, Isotani M, Azakami D, Ono K, Washizu T. Characterization of cDNA and the genomic sequence encoding canine neural-cell adhesion molecule, CD56/N-CAM. Vet Immunol Immunopathol 2005; 107:171-6. [PMID: 15979156 DOI: 10.1016/j.vetimm.2005.04.008] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2004] [Revised: 03/24/2005] [Accepted: 04/19/2005] [Indexed: 11/17/2022]
Abstract
The neural-cell adhesion molecule, CD56/N-CAM is a member of the immunoglobulin superfamily expressed by various tissues and cells, including natural killer (NK) cells. Despite the importance of CD56 as a marker for identifying NK cells in circulating blood, canine CD56 has not been identified. In the present study, we identified the canine counterparts of the 140-kDa (CD56-140) and 120-kDa (CD56-120) isoforms of human DC56. Both of amino acid sequences encoded by the canine CD56-140 and -120 cDNA showed high homology with those of human (both 96% homology), having well-conserved domains (five immunoglobulin, two fibronectin type III, and transmembrane and intracellular or glycosylphosphatidylinositol-linked domain) among various species (human, mouse, and feline). We revealed that the transcripts of canine CD56-140 and -120 arise from alternative mRNA splicing from a single gene located on canine chromosome 5. Moreover, the mRNA encoding canine CD56-140 was expressed at high levels constitutively by nervous system and endocrine tissues as has shown in other animals.
Collapse
Affiliation(s)
- M Bonkobara
- Department of Veterinary Clinical Pathology, Nippon Veterinary and Animal Science University, 1-7-1 Kyonan-cho, Musashino-shi, Tokyo 180-8602, Japan.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Vrontou E, Economou A. Structure and function of SecA, the preprotein translocase nanomotor. BIOCHIMICA ET BIOPHYSICA ACTA-MOLECULAR CELL RESEARCH 2005; 1694:67-80. [PMID: 15546658 DOI: 10.1016/j.bbamcr.2004.06.003] [Citation(s) in RCA: 98] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2003] [Revised: 06/03/2004] [Accepted: 06/17/2004] [Indexed: 11/22/2022]
Abstract
Most secretory proteins that are destined for the periplasm or the outer membrane are exported through the bacterial plasma membrane by the Sec translocase. Translocase is a complex nanomachine that moves processively along its aminoacyl polymeric substrates effectively pumping them to the periplasmic space. The salient features of this process are: (a) a membrane-embedded "clamp" formed by the trimeric SecYEG protein, (b) a "motor" provided by the dimeric SecA ATPase, (c) regulatory subunits that optimize catalysis and (d) both chemical and electrochemical metabolic energy. Significant recent strides have allowed structural, biochemical and biophysical dissection of the export reaction. A model incorporating stepwise strokes of the translocase nanomachine at work is discussed.
Collapse
Affiliation(s)
- Eleftheria Vrontou
- Laboratory Unicellular, Organisms Group, Institute of Molecular Biology and Biotechnology, FO.R.T.H. and Department of Biology, University of Crete, Vassilika Vouton, P.O. Box 1527, GR-711 10 Iraklio, Crete, Greece
| | | |
Collapse
|
32
|
Chou KC, Cai YD. Predicting subcellular localization of proteins by hybridizing functional domain composition and pseudo-amino acid composition. J Cell Biochem 2004; 91:1197-203. [PMID: 15048874 DOI: 10.1002/jcb.10790] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Recent advances in large-scale genome sequencing have led to the rapid accumulation of amino acid sequences of proteins whose functions are unknown. Since the functions of these proteins are closely correlated with their subcellular localizations, many efforts have been made to develop a variety of methods for predicting protein subcellular location. In this study, based on the strategy by hybridizing the functional domain composition and the pseudo-amino acid composition (Cai and Chou [2003]: Biochem. Biophys. Res. Commun. 305:407-411), the Intimate Sorting Algorithm (ISort predictor) was developed for predicting the protein subcellular location. As a showcase, the same plant and non-plant protein datasets as investigated by the previous investigators were used for demonstration. The overall success rate by the jackknife test for the plant protein dataset was 85.4%, and that for the non-plant protein dataset 91.9%. These are so far the highest success rates achieved for the two datasets by following a rigorous cross validation test procedure, further confirming that such a hybrid approach may become a very useful high-throughput tool in the area of bioinformatics, proteomics, as well as molecular cell biology.
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Gordon Life Science Institute, Torrey Del Mar Drive, San Diego, California 92130, USA.
| | | |
Collapse
|
33
|
Liu H, Wong L. Data mining tools for biological sequences. J Bioinform Comput Biol 2004; 1:139-67. [PMID: 15290785 DOI: 10.1142/s0219720003000216] [Citation(s) in RCA: 54] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2002] [Revised: 04/07/2003] [Accepted: 04/07/2003] [Indexed: 11/18/2022]
Abstract
We describe a methodology, as well as some related data mining tools, for analyzing sequence data. The methodology comprises three steps: (a) generating candidate features from the sequences, (b) selecting relevant features from the candidates, and (c) integrating the selected features to build a system to recognize specific properties in sequence data. We also give relevant techniques for each of these three steps. For generating candidate features, we present various types of features based on the idea of k-grams. For selecting relevant features, we discuss signal-to-noise, t-statistics, and entropy measures, as well as a correlation-based feature selection method. For integrating selected features, we use machine learning methods, including C4.5, SVM, and Naive Bayes. We illustrate this methodology on the problem of recognizing translation initiation sites. We discuss how to generate and select features that are useful for understanding the distinction between ATG sites that are translation initiation sites and those that are not. We also discuss how to use such features to build reliable systems for recognizing translation initiation sites in DNA sequences.
Collapse
Affiliation(s)
- Huiqing Liu
- Institute for Infocomm Research, 21 Heng Mui Keng Terrace, Singapore 119613, Singapore.
| | | |
Collapse
|
34
|
Chou KC, Cai YD. Prediction of protein subcellular locations by GO-FunD-PseAA predictor. Biochem Biophys Res Commun 2004; 320:1236-9. [PMID: 15249222 DOI: 10.1016/j.bbrc.2004.06.073] [Citation(s) in RCA: 115] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2004] [Indexed: 11/18/2022]
Abstract
The localization of a protein in a cell is closely correlated with its biological function. With the explosion of protein sequences entering into DataBanks, it is highly desired to develop an automated method that can fast identify their subcellular location. This will expedite the annotation process, providing timely useful information for both basic research and industrial application. In view of this, a powerful predictor has been developed by hybridizing the gene ontology approach [Nat. Genet. 25 (2000) 25], functional domain composition approach [J. Biol. Chem. 277 (2002) 45765], and the pseudo-amino acid composition approach [Proteins Struct. Funct. Genet. 43 (2001) 246; Erratum: ibid. 44 (2001) 60]. As a showcase, the recently constructed dataset [Bioinformatics 19 (2003) 1656] was used for demonstration. The dataset contains 7589 proteins classified into 12 subcellular locations: chloroplast, cytoplasmic, cytoskeleton, endoplasmic reticulum, extracellular, Golgi apparatus, lysosomal, mitochondrial, nuclear, peroxisomal, plasma membrane, and vacuolar. The overall success rate of prediction obtained by the jackknife cross-validation was 92%. This is so far the highest success rate performed on this dataset by following an objective and rigorous cross-validation procedure.
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Gordon Life Science Institute, San Diego, CA 92130, USA.
| | | |
Collapse
|
35
|
Rogers MB, Archibald JM, Field MA, Li C, Striepen B, Keeling PJ. Plastid-Targeting Peptides from the Chlorarachniophyte Bigelowiella natans. J Eukaryot Microbiol 2004; 51:529-35. [PMID: 15537087 DOI: 10.1111/j.1550-7408.2004.tb00288.x] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Chlorarachniophytes are marine amoeboflagellate protists that have acquired their plastid (chloroplast) through secondary endosymbiosis with a green alga. Like other algae, most of the proteins necessary for plastid function are encoded in the nuclear genome of the secondary host. These proteins are targeted to the organelle using a bipartite leader sequence consisting of a signal peptide (allowing entry in to the endomembrane system) and a chloroplast transit peptide (for transport across the chloroplast envelope membranes). We have examined the leader sequences from 45 full-length predicted plastid-targeted proteins from the chlorarachniophyte Bigelowiella natans with the goal of understanding important features of these sequences and possible conserved motifs. The chemical characteristics of these sequences were compared with a set of 10 B. natans endomembrane-targeted proteins and 38 cytosolic or nuclear proteins, which show that the signal peptides are similar to those of most other eukaryotes, while the transit peptides differ from those of other algae in some characteristics. Consistent with this, the leader sequence from one B. natans protein was tested for function in the apicomplexan parasite, Toxoplasma gondii, and shown to direct the secretion of the protein.
Collapse
Affiliation(s)
- Matthew B Rogers
- Canadian Institute for Advanced Research, Department of Botany, University of British Columbia, 3529-6270 University Boulevard, Vancouver, BC, V6T 1Z4, Canada
| | | | | | | | | | | |
Collapse
|
36
|
Sun Q, Emanuelsson O, van Wijk KJ. Analysis of curated and predicted plastid subproteomes of Arabidopsis. Subcellular compartmentalization leads to distinctive proteome properties. PLANT PHYSIOLOGY 2004; 135:723-34. [PMID: 15208420 PMCID: PMC514110 DOI: 10.1104/pp.104.040717] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/13/2004] [Revised: 03/25/2004] [Accepted: 04/14/2004] [Indexed: 05/17/2023]
Abstract
Carefully curated proteomes of the inner envelope membrane, the thylakoid membrane, and the thylakoid lumen of chloroplasts from Arabidopsis were assembled based on published, well-documented localizations. These curated proteomes were evaluated for distribution of physical-chemical parameters, with the goal of extracting parameters for improved subcellular prediction and subsequent identification of additional (low abundant) components of each membrane system. The assembly of rigorously curated subcellular proteomes is in itself also important as a parts list for plant and systems biology. Transmembrane and subcellular prediction strategies were evaluated using the curated data sets. The three curated proteomes differ strongly in average isoelectric point and protein size, as well as transmembrane distribution. Removal of the cleavable, N-terminal transit peptide sequences greatly affected isoelectric point and size distribution. Unexpectedly, the Cys content was much lower for the thylakoid proteomes than for the inner envelope. This likely relates to the role of the thylakoid membrane in light-driven electron transport and helps to avoid unwanted oxidation-reduction reactions. A rule of thumb for discriminating between the predicted integral inner envelope membrane and integral thylakoid membrane proteins is suggested. Using a combination of predictors and experimentally derived parameters, four plastid subproteomes were predicted from the fully annotated Arabidopsis genome. These predicted subproteomes were analyzed for their properties and compared to the curated proteomes. The sensitivity and accuracy of the prediction strategies are discussed. Data can be extracted from the new plastid proteome database (http://ppdb.tc.cornell.edu).
Collapse
Affiliation(s)
- Qi Sun
- Computational Biology Service Unit, Cornell Theory Center, Cornell University, Ithaca, New York, USA
| | | | | |
Collapse
|
37
|
Chou KC, Cai YD. Prediction and classification of protein subcellular location-sequence-order effect and pseudo amino acid composition. J Cell Biochem 2003; 90:1250-60. [PMID: 14635197 DOI: 10.1002/jcb.10719] [Citation(s) in RCA: 136] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Given a protein sequence, how to identify its subcellular location? With the rapid increase in newly found protein sequences entering into databanks, the problem has become more and more important because the function of a protein is closely correlated with its localization. To practically deal with the challenge, a dataset has been established that allows the identification performed among the following 14 subcellular locations: (1) cell wall, (2) centriole, (3) chloroplast, (4) cytoplasm, (5) cytoskeleton, (6) endoplasmic reticulum, (7) extracellular, (8) Golgi apparatus, (9) lysosome, (10) mitochondria, (11) nucleus, (12) peroxisome, (13) plasma membrane, and (14) vacuole. Compared with the datasets constructed by the previous investigators, the current one represents the largest in the scope of localizations covered, and hence many proteins which were totally out of picture in the previous treatments, can now be investigated. Meanwhile, to enhance the potential and flexibility in taking into account the sequence-order effect, the series-mode pseudo-amino-acid-composition has been introduced as a representation for a protein. High success rates are obtained by the re-substitution test, jackknife test, and independent dataset test, respectively. It is anticipated that the current automated method can be developed to a high throughput tool for practical usage in both basic research and pharmaceutical industry.
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Gordon Life Science Institute, San Diego, CA 92130, USA
| | | |
Collapse
|
38
|
Chou KC, Cai YD. A new hybrid approach to predict subcellular localization of proteins by incorporating gene ontology. Biochem Biophys Res Commun 2003; 311:743-7. [PMID: 14623335 DOI: 10.1016/j.bbrc.2003.10.062] [Citation(s) in RCA: 98] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Based on the recent development in the gene ontology and functional domain databases, a new hybridization approach is developed for predicting protein subcellular location by combining the gene product, functional domain, and quasi-sequence-order effects. As a showcase, the same prokaryotic and eukaryotic datasets, which were studied by many previous investigators, are used for demonstration. The overall success rate by the jackknife test for the prokaryotic set is 94.7% and that for the eukaryotic set 92.9%. These are so far the highest success rates achieved for the two datasets by following a rigorous cross-validation test procedure, suggesting that such a hybrid approach may become a very useful high-throughput tool in the area of bioinformatics, proteomics, as well as molecular cell biology. The very high success rates also reflect the fact that the subcellular localization of a protein is closely correlated with: (1). the biological objective to which the gene or gene product contributes, (2). the biochemical activity of a gene product, and (3). the place in the cell where a gene product is active.
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Gordon Life Science Institute, San Diego, CA 92130, USA.
| | | |
Collapse
|
39
|
Cai YD, Chou KC. Nearest neighbour algorithm for predicting protein subcellular location by combining functional domain composition and pseudo-amino acid composition. Biochem Biophys Res Commun 2003; 305:407-11. [PMID: 12745090 DOI: 10.1016/s0006-291x(03)00775-7] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
In this paper, based on the approach by combining the "functional domain composition" [K.C. Chou, Y. D. Cai, J. Biol. Chem. 277 (2002) 45765] and the pseudo-amino acid composition [K.C. Chou, Proteins Struct. Funct. Genet. 43 (2001) 246; Correction Proteins Struct. Funct. Genet. 2044 (2001) 2060], the Nearest Neighbour Algorithm (NNA) was developed for predicting the protein subcellular location. Very high success rates were observed, suggesting that such a hybrid approach may become a useful high-throughput tool in the area of bioinformatics and proteomics.
Collapse
Affiliation(s)
- Yu-Dong Cai
- Shanghai Research Centre of Biotechnology, Chinese Academy of Sciences, Shanghai 200233, China.
| | | |
Collapse
|
40
|
Ueda K, Lipkind GM, Zhou A, Zhu X, Kuznetsov A, Philipson L, Gardner P, Zhang C, Steiner DF. Mutational analysis of predicted interactions between the catalytic and P domains of prohormone convertase 3 (PC3/PC1). Proc Natl Acad Sci U S A 2003; 100:5622-7. [PMID: 12721373 PMCID: PMC156251 DOI: 10.1073/pnas.0631617100] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The subtilisin-like prohormone convertases (PCs) contain an essential downstream domain (P domain), which has been predicted to have a beta-barrel structure that interacts with and stabilizes the catalytic domain (CAT). To assess possible sites of hydrophobic interaction, a series of mutant PC3-enhanced GFP constructs were prepared in which selected nonpolar residues on the surface of CAT were substituted by the corresponding polar residues in subtilisin Carlsberg. To investigate the folding potential of the isolated P domain, signal peptide-P domain-enhanced GFP constructs with mutated andor truncated P domains were also made. All mutants were expressed in betaTC3 cells, and their subcellular localization and secretion were determined. The mutants fell into three main groups: (i) Golgisecreted, (ii) ERnonsecreted, and (iii) apoptosis inducing. The destabilizing CAT mutations indicate that the side chains of V292, T328, L351, Q408, H409, V412, and F441 and nonpolar fragments of the side chains of R405 and W413 form a hydrophobic patch on CAT that interacts with the P domain. We also have found that the P domain can fold independently, as indicated by its secretion. Interestingly, T594, which is near the P domain C terminus, was not essential for P domain secretion but is crucial for the stability of intact PC3. T594V produced a stable enzyme, but T594D did not, which suggests that T594 participates in important hydrophobic interactions within PC3. These findings support our conclusion that the catalytic and P domains contribute to the folding and thermodynamic stability of the convertases through reciprocal hydrophobic interactions.
Collapse
Affiliation(s)
- Kazuya Ueda
- Howard Hughes Medical Institute, Department of Biochemistry, University of Chicago, 5841 South Maryland Avenue, Chicago, IL 60637, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
41
|
Kroth PG. Protein transport into secondary plastids and the evolution of primary and secondary plastids. INTERNATIONAL REVIEW OF CYTOLOGY 2003; 221:191-255. [PMID: 12455749 DOI: 10.1016/s0074-7696(02)21013-x] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Chloroplasts are key organelles in algae and plants due to their photosynthetic abilities. They are thought to have evolved from prokaryotic cyanobacteria taken up by a eukaryotic host cell in a process termed primary endocytobiosis. In addition, a variety of organisms have evolved by subsequent secondary endocytobioses, in which a heterotrophic host cell engulfed a eukaryotic alga. Both processes dramatically enhanced the complexity of the resulting cells. Since the first version of the endosymbiotic theory was proposed more than 100 years ago, morphological, physiological, biochemical, and molecular data have been collected substantiating the emerging picture about the origin and the relationship of individual organisms with different primary or secondary chloroplast types. Depending on their origin, plastids in different lineages may have two, three, or four envelope membranes. The evolutionary success of endocytobioses depends, among other factors, on the specific exchange of molecules between the host and endosymbiont. This raises questions concerning how targeting of nucleus-encoded proteins into the different plastid types occurs and how these processes may have developed. Most studies of protein translocation into plastids have been performed on primary plastids, but in recent years more complex protein-translocation systems of secondary plastids have been investigated. Analyses of transport systems in different algal lineages with secondary plastids reveal that during evolution existing translocation machineries were recycled or recombined rather than being developed de novo. This review deals with current knowledge about the evolution and function of primary and secondary plastids and the respective protein-targeting systems.
Collapse
Affiliation(s)
- Peter G Kroth
- Department of Biology, University of Konstanz, 78457 Konstanz, Germany
| |
Collapse
|
42
|
|
43
|
Abstract
Given a nascent protein sequence, how can one predict its signal peptide or "Zipcode" sequence? This is an important problem for scientists to use signal peptides as a vehicle to find new drugs or to reprogram cells for gene therapy (see, e.g. K.C. Chou, Current Protein and Peptide Science 2002;3:615-22). In this paper, support vector machines (SVMs), a new machine learning method, is applied to approach this problem. The overall rate of correct prediction for 1939 secretary proteins and 1440 nonsecretary proteins was over 91%. It has not escaped our attention that the new method may also serve as a useful tool for further investigating many unclear details regarding the molecular mechanism of the ZIP code protein-sorting system in cells.
Collapse
Affiliation(s)
- Yu-Dong Cai
- Shanghai Research Centre of Biotechnology, Chinese Academy of Sciences, Shanghai 200233, China.
| | | | | |
Collapse
|
44
|
Nair R, Rost B. Sequence conserved for subcellular localization. Protein Sci 2002; 11:2836-47. [PMID: 12441382 PMCID: PMC2373743 DOI: 10.1110/ps.0207402] [Citation(s) in RCA: 131] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2002] [Revised: 09/05/2002] [Accepted: 09/10/2002] [Indexed: 10/27/2022]
Abstract
The more proteins diverged in sequence, the more difficult it becomes for bioinformatics to infer similarities of protein function and structure from sequence. The precise thresholds used in automated genome annotations depend on the particular aspect of protein function transferred by homology. Here, we presented the first large-scale analysis of the relation between sequence similarity and identity in subcellular localization. Three results stood out: (1) The subcellular compartment is generally more conserved than what might have been expected given that short sequence motifs like nuclear localization signals can alter the native compartment; (2) the sequence conservation of localization is similar between different compartments; and (3) it is similar to the conservation of structure and enzymatic activity. In particular, we found the transition between the regions of conserved and nonconserved localization to be very sharp, although the thresholds for conservation were less well defined than for structure and enzymatic activity. We found that a simple measure for sequence similarity accounting for pairwise sequence identity and alignment length, the HSSP distance, distinguished accurately between protein pairs of identical and different localizations. In fact, BLAST expectation values outperformed the HSSP distance only for alignments in the subtwilight zone. We succeeded in slightly improving the accuracy of inferring localization through homology by fine tuning the thresholds. Finally, we applied our results to the entire SWISS-PROT database and five entirely sequenced eukaryotes.
Collapse
Affiliation(s)
- Rajesh Nair
- Columbia University Bioinformatics Center (CUBIC), Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York 10032, USA
| | | |
Collapse
|
45
|
Richter S, Lamppa GK. Determinants for removal and degradation of transit peptides of chloroplast precursor proteins. J Biol Chem 2002; 277:43888-94. [PMID: 12235143 DOI: 10.1074/jbc.m206020200] [Citation(s) in RCA: 43] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
The stromal processing peptidase (SPP) cleaves a large diversity of chloroplast precursor proteins, removing an N-terminal transit peptide. We predicted previously that this key step of the import pathway is mediated by features of the transit peptide that determine precursor binding and cleavage followed by transit peptide conversion to a degradable substrate. Here we performed competition experiments using synthesized oligopeptides of the transit peptide of ferredoxin precursor to investigate the mechanism of these processes. We found that binding and processing of ferredoxin precursor depend on specific interactions of SPP with the region consisting of the C-terminal 12 residues of the transit peptide. Analysis of four other precursors suggests that processing depends on the same region, although their transit peptides are highly divergent in primary sequence and length. Upon processing, SPP terminates its interaction with the transit peptide by a second cleavage, converting it to a subfragment form. From the competition experiments we deduce that SPP releases a subfragment consisting of the transit peptide without its original C terminus. Interestingly, examination of the ATP-dependent metallopeptidase activity responsible for degradation of transit peptide subfragments suggests that it may recognize other unrelated peptides and, hence, act separately from SPP as a novel stromal oligopeptidase.
Collapse
Affiliation(s)
- Stefan Richter
- Department of Molecular Genetics and Cell Biology, University of Chicago, Illinois 60637, USA
| | | |
Collapse
|
46
|
Langeveld SMJ, Vennik M, Kottenhagen M, Van Wijk R, Buijk A, Kijne JW, de Pater S. Glucosylation activity and complex formation of two classes of reversibly glycosylated polypeptides. PLANT PHYSIOLOGY 2002; 129:278-89. [PMID: 12011358 PMCID: PMC155891 DOI: 10.1104/pp.010720] [Citation(s) in RCA: 54] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2001] [Revised: 11/21/2001] [Accepted: 01/17/2002] [Indexed: 05/18/2023]
Abstract
Reversibly glycosylated polypeptides (RGPs) have been implicated in polysaccharide biosynthesis. In plants, these proteins may function, for example, in cell wall synthesis and/or in synthesis of starch. We have isolated wheat (Triticum aestivum) and rice (Oryza sativa) Rgp cDNA clones to study the function of RGPs. Sequence comparisons showed the existence of two classes of RGP proteins, designated RGP1 and RGP2. Glucosylation activity of RGP1 and RGP2 from wheat and rice was studied. After separate expression of Rgp1 and Rgp2 in Escherichia coli or yeast (Saccharomyces cerevisiae), only RGP1 showed self-glucosylation. In Superose 12 fractions from wheat endosperm extract, a polypeptide with a molecular mass of about 40 kD is glucosylated by UDP-glucose. Transgenic tobacco (Nicotiana tabacum) plants, overexpressing either wheat Rgp1 or Rgp2, were generated. Subsequent glucosylation assays revealed that in RGP1-containing tobacco extracts as well as in RGP2-containing tobacco extracts UDP-glucose is incorporated, indicating that an RGP2-containing complex is active. Gel filtration experiments with wheat endosperm extracts and extracts from transgenic tobacco plants, overexpressing either wheat Rgp1 or Rgp2, showed the presence of RGP1 and RGP2 in high-molecular mass complexes. Yeast two-hybrid studies indicated that RGP1 and RGP2 form homo- and heterodimers. Screening of a cDNA library using the yeast two-hybrid system and purification of the complex by an antibody affinity column did not reveal the presence of other proteins in the RGP complexes. Taken together, these results suggest the presence of active RGP1 and RGP2 homo- and heteromultimers in wheat endosperm.
Collapse
Affiliation(s)
- Sandra M J Langeveld
- Department of Applied Plant Sciences of the Netherlands Organisation for Applied Scientific Research, Center for Phytotechnology, Leiden University, Wassenaarseweg 64, 2333 AL Leiden, The Netherlands.
| | | | | | | | | | | | | |
Collapse
|
47
|
Perozzo R, Kuo M, Sidhu ABS, Valiyaveettil JT, Bittman R, Jacobs WR, Fidock DA, Sacchettini JC. Structural elucidation of the specificity of the antibacterial agent triclosan for malarial enoyl acyl carrier protein reductase. J Biol Chem 2002; 277:13106-14. [PMID: 11792710 DOI: 10.1074/jbc.m112000200] [Citation(s) in RCA: 149] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
The human malaria parasite Plasmodium falciparum synthesizes fatty acids using a type II pathway that is absent in humans. The final step in fatty acid elongation is catalyzed by enoyl acyl carrier protein reductase, a validated antimicrobial drug target. Here, we report the cloning and expression of the P. falciparum enoyl acyl carrier protein reductase gene, which encodes a 50-kDa protein (PfENR) predicted to target to the unique parasite apicoplast. Purified PfENR was crystallized, and its structure resolved as a binary complex with NADH, a ternary complex with triclosan and NAD(+), and as ternary complexes bound to the triclosan analogs 1 and 2 with NADH. Novel structural features were identified in the PfENR binding loop region that most closely resembled bacterial homologs; elsewhere the protein was similar to ENR from the plant Brassica napus (root mean square for Calphas, 0.30 A). Triclosan and its analogs 1 and 2 killed multidrug-resistant strains of intra-erythrocytic P. falciparum parasites at sub to low micromolar concentrations in vitro. These data define the structural basis of triclosan binding to PfENR and will facilitate structure-based optimization of PfENR inhibitors.
Collapse
Affiliation(s)
- Remo Perozzo
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX 77843-2128, USA
| | | | | | | | | | | | | | | |
Collapse
|
48
|
Muggleton SH, Bryant CH, Srinivasan A, Whittaker A, Topp S, Rawlings C. Are grammatical representations useful for learning from biological sequence data?--a case study. J Comput Biol 2002; 8:493-521. [PMID: 11694180 DOI: 10.1089/106652701753216512] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
This paper investigates whether Chomsky-like grammar representations are useful for learning cost-effective, comprehensible predictors of members of biological sequence families. The Inductive Logic Programming (ILP) Bayesian approach to learning from positive examples is used to generate a grammar for recognising a class of proteins known as human neuropeptide precursors (NPPs). Collectively, five of the co-authors of this paper, have extensive expertise on NPPs and general bioinformatics methods. Their motivation for generating a NPP grammar was that none of the existing bioinformatics methods could provide sufficient cost-savings during the search for new NPPs. Prior to this project experienced specialists at SmithKline Beecham had tried for many months to hand-code such a grammar but without success. Our best predictor makes the search for novel NPPs more than 100 times more efficient than randomly selecting proteins for synthesis and testing them for biological activity. As far as these authors are aware, this is both the first biological grammar learnt using ILP and the first real-world scientific application of the ILP Bayesian approach to learning from positive examples. A group of features is derived from this grammar. Other groups of features of NPPs are derived using other learning strategies. Amalgams of these groups are formed. A recognition model is generated for each amalgam using C4.5 and C4.5rules and its performance is measured using both predictive accuracy and a new cost function, Relative Advantage (RA). The highest RA was achieved by a model which includes grammar-derived features. This RA is significantly higher than the best RA achieved without the use of the grammar-derived features. Predictive accuracy is not a good measure of performance for this domain because it does not discriminate well between NPP recognition models: despite covering varying numbers of (the rare) positives, all the models are awarded a similar (high) score by predictive accuracy because they all exclude most of the abundant negatives.
Collapse
Affiliation(s)
- S H Muggleton
- Department of Computer Science, University of York, York YO10 5DD, United Kingdom
| | | | | | | | | | | |
Collapse
|
49
|
Chung JJ, Shikano S, Hanyu Y, Li M. Functional diversity of protein C-termini: more than zipcoding? Trends Cell Biol 2002; 12:146-50. [PMID: 11859027 DOI: 10.1016/s0962-8924(01)02241-3] [Citation(s) in RCA: 61] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The carboxylated (C)-terminus of proteins, which includes the single terminal alpha-carboxyl group and preceding residues, is uniquely positioned to serve as a recognition signature for a variety of cell-biological processes, including protein targeting, subcellular anchoring and the static and dynamic formation of macromolecular complexes. The terminal sequence motifs can be processed by posttranslational modifications, thereby providing a means to increase sequence diversity and to regulate interactions. Several classes of protein domains have been identified that are either designed for or are capable of interacting with protein C-termini - these include PDZ and TPR domains. The interactions between these protein domains and various terminal epitopes play an important role in specifying cell-biological functions. The combination of diversity and the plasticity of the chemistry of C-termini provides mechanisms for spatial and temporal specificity that are exploited by a variety of biological processes, ranging from specifying prokaryotic protein degradation to nucleating mammalian neuronal signaling complexes. Understanding the diverse functions of protein C-termini might also provide an important indexing criterion for functional proteomics.
Collapse
Affiliation(s)
- Jean Ju Chung
- Dept of Physiology, Johns Hopkins University School of Medicine, 725 N Wolfe Street, Baltimore, MD 21205, USA
| | | | | | | |
Collapse
|
50
|
Emanuelsson O, von Heijne G, Schneider G. Analysis and prediction of mitochondrial targeting peptides. Methods Cell Biol 2002; 65:175-87. [PMID: 11381593 DOI: 10.1016/s0091-679x(01)65011-8] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Affiliation(s)
- O Emanuelsson
- Stockholm Bioinformatics Center, Stockholm University, S-10691 Stockholm, Sweden
| | | | | |
Collapse
|