Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Restrepo-Montoya D, Pino C, Nino LF, Patarroyo ME, Patarroyo MA. NClassG+: A classifier for non-classically secreted Gram-positive bacterial proteins. BMC Bioinformatics 2011;12:21. [PMID: 21235786 PMCID: PMC3025837 DOI: 10.1186/1471-2105-12-21] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2010] [Accepted: 01/14/2011] [Indexed: 11/16/2022] Open

For:	Restrepo-Montoya D, Pino C, Nino LF, Patarroyo ME, Patarroyo MA. NClassG+: A classifier for non-classically secreted Gram-positive bacterial proteins. BMC Bioinformatics 2011;12:21. [PMID: 21235786 PMCID: PMC3025837 DOI: 10.1186/1471-2105-12-21] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2010] [Accepted: 01/14/2011] [Indexed: 11/16/2022] Open

Number

Cited by Other Article(s)

Liu T, Song C, Wang C. NCSP-PLM: An ensemble learning framework for predicting non-classical secreted proteins based on protein language models and deep learning. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024;21:1472-1488. [PMID: 38303473 DOI: 10.3934/mbe.2024063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2024]

Wang X, Li F, Xu J, Rong J, Webb GI, Ge Z, Li J, Song J. ASPIRER: a new computational approach for identifying non-classical secreted proteins based on deep learning. Brief Bioinform 2022;23:bbac031. [PMID: 35176756 PMCID: PMC8921646 DOI: 10.1093/bib/bbac031] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Revised: 01/10/2022] [Accepted: 01/22/2022] [Indexed: 12/15/2022] Open

Marques da Silva W, Seyffert N, Silva A, Azevedo V. A journey through the Corynebacterium pseudotuberculosis proteome promotes insights into its functional genome. PeerJ 2022;9:e12456. [PMID: 35036114 PMCID: PMC8710256 DOI: 10.7717/peerj.12456] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Accepted: 10/18/2021] [Indexed: 11/28/2022] Open

Abstract

Background

Corynebacterium pseudotuberculosis is a Gram-positive facultative intracellular pathogen and the etiologic agent of illnesses like caseous lymphadenitis in small ruminants, mastitis in dairy cattle, ulcerative lymphangitis in equines, and oedematous skin disease in buffalos. With the growing advance in high-throughput technologies, genomic studies have been carried out to explore the molecular basis of its virulence and pathogenicity. However, data large-scale functional genomics studies are necessary to complement genomics data and better understating the molecular basis of a given organism. Here we summarize, MS-based proteomics techniques and bioinformatics tools incorporated in genomic functional studies of C. pseudotuberculosis to discover the different patterns of protein modulation under distinct environmental conditions, and antigenic and drugs targets.

Methodology

In this study we performed an extensive search in Web of Science of original and relevant articles related to methods, strategy, technology, approaches, and bioinformatics tools focused on the functional study of the genome of C. pseudotuberculosis at the protein level.

Results

Here, we highlight the use of proteomics for understating several aspects of the physiology and pathogenesis of C. pseudotuberculosis at the protein level. The implementation and use of protocols, strategies, and proteomics approach to characterize the different subcellular fractions of the proteome of this pathogen. In addition, we have discussed the immunoproteomics, immunoinformatics and genetic tools employed to identify targets for immunoassays, drugs, and vaccines against C. pseudotuberculosis infection.

Conclusion

In this review, we showed that the combination of proteomics and bioinformatics studies is a suitable strategy to elucidate the functional aspects of the C. pseudotuberculosis genome. Together, all information generated from these proteomics studies allowed expanding our knowledge about factors related to the pathophysiology of this pathogen.

Collapse

Dai W, Li J, Li Q, Cai J, Su J, Stubenrauch C, Wang J. PncsHub: a platform for annotating and analyzing non-classically secreted proteins in Gram-positive bacteria. Nucleic Acids Res 2022;50:D848-D857. [PMID: 34551435 PMCID: PMC8728121 DOI: 10.1093/nar/gkab814] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Revised: 08/30/2021] [Accepted: 09/07/2021] [Indexed: 12/28/2022] Open

Wang W, Ye LF, Bao H, Hu MT, Han M, Tang HM, Ren C, Wu X, Shao Y, Wang FH, Zhou ZW, Li YH, Xu RH, Wang DS. Heterogeneity and evolution of tumour immune microenvironment in metastatic gastroesophageal adenocarcinoma. Gastric Cancer 2022;25:1017-1030. [PMID: 35904677 PMCID: PMC9587966 DOI: 10.1007/s10120-022-01324-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/08/2022] [Accepted: 07/16/2022] [Indexed: 02/07/2023]

Affiliation(s)

Wei Wang State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-Sen University Cancer Center, Sun Yat-Sen University, Guangzhou, 510060 People’s Republic of China ,Department of Gastric Surgery, Sun Yat-Sen University Cancer Center, Guangzhou, 510060 People’s Republic of China
Liu-Fang Ye State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-Sen University Cancer Center, Sun Yat-Sen University, Guangzhou, 510060 People’s Republic of China ,Research Unit of Precision Diagnosis and Treatment for Gastrointestinal Cancer, Chinese Academy of Medical Sciences, Guangzhou, 510060 People’s Republic of China ,Department of Medical Oncology, Sun Yat-Sen University Cancer Center, 651 Dong feng, East Road, Guangzhou, 510060 People’s Republic of China
Hua Bao Geneseeq Research Institute, Nanjing Geneseeq Technology Inc., Nanjing, Jiangsu China
Ming-Tao Hu State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-Sen University Cancer Center, Sun Yat-Sen University, Guangzhou, 510060 People’s Republic of China ,Research Unit of Precision Diagnosis and Treatment for Gastrointestinal Cancer, Chinese Academy of Medical Sciences, Guangzhou, 510060 People’s Republic of China ,Department of Medical Oncology, Sun Yat-Sen University Cancer Center, 651 Dong feng, East Road, Guangzhou, 510060 People’s Republic of China
Ming Han Geneseeq Research Institute, Nanjing Geneseeq Technology Inc., Nanjing, Jiangsu China
Hai-Meng Tang Geneseeq Research Institute, Nanjing Geneseeq Technology Inc., Nanjing, Jiangsu China
Chao Ren State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-Sen University Cancer Center, Sun Yat-Sen University, Guangzhou, 510060 People’s Republic of China ,Research Unit of Precision Diagnosis and Treatment for Gastrointestinal Cancer, Chinese Academy of Medical Sciences, Guangzhou, 510060 People’s Republic of China ,Department of Medical Oncology, Sun Yat-Sen University Cancer Center, 651 Dong feng, East Road, Guangzhou, 510060 People’s Republic of China
Xue Wu Geneseeq Research Institute, Nanjing Geneseeq Technology Inc., Nanjing, Jiangsu China
Yang Shao Geneseeq Research Institute, Nanjing Geneseeq Technology Inc., Nanjing, Jiangsu China ,School of Public Health, Nanjing Medical University, Nanjing, China
Feng-Hua Wang State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-Sen University Cancer Center, Sun Yat-Sen University, Guangzhou, 510060 People’s Republic of China ,Research Unit of Precision Diagnosis and Treatment for Gastrointestinal Cancer, Chinese Academy of Medical Sciences, Guangzhou, 510060 People’s Republic of China ,Department of Medical Oncology, Sun Yat-Sen University Cancer Center, 651 Dong feng, East Road, Guangzhou, 510060 People’s Republic of China
Zhi-Wei Zhou State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-Sen University Cancer Center, Sun Yat-Sen University, Guangzhou, 510060 People’s Republic of China ,Department of Gastric Surgery, Sun Yat-Sen University Cancer Center, Guangzhou, 510060 People’s Republic of China
Yu-Hong Li State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-Sen University Cancer Center, Sun Yat-Sen University, Guangzhou, 510060 People’s Republic of China ,Research Unit of Precision Diagnosis and Treatment for Gastrointestinal Cancer, Chinese Academy of Medical Sciences, Guangzhou, 510060 People’s Republic of China ,Department of Medical Oncology, Sun Yat-Sen University Cancer Center, 651 Dong feng, East Road, Guangzhou, 510060 People’s Republic of China
Rui-Hua Xu State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-Sen University Cancer Center, Sun Yat-Sen University, Guangzhou, 510060 People’s Republic of China ,Research Unit of Precision Diagnosis and Treatment for Gastrointestinal Cancer, Chinese Academy of Medical Sciences, Guangzhou, 510060 People’s Republic of China ,Department of Medical Oncology, Sun Yat-Sen University Cancer Center, 651 Dong feng, East Road, Guangzhou, 510060 People’s Republic of China
De-Shen Wang State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-Sen University Cancer Center, Sun Yat-Sen University, Guangzhou, 510060 People’s Republic of China ,Research Unit of Precision Diagnosis and Treatment for Gastrointestinal Cancer, Chinese Academy of Medical Sciences, Guangzhou, 510060 People’s Republic of China ,Department of Medical Oncology, Sun Yat-Sen University Cancer Center, 651 Dong feng, East Road, Guangzhou, 510060 People’s Republic of China

Collapse

Protein Secretion Prediction Tools and Extracellular Vesicles Databases. Methods Mol Biol 2021;2361:213-227. [PMID: 34236664 DOI: 10.1007/978-1-0716-1641-3_13] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

Wang C, Wu J, Xu L, Zou Q. NonClasGP-Pred: robust and efficient prediction of non-classically secreted proteins by integrating subset-specific optimal models of imbalanced data. Microb Genom 2020;6:mgen000483. [PMID: 33245691 PMCID: PMC8116686 DOI: 10.1099/mgen.0.000483] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Accepted: 11/06/2020] [Indexed: 01/01/2023] Open

Abstract

Non-classically secreted proteins (NCSPs) are proteins that are located in the extracellular environment, although there is a lack of known signal peptides or secretion motifs. They usually perform different biological functions in intracellular and extracellular environments, and several of their biological functions are linked to bacterial virulence and cell defence. Accurate protein localization is essential for all living organisms, however, the performance of existing methods developed for NCSP identification has been unsatisfactory and in particular suffer from data deficiency and possible overfitting problems. Further improvement is desirable, especially to address the lack of informative features and mining subset-specific features in imbalanced datasets. In the present study, a new computational predictor was developed for NCSP prediction of gram-positive bacteria. First, to address the possible prediction bias caused by the data imbalance problem, ten balanced subdatasets were generated for ensemble model construction. Then, the F-score algorithm combined with sequential forward search was used to strengthen the feature representation ability for each of the training subdatasets. Third, the subset-specific optimal feature combination process was adopted to characterize the original data from different aspects, and all subdataset-based models were integrated into a unified model, NonClasGP-Pred, which achieved an excellent performance with an accuracy of 93.23 %, a sensitivity of 100 %, a specificity of 89.01 %, a Matthew's correlation coefficient of 87.68 % and an area under the curve value of 0.9975 for ten-fold cross-validation. Based on assessment on the independent test dataset, the proposed model outperformed state-of-the-art available toolkits. For availability and implementation, see: http://lab.malab.cn/~wangchao/softwares/NonClasGP/.

Collapse

Zhang Y, Yu S, Xie R, Li J, Leier A, Marquez-Lago TT, Akutsu T, Smith AI, Ge Z, Wang J, Lithgow T, Song J. PeNGaRoo, a combined gradient boosting and ensemble learning framework for predicting non-classical secreted proteins. Bioinformatics 2020;36:704-712. [PMID: 31393553 DOI: 10.1093/bioinformatics/btz629] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Revised: 07/17/2019] [Accepted: 08/07/2019] [Indexed: 12/17/2022] Open

Abstract

MOTIVATION

Gram-positive bacteria have developed secretion systems to transport proteins across their cell wall, a process that plays an important role during host infection. These secretion mechanisms have also been harnessed for therapeutic purposes in many biotechnology applications. Accordingly, the identification of features that select a protein for efficient secretion from these microorganisms has become an important task. Among all the secreted proteins, 'non-classical' secreted proteins are difficult to identify as they lack discernable signal peptide sequences and can make use of diverse secretion pathways. Currently, several computational methods have been developed to facilitate the discovery of such non-classical secreted proteins; however, the existing methods are based on either simulated or limited experimental datasets. In addition, they often employ basic features to train the models in a simple and coarse-grained manner. The availability of more experimentally validated datasets, advanced feature engineering techniques and novel machine learning approaches creates new opportunities for the development of improved predictors of 'non-classical' secreted proteins from sequence data.

RESULTS

In this work, we first constructed a high-quality dataset of experimentally verified 'non-classical' secreted proteins, which we then used to create benchmark datasets. Using these benchmark datasets, we comprehensively analyzed a wide range of features and assessed their individual performance. Subsequently, we developed a two-layer Light Gradient Boosting Machine (LightGBM) ensemble model that integrates several single feature-based models into an overall prediction framework. At this stage, LightGBM, a gradient boosting machine, was used as a machine learning approach and the necessary parameter optimization was performed by a particle swarm optimization strategy. All single feature-based LightGBM models were then integrated into a unified ensemble model to further improve the predictive performance. Consequently, the final ensemble model achieved a superior performance with an accuracy of 0.900, an F-value of 0.903, Matthew's correlation coefficient of 0.803 and an area under the curve value of 0.963, and outperforming previous state-of-the-art predictors on the independent test. Based on our proposed optimal ensemble model, we further developed an accessible online predictor, PeNGaRoo, to serve users' demands. We believe this online web server, together with our proposed methodology, will expedite the discovery of non-classically secreted effector proteins in Gram-positive bacteria and further inspire the development of next-generation predictors.

AVAILABILITY AND IMPLEMENTATION

http://pengaroo.erc.monash.edu/.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Affiliation(s)

Yanju Zhang Bioinformatics Group, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
Sha Yu Bioinformatics Group, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China.,Infection and Immunity Program, Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, VIC 3800, Australia
Ruopeng Xie Bioinformatics Group, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China.,Infection and Immunity Program, Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, VIC 3800, Australia
Jiahui Li Bioinformatics Group, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China.,Infection and Immunity Program, Biomedicine Discovery Institute and Department of Microbiology, Monash University, Melbourne, VIC 3800, Australia
André Leier Department of Genetics, AL, USA.,Department of Cell, Developmental and Integrative Biology, School of Medicine, University of Alabama at Birmingham, AL, USA
Tatiana T Marquez-Lago Department of Genetics, AL, USA.,Department of Cell, Developmental and Integrative Biology, School of Medicine, University of Alabama at Birmingham, AL, USA
Tatsuya Akutsu Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto 611-0011, Japan
A Ian Smith Infection and Immunity Program, Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, VIC 3800, Australia.,ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, VIC 3800, Australia
Zongyuan Ge Monash e-Research Centre and Faculty of Engineering, Monash University, Melbourne, VIC 3800, Australia
Jiawei Wang Infection and Immunity Program, Biomedicine Discovery Institute and Department of Microbiology, Monash University, Melbourne, VIC 3800, Australia
Trevor Lithgow Infection and Immunity Program, Biomedicine Discovery Institute and Department of Microbiology, Monash University, Melbourne, VIC 3800, Australia
Jiangning Song Infection and Immunity Program, Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, VIC 3800, Australia.,ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, VIC 3800, Australia

Collapse

Zhang J, Zhang Y, Ma Z. In silico Prediction of Human Secretory Proteins in Plasma Based on Discrete Firefly Optimization and Application to Cancer Biomarkers Identification. Front Genet 2019;10:542. [PMID: 31244885 PMCID: PMC6563772 DOI: 10.3389/fgene.2019.00542] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2019] [Accepted: 05/21/2019] [Indexed: 12/20/2022] Open

Zhang J, Chai H, Guo S, Guo H, Li Y. High-Throughput Identification of Mammalian Secreted Proteins Using Species-Specific Scheme and Application to Human Proteome. Molecules 2018;23:molecules23061448. [PMID: 29903999 PMCID: PMC6099666 DOI: 10.3390/molecules23061448] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2018] [Revised: 05/29/2018] [Accepted: 05/30/2018] [Indexed: 02/02/2023] Open

Lonsdale A, Davis MJ, Doblin MS, Bacic A. Better Than Nothing? Limitations of the Prediction Tool SecretomeP in the Search for Leaderless Secretory Proteins (LSPs) in Plants. FRONTIERS IN PLANT SCIENCE 2016;7:1451. [PMID: 27729919 PMCID: PMC5037178 DOI: 10.3389/fpls.2016.01451] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2016] [Accepted: 09/12/2016] [Indexed: 05/14/2023]

Huang WL, Tung CW, Liaw C, Huang HL, Ho SY. Rule-based knowledge acquisition method for promoter prediction in human and Drosophila species. ScientificWorldJournal 2014;2014:327306. [PMID: 24955394 PMCID: PMC3927563 DOI: 10.1155/2014/327306] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2013] [Accepted: 10/10/2013] [Indexed: 01/08/2023] Open

Caccia D, Dugo M, Callari M, Bongarzone I. Bioinformatics tools for secretome analysis. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2013;1834:2442-53. [PMID: 23395702 DOI: 10.1016/j.bbapap.2013.01.039] [Citation(s) in RCA: 71] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/29/2012] [Revised: 01/23/2013] [Accepted: 01/29/2013] [Indexed: 12/29/2022]

Ye L, Zhang T, Wang T, Fang Z. Microbial structures, functions, and metabolic pathways in wastewater treatment bioreactors revealed using high-throughput sequencing. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2012;46:13244-52. [PMID: 23151157 DOI: 10.1021/es303454k] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]

Santos AR, Carneiro A, Gala-García A, Pinto A, Barh D, Barbosa E, Aburjaile F, Dorella F, Rocha F, Guimarães L, Zurita-Turk M, Ramos R, Almeida S, Soares S, Pereira U, Abreu VC, Silva A, Miyoshi A, Azevedo V. The Corynebacterium pseudotuberculosis in silico predicted pan-exoproteome. BMC Genomics 2012;13 Suppl 5:S6. [PMID: 23095951 PMCID: PMC3476999 DOI: 10.1186/1471-2164-13-s5-s6] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Abstract

Background

Pan-genomic studies aim, for instance, at defining the core, dispensable and unique genes within a species. A pan-genomics study for vaccine design tries to assess the best candidates for a vaccine against a specific pathogen. In this context, rather than studying genes predicted to be exported in a single genome, with pan-genomics it is possible to study genes present in different strains within the same species, such as virulence factors. The target organism of this pan-genomic work here presented is Corynebacterium pseudotuberculosis, the etiologic agent of caseous lymphadenitis (CLA) in goat and sheep, which causes significant economic losses in those herds around the world. Currently, only a few antigens against CLA are known as being the basis of commercial and still ineffective vaccines. In this regard, the here presented work analyses, in silico, five C. pseudotuberculosis genomes and gathers data to predict common exported proteins in all five genomes. These candidates were also compared to two recent C. pseudotuberculosis in vitro exoproteome results.

Results

The complete genome of five C. pseudotuberculosis strains (1002, C231, I19, FRC41 and PAT10) were submitted to pan-genomics analysis, yielding 306, 59 and 12 gene sets, respectively, representing the core, dispensable and unique in silico predicted exported pan-genomes. These sets bear 150 genes classified as secreted (SEC) and 227 as potentially surface exposed (PSE). Our findings suggest that the main C. pseudotuberculosis in vitro exoproteome could be greater, appended by a fraction of the 35 proteins formerly predicted as making part of the variant in vitro exoproteome. These genomes were manually curated for correct methionine initiation and redeposited with a total of 1885 homogenized genes.

Conclusions

The in silico prediction of exported proteins has allowed to define a list of putative vaccine candidate genes present in all five complete C. pseudotuberculosis genomes. Moreover, it has also been possible to define the in silico predicted dispensable and unique C. pseudotuberculosis exported proteins. These results provide in silico evidence to further guide experiments in the areas of vaccines, diagnosis and drugs. The work here presented is the first whole C. pseudotuberculosis in silico predicted pan-exoproteome completed till today.

Collapse

Hu Y, Li T, Sun J, Tang S, Xiong W, Li D, Chen G, Cong P. Predicting Gram-positive bacterial protein subcellular localization based on localization motifs. J Theor Biol 2012;308:135-40. [DOI: 10.1016/j.jtbi.2012.05.031] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2012] [Revised: 03/30/2012] [Accepted: 05/29/2012] [Indexed: 10/28/2022]

Renier S, Micheau P, Talon R, Hébraud M, Desvaux M. Subcellular localization of extracytoplasmic proteins in monoderm bacteria: rational secretomics-based strategy for genomic and proteomic analyses. PLoS One 2012;7:e42982. [PMID: 22912771 PMCID: PMC3415414 DOI: 10.1371/journal.pone.0042982] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2012] [Accepted: 07/13/2012] [Indexed: 11/20/2022] Open

Huang WL. Ranking Gene Ontology terms for predicting non-classical secretory proteins in eukaryotes and prokaryotes. J Theor Biol 2012;312:105-13. [PMID: 22967952 DOI: 10.1016/j.jtbi.2012.07.027] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2012] [Revised: 05/30/2012] [Accepted: 07/28/2012] [Indexed: 11/24/2022]

Abstract

Protein secretion is an important biological process for both eukaryotes and prokaryotes. Several sequence-based methods mainly rely on utilizing various types of complementary features to design accurate classifiers for predicting non-classical secretory proteins. Gene Ontology (GO) terms are increasing informative in predicting protein functions. However, the number of used GO terms is often very large. For example, there are 60,020 GO terms used in the prediction method Euk-mPLoc 2.0 for subcellular localization. This study proposes a novel approach to identify a small set of m top-ranked GO terms served as the only type of input features to design a support vector machine (SVM) based method Sec-GO to predict non-classical secretory proteins in both eukaryotes and prokaryotes. To evaluate the Sec-GO method, two existing methods and their used datasets are adopted for performance comparisons. The Sec-GO method using m=436 GO terms yields an independent test accuracy of 96.7% on mammalian proteins, much better than the existing method SPRED (82.2%) which uses frequencies of tri-peptides and short peptides, secondary structure, and physicochemical properties as input features of a random forest classifier. Furthermore, when applying to Gram-positive bacterial proteins, the Sec-GO with m=158 GO terms has a test accuracy of 94.5%, superior to NClassG+ (90.0%) which uses SVM with several feature types, comprising amino acid composition, di-peptides, physicochemical properties and the position specific weighting matrix. Analysis of the distribution of secretory proteins in a GO database indicates the percentage of the non-classical secretory proteins annotated by GO is larger than that of classical secretory proteins in both eukaryotes and prokaryotes. Of the m top-ranked GO features, the top-four GO terms are all annotated by such subcellular locations as GO:0005576 (Extracellular region). Additionally, the method Sec-GO is easily implemented and its web tool of prediction is available at iclab.life.nctu.edu.tw/secgo.

Collapse