Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	Krug K, Nahnsen S, Macek B. Mass spectrometry at the interface of proteomics and genomics. Mol Biosyst 2010;7:284-91. [PMID: 20967315 DOI: 10.1039/c0mb00168f] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]

Number

Cited by Other Article(s)

Mani DR, Krug K, Zhang B, Satpathy S, Clauser KR, Ding L, Ellis M, Gillette MA, Carr SA. Cancer proteogenomics: current impact and future prospects. Nat Rev Cancer 2022;22:298-313. [PMID: 35236940 DOI: 10.1038/s41568-022-00446-5] [Citation(s) in RCA: 66] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/21/2022] [Indexed: 02/07/2023]

Rex DB, Patil AH, Modi PK, Kandiyil MK, Kasaragod S, Pinto SM, Tanneru N, Sijwali PS, Prasad TSK. Dissecting Plasmodium yoelii Pathobiology: Proteomic Approaches for Decoding Novel Translational and Post-Translational Modifications. ACS OMEGA 2022;7:8246-8257. [PMID: 35309442 PMCID: PMC8928344 DOI: 10.1021/acsomega.1c03892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/01/2021] [Accepted: 02/21/2022] [Indexed: 06/14/2023]

Guillot L, Delage L, Viari A, Vandenbrouck Y, Com E, Ritter A, Lavigne R, Marie D, Peterlongo P, Potin P, Pineau C. Peptimapper: proteogenomics workflow for the expert annotation of eukaryotic genomes. BMC Genomics 2019;20:56. [PMID: 30654742 PMCID: PMC6337836 DOI: 10.1186/s12864-019-5431-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2018] [Accepted: 01/03/2019] [Indexed: 01/02/2023] Open

Abstract

Background

Accurate structural annotation of genomes is still a challenge, despite the progress made over the past decade. The prediction of gene structure remains difficult, especially for eukaryotic species, and is often erroneous and incomplete. We used a proteogenomics strategy, taking advantage of the combination of proteomics datasets and bioinformatics tools, to identify novel protein coding-genes and splice isoforms, assign correct start sites, and validate predicted exons and genes.

Results

Our proteogenomics workflow, Peptimapper, was applied to the genome annotation of Ectocarpus sp., a key reference genome for both the brown algal lineage and stramenopiles. We generated proteomics data from various life cycle stages of Ectocarpus sp. strains and sub-cellular fractions using a shotgun approach. First, we directly generated peptide sequence tags (PSTs) from the proteomics data. Second, we mapped PSTs onto the translated genomic sequence. Closely located hits (i.e., PSTs locations on the genome) were then clustered to detect potential coding regions based on parameters optimized for the organism. Third, we evaluated each cluster and compared it to gene predictions from existing conventional genome annotation approaches. Finally, we integrated cluster locations into GFF files to use a genome viewer. We identified two potential novel genes, a ribosomal protein L22 and an aryl sulfotransferase and corrected the gene structure of a dihydrolipoamide acetyltransferase. We experimentally validated the results by RT-PCR and using transcriptomics data.

Conclusions

Peptimapper is a complementary tool for the expert annotation of genomes. It is suitable for any organism and is distributed through a Docker image available on two public bioinformatics docker repositories: Docker Hub and BioShaDock. This workflow is also accessible through the Galaxy framework and for use by non-computer scientists at https://galaxy.protim.eu.

Data are available via ProteomeXchange under identifier PXD010618.

Electronic supplementary material

The online version of this article (10.1186/s12864-019-5431-9) contains supplementary material, which is available to authorized users.

Collapse

Affiliation(s)

Laetitia Guillot Univ Rennes, Inserm, EHESP, Irset (Institut de recherche en santé, environnement et travail) - UMR_S 1085, F-35042, Rennes cedex, France.,Protim, Univ Rennes, F-35042, Rennes cedex, France
Ludovic Delage Sorbonne Université, UPMC, CNRS, UMR 8227, Integrative Biology of Marine Models, Biological Station, CS 90074, F-29688, Roscoff, France
Alain Viari INRIA Grenoble-Rhône-Alpes, F-38330, Montbonnot-Saint-Martin, France
Yves Vandenbrouck University Grenoble Alpes, CEA, Inserm, BIG-BGE, 38000, Grenoble, France
Emmanuelle Com Univ Rennes, Inserm, EHESP, Irset (Institut de recherche en santé, environnement et travail) - UMR_S 1085, F-35042, Rennes cedex, France.,Protim, Univ Rennes, F-35042, Rennes cedex, France
Andrés Ritter Sorbonne Université, UPMC, CNRS, UMR 8227, Integrative Biology of Marine Models, Biological Station, CS 90074, F-29688, Roscoff, France.,Present address: Sorbonne Université, CNRS, Institut de Biologie Paris-Seine, Laboratory of Computational and Quantitative Biology, F-75005, Paris, France
Régis Lavigne Univ Rennes, Inserm, EHESP, Irset (Institut de recherche en santé, environnement et travail) - UMR_S 1085, F-35042, Rennes cedex, France.,Protim, Univ Rennes, F-35042, Rennes cedex, France
Dominique Marie Sorbonne Université, UPMC, CNRS, UMR 8227, Integrative Biology of Marine Models, Biological Station, CS 90074, F-29688, Roscoff, France
Pierre Peterlongo University Rennes, Inria, CNRS, IRISA, F-35042, Rennes, France
Philippe Potin Sorbonne Université, UPMC, CNRS, UMR 8227, Integrative Biology of Marine Models, Biological Station, CS 90074, F-29688, Roscoff, France
Charles Pineau Univ Rennes, Inserm, EHESP, Irset (Institut de recherche en santé, environnement et travail) - UMR_S 1085, F-35042, Rennes cedex, France. .,Protim, Univ Rennes, F-35042, Rennes cedex, France.

Collapse

Finkel Y, Stern‐Ginossar N, Schwartz M. Viral Short ORFs and Their Possible Functions. Proteomics 2018;18:e1700255. [PMID: 29150926 PMCID: PMC7167739 DOI: 10.1002/pmic.201700255] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2017] [Revised: 11/06/2017] [Indexed: 12/30/2022]

Ruggles KV, Krug K, Wang X, Clauser KR, Wang J, Payne SH, Fenyö D, Zhang B, Mani DR. Methods, Tools and Current Perspectives in Proteogenomics. Mol Cell Proteomics 2017;16:959-981. [PMID: 28456751 DOI: 10.1074/mcp.mr117.000024] [Citation(s) in RCA: 95] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2017] [Indexed: 12/20/2022] Open

Westermann B, Jacome ASV, Rompais M, Carapito C, Schaeffer-Reiss C. Doublet N-Terminal Oriented Proteomics for N-Terminomics and Proteolytic Processing Identification. Methods Mol Biol 2017;1574:77-90. [PMID: 28315244 DOI: 10.1007/978-1-4939-6850-3_6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

Soares NC, Bou G, Blackburn JM. Editorial: Proteomics of Microbial Human Pathogens. Front Microbiol 2016;7:1742. [PMID: 27867374 PMCID: PMC5095502 DOI: 10.3389/fmicb.2016.01742] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2016] [Accepted: 10/18/2016] [Indexed: 11/20/2022] Open

Klasberg S, Bitard-Feildel T, Mallet L. Computational Identification of Novel Genes: Current and Future Perspectives. Bioinform Biol Insights 2016;10:121-31. [PMID: 27493475 PMCID: PMC4970615 DOI: 10.4137/bbi.s39950] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2016] [Revised: 05/31/2016] [Accepted: 06/05/2016] [Indexed: 12/31/2022] Open

Potgieter MG, Nakedi KC, Ambler JM, Nel AJM, Garnett S, Soares NC, Mulder N, Blackburn JM. Proteogenomic Analysis of Mycobacterium smegmatis Using High Resolution Mass Spectrometry. Front Microbiol 2016;7:427. [PMID: 27092112 PMCID: PMC4821088 DOI: 10.3389/fmicb.2016.00427] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2015] [Accepted: 03/16/2016] [Indexed: 11/30/2022] Open

Sheshukova EV, Shindyapina AV, Komarova TV, Dorokhov YL. “Matreshka” genes with alternative reading frames. RUSS J GENET+ 2016. [DOI: 10.1134/s1022795416020149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

Díez P, Droste C, Dégano RM, González-Muñoz M, Ibarrola N, Pérez-Andrés M, Garin-Muga A, Segura V, Marko-Varga G, LaBaer J, Orfao A, Corrales FJ, De Las Rivas J, Fuentes M. Integration of Proteomics and Transcriptomics Data Sets for the Analysis of a Lymphoma B-Cell Line in the Context of the Chromosome-Centric Human Proteome Project. J Proteome Res 2015. [PMID: 26216070 DOI: 10.1021/acs.jproteome.5b00474] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Abstract

A comprehensive study of the molecular active landscape of human cells can be undertaken to integrate two different but complementary perspectives: transcriptomics, and proteomics. After the genome era, proteomics has emerged as a powerful tool to simultaneously identify and characterize the compendium of thousands of different proteins active in a cell. Thus, the Chromosome-centric Human Proteome Project (C-HPP) is promoting a full characterization of the human proteome combining high-throughput proteomics with the data derived from genome-wide expression profiling of protein-coding genes. Here we present a full proteomic profiling of a human lymphoma B-cell line (Ramos) performed using a nanoUPLC-LTQ-Orbitrap Velos proteomic platform, combined to an in-depth transcriptomic profiling of the same cell type. Data are available via ProteomeXchange with identifier PXD001933. Integration of the proteomic and transcriptomic data sets revealed a 94% overlap in the proteins identified by both -omics approaches. Moreover, functional enrichment analysis of the proteomic profiles showed an enrichment of several functions directly related to the biological and morphological characteristics of B-cells. In turn, about 30% of all protein-coding genes present in the whole human genome were identified as being expressed by the Ramos cells (stable average of 30% genes along all the chromosomes), revealing the size of the protein expression-set present in one specific human cell type. Additionally, the identification of missing proteins in our data sets has been reported, highlighting the power of the approach. Also, a comparison between neXtProt and UniProt database searches has been performed. In summary, our transcriptomic and proteomic experimental profiling provided a high coverage report of the expressed proteome from a human lymphoma B-cell type with a clear insight into the biological processes that characterized these cells. In this way, we demonstrated the usefulness of combining -omics for a comprehensive characterization of specific biological systems.

Collapse

Affiliation(s)

Paula Díez Department of Medicine and General Cytometry Service-Nucleus, Cancer Research Centre (IBMCC/CSIC/USAL/IBSAL), 37007 Salamanca, Spain.,Proteomics Unit. Cancer Research Centre (IBMCC/CSIC/USAL/IBSAL), 37007 Salamanca, Spain
Conrad Droste Bioinformatics and Functional Genomics Research Group, Cancer Research Centre (IBMCC/CSIC/USAL/IBSAL), 37007 Salamanca, Spain
Rosa M Dégano Proteomics Unit. Cancer Research Centre (IBMCC/CSIC/USAL/IBSAL), 37007 Salamanca, Spain
María González-Muñoz Department of Medicine and General Cytometry Service-Nucleus, Cancer Research Centre (IBMCC/CSIC/USAL/IBSAL), 37007 Salamanca, Spain
Nieves Ibarrola Proteomics Unit. Cancer Research Centre (IBMCC/CSIC/USAL/IBSAL), 37007 Salamanca, Spain
Martín Pérez-Andrés Department of Medicine and General Cytometry Service-Nucleus, Cancer Research Centre (IBMCC/CSIC/USAL/IBSAL), 37007 Salamanca, Spain
Alba Garin-Muga Division of Hepatology and Gene Therapy, Proteomics and Bioinformatics Unit, Centre for Applied Medical Research (CIMA), University of Navarra , 31008 Pamplona, Spain
Víctor Segura Division of Hepatology and Gene Therapy, Proteomics and Bioinformatics Unit, Centre for Applied Medical Research (CIMA), University of Navarra , 31008 Pamplona, Spain
Gyorgy Marko-Varga Clinical Protein Science and Imaging, Biomedical Centre, Department of Biomedical Engineering, Lund University , BMC D13, 221 84 Lund, Sweden
Joshua LaBaer Biodesign Institute, Arizona State University , 1001 South McAllister Avenue, Tempe, Arizona 85287, United States
Alberto Orfao Department of Medicine and General Cytometry Service-Nucleus, Cancer Research Centre (IBMCC/CSIC/USAL/IBSAL), 37007 Salamanca, Spain
Fernando J Corrales Division of Hepatology and Gene Therapy, Proteomics and Bioinformatics Unit, Centre for Applied Medical Research (CIMA), University of Navarra , 31008 Pamplona, Spain
Javier De Las Rivas Bioinformatics and Functional Genomics Research Group, Cancer Research Centre (IBMCC/CSIC/USAL/IBSAL), 37007 Salamanca, Spain
Manuel Fuentes Department of Medicine and General Cytometry Service-Nucleus, Cancer Research Centre (IBMCC/CSIC/USAL/IBSAL), 37007 Salamanca, Spain.,Proteomics Unit. Cancer Research Centre (IBMCC/CSIC/USAL/IBSAL), 37007 Salamanca, Spain

Collapse

Walley JW, Briggs SP. Dual use of peptide mass spectra: Protein atlas and genome annotation. CURRENT PLANT BIOLOGY 2015;2:21-24. [PMID: 26811807 PMCID: PMC4723421 DOI: 10.1016/j.cpb.2015.02.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]

Rabus R. Fifteen years of physiological proteo(geno)mics with (marine) environmental bacteria. Arch Physiol Biochem 2014;120:173-87. [PMID: 25233489 DOI: 10.3109/13813455.2014.951658] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Krug K, Popic S, Carpy A, Taumer C, Macek B. Construction and assessment of individualized proteogenomic databases for large-scale analysis of nonsynonymous single nucleotide variants. Proteomics 2014;14:2699-708. [DOI: 10.1002/pmic.201400219] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2014] [Revised: 08/02/2014] [Accepted: 09/19/2014] [Indexed: 01/08/2023]

Dwivedi SB, Muthusamy B, Kumar P, Kim MS, Nirujogi RS, Getnet D, Ahiakonu P, De G, Nair B, Gowda H, Prasad TSK, Kumar N, Pandey A, Okulate M. Brain proteomics of Anopheles gambiae. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2014;18:421-37. [PMID: 24937107 DOI: 10.1089/omi.2014.0007] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Bland C, Hartmann EM, Christie-Oleza JA, Fernandez B, Armengaud J. N-Terminal-oriented proteogenomics of the marine bacterium roseobacter denitrificans Och114 using N-Succinimidyloxycarbonylmethyl)tris(2,4,6-trimethoxyphenyl)phosphonium bromide (TMPP) labeling and diagonal chromatography. Mol Cell Proteomics 2014;13:1369-81. [PMID: 24536027 DOI: 10.1074/mcp.o113.032854] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open

Emerging evidence for functional peptides encoded by short open reading frames. Nat Rev Genet 2014;15:193-204. [PMID: 24514441 DOI: 10.1038/nrg3520] [Citation(s) in RCA: 381] [Impact Index Per Article: 38.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]

Mehta A, Sonam S, Gouri I, Loharch S, Sharma DK, Parkesh R. SMMRNA: a database of small molecule modulators of RNA. Nucleic Acids Res 2014;42:D132-41. [PMID: 24163098 PMCID: PMC3965028 DOI: 10.1093/nar/gkt976] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2013] [Revised: 09/13/2013] [Accepted: 10/01/2013] [Indexed: 02/05/2023] Open

Branca RMM, Orre LM, Johansson HJ, Granholm V, Huss M, Pérez-Bercoff Å, Forshed J, Käll L, Lehtiö J. HiRIEF LC-MS enables deep proteome coverage and unbiased proteogenomics. Nat Methods 2013;11:59-62. [PMID: 24240322 DOI: 10.1038/nmeth.2732] [Citation(s) in RCA: 182] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2013] [Accepted: 10/08/2013] [Indexed: 11/09/2022]

Ruiz-Mirazo K, Briones C, de la Escosura A. Prebiotic Systems Chemistry: New Perspectives for the Origins of Life. Chem Rev 2013;114:285-366. [DOI: 10.1021/cr2004844] [Citation(s) in RCA: 563] [Impact Index Per Article: 51.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

Muth T, Benndorf D, Reichl U, Rapp E, Martens L. Searching for a needle in a stack of needles: challenges in metaproteomics data analysis. MOLECULAR BIOSYSTEMS 2013;9:578-85. [PMID: 23238088 DOI: 10.1039/c2mb25415h] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]

Krug K, Carpy A, Behrends G, Matic K, Soares NC, Macek B. Deep coverage of the Escherichia coli proteome enables the assessment of false discovery rates in simple proteogenomic experiments. Mol Cell Proteomics 2013;12:3420-30. [PMID: 23908556 DOI: 10.1074/mcp.m113.029165] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open

Abstract

Recent advances in mass spectrometry (MS) have led to increased applications of shotgun proteomics to the refinement of genome annotation. The typical "proteo-genomic" workflows rely on the mapping of peptide MS/MS spectra onto databases derived via six-frame translation of the genome sequence. These databases contain a large proportion of spurious protein sequences which make the statistical confidence of the resulting peptide spectrum matches difficult to assess. Here we performed a comprehensive analysis of the Escherichia coli proteome using LTQ-Orbitrap MS and mapped the corresponding MS/MS spectra onto a six-frame translation of the E. coli genome. We hypothesized that the protein-coding part of the E. coli genome approaches complete annotation and that the majority of six frame-specific (novel) peptide spectrum matches can be considered as false positive identifications. We confirm our hypothesis by showing that the posterior error probability distribution of novel hits is almost identical to that of reversed (decoy) hits; this enables us to estimate the sensitivity, specificity, accuracy, and false discovery rate in a typical bacterial proteo-genomic dataset. We use two complementary computational frameworks for processing and statistical assessment of MS/MS data: MaxQuant and Trans-Proteomic Pipeline. We show that MaxQuant achieves a more sensitive six-frame database search with an acceptable false discovery rate and is therefore well suited for global genome reannotation applications, whereas the Trans-Proteomic Pipeline achieves higher specificity and is well suited for high-confidence validation. The use of a small and well-annotated bacterial genome enables us to address genome coverage achieved in state-of-the-art bacterial proteomics: identified peptide sequences mapped to all expressed E. coli proteins but covered 31.7% of the protein-coding genome sequence. Our results show that false discovery rates can be substantially underestimated even in "simple" proteo-genomic experiments obtained by means of high-accuracy MS and point to the necessity of further improvements concerning the coverage of peptide sequences by MS-based methods.

Collapse

Müller SA, Findeiß S, Pernitzsch SR, Wissenbach DK, Stadler PF, Hofacker IL, von Bergen M, Kalkhof S. Identification of new protein coding sequences and signal peptidase cleavage sites of Helicobacter pylori strain 26695 by proteogenomics. J Proteomics 2013;86:27-42. [PMID: 23665149 DOI: 10.1016/j.jprot.2013.04.036] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2012] [Revised: 03/29/2013] [Accepted: 04/26/2013] [Indexed: 12/16/2022]

Abstract

UNLABELLED

Correct annotation of protein coding genes is the basis of conventional data analysis in proteomic studies. Nevertheless, most protein sequence databases almost exclusively rely on gene finding software and inevitably also miss protein annotations or possess errors. Proteogenomics tries to overcome these issues by matching MS data directly against a genome sequence database. Here we report an in-depth proteogenomics study of Helicobacter pylori strain 26695. MS data was searched against a combined database of the NCBI annotations and a six-frame translation of the genome. Database searches with Mascot and X! Tandem revealed 1115 proteins identified by at least two peptides with a peptide false discovery rate below 1%. This represents 71% of the predicted proteome. So far this is the most extensive proteome study of Helicobacter pylori. Our proteogenomic approach unambiguously identified four previously missed annotations and furthermore allowed us to correct sequences of six annotated proteins. Since secreted proteins are often involved in pathogenic processes we further investigated signal peptidase cleavage sites. By applying a database search that accommodates the identification of semi-specific cleaved peptides, 63 previously unknown signal peptides were detected. The motif LXA showed to be the predominant recognition sequence for signal peptidases.

BIOLOGICAL SIGNIFICANCE

The results of MS-based proteomic studies highly rely on correct annotation of protein coding genes which is the basis of conventional data analysis. However, the annotation of protein coding sequences in genomic data is usually based on gene finding software. These tools are limited in their prediction accuracy such as the problematic determination of exact gene boundaries. Thus, protein databases own partly erroneous or incomplete sequences. Additionally, some protein sequences might also be missing in the databases. Proteogenomics, a combination of proteomic and genomic data analyses, is well suited to detect previously not annotated proteins and to correct erroneous sequences. For this purpose, the existing database of the investigated species is typically supplemented with a six-frame translation of the genome. Here, we studied the proteome of the major human pathogen Helicobacter pylori that is responsible for many gastric diseases such as duodenal ulcers and gastric cancer. Our in-depth proteomic study highly reliably identified 1115 proteins (FDR<0.01%) by at least two peptides (FDR<1%) which represent 71% of the predicted proteome deposited at NCBI. The proteogenomic data analysis of our data set resulted in the unambiguous identification of four previously missed annotations, the correction of six annotated proteins as well as the detection of 63 previously unknown signal peptides. We have annotated proteins of particular biological interest like the ferrous iron transport protein A, the coiled-coil-rich protein HP0058 and the lipopolysaccharide biosynthesis protein HP0619. For instance, the protein HP0619 could be a drug target for the inhibition of the LPS synthesis pathway. Furthermore it has been proven that the motif "LXA" is the predominant recognition sequence for the signal peptidase I of H. pylori. Signal peptidases are essential enzymes for the viability of bacterial cells and are involved in pathogenesis. Therefore signal peptidases could be novel targets for antibiotics. The inclusion of the corrected and new annotated proteins as well as the information of signal peptide cleavage sites will help in the study of biological pathways involved in pathogenesis or drug response of H. pylori.

Collapse

Verbeke TJ, Zhang X, Henrissat B, Spicer V, Rydzak T, Krokhin OV, Fristensky B, Levin DB, Sparling R. Genomic evaluation of Thermoanaerobacter spp. for the construction of designer co-cultures to improve lignocellulosic biofuel production. PLoS One 2013;8:e59362. [PMID: 23555660 PMCID: PMC3608648 DOI: 10.1371/journal.pone.0059362] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2012] [Accepted: 02/13/2013] [Indexed: 02/07/2023] Open

Abstract

The microbial production of ethanol from lignocellulosic biomass is a multi-component process that involves biomass hydrolysis, carbohydrate transport and utilization, and finally, the production of ethanol. Strains of the genus Thermoanaerobacter have been studied for decades due to their innate abilities to produce comparatively high ethanol yields from hemicellulose constituent sugars. However, their inability to hydrolyze cellulose, limits their usefulness in lignocellulosic biofuel production. As such, co-culturing Thermoanaerobacter spp. with cellulolytic organisms is a plausible approach to improving lignocellulose conversion efficiencies and yields of biofuels. To evaluate native lignocellulosic ethanol production capacities relative to competing fermentative end-products, comparative genomic analysis of 11 sequenced Thermoanaerobacter strains, including a de novo genome, Thermoanaerobacter thermohydrosulfuricus WC1, was conducted. Analysis was specifically focused on the genomic potential for each strain to address all aspects of ethanol production mentioned through a consolidated bioprocessing approach. Whole genome functional annotation analysis identified three distinct clades within the genus. The genomes of Clade 1 strains encode the fewest extracellular carbohydrate active enzymes and also show the least diversity in terms of lignocellulose relevant carbohydrate utilization pathways. However, these same strains reportedly are capable of directing a higher proportion of their total carbon flux towards ethanol, rather than non-biofuel end-products, than other Thermoanaerobacter strains. Strains in Clade 2 show the greatest diversity in terms of lignocellulose hydrolysis and utilization, but proportionately produce more non-ethanol end-products than Clade 1 strains. Strains in Clade 3, in which T. thermohydrosulfuricus WC1 is included, show mid-range potential for lignocellulose hydrolysis and utilization, but also exhibit extensive divergence from both Clade 1 and Clade 2 strains in terms of cellular energetics. The potential implications regarding strain selection and suitability for industrial ethanol production through a consolidated bioprocessing co-culturing approach are examined throughout the manuscript.

Collapse

Sigdel TK, Gao X, Sarwal MM. Protein and peptide biomarkers in organ transplantation. Biomark Med 2012;6:259-71. [PMID: 22731899 DOI: 10.2217/bmm.12.29] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open

Zhang YE, Landback P, Vibranovski M, Long M. New genes expressed in human brains: implications for annotating evolving genomes. Bioessays 2012;34:982-91. [PMID: 23001763 DOI: 10.1002/bies.201200008] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

Wang M, Weiss M, Simonovic M, Haertinger G, Schrimpf SP, Hengartner MO, von Mering C. PaxDb, a database of protein abundance averages across all three domains of life. Mol Cell Proteomics 2012;11:492-500. [PMID: 22535208 PMCID: PMC3412977 DOI: 10.1074/mcp.o111.014704] [Citation(s) in RCA: 354] [Impact Index Per Article: 29.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2011] [Revised: 03/26/2012] [Indexed: 02/04/2023] Open

The proteomic future: where mass spectrometry should be taking us. Biochem J 2012;444:169-81. [PMID: 22574775 DOI: 10.1042/bj20110363] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Pawar H, Sahasrabuddhe NA, Renuse S, Keerthikumar S, Sharma J, Kumar GSS, Venugopal A, Sekhar NR, Kelkar DS, Nemade H, Khobragade SN, Muthusamy B, Kandasamy K, Harsha HC, Chaerkady R, Patole MS, Pandey A. A proteogenomic approach to map the proteome of an unsequenced pathogen - Leishmania donovani. Proteomics 2012;12:832-44. [DOI: 10.1002/pmic.201100505] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Affiliation(s)

Harsh Pawar Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India Rajiv Gandhi University of Health Sciences; Bangalore Karnataka India
Nandini A. Sahasrabuddhe Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India Manipal University; Madhav Nagar Manipal Karnataka India McKusick-Nathans Institute of Genetic Medicine; Johns Hopkins University School of Medicine; Baltimore MD USA Department of Biological Chemistry; Johns Hopkins University School of Medicine; Baltimore MD USA
Santosh Renuse Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India McKusick-Nathans Institute of Genetic Medicine; Johns Hopkins University School of Medicine; Baltimore MD USA Department of Biological Chemistry; Johns Hopkins University School of Medicine; Baltimore MD USA Department of Biotechnology; Amrita Vishwa Vidyapeetham; Kollam Kerala India
Shivakumar Keerthikumar Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
Jyoti Sharma Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India Manipal University; Madhav Nagar Manipal Karnataka India
Ghantasala. S. Sameer Kumar Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India Department of Biotechnology; Kuvempu University; Shimoga Karnataka India
Abhilash Venugopal Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India Department of Biotechnology; Kuvempu University; Shimoga Karnataka India
Nirujogi Raja Sekhar Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India Bioinformatics Centre; School of Life Sciences; Pondicherry University; Puducherry India
Dhanashree S. Kelkar Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India Department of Biotechnology; Amrita Vishwa Vidyapeetham; Kollam Kerala India
Harshal Nemade National Centre for Cell Sciences; Pune Maharashtra India
Sweta N. Khobragade National Centre for Cell Sciences; Pune Maharashtra India
Babylakshmi Muthusamy Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India Bioinformatics Centre; School of Life Sciences; Pondicherry University; Puducherry India
Kumaran Kandasamy Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
H. C. Harsha Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India
Raghothama Chaerkady Institute of Bioinformatics; International Technology Park; Bangalore Karnataka India McKusick-Nathans Institute of Genetic Medicine; Johns Hopkins University School of Medicine; Baltimore MD USA Department of Biological Chemistry; Johns Hopkins University School of Medicine; Baltimore MD USA
Milind S. Patole National Centre for Cell Sciences; Pune Maharashtra India
Akhilesh Pandey McKusick-Nathans Institute of Genetic Medicine; Johns Hopkins University School of Medicine; Baltimore MD USA Department of Biological Chemistry; Johns Hopkins University School of Medicine; Baltimore MD USA Department of Oncology; Johns Hopkins University School of Medicine; Baltimore MD USA Department of Pathology; Johns Hopkins University School of Medicine; Baltimore MD USA

Collapse

Translational plant proteomics: a perspective. J Proteomics 2012;75:4588-601. [PMID: 22516432 DOI: 10.1016/j.jprot.2012.03.055] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2011] [Revised: 02/25/2012] [Accepted: 03/25/2012] [Indexed: 11/21/2022]

Kim MS, Pandey A. Electron transfer dissociation mass spectrometry in proteomics. Proteomics 2012;12:530-42. [PMID: 22246976 DOI: 10.1002/pmic.201100517] [Citation(s) in RCA: 78] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2011] [Revised: 10/25/2011] [Accepted: 11/02/2011] [Indexed: 01/30/2023]

Rosenbloom KR, Dreszer TR, Long JC, Malladi VS, Sloan CA, Raney BJ, Cline MS, Karolchik D, Barber GP, Clawson H, Diekhans M, Fujita PA, Goldman M, Gravell RC, Harte RA, Hinrichs AS, Kirkup VM, Kuhn RM, Learned K, Maddren M, Meyer LR, Pohl A, Rhead B, Wong MC, Zweig AS, Haussler D, Kent WJ. ENCODE whole-genome data in the UCSC Genome Browser: update 2012. Nucleic Acids Res 2012;40:D912-7. [PMID: 22075998 PMCID: PMC3245183 DOI: 10.1093/nar/gkr1012] [Citation(s) in RCA: 209] [Impact Index Per Article: 17.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2011] [Revised: 10/18/2011] [Accepted: 10/20/2011] [Indexed: 11/23/2022] Open

Prasad TSK, Harsha HC, Keerthikumar S, Sekhar NR, Selvan LDN, Kumar P, Pinto SM, Muthusamy B, Subbannayya Y, Renuse S, Chaerkady R, Mathur PP, Ravikumar R, Pandey A. Proteogenomic Analysis of Candida glabrata using High Resolution Mass Spectrometry. J Proteome Res 2011;11:247-60. [DOI: 10.1021/pr200827k] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]

Affiliation(s)

T. S. Keshava Prasad Institute of Bioinformatics, International Technology Park, Bangalore -560 066, India Centre of Excellence in Bioinformatics, Bioinformatics Centre, School of Life Sciences, Pondicherry University, Puducherry -605 014, India Manipal University, Madhav Nagar, Manipal, Karnataka 576104; India Amrita School of Biotechnology, Amrita University, Kollam -690 525, India
H. C. Harsha Institute of Bioinformatics, International Technology Park, Bangalore -560 066, India
Shivakumar Keerthikumar Institute of Bioinformatics, International Technology Park, Bangalore -560 066, India
Nirujogi Raja Sekhar Institute of Bioinformatics, International Technology Park, Bangalore -560 066, India Centre of Excellence in Bioinformatics, Bioinformatics Centre, School of Life Sciences, Pondicherry University, Puducherry -605 014, India
Lakshmi Dhevi N. Selvan Institute of Bioinformatics, International Technology Park, Bangalore -560 066, India Amrita School of Biotechnology, Amrita University, Kollam -690 525, India
Praveen Kumar Institute of Bioinformatics, International Technology Park, Bangalore -560 066, India Amrita School of Biotechnology, Amrita University, Kollam -690 525, India
Sneha M. Pinto Institute of Bioinformatics, International Technology Park, Bangalore -560 066, India Manipal University, Madhav Nagar, Manipal, Karnataka 576104; India
Babylakshmi Muthusamy Institute of Bioinformatics, International Technology Park, Bangalore -560 066, India Centre of Excellence in Bioinformatics, Bioinformatics Centre, School of Life Sciences, Pondicherry University, Puducherry -605 014, India
Yashwanth Subbannayya Institute of Bioinformatics, International Technology Park, Bangalore -560 066, India Rajiv Gandhi University of Health Sciences, Jayanagar, Bangalore −560 041, India
Santosh Renuse Institute of Bioinformatics, International Technology Park, Bangalore -560 066, India Amrita School of Biotechnology, Amrita University, Kollam -690 525, India
Raghothama Chaerkady Institute of Bioinformatics, International Technology Park, Bangalore -560 066, India
Premendu P. Mathur Centre of Excellence in Bioinformatics, Bioinformatics Centre, School of Life Sciences, Pondicherry University, Puducherry -605 014, India
Raju Ravikumar Department of Neuromicrobiology, National Institute of Mental Health and Neuro Sciences, Bangalore -560029, India
Akhilesh Pandey

Collapse