51
|
Durighello E, Christie-Oleza JA, Armengaud J. Assessing the exoproteome of marine bacteria, lesson from a RTX-toxin abundantly secreted by Phaeobacter strain DSM 17395. PLoS One 2014; 9:e89691. [PMID: 24586966 PMCID: PMC3933643 DOI: 10.1371/journal.pone.0089691] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2013] [Accepted: 01/21/2014] [Indexed: 11/24/2022] Open
Abstract
Bacteria from the Roseobacter clade are abundant in surface marine ecosystems as over 10% of bacterial cells in the open ocean and 20% in coastal waters belong to this group. In order to document how these marine bacteria interact with their environment, we analyzed the exoproteome of Phaeobacter strain DSM 17395. We grew the strain in marine medium, collected the exoproteome and catalogued its content with high-throughput nanoLC-MS/MS shotgun proteomics. The major component represented 60% of the total protein content but was refractory to either classical proteomic identification or proteogenomics. We de novo sequenced this abundant protein with high-resolution tandem mass spectra which turned out being the 53 kDa RTX-toxin ZP_02147451. It comprised a peptidase M10 serralysin domain. We explained its recalcitrance to trypsin proteolysis and proteomic identification by its unusual low number of basic residues. We found this is a conserved trait in RTX-toxins from Roseobacter strains which probably explains their persistence in the harsh conditions around bacteria. Comprehensive analysis of exoproteomes from environmental bacteria should take into account this proteolytic recalcitrance.
Collapse
Affiliation(s)
- Emie Durighello
- CEA, DSV, IBEB, Lab Biochim System Perturb, Bagnols-sur-Cèze, France
| | | | - Jean Armengaud
- CEA, DSV, IBEB, Lab Biochim System Perturb, Bagnols-sur-Cèze, France
- * E-mail:
| |
Collapse
|
52
|
Bland C, Hartmann EM, Christie-Oleza JA, Fernandez B, Armengaud J. N-Terminal-oriented proteogenomics of the marine bacterium roseobacter denitrificans Och114 using N-Succinimidyloxycarbonylmethyl)tris(2,4,6-trimethoxyphenyl)phosphonium bromide (TMPP) labeling and diagonal chromatography. Mol Cell Proteomics 2014; 13:1369-81. [PMID: 24536027 DOI: 10.1074/mcp.o113.032854] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Given the ease of whole genome sequencing with next-generation sequencers, structural and functional gene annotation is now purely based on automated prediction. However, errors in gene structure are frequent, the correct determination of start codons being one of the main concerns. Here, we combine protein N termini derivatization using (N-Succinimidyloxycarbonylmethyl)tris(2,4,6-trimethoxyphenyl)phosphonium bromide (TMPP Ac-OSu) as a labeling reagent with the COmbined FRActional DIagonal Chromatography (COFRADIC) sorting method to enrich labeled N-terminal peptides for mass spectrometry detection. Protein digestion was performed in parallel with three proteases to obtain a reliable automatic validation of protein N termini. The analysis of these N-terminal enriched fractions by high-resolution tandem mass spectrometry allowed the annotation refinement of 534 proteins of the model marine bacterium Roseobacter denitrificans OCh114. This study is especially efficient regarding mass spectrometry analytical time. From the 534 validated N termini, 480 confirmed existing gene annotations, 41 highlighted erroneous start codon annotations, five revealed totally new mis-annotated genes; the mass spectrometry data also suggested the existence of multiple start sites for eight different genes, a result that challenges the current view of protein translation initiation. Finally, we identified several proteins for which classical genome homology-driven annotation was inconsistent, questioning the validity of automatic annotation pipelines and emphasizing the need for complementary proteomic data. All data have been deposited to the ProteomeXchange with identifier PXD000337.
Collapse
Affiliation(s)
- Céline Bland
- CEA, DSV, IBEB, Lab Biochim System Perturb, Bagnols-sur-Cèze, F-30207, France
| | | | | | | | | |
Collapse
|
53
|
Armengaud J, Trapp J, Pible O, Geffard O, Chaumot A, Hartmann EM. Non-model organisms, a species endangered by proteogenomics. J Proteomics 2014; 105:5-18. [PMID: 24440519 DOI: 10.1016/j.jprot.2014.01.007] [Citation(s) in RCA: 100] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2013] [Revised: 12/24/2013] [Accepted: 01/07/2014] [Indexed: 10/25/2022]
Abstract
UNLABELLED Previously, large-scale proteomics was possible only for organisms whose genomes were sequenced, meaning the most common model organisms. The use of next-generation sequencers is now changing the deal. With "proteogenomics", the use of experimental proteomics data to refine genome annotations, a higher integration of omics data is gaining ground. By extension, combining genomic and proteomic data is becoming routine in many research projects. "Proteogenomic"-flavored approaches are currently expanding, enabling the molecular studies of non-model organisms at an unprecedented depth. Today draft genomes can be obtained using next-generation sequencers in a rather straightforward way and at a reasonable cost for any organism. Unfinished genome sequences can be used to interpret tandem mass spectrometry proteomics data without the need for time-consuming genome annotation, and the use of RNA-seq to establish nucleotide sequences that are directly translated into protein sequences appears promising. There are, however, certain drawbacks that deserve further attention for RNA-seq to become more efficient. Here, we discuss the opportunities of working with non-model organisms, the proteomic methods that have been used until now, and the dramatic improvements proffered by proteogenomics. These put the distinction between model and non-model organisms in great danger, at least in terms of proteomics! BIOLOGICAL SIGNIFICANCE Model organisms have been crucial for in-depth analysis of cellular and molecular processes of life. Focusing the efforts of thousands of researchers on the Escherichia coli bacterium, Saccharomyces cerevisiae yeast, Arabidopsis thaliana plant, Danio rerio fish and other models for which genetic manipulation was possible was certainly worthwhile in terms of fundamental and invaluable biological insights. Until recently, proteomics of non-model organisms was limited to tedious, homology-based techniques, but today draft genomes or RNA-seq data can be straightforwardly obtained using next-generation sequencers, allowing the establishment of a draft protein database for any organism. Thus, proteogenomics opens new perspectives for molecular studies of non-model organisms, although they are still difficult experimental organisms. This article is part of a Special Issue entitled: Proteomics of non-model organisms.
Collapse
Affiliation(s)
- Jean Armengaud
- CEA, DSV, IBEB, Lab Biochim System Perturb, Bagnols-sur-Cèze F-30207, France.
| | - Judith Trapp
- CEA, DSV, IBEB, Lab Biochim System Perturb, Bagnols-sur-Cèze F-30207, France; Irstea, UR MALY, F-69626 Villeurbanne, France
| | - Olivier Pible
- CEA, DSV, IBEB, Lab Biochim System Perturb, Bagnols-sur-Cèze F-30207, France
| | | | | | - Erica M Hartmann
- CEA, DSV, IBEB, Lab Biochim System Perturb, Bagnols-sur-Cèze F-30207, France
| |
Collapse
|
54
|
Hartmann EM, Allain F, Gaillard JC, Pible O, Armengaud J. Taking the shortcut for high-throughput shotgun proteomic analysis of bacteria. Methods Mol Biol 2014; 1197:275-85. [PMID: 25172287 DOI: 10.1007/978-1-4939-1261-2_16] [Citation(s) in RCA: 77] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Currently, proteomic tools are able to establish a complete list of the most abundant proteins present in a sample, providing the opportunity to study at high resolution the physiology of any bacteria for which the genome sequence is available. For a comprehensive list, proteins should be first resolved into fractions that are then proteolyzed by trypsin. The resulting peptide mixtures are analyzed by a high-throughput tandem mass spectrometer that records thousands of MS/MS spectra for each fraction. These spectra are then assigned to peptides, which are used as evidence of the existence of proteins. In addition to generating a list of protein identifications, this shortcut to proteomics uses the number of spectra recorded for each protein to quantify the observations. Here, we describe one of the most simple sample preparation methods for high-throughput proteomics of bacteria, as well as the subsequent data processing to extract quantitative information based on the spectral count approach.
Collapse
|
55
|
Bland C, Bellanger L, Armengaud J. Magnetic Immunoaffinity Enrichment for Selective Capture and MS/MS Analysis of N-Terminal-TMPP-Labeled Peptides. J Proteome Res 2013; 13:668-80. [DOI: 10.1021/pr400774z] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Affiliation(s)
- Céline Bland
- DSV,
IBEB, Lab Biochim System Perturb, CEA, Parc Technologique Marcel Boiteux, Bagnols-sur-Cèze F-30207, France
| | - Laurent Bellanger
- DSV,
IBEB, Lab Ing Cellul Biotechnol, CEA, Parc Technologique Marcel Boiteux, Bagnols-sur-Cèze F-30207, France
| | - Jean Armengaud
- DSV,
IBEB, Lab Biochim System Perturb, CEA, Parc Technologique Marcel Boiteux, Bagnols-sur-Cèze F-30207, France
| |
Collapse
|
56
|
de la Tour CB, Passot FM, Toueille M, Mirabella B, Guérin P, Blanchard L, Servant P, de Groot A, Sommer S, Armengaud J. Comparative proteomics reveals key proteins recruited at the nucleoid of Deinococcus after irradiation-induced DNA damage. Proteomics 2013; 13:3457-69. [PMID: 24307635 DOI: 10.1002/pmic.201300249] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2013] [Revised: 10/19/2013] [Accepted: 10/23/2013] [Indexed: 11/09/2022]
Abstract
The nucleoids of radiation-resistant Deinococcus species show a high degree of compaction maintained after ionizing irradiation. We identified proteins recruited after irradiation in nucleoids of Deinococcus radiodurans and Deinococcus deserti by means of comparative proteomics. Proteins in nucleoid-enriched fractions from unirradiated and irradiated Deinococcus were identified and semiquantified by shotgun proteomics. The ssDNA-binding protein SSB, DNA gyrase subunits GyrA and GyrB, DNA topoisomerase I, RecA recombinase, UvrA excinuclease, RecQ helicase, DdrA, DdrB, and DdrD proteins were found in significantly higher amounts in irradiated nucleoids of both Deinococcus species. We observed, by immunofluorescence microscopy, the subcellular localization of these proteins in D. radiodurans, showing for the first time the recruitment of the DdrD protein into the D. radiodurans nucleoid. We specifically followed the kinetics of recruitment of RecA, DdrA, and DdrD to the nucleoid after irradiation. Remarkably, RecA proteins formed irregular filament-like structures 1 h after irradiation, before being redistributed throughout the cells by 3 h post-irradiation. Comparable dynamics of DdrD localization were observed, suggesting a possible functional interaction between RecA and DdrD. Several proteins involved in nucleotide synthesis were also seen in higher quantities in the nucleoids of irradiated cells, indicative of the existence of a mechanism for orchestrating the presence of proteins involved in DNA metabolism in nucleoids in response to massive DNA damage. All MS data have been deposited in the ProteomeXchange with identifier PXD00196 (http://proteomecentral.proteomexchange.org/dataset/PXD000196).
Collapse
|
57
|
Hartmann EM, Armengaud J. Shotgun proteomics suggests involvement of additional enzymes in dioxin degradation by Sphingomonas wittichii RW1. Environ Microbiol 2013; 16:162-76. [PMID: 24118890 DOI: 10.1111/1462-2920.12264] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2013] [Revised: 08/06/2013] [Accepted: 08/24/2013] [Indexed: 12/01/2022]
Abstract
Chlorinated congeners of dibenzo-p-dioxin and dibenzofuran are widely dispersed pollutants that can be treated using microorganisms, such as the Sphingomonas wittichii RW1 bacterium, able to transform some of them into non-toxic substances. The enzymes of the upper pathway for dibenzo-p-dioxin degradation in S. wittichii RW1 have been biochemically and genetically characterized, but its genome sequence indicated the existence of a tremendous potential for aromatic compound transformation, with 56 ring-hydroxylating dioxygenase subunits, 34 extradiol dioxygenases and 40 hydrolases. To further characterize this enzymatic arsenal, new methodological approaches should be employed. Here, a large shotgun proteomic survey was performed on cells grown on dibenzofuran, dibenzo-p-dioxin and 2-chlorodibenzo-p-dioxin, and compared with growth on acetate. Changes in the proteome were monitored over time. In total, 502 proteins were observed and quantified using a label-free mass spectrometry-based approach; all data were deposited to the ProteomeXchange (PXD000403). Our results confirmed the roles of the dioxin dioxygenase DxnA1A2, trihydroxybiphenyl dioxygenase DbfB, meta-cleavage product hydrolase DxnB and reductase RedA2, and corroborated the proposed involvement of the Swit_3046 dioxygenase and DxnB2 hydrolase. Trends across substrates and over the course of growth do not support concerted pathway regulation and suggest the involvement of an additional hydrolase and several TonB-dependent receptors.
Collapse
Affiliation(s)
- Erica M Hartmann
- CEA, DSV, IBEB, Lab Biochim System Perturb, Bagnols-sur-Cèze, F-30207, France
| | | |
Collapse
|
58
|
Christie-Oleza JA, Miotello G, Armengaud J. Proteogenomic definition of biomarkers for the large Roseobacter clade and application for a quick screening of new environmental isolates. J Proteome Res 2013; 12:5331-9. [PMID: 24044462 DOI: 10.1021/pr400554e] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Whole-cell, matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry has become a routine and reliable method for microbial characterization due to its simplicity, low cost, and high reproducibility. The identification of microbial isolates relies on the spectral resemblance of low-molecular-weight proteins to already-existing isolates within the databases. This is a gold standard for clinicians who have a finite number of well-defined pathogenic strains but represents a problem for environmental microbiologists with an overwhelming number of organisms to be defined. Here we set a milestone for implementing whole-cell MALDI-TOF mass spectrometry to identify isolates from the biosphere. To make this technique accessible for environmental studies, we propose to (i) define biomarkers that will always show up with an intense m/z signal in the MALDI-TOF spectra and (ii) create a database with all the possible m/z values that these biomarkers can generate to screen new isolates. We tested our method with the relevant marine Roseobacter lineage. The use of shotgun nanoLC-MS/MS proteomics on the small proteome fraction of nine Roseobacter strains and the proteogenomic toolbox helped us to identify potential biomarkers in terms of protein abundance and low variability among strains. We show that the DNA binding protein, HU, and the ribosomal proteins, L29 and L30, are the most robust biomarkers within the Roseobacter clade. The molecular weights of these three biomarkers, as for other conserved homologous proteins, vary due to sequence variation above the genus level. Therefore, we calculated the m/z values expected for each one of the known Roseobacter genera and tested our strategy during an extensive screening of natural marine isolates obtained from coastal waters of the Western Mediterranean Sea. The use of this technique versus standard sequencing methods is discussed.
Collapse
|
59
|
Zech H, Hensler M, Koßmehl S, Drüppel K, Wöhlbrand L, Trautwein K, Hulsch R, Maschmann U, Colby T, Schmidt J, Reinhardt R, Schmidt-Hohagen K, Schomburg D, Rabus R. Adaptation of Phaeobacter inhibens DSM 17395 to growth with complex nutrients. Proteomics 2013; 13:2851-68. [PMID: 23613352 DOI: 10.1002/pmic.201200513] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2012] [Revised: 02/06/2013] [Accepted: 02/23/2013] [Indexed: 12/19/2022]
Abstract
Phaeobacter inhibens DSM 17395, a member of the Roseobacter clade, was studied for its adaptive strategies to complex and excess nutrient supply, here mimicked by cultivation with Marine Broth (MB). During growth in process-controlled fermenters, P. inhibens DSM 17395 grew faster (3.6-fold higher μmax ) and reached higher optical densities (2.2-fold) with MB medium, as compared to the reference condition of glucose-containing mineral medium. Apparently, in the presence of MB medium, metabolism was tuned to maximize growth rate at the expense of efficiency. Comprehensive proteomic analysis of cells harvested at ½ ODmax identified 1783 (2D DIGE, membrane and extracellular protein-enriched fractions, shotgun) different proteins (50.5% coverage), 315 (based on 2D DIGE) of which displayed differential abundance profiles. Moreover, 145 different metabolites (intra- and extracellular combined) were identified, almost all of which (140) showed abundance changes. During growth with MB medium, P. inhibens DSM 17395 specifically formed the various proteins required for utilization of phospholipids and several amino acids, as well as for gluconeogenesis. Metabolic tuning on amino acid utilization is also reflected by massive discharge of urea to dispose the cell of excess ammonia. Apparently, P. inhibens DSM 17395 modulated its metabolism to simultaneously utilize diverse substrates from the complex nutrient supply.
Collapse
Affiliation(s)
- Hajo Zech
- Institute for Chemistry and Biology of the Marine Environment (ICBM), Carl von Ossietzky University Oldenburg, Oldenburg, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
60
|
Armengaud J, Christie-Oleza JA, Clair G, Malard V, Duport C. Exoproteomics: exploring the world around biological systems. Expert Rev Proteomics 2013. [PMID: 23194272 DOI: 10.1586/epr.12.52] [Citation(s) in RCA: 73] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
The term 'exoproteome' describes the protein content that can be found in the extracellular proximity of a given biological system. These proteins arise from cellular secretion, other protein export mechanisms or cell lysis, but only the most stable proteins in this environment will remain in abundance. It has been shown that these proteins reflect the physiological state of the cells in a given condition and are indicators of how living systems interact with their environments. High-throughput proteomic approaches based on a shotgun strategy, and high-resolution mass spectrometers, have modified the authors' view of exoproteomes. In the present review, the authors describe how these new approaches should be exploited to obtain the maximum useful information from a sample, whatever its origin. The methodologies used for studying secretion from model cell lines derived from eukaryotic, multicellular organisms, virulence determinants of pathogens and environmental bacteria and their relationships with their habitats are illustrated with several examples. The implication of such data, in terms of proteogenomics and the discovery of novel protein functions, is discussed.
Collapse
Affiliation(s)
- Jean Armengaud
- CEA, DSV, IBEB, Lab Biochim System Perturb, Bagnols-sur-Cèze, F-30207, France.
| | | | | | | | | |
Collapse
|
61
|
Müller SA, Findeiß S, Pernitzsch SR, Wissenbach DK, Stadler PF, Hofacker IL, von Bergen M, Kalkhof S. Identification of new protein coding sequences and signal peptidase cleavage sites of Helicobacter pylori strain 26695 by proteogenomics. J Proteomics 2013; 86:27-42. [PMID: 23665149 DOI: 10.1016/j.jprot.2013.04.036] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2012] [Revised: 03/29/2013] [Accepted: 04/26/2013] [Indexed: 12/16/2022]
Abstract
UNLABELLED Correct annotation of protein coding genes is the basis of conventional data analysis in proteomic studies. Nevertheless, most protein sequence databases almost exclusively rely on gene finding software and inevitably also miss protein annotations or possess errors. Proteogenomics tries to overcome these issues by matching MS data directly against a genome sequence database. Here we report an in-depth proteogenomics study of Helicobacter pylori strain 26695. MS data was searched against a combined database of the NCBI annotations and a six-frame translation of the genome. Database searches with Mascot and X! Tandem revealed 1115 proteins identified by at least two peptides with a peptide false discovery rate below 1%. This represents 71% of the predicted proteome. So far this is the most extensive proteome study of Helicobacter pylori. Our proteogenomic approach unambiguously identified four previously missed annotations and furthermore allowed us to correct sequences of six annotated proteins. Since secreted proteins are often involved in pathogenic processes we further investigated signal peptidase cleavage sites. By applying a database search that accommodates the identification of semi-specific cleaved peptides, 63 previously unknown signal peptides were detected. The motif LXA showed to be the predominant recognition sequence for signal peptidases. BIOLOGICAL SIGNIFICANCE The results of MS-based proteomic studies highly rely on correct annotation of protein coding genes which is the basis of conventional data analysis. However, the annotation of protein coding sequences in genomic data is usually based on gene finding software. These tools are limited in their prediction accuracy such as the problematic determination of exact gene boundaries. Thus, protein databases own partly erroneous or incomplete sequences. Additionally, some protein sequences might also be missing in the databases. Proteogenomics, a combination of proteomic and genomic data analyses, is well suited to detect previously not annotated proteins and to correct erroneous sequences. For this purpose, the existing database of the investigated species is typically supplemented with a six-frame translation of the genome. Here, we studied the proteome of the major human pathogen Helicobacter pylori that is responsible for many gastric diseases such as duodenal ulcers and gastric cancer. Our in-depth proteomic study highly reliably identified 1115 proteins (FDR<0.01%) by at least two peptides (FDR<1%) which represent 71% of the predicted proteome deposited at NCBI. The proteogenomic data analysis of our data set resulted in the unambiguous identification of four previously missed annotations, the correction of six annotated proteins as well as the detection of 63 previously unknown signal peptides. We have annotated proteins of particular biological interest like the ferrous iron transport protein A, the coiled-coil-rich protein HP0058 and the lipopolysaccharide biosynthesis protein HP0619. For instance, the protein HP0619 could be a drug target for the inhibition of the LPS synthesis pathway. Furthermore it has been proven that the motif "LXA" is the predominant recognition sequence for the signal peptidase I of H. pylori. Signal peptidases are essential enzymes for the viability of bacterial cells and are involved in pathogenesis. Therefore signal peptidases could be novel targets for antibiotics. The inclusion of the corrected and new annotated proteins as well as the information of signal peptide cleavage sites will help in the study of biological pathways involved in pathogenesis or drug response of H. pylori.
Collapse
Affiliation(s)
- Stephan A Müller
- Department of Proteomics, UFZ, Helmholtz-Centre for Environmental Research Leipzig, 04318 Leipzig, Germany
| | | | | | | | | | | | | | | |
Collapse
|
62
|
Christie-Oleza JA, Piña-Villalonga JM, Guerin P, Miotello G, Bosch R, Nogales B, Armengaud J. Shotgun nanoLC-MS/MS proteogenomics to document MALDI-TOF biomarkers for screening new members of theRuegeriagenus. Environ Microbiol 2012; 15:133-47. [DOI: 10.1111/j.1462-2920.2012.02812.x] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
63
|
Abstract
High-throughput identification of proteins with the latest generation of hybrid high-resolution mass spectrometers is opening new perspectives in microbiology. I present, here, an overview of tandem mass spectrometry technology and bioinformatics for shotgun proteomics that make 2D-PAGE approaches obsolete. Non-labelling quantitative approaches have become more popular than labelling techniques on most proteomic platforms because they are easier to carry out while their quantitative outcome is rather robust. Parameters for recording mass spectrometry data, however, need to be chosen carefully and statistics to assess the confidence of the results should not be neglected. Interestingly, next-generation sequencing methodologies make any microbial model quickly amenable to proteomics, leading to the documentation of a wide range of organisms from diverse environments. Some recent discoveries made using microbial proteomics have challenged some biological dogma, such as: (i) initiation of the translation does not occur predominantly from ATG codons in some microorganisms, (ii) non-canonical initiation codons are used to regulate the production of specific but important proteins and (iii) a gene may code for multiple polypeptide species, heterogeneous in terms of sequences. Microbial diversity and microbial physiology can now be revisited by means of exhaustive comparative proteomic surveys where thousands of proteins are detected and quantified. Proteogenomics, consisting of better annotating of genomes with the help of proteomic evidence, is paving the way for integrated multi-omic approaches in microbiology. Finally, meta-proteomic tools and approaches are emerging for tackling the high complexity of the microbial world as a whole, opening new perspectives for assessing how microbial communities function.
Collapse
Affiliation(s)
- Jean Armengaud
- CEA, DSV, IBEB, Lab Biochim System Perturb, F-30207 Bagnols-sur-Cèze, France.
| |
Collapse
|
64
|
Christie-Oleza JA, Miotello G, Armengaud J. High-throughput proteogenomics of Ruegeria pomeroyi: seeding a better genomic annotation for the whole marine Roseobacter clade. BMC Genomics 2012; 13:73. [PMID: 22336032 PMCID: PMC3305630 DOI: 10.1186/1471-2164-13-73] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2011] [Accepted: 02/15/2012] [Indexed: 11/10/2022] Open
Abstract
Background The structural and functional annotation of genomes is now heavily based on data obtained using automated pipeline systems. The key for an accurate structural annotation consists of blending similarities between closely related genomes with biochemical evidence of the genome interpretation. In this work we applied high-throughput proteogenomics to Ruegeria pomeroyi, a member of the Roseobacter clade, an abundant group of marine bacteria, as a seed for the annotation of the whole clade. Results A large dataset of peptides from R. pomeroyi was obtained after searching over 1.1 million MS/MS spectra against a six-frame translated genome database. We identified 2006 polypeptides, of which thirty-four were encoded by open reading frames (ORFs) that had not previously been annotated. From the pool of 'one-hit-wonders', i.e. those ORFs specified by only one peptide detected by tandem mass spectrometry, we could confirm the probable existence of five additional new genes after proving that the corresponding RNAs were transcribed. We also identified the most-N-terminal peptide of 486 polypeptides, of which sixty-four had originally been wrongly annotated. Conclusions By extending these re-annotations to the other thirty-six Roseobacter isolates sequenced to date (twenty different genera), we propose the correction of the assigned start codons of 1082 homologous genes in the clade. In addition, we also report the presence of novel genes within operons encoding determinants of the important tricarboxylic acid cycle, a feature that seems to be characteristic of some Roseobacter genomes. The detection of their corresponding products in large amounts raises the question of their function. Their discoveries point to a possible theory for protein evolution that will rely on high expression of orphans in bacteria: their putative poor efficiency could be counterbalanced by a higher level of expression. Our proteogenomic analysis will increase the reliability of the future annotation of marine bacterial genomes.
Collapse
|