51
|
Lacerda CMR, Reardon KF. Environmental proteomics: applications of proteome profiling in environmental microbiology and biotechnology. BRIEFINGS IN FUNCTIONAL GENOMICS AND PROTEOMICS 2009; 8:75-87. [PMID: 19279070 DOI: 10.1093/bfgp/elp005] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
In this review, we present the use of proteomics to advance knowledge in the field of environmental biotechnology, including studies of bacterial physiology, metabolism and ecology. Bacteria are widely applied in environmental biotechnology for their ability to catalyze dehalogenation, methanogenesis, denitrification and sulfate reduction, among others. Their tolerance to radiation and toxic compounds is also of importance. Proteomics has an important role in helping uncover the pathways behind these cellular processes. Environmental samples are often highly complex, which makes proteome studies in this field especially challenging. Some of these challenges are the lack of genome sequences for the vast majority of environmental bacteria, difficulties in isolating bacteria and proteins from certain environments, and the presence of complex microbial communities. Despite these challenges, proteomics offers a unique dynamic view into cellular function. We present examples of environmental proteomics of model organisms, and then discuss metaproteomics (microbial community proteomics), which has the potential to provide insights into the function of a community without isolating organisms. Finally, the environmental proteomics literature is summarized as it pertains to the specific application areas of wastewater treatment, metabolic engineering, microbial ecology and environmental stress responses.
Collapse
Affiliation(s)
- Carla M R Lacerda
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, CO 80523-1370, USA
| | | |
Collapse
|
52
|
Louie B, Tarczy-Hornoch P, Higdon R, Kolker E. Validating annotations for uncharacterized proteins in Shewanella oneidensis. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2008; 12:211-5. [PMID: 18687039 DOI: 10.1089/omi.2008.0051] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Proteins of unknown function are a barrier to our understanding of molecular biology. Assigning function to these "uncharacterized" proteins is imperative, but challenging. The usual approach is similarity searches using annotation databases, which are useful for predicting function. However, since the performance of these databases on uncharacterized proteins is basically unknown, the accuracy of their predictions is suspect, making annotation difficult. To address this challenge, we developed a benchmark annotation dataset of 30 proteins in Shewanella oneidensis. The proteins in the dataset were originally uncharacterized after the initial annotation of the S. oneidensis proteome in 2002. In the intervening 5 years, the accumulation of new experimental evidence has enabled specific functions to be predicted. We utilized this benchmark dataset to evaluate several commonly utilized annotation databases. According to our criteria, six annotation databases accurately predicted functions for at least 60% of proteins in our dataset. Two of these six even had a "conditional accuracy" of 90%. Conditional accuracy is another evaluation metric we developed which excludes results from databases where no function was predicted. Also, 27 of the 30 proteins' functions were correctly predicted by at least one database. These represent one of the first performance evaluations of annotation databases on uncharacterized proteins. Our evaluation indicates that these databases readily incorporate new information and are accurate in predicting functions for uncharacterized proteins, provided that experimental function evidence exists.
Collapse
Affiliation(s)
- Brenton Louie
- Department of Medical Education and Biomedical Informatics, Division of Biomedical and Health Informatics, University of Washington, Seattle, Washington 98101-1304, USA
| | | | | | | |
Collapse
|
53
|
Toes ACM, Daleke MH, Kuenen JG, Muyzer G. Expression of copA and cusA in Shewanella during copper stress. MICROBIOLOGY-SGM 2008; 154:2709-2718. [PMID: 18757804 DOI: 10.1099/mic.0.2008/016857-0] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Copper homeostasis is tightly regulated in all living cells as a result of the necessity and toxicity of this metal in free cationic form. In Gram-negative bacteria CPx-type ATPases (e.g. CopA in Escherichia coli) and heavy-metal efflux RND proteins (e.g. CusA in E. coli) play an important role in transport of copper across the cytoplasmic and outer membrane. We investigated the expression of CusA- and CopA-like proteins in Shewanella oneidensis MR1 and Shewanella strain MB4, a Mn(IV)-reducing isolate from a metal-polluted harbour sediment. Q-PCR analysis of total mRNA extracted from cultures grown under aerobic conditions with 25 microM copper showed significantly increased expression of cusA (Student's t-test: MR1, P<0.0001; MB4, P=0.0006). This gene was also induced in the presence of 100 microM copper and 10 or 25 microM cadmium in both tested strains. In the absence of oxygen, with fumarate as final electron acceptor and 100 microM copper, a prolonged lag phase (5 h) was observed and general fitness decreased as evidenced by twofold lower copy numbers of 16S rRNA compared to aerobic conditions. cusA expression in cells grown under these conditions remained at comparable levels (MR1) or was slightly decreased (MB4), compared to aerobic copper challenges. A gene homologous to the copA gene of S. oneidensis was not detected in strain MB4. Although low copA copy numbers were observed in strain MR1 under conditions with 25 and 100 microM copper, copA was not detected in mRNA from cultures grown in the presence of 10 microM cadmium, or in the absence of added heavy metals. However, copA was highly induced under anaerobic conditions with 100 microM copper (P=0.0011). These results suggest essentially different roles for the two proteins CopA and CusA in the copper response in S. oneidensis MR1, similar to findings in more metal-resistant bacteria such as Escherichia coli and Cupriavidus metallidurans.
Collapse
Affiliation(s)
- Ann-Charlotte M Toes
- Department of Biotechnology, Delft University of Technology, Julianalaan 67, NL-2628 BC Delft, The Netherlands
| | - Maria H Daleke
- Department of Biotechnology, Delft University of Technology, Julianalaan 67, NL-2628 BC Delft, The Netherlands
| | - J Gijs Kuenen
- Department of Biotechnology, Delft University of Technology, Julianalaan 67, NL-2628 BC Delft, The Netherlands
| | - Gerard Muyzer
- Department of Biotechnology, Delft University of Technology, Julianalaan 67, NL-2628 BC Delft, The Netherlands
| |
Collapse
|
54
|
Gao H, Pattison D, Yan T, Klingeman DM, Wang X, Petrosino J, Hemphill L, Wan X, Leaphart AB, Weinstock GM, Palzkill T, Zhou J. Generation and validation of a Shewanella oneidensis MR-1 clone set for protein expression and phage display. PLoS One 2008; 3:e2983. [PMID: 18714347 PMCID: PMC2500165 DOI: 10.1371/journal.pone.0002983] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2008] [Accepted: 07/28/2008] [Indexed: 12/02/2022] Open
Abstract
A comprehensive gene collection for S. oneidensis was constructed using the lambda recombinase (Gateway) cloning system. A total of 3584 individual ORFs (85%) have been successfully cloned into the entry plasmids. To validate the use of the clone set, three sets of ORFs were examined within three different destination vectors constructed in this study. Success rates for heterologous protein expression of S. oneidensis His- or His/GST- tagged proteins in E. coli were approximately 70%. The ArcA and NarP transcription factor proteins were tested in an in vitro binding assay to demonstrate that functional proteins can be successfully produced using the clone set. Further functional validation of the clone set was obtained from phage display experiments in which a phage encoding thioredoxin was successfully isolated from a pool of 80 different clones after three rounds of biopanning using immobilized anti-thioredoxin antibody as a target. This clone set complements existing genomic (e.g., whole-genome microarray) and other proteomic tools (e.g., mass spectrometry-based proteomic analysis), and facilitates a wide variety of integrated studies, including protein expression, purification, and functional analyses of proteins both in vivo and in vitro.
Collapse
Affiliation(s)
- Haichun Gao
- Institute for Environmental Genomics, University of Oklahoma, Norman, Oklahoma, United States of America
- Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, United States of America
| | - Donna Pattison
- Baylor College of Medicine, Houston, Texas, United States of America
| | - Tingfen Yan
- Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, United States of America
| | - Dawn M. Klingeman
- Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, United States of America
| | - Xiaohu Wang
- Baylor College of Medicine, Houston, Texas, United States of America
| | - Joseph Petrosino
- Baylor College of Medicine, Houston, Texas, United States of America
| | - Lisa Hemphill
- Baylor College of Medicine, Houston, Texas, United States of America
| | - Xiufeng Wan
- Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, United States of America
| | - Adam B. Leaphart
- Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, United States of America
| | | | - Timothy Palzkill
- Baylor College of Medicine, Houston, Texas, United States of America
- * E-mail: (TP); (JZ)
| | - Jizhong Zhou
- Institute for Environmental Genomics, University of Oklahoma, Norman, Oklahoma, United States of America
- Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, United States of America
- * E-mail: (TP); (JZ)
| |
Collapse
|
55
|
Abstract
While hundreds of microbial genomes are sequenced, the challenge remains to define their cis-regulatory maps. Here, we present a comparative genomic analysis of the cis-regulatory map of Shewanella oneidensis, an important model organism for bioremediation because of its extraordinary abilities to use a wide variety of metals and organic molecules as electron acceptors in respiration. First, from the experimentally verified transcriptional regulatory networks of Escherichia coli, we inferred 24 DNA motifs that are conserved in S. oneidensis. We then applied a new comparative approach on five Shewanella genomes that allowed us to systematically identify 194 nonredundant palindromic DNA motifs and corresponding regulons in S. oneidensis. Sixty-four percent of the predicted motifs are conserved in at least three of the seven newly sequenced and distantly related Shewanella genomes. In total, we obtained 209 unique DNA motifs in S. oneidensis that cover 849 unique transcription units. Besides conservation in other genomes, 77 of these motifs are supported by at least one additional type of evidence, including matching to known transcription factor binding motifs and significant functional enrichment or expression coherence of the corresponding target genes. Using the same approach on a more focused gene set, 990 differentially expressed genes derived from published microarray data of S. oneidensis during exposure to metal ions, we identified 31 putative cis-regulatory motifs (16 with at least one type of additional supporting evidence) that are potentially involved in the process of metal reduction. The majority (18/31) of those motifs had been found in our whole-genome comparative approach, further demonstrating that such an approach is capable of uncovering a large fraction of the regulatory map of a genome even in the absence of experimental data. The integrated computational approach developed in this study provides a useful strategy to identify genome-wide cis-regulatory maps and a novel avenue to explore the regulatory pathways for particular biological processes in bacterial systems.
Collapse
Affiliation(s)
- Jiajian Liu
- Department of Genetics, Washington University School of Medicine, 660 S Euclid, Box 8232, St Louis, MO 63110, USA
| | | | | |
Collapse
|
56
|
Garfin DE. 24th Annual Meeting of the American Electrophoresis Society. Expert Rev Proteomics 2008; 5:385-7. [PMID: 18532905 DOI: 10.1586/14789450.5.3.385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Presentations at the 2007 meeting of the American Electrophoresis Society covered many aspects of this key separation technology. In total there were three plenary speakers, two invited talks, 85 technical talks and 14 posters in a 5-day meeting. The three plenary speakers presented their work with each of them discussing somewhat different multiplexed proteomics approaches. The invited speakers discussed ways to improve resolution and shorten running times in proteomic and genomic separations. The proteomics technical talks described applications of 1D and 2D gel electrophoresis, capillary electrophoresis and micro-scale platforms. This report is limited to a small number of those presentations that discussed proteomics directly.
Collapse
Affiliation(s)
- David E Garfin
- American Electrophoresis Society, 1563 Solano Avenue, #341, Berkeley, CA 94707, USA.
| |
Collapse
|
57
|
Paša-Tolić L, Jacobs JM, Qian WJ, Smith RD. Quantitative Proteomics Using Nano-LC with High Accuracy Mass Spectrometry. Clin Proteomics 2008. [DOI: 10.1002/9783527622153.ch7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
|
58
|
Rohmer L, Guina T, Chen J, Gallis B, Taylor GK, Shaffer SA, Miller SI, Brittnacher MJ, Goodlett DR. Determination and Comparison of the Francisella tularensis subsp.novicida U112 Proteome to Other Bacterial Proteomes. J Proteome Res 2008; 7:2016-24. [DOI: 10.1021/pr700760z] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Laurence Rohmer
- Department of Genome Sciences, Microbiology, Medicine, Medicinal Chemistry, and Department of Pediatrics, Division of Infectious Diseases, University of Washington, Seattle, Washington 98195
| | - Tina Guina
- Department of Genome Sciences, Microbiology, Medicine, Medicinal Chemistry, and Department of Pediatrics, Division of Infectious Diseases, University of Washington, Seattle, Washington 98195
| | - Jinzhi Chen
- Department of Genome Sciences, Microbiology, Medicine, Medicinal Chemistry, and Department of Pediatrics, Division of Infectious Diseases, University of Washington, Seattle, Washington 98195
| | - Byron Gallis
- Department of Genome Sciences, Microbiology, Medicine, Medicinal Chemistry, and Department of Pediatrics, Division of Infectious Diseases, University of Washington, Seattle, Washington 98195
| | - Greg K. Taylor
- Department of Genome Sciences, Microbiology, Medicine, Medicinal Chemistry, and Department of Pediatrics, Division of Infectious Diseases, University of Washington, Seattle, Washington 98195
| | - Scott A. Shaffer
- Department of Genome Sciences, Microbiology, Medicine, Medicinal Chemistry, and Department of Pediatrics, Division of Infectious Diseases, University of Washington, Seattle, Washington 98195
| | - Samuel I. Miller
- Department of Genome Sciences, Microbiology, Medicine, Medicinal Chemistry, and Department of Pediatrics, Division of Infectious Diseases, University of Washington, Seattle, Washington 98195
| | - Mitchell J. Brittnacher
- Department of Genome Sciences, Microbiology, Medicine, Medicinal Chemistry, and Department of Pediatrics, Division of Infectious Diseases, University of Washington, Seattle, Washington 98195
| | - David R. Goodlett
- Department of Genome Sciences, Microbiology, Medicine, Medicinal Chemistry, and Department of Pediatrics, Division of Infectious Diseases, University of Washington, Seattle, Washington 98195
| |
Collapse
|
59
|
Higdon R, Hogan JM, Kolker N, van Belle G, Kolker E. Experiment-specific estimation of peptide identification probabilities using a randomized database. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2008; 11:351-65. [PMID: 18092908 DOI: 10.1089/omi.2007.0040] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Determining the error rate for peptide and protein identification accurately and reliably is necessary to enable evaluation and crosscomparisons of high throughput proteomics experiments. Currently, peptide identification is based either on preset scoring thresholds or on probabilistic models trained on datasets that are often dissimilar to experimental results. The false discovery rates (FDR) and peptide identification probabilities for these preset thresholds or models often vary greatly across different experimental treatments, organisms, or instruments used in specific experiments. To overcome these difficulties, randomized databases have been used to estimate the FDR. However, the cumulative FDR may include low probability identifications when there are a large number of peptide identifications and exclude high probability identifications when there are few. To overcome this logical inconsistency, this study expands the use of randomized databases to generate experiment-specific estimates of peptide identification probabilities. These experiment-specific probabilities are generated by logistic and Loess regression models of the peptide scores obtained from original and reshuffled database matches. These experiment-specific probabilities are shown to very well approximate "true" probabilities based on known standard protein mixtures across different experiments. Probabilities generated by the earlier Peptide_Prophet and more recent LIPS models are shown to differ significantly from this study's experiment-specific probabilities, especially for unknown samples. The experiment-specific probabilities reliably estimate the accuracy of peptide identifications and overcome potential logical inconsistencies of the cumulative FDR. This estimation method is demonstrated using a Sequest database search, LIPS model, and a reshuffled database. However, this approach is generally applicable to any search algorithm, peptide scoring, and statistical model when using a randomized database.
Collapse
Affiliation(s)
- Roger Higdon
- Seattle Children's Hospital and Regional Medical Center, Seattle, WA 98101, USA
| | | | | | | | | |
Collapse
|
60
|
Abstract
The shewanellae are aquatic microorganisms with worldwide distribution. Their hallmark features include unparalleled respiratory diversity and the capacity to thrive at low temperatures. As a genus the shewanellae are physiologically diverse, and this review provides an overview of the varied roles they serve in the environment and describes what is known about how they might survive in such extreme and harsh environments. In light of their fascinating physiology, these organisms have several biotechnological uses, from bioremediation of chlorinated compounds, radionuclides, and other environmental pollutants to energy-generating biocatalysis. The ecology and biotechnology of these organisms are intertwined, with genomics playing a key role in our understanding of their physiology.
Collapse
Affiliation(s)
- Heidi H Hau
- Department of Microbiology and The BioTechnology Institute, University of Minnesota, St. Paul, Minnesota 55108, USA
| | | |
Collapse
|
61
|
Kolker E, Hogan JM, Higdon R, Kolker N, Landorf E, Yakunin AF, Collart FR, van Belle G. Development of BIATECH-54 standard mixtures for assessment of protein identification and relative expression. Proteomics 2007; 7:3693-8. [PMID: 17890649 DOI: 10.1002/pmic.200700088] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Mixtures of known proteins have been very useful in the assessment and validation of methods for high-throughput (HTP) MS (MS/MS) proteomics experiments. However, these test mixtures have generally consisted of few proteins at near equal concentration or of a single protein at varied concentrations. Such mixtures are too simple to effectively assess the validity of error rates for protein identification and differential expression in HTP MS/MS studies. This work aimed at overcoming these limitations and simulating studies of complex biological samples. We introduced a pair of 54-protein standard mixtures of variable concentrations with up to a 1000-fold dynamic range in concentration and up to ten-fold expression ratios with additional negative controls (infinite expression ratios). These test mixtures comprised 16 off-the-shelf Sigma-Aldrich proteins and 38 Shewanella oneidensis proteins produced in-house. The standard proteins were systematically distributed into three main concentration groups (high, medium, and low) and then the concentrations were varied differently for each mixture within the groups to generate different expression ratios. The mixtures were analyzed with both low mass accuracy LCQ and high mass accuracy FT-LTQ instruments. In addition, these 54 standard proteins closely follow the molecular weight distributions of both bacterial and human proteomes. As a result, these new standard mixtures allow for a much more realistic assessment of approaches for protein identification and label-free differential expression than previous mixtures. Finally, methodology and experimental design developed in this work can be readily applied in future to development of more complex standard mixtures for HTP proteomics studies.
Collapse
|
62
|
Gupta N, Tanner S, Jaitly N, Adkins JN, Lipton M, Edwards R, Romine M, Osterman A, Bafna V, Smith RD, Pevzner PA. Whole proteome analysis of post-translational modifications: applications of mass-spectrometry for proteogenomic annotation. Genes Dev 2007; 17:1362-77. [PMID: 17690205 PMCID: PMC1950905 DOI: 10.1101/gr.6427907] [Citation(s) in RCA: 159] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2007] [Accepted: 06/12/2007] [Indexed: 11/24/2022]
Abstract
While bacterial genome annotations have significantly improved in recent years, techniques for bacterial proteome annotation (including post-translational chemical modifications, signal peptides, proteolytic events, etc.) are still in their infancy. At the same time, the number of sequenced bacterial genomes is rising sharply, far outpacing our ability to validate the predicted genes, let alone annotate bacterial proteomes. In this study, we use tandem mass spectrometry (MS/MS) to annotate the proteome of Shewanella oneidensis MR-1, an important microbe for bioremediation. In particular, we provide the first comprehensive map of post-translational modifications in a bacterial genome, including a large number of chemical modifications, signal peptide cleavages, and cleavages of N-terminal methionine residues. We also detect multiple genes that were missed or assigned incorrect start positions by gene prediction programs, and suggest corrections to improve the gene annotation. This study demonstrates that complementing every genome sequencing project by an MS/MS project would significantly improve both genome and proteome annotations for a reasonable cost.
Collapse
Affiliation(s)
- Nitin Gupta
- Bioinformatics Program, University of California San Diego, La Jolla, California 92093, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
63
|
Affiliation(s)
- Dmitrij Frishman
- Department of Genome Oriented Bioinformatics, Technische Universität München, Wissenchaftszentrum Weihenstephan, 85350 Freising, Germany
| |
Collapse
|
64
|
Bretschger O, Obraztsova A, Sturm CA, Chang IS, Gorby YA, Reed SB, Culley DE, Reardon CL, Barua S, Romine MF, Zhou J, Beliaev AS, Bouhenni R, Saffarini D, Mansfeld F, Kim BH, Fredrickson JK, Nealson KH. Current production and metal oxide reduction by Shewanella oneidensis MR-1 wild type and mutants. Appl Environ Microbiol 2007; 73:7003-12. [PMID: 17644630 PMCID: PMC2074945 DOI: 10.1128/aem.01087-07] [Citation(s) in RCA: 361] [Impact Index Per Article: 21.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Shewanella oneidensis MR-1 is a gram-negative facultative anaerobe capable of utilizing a broad range of electron acceptors, including several solid substrates. S. oneidensis MR-1 can reduce Mn(IV) and Fe(III) oxides and can produce current in microbial fuel cells. The mechanisms that are employed by S. oneidensis MR-1 to execute these processes have not yet been fully elucidated. Several different S. oneidensis MR-1 deletion mutants were generated and tested for current production and metal oxide reduction. The results showed that a few key cytochromes play a role in all of the processes but that their degrees of participation in each process are very different. Overall, these data suggest a very complex picture of electron transfer to solid and soluble substrates by S. oneidensis MR-1.
Collapse
Affiliation(s)
- Orianna Bretschger
- Mork Family Department of Chemical Engineering and Materials Science, University of Southern California, Los Angeles, California 90089, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
65
|
Kang H, Pasa-Tolić L, Smith RD. Targeted tandem mass spectrometry for high-throughput comparative proteomics employing NanoLC-FTICR MS with external ion dissociation. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2007; 18:1332-43. [PMID: 17531500 DOI: 10.1016/j.jasms.2007.04.011] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2007] [Revised: 04/06/2007] [Accepted: 04/11/2007] [Indexed: 05/15/2023]
Abstract
Targeted tandem mass spectrometry (MS/MS) is an attractive proteomic approach that allows selective identification of peptides exhibiting abundance differences, e.g., between culture conditions and/or diseased states. Herein, we report on a targeted LC-MS/MS capability realized with a hybrid quadrupole-7 tesla Fourier transform ion cyclotron resonance (FTICR) mass spectrometer that provides data-dependent ion selection, accumulation, and dissociation external to the ICR trap, and a control software that directs intelligent MS/MS target selection based on LC elution time and m/z ratio. We show that the continuous on-the-fly alignment of the LC elution time during the targeted LC-MS/MS experiment, combined with the high mass resolution of FTICR MS, is crucial for accurate selection of targets, whereas high mass measurement accuracy MS/MS data facilitate unambiguous peptide identifications. Identification of a subset of differentially abundant proteins from Shewanella oneidensis grown under suboxic versus aerobic conditions demonstrates the feasibility of such approach.
Collapse
Affiliation(s)
- Hyuk Kang
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington, USA
| | | | | |
Collapse
|
66
|
Denef VJ, Shah MB, Verberkmoes NC, Hettich RL, Banfield JF. Implications of Strain- and Species-Level Sequence Divergence for Community and Isolate Shotgun Proteomic Analysis. J Proteome Res 2007; 6:3152-61. [PMID: 17602579 DOI: 10.1021/pr0701005] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The recent surge in microbial genomic sequencing, combined with the development of high-throughput liquid chromatography-mass-spectrometry-based (LC/LC-MS/MS) proteomics, has raised the question of the extent to which genomic information of one strain or environmental sample can be used to profile proteomes of related strains or samples. Even with decreasing sequencing costs, it remains impractical to obtain genomic sequence for every strain or sample analyzed. Here, we evaluate how shotgun proteomics is affected by amino acid divergence between the sample and the genomic database using a probability-based model and a random mutation simulation model constrained by experimental data. To assess the effects of nonrandom distribution of mutations, we also evaluated identification levels using in silico peptide data from sequenced isolates with average amino acid identities (AAI) varying between 76 and 98%. We compared the predictions to experimental protein identification levels for a sample that was evaluated using a database that included genomic information for the dominant organism and for a closely related variant (95% AAI). The range of models set the boundaries at which half of the proteins in a proteomic experiment can be identified to be 77-92% AAI between orthologs in the sample and database. Consistent with this prediction, experimental data indicated loss of half the identifiable proteins at 90% AAI. Additional analysis indicated a 6.4% reduction of the initial protein coverage per 1% amino acid divergence and total identification loss at 86% AAI. Consequently, shotgun proteomics is capable of cross-strain identifications but avoids most cross-species false positives.
Collapse
Affiliation(s)
- Vincent J Denef
- Department of Earth and Planetary Science, University of California, Berkeley, California 94720, USA.
| | | | | | | | | |
Collapse
|
67
|
Wilson GA, Feil EJ, Lilley AK, Field D. Large-scale comparative genomic ranking of taxonomically restricted genes (TRGs) in bacterial and archaeal genomes. PLoS One 2007; 2:e324. [PMID: 17389915 PMCID: PMC1824705 DOI: 10.1371/journal.pone.0000324] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2007] [Accepted: 02/18/2007] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Lineage-specific, or taxonomically restricted genes (TRGs), especially those that are species and strain-specific, are of special interest because they are expected to play a role in defining exclusive ecological adaptations to particular niches. Despite this, they are relatively poorly studied and little understood, in large part because many are still orphans or only have homologues in very closely related isolates. This lack of homology confounds attempts to establish the likelihood that a hypothetical gene is expressed and, if so, to determine the putative function of the protein. METHODOLOGY/PRINCIPAL FINDINGS We have developed "QIPP" ("Quality Index for Predicted Proteins"), an index that scores the "quality" of a protein based on non-homology-based criteria. QIPP can be used to assign a value between zero and one to any protein based on comparing its features to other proteins in a given genome. We have used QIPP to rank the predicted proteins in the proteomes of Bacteria and Archaea. This ranking reveals that there is a large amount of variation in QIPP scores, and identifies many high-scoring orphans as potentially "authentic" (expressed) orphans. There are significant differences in the distributions of QIPP scores between orphan and non-orphan genes for many genomes and a trend for less well-conserved genes to have lower QIPP scores. CONCLUSIONS The implication of this work is that QIPP scores can be used to further annotate predicted proteins with information that is independent of homology. Such information can be used to prioritize candidates for further analysis. Data generated for this study can be found in the OrphanMine at http://www.genomics.ceh.ac.uk/orphan_mine.
Collapse
Affiliation(s)
- Gareth A Wilson
- Centre for Ecology and Hydrology (CEH) Oxford, Oxford, United Kindgom.
| | | | | | | |
Collapse
|
68
|
Luo Q, Page JS, Tang K, Smith RD. MicroSPE-nanoLC-ESI-MS/MS using 10-microm-i.d. silica-based monolithic columns for proteomics. Anal Chem 2007; 79:540-5. [PMID: 17222018 DOI: 10.1021/ac061603h] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Silica-based monolithic capillary columns (25 cm x 10 microm i.d.) with integrated nanoESI emitters have been developed to provide high-quality and robust microSPE-nanoLC-ESI-MS analyses. The integrated nanoESI emitter adds no dead volume to the LC separation, allowing stable electrospray operation at flow rates of approximately 10 nL/min. In an initial application with a linear ion trap MS, we identified 5510 unique peptides that covered 1443 distinct Shewanella oneidensis proteins from a 300-ng tryptic digest sample in a single 4-h LC-MS/MS analysis. The use of an integrated monolithic ESI emitter provided enhanced resistance to clogging and provided good run-to-run reproducibility.
Collapse
Affiliation(s)
- Quanzhou Luo
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, USA
| | | | | | | |
Collapse
|
69
|
Dworzanski JP, Snyder AP. Classification and identification of bacteria using mass spectrometry-based proteomics. Expert Rev Proteomics 2007; 2:863-78. [PMID: 16307516 DOI: 10.1586/14789450.2.6.863] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Timely classification and identification of bacteria is of vital importance in many areas of public health. Mass spectrometry-based methods provide an attractive alternative to well-established microbiologic procedures. Mass spectrometry methods can be characterized by the relatively high speed of acquiring taxonomically relevant information. Gel-free mass spectrometry proteomics techniques allow for rapid fingerprinting of bacterial proteins using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry or, for high-throughput sequencing of peptides from protease-digested cellular proteins, using mass analysis of fragments from collision-induced dissociation of peptide ions. The latter technique uses database searching of product ion mass spectra. A database contains a comprehensive list of protein sequences translated from protein-encoding open reading frames found in bacterial genomes. The results of such searches allow the assignment of experimental peptide sequences to matching theoretical bacterial proteomes. Phylogenetic profiles of sequenced peptides are then used to create a matrix of sequence-to-bacterium assignments, which are analyzed using numerical taxonomy tools. The results thereof reveal the relatedness between bacteria, and allow the taxonomic position of an investigated strain to be inferred.
Collapse
Affiliation(s)
- Jacek P Dworzanski
- Science Applications International Corporation (SAIC), PO Box 68, Aberdeen Proving Ground, MD 21010-0068, USA.
| | | |
Collapse
|
70
|
Affiliation(s)
- Michael Y Galperin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
| |
Collapse
|
71
|
Suen G, Arshinoff BI, Taylor RG, Welch RD. Practical Applications of Bacterial Functional Genomics. Biotechnol Genet Eng Rev 2007; 24:213-42. [DOI: 10.1080/02648725.2007.10648101] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
72
|
Meshulam-Simon G, Behrens S, Choo AD, Spormann AM. Hydrogen metabolism in Shewanella oneidensis MR-1. Appl Environ Microbiol 2006; 73:1153-65. [PMID: 17189435 PMCID: PMC1828657 DOI: 10.1128/aem.01588-06] [Citation(s) in RCA: 85] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Shewanella oneidensis MR-1 is a facultative sediment microorganism which uses diverse compounds, such as oxygen and fumarate, as well as insoluble Fe(III) and Mn(IV) as electron acceptors. The electron donor spectrum is more limited and includes metabolic end products of primary fermenting bacteria, such as lactate, formate, and hydrogen. While the utilization of hydrogen as an electron donor has been described previously, we report here the formation of hydrogen from pyruvate under anaerobic, stationary-phase conditions in the absence of an external electron acceptor. Genes for the two S. oneidensis MR-1 hydrogenases, hydA, encoding a periplasmic [Fe-Fe] hydrogenase, and hyaB, encoding a periplasmic [Ni-Fe] hydrogenase, were found to be expressed only under anaerobic conditions during early exponential growth and into stationary-phase growth. Analyses of DeltahydA, DeltahyaB, and DeltahydA DeltahyaB in-frame-deletion mutants indicated that HydA functions primarily as a hydrogen-forming hydrogenase while HyaB has a bifunctional role and represents the dominant hydrogenase activity under the experimental conditions tested. Based on results from physiological and genetic experiments, we propose that hydrogen is formed from pyruvate by multiple parallel pathways, one pathway involving formate as an intermediate, pyruvate-formate lyase, and formate-hydrogen lyase, comprised of HydA hydrogenase and formate dehydrogenase, and a formate-independent pathway involving pyruvate dehydrogenase. A reverse electron transport chain is potentially involved in a formate-hydrogen lyase-independent pathway. While pyruvate does not support a fermentative mode of growth in this microorganism, pyruvate, in the absence of an electron acceptor, increased cell viability in anaerobic, stationary-phase cultures, suggesting a role in the survival of S. oneidensis MR-1 under stationary-phase conditions.
Collapse
Affiliation(s)
- Galit Meshulam-Simon
- Department of Civil and Environmental Engineering, Stanford University, Stanford, CA 94305-5429, USA
| | | | | | | |
Collapse
|
73
|
Norbeck AD, Callister SJ, Monroe ME, Jaitly N, Elias DA, Lipton MS, Smith RD. Proteomic approaches to bacterial differentiation. J Microbiol Methods 2006; 67:473-86. [PMID: 16919344 DOI: 10.1016/j.mimet.2006.04.024] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2005] [Revised: 04/19/2006] [Accepted: 04/28/2006] [Indexed: 01/30/2023]
Abstract
Mass spectrometry-based proteomics has been used extensively to explore the proteomes of various organisms, and this technology is now being applied to the characterization of bacterial species. Predominantly, two methods emerge as leaders in this application. Intact protein profiling creates fingerprints of bacterial species which can be used for differentiation and tracking over time. Peptide-centric approaches, analyzed after enzymatic digestion, enable high-throughput proteome characterization in addition to species determination from the identification of peptides distinctive to a species. Highlighted herein is an application of a peptide-centric approach to the identification and quantitation of species-specific peptide identifiers using an in silico exploration and an experimental mass spectrometry-based method. The application to microbial communities is addressed with an in silico analysis of an artificial complex community of 25 microorganisms.
Collapse
Affiliation(s)
- Angela D Norbeck
- Biological Sciences Division, Pacific Northwest National Laboratory, P.O. Box 999, MSIN, K8-98, Richland, WA 99352, USA
| | | | | | | | | | | | | |
Collapse
|
74
|
Abstract
MOTIVATION Tandem mass-spectrometry of trypsin digests, followed by database searching, is one of the most popular approaches in high-throughput proteomics studies. Peptides are considered identified if they pass certain scoring thresholds. To avoid false positive protein identification, > or = 2 unique peptides identified within a single protein are generally recommended. Still, in a typical high-throughput experiment, hundreds of proteins are identified only by a single peptide. We introduce here a method for distinguishing between true and false identifications among single-hit proteins. The approach is based on randomized database searching and usage of logistic regression models with cross-validation. This approach is implemented to analyze three bacterial samples enabling recovery 68-98% of the correct single-hit proteins with an error rate of < 2%. This results in a 22-65% increase in number of identified proteins. Identifying true single-hit proteins will lead to discovering many crucial regulators, biomarkers and other low abundance proteins. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
75
|
Cruz-García C, Murray AE, Klappenbach JA, Stewart V, Tiedje JM. Respiratory nitrate ammonification by Shewanella oneidensis MR-1. J Bacteriol 2006; 189:656-62. [PMID: 17098906 PMCID: PMC1797406 DOI: 10.1128/jb.01194-06] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Anaerobic cultures of Shewanella oneidensis MR-1 grown with nitrate as the sole electron acceptor exhibited sequential reduction of nitrate to nitrite and then to ammonium. Little dinitrogen and nitrous oxide were detected, and no growth occurred on nitrous oxide. A mutant with the napA gene encoding periplasmic nitrate reductase deleted could not respire or assimilate nitrate and did not express nitrate reductase activity, confirming that the NapA enzyme is the sole nitrate reductase. Hence, S. oneidensis MR-1 conducts respiratory nitrate ammonification, also termed dissimilatory nitrate reduction to ammonium, but not respiratory denitrification.
Collapse
Affiliation(s)
- Claribel Cruz-García
- Center for Microbial Ecology, Michigan State University, East Lansing, MI 48824-1325, USA
| | | | | | | | | |
Collapse
|
76
|
Powers R, Copeland JC, Germer K, Mercier KA, Ramanathan V, Revesz P. Comparison of protein active site structures for functional annotation of proteins and drug design. Proteins 2006; 65:124-35. [PMID: 16862592 DOI: 10.1002/prot.21092] [Citation(s) in RCA: 67] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Rapid and accurate functional assignment of novel proteins is increasing in importance, given the completion of numerous genome sequencing projects and the vastly expanding list of unannotated proteins. Traditionally, global primary-sequence and structure comparisons have been used to determine putative function. These approaches, however, do not emphasize similarities in active site configurations that are fundamental to a protein's activity and highly conserved relative to the global and more variable structural features. The Comparison of Protein Active Site Structures (CPASS) database and software enable the comparison of experimentally identified ligand-binding sites to infer biological function and aid in drug discovery. The CPASS database comprises the ligand-defined active sites identified in the protein data bank, where the CPASS program compares these ligand-defined active sites to determine sequence and structural similarity without maintaining sequence connectivity. CPASS will compare any set of ligand-defined protein active sites, irrespective of the identity of the bound ligand.
Collapse
Affiliation(s)
- Robert Powers
- Department of Chemistry, University of Nebraska-Lincoln, Lincoln, Nebraska 68588, USA.
| | | | | | | | | | | |
Collapse
|
77
|
Zhang W, Culley DE, Gritsenko MA, Moore RJ, Nie L, Scholten JCM, Petritis K, Strittmatter EF, Camp DG, Smith RD, Brockman FJ. LC-MS/MS based proteomic analysis and functional inference of hypothetical proteins in Desulfovibrio vulgaris. Biochem Biophys Res Commun 2006; 349:1412-9. [PMID: 16982031 DOI: 10.1016/j.bbrc.2006.09.019] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2006] [Accepted: 09/07/2006] [Indexed: 11/26/2022]
Abstract
High efficiency capillary liquid chromatography-tandem mass spectrometry (LC-MS/MS) was used to examine the proteins extracted from Desulfovibrio vulgaris cells across six treatment conditions. While our previous study provided a proteomic overview of the cellular metabolism based on proteins with known functions [W. Zhang, M.A. Gritsenko, R.J. Moore, D.E. Culley, L. Nie, K. Petritis, E.F. Strittmatter, D.G. Camp II, R.D. Smith, F.J. Brockman, A proteomic view of the metabolism in Desulfovibrio vulgaris determined by liquid chromatography coupled with tandem mass spectrometry, Proteomics 6 (2006) 4286-4299], this study describes the global detection and functional inference for hypothetical D. vulgaris proteins. Using criteria that a given peptide of a protein is identified from at least two out of three independent LC-MS/MS measurements and that for any protein at least two different peptides are identified among the three measurements, 129 open reading frames (ORFs) originally annotated as hypothetical proteins were found to encode expressed proteins. Functional inference for the conserved hypothetical proteins was performed by a combination of several non-homology based methods: genomic context analysis, phylogenomic profiling, and analysis of a combination of experimental information, including peptide detection in cells grown under specific culture conditions and cellular location of the proteins. Using this approach we were able to assign possible functions to 20 conserved hypothetical proteins. This study demonstrated that a combination of proteomics and bioinformatics methodologies can provide verification of the expression of hypothetical proteins and improve genome annotation.
Collapse
Affiliation(s)
- Weiwen Zhang
- Microbiology Group, Pacific Northwest National Laboratory, 902 Battelle Boulevard, P.O. Box 999, Richland, WA 99352, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
78
|
Galperin MY, Kolker E. New metrics for comparative genomics. Curr Opin Biotechnol 2006; 17:440-7. [PMID: 16978854 PMCID: PMC1764326 DOI: 10.1016/j.copbio.2006.08.007] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2006] [Revised: 08/10/2006] [Accepted: 08/25/2006] [Indexed: 10/24/2022]
Abstract
The availability of genome sequences from a variety of organisms presents an opportunity to apply this sequence information to solving the key problems of molecular biology. One of the principal roadblocks on this path is the lack of appropriate descriptors and metrics that could succinctly represent the new knowledge stemming from the genomic data. Several new metrics have recently been used in comparative genome analysis, yet challenges remain in finding an appropriate language for the emerging discipline of systems biology.
Collapse
Affiliation(s)
- Michael Y Galperin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA and
- Corresponding authors: Galperin, Michael Y (); Kolker, Eugene ()
| | - Eugene Kolker
- The BIATECH Institute, 19310 North Creek Pkwy, Suite 115, Bothell, WA 98011, USA
| |
Collapse
|
79
|
Field D, Wilson G, van der Gast C. How do we compare hundreds of bacterial genomes? Curr Opin Microbiol 2006; 9:499-504. [PMID: 16942900 DOI: 10.1016/j.mib.2006.08.008] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2006] [Accepted: 08/16/2006] [Indexed: 11/26/2022]
Abstract
The genomic revolution is fully upon us in 2006 and the pace of discovery is set to accelerate with the emergence of ultra-high-throughput sequencing technologies. Our complete genome collection of bacteria and archaea continues to grow in number and diversity, as genome sequencing is applied to an array of new problems, from the characterization of the pan-genome to the detection of mutation after experimentation and the exploration of microbial communities in unprecedented detail. The benefits of large-scale comparative genomic analyses are driving the community to think about how to manage our public collections of genomes in novel ways.
Collapse
Affiliation(s)
- Dawn Field
- Oxford Centre for Ecology and Hydrology, Oxford OX1 3SR, UK.
| | | | | |
Collapse
|
80
|
Serres MH, Riley M. Genomic analysis of carbon source metabolism of Shewanella oneidensis MR-1: Predictions versus experiments. J Bacteriol 2006; 188:4601-9. [PMID: 16788168 PMCID: PMC1482980 DOI: 10.1128/jb.01787-05] [Citation(s) in RCA: 83] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Genomic sequences have been used to find the genetic foundation for carbon source metabolism in Shewanella oneidensis MR-1. Annotated S. oneidensis MR-1 gene products were examined for their sequence similarity to enzymes participating in pathways for utilization of carbon and energy as described in the BioCyc database (http://www.biocyc.org/) or in the primary literature. A picture emerges that relegates five- and six-carbon sugars to minor roles as carbon sources, whereas multiple pathways for utilization of up to three-carbon carbohydrates seem to be present. Capacity to utilize amino acids for carbon and energy is also present. A few contradictions emerged in which enzymes appear to be present by annotations but are not active in the cell according to physiological experiments. Annotations are based on close sequence similarity and will not reveal inactivity due to deleterious mutations or due to lack of coordination of regulation and transport. Genes for a few enzymes known by experiment to be active are not found in the genome. This may be due to extensive divergence after duplication or convergence of function in separate lines in evolution rendering activities undetectable by sequence similarity. To minimize false predictions from protein sequences, we have been conservative in predicting pathways. We did not predict any pathway when, although a partial pathway was seen it was composed largely of enzymes already accounted for in any other complete pathway. This is an example of how a biochemically oriented sequence analysis can generate questions and direct further experimental investigation.
Collapse
Affiliation(s)
- Margrethe H Serres
- Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, MA 02543, USA
| | | |
Collapse
|
81
|
Petritis K, Kangas LJ, Yan B, Monroe ME, Strittmatter EF, Qian WJ, Adkins JN, Moore RJ, Xu Y, Lipton MS, Camp DG, Smith RD. Improved peptide elution time prediction for reversed-phase liquid chromatography-MS by incorporating peptide sequence information. Anal Chem 2006; 78:5026-39. [PMID: 16841926 PMCID: PMC1924966 DOI: 10.1021/ac060143p] [Citation(s) in RCA: 152] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We describe an improved artificial neural network (ANN)-based method for predicting peptide retention times in reversed-phase liquid chromatography. In addition to the peptide amino acid composition, this study investigated several other peptide descriptors to improve the predictive capability, such as peptide length, sequence, hydrophobicity and hydrophobic moment, and nearest-neighbor amino acid, as well as peptide predicted structural configurations (i.e., helix, sheet, coil). An ANN architecture that consisted of 1052 input nodes, 24 hidden nodes, and 1 output node was used to fully consider the amino acid residue sequence in each peptide. The network was trained using approximately 345,000 nonredundant peptides identified from a total of 12,059 LC-MS/MS analyses of more than 20 different organisms, and the predictive capability of the model was tested using 1303 confidently identified peptides that were not included in the training set. The model demonstrated an average elution time precision of approximately 1.5% and was able to distinguish among isomeric peptides based upon the inclusion of peptide sequence information. The prediction power represents a significant improvement over our earlier report (Petritis, K.; Kangas, L. J.; Ferguson, P. L.; Anderson, G. A.; Pasa-Tolic, L.; Lipton, M. S.; Auberry, K. J.; Strittmatter, E. F.; Shen, Y.; Zhao, R.; Smith, R. D. Anal. Chem. 2003, 75, 1039-1048) and other previously reported models.
Collapse
Affiliation(s)
- Konstantinos Petritis
- Biological Sciences Division, Environmental and Molecular Sciences Laboratory, P. O. Box 999, Richland, Washington 99352, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
82
|
Galperin MY. Genomes to aid in bioremediation of dry cleaning solvents, mothballs and more. Environ Microbiol 2006; 8:949-55. [PMID: 16689716 DOI: 10.1111/j.1462-2920.2006.01059.x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Affiliation(s)
- Michael Y Galperin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
| |
Collapse
|
83
|
Hogan JM, Higdon R, Kolker E. Experimental Standards for High-Throughput Proteomics. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2006; 10:152-7. [PMID: 16901220 DOI: 10.1089/omi.2006.10.152] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Proteome analysis, utilizing high-throughput proteomics approaches, involves studying proteins that a whole organism (or specific tissue or cellular compartment) expresses under certain conditions. Intrinsic difficulties of these studies, as well as the enormous volumes of data they typically produce, make the proteome analysis and interpretation very difficult. As with any high-throughput approach, proteomics experiments should be carefully designed, analyzed, and verified. In addition to computational standards,experimental standards--simple and complex mixtures of known proteins--for high-throughput proteomics have to be developed and utilized. This article discusses such experimental standards and their implementations.
Collapse
Affiliation(s)
- Jason M Hogan
- The BIATECH Institute, Bothell, Washington 98011, USA
| | | | | |
Collapse
|
84
|
Brown SD, Thompson MR, Verberkmoes NC, Chourey K, Shah M, Zhou J, Hettich RL, Thompson DK. Molecular Dynamics of the Shewanella oneidensis Response to Chromate Stress. Mol Cell Proteomics 2006; 5:1054-71. [PMID: 16524964 DOI: 10.1074/mcp.m500394-mcp200] [Citation(s) in RCA: 127] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Temporal genomic profiling and whole-cell proteomic analyses were performed to characterize the dynamic molecular response of the metal-reducing bacterium Shewanella oneidensis MR-1 to an acute chromate shock. The complex dynamics of cellular processes demand the integration of methodologies that describe biological systems at the levels of regulation, gene and protein expression, and metabolite production. Genomic microarray analysis of the transcriptome dynamics of midexponential phase cells subjected to 1 mm potassium chromate (K(2)CrO(4)) at exposure time intervals of 5, 30, 60, and 90 min revealed 910 genes that were differentially expressed at one or more time points. Strongly induced genes included those encoding components of a TonB1 iron transport system (tonB1-exbB1-exbD1), hemin ATP-binding cassette transporters (hmuTUV), TonB-dependent receptors as well as sulfate transporters (cysP, cysW-2, and cysA-2), and enzymes involved in assimilative sulfur metabolism (cysC, cysN, cysD, cysH, cysI, and cysJ). Transcript levels for genes with annotated functions in DNA repair (lexA, recX, recA, recN, dinP, and umuD), cellular detoxification (so1756, so3585, and so3586), and two-component signal transduction systems (so2426) were also significantly up-regulated (p < 0.05) in Cr(VI)-exposed cells relative to untreated cells. By contrast, genes with functions linked to energy metabolism, particularly electron transport (e.g. so0902-03-04, mtrA, omcA, and omcB), showed dramatic temporal alterations in expression with the majority exhibiting repression. Differential proteomics based on multidimensional HPLC-MS/MS was used to complement the transcriptome data, resulting in comparable induction and repression patterns for a subset of corresponding proteins. In total, expression of 2,370 proteins were confidently verified with 624 (26%) of these annotated as hypothetical or conserved hypothetical proteins. The initial response of S. oneidensis to chromate shock appears to require a combination of different regulatory networks that involve genes with annotated functions in oxidative stress protection, detoxification, protein stress protection, iron and sulfur acquisition, and SOS-controlled DNA repair mechanisms.
Collapse
Affiliation(s)
- Steven D Brown
- Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, USA
| | | | | | | | | | | | | | | |
Collapse
|
85
|
Zimmer JSD, Monroe ME, Qian WJ, Smith RD. Advances in proteomics data analysis and display using an accurate mass and time tag approach. MASS SPECTROMETRY REVIEWS 2006; 25:450-82. [PMID: 16429408 PMCID: PMC1829209 DOI: 10.1002/mas.20071] [Citation(s) in RCA: 254] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Proteomics has recently demonstrated utility for increasing the understanding of cellular processes on the molecular level as a component of systems biology approaches and for identifying potential biomarkers of various disease states. The large amount of data generated by utilizing high efficiency (e.g., chromatographic) separations coupled with high mass accuracy mass spectrometry for high-throughput proteomics analyses presents challenges related to data processing, analysis, and display. This review focuses on recent advances in nanoLC-FTICR-MS-based proteomics approaches and the accompanying data processing tools that have been developed to display and interpret the large volumes of data being produced.
Collapse
Affiliation(s)
- Jennifer S D Zimmer
- Biological Sciences Division and Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, Washington 99352, USA
| | | | | | | |
Collapse
|
86
|
Kolker E, Higdon R, Hogan JM. Protein identification and expression analysis using mass spectrometry. Trends Microbiol 2006; 14:229-35. [PMID: 16603360 DOI: 10.1016/j.tim.2006.03.005] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2005] [Revised: 03/02/2006] [Accepted: 03/22/2006] [Indexed: 11/28/2022]
Abstract
The identification and quantification of the proteins that a whole organism expresses under certain conditions is a main focus of high-throughput proteomics. Advanced proteomics approaches generate new biologically relevant data and potent hypotheses. A practical report of what proteome studies can and cannot accomplish in common laboratory settings is presented here. The review discusses the most popular tandem mass-spectrometry-based methods and focuses on how to produce reliable results. A step-by-step description of proteome experiments is given, including sample preparation, digestion, labeling, liquid chromatography, data processing, database searching and statistical analysis. The difficulties and bottlenecks of proteome analysis are addressed and the requirements for further improvements are discussed. Several diverse high-throughput proteomics-based studies of microorganisms are described.
Collapse
Affiliation(s)
- Eugene Kolker
- The BIATECH Institute, 19310 North Creek Parkway, Suite 115, Bothell, WA 98011, USA.
| | | | | |
Collapse
|
87
|
Gao W, Liu Y, Giometti CS, Tollaksen SL, Khare T, Wu L, Klingeman DM, Fields MW, Zhou J. Knock-out of SO1377 gene, which encodes the member of a conserved hypothetical bacterial protein family COG2268, results in alteration of iron metabolism, increased spontaneous mutation and hydrogen peroxide sensitivity in Shewanella oneidensis MR-1. BMC Genomics 2006; 7:76. [PMID: 16600046 PMCID: PMC1468410 DOI: 10.1186/1471-2164-7-76] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2006] [Accepted: 04/06/2006] [Indexed: 01/28/2023] Open
Abstract
Background Shewanella oneidensis MR-1 is a facultative, gram-negative bacterium capable of coupling the oxidation of organic carbon to a wide range of electron acceptors such as oxygen, nitrate and metals, and has potential for bioremediation of heavy metal contaminated sites. The complete 5-Mb genome of S. oneidensis MR-1 was sequenced and standard sequence-comparison methods revealed approximately 42% of the MR-1 genome encodes proteins of unknown function. Defining the functions of hypothetical proteins is a great challenge and may need a systems approach. In this study, by using integrated approaches including whole genomic microarray and proteomics, we examined knockout effects of the gene encoding SO1377 (gi24372955), a member of the conserved, hypothetical, bacterial protein family COG2268 (Clusters of Orthologous Group) in bacterium Shewanella oneidensis MR-1, under various physiological conditions. Results Compared with the wild-type strain, growth assays showed that the deletion mutant had a decreased growth rate when cultured aerobically, but not affected under anaerobic conditions. Whole-genome expression (RNA and protein) profiles revealed numerous gene and protein expression changes relative to the wild-type control, including some involved in iron metabolism, oxidative damage protection and respiratory electron transfer, e. g. complex IV of the respiration chain. Although total intracellular iron levels remained unchanged, whole-cell electron paramagnetic resonance (EPR) demonstrated that the level of free iron in mutant cells was 3 times less than that of the wild-type strain. Siderophore excretion in the mutant also decreased in iron-depleted medium. The mutant was more sensitive to hydrogen peroxide and gave rise to 100 times more colonies resistant to gentamicin or kanamycin. Conclusion Our results showed that the knock-out of SO1377 gene had pleiotropic effects and suggested that SO1377 may play a role in iron homeostasis and oxidative damage protection in S. oneidensis MR-1.
Collapse
Affiliation(s)
- Weimin Gao
- Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Yongqing Liu
- Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Carol S Giometti
- Biosciences Division, Argonne National Laboratory, Argonne, IL 60439, USA
| | - Sandra L Tollaksen
- Biosciences Division, Argonne National Laboratory, Argonne, IL 60439, USA
| | - Tripti Khare
- Biosciences Division, Argonne National Laboratory, Argonne, IL 60439, USA
| | - Liyou Wu
- Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Dawn M Klingeman
- Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Matthew W Fields
- Department of Microbiology, Miami University, Oxford, OH 45056, USA
| | - Jizhong Zhou
- Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
- Institute for Environmental Genomics and Department of Botany and Microbiology, University of Oklahoma, Norman, OK 73019, USA
| |
Collapse
|
88
|
Higdon R, Hogan JM, Van Belle G, Kolker E. Randomized sequence databases for tandem mass spectrometry peptide and protein identification. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2006; 9:364-79. [PMID: 16402894 DOI: 10.1089/omi.2005.9.364] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Tandem mass spectrometry (MS/MS) combined with database searching is currently the most widely used method for high-throughput peptide and protein identification. Many different algorithms, scoring criteria, and statistical models have been used to identify peptides and proteins in complex biological samples, and many studies, including our own, describe the accuracy of these identifications, using at best generic terms such as "high confidence." False positive identification rates for these criteria can vary substantially with changing organisms under study, growth conditions, sequence databases, experimental protocols, and instrumentation; therefore, study-specific methods are needed to estimate the accuracy (false positive rates) of these peptide and protein identifications. We present and evaluate methods for estimating false positive identification rates based on searches of randomized databases (reversed and reshuffled). We examine the use of separate searches of a forward then a randomized database and combined searches of a randomized database appended to a forward sequence database. Estimated error rates from randomized database searches are first compared against actual error rates from MS/MS runs of known protein standards. These methods are then applied to biological samples of the model microorganism Shewanella oneidensis strain MR-1. Based on the results obtained in this study, we recommend the use of use of combined searches of a reshuffled database appended to a forward sequence database as a means providing quantitative estimates of false positive identification rates of peptides and proteins. This will allow researchers to set criteria and thresholds to achieve a desired error rate and provide the scientific community with direct and quantifiable measures of peptide and protein identification accuracy as opposed to vague assessments such as "high confidence."
Collapse
Affiliation(s)
- Roger Higdon
- The BIATECH Institute, 19310 N. Creek Parkway South, Suite 115, Bothell, WA 98011, USA
| | | | | | | |
Collapse
|
89
|
Nanduri B, Lawrence ML, Vanguri S, Pechan T, Burgess SC. Proteomic analysis using an unfinished bacterial genome: the effects of subminimum inhibitory concentrations of antibiotics on Mannheimia haemolytica virulence factor expression. Proteomics 2006; 5:4852-63. [PMID: 16247735 DOI: 10.1002/pmic.200500112] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Here we identify, using nonelectrophoretic proteomics, effects of subminimum inhibitory concentrations (subMIC) of two antibiotic preparations, chlortetracycline (CTC), and chlortetracycline-sulfamethazine (CTC + SMZ), on protein expression in the bovine respiratory pathogen Mannheimia haemolytica. The M. haemolytica genome is currently in draft form, and annotation is incomplete. Relying on the principle of gene sequence conservation across species, we used annotated genomes from closely related species to identify, confirm, and functionally annotate 495 M. haemolytica proteins. To conduct quantitative comparative proteomics, we developed a protein quantitation method based on the cross correlation function of the SEQUEST algorithm. When M. haemolytica was cultivated in the presence of 1/4 MIC of CTC and CTC + SMZ, expression of proteins involved in energy production, nucleotide metabolism, translation, and the bacterial stress response (chaperones) were affected. The most notable subMIC effect was a significant decrease in the expression of leukotoxin A, which is an important M. haemolytica virulence factor. Reduction in leukotoxin expression could be one of the molecular mechanisms responsible for the efficacy of these antibiotics against bovine respiratory disease.
Collapse
Affiliation(s)
- Bindu Nanduri
- College of Veterinary Medicine, Mississippi State University, Mississippi, MS 39762-6100, USA
| | | | | | | | | |
Collapse
|
90
|
Elias DA, Monroe ME, Smith RD, Fredrickson JK, Lipton MS. Confirmation of the expression of a large set of conserved hypothetical proteins in Shewanella oneidensis MR-1. J Microbiol Methods 2006; 66:223-33. [PMID: 16417935 DOI: 10.1016/j.mimet.2005.11.009] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2005] [Revised: 11/15/2005] [Accepted: 11/15/2005] [Indexed: 10/25/2022]
Abstract
High-throughput "omic" technologies have allowed for a relatively rapid, yet comprehensive analysis of the global expression patterns within an organism in response to perturbations. In the current study, 9503 different tryptic peptides were identified with high confidence from capillary liquid chromatography-mass spectrometry analysis of 26 chemostat cultures of Shewanella oneidensis MR-1 under various conditions. Using at least one distinctive and a total of two total peptide identifications per protein, we detected the expression of 758 conserved hypothetical proteins. This included 359 such proteins previously described [Kolker, E., Picone, A.F., Galperin, M.Y., Romine, M.F., Higdon, R., Makarova, K.S., Kolker, N., Anderson, G.A., Qiu, X., Auberry, K.J., Babnigg, G., Beliaev, A.S., Edlefsen, P., Elias, D.A., Gorby, Y.A., Holzman, T., Klappenbach, J.A., Konstantinidis, K.T., Land, M.L., Lipton, M.S., McCue, L.A., Monroe, M., Pasa-Tolic, L., Pinchuk, G., Purvine, S., Serres, M.H., Tsapin, S., Zakrajsek, B.A., Zhu, W., Zhou, J., Larimer, F.W., Lawrence, C.E., Riley, M., Collart, F.R., Yates, J.R., III, Smith, R.D., Giometti, C.S., Nealson, K.H., Fredrickson, J.K., Tiedje, J.M., 2005. Global profiling of Shewanella oneidensis MR-1: expression of hypothetical genes and improved functional annotations. Proc Natl Acad Sci U S A 102, 2099-2104] with an additional 399 reported herein for the first time. The latter 399 proteins ranged from 5.3 to 208.3 kDa, with 44 being of 100 amino acid residues or less. Using a combination of information including peptide detection in cells grown under specific culture conditions and predictive algorithms such as PSORT and PSORT-B, possible/plausible functions are proposed for some conserved hypothetical proteins. Such proteins were found not only to be expressed, but 19 were only expressed under certain culturing conditions, thereby providing insight into potential functions. These findings also impact the genomic annotation for S. oneidensis MR-1 by confirming that these genes code for expressed proteins. Our results indicate that 399 proteins can now be upgraded from "conserved hypothetical protein" to "expressed protein in Shewanella," 19 of which appeared to be expressed under specific culture conditions.
Collapse
Affiliation(s)
- Dwayne A Elias
- Environmental and Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | | | | | | | | |
Collapse
|
91
|
Fredrickson JK, Romine MF. Genome-assisted analysis of dissimilatory metal-reducing bacteria. Curr Opin Biotechnol 2005; 16:269-74. [PMID: 15961027 DOI: 10.1016/j.copbio.2005.04.001] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2004] [Revised: 03/20/2005] [Accepted: 04/04/2005] [Indexed: 10/25/2022]
Abstract
The availability of whole genome sequences for Shewanella oneidensis and Geobacter sulfurreducens has provided numerous new biological insights into the function of these model dissimilatory metal-reducing bacteria. Many of these findings, including the identification of a high number of c-type cytochromes in both organisms, have resulted from comparative genomic analyses, and several have been experimentally confirmed. These genome sequences have also aided the identification of genes important for the reduction of metal ions and other electron acceptors utilized during anaerobic growth, by facilitating the identification of genes disrupted by random insertions. Technologies for assaying global expression patterns for genes and proteins have also been employed, but their application has been limited mainly to the analysis of the role of global regulatory genes and to identifying genes expressed or repressed in response to specific electron acceptors. It is anticipated that details of the mechanisms of metal ion respiration, and metabolism in general, will eventually be revealed by comprehensive, systems-level analyses enabled by functional genomics data.
Collapse
Affiliation(s)
- James K Fredrickson
- Pacific Northwest National Laboratory, PO Box 999, Richland, Washington 99352, USA.
| | | |
Collapse
|
92
|
Hogan JM, Higdon R, Kolker N, Kolker E. Charge State Estimation for Tandem Mass Spectrometry Proteomics. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2005; 9:233-50. [PMID: 16209638 DOI: 10.1089/omi.2005.9.233] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
High-throughput protein analysis by tandem mass spectrometry produces anywhere from thousands to millions of spectra that are being used for peptide and protein identifications. Though each spectrum corresponds only to one charged peptide (ion) state, repetitive database searches of multiple charge states are typically conducted since the resolution of many common mass spectrometers is not sufficient to determine the charge state. The resulting database searches are both error-prone and time-consuming. We describe a straightforward, accurate approach on charge state estimation (CHASTE). CHASTE relies on fragment ion peak distributions, and by using reliable logistic regression models, combines different measurements to improve its accuracy. CHASTE's performance has been validated on data sets, comprised of known peptide dissociation spectra, obtained by replicate analyses of our earlier developed protein standard mixture using ion trap mass spectrometers at different laboratories. CHASTE was able to reduce number of needed database searches by at least 60% and the number of redundant searches by at least 90% virtually without any informational loss. This greatly alleviates one of the major bottlenecks in high throughput peptide and protein identifications. Thresholds and parameter estimates can be tailored to specific analysis situations, pipelines, and instrumentations. CHASTE was implemented in Java GUI-based and command-line-based interfaces.
Collapse
|
93
|
Galperin MY. A census of membrane-bound and intracellular signal transduction proteins in bacteria: bacterial IQ, extroverts and introverts. BMC Microbiol 2005; 5:35. [PMID: 15955239 PMCID: PMC1183210 DOI: 10.1186/1471-2180-5-35] [Citation(s) in RCA: 320] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2005] [Accepted: 06/14/2005] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Analysis of complete microbial genomes showed that intracellular parasites and other microorganisms that inhabit stable ecological niches encode relatively primitive signaling systems, whereas environmental microorganisms typically have sophisticated systems of environmental sensing and signal transduction. RESULTS This paper presents results of a comprehensive census of signal transduction proteins--histidine kinases, methyl-accepting chemotaxis receptors, Ser/Thr/Tyr protein kinases, adenylate and diguanylate cyclases and c-di-GMP phosphodiesterases--encoded in 167 bacterial and archaeal genomes, sequenced by the end of 2004. The data have been manually checked to avoid false-negative and false-positive hits that commonly arise during large-scale automated analyses and compared against other available resources. The census data show uneven distribution of most signaling proteins among bacterial and archaeal phyla. The total number of signal transduction proteins grows approximately as a square of genome size. While histidine kinases are found in representatives of all phyla and are distributed according to the power law, other signal transducers are abundant in certain phylogenetic groups but virtually absent in others. CONCLUSION The complexity of signaling systems differs even among closely related organisms. Still, it usually can be correlated with the phylogenetic position of the organism, its lifestyle, and typical environmental challenges it encounters. The number of encoded signal transducers (or their fraction in the total protein set) can be used as a measure of the organism's ability to adapt to diverse conditions, the 'bacterial IQ', while the ratio of transmembrane receptors to intracellular sensors can be used to define whether the organism is an 'extrovert', actively sensing the environmental parameters, or an 'introvert', more concerned about its internal homeostasis. Some of the microorganisms with the highest IQ, including the current leader Wolinella succinogenes, are found among the poorly studied beta-, delta- and epsilon-proteobacteria. Among all bacterial phyla, only cyanobacteria appear to be true introverts, probably due to their capacity to conduct oxygenic photosynthesis, using a complex system of intracellular membranes. The census data, available at http://www.ncbi.nlm.nih.gov/Complete_Genomes/SignalCensus.html, can be used to get an insight into metabolic and behavioral propensities of each given organism and improve prediction of the organism's properties based solely on its genome sequence.
Collapse
Affiliation(s)
- Michael Y Galperin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
| |
Collapse
|
94
|
Affiliation(s)
- Michael Y Galperin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
| |
Collapse
|
95
|
Craig R, Cortens JP, Beavis RC. The use of proteotypic peptide libraries for protein identification. RAPID COMMUNICATIONS IN MASS SPECTROMETRY : RCM 2005; 19:1844-50. [PMID: 15945033 DOI: 10.1002/rcm.1992] [Citation(s) in RCA: 126] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
This paper describes an algorithm to apply proteotypic peptide sequence libraries to protein identifications performed using tandem mass spectrometry (MS/MS). Proteotypic peptides are those peptides in a protein sequence that are most likely to be confidently observed by current MS-based proteomics methods. Libraries of proteotypic peptide sequences were compiled from the Global Proteome Machine Database for Homo sapiens and Saccharomyces cerevisiae model species proteomes. These libraries were used to scan through collections of tandem mass spectra to discover which proteins were represented by the data sets, followed by detailed analysis of the spectra with the full protein sequences corresponding to the discovered proteotypic peptides. This algorithm (Proteotypic Peptide Profiling, or P3) resulted in sequence-to-spectrum matches comparable to those obtained by conventional protein identification algorithms using only full protein sequences, with a 20-fold reduction in the time required to perform the identification calculations. The proteotypic peptide libraries, the open source code for the implementation of the search algorithm and a website for using the software have been made freely available. Approximately 4% of the residues in the H. sapiens proteome were required in the proteotypic peptide library to successfully identify proteins.
Collapse
|
96
|
Louie B, Mork P, Shaker R, Kolker N, Kolker E, Tarczy-Hornoch P. Integration of data for gene annotation using the BioMediator system. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2005:1036. [PMID: 16779323 PMCID: PMC1560885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Gene annotation requires integration of data from multiple sources in order to functionally classify genes. We are using BioMediator, a general purpose data-integration solution, to develop a gene annotation system to automate the process of collecting data from disparate genomic databases. Integration of annotation data from multiple sources into a single format will facilitate use of analytic tools for the proper functional classification of genes.
Collapse
Affiliation(s)
- B Louie
- Department of Medical Education and Biomedical Informatics, University of Washington, Seattle, USA
| | | | | | | | | | | |
Collapse
|