51
|
Fox JM, Erill I. Relative codon adaptation: a generic codon bias index for prediction of gene expression. DNA Res 2010; 17:185-96. [PMID: 20453079 PMCID: PMC2885275 DOI: 10.1093/dnares/dsq012] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The development of codon bias indices (CBIs) remains an active field of research due to their myriad applications in computational biology. Recently, the relative codon usage bias (RCBS) was introduced as a novel CBI able to estimate codon bias without using a reference set. The results of this new index when applied to Escherichia coli and Saccharomyces cerevisiae led the authors of the original publications to conclude that natural selection favours higher expression and enhanced codon usage optimization in short genes. Here, we show that this conclusion was flawed and based on the systematic oversight of an intrinsic bias for short sequences in the RCBS index and of biases in the small data sets used for validation in E. coli. Furthermore, we reveal that how the RCBS can be corrected to produce useful results and how its underlying principle, which we here term relative codon adaptation (RCA), can be made into a powerful reference-set-based index that directly takes into account the genomic base composition. Finally, we show that RCA outperforms the codon adaptation index (CAI) as a predictor of gene expression when operating on the CAI reference set and that this improvement is significantly larger when analysing genomes with high mutational bias.
Collapse
Affiliation(s)
- Jesse M Fox
- Department of Biological Sciences, University of Maryland Baltimore County (UMBC), 1000 Hilltop Road, Baltimore, MD 21228, USA
| | | |
Collapse
|
52
|
Metabolic flux distributions: genetic information, computational predictions, and experimental validation. Appl Microbiol Biotechnol 2010; 86:1243-55. [DOI: 10.1007/s00253-010-2506-6] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2009] [Revised: 02/10/2010] [Accepted: 02/11/2010] [Indexed: 01/15/2023]
|
53
|
Chemical and biological single cell analysis. Curr Opin Biotechnol 2010; 21:12-20. [DOI: 10.1016/j.copbio.2010.01.007] [Citation(s) in RCA: 147] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2009] [Accepted: 01/09/2010] [Indexed: 11/20/2022]
|
54
|
Gao J, Chen LL. Theoretical methods for identifying important functional genes in bacterial genomes. Res Microbiol 2009; 161:1-8. [PMID: 19900539 DOI: 10.1016/j.resmic.2009.10.007] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2009] [Revised: 10/05/2009] [Accepted: 10/21/2009] [Indexed: 12/30/2022]
Abstract
Some functional genes, such as essential genes, highly expressed genes and horizontally transferred genes, play important roles in the survival and pathogenicity of bacteria. This review attempts to summarize current computational methods in identifying the above functional genes from bacterial genomes, which is of significant importance in exploring the bacterial genomes.
Collapse
Affiliation(s)
- Junxiang Gao
- School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, PR China
| | | |
Collapse
|
55
|
Wexler M, Richardson DJ, Bond PL. Radiolabelled proteomics to determine differential functioning of Accumulibacter during the anaerobic and aerobic phases of a bioreactor operating for enhanced biological phosphorus removal. Environ Microbiol 2009; 11:3029-44. [PMID: 19650829 DOI: 10.1111/j.1462-2920.2009.02007.x] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Proteins synthesized by the mixed microbial community of two sequencing batch reactors run for enhanced biological phosphorus removal (EBPR) during aerobic and anaerobic reactor phases were compared, using mass spectrometry-based proteomics and radiolabelling. Both sludges were dominated by polyphosphate-accumulating organisms belonging to Candidatis Accumulibacter and the majority of proteins identified matched closest to these bacteria. Enzymes from the Embden-Meyerhof-Parnas pathway were identified, suggesting this is the major glycolytic pathway for these Accumulibacter populations. Enhanced aerobic synthesis of glyoxylate cycle enzymes suggests this cycle is important during the aerobic phase of EBPR. In one sludge, several TCA cycle enzymes showed enhanced aerobic synthesis, suggesting this cycle is unimportant anaerobically. The second sludge showed enhanced synthesis of TCA cycle enzymes under anaerobic conditions, suggesting full or partial TCA cycle operation anaerobically. A phylogenetic analysis of Accumulibacter polyphosphate kinase genes from each sludge demonstrated different Accumulibacter populations dominated the two sludges. Thus, TCA cycle activity differences may be due to Accumulibacter strain differences. The major fatty acids present in Accumulibacter-dominated sludge include palmitic, hexadecenoic and cis-vaccenic acid and fatty acid content increased by approximately 20% during the anaerobic phase. We hypothesize that this is associated with increased anaerobic phospholipid membrane biosynthesis, to accommodate intracellular polyhydroxyalkanoate granules.
Collapse
Affiliation(s)
- Margaret Wexler
- School of Biological Sciences, University of East Anglia, Norwich NR4 7TJ, UK.
| | | | | |
Collapse
|
56
|
Atkins JF, Gesteland RF. Sequences Promoting Recoding Are Singular Genomic Elements. RECODING: EXPANSION OF DECODING RULES ENRICHES GENE EXPRESSION 2009; 24. [PMCID: PMC7122551 DOI: 10.1007/978-0-387-89382-2_14] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
The distribution of sequences which induce non-standard decoding, especially of shift-prone sequences, is very unusual. On one hand, since they can disrupt standard genetic readout, they are avoided within the coding regions of most genes. On the other hand, they play important regulatory roles for the expression of those genes where they do occur. As a result, they are preserved among homologs and exhibit deep phylogenetic conservation. The combination of these two constraints results in a characteristic distribution of recoding sequences across genomes: they are highly conserved at specific locations while they are very rare in other locations. We term such sequences singular genomic elements to signify their rare occurrence and biological importance.
Collapse
Affiliation(s)
- John F. Atkins
- Molecular Biology Program, University of Utah, N. 2030 E. 15, Salt Late City, 84112-5330 U.S.A
| | | |
Collapse
|
57
|
Gao N, Ma BG, Zhang YS, Song Q, Chen LL, Zhang HY. Gene Expression Analysis of Four Radiation-resistant Bacteria. GENOMICS INSIGHTS 2009; 2:11-22. [PMID: 26244019 PMCID: PMC4510606 DOI: 10.4137/gei.s2380] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
To investigate the general radiation-resistant mechanisms of bacteria, bioinformatic method was employed to predict highly expressed genes for four radiation-resistant bacteria, i.e. Deinococcus geothermalis (D. geo), Deinococcus radiodurans (D. rad), Kineococcus radiotolerans (K. rad) and Rubrobacter xylanophilus (R. xyl). It is revealed that most of the three reference gene sets, i.e. ribosomal proteins, transcription factors and major chaperones, are generally highly expressed in the four bacteria. Recombinase A (recA), a key enzyme in recombinational repair, is predicted to be highly or marginally highly expressed in the four bacteria. However, most proteins associated with other repair systems show low expression levels. Some genes participating in ‘information storage and processing,’ ‘cellular processes and signaling’ and ‘metabolism’ are among the top twenty predicted highly expressed (PHX) genes in the four genomes. Many antioxidant enzymes and proteases are commonly highly expressed in the four bacteria, indicating that these enzymes play important roles in resisting irradiation. Finally, a number of ‘hypothetical genes’ are among the top twenty PHX genes in each genome, some of them might contribute vitally to resist irradiation. Some of the prediction results are supported by experimental evidence. All the above information not only helps to understand the radiation-resistant mechanisms but also provides clues for identifying new radiation-resistant genes from these bacteria.
Collapse
Affiliation(s)
- Na Gao
- Shandong Provincial Research Center for Bioinformatic Engineering and Technique, Center for Advanced Study, School of Life Sciences, Shandong University of Technology, Zibo 255049, P.R. China
| | - Bin-Guang Ma
- Shandong Provincial Research Center for Bioinformatic Engineering and Technique, Center for Advanced Study, School of Life Sciences, Shandong University of Technology, Zibo 255049, P.R. China. ; Computational Biology Unit, Bergen Center for Computational Science, University of Bergen, Bergen 5008, Norway
| | - Yu-Sheng Zhang
- Shandong Provincial Research Center for Bioinformatic Engineering and Technique, Center for Advanced Study, School of Life Sciences, Shandong University of Technology, Zibo 255049, P.R. China
| | - Qin Song
- Shandong Provincial Research Center for Bioinformatic Engineering and Technique, Center for Advanced Study, School of Life Sciences, Shandong University of Technology, Zibo 255049, P.R. China
| | - Ling-Ling Chen
- Shandong Provincial Research Center for Bioinformatic Engineering and Technique, Center for Advanced Study, School of Life Sciences, Shandong University of Technology, Zibo 255049, P.R. China
| | - Hong-Yu Zhang
- Shandong Provincial Research Center for Bioinformatic Engineering and Technique, Center for Advanced Study, School of Life Sciences, Shandong University of Technology, Zibo 255049, P.R. China
| |
Collapse
|
58
|
Golovina AY, Sergiev PV, Golovin AV, Serebryakova MV, Demina I, Govorun VM, Dontsova OA. The yfiC gene of E. coli encodes an adenine-N6 methyltransferase that specifically modifies A37 of tRNA1Val(cmo5UAC). RNA (NEW YORK, N.Y.) 2009; 15:1134-41. [PMID: 19383770 PMCID: PMC2685529 DOI: 10.1261/rna.1494409] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/02/2008] [Accepted: 02/27/2009] [Indexed: 05/24/2023]
Abstract
Transfer RNA is highly modified. Nucleotide 37 of the anticodon loop is represented by various modified nucleotides. In Escherichia coli, the valine-specific tRNA (cmo(5)UAC) contains a unique modification, N(6)-methyladenosine, at position 37; however, the enzyme responsible for this modification is unknown. Here we demonstrate that the yfiC gene of E. coli encodes an enzyme responsible for the methylation of A37 in tRNA(1)(Val). Inactivation of yfiC gene abolishes m(6)A formation in tRNA(1)(Val), while expression of the yfiC gene from a plasmid restores the modification. Additionally, unmodified tRNA(1)(Val) can be methylated by recombinant YfiC protein in vitro. Although the methylation of m(6)A in tRNA(1)(Val) by YfiC has little influence on the cell growth under standard conditions, the yfiC gene confers a growth advantage under conditions of osmotic and oxidative stress.
Collapse
Affiliation(s)
- Anna Y Golovina
- Department of Chemistry, Moscow State University, Moscow 119992, Russia
| | | | | | | | | | | | | |
Collapse
|
59
|
Das S, Roymondal U, Sahoo S. Analyzing gene expression from relative codon usage bias in Yeast genome: a statistical significance and biological relevance. Gene 2009; 443:121-31. [PMID: 19410638 DOI: 10.1016/j.gene.2009.04.022] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2008] [Revised: 03/08/2009] [Accepted: 04/20/2009] [Indexed: 11/17/2022]
Abstract
Based on the hypothesis that highly expressed genes are often characterized by strong compositional bias in terms of codon usage, there are a number of measures currently in use that quantify codon usage bias in genes, and hence provide numerical indices to predict the expression levels of genes. With the recent advent of expression measure from the score of the relative codon usage bias (RCBS), we have explicitly tested the performance of this numerical measure to predict the gene expression level and illustrate this with an analysis of Yeast genomes. In contradiction with previous other studies, we observe a weak correlations between GC content and RCBS, but a selective pressure on the codon preferences in highly expressed genes. The assertion that the expression of a given gene depends on the score of relative codon usage bias (RCBS) is supported by the data. We further observe a strong correlation between RCBS and protein length indicating natural selection in favour of shorter genes to be expressed at higher level. We also attempt a statistical analysis to assess the strength of relative codon bias in genes as a guide to their likely expression level, suggesting a decrease of the informational entropy in the highly expressed genes.
Collapse
Affiliation(s)
- Shibsankar Das
- Department of Mathematics, Uluberia College, Uluberia, Howrah, W.B., India
| | | | | |
Collapse
|
60
|
Raymond A, Lovell S, Lorimer D, Walchli J, Mixon M, Wallace E, Thompkins K, Archer K, Burgin A, Stewart L. Combined protein construct and synthetic gene engineering for heterologous protein expression and crystallization using Gene Composer. BMC Biotechnol 2009; 9:37. [PMID: 19383143 PMCID: PMC2680836 DOI: 10.1186/1472-6750-9-37] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2008] [Accepted: 04/21/2009] [Indexed: 01/29/2023] Open
Abstract
Background With the goal of improving yield and success rates of heterologous protein production for structural studies we have developed the database and algorithm software package Gene Composer. This freely available electronic tool facilitates the information-rich design of protein constructs and their engineered synthetic gene sequences, as detailed in the accompanying manuscript. Results In this report, we compare heterologous protein expression levels from native sequences to that of codon engineered synthetic gene constructs designed by Gene Composer. A test set of proteins including a human kinase (P38α), viral polymerase (HCV NS5B), and bacterial structural protein (FtsZ) were expressed in both E. coli and a cell-free wheat germ translation system. We also compare the protein expression levels in E. coli for a set of 11 different proteins with greatly varied G:C content and codon bias. Conclusion The results consistently demonstrate that protein yields from codon engineered Gene Composer designs are as good as or better than those achieved from the synonymous native genes. Moreover, structure guided N- and C-terminal deletion constructs designed with the aid of Gene Composer can lead to greater success in gene to structure work as exemplified by the X-ray crystallographic structure determination of FtsZ from Bacillus subtilis. These results validate the Gene Composer algorithms, and suggest that using a combination of synthetic gene and protein construct engineering tools can improve the economics of gene to structure research.
Collapse
Affiliation(s)
- Amy Raymond
- deCODE biostructures Inc, 7869 NE Day Road West, Bainbridge Island, WA 98110, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
61
|
Roymondal U, Das S, Sahoo S. Predicting gene expression level from relative codon usage bias: an application to Escherichia coli genome. DNA Res 2009; 16:13-30. [PMID: 19131380 PMCID: PMC2646356 DOI: 10.1093/dnares/dsn029] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We present an expression measure of a gene, devised to predict the level of gene expression from relative codon bias (RCB). There are a number of measures currently in use that quantify codon usage in genes. Based on the hypothesis that gene expressivity and codon composition is strongly correlated, RCB has been defined to provide an intuitively meaningful measure of an extent of the codon preference in a gene. We outline a simple approach to assess the strength of RCB (RCBS) in genes as a guide to their likely expression levels and illustrate this with an analysis of Escherichia coli (E. coli) genome. Our efforts to quantitatively predict gene expression levels in E. coli met with a high level of success. Surprisingly, we observe a strong correlation between RCBS and protein length indicating natural selection in favour of the shorter genes to be expressed at higher level. The agreement of our result with high protein abundances, microarray data and radioactive data demonstrates that the genomic expression profile available in our method can be applied in a meaningful way to the study of cell physiology and also for more detailed studies of particular genes of interest.
Collapse
Affiliation(s)
- Uttam Roymondal
- Department of Mathematics, Raidighi College, South 24 Parganas, Raidighi, West Bengal, India
| | | | | |
Collapse
|
62
|
Basak S, Mukherjee I, Choudhury M, Das S. Unusual codon usage bias in low expression genes of Vibrio cholerae. Bioinformation 2008; 3:213-7. [PMID: 19255636 PMCID: PMC2646191 DOI: 10.6026/97320630003213] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2008] [Accepted: 12/02/2008] [Indexed: 11/23/2022] Open
Abstract
Positive correlation between gene expression and synonymous codon usage bias is well documented in the literature. However, in the present study of Vibrio cholerae genome, we have identified a group of genes having unusually high codon usage bias despite being low potential expressivity. Our results suggest that codon usage in lowly expressed genes might also be selected on to preferably use non-optimal codons to maintain a low cellular concentration of the proteins that they encode. This would predict that lowly expressed genes are also biased in codon usage, but in a way that is opposite to the bias of highly expressed genes.
Collapse
Affiliation(s)
- Surajit Basak
- Biomedical Informatics Center, National Institute of Cholera and Enteric Diseases, P-33, C.I.T Road, Scheme-XM, Beliaghata, Kolkata 700010, India.
| | | | | | | |
Collapse
|
63
|
Suzuki H, Brown CJ, Forney LJ, Top EM. Comparison of correspondence analysis methods for synonymous codon usage in bacteria. DNA Res 2008; 15:357-65. [PMID: 18940873 PMCID: PMC2608848 DOI: 10.1093/dnares/dsn028] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
Synonymous codon usage varies both between organisms and among genes within a genome, and arises due to differences in G + C content, replication strand skew, or gene expression levels. Correspondence analysis (CA) is widely used to identify major sources of variation in synonymous codon usage among genes and provides a way to identify horizontally transferred or highly expressed genes. Four methods of CA have been developed based on three kinds of input data: absolute codon frequency, relative codon frequency, and relative synonymous codon usage (RSCU) as well as within-group CA (WCA). Although different CA methods have been used in the past, no comprehensive comparative study has been performed to evaluate their effectiveness. Here, the four CA methods were evaluated by applying them to 241 bacterial genome sequences. The results indicate that WCA is more effective than the other three methods in generating axes that reflect variations in synonymous codon usage. Furthermore, WCA reveals sources that were previously unnoticed in some genomes; e.g. synonymous codon usage related to replication strand skew was detected in Rickettsia prowazekii. Though CA based on RSCU is widely used, our evaluation indicates that this method does not perform as well as WCA.
Collapse
Affiliation(s)
- Haruo Suzuki
- Department of Biological Sciences and Initiative for Bioinformatics and Evolutionary Studies, University of Idaho, PO Box 443051, Moscow, Idaho 83844-3051, USA.
| | | | | | | |
Collapse
|
64
|
Abstract
An effective vaccine for Vibrio cholerae is not yet available for use in the developing world, where the burden of cholera disease is highest. Characterizing the proteins that are expressed by V. cholerae in the human host environment may provide insight into the pathogenesis of cholera and assist with the development of an improved vaccine. We analyzed the V. cholerae proteins present in the stools of 32 patients with clinical cholera. The V. cholerae outer membrane porin, OmpU, was identified in all of the human stool samples, and many V. cholerae proteins were repeatedly identified in separate patient samples. The majority of V. cholerae proteins identified in human stool are involved in protein synthesis and energy metabolism. A number of proteins involved in the pathogenesis of cholera, including the A and B subunits of cholera toxin and the toxin-coregulated pilus, were identified in human stool. In a subset of stool specimens, we also assessed which in vivo expressed V. cholerae proteins were recognized uniquely by convalescent-phase as opposed to acute-phase serum from cholera patients. We identified a number of these in vivo expressed proteins as immunogenic during human infection. To our knowledge, this is the first characterization of the proteome of a pathogenic bacteria recovered from a natural host.
Collapse
|
65
|
Hieu CX, Voigt B, Albrecht D, Becher D, Lombardot T, Glöckner FO, Amann R, Hecker M, Schweder T. Detailed proteome analysis of growing cells of the planctomyceteRhodopirellula baltica SH1T. Proteomics 2008; 8:1608-23. [DOI: 10.1002/pmic.200701017] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
66
|
Quorum sensing influences Vibrio harveyi growth rates in a manner not fully accounted for by the marker effect of bioluminescence. PLoS One 2008; 3:e1671. [PMID: 18301749 PMCID: PMC2249925 DOI: 10.1371/journal.pone.0001671] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2007] [Accepted: 01/13/2008] [Indexed: 11/30/2022] Open
Abstract
Background The light-emitting Vibrios provide excellent material for studying the interaction of cellular communication with growth rate because bioluminescence is a convenient marker for quorum sensing. However, the use of bioluminescence as a marker is complicated because bioluminescence itself may affect growth rate, e.g. by diverting energy. Methodology/Principal Findings The marker effect was explored via growth rate studies in isogenic Vibrio harveyi (Vh) strains altered in quorum sensing on the one hand, and bioluminescence on the other. By hypothesis, growth rate is energy limited: mutants deficient in quorum sensing grow faster because wild type quorum sensing unleashes bioluminescence and bioluminescence diverts energy. Findings reported here confirm a role for bioluminescence in limiting Vh growth rate, at least under the conditions tested. However, the results argue that the bioluminescence is insufficient to explain the relationship of growth rate and quorum sensing in Vh. A Vh mutant null for all genes encoding the bioluminescence pathway grew faster than wild type but not as fast as null mutants in quorum sensing. Vh quorum sensing mutants showed altered growth rates that do not always rank with their relative increase or decrease in bioluminescence. In addition, the cell-free culture fluids of a rapidly growing Vibrio parahaemolyticus (Vp) strain increased the growth rate of wild type Vh without significantly altering Vh's bioluminescence. The same cell-free culture fluid increased the bioluminescence of Vh quorum mutants. Conclusions/Significance The effect of quorum sensing on Vh growth rate can be either positive or negative and includes both bioluminescence-dependent and independent components. Bioluminescence tends to slow growth rate but not enough to account for the effects of quorum sensing on growth rate.
Collapse
|
67
|
Ishihama Y, Schmidt T, Rappsilber J, Mann M, Hartl FU, Kerner MJ, Frishman D. Protein abundance profiling of the Escherichia coli cytosol. BMC Genomics 2008; 9:102. [PMID: 18304323 PMCID: PMC2292177 DOI: 10.1186/1471-2164-9-102] [Citation(s) in RCA: 353] [Impact Index Per Article: 22.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2008] [Accepted: 02/27/2008] [Indexed: 11/10/2022] Open
Abstract
Background Knowledge about the abundance of molecular components is an important prerequisite for building quantitative predictive models of cellular behavior. Proteins are central components of these models, since they carry out most of the fundamental processes in the cell. Thus far, protein concentrations have been difficult to measure on a large scale, but proteomic technologies have now advanced to a stage where this information becomes readily accessible. Results Here, we describe an experimental scheme to maximize the coverage of proteins identified by mass spectrometry of a complex biological sample. Using a combination of LC-MS/MS approaches with protein and peptide fractionation steps we identified 1103 proteins from the cytosolic fraction of the Escherichia coli strain MC4100. A measure of abundance is presented for each of the identified proteins, based on the recently developed emPAI approach which takes into account the number of sequenced peptides per protein. The values of abundance are within a broad range and accurately reflect independently measured copy numbers per cell. As expected, the most abundant proteins were those involved in protein synthesis, most notably ribosomal proteins. Proteins involved in energy metabolism as well as those with binding function were also found in high copy number while proteins annotated with the terms metabolism, transcription, transport, and cellular organization were rare. The barrel-sandwich fold was found to be the structural fold with the highest abundance. Highly abundant proteins are predicted to be less prone to aggregation based on their length, pI values, and occurrence patterns of hydrophobic stretches. We also find that abundant proteins tend to be predominantly essential. Additionally we observe a significant correlation between protein and mRNA abundance in E. coli cells. Conclusion Abundance measurements for more than 1000 E. coli proteins presented in this work represent the most complete study of protein abundance in a bacterial cell so far. We show significant associations between the abundance of a protein and its properties and functions in the cell. In this way, we provide both data and novel insights into the role of protein concentration in this model organism.
Collapse
Affiliation(s)
- Yasushi Ishihama
- Institute for Advanced Biosciences, Keio University, Tsuruoka, Yamagata 997-0017, Japan.
| | | | | | | | | | | | | |
Collapse
|
68
|
The implication of life style on codon usage patterns and predicted highly expressed genes for three Frankia genomes. Antonie van Leeuwenhoek 2008; 93:335-46. [DOI: 10.1007/s10482-007-9211-1] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2007] [Accepted: 11/12/2007] [Indexed: 11/27/2022]
|
69
|
|
70
|
Puigbò P, Romeu A, Garcia-Vallvé S. HEG-DB: a database of predicted highly expressed genes in prokaryotic complete genomes under translational selection. Nucleic Acids Res 2007; 36:D524-7. [PMID: 17933767 PMCID: PMC2238906 DOI: 10.1093/nar/gkm831] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The highly expressed genes database (HEG-DB) is a genomic database that includes the prediction of which genes are highly expressed in prokaryotic complete genomes under strong translational selection. The current version of the database contains general features for almost 200 genomes under translational selection, including the correspondence analysis of the relative synonymous codon usage for all genes, and the analysis of their highly expressed genes. For each genome, the database contains functional and positional information about the predicted group of highly expressed genes. This information can also be accessed using a search engine. Among other statistical parameters, the database also provides the Codon Adaptation Index (CAI) for all of the genes using the codon usage of the highly expressed genes as a reference set. The 'Pathway Tools Omics Viewer' from the BioCyc database enables the metabolic capabilities of each genome to be explored, particularly those related to the group of highly expressed genes. The HEG-DB is freely available at http://genomes.urv.cat/HEG-DB.
Collapse
Affiliation(s)
- Pere Puigbò
- Evolutionary Genomics Group, Biochemistry and Biotechnology Department, Faculty of Chemistry, Rovira i Virgili University (URV), c/Marcel-li Domingo, s/n. Campus Sescelades, 43007 Tarragona, Spain.
| | | | | |
Collapse
|
71
|
Chaves DFS, Ferrer PP, de Souza EM, Gruz LM, Monteiro RA, de Oliveira Pedrosa F. A two-dimensional proteome reference map of
Herbaspirillum seropedicae
proteins. Proteomics 2007; 7:3759-63. [PMID: 17853511 DOI: 10.1002/pmic.200600859] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Herbaspirillum seropedicae is an endophytic diazotroph associated with economically important crops such as rice, sugarcane, and wheat. Here, we present a 2-D reference map for H. seropedicae. Using MALDI-TOF-MS we identified 205 spots representing 173 different proteins with a calculated average of 1.18 proteins/gene. Seventeen hypothetical or conserved hypothetical ORFs were shown to code for true gene products. These data will support the genome annotation process and provide a basis on which to undertake comparative proteomic studies.
Collapse
|
72
|
Xu K, Ma BG. Comparative analysis of predicted gene expression among deep-sea genomes. Gene 2007; 397:136-42. [PMID: 17544603 DOI: 10.1016/j.gene.2007.04.023] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2006] [Revised: 04/08/2007] [Accepted: 04/20/2007] [Indexed: 11/18/2022]
Abstract
Deep-sea species live in an environment that is specifically characterized by extreme temperature and hydrostatic pressure. In this work, predicted highly expressed (PHX) genes are comparatively analyzed for six deep-sea microbes, which allows us to pinpoint the common highly expressed genes shared by them. The relationships between gene expression level and some basic properties such as genomic G + C content, optimal growth temperature (OGT), and environmental hydrostatic pressure of the six deep-sea species are also investigated. We find that the percentage of PHX genes out of a whole genome positively correlates to OGT for the deep-sea genomes, whereas such positive correlation seems not to exist between environmental hydrostatic pressure and percentage of PHX genes. Moreover, there exists a negative correlation between genomic G + C content and diversity of gene expression level for the deep-sea genomes, which is in sharp contrast to land-living microbes. We report the top 20 PHX genes for the six deep-sea genomes and find no common highly expressed genes shared by them except for ribosomal proteins, transcription factors, and translation factors. Our present work proffers a paradigm for studying the relationship between environmental factors and microbes' predicted gene expression level.
Collapse
Affiliation(s)
- Ke Xu
- College of Mathematics and Information Science, Shandong University of Technology, Zibo, PR China
| | | |
Collapse
|
73
|
van Scherpenzeel M, van der Pot M, Arnusch CJ, Liskamp RMJ, Pieters RJ. Detection of galectin-3 by novel peptidic photoprobes. Bioorg Med Chem Lett 2007; 17:376-8. [PMID: 17095228 DOI: 10.1016/j.bmcl.2006.10.043] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2006] [Revised: 10/18/2006] [Accepted: 10/18/2006] [Indexed: 11/26/2022]
Abstract
Photoprobes were prepared with specificity for binding, labeling, and visualizing galectin-3 in a mixture of proteins. The probes were derived from a galectin-3 binding 15-mer peptide sequence in which a benzophenone photolabel was incorporated at the N-terminus and in another case as a phenyl alanine replacement in the middle of the sequence. Detection of galectin-3 was possible in Escherichia coli lysates that were spiked with various amounts of galectin-3.
Collapse
Affiliation(s)
- Monique van Scherpenzeel
- Department of Medicinal Chemistry and Chemical Biology, Utrecht Institute for Pharmaceutical Sciences, Utrecht University, PO Box 80082, 3508 TB Utrecht, The Netherlands
| | | | | | | | | |
Collapse
|
74
|
Abstract
The "expression measure" of a gene, E(g), is a statistic devised to predict the level of gene expression from codon usage bias. E(g) has been used extensively to analyze prokaryotic genome sequences. We discuss 2 problems with this approach. First, the formulation of E(g) is such that genes with the strongest selected codon usage bias are not likely to have the highest predicted expression levels; indeed the correlation between E(g) and expression level is weak among moderate to highly expressed genes. Second, in some species, highly expressed genes do not have unusual codon usage, and so codon usage cannot be used to predict expression levels. We outline a simple approach, first to check whether a genome shows evidence of selected codon usage bias and then to assess the strength of bias in genes as a guide to their likely expression level; we illustrate this with an analysis of Shewanella oneidensis.
Collapse
|
75
|
Zhang W, Gritsenko MA, Moore RJ, Culley DE, Nie L, Petritis K, Strittmatter EF, Camp DG, Smith RD, Brockman FJ. A proteomic view ofDesulfovibrio vulgaris metabolism as determined by liquid chromatography coupled with tandem mass spectrometry. Proteomics 2006; 6:4286-99. [PMID: 16819729 DOI: 10.1002/pmic.200500930] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Direct LC-MS/MS was used to examine the proteins extracted from exponential or stationary phase Desulfovibrio vulgaris cells that had been grown on a minimal medium containing either lactate or formate as the primary carbon source. Across all four growth conditions, 976 gene products were identified with high confidence, which is equal to approximately 28% of all predicted proteins in the D. vulgaris genome. Bioinformatic analysis showed that the proteins identified were distributed among almost all functional classes, with the energy metabolism category containing the greatest number of identified proteins. At least 154 ORFs originally annotated as hypothetical proteins were found to encode the expressed proteins, which provided verification for the authenticity of these hypothetical proteins. Proteomic analysis showed that proteins potentially involved in ATP biosynthesis using the proton gradient across membrane, such as ATPase, alcohol dehydrogenases, heterodisulfide reductases, and [NiFe] hydrogenase (HynAB-1) of the hydrogen cycling were highly expressed in all four growth conditions, suggesting they may be the primary pathways for ATP synthesis in D. vulgaris. Most of the enzymes involved in substrate-level phosphorylation were also detected in all tested conditions. However, no enzyme involved in CO cycling or formate cycling was detected, suggesting that they are not the primary ATP-biosynthesis pathways under the tested conditions. This study provides the first proteomic overview of the cellular metabolism of D. vulgaris. The complete list of proteins identified in this study and their abundances (peptide hits) is provided in Supplementary Table 1.
Collapse
Affiliation(s)
- Weiwen Zhang
- Microbiology Group, Pacific Northwest National Laboratory, Richland, WA, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
76
|
Karlin S, Brocchieri L, Mrázek J, Kaiser D. Distinguishing features of delta-proteobacterial genomes. Proc Natl Acad Sci U S A 2006; 103:11352-7. [PMID: 16844781 PMCID: PMC1544090 DOI: 10.1073/pnas.0604311103] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We analyzed several features of five currently available delta-proteobacterial genomes, including two aerobic bacteria exhibiting predatory behavior and three anaerobic sulfate-reducing bacteria. The delta genomes are distinguished from other bacteria by several properties: (i) The delta genomes contain two "giant" S1 ribosomal protein genes in contrast to all other bacterial types, which encode a single or no S1; (ii) in most delta-proteobacterial genomes the major ribosomal protein (RP) gene cluster is near the replication terminus whereas most bacterial genomes place the major RP cluster near the origin of replication; (iii) the delta genomes possess the rare combination of discriminating asparaginyl and glutaminyl tRNA synthetase (AARS) together with the amido-transferase complex (Gat CAB) genes that modify Asp-tRNA(Asn) into Asn-tRNA(Asn) and Glu-tRNA(Gln) into Gln-tRNA(Gln); (iv) the TonB receptors and ferric siderophore receptors that facilitate uptake and removal of complex metals are common among delta genomes; (v) the anaerobic delta genomes encode multiple copies of the anaerobic detoxification protein rubrerythrin that can neutralize hydrogen peroxide; and (vi) sigma(54) activators play a more important role in the delta genomes than in other bacteria. delta genomes have a plethora of enhancer binding proteins that respond to environmental and intracellular cues, often as part of two-component systems; (vii) delta genomes encode multiple copies of metallo-beta-lactamase enzymes; (viii) a host of secretion proteins emphasizing SecA, SecB, and SecY may be especially useful in the predatory activities of Myxococcus xanthus; (ix) delta proteobacteria drive many multiprotein machines in their periplasms and outer membrane, including chaperone-feeding machines, jets for slime secretion, and type IV pili. Bdellovibrio replicates in the periplasm of prey cells. The sulfate-reducing delta proteobacteria metabolize hydrogen and generate a proton gradient by electron transport. The predicted highly expressed genes from delta genomes reflect their different ecologies, metabolic strategies, and adaptations.
Collapse
Affiliation(s)
- Samuel Karlin
- Department of Mathematics, Stanford University, Stanford, CA 94305, USA.
| | | | | | | |
Collapse
|
77
|
Abstract
During in vitro broth culture, bacterial gene expression is typically dominated by highly expressed factors involved in protein biosynthesis, maturation, and folding, but it is unclear if this also applies to conditions in natural environments. Here, we used a promoter trap strategy with an unstable green fluorescent protein reporter that can be detected in infected mouse tissues to identify 21 Salmonella enterica promoters with high levels of activity in a mouse enteritis model. We then measured the activities of these and 31 previously identified Salmonella promoters in both the enteritis and a murine typhoid fever model. Surprisingly, the data reveal that instead of protein biosynthesis genes, disease-specific genes such as Salmonella pathogenicity island 1 (SPI-1)-associated genes and genes involved in anaerobic respiration (enteritis) or SPI-2-associated genes and genes of the PhoP regulon (typhoid fever), respectively, dominate Salmonella in vivo gene expression. The overall functional profile of highly expressed genes suggests a marked shift in major transcriptional activities to nutrient utilization during enteritis or to fighting against the host during typhoid fever. The large proportion of known and novel essential virulence factors among the identified genes suggests that high expression levels during infection may correlate with functional relevance.
Collapse
Affiliation(s)
- Claudia Rollenhagen
- Max Planck Institute for Infection Biology, Department of Molecular Biology, Berlin, Germany
| | | |
Collapse
|
78
|
Wu G, Nie L, Zhang W. Relation between mRNA expression and sequence information in Desulfovibrio vulgaris: combinatorial contributions of upstream regulatory motifs and coding sequence features to variations in mRNA abundance. Biochem Biophys Res Commun 2006; 344:114-21. [PMID: 16603130 DOI: 10.1016/j.bbrc.2006.03.124] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2006] [Accepted: 03/21/2006] [Indexed: 11/29/2022]
Abstract
The context-dependent expression of genes is the core for biological activities, and significant attention has been given to identification of various factors contributing to gene expression at genomic scale. However, so far this type of analysis has been focused either on relation between mRNA expression and non-coding sequence features such as upstream regulatory motifs or on correlation between mRNA abundance and non-random features in coding sequences (e.g., codon usage and amino acid usage). In this study multiple regression analyses of the mRNA abundance and all sequence information in Desulfovibrio vulgaris were performed, with the goal to investigate how much coding and non-coding sequence features contribute to the variations in mRNA expression, and in what manner they act together. Using the AlignACE program, 442 over-represented motifs were identified from the upstream 100bp region of 293 genes located in the known regulons. Regression of mRNA expression data against the measures of coding and non-coding sequence features indicated that 54.1% of the variations in mRNA abundance can be explained by the presence of upstream motifs, while coding sequences alone contribute to 29.7% of the variations in mRNA abundance. Interestingly, most of contribution from coding sequences is overlapping with that from upstream motifs; thereby a total of 60.3% of the variations in mRNA abundance can be explained when coding and non-coding information was included. This result demonstrates that upstream regulatory motifs and coding sequence information contribute to the overall mRNA expression in a combinatorial rather than an additive manner.
Collapse
Affiliation(s)
- Gang Wu
- Department of Biological Sciences, University of Maryland at Baltimore County, 1000 Hilltop Circle, Baltimore, MD 21250, USA
| | | | | |
Collapse
|
79
|
González-Escalona N, Fey A, Höfle MG, Espejo RT, A Guzmán C. Quantitative reverse transcription polymerase chain reaction analysis of Vibrio cholerae cells entering the viable but non-culturable state and starvation in response to cold shock. Environ Microbiol 2006; 8:658-66. [PMID: 16584477 DOI: 10.1111/j.1462-2920.2005.00943.x] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
We performed a comparative analysis of the Vibrio cholerae strain El Tor 3083 entering the viable but non-culturable (VBNC) state and starvation after incubation in artificial seawater (ASW) at 4 and 15 degrees C respectively. To this end, we determined bacterial culturability and membrane integrity, as well as the cellular levels of 16S rRNA and mRNA for the tuf, rpoS and relA genes, which were assessed by real-time quantitative reverse transcription polymerase chain reaction (Q-RT-PCR). Bacterial cells entering the VBNC state showed a 154, 5.1 x 10(3), 24- and 23-fold reduction in the number of copies of 16S rRNA and mRNA for tuf, rpoS and relA, in comparison to exponentially growing cells. The differences were less striking between cells in the VBNC and starvation states. The mRNA for relA was selectively increased in VBNC cells (3.2-folds), whereas a 3.9-fold reduction was observed for 16S rRNA. The obtained results confirmed that key activities of the cellular metabolism (i.e. tuf representing protein synthesis, and relA or rpoS stress response) were still detected in bacteria entering the VBNC state and starvation. These data suggest that the new Q-RT-PCR methodology, based on the selected RNA targets, could be successfully exploited for the identification (rRNA) of V. cholerae and assessment of its metabolic activity (tuf, rpoS, relA mRNA) in environmental samples.
Collapse
Affiliation(s)
- Narjol González-Escalona
- Vaccine Research Group, Division of Microbiology, GBF-German Research Centre for Biotechnology, Braunschweig, Germany
| | | | | | | | | |
Collapse
|
80
|
Abstract
Predicted highly expressed (PHX) genes are compared for 16 gamma-proteobacteria and their similarities and differences are interpreted with respect to known or predicted physiological characteristics of the organisms. Predicted highly expressed genes often reflect the organism's predominant lifestyle, habitat, nutrition sources and metabolic propensities. This technique allows to predict principal metabolic activities of the microorganisms operating in their natural habitats. Among our findings is an unusually high number of PHX enzymes acting in cell wall biosynthesis, amino acid biosynthesis and replication in the ant endosymbiont Blochmannia floridanus. We ascribe the abundance of these PHX genes to specific aspects of the relationship between the bacterium and its host. Xanthomonas campestris is unique with a very high number of PHX genes acting in flagellum biosynthesis, which may play a special role during its pathogenicity. Shewanella oneidensis possesses three protein complexes which all can function as complex I in the respiratory chain but only the Na(+)-transporting NADH:ubiquinone oxidoreductase nqr-2 operon is PHX. The PHX genes of Vibrio parahaemolyticus are consistent with the microorganism's adaptation to extremely fast growth rates. Comparative analysis of PHX genes from complex environmental genomic sequences as well as from uncultured pathogenic microbes can provide a novel, useful tool to predict global flux of matter and key intermediates.
Collapse
Affiliation(s)
- Jan Mrázek
- Department of Mathematics, Stanford University, CA 94305, USA
| | | | | |
Collapse
|
81
|
Wu G, Nie L, Zhang W. Predicted highly expressed genes in Nocardia farcinica and the implication for its primary metabolism and nocardial virulence. Antonie van Leeuwenhoek 2006; 89:135-46. [PMID: 16496092 DOI: 10.1007/s10482-005-9016-z] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2005] [Accepted: 09/26/2005] [Indexed: 01/30/2023]
Abstract
Nocardia farcinica is a Gram positive, filamentous bacterium, and is considered an opportunistic pathogen. In this study, the highly expressed genes in N. farcinica were predicted using the codon adaptation index (CAI) as a numerical estimator of gene expressivity. Using ribosomal protein (RP) genes as references, the top approximately approximately 10% of the genes were predicted to be the predicted highly expressed (PHX) genes in N. farcinica using a CAI cutoff of greater than 0.73. Consistent with earlier analysis of Streptomyces genomes, most of the PHX genes in N. farcinica were involved in various 'house-keeping' functions important for cell growth. However, 15 genes putatively involved in nocardial virulence were predicted as PHX genes in N. farcinica, which included genes encoding four Mce proteins, cyclopropane fatty acid synthase which is involved in the modification of cell wall which may be important for nocardia virulence, polyketide synthase PKS13 for mycolic acid synthesis and a non-ribosomal peptide synthetase involved in biosynthesis of a mycobactin-related siderophore. In addition, multiple genes involved in defense against reactive oxygen species (ROS) produced by the phagocyte were predicted with high expressivity, which included alkylhydroperoxide reductase (ahpC), catalase (katG), superoxide dismutase (sodF), thioredoxin, thioredoxin reductase, glutathione peroxidase, and peptide methionine sulfoxide reductase, suggesting that combating against ROS is essential for survival of N. farcinica in host cells. The study also showed that the distribution of PHX genes in the N. farcinica circular chromosome was uneven, with more PHX genes located in the regions close to replication initiation site. The results provided the first estimates of global gene expression patterns in N. farcinica, which will be useful in guiding experimental design for further investigations.
Collapse
Affiliation(s)
- Gang Wu
- Department of Biological Sciences, University of Maryland, Baltimore County, Baltimore, MD 21250, USA
| | | | | |
Collapse
|
82
|
Bardey V, Vallet C, Robas N, Charpentier B, Thouvenot B, Mougin A, Hajnsdorf E, Régnier P, Springer M, Branlant C. Characterization of the molecular mechanisms involved in the differential production of erythrose-4-phosphate dehydrogenase, 3-phosphoglycerate kinase and class II fructose-1,6-bisphosphate aldolase in Escherichia coli. Mol Microbiol 2005; 57:1265-87. [PMID: 16102000 DOI: 10.1111/j.1365-2958.2005.04762.x] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
A gapA-pgk gene tandem coding the glyceraldehyde 3-phosphate dehydrogenase and 3-phosphoglycerate kinase, is most frequently found in bacteria. However, in Enterobacteriaceae, gapA is replaced by an epd open reading frame (ORF) coding an erythrose-4-phosphate dehydrogenase and an fbaA ORF coding the class II fructose-1,6-bisphosphate aldolase follows pgk. Although epd expression is very low in Escherichia coli, we show that, in the presence of glucose, the 3 epd, pgk and fbaA ORFs are efficiently cotranscribed from promoter epd P0. Conservation of promoter epd P0 is likely due to its important role in modulation of the metabolic flux during glycolysis and gluconeogenesis. As a consequence, we found that the epd translation initiation region and ORF have been adapted in order to limit epd translation and to create an efficient RNase E entry site. We also show that fbaA is cotranscribed with pgk, from promoter epd P0 or an internal pgk P1 promoter of the extended -10 class. The differential expression of pgk and fbaA also depends upon an RNase E segmentation process, leading to individual mRNAs with different stabilities. The secondary structures of the RNA regions containing the RNase E sites were experimentally determined which brings important information on the structural features of RNase E ectopic sites.
Collapse
Affiliation(s)
- Vincent Bardey
- Laboratoire de Maturation des ARN et Enzymologie Moléculaire, UMR 7567 CNRS-UHP Nancy I, Faculté des Sciences et Techniques, BP 239, 54506 Vandoeuvre-lès-Nancy, Cedex, France
| | | | | | | | | | | | | | | | | | | |
Collapse
|
83
|
Wu G, Culley DE, Zhang W. Predicted highly expressed genes in the genomes of Streptomyces coelicolor and Streptomyces avermitilis and the implications for their metabolism. MICROBIOLOGY-SGM 2005; 151:2175-2187. [PMID: 16000708 DOI: 10.1099/mic.0.27833-0] [Citation(s) in RCA: 101] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Highly expressed genes in bacteria often have a stronger codon bias than genes expressed at lower levels, due to translational selection. In this study, a comparative analysis of predicted highly expressed (PHX) genes in the Streptomyces coelicolor and Streptomyces avermitilis genomes was performed using the codon adaptation index (CAI) as a numerical estimator of gene expression level. Although it has been suggested that there is little heterogeneity in codon usage in G+C-rich bacteria, considerable heterogeneity was found among genes in these two G+C-rich Streptomyces genomes. Using ribosomal protein genes as references, approximately 10% of the genes were predicted to be PHX genes using a CAI cutoff value of greater than 0.78 and 0.75 in S. coelicolor and S. avermitilis, respectively. The PHX genes showed good agreement with the experimental data on expression levels obtained from proteomic analysis by previous workers. Among 724 and 730 PHX genes identified from S. coelicolor and S. avermitilis, 368 are orthologue genes present in both genomes, which were mostly 'housekeeping' genes involved in cell growth. In addition, 61 orthologous gene pairs with unknown functions were identified as PHX. Only one polyketide synthase gene from each Streptomyces genome was predicted as PHX. Nevertheless, several key genes responsible for producing precursors for secondary metabolites, such as crotonyl-CoA reductase and propionyl-CoA carboxylase, and genes necessary for initiation of secondary metabolism, such as adenosylmethionine synthetase, were among the PHX genes in the two Streptomyces species. The PHX genes exclusive to each genome, and what they imply regarding cellular metabolism, are also discussed.
Collapse
Affiliation(s)
- Gang Wu
- Department of Biological Sciences, University of Maryland, Baltimore County, 1000 Hilltop Circle, Baltimore, MD 21250, USA
| | - David E Culley
- Microbiology Department, Pacific Northwest National Laboratory, 902 Battelle Boulevard, PO Box 999, Mail Stop P7-50, Richland, WA 99352, USA
| | - Weiwen Zhang
- Microbiology Department, Pacific Northwest National Laboratory, 902 Battelle Boulevard, PO Box 999, Mail Stop P7-50, Richland, WA 99352, USA
| |
Collapse
|
84
|
Meinersmann RJ, Phillips RW, Hiett KL, Fedorka-Cray P. Differentiation of campylobacter populations as demonstrated by flagellin short variable region sequences. Appl Environ Microbiol 2005; 71:6368-74. [PMID: 16204559 PMCID: PMC1265918 DOI: 10.1128/aem.71.10.6368-6374.2005] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2005] [Accepted: 05/10/2005] [Indexed: 11/20/2022] Open
Abstract
The DNA sequence of the flaA short variable region (SVR) was used to analyze a random population of Campylobacter isolates to investigate the weakly clonal population structure of members of the genus. The SVR sequence from 197 strains of C. jejuni and C. coli isolated from humans, bovine, swine, and chickens identified a group of 43 strains containing disparate short variable region sequences compared to the rest of the population. This group contains both C. jejuni and C. coli strains but disproportionately consisted of bovine isolates. Relative synonymous codon usage analysis of the sequences identified two groups: one group typified C. jejuni, and the second group was characteristic for C. coli and the disparate alleles were not clustered. The data show that there is significant differentiation of Campylobacter populations according to the source of the isolate even without considering the disparate isolates. Even though there is significant differentiation of chicken and bovine isolates, the bovine isolates did not show any difference in ability to colonize chickens. It is possible that disparate sequences were obtained through the lateral transfer of DNA from Campylobacter species other than C. jejuni and C. coli. It is evident that recombination within the flaA SVR occurs rapidly. However, the rate of migration between populations appears to limit the distribution of sequences and results in a weakly clonal population structure.
Collapse
Affiliation(s)
- Richard J Meinersmann
- USDA-ARS, Richard J. Russell Research Center, P.O. Box 5677, Athens, GA 30604-5677, USA.
| | | | | | | |
Collapse
|
85
|
Niemitalo O, Neubauer A, Liebal U, Myllyharju J, Juffer AH, Neubauer P. Modelling of translation of human protein disulfide isomerase in Escherichia coli—A case study of gene optimisation. J Biotechnol 2005; 120:11-24. [PMID: 16111781 DOI: 10.1016/j.jbiotec.2005.05.028] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2004] [Revised: 04/21/2005] [Accepted: 05/04/2005] [Indexed: 11/30/2022]
Abstract
Recombinant human protein disulfide isomerase (PDI) was expressed in vivo in Escherichia coli using a non-optimised gene sequence and an optimised sequence with four 5' codons substituted by synonymous codons that take less time to translate. The optimisation resulted in a 2-fold increase of total PDI concentration and by successive optimisation with expression at low temperature in a 10-fold increase of the amount of soluble PDI in comparison with the original wild-type construct. The improvement can be due to a faster clearing of the ribosome binding site on the mRNA, elevating the translation initiation rate and resulting in higher ribosome loading and better ribosome protection of the PDI mRNA against endonucleolytic cleavage by RNase. This hypothesis was supported by a novel computer simulation model of E. coli translational ribosome traffic based upon the stochastic Gillespie algorithm. The study indicates the applicability of such models in optimisation of recombinant protein sequences.
Collapse
Affiliation(s)
- Olli Niemitalo
- Bioprocess Engineering Laboratory, Department of Process and Environmental Engineering, University of Oulu, Oulu, Finland
| | | | | | | | | | | |
Collapse
|
86
|
Abstract
The Arthur M. Sackler Colloquium of the National Academy of Sciences, "Frontiers in Bioinformatics: Unsolved Problems and Challenges," organized by David Eisenberg, Russ Altman, and myself, was held October 15-17, 2004, to provide a forum for discussing concepts and methods in bioinformatics serving the biological and medical sciences. The deluge of genomic and proteomic data in the last two decades has driven the creation of tools that search and analyze biomolecular sequences and structures. Bioinformatics is highly interdisciplinary, using knowledge from mathematics, statistics, computer science, biology, medicine, physics, chemistry, and engineering.
Collapse
Affiliation(s)
- Samuel Karlin
- Department of Mathematics, Stanford University, Stanford, CA 94305-2125, USA.
| |
Collapse
|
87
|
Sun J, Chen M, Xu J, Luo J. Relationships among stop codon usage bias, its context, isochores, and gene expression level in various eukaryotes. J Mol Evol 2005; 61:437-44. [PMID: 16170455 DOI: 10.1007/s00239-004-0277-3] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2004] [Accepted: 01/25/2005] [Indexed: 11/25/2022]
Abstract
It is well known that stop codons play a critical role in the process of protein synthesis. However, little effort has been made to investigate whether stop codon usage exhibits biases, such as widely seen for synonymous codon usage. Here we systematically investigate stop codon usage bias in various eukaryotes as well as its relationships with its context, GC3 content, gene expression level, and secondary structure. The results show that there is a strong bias for stop codon usage in different eukaryotes, i.e., UAA is overrepresented in the lower eukaryotes, UGA is overrepresented in the higher eukaryotes, and UAG is least used in all eukaryotes. Different conserved patterns for each stop codon in different eukaryotic classes are found based on information content and logo analysis. GC3 contents increase with increasing complexity of organisms. Secondary structure prediction revealed that UAA is generally associated with loop structures, whereas UGA is more uniformly present in loop and stem structures, i.e., UGA is less biased toward having a particular structure. The stop codon usage bias, however, shows no significant relationship with GC3 content and gene expression level in individual eukaryotes. The results indicate that genomic complexity and GC3 content might contribute to stop codon usage bias in different eukaryotes. Our results indicate that stop codons, like synonymous codons, exhibit biases in usage. Additional work will be needed to understand the causes of these biases and their relationship to the mechanism of protein termination.
Collapse
Affiliation(s)
- Jingchun Sun
- School of Life Sciences & Technology, Shanghai Jiaotong University, Shanghai 200240, China
| | | | | | | |
Collapse
|
88
|
Das S, Ghosh S, Pan A, Dutta C. Compositional variation in bacterial genes and proteins with potential expression level. FEBS Lett 2005; 579:5205-10. [PMID: 16165133 DOI: 10.1016/j.febslet.2005.08.042] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2005] [Accepted: 08/22/2005] [Indexed: 11/22/2022]
Abstract
Usage of guanine and cytosine at three codon sites in eubacterial genes vary distinctly with potential expressivity, as predicted by Codon Adaptation Index (CAI). In bacteria with moderate/high GC-content, G(3) follows a biphasic relationship, while C(3) increases with CAI. In AT-rich bacteria, correlation of CAI is negative with G(3), but non-specific with C(3). Correlations of CAI with residues encoded by G-starting codons are positive, while with those by C-starting codons are usually negative/random. Average Size/Complexity Score and aromaticity of gene-products decrease with CAI, confirming general validity of cost-minimization principle in free-living eubacteria. Alcoholicity of bacterial gene-products usually decreases with expressivity.
Collapse
Affiliation(s)
- Sabyasachi Das
- Bioinformatics Center, Indian Institute of Chemical Biology, 4, Raja S.C. Mullick Road, Kolkata 700 032, India
| | | | | | | |
Collapse
|
89
|
Gade D, Theiss D, Lange D, Mirgorodskaya E, Lombardot T, Glöckner FO, Kube M, Reinhardt R, Amann R, Lehrach H, Rabus R, Gobom J. Towards the proteome of the marine bacteriumRhodopirellula baltica: Mapping the soluble proteins. Proteomics 2005; 5:3654-71. [PMID: 16127728 DOI: 10.1002/pmic.200401201] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The marine bacterium Rhodopirellula baltica, a member of the phylum Planctomycetes, has distinct morphological properties and contributes to remineralization of biomass in the natural environment. On the basis of its recently determined complete genome we investigated its proteome by 2-DE and established a reference 2-DE gel for the soluble protein fraction. Approximately 1000 protein spots were excised from a colloidal Coomassie-stained gel (pH 4-7), analyzed by MALDI-MS and identified by PMF. The non-redundant data set contained 626 distinct protein spots, corresponding to 558 different genes. The identified proteins were classified into role categories according to their predicted functions. The experimentally determined and the theoretically predicted proteomes were compared. Proteins, which were most abundant in 2-DE gels and the coding genes of which were also predicted to be highly expressed, could be linked mainly to housekeeping functions in glycolysis, tricarboxic acid cycle, amino acid biosynthesis, protein quality control and translation. Absence of predictable signal peptides indicated a localization of these proteins in the intracellular compartment, the pirellulosome. Among the identified proteins, 146 contained a predicted signal peptide suggesting their translocation. Some proteins were detected in more than one spot on the gel, indicating post-translational modification. In addition to identifying proteins present in the published sequence database for R. baltica, an alternative approach was used, in which the mass spectrometric data was searched against a maximal ORF set, allowing the identification of four previously unpredicted ORFs. The 2-DE reference map presented here will serve as framework for further experiments to study differential gene expression of R. baltica in response to external stimuli or cellular development and compartmentalization.
Collapse
Affiliation(s)
- Dörte Gade
- Max Planck Institute for Marine Microbiology, Bremen, Germany
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
90
|
Fu QS, Li F, Chen LL. Gene expression analysis of six GC-rich Gram-negative phytopathogens. Biochem Biophys Res Commun 2005; 332:380-7. [PMID: 15910748 DOI: 10.1016/j.bbrc.2005.04.128] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2005] [Accepted: 04/26/2005] [Indexed: 11/20/2022]
Abstract
Predicted highly expressed (PHX) genes are comparatively analyzed for six GC-rich Gram-negative phytopathogens, i.e., Ralstonia solanacearum, Agrobacterium tumefaciens, Xanthomonas campestris pv. campestris (Xcc), Xanthomonas axonopodis pv. citri (Xac), Pseudomonas syringae pv. tomato, and Xylella fastidiosa. Enzymes involved in energy metabolism, such as ATP synthase, and genes involved in TCA cycle, are PHX in most bacteria except X. fastidiosa, which prefers an anaerobic environment. Most pathogenicity-related factors, including flagellar proteins and some outer membrane proteins, are PHX, except that flagellar proteins are missing in X. fastidiosa which is spread by insects and does not need to move during invasion. Although type III secretion system apparatus are homologous to flagellar proteins, none of them is PHX, which support the viewpoint that the two types of genes have evolved independently. Furthermore, it is revealed that some biosynthesis-related enzymes are highly expressed in certain bacteria. The PHX genes may provide potential drug targets for the design of new bactericide.
Collapse
Affiliation(s)
- Qing-Shan Fu
- Shandong Provincial Research Center for Bioinformatic Engineering and Technique, Center for Advanced Study, Shandong University of Technology, Zibo 255049, China
| | | | | |
Collapse
|
91
|
Supek F, Vlahoviček K. Comparison of codon usage measures and their applicability in prediction of microbial gene expressivity. BMC Bioinformatics 2005; 6:182. [PMID: 16029499 PMCID: PMC1199580 DOI: 10.1186/1471-2105-6-182] [Citation(s) in RCA: 93] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2005] [Accepted: 07/19/2005] [Indexed: 12/18/2022] Open
Abstract
Background There are a number of methods (also called: measures) currently in use that quantify codon usage in genes. These measures are often influenced by other sequence properties, such as length. This can introduce strong methodological bias into measurements; therefore we attempted to develop a method free from such dependencies. One of the common applications of codon usage analyses is to quantitatively predict gene expressivity. Results We compared the performance of several commonly used measures and a novel method we introduce in this paper – Measure Independent of Length and Composition (MILC). Large, randomly generated sequence sets were used to test for dependence on (i) sequence length, (ii) overall amount of codon bias and (iii) codon bias discrepancy in the sequences. A derivative of the method, named MELP (MILC-based Expression Level Predictor) can be used to quantitatively predict gene expression levels from genomic data. It was compared to other similar predictors by examining their correlation with actual, experimentally obtained mRNA or protein abundances. Conclusion We have established that MILC is a generally applicable measure, being resistant to changes in gene length and overall nucleotide composition, and introducing little noise into measurements. Other methods, however, may also be appropriate in certain applications. Our efforts to quantitatively predict gene expression levels in several prokaryotes and unicellular eukaryotes met with varying levels of success, depending on the experimental dataset and predictor used. Out of all methods, MELP and Rainer Merkl's GCB method had the most consistent behaviour. A 'reference set' containing known ribosomal protein genes appears to be a valid starting point for a codon usage-based expressivity prediction.
Collapse
Affiliation(s)
- Fran Supek
- Department of Molecular Biology, Division of Biology, Faculty of Science, Zagreb University, Rooseveltov trg 6, 10000 Zagreb, Croatia
| | - Kristian Vlahoviček
- Department of Molecular Biology, Division of Biology, Faculty of Science, Zagreb University, Rooseveltov trg 6, 10000 Zagreb, Croatia
- Protein Structure and Bioinformatics, International Centre for Genetic Engineering and Biotechnology, Padriciano 99, 34012 Trieste, Italy
| |
Collapse
|
92
|
Gurvich OL, Baranov PV, Gesteland RF, Atkins JF. Expression levels influence ribosomal frameshifting at the tandem rare arginine codons AGG_AGG and AGA_AGA in Escherichia coli. J Bacteriol 2005; 187:4023-32. [PMID: 15937165 PMCID: PMC1151738 DOI: 10.1128/jb.187.12.4023-4032.2005] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
The rare codons AGG and AGA comprise 2% and 4%, respectively, of the arginine codons of Escherichia coli K-12, and their cognate tRNAs are sparse. At tandem occurrences of either rare codon, the paucity of cognate aminoacyl tRNAs for the second codon of the pair facilitates peptidyl-tRNA shifting to the +1 frame. However, AGG_AGG and AGA_AGA are not underrepresented and occur 4 and 42 times, respectively, in E. coli genes. Searches for corresponding occurrences in other bacteria provide no strong support for the functional utilization of frameshifting at these sequences. All sequences tested in their native context showed 1.5 to 11% frameshifting when expressed from multicopy plasmids. A cassette with one of these sequences singly integrated into the chromosome in stringent cells gave 0.9% frameshifting in contrast to two- to four-times-higher values obtained from multicopy plasmids in stringent cells and eight-times-higher values in relaxed cells. Thus, +1 frameshifting efficiency at AGG_AGG and AGA_AGA is influenced by the mRNA expression level. These tandem rare codons do not occur in highly expressed mRNAs.
Collapse
Affiliation(s)
- Olga L Gurvich
- Department of Human Genetics, University of Utah, 15N 2030E, Rm. 7410, Salt Lake City, Utah 84112-5330, USA
| | | | | | | |
Collapse
|
93
|
Karlin S, Mrázek J, Ma J, Brocchieri L. Predicted highly expressed genes in archaeal genomes. Proc Natl Acad Sci U S A 2005; 102:7303-8. [PMID: 15883368 PMCID: PMC1129124 DOI: 10.1073/pnas.0502313102] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Based primarily on 16S rRNA sequence comparisons, life has been broadly divided into the three domains of Bacteria, Archaea, and Eukarya. Archaea is further classified into Crenarchaea and Euryarchaea. Archaea generally thrive in extreme environments as assessed by temperature, pH, and salinity. For many prokaryotic organisms, ribosomal proteins (RP), transcription/translation factors, and chaperone genes tend to be highly expressed. A gene is predicted highly expressed (PHX) if its codon usage is rather similar to the average codon usage of at least one of the RP, transcription/translation factors, and chaperone gene classes and deviates strongly from the average gene of the genome. The thermosome (Ths) chaperonin family represents the most salient PHX genes among Archaea. The chaperones Trigger factor and HSP70 have overlapping functions in the folding process, but both of these proteins are lacking in most archaea where they may be substituted by the chaperone prefoldin. Other distinctive PHX proteins of Archaea, absent from Bacteria, include the proliferating cell nuclear antigen PCNA, a replication auxiliary factor responsible for tethering the catalytic unit of DNA polymerase to DNA during high-speed replication, and the acidic RP P0, which helps to initiate mRNA translation at the ribosome. Other PHX genes feature Cell division control protein 48 (Cdc48), whereas the bacterial septation proteins FtsZ and minD are lacking in Crenarchaea. RadA is a major DNA repair and recombination protein of Archaea. Archaeal genomes feature a strong Shine-Dalgarno ribosome-binding motif more pronounced in Euryarchaea compared with Crenarchaea.
Collapse
Affiliation(s)
- Samuel Karlin
- Department of Mathematics, Stanford University, Stanford, CA 94305-2125, USA.
| | | | | | | |
Collapse
|
94
|
Lithwick G, Margalit H. Relative predicted protein levels of functionally associated proteins are conserved across organisms. Nucleic Acids Res 2005; 33:1051-7. [PMID: 15718304 PMCID: PMC549420 DOI: 10.1093/nar/gki261] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
We show that the predicted protein levels of functionally related proteins change in a coordinated fashion over many unicellular organisms. For each protein, we created a profile containing a protein abundance measure in each of a set of organisms. We show that for functionally related proteins these profiles tend to be correlated. Using the Codon Adaptation Index as a predictor of protein abundance in 48 unicellular organisms, we demonstrated this phenomenon for two types of functional relations: for proteins that physically interact and for proteins involved in consecutive steps within a metabolic pathway. Our results suggest that the protein abundance levels of functionally related proteins co-evolve.
Collapse
Affiliation(s)
| | - Hanah Margalit
- To whom correspondence should be addressed. Tel: +972 2 6758614; Fax: +972 2 6757308;
| |
Collapse
|
95
|
Raghunathan A, Price ND, Galperin MY, Makarova KS, Purvine S, Picone AF, Cherny T, Xie T, Reilly TJ, Munson R, Tyler RE, Akerley BJ, Smith AL, Palsson BO, Kolker E. In Silico Metabolic Model and Protein Expression of Haemophilus influenzae Strain Rd KW20 in Rich Medium. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2004; 8:25-41. [PMID: 15107235 DOI: 10.1089/153623104773547471] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
The intermediary metabolism of Haemophilus influenzae strain Rd KW20 was studied by a combination of protein expression analysis using a recently developed direct proteomics approach, mutational analysis, and mathematical modeling. Special emphasis was placed on carbon utilization, sugar fermentation, TCA cycle, and electron transport of H. influenzae cells grown microaerobically and anaerobically in a rich medium. The data indicate that several H. influenzae metabolic proteins similar to Escherichia coli proteins, known to be regulated by low concentrations of oxygen, were well expressed in both growth conditions in H. influenzae. An in silico model of the H. influenzae metabolic network was used to study the effects of selective deletion of certain enzymatic steps. This allowed us to define proteins predicted to be essential or non-essential for cell growth and to address numerous unresolved questions about intermediary metabolism of H. influenzae. Comparison of data from in vivo protein expression with the protein list associated with a genome-scale metabolic model showed significant coverage of the known metabolic proteome. This study demonstrates the significance of an integrated approach to the characterization of H. influenzae metabolism.
Collapse
Affiliation(s)
- Anu Raghunathan
- Department of Bioengineering, University of California at San Diego, La Jolla, California, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
96
|
Westers L, Westers H, Quax WJ. Bacillus subtilis as cell factory for pharmaceutical proteins: a biotechnological approach to optimize the host organism. BIOCHIMICA ET BIOPHYSICA ACTA-MOLECULAR CELL RESEARCH 2004; 1694:299-310. [PMID: 15546673 DOI: 10.1016/j.bbamcr.2004.02.011] [Citation(s) in RCA: 307] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2003] [Revised: 02/13/2004] [Accepted: 02/16/2004] [Indexed: 11/17/2022]
Abstract
Bacillus subtilis is a rod-shaped, Gram-positive soil bacterium that secretes numerous enzymes to degrade a variety of substrates, enabling the bacterium to survive in a continuously changing environment. These enzymes are produced commercially and this production represents about 60% of the industrial-enzyme market. Unfortunately, the secretion of heterologous proteins, originating from Gram-negative bacteria or from eukaryotes, is often severely hampered. Several bottlenecks in the B. subtilis secretion pathway, such as poor targeting to the translocase, degradation of the secretory protein, and incorrect folding, have been revealed. Nevertheless, research into the mechanisms and control of the secretion pathways will lead to improved Bacillus protein secretion systems and broaden the applications as industrial production host. This review focuses on studies that aimed at optimizing B. subtilis as cell factory for commercially interesting heterologous proteins.
Collapse
Affiliation(s)
- Lidia Westers
- Department of Pharmaceutical Biology, University of Groningen, Antonius Deusinglaan 1, 9713 AV Groningen, The Netherlands
| | | | | |
Collapse
|
97
|
Martín-Galiano AJ, Wells JM, de la Campa AG. Relationship between codon biased genes, microarray expression values and physiological characteristics of Streptococcus pneumoniae. Microbiology (Reading) 2004; 150:2313-2325. [PMID: 15256573 DOI: 10.1099/mic.0.27097-0] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
A codon-profile strategy was used to predict gene expression levels in Streptococcus pneumoniae. Predicted highly expressed (PHE) genes included those encoding glycolytic and fermentative enzymes, sugar-conversion systems and carbohydrate-transporters. Additionally, some genes required for infection that are involved in oxidative metabolism and hydrogen peroxide production were PHE. Low expression values were predicted for genes encoding specific regulatory proteins like two-component systems and competence genes. Correspondence analysis localized 484 ORFs which shared a distinctive codon profile in the right horn. These genes had a mean G+C content (33·4 %) that was lower than the bulk of the genome coding sequences (39·7 %), suggesting that many of them were acquired by horizontal transfer. Half of these genes (242) were pseudogenes, ORFs shorter than 80 codons or without assigned function. The remaining genes included several virulence factors, such as capsular genes, iga, lytB, nanB, pspA, choline-binding proteins, and functions related to DNA acquisition, such as restriction-modification systems and comDE. In order to compare predicted translation rate with the relative amounts of mRNA for each gene, the codon adaptation index (CAI) values were compared with microarray fluorescence intensity values following hybridization of labelled RNA from laboratory-grown cultures. High mRNA amounts were observed in 32·5 % of PHE genes and in 64 % of the 25 genes with the highest CAI values. However, high relative amounts of RNA were also detected in 10·4 % of non-PHE genes, such as those encoding fatty acid metabolism enzymes and proteases, suggesting that their expression might also be regulated at the level of transcription or mRNA stability under the conditions tested. The effects of codon bias and mRNA amount on different gene groups in S. pneumoniae are discussed.
Collapse
Affiliation(s)
- Antonio J Martín-Galiano
- Unidad de Genética Bacteriana (CSIC), Centro Nacional de Microbiología, Instituto de Salud Carlos III, 28220, Majadahonda, Madrid, Spain
| | - Jerry M Wells
- Bacterial Infection and Immunity Group, Institute of Food Research, Norwich Research Park, Norwich NR4 7UA, UK
| | - Adela G de la Campa
- Unidad de Genética Bacteriana (CSIC), Centro Nacional de Microbiología, Instituto de Salud Carlos III, 28220, Majadahonda, Madrid, Spain
| |
Collapse
|
98
|
dos Reis M, Wernisch L, Savva R. Unexpected correlations between gene expression and codon usage bias from microarray data for the whole Escherichia coli K-12 genome. Nucleic Acids Res 2004; 31:6976-85. [PMID: 14627830 PMCID: PMC290265 DOI: 10.1093/nar/gkg897] [Citation(s) in RCA: 169] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Escherichia coli has long been regarded as a model organism in the study of codon usage bias (CUB). However, most studies in this organism regarding this topic have been computational or, when experimental, restricted to small datasets; particularly poor attention has been given to genes with low CUB. In this work, correspondence analysis on codon usage is used to classify E.coli genes into three groups, and the relationship between them and expression levels from microarray experiments is studied. These groups are: group 1, highly biased genes; group 2, moderately biased genes; and group 3, AT-rich genes with low CUB. It is shown that, surprisingly, there is a negative correlation between codon bias and expression levels for group 3 genes, i.e. genes with extremely low codon adaptation index (CAI) values are highly expressed, while group 2 show the lowest average expression levels and group 1 show the usual expected positive correlation between CAI and expression. This trend is maintained over all functional gene groups, seeming to contradict the E.coli-yeast paradigm on CUB. It is argued that these findings are still compatible with the mutation-selection balance hypothesis of codon usage and that E.coli genes form a dynamic system shaped by these factors.
Collapse
Affiliation(s)
- Mario dos Reis
- School of Crystallography, Birkbeck College, Malet Street, London WC1E 7HX, UK
| | | | | |
Collapse
|
99
|
Kolker E, Makarova KS, Shabalina S, Picone AF, Purvine S, Holzman T, Cherny T, Armbruster D, Munson RS, Kolesov G, Frishman D, Galperin MY. Identification and functional analysis of 'hypothetical' genes expressed in Haemophilus influenzae. Nucleic Acids Res 2004; 32:2353-61. [PMID: 15121896 PMCID: PMC419445 DOI: 10.1093/nar/gkh555] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The progress in genome sequencing has led to a rapid accumulation in GenBank submissions of uncharacterized 'hypothetical' genes. These genes, which have not been experimentally characterized and whose functions cannot be deduced from simple sequence comparisons alone, now comprise a significant fraction of the public databases. Expression analyses of Haemophilus influenzae cells using a combination of transcriptomic and proteomic approaches resulted in confident identification of 54 'hypothetical' genes that were expressed in cells under normal growth conditions. In an attempt to understand the functions of these proteins, we used a variety of publicly available analysis tools. Close homologs in other species were detected for each of the 54 'hypothetical' genes. For 16 of them, exact functional assignments could be found in one or more public databases. Additionally, we were able to suggest general functional characterization for 27 more genes (comprising approximately 80% total). Findings from this analysis include the identification of a pyruvate-formate lyase-like operon, likely to be expressed not only in H.influenzae but also in several other bacteria. Further, we also observed three genes that are likely to participate in the transport and/or metabolism of sialic acid, an important component of the H.influenzae lipo-oligosaccharide. Accurate functional annotation of uncharacterized genes calls for an integrative approach, combining expression studies with extensive computational analysis and curation, followed by eventual experimental verification of the computational predictions.
Collapse
Affiliation(s)
- Eugene Kolker
- BIATECH, 19310 North Creek Parkway, Suite 115, Bothell, WA 98011, USA.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
100
|
Campo N, Dias MJ, Daveran-Mingot ML, Ritzenthaler P, Le Bourgeois P. Chromosomal constraints in Gram-positive bacteria revealed by artificial inversions. Mol Microbiol 2004; 51:511-22. [PMID: 14756790 DOI: 10.1046/j.1365-2958.2003.03847.x] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
We used artificial chromosome inversions to investigate the chromosomal constraints that preserve genome organization in the Gram-positive bacterium Lactococcus lactis. Large inversions, 80-1260 kb in length, disturbing the symmetry of the origin and terminus of the replication axis to various extents, were constructed using the site-specific Cre-loxP recombination system. These inversions were all mechanistically feasible and fell into various classes according to stability and effect on cell fitness. The L. lactis chromosome supports only to some extent unbalance in length of its replication arms. The location of detrimental inversions allowed identification of two constrained chromosomal regions: a large domain covering one fifth of the genome that encompasses the origin of replication (Ori domain), and a smaller domain located at the opposite of the chromosome (Ter domain).
Collapse
Affiliation(s)
- N Campo
- Laboratoire de Microbiologie et Génétique Moléculaire du CNRS (UMR5100), Université Paul Sabatier, 118 route de Narbonne, 31062 Toulouse, France
| | | | | | | | | |
Collapse
|