51
|
Kaderbhai NN, Broadhurst DI, Ellis DI, Goodacre R, Kell DB. Functional genomics via metabolic footprinting: monitoring metabolite secretion by Escherichia coli tryptophan metabolism mutants using FT-IR and direct injection electrospray mass spectrometry. Comp Funct Genomics 2010; 4:376-91. [PMID: 18629082 PMCID: PMC2447367 DOI: 10.1002/cfg.302] [Citation(s) in RCA: 101] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2003] [Revised: 04/23/2003] [Accepted: 05/22/2003] [Indexed: 12/14/2022] Open
Abstract
We sought to test the hypothesis that mutant bacterial strains could be discriminated from each other on the basis of the metabolites they secrete into the medium (their
‘metabolic footprint’), using two methods of ‘global’ metabolite analysis (FT–IR and
direct injection electrospray mass spectrometry). The biological system used was
based on a published study of Escherichia coli tryptophan mutants that had been
analysed and discriminated by Yanofsky and colleagues using transcriptome analysis.
Wild-type strains supplemented with tryptophan or analogues could be discriminated
from controls using FT–IR of 24 h broths, as could each of the mutant strains in both
minimal and supplemented media. Direct injection electrospray mass spectrometry
with unit mass resolution could also be used to discriminate the strains from each
other, and had the advantage that the discrimination required the use of just two
or three masses in each case. These were determined via a genetic algorithm. Both
methods are rapid, reagentless, reproducible and cheap, and might beneficially be
extended to the analysis of gene knockout libraries.
Collapse
Affiliation(s)
- Naheed N Kaderbhai
- Institute of Biological Sciences, University of Wales, Aberystwyth, Wales Ceredigion SY23 3DD, UK
| | | | | | | | | |
Collapse
|
52
|
Kumar R, Shah P, Swiatlo E, Burgess SC, Lawrence ML, Nanduri B. Identification of novel non-coding small RNAs from Streptococcus pneumoniae TIGR4 using high-resolution genome tiling arrays. BMC Genomics 2010; 11:350. [PMID: 20525227 PMCID: PMC2887815 DOI: 10.1186/1471-2164-11-350] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2009] [Accepted: 06/03/2010] [Indexed: 11/10/2022] Open
Abstract
Background The identification of non-coding transcripts in human, mouse, and Escherichia coli has revealed their widespread occurrence and functional importance in both eukaryotic and prokaryotic life. In prokaryotes, studies have shown that non-coding transcripts participate in a broad range of cellular functions like gene regulation, stress and virulence. However, very little is known about non-coding transcripts in Streptococcus pneumoniae (pneumococcus), an obligate human respiratory pathogen responsible for significant worldwide morbidity and mortality. Tiling microarrays enable genome wide mRNA profiling as well as identification of novel transcripts at a high-resolution. Results Here, we describe a high-resolution transcription map of the S. pneumoniae clinical isolate TIGR4 using genomic tiling arrays. Our results indicate that approximately 66% of the genome is expressed under our experimental conditions. We identified a total of 50 non-coding small RNAs (sRNAs) from the intergenic regions, of which 36 had no predicted function. Half of the identified sRNA sequences were found to be unique to S. pneumoniae genome. We identified eight overrepresented sequence motifs among sRNA sequences that correspond to sRNAs in different functional categories. Tiling arrays also identified approximately 202 operon structures in the genome. Conclusions In summary, the pneumococcal operon structures and novel sRNAs identified in this study enhance our understanding of the complexity and extent of the pneumococcal 'expressed' genome. Furthermore, the results of this study open up new avenues of research for understanding the complex RNA regulatory network governing S. pneumoniae physiology and virulence.
Collapse
Affiliation(s)
- Ranjit Kumar
- Department of Basic sciences, College of Veterinary Medicine, Mississippi State University, Mississippi State, MS 39762, USA
| | | | | | | | | | | |
Collapse
|
53
|
Yu WH, Høvik H, Chen T. A hidden Markov support vector machine framework incorporating profile geometry learning for identifying microbial RNA in tiling array data. Bioinformatics 2010; 26:1423-30. [PMID: 20395286 DOI: 10.1093/bioinformatics/btq162] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION RNA expression signals detected by high-density genomic tiling microarrays contain comprehensive transcriptomic information of the target organism. Current methods for determining the RNA transcription units are still computation intense and lack the discriminative power. This article describes an efficient and accurate methodology to reveal complicated transcriptional architecture, including small regulatory RNAs, in microbial transcriptome profiles. RESULTS Normalized microarray data were first subject to support vector regression to estimate the profile tendency by reducing noise interruption. A hybrid supervised machine learning algorithm, hidden Markov support vector machines, was then used to classify the underlying state of each probe to 'expression' or 'silence' with the assumption that the consecutive state sequence was a heterogeneous Markov chain. For model construction, we introduced a profile geometry learning method to construct the feature vectors, which considered both intensity profiles and changes of intensities over the probe spacing. Also, a robust strategy was used to dynamically evaluate and select the training set based only on prior computer gene annotation. The algorithm performed better than other methods in accuracy on simulated data, especially for small expressed regions with lower (<1) SNR (signal-to-noise ratio), hence more sensitive for detecting small RNAs. AVAILABILITY AND IMPLEMENTATION Detail implementation steps of the algorithm and the complete result of the transcriptome analysis for a microbial genome Porphyromonas gingivalis W83 can be viewed at http://bioinformatics.forsyth.org/mtd.
Collapse
Affiliation(s)
- Wen-Han Yu
- Department of Molecular Genetics, The Forsyth Institute, Boston, MA 02115, USA
| | | | | |
Collapse
|
54
|
Lorenz C, Gesell T, Zimmermann B, Schoeberl U, Bilusic I, Rajkowitsch L, Waldsich C, von Haeseler A, Schroeder R. Genomic SELEX for Hfq-binding RNAs identifies genomic aptamers predominantly in antisense transcripts. Nucleic Acids Res 2010; 38:3794-808. [PMID: 20348540 PMCID: PMC2887942 DOI: 10.1093/nar/gkq032] [Citation(s) in RCA: 71] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
An unexpectedly high number of regulatory RNAs have been recently discovered that fine-tune the function of genes at all levels of expression. We employed Genomic SELEX, a method to identify protein-binding RNAs encoded in the genome, to search for further regulatory RNAs in Escherichia coli. We used the global regulator protein Hfq as bait, because it can interact with a large number of RNAs, promoting their interaction. The enriched SELEX pool was subjected to deep sequencing, and 8865 sequences were mapped to the E. coli genome. These short sequences represent genomic Hfq-aptamers and are part of potential regulatory elements within RNA molecules. The motif 5′-AAYAAYAA-3′ was enriched in the selected RNAs and confers low-nanomolar affinity to Hfq. The motif was confirmed to bind Hfq by DMS footprinting. The Hfq aptamers are 4-fold more frequent on the antisense strand of protein coding genes than on the sense strand. They were enriched opposite to translation start sites or opposite to intervening sequences between ORFs in operons. These results expand the repertoire of Hfq targets and also suggest that Hfq might regulate the expression of a large number of genes via interaction with cis-antisense RNAs.
Collapse
Affiliation(s)
- C Lorenz
- Department of Biochemistry, Medical University of Vienna and University of Veterinary Medicine, Vienna, Austria
| | | | | | | | | | | | | | | | | |
Collapse
|
55
|
ten Broeke-Smits NJP, Pronk TE, Jongerius I, Bruning O, Wittink FR, Breit TM, van Strijp JAG, Fluit AC, Boel CHE. Operon structure of Staphylococcus aureus. Nucleic Acids Res 2010; 38:3263-74. [PMID: 20150412 PMCID: PMC2879529 DOI: 10.1093/nar/gkq058] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
In bacteria, gene regulation is one of the fundamental characteristics of survival, colonization and pathogenesis. Operons play a key role in regulating expression of diverse genes involved in metabolism and virulence. However, operon structures in pathogenic bacteria have been determined only by in silico approaches that are dependent on factors such as intergenic distances and terminator/promoter sequences. Knowledge of operon structures is crucial to fully understand the pathophysiology of infections. Presently, transcriptome data obtained from growth curves in a defined medium were used to predict operons in Staphylococcus aureus. This unbiased approach and the use of five highly reproducible biological replicates resulted in 93.5% significantly regulated genes. These data, combined with Pearson's correlation coefficients of the transcriptional profiles, enabled us to accurately compile 93% of the genome in operon structures. A total of 1640 genes of different functional classes were identified in operons. Interestingly, we found several operons containing virulence genes and showed synergistic effects for two complement convertase inhibitors transcribed in one operon. This is the first experimental approach to fully identify operon structures in S. aureus. It forms the basis for further in vitro regulation studies that will profoundly advance the understanding of bacterial pathophysiology in vivo.
Collapse
|
56
|
Høvik H, Chen T. Dynamic probe selection for studying microbial transcriptome with high-density genomic tiling microarrays. BMC Bioinformatics 2010; 11:82. [PMID: 20144223 PMCID: PMC2836303 DOI: 10.1186/1471-2105-11-82] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2009] [Accepted: 02/09/2010] [Indexed: 12/27/2022] Open
Abstract
Background Current commercial high-density oligonucleotide microarrays can hold millions of probe spots on a single microscopic glass slide and are ideal for studying the transcriptome of microbial genomes using a tiling probe design. This paper describes a comprehensive computational pipeline implemented specifically for designing tiling probe sets to study microbial transcriptome profiles. Results The pipeline identifies every possible probe sequence from both forward and reverse-complement strands of all DNA sequences in the target genome including circular or linear chromosomes and plasmids. Final probe sequence lengths are adjusted based on the maximal oligonucleotide synthesis cycles and best isothermality allowed. Optimal probes are then selected in two stages - sequential and gap-filling. In the sequential stage, probes are selected from sequence windows tiled alongside the genome. In the gap-filling stage, additional probes are selected from the largest gaps between adjacent probes that have already been selected, until a predefined number of probes is reached. Selection of the highest quality probe within each window and gap is based on five criteria: sequence uniqueness, probe self-annealing, melting temperature, oligonucleotide length, and probe position. Conclusions The probe selection pipeline evaluates global and local probe sequence properties and selects a set of probes dynamically and evenly distributed along the target genome. Unique to other similar methods, an exact number of non-redundant probes can be designed to utilize all the available probe spots on any chosen microarray platform. The pipeline can be applied to microbial genomes when designing high-density tiling arrays for comparative genomics, ChIP chip, gene expression and comprehensive transcriptome studies.
Collapse
Affiliation(s)
- Hedda Høvik
- Department of Oral Biology, Faculty of Dentistry, University of Oslo, Oslo, Norway
| | | |
Collapse
|
57
|
Abstract
Small RNAs (sRNAs) that act by base pairing with trans-encoded mRNAs modulate metabolism in response to a variety of environmental stimuli. Here, we describe an Hfq-binding sRNA (FnrS) whose expression is induced upon a shift from aerobic to anaerobic conditions and which acts to downregulate the levels of a variety of mRNAs encoding metabolic enzymes. Anaerobic induction in minimal medium depends strongly on FNR but is also affected by the ArcA and CRP transcription regulators. Whole genome expression analysis showed that the levels of at least 32 mRNAs are downregulated upon FnrS overexpression, 15 of which are predicted to base pair with FnrS by TargetRNA. The sRNA is highly conserved across its entire length in numerous Enterobacteria, and mutational analysis revealed that two separate regions of FnrS base pair with different sets of target mRNAs. The majority of the target genes were previously reported to be downregulated in an FNR-dependent manner but lack recognizable FNR binding sites. We thus suggest that FnrS extends the FNR regulon and increases the efficiency of anaerobic metabolism by repressing the synthesis of enzymes that are not needed under these conditions.
Collapse
Affiliation(s)
- Sylvain Durand
- Cell Biology and Metabolism Program, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD, USA
| | | |
Collapse
|
58
|
Güell M, van Noort V, Yus E, Chen WH, Leigh-Bell J, Michalodimitrakis K, Yamada T, Arumugam M, Doerks T, Kühner S, Rode M, Suyama M, Schmidt S, Gavin AC, Bork P, Serrano L. Transcriptome complexity in a genome-reduced bacterium. Science 2009; 326:1268-71. [PMID: 19965477 DOI: 10.1126/science.1176951] [Citation(s) in RCA: 349] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
To study basic principles of transcriptome organization in bacteria, we analyzed one of the smallest self-replicating organisms, Mycoplasma pneumoniae. We combined strand-specific tiling arrays, complemented by transcriptome sequencing, with more than 252 spotted arrays. We detected 117 previously undescribed, mostly noncoding transcripts, 89 of them in antisense configuration to known genes. We identified 341 operons, of which 139 are polycistronic; almost half of the latter show decaying expression in a staircase-like manner. Under various conditions, operons could be divided into 447 smaller transcriptional units, resulting in many alternative transcripts. Frequent antisense transcripts, alternative transcripts, and multiple regulators per gene imply a highly dynamic transcriptome, more similar to that of eukaryotes than previously thought.
Collapse
Affiliation(s)
- Marc Güell
- Centre for Genomic Regulation (CRG), Universitat Pompeu Fabra, Barcelona, Spain
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
59
|
Oliver HF, Orsi RH, Ponnala L, Keich U, Wang W, Sun Q, Cartinhour SW, Filiatrault MJ, Wiedmann M, Boor KJ. Deep RNA sequencing of L. monocytogenes reveals overlapping and extensive stationary phase and sigma B-dependent transcriptomes, including multiple highly transcribed noncoding RNAs. BMC Genomics 2009; 10:641. [PMID: 20042087 PMCID: PMC2813243 DOI: 10.1186/1471-2164-10-641] [Citation(s) in RCA: 145] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2009] [Accepted: 12/30/2009] [Indexed: 11/30/2022] Open
Abstract
Background Identification of specific genes and gene expression patterns important for bacterial survival, transmission and pathogenesis is critically needed to enable development of more effective pathogen control strategies. The stationary phase stress response transcriptome, including many σB-dependent genes, was defined for the human bacterial pathogen Listeria monocytogenes using RNA sequencing (RNA-Seq) with the Illumina Genome Analyzer. Specifically, bacterial transcriptomes were compared between stationary phase cells of L. monocytogenes 10403S and an otherwise isogenic ΔsigB mutant, which does not express the alternative σ factor σB, a major regulator of genes contributing to stress response, including stresses encountered upon entry into stationary phase. Results Overall, 83% of all L. monocytogenes genes were transcribed in stationary phase cells; 42% of currently annotated L. monocytogenes genes showed medium to high transcript levels under these conditions. A total of 96 genes had significantly higher transcript levels in 10403S than in ΔsigB, indicating σB-dependent transcription of these genes. RNA-Seq analyses indicate that a total of 67 noncoding RNA molecules (ncRNAs) are transcribed in stationary phase L. monocytogenes, including 7 previously unrecognized putative ncRNAs. Application of a dynamically trained Hidden Markov Model, in combination with RNA-Seq data, identified 65 putative σB promoters upstream of 82 of the 96 σB-dependent genes and upstream of the one σB-dependent ncRNA. The RNA-Seq data also enabled annotation of putative operons as well as visualization of 5'- and 3'-UTR regions. Conclusions The results from these studies provide powerful evidence that RNA-Seq data combined with appropriate bioinformatics tools allow quantitative characterization of prokaryotic transcriptomes, thus providing exciting new strategies for exploring transcriptional regulatory networks in bacteria. See minireivew http://jbiol.com/content/8/12/107.
Collapse
Affiliation(s)
- Haley F Oliver
- Department of Food Science, Cornell University, Ithaca, NY, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
60
|
Genome-wide identification of transcription start sites, promoters and transcription factor binding sites in E. coli. PLoS One 2009; 4:e7526. [PMID: 19838305 PMCID: PMC2760140 DOI: 10.1371/journal.pone.0007526] [Citation(s) in RCA: 210] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2009] [Accepted: 09/28/2009] [Indexed: 11/19/2022] Open
Abstract
Despite almost 40 years of molecular genetics research in Escherichia coli a major fraction of its Transcription Start Sites (TSSs) are still unknown, limiting therefore our understanding of the regulatory circuits that control gene expression in this model organism. RegulonDB (http://regulondb.ccg.unam.mx/) is aimed at integrating the genetic regulatory network of E. coli K12 as an entirely bioinformatic project up till now. In this work, we extended its aims by generating experimental data at a genome scale on TSSs, promoters and regulatory regions. We implemented a modified 5' RACE protocol and an unbiased High Throughput Pyrosequencing Strategy (HTPS) that allowed us to map more than 1700 TSSs with high precision. From this collection, about 230 corresponded to previously reported TSSs, which helped us to benchmark both our methodologies and the accuracy of the previous mapping experiments. The other ca 1500 TSSs mapped belong to about 1000 different genes, many of them with no assigned function. We identified promoter sequences and type of sigma factors that control the expression of about 80% of these genes. As expected, the housekeeping sigma(70) was the most common type of promoter, followed by sigma(38). The majority of the putative TSSs were located between 20 to 40 nucleotides from the translational start site. Putative regulatory binding sites for transcription factors were detected upstream of many TSSs. For a few transcripts, riboswitches and small RNAs were found. Several genes also had additional TSSs within the coding region. Unexpectedly, the HTPS experiments revealed extensive antisense transcription, probably for regulatory functions. The new information in RegulonDB, now with more than 2400 experimentally determined TSSs, strengthens the accuracy of promoter prediction, operon structure, and regulatory networks and provides valuable new information that will facilitate the understanding from a global perspective the complex and intricate regulatory network that operates in E. coli.
Collapse
|
61
|
Sharma CM, Vogel J. Experimental approaches for the discovery and characterization of regulatory small RNA. Curr Opin Microbiol 2009; 12:536-46. [PMID: 19758836 DOI: 10.1016/j.mib.2009.07.006] [Citation(s) in RCA: 162] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2009] [Revised: 07/23/2009] [Accepted: 07/28/2009] [Indexed: 01/27/2023]
Abstract
Following the pioneering screens for small regulatory RNAs (sRNAs) in Escherichia coli in 2001, sRNAs are now being identified in almost every branch of the eubacterial kingdom. Experimental strategies have become increasingly important for sRNA discovery, thanks to increased availability of tiling arrays and fast progress in the development of high-throughput cDNA sequencing (RNA-Seq). The new technologies also facilitate genome-wide discovery of potential target mRNAs by sRNA pulse-expression coupled to transcriptomics, and immunoprecipitation with RNA-binding proteins such as Hfq. Moreover, the staggering rate of new sRNAs demands mechanistic analysis of target regulation. We will also review the available toolbox for wet lab-based research, including in vivo and in vitro reporter systems, genetic methods and biochemical co-purification of sRNA interaction partners.
Collapse
Affiliation(s)
- Cynthia Mira Sharma
- RNA Biology Group, Max Planck Institute for Infection Biology, Charitéplatz 1, D-10117 Berlin, Germany
| | | |
Collapse
|
62
|
Rasmussen S, Nielsen HB, Jarmer H. The transcriptionally active regions in the genome of Bacillus subtilis. Mol Microbiol 2009; 73:1043-57. [PMID: 19682248 PMCID: PMC2784878 DOI: 10.1111/j.1365-2958.2009.06830.x] [Citation(s) in RCA: 140] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/23/2009] [Indexed: 12/29/2022]
Abstract
The majority of all genes have so far been identified and annotated systematically through in silico gene finding. Here we report the finding of 3662 strand-specific transcriptionally active regions (TARs) in the genome of Bacillus subtilis by the use of tiling arrays. We have measured the genome-wide expression during mid-exponential growth on rich (LB) and minimal (M9) medium. The identified TARs account for 77.3% of the genes as they are currently annotated and additionally we find 84 putative non-coding RNAs (ncRNAs) and 127 antisense transcripts. One ncRNA, ncr22, is predicted to act as a translational control on cstA and an antisense transcript was observed opposite the housekeeping sigma factor sigA. Through this work we have discovered a long conserved 3' untranslated region (UTR) in a group of membrane-associated genes that is predicted to fold into a large and highly stable secondary structure. One of the genes having this tail is efeN, which encodes a target of the twin-arginine translocase (Tat) protein translocation system.
Collapse
Affiliation(s)
- Simon Rasmussen
- Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark2800 Lyngby, Denmark
| | - Henrik Bjørn Nielsen
- Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark2800 Lyngby, Denmark
| | - Hanne Jarmer
- Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark2800 Lyngby, Denmark
| |
Collapse
|
63
|
Thomassen GOS, Rowe AD, Lagesen K, Lindvall JM, Rognes T. Custom design and analysis of high-density oligonucleotide bacterial tiling microarrays. PLoS One 2009; 4:e5943. [PMID: 19536279 PMCID: PMC2691959 DOI: 10.1371/journal.pone.0005943] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2009] [Accepted: 05/18/2009] [Indexed: 11/21/2022] Open
Abstract
Background High-density tiling microarrays are a powerful tool for the characterization of complete genomes. The two major computational challenges associated with custom-made arrays are design and analysis. Firstly, several genome dependent variables, such as the genome's complexity and sequence composition, need to be considered in the design to ensure a high quality microarray. Secondly, since tiling projects today very often exceed the limits of conventional array-experiments, researchers cannot use established computer tools designed for commercial arrays, and instead have to redesign previous methods or create novel tools. Principal Findings Here we describe the multiple aspects involved in the design of tiling arrays for transcriptome analysis and detail the normalisation and analysis procedures for such microarrays. We introduce a novel design method to make two 280,000 feature microarrays covering the entire genome of the bacterial species Escherichia coli and Neisseria meningitidis, respectively, as well as the use of multiple copies of control probe-sets on tiling microarrays. Furthermore, a novel normalisation and background estimation procedure for tiling arrays is presented along with a method for array analysis focused on detection of short transcripts. The design, normalisation and analysis methods have been applied in various experiments and several of the detected novel short transcripts have been biologically confirmed by Northern blot tests. Conclusions Tiling-arrays are becoming increasingly applicable in genomic research, but researchers still lack both the tools for custom design of arrays, as well as the systems and procedures for analysis of the vast amount of data resulting from such experiments. We believe that the methods described herein will be a useful contribution and resource for researchers designing and analysing custom tiling arrays for both bacteria and higher organisms.
Collapse
Affiliation(s)
- Gard O. S. Thomassen
- Centre for Molecular Biology and Neuroscience (CMBN), Institute of Medical Microbiology, University of Oslo, Oslo, Norway
- Centre for Molecular Biology and Neuroscience (CMBN), Institute of Medical Microbiology, Oslo University Hospital, Rikshospitalet, Oslo, Norway
| | - Alexander D. Rowe
- Centre for Molecular Biology and Neuroscience (CMBN), Institute of Medical Microbiology, Oslo University Hospital, Rikshospitalet, Oslo, Norway
| | - Karin Lagesen
- Centre for Molecular Biology and Neuroscience (CMBN), Institute of Medical Microbiology, Oslo University Hospital, Rikshospitalet, Oslo, Norway
| | | | - Torbjørn Rognes
- Centre for Molecular Biology and Neuroscience (CMBN), Institute of Medical Microbiology, Oslo University Hospital, Rikshospitalet, Oslo, Norway
- Department of Informatics, University of Oslo, Oslo, Norway
- * E-mail:
| |
Collapse
|
64
|
Meyer MM, Ames TD, Smith DP, Weinberg Z, Schwalbach MS, Giovannoni SJ, Breaker RR. Identification of candidate structured RNAs in the marine organism 'Candidatus Pelagibacter ubique'. BMC Genomics 2009; 10:268. [PMID: 19531245 PMCID: PMC2704228 DOI: 10.1186/1471-2164-10-268] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2009] [Accepted: 06/16/2009] [Indexed: 02/04/2023] Open
Abstract
Background Metagenomic sequence data are proving to be a vast resource for the discovery of biological components. Yet analysis of this data to identify functional RNAs lags behind efforts to characterize protein diversity. The genome of 'Candidatus Pelagibacter ubique' HTCC 1062 is the closest match for approximately 20% of marine metagenomic sequence reads. It is also small, contains little non-coding DNA, and has strikingly low GC content. Results To aid the discovery of RNA motifs within the marine metagenome we exploited the genomic properties of 'Cand. P. ubique' by targeting our search to long intergenic regions (IGRs) with relatively high GC content. Analysis of known RNAs (rRNA, tRNA, riboswitches etc.) shows that structured RNAs are significantly enriched in such IGRs. To identify additional candidate structured RNAs, we examined other IGRs with similar characteristics from 'Cand. P. ubique' using comparative genomics approaches in conjunction with marine metagenomic data. Employing this strategy, we discovered four candidate structured RNAs including a new riboswitch class as well as three additional likely cis-regulatory elements that precede genes encoding ribosomal proteins S2 and S12, and the cytoplasmic protein component of the signal recognition particle. We also describe four additional potential RNA motifs with few or no examples occurring outside the metagenomic data. Conclusion This work begins the process of identifying functional RNA motifs present in the metagenomic data and illustrates how existing completed genomes may be used to aid in this task.
Collapse
Affiliation(s)
- Michelle M Meyer
- Department of Molecular Cellular and Developmental Biology, Yale University, New Haven, CT 06520, USA.
| | | | | | | | | | | | | |
Collapse
|
65
|
Mok WWK, Navani NK, Barker C, Sawchyn BL, Gu J, Pathania R, Zhu RD, Brown ED, Li Y. Identification of a toxic peptide through bidirectional expression of small RNAs. Chembiochem 2009; 10:238-41. [PMID: 19090519 DOI: 10.1002/cbic.200800591] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Wendy W K Mok
- Department of Biochemistry and Biomedical Sciences, McMaster University, 1200 Main Street, W. Hamilton, ON L8N 3Z5, Canada
| | | | | | | | | | | | | | | | | |
Collapse
|
66
|
Marchais A, Naville M, Bohn C, Bouloc P, Gautheret D. Single-pass classification of all noncoding sequences in a bacterial genome using phylogenetic profiles. Genome Res 2009; 19:1084-92. [PMID: 19237465 DOI: 10.1101/gr.089714.108] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Identification and characterization of functional elements in the noncoding regions of genomes is an elusive and time-consuming activity whose output does not keep up with the pace of genome sequencing. Hundreds of bacterial genomes lay unexploited in terms of noncoding sequence analysis, although they may conceal a wide diversity of novel RNA genes, riboswitches, or other regulatory elements. We describe a strategy that exploits the entirety of available bacterial genomes to classify all noncoding elements of a selected reference species in a single pass. This method clusters noncoding elements based on their profile of presence among species. Most noncoding RNAs (ncRNAs) display specific signatures that enable their grouping in distinct clusters, away from sequence conservation noise and other elements such as promoters. We submitted 24 ncRNA candidates from Staphylococcus aureus to experimental validation and confirmed the presence of seven novel small RNAs or riboswitches. Besides offering a powerful method for de novo ncRNA identification, the analysis of phylogenetic profiles opens a new path toward the identification of functional relationships between co-evolving coding and noncoding elements.
Collapse
Affiliation(s)
- Antonin Marchais
- Université Paris-Sud 11, CNRS, UMR8621, Institut de Génétique et Microbiologie, F-91405 Orsay Cedex, France
| | | | | | | | | |
Collapse
|
67
|
Hazen SP, Naef F, Quisel T, Gendron JM, Chen H, Ecker JR, Borevitz JO, Kay SA. Exploring the transcriptional landscape of plant circadian rhythms using genome tiling arrays. Genome Biol 2009; 10:R17. [PMID: 19210792 PMCID: PMC2688271 DOI: 10.1186/gb-2009-10-2-r17] [Citation(s) in RCA: 92] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2008] [Revised: 12/09/2008] [Accepted: 02/11/2009] [Indexed: 11/20/2022] Open
Abstract
Whole genome tiling array analysis reveals the extent of transcriptional oscillation for both coding and non-coding genes in regulating Arabidopsis thaliana circadian rhythms Background Organisms are able to anticipate changes in the daily environment with an internal oscillator know as the circadian clock. Transcription is an important mechanism in maintaining these oscillations. Here we explore, using whole genome tiling arrays, the extent of rhythmic expression patterns genome-wide, with an unbiased analysis of coding and noncoding regions of the Arabidopsis genome. Results As in previous studies, we detected a circadian rhythm for approximately 25% of the protein coding genes in the genome. With an unbiased interrogation of the genome, extensive rhythmic introns were detected predominantly in phase with adjacent rhythmic exons, creating a transcript that, if translated, would be expected to produce a truncated protein. In some cases, such as the MYB transcription factor AT2G20400, an intron was found to exhibit a circadian rhythm while the remainder of the transcript was otherwise arrhythmic. In addition to several known noncoding transcripts, including microRNA, trans-acting short interfering RNA, and small nucleolar RNA, greater than one thousand intergenic regions were detected as circadian clock regulated, many of which have no predicted function, either coding or noncoding. Nearly 7% of the protein coding genes produced rhythmic antisense transcripts, often for genes whose sense strand was not similarly rhythmic. Conclusions This study revealed widespread circadian clock regulation of the Arabidopsis genome extending well beyond the protein coding transcripts measured to date. This suggests a greater level of structural and temporal dynamics than previously known.
Collapse
Affiliation(s)
- Samuel P Hazen
- Section of Cell and Developmental Biology, University of California San Diego, Gilman Drive, La Jolla, CA 92093-0130, USA
| | | | | | | | | | | | | | | |
Collapse
|
68
|
Sayed AK, Foster JW. A 750 bp sensory integration region directs global control of the Escherichia coli GadE acid resistance regulator. Mol Microbiol 2009; 71:1435-50. [PMID: 19220752 DOI: 10.1111/j.1365-2958.2009.06614.x] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Escherichia coli survives pH 2 environments through an acid resistance (AR) system regulated by the transcriptional activator GadE. Numerous proteins control gadE at an upstream, conserved, 798 bp intergenic region. We show this region produces three transcripts starting at -124 (T1), -324/-317 (T2) and -566 (T3) bp from the gadE start codon. Transcriptional lacZ fusions to gadE promoter regions revealed P1 and P3 were active while P2 alone was not. However, pairing P3 with P2 activated P2 and increased expression 20-fold above P3 alone. The fusions were transferred to Salmonella, which lacks this AR system, and plasmid-borne E. coli-specific regulators EvgA, YdeO, GadE and GadX were introduced. Data revealed that YdeO and GadX activate P3, P2 and P3P2, while GadE autoactivates P1 and represses P3 and P3P2. The developing model indicates that different signals activate YdeO, GadX, or an MnmE-dependent regulator, which stimulate gadE transcription from the P3 and P2 promoters. Once made, GadE activates P1 and represses P3 and P2. The P1 region also enables efficient downstream transcription and translation of the P3 or P2 transcripts. Evidence indicates the entire 750 bp sensory integration locus is necessary for a versatile response.
Collapse
Affiliation(s)
- Atef K Sayed
- Department of Microbiology and Immunology, University of South Alabama College of Medicine, Mobile, AL 36688, USA
| | | |
Collapse
|
69
|
Dhar PK, Thwin CS, Tun K, Tsumoto Y, Maurer-Stroh S, Eisenhaber F, Surana U. Synthesizing non-natural parts from natural genomic template. J Biol Eng 2009; 3:2. [PMID: 19187561 PMCID: PMC2642765 DOI: 10.1186/1754-1611-3-2] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2008] [Accepted: 02/03/2009] [Indexed: 11/19/2022] Open
Abstract
Background The current knowledge of genes and proteins comes from 'naturally designed' coding and non-coding regions. It would be interesting to move beyond natural boundaries and make user-defined parts. To explore this possibility we made six non-natural proteins in E. coli. We also studied their potential tertiary structure and phenotypic outcomes. Results The chosen intergenic sequences were amplified and expressed using pBAD 202/D-TOPO vector. All six proteins showed significantly low similarity to the known proteins in the NCBI protein database. The protein expression was confirmed through Western blot. The endogenous expression of one of the proteins resulted in the cell growth inhibition. The growth inhibition was completely rescued by culturing cells in the inducer-free medium. Computational structure prediction suggests globular tertiary structure for two of the six non-natural proteins synthesized. Conclusion To our best knowledge, this is the first study that demonstrates artificial synthesis of non-natural proteins from existing genomic template, their potential tertiary structure and phenotypic outcome. The work presented in this paper opens up a new avenue of investigating fundamental biology. Our approach can also be used to synthesize large numbers of non-natural RNA and protein parts for useful applications.
Collapse
Affiliation(s)
- Pawan K Dhar
- Synthetic Biology Lab, RIKEN Advanced Sciences Institute, Yokohama, 230-0045, Japan.
| | | | | | | | | | | | | |
Collapse
|
70
|
Levine E, Hwa T. Small RNAs establish gene expression thresholds. Curr Opin Microbiol 2008; 11:574-9. [PMID: 18935980 DOI: 10.1016/j.mib.2008.09.016] [Citation(s) in RCA: 92] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2008] [Revised: 09/24/2008] [Accepted: 09/24/2008] [Indexed: 02/01/2023]
Abstract
The central role of small RNAs in regulating bacterial gene expression has been elucidated in the past years. Typically, small RNAs act via specific basepairing with target mRNAs, leading to modulation of translation initiation and mRNA stability. Quantitative studies suggest that small RNA regulation is characterized by unique features, which allow it to complement regulation at the transcriptional level. In particular, small RNAs are shown to establish a threshold for the expression of their target, providing safety mechanism against random fluctuations and transient signals. The threshold level is set by the transcription rate of the small RNA and can thus be modulated dynamically to reflect changing environmental conditions.
Collapse
Affiliation(s)
- Erel Levine
- Center for Theoretical Biological Physics and Department of Physics, University of California at San Diego, La Jolla, CA 92093, United States.
| | | |
Collapse
|
71
|
|
72
|
Pichon C, Felden B. Small RNA gene identification and mRNA target predictions in bacteria. Bioinformatics 2008; 24:2807-13. [PMID: 18974076 DOI: 10.1093/bioinformatics/btn560] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Bacterial small ribonucleic acids (sRNAs) that are not ribosomal and transfer or messenger RNAs were initially identified in the sixties, whereas their molecular functions are still under active investigation today. It is now widely accepted that most play central roles in gene expression regulation in response to environmental changes. Interestingly, some are also implicated in bacterial virulence. Functional studies revealed that a large subset of these sRNAs act by an antisense mechanism thanks to pairing interactions with dedicated mRNA targets, usually around their translation start sites, to modulate gene expression at the posttranscriptional level. Some sRNAs modulate protein activity or mimic the structure of other macromolecules. In the last few years, in silico methods have been developed to detect more bacterial sRNAs. Among these, computational analyses of the bacterial genomes by comparative genomics have predicted the existence of a plethora of sRNAs, some that were confirmed to be expressed in vivo. The prediction accuracy of these computational tools is highly variable and can be perfectible. Here we review the computational studies that have contributed to detecting the sRNA gene and mRNA targets in bacteria and the methods for their experimental testing. In addition, the remaining challenges are discussed.
Collapse
Affiliation(s)
- Christophe Pichon
- Unité Pathogénie Bactérienne des Muqueuses, Institut Pasteur, 25-28 Rue du Docteur Roux, 75724 Paris, France
| | | |
Collapse
|
73
|
Rhodius VA, Wade JT. Technical considerations in using DNA microarrays to define regulons. Methods 2008; 47:63-72. [PMID: 18955146 DOI: 10.1016/j.ymeth.2008.10.017] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2008] [Revised: 10/15/2008] [Accepted: 10/17/2008] [Indexed: 11/20/2022] Open
Abstract
Transcription is the major regulatory target of gene expression in bacteria, and is controlled by many regulatory proteins and RNAs. Microarrays are a powerful tool to study the regulation of transcription on a genomic scale. Here we describe the use of transcription profiling and ChIP-chip to study transcriptional regulation in bacteria. Transcription profiling determines the outcome of regulatory events whereas ChIP-chip identifies the protein-DNA interactions that determine these events. Together they can provide detailed information on transcriptional regulatory systems.
Collapse
Affiliation(s)
- Virgil A Rhodius
- Department of Microbiology and Immunology, University of California at San Francisco, San Francisco, CA 94143, USA.
| | | |
Collapse
|
74
|
Livny J, Teonadi H, Livny M, Waldor MK. High-throughput, kingdom-wide prediction and annotation of bacterial non-coding RNAs. PLoS One 2008; 3:e3197. [PMID: 18787707 PMCID: PMC2527527 DOI: 10.1371/journal.pone.0003197] [Citation(s) in RCA: 160] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2008] [Accepted: 08/25/2008] [Indexed: 01/06/2023] Open
Abstract
BACKGROUND Diverse bacterial genomes encode numerous small non-coding RNAs (sRNAs) that regulate myriad biological processes. While bioinformatic algorithms have proven effective in identifying sRNA-encoding loci, the lack of tools and infrastructure with which to execute these computationally demanding algorithms has limited their utilization. Genome-wide predictions of sRNA-encoding genes have been conducted in less than 3% of all sequenced bacterial strains, leading to critical gaps in current annotations. The relative paucity of genome-wide sRNA prediction represents a critical gap in current annotations of bacterial genomes and has limited examination of larger issues in sRNA biology, such as sRNA evolution. METHODOLOGY/PRINCIPAL FINDINGS We have developed and deployed SIPHT, a high throughput computational tool that utilizes workflow management and distributed computing to effectively conduct kingdom-wide predictions and annotations of intergenic sRNA-encoding genes. Candidate sRNA-encoding loci are identified based on the presence of putative Rho-independent terminators downstream of conserved intergenic sequences, and each locus is annotated for several features, including conservation in other species, association with one of several transcription factor binding sites and homology to any of over 300 previously identified sRNAs and cis-regulatory RNA elements. Using SIPHT, we conducted searches for putative sRNA-encoding genes in all 932 bacterial replicons in the NCBI database. These searches yielded nearly 60% of previously confirmed sRNAs, hundreds of previously annotated cis-encoded regulatory RNA elements such as riboswitches, and over 45,000 novel candidate intergenic loci. CONCLUSIONS/SIGNIFICANCE Candidate loci were identified across all branches of the bacterial evolutionary tree, suggesting a central and ubiquitous role for RNA-mediated regulation among bacterial species. Annotation of candidate loci by SIPHT provides clues into the potential biological function of thousands of previously confirmed and candidate regulatory RNAs and affords new insights into the evolution of bacterial riboregulation.
Collapse
Affiliation(s)
- Jonathan Livny
- Channing Laboratories, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America.
| | | | | | | |
Collapse
|
75
|
Brouwer RWW, Kuipers OP, van Hijum SAFT. The relative value of operon predictions. Brief Bioinform 2008; 9:367-75. [PMID: 18420711 DOI: 10.1093/bib/bbn019] [Citation(s) in RCA: 78] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
For most organisms, computational operon predictions are the only source of genome-wide operon information. Operon prediction methods described in literature are based on (a combination of) the following five criteria: (i) intergenic distance, (ii) conserved gene clusters, (iii) functional relation, (iv) sequence elements and (v) experimental evidence. The performance estimates of operon predictions reported in literature cannot directly be compared due to differences in methods and data used in these studies. Here, we survey the current status of operon prediction methods. Based on a comparison of the performance of operon predictions on Escherichia coli and Bacillus subtilis we conclude that there is still room for improvement. We expect that existing and newly generated genomics and transcriptomics data will further improve accuracy of operon prediction methods.
Collapse
|
76
|
Laing E, Sidhu K, Hubbard SJ. Predicted transcription factor binding sites as predictors of operons in Escherichia coli and Streptomyces coelicolor. BMC Genomics 2008; 9:79. [PMID: 18269733 PMCID: PMC2276206 DOI: 10.1186/1471-2164-9-79] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2007] [Accepted: 02/12/2008] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND As a polycistronic transcriptional unit of one or more adjacent genes, operons play a key role in regulation and function in prokaryotic biology, and a better understanding of how they are constituted and controlled is needed. Recent efforts have attempted to predict operonic status in sequenced genomes using a variety of techniques and data sources. To date, non-homology based operon prediction strategies have mainly used predicted promoters and terminators present at the extremities of transcriptional unit as predictors, with reasonable success. However, transcription factor binding sites (TFBSs), typically found upstream of the first gene in an operon, have not yet been evaluated. RESULTS Here we apply a method originally developed for the prediction of TFBSs in Escherichia coli that minimises the need for prior knowledge and tests its ability to predict operons in E. coli and the 'more complex', pharmaceutically important, Streptomyces coelicolor. We demonstrate that through building genome specific TFBS position-specific-weight-matrices (PSWMs) it is possible to predict operons in E. coli and S. coelicolor with 83% and 93% accuracy respectively, using only TFBS as delimiters of operons. Additionally, the 'palindromicity' of TFBS footprint data of E. coli is characterised. CONCLUSION TFBS are proposed as novel independent features for use in prokaryotic operon prediction (whether alone or as part of a set of features) given their efficacy as operon predictors in E. coli and S. coelicolor. We also show that TFBS footprint data in E. coli generally contains inverted repeats with significantly (p < 0.05) greater palindromicity than random sequences. Consequently, the palindromicity of putative TFBSs predicted can also enhance operon predictions.
Collapse
Affiliation(s)
- Emma Laing
- Faculty of Life Sciences, The University of Manchester, Michael Smith Building, Oxford Road, Manchester, M13 9PT, UK
- School of Biomedical and Molecular Sciences, University of Surrey, Guildford, GU2 7XH, UK
| | - Khushwant Sidhu
- Faculty of Life Sciences, The University of Manchester, Michael Smith Building, Oxford Road, Manchester, M13 9PT, UK
| | - Simon J Hubbard
- Faculty of Life Sciences, The University of Manchester, Michael Smith Building, Oxford Road, Manchester, M13 9PT, UK
| |
Collapse
|
77
|
Levine E, Zhang Z, Kuhlman T, Hwa T. Quantitative characteristics of gene regulation by small RNA. PLoS Biol 2007; 5:e229. [PMID: 17713988 PMCID: PMC1994261 DOI: 10.1371/journal.pbio.0050229] [Citation(s) in RCA: 283] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2006] [Accepted: 06/26/2007] [Indexed: 11/18/2022] Open
Abstract
An increasing number of small RNAs (sRNAs) have been shown to regulate critical pathways in prokaryotes and eukaryotes. In bacteria, regulation by trans-encoded sRNAs is predominantly found in the coordination of intricate stress responses. The mechanisms by which sRNAs modulate expression of its targets are diverse. In common to most is the possibility that interference with the translation of mRNA targets may also alter the abundance of functional sRNAs. Aiming to understand the unique role played by sRNAs in gene regulation, we studied examples from two distinct classes of bacterial sRNAs in Escherichia coli using a quantitative approach combining experiment and theory. Our results demonstrate that sRNA provides a novel mode of gene regulation, with characteristics distinct from those of protein-mediated gene regulation. These include a threshold-linear response with a tunable threshold, a robust noise resistance characteristic, and a built-in capability for hierarchical cross-talk. Knowledge of these special features of sRNA-mediated regulation may be crucial toward understanding the subtle functions that sRNAs can play in coordinating various stress-relief pathways. Our results may also help guide the design of synthetic genetic circuits that have properties difficult to attain with protein regulators alone. The activation of stress response programs, while crucial for the survival of a bacterial cell under stressful conditions, is costly in terms of energy and substrates and risky to the normal functions of the cell. Stress response is therefore tightly regulated. A recently discovered layer of regulation involves small RNA molecules, which bind the mRNA transcripts of their targets, inhibit their translation, and promote their cleavage. To understand the role that small RNA plays in regulation, we have studied the quantitative aspects of small RNA regulation by integrating mathematical modeling and quantitative experiments in Escherichia coli. We have demonstrated that small RNAs can tightly repress their target genes when their synthesis rate is smaller than some threshold, but have little or no effect when the synthesis rate is much larger than that threshold. Importantly, the threshold level is set by the synthesis rate of the small RNA itself and can be dynamically tuned. The effect of biochemical properties—such as the binding affinity of the two RNA molecules, which can only be altered on evolutionary time scales—is limited to setting a hierarchical order among different targets of a small RNA, facilitating in principle a global coordination of stress response. In bacteria, small RNAs can regulate the expression of genes at the translational level. The many advantages of this type of control include a tuneable threshold response and resistance to biochemical noise.
Collapse
Affiliation(s)
- Erel Levine
- Center for Theoretical Biological Physics, University of California San Diego, La Jolla, California, United States of America
| | - Zhongge Zhang
- Division of Biological Sciences, University of California San Diego, La Jolla, California, United States of America
| | - Thomas Kuhlman
- Center for Theoretical Biological Physics, University of California San Diego, La Jolla, California, United States of America
| | - Terence Hwa
- Center for Theoretical Biological Physics, University of California San Diego, La Jolla, California, United States of America
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
78
|
Coenye T, Drevinek P, Mahenthiralingam E, Shah SA, Gill RT, Vandamme P, Ussery DW. Identification of putative noncoding RNA genes in the Burkholderia cenocepacia J2315 genome. FEMS Microbiol Lett 2007; 276:83-92. [PMID: 17937666 DOI: 10.1111/j.1574-6968.2007.00916.x] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
Noncoding RNA (ncRNA) genes are not involved in the production of mRNA and proteins, but produce transcripts that function directly as structural or regulatory RNAs. In the present study, the presence of ncRNA genes in the genome of Burkholderia cenocepacia J2315 was evaluated by combining comparative genomics (alignment-based) and predicted secondary structure approaches. Two hundred and thirteen putative ncRNA genes were identified in the B. cenocepacia J2315 genome and upregulated expression of four of these could be confirmed by microarray analysis. Most of the ncRNA gene transcripts have a marked predicted secondary structure that may facilitate interaction with other molecules. Several B. cenocepacia J2315 ncRNAs seem to be related to previously characterized ncRNAs involved in regulation of various cellular processes, while the function of many others remains unknown. The presence of a large number of ncRNA genes in this organism may help to explain its complexity, phenotypic variability and ability to survive in a remarkably wide range of environments.
Collapse
Affiliation(s)
- Tom Coenye
- Laboratorium voor Microbiologie, Universiteit Gent, Gent, Belgium.
| | | | | | | | | | | | | |
Collapse
|
79
|
Horesh Y, Doniger T, Michaeli S, Unger R. RNAspa: a shortest path approach for comparative prediction of the secondary structure of ncRNA molecules. BMC Bioinformatics 2007; 8:366. [PMID: 17908318 PMCID: PMC2147038 DOI: 10.1186/1471-2105-8-366] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2007] [Accepted: 10/01/2007] [Indexed: 12/27/2022] Open
Abstract
Background In recent years, RNA molecules that are not translated into proteins (ncRNAs) have drawn a great deal of attention, as they were shown to be involved in many cellular functions. One of the most important computational problems regarding ncRNA is to predict the secondary structure of a molecule from its sequence. In particular, we attempted to predict the secondary structure for a set of unaligned ncRNA molecules that are taken from the same family, and thus presumably have a similar structure. Results We developed the RNAspa program, which comparatively predicts the secondary structure for a set of ncRNA molecules in linear time in the number of molecules. We observed that in a list of several hundred suboptimal minimal free energy (MFE) predictions, as provided by the RNAsubopt program of the Vienna package, it is likely that at least one suggested structure would be similar to the true, correct one. The suboptimal solutions of each molecule are represented as a layer of vertices in a graph. The shortest path in this graph is the basis for structural predictions for the molecule. We also show that RNA secondary structures can be compared very rapidly by a simple string Edit-Distance algorithm with a minimal loss of accuracy. We show that this approach allows us to more deeply explore the suboptimal structure space. Conclusion The algorithm was tested on three datasets which include several ncRNA families taken from the Rfam database. These datasets allowed for comparison of the algorithm with other methods. In these tests, RNAspa performed better than four other programs.
Collapse
Affiliation(s)
- Yair Horesh
- Department of Computer Sciences, Bar-Ilan University, Ramat-Gan 52900, Israel
| | - Tirza Doniger
- The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan 52900, Israel
| | - Shulamit Michaeli
- The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan 52900, Israel
| | - Ron Unger
- The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan 52900, Israel
| |
Collapse
|
80
|
Nie L, Wu G, Culley DE, Scholten JCM, Zhang W. Integrative analysis of transcriptomic and proteomic data: challenges, solutions and applications. Crit Rev Biotechnol 2007; 27:63-75. [PMID: 17578703 DOI: 10.1080/07388550701334212] [Citation(s) in RCA: 170] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Recent advances in high-throughput technologies enable quantitative monitoring of the abundance of various biological molecules and allow determination of their variation between biological states on a genomic scale. Two popular platforms are DNA microarrays that measure messenger RNA transcript levels, and gel-free proteomic analyses that quantify protein abundance. Obviously, no single approach can fully unravel the complexities of fundamental biology and it is equally clear that integrative analysis of multiple levels of gene expression would be valuable in this endeavor. However, most integrative transcriptomic and proteomic studies have thus far either failed to find a correlation or only observed a weak correlation. In addition to various biological factors, it is suggested that the poor correlation could be quite possibly due to the inadequacy of available statistical tools to compensate for biases in the data collection methodologies. To address this issue, attempts have recently been made to systematically investigate the correlation patterns between transcriptomic and proteomic datasets, and to develop sophisticated statistical tools to improve the chances of capturing a relationship. The goal of these efforts is to enhance understanding of the relationship between transcriptomes and proteomes so that integrative analyses may be utilized to reveal new biological insights that are not accessible through one-dimensional datasets. In this review, we outline some of the challenges associated with integrative analyses and present some preliminary statistical solutions. In addition, some new applications of integrated transcriptomic and proteomic analysis to the investigation of post-transcriptional regulation are also discussed.
Collapse
Affiliation(s)
- Lei Nie
- Department of Biostatistics, Bioinformatics, and Biomathematics, Georgetown University. Washington, DC, USA
| | | | | | | | | |
Collapse
|
81
|
Nakamura T, Naito K, Yokota N, Sugita C, Sugita M. A cyanobacterial non-coding RNA, Yfr1, is required for growth under multiple stress conditions. PLANT & CELL PHYSIOLOGY 2007; 48:1309-18. [PMID: 17664182 DOI: 10.1093/pcp/pcm098] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Small, regulatory, non-coding RNA (ncRNA) is involved in various cell functions in both prokaryotes and eukaryotes. However, information on ncRNA in cyanobacteria is still scarce. We studied ncRNA genes by computational screening to compare the intergenic regions of the Synechococcus elongatus PCC 6301 genome with the genomes of three freshwater cyanobacteria. We identified an ncRNA gene in S. elongatus, which has been previously described as yfr1 in marine cyanobacteria. The S. elongatus yfr1 gene is 65 nucleotides long and is positioned between guaB and trxA. We found a high conservation of the yfr1 gene in most cyanobacterial lineages. A yfr1-deficient mutant showed reduced growth under various stress conditions, e.g. oxidative stress and high salt stress conditions, and showed unusual accumulation of sbtA mRNA. A gel shift assay demonstrated interaction of the Yfr1 RNA with sbtA mRNA in vitro. This suggests that the sbtA transcript is a target RNA for the Yfr1 RNA.
Collapse
|
82
|
Gottesman S, McCullen C, Guillier M, Vanderpool C, Majdalani N, Benhammou J, Thompson K, FitzGerald P, Sowa N, FitzGerald D. Small RNA regulators and the bacterial response to stress. COLD SPRING HARBOR SYMPOSIA ON QUANTITATIVE BIOLOGY 2007; 71:1-11. [PMID: 17381274 PMCID: PMC3592358 DOI: 10.1101/sqb.2006.71.016] [Citation(s) in RCA: 178] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Recent studies have uncovered dozens of regulatory small RNAs in bacteria. A large number of these small RNAs act by pairing to their target mRNAs. The outcome of pairing can be either stimulation or inhibition of translation. Pairing in vivo frequently depends on the RNA-binding protein Hfq. Synthesis of these small RNAs is tightly regulated at the level of transcription; many of the well-studied stress response regulons have now been found to include a regulatory RNA. Expression of the small RNA can help the cell cope with environmental stress by redirecting cellular metabolism, exemplified by RyhB, a small RNA expressed upon iron starvation. Although small RNAs found in Escherichia coli can usually be identified by sequence comparison to closely related enterobacteria, other approaches are necessary to find the equivalent RNAs in other bacterial species. Nonetheless, it is becoming increasingly clear that many if not all bacteria encode significant numbers of these important regulators. Tracing their evolution through bacterial genomes remains a challenge.
Collapse
Affiliation(s)
- Susan Gottesman
- Laboratory of Molecular Biology, Center for Cancer Research, National Cancer Institute, Bethesda, MD. 20892
- Corresponding author: Bldg. 37, Rm. 5132, National Cancer Institute, Bethesda, MD. 20892; phone: 301-496-3524; fax: 301-496-3875;
| | - Colleen McCullen
- Laboratory of Molecular Biology, Center for Cancer Research, National Cancer Institute, Bethesda, MD. 20892
| | - Maude Guillier
- Laboratory of Molecular Biology, Center for Cancer Research, National Cancer Institute, Bethesda, MD. 20892
| | - Carin Vanderpool
- Laboratory of Molecular Biology, Center for Cancer Research, National Cancer Institute, Bethesda, MD. 20892
| | - Nadim Majdalani
- Laboratory of Molecular Biology, Center for Cancer Research, National Cancer Institute, Bethesda, MD. 20892
| | - Jihane Benhammou
- Laboratory of Molecular Biology, Center for Cancer Research, National Cancer Institute, Bethesda, MD. 20892
| | - Karl Thompson
- Laboratory of Molecular Biology, Center for Cancer Research, National Cancer Institute, Bethesda, MD. 20892
| | - Peter FitzGerald
- Genome Analysis Unit, Center for Cancer Research, National Cancer Institute, Bethesda, MD. 20892
| | - Nathaniel Sowa
- Laboratory of Molecular Biology, Center for Cancer Research, National Cancer Institute, Bethesda, MD. 20892
| | - David FitzGerald
- Laboratory of Molecular Biology, Center for Cancer Research, National Cancer Institute, Bethesda, MD. 20892
| |
Collapse
|
83
|
Altuvia S. Identification of bacterial small non-coding RNAs: experimental approaches. Curr Opin Microbiol 2007; 10:257-61. [PMID: 17553733 DOI: 10.1016/j.mib.2007.05.003] [Citation(s) in RCA: 101] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2007] [Revised: 04/09/2007] [Accepted: 05/15/2007] [Indexed: 10/23/2022]
Abstract
Almost 140 bacterial small RNAs (sRNAs; sometimes referred to as non-coding RNAs) have been discovered in the past six years. The majority of these sRNAs were discovered in Escherichia coli, and a smaller subset was characterized in other bacteria, many of which were pathogenic. Many of these genes were identified as a result of systematic screens using computational prediction of sRNAs and experimental-based approaches, including microarray and shotgun cloning. A smaller number of sRNAs were discovered by direct labeling or by functional genetic screens. Many of the discovered genes, ranging in size from 50 to 500 nucleotides, are conserved and located in intergenic regions, in-between open reading frames. The expression of many of these genes is growth phase dependent or stress related. As each search employed specific parameters, this led to the identification of genes with distinct characteristics. Consequently, unique sRNAs such as those that are species-specific, sRNA genes that are transcribed under unique conditions or genes located on the antisense strand of protein-encoding genes, were probably missed.
Collapse
Affiliation(s)
- Shoshy Altuvia
- Department of Molecular Genetics and Biotechnology, The Hebrew University-Hadassah Medical School, Jerusalem 91120, Israel.
| |
Collapse
|
84
|
Sridhar J, Rafi ZA. Small RNA identification in Enterobacteriaceae using synteny and genomic backbone retention. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2007; 11:74-99. [PMID: 17411397 DOI: 10.1089/omi.2006.0006] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Genomic screens for small RNA candidates in Enterobacteriacae genomes were carried out with existing small RNA sequences, conserved flanking genes, and genomic backbone information. The small RNA sequences and contexts from E. coli K12 formed the basis of the search. Sequence identity identified 117 additional small RNA homologs in related genomes. Motifs of continuous sequence stretches added another 48 sRNA regions, termed partial homologs. However, this study is unique in identifying 160 nonhomologous sRNA loci in related genomes based on the conserved flanking gene synteny and the backbone retention information obtained from KEGG-SSDB. Gene synteny and genomic backbone continuity were observed to be correlated with all of the sRNAs in related genomes. This search is the first of its kind toward identification of functionally important regions using gene order and back-bone information. A disruption in flanking gene order or genomic backbone indicates a possible hotspot for alien gene pool integration. This study reports both occurrence of multiple copies of a sRNA and co-occurrence of different sRNAs between a pair of conserved flanking genes. In general, synteny and genomic backbone retention information can be added as additional search criteria toward the design of precise bioinformatics tools for sRNA, gene identification, and gene functional annotations in related genomes.
Collapse
Affiliation(s)
- Jayavel Sridhar
- Centre of Excellence in Bioinformatics, School of Biotechnology, Madurai Kamaraj University, Madurai, Tamilnadu, India
| | | |
Collapse
|
85
|
Luban S, Kihara D. Comparative genomics of small RNAs in bacterial genomes. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2007; 11:58-73. [PMID: 17411396 DOI: 10.1089/omi.2006.0005] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
In recent years, various families of small non-coding RNAs (sRNAs) have been discovered by experimental and computational approaches, both in bacterial and eukaryotic genomes. Although most of them await elucidation of their function, it has been reported that some play important roles in gene regulation. Here we carried out comparative genomics analysis of possible sRNAs that are computationally identified in 30 bacterial genomes from gamma- and alpha-proteobacteria and Deinococcus radiodurans. Identified sRNAs are clustered by a complete-linkage clustering method to see conservation among the organisms. On average, sRNAs are found in approximately 30% of intergenic regions of each genome sequence. Of these, 25.7% are conserved among three or more organisms. Approximately 60% of the conserved sRNAs do not locate in orthologous intergenic regions, implying that sRNAs may be shuffled their positions in genomes. The current study implies that sRNAs may be involved in a more extensive range of functions in bacteria.
Collapse
Affiliation(s)
- Stan Luban
- Department of Computer Science, Markey Center for Structural Biology, West Lafayette, Indiana 47907, USA
| | | |
Collapse
|
86
|
Livny J, Waldor MK. Identification of small RNAs in diverse bacterial species. Curr Opin Microbiol 2007; 10:96-101. [PMID: 17383222 DOI: 10.1016/j.mib.2007.03.005] [Citation(s) in RCA: 138] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2006] [Accepted: 03/09/2007] [Indexed: 11/27/2022]
Abstract
Small, non-coding bacterial RNAs (sRNAs) have been shown to regulate a plethora of biological processes. Up until recently, most sRNAs had been identified and characterized in E. coli. However, in the past few years, dozens of sRNAs have been discovered in a wide variety of bacterial species. Whereas numerous sRNAs have been isolated or detected through experimental approaches, most have been identified in predictive bioinformatic searches. Recently developed computational tools have greatly facilitated the efficient prediction of sRNAs in diverse species. Although the number of known sRNAs has dramatically increased in recent years, many challenges in the identification and characterization of sRNAs lie ahead.
Collapse
Affiliation(s)
- Jonathan Livny
- Department of Molecular Biology and Microbiology, Tufts University School of Medicine and Howard Hughes Medical Institute, 136 Harrison Avenue, Boston, MA 02111, USA
| | | |
Collapse
|
87
|
Tjaden B. Prediction of small, noncoding RNAs in bacteria using heterogeneous data. J Math Biol 2007; 56:183-200. [PMID: 17354017 DOI: 10.1007/s00285-007-0079-5] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2006] [Indexed: 10/23/2022]
Abstract
sRNAFinder is a new gene prediction system for systematic identification of noncoding genes in bacteria. Most noncoding RNAs in prokaryotes belong to a class of genes denoted as small RNAs (sRNAs). In the model organism Escherichia coli, over 70 sRNA genes have been identified, and the existence of many more has been hypothesized. While various sources of information have proven useful for prediction of novel sRNA genes, most computational approaches do not take advantage of the disparate sources of data available for identifying these noncoding RNA genes. We present a general probabilistic method for predicting sRNA genes in bacteria. The method, based on a general Markov model, is implemented in the computational tool sRNAFinder. sRNAFinder incorporates heterogeneous data sources for gene prediction, including primary sequence data, transcript expression data from microarray experiments, and conserved RNA structure information as determined from comparative genomics analysis. We demonstrate that sRNAFinder improves upon current tools for identifying small, noncoding genes in bacteria.
Collapse
Affiliation(s)
- Brian Tjaden
- Computer Science Department, Wellesley College, Wellesley, MA 02481, USA.
| |
Collapse
|
88
|
Seshasayee ASN. An assessment of the role of DNA adenine methyltransferase on gene expression regulation in E coli. PLoS One 2007; 2:e273. [PMID: 17342207 PMCID: PMC1804101 DOI: 10.1371/journal.pone.0000273] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2006] [Accepted: 02/14/2007] [Indexed: 11/19/2022] Open
Abstract
N6-Adenine methylation is an important epigenetic signal, which regulates various processes, such as DNA replication and repair and transcription. In γ-proteobacteria, Dam is a stand-alone enzyme that methylates GATC sites, which are non-randomly distributed in the genome. Some of these overlap with transcription factor binding sites. This work describes a global computational analysis of a published Dam knockout microarray alongside other publicly available data to throw insights into the extent to which Dam regulates transcription by interfering with protein binding. The results indicate that DNA methylation by DAM may not globally affect gene transcription by physically blocking access of transcription factors to binding sites. Down-regulation of Dam during stationary phase correlates with the activity of TFs whose binding sites are enriched for GATC sites.
Collapse
Affiliation(s)
- Aswin Sai Narain Seshasayee
- Genomics and Regulatory Systems Group, EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom.
| |
Collapse
|
89
|
Reppas NB, Wade JT, Church GM, Struhl K. The transition between transcriptional initiation and elongation in E. coli is highly variable and often rate limiting. Mol Cell 2007; 24:747-757. [PMID: 17157257 DOI: 10.1016/j.molcel.2006.10.030] [Citation(s) in RCA: 169] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2006] [Revised: 10/10/2006] [Accepted: 10/24/2006] [Indexed: 10/23/2022]
Abstract
We perform a genome-wide analysis of the transition between transcriptional initiation and elongation in Escherichia coli by determining the association of core RNA polymerase (RNAP) and the promoter-recognition factor sigma70 with respect to RNA transcripts. We identify 1286 sigma70-associated promoters, including many internal to known operons, and demonstrate that sigma70 is usually released very rapidly from elongating RNAP complexes. On average, RNAP density is higher at the promoter than in the coding sequence, although the ratio is highly variable among different transcribed regions. Strikingly, a significant fraction of RNAP-bound promoters is not associated with transcriptional activity, perhaps due to an intrinsic energetic barrier to promoter escape. Thus, the transition from transcriptional initiation to elongation is highly variable, often rate limiting, and in some cases is essentially blocked such that RNAP is effectively "poised" to transcribe only under the appropriate environmental conditions. The genomic pattern of RNAP density in E. coli differs from that in yeast and mammalian cells.
Collapse
Affiliation(s)
- Nikos B Reppas
- Harvard University Biophysics Program, Harvard Medical School, Boston, Massachusetts 02115; Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115
| | - Joseph T Wade
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, Massachusetts 02115
| | - George M Church
- Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115
| | - Kevin Struhl
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, Massachusetts 02115.
| |
Collapse
|
90
|
Rodrigues F, Sarkar-Tyson M, Harding SV, Sim SH, Chua HH, Lin CH, Han X, Karuturi RKM, Sung K, Yu K, Chen W, Atkins TP, Titball RW, Tan P. Global map of growth-regulated gene expression in Burkholderia pseudomallei, the causative agent of melioidosis. J Bacteriol 2006; 188:8178-88. [PMID: 16997946 PMCID: PMC1698202 DOI: 10.1128/jb.01006-06] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Many microbial pathogens express specific virulence traits at distinct growth phases. To understand the molecular pathways linking bacterial growth to pathogenicity, we have characterized the growth transcriptome of Burkholderia pseudomallei, the causative agent of melioidosis. Using a fine-scale sampling approach, we found approximately 17% of all B. pseudomallei genes displaying regulated expression during growth in rich medium, occurring as broad waves of functionally coherent gene expression tightly associated with distinct growth phases and transition points. We observed regulation of virulence genes across all growth phases and identified serC as a potentially new virulence factor by virtue of its coexpression with other early-phase virulence genes. serC-disrupted B. pseudomallei strains were serine auxotrophs and in mouse infection assays exhibited a dramatic attenuation of virulence compared to wild-type B. pseudomallei. Immunization of mice with serC-disrupted B. pseudomallei also conferred protection against subsequent challenges with different wild-type B. pseudomallei strains. At a genomic level, early-phase genes were preferentially localized on chromosome 1, while stationary-phase genes were significantly biased towards chromosome 2. We detected a significant level of chromosomally clustered gene expression, allowing us to predict approximately 100 potential operons in the B. pseudomallei genome. We computationally and experimentally validated these operons by showing that genes in these regions are preferentially transcribed in the same 5'-->3' direction, possess significantly shorter intergenic lengths than the overall genome, and are expressed as a common mRNA transcript. The availability of this transcriptome map provides an important resource for understanding the transcriptional architecture of B. pseudomallei.
Collapse
Affiliation(s)
- Fiona Rodrigues
- Genome Institute of Singapore, 60 Biopolis Street, no. 02-01, Genome, Singapore 138672, Republic of Singapore
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
91
|
Ozoline ON, Deev AA. Predicting antisense RNAs in the genomes of Escherichia coli and Salmonella typhimurium using promoter-search algorithm PlatProm. J Bioinform Comput Biol 2006; 4:443-54. [PMID: 16819794 DOI: 10.1142/s0219720006001916] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2005] [Revised: 12/29/2005] [Accepted: 01/13/2006] [Indexed: 11/18/2022]
Abstract
A pattern recognition software PlatProm, which takes into consideration both sequence-specific and structure-specific features in the genetic environment of the promoter sites and identifies transcription start points with a very high accuracy was used to reveal potentially transcribed regions in the genomes of two bacterial species. Along with the expected promoters located upstream from coding sequences PlatProm identified several hundred of very similar signals in other intergenic regions and within coding sequences. Homologous genes of Escherichia coli and Salmonella typhimurium, containing potential promoters on the template strand are suggested as putative targets for regulations by antisense RNA-products (aRNAs).
Collapse
Affiliation(s)
- Olga N Ozoline
- Institute of Cell Biophysics, Russian Academy of Sciences, Pushchino, Moscow Region, 142290, Russia.
| | | |
Collapse
|
92
|
Wang C, Ding C, Meraz RF, Holbrook SR. PSoL: a positive sample only learning algorithm for finding non-coding RNA genes. Bioinformatics 2006; 22:2590-6. [PMID: 16945945 DOI: 10.1093/bioinformatics/btl441] [Citation(s) in RCA: 68] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Small non-coding RNA (ncRNA) genes play important regulatory roles in a variety of cellular processes. However, detection of ncRNA genes is a great challenge to both experimental and computational approaches. In this study, we describe a new approach called positive sample only learning (PSoL) to predict ncRNA genes in the Escherichia coli genome. Although PSoL is a machine learning method for classification, it requires no negative training data, which, in general, is hard to define properly and affects the performance of machine learning dramatically. In addition, using the support vector machine (SVM) as the core learning algorithm, PSoL can integrate many different kinds of information to improve the accuracy of prediction. Besides the application of PSoL for predicting ncRNAs, PSoL is applicable to many other bioinformatics problems as well. RESULTS The PSoL method is assessed by 5-fold cross-validation experiments which show that PSoL can achieve about 80% accuracy in recovery of known ncRNAs. We compared PSoL predictions with five previously published results. The PSoL method has the highest percentage of predictions overlapping with those from other methods.
Collapse
Affiliation(s)
- Chunlin Wang
- Physical Biosciences Division, Lawrence Berkeley National Laboratory Berkeley, CA 94720, USA
| | | | | | | |
Collapse
|
93
|
Christiansen JK, Nielsen JS, Ebersbach T, Valentin-Hansen P, Søgaard-Andersen L, Kallipolitis BH. Identification of small Hfq-binding RNAs in Listeria monocytogenes. RNA (NEW YORK, N.Y.) 2006; 12:1383-96. [PMID: 16682563 PMCID: PMC1484441 DOI: 10.1261/rna.49706] [Citation(s) in RCA: 127] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
The RNA-binding protein Hfq plays important roles in bacterial physiology and is required for the activity of many small regulatory RNAs in prokaryotes. We have previously shown that Hfq contributes to stress tolerance and virulence in the Gram-positive human pathogen Listeria monocytogenes. In the present study, we performed coimmunoprecipitations followed by enzymatic RNA sequencing to identify Hfq-binding RNA molecules in L. monocytogenes. The approach resulted in the discovery of three small RNAs (sRNAs). The sRNAs are conserved between Listeria species, but were not identified in other bacterial species. The initial characterization revealed a number of unique features displayed by each individual sRNA. The first sRNA is encoded from within an annotated gene in the L. monocytogenes EGD-e genome. Analogous to most regulatory sRNAs in Escherichia coli, the stability of this sRNA is highly dependent on the presence of Hfq. The second sRNA appears to be produced by a transcription attenuation mechanism, and the third sRNA is present in five copies at two different locations within the L. monocytogenes EGD-e genome. The cellular levels of the sRNAs are growth phase dependent and vary in response to growth medium. All three sRNAs are expressed when L. monocytogenes multiplies within mammalian cells. This study represents the first attempt to identify sRNAs in L. monocytogenes.
Collapse
Affiliation(s)
- Janne K Christiansen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark
| | | | | | | | | | | |
Collapse
|
94
|
Abstract
During in vitro broth culture, bacterial gene expression is typically dominated by highly expressed factors involved in protein biosynthesis, maturation, and folding, but it is unclear if this also applies to conditions in natural environments. Here, we used a promoter trap strategy with an unstable green fluorescent protein reporter that can be detected in infected mouse tissues to identify 21 Salmonella enterica promoters with high levels of activity in a mouse enteritis model. We then measured the activities of these and 31 previously identified Salmonella promoters in both the enteritis and a murine typhoid fever model. Surprisingly, the data reveal that instead of protein biosynthesis genes, disease-specific genes such as Salmonella pathogenicity island 1 (SPI-1)-associated genes and genes involved in anaerobic respiration (enteritis) or SPI-2-associated genes and genes of the PhoP regulon (typhoid fever), respectively, dominate Salmonella in vivo gene expression. The overall functional profile of highly expressed genes suggests a marked shift in major transcriptional activities to nutrient utilization during enteritis or to fighting against the host during typhoid fever. The large proportion of known and novel essential virulence factors among the identified genes suggests that high expression levels during infection may correlate with functional relevance.
Collapse
Affiliation(s)
- Claudia Rollenhagen
- Max Planck Institute for Infection Biology, Department of Molecular Biology, Berlin, Germany
| | | |
Collapse
|
95
|
Yachie N, Numata K, Saito R, Kanai A, Tomita M. Prediction of non-coding and antisense RNA genes in Escherichia coli with Gapped Markov Model. Gene 2006; 372:171-81. [PMID: 16564143 DOI: 10.1016/j.gene.2005.12.034] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2005] [Revised: 12/02/2005] [Accepted: 12/28/2005] [Indexed: 11/27/2022]
Abstract
A new mathematical index was developed to identify and characterize non-coding RNA (ncRNA) genes encoded within the Escherichia coli (E. coli) genome. It was designated the GMMI (Gapped Markov Model Index) and used to evaluate sequence patterns located at the separate positions of consensus sequences, codon biases and/or possible RNA structures on the basis of the Markov model. The GMMI was able to separate a set of known mRNA sequences from a mixture of ncRNAs including tRNAs and rRNAs. Consequently, the GMMI was employed to predict novel ncRNA candidates. At the beginning, possible transcription units were extracted from the E. coli genome using consensus sequences for the sigma70 promoter and the rho-independent terminator. Then, these units were evaluated by using the GMMI. This identified 133 candidate ncRNAs, which contain 29 previously annotated small RNA genes and 46 possible antisense ncRNAs. Furthermore 12 transcripts (including five antisense RNAs) were confirmed according to the expression analysis. These data suggests that the expression of small antisense RNAs might be more common than previously thought in the E. coli genome.
Collapse
Affiliation(s)
- Nozomu Yachie
- Institute for Advanced Biosciences, Keio University, Tsuruoka, 997-0035, Japan
| | | | | | | | | |
Collapse
|
96
|
Hüttenhofer A, Vogel J. Experimental approaches to identify non-coding RNAs. Nucleic Acids Res 2006; 34:635-46. [PMID: 16436800 PMCID: PMC1351373 DOI: 10.1093/nar/gkj469] [Citation(s) in RCA: 141] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2005] [Revised: 01/10/2006] [Accepted: 01/10/2006] [Indexed: 12/12/2022] Open
Abstract
Cellular RNAs that do not function as messenger RNAs (mRNAs), transfer RNAs (tRNAs) or ribosomal RNAs (rRNAs) comprise a diverse class of molecules that are commonly referred to as non-protein-coding RNAs (ncRNAs). These molecules have been known for quite a while, but their importance was not fully appreciated until recent genome-wide searches discovered thousands of these molecules and their genes in a variety of model organisms. Some of these screens were based on biocomputational prediction of ncRNA candidates within entire genomes of model organisms. Alternatively, direct biochemical isolation of expressed ncRNAs from cells, tissues or entire organisms has been shown to be a powerful approach to identify ncRNAs both at the level of individual molecules and at a global scale. In this review, we will survey several such wet-lab strategies, i.e. direct sequencing of ncRNAs, shotgun cloning of small-sized ncRNAs (cDNA libraries), microarray analysis and genomic SELEX to identify novel ncRNAs, and discuss the advantages and limits of these approaches.
Collapse
Affiliation(s)
- Alexander Hüttenhofer
- Innsbruck Biocenter, Division of Genomics and RNomics, Innsbruck Medical University, Fritz-Pregl-Str. 3, 6020 Innsbruck, Austria.
| | | |
Collapse
|
97
|
Abstract
AbstractSmall non-coding RNAs (sRNAs) have attracted considerable attention as an emerging class of gene expression regulators. In bacteria, a few regulatory RNA molecules have long been known, but the extent of their role in the cell was not fully appreciated until the recent discovery of hundreds of potential sRNA genes in the bacteriumEscherichia coli. Orthologs of theseE. colisRNA genes, as well as unrelated sRNAs, were also found in other bacteria. Here we review the disparate experimental approaches used over the years to identify sRNA molecules and their genes in prokaryotes. These include genome-wide searches based on the biocomputational prediction of non-coding RNA genes, global detection of non-coding transcripts using microarrays, and shotgun cloning of small RNAs (RNomics). Other sRNAs were found by either co-purification with RNA-binding proteins, such as Hfq or CsrA/RsmA, or classical cloning of abundant small RNAs after size fractionation in polyacrylamide gels. In addition, bacterial genetics offers powerful tools that aid in the search for sRNAs that may play a critical role in the regulatory circuit of interest, for example, the response to stress or the adaptation to a change in nutrient availability. Many of the techniques discussed here have also been successfully applied to the discovery of eukaryotic and archaeal sRNAs.
Collapse
MESH Headings
- Cloning, Molecular
- Escherichia coli/genetics
- Escherichia coli/metabolism
- Escherichia coli Proteins/chemistry
- Escherichia coli Proteins/genetics
- Escherichia coli Proteins/metabolism
- Eukaryotic Cells/metabolism
- Gene Expression Regulation, Bacterial
- Genome, Bacterial
- Host Factor 1 Protein/chemistry
- Host Factor 1 Protein/genetics
- Host Factor 1 Protein/metabolism
- Oligonucleotide Array Sequence Analysis
- RNA Processing, Post-Transcriptional
- RNA, Archaeal/chemistry
- RNA, Archaeal/genetics
- RNA, Archaeal/metabolism
- RNA, Bacterial/chemistry
- RNA, Bacterial/genetics
- RNA, Bacterial/metabolism
- RNA, Untranslated/chemistry
- RNA, Untranslated/genetics
- RNA, Untranslated/metabolism
- RNA-Binding Proteins/chemistry
- RNA-Binding Proteins/genetics
- RNA-Binding Proteins/metabolism
Collapse
Affiliation(s)
- Jörg Vogel
- Max Planck Institute for Infection Biology, RNA Biology, Schumannstr. 21/22, D-10117 Berlin, Germany.
| | | |
Collapse
|
98
|
Park SJ, Lee SY, Cho J, Kim TY, Lee JW, Park JH, Han MJ. Global physiological understanding and metabolic engineering of microorganisms based on omics studies. Appl Microbiol Biotechnol 2005; 68:567-79. [PMID: 16041571 DOI: 10.1007/s00253-005-0081-z] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2005] [Revised: 06/23/2005] [Accepted: 06/24/2005] [Indexed: 10/25/2022]
Abstract
Through metabolic engineering, scientists seek to modify the metabolic pathways of living organisms to facilitate optimized, efficient production of target biomolecules. During the past decade, we have seen notable improvements in biotechnology, many of which have been based on metabolically engineered microorganisms. Recent developments in the fields of functional genomics, transcriptomics, proteomics, and metabolomics have changed metabolic engineering strategies from the local pathway level to the whole system level. This article focuses on recent advances in the field of metabolic engineering, which have been powered by the combined approaches of the various "omics" that allow us to understand the microbial metabolism at a global scale and to develop more effectively redesigned metabolic pathways for the enhanced production of target bioproducts.
Collapse
Affiliation(s)
- S J Park
- Corporate R&D, LG Chem, Ltd./Research Park, Yuseong-gu, Daejeon, Republic of Korea.
| | | | | | | | | | | | | |
Collapse
|
99
|
Royce TE, Rozowsky JS, Bertone P, Samanta M, Stolc V, Weissman S, Snyder M, Gerstein M. Issues in the analysis of oligonucleotide tiling microarrays for transcript mapping. Trends Genet 2005; 21:466-75. [PMID: 15979196 PMCID: PMC1855044 DOI: 10.1016/j.tig.2005.06.007] [Citation(s) in RCA: 64] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2005] [Revised: 05/17/2005] [Accepted: 06/08/2005] [Indexed: 10/25/2022]
Abstract
Traditional microarrays use probes complementary to known genes to quantitate the differential gene expression between two or more conditions. Genomic tiling microarray experiments differ in that probes that span a genomic region at regular intervals are used to detect the presence or absence of transcription. This difference means the same sets of biases and the methods for addressing them are unlikely to be relevant to both types of experiment. We introduce the informatics challenges arising in the analysis of tiling microarray experiments as open problems to the scientific community and present initial approaches for the analysis of this nascent technology.
Collapse
Affiliation(s)
- Thomas E Royce
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | | | | | | | | | | | | | | |
Collapse
|
100
|
Shearwin KE, Callen BP, Egan JB. Transcriptional interference--a crash course. Trends Genet 2005; 21:339-45. [PMID: 15922833 PMCID: PMC2941638 DOI: 10.1016/j.tig.2005.04.009] [Citation(s) in RCA: 416] [Impact Index Per Article: 21.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2004] [Revised: 03/09/2005] [Accepted: 04/12/2005] [Indexed: 12/13/2022]
Abstract
The term "transcriptional interference" (TI) is widely used but poorly defined in the literature. There are a variety of methods by which one can interfere with the process or the product of transcription but the term TI usually refers to the direct negative impact of one transcriptional activity on a second transcriptional activity in cis. Two recent studies, one examining Saccharomyces cerevisiae and the other Escherichia coli, clearly show TI at one promoter caused by the arrival of a transcribing complex initiating at a distant promoter. TI is potentially widespread throughout biology; therefore, it is timely to assess exactly its nature, significance and operative mechanisms. In this article, we will address the following questions: what is TI, how important and widespread is it, how does it work and where should we focus our future research efforts?
Collapse
Affiliation(s)
- Keith E Shearwin
- School of Molecular and Biomedical Science, University of Adelaide, Adelaide, Australia 5005.
| | | | | |
Collapse
|