1
|
Anderssen S, Naômé A, Jadot C, Brans A, Tocquin P, Rigali S. AURTHO: Autoregulation of transcription factors as facilitator of cis-acting element discovery. BIOCHIMICA ET BIOPHYSICA ACTA. GENE REGULATORY MECHANISMS 2022; 1865:194847. [PMID: 35901946 DOI: 10.1016/j.bbagrm.2022.194847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Revised: 07/04/2022] [Accepted: 07/18/2022] [Indexed: 06/15/2023]
Abstract
Transcriptional regulation is key in bacteria for providing an adequate response in time and space to changing environmental conditions. However, despite decades of research, the binding sites and therefore the target genes and the function of most transcription factors (TFs) remain unknown. Filling this gap in knowledge through conventional methods represents a colossal task which we demonstrate here can be significantly facilitated by a widespread feature in transcriptional control: the autoregulation of TFs implying that the yet unknown transcription factor binding site (TFBS) is neighboring the TF itself. In this work, we describe the "AURTHO" methodology (AUtoregulation of oRTHOlogous transcription factors), consisting of analyzing upstream regions of orthologous TFs in order to uncover their associated TFBSs. AURTHO enabled the de novo identification of novel TFBSs with an unprecedented improvement in terms of quantity and reliability. DNA-protein interaction studies on a selection of candidate cis-acting elements yielded an >90 % success rate, demonstrating the efficacy of AURTHO at highlighting true TF-TFBS couples and confirming the identification in a near future of a plethora of TFBSs across all bacterial species.
Collapse
Affiliation(s)
- Sinaeda Anderssen
- InBioS - Center for Protein Engineering, University of Liège, B-4000 Liège, Belgium
| | - Aymeric Naômé
- InBioS - Center for Protein Engineering, University of Liège, B-4000 Liège, Belgium; HEDERA 22, Boulevard du Rectorat 27b, B-4000 Liège, Belgium
| | - Cédric Jadot
- InBioS - Center for Protein Engineering, University of Liège, B-4000 Liège, Belgium
| | - Alain Brans
- InBioS - Center for Protein Engineering, University of Liège, B-4000 Liège, Belgium
| | - Pierre Tocquin
- HEDERA 22, Boulevard du Rectorat 27b, B-4000 Liège, Belgium; InBioS - PhytoSystems, University of Liège, B-4000 Liège, Belgium
| | - Sébastien Rigali
- InBioS - Center for Protein Engineering, University of Liège, B-4000 Liège, Belgium; HEDERA 22, Boulevard du Rectorat 27b, B-4000 Liège, Belgium.
| |
Collapse
|
2
|
Dai Z, Iqbal M, Lawrence ND, Rattray M. Efficient inference for sparse latent variable models of transcriptional regulation. Bioinformatics 2018; 33:3776-3783. [PMID: 28961802 PMCID: PMC5860323 DOI: 10.1093/bioinformatics/btx508] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2017] [Accepted: 08/25/2017] [Indexed: 12/23/2022] Open
Abstract
Motivation Regulation of gene expression in prokaryotes involves complex co-regulatory mechanisms involving large numbers of transcriptional regulatory proteins and their target genes. Uncovering these genome-scale interactions constitutes a major bottleneck in systems biology. Sparse latent factor models, assuming activity of transcription factors (TFs) as unobserved, provide a biologically interpretable modelling framework, integrating gene expression and genome-wide binding data, but at the same time pose a hard computational inference problem. Existing probabilistic inference methods for such models rely on subjective filtering and suffer from scalability issues, thus are not well-suited for realistic genome-scale applications. Results We present a fast Bayesian sparse factor model, which takes input gene expression and binding sites data, either from ChIP-seq experiments or motif predictions, and outputs active TF-gene links as well as latent TF activities. Our method employs an efficient variational Bayes scheme for model inference enabling its application to large datasets which was not feasible with existing MCMC-based inference methods for such models. We validate our method on synthetic data against a similar model in the literature, employing MCMC for inference, and obtain comparable results with a small fraction of the computational time. We also apply our method to large-scale data from Mycobacterium tuberculosis involving ChIP-seq data on 113 TFs and matched gene expression data for 3863 putative target genes. We evaluate our predictions using an independent transcriptomics experiment involving over-expression of TFs. Availability and implementation An easy-to-use Jupyter notebook demo of our method with data is available at https://github.com/zhenwendai/SITAR. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zhenwen Dai
- Department of Computer Science, University of Sheffield, Sheffield, UK.,Amazon Research, Cambridge, UK
| | - Mudassar Iqbal
- Division of Informatics, Imaging & Data Sciences, Faculty of Biology, Medicine, and Health Sciences, University of Manchester, Manchester, UK
| | - Neil D Lawrence
- Department of Computer Science, University of Sheffield, Sheffield, UK.,Amazon Research, Cambridge, UK
| | - Magnus Rattray
- Division of Informatics, Imaging & Data Sciences, Faculty of Biology, Medicine, and Health Sciences, University of Manchester, Manchester, UK
| |
Collapse
|
3
|
Romero-Rodríguez A, Robledo-Casados I, Sánchez S. An overview on transcriptional regulators in Streptomyces. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2015; 1849:1017-39. [PMID: 26093238 DOI: 10.1016/j.bbagrm.2015.06.007] [Citation(s) in RCA: 107] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/19/2015] [Revised: 06/09/2015] [Accepted: 06/12/2015] [Indexed: 12/19/2022]
Abstract
Streptomyces are Gram-positive microorganisms able to adapt and respond to different environmental conditions. It is the largest genus of Actinobacteria comprising over 900 species. During their lifetime, these microorganisms are able to differentiate, produce aerial mycelia and secondary metabolites. All of these processes are controlled by subtle and precise regulatory systems. Regulation at the transcriptional initiation level is probably the most common for metabolic adaptation in bacteria. In this mechanism, the major players are proteins named transcription factors (TFs), capable of binding DNA in order to repress or activate the transcription of specific genes. Some of the TFs exert their action just like activators or repressors, whereas others can function in both manners, depending on the target promoter. Generally, TFs achieve their effects by using one- or two-component systems, linking a specific type of environmental stimulus to a transcriptional response. After DNA sequencing, many streptomycetes have been found to have chromosomes ranging between 6 and 12Mb in size, with high GC content (around 70%). They encode for approximately 7000 to 10,000 genes, 50 to 100 pseudogenes and a large set (around 12% of the total chromosome) of regulatory genes, organized in networks, controlling gene expression in these bacteria. Among the sequenced streptomycetes reported up to now, the number of transcription factors ranges from 471 to 1101. Among these, 315 to 691 correspond to transcriptional regulators and 31 to 76 are sigma factors. The aim of this work is to give a state of the art overview on transcription factors in the genus Streptomyces.
Collapse
Affiliation(s)
- Alba Romero-Rodríguez
- Departamento de Biología Molecular y Biotecnología, Instituto de Investigaciones Biomédicas, Universidad Nacional Autónoma de México, México, D.F. 04510, Mexico
| | - Ivonne Robledo-Casados
- Departamento de Biología Molecular y Biotecnología, Instituto de Investigaciones Biomédicas, Universidad Nacional Autónoma de México, México, D.F. 04510, Mexico
| | - Sergio Sánchez
- Departamento de Biología Molecular y Biotecnología, Instituto de Investigaciones Biomédicas, Universidad Nacional Autónoma de México, México, D.F. 04510, Mexico.
| |
Collapse
|
4
|
Iqbal M, Mast Y, Amin R, Hodgson DA, Wohlleben W, Burroughs NJ. Extracting regulator activity profiles by integration of de novo motifs and expression data: characterizing key regulators of nutrient depletion responses in Streptomyces coelicolor. Nucleic Acids Res 2012; 40:5227-39. [PMID: 22406834 PMCID: PMC3384326 DOI: 10.1093/nar/gks205] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
Determining transcriptional regulator activities is a major focus of systems biology, providing key insight into regulatory mechanisms and co-regulators. For organisms such as Escherichia coli, transcriptional regulator binding site data can be integrated with expression data to infer transcriptional regulator activities. However, for most organisms there is only sparse data on their transcriptional regulators, while their associated binding motifs are largely unknown. Here, we address the challenge of inferring activities of unknown regulators by generating de novo (binding) motifs and integrating with expression data. We identify a number of key regulators active in the metabolic switch, including PhoP with its associated directed repeat PHO box, candidate motifs for two SARPs, a CRP family regulator, an iron response regulator and that for LexA. Experimental validation for some of our predictions was obtained using gel-shift assays. Our analysis is applicable to any organism for which there is a reasonable amount of complementary expression data and for which motifs (either over represented or evolutionary conserved) can be identified in the genome.
Collapse
Affiliation(s)
- Mudassar Iqbal
- Multidisciplinary Centre for Integrative Biology (MyCIB), School of Biosciences, University of Nottingham, Nottingham, UK.
| | | | | | | | | | | | | |
Collapse
|
5
|
Eng C, Asthana C, Aigle B, Hergalant S, Mari JF, Leblond P. A New Data Mining Approach for the Detection of Bacterial Promoters Combining Stochastic and Combinatorial Methods. J Comput Biol 2009; 16:1211-25. [DOI: 10.1089/cmb.2008.0122] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Open
Affiliation(s)
- Catherine Eng
- LORIA, UMR CNRS 7503 et INRIA Grand Est, Campus Scientifique, Vandœuvre-lès-Nancy, France
- Laboratoire de Génétique et Microbiologie, UMR UHP-INRA 1128, IFR 110, Nancy Université, Faculté des Sciences et Techniques, Vandœuvre-lès-Nancy, France
| | - Charu Asthana
- LORIA, UMR CNRS 7503 et INRIA Grand Est, Campus Scientifique, Vandœuvre-lès-Nancy, France
| | - Bertrand Aigle
- Laboratoire de Génétique et Microbiologie, UMR UHP-INRA 1128, IFR 110, Nancy Université, Faculté des Sciences et Techniques, Vandœuvre-lès-Nancy, France
| | - Sébastien Hergalant
- LORIA, UMR CNRS 7503 et INRIA Grand Est, Campus Scientifique, Vandœuvre-lès-Nancy, France
| | - Jean-François Mari
- LORIA, UMR CNRS 7503 et INRIA Grand Est, Campus Scientifique, Vandœuvre-lès-Nancy, France
| | - Pierre Leblond
- Laboratoire de Génétique et Microbiologie, UMR UHP-INRA 1128, IFR 110, Nancy Université, Faculté des Sciences et Techniques, Vandœuvre-lès-Nancy, France
| |
Collapse
|
6
|
Abstract
While hundreds of microbial genomes are sequenced, the challenge remains to define their cis-regulatory maps. Here, we present a comparative genomic analysis of the cis-regulatory map of Shewanella oneidensis, an important model organism for bioremediation because of its extraordinary abilities to use a wide variety of metals and organic molecules as electron acceptors in respiration. First, from the experimentally verified transcriptional regulatory networks of Escherichia coli, we inferred 24 DNA motifs that are conserved in S. oneidensis. We then applied a new comparative approach on five Shewanella genomes that allowed us to systematically identify 194 nonredundant palindromic DNA motifs and corresponding regulons in S. oneidensis. Sixty-four percent of the predicted motifs are conserved in at least three of the seven newly sequenced and distantly related Shewanella genomes. In total, we obtained 209 unique DNA motifs in S. oneidensis that cover 849 unique transcription units. Besides conservation in other genomes, 77 of these motifs are supported by at least one additional type of evidence, including matching to known transcription factor binding motifs and significant functional enrichment or expression coherence of the corresponding target genes. Using the same approach on a more focused gene set, 990 differentially expressed genes derived from published microarray data of S. oneidensis during exposure to metal ions, we identified 31 putative cis-regulatory motifs (16 with at least one type of additional supporting evidence) that are potentially involved in the process of metal reduction. The majority (18/31) of those motifs had been found in our whole-genome comparative approach, further demonstrating that such an approach is capable of uncovering a large fraction of the regulatory map of a genome even in the absence of experimental data. The integrated computational approach developed in this study provides a useful strategy to identify genome-wide cis-regulatory maps and a novel avenue to explore the regulatory pathways for particular biological processes in bacterial systems.
Collapse
Affiliation(s)
- Jiajian Liu
- Department of Genetics, Washington University School of Medicine, 660 S Euclid, Box 8232, St Louis, MO 63110, USA
| | | | | |
Collapse
|
7
|
Laing E, Sidhu K, Hubbard SJ. Predicted transcription factor binding sites as predictors of operons in Escherichia coli and Streptomyces coelicolor. BMC Genomics 2008; 9:79. [PMID: 18269733 PMCID: PMC2276206 DOI: 10.1186/1471-2164-9-79] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2007] [Accepted: 02/12/2008] [Indexed: 11/18/2022] Open
Abstract
Background As a polycistronic transcriptional unit of one or more adjacent genes, operons play a key role in regulation and function in prokaryotic biology, and a better understanding of how they are constituted and controlled is needed. Recent efforts have attempted to predict operonic status in sequenced genomes using a variety of techniques and data sources. To date, non-homology based operon prediction strategies have mainly used predicted promoters and terminators present at the extremities of transcriptional unit as predictors, with reasonable success. However, transcription factor binding sites (TFBSs), typically found upstream of the first gene in an operon, have not yet been evaluated. Results Here we apply a method originally developed for the prediction of TFBSs in Escherichia coli that minimises the need for prior knowledge and tests its ability to predict operons in E. coli and the 'more complex', pharmaceutically important, Streptomyces coelicolor. We demonstrate that through building genome specific TFBS position-specific-weight-matrices (PSWMs) it is possible to predict operons in E. coli and S. coelicolor with 83% and 93% accuracy respectively, using only TFBS as delimiters of operons. Additionally, the 'palindromicity' of TFBS footprint data of E. coli is characterised. Conclusion TFBS are proposed as novel independent features for use in prokaryotic operon prediction (whether alone or as part of a set of features) given their efficacy as operon predictors in E. coli and S. coelicolor. We also show that TFBS footprint data in E. coli generally contains inverted repeats with significantly (p < 0.05) greater palindromicity than random sequences. Consequently, the palindromicity of putative TFBSs predicted can also enhance operon predictions.
Collapse
Affiliation(s)
- Emma Laing
- Faculty of Life Sciences, The University of Manchester, Michael Smith Building, Oxford Road, Manchester, M13 9PT, UK.
| | | | | |
Collapse
|
8
|
Hesketh A, Bucca G, Laing E, Flett F, Hotchkiss G, Smith CP, Chater KF. New pleiotropic effects of eliminating a rare tRNA from Streptomyces coelicolor, revealed by combined proteomic and transcriptomic analysis of liquid cultures. BMC Genomics 2007; 8:261. [PMID: 17678549 PMCID: PMC2000904 DOI: 10.1186/1471-2164-8-261] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2007] [Accepted: 08/02/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In Streptomyces coelicolor, bldA encodes the only tRNA for a rare leucine codon, UUA. This tRNA is unnecessary for growth, but is required for some aspects of secondary metabolism and morphological development. We describe a transcriptomic and proteomic analysis of the effects of deleting bldA on cellular processes during submerged culture: conditions relevant to the industrial production of antibiotics. RESULTS At the end of rapid growth, a co-ordinated transient up-regulation of about 100 genes, including many for ribosomal proteins, was seen in the parent strain but not the DeltabldA mutant. Increased basal levels of the signal molecule ppGpp in the mutant strain may be responsible for this difference. Transcripts or proteins from a further 147 genes classified as bldA-influenced were mostly expressed late in culture in the wild-type, though others were significantly transcribed during exponential growth. Some were involved in the biosynthesis of seven secondary metabolites; and some have probable roles in reorganising metabolism after rapid growth. Many of the 147 genes were "function unknown", and may represent unknown aspects of Streptomyces biology. Only two of the 147 genes contain a TTA codon, but some effects of bldA could be traced to TTA codons in regulatory genes or polycistronic operons. Several proteins were affected post-translationally by the bldA deletion. There was a statistically significant but weak positive global correlation between transcript and corresponding protein levels. Different technical limitations of the two approaches were a major cause of discrepancies in the results obtained with them. CONCLUSION Although deletion of bldA has very conspicuous effects on the gross phenotype, the bldA molecular phenotype revealed by the "dualomic" approach has shown that only about 2% of the genome is affected; but this includes many previously unknown effects at a variety of different levels, including post-translational changes in proteins and global cellular physiology.
Collapse
Affiliation(s)
- Andy Hesketh
- Department of Molecular Microbiology, John Innes Centre, Norwich Research Park, Colney, Norwich, NR4 7UH, UK
| | - Giselda Bucca
- School of Biomedical and Molecular Sciences, University of Surrey, Guildford, Surrey, GU2 7XH, UK
| | - Emma Laing
- School of Biomedical and Molecular Sciences, University of Surrey, Guildford, Surrey, GU2 7XH, UK
| | - Fiona Flett
- Manchester Interdisciplinary Biocentre, The University of Manchester, 131 Princess Street, Manchester, M1 7ND, UK
| | - Graham Hotchkiss
- School of Biomedical and Molecular Sciences, University of Surrey, Guildford, Surrey, GU2 7XH, UK
| | - Colin P Smith
- School of Biomedical and Molecular Sciences, University of Surrey, Guildford, Surrey, GU2 7XH, UK
| | - Keith F Chater
- Department of Molecular Microbiology, John Innes Centre, Norwich Research Park, Colney, Norwich, NR4 7UH, UK
| |
Collapse
|
9
|
Affiliation(s)
- Dmitry A Rodionov
- Burnham Institute for Medical Research, La Jolla, California 92037, USA.
| |
Collapse
|
10
|
Rokem JS, Lantz AE, Nielsen J. Systems biology of antibiotic production by microorganisms. Nat Prod Rep 2007; 24:1262-87. [DOI: 10.1039/b617765b] [Citation(s) in RCA: 123] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
11
|
Colson S, Stephan J, Hertrich T, Saito A, van Wezel GP, Titgemeyer F, Rigali S. Conserved cis-Acting Elements Upstream of Genes Composing the Chitinolytic System of Streptomycetes Are DasR-Responsive Elements. J Mol Microbiol Biotechnol 2006; 12:60-6. [PMID: 17183212 DOI: 10.1159/000096460] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
For soil-dwelling bacteria that usually live in a carbon-rich and nitrogen-poor environment, the ability to utilize chitin - the second most abundant polysaccharide on earth - is a decisive evolving advantage as it is a source for both elements. Streptomycetes are high-GC Gram-positive soil bacteria that are equipped with a broad arsenal of chitinase-degrading genes. These genes are induced when the streptomycetes sense the presence of chitooligosaccharides. Their expression is repressed as soon as more readily assimilated carbon sources become available. This includes for example glucose or N-acetylglucosamine, the monomer subunit of chitin. Historically, the first cis-acting elements involved in carbon regulation in streptomycetes were found more than a decade ago upstream of chitinase genes, but the transcriptional regulator had so far remained undiscovered. In this work, we show that these cis-acting elements consist of inverted repeats with multiple occurrences and are bound by the HutC/GntR type regulator DasR. We have therefore designated these sites as DasR-responsive elements (dre). DasR, which is also the repressor of the genes for the N-acetylglucosamine-specific phosphotransferase transport system, should therefore play a critical role in sensing the balance between the monomeric and polymeric forms of N-acetylglucosamine.
Collapse
Affiliation(s)
- Séverine Colson
- Centre d'Ingénierie des Protéines, Université de Liège, Institut de Chimie B6a, Liège, Belgium
| | | | | | | | | | | | | |
Collapse
|
12
|
Laing E, Mersinias V, Smith CP, Hubbard SJ. Analysis of gene expression in operons of Streptomyces coelicolor. Genome Biol 2006; 7:R46. [PMID: 16749941 PMCID: PMC1779546 DOI: 10.1186/gb-2006-7-6-r46] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2005] [Revised: 03/03/2006] [Accepted: 05/09/2006] [Indexed: 11/12/2022] Open
Abstract
Analysis of the relative transcript levels of intra-operonic genes in Streptomyces coelicolor suggests significant levels of internal regulation. Background Recent studies have shown that microarray-derived gene-expression data are useful for operon prediction. However, it is apparent that genes within an operon do not conform to the simple notion that they have equal levels of expression. Results To investigate the relative transcript levels of intra-operonic genes, we have used a Z-score approach to normalize the expression levels of all genes within an operon to expression of the first gene of that operon. Here we demonstrate that there is a general downward trend in expression from the first to the last gene in Streptomyces coelicolor operons, in contrast to what we observe in Escherichia coli. Combining transcription-factor binding-site prediction with the identification of operonic genes that exhibited higher transcript levels than the first gene of the same operon enabled the discovery of putative internal promoters. The presence of transcription terminators and abundance of putative transcriptional control sequences in S. coelicolor operons are also described. Conclusion Here we have demonstrated a polarity of expression in operons of S. coelicolor not seen in E. coli, bringing caution to those that apply operon prediction strategies based on E. coli 'equal-expression' to divergent species. We speculate that this general difference in transcription behavior could reflect the contrasting lifestyles of the two organisms and, in the case of Streptomyces, might also be influenced by its high G+C content genome. Identification of putative internal promoters, previously thought to cause problems in operon prediction strategies, has also been enabled.
Collapse
Affiliation(s)
- Emma Laing
- Faculty of Life Sciences, The University of Manchester, Manchester M13 9PT, UK
- Current Address: School of Biomedical and Molecular Sciences, University of Surrey, Guildford GU2 7XH, UK
| | - Vassilis Mersinias
- Functional Genomics Laboratory, School of Biomedical and Molecular Sciences, University of Surrey, Guildford GU2 7XH, UK
| | - Colin P Smith
- Functional Genomics Laboratory, School of Biomedical and Molecular Sciences, University of Surrey, Guildford GU2 7XH, UK
| | - Simon J Hubbard
- Faculty of Life Sciences, The University of Manchester, Manchester M13 9PT, UK
| |
Collapse
|
13
|
Jacques PÉ, Rodrigue S, Gaudreau L, Goulet J, Brzezinski R. Detection of prokaryotic promoters from the genomic distribution of hexanucleotide pairs. BMC Bioinformatics 2006; 7:423. [PMID: 17014715 PMCID: PMC1615881 DOI: 10.1186/1471-2105-7-423] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2006] [Accepted: 10/02/2006] [Indexed: 12/03/2022] Open
Abstract
Background In bacteria, sigma factors and other transcriptional regulatory proteins recognize DNA patterns upstream of their target genes and interact with RNA polymerase to control transcription. As a consequence of evolution, DNA sequences recognized by transcription factors are thought to be enriched in intergenic regions (IRs) and depleted from coding regions of prokaryotic genomes. Results In this work, we report that genomic distribution of transcription factors binding sites is biased towards IRs, and that this bias is conserved amongst bacterial species. We further take advantage of this observation to develop an algorithm that can efficiently identify promoter boxes by a distribution-dependent approach rather than a direct sequence comparison approach. This strategy, which can easily be combined with other methodologies, allowed the identification of promoter sequences in ten species and can be used with any annotated bacterial genome, with results that rival with current methodologies. Experimental validations of predicted promoters also support our approach. Conclusion Considering that complete genomic sequences of over 1000 bacteria will soon be available and that little transcriptional information is available for most of them, our algorithm constitutes a promising tool for the prediction of promoter sequences. Importantly, our methodology could also be adapted to identify DNA sequences recognized by other regulatory proteins.
Collapse
Affiliation(s)
- Pierre-Étienne Jacques
- Département de biologie, Université de Sherbrooke, Sherbrooke, Québec, Canada
- Département d'informatique, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | - Sébastien Rodrigue
- Département de biologie, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | - Luc Gaudreau
- Département de biologie, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | - Jean Goulet
- Département d'informatique, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | - Ryszard Brzezinski
- Département de biologie, Université de Sherbrooke, Sherbrooke, Québec, Canada
- Centre d'étude et de valorisation de la diversité microbienne, Université de Sherbrooke, Sherbrooke, Québec, Canada
| |
Collapse
|
14
|
Bose M, Slick D, Sarto MJ, Roberts D, Roberts J, Barber RD. Identification of SmtB/ArsR cis elements and proteins in archaea using the Prokaryotic InterGenic Exploration Database (PIGED). ARCHAEA-AN INTERNATIONAL MICROBIOLOGICAL JOURNAL 2006; 2:39-49. [PMID: 16877320 PMCID: PMC2685587 DOI: 10.1155/2006/837139] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Microbial genome sequencing projects have revealed an apparently wide distribution of SmtB/ArsR metal-responsive transcriptional regulators among prokaryotes. Using a position-dependent weight matrix approach, prokaryotic genome sequences were screened for SmtB/ArsR DNA binding sites using data derived from intergenic sequences upstream of orthologous genes encoding these regulators. Sixty SmtB/ArsR operators linked to metal detoxification genes, including nine among various archaeal species, are predicted among 230 annotated and draft prokaryotic genome sequences. Independent multiple sequence alignments of putative operator sites and corresponding winged helix-turn-helix motifs define sequence signatures for the DNA binding activity of this SmtB/ArsR subfamily. Prediction of an archaeal SmtB/ArsR based upon these signature sequences is confirmed using purified Methanosarcina acetivorans C2A protein and electrophoretic mobility shift assays. Tools used in this study have been incorporated into a web application, the Prokaryotic InterGenic Exploration Database (PIGED; http://bioinformatics.uwp.edu/~PIGED/home.htm), facilitating comparable studies. Use of this tool and establishment of orthology based on DNA binding signatures holds promise for deciphering potential cellular roles of various archaeal winged helix-turn-helix transcriptional regulators.
Collapse
Affiliation(s)
- Michael Bose
- Biological Sciences Department, University of Wisconsin-Parkside, Kenosha, WI 53141, USA
| | - David Slick
- Biological Sciences Department, University of Wisconsin-Parkside, Kenosha, WI 53141, USA
| | - Mickey J. Sarto
- Biological Sciences Department, University of Wisconsin-Parkside, Kenosha, WI 53141, USA
| | - David Roberts
- Department of Chemistry, DePauw University, Greencastle, IN 46135, USA
| | | | - Robert D. Barber
- Biological Sciences Department, University of Wisconsin-Parkside, Kenosha, WI 53141, USA
- Corresponding author ()
| |
Collapse
|
15
|
Rigali S, Schlicht M, Hoskisson P, Nothaft H, Merzbacher M, Joris B, Titgemeyer F. Extending the classification of bacterial transcription factors beyond the helix-turn-helix motif as an alternative approach to discover new cis/trans relationships. Nucleic Acids Res 2004; 32:3418-26. [PMID: 15247334 PMCID: PMC443547 DOI: 10.1093/nar/gkh673] [Citation(s) in RCA: 69] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Transcription factors (TFs) of bacterial helix-turn-helix superfamilies exhibit different effector-binding domains (EBDs) fused to a DNA-binding domain with a common feature. In a previous study of the GntR superfamily, we demonstrated that classifying members into subfamilies according to the EBD heterogeneity highlighted unsuspected and accurate TF-binding site signatures. In this work, we present how such in silico analysis can provide prediction tools to discover new cis/trans relationships. The TF-binding site consensus of the HutC/GntR subfamily was used to (i) predict target sites within the Streptomyces coelicolor genome, (ii) discover a new HutC/GntR regulon and (iii) discover its specific TF. By scanning the S.coelicolor genome we identified a presumed new HutC regulon that comprises genes of the phosphotransferase system (PTS) specific for the uptake of N-acetylglucosamine (PTS(Nag)). A weight matrix was derived from the compilation of the predicted cis-acting elements upstream of each gene of the presumed regulon. Under the assumption that TFs are often subject to autoregulation, we used this matrix to scan the upstream region of the 24 HutC-like members of S.coelicolor. orf SCO5231 (dasR) was selected as the best candidate according to the high score of a 16 bp sequence identified in its upstream region. Our prediction that DasR regulates the PTS(Nag) regulon was confirmed by in vivo and in vitro experiments. In conclusion, our in silico approach permitted to highlight the specific TF of a regulon out of the 673 orfs annotated as 'regulatory proteins' within the genome of S.coelicolor.
Collapse
Affiliation(s)
- Sébastien Rigali
- Centre d'Ingénierie des Protéines, Université de Liège, Institut de Chimie B6a, B-4000, Liège, Belgium.
| | | | | | | | | | | | | |
Collapse
|