1
|
Characterization and evolutionary analysis of Brassica species-diverged sequences containing simple repeat units. Genes Genomics 2013. [DOI: 10.1007/s13258-013-0076-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
2
|
Claeys M, Storms V, Sun H, Michoel T, Marchal K. MotifSuite: workflow for probabilistic motif detection and assessment. Bioinformatics 2012; 28:1931-2. [DOI: 10.1093/bioinformatics/bts293] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
|
3
|
Zambelli F, Pesole G, Pavesi G. Motif discovery and transcription factor binding sites before and after the next-generation sequencing era. Brief Bioinform 2012; 14:225-37. [PMID: 22517426 PMCID: PMC3603212 DOI: 10.1093/bib/bbs016] [Citation(s) in RCA: 93] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Motif discovery has been one of the most widely studied problems in bioinformatics ever since genomic and protein sequences have been available. In particular, its application to the de novo prediction of putative over-represented transcription factor binding sites in nucleotide sequences has been, and still is, one of the most challenging flavors of the problem. Recently, novel experimental techniques like chromatin immunoprecipitation (ChIP) have been introduced, permitting the genome-wide identification of protein-DNA interactions. ChIP, applied to transcription factors and coupled with genome tiling arrays (ChIP on Chip) or next-generation sequencing technologies (ChIP-Seq) has opened new avenues in research, as well as posed new challenges to bioinformaticians developing algorithms and methods for motif discovery.
Collapse
|
4
|
Aerts S. Computational strategies for the genome-wide identification of cis-regulatory elements and transcriptional targets. Curr Top Dev Biol 2012; 98:121-45. [PMID: 22305161 DOI: 10.1016/b978-0-12-386499-4.00005-7] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Transcription factors (TFs) are key proteins that decode the information in our genome to express a precise and unique set of proteins and RNA molecules in each cell type in our body. These factors play a pivotal role in all biological processes, including the determination of a cell's fate during development and the maintenance of a cell's physiological function. To achieve this, a TF binds to specific DNA sequences in the noncoding part of the genome, recruits chromatin modifiers and cofactors, and directs the transcription initiation rate of its "target genes." Therefore, a key challenge in deciphering a transcriptional switch is to identify the direct target genes of the master regulators that control the switch, the cis-regulatory elements implementing (auto-)regulatory loops, and the target genes of all the TFs in the downstream regulatory network. A better knowledge of a TF's targetome during specification and differentiation of a particular cell type will generate mechanistic insight into its developmental program. Here, I review computational strategies and methods to predict transcriptional targets by genome-wide searches for TF binding sites using position weight matrices, motif clusters, phylogenetic footprinting, chromatin binding and accessibility data, enhancer classification, motif enrichment, and gene expression signatures.
Collapse
Affiliation(s)
- Stein Aerts
- Laboratory of Computational Biology, Center for Human Genetics, Katholieke Universiteit Leuven, Leuven, Belgium
| |
Collapse
|
5
|
Sleumer MC, Bilenky M, He A, Robertson G, Thiessen N, Jones SJM. Caenorhabditis elegans cisRED: a catalogue of conserved genomic elements. Nucleic Acids Res 2009; 37:1323-34. [PMID: 19151087 PMCID: PMC2651782 DOI: 10.1093/nar/gkn1041] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
The availability of completely sequenced genomes from eight species of nematodes has provided an opportunity to identify novel cis-regulatory elements in the promoter regions of Caenorhabditis elegans transcripts using comparative genomics. We determined orthologues for C. elegans transcripts in C. briggsae, C. remanei, C. brenneri, C. japonica, Pristionchus pacificus, Brugia malayi and Trichinella spiralis using the WABA alignment algorithm. We pooled the upstream region of each transcript in C. elegans with the upstream regions of its orthologues and identified conserved DNA sequence elements by de novo motif discovery. In total, we discovered 158 017 novel conserved motifs upstream of 3847 C. elegans transcripts for which three or more orthologues were available, and identified 82% of 44 experimentally proven regulatory elements from ORegAnno. We annotated 26% of the motifs as similar to known binding sequences of transcription factors from ORegAnno, TRANSFAC and JASPAR. This is the first catalogue of annotated conserved upstream elements for nematodes and can be used to find putative regulatory elements, improve gene models, discover novel RNA genes, and understand the evolution of transcription factors and their binding sites in phylum Nematoda. The annotated motifs provide novel binding site candidates for both characterized transcription factors and orthologues of characterized mammalian transcription factors.
Collapse
Affiliation(s)
- Monica C Sleumer
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, BC, Canada
| | | | | | | | | | | |
Collapse
|
6
|
Thompson W, Conlan S, McCue LA, Lawrence CE. Using the Gibbs Motif Sampler for phylogenetic footprinting. Methods Mol Biol 2008; 395:403-24. [PMID: 17993688 DOI: 10.1007/978-1-59745-514-5_25] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/15/2023]
Abstract
The Gibbs Motif Sampler (Gibbs) is a software package used to predict conserved elements in biopolymer sequences. Although the software can be used to locate conserved motifs in protein sequences, its most common use is the prediction of transcription factor binding sites (TFBSs) in promoters upstream of gene sequences. We will describe approaches that use Gibbs to locate TFBSs in a collection of orthologous nucleotide sequences, i.e., phylogenetic footprinting. To illustrate this technique, we present examples that use Gibbs to detect binding sites for the transcription factor LexA in orthologous sequence data from representative species belonging to two different proteobacterial divisions.
Collapse
|
7
|
De Keersmaecker SCJ, Thijs IMV, Vanderleyden J, Marchal K. Integration of omics data: how well does it work for bacteria? Mol Microbiol 2006; 62:1239-50. [PMID: 17040488 DOI: 10.1111/j.1365-2958.2006.05453.x] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
In the current omics era, innovative high-throughput technologies allow measuring temporal and conditional changes at various cellular levels. Although individual analysis of each of these omics data undoubtedly results into interesting findings, it is only by integrating them that gaining a global insight into cellular behaviour can be aimed at. A systems approach thus is predicated on data integration. However, because of the complexity of biological systems and the specificities of the data-generating technologies (noisiness, heterogeneity, etc.), integrating omics data in an attempt to reconstruct signalling networks is not trivial. Developing its methodologies constitutes a major research challenge. Besides for their intrinsic value towards health care, environment and industry, prokaryotes are ideal model systems to further develop these methods because of their lower regulatory complexity compared with eukaryotes, and the ease with which they can be manipulated. Several successful examples outlined in this review already show the potential of the systems approach for both fundamental and industrial applications, which would be time-consuming or impossible to develop solely through traditional reductionist approaches.
Collapse
Affiliation(s)
- Sigrid C J De Keersmaecker
- Centre of Microbial and Plant Genetics (CMPG) Katholieke Universiteit Leuven, Kasteelpark Arenberg 20, Belgium
| | | | | | | |
Collapse
|
8
|
Uddin RK, Singh SM. cis-Regulatory sequences of the genes involved in apoptosis, cell growth, and proliferation may provide a target for some of the effects of acute ethanol exposure. Brain Res 2006; 1088:31-44. [PMID: 16631145 DOI: 10.1016/j.brainres.2006.02.125] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2005] [Revised: 01/31/2006] [Accepted: 02/26/2006] [Indexed: 01/22/2023]
Abstract
The physiological effects of alcohol are known to include drunkenness, toxicity, and addiction leading to alcohol-related health and societal problems. Some of these effects are mediated by regulation of expression of many genes involved in alcohol response pathways. Analysis of the regulatory elements and biological interaction of the genes that show coexpression in response to alcohol may give an insight into how they are regulated. Fifty-two ethanol-responsive (ER) genes displaying differential expression in mouse brain in response to acute ethanol exposure were subjected to bioinformatics analysis to identify known or putative transcription factor binding sites and cis-regulatory modules in the promoter regions that may be involved in their responsiveness to alcohol. Functional interactions of these genes were also examined to assess their cumulative contribution to metabolomic pathways. Clustering and promoter sequence analysis of the ER genes revealed the DNA binding site for nuclear transcription factor Y (NFY) as the most significant. NFY also take part in the proposed biological association network of a number of ER genes, where these genes interact with themselves and other cellular components, and may generate a major cumulative effect on apoptosis, cell survival, and proliferation in response to alcohol. NFY has the potential to play a critical role in mediating the expression of a set of ER genes whose interactions contribute to apoptosis, cell survival, and proliferation, which in turn may affect alcohol-related behaviors.
Collapse
Affiliation(s)
- Raihan K Uddin
- Department of Biology and Division of Medical Genetics, The University of Western Ontario, London, Ontario, Canada N6A 5B7.
| | | |
Collapse
|
9
|
Abstract
Bioinformatics plays an essential role in today's plant science. As the amount of data grows exponentially, there is a parallel growth in the demand for tools and methods in data management, visualization, integration, analysis, modeling, and prediction. At the same time, many researchers in biology are unfamiliar with available bioinformatics methods, tools, and databases, which could lead to missed opportunities or misinterpretation of the information. In this review, we describe some of the key concepts, methods, software packages, and databases used in bioinformatics, with an emphasis on those relevant to plant science. We also cover some fundamental issues related to biological sequence analyses, transcriptome analyses, computational proteomics, computational metabolomics, bio-ontologies, and biological databases. Finally, we explore a few emerging research topics in bioinformatics.
Collapse
Affiliation(s)
- Seung Yon Rhee
- Department of Plant Biology, Carnegie Institution, Stanford, California 94305, USA.
| | | | | |
Collapse
|
10
|
Van Hellemont R, Monsieurs P, Thijs G, De Moor B, Van de Peer Y, Marchal K. A novel approach to identifying regulatory motifs in distantly related genomes. Genome Biol 2005; 6:R113. [PMID: 16420672 PMCID: PMC1414112 DOI: 10.1186/gb-2005-6-13-r113] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2005] [Revised: 08/22/2005] [Accepted: 12/01/2005] [Indexed: 11/25/2022] Open
Abstract
A two-step procedure for identifying regulatory motifs in distantly related organisms is described that combines the advantages of sequence alignment and motif detection approaches. Although proven successful in the identification of regulatory motifs, phylogenetic footprinting methods still show some shortcomings. To assess these difficulties, most apparent when applying phylogenetic footprinting to distantly related organisms, we developed a two-step procedure that combines the advantages of sequence alignment and motif detection approaches. The results on well-studied benchmark datasets indicate that the presented method outperforms other methods when the sequences become either too long or too heterogeneous in size.
Collapse
Affiliation(s)
- Ruth Van Hellemont
- ESAT-SCD, KU Leuven, Kasteelpark Arenberg 10, 3001 Leuven-Heverlee, Belgium
| | - Pieter Monsieurs
- ESAT-SCD, KU Leuven, Kasteelpark Arenberg 10, 3001 Leuven-Heverlee, Belgium
| | - Gert Thijs
- ESAT-SCD, KU Leuven, Kasteelpark Arenberg 10, 3001 Leuven-Heverlee, Belgium
| | - Bart De Moor
- ESAT-SCD, KU Leuven, Kasteelpark Arenberg 10, 3001 Leuven-Heverlee, Belgium
| | - Yves Van de Peer
- Plant Systems Biology, Bioinformatics and Evolutionary Genomics, VIB/Ghent University, Technologiepark 927, 9052 Gent, Belgium
| | - Kathleen Marchal
- ESAT-SCD, KU Leuven, Kasteelpark Arenberg 10, 3001 Leuven-Heverlee, Belgium
- Department of Microbial and Molecular Systems, KU Leuven, Kasteelpark Arenberg 20, 3001 Leuven-Heverlee, Belgium
| |
Collapse
|
11
|
Monsieurs P, De Keersmaecker S, Navarre WW, Bader MW, De Smet F, McClelland M, Fang FC, De Moor B, Vanderleyden J, Marchal K. Comparison of the PhoPQ regulon in Escherichia coli and Salmonella typhimurium. J Mol Evol 2005; 60:462-74. [PMID: 15883881 DOI: 10.1007/s00239-004-0212-7] [Citation(s) in RCA: 90] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2004] [Accepted: 10/20/2004] [Indexed: 01/04/2023]
Abstract
The PhoPQ two-component system acts as a transcriptional regulator that responds to Mg(2+) starvation both in Escherichia coli and Salmonella typhimurium (Garcia et al. 1996; Kato et al. 1999). By monitoring the availability of extracellular Mg(2+), this two-component system allows S. typhimurium to sense the transition from an extracellular environment to a subcellular location. Concomitantly with this transition, a set of virulence factors essential for survival in the intracellular environment is activated by the PhoPQ system (Groisman et al. 1989; Miller et al. 1989). Compared to nonpathogenic strains, such as E. coli K12, the PhoPQ regulon in pathogens must contain target genes specifically contributing to the virulence phenotype. To verify this hypothesis, we compared the composition of the PhoPQ regulon between E. coli and S. typhimurium using a combination of expression experiments and motif data. PhoPQ-dependent genes in both organisms were identified from PhoPQ-related microarray experiments. To distinguish between direct and indirect targets, we searched for the presence of the regulatory motif in the promoter region of the identified PhoPQ-dependent genes. This allowed us to reconstruct the direct PhoPQ-dependent regulons in E. coli K12 and S. typhimurium LT2. Comparison of both regulons revealed a very limited overlap of PhoPQ-dependent genes between both organisms. These results suggest that the PhoPQ system has acquired a specialized function during evolution in each of these closely related species that allows adaptation to the specificities of their lifestyles (e.g., pathogenesis in S. typhimurium).
Collapse
Affiliation(s)
- Pieter Monsieurs
- ESAT-SCD, K.U. Leuven, Kasteelpark Arenberg 10, 3001, Leuven-Heverlee, Belgium
| | | | | | | | | | | | | | | | | | | |
Collapse
|
12
|
De Keersmaecker SCJ, Marchal K, Verhoeven TLA, Engelen K, Vanderleyden J, Detweiler CS. Microarray analysis and motif detection reveal new targets of the Salmonella enterica serovar Typhimurium HilA regulatory protein, including hilA itself. J Bacteriol 2005; 187:4381-91. [PMID: 15968047 PMCID: PMC1151768 DOI: 10.1128/jb.187.13.4381-4391.2005] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
DNA regulatory motifs reflect the direct transcriptional interactions between regulators and their target genes and contain important information regarding transcriptional networks. In silico motif detection strategies search for DNA patterns that are present more frequently in a set of related sequences than in a set of unrelated sequences. Related sequences could be genes that are coexpressed and are therefore expected to share similar conserved regulatory motifs. We identified coexpressed genes by carrying out microarray-based transcript profiling of Salmonella enterica serovar Typhimurium in response to the spent culture supernatant of the probiotic strain Lactobacillus rhamnosus GG. Probiotics are live microorganisms which, when administered in adequate amounts, confer a health benefit on the host. They are known to antagonize intestinal pathogens in vivo, including salmonellae. S. enterica serovar Typhimurium causes human gastroenteritis. Infection is initiated by entry of salmonellae into intestinal epithelial cells. The expression of invasion genes is tightly regulated by environmental conditions, as well as by many bacterial factors including the key regulator HilA. One mechanism by which probiotics may antagonize intestinal pathogens is by influencing invasion gene expression. Our microarray experiment yielded a cluster of coexpressed Salmonella genes that are predicted to be down-regulated by spent culture supernatant. This cluster was enriched for genes known to be HilA dependent. In silico motif detection revealed a motif that overlaps the previously described HilA box in the promoter region of three of these genes, spi4_H, sicA, and hilA. Site-directed mutagenesis, beta-galactosidase reporter assays, and gel mobility shift experiments indicated that sicA expression requires HilA and that hilA is negatively autoregulated.
Collapse
|
13
|
Nemhauser JL, Mockler TC, Chory J. Interdependency of brassinosteroid and auxin signaling in Arabidopsis. PLoS Biol 2004; 2:E258. [PMID: 15328536 PMCID: PMC509407 DOI: 10.1371/journal.pbio.0020258] [Citation(s) in RCA: 383] [Impact Index Per Article: 19.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2003] [Accepted: 06/09/2004] [Indexed: 11/18/2022] Open
Abstract
How growth regulators provoke context-specific signals is a fundamental question in developmental biology. In plants, both auxin and brassinosteroids (BRs) promote cell expansion, and it was thought that they activated this process through independent mechanisms. In this work, we describe a shared auxin:BR pathway required for seedling growth. Genetic, physiological, and genomic analyses demonstrate that response from one pathway requires the function of the other, and that this interdependence does not act at the level of hormone biosynthetic control. Increased auxin levels saturate the BR-stimulated growth response and greatly reduce BR effects on gene expression. Integration of these two pathways is downstream from BES1 and Aux/IAA proteins, the last known regulatory factors acting downstream of each hormone, and is likely to occur directly on the promoters of auxin:BR target genes. We have developed a new approach to identify potential regulatory elements acting in each hormone pathway, as well as in the shared auxin:BR pathway. We show that one element highly overrepresented in the promoters of auxin- and BR-induced genes is responsive to both hormones and requires BR biosynthesis for normal expression. This work fundamentally alters our view of BR and auxin signaling and describes a powerful new approach to identify regulatory elements required for response to specific stimuli. Although distinct sets of growth regulators - auxin and brassinosteroids - are required for cell expansion; rather than being independent signals, the response from each pathway requires the other
Collapse
Affiliation(s)
- Jennifer L Nemhauser
- 1Plant Biology Laboratory, Salk Institute for Biological StudiesLa Jolla, California, United States of America
| | - Todd C Mockler
- 1Plant Biology Laboratory, Salk Institute for Biological StudiesLa Jolla, California, United States of America
| | - Joanne Chory
- 1Plant Biology Laboratory, Salk Institute for Biological StudiesLa Jolla, California, United States of America
- 2Howard Hughes Medical Institute, La JollaCaliforniaUnited States of America
| |
Collapse
|
14
|
Hu Z, Fu Y, Halees AS, Kielbasa SM, Weng Z. SeqVISTA: a new module of integrated computational tools for studying transcriptional regulation. Nucleic Acids Res 2004; 32:W235-41. [PMID: 15215387 PMCID: PMC441621 DOI: 10.1093/nar/gkh483] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Transcriptional regulation is one of the most basic regulatory mechanisms in the cell. The accumulation of multiple metazoan genome sequences and the advent of high-throughput experimental techniques have motivated the development of a large number of bioinformatics methods for the detection of regulatory motifs. The regulatory process is extremely complex and individual computational algorithms typically have very limited success in genome-scale studies. Here, we argue the importance of integrating multiple computational algorithms and present an infrastructure that integrates eight web services covering key areas of transcriptional regulation. We have adopted the client-side integration technology and built a consistent input and output environment with a versatile visualization tool named SeqVISTA. The infrastructure will allow for easy integration of gene regulation analysis software that is scattered over the Internet. It will also enable bench biologists to perform an arsenal of analysis using cutting-edge methods in a familiar environment and bioinformatics researchers to focus on developing new algorithms without the need to invest substantial effort on complex pre- or post-processors. SeqVISTA is freely available to academic users and can be launched online at http://zlab.bu.edu/SeqVISTA/web.jnlp, provided that Java Web Start has been installed. In addition, a stand-alone version of the program can be downloaded and run locally. It can be obtained at http://zlab.bu.edu/SeqVISTA.
Collapse
Affiliation(s)
- Zhenjun Hu
- Bioinformatics Program, Boston University, 44 Cummington Street, Boston, MA 02215, USA
| | | | | | | | | |
Collapse
|
15
|
Studholme DJ, Dixon R. In silico analysis of the sigma54-dependent enhancer-binding proteins in Pirellula species strain 1. FEMS Microbiol Lett 2004; 230:215-25. [PMID: 14757243 DOI: 10.1016/s0378-1097(03)00897-8] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
The planctomycetes are a phylogenetically distinct group of bacteria, widespread in aquatic and terrestrial environments. Their cell walls lack peptidoglycan and their compartmentalised cells undergo a yeast-like budding cell division process. Many bacteria regulate a subset of their genes by an enhancer-dependent mechanism involving the alternative sigma factor sigma54 (RpoN, sigmaN) in association with sigma54-dependent transcriptional activators known as enhancer-binding proteins (EBPs). The sigma54-dependent regulon has previously been studied in several groups of bacteria, but not in the planctomycetes. We wished to exploit the recently published complete genome sequence of Pirellula species strain 1 to predict and analyse the sigma54-dependent regulon in this interesting group of bacteria. The genome of Pirellula species strain 1 encodes one homologue of sigma54, and 16 sigma54-dependent EBPs, including 10 two-component response regulators and a homologue of Escherichia coli RtcR. Two EBPs contain forkhead-associated domains, representing a novel protein domain combination not previously observed in bacterial EBPs and suggesting a novel link between the enhancer-dependent regulon and 'eukaryotic-like' protein phosphorylation in bacterial signal transduction. We identified several potential sigma54-dependent promoters upstream of genes and operons including two homologues of csrA, which encodes the global regulator CsrA, and rtcBA, encoding a RNA 3'-terminal phosphate cyclase. Phylogenetic analysis of EBP sequences from a wide range of bacterial taxa suggested that planctomycete EBPs fall into several distinct clades. Also the phylogeny of the sigma54 factors is broadly consistent with that of the host organisms. These results are consistent with a very ancient origin of sigma54 within the bacterial lineage. The repertoire of functions predicted to be under the control of the sigma54-dependent regulon in Pirellula shares some similarities (e.g. rtcBA) as well as exhibiting differences with that in other taxonomic groups of bacteria, reinforcing the evolutionarily dynamic nature of this regulon.
Collapse
|
16
|
Marchal K, De Keersmaecker S, Monsieurs P, van Boxel N, Lemmens K, Thijs G, Vanderleyden J, De Moor B. In silico identification and experimental validation of PmrAB targets in Salmonella typhimurium by regulatory motif detection. Genome Biol 2004; 5:R9. [PMID: 14759259 PMCID: PMC395753 DOI: 10.1186/gb-2004-5-2-r9] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2003] [Revised: 08/27/2003] [Accepted: 12/17/2003] [Indexed: 01/17/2023] Open
Abstract
A genome-wide computational screen for targets of the PmrA transcription factor in Salmonella typhimurium has identified novel target genes. Background The PmrAB (BasSR) two-component regulatory system is required for Salmonella typhimurium virulence. PmrAB-controlled modifications of the lipopolysaccharide (LPS) layer confer resistance to cationic antibiotic polypeptides, which may allow bacteria to survive within macrophages. The PmrAB system also confers resistance to Fe3+-mediated killing. New targets of the system have recently been discovered that seem not to have a role in the well-described functions of PmrAB, suggesting that the PmrAB-dependent regulon might contain additional, unidentified targets. Results We performed an in silico analysis of possible targets of the PmrAB system. Using a motif model of the PmrA binding site in DNA, genome-wide screening was carried out to detect PmrAB target genes. To increase confidence in the predictions, all putative targets were subjected to a cross-species comparison (phylogenetic footprinting) using a Gibbs sampling-based motif-detection procedure. As well as the known targets, we detected additional targets with unknown functions. Four of these were experimentally validated (yibD, aroQ, mig-13 and sseJ). Site-directed mutagenesis of the PmrA-binding site (PmrA box) in yibD revealed specific sequence requirements. Conclusions We demonstrated the efficiency of our procedure by recovering most of the known PmrAB-dependent targets and by identifying unknown targets that we were able to validate experimentally. We also pinpointed directions for further research that could help elucidate the S. typhimurium virulence pathway.
Collapse
Affiliation(s)
- Kathleen Marchal
- ESAT-SCD, Katholieke Universiteit Leuven, Kasteelpark Arenberg 10, 3001 Leuven-Heverlee, Belgium.
| | | | | | | | | | | | | | | |
Collapse
|
17
|
Coessens B, Thijs G, Aerts S, Marchal K, De Smet F, Engelen K, Glenisson P, Moreau Y, Mathys J, De Moor B. INCLUSive: A web portal and service registry for microarray and regulatory sequence analysis. Nucleic Acids Res 2003; 31:3468-70. [PMID: 12824346 PMCID: PMC169021 DOI: 10.1093/nar/gkg615] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
INCLUSive is a suite of algorithms and tools for the analysis of gene expression data and the discovery of cis-regulatory sequence elements. The tools allow normalization, filtering and clustering of microarray data, functional scoring of gene clusters, sequence retrieval, and detection of known and unknown regulatory elements using probabilistic sequence models and Gibbs sampling. All tools are available via different web pages and as web services. The web pages are connected and integrated to reflect a methodology and facilitate complex analysis using different tools. The web services can be invoked using standard SOAP messaging. Example clients are available for download to invoke the services from a remote computer or to be integrated with other applications. All services are catalogued and described in a web service registry. The INCLUSive web portal is available for academic purposes at http://www.esat.kuleuven.ac.be/inclusive.
Collapse
Affiliation(s)
- Bert Coessens
- ESAT-SCD, Katholieke Universiteit Leuven, Kasteelpark Arenberg 10, 3001 Leuven, Belgium.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Thompson W, Rouchka EC, Lawrence CE. Gibbs Recursive Sampler: finding transcription factor binding sites. Nucleic Acids Res 2003; 31:3580-5. [PMID: 12824370 PMCID: PMC169014 DOI: 10.1093/nar/gkg608] [Citation(s) in RCA: 231] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2003] [Revised: 04/09/2003] [Accepted: 04/09/2003] [Indexed: 11/14/2022] Open
Abstract
The Gibbs Motif Sampler is a software package for locating common elements in collections of biopolymer sequences. In this paper we describe a new variation of the Gibbs Motif Sampler, the Gibbs Recursive Sampler, which has been developed specifically for locating multiple transcription factor binding sites for multiple transcription factors simultaneously in unaligned DNA sequences that may be heterogeneous in DNA composition. Here we describe the basic operation of the web-based version of this sampler. The sampler may be acces-sed at http://bayesweb.wadsworth.org/gibbs/gibbs.html and at http://www.bioinfo.rpi.edu/applications/bayesian/gibbs/gibbs.html. An online user guide is available at http://bayesweb.wadsworth.org/gibbs/bernoulli.html and at http://www.bioinfo.rpi.edu/applications/bayesian/gibbs/manual/bernoulli.html. Solaris, Solaris.x86 and Linux versions of the sampler are available as stand-alone programs for academic and not-for-profit users. Commercial licenses are also available. The Gibbs Recursive Sampler is distributed in accordance with the ISCB level 0 guidelines and a requirement for citation of use in scientific publications.
Collapse
Affiliation(s)
- William Thompson
- The Wadsworth Center, New York State Department of Health, Albany, NY 12201-0509, USA.
| | | | | |
Collapse
|
19
|
Aerts S, Thijs G, Coessens B, Staes M, Moreau Y, De Moor B. Toucan: deciphering the cis-regulatory logic of coregulated genes. Nucleic Acids Res 2003; 31:1753-64. [PMID: 12626717 PMCID: PMC152870 DOI: 10.1093/nar/gkg268] [Citation(s) in RCA: 147] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
TOUCAN is a Java application for the rapid discovery of significant cis-regulatory elements from sets of coexpressed or coregulated genes. Biologists can automatically (i) retrieve genes and intergenic regions, (ii) identify putative regulatory regions, (iii) score sequences for known transcription factor binding sites, (iv) identify candidate motifs for unknown binding sites, and (v) detect those statistically over-represented sites that are characteristic for a gene set. Genes or intergenic regions are retrieved from Ensembl or EMBL, together with orthologs and supporting information. Orthologs are aligned and syntenic regions are selected as candidate regulatory regions. Putative sites for known transcription factors are detected using our MotifScanner, which scores position weight matrices using a probabilistic model. New motifs are detected using our MotifSampler based on Gibbs sampling. Binding sites characteristic for a gene set--and thus statistically over-represented with respect to a reference sequence set--are found using a binomial test. We have validated Toucan by analyzing muscle-specific genes, liver-specific genes and E2F target genes; we have easily detected many known binding sites within intergenic DNA and identified new biologically plausible sites for known and unknown transcription factors. Software available at http://www.esat.kuleuven.ac. be/ approximately dna/BioI/Software.html.
Collapse
Affiliation(s)
- Stein Aerts
- Department of Electrical Engineering (ESAT-SCD), Katholieke Universiteit Leuven, Kasteelpark Arenberg 10, 3001 Heverlee, Leuven, Belgium.
| | | | | | | | | | | |
Collapse
|