1
|
Solovyev VV, Shahmuradov IA, Salamov AA. Identification of promoter regions and regulatory sites. Methods Mol Biol 2010; 674:57-83. [PMID: 20827586 DOI: 10.1007/978-1-60761-854-6_5] [Citation(s) in RCA: 104] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Promoter sequences are the main regulatory elements of gene expression. Their recognition by computer algorithms is fundamental for understanding gene expression patterns, cell specificity and development. This chapter describes the advanced approaches to identify promoters in animal, plant and bacterial sequences. Also, we discuss an approach to identify statistically significant regulatory motifs in genomic sequences.
Collapse
|
2
|
Woo DK, Phang TL, Trawick JD, Poyton RO. Multiple pathways of mitochondrial-nuclear communication in yeast: Intergenomic signaling involves ABF1 and affects a different set of genes than retrograde regulation. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2009; 1789:135-45. [DOI: 10.1016/j.bbagrm.2008.09.008] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2008] [Revised: 09/14/2008] [Accepted: 09/23/2008] [Indexed: 10/21/2022]
|
3
|
Lenka SK, Lohia B, Kumar A, Chinnusamy V, Bansal KC. Genome-wide targeted prediction of ABA responsive genes in rice based on over-represented cis-motif in co-expressed genes. PLANT MOLECULAR BIOLOGY 2009; 69:261-271. [PMID: 18998058 DOI: 10.1007/s11103-008-9423-4] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/19/2008] [Accepted: 10/16/2008] [Indexed: 05/27/2023]
Abstract
Abscisic acid (ABA), the popular plant stress hormone, plays a key role in regulation of sub-set of stress responsive genes. These genes respond to ABA through specific transcription factors which bind to cis-regulatory elements present in their promoters. We discovered the ABA Responsive Element (ABRE) core (ACGT) containing CGMCACGTGB motif as over-represented motif among the promoters of ABA responsive co-expressed genes in rice. Targeted gene prediction strategy using this motif led to the identification of 402 protein coding genes potentially regulated by ABA-dependent molecular genetic network. RT-PCR analysis of arbitrarily chosen 45 genes from the predicted 402 genes confirmed 80% accuracy of our prediction. Plant Gene Ontology (GO) analysis of ABA responsive genes showed enrichment of signal transduction and stress related genes among diverse functional categories.
Collapse
Affiliation(s)
- Sangram K Lenka
- National Research Centre on Plant Biotechnology, Indian Agricultural Research Institute, New Delhi, 110012, India
| | | | | | | | | |
Collapse
|
4
|
Paietta JV. DNA-binding specificity of the CYS3 transcription factor of Neurospora crassa defined by binding-site selection. Fungal Genet Biol 2008; 45:1166-71. [PMID: 18565773 DOI: 10.1016/j.fgb.2008.05.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2007] [Revised: 04/28/2008] [Accepted: 05/06/2008] [Indexed: 11/18/2022]
Abstract
The CYS3 transcription factor is a basic region-leucine zipper (bZIP) DNA-binding protein that is essential for the expression of a coordinately regulated group of genes involved in the acquisition and utilization of sulfur in Neurospora crassa. An approach of using binding-site selection from random-sequence oligonucleotides was used to define CYS3-binding specificity. The derived consensus-binding site of ATGGCGCCAT defines a symmetrical sequence (half-site A T G/t G/a C/t) that resembles that of other bZIP proteins such as CREB and C/EBP. By comparison, CYS3 shows a greater range of binding to a central core of varied Pur-Pyr-Pur-Pyr sequences than CREB as determined by gel shift assays. The derived CYS3 consensus binding sequence was further validated by demonstrating in vivo sulfur regulation using a heterologous promoter construct. The CYS3-binding site data will be useful for the genome-wide study of sulfur-regulated genes in N. crassa, which has served as a model fungal sulfur control system.
Collapse
Affiliation(s)
- John V Paietta
- Department of Biochemistry and Molecular Biology, Wright State University, Dayton, OH 45435, USA.
| |
Collapse
|
5
|
Ambesi-Impiombato A, Bansal M, Liò P, di Bernardo D. Computational framework for the prediction of transcription factor binding sites by multiple data integration. BMC Neurosci 2006; 7 Suppl 1:S8. [PMID: 17118162 PMCID: PMC1775048 DOI: 10.1186/1471-2202-7-s1-s8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Control of gene expression is essential to the establishment and maintenance of all cell types, and its dysregulation is involved in pathogenesis of several diseases. Accurate computational predictions of transcription factor regulation may thus help in understanding complex diseases, including mental disorders in which dysregulation of neural gene expression is thought to play a key role. However, biological mechanisms underlying the regulation of gene expression are not completely understood, and predictions via bioinformatics tools are typically poorly specific. We developed a bioinformatics workflow for the prediction of transcription factor binding sites from several independent datasets. We show the advantages of integrating information based on evolutionary conservation and gene expression, when tackling the problem of binding site prediction. Consistent results were obtained on a large simulated dataset consisting of 13050 in silico promoter sequences, on a set of 161 human gene promoters for which binding sites are known, and on a smaller set of promoters of Myc target genes. Our computational framework for binding site prediction can integrate multiple sources of data, and its performance was tested on different datasets. Our results show that integrating information from multiple data sources, such as genomic sequence of genes' promoters, conservation over multiple species, and gene expression data, indeed improves the accuracy of computational predictions.
Collapse
Affiliation(s)
- Alberto Ambesi-Impiombato
- TIGEM, Telethon Institute of Genetics and Medicine, Naples, Italy
- Department of Neuroscience, University of Medicine "Federico II", Naples, Italy
| | - Mukesh Bansal
- TIGEM, Telethon Institute of Genetics and Medicine, Naples, Italy
- SEMM, European School of Molecular Medicine, Naples, Italy
| | - Pietro Liò
- Computer Laboratory, Cambridge University, Cambridge, UK
| | - Diego di Bernardo
- TIGEM, Telethon Institute of Genetics and Medicine, Naples, Italy
- SEMM, European School of Molecular Medicine, Naples, Italy
| |
Collapse
|
6
|
Abstract
MOTIVATION Assigning functions for unknown genes based on diverse large-scale data is a key task in functional genomics. Previous work on gene function prediction has addressed this problem using independent classifiers for each function. However, such an approach ignores the structure of functional class taxonomies, such as the Gene Ontology (GO). Over a hierarchy of functional classes, a group of independent classifiers where each one predicts gene membership to a particular class can produce a hierarchically inconsistent set of predictions, where for a given gene a specific class may be predicted positive while its inclusive parent class is predicted negative. Taking the hierarchical structure into account resolves such inconsistencies and provides an opportunity for leveraging all classifiers in the hierarchy to achieve higher specificity of predictions. RESULTS We developed a Bayesian framework for combining multiple classifiers based on the functional taxonomy constraints. Using a hierarchy of support vector machine (SVM) classifiers trained on multiple data types, we combined predictions in our Bayesian framework to obtain the most probable consistent set of predictions. Experiments show that over a 105-node subhierarchy of the GO, our Bayesian framework improves predictions for 93 nodes. As an additional benefit, our method also provides implicit calibration of SVM margin outputs to probabilities. Using this method, we make function predictions for multiple proteins, and experimentally confirm predictions for proteins involved in mitosis. SUPPLEMENTARY INFORMATION Results for the 105 selected GO classes and predictions for 1059 unknown genes are available at: http://function.princeton.edu/genesite/ CONTACT ogt@cs.princeton.edu.
Collapse
Affiliation(s)
- Zafer Barutcuoglu
- Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ 08544, USA
| | | | | |
Collapse
|
7
|
Gomes DS, Riger CJ, Pinto MLC, Panek AD, Eleutherio ECA. Evaluation of the role of Ace1 and Yap1 in cadmium absorption using the eukaryotic cell model Saccharomyces cerevisiae. ENVIRONMENTAL TOXICOLOGY AND PHARMACOLOGY 2005; 20:383-389. [PMID: 21783616 DOI: 10.1016/j.etap.2005.02.009] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2004] [Accepted: 02/22/2005] [Indexed: 05/31/2023]
Abstract
In a previous paper, we demonstrated that the cytoplasmic level of glutathione-cadmium complex affects cadmium absorption by Saccharomyces cerevisiae, a usual eukaryotic cell model for studies of stress response. Furthermore, it was also observed that the absorption of this non-essential metal seems to be achieved by Zrt1, a zinc transporter of high affinity. Looking a little further into the control mechanism, we have verified that the deficiency in Ace1 impaired cadmium transport significantly. Ace1 is a transcription factor that activates the expression of CUP1, which encodes the S. cerevisiae metallothionein. On the other hand, the deficiency in the transcription factor Yap1 produced a two-fold increase in cadmium uptake. Cells lacking Yap1 showed low levels of glutathione, which could explain their higher capacity of absorbing cadmium. However, the mutant strain Ace1 deficient exhibited considerable amounts of glutathione. By using RT-PCR analysis, we observed that the lack of Yap1 activates the expression of both CUP1 and ZRT1, while the lack of Ace1 inhibited significantly the expression of these genes. Thus, metallothionein seems also to participate in the regulation of cadmium transport by controlling the expression of ZRT1. We propose that, at low levels of Cup1, the cytoplasmic concentration of essential metals, such as zinc, in free form (not complexated), increases, inhibiting ZRT1 expression. In contrast, at high levels of Cup1, the concentration of these metals falls, inducing ZRT1 expression and favoring cadmium absorption. These results confirm the involvement of zinc transport system with cadmium transport.
Collapse
Affiliation(s)
- D S Gomes
- Departamento de Bioquímica, Instituto de Química, UFRJ, 21949-900 Rio de Janeiro, RJ, Brazil
| | | | | | | | | |
Collapse
|
8
|
Suzuki M, Ketterling MG, McCarty DR. Quantitative statistical analysis of cis-regulatory sequences in ABA/VP1- and CBF/DREB1-regulated genes of Arabidopsis. PLANT PHYSIOLOGY 2005; 139:437-47. [PMID: 16113229 PMCID: PMC1203392 DOI: 10.1104/pp.104.058412] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
We have developed a simple quantitative computational approach for objective analysis of cis-regulatory sequences in promoters of coregulated genes. The program, designated MotifFinder, identifies oligo sequences that are overrepresented in promoters of coregulated genes. We used this approach to analyze promoter sequences of Viviparous1 (VP1)/abscisic acid (ABA)-regulated genes and cold-regulated genes, respectively, of Arabidopsis (Arabidopsis thaliana). We detected significantly enriched sequences in up-regulated genes but not in down-regulated genes. This result suggests that gene activation but not repression is mediated by specific and common sequence elements in promoters. The enriched motifs include several known cis-regulatory sequences as well as previously unidentified motifs. With respect to known cis-elements, we dissected the flanking nucleotides of the core sequences of Sph element, ABA response elements (ABREs), and the C repeat/dehydration-responsive element. This analysis identified the motif variants that may correlate with qualitative and quantitative differences in gene expression. While both VP1 and cold responses are mediated in part by ABA signaling via ABREs, these responses correlate with unique ABRE variants distinguished by nucleotides flanking the ACGT core. ABRE and Sph motifs are tightly associated uniquely in the coregulated set of genes showing a strict dependence on VP1 and ABA signaling. Finally, analysis of distribution of the enriched sequences revealed a striking concentration of enriched motifs in a proximal 200-base region of VP1/ABA and cold-regulated promoters. Overall, each class of coregulated genes possesses a discrete set of the enriched motifs with unique distributions in their promoters that may account for the specificity of gene regulation.
Collapse
Affiliation(s)
- Masaharu Suzuki
- Plant Molecular and Cellular Biology Program, Horticultural Sciences Department, University of Florida, Gainesville, 32611, USA.
| | | | | |
Collapse
|
9
|
Harkness TAA, Shea KA, Legrand C, Brahmania M, Davies GF. A functional analysis reveals dependence on the anaphase-promoting complex for prolonged life span in yeast. Genetics 2004; 168:759-74. [PMID: 15514051 PMCID: PMC1448841 DOI: 10.1534/genetics.104.027771] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2004] [Accepted: 06/21/2004] [Indexed: 11/18/2022] Open
Abstract
Defects in anaphase-promoting complex (APC) activity, which regulates mitotic progression and chromatin assembly, results in genomic instability, a hallmark of premature aging and cancer. We investigated whether APC-dependent genomic stability affects aging and life span in yeast. Utilizing replicative and chronological aging assays, the APC was shown to promote longevity. Multicopy expression of genes encoding Snf1p (MIG1) and PKA (PDE2) aging-pathway components suppressed apc5CA phenotypes, suggesting their involvement in APC-dependent longevity. While it is known that PKA inhibits APC activity and reduces life span, a link between the Snf1p-inhibited Mig1p transcriptional modulator and the APC is novel. Our mutant analysis supports a model in which Snf1p promotes extended life span by inhibiting the negative influence of Mig1p on the APC. Consistent with this, we found that increased MIG1 expression reduced replicative life span, whereas mig1Delta mutations suppressed the apc5CA chronological aging defect. Furthermore, Mig1p and Mig2p activate APC gene transcription, particularly on glycerol, and mig2Delta, but not mig1Delta, confers a prolonged replicative life span in both APC5 and acp5CA cells. However, glucose repression of APC genes was Mig1p and Mig2p independent, indicating the presence of an uncharacterized factor. Therefore, we propose that APC-dependent genomic stability is linked to prolonged longevity by the antagonistic regulation of the PKA and Snf1p pathways.
Collapse
Affiliation(s)
- Troy A A Harkness
- Department of Anatomy and Cell Biology, University of Saskatchewan, Saskatoon, Saskatchewan S7N 5E5, Canada
| | | | | | | | | |
Collapse
|
10
|
Kankainen M, Holm L. POBO, transcription factor binding site verification with bootstrapping. Nucleic Acids Res 2004; 32:W222-9. [PMID: 15215385 PMCID: PMC441601 DOI: 10.1093/nar/gkh463] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Transcription factors can either activate or repress target genes by binding onto short nucleotide sequence motifs in the promoter regions of these genes. Here, we present POBO, a promoter bootstrapping program, for gene expression data. POBO can be used to detect, compare and verify predetermined transcription factor binding site motifs in the promoters of one or two clusters of co-regulated genes. The program calculates the frequencies of the motif in the input promoter sets. A bootstrap analysis detects significantly over- or underrepresented motifs. The output of the program presents bootstrapped results in picture and text formats. The program was tested with published data from transgenic WRKY70 microarray experiments. Intriguingly, motifs recognized by the WRKY transcription factors of plant defense pathways are similarly enriched in both up- and downregulated clusters. POBO analysis suggests slightly modified hypothetical motifs that discriminate between up- and downregulated clusters. In conclusion, POBO allows easy, fast and accurate verification of putative regulatory motifs. The statistical tests implemented in POBO can be useful in eliminating false positives from the results of pattern discovery programs and increasing the reliability of true positives. POBO is freely available from http://ekhidna.biocenter.helsinki.fi:9801/pobo.
Collapse
Affiliation(s)
- Matti Kankainen
- Structural Genomics Group, Institute of Biotechnology, University of Helsinki, PO Box 56 (Viikinkaari 5), Fin-00014, Helsinki, Finland
| | | |
Collapse
|
11
|
Kreps J, Budworth P, Goff S, Wang R. Identification of putative plant cold responsive regulatory elements by gene expression profiling and a pattern enumeration algorithm. PLANT BIOTECHNOLOGY JOURNAL 2003; 1:345-52. [PMID: 17166133 DOI: 10.1046/j.1467-7652.2003.00032.x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
A pattern enumeration algorithm named GBSSR has been developed to analyse co-expressed gene groups identified through gene chip expression profiling to search for putative cis-regulatory elements, an important step toward understanding transcriptional factors, quantitative trait loci and gene regulatory networks. Without making any statistical assumptions, this algorithm establishes the frequency distribution of all eligible 6-15 bp strings by extensive bootstrap sampling from an entire genome worth of promoters, enabling those over-represented in a co-expressed gene group to be identified. Using a well-studied plant cold responsive gene system as a positive control, several known cold responsive elements were identified as top ranking candidates, along with some potentially novel ones. A typical analysis of 40 co-expressed genes takes a relatively inexpensive Linux cluster with 32 x 1.4 GHz Intel CPUs about 7 days to process.
Collapse
Affiliation(s)
- Joel Kreps
- Torrey Mesa Research Institute, 3115 Merryfield Row, San Diego, CA 92121, USA
| | | | | | | |
Collapse
|
12
|
Boardman PE, Oliver SG, Hubbard SJ. SiteSeer: Visualisation and analysis of transcription factor binding sites in nucleotide sequences. Nucleic Acids Res 2003; 31:3572-5. [PMID: 12824368 PMCID: PMC168918 DOI: 10.1093/nar/gkg511] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2003] [Revised: 03/11/2003] [Accepted: 03/11/2003] [Indexed: 11/13/2022] Open
Abstract
The regulation of gene expression is a fundamental process within every living cell, which allows organisms to manage the precise levels of functional gene products with high sensitivity. It is well established that specific DNA sequences located upstream of the transcriptional start site are important in facilitating the binding of regulatory proteins that control the transcription of the gene. Indeed, microarray-based studies have successfully mined the upstream regions of co-expressed genes and discovered over-represented sequences corresponding to known promoter sites. Here we describe a tool for the visualisation of mapped transcription factor binding sites in the upstream regions of either single or grouped eukaryotic genes, which allows users to examine the positions of known and user-defined sites (http://rocky.bms.umist.ac.uk/SiteSeer/). SiteSeer allows the user to map different sections of the TRANSFAC and SCPD databases (or a set of user-defined sites) onto nucleotide sequences. Additionally, users may restrict the analysis by expectation values for certain DNA words as well as by known binding sites specific to a given organism. We believe this tool will prove particularly valuable for biologists who wish to examine sets of co-expressed or functionally-related genes and those who wish to visualise the positions of promoter sequences and generate displays for publications.
Collapse
Affiliation(s)
- Paul E Boardman
- Department of Biomolecular Sciences, University of Manchester Institute of Science and Technology, PO Box 88, Manchester M60 1QD, UK
| | | | | |
Collapse
|
13
|
Wong CM, Ching YP, Zhou Y, Kung HF, Jin DY. Transcriptional regulation of yeast peroxiredoxin gene TSA2 through Hap1p, Rox1p, and Hap2/3/5p. Free Radic Biol Med 2003; 34:585-97. [PMID: 12614847 DOI: 10.1016/s0891-5849(02)01354-0] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
In Saccharomyces cerevisiae, the transcription of peroxiredoxin gene TSA2 is responsive to various reactive oxygen and nitrogen species. Redox-regulated transcriptional activators Yap1p, Skn7p, Msn2p/Msn4p have been shown to play a role in regulating TSA2 expression. In this study we show that the transcription of TSA2 is under complex control involving additional transcription factors Hap1p, Rox1p, and Hap2/3/5p. Deletion of HAP1 led to a 50% reduction of TSA2 transcriptional activity. As an intracellular oxygen sensor, heme stimulated TSA2 transcription by activating Hap1p. The induction of TSA2 by H(2)O(2) is also mediated in part through Hap1p. Countering the effects of Hap1p was a transcriptional repressor Rox1p. Deletion of ROX1 or mutation of Rox1p-binding site significantly activated TSA2 transcription. In addition, TSA2 activity was diminished in hap2Delta, hap3Delta, hap4Delta, and hap5Delta strains, but was stimulated upon overexpression of Hap4p. Hap2/3/5p may cooperate with Msn2/4p to activate TSA2 after diauxic shift. Finally, we demonstrated a role for kinases Ras1/2p and Hog1p in Msn2/4p-dependent activation of TSA2. In particular, Hog1p mediated the response of TSA2 to osmotic and oxidative stress. Taken together, our findings suggest that the expression of TSA2 is regulated by a group of transcription factors responsive differentially to stress conditions.
Collapse
Affiliation(s)
- Chi-Ming Wong
- Institute of Molecular Biology, The University of Hong Kong, Hong Kong, China
| | | | | | | | | |
Collapse
|
14
|
Zhang H, Ramanathan Y, Soteropoulos P, Recce ML, Tolias PP. EZ-Retrieve: a web-server for batch retrieval of coordinate-specified human DNA sequences and underscoring putative transcription factor-binding sites. Nucleic Acids Res 2002; 30:e121. [PMID: 12409480 PMCID: PMC135846 DOI: 10.1093/nar/gnf120] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
The availability of a draft human genome sequence and ability to monitor the transcription of thousands of genes with DNA microarrays has necessitated the need for new computational tools that can analyze cis-regulatory elements controlling genes that display similar expression patterns. We have developed a tool designated EZ-Retrieve that can: (i) retrieve any particular region of human genome sequence from the NCBI database and (ii) analyze retrieved sequences for putative transcription factor-binding sites (TFBSs) as they appear on the TRANSFAC database. The tool is web-based, user-friendly and offers both batch sequence retrieval and batch TFBS prediction. A major application of EZ-Retrieve is the analysis of co-expressed genes that are highlighted as expression clusters in DNA microarray experiments.
Collapse
Affiliation(s)
- Haibo Zhang
- Center for Applied Genomics, Public Health Research Institute, 225 Warren Street, ICPH W420M, Newark, NJ 07103, USA
| | | | | | | | | |
Collapse
|