51
|
Sinnett D, Beaulieu P, Bélanger H, Lefebvre JF, Langlois S, Théberge MC, Drouin S, Zotti C, Hudson TJ, Labuda D. Detection and characterization of DNA variants in the promoter regions of hundreds of human disease candidate genes. Genomics 2006; 87:704-10. [PMID: 16500075 DOI: 10.1016/j.ygeno.2006.01.001] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2005] [Revised: 12/21/2005] [Accepted: 01/02/2006] [Indexed: 11/20/2022]
Abstract
Understanding genetic variation might reveal the cause of individual susceptibility to a variety of complex diseases such as asthma, diabetes, and cancer. Current efforts to identify functional DNA variants have essentially been oriented toward single nucleotide polymorphisms (SNPs) found in coding regions of candidate genes since they have direct impact on the structure and function of the affected proteins. Abnormal expression of finely regulated genes could also lead to disequilibria in different metabolic pathways and/or biological processes. Thus investigation of SNPs in the promoter regions (pSNPs) of genes should improve our knowledge of the etiology of complex diseases. Unfortunately, little is known about the nature and the prevalence of pSNPs. We have analyzed 197 genes targeting the promoter region, arbitrarily defined as a 2-kb genomic segment upstream of the transcription initiation site, by screening by dHPLC for the presence of SNPs in a worldwide panel of 40 individuals. As a result 1838 pSNPs were detected, 75% of which modify (by either gain or loss) putative binding sites of known transcription factors. We also examined the distribution of these pSNPs among features such as conserved regions, repeats, and dinucleotides as well as Gene Ontology terms. This report supports the functional relevance of several of the pSNPs investigated and suggests a putative impact on disease susceptibility.
Collapse
Affiliation(s)
- Daniel Sinnett
- Division of Hematology-Oncology, Research Center, Sainte-Justine Hospital, 3175 Chemin de la Côte-Sainte-Catherine, Montreal, Canada QC H3T 1C5.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
52
|
Kel A, Konovalova T, Waleev T, Cheremushkin E, Kel-Margoulis O, Wingender E. Composite Module Analyst: a fitness-based tool for identification of transcription factor binding site combinations. Bioinformatics 2006; 22:1190-7. [PMID: 16473870 DOI: 10.1093/bioinformatics/btl041] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
MOTIVATION Functionally related genes involved in the same molecular-genetic, biochemical or physiological process are often regulated coordinately. Such regulation is provided by precisely organized binding of a multiplicity of special proteins [transcription factors (TFs)] to their target sites (cis-elements) in regulatory regions of genes. Cis-element combinations provide a structural basis for the generation of unique patterns of gene expression. RESULTS Here we present a new approach for defining promoter models based on the composition of TF binding sites and their pairs. We utilize a multicomponent fitness function for selection of the promoter model that fits best to the observed gene expression profile. We demonstrate examples of successful application of the fitness function with the help of a genetic algorithm for the analysis of functionally related or co-expressed genes as well as testing on simulated and permutated data. AVAILABILITY The CMA program is freely available for non-commercial users. URL http://www.gene-regulation.com/pub/programs.html#CMAnalyst. It is also a part of the commercial system ExPlain (www.biobase.de) designed for causal analysis of gene expression data..
Collapse
Affiliation(s)
- A Kel
- BIOBASE GmbH Halchtersche Str. 33, D-38304 Wolfenbüttel, Germany.
| | | | | | | | | | | |
Collapse
|
53
|
Abstract
Among more than 120 genes that are now known to regulate mammalian pigmentation, one of the key genes is MC1R, which encodes the melanocortin 1 receptor, a seven transmembrane G protein-coupled receptor expressed on the surface of melanocytes. Since the monoexonic sequence of the gene was cloned and characterized more than a decade ago, tremendous efforts have been dedicated to the extensive genotyping of mostly red-haired populations all around the world, thus providing allelic variants that may or may not account for melanoma susceptibility in the presence or absence of ultraviolet (UV) exposure. Soluble factors, such as proopiomelanocortin (POMC) derivatives, agouti signal protein (ASP) and others, regulate MC1R expression, leading to improved photoprotection via increased eumelanin synthesis or in contrast, inducing the switch to pheomelanin. However, there is an obvious lack of knowledge regarding the numerous and complex regulatory mechanisms that govern the expression of MC1R at the intra-cellular level, from gene transcription in response to an external stimulus to the expression of the mature receptor on the melanocyte surface.
Collapse
Affiliation(s)
- Francois Rouzaud
- Laboratory of Cell Biology, National Cancer Institute, National Institutes of Health, Building 37, Room 2132, Bethesda, MD 20892, USA
| | | |
Collapse
|
54
|
Vavouri T, Elgar G. Prediction of cis-regulatory elements using binding site matrices--the successes, the failures and the reasons for both. Curr Opin Genet Dev 2005; 15:395-402. [PMID: 15950456 DOI: 10.1016/j.gde.2005.05.002] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2005] [Accepted: 05/23/2005] [Indexed: 01/02/2023]
Abstract
Protein-DNA interactions control many aspects of animal development and cellular responses to the environment. Although profiling of individual transcription factor binding sites is not a reliable guide for predicting the position of cis-regulatory elements in large genomes, modelling the evolution and the organization of regulatory elements has provided enough information to make some successful predictions. For vertebrate genomes, the field is limited by the lack of sufficient experimental data upon which to build reliable models. Nonetheless, a combination of experimental, computational and comparative data is likely to reveal aspects of complex regulatory networks in vertebrates, just as it has already done for simple eukaryotic genomes.
Collapse
Affiliation(s)
- Tanya Vavouri
- Comparative Genomics Group, MRC Rosalind Franklin Centre for Genomics Research, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SB, UK
| | | |
Collapse
|
55
|
Zhu Z, Shendure J, Church GM. Discovering functional transcription-factor combinations in the human cell cycle. Genome Res 2005; 15:848-55. [PMID: 15930495 PMCID: PMC1142475 DOI: 10.1101/gr.3394405] [Citation(s) in RCA: 58] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
With the completion of full genome sequences and advancement in high-throughput technologies, in silico methods have been successfully used to integrate diverse data sources toward unraveling the combinatorial nature of transcriptional regulation. So far, almost all of these studies are restricted to lower eukaryotes such as budding yeast. We describe here a computational search for functional transcription-factor (TF) combinations using phylogenetically conserved sequences and microarray-based expression data. Taking into account both orientational and positional constraints, we investigated the overrepresentation of binding sites in the vicinity of one another and whether these combinations result in more coherent expression profiles. Without any prior biological knowledge, the search led to the discovery of several experimentally established TF associations, as well as some novel ones. In particular, we identified a regulatory module controlling cell cycle-dependent transcription of G2-M genes and expanded its functional generality. We also detected many homotypic combinations, supporting the importance of binding-site density in transcriptional regulation of higher eukaryotes.
Collapse
Affiliation(s)
- Zhou Zhu
- Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA.
| | | | | |
Collapse
|
56
|
Di Cara A, Schmidt K, Hemmings BA, Oakeley EJ. PromoterPlot: a graphical display of promoter similarities by pattern recognition. Nucleic Acids Res 2005; 33:W423-6. [PMID: 15980503 PMCID: PMC1160174 DOI: 10.1093/nar/gki413] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
PromoterPlot () is a web-based tool for simplifying the display and processing of transcription factor searches using either the commercial or free TransFac distributions. The input sequence is a TransFac search (public version) or FASTA/Affymetrix IDs (local install). It uses an intuitive pattern recognition algorithm for finding similarities between groups of promoters by dividing transcription factor predictions into conserved triplet models. To minimize the number of false-positive models, it can optionally exclude factors that are known to be unexpressed or inactive in the cells being studied based on microarray or proteomic expression data. The program will also estimate the likelihood of finding a pattern by chance based on the frequency observed in a control set of mammalian promoters we obtained from Genomatix. The results are stored as an interactive SVG web page on our server.
Collapse
Affiliation(s)
| | | | | | - Edward J. Oakeley
- To whom correspondence should be addressed. Tel: +41 61 697 6986; Fax: +41 61 697 3976;
| |
Collapse
|
57
|
Chekmenev DS, Haid C, Kel AE. P-Match: transcription factor binding site search by combining patterns and weight matrices. Nucleic Acids Res 2005; 33:W432-7. [PMID: 15980505 PMCID: PMC1160202 DOI: 10.1093/nar/gki441] [Citation(s) in RCA: 153] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
P-Match is a new tool for identifying transcription factor (TF) binding sites in DNA sequences. It combines pattern matching and weight matrix approaches thus providing higher accuracy of recognition than each of the methods alone. P-Match is closely interconnected with the TRANSFAC® database. In particular, P-Match uses the matrix library as well as sets of aligned known TF-binding sites collected in TRANSFAC® and therefore provides the possibility to search for a large variety of different TF binding sites. Using results of extensive tests of recognition accuracy, we selected three sets of optimized cut-off values that minimize either false negatives or false positives, or the sum of both errors. Comparison with the weight matrix approaches such as Match™ tool shows that P-Match generally provides superior recognition accuracy in the area of low false negative errors (high sensitivity). As familiar to the user of Match™, P-Match also allows to save user-specific profiles that include selected subsets of matrices with corresponding TF-binding sites or user-defined cut-off values. Furthermore, a number of tissue-specific profiles are provided that were compiled by the TRANSFAC® team. A public version of the P-Match tool is available at .
Collapse
Affiliation(s)
| | | | - A. E. Kel
- To whom correspondence should be addressed. Tel: +49 5331 8584 41; Fax: +49 5331 8584 70;
| |
Collapse
|
58
|
Döhr S, Klingenhoff A, Maier H, de Angelis MH, Werner T, Schneider R. Linking disease-associated genes to regulatory networks via promoter organization. Nucleic Acids Res 2005; 33:864-72. [PMID: 15701758 PMCID: PMC549397 DOI: 10.1093/nar/gki230] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Pathway- or disease-associated genes may participate in more than one transcriptional co-regulation network. Such gene groups can be readily obtained by literature analysis or by high-throughput techniques such as microarrays or protein-interaction mapping. We developed a strategy that defines regulatory networks by in silico promoter analysis, finding potentially co-regulated subgroups without a priori knowledge. Pairs of transcription factor binding sites conserved in orthologous genes (vertically) as well as in promoter sequences of co-regulated genes (horizontally) were used as seeds for the development of promoter models representing potential co-regulation. This approach was applied to a Maturity Onset Diabetes of the Young (MODY)-associated gene list, which yielded two models connecting functionally interacting genes within MODY-related insulin/glucose signaling pathways. Additional genes functionally connected to our initial gene list were identified by database searches with these promoter models. Thus, data-driven in silico promoter analysis allowed integrating molecular mechanisms with biological functions of the cell.
Collapse
Affiliation(s)
| | - A. Klingenhoff
- Genomatix Software GmbHLandsberger Str. 6, D-80339 München, Germany
| | | | | | - T. Werner
- Genomatix Software GmbHLandsberger Str. 6, D-80339 München, Germany
| | - R. Schneider
- To whom correspondence should be addressed. Tel: +49 89 3187 4060; Fax: +49 89 3187 4400;
| |
Collapse
|
59
|
Bennett CS, Khorram Khorshid HR, Kitchen JA, Arteta D, Dalgleish R. Characterization of the human secreted phosphoprotein 24 gene (SPP2) and comparison of the protein sequence in nine species. Matrix Biol 2005; 22:641-51. [PMID: 15062857 DOI: 10.1016/j.matbio.2003.12.001] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2003] [Revised: 11/11/2003] [Accepted: 12/03/2003] [Indexed: 10/26/2022]
Abstract
Secreted phosphoprotein 24 (spp24) is a member of the cystatin superfamily, which was first identified in cattle as a minor component of cortical bone and subsequently has been identified as a component of the fetuin-mineral complex. We have localized the human SPP2 gene, which encodes spp24 to chromosome 2q37.1, determined its structure and mapped the start of transcription in liver. There is no CAAT or TATA box in the promoter region but potential transcription factor (TF)-binding sites have been identified. The gene comprises eight exons spread over a region of approximately 27 kb with the cystatin-like region of spp24 encoded by four exons, rather than the three-exon structure typical of the genes encoding the archetypal cystatins. A rare single amino acid polymorphism (p.S38F) has been identified within the mature protein and its significance has been assessed by comparing the sequence of human spp24 with that of eight other species.
Collapse
Affiliation(s)
- Clare S Bennett
- Department of Genetics, University of Leicester, University Road, Leicester LE1 7RH, UK
| | | | | | | | | |
Collapse
|
60
|
Shelest E, Wingender E. Construction of predictive promoter models on the example of antibacterial response of human epithelial cells. Theor Biol Med Model 2005; 2:2. [PMID: 15647113 PMCID: PMC546226 DOI: 10.1186/1742-4682-2-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2004] [Accepted: 01/12/2005] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Binding of a bacteria to a eukaryotic cell triggers a complex network of interactions in and between both cells. P. aeruginosa is a pathogen that causes acute and chronic lung infections by interacting with the pulmonary epithelial cells. We use this example for examining the ways of triggering the response of the eukaryotic cell(s), leading us to a better understanding of the details of the inflammatory process in general. RESULTS Considering a set of genes co-expressed during the antibacterial response of human lung epithelial cells, we constructed a promoter model for the search of additional target genes potentially involved in the same cell response. The model construction is based on the consideration of pair-wise combinations of transcription factor binding sites (TFBS). It has been shown that the antibacterial response of human epithelial cells is triggered by at least two distinct pathways. We therefore supposed that there are two subsets of promoters activated by each of them. Optimally, they should be "complementary" in the sense of appearing in complementary subsets of the (+)-training set. We developed the concept of complementary pairs, i.e., two mutually exclusive pairs of TFBS, each of which should be found in one of the two complementary subsets. CONCLUSIONS We suggest a simple, but exhaustive method for searching for TFBS pairs which characterize the whole (+)-training set, as well as for complementary pairs. Applying this method, we came up with a promoter model of antibacterial response genes that consists of one TFBS pair which should be found in the whole training set and four complementary pairs. We applied this model to screening of 13,000 upstream regions of human genes and identified 430 new target genes which are potentially involved in antibacterial defense mechanisms.
Collapse
Affiliation(s)
- Ekaterina Shelest
- Dept. of Bioinformatics, UKG, University of Göttingen, Goldschmidtstr. 1, D-37077 Göttingen, Germany
| | - Edgar Wingender
- Dept. of Bioinformatics, UKG, University of Göttingen, Goldschmidtstr. 1, D-37077 Göttingen, Germany
- BIOBASE GmbH, Halchtersche Str. 33, D-38304 Wolfenbüttel, Germany
| |
Collapse
|
61
|
Kel A, Reymann S, Matys V, Nettesheim P, Wingender E, Borlak J. A novel computational approach for the prediction of networked transcription factors of aryl hydrocarbon-receptor-regulated genes. Mol Pharmacol 2004; 66:1557-72. [PMID: 15342792 DOI: 10.1124/mol.104.001677] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
A novel computational method based on a genetic algorithm was developed to study composite structure of promoters of coexpressed genes. Our method enabled an identification of combinations of multiple transcription factor binding sites regulating the concerted expression of genes. In this article, we study genes whose expression is regulated by a ligand-activated transcription factor, aryl hydrocarbon receptor (AhR), that mediates responses to a variety of toxins. AhR-mediated change in expression of AhR target genes was measured by oligonucleotide microarrays and by reverse transcription-polymerase chain reaction in human and rat hepatocytes. Promoters and long-distance regulatory regions (>10 kb) of AhR-responsive genes were analyzed by the genetic algorithm and a variety of other computational methods. Rules were established on the local oligonucleotide context in the flanks of the AhR binding sites, on the occurrence of clusters of AhR recognition elements, and on the presence in the promoters of specific combinations of multiple binding sites for the transcription factors cooperating in the AhR regulatory network. Our rules were applied to search for yet unknown Ah-receptor target genes. Experimental evidence is presented to demonstrate high fidelity of this novel in silico approach.
Collapse
|
62
|
Werner T. Proteomics and regulomics: the yin and yang of functional genomics. MASS SPECTROMETRY REVIEWS 2004; 23:25-33. [PMID: 14625890 DOI: 10.1002/mas.10067] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Protein analysis is a field of research with a long history. Recently, the development of a series of proteomics approaches, i.e., simultaneous analyses on all or a majority of proteins in a cell at a given state, has reinvigorated protein analyses. Mass Spectrometry also developed into one of the most versatile technical tools supporting or even enabling many proteomics-oriented approaches, providing a convenient link between experimental protein analysis and the corresponding amino acid sequences. Thus direct links to the genomic sequence can be established, which opens the door for a synergistic combination with genomic sequence analysis. This review focuses especially on aspects of genome-wide transcription control, regulomics in analogy to all the other -omics, and how a combination of MS-based proteomics with in silico regulomics analyses can produce synergistic effects in the quest to understand how cells function. This is illustrated on a real life example showing how the MS-analysis and in silico promoter analysis can extend the list of candidates for signaling pathways, here the MAP kinase pathway.
Collapse
Affiliation(s)
- Thomas Werner
- Genomatix Software GmbH, Landsbergerstr 6, D-80339 München, Germany.
| |
Collapse
|
63
|
Hogan PG, Chen L, Nardone J, Rao A. Transcriptional regulation by calcium, calcineurin, and NFAT. Genes Dev 2003; 17:2205-32. [PMID: 12975316 DOI: 10.1101/gad.1102703] [Citation(s) in RCA: 1533] [Impact Index Per Article: 69.7] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Affiliation(s)
- Patrick G Hogan
- The Center for Blood Research, Harvard Medical School, Boston, Massachusetts 02115, USA
| | | | | | | |
Collapse
|
64
|
Kel AE, Gössling E, Reuter I, Cheremushkin E, Kel-Margoulis OV, Wingender E. MATCH: A tool for searching transcription factor binding sites in DNA sequences. Nucleic Acids Res 2003; 31:3576-9. [PMID: 12824369 PMCID: PMC169193 DOI: 10.1093/nar/gkg585] [Citation(s) in RCA: 807] [Impact Index Per Article: 36.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Match is a weight matrix-based tool for searching putative transcription factor binding sites in DNA sequences. Match is closely interconnected and distributed together with the TRANSFAC database. In particular, Match uses the matrix library collected in TRANSFAC and therefore provides the possibility to search for a great variety of different transcription factor binding sites. Several sets of optimised matrix cut-off values are built in the system to provide a variety of search modes of different stringency. The user may construct and save his/her specific user profiles which are selected subsets of matrices including default or user-defined cut-off values. Furthermore a number of tissue-specific profiles are provided that were compiled by the TRANSFAC team. A public version of the Match tool is available at: http://www.gene-regulation.com/pub/programs.html#match. The same program with a different web interface can be found at http://compel.bionet.nsc.ru/Match/Match.html. An advanced version of the tool called Match Professional is available at http://www.biobase.de.
Collapse
Affiliation(s)
- A E Kel
- BIOBASE GmbH, Halchtersche Str. 33, D-38304 Wolfenbüttel, Germany.
| | | | | | | | | | | |
Collapse
|
65
|
Eicher DM. IL-2 and IL-15 manifest opposing effects on activation of nuclear factor of activated T cells. Cell Immunol 2003; 223:133-42. [PMID: 14527511 DOI: 10.1016/s0008-8749(03)00168-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
IL-2 and IL-15 are cytokines involved in T cell activation and death. Their non-shared receptors, IL-2Ralpha and IL-15Ralpha, are important in the homeostasis of lymphocytes as evidenced by gene deletion studies. How these cytokine/receptor systems affect T cell antigen receptor signaling pathways is poorly understood. Here, we show that the IL-2 and IL-15 cytokine/receptor alpha systems regulate activation of nuclear factor of activated T cells (NF-AT) in opposing ways. IL-15Ralpha increased while IL-2Ralpha decreased basal NF-AT activation status in a Jurkat transient transfection model. The effect of each of the alpha chain receptors on NF-AT activation was further opposed by addition of the respective cytokine. These effects were inhibited by anti-cytokine and anti-cytokine receptor reagents as well as by inhibitors of TCR signaling. These results suggest a novel pathway of cytokine action to regulate T cell signaling, activation, death, and homeostasis.
Collapse
Affiliation(s)
- Donald M Eicher
- Division of Hematology-Oncology, University Hospitals of Cleveland, Case Western Reserve University School of Medicine, Wearn Building, Room 448, Mailstop: WRN5061, 10900 Euclid Avenue, Cleveland, OH 44106-4937, USA.
| |
Collapse
|
66
|
Elkon R, Linhart C, Sharan R, Shamir R, Shiloh Y. Genome-wide in silico identification of transcriptional regulators controlling the cell cycle in human cells. Genome Res 2003; 13:773-80. [PMID: 12727897 PMCID: PMC430898 DOI: 10.1101/gr.947203] [Citation(s) in RCA: 233] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2002] [Accepted: 02/25/2003] [Indexed: 11/24/2022]
Abstract
Dissection of regulatory networks that control gene transcription is one of the greatest challenges of functional genomics. Using human genomic sequences, models for binding sites of known transcription factors, and gene expression data, we demonstrate that the reverse engineering approach, which infers regulatory mechanisms from gene expression patterns, can reveal transcriptional networks in human cells. To date, such methodologies were successfully demonstrated only in prokaryotes and low eukaryotes. We developed computational methods for identifying putative binding sites of transcription factors and for evaluating the statistical significance of their prevalence in a given set of promoters. Focusing on transcriptional mechanisms that control cell cycle progression, our computational analyses revealed eight transcription factors whose binding sites are significantly overrepresented in promoters of genes whose expression is cell-cycle-dependent. The enrichment of some of these factors is specific to certain phases of the cell cycle. In addition, several pairs of these transcription factors show a significant co-occurrence rate in cell-cycle-regulated promoters. Each such pair indicates functional cooperation between its members in regulating the transcriptional program associated with cell cycle progression. The methods presented here are general and can be applied to the analysis of transcriptional networks controlling any biological process.
Collapse
Affiliation(s)
- Ran Elkon
- The David and Inez Myers Laboratory for Genetic Research, Department of Human Genetics, Sackler School of Medicine, and School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel
| | | | | | | | | |
Collapse
|
67
|
Ovcharenko I, Loots GG. Comparative genomic tools for exploring the human genome. COLD SPRING HARBOR SYMPOSIA ON QUANTITATIVE BIOLOGY 2003; 68:283-91. [PMID: 15338628 DOI: 10.1101/sqb.2003.68.283] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/30/2023]
Affiliation(s)
- I Ovcharenko
- EEBI Computing Division, Lawrence Livermore National Laboratory, Livermore, California 94550, USA
| | | |
Collapse
|
68
|
Werner T. Promoter analysis. ERNST SCHERING RESEARCH FOUNDATION WORKSHOP 2002:65-82. [PMID: 12061007 DOI: 10.1007/978-3-662-04747-7_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- T Werner
- Institute of Biomathematics and Biometry, GSF-Forschungszentrum für Umwelt und Gesundheit, Ingolstädter Landstrasse 1, 85764 Neuherberg, Germany.
| |
Collapse
|
69
|
Sudarsanam P, Pilpel Y, Church GM. Genome-wide co-occurrence of promoter elements reveals a cis-regulatory cassette of rRNA transcription motifs in Saccharomyces cerevisiae. Genome Res 2002; 12:1723-31. [PMID: 12421759 PMCID: PMC187556 DOI: 10.1101/gr.301202] [Citation(s) in RCA: 70] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2002] [Accepted: 09/10/2002] [Indexed: 11/25/2022]
Abstract
Combinatorial regulation is an important feature of eukaryotic transcription. However, only a limited number of studies have characterized this aspect on a whole-genome level. We have conducted a genome-wide computational survey to identify cis-regulatory motif pairs that co-occur in a significantly high number of promoters in the S. cerevisiae genome. A pair of novel motifs, mRRPE and PAC, co-occur most highly in the genome, primarily in the promoters of genes involved in rRNA transcription and processing. The two motifs show significant positional and orientational bias with mRRPE being closer to the ATG than PAC in most promoters. Two additional rRNA-related motifs, mRRSE3 and mRRSE10, also co-occur with mRRPE and PAC. mRRPE and PAC are the primary determinants of expression profiles while mRRSE3 and mRRSE10 modulate these patterns. We describe a new computational approach for studying the functional significance of the physical locations of promoter elements that combine analyses of genome sequence and microarray data. Applying this methodology to the regulatory cassette containing the four rRNA motifs demonstrates that the relative promoter locations of these elements have a profound effect on the expression patterns of the downstream genes. These findings provide a function for these novel motifs and insight into the mechanism by which they regulate gene expression. The methodology introduced here should prove particularly useful for analyzing transcriptional regulation in more complex genomes.
Collapse
Affiliation(s)
- Priya Sudarsanam
- Department of Genetics and Lipper Center for Computational Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
| | | | | |
Collapse
|
70
|
Baksh S, Widlund HR, Frazer-Abel AA, Du J, Fosmire S, Fisher DE, DeCaprio JA, Modiano JF, Burakoff SJ. NFATc2-mediated repression of cyclin-dependent kinase 4 expression. Mol Cell 2002; 10:1071-81. [PMID: 12453415 DOI: 10.1016/s1097-2765(02)00701-3] [Citation(s) in RCA: 142] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
The calcineurin-regulated transcription factor, nuclear factor of activated T cells (NFAT), controls many aspects of T cell function. Here, we demonstrate that the calcineurin/NFAT pathway negatively regulates the expression of cyclin-dependent kinase 4 (CDK4). A canonical NFAT binding site was identified and found to be sensitive to calcium signals, FK506/CsA, and histone deacetylase activity and to not require AP-1. Ectopic expression of NFATc2 inhibited the basal activity of the human CDK4 promoter. Additionally, both calcineurin Aalpha(-/-) and NFATc2(-/-) mice had elevated protein levels of CDK4, confirming a negative regulatory role for the calcineurin/NFAT pathway. This pathway may thus regulate the expression of CDK4 at the transcriptional level and control how cells re-enter a resting, nonproliferative state.
Collapse
Affiliation(s)
- Shairaz Baksh
- Department of Pediatric Oncology, Harvard Medical School, Boston, MA 02115, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
71
|
Kneitz C, Goller M, Tony H, Simon A, Stibbe C, König T, Serfling E, Avots A. The CD23b promoter is a target for NF-AT transcription factors in B-CLL cells. BIOCHIMICA ET BIOPHYSICA ACTA 2002; 1588:41-7. [PMID: 12379312 DOI: 10.1016/s0925-4439(02)00114-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
CD23 is atypically highly expressed in various chronic diseases, including B-CLL, lupus erythematodes and rheumatoid arthritis. Its expression can be further enhanced by interleukin 4 (IL-4). We have shown before that in B-CLL cells nuclear factor(s) of activated T cells (NF-ATs) show permanent nuclear localization and therefore constitutive transcriptional activity. Here we identify CD23b promoter as a novel target for NF-AT factors in B-CLL cells. The CD23b promoter contains two NF-AT binding sites to which NF-ATp and NF-ATc factors bind with high affinity. Mutations introduced into these sites abolished NF-AT binding and impaired the promoter activity, as did cyclosporin A (CsA), an inhibitor of nuclear transport of NF-ATs. Furthermore, we show that IL-4-induced transcription factor STAT6 cooperates with NF-ATs in the induction of the CD23b promoter activity. These results show that the CD23b promoter is a target for NF-AT factors and suggest that the cooperation between NF-AT and STAT factors might be one of the molecular mechanisms responsible for high-level expression of CD23 on the surface of B-CLL cells.
Collapse
Affiliation(s)
- Christian Kneitz
- Medizinische Poliklinik, University of Wuerzburg, Klinikstrasse. 6-8, 97070, Wuerzburg, Germany.
| | | | | | | | | | | | | | | |
Collapse
|
72
|
Abstract
Transcriptional regulation is mediated by a battery of transcription factor (TF) proteins, that form complexes involving protein-protein and protein-DNA interactions. Individual TFs bind to their cognate cis-elements or transcription factor-binding sites (TFBS). TFBS are organized on the DNA proximal to the gene in groups confined to a few hundred base pair regions. These groups are referred to as modules. Various modules work together to provide the combinatorial regulation of gene transcription in response to various developmental and environmental conditions. The sets of modules constitute a promoter model. Determining the TFs that preferentially work in concert as part of a module is an essential component of understanding transcriptional regulation. The TFs that act synergistically in such a fashion are likely to have their cis-elements co-localized on the genome at specific distances apart. We exploit this notion to predict TF pairs that are likely to be part of a transcriptional module on the human genome sequence. The computational method is validated statistically, using known interacting pairs extracted from the literature. There are 251 TFBS pairs up to 50 bp apart and 70 TFBS pairs up to 200 bp apart that score higher than any of the known synergistic pairs. Further investigation of 50 pairs randomly selected from each of these two sets using PubMed queries provided additional supporting evidence from the existing biological literature suggesting TF synergism for these novel pairs.
Collapse
Affiliation(s)
- Sridhar Hannenhalli
- Informatics Research, Celera Genomics, 45 West Gude Drive, Rockville, MD 20850, USA.
| | | |
Collapse
|
73
|
Jegga AG, Sherwood SP, Carman JW, Pinski AT, Phillips JL, Pestian JP, Aronow BJ. Detection and visualization of compositionally similar cis-regulatory element clusters in orthologous and coordinately controlled genes. Genome Res 2002; 12:1408-17. [PMID: 12213778 PMCID: PMC186658 DOI: 10.1101/gr.255002] [Citation(s) in RCA: 78] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2002] [Accepted: 07/18/2002] [Indexed: 02/02/2023]
Abstract
Evolutionarily conserved noncoding genomic sequences represent a potentially rich source for the discovery of gene regulatory regions. However, detecting and visualizing compositionally similar cis-element clusters in the context of conserved sequences is challenging. We have explored potential solutions and developed an algorithm and visualization method that combines the results of conserved sequence analyses (BLASTZ) with those of transcription factor binding site analyses (MatInspector) (http://trafac.chmcc.org). We define hits as the density of co-occurring cis-element transcription factor (TF)-binding sites measured within a 200-bp moving average window through phylogenetically conserved regions. The results are depicted as a Regulogram, in which the hit count is plotted as a function of position within each of the two genomic regions of the aligned orthologs. Within a high-scoring region, the relative arrangement of shared cis-elements within compositionally similar TF-binding site clusters is depicted in a Trafacgram. On the basis of analyses of several training data sets, the approach also allows for the detection of similarities in composition and relative arrangement of cis-element clusters within nonorthologous genes, promoters, and enhancers that exhibit coordinate regulatory properties. Known functional regulatory regions of nonorthologous and less-conserved orthologous genes frequently showed cis-element shuffling, demonstrating that compositional similarity can be more sensitive than sequence similarity. These results show that combining sequence similarity with cis-element compositional similarity provides a powerful aid for the identification of potential control regions.
Collapse
Affiliation(s)
- Anil G Jegga
- Divisions of Pediatric Informatics, University of Cincinnati, Cincinnati, Ohio, 45229 USA
| | | | | | | | | | | | | |
Collapse
|
74
|
Abstract
The human genome sequence is the book of our life. Buried in this large volume are our genes, which are scattered as small DNA fragments throughout the genome and comprise a small percentage of the total text. Finding these indistinct 'needles' in a vast genomic 'haystack' can be extremely challenging. In response to this challenge, computational prediction approaches have proliferated in recent years that predict the location and structure of genes. Here, I discuss these approaches and explain why they have become essential for the analyses of newly sequenced genomes.
Collapse
Affiliation(s)
- Michael Q Zhang
- Watson School of Biological Sciences, Cold Spring Harbor Laboratory, 1 Bungtown Road, PO Box 100, Cold Spring Harbor, New York 11724, USA.
| |
Collapse
|
75
|
Walden M, Kreutzmann P, Drögemüller K, John H, Forssmann WG, Hans-Jürgen M. Biochemical features, molecular biology and clinical relevance of the human 15-domain serine proteinase inhibitor LEKTI. Biol Chem 2002; 383:1139-41. [PMID: 12437098 DOI: 10.1515/bc.2002.124] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Based on the isolation of a 55 amino acid peptide from human hemofiltrate, we cloned the cDNA for a novel human 15-domain serine proteinase inhibitor termed LEKTI. A trypsin-inhibiting activity was demonstrated for three different domains. High levels of expression of the corresponding gene were detected in oral mucosa, followed by the tonsils, parathyroid glands, thymus, and trachea. Hovnanian and coworkers recently found that certain mutations within the LEKTI gene are linked to the severe congenital disease Netherton syndrome and atopic manifestations (including asthma). Thus, a future therapeutic use of LEKTI is conceivable.
Collapse
|
76
|
Chuvpilo S, Jankevics E, Tyrsin D, Akimzhanov A, Moroz D, Jha MK, Schulze-Luehrmann J, Santner-Nanan B, Feoktistova E, König T, Avots A, Schmitt E, Berberich-Siebelt F, Schimpl A, Serfling E. Autoregulation of NFATc1/A expression facilitates effector T cells to escape from rapid apoptosis. Immunity 2002; 16:881-95. [PMID: 12121669 DOI: 10.1016/s1074-7613(02)00329-1] [Citation(s) in RCA: 162] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Threshold levels of individual NFAT factors appear to be critical for apoptosis induction in effector T cells. In these cells, the short isoform A of NFATc1 is induced to high levels due to the autoregulation of the NFATc1 promoter P1 by NFATs. P1 is located within a CpG island in front of exon 1, represents a DNase I hypersensitive chromatin site, and harbors several sites for binding of inducible transcription factors, including a tandemly arranged NFAT site. A second promoter, P2, before exon 2, is not controlled by NFATs and directs synthesis of the longer NFATc1/B+C isoforms. Contrary to other NFATs, NFATc1/A is unable to promote apoptosis, suggesting that NFATc1/A enhances effector functions without promoting apoptosis of effector T cells.
Collapse
Affiliation(s)
- Sergei Chuvpilo
- Department of Molecular Pathology, Institute of Pathology, University of Wuerzburg, D97080 Wuerzburg, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
77
|
Banerjee N, Zhang MQ. Functional genomics as applied to mapping transcription regulatory networks. Curr Opin Microbiol 2002; 5:313-7. [PMID: 12057687 DOI: 10.1016/s1369-5274(02)00322-3] [Citation(s) in RCA: 62] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
The sequencing of the human genome and the entire genomes of many model organisms has resulted in the identification of many genes. Many large-scale experiments for generating gene disruptions and analyzing the phenotypes are underway to ascertain gene function. A future challenge will be to determine interaction and regulation of all the genes of an organism. Recent advances in functional genomic technology have begun to shine light on such gene network problems at both transcriptomic and proteomic levels. Functional genomics will not only elucidate what the genes do, but will also help determine when, where and how they are expressed as an orchestrated system. In this review, we discuss the functional genomics approaches to extract knowledge about transcription regulatory mechanisms from combinations of sequence data, microarray data and ChIP data. We focus in particular on the budding yeast Saccharomyces cerevisiae.
Collapse
Affiliation(s)
- Nila Banerjee
- Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, New York 11724, USA
| | | |
Collapse
|
78
|
Loots GG, Ovcharenko I, Pachter L, Dubchak I, Rubin EM. rVista for comparative sequence-based discovery of functional transcription factor binding sites. Genome Res 2002; 12:832-9. [PMID: 11997350 PMCID: PMC186580 DOI: 10.1101/gr.225502] [Citation(s) in RCA: 273] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Identifying transcriptional regulatory elements represents a significant challenge in annotating the genomes of higher vertebrates. We have developed a computational tool, rVista, for high-throughput discovery of cis-regulatory elements that combines clustering of predicted transcription factor binding sites (TFBSs) and the analysis of interspecies sequence conservation to maximize the identification of functional sites. To assess the ability of rVista to discover true positive TFBSs while minimizing the prediction of false positives, we analyzed the distribution of several TFBSs across 1 Mb of the well-annotated cytokine gene cluster (Hs5q31; Mm11). Because a large number of AP-1, NFAT, and GATA-3 sites have been experimentally identified in this interval, we focused our analysis on the distribution of all binding sites specific for these transcription factors. The exploitation of the orthologous human-mouse dataset resulted in the elimination of > 95% of the approximately 58,000 binding sites predicted on analysis of the human sequence alone, whereas it identified 88% of the experimentally verified binding sites in this region.
Collapse
Affiliation(s)
- Gabriela G Loots
- Genome Sciences Department, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA.
| | | | | | | | | |
Collapse
|
79
|
Abstract
The TGF-beta superfamily is an important class of intercellular signalling molecule, including TGF-beta and bone morphogenetic proteins. Intracellular signalling cascades triggered by these molecules eventually activate transcription factors of the Smad family, which then regulate expression of their respective target genes. This article will discuss the TGF-beta--Smad signalling networks and how these processes are represented in databases of signal transduction and transcription control mechanisms. These databases can provide a well-structured overview of the subject and a basis for advanced bioinformatics analyses to interpret the function of genomic sequences or to analyse signalling networks.
Collapse
Affiliation(s)
- Dorothee U Kloos
- BIOBASE GmbH, Biological Databases, Halchtersche Strasse 33, D-38304 Wolfenbüttel, Germany.
| | | | | |
Collapse
|
80
|
Kel-Margoulis OV, Kel AE, Reuter I, Deineko IV, Wingender E. TRANSCompel: a database on composite regulatory elements in eukaryotic genes. Nucleic Acids Res 2002; 30:332-4. [PMID: 11752329 PMCID: PMC99108 DOI: 10.1093/nar/30.1.332] [Citation(s) in RCA: 86] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Originating from COMPEL, the TRANSCompel database emphasizes the key role of specific interactions between transcription factors binding to their target sites providing specific features of gene regulation in a particular cellular content. Composite regulatory elements contain two closely situated binding sites for distinct transcription factors and represent minimal functional units providing combinatorial transcriptional regulation. Both specific factor--DNA and factor--factor interactions contribute to the function of composite elements (CEs). Information about the structure of known CEs and specific gene regulation achieved through such CEs appears to be extremely useful for promoter prediction, for gene function prediction and for applied gene engineering as well. Each database entry corresponds to an individual CE within a particular gene and contains information about two binding sites, two corresponding transcription factors and experiments confirming cooperative action between transcription factors. The COMPEL database, equipped with the search and browse tools, is available at http://www.gene-regulation.com/pub/databases.html#transcompel. Moreover, we have developed the program CATCH for searching potential CEs in DNA sequences. It is freely available as CompelPatternSearch at http://compel.bionet.nsc.ru/FunSite/CompelPatternSearch.html.
Collapse
|
81
|
Abstract
The controlled expression of cytokine genes is an essential component of an immune response. The specific types of cytokines as well as the time and place of their production is important in generating an appropriate immune response to an infectious agent. Aberrant expression is associated with pathological conditions of the immune system such as autoimmunity, atopy and chronic inflammation. Cytokine gene transcription is generally induced in a cell-specific manner. Over the last 15 years, a large amount of information has been generated describing the transcriptional controls that are exerted on cytokine genes. Recently, efforts have been directed at understanding how these genes are transcribed in a chromatin context. This review will discuss the mechanisms by which cytokine genes become available for transcription in a cell-restricted manner as well as the mechanisms by which these genes sense their environment and activate high level transcription in a transient manner. Particular attention will be paid to the role of chromatin in allowing transcription factor access to appropriate genes.
Collapse
Affiliation(s)
- A F Holloway
- Division of Biochemistry and Molecular Biology, John Curtin School of Medical Research, Australian National University, Canberra ACT 2601, Australia
| | | | | |
Collapse
|
82
|
Scott ES, Malcomber S, O'Hare P. Nuclear translocation and activation of the transcription factor NFAT is blocked by herpes simplex virus infection. J Virol 2001; 75:9955-65. [PMID: 11559828 PMCID: PMC114567 DOI: 10.1128/jvi.75.20.9955-9965.2001] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Transcription factors of the NFAT (nuclear factor of activated T cells) family are expressed in most immune system cells and in a range of other cell types. Signaling through NFAT is implicated in the regulation of transcription for the immune response and other processes, including differentiation and apoptosis. NFAT normally resides in the cytoplasm, and a key aspect of the NFAT activation pathway is the regulation of its nuclear import by the Ca(2+)/calmodulin-dependent phosphatase calcineurin. In a cell line stably expressing green fluorescent protein (GFP)-NFAT, this import can be triggered by elevation of intracellular calcium and visualized in live cells. Here we show that the inducible nuclear import of GFP-NFAT is efficiently blocked at early stages of herpes simplex virus (HSV) infection. This is a specific effect, since we observed abundant nuclear accumulation of a test viral protein and no impediment to general nuclear localization signal-dependent nuclear import and retention in infected cells. We show that virus binding at the cell surface is not itself sufficient to inhibit the signaling that induces NFAT nuclear translocation. Since the block occurs following infection in the presence of phosphonoacetic acid but not cycloheximide, we infer that the entry of the virion and early gene transcription are required but the effect is independent of DNA replication or late virus gene expression. A consequence of the block to GFP-NFAT import is a reduction in NFAT-dependent transcriptional activation from the interleukin-2 promoter in infected cells. This HSV-mediated repression of the NFAT pathway may constitute an immune evasion strategy or subversion of other NFAT-dependent cellular processes to promote viral replication.
Collapse
Affiliation(s)
- E S Scott
- Marie Curie Research Institute, Oxted, Surrey RH8 0TL, United Kingdom
| | | | | |
Collapse
|
83
|
Anusaksathien O, Laplace C, Li X, Ren Y, Peng L, Goldring SR, Galson DL. Tissue-specific and ubiquitous promoters direct the expression of alternatively spliced transcripts from the calcitonin receptor gene. J Biol Chem 2001; 276:22663-74. [PMID: 11309373 DOI: 10.1074/jbc.m007104200] [Citation(s) in RCA: 59] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The gene encoding the murine calcitonin receptor (mCTR) was isolated, and the exon/intron structure was determined. Analysis of transcripts revealed novel cDNA sequences, new alternative exon splicing in the 5'-untranslated region, and three putative promoters (P1, P2, and P3). The longest transcription unit is greater than 67 kilobase pairs, and the location of introns within the coding region of the mCTR gene (exons E3-E14) are identical to those of the porcine and human CTR genes. We have identified novel cDNA sequences that form three new exons as well as others that add 512 base pairs to the 5' side of the previously published cDNA, thereby extending exon E1 to 682 base pairs. Two of these novel exons are upstream of exon E2 and form a tripartite exon E2 (E2a, E2b, and E2c) in which E2a is utilized by promoter P2 with variable splicing of E2b. The third new exon (E3b') lies between E3a and E3b and is utilized by promoter P3. Analysis of mCTR mRNAs has revealed that the three alternative promoters give rise to at least seven mCTR isoforms in the 5' region of the gene and generate 5'-untranslated regions of very different lengths. Analysis by reverse transcription-polymerase chain reaction shows that promoters P1 and P2 are utilized in osteoclasts, brain, and kidney, whereas promoter P3 appears to be osteoclast-specific. Using transiently transfected reporter constructs, promoter P2 has activity in both a murine kidney cell line (MDCT209) and a chicken osteoclast-like cell line (HD-11EM), whereas promoter P3 is active only in the osteoclast-like cell line. These transfection data confirm the osteoclast specificity of promoter P3 and provide the first evidence that the CTR gene is regulated in a tissue-specific manner by alternative promoter utilization.
Collapse
Affiliation(s)
- O Anusaksathien
- New England Baptist Bone and Joint Institute, Beth Israel Deaconess Medical Center, Boston, Massachusetts 02115, USA
| | | | | | | | | | | | | |
Collapse
|
84
|
Abstract
The analysis of regulatory sequences is greatly facilitated by database-assisted bioinformatic approaches. The TRANSFAC database contains information on transcription factors and their origins, functional properties and sequence-specific binding activities. Software tools enable us to screen the database with a given DNA sequence for interacting transcription factors. If a regulatory function is already attributed to this sequence then the database-assisted identification of binding sites for proteins or protein classes and subsequent experimental verification might establish functionally relevant sites within this sequence. The binding transcription factors and interacting factors might already be present in the database.
Collapse
Affiliation(s)
- R Hehl
- Institut für Genetik, Technische Universität Braunschweig, Spielmannstr. 7, D-38106,., Braunschweig, Germany.
| | | |
Collapse
|
85
|
Kel AE, Kel-Margoulis OV, Farnham PJ, Bartley SM, Wingender E, Zhang MQ. Computer-assisted identification of cell cycle-related genes: new targets for E2F transcription factors. J Mol Biol 2001; 309:99-120. [PMID: 11491305 DOI: 10.1006/jmbi.2001.4650] [Citation(s) in RCA: 133] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The processes that take place during development and differentiation are directed through coordinated regulation of expression of a large number of genes. One such gene regulatory network provides cell cycle control in eukaryotic organisms. In this work, we have studied the structural features of the 5' regulatory regions of cell cycle-related genes. We developed a new method for identifying composite substructures (modules) in regulatory regions of genes consisting of a binding site for a key transcription factor and additional contextual motifs: potential targets for other transcription factors that may synergistically regulate gene transcription. Applying this method to cell cycle-related promoters, we created a program for context-specific identification of binding sites for transcription factors of the E2F family which are key regulators of the cell cycle. We found that E2F composite modules are found at a high frequency and in close proximity to the start of transcription in cell cycle-related promoters in comparison with other promoters. Using this information, we then searched for E2F sites in genomic sequences with the goal of identifying new genes which play important roles in controlling cell proliferation, differentiation and apoptosis. Using a chromatin immunoprecipitation assay, we then experimentally verified the binding of E2F in vivo to the promoters predicted by the computer-assisted methods. Our identification of new E2F target genes provides new insight into gene regulatory networks and provides a framework for continued analysis of the role of contextual promoter features in transcriptional regulation. The tools described are available at http://compel.bionet.nsc.ru/FunSite/SiteScan.html.
Collapse
Affiliation(s)
- A E Kel
- Institute of Cytology and Genetics, Novosibirsk, Russia.
| | | | | | | | | | | |
Collapse
|
86
|
Werner T. Target gene identification from expression array data by promoter analysis. BIOMOLECULAR ENGINEERING 2001; 17:87-94. [PMID: 11222983 DOI: 10.1016/s1389-0344(00)00071-x] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
DNA microchips and expression arrays yield enormous amounts of data linking cDNA sequences to gene expression patterns. This now allows the characterization of gene expression in normal and diseased tissues as well as the response of tissues to the application of therapeutic reagents. Software currently exists to analyze DNA array/chip data with respect to corresponding mRNA sequences, which facilitates the precise determination of when and where certain groups of genes are expressed. The information concerning transcriptional regulatory networks responsible for the observed expression patterns is not contained within the cDNA sequences used to generate the arrays, but resides often within the promoter sequences of the individual genes (and/or enhancers). The complete sequence of the human genome will provide the molecular basis for the identification of such regulatory regions. Promoter sequences for specific cDNAs can be obtained reliably from genomic sequences simply by exon mapping. Promoter prediction tools can also be used to locate promoters directly in the genomic sequence in many cases in which cDNAs are 5'-incomplete. Once sufficient numbers of promoter sequences have been obtained, the comparative promoter analysis of the co-regulated genes and groups of genes can be applied in order to generate models describing the higher order levels of the transcription factor binding site organization within these promoter regions. As evident from several examples, this approach can identify promoter modules responsible for the common regulation of promoters solely by the application of bioinformatics methods. Such modules represent the molecular mechanisms through which regulatory networks influence gene expression. Another advantage of this approach is that it also provides a powerful alternative for elucidating functional features of genes with no detectable sequence similarity, by linking them to other genes on the basis of their common promoter structures.
Collapse
Affiliation(s)
- T Werner
- Genomatix Software GmbH, Karlstrasse 55, 80333, Munich, Germany.
| |
Collapse
|
87
|
Scherf M, Klingenhoff A, Frech K, Quandt K, Schneider R, Grote K, Frisch M, Gailus-Durner V, Seidel A, Brack-Werner R, Werner T. First pass annotation of promoters on human chromosome 22. Genome Res 2001; 11:333-40. [PMID: 11230158 PMCID: PMC311038 DOI: 10.1101/gr.154601] [Citation(s) in RCA: 39] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The publication of the first almost complete sequence of a human chromosome (chromosome 22) is a major milestone in human genomics. Together with the sequence, an excellent annotation of genes was published which certainly will serve as an information resource for numerous future projects. We noted that the annotation did not cover regulatory regions; in particular, no promoter annotation has been provided. Here we present an analysis of the complete published chromosome 22 sequence for promoters. A recent breakthrough in specific in silico prediction of promoter regions enabled us to attempt large-scale prediction of promoter regions on chromosome 22. Scanning of sequence databases revealed only 20 experimentally verified promoters, of which 10 were correctly predicted by our approach. Nearly 40% of our 465 predicted promoter regions are supported by the currently available gene annotation. Promoter finding also provides a biologically meaningful method for "chromosomal scaffolding", by which long genomic sequences can be divided into segments starting with a gene. As one example, the combination of promoter region prediction with exon/intron structure predictions greatly enhances the specificity of de novo gene finding. The present study demonstrates that it is possible to identify promoters in silico on the chromosomal level with sufficient reliability for experimental planning and indicates that a wealth of information about regulatory regions can be extracted from current large-scale (megabase) sequencing projects. Results are available on-line at http://genomatix.gsf.de/chr22/.
Collapse
Affiliation(s)
- M Scherf
- Institute of Mammalian Genetics, GSF-National Research Center for Environment and Health, Neuherberg, Germany.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
88
|
Werner T. Cluster analysis and promoter modelling as bioinformatics tools for the identification of target genes from expression array data. Pharmacogenomics 2001; 2:25-36. [PMID: 11258194 DOI: 10.1517/14622416.2.1.25] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Expression arrays yield enormous amounts of data linking genes, via their cDNA sequences, to gene expression patterns. This now allows the characterisation of gene expression in normal and diseased tissues, as well as the response of tissues to the application of therapeutic reagents. Expression array data can be analysed with respect to the underlying protein sequences, which facilitates the precise determination of when and where certain groups of genes are expressed. More recent developments of clustering algorithms take additional parameters of the experimental set-up into account, focusing more directly on co-regulated set of genes. However, the information concerning transcriptional regulatory networks responsible for the observed expression patterns is not contained within the cDNA sequences used to generate the arrays. Regulation of expression is determined to a large extent by the promoter sequences of the individual genes (and/or enhancers). The complete sequence of the human genome now provides the molecular basis for the identification of many regulatory regions. Promoter sequences for specific cDNAs can be obtained reliably from genomic sequences by exon mapping. In the many cases in which cDNAs are 5'-incomplete, high quality promoter prediction tools can be used to locate promoters directly in the genomic sequence. Once sufficient numbers of promoter sequences have been obtained, a comparative promoter analysis of the co-regulated genes and groups of genes can be applied in order to generate models describing the higher order levels of transcription factor binding site organisation within these promoter regions. Such modules represent the molecular mechanisms through which regulatory networks influence gene expression, and candidates can be determined solely by bioinformatics. This approach also provides a powerful alternative for elucidating the functional features of genes with no detectable sequence similarity, by linking them to other genes on the basis of their common promoter structures.
Collapse
Affiliation(s)
- T Werner
- Genomatix Software GmbH, Karlstrasse 55, D-80333 Munich, Germany.
| |
Collapse
|
89
|
Serfling E, Berberich-Siebelt F, Chuvpilo S, Jankevics E, Klein-Hessling S, Twardzik T, Avots A. The role of NF-AT transcription factors in T cell activation and differentiation. BIOCHIMICA ET BIOPHYSICA ACTA 2000; 1498:1-18. [PMID: 11042346 DOI: 10.1016/s0167-4889(00)00082-3] [Citation(s) in RCA: 158] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
The family of genuine NF-AT transcription factors consists of four members (NF-AT1 [or NF-ATp], NF-AT2 [or NF-ATc], NF-AT3 and NF-AT4 [or NF-ATx]) which are characterized by a highly conserved DNA binding domain (is designated as Rel similarity domain) and a calcineurin binding domain. The binding of the Ca(2+)-dependent phosphatase calcineurin to this region controls the nuclear import and exit of NF-ATs. This review deals (1) with the structure of NF-AT proteins, (2) the DNA binding of NF-AT factors and their interaction with AP-1, (3) NF-AT target genes, (4) signalling pathways leading to NF-AT activation: the role of protein kinases and calcineurin, (5) the nuclear entry and exit of NF-AT factors, (6) transcriptional transactivation by NF-AT factors, (7) the structure and expression of the chromosomal NF-AT2 gene, and (8) NF-AT factors in Th cell differentiation. The experimental data presented and discussed in the review show that NF-AT factors are major players in the control of T cell activation and differentiation and, in all likelihood, also of the cell cycle and apoptosis of T lymphocytes.
Collapse
Affiliation(s)
- E Serfling
- Department of Molecular Pathology, Institute of Pathology, University of Würzburg, Josef-Schneider-Str. 2, D-97080 Würzburg, Germany.
| | | | | | | | | | | | | |
Collapse
|
90
|
Wingender E, Chen X, Hehl R, Karas H, Liebich I, Matys V, Meinhardt T, Prüss M, Reuter I, Schacherer F. TRANSFAC: an integrated system for gene expression regulation. Nucleic Acids Res 2000; 28:316-9. [PMID: 10592259 PMCID: PMC102445 DOI: 10.1093/nar/28.1.316] [Citation(s) in RCA: 887] [Impact Index Per Article: 35.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/1999] [Accepted: 10/07/1999] [Indexed: 11/13/2022] Open
Abstract
TRANSFAC is a database on transcription factors, their genomic binding sites and DNA-binding profiles (http://transfac.gbf.de/TRANSFAC/). Its content has been enhanced, in particular by information about training sequences used for the construction of nucleotide matrices as well as by data on plant sites and factors. Moreover, TRANSFAC has been extended by two new modules: PathoDB provides data on pathologically relevant mutations in regulatory regions and transcription factor genes, whereas S/MARt DB compiles features of scaffold/matrix attached regions (S/MARs) and the proteins binding to them. Additionally, the databases TRANSPATH, about signal transduction, and CYTOMER, about organs and cell types, have been extended and are increasingly integrated with the TRANSFAC data sources.
Collapse
Affiliation(s)
- E Wingender
- Gesellschaft für Biotechnologische Forschung mbH, Mascheroder Weg 1, D-38124 Braunschweig, Germany.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
91
|
Kel-Margoulis OV, Romashchenko AG, Kolchanov NA, Wingender E, Kel AE. COMPEL: a database on composite regulatory elements providing combinatorial transcriptional regulation. Nucleic Acids Res 2000; 28:311-5. [PMID: 10592258 PMCID: PMC102399 DOI: 10.1093/nar/28.1.311] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/1999] [Accepted: 09/17/1999] [Indexed: 11/14/2022] Open
Abstract
COMPEL is a database on composite regulatory elements, the basic structures of combinatorial regulation. Composite regulatory elements contain two closely situated binding sites for distinct transcription factors and represent minimal functional units providing combinatorial transcriptional regulation. Both specific factor-DNA and factor-factor interactions contribute to the function of composite elements (CEs). Information about the structure of known CEs and specific gene regulation achieved through such CEs appears to be extremely useful for promoter prediction, for gene function prediction and for applied gene engineering as well. The structure of the relational model of COMPEL is determined by the concept of molecular structure and regulatory role of CEs. Based on the set of a particular CE, a program has been developed for searching potential CEs in gene regulatory regions. WWW search and browse routines were developed for COMPEL release 3.0. The COMPEL database equipped with the search and browse tools is available at http://compel.bionet.nsc.ru/. The program for prediction of potential CEs of NFAT type is available at http://compel.bionet.nsc. ru/FunSite.html and http://transfac.gbf.de/dbsearch/funsitep/ s_comp.html
Collapse
Affiliation(s)
- O V Kel-Margoulis
- Institute of Cytology, SB RAN, 10 Lavrentyev pr., 630090, Novosibirsk, Russia.
| | | | | | | | | |
Collapse
|
92
|
Chuvpilo S, Avots A, Berberich-Siebelt F, Glöckner J, Fischer C, Kerstan A, Escher C, Inashkina I, Hlubek F, Jankevics E, Brabletz T, Serfling E. Multiple NF-ATc Isoforms with Individual Transcriptional Properties Are Synthesized in T Lymphocytes. THE JOURNAL OF IMMUNOLOGY 1999. [DOI: 10.4049/jimmunol.162.12.7294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Abstract
The transcription factor NF-ATc that controls gene expression in T lymphocytes and embryonic cardiac cells is expressed in three prominent isoforms. This is due to alternative splice/polyadenylation events that lead to the predominant synthesis of two long isoforms in naive T cells and a shorter NF-ATc isoform in effector T cells. Whereas the previously described isoform NF-ATc/A contains a relatively short C terminus, the longer isoforms, B and C, span extra C-terminal peptides of 128 and 246 aa, respectively. We show here that in addition to the strong N-terminal trans-activation domain, TAD-A, which is common to all three NF-ATc isoforms, NF-ATc/C contains a second trans-activation domain, TAD-B, in its C-terminal peptide. Various stimuli of T cells that induce the activity of TAD-A also enhance the activity of TAD-B, but, unlike TAD-A, TAD-B remains unphosphorylated by protein from 12-O-tetradecanoyl 12-phorbol 13-acetate-stimulated T cells. The shorter C-terminal peptide of isoform NF-ATc/B exerts a suppressive transcriptional effect. These properties of NF-ATc/B and -C might be of importance for gene regulation in naive T lymphocytes in which NF-ATc/B and -C are predominantly synthesized.
Collapse
Affiliation(s)
- Sergei Chuvpilo
- *Department of Molecular Pathology, Institute of Pathology, University of Würzburg, Würzburg, Germany
| | - Andris Avots
- *Department of Molecular Pathology, Institute of Pathology, University of Würzburg, Würzburg, Germany
| | | | - Judith Glöckner
- *Department of Molecular Pathology, Institute of Pathology, University of Würzburg, Würzburg, Germany
| | - Christian Fischer
- *Department of Molecular Pathology, Institute of Pathology, University of Würzburg, Würzburg, Germany
| | - Andreas Kerstan
- *Department of Molecular Pathology, Institute of Pathology, University of Würzburg, Würzburg, Germany
| | - Cornelia Escher
- *Department of Molecular Pathology, Institute of Pathology, University of Würzburg, Würzburg, Germany
| | - Inna Inashkina
- *Department of Molecular Pathology, Institute of Pathology, University of Würzburg, Würzburg, Germany
- ‡Biomedical Research and Study Center, University of Latvia, Riga, Latvia
| | - Falk Hlubek
- †Institute of Pathology, University of Erlangen-Nürnberg, Erlangen, Germany; and
| | - Eriks Jankevics
- ‡Biomedical Research and Study Center, University of Latvia, Riga, Latvia
| | - Thomas Brabletz
- †Institute of Pathology, University of Erlangen-Nürnberg, Erlangen, Germany; and
| | - Edgar Serfling
- *Department of Molecular Pathology, Institute of Pathology, University of Würzburg, Würzburg, Germany
| |
Collapse
|