1
|
Harel S, Sanchez V, Moamer A, Sanchez-Galan JE, Abid Hussein MN, Mayaki D, Blanchette M, Hussain SNA. ETS1, ELK1, and ETV4 Transcription Factors Regulate Angiopoietin-1 Signaling and the Angiogenic Response in Endothelial Cells. Front Physiol 2021; 12:683651. [PMID: 34381375 PMCID: PMC8350579 DOI: 10.3389/fphys.2021.683651] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2021] [Accepted: 07/05/2021] [Indexed: 12/03/2022] Open
Abstract
Background Angiopoietin-1 (Ang-1) is the main ligand of Tie-2 receptors. It promotes endothelial cell (EC) survival, migration, and differentiation. Little is known about the transcription factors (TFs) in ECs that are downstream from Tie-2 receptors. Objective The main objective of this study is to identify the roles of the ETS family of TFs in Ang-1 signaling and the angiogenic response. Methods In silico enrichment analyses that were designed to predict TF binding sites of the promotors of eighty-six Ang-1-upregulated genes showed significant enrichment of ETS1, ELK1, and ETV4 binding sites in ECs. Human umbilical vein endothelial cells (HUVECs) were exposed for different time periods to recombinant Ang-1 protein and mRNA levels of ETS1, ELK1, and ETV4 were measured with qPCR and intracellular localization of these transcription factors was assessed with immunofluorescence. Electrophoretic mobility shift assays and reporter assays were used to assess activation of ETS1, ELK1, and ETV4 in response to Ang-1 exposure. The functional roles of these TFs in Ang-1-induced endothelial cell survival, migration, differentiation, and gene regulation were evaluated by using a loss-of-function approach (transfection with siRNA oligos). Results Ang-1 exposure increased ETS1 mRNA levels but had no effect on ELK1 or ETV4 levels. Immunostaining revealed that in control ECs, ETS1 has nuclear localization whereas ELK1 and ETV4 are localized to the nucleus and the cytosol. Ang-1 exposure increased nuclear intensity of ETS1 protein and enhanced nuclear mobilization of ELK1 and ETV4. Selective siRNA knockdown of ETS1, ELK1, and ETV4 showed that these TFs are required for Ang-1-induced EC survival and differentiation of cells, while ETS1 and ETV4 are required for Ang-1-induced EC migration. Moreover, ETS1, ELK1, and ETV4 knockdown inhibited Ang-1-induced upregulation of thirteen, eight, and nine pro-angiogenesis genes, respectively. Conclusion We conclude that ETS1, ELK1, and ETV4 transcription factors play significant angiogenic roles in Ang-1 signaling in ECs.
Collapse
Affiliation(s)
- Sharon Harel
- Translational Research in Respiratory Diseases Program, Research Institute of the McGill University Health Centre, Montreal, QC, Canada.,Department of Critical Care, McGill University Health Centre, Montreal, QC, Canada.,Meakins-Christie Laboratories, Department of Medicine, McGill University, Montreal, QC, Canada
| | - Veronica Sanchez
- Translational Research in Respiratory Diseases Program, Research Institute of the McGill University Health Centre, Montreal, QC, Canada.,Department of Critical Care, McGill University Health Centre, Montreal, QC, Canada.,Meakins-Christie Laboratories, Department of Medicine, McGill University, Montreal, QC, Canada
| | - Alaa Moamer
- Translational Research in Respiratory Diseases Program, Research Institute of the McGill University Health Centre, Montreal, QC, Canada.,Department of Critical Care, McGill University Health Centre, Montreal, QC, Canada.,Meakins-Christie Laboratories, Department of Medicine, McGill University, Montreal, QC, Canada
| | - Javier E Sanchez-Galan
- School of Computer Science, McGill Centre for Bioinformatics, McGill University, Montreal, QC, Canada
| | - Mohammad N Abid Hussein
- School of Engineering and Technology (SET), Aldar University College, Dubai, United Arab Emirates
| | - Dominique Mayaki
- Translational Research in Respiratory Diseases Program, Research Institute of the McGill University Health Centre, Montreal, QC, Canada.,Department of Critical Care, McGill University Health Centre, Montreal, QC, Canada.,Meakins-Christie Laboratories, Department of Medicine, McGill University, Montreal, QC, Canada
| | - Mathieu Blanchette
- School of Computer Science, McGill Centre for Bioinformatics, McGill University, Montreal, QC, Canada
| | - Sabah N A Hussain
- Translational Research in Respiratory Diseases Program, Research Institute of the McGill University Health Centre, Montreal, QC, Canada.,Department of Critical Care, McGill University Health Centre, Montreal, QC, Canada.,Meakins-Christie Laboratories, Department of Medicine, McGill University, Montreal, QC, Canada
| |
Collapse
|
2
|
Stella R, Bonadio RS, Cagnin S, Massimino ML, Bertoli A, Peggion C. Perturbations of the Proteome and of Secreted Metabolites in Primary Astrocytes from the hSOD1(G93A) ALS Mouse Model. Int J Mol Sci 2021; 22:ijms22137028. [PMID: 34209958 PMCID: PMC8268687 DOI: 10.3390/ijms22137028] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2021] [Revised: 06/18/2021] [Accepted: 06/21/2021] [Indexed: 01/16/2023] Open
Abstract
Amyotrophic lateral sclerosis (ALS) is a progressive neurodegenerative disease whose pathophysiology is largely unknown. Despite the fact that motor neuron (MN) death is recognized as the key event in ALS, astrocytes dysfunctionalities and neuroinflammation were demonstrated to accompany and probably even drive MN loss. Nevertheless, the mechanisms priming astrocyte failure and hyperactivation are still obscure. In this work, altered pathways and molecules in ALS astrocytes were unveiled by investigating the proteomic profile and the secreted metabolome of primary spinal cord astrocytes derived from transgenic ALS mouse model overexpressing the human (h)SOD1(G93A) protein in comparison with the transgenic counterpart expressing hSOD1(WT) protein. Here we show that ALS primary astrocytes are depleted of proteins-and of secreted metabolites-involved in glutathione metabolism and signaling. The observed increased activation of Nf-kB, Ebf1, and Plag1 transcription factors may account for the augmented expression of proteins involved in the proteolytic routes mediated by proteasome or endosome-lysosome systems. Moreover, hSOD1(G93A) primary astrocytes also display altered lipid metabolism. Our results provide novel insights into the altered molecular pathways that may underlie astrocyte dysfunctionalities and altered astrocyte-MN crosstalk in ALS, representing potential therapeutic targets to abrogate or slow down MN demise in disease pathogenesis.
Collapse
Affiliation(s)
- Roberto Stella
- Department of Chemistry, Istituto Zooprofilattico Sperimentale delle Venezie, 35020 Legnaro, Italy;
| | - Raphael Severino Bonadio
- Department of Biology and CRIBI Biotechnology Center, University of Padova, 35131 Padova, Italy; (R.S.B.); (S.C.)
| | - Stefano Cagnin
- Department of Biology and CRIBI Biotechnology Center, University of Padova, 35131 Padova, Italy; (R.S.B.); (S.C.)
- CIR-Myo Myology Center, University of Padova, 35131 Padova, Italy
| | | | - Alessandro Bertoli
- CNR—Neuroscience Institute, 35131 Padova, Italy;
- Padova Neuroscience Center, University of Padova, 35131 Padova, Italy
- Department of Biomedical Sciences, University of Padova, 35131 Padova, Italy
- Correspondence: (A.B.); (C.P.)
| | - Caterina Peggion
- Department of Biomedical Sciences, University of Padova, 35131 Padova, Italy
- Correspondence: (A.B.); (C.P.)
| |
Collapse
|
3
|
Verardo LL, Lopes MS, Wijga S, Madsen O, Silva FF, Groenen MAM, Knol EF, Lopes PS, Guimarães SEF. After genome-wide association studies: Gene networks elucidating candidate genes divergences for number of teats across two pig populations. J Anim Sci 2017; 94:1446-58. [PMID: 27136004 DOI: 10.2527/jas.2015-9917] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Number of teats (NT) is an important trait affecting both piglet's welfare and the production level of pig farms. Biologically, embryonic mammary gland development requires the coordination of many signaling pathways necessary for the proper development of teats. Several QTL for NT have been identified; however, further analysis is still lacking. Therefore, gene networks derived from genomewide association study (GWAS) results can be used to examine shared pathways and functions of putative candidate genes. Besides, such analyses may also be helpful to understand the genetic diversity between populations for the same trait or traits. In this study, we identified significant SNP for Landrace-based (line C) and Large White-based (line D) dam lines. Besides, gene-transcription factor (TF) networks were constructed aiming to obtain the most likely candidate genes for NT in each line followed by a comparative analysis between both lines to access similarities or dissimilarities at the marker and gene level. We identified 24 and 19 significant SNP (Bayes factor ≥ 100) for lines C and D, respectively. Only 1 significant SNP overlapped both lines. Network analysis illustrated gene interactions consistent with known mammal's breast biology and captured known TF. We observed different sets of putative candidate genes for NT in each line evaluated that may have common effects on the phenotype. Based on these results, we demonstrated the importance of post-GWAS analyses increasing the biological understanding of relevant genes for a complex trait. Moreover, we believe that this genomic diversity across lines should be taken into account, considering breed-specific reference populations for genomic selection.
Collapse
|
4
|
Verardo LL, Sevón-Aimonen ML, Serenius T, Hietakangas V, Uimari P. Whole-genome association analysis of pork meat pH revealed three significant regions and several potential genes in Finnish Yorkshire pigs. BMC Genet 2017; 18:13. [PMID: 28193157 PMCID: PMC5307873 DOI: 10.1186/s12863-017-0482-x] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2016] [Accepted: 02/07/2017] [Indexed: 12/11/2022] Open
Abstract
Background One of the most commonly used quality measurements of pork is pH measured 24 h after slaughter. The most probable mode of inheritance for this trait is oligogenic with several known major genes, such as PRKAG3. In this study, we used whole-genome SNP genotypes of over 700 AI boars; after a quality check, 42,385 SNPs remained for association analysis. All the boars were purebred Finnish Yorkshire. To account for relatedness of the animals, a pedigree-based relationship matrix was used in a mixed linear model to test the effect of SNPs on pH measured from loin. A bioinformatics analysis was performed to identify the most promising genes in the significant regions related to meat quality. Results Genome-wide association study (GWAS) revealed three significant chromosomal regions: one on chromosome 3 (39.9 Mb–40.1 Mb) and two on chromosome 15 (58.5 Mb–60.5 Mb and 132 Mb–135 Mb including PRKAG3). A conditional analysis with a significant SNP in the PRKAG3 region, MARC0083357, as a covariate in the model retained the significant SNPs on chromosome 3. Even though linkage disequilibrium was relatively high over a long distance between MARC0083357 and other significant SNPs on chromosome 15, some SNPs retained their significance in the conditional analysis, even in the vicinity of PRKAG3. The significant regions harbored several genes, including two genes involved in cyclic AMP (cAMP) signaling: ADCY9 and CREBBP. Based on functional and transcription factor-gene networks, the most promising candidate genes for meat pH are ADCY9, CREBBP, TRAP1, NRG1, PRKAG3, VIL1, TNS1, and IGFBP5, and the key transcription factors related to these genes are HNF4A, PPARG, and Nkx2-5. Conclusions Based on SNP association, pathway, and transcription factor analysis, we were able to identify several genes with potential to control muscle cell homeostasis and meat quality. The associated SNPs can be used in selection for better pork. We also showed that post-GWAS analysis reveals important information about the genes’ potential role on meat quality. The gained information can be used in later functional studies.
Collapse
Affiliation(s)
- Lucas L Verardo
- Department of Animal Science/Animal Breeding, Federal University of Viçosa, Viçosa, Brazil
| | | | | | - Ville Hietakangas
- Department of Biosciences, University of Helsinki, Helsinki, Finland.,Institute of Biotechnology, University of Helsinki, Helsinki, Finland
| | - Pekka Uimari
- Department of Agricultural Sciences, University of Helsinki, Helsinki, Finland.
| |
Collapse
|
5
|
Bartel J, Krumsiek J, Schramm K, Adamski J, Gieger C, Herder C, Carstensen M, Peters A, Rathmann W, Roden M, Strauch K, Suhre K, Kastenmüller G, Prokisch H, Theis FJ. The Human Blood Metabolome-Transcriptome Interface. PLoS Genet 2015; 11:e1005274. [PMID: 26086077 PMCID: PMC4473262 DOI: 10.1371/journal.pgen.1005274] [Citation(s) in RCA: 77] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2014] [Accepted: 05/12/2015] [Indexed: 12/21/2022] Open
Abstract
Biological systems consist of multiple organizational levels all densely interacting with each other to ensure function and flexibility of the system. Simultaneous analysis of cross-sectional multi-omics data from large population studies is a powerful tool to comprehensively characterize the underlying molecular mechanisms on a physiological scale. In this study, we systematically analyzed the relationship between fasting serum metabolomics and whole blood transcriptomics data from 712 individuals of the German KORA F4 cohort. Correlation-based analysis identified 1,109 significant associations between 522 transcripts and 114 metabolites summarized in an integrated network, the 'human blood metabolome-transcriptome interface' (BMTI). Bidirectional causality analysis using Mendelian randomization did not yield any statistically significant causal associations between transcripts and metabolites. A knowledge-based interpretation and integration with a genome-scale human metabolic reconstruction revealed systematic signatures of signaling, transport and metabolic processes, i.e. metabolic reactions mainly belonging to lipid, energy and amino acid metabolism. Moreover, the construction of a network based on functional categories illustrated the cross-talk between the biological layers at a pathway level. Using a transcription factor binding site enrichment analysis, this pathway cross-talk was further confirmed at a regulatory level. Finally, we demonstrated how the constructed networks can be used to gain novel insights into molecular mechanisms associated to intermediate clinical traits. Overall, our results demonstrate the utility of a multi-omics integrative approach to understand the molecular mechanisms underlying both normal physiology and disease.
Collapse
Affiliation(s)
- Jörg Bartel
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany
| | - Jan Krumsiek
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany
| | - Katharina Schramm
- Institute of Human Genetics, Helmholtz Zentrum München, Neuherberg, Germany
- Institute of Human Genetics, Technische Universität München, Neuherberg, Germany
| | - Jerzy Adamski
- Institute of Experimental Genetics, Genome Analysis Center Helmholtz Zentrum München, Neuherberg, Germany
- Faculty of Experimental Genetics, Technische Universität München, Freising-Weihenstephan, Germany
- German Center for Cardiovascular Disease Research (DZHK e.V.), partner-site Munich, Munich, Germany
| | - Christian Gieger
- Institute of Genetic Epidemiology, Helmholtz Zentrum München, Neuherberg, Germany
| | - Christian Herder
- Institute of Clinical Diabetology, German Diabetes Center, Leibniz Center for Diabetes Research at Heinrich Heine University Düsseldorf, Düsseldorf, Germany
- German Center for Diabetes Research (DZD e.V.), partner-site Düsseldorf, Düsseldorf, Germany
| | - Maren Carstensen
- Institute of Clinical Diabetology, German Diabetes Center, Leibniz Center for Diabetes Research at Heinrich Heine University Düsseldorf, Düsseldorf, Germany
- German Center for Diabetes Research (DZD e.V.), partner-site Düsseldorf, Düsseldorf, Germany
| | - Annette Peters
- German Center for Cardiovascular Disease Research (DZHK e.V.), partner-site Munich, Munich, Germany
- Institute of Epidemiology II, Helmholtz Zentrum München, Neuherberg, Germany
- German Center for Cardiovascular Disease Research (DZHK e.V.), partner-site Munich, Munich, Germany
| | - Wolfgang Rathmann
- Institute of Biometrics and Epidemiology, German Diabetes Center, Leibniz Center for Diabetes Research at Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Michael Roden
- Institute of Clinical Diabetology, German Diabetes Center, Leibniz Center for Diabetes Research at Heinrich Heine University Düsseldorf, Düsseldorf, Germany
- German Center for Diabetes Research (DZD e.V.), partner-site Düsseldorf, Düsseldorf, Germany
- Department of Endocrinology and Diabetology, University Hospital Düsseldorf, Heinrich Heine University, Düsseldorf, Germany
| | - Konstantin Strauch
- Institute of Genetic Epidemiology, Helmholtz Zentrum München, Neuherberg, Germany
- Institute of Medical Informatics, Biometry and Epidemiology, Chair of Genetic Epidemiology, Ludwig-Maximilians-Universität, Munich, Germany
| | - Karsten Suhre
- Institute of Bioinformatics and Systems Biology, Helmholtz Zentrum München, Neuherberg, Germany
- Department of Physiology and Biophysics, Weill Cornell Medical College in Qatar, Qatar Foundation, Doha, Qatar
| | - Gabi Kastenmüller
- Institute of Bioinformatics and Systems Biology, Helmholtz Zentrum München, Neuherberg, Germany
| | - Holger Prokisch
- Institute of Human Genetics, Helmholtz Zentrum München, Neuherberg, Germany
- Institute of Human Genetics, Technische Universität München, Neuherberg, Germany
| | - Fabian J. Theis
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany
- Department of Mathematics, Technische Universität München, Garching, Germany
| |
Collapse
|
6
|
Jablonska A, Polouliakh N. In silico discovery of novel transcription factors regulated by mTOR-pathway activities. Front Cell Dev Biol 2014; 2:23. [PMID: 25364730 PMCID: PMC4206986 DOI: 10.3389/fcell.2014.00023] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2014] [Accepted: 05/09/2014] [Indexed: 12/21/2022] Open
Abstract
The mammalian target of rapamycine (mTOR) pathway is a key regulator of cellular growth, development, and ageing, and unraveling its control is essential for understanding life and death of biological organisms. A motif-discovery workbench including nine tools was used to identify transcription factors involved in five basic (Insulin, MAPK, VEGF, Hypoxia, and mTOR core) activities of the mTOR pathway. Discovered transcription factors are classified as “process-specific” or “pathway-ubiquitous” with highlights toward their regulating/regulated activities within the mTOR pathway. Our transcription regulation results will facilitate further research on investigating the control mechanism in mTOR pathway.
Collapse
Affiliation(s)
- Agnieszka Jablonska
- Faculty of Biotechnology and Food Sciences, Lodz University of Technology Lodz, Poland
| | - Natalia Polouliakh
- Fundamental Research Laboratories, Sony Computer Science Laboratories Inc. Tokyo, Japan ; Systems Biology Institute Tokyo, Japan ; Graduate School of Medicine, Yokohama City University Yokohama, Japan
| |
Collapse
|
7
|
Selection of higher order regression models in the analysis of multi-factorial transcription data. PLoS One 2014; 9:e91840. [PMID: 24658540 PMCID: PMC3962375 DOI: 10.1371/journal.pone.0091840] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2013] [Accepted: 02/16/2014] [Indexed: 11/19/2022] Open
Abstract
Introduction Many studies examine gene expression data that has been obtained under the influence of multiple factors, such as genetic background, environmental conditions, or exposure to diseases. The interplay of multiple factors may lead to effect modification and confounding. Higher order linear regression models can account for these effects. We present a new methodology for linear model selection and apply it to microarray data of bone marrow-derived macrophages. This experiment investigates the influence of three variable factors: the genetic background of the mice from which the macrophages were obtained, Yersinia enterocolitica infection (two strains, and a mock control), and treatment/non-treatment with interferon-γ. Results We set up four different linear regression models in a hierarchical order. We introduce the eruption plot as a new practical tool for model selection complementary to global testing. It visually compares the size and significance of effect estimates between two nested models. Using this methodology we were able to select the most appropriate model by keeping only relevant factors showing additional explanatory power. Application to experimental data allowed us to qualify the interaction of factors as either neutral (no interaction), alleviating (co-occurring effects are weaker than expected from the single effects), or aggravating (stronger than expected). We find a biologically meaningful gene cluster of putative C2TA target genes that appear to be co-regulated with MHC class II genes. Conclusions We introduced the eruption plot as a tool for visual model comparison to identify relevant higher order interactions in the analysis of expression data obtained under the influence of multiple factors. We conclude that model selection in higher order linear regression models should generally be performed for the analysis of multi-factorial microarray data.
Collapse
|
8
|
Charos AE, Reed BD, Raha D, Szekely AM, Weissman SM, Snyder M. A highly integrated and complex PPARGC1A transcription factor binding network in HepG2 cells. Genome Res 2013; 22:1668-79. [PMID: 22955979 PMCID: PMC3431484 DOI: 10.1101/gr.127761.111] [Citation(s) in RCA: 63] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
PPARGC1A is a transcriptional coactivator that binds to and coactivates a variety of transcription factors (TFs) to regulate the expression of target genes. PPARGC1A plays a pivotal role in regulating energy metabolism and has been implicated in several human diseases, most notably type II diabetes. Previous studies have focused on the interplay between PPARGC1A and individual TFs, but little is known about how PPARGC1A combines with all of its partners across the genome to regulate transcriptional dynamics. In this study, we describe a core PPARGC1A transcriptional regulatory network operating in HepG2 cells treated with forskolin. We first mapped the genome-wide binding sites of PPARGC1A using chromatin-IP followed by high-throughput sequencing (ChIP-seq) and uncovered overrepresented DNA sequence motifs corresponding to known and novel PPARGC1A network partners. We then profiled six of these site-specific TF partners using ChIP-seq and examined their network connectivity and combinatorial binding patterns with PPARGC1A. Our analysis revealed extensive overlap of targets including a novel link between PPARGC1A and HSF1, a TF regulating the conserved heat shock response pathway that is misregulated in diabetes. Importantly, we found that different combinations of TFs bound to distinct functional sets of genes, thereby helping to reveal the combinatorial regulatory code for metabolic and other cellular processes. In addition, the different TFs often bound near the promoters and coding regions of each other's genes suggesting an intricate network of interdependent regulation. Overall, our study provides an important framework for understanding the systems-level control of metabolic gene expression in humans.
Collapse
Affiliation(s)
- Alexandra E Charos
- Department of Molecular, Cellular, and Developmental Biology, Yale University, New Haven, Connecticut 06520, USA
| | | | | | | | | | | |
Collapse
|
9
|
Vandenbon A, Kumagai Y, Teraguchi S, Amada KM, Akira S, Standley DM. A Parzen window-based approach for the detection of locally enriched transcription factor binding sites. BMC Bioinformatics 2013; 14:26. [PMID: 23331723 PMCID: PMC3602658 DOI: 10.1186/1471-2105-14-26] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2012] [Accepted: 01/14/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Identification of cis- and trans-acting factors regulating gene expression remains an important problem in biology. Bioinformatics analyses of regulatory regions are hampered by several difficulties. One is that binding sites for regulatory proteins are often not significantly over-represented in the set of DNA sequences of interest, because of high levels of false positive predictions, and because of positional restrictions on functional binding sites with regard to the transcription start site. RESULTS We have developed a novel method for the detection of regulatory motifs based on their local over-representation in sets of regulatory regions. The method makes use of a Parzen window-based approach for scoring local enrichment, and during evaluation of significance it takes into account GC content of sequences. We show that the accuracy of our method compares favourably to that of other methods, and that our method is capable of detecting not only generally over-represented regulatory motifs, but also locally over-represented motifs that are often missed by standard motif detection approaches. Using a number of examples we illustrate the validity of our approach and suggest applications, such as the analysis of weaker binding sites. CONCLUSIONS Our approach can be used to suggest testable hypotheses for wet-lab experiments. It has potential for future analyses, such as the prediction of weaker binding sites. An online application of our approach, called LocaMo Finder (Local Motif Finder), is available at http://sysimm.ifrec.osaka-u.ac.jp/tfbs/locamo/.
Collapse
Affiliation(s)
- Alexis Vandenbon
- Laboratory of Systems Immunology, Immunology Frontier Research Center, Osaka University, Osaka, Japan.
| | | | | | | | | | | |
Collapse
|
10
|
Wang D, Tapan S. MISCORE: a new scoring function for characterizing DNA regulatory motifs in promoter sequences. BMC SYSTEMS BIOLOGY 2012; 6 Suppl 2:S4. [PMID: 23282090 PMCID: PMC3521183 DOI: 10.1186/1752-0509-6-s2-s4] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Background Computational approaches for finding DNA regulatory motifs in promoter sequences are useful to biologists in terms of reducing the experimental costs and speeding up the discovery process of de novo binding sites. It is important for rule-based or clustering-based motif searching schemes to effectively and efficiently evaluate the similarity between a k-mer (a k-length subsequence) and a motif model, without assuming the independence of nucleotides in motif models or without employing computationally expensive Markov chain models to estimate the background probabilities of k-mers. Also, it is interesting and beneficial to use a priori knowledge in developing advanced searching tools. Results This paper presents a new scoring function, termed as MISCORE, for functional motif characterization and evaluation. Our MISCORE is free from: (i) any assumption on model dependency; and (ii) the use of Markov chain model for background modeling. It integrates the compositional complexity of motif instances into the function. Performance evaluations with comparison to the well-known Maximum a Posteriori (MAP) score and Information Content (IC) have shown that MISCORE has promising capabilities to separate and recognize functional DNA motifs and its instances from non-functional ones. Conclusions MISCORE is a fast computational tool for candidate motif characterization, evaluation and selection. It enables to embed priori known motif models for computing motif-to-motif similarity, which is more advantageous than IC and MAP score. In addition to these merits mentioned above, MISCORE can automatically filter out some repetitive k-mers from a motif model due to the introduction of the compositional complexity in the function. Consequently, the merits of our proposed MISCORE in terms of both motif signal modeling power and computational efficiency will make it more applicable in the development of computational motif discovery tools.
Collapse
Affiliation(s)
- Dianhui Wang
- Department of Computer Science and Computer Engineering, La Trobe University, Melbourne, Victoria 3086, Australia.
| | | |
Collapse
|
11
|
Marcinowski L, Lidschreiber M, Windhager L, Rieder M, Bosse JB, Rädle B, Bonfert T, Györy I, de Graaf M, da Costa OP, Rosenstiel P, Friedel CC, Zimmer R, Ruzsics Z, Dölken L. Real-time transcriptional profiling of cellular and viral gene expression during lytic cytomegalovirus infection. PLoS Pathog 2012; 8:e1002908. [PMID: 22969428 PMCID: PMC3435240 DOI: 10.1371/journal.ppat.1002908] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2012] [Accepted: 08/01/2012] [Indexed: 01/08/2023] Open
Abstract
During viral infections cellular gene expression is subject to rapid alterations induced by both viral and antiviral mechanisms. In this study, we applied metabolic labeling of newly transcribed RNA with 4-thiouridine (4sU-tagging) to dissect the real-time kinetics of cellular and viral transcriptional activity during lytic murine cytomegalovirus (MCMV) infection. Microarray profiling on newly transcribed RNA obtained at different times during the first six hours of MCMV infection revealed discrete functional clusters of cellular genes regulated with distinct kinetics at surprising temporal resolution. Immediately upon virus entry, a cluster of NF-κB- and interferon-regulated genes was induced. Rapid viral counter-regulation of this coincided with a very transient DNA-damage response, followed by a delayed ER-stress response. Rapid counter-regulation of all three clusters indicated the involvement of novel viral regulators targeting these pathways. In addition, down-regulation of two clusters involved in cell-differentiation (rapid repression) and cell-cycle (delayed repression) was observed. Promoter analysis revealed all five clusters to be associated with distinct transcription factors, of which NF-κB and c-Myc were validated to precisely match the respective transcriptional changes observed in newly transcribed RNA. 4sU-tagging also allowed us to study the real-time kinetics of viral gene expression in the absence of any interfering virion-associated-RNA. Both qRT-PCR and next-generation sequencing demonstrated a sharp peak of viral gene expression during the first two hours of infection including transcription of immediate-early, early and even well characterized late genes. Interestingly, this was subject to rapid gene silencing by 5–6 hours post infection. Despite the rapid increase in viral DNA load during viral DNA replication, transcriptional activity of some viral genes remained remarkably constant until late-stage infection, or was subject to further continuous decline. In summary, this study pioneers real-time transcriptional analysis during a lytic herpesvirus infection and highlights numerous novel regulatory aspects of virus-host-cell interaction. Cytomegaloviruses are large DNA viruses, which establish life-long latent infections, leaving the infected individual at risk of reactivation and disease. Here, we applied 4-thiouridine-(4sU)-tagging of newly transcribed RNA to monitor the real-time kinetics of transcriptional activity of both cellular and viral genes during lytic murine CMV (MCMV) infection. We observed a cascade of MCMV-induced signaling events including a rapid inflammatory/interferon-response, a transient DNA-damage-response and a delayed ER-stress-response. All of these were heavily counter-regulated by viral gene expression. Besides dramatically increasing temporal resolution, our approach provides the unique opportunity to study viral transcriptional activity in absence of any interfering virion-associated-RNA. Virion-associated-RNA consists of transcripts that are unspecifically incorporated into the virus particles thereby resembling the cellular RNA profile of late stage infection. A clear picture of which viral genes are expressed, particularly at very early times of infection, could thus not be obtained. By overcoming this problem, we provide intriguing insights into the regulation of viral gene expression, namely 1) a peak of viral gene expression during the first two hours of infection including the expression of well-characterized late genes and 2) remarkably constant or even continuously declining expression of some viral genes despite the onset of rapid viral DNA replication.
Collapse
Affiliation(s)
- Lisa Marcinowski
- Max von Pettenkofer-Institute, Ludwig-Maximilians-University, Munich, Germany
| | - Michael Lidschreiber
- Gene Center and Department of Biochemistry, Ludwig-Maximilians-University, Munich, Germany
| | - Lukas Windhager
- Institute for Informatics, Ludwig-Maximilians-University, Munich, Germany
| | - Martina Rieder
- Max von Pettenkofer-Institute, Ludwig-Maximilians-University, Munich, Germany
| | - Jens B. Bosse
- Department of Molecular Biology, Princeton University, Princeton, New Jersey, United States of America
| | - Bernd Rädle
- Max von Pettenkofer-Institute, Ludwig-Maximilians-University, Munich, Germany
| | - Thomas Bonfert
- Institute for Informatics, Ludwig-Maximilians-University, Munich, Germany
| | - Ildiko Györy
- School of Biomedical and Biological Sciences, Centre for Research in Translational Biomedicine, Plymouth University, Plymouth, United Kingdom
| | - Miranda de Graaf
- Department of Medicine, University of Cambridge, Addenbrooke's Hospital, Cambridge, United Kingdom
| | | | - Philip Rosenstiel
- Institute of Clinical Molecular Biology, Christian-Albrechts-University, Kiel, Germany
| | | | - Ralf Zimmer
- Institute for Informatics, Ludwig-Maximilians-University, Munich, Germany
| | - Zsolt Ruzsics
- Max von Pettenkofer-Institute, Ludwig-Maximilians-University, Munich, Germany
| | - Lars Dölken
- Max von Pettenkofer-Institute, Ludwig-Maximilians-University, Munich, Germany
- Department of Medicine, University of Cambridge, Addenbrooke's Hospital, Cambridge, United Kingdom
- * E-mail:
| |
Collapse
|
12
|
oPOSSUM-3: advanced analysis of regulatory motif over-representation across genes or ChIP-Seq datasets. G3-GENES GENOMES GENETICS 2012; 2:987-1002. [PMID: 22973536 PMCID: PMC3429929 DOI: 10.1534/g3.112.003202] [Citation(s) in RCA: 230] [Impact Index Per Article: 19.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/27/2012] [Accepted: 06/11/2012] [Indexed: 01/12/2023]
Abstract
oPOSSUM-3 is a web-accessible software system for identification of over-represented transcription factor binding sites (TFBS) and TFBS families in either DNA sequences of co-expressed genes or sequences generated from high-throughput methods, such as ChIP-Seq. Validation of the system with known sets of co-regulated genes and published ChIP-Seq data demonstrates the capacity for oPOSSUM-3 to identify mediating transcription factors (TF) for co-regulated genes or co-recovered sequences. oPOSSUM-3 is available at http://opossum.cisreg.ca.
Collapse
|
13
|
Tan M, Yu D, Jin Y, Dou L, Li B, Wang Y, Yue J, Liang L. An information transmission model for transcription factor binding at regulatory DNA sites. Theor Biol Med Model 2012; 9:19. [PMID: 22672438 PMCID: PMC3442977 DOI: 10.1186/1742-4682-9-19] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2012] [Accepted: 05/17/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Computational identification of transcription factor binding sites (TFBSs) is a rapid, cost-efficient way to locate unknown regulatory elements. With increased potential for high-throughput genome sequencing, the availability of accurate computational methods for TFBS prediction has never been as important as it currently is. To date, identifying TFBSs with high sensitivity and specificity is still an open challenge, necessitating the development of novel models for predicting transcription factor-binding regulatory DNA elements. RESULTS Based on the information theory, we propose a model for transcription factor binding of regulatory DNA sites. Our model incorporates position interdependencies in effective ways. The model computes the information transferred (TI) between the transcription factor and the TFBS during the binding process and uses TI as the criterion to determine whether the sequence motif is a possible TFBS. Based on this model, we developed a computational method to identify TFBSs. By theoretically proving and testing our model using both real and artificial data, we found that our model provides highly accurate predictive results. CONCLUSIONS In this study, we present a novel model for transcription factor binding regulatory DNA sites. The model can provide an increased ability to detect TFBSs.
Collapse
Affiliation(s)
- Mingfeng Tan
- Beijing Institute of Biotechnology, Beijing 100071, China
| | | | | | | | | | | | | | | |
Collapse
|
14
|
Iwashita Y, Fukuchi N, Waki M, Hayashi K, Tahira T. Genome-wide repression of NF-κB target genes by transcription factor MIBP1 and its modulation by O-linked β-N-acetylglucosamine (O-GlcNAc) transferase. J Biol Chem 2012; 287:9887-9900. [PMID: 22294689 DOI: 10.1074/jbc.m111.298521] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
The transcription factor c-MYC intron binding protein 1 (MIBP1) binds to various genomic regulatory regions, including intron 1 of c-MYC. This factor is highly expressed in postmitotic neurons in the fetal brain and may be involved in various biological steps, such as neurological and immunological processes. In this study, we globally characterized the transcriptional targets of MIBP1 and proteins that interact with MIBP1. Microarray hybridization followed by gene set enrichment analysis revealed that genes involved in the pathways downstream of MYC, NF-κB, and TGF-β were down-regulated when HEK293 cells stably overexpressed MIBP1. In silico transcription factor binding site analysis of the promoter regions of these down-regulated genes showed that the NF-κB binding site was the most overrepresented. The up-regulation of genes known to be in the NF-κB pathway after the knockdown of endogenous MIBP1 in HT1080 cells supports the view that MIBP1 is a down-regulator of the NF-κB pathway. We also confirmed the binding of the MIBP1 to the NF-κB site. By immunoprecipitation and mass spectrometry, we detected O-linked β-N-acetylglucosamine (O-GlcNAc) transferase as a prominent binding partner of MIBP1. Analyses using deletion mutants revealed that a 154-amino acid region of MIBP1 was necessary for its O-GlcNAc transferase binding and O-GlcNAcylation. A luciferase reporter assay showed that NF-κB-responsive expression was repressed by MIBP1, and stronger repression by MIBP1 lacking the 154-amino acid region was observed. Our results indicate that the primary effect of MIBP1 expression is the down-regulation of the NF-κB pathway and that this effect is attenuated by O-GlcNAc signaling.
Collapse
Affiliation(s)
- Yuji Iwashita
- Division of Genome Analysis, Research Center for Genetic Information, Medical Institute of Bioregulation, Kyushu University, Fukuoka 812-8582, Japan; Graduate School of Systems Life Sciences, Kyushu University, Fukuoka 812-8582, Japan
| | - Naruhiko Fukuchi
- Division of Genome Analysis, Research Center for Genetic Information, Medical Institute of Bioregulation, Kyushu University, Fukuoka 812-8582, Japan
| | - Mariko Waki
- Division of Genome Analysis, Research Center for Genetic Information, Medical Institute of Bioregulation, Kyushu University, Fukuoka 812-8582, Japan
| | - Kenshi Hayashi
- Division of Genome Analysis, Research Center for Genetic Information, Medical Institute of Bioregulation, Kyushu University, Fukuoka 812-8582, Japan
| | - Tomoko Tahira
- Division of Genome Analysis, Research Center for Genetic Information, Medical Institute of Bioregulation, Kyushu University, Fukuoka 812-8582, Japan; Graduate School of Systems Life Sciences, Kyushu University, Fukuoka 812-8582, Japan.
| |
Collapse
|
15
|
Transcriptional networks in epithelial-mesenchymal transition. PLoS One 2011; 6:e25354. [PMID: 21980432 PMCID: PMC3184133 DOI: 10.1371/journal.pone.0025354] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2011] [Accepted: 09/01/2011] [Indexed: 12/22/2022] Open
Abstract
Backround Epithelial-mesenchymal transition (EMT) changes polarized epithelial cells into migratory phenotypes associated with loss of cell-cell adhesion molecules and cytoskeletal rearrangements. This form of plasticity is seen in mesodermal development, fibroblast formation, and cancer metastasis. Methods and Findings Here we identify prominent transcriptional networks active during three time points of this transitional process, as epithelial cells become fibroblasts. DNA microarray in cultured epithelia undergoing EMT, validated in vivo, were used to detect various patterns of gene expression. In particular, the promoter sequences of differentially expressed genes and their transcription factors were analyzed to identify potential binding sites and partners. The four most frequent cis-regulatory elements (CREs) in up-regulated genes were SRY, FTS-1, Evi-1, and GC-Box, and RNA inhibition of the four transcription factors, Atf2, Klf10, Sox11, and SP1, most frequently binding these CREs, establish their importance in the initiation and propagation of EMT. Oligonucleotides that block the most frequent CREs restrain EMT at early and intermediate stages through apoptosis of the cells. Conclusions Our results identify new transcriptional interactions with high frequency CREs that modulate the stability of cellular plasticity, and may serve as targets for modulating these transitional states in fibroblasts.
Collapse
|
16
|
p38MAPK is a novel DNA damage response-independent regulator of the senescence-associated secretory phenotype. EMBO J 2011; 30:1536-48. [PMID: 21399611 PMCID: PMC3102277 DOI: 10.1038/emboj.2011.69] [Citation(s) in RCA: 681] [Impact Index Per Article: 52.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2010] [Accepted: 02/18/2011] [Indexed: 12/11/2022] Open
Abstract
Cellular senescence suppresses cancer by forcing potentially oncogenic cells into a permanent cell cycle arrest. Senescent cells also secrete growth factors, proteases, and inflammatory cytokines, termed the senescence-associated secretory phenotype (SASP). Much is known about pathways that regulate the senescence growth arrest, but far less is known about pathways that regulate the SASP. We previously showed that DNA damage response (DDR) signalling is essential, but not sufficient, for the SASP, which is restrained by p53. Here, we delineate another crucial SASP regulatory pathway and its relationship to the DDR and p53. We show that diverse senescence-inducing stimuli activate the stress-inducible kinase p38MAPK in normal human fibroblasts. p38MAPK inhibition markedly reduced the secretion of most SASP factors, constitutive p38MAPK activation was sufficient to induce an SASP, and p53 restrained p38MAPK activation. Further, p38MAPK regulated the SASP independently of the canonical DDR. Mechanistically, p38MAPK induced the SASP largely by increasing NF-κB transcriptional activity. These findings assign p38MAPK a novel role in SASP regulation--one that is necessary, sufficient, and independent of previously described pathways.
Collapse
|
17
|
Tuggle CK, Bearson SMD, Uthe JJ, Huang TH, Couture OP, Wang YF, Kuhar D, Lunney JK, Honavar V. Methods for transcriptomic analyses of the porcine host immune response: application to Salmonella infection using microarrays. Vet Immunol Immunopathol 2010; 138:280-91. [PMID: 21036404 DOI: 10.1016/j.vetimm.2010.10.006] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Technological developments in both the collection and analysis of molecular genetic data over the past few years have provided new opportunities for an improved understanding of the global response to pathogen exposure. Such developments are particularly dramatic for scientists studying the pig, where tools to measure the expression of tens of thousands of transcripts, as well as unprecedented data on the porcine genome sequence, have combined to expand our abilities to elucidate the porcine immune system. In this review, we describe these recent developments in the context of our work using primarily microarrays to explore gene expression changes during infection of pigs by Salmonella. Thus while the focus is not a comprehensive review of all possible approaches, we provide links and information on both the tools we use as well as alternatives commonly available for transcriptomic data collection and analysis of porcine immune responses. Through this review, we expect readers will gain an appreciation for the necessary steps to plan, conduct, analyze and interpret the data from transcriptomic analyses directly applicable to their research interests.
Collapse
Affiliation(s)
- C K Tuggle
- Department of Animal Science, and Center for Integrated Animal Genomics, 2255 Kildee Hall, Iowa State University, Ames, IA 50010, United States.
| | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Mansour AA, Nissim-Eliraz E, Zisman S, Golan-Lev T, Schatz O, Klar A, Ben-Arie N. Foxa2 regulates the expression of Nato3 in the floor plate by a novel evolutionarily conserved promoter. Mol Cell Neurosci 2010; 46:187-99. [PMID: 20849957 DOI: 10.1016/j.mcn.2010.09.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2010] [Revised: 08/29/2010] [Accepted: 09/01/2010] [Indexed: 11/24/2022] Open
Abstract
The development of the neural tube into a complex central nervous system involves morphological, cellular and molecular changes, all of which are tightly regulated. The floor plate (FP) is a critical organizing center located at the ventral-most midline of the neural tube. FP cells regulate dorsoventral patterning, differentiation and axon guidance by secreting morphogens. Here we show that the bHLH transcription factor Nato3 (Ferd3l) is specifically expressed in the spinal FP of chick and mouse embryos. Using in ovo electroporation to understand the regulation of the FP-specific expression of Nato3, we have identified an evolutionarily conserved 204 bp genomic region, which is necessary and sufficient to drive expression to the chick FP. This promoter contains two Foxa2-binding sites, which are highly conserved among distant phyla. The two sites can bind Foxa2 in vitro, and are necessary for the expression in the FP in vivo. Gain and loss of Foxa2 function in vivo further emphasize its role in Nato3 promoter activity. Thus, our data suggest that Nato3 is a direct target of Foxa2, a transcription activator and effector of Sonic hedgehog, the hallmark regulator of FP induction and spinal cord development. The identification of the FP-specific promoter is an important step towards a better understanding of the molecular mechanisms through which Nato3 transcription is regulated and for uncovering its function during nervous system development. Moreover, the promoter provides us with a powerful tool for conditional genetic manipulations in the FP.
Collapse
Affiliation(s)
- Abed AlFatah Mansour
- Department of Cell and Developmental Biology, Institute of Life Sciences, Hebrew University of Jerusalem, Jerusalem 91904, Israel
| | | | | | | | | | | | | |
Collapse
|
19
|
Piechota M, Korostynski M, Przewlocki R. Identification of cis-regulatory elements in the mammalian genome: the cREMaG database. PLoS One 2010; 5:e12465. [PMID: 20824209 PMCID: PMC2930848 DOI: 10.1371/journal.pone.0012465] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2010] [Accepted: 08/02/2010] [Indexed: 12/20/2022] Open
Abstract
Background A growing number of gene expression-profiling datasets provides a reliable source of information about gene co-expression. In silico analyses of the properties shared among the promoters of co-expressed genes facilitates the identification of transcription factors (TFs) involved in the co-regulation of those genes. Our previous experience with microarray data led to the development of a database suitable for the examination of regulatory motifs in the promoters of co-expressed genes. Methodology We introduce the cREMaG (cis-Regulatory Elements in the Mammalian Genome) system designed for in silico studies of the promoter properties of co-regulated mammalian genes. The cREMaG system offers an analysis of data obtained from human, mouse, rat, bovine and canine gene expression-profiling studies. More than eight analysis parameters can be utilized in user-defined combinations. The selection of alternative transcription start sites and information about CpG islands are also available. Conclusions Using the cREMaG system, we successfully identified TFs mediating transcriptional responses in reference gene sets. The cREMaG system facilitates in silico studies of mammalian transcriptional gene regulation. The resource is freely available at http://www.cremag.org.
Collapse
Affiliation(s)
- Marcin Piechota
- Department of Molecular Neuropharmacology, Institute of Pharmacology, Polish Academy of Sciences, Kraków, Poland.
| | | | | |
Collapse
|
20
|
Abstract
DNA-binding transcription factors (TFs) play a central role in transcription regulation, and computational approaches that help in elucidating complex mechanisms governing this basic biological process are of great use. In this perspective, we present the TFM-Explorer web server that is a toolbox to identify putative TF binding sites within a set of upstream regulatory sequences of genes sharing some regulatory mechanisms. TFM-Explorer finds local regions showing overrepresentation of binding sites. Accepted organisms are human, mouse, rat, chicken and drosophila. The server employs a number of features to help users to analyze their data: visualization of selected binding sites on genomic sequences, and selection of cis-regulatory modules. TFM-Explorer is available at http://bioinfo.lifl.fr/TFM.
Collapse
Affiliation(s)
- Laurie Tonon
- INRIA Lille-Nord Europe, 40 av Halley, 59650 Villeneuve d'Ascq, d'Ascq Cedex, France
| | | | | |
Collapse
|
21
|
Abstract
Recognition of promoter elements by the transcription factors is one of the early initial and crucial steps in gene expression and regulation. In prokaryotes, there are clear signals to identify the promoter regions like TATAAT at around -10 and TTGACA at -35 positions from transcription start site (TSS). In eukaryotes the promoter regions are structurally more complex and there are no conserved or consensus sequences similar to the ones found in prokaryotic promoters. We have located a set of GC rich short sequences (< 8 nt) that are relatively common in human promoter sequences around the TSS (+/- 100 relative to TSS). These sequences were sorted based on their frequency of occurrence in the database and the most common 50 sequences were used for further studies. Sigmoidal behavior of the high end of the frequency distribution of these sequences suggests presence of some internal co-operativity. These short sequences are distributed on both sides of TSS, suggesting that probably the transcription factors recognize these sequences on both upstream and downstream of TSS. As eukaryotic promoters lack any conserved sequences, we expect that these short sequences may help in recognition of promoter regions by relevant transcription factors prior to the initiation of transcription process. We postulate that a cluster of genes with common short sequences in the promoter region can be recognized by a particular transcription factor. We also found that most of these short sequences are fairly common within miRNA (both mature and stem-loop sequences). Our studies indicate that eukaryotic transcription is more complex than currently believed.
Collapse
Affiliation(s)
- Padmavathi Putta
- Department of Biochemistry, University of Hyderabad, Hyderabad - 500 046, India.
| | | |
Collapse
|
22
|
Bernard V, Brunaud V, Lecharny A. TC-motifs at the TATA-box expected position in plant genes: a novel class of motifs involved in the transcription regulation. BMC Genomics 2010; 11:166. [PMID: 20222994 PMCID: PMC2842252 DOI: 10.1186/1471-2164-11-166] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2009] [Accepted: 03/12/2010] [Indexed: 01/14/2023] Open
Abstract
BACKGROUND The TATA-box and TATA-variants are regulatory elements involved in the formation of a transcription initiation complex. Both have been conserved throughout evolution in a restricted region close to the Transcription Start Site (TSS). However, less than half of the genes in model organisms studied so far have been found to contain either one of these elements. Indeed different core-promoter elements are involved in the recruitment of the TATA-box-binding protein. Here we assessed the possibility of identifying novel functional motifs in plant genes, sharing the TATA-box topological constraints. RESULTS We developed an ab-initio approach considering the preferential location of motifs relative to the TSS. We identified motifs observed at the TATA-box expected location and conserved in both Arabidopsis thaliana and Oryza sativa promoters. We identified TC-elements within non-TA-rich promoters 30 bases upstream of the TSS. As with the TATA-box and TATA-variant sequences, it was possible to construct a unique distance graph with the TC-element sequences. The structural and functional features of TC-element-containing genes were distinct from those of TATA-box- or TATA-variant-containing genes. Arabidopsis thaliana transcriptome analysis revealed that TATA-box-containing genes were generally those showing relatively high levels of expression and that TC-element-containing genes were generally those expressed in specific conditions. CONCLUSIONS Our observations suggest that the TC-elements might constitute a class of novel regulatory elements participating towards the complex modulation of gene expression in plants.
Collapse
Affiliation(s)
- Virginie Bernard
- Unité de Recherche en Génomique Végétale (URGV), UMR INRA 1165-CNRS 8114-UEVE, 2 Rue Gaston Crémieux, 91057 Evry Cedex, France
| | | | | |
Collapse
|
23
|
Essaghir A, Toffalini F, Knoops L, Kallin A, van Helden J, Demoulin JB. Transcription factor regulation can be accurately predicted from the presence of target gene signatures in microarray gene expression data. Nucleic Acids Res 2010; 38:e120. [PMID: 20215436 PMCID: PMC2887972 DOI: 10.1093/nar/gkq149] [Citation(s) in RCA: 159] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Deciphering transcription factor networks from microarray data remains difficult. This study presents a simple method to infer the regulation of transcription factors from microarray data based on well-characterized target genes. We generated a catalog containing transcription factors associated with 2720 target genes and 6401 experimentally validated regulations. When it was available, a distinction between transcriptional activation and inhibition was included for each regulation. Next, we built a tool (www.tfacts.org) that compares submitted gene lists with target genes in the catalog to detect regulated transcription factors. TFactS was validated with published lists of regulated genes in various models and compared to tools based on in silico promoter analysis. We next analyzed the NCI60 cancer microarray data set and showed the regulation of SOX10, MITF and JUN in melanomas. We then performed microarray experiments comparing gene expression response of human fibroblasts stimulated by different growth factors. TFactS predicted the specific activation of Signal transducer and activator of transcription factors by PDGF-BB, which was confirmed experimentally. Our results show that the expression levels of transcription factor target genes constitute a robust signature for transcription factor regulation, and can be efficiently used for microarray data mining.
Collapse
Affiliation(s)
- Ahmed Essaghir
- de Duve Institute, Université Catholique de Louvain, MEXP74.30, avenue Hippocrates 74-75, B-1200 Brussels, Belgium
| | | | | | | | | | | |
Collapse
|
24
|
Kadupitige SR, Leung KC, Sellmeier J, Sivieng J, Catchpoole DR, Bain ME, Gaëta BA. MINER: exploratory analysis of gene interaction networks by machine learning from expression data. BMC Genomics 2009; 10 Suppl 3:S17. [PMID: 19958480 PMCID: PMC2788369 DOI: 10.1186/1471-2164-10-s3-s17] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Background The reconstruction of gene regulatory networks from high-throughput "omics" data has become a major goal in the modelling of living systems. Numerous approaches have been proposed, most of which attempt only "one-shot" reconstruction of the whole network with no intervention from the user, or offer only simple correlation analysis to infer gene dependencies. Results We have developed MINER (Microarray Interactive Network Exploration and Representation), an application that combines multivariate non-linear tree learning of individual gene regulatory dependencies, visualisation of these dependencies as both trees and networks, and representation of known biological relationships based on common Gene Ontology annotations. MINER allows biologists to explore the dependencies influencing the expression of individual genes in a gene expression data set in the form of decision, model or regression trees, using their domain knowledge to guide the exploration and formulate hypotheses. Multiple trees can then be summarised in the form of a gene network diagram. MINER is being adopted by several of our collaborators and has already led to the discovery of a new significant regulatory relationship with subsequent experimental validation. Conclusion Unlike most gene regulatory network inference methods, MINER allows the user to start from genes of interest and build the network gene-by-gene, incorporating domain expertise in the process. This approach has been used successfully with RNA microarray data but is applicable to other quantitative data produced by high-throughput technologies such as proteomics and "next generation" DNA sequencing.
Collapse
Affiliation(s)
- Sidath Randeni Kadupitige
- School of Computer Science and Engineering, The University of New South Wales, Sydney, NSW, 2052, Australia.
| | | | | | | | | | | | | |
Collapse
|
25
|
Tomovic A, Stadler M, Oakeley EJ. Transcription factor site dependencies in human, mouse and rat genomes. BMC Bioinformatics 2009; 10:339. [PMID: 19835596 PMCID: PMC2770556 DOI: 10.1186/1471-2105-10-339] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2009] [Accepted: 10/16/2009] [Indexed: 01/14/2023] Open
Abstract
BACKGROUND It is known that transcription factors frequently act together to regulate gene expression in eukaryotes. In this paper we describe a computational analysis of transcription factor site dependencies in human, mouse and rat genomes. RESULTS Our approach for quantifying tendencies of transcription factor binding sites to co-occur is based on a binding site scoring function which incorporates dependencies between positions, the use of information about the structural class of each transcription factor (major/minor groove binder), and also considered the possible implications of varying GC content of the sequences. Significant tendencies (dependencies) have been detected by non-parametric statistical methodology (permutation tests). Evaluation of obtained results has been performed in several ways: reports from literature (many of the significant dependencies between transcription factors have previously been confirmed experimentally); dependencies between transcription factors are not biased due to similarities in their DNA-binding sites; the number of dependent transcription factors that belong to the same functional and structural class is significantly higher than would be expected by chance; supporting evidence from GO clustering of targeting genes. Based on dependencies between two transcription factor binding sites (second-order dependencies), it is possible to construct higher-order dependencies (networks). Moreover results about transcription factor binding sites dependencies can be used for prediction of groups of dependent transcription factors on a given promoter sequence. Our results, as well as a scanning tool for predicting groups of dependent transcription factors binding sites are available on the Internet. CONCLUSION We show that the computational analysis of transcription factor site dependencies is a valuable complement to experimental approaches for discovering transcription regulatory interactions and networks. Scanning promoter sequences with dependent groups of transcription factor binding sites improve the quality of transcription factor predictions.
Collapse
Affiliation(s)
- Andrija Tomovic
- Friedrich Miescher Institute for Biomedical Research, Novartis Research Foundation, Basel, Switzerland.
| | | | | |
Collapse
|
26
|
Roider HG, Lenhard B, Kanhere A, Haas SA, Vingron M. CpG-depleted promoters harbor tissue-specific transcription factor binding signals--implications for motif overrepresentation analyses. Nucleic Acids Res 2009; 37:6305-15. [PMID: 19736212 PMCID: PMC2770660 DOI: 10.1093/nar/gkp682] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Motif overrepresentation analysis of proximal promoters is a common approach to characterize the regulatory properties of co-expressed sets of genes. Here we show that these approaches perform well on mammalian CpG-depleted promoter sets that regulate expression in terminally differentiated tissues such as liver and heart. In contrast, CpG-rich promoters show very little overrepresentation signal, even when associated with genes that display highly constrained spatiotemporal expression. For instance, while ∼50% of heart specific genes possess CpG-rich promoters we find that the frequently observed enrichment of MEF2-binding sites upstream of heart-specific genes is solely due to contributions from CpG-depleted promoters. Similar results are obtained for all sets of tissue-specific genes indicating that CpG-rich and CpG-depleted promoters differ fundamentally in their distribution of regulatory inputs around the transcription start site. In order not to dilute the respective transcription factor binding signals, the two promoter types should thus be treated as separate sets in any motif overrepresentation analysis.
Collapse
Affiliation(s)
- Helge G Roider
- Max Planck Institute for Molecular Genetics, Ihnestrasse 73, 14195 Berlin.
| | | | | | | | | |
Collapse
|
27
|
van Hijum SAFT, Medema MH, Kuipers OP. Mechanisms and evolution of control logic in prokaryotic transcriptional regulation. Microbiol Mol Biol Rev 2009; 73:481-509, Table of Contents. [PMID: 19721087 PMCID: PMC2738135 DOI: 10.1128/mmbr.00037-08] [Citation(s) in RCA: 102] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
A major part of organismal complexity and versatility of prokaryotes resides in their ability to fine-tune gene expression to adequately respond to internal and external stimuli. Evolution has been very innovative in creating intricate mechanisms by which different regulatory signals operate and interact at promoters to drive gene expression. The regulation of target gene expression by transcription factors (TFs) is governed by control logic brought about by the interaction of regulators with TF binding sites (TFBSs) in cis-regulatory regions. A factor that in large part determines the strength of the response of a target to a given TF is motif stringency, the extent to which the TFBS fits the optimal TFBS sequence for a given TF. Advances in high-throughput technologies and computational genomics allow reconstruction of transcriptional regulatory networks in silico. To optimize the prediction of transcriptional regulatory networks, i.e., to separate direct regulation from indirect regulation, a thorough understanding of the control logic underlying the regulation of gene expression is required. This review summarizes the state of the art of the elements that determine the functionality of TFBSs by focusing on the molecular biological mechanisms and evolutionary origins of cis-regulatory regions.
Collapse
Affiliation(s)
- Sacha A F T van Hijum
- Molecular Genetics, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Kerklaan 30, 9751 NN Haren, The Netherlands.
| | | | | |
Collapse
|
28
|
Geurts J, Joosten LAB, Takahashi N, Arntz OJ, Glück A, Bennink MB, van den Berg WB, van de Loo FAJ. Computational design and application of endogenous promoters for transcriptionally targeted gene therapy for rheumatoid arthritis. Mol Ther 2009; 17:1877-87. [PMID: 19690516 DOI: 10.1038/mt.2009.182] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
The promoter regions of genes that are differentially regulated in the synovial membrane during the course of rheumatoid arthritis (RA) represent attractive candidates for application in transcriptionally targeted gene therapy. In this study, we applied an unbiased computational approach to define proximal-promoters from a gene expression profiling study of murine experimental arthritis. Synovium expression profiles from progressing stages of collagen-induced arthritis (CIA) were classified into six distinct groups using k-means clustering. Using an algorithm based on local over-representation and comparative genomics, we identified putatively functional transcription factor-binding sites (TFBS) in TATA-dependent proximal-promoters. Applying a filter based on spacing between TATA box and transcription start site (TSS) combined with the presence of over-represented nuclear factor kappaB (NFkappaB), AP-1, or CCAAT/enhancer-binding protein beta (C/EBPbeta) sites, 382 candidate murine and human promoters were reduced to 66, corresponding to 45 genes. In vitro, 9 out of 10 computationally defined promoter regions conferred cytokine-inducible expression in murine cells and human synovial fibroblasts. Under these conditions, the serum amyloid A3 (Saa3) promoter showed the strongest transcriptional induction and strength. We applied this promoter for driving therapeutically efficacious levels of the interleukin-1 receptor antagonist (Il1rn) in a disease-regulated fashion. These results demonstrate the value of bioinformatics for guiding the selection of endogenous promoters for transcriptionally targeted gene therapy.
Collapse
Affiliation(s)
- Jeroen Geurts
- Rheumatology Research and Advanced Therapeutics, Department of Rheumatology, Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands
| | | | | | | | | | | | | | | |
Collapse
|
29
|
Abstract
MOTIVATION Identifying transcription factor binding sites (TFBSs) encoding complex regulatory signals in metazoan genomes remains a challenging problem in computational genomics. Due to degeneracy of nucleotide content among binding site instances or motifs, and intricate 'grammatical organization' of motifs within cis-regulatory modules (CRMs), extant pattern matching-based in silico motif search methods often suffer from impractically high false positive rates, especially in the context of analyzing large genomic datasets, and noisy position weight matrices which characterize binding sites. Here, we try to address this problem by using a framework to maximally utilize the information content of the genomic DNA in the region of query, taking cues from values of various biologically meaningful genetic and epigenetic factors in the query region such as clade-specific evolutionary parameters, presence/absence of nearby coding regions, etc. We present a new method for TFBS prediction in metazoan genomes that utilizes both the CRM architecture of sequences and a variety of features of individual motifs. Our proposed approach is based on a discriminative probabilistic model known as conditional random fields that explicitly optimizes the predictive probability of motif presence in large sequences, based on the joint effect of all such features. RESULTS This model overcomes weaknesses in earlier methods based on less effective statistical formalisms that are sensitive to spurious signals in the data. We evaluate our method on both simulated CRMs and real Drosophila sequences in comparison with a wide spectrum of existing models, and outperform the state of the art by 22% in F1 score. AVAILABILITY AND IMPLEMENTATION The code is publicly available at http://www.sailing.cs.cmu.edu/discover.html. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Wenjie Fu
- School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | | | | |
Collapse
|
30
|
Zambelli F, Pesole G, Pavesi G. Pscan: finding over-represented transcription factor binding site motifs in sequences from co-regulated or co-expressed genes. Nucleic Acids Res 2009; 37:W247-52. [PMID: 19487240 PMCID: PMC2703934 DOI: 10.1093/nar/gkp464] [Citation(s) in RCA: 315] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
The first step in gene expression, transcription, is modulated by the interaction of transcription factors with their corresponding binding sites on the DNA sequence. Pscan is a software tool that scans a set of sequences (e.g. promoters) from co-regulated or co-expressed genes with motifs describing the binding specificity of known transcription factors and assesses which motifs are significantly over- or under-represented, providing thus hints on which transcription factors could be common regulators of the genes studied, together with the location of their candidate binding sites in the sequences. Pscan does not resort to comparisons with orthologous sequences and experimental results show that it compares favorably to other tools for the same task in terms of false positive predictions and computation time. The website is free and open to all users and there is no login requirement. Address: http://www.beaconlab.it/pscan.
Collapse
Affiliation(s)
- Federico Zambelli
- Dipartimento di Scienze Biomolecolari e Biotecnologie, University of Milan, Milan, Italy
| | | | | |
Collapse
|
31
|
Non-coding RNAs revealed during identification of genes involved in chicken immune responses. Immunogenetics 2008; 61:55-70. [PMID: 19009289 DOI: 10.1007/s00251-008-0337-8] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2008] [Accepted: 10/13/2008] [Indexed: 12/12/2022]
Abstract
Recent large-scale cDNA cloning studies have shown that a significant proportion of the transcripts expressed from vertebrate genomes do not appear to encode protein. Moreover, it was reported in mammals (human and mice) that these non-coding transcripts are expressed and regulated by mechanisms similar to those involved in the control of protein-coding genes. We have produced a collection of cDNA sequences from immunologically active tissues with the aim of discovering chicken genes involved in immune mechanisms, and we decided to explore the non-coding component of these immune-related libraries. After finding known non-coding RNAs (miRNA, snRNA, snoRNA), we identified new putative mRNA-like non-coding RNAs. We characterised their expression profiles in immune-related samples. Some of them showed changes in expression following viral infections. As they exhibit patterns of expression that parallel the behaviour of protein-coding RNAs in immune tissues, our study suggests that they could play an active role in the immune response.
Collapse
|
32
|
Yoon K, Ko D, Doderer M, Livi CB, Penalva LOF. Over-represented sequences located on 3' UTRs are potentially involved in regulatory functions. RNA Biol 2008; 5:255-62. [PMID: 18971640 DOI: 10.4161/rna.7116] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Eukaryotic gene expression must be coordinated for the proper functioning of biological processes. This coordination can be achieved both at the transcriptional and post-transcriptional levels. In both cases, regulatory sequences placed at either promoter regions or on UTRs function as markers recognized by regulators that can then activate or repress different groups of genes according to necessity. While regulatory sequences involved in transcription are quite well documented, there is a lack of information on sequence elements involved in post-transcriptional regulation. We used a statistical over-representation method to identify novel regulatory elements located on UTRs. An exhaustive search approach was used to calculate the frequency of all possible n-mers (short nucleotide sequences) in 16,160 human genes of NCBI RefSeq sequences and to identify any peculiar usage of n-mers on UTRs. After a stringent filtering process, we identified 2,772 highly over-represented n-mers on 3' UTRs. We provide evidence that these n-mers are potentially involved in regulatory functions. Identified n-mers overlap with previously identified binding sites for HuR and TIA-1 and, ARE and GRE sequences. We determine also that n-mers overlap with predicted miRNA target sites. Finally, a method to cluster n-mer groups allowed the identification of putative gene networks.
Collapse
Affiliation(s)
- Kihoon Yoon
- Department of Epidemiology and Biostatistics, The University of Texas Health Science Center at San Antonio, San Antonio, Texas 78229-3900, USA
| | | | | | | | | |
Collapse
|
33
|
Mihara M, Itoh T, Izawa T. In silico identification of short nucleotide sequences associated with gene expression of pollen development in rice. PLANT & CELL PHYSIOLOGY 2008; 49:1451-64. [PMID: 18835840 PMCID: PMC2566928 DOI: 10.1093/pcp/pcn129] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Microarray analysis of tiny amounts of RNA extracted from plant section samples prepared by laser microdissection (LM) can provide high-quality information on gene expression in specified plant cells at various stages of development. Having joined the LM-microarray analysis project, we utilized such genome-wide gene expression data from developing rice pollen cells to identify candidates for cis-regulatory elements for specific gene expression in these cells. We first found a few clusters of gene expression patterns based on the data from LM-microarrays. On one gene cluster in which the members were specifically expressed at the bicellular and mature pollen mitotic stages, we identified gene cluster fingerprints (GCFs), each of which consists of a short nucleotide representing the gene cluster. We expected that these GCFs would contain cis-regulatory elements for stage- and tissue-specific gene expression, and we further identified groups of GCFs with common core sequences. Some criteria, such as frequency of occurrence in the gene cluster in contrast to the total tested gene set, flanking sequence preference and distribution of combined GCF sets in the gene regions, allowed us to limit candidates for cis-regulatory sequences for specific gene expression in rice pollen cells to at least 20 sets of combined GCFs. This approach should provide a general purpose algorithm for identifying short nucleotides associated with specific gene expression.
Collapse
Affiliation(s)
- Motohiro Mihara
- Plant Genomics Research Unit, National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba, 305-8602 Japan
| | - Takeshi Itoh
- Bioinformatics Research Unit, National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba, 305-8602 Japan
| | - Takeshi Izawa
- Plant Genomics Research Unit, National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba, 305-8602 Japan
- *Corresponding author: E-mail,
| |
Collapse
|
34
|
Wang Y, Couture OP, Qu L, Uthe JJ, Bearson SMD, Kuhar D, Lunney JK, Nettleton D, Dekkers JCM, Tuggle CK. Analysis of porcine transcriptional response to Salmonella enterica serovar Choleraesuis suggests novel targets of NFkappaB are activated in the mesenteric lymph node. BMC Genomics 2008; 9:437. [PMID: 18811943 PMCID: PMC2570369 DOI: 10.1186/1471-2164-9-437] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2008] [Accepted: 09/23/2008] [Indexed: 01/21/2023] Open
Abstract
BACKGROUND Specific knowledge of the molecular pathways controlling host-pathogen interactions can increase our understanding of immune response biology as well as provide targets for drug development and genetic improvement of disease resistance. Toward this end, we have characterized the porcine transcriptional response to Salmonella enterica serovar Choleraesuis (S. Choleraesuis), a Salmonella serovar that predominately colonizes swine, yet can cause serious infections in human patients. Affymetrix technology was used to screen for differentially expressed genes in pig mesenteric lymph nodes (MLN) responding to infection with S. Choleraesuis at acute (8 hours (h), 24 h and 48 h post-inoculation (pi)) and chronic stages (21 days (d) pi). RESULTS Analysis of variance with false discovery rate control identified 1,853 genes with significant changes in expression level (p-value < 0.01, q-value < 0.26, and fold change (FC) > 2) during infection as compared to un-inoculated control pigs. Down-regulation of translation-related genes at 8 hpi and 24 hpi implied that S. Choleraesuis repressed host protein translation. Genes involved in the Th1, innate immune/inflammation response and apoptosis pathways were induced significantly. However, antigen presentation/dendritic cell (DC) function pathways were not affected significantly during infection. A strong NFkappaB-dependent response was observed, as 58 known NFkappaB target genes were induced at 8, 24 and/or 48 hpi. Quantitative-PCR analyses confirmed the microarray data for 21 of 22 genes tested. Based on expression patterns, these target genes can be classified as an "Early" group (induced at either 8 or 24 hpi) and a "Late" group (induced only at 48 hpi). Cytokine activity or chemokine activity were enriched within the Early group genes GO annotations, while the Late group was predominantly composed of signal transduction and cell metabolism annotated genes. Regulatory motif analysis of the human orthologous promoters for both Early and Late genes revealed that 241 gene promoters were predicted to contain NFkappaB binding sites, and that of these, 51 Early and 145 Late genes were previously not known to be NFkappaB targets. CONCLUSION Our study provides novel genome-wide transcriptional profiling data on the porcine response to S. Choleraesuis and expands the understanding of NFkappaB signaling in response to Salmonella infection. Comparison of the magnitude and timing of porcine MLN transcriptional response to different Salmonella serovars, S. Choleraesuis and S. Typhimurium, clearly showed a larger but later transcriptional response to S. Choleraesuis. Both microarray and QPCR data provided evidence of a strong NFkappaB-dependent host transcriptional response during S. Choleraesuis infection. Our data indicate that a lack of strong DC-mediated antigen presentation in the MLN may cause S. Choleraesuis infected pigs to develop a systemic infection, and our analysis predicts nearly 200 novel NFkappaB target genes which may be applicable across mammalian species.
Collapse
Affiliation(s)
- Yanfang Wang
- Department of Animal Science, and Center for Integrated Animal Genomics, Iowa State University, 2255 Kildee Hall, Ames, IA 50010, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
35
|
Gotea V, Ovcharenko I. DiRE: identifying distant regulatory elements of co-expressed genes. Nucleic Acids Res 2008; 36:W133-9. [PMID: 18487623 PMCID: PMC2447744 DOI: 10.1093/nar/gkn300] [Citation(s) in RCA: 105] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2007] [Revised: 04/23/2008] [Accepted: 04/29/2008] [Indexed: 11/13/2022] Open
Abstract
Regulation of gene expression in eukaryotic genomes is established through a complex cooperative activity of proximal promoters and distant regulatory elements (REs) such as enhancers, repressors and silencers. We have developed a web server named DiRE, based on the Enhancer Identification (EI) method, for predicting distant regulatory elements in higher eukaryotic genomes, namely for determining their chromosomal location and functional characteristics. The server uses gene co-expression data, comparative genomics and profiles of transcription factor binding sites (TFBSs) to determine TFBS-association signatures that can be used for discriminating specific regulatory functions. DiRE's unique feature is its ability to detect REs outside of proximal promoter regions, as it takes advantage of the full gene locus to conduct the search. DiRE can predict common REs for any set of input genes for which the user has prior knowledge of co-expression, co-function or other biologically meaningful grouping. The server predicts function-specific REs consisting of clusters of specifically-associated TFBSs and it also scores the association of individual transcription factors (TFs) with the biological function shared by the group of input genes. Its integration with the Array2BIO server allows users to start their analysis with raw microarray expression data. The DiRE web server is freely available at http://dire.dcode.org.
Collapse
Affiliation(s)
| | - Ivan Ovcharenko
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894
| |
Collapse
|
36
|
Bodén M, Bailey TL. Associating transcription factor-binding site motifs with target GO terms and target genes. Nucleic Acids Res 2008; 36:4108-17. [PMID: 18544606 PMCID: PMC2475605 DOI: 10.1093/nar/gkn374] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
The roles and target genes of many transcription factors (TFs) are still unknown. To predict the roles of TFs, we present a computational method for associating Gene Ontology (GO) terms with TF-binding motifs. The method works by ranking all genes as potential targets of the TF, and reporting GO terms that are significantly associated with highly ranked genes. We also present an approach, whereby these predicted GO terms can be used to improve predictions of TF target genes. This uses a novel gene-scoring function that reflects the insight that genes annotated with GO terms predicted to be associated with the TF are more likely to be its targets. We construct validation sets of GO terms highly associated with known targets of various yeast and human TF. On the yeast reference sets, our prediction method identifies at least one correct GO term for 73% of the TF, 49% of the correct GO terms are predicted and almost one-third of the predicted GO terms are correct. Results on human reference sets are similarly encouraging. Validation of our target gene prediction method shows that its accuracy exceeds that of simple motif scanning.
Collapse
Affiliation(s)
- Mikael Bodén
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD 4072, Australia
| | | |
Collapse
|
37
|
Kim NK, Tharakaraman K, Mariño-Ramírez L, Spouge JL. Finding sequence motifs with Bayesian models incorporating positional information: an application to transcription factor binding sites. BMC Bioinformatics 2008; 9:262. [PMID: 18533028 PMCID: PMC2432075 DOI: 10.1186/1471-2105-9-262] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2007] [Accepted: 06/04/2008] [Indexed: 12/03/2022] Open
Abstract
Background Biologically active sequence motifs often have positional preferences with respect to a genomic landmark. For example, many known transcription factor binding sites (TFBSs) occur within an interval [-300, 0] bases upstream of a transcription start site (TSS). Although some programs for identifying sequence motifs exploit positional information, most of them model it only implicitly and with ad hoc methods, making them unsuitable for general motif searches. Results A-GLAM, a user-friendly computer program for identifying sequence motifs, now incorporates a Bayesian model systematically combining sequence and positional information. A-GLAM's predictions with and without positional information were compared on two human TFBS datasets, each containing sequences corresponding to the interval [-2000, 0] bases upstream of a known TSS. A rigorous statistical analysis showed that positional information significantly improved the prediction of sequence motifs, and an extensive cross-validation study showed that A-GLAM's model was robust against mild misspecification of its parameters. As expected, when sequences in the datasets were successively truncated to the intervals [-1000, 0], [-500, 0] and [-250, 0], positional information aided motif prediction less and less, but never hurt it significantly. Conclusion Although sequence truncation is a viable strategy when searching for biologically active motifs with a positional preference, a probabilistic model (used reasonably) generally provides a superior and more robust strategy, particularly when the sequence motifs' positional preferences are not well characterized.
Collapse
Affiliation(s)
- Nak-Kyeong Kim
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA.
| | | | | | | |
Collapse
|
38
|
Abstract
Common human diseases like obesity and diabetes are driven by complex networks of genes and any number of environmental factors. To understand this complexity in hopes of identifying targets and developing drugs against disease, a systematic approach is required to elucidate the genetic and environmental factors and interactions among and between these factors, and to establish how these factors induce changes in gene networks that in turn lead to disease. The explosion of large-scale, high-throughput technologies in the biological sciences has enabled researchers to take a more systems biology approach to study complex traits like disease. Genotyping of hundreds of thousands of DNA markers and profiling tens of thousands of molecular phenotypes simultaneously in thousands of individuals is now possible, and this scale of data is making it possible for the first time to reconstruct whole gene networks associated with disease. In the following sections, we review different approaches for integrating genetic expression and clinical data to infer causal relationships among gene expression traits and between expression and disease traits. We further review methods to integrate these data in a more comprehensive manner to identify common pathways shared by the causal factors driving disease, including the reconstruction of association and probabilistic causal networks. Particular attention is paid to integrating diverse information to refine these types of networks so that they are more predictive. To highlight these different approaches in practice, we step through an example on how Insig2 was identified as a causal factor for plasma cholesterol levels in mice.
Collapse
|
39
|
Romero DG, Plonczynski MW, Welsh BL, Gomez-Sanchez CE, Zhou MY, Gomez-Sanchez EP. Gene expression profile in rat adrenal zona glomerulosa cells stimulated with aldosterone secretagogues. Physiol Genomics 2007; 32:117-27. [PMID: 17895393 DOI: 10.1152/physiolgenomics.00145.2007] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
The mineralocorticoid aldosterone, mainly produced by the adrenal gland, is essential for life, but an abnormally excessive secretion causes severe pathological effects including hypertension and target organ injury in the heart and kidney. The aim of this study was to determine the gene regulatory network triggered by aldosterone secretagogues in a nontransformed cell system. Freshly isolated rat adrenal zona glomerulosa cells were stimulated with the two main aldosterone secretagogues, angiotensin II and potassium, for 2 h and subjected to whole genome expression studies using multiple biological and bioinformatics tools. Several genes were differentially expressed by ANG II (n = 133) or potassium (n = 216). Genes belonging to the nucleic acid binding and transcription factor activity categories were significantly enriched. A subset of the most regulated genes was confirmed by real-time RT-PCR, and then their expression was analyzed in time curve studies. Differentially expressed genes were grouped according to their time response expression pattern, and their promoter regions were analyzed for common regulatory transcription factor binding sites. Finally, data mining with gene promoters, transcription factors, and literature databases was performed to generate gene interaction networks for either ANG II or potassium. This paper provides for the first time a complete study of the genes that are regulated, and the interaction between them, by aldosterone secretagogues in rat adrenal cells. Increasing our knowledge of adrenal physiology and gene regulation in nontransformed cell systems could lead us to a better approach for the discovery of candidate genes involved in pathological conditions of the adrenal cortex.
Collapse
Affiliation(s)
- Damian G Romero
- Division of Endocrinology, G. V. (Sonny) Montgomery Veterans Affairs Medical Center, Jackson, MS 39216, USA.
| | | | | | | | | | | |
Collapse
|
40
|
Abstract
MOTIVATION Most of the available tools for transcription factor binding site prediction are based on methods which assume no sequence dependence between the binding site base positions. Our primary objective was to investigate the statistical basis for either a claim of dependence or independence, to determine whether such a claim is generally true, and to use the resulting data to develop improved scoring functions for binding-site prediction. RESULTS Using three statistical tests, we analyzed the number of binding sites showing dependent positions. We analyzed transcription factor-DNA crystal structures for evidence of position dependence. Our final conclusions were that some factors show evidence of dependencies whereas others do not. We observed that the conformational energy (Z-score) of the transcription factor-DNA complexes was lower (better) for sequences that showed dependency than for those that did not (P < 0.02). We suggest that where evidence exists for dependencies, these should be modeled to improve binding-site predictions. However, when no significant dependency is found, this correction should be omitted. This may be done by converting any existing scoring function which assumes independence into a form which includes a dependency correction. We present an example of such an algorithm and its implementation as a web tool. AVAILABILITY http://promoterplot.fmi.ch/cgi-bin/dep.html
Collapse
Affiliation(s)
- Andrija Tomovic
- Friedrich Miescher Institute for Biomedical Research, Novartis Research Foundation, Basel, Switzerland
| | | |
Collapse
|
41
|
In silico identification of NF-kappaB-regulated genes in pancreatic beta-cells. BMC Bioinformatics 2007; 8:55. [PMID: 17302974 PMCID: PMC1810323 DOI: 10.1186/1471-2105-8-55] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2006] [Accepted: 02/15/2007] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Pancreatic beta-cells are the target of an autoimmune attack in type 1 diabetes mellitus (T1DM). This is mediated in part by cytokines, such as interleukin (IL)-1beta and interferon (IFN)-gamma. These cytokines modify the expression of hundreds of genes, leading to beta-cell dysfunction and death by apoptosis. Several of these cytokine-induced genes are potentially regulated by the IL-1beta-activated transcription factor (TF) nuclear factor (NF)-kappaB, and previous studies by our group have shown that cytokine-induced NF-kappaB activation is pro-apoptotic in beta-cells. To identify NF-kappaB-regulated gene networks in beta-cells we presently used a discriminant analysis-based approach to predict NF-kappaB responding genes on the basis of putative regulatory elements. RESULTS The performance of linear and quadratic discriminant analysis (LDA, QDA) in identifying NF-kappaB-responding genes was examined on a dataset of 240 positive and negative examples of NF-kappaB regulation, using stratified cross-validation with an internal leave-one-out cross-validation (LOOCV) loop for automated feature selection and noise reduction. LDA performed slightly better than QDA, achieving 61% sensitivity, 91% specificity and 87% positive predictive value, and allowing the identification of 231, 251 and 580 NF-kappaB putative target genes in insulin-producing INS-1E cells, primary rat beta-cells and human pancreatic islets, respectively. Predicted NF-kappaB targets had a significant enrichment in genes regulated by cytokines (IL-1beta or IL-1beta + IFN-gamma) and double stranded RNA (dsRNA), as compared to genes not regulated by these NF-kappaB-dependent stimuli. We increased the confidence of the predictions by selecting only evolutionary stable genes, i.e. genes with homologs predicted as NF-kappaB targets in rat, mouse, human and chimpanzee. CONCLUSION The present in silico analysis allowed us to identify novel regulatory targets of NF-kappaB using a supervised classification method based on putative binding motifs. This provides new insights into the gene networks regulating cytokine-induced beta-cell dysfunction and death.
Collapse
|
42
|
Karmaker A, Harris SE, Kwek S. Constructing human transcriptional regulatory subnets from crossgenome comparison and gene expression profile analysis. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2007; 11:397-412. [PMID: 18092911 DOI: 10.1089/omi.2007.0028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
With the completion of Human Genome Project (HGP), understanding the complex interaction between trans- and cis-regulatory elements comprehensively and identifying these potential functional elements are fundamental problems in functional genomics. Although many computational approaches have been developed for lower eukaryotes and prokaryotes, most of them often do not generalize to vertebrates. Here, we use a decay function to characterize transcriptional behavior, and analyze correlations on gene expression profiles of human and mouse to construct coregulated gene groups. Using these two closely related species, we perform comparative genome analysis and identify target genes and conserved functional cis-regulatory elements by motif overrepresentation. Moreover, we presented experimental evidences (ChIP-Chip) for E2F to assert our findings.
Collapse
Affiliation(s)
- Amitava Karmaker
- Department of Computer Science, University of Texas at San Antonio, San Antonio, TX 78249, USA.
| | | | | |
Collapse
|