Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Hannenhalli S. Eukaryotic transcription factor binding sites--modeling and integrative search methods. Bioinformatics 2008;24:1325-31. [PMID: 18426806 DOI: 10.1093/bioinformatics/btn198] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open

For:	Hannenhalli S. Eukaryotic transcription factor binding sites--modeling and integrative search methods. Bioinformatics 2008;24:1325-31. [PMID: 18426806 DOI: 10.1093/bioinformatics/btn198] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open

Number

Cited by Other Article(s)

Protein tagging for chromatin immunoprecipitation from Arabidopsis. Methods Mol Biol 2011;678:199-210. [PMID: 20931382 DOI: 10.1007/978-1-60761-682-5_15] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]

Li G, Chan TM, Leung KS, Lee KH. A cluster refinement algorithm for motif discovery. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2010;7:654-668. [PMID: 21030733 DOI: 10.1109/tcbb.2009.25] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]

Transcription factor binding variation in the evolution of gene regulation. Trends Genet 2010;26:468-75. [PMID: 20864205 DOI: 10.1016/j.tig.2010.08.005] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2010] [Revised: 08/22/2010] [Accepted: 08/22/2010] [Indexed: 01/17/2023]

Li MJ, Sham PC, Wang J. FastPval: a fast and memory efficient program to calculate very low P-values from empirical distribution. ACTA ACUST UNITED AC 2010;26:2897-9. [PMID: 20861029 PMCID: PMC2971576 DOI: 10.1093/bioinformatics/btq540] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Piechota M, Korostynski M, Przewlocki R. Identification of cis-regulatory elements in the mammalian genome: the cREMaG database. PLoS One 2010;5:e12465. [PMID: 20824209 PMCID: PMC2930848 DOI: 10.1371/journal.pone.0012465] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2010] [Accepted: 08/02/2010] [Indexed: 12/20/2022] Open

Ramsey SA, Knijnenburg TA, Kennedy KA, Zak DE, Gilchrist M, Gold ES, Johnson CD, Lampano AE, Litvak V, Navarro G, Stolyar T, Aderem A, Shmulevich I. Genome-wide histone acetylation data improve prediction of mammalian transcription factor binding sites. ACTA ACUST UNITED AC 2010;26:2071-5. [PMID: 20663846 PMCID: PMC2922897 DOI: 10.1093/bioinformatics/btq405] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Tagu D, Dugravot S, Outreman Y, Rispe C, Simon JC, Colella S. The anatomy of an aphid genome: From sequence to biology. C R Biol 2010;333:464-73. [DOI: 10.1016/j.crvi.2010.03.006] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Laurila K, Yli-Harja O, Lähdesmäki H. A protein-protein interaction guided method for competitive transcription factor binding improves target predictions. Nucleic Acids Res 2010;37:e146. [PMID: 19786498 PMCID: PMC2794167 DOI: 10.1093/nar/gkp789] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open

Reid JE, Evans KJ, Dyer N, Wernisch L, Ott S. Variable structure motifs for transcription factor binding sites. BMC Genomics 2010;11:30. [PMID: 20074339 PMCID: PMC2824720 DOI: 10.1186/1471-2164-11-30] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2009] [Accepted: 01/14/2010] [Indexed: 02/06/2023] Open

Abstract

Background

Classically, models of DNA-transcription factor binding sites (TFBSs) have been based on relatively few known instances and have treated them as sites of fixed length using position weight matrices (PWMs). Various extensions to this model have been proposed, most of which take account of dependencies between the bases in the binding sites. However, some transcription factors are known to exhibit some flexibility and bind to DNA in more than one possible physical configuration. In some cases this variation is known to affect the function of binding sites. With the increasing volume of ChIP-seq data available it is now possible to investigate models that incorporate this flexibility. Previous work on variable length models has been constrained by: a focus on specific zinc finger proteins in yeast using restrictive models; a reliance on hand-crafted models for just one transcription factor at a time; and a lack of evaluation on realistically sized data sets.

Results

We re-analysed binding sites from the TRANSFAC database and found motivating examples where our new variable length model provides a better fit. We analysed several ChIP-seq data sets with a novel motif search algorithm and compared the results to one of the best standard PWM finders and a recently developed alternative method for finding motifs of variable structure. All the methods performed comparably in held-out cross validation tests. Known motifs of variable structure were recovered for p53, Stat5a and Stat5b. In addition our method recovered a novel generalised version of an existing PWM for Sp1 that allows for variable length binding. This motif improved classification performance.

Conclusions

We have presented a new gapped PWM model for variable length DNA binding sites that is not too restrictive nor over-parameterised. Our comparison with existing tools shows that on average it does not have better predictive accuracy than existing methods. However, it does provide more interpretable models of motifs of variable structure that are suitable for follow-up structural studies. To our knowledge, we are the first to apply variable length motif models to eukaryotic ChIP-seq data sets and consequently the first to show their value in this domain. The results include a novel motif for the ubiquitous transcription factor Sp1.

Collapse

Huttenhower C, Mutungu KT, Indik N, Yang W, Schroeder M, Forman JJ, Troyanskaya OG, Coller HA. Detailing regulatory networks through large scale data integration. Bioinformatics 2009;25:3267-74. [PMID: 19825796 DOI: 10.1093/bioinformatics/btp588] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Oh YM, Kim JK, Choi Y, Choi S, Yoo JY. Prediction and experimental validation of novel STAT3 target genes in human cancer cells. PLoS One 2009;4:e6911. [PMID: 19730699 PMCID: PMC2731854 DOI: 10.1371/journal.pone.0006911] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2009] [Accepted: 08/03/2009] [Indexed: 11/23/2022] Open

Marco A, Konikoff C, Karr TL, Kumar S. Relationship between gene co-expression and sharing of transcription factor binding sites in Drosophila melanogaster. Bioinformatics 2009;25:2473-7. [PMID: 19633094 DOI: 10.1093/bioinformatics/btp462] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open

Marschall T, Rahmann S. Efficient exact motif discovery. Bioinformatics 2009;25:i356-64. [PMID: 19478010 PMCID: PMC2687942 DOI: 10.1093/bioinformatics/btp188] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open

Abstract

Motivation: The motif discovery problem consists of finding over-represented patterns in a collection of biosequences. It is one of the classical sequence analysis problems, but still has not been satisfactorily solved in an exact and efficient manner. This is partly due to the large number of possibilities of defining the motif search space and the notion of over-representation. Even for well-defined formalizations, the problem is frequently solved in an ad hoc manner with heuristics that do not guarantee to find the best motif.

Results: We show how to solve the motif discovery problem (almost) exactly on a practically relevant space of IUPAC generalized string patterns, using the p-value with respect to an i.i.d. model or a Markov model as the measure of over-representation. In particular, (i) we use a highly accurate compound Poisson approximation for the null distribution of the number of motif occurrences. We show how to compute the exact clump size distribution using a recently introduced device called probabilistic arithmetic automaton (PAA). (ii) We define two p-value scores for over-representation, the first one based on the total number of motif occurrences, the second one based on the number of sequences in a collection with at least one occurrence. (iii) We describe an algorithm to discover the optimal pattern with respect to either of the scores. The method exploits monotonicity properties of the compound Poisson approximation and is by orders of magnitude faster than exhaustive enumeration of IUPAC strings (11.8 h compared with an extrapolated runtime of 4.8 years). (iv) We justify the use of the proposed scores for motif discovery by showing our method to outperform other motif discovery algorithms (e.g. MEME, Weeder) on benchmark datasets. We also propose new motifs on Mycobacterium tuberculosis.

Availability and Implementation: The method has been implemented in Java. It can be obtained from http://ls11-www.cs.tu-dortmund.de/people/marschal/paa_md/

Contact:tobias.marschall@tu-dortmund.de; sven.rahmann@tu-dortmund.de

Collapse

HOU L, QIAN MP, ZHU YP, DENG MH. Advances on bioinformatic research in transcription factor binding sites. YI CHUAN = HEREDITAS 2009;31:365-73. [DOI: 10.3724/sp.j.1005.2009.00365] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Narlikar L, Ovcharenko I. Identifying regulatory elements in eukaryotic genomes. BRIEFINGS IN FUNCTIONAL GENOMICS AND PROTEOMICS 2009;8:215-30. [PMID: 19498043 DOI: 10.1093/bfgp/elp014] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]

Luca F, Kashyap S, Southard C, Zou M, Witonsky D, Di Rienzo A, Conzen SD. Adaptive variation regulates the expression of the human SGK1 gene in response to stress. PLoS Genet 2009;5:e1000489. [PMID: 19461886 PMCID: PMC2679193 DOI: 10.1371/journal.pgen.1000489] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2009] [Accepted: 04/22/2009] [Indexed: 12/22/2022] Open

Abstract

The Serum and Glucocorticoid-regulated Kinase1 (SGK1) gene is a target of the glucocorticoid receptor (GR) and is central to the stress response in many human tissues. Because environmental stress varies across habitats, we hypothesized that natural selection shaped the geographic distribution of genetic variants regulating the level of SGK1 expression following GR activation. By combining population genetics and molecular biology methods, we identified a variant (rs9493857) with marked allele frequency differences between populations of African and European ancestry and with a strong correlation between allele frequency and latitude in worldwide population samples. This SNP is located in a GR-binding region upstream of SGK1 that was identified using a GR ChIP-chip. SNP rs9493857 also lies within a predicted binding site for Oct1, a transcription factor known to cooperate with the GR in the transactivation of target genes. Using ChIP assays, we show that both GR and Oct1 bind to this region and that the ancestral allele at rs9493857 binds the GR-Oct1 complex more efficiently than the derived allele. Finally, using a reporter gene assay, we demonstrate that the ancestral allele is associated with increased glucocorticoid-dependent gene expression when compared to the derived allele. Our results suggest a novel paradigm in which hormonal responsiveness is modulated by sequence variation in the regulatory regions of nuclear receptor target genes. Identifying such functional variants may shed light on the mechanisms underlying inter-individual variation in response to environmental stressors and to hormonal therapy, as well as in the susceptibility to hormone-dependent diseases.

Susceptibility to many common human diseases including hypertension, heart disease, and the metabolic syndrome is associated with increased neuroendocrine signaling in response to environmental stressors. A key component of the human stress response involves increased systemic glucocorticoid secretion that in turn leads to glucocorticoid receptor (GR) activation. As a result, a variety of GR-expressing cell types undergo gene expression changes, thereby providing an integrated physiological response to stress. The SGK1 gene is a well-established GR target that promotes cellular homeostasis in response to stress. Here, we use a combination of population genetics and molecular biology approaches to identify an SNP (rs9493857) in a distant SGK1 GR-binding region with unusually large differences in allele frequency between populations of European and African ancestry. Furthermore, rs9493857 shows a strong correlation between allele frequency and distance from the equator, a pattern consistent with a varying selective advantage across environments. Indeed, the ancestral allele at rs9493857 results in increased GR-binding and glucocorticoid-regulated gene expression, suggesting that an increased stress response (i.e., glucocorticoid responsiveness) was advantageous in ancestral human populations. We speculate that, in modern times, such variation could favor the negative effects of a heightened glucocorticoid response, potentially predisposing individuals to chronic diseases such as metabolic syndrome and hypertension.

Collapse

Courchesne NMD, Parisien A, Wang B, Lan CQ. Enhancement of lipid production using biochemical, genetic and transcription factor engineering approaches. J Biotechnol 2009;141:31-41. [PMID: 19428728 DOI: 10.1016/j.jbiotec.2009.02.018] [Citation(s) in RCA: 257] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2008] [Revised: 02/15/2009] [Accepted: 02/20/2009] [Indexed: 01/03/2023]

Das D, Pellegrini M, Gray JW. A primer on regression methods for decoding cis-regulatory logic. PLoS Comput Biol 2009;5:e1000269. [PMID: 19180174 PMCID: PMC2607548 DOI: 10.1371/journal.pcbi.1000269] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open

Hao P, Yu Y, Zhang X, Tu K, Fan H, Zhong Y. The contribution of cis-regulatory elements to head-to-head gene pairs' co-expression pattern. SCIENCE IN CHINA. SERIES C, LIFE SCIENCES 2009;52:74-9. [PMID: 19152086 DOI: 10.1007/s11427-009-0004-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2008] [Accepted: 08/06/2008] [Indexed: 10/21/2022]

Cai Y, He J, Li X, Lu L, Yang X, Feng K, Lu W, Kong X. A Novel Computational Approach To Predict Transcription Factor DNA Binding Preference. J Proteome Res 2008;8:999-1003. [DOI: 10.1021/pr800717y] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Affiliation(s)

Yudong Cai CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China, Department of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200040, People’s Republic of China, Institute of Health Sciences, Shanghai Jiao Tong University School of Medicine and Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200025, China, Division of Imaging Science & Biomedical
JianFeng He CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China, Department of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200040, People’s Republic of China, Institute of Health Sciences, Shanghai Jiao Tong University School of Medicine and Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200025, China, Division of Imaging Science & Biomedical
XinLei Li CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China, Department of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200040, People’s Republic of China, Institute of Health Sciences, Shanghai Jiao Tong University School of Medicine and Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200025, China, Division of Imaging Science & Biomedical
Lin Lu CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China, Department of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200040, People’s Republic of China, Institute of Health Sciences, Shanghai Jiao Tong University School of Medicine and Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200025, China, Division of Imaging Science & Biomedical
XinYi Yang CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China, Department of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200040, People’s Republic of China, Institute of Health Sciences, Shanghai Jiao Tong University School of Medicine and Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200025, China, Division of Imaging Science & Biomedical
KaiYan Feng CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China, Department of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200040, People’s Republic of China, Institute of Health Sciences, Shanghai Jiao Tong University School of Medicine and Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200025, China, Division of Imaging Science & Biomedical
WenCong Lu CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China, Department of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200040, People’s Republic of China, Institute of Health Sciences, Shanghai Jiao Tong University School of Medicine and Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200025, China, Division of Imaging Science & Biomedical
XiangYin Kong CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China, Department of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200040, People’s Republic of China, Institute of Health Sciences, Shanghai Jiao Tong University School of Medicine and Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200025, China, Division of Imaging Science & Biomedical

Collapse

Persikov AV, Osada R, Singh M. Predicting DNA recognition by Cys2His2 zinc finger proteins. ACTA ACUST UNITED AC 2008;25:22-9. [PMID: 19008249 DOI: 10.1093/bioinformatics/btn580] [Citation(s) in RCA: 87] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Abstract

MOTIVATION

Cys(2)His(2) zinc finger (ZF) proteins represent the largest class of eukaryotic transcription factors. Their modular structure and well-conserved protein-DNA interface allow the development of computational approaches for predicting their DNA-binding preferences even when no binding sites are known for a particular protein. The 'canonical model' for ZF protein-DNA interaction consists of only four amino acid nucleotide contacts per zinc finger domain.

RESULTS

We present an approach for predicting ZF binding based on support vector machines (SVMs). While most previous computational approaches have been based solely on examples of known ZF protein-DNA interactions, ours additionally incorporates information about protein-DNA pairs known to bind weakly or not at all. Moreover, SVMs with a linear kernel can naturally incorporate constraints about the relative binding affinities of protein-DNA pairs; this type of information has not been used previously in predicting ZF protein-DNA binding. Here, we build a high-quality literature-derived experimental database of ZF-DNA binding examples and utilize it to test both linear and polynomial kernels for predicting ZF protein-DNA binding on the basis of the canonical binding model. The polynomial SVM outperforms previously published prediction procedures as well as the linear SVM. This may indicate the presence of dependencies between contacts in the canonical binding model and suggests that modification of the underlying structural model may result in further improved performance in predicting ZF protein-DNA binding. Overall, this work demonstrates that methods incorporating information about non-binding and relative binding of protein-DNA pairs have great potential for effective prediction of protein-DNA interactions.

AVAILABILITY

An online tool for predicting ZF DNA binding is available at http://compbio.cs.princeton.edu/zf/.

Collapse