1
|
Tahara S, Tsuchiya T, Matsumoto H, Ozaki H. Transcription factor-binding k-mer analysis clarifies the cell type dependency of binding specificities and cis-regulatory SNPs in humans. BMC Genomics 2023; 24:597. [PMID: 37805453 PMCID: PMC10560430 DOI: 10.1186/s12864-023-09692-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Accepted: 09/21/2023] [Indexed: 10/09/2023] Open
Abstract
BACKGROUND Transcription factors (TFs) exhibit heterogeneous DNA-binding specificities in individual cells and whole organisms under natural conditions, and de novo motif discovery usually provides multiple motifs, even from a single chromatin immunoprecipitation-sequencing (ChIP-seq) sample. Despite the accumulation of ChIP-seq data and ChIP-seq-derived motifs, the diversity of DNA-binding specificities across different TFs and cell types remains largely unexplored. RESULTS Here, we applied MOCCS2, our k-mer-based motif discovery method, to a collection of human TF ChIP-seq samples across diverse TFs and cell types, and systematically computed profiles of TF-binding specificity scores for all k-mers. After quality control, we compiled a set of TF-binding specificity score profiles for 2,976 high-quality ChIP-seq samples, comprising 473 TFs and 398 cell types. Using these high-quality samples, we confirmed that the k-mer-based TF-binding specificity profiles reflected TF- or TF-family dependent DNA-binding specificities. We then compared the binding specificity scores of ChIP-seq samples with the same TFs but with different cell type classes and found that half of the analyzed TFs exhibited differences in DNA-binding specificities across cell type classes. Additionally, we devised a method to detect differentially bound k-mers between two ChIP-seq samples and detected k-mers exhibiting statistically significant differences in binding specificity scores. Moreover, we demonstrated that differences in the binding specificity scores between k-mers on the reference and alternative alleles could be used to predict the effect of variants on TF binding, as validated by in vitro and in vivo assay datasets. Finally, we demonstrated that binding specificity score differences can be used to interpret disease-associated non-coding single-nucleotide polymorphisms (SNPs) as TF-affecting SNPs and provide candidates responsible for TFs and cell types. CONCLUSIONS Our study provides a basis for investigating the regulation of gene expression in a TF-, TF family-, or cell-type-dependent manner. Furthermore, our differential analysis of binding-specificity scores highlights noncoding disease-associated variants in humans.
Collapse
Affiliation(s)
- Saeko Tahara
- Bioinformatics Laboratory, Institute of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8577, Japan
- School of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8577, Japan
| | - Takaho Tsuchiya
- Bioinformatics Laboratory, Institute of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8577, Japan
- Center for Artificial Intelligence Research, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8577, Japan
| | - Hirotaka Matsumoto
- School of Information and Data Sciences, Nagasaki University, 1-14, Bunkyo-Machi, Nagasaki City, Nagasaki, 852-8521, Japan
- Laboratory for Bioinformatics Research, RIKEN Center for Biosystems Dynamics, Wako, Saitama, 351-0198, Japan
| | - Haruka Ozaki
- Bioinformatics Laboratory, Institute of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8577, Japan.
- Center for Artificial Intelligence Research, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8577, Japan.
- Laboratory for Bioinformatics Research, RIKEN Center for Biosystems Dynamics, Wako, Saitama, 351-0198, Japan.
| |
Collapse
|
2
|
Yoshitane H, Asano Y, Sagami A, Sakai S, Suzuki Y, Okamura H, Iwasaki W, Ozaki H, Fukada Y. Functional D-box sequences reset the circadian clock and drive mRNA rhythms. Commun Biol 2019; 2:300. [PMID: 31428688 PMCID: PMC6687812 DOI: 10.1038/s42003-019-0522-3] [Citation(s) in RCA: 48] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2019] [Accepted: 06/28/2019] [Indexed: 01/12/2023] Open
Abstract
The circadian clock drives gene expression rhythms, leading to daily changes in physiology and behavior. In mammals, Albumin D-site-Binding Protein (DBP) rhythmically activates transcription of various genes through a DNA cis-element, D-box. The DBP-dependent transactivation is repressed by competitive binding of E4BP4 to the D-box. Despite the elaborate regulation, physiological roles of the D-box in the circadian clockwork are still elusive. Here we identified 1490 genomic regions recognized commonly by DBP and E4BP4 in the mouse liver. We comprehensively defined functional D-box sequences using an improved bioinformatics method, MOCCS2. In RNA-Seq analysis of E4bp4-knockout and wild type liver, we showed the importance of E4BP4-mediated circadian repression in gene expression rhythms. In addition to the circadian control, we found that environmental stimuli caused acute induction of E4BP4 protein, evoking phase-dependent phase shifts of cellular circadian rhythms and resetting the clock. Collectively, D-box-mediated transcriptional regulation plays pivotal roles in input and output in the circadian clock system.
Collapse
Affiliation(s)
- Hikari Yoshitane
- Department of Biological Sciences, School of Science, The University of Tokyo, Hongo 7-3-1, Bunkyo-ku Tokyo, 113-0033 Japan
| | - Yoshimasa Asano
- Department of Biological Sciences, School of Science, The University of Tokyo, Hongo 7-3-1, Bunkyo-ku Tokyo, 113-0033 Japan
| | - Aya Sagami
- Department of Biological Sciences, School of Science, The University of Tokyo, Hongo 7-3-1, Bunkyo-ku Tokyo, 113-0033 Japan
| | - Seinosuke Sakai
- Department of Biological Sciences, School of Science, The University of Tokyo, Hongo 7-3-1, Bunkyo-ku Tokyo, 113-0033 Japan
| | - Yutaka Suzuki
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwanoha 5-1-5,, Kashiwa Chiba, 277-8568 Japan
| | - Hitoshi Okamura
- Department of Systems Biology, Graduate School of Pharmaceutical Sciences, Kyoto University, Yoshida-Shimo-Adachi-cho 46-29, Kyoto, 606-8501 Japan
| | - Wataru Iwasaki
- Department of Biological Sciences, School of Science, The University of Tokyo, Hongo 7-3-1, Bunkyo-ku Tokyo, 113-0033 Japan
| | - Haruka Ozaki
- Bioinformatics Laboratory, Faculty of Medicine, University of Tsukuba, Tennodai 1-1-1, Tsukuba, Ibaraki, 305-8575 Japan
- Center for Artificial Intelligence Research, University of Tsukuba, Tennodai 1-1-1, Tsukuba, Ibaraki, 305-8577 Japan
| | - Yoshitaka Fukada
- Department of Biological Sciences, School of Science, The University of Tokyo, Hongo 7-3-1, Bunkyo-ku Tokyo, 113-0033 Japan
| |
Collapse
|
3
|
Six6 and Six7 coordinately regulate expression of middle-wavelength opsins in zebrafish. Proc Natl Acad Sci U S A 2019; 116:4651-4660. [PMID: 30765521 PMCID: PMC6410792 DOI: 10.1073/pnas.1812884116] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
Color discrimination in the vertebrate retina is mediated by a combination of cone cell types expressing UV (SWS1), blue (SWS2), green (RH2), and red (LWS) opsins. Although the tetrachromatic cone system is retained in most nonmammalian vertebrate lineages, the transcriptional mechanism underlying gene expression of cone opsins remains elusive. Here, we found that the retinal transcription factors, sine oculis homeobox 6 (Six6b) and Six7, synergistically and positively regulate gene expression of zebrafish SWS2 and RH2 opsins. Larvae deficient for both of these transcription factors showed heavily impaired visually driven foraging behavior and were unable to compete for food when reared in a group with normal siblings. The results suggest that six6b and six7 play a pivotal role in blue- and green-light sensitivity and daylight vision. Color discrimination in the vertebrate retina is mediated by a combination of spectrally distinct cone photoreceptors, each expressing one of multiple cone opsins. The opsin genes diverged early in vertebrate evolution into four classes maximally sensitive to varying wavelengths of light: UV (SWS1), blue (SWS2), green (RH2), and red (LWS) opsins. Although the tetrachromatic cone system is retained in most nonmammalian vertebrate lineages, the transcriptional mechanism underlying gene expression of the cone opsins remains elusive, particularly for SWS2 and RH2 opsins, both of which have been lost in the mammalian lineage. In zebrafish, which have all four cone subtypes, rh2 opsin gene expression depends on a homeobox transcription factor, sine oculis homeobox 7 (Six7). However, the six7 gene is found only in the ray-finned fish lineage, suggesting the existence of another evolutionarily conserved transcriptional factor(s) controlling rh2 opsin expression in vertebrates. Here, we found that the reduced rh2 expression caused by six7 deficiency was rescued by forced expression of six6b, which is a six7-related transcription factor conserved widely among vertebrates. The compensatory role of six6b was reinforced by ChIP-sequencing analysis, which revealed a similar pattern of Six6b- and Six7-binding sites within and near the cone opsin genes. TAL effector nuclease-induced genetic ablation of six6b and six7 revealed that they coordinately regulate SWS2 opsin gene expression. Mutant larvae deficient for these transcription factors showed severely impaired visually driven foraging behavior. These results demonstrate that in zebrafish, six6b and six7 govern expression of the SWS2 and RH2 opsins responsible for middle-wavelength sensitivity, which would be physiologically important for daylight vision.
Collapse
|