Synthetic STARR-seq reveals how DNA shape and sequence modulate transcriptional output and noise.
PLoS Genet 2018;
14:e1007793. [PMID:
30427832 PMCID:
PMC6261644 DOI:
10.1371/journal.pgen.1007793]
[Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2018] [Revised: 11/28/2018] [Accepted: 10/26/2018] [Indexed: 12/29/2022] Open
Abstract
The binding of transcription factors to short recognition sequences plays a pivotal role in controlling the expression of genes. The sequence and shape characteristics of binding sites influence DNA binding specificity and have also been implicated in modulating the activity of transcription factors downstream of binding. To quantitatively assess the transcriptional activity of tens of thousands of designed synthetic sites in parallel, we developed a synthetic version of STARR-seq (synSTARR-seq). We used the approach to systematically analyze how variations in the recognition sequence of the glucocorticoid receptor (GR) affect transcriptional regulation. Our approach resulted in the identification of a novel highly active functional GR binding sequence and revealed that sequence variation both within and flanking GR’s core binding site can modulate GR activity without apparent changes in DNA binding affinity. Notably, we found that the sequence composition of variants with similar activity profiles was highly diverse. In contrast, groups of variants with similar activity profiles showed specific DNA shape characteristics indicating that DNA shape may be a better predictor of activity than DNA sequence. Finally, using single cell experiments with individual enhancer variants, we obtained clues indicating that the architecture of the response element can independently tune expression mean and cell-to cell variability in gene expression (noise). Together, our studies establish synSTARR as a powerful method to systematically study how DNA sequence and shape modulate transcriptional output and noise.
The expression level of genes is controlled by transcription factors, which are proteins that bind to genomic response elements that contain their recognition DNA sequence. Importantly, genes are not simply turned on but need to be expressed at the right level. This is, at least in part, assured by the sequence composition of genomic response elements. Here, we studied how the recognition DNA sequence influences gene regulation by a transcription factor called the glucocorticoid receptor. Specifically, we developed a method to test the activity of variants in a highly parallelized setting where everything is kept identical except for the sequence of the binding site. The systematic analysis of tens of thousands of sequence variants facilitated the identification of a previously unknown sequence variant with high activity. Moreover, we report how sequence variation of the response element influences cell-to-cell variability in expression levels. Finally, we observe similar activity profiles for distinct sequence variants that share similar three-dimensional DNA shape characteristics arguing that the three-dimensional perception of DNA by the glucocorticoid receptor, modulates its activity towards individual target genes.
Collapse