1
|
He AY, Danko CG. Dissection of core promoter syntax through single nucleotide resolution modeling of transcription initiation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.13.583868. [PMID: 38559255 PMCID: PMC10979970 DOI: 10.1101/2024.03.13.583868] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
How the DNA sequence of cis -regulatory elements encode transcription initiation patterns remains poorly understood. Here we introduce CLIPNET, a deep learning model trained on population-scale PRO-cap data that predicts the position and quantity of transcription initiation with single nucleotide resolution from DNA sequence more accurately than existing approaches. Interpretation of CLIPNET revealed a complex regulatory syntax consisting of DNA-protein interactions in five major positions between - 200 and + 50 bp relative to the transcription start site, as well as more subtle positional preferences among transcriptional activators. Transcriptional activator and core promoter motifs work non-additively to encode distinct aspects of initiation, with the former driving initiation quantity and the latter initiation position. We identified core promoter motifs that explain initiation patterns in the majority of promoters and enhancers, including DPR motifs and AT-rich TBP binding sequences in TATA-less promoters. Our results provide insights into the sequence architecture governing transcription initiation.
Collapse
Affiliation(s)
- Adam Y. He
- Baker Institute for Animal Health, College of Veterinary Medicine, Cornell University
- Graduate Field of Computational Biology, Cornell University
| | - Charles G. Danko
- Baker Institute for Animal Health, College of Veterinary Medicine, Cornell University
- Department of Biomedical Sciences, College of Veterinary Medicine, Cornell University
| |
Collapse
|
2
|
Su BG, Vos SM. Distinct negative elongation factor conformations regulate RNA polymerase II promoter-proximal pausing. Mol Cell 2024; 84:1243-1256.e5. [PMID: 38401543 PMCID: PMC10997474 DOI: 10.1016/j.molcel.2024.01.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 12/17/2023] [Accepted: 01/25/2024] [Indexed: 02/26/2024]
Abstract
Metazoan gene expression regulation involves pausing of RNA polymerase (Pol II) in the promoter-proximal region of genes and is stabilized by DSIF and NELF. Upon depletion of elongation factors, NELF appears to accompany elongating Pol II past pause sites; however, prior work indicates that NELF prevents Pol II elongation. Here, we report cryoelectron microscopy structures of Pol II-DSIF-NELF complexes with NELF in two distinct conformations corresponding to paused and poised states. The paused NELF state supports Pol II stalling, whereas the poised NELF state enables transcription elongation as it does not support a tilted RNA-DNA hybrid. Further, the poised NELF state can accommodate TFIIS binding to Pol II, allowing for Pol II reactivation at paused or backtracking sites. Finally, we observe that the NELF-A tentacle interacts with the RPB2 protrusion and is necessary for pausing. Our results define how NELF can support pausing, reactivation, and elongation by Pol II.
Collapse
Affiliation(s)
- Bonnie G Su
- Department of Biology, Massachusetts Institute of Technology, Building 68, 31 Ames St., Cambridge, MA 02139, USA
| | - Seychelle M Vos
- Department of Biology, Massachusetts Institute of Technology, Building 68, 31 Ames St., Cambridge, MA 02139, USA; Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA.
| |
Collapse
|
3
|
Zhao Y, Liu L, Hassett R, Siepel A. Model-based characterization of the equilibrium dynamics of transcription initiation and promoter-proximal pausing in human cells. Nucleic Acids Res 2023; 51:e106. [PMID: 37889042 PMCID: PMC10681744 DOI: 10.1093/nar/gkad843] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 09/13/2023] [Accepted: 09/21/2023] [Indexed: 10/28/2023] Open
Abstract
In metazoans, both transcription initiation and the escape of RNA polymerase (RNAP) from promoter-proximal pausing are key rate-limiting steps in gene expression. These processes play out at physically proximal sites on the DNA template and appear to influence one another through steric interactions. Here, we examine the dynamics of these processes using a combination of statistical modeling, simulation, and analysis of real nascent RNA sequencing data. We develop a simple probabilistic model that jointly describes the kinetics of transcription initiation, pause-escape, and elongation, and the generation of nascent RNA sequencing read counts under steady-state conditions. We then extend this initial model to allow for variability across cells in promoter-proximal pause site locations and steric hindrance of transcription initiation from paused RNAPs. In an extensive series of simulations, we show that this model enables accurate estimation of initiation and pause-escape rates. Furthermore, we show by simulation and analysis of real data that pause-escape is often strongly rate-limiting and that steric hindrance can dramatically reduce initiation rates. Our modeling framework is applicable to a variety of inference problems, and our software for estimation and simulation is freely available.
Collapse
Affiliation(s)
- Yixin Zhao
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Lingjie Liu
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
- Graduate Program in Genetics, Stony Brook University, Stony Brook, NY, USA
| | - Rebecca Hassett
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Adam Siepel
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
- Graduate Program in Genetics, Stony Brook University, Stony Brook, NY, USA
| |
Collapse
|
4
|
Zhu P, Schon M, Questa J, Nodine M, Dean C. Causal role of a promoter polymorphism in natural variation of the Arabidopsis floral repressor gene FLC. Curr Biol 2023; 33:4381-4391.e3. [PMID: 37729909 DOI: 10.1016/j.cub.2023.08.079] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Revised: 07/06/2023] [Accepted: 08/25/2023] [Indexed: 09/22/2023]
Abstract
Noncoding polymorphism frequently associates with phenotypic variation, but causation and mechanism are rarely established. Noncoding single-nucleotide polymorphisms (SNPs) characterize the major haplotypes of the Arabidopsis thaliana floral repressor gene FLOWERING LOCUS C (FLC). This noncoding polymorphism generates a range of FLC expression levels, determining the requirement for and the response to winter cold. The major adaptive determinant of these FLC haplotypes was shown to be the autumnal levels of FLC expression. Here, we investigate how noncoding SNPs influence FLC transcriptional output. We identify an upstream transcription start site (uTSS) cluster at FLC, whose usage is increased by an A variant at the promoter SNP-230. This variant is present in relatively few Arabidopsis accessions, with the majority containing G at this site. We demonstrate a causal role for the A variant at -230 in reduced FLC transcriptional output. The G variant upregulates FLC expression redundantly with the major transcriptional activator FRIGIDA (FRI). We demonstrate an additive interaction of SNP-230 with an intronic SNP+259, which also differentially influences uTSS usage. Combinatorial interactions between noncoding SNPs and transcriptional activators thus generate quantitative variation in FLC transcription that has facilitated the adaptation of Arabidopsis accessions to distinct climates.
Collapse
Affiliation(s)
- Pan Zhu
- John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK
| | - Michael Schon
- Laboratory of Molecular Biology, Wageningen University, Wageningen 6708 PB, the Netherlands; Gregor Mendel Institute, Vienna Biocenter, Vienna 1030, Austria
| | - Julia Questa
- John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK
| | - Michael Nodine
- Laboratory of Molecular Biology, Wageningen University, Wageningen 6708 PB, the Netherlands; Gregor Mendel Institute, Vienna Biocenter, Vienna 1030, Austria
| | - Caroline Dean
- John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK.
| |
Collapse
|
5
|
Bernardini A, Hollinger C, Willgenss D, Müller F, Devys D, Tora L. Transcription factor IID parks and drives preinitiation complexes at sharp or broad promoters. Trends Biochem Sci 2023; 48:839-848. [PMID: 37574371 PMCID: PMC10529448 DOI: 10.1016/j.tibs.2023.07.009] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 07/19/2023] [Accepted: 07/19/2023] [Indexed: 08/15/2023]
Abstract
Core promoters are sites where transcriptional regulatory inputs of a gene are integrated to direct the assembly of the preinitiation complex (PIC) and RNA polymerase II (Pol II) transcription output. Until now, core promoter functions have been investigated by distinct methods, including Pol II transcription initiation site mappings and structural characterization of PICs on distinct promoters. Here, we bring together these previously unconnected observations and hypothesize how, on metazoan TATA promoters, the precisely structured building up of transcription factor (TF) IID-based PICs results in sharp transcription start site (TSS) selection; or, in contrast, how the less strictly controlled positioning of the TATA-less promoter DNA relative to TFIID-core PIC components results in alternative broad TSS selections by Pol II.
Collapse
Affiliation(s)
- Andrea Bernardini
- Institut de Génétique et de Biologie Moléculaire et Cellulaire, 67404 Illkirch, France; Centre National de la Recherche Scientifique, UMR7104, 67404 Illkirch, France; Institut National de la Santé et de la Recherche Médicale, U1258, 67404 Illkirch, France; Université de Strasbourg, 67404 Illkirch, France
| | | | | | - Ferenc Müller
- Institute of Cancer and Genomic Sciences, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
| | - Didier Devys
- Institut de Génétique et de Biologie Moléculaire et Cellulaire, 67404 Illkirch, France; Centre National de la Recherche Scientifique, UMR7104, 67404 Illkirch, France; Institut National de la Santé et de la Recherche Médicale, U1258, 67404 Illkirch, France; Université de Strasbourg, 67404 Illkirch, France.
| | - László Tora
- Institut de Génétique et de Biologie Moléculaire et Cellulaire, 67404 Illkirch, France; Centre National de la Recherche Scientifique, UMR7104, 67404 Illkirch, France; Institut National de la Santé et de la Recherche Médicale, U1258, 67404 Illkirch, France; Université de Strasbourg, 67404 Illkirch, France.
| |
Collapse
|