1
|
Yang JH, Hansen AS. Enhancer selectivity in space and time: from enhancer-promoter interactions to promoter activation. Nat Rev Mol Cell Biol 2024; 25:574-591. [PMID: 38413840 DOI: 10.1038/s41580-024-00710-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/30/2024] [Indexed: 02/29/2024]
Abstract
The primary regulators of metazoan gene expression are enhancers, originally functionally defined as DNA sequences that can activate transcription at promoters in an orientation-independent and distance-independent manner. Despite being crucial for gene regulation in animals, what mechanisms underlie enhancer selectivity for promoters, and more fundamentally, how enhancers interact with promoters and activate transcription, remain poorly understood. In this Review, we first discuss current models of enhancer-promoter interactions in space and time and how enhancers affect transcription activation. Next, we discuss different mechanisms that mediate enhancer selectivity, including repression, biochemical compatibility and regulation of 3D genome structure. Through 3D polymer simulations, we illustrate how the ability of 3D genome folding mechanisms to mediate enhancer selectivity strongly varies for different enhancer-promoter interaction mechanisms. Finally, we discuss how recent technical advances may provide new insights into mechanisms of enhancer-promoter interactions and how technical biases in methods such as Hi-C and Micro-C and imaging techniques may affect their interpretation.
Collapse
Affiliation(s)
- Jin H Yang
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Koch Institute for Integrative Cancer Research, Cambridge, MA, USA
| | - Anders S Hansen
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Koch Institute for Integrative Cancer Research, Cambridge, MA, USA.
| |
Collapse
|
2
|
Fischer M, Sammons MA. Determinants of p53 DNA binding, gene regulation, and cell fate decisions. Cell Death Differ 2024; 31:836-843. [PMID: 38951700 PMCID: PMC11239874 DOI: 10.1038/s41418-024-01326-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2024] [Revised: 06/05/2024] [Accepted: 06/10/2024] [Indexed: 07/03/2024] Open
Abstract
The extent to which transcription factors read and respond to specific information content within short DNA sequences remains an important question that the tumor suppressor p53 is helping us answer. We discuss recent insights into how local information content at p53 binding sites might control modes of p53 target gene activation and cell fate decisions. Significant prior work has yielded data supporting two potential models of how p53 determines cell fate through its target genes: a selective target gene binding and activation model and a p53 level threshold model. Both of these models largely revolve around an analogy of whether p53 is acting in a "smart" or "dumb" manner. Here, we synthesize recent and past studies on p53 decoding of DNA sequence, chromatin context, and cellular signaling cascades to elicit variable cell fates critical in human development, homeostasis, and disease.
Collapse
Affiliation(s)
- Martin Fischer
- Computational Biology Group, Leibniz Institute on Aging - Fritz Lipmann Institute (FLI), Beutenbergstraße 11, 07745, Jena, Germany.
| | - Morgan A Sammons
- Department of Biological Sciences and The RNA Institute, The State University of New York at Albany, 1400 Washington Avenue, Albany, NY, 12222, USA.
| |
Collapse
|
3
|
Fischer M. Gene regulation by the tumor suppressor p53 - The omics era. Biochim Biophys Acta Rev Cancer 2024; 1879:189111. [PMID: 38740351 DOI: 10.1016/j.bbcan.2024.189111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 05/08/2024] [Accepted: 05/10/2024] [Indexed: 05/16/2024]
Abstract
The transcription factor p53 is activated in response to a variety of cellular stresses and serves as a prominent and potent tumor suppressor. Since its discovery, we have sought to understand how p53 functions as both a transcription factor and a tumor suppressor. Two decades ago, the field of gene regulation entered the omics era and began to study the regulation of entire genomes. The omics perspective has greatly expanded our understanding of p53 functions and has begun to reveal its gene regulatory network. In this mini-review, I discuss recent insights into the p53 transcriptional program from high-throughput analyses.
Collapse
Affiliation(s)
- Martin Fischer
- Computational Biology Group, Leibniz Institute on Aging - Fritz Lipmann Institute (FLI), Beutenbergstraße 11, 07745 Jena, Germany.
| |
Collapse
|
4
|
Moeckel C, Mouratidis I, Chantzi N, Uzun Y, Georgakopoulos-Soares I. Advances in computational and experimental approaches for deciphering transcriptional regulatory networks: Understanding the roles of cis-regulatory elements is essential, and recent research utilizing MPRAs, STARR-seq, CRISPR-Cas9, and machine learning has yielded valuable insights. Bioessays 2024; 46:e2300210. [PMID: 38715516 DOI: 10.1002/bies.202300210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 04/22/2024] [Accepted: 04/23/2024] [Indexed: 05/16/2024]
Abstract
Understanding the influence of cis-regulatory elements on gene regulation poses numerous challenges given complexities stemming from variations in transcription factor (TF) binding, chromatin accessibility, structural constraints, and cell-type differences. This review discusses the role of gene regulatory networks in enhancing understanding of transcriptional regulation and covers construction methods ranging from expression-based approaches to supervised machine learning. Additionally, key experimental methods, including MPRAs and CRISPR-Cas9-based screening, which have significantly contributed to understanding TF binding preferences and cis-regulatory element functions, are explored. Lastly, the potential of machine learning and artificial intelligence to unravel cis-regulatory logic is analyzed. These computational advances have far-reaching implications for precision medicine, therapeutic target discovery, and the study of genetic variations in health and disease.
Collapse
Affiliation(s)
- Camille Moeckel
- Department of Biochemistry and Molecular Biology, Institute for Personalized Medicine, The Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
| | - Ioannis Mouratidis
- Department of Biochemistry and Molecular Biology, Institute for Personalized Medicine, The Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania, USA
| | - Nikol Chantzi
- Department of Biochemistry and Molecular Biology, Institute for Personalized Medicine, The Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
| | - Yasin Uzun
- Department of Biochemistry and Molecular Biology, Institute for Personalized Medicine, The Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania, USA
- Department of Pediatrics, The Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
| | - Ilias Georgakopoulos-Soares
- Department of Biochemistry and Molecular Biology, Institute for Personalized Medicine, The Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania, USA
| |
Collapse
|
5
|
Iñiguez-Muñoz S, Llinàs-Arias P, Ensenyat-Mendez M, Bedoya-López AF, Orozco JIJ, Cortés J, Roy A, Forsberg-Nilsson K, DiNome ML, Marzese DM. Hidden secrets of the cancer genome: unlocking the impact of non-coding mutations in gene regulatory elements. Cell Mol Life Sci 2024; 81:274. [PMID: 38902506 DOI: 10.1007/s00018-024-05314-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 12/07/2023] [Accepted: 06/06/2024] [Indexed: 06/22/2024]
Abstract
Discoveries in the field of genomics have revealed that non-coding genomic regions are not merely "junk DNA", but rather comprise critical elements involved in gene expression. These gene regulatory elements (GREs) include enhancers, insulators, silencers, and gene promoters. Notably, new evidence shows how mutations within these regions substantially influence gene expression programs, especially in the context of cancer. Advances in high-throughput sequencing technologies have accelerated the identification of somatic and germline single nucleotide mutations in non-coding genomic regions. This review provides an overview of somatic and germline non-coding single nucleotide alterations affecting transcription factor binding sites in GREs, specifically involved in cancer biology. It also summarizes the technologies available for exploring GREs and the challenges associated with studying and characterizing non-coding single nucleotide mutations. Understanding the role of GRE alterations in cancer is essential for improving diagnostic and prognostic capabilities in the precision medicine era, leading to enhanced patient-centered clinical outcomes.
Collapse
Affiliation(s)
- Sandra Iñiguez-Muñoz
- Cancer Epigenetics Laboratory at the Cancer Cell Biology Group, Institut d'Investigació Sanitària Illes Balears (IdISBa), Palma, Spain
| | - Pere Llinàs-Arias
- Cancer Epigenetics Laboratory at the Cancer Cell Biology Group, Institut d'Investigació Sanitària Illes Balears (IdISBa), Palma, Spain
| | - Miquel Ensenyat-Mendez
- Cancer Epigenetics Laboratory at the Cancer Cell Biology Group, Institut d'Investigació Sanitària Illes Balears (IdISBa), Palma, Spain
| | - Andrés F Bedoya-López
- Cancer Epigenetics Laboratory at the Cancer Cell Biology Group, Institut d'Investigació Sanitària Illes Balears (IdISBa), Palma, Spain
| | - Javier I J Orozco
- Saint John's Cancer Institute, Providence Saint John's Health Center, Santa Monica, CA, USA
| | - Javier Cortés
- International Breast Cancer Center (IBCC), Pangaea Oncology, Quiron Group, 08017, Barcelona, Spain
- Medica Scientia Innovation Research SL (MEDSIR), 08018, Barcelona, Spain
- Faculty of Biomedical and Health Sciences, Department of Medicine, Universidad Europea de Madrid, 28670, Madrid, Spain
| | - Ananya Roy
- Department of Immunology, Genetics and Pathology and Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Karin Forsberg-Nilsson
- Department of Immunology, Genetics and Pathology and Science for Life Laboratory, Uppsala University, Uppsala, Sweden
- University of Nottingham Biodiscovery Institute, Nottingham, UK
| | - Maggie L DiNome
- Department of Surgery, Duke University School of Medicine, Durham, NC, USA
| | - Diego M Marzese
- Cancer Epigenetics Laboratory at the Cancer Cell Biology Group, Institut d'Investigació Sanitària Illes Balears (IdISBa), Palma, Spain.
- Department of Surgery, Duke University School of Medicine, Durham, NC, USA.
| |
Collapse
|
6
|
Hwang H, Jeon H, Yeo N, Baek D. Big data and deep learning for RNA biology. Exp Mol Med 2024:10.1038/s12276-024-01243-w. [PMID: 38871816 DOI: 10.1038/s12276-024-01243-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 02/27/2024] [Accepted: 03/05/2024] [Indexed: 06/15/2024] Open
Abstract
The exponential growth of big data in RNA biology (RB) has led to the development of deep learning (DL) models that have driven crucial discoveries. As constantly evidenced by DL studies in other fields, the successful implementation of DL in RB depends heavily on the effective utilization of large-scale datasets from public databases. In achieving this goal, data encoding methods, learning algorithms, and techniques that align well with biological domain knowledge have played pivotal roles. In this review, we provide guiding principles for applying these DL concepts to various problems in RB by demonstrating successful examples and associated methodologies. We also discuss the remaining challenges in developing DL models for RB and suggest strategies to overcome these challenges. Overall, this review aims to illuminate the compelling potential of DL for RB and ways to apply this powerful technology to investigate the intriguing biology of RNA more effectively.
Collapse
Affiliation(s)
- Hyeonseo Hwang
- School of Biological Sciences, Seoul National University, Seoul, Republic of Korea
| | - Hyeonseong Jeon
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
- Genome4me Inc., Seoul, Republic of Korea
| | - Nagyeong Yeo
- School of Biological Sciences, Seoul National University, Seoul, Republic of Korea
| | - Daehyun Baek
- School of Biological Sciences, Seoul National University, Seoul, Republic of Korea.
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea.
- Genome4me Inc., Seoul, Republic of Korea.
| |
Collapse
|
7
|
Yin C, Hair SC, Byeon GW, Bromley P, Meuleman W, Seelig G. Iterative deep learning-design of human enhancers exploits condensed sequence grammar to achieve cell type-specificity. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.14.599076. [PMID: 38915713 PMCID: PMC11195158 DOI: 10.1101/2024.06.14.599076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]
Abstract
An important and largely unsolved problem in synthetic biology is how to target gene expression to specific cell types. Here, we apply iterative deep learning to design synthetic enhancers with strong differential activity between two human cell lines. We initially train models on published datasets of enhancer activity and chromatin accessibility and use them to guide the design of synthetic enhancers that maximize predicted specificity. We experimentally validate these sequences, use the measurements to re-optimize the predictor, and design a second generation of enhancers with improved specificity. Our design methods embed relevant transcription factor binding site (TFBS) motifs with higher frequencies than comparable endogenous enhancers while using a more selective motif vocabulary, and we show that enhancer activity is correlated with transcription factor expression at the single cell level. Finally, we characterize causal features of top enhancers via perturbation experiments and show enhancers as short as 50bp can maintain specificity.
Collapse
Affiliation(s)
- Christopher Yin
- Department of Electrical & Computer Engineering, University of Washington, Seattle, WA
| | | | - Gun Woo Byeon
- Department of Electrical & Computer Engineering, University of Washington, Seattle, WA
| | - Peter Bromley
- Altius Institute for Biomedical Sciences, Seattle, WA
| | - Wouter Meuleman
- Altius Institute for Biomedical Sciences, Seattle, WA
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA
| | - Georg Seelig
- Department of Electrical & Computer Engineering, University of Washington, Seattle, WA
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA
| |
Collapse
|
8
|
Lalanne JB, Regalado SG, Domcke S, Calderon D, Martin BK, Li X, Li T, Suiter CC, Lee C, Trapnell C, Shendure J. Multiplex profiling of developmental cis-regulatory elements with quantitative single-cell expression reporters. Nat Methods 2024; 21:983-993. [PMID: 38724692 PMCID: PMC11166576 DOI: 10.1038/s41592-024-02260-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Accepted: 03/22/2024] [Indexed: 06/13/2024]
Abstract
The inability to scalably and precisely measure the activity of developmental cis-regulatory elements (CREs) in multicellular systems is a bottleneck in genomics. Here we develop a dual RNA cassette that decouples the detection and quantification tasks inherent to multiplex single-cell reporter assays. The resulting measurement of reporter expression is accurate over multiple orders of magnitude, with a precision approaching the limit set by Poisson counting noise. Together with RNA barcode stabilization via circularization, these scalable single-cell quantitative expression reporters provide high-contrast readouts, analogous to classic in situ assays but entirely from sequencing. Screening >200 regions of accessible chromatin in a multicellular in vitro model of early mammalian development, we identify 13 (8 previously uncharacterized) autonomous and cell-type-specific developmental CREs. We further demonstrate that chimeric CRE pairs generate cognate two-cell-type activity profiles and assess gain- and loss-of-function multicellular expression phenotypes from CRE variants with perturbed transcription factor binding sites. Single-cell quantitative expression reporters can be applied in developmental and multicellular systems to quantitatively characterize native, perturbed and synthetic CREs at scale, with high sensitivity and at single-cell resolution.
Collapse
Affiliation(s)
| | - Samuel G Regalado
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Silvia Domcke
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Diego Calderon
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Beth K Martin
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Xiaoyi Li
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Tony Li
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Chase C Suiter
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Molecular and Cellular Biology Program, University of Washington, Seattle, WA, USA
| | - Choli Lee
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Cole Trapnell
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
- Allen Discovery Center for Cell Lineage Tracing, Seattle, WA, USA
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA.
- Allen Discovery Center for Cell Lineage Tracing, Seattle, WA, USA.
- Howard Hughes Medical Institute, Seattle, WA, USA.
| |
Collapse
|
9
|
Bell CC, Balic JJ, Talarmain L, Gillespie A, Scolamiero L, Lam EYN, Ang CS, Faulkner GJ, Gilan O, Dawson MA. Comparative cofactor screens show the influence of transactivation domains and core promoters on the mechanisms of transcription. Nat Genet 2024; 56:1181-1192. [PMID: 38769457 DOI: 10.1038/s41588-024-01749-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 04/09/2024] [Indexed: 05/22/2024]
Abstract
Eukaryotic transcription factors (TFs) activate gene expression by recruiting cofactors to promoters. However, the relationships between TFs, promoters and their associated cofactors remain poorly understood. Here we combine GAL4-transactivation assays with comparative CRISPR-Cas9 screens to identify the cofactors used by nine different TFs and core promoters in human cells. Using this dataset, we associate TFs with cofactors, classify cofactors as ubiquitous or specific and discover transcriptional co-dependencies. Through a reductionistic, comparative approach, we demonstrate that TFs do not display discrete mechanisms of activation. Instead, each TF depends on a unique combination of cofactors, which influences distinct steps in transcription. By contrast, the influence of core promoters appears relatively discrete. Different promoter classes are constrained by either initiation or pause-release, which influences their dynamic range and compatibility with cofactors. Overall, our comparative cofactor screens characterize the interplay between TFs, cofactors and core promoters, identifying general principles by which they influence transcription.
Collapse
Affiliation(s)
- Charles C Bell
- Cancer Research Division, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia.
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, Victoria, Australia.
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, Queensland, Australia.
| | - Jesse J Balic
- Cancer Research Division, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, Victoria, Australia
| | - Laure Talarmain
- Cancer Research Division, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, Victoria, Australia
| | - Andrea Gillespie
- Cancer Research Division, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
| | - Laura Scolamiero
- Cancer Research Division, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, Victoria, Australia
| | - Enid Y N Lam
- Cancer Research Division, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
| | - Ching-Seng Ang
- Bio21 Mass Spectrometry and Proteomics Facility, The University of Melbourne, Parkville, Victoria, Australia
| | - Geoffrey J Faulkner
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, Queensland, Australia
- Queensland Brain Institute, University of Queensland, Brisbane, Queensland, Australia
| | - Omer Gilan
- Cancer Research Division, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
- Australian Centre for Blood Diseases, Monash University, Melbourne, Victoria, Australia
| | - Mark A Dawson
- Cancer Research Division, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia.
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, Victoria, Australia.
- Department of Haematology, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia.
- Centre for Cancer Research, University of Melbourne, Melbourne, Victoria, Australia.
| |
Collapse
|
10
|
McCann AA, Baniulyte G, Woodstock DL, Sammons MA. Context dependent activity of p63-bound gene regulatory elements. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.09.593326. [PMID: 38766006 PMCID: PMC11100809 DOI: 10.1101/2024.05.09.593326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
The p53 family of transcription factors regulate numerous organismal processes including the development of skin and limbs, ciliogenesis, and preservation of genetic integrity and tumor suppression. p53 family members control these processes and gene expression networks through engagement with DNA sequences within gene regulatory elements. Whereas p53 binding to its cognate recognition sequence is strongly associated with transcriptional activation, p63 can mediate both activation and repression. How the DNA sequence of p63-bound gene regulatory elements is linked to these varied activities is not yet understood. Here, we use massively parallel reporter assays (MPRA) in a range of cellular and genetic contexts to investigate the influence of DNA sequence on p63-mediated transcription. Most regulatory elements with a p63 response element motif (p63RE) activate transcription, with those sites bound by p63 more frequently or adhering closer to canonical p53 family response element sequences driving higher transcriptional output. The most active regulatory elements are those also capable of binding p53. Elements uniquely bound by p63 have varied activity, with p63RE-mediated repression associated with lower overall GC content in flanking sequences. Comparison of activity across cell lines suggests differential activity of elements may be regulated by a combination of p63 abundance or context-specific cofactors. Finally, changes in p63 isoform expression dramatically alters regulatory element activity, primarily shifting inactive elements towards a strong p63-dependent activity. Our analysis of p63-bound gene regulatory elements provides new insight into how sequence, cellular context, and other transcription factors influence p63-dependent transcription. These studies provide a framework for understanding how p63 genomic binding locally regulates transcription. Additionally, these results can be extended to investigate the influence of sequence content, genomic context, chromatin structure on the interplay between p63 isoforms and p53 family paralogs.
Collapse
Affiliation(s)
- Abby A. McCann
- Department of Biological Sciences and The RNA Institute, University at Albany, State University of New York. 1400 washington Ave, Albany, NY 12222
| | - Gabriele Baniulyte
- Department of Biological Sciences and The RNA Institute, University at Albany, State University of New York. 1400 washington Ave, Albany, NY 12222
| | - Dana L. Woodstock
- Department of Biological Sciences and The RNA Institute, University at Albany, State University of New York. 1400 washington Ave, Albany, NY 12222
| | - Morgan A. Sammons
- Department of Biological Sciences and The RNA Institute, University at Albany, State University of New York. 1400 washington Ave, Albany, NY 12222
| |
Collapse
|
11
|
Maturana CJ. Engineered compact pan-neuronal promoter from Alphaherpesvirus LAP2 enhances target gene expression in the mouse brain and reduces tropism in the liver. Gene Ther 2024; 31:335-344. [PMID: 38012300 PMCID: PMC11090813 DOI: 10.1038/s41434-023-00430-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2023] [Revised: 10/29/2023] [Accepted: 11/09/2023] [Indexed: 11/29/2023]
Abstract
Small promoters capable of driving potent neuron-restricted gene expression are required to support successful brain circuitry and clinical gene therapy studies. However, converting large promoters into functional MiniPromoters, which can be used in vectors with limited capacity, remains challenging. In this study, we describe the generation of a novel version of alphaherpesvirus latency-associated promoter 2 (LAP2), which facilitates precise transgene expression exclusively in the neurons of the mouse brain while minimizing undesired targeting in peripheral tissues. Additionally, we aimed to create a compact neural promoter to facilitate packaging of larger transgenes. Our results revealed that MiniLAP2 (278 bp) drives potent transgene expression in all neurons in the mouse brain, with little to no expression in glial cells. In contrast to the native promoter, MiniLAP2 reduced tropism in the spinal cord and liver. No expression was detected in the kidney or skeletal muscle. In summary, we developed a minimal pan-neuronal promoter that drives specific and robust transgene expression in the mouse brain when delivered intravenously via AAV-PHP.eB vector. The use of this novel MiniPromoter may broaden the range of deliverable therapeutics and improve their safety and efficacy by minimizing the potential for off-target effects.
Collapse
Affiliation(s)
- Carola J Maturana
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA.
| |
Collapse
|
12
|
Tak YE, Hsu JY, Shih J, Schultz HT, Nguyen IT, Lam KC, Pinello L, Keith Joung J. CRISPR PERSIST-On enables heritable and fine-tunable human gene activation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.26.590475. [PMID: 38712303 PMCID: PMC11071488 DOI: 10.1101/2024.04.26.590475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
Current technologies for upregulation of endogenous genes use targeted artificial transcriptional activators but stable gene activation requires persistent expression of these synthetic factors. Although general "hit-and-run" strategies exist for inducing long-term silencing of endogenous genes using targeted artificial transcriptional repressors, to our knowledge no equivalent approach for gene activation has been described to date. Here we show stable gene activation can be achieved by harnessing endogenous transcription factors ( EndoTF s) that are normally expressed in human cells. Specifically, EndoTFs can be recruited to activate endogenous human genes of interest by using CRISPR-based gene editing to introduce EndoTF DNA binding motifs into a target gene promoter. This Precision Editing of Regulatory Sequences to Induce Stable Transcription-On ( PERSIST-On ) approach results in stable long-term gene activation, which we show is durable for at least five months. Using a high-throughput CRISPR prime editing pooled screening method, we also show that the magnitude of gene activation can be finely tuned either by using binding sites for different EndoTF or by introducing specific mutations within such sites. Our results delineate a generalizable framework for using PERSIST-On to induce heritable and fine-tunable gene activation in a hit-and-run fashion, thereby enabling a wide range of research and therapeutic applications that require long-term upregulation of a target gene.
Collapse
|
13
|
Mononen J, Taipale M, Malinen M, Velidendla B, Niskanen E, Levonen AL, Ruotsalainen AK, Heikkinen S. Genetic variation is a key determinant of chromatin accessibility and drives differences in the regulatory landscape of C57BL/6J and 129S1/SvImJ mice. Nucleic Acids Res 2024; 52:2904-2923. [PMID: 38153160 PMCID: PMC11014276 DOI: 10.1093/nar/gkad1225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Revised: 11/09/2023] [Accepted: 12/12/2023] [Indexed: 12/29/2023] Open
Abstract
Most common genetic variants associated with disease are located in non-coding regions of the genome. One mechanism by which they function is through altering transcription factor (TF) binding. In this study, we explore how genetic variation is connected to differences in the regulatory landscape of livers from C57BL/6J and 129S1/SvImJ mice fed either chow or a high-fat diet. To identify sites where regulatory variation affects TF binding and nearby gene expression, we employed an integrative analysis of H3K27ac ChIP-seq (active enhancers), ATAC-seq (chromatin accessibility) and RNA-seq (gene expression). We show that, across all these assays, the genetically driven (i.e. strain-specific) differences in the regulatory landscape are more pronounced than those modified by diet. Most notably, our analysis revealed that differentially accessible regions (DARs, N = 29635, FDR < 0.01 and fold change > 50%) are almost always strain-specific and enriched with genetic variation. Moreover, proximal DARs are highly correlated with differentially expressed genes. We also show that TF binding is affected by genetic variation, which we validate experimentally using ChIP-seq for TCF7L2 and CTCF. This study provides detailed insights into how non-coding genetic variation alters the gene regulatory landscape, and demonstrates how this can be used to study the regulatory variation influencing TF binding.
Collapse
Affiliation(s)
- Juho Mononen
- Institute of Biomedicine, Faculty of Health Sciences, University of Eastern Finland, Kuopio FI-70211, Finland
| | - Mari Taipale
- A.I. Virtanen Institute, Faculty of Health Sciences, University of Eastern Finland, Kuopio FI-70211, Finland
| | - Marjo Malinen
- Department of Environmental and Biological Sciences, Faculty of Science and Forestry, University of Eastern Finland, Joensuu FI- 80101, Finland
- Department of Forestry and Environmental Engineering, South-Eastern Finland University of Applied Sciences, Kouvola FI-45100, Finland
| | - Bharadwaja Velidendla
- Institute of Biomedicine, Faculty of Health Sciences, University of Eastern Finland, Kuopio FI-70211, Finland
| | - Einari Niskanen
- Institute of Biomedicine, Faculty of Health Sciences, University of Eastern Finland, Kuopio FI-70211, Finland
| | - Anna-Liisa Levonen
- A.I. Virtanen Institute, Faculty of Health Sciences, University of Eastern Finland, Kuopio FI-70211, Finland
| | - Anna-Kaisa Ruotsalainen
- A.I. Virtanen Institute, Faculty of Health Sciences, University of Eastern Finland, Kuopio FI-70211, Finland
| | - Sami Heikkinen
- Institute of Biomedicine, Faculty of Health Sciences, University of Eastern Finland, Kuopio FI-70211, Finland
| |
Collapse
|
14
|
Choudalakis M, Bashtrykov P, Jeltsch A. RepEnTools: an automated repeat enrichment analysis package for ChIP-seq data reveals hUHRF1 Tandem-Tudor domain enrichment in young repeats. Mob DNA 2024; 15:6. [PMID: 38570859 PMCID: PMC10988844 DOI: 10.1186/s13100-024-00315-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Accepted: 03/05/2024] [Indexed: 04/05/2024] Open
Abstract
BACKGROUND Repeat elements (REs) play important roles for cell function in health and disease. However, RE enrichment analysis in short-read high-throughput sequencing (HTS) data, such as ChIP-seq, is a challenging task. RESULTS Here, we present RepEnTools, a software package for genome-wide RE enrichment analysis of ChIP-seq and similar chromatin pulldown experiments. Our analysis package bundles together various software with carefully chosen and validated settings to provide a complete solution for RE analysis, starting from raw input files to tabular and graphical outputs. RepEnTools implementations are easily accessible even with minimal IT skills (Galaxy/UNIX). To demonstrate the performance of RepEnTools, we analysed chromatin pulldown data by the human UHRF1 TTD protein domain and discovered enrichment of TTD binding on young primate and hominid specific polymorphic repeats (SVA, L1PA1/L1HS) overlapping known enhancers and decorated with H3K4me1-K9me2/3 modifications. We corroborated these new bioinformatic findings with experimental data by qPCR assays using newly developed primate and hominid specific qPCR assays which complement similar research tools. Finally, we analysed mouse UHRF1 ChIP-seq data with RepEnTools and showed that the endogenous mUHRF1 protein colocalizes with H3K4me1-H3K9me3 on promoters of REs which were silenced by UHRF1. These new data suggest a functional role for UHRF1 in silencing of REs that is mediated by TTD binding to the H3K4me1-K9me3 double mark and conserved in two mammalian species. CONCLUSIONS RepEnTools improves the previously available programmes for RE enrichment analysis in chromatin pulldown studies by leveraging new tools, enhancing accessibility and adding some key functions. RepEnTools can analyse RE enrichment rapidly, efficiently, and accurately, providing the community with an up-to-date, reliable and accessible tool for this important type of analysis.
Collapse
Affiliation(s)
- Michel Choudalakis
- Department of Biochemistry, Institute of Biochemistry and Technical Biochemistry, University of Stuttgart, Allmandring 31, 70569, Stuttgart, Germany
| | - Pavel Bashtrykov
- Department of Biochemistry, Institute of Biochemistry and Technical Biochemistry, University of Stuttgart, Allmandring 31, 70569, Stuttgart, Germany.
| | - Albert Jeltsch
- Department of Biochemistry, Institute of Biochemistry and Technical Biochemistry, University of Stuttgart, Allmandring 31, 70569, Stuttgart, Germany.
| |
Collapse
|
15
|
Karollus A, Hingerl J, Gankin D, Grosshauser M, Klemon K, Gagneur J. Species-aware DNA language models capture regulatory elements and their evolution. Genome Biol 2024; 25:83. [PMID: 38566111 PMCID: PMC10985990 DOI: 10.1186/s13059-024-03221-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 03/20/2024] [Indexed: 04/04/2024] Open
Abstract
BACKGROUND The rise of large-scale multi-species genome sequencing projects promises to shed new light on how genomes encode gene regulatory instructions. To this end, new algorithms are needed that can leverage conservation to capture regulatory elements while accounting for their evolution. RESULTS Here, we introduce species-aware DNA language models, which we trained on more than 800 species spanning over 500 million years of evolution. Investigating their ability to predict masked nucleotides from context, we show that DNA language models distinguish transcription factor and RNA-binding protein motifs from background non-coding sequence. Owing to their flexibility, DNA language models capture conserved regulatory elements over much further evolutionary distances than sequence alignment would allow. Remarkably, DNA language models reconstruct motif instances bound in vivo better than unbound ones and account for the evolution of motif sequences and their positional constraints, showing that these models capture functional high-order sequence and evolutionary context. We further show that species-aware training yields improved sequence representations for endogenous and MPRA-based gene expression prediction, as well as motif discovery. CONCLUSIONS Collectively, these results demonstrate that species-aware DNA language models are a powerful, flexible, and scalable tool to integrate information from large compendia of highly diverged genomes.
Collapse
Affiliation(s)
- Alexander Karollus
- School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
- Munich Center for Machine Learning, Munich, Germany
| | - Johannes Hingerl
- School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
| | - Dennis Gankin
- School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
| | - Martin Grosshauser
- School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
| | - Kristian Klemon
- School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
| | - Julien Gagneur
- School of Computation, Information and Technology, Technical University of Munich, Garching, Germany.
- Munich Center for Machine Learning, Munich, Germany.
- Institute of Human Genetics, School of Medicine and Health, Technical University of Munich, Munich, Germany.
- Computational Health Center, Helmholtz Center Munich, Neuherberg, Germany.
- Munich Data Science Institute, Technical University of Munich, Garching, Germany.
| |
Collapse
|
16
|
Yao D, Tycko J, Oh JW, Bounds LR, Gosai SJ, Lataniotis L, Mackay-Smith A, Doughty BR, Gabdank I, Schmidt H, Guerrero-Altamirano T, Siklenka K, Guo K, White AD, Youngworth I, Andreeva K, Ren X, Barrera A, Luo Y, Yardımcı GG, Tewhey R, Kundaje A, Greenleaf WJ, Sabeti PC, Leslie C, Pritykin Y, Moore JE, Beer MA, Gersbach CA, Reddy TE, Shen Y, Engreitz JM, Bassik MC, Reilly SK. Multicenter integrated analysis of noncoding CRISPRi screens. Nat Methods 2024; 21:723-734. [PMID: 38504114 PMCID: PMC11009116 DOI: 10.1038/s41592-024-02216-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Accepted: 02/18/2024] [Indexed: 03/21/2024]
Abstract
The ENCODE Consortium's efforts to annotate noncoding cis-regulatory elements (CREs) have advanced our understanding of gene regulatory landscapes. Pooled, noncoding CRISPR screens offer a systematic approach to investigate cis-regulatory mechanisms. The ENCODE4 Functional Characterization Centers conducted 108 screens in human cell lines, comprising >540,000 perturbations across 24.85 megabases of the genome. Using 332 functionally confirmed CRE-gene links in K562 cells, we established guidelines for screening endogenous noncoding elements with CRISPR interference (CRISPRi), including accurate detection of CREs that exhibit variable, often low, transcriptional effects. Benchmarking five screen analysis tools, we find that CASA produces the most conservative CRE calls and is robust to artifacts of low-specificity single guide RNAs. We uncover a subtle DNA strand bias for CRISPRi in transcribed regions with implications for screen design and analysis. Together, we provide an accessible data resource, predesigned single guide RNAs for targeting 3,275,697 ENCODE SCREEN candidate CREs with CRISPRi and screening guidelines to accelerate functional characterization of the noncoding genome.
Collapse
Affiliation(s)
- David Yao
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Josh Tycko
- Department of Genetics, Stanford University, Stanford, CA, USA.
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA.
| | - Jin Woo Oh
- Departments of Biomedical Engineering and Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - Lexi R Bounds
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
| | - Sager J Gosai
- Broad Institute of Harvard & MIT, Cambridge, MA, USA
- Department of Organismic and Evolutionary Biology, Center for System Biology, Harvard University, Cambridge, MA, USA
- Harvard Graduate Program in Biological and Biomedical Science, Boston, MA, USA
| | - Lazaros Lataniotis
- Department of Neurology, Institute for Human Genetics, University of California, San Franscisco, San Francisco, CA, USA
| | - Ava Mackay-Smith
- University Program in Genetics and Genomics, Duke University School of Medicine, Durham, NC, USA
| | | | - Idan Gabdank
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Henri Schmidt
- Department of Computer Science, Princeton University, Princeton, NJ, USA
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Tania Guerrero-Altamirano
- University Program in Genetics and Genomics, Duke University School of Medicine, Durham, NC, USA
- Department of Biology, Duke University, Durham, NC, USA
| | - Keith Siklenka
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
- Department of Biostatistics and Bioinformatics, Duke University Medical Center, Durham, NC, USA
| | - Katherine Guo
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Alexander D White
- Department of Electrical Engineering, Stanford University, Stanford, CA, USA
| | | | - Kalina Andreeva
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Xingjie Ren
- Department of Neurology, Institute for Human Genetics, University of California, San Franscisco, San Francisco, CA, USA
| | - Alejandro Barrera
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
- Department of Biostatistics and Bioinformatics, Duke University Medical Center, Durham, NC, USA
| | - Yunhai Luo
- Department of Genetics, Stanford University, Stanford, CA, USA
| | | | | | - Anshul Kundaje
- Department of Genetics, Stanford University, Stanford, CA, USA
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - William J Greenleaf
- Department of Genetics, Stanford University, Stanford, CA, USA
- Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA, USA
- Department of Applied Physics, Stanford University, Stanford, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - Pardis C Sabeti
- Broad Institute of Harvard & MIT, Cambridge, MA, USA
- Department of Organismic and Evolutionary Biology, Center for System Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
- Department of Immunology and Infectious Disease, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Christina Leslie
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Yuri Pritykin
- Department of Computer Science, Princeton University, Princeton, NJ, USA
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Jill E Moore
- Program in Bioinformatics and Integrative Biology, RNA Therapeutics Institute, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Michael A Beer
- Departments of Biomedical Engineering and Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - Charles A Gersbach
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
| | - Timothy E Reddy
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
- Department of Biostatistics and Bioinformatics, Duke University Medical Center, Durham, NC, USA
| | - Yin Shen
- Department of Neurology, Institute for Human Genetics, University of California, San Franscisco, San Francisco, CA, USA
- Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Jesse M Engreitz
- Department of Genetics, Stanford University, Stanford, CA, USA
- BASE Initiative, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Stanford, CA, USA
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Steven K Reilly
- Department of Genetics, Yale University, New Haven, CT, USA.
| |
Collapse
|
17
|
Vaknin I, Willinger O, Mandl J, Heuberger H, Ben-Ami D, Zeng Y, Goldberg S, Orenstein Y, Amit R. A universal system for boosting gene expression in eukaryotic cell-lines. Nat Commun 2024; 15:2394. [PMID: 38493141 PMCID: PMC10944472 DOI: 10.1038/s41467-024-46573-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 03/04/2024] [Indexed: 03/18/2024] Open
Abstract
We demonstrate a transcriptional regulatory design algorithm that can boost expression in yeast and mammalian cell lines. The system consists of a simplified transcriptional architecture composed of a minimal core promoter and a synthetic upstream regulatory region (sURS) composed of up to three motifs selected from a list of 41 motifs conserved in the eukaryotic lineage. The sURS system was first characterized using an oligo-library containing 189,990 variants. We validate the resultant expression model using a set of 43 unseen sURS designs. The validation sURS experiments indicate that a generic set of grammar rules for boosting and attenuation may exist in yeast cells. Finally, we demonstrate that this generic set of grammar rules functions similarly in mammalian CHO-K1 and HeLa cells. Consequently, our work provides a design algorithm for boosting the expression of promoters used for expressing industrially relevant proteins in yeast and mammalian cell lines.
Collapse
Affiliation(s)
- Inbal Vaknin
- Department of Biotechnology and Food Engineering, Technion, Haifa, Israel
| | - Or Willinger
- Department of Biotechnology and Food Engineering, Technion, Haifa, Israel
| | - Jonathan Mandl
- Department of Computer Science, Bar-Ilan University, Ramat Gan, Israel
| | - Hadar Heuberger
- School of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Dan Ben-Ami
- School of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Yi Zeng
- Department of Biotechnology and Food Engineering, Technion, Haifa, Israel
| | - Sarah Goldberg
- Department of Biotechnology and Food Engineering, Technion, Haifa, Israel
| | - Yaron Orenstein
- Department of Computer Science, Bar-Ilan University, Ramat Gan, Israel
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel
| | - Roee Amit
- Department of Biotechnology and Food Engineering, Technion, Haifa, Israel.
- The Russell Berrie Nanotechnology Institute, Technion, Haifa, Israel.
| |
Collapse
|
18
|
Monteagudo-Sánchez A, Noordermeer D, Greenberg MVC. The impact of DNA methylation on CTCF-mediated 3D genome organization. Nat Struct Mol Biol 2024; 31:404-412. [PMID: 38499830 DOI: 10.1038/s41594-024-01241-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Accepted: 02/05/2024] [Indexed: 03/20/2024]
Abstract
Cytosine DNA methylation is a highly conserved epigenetic mark in eukaryotes. Although the role of DNA methylation at gene promoters and repetitive elements has been extensively studied, the function of DNA methylation in other genomic contexts remains less clear. In the nucleus of mammalian cells, the genome is spatially organized at different levels, and strongly influences myriad genomic processes. There are a number of factors that regulate the three-dimensional (3D) organization of the genome, with the CTCF insulator protein being among the most well-characterized. Pertinently, CTCF binding has been reported as being DNA methylation-sensitive in certain contexts, perhaps most notably in the process of genomic imprinting. Therefore, it stands to reason that DNA methylation may play a broader role in the regulation of chromatin architecture. Here we summarize the current understanding that is relevant to both the mammalian DNA methylation and chromatin architecture fields and attempt to assess the extent to which DNA methylation impacts the folding of the genome. The focus is in early embryonic development and cellular transitions when the epigenome is in flux, but we also describe insights from pathological contexts, such as cancer, in which the epigenome and 3D genome organization are misregulated.
Collapse
Affiliation(s)
| | - Daan Noordermeer
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), Gif-sur-Yvette, France
| | | |
Collapse
|
19
|
Luthra I, Jensen C, Chen XE, Salaudeen AL, Rafi AM, de Boer CG. Regulatory activity is the default DNA state in eukaryotes. Nat Struct Mol Biol 2024; 31:559-567. [PMID: 38448573 DOI: 10.1038/s41594-024-01235-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 01/29/2024] [Indexed: 03/08/2024]
Abstract
Genomes encode for genes and non-coding DNA, both capable of transcriptional activity. However, unlike canonical genes, many transcripts from non-coding DNA have limited evidence of conservation or function. Here, to determine how much biological noise is expected from non-genic sequences, we quantify the regulatory activity of evolutionarily naive DNA using RNA-seq in yeast and computational predictions in humans. In yeast, more than 99% of naive DNA bases were transcribed. Unlike the evolved transcriptome, naive transcripts frequently overlapped with opposite sense transcripts, suggesting selection favored coherent gene structures in the yeast genome. In humans, regulation-associated chromatin activity is predicted to be common in naive dinucleotide-content-matched randomized DNA. Here, naive and evolved DNA have similar co-occurrence and cell-type specificity of chromatin marks, challenging these as indicators of selection. However, in both yeast and humans, extreme high activities were rare in naive DNA, suggesting they result from selection. Overall, basal regulatory activity seems to be the default, which selection can hone to evolve a function or, if detrimental, repress.
Collapse
Affiliation(s)
- Ishika Luthra
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada
| | - Cassandra Jensen
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada
| | - Xinyi E Chen
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada
| | - Asfar Lathif Salaudeen
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada
| | - Abdul Muntakim Rafi
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada
| | - Carl G de Boer
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada.
| |
Collapse
|
20
|
Rafi AM, Nogina D, Penzar D, Lee D, Lee D, Kim N, Kim S, Kim D, Shin Y, Kwak IY, Meshcheryakov G, Lando A, Zinkevich A, Kim BC, Lee J, Kang T, Vaishnav ED, Yadollahpour P, Kim S, Albrecht J, Regev A, Gong W, Kulakovskiy IV, Meyer P, de Boer C. Evaluation and optimization of sequence-based gene regulatory deep learning models. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.04.26.538471. [PMID: 38405704 PMCID: PMC10888977 DOI: 10.1101/2023.04.26.538471] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
Neural networks have emerged as immensely powerful tools in predicting functional genomic regions, notably evidenced by recent successes in deciphering gene regulatory logic. However, a systematic evaluation of how model architectures and training strategies impact genomics model performance is lacking. To address this gap, we held a DREAM Challenge where competitors trained models on a dataset of millions of random promoter DNA sequences and corresponding expression levels, experimentally determined in yeast, to best capture the relationship between regulatory DNA and gene expression. For a robust evaluation of the models, we designed a comprehensive suite of benchmarks encompassing various sequence types. While some benchmarks produced similar results across the top-performing models, others differed substantially. All top-performing models used neural networks, but diverged in architectures and novel training strategies, tailored to genomics sequence data. To dissect how architectural and training choices impact performance, we developed the Prix Fixe framework to divide any given model into logically equivalent building blocks. We tested all possible combinations for the top three models and observed performance improvements for each. The DREAM Challenge models not only achieved state-of-the-art results on our comprehensive yeast dataset but also consistently surpassed existing benchmarks on Drosophila and human genomic datasets. Overall, we demonstrate that high-quality gold-standard genomics datasets can drive significant progress in model development.
Collapse
Affiliation(s)
| | - Daria Nogina
- Lomonosov Moscow State University, Moscow, Russia
| | - Dmitry Penzar
- Lomonosov Moscow State University, Moscow, Russia
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Dohoon Lee
- Seoul National University, Seoul, South Korea
| | | | - Nayeon Kim
- Seoul National University, Seoul, South Korea
| | | | - Dohyeon Kim
- Seoul National University, Seoul, South Korea
| | - Yeojin Shin
- Seoul National University, Seoul, South Korea
| | | | | | | | - Arsenii Zinkevich
- Lomonosov Moscow State University, Moscow, Russia
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | | | - Juhyun Lee
- Chung-Ang University, Seoul, South Korea
| | - Taein Kang
- Chung-Ang University, Seoul, South Korea
| | | | | | - Sun Kim
- Seoul National University, Seoul, South Korea
| | | | - Aviv Regev
- Broad Institute of MIT and Harvard, Massachusetts, United States
- Genentech, South San Francisco, CA, USA
| | - Wuming Gong
- University of Minnesota, Minneapolis, United States
| | - Ivan V Kulakovskiy
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Russia
| | | | - Carl de Boer
- University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
21
|
Lack N, Altintas UB, Seo JH, Giambartolomei C, Ozturan D, Fortunato B, Nelson G, Goldman S, Adelman K, Hach F, Freedman M. Decoding the Epigenetics and Chromatin Loop Dynamics of Androgen Receptor-Mediated Transcription. RESEARCH SQUARE 2024:rs.3.rs-3854707. [PMID: 38352568 PMCID: PMC10862967 DOI: 10.21203/rs.3.rs-3854707/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/21/2024]
Abstract
Androgen receptor (AR)-mediated transcription plays a critical role in normal prostate development and prostate cancer growth. AR drives gene expression by binding to thousands of cis-regulatory elements (CRE) that loop to hundreds of target promoters. With multiple CREs interacting with a single promoter, it remains unclear how individual AR bound CREs contribute to gene expression. To characterize the involvement of these CREs, we investigated the AR-driven epigenetic and chromosomal chromatin looping changes. We collected a kinetic multi-omic dataset comprised of steady-state mRNA, chromatin accessibility, transcription factor binding, histone modifications, chromatin looping, and nascent RNA. Using an integrated regulatory network, we found that AR binding induces sequential changes in the epigenetic features at CREs, independent of gene expression. Further, we showed that binding of AR does not result in a substantial rewiring of chromatin loops, but instead increases the contact frequency of pre-existing loops to target promoters. Our results show that gene expression strongly correlates to the changes in contact frequency. We then proposed and experimentally validated an unbalanced multi-enhancer model where the impact on gene expression of AR-bound enhancers is heterogeneous, and is proportional to their contact frequency with target gene promoters. Overall, these findings provide new insight into AR-mediated gene expression upon acute androgen simulation and develop a mechanistic framework to investigate nuclear receptor mediated perturbations.
Collapse
|
22
|
Taskiran II, Spanier KI, Dickmänken H, Kempynck N, Pančíková A, Ekşi EC, Hulselmans G, Ismail JN, Theunis K, Vandepoel R, Christiaens V, Mauduit D, Aerts S. Cell-type-directed design of synthetic enhancers. Nature 2024; 626:212-220. [PMID: 38086419 PMCID: PMC10830415 DOI: 10.1038/s41586-023-06936-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Accepted: 12/05/2023] [Indexed: 01/19/2024]
Abstract
Transcriptional enhancers act as docking stations for combinations of transcription factors and thereby regulate spatiotemporal activation of their target genes1. It has been a long-standing goal in the field to decode the regulatory logic of an enhancer and to understand the details of how spatiotemporal gene expression is encoded in an enhancer sequence. Here we show that deep learning models2-6, can be used to efficiently design synthetic, cell-type-specific enhancers, starting from random sequences, and that this optimization process allows detailed tracing of enhancer features at single-nucleotide resolution. We evaluate the function of fully synthetic enhancers to specifically target Kenyon cells or glial cells in the fruit fly brain using transgenic animals. We further exploit enhancer design to create 'dual-code' enhancers that target two cell types and minimal enhancers smaller than 50 base pairs that are fully functional. By examining the state space searches towards local optima, we characterize enhancer codes through the strength, combination and arrangement of transcription factor activator and transcription factor repressor motifs. Finally, we apply the same strategies to successfully design human enhancers, which adhere to enhancer rules similar to those of Drosophila enhancers. Enhancer design guided by deep learning leads to better understanding of how enhancers work and shows that their code can be exploited to manipulate cell states.
Collapse
Affiliation(s)
- Ibrahim I Taskiran
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Katina I Spanier
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Hannah Dickmänken
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Niklas Kempynck
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Alexandra Pančíková
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
- VIB-KULeuven Center for Cancer Biology, Leuven, Belgium
| | - Eren Can Ekşi
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Gert Hulselmans
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Joy N Ismail
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
- UK Dementia Research Institute at Imperial College London, London, UK
| | - Koen Theunis
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Roel Vandepoel
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Valerie Christiaens
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - David Mauduit
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Stein Aerts
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium.
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium.
- Department of Human Genetics, KU Leuven, Leuven, Belgium.
| |
Collapse
|
23
|
Kim S, Morgunova E, Naqvi S, Goovaerts S, Bader M, Koska M, Popov A, Luong C, Pogson A, Swigut T, Claes P, Taipale J, Wysocka J. DNA-guided transcription factor cooperativity shapes face and limb mesenchyme. Cell 2024; 187:692-711.e26. [PMID: 38262408 PMCID: PMC10872279 DOI: 10.1016/j.cell.2023.12.032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 10/23/2023] [Accepted: 12/27/2023] [Indexed: 01/25/2024]
Abstract
Transcription factors (TFs) can define distinct cellular identities despite nearly identical DNA-binding specificities. One mechanism for achieving regulatory specificity is DNA-guided TF cooperativity. Although in vitro studies suggest that it may be common, examples of such cooperativity remain scarce in cellular contexts. Here, we demonstrate how "Coordinator," a long DNA motif composed of common motifs bound by many basic helix-loop-helix (bHLH) and homeodomain (HD) TFs, uniquely defines the regulatory regions of embryonic face and limb mesenchyme. Coordinator guides cooperative and selective binding between the bHLH family mesenchymal regulator TWIST1 and a collective of HD factors associated with regional identities in the face and limb. TWIST1 is required for HD binding and open chromatin at Coordinator sites, whereas HD factors stabilize TWIST1 occupancy at Coordinator and titrate it away from HD-independent sites. This cooperativity results in the shared regulation of genes involved in cell-type and positional identities and ultimately shapes facial morphology and evolution.
Collapse
Affiliation(s)
- Seungsoo Kim
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA 94305, USA; Department of Developmental Biology, Stanford University, Stanford, CA 94305, USA; Institute for Stem Cell Biology and Regenerative Medicine, Stanford University, Stanford, CA 94305, USA; Howard Hughes Medical Institute, Stanford, CA 94305, USA
| | - Ekaterina Morgunova
- Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Solna, Sweden
| | - Sahin Naqvi
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA 94305, USA; Department of Developmental Biology, Stanford University, Stanford, CA 94305, USA; Institute for Stem Cell Biology and Regenerative Medicine, Stanford University, Stanford, CA 94305, USA; Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Seppe Goovaerts
- Medical Imaging Research Center, UZ Leuven, Leuven, Belgium; Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Maram Bader
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA 94305, USA; Department of Developmental Biology, Stanford University, Stanford, CA 94305, USA; Institute for Stem Cell Biology and Regenerative Medicine, Stanford University, Stanford, CA 94305, USA
| | - Mervenaz Koska
- Department of Developmental Biology, Stanford University, Stanford, CA 94305, USA
| | | | - Christy Luong
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA 94305, USA
| | - Angela Pogson
- Department of Developmental Biology, Stanford University, Stanford, CA 94305, USA
| | - Tomek Swigut
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA 94305, USA; Department of Developmental Biology, Stanford University, Stanford, CA 94305, USA; Institute for Stem Cell Biology and Regenerative Medicine, Stanford University, Stanford, CA 94305, USA; Howard Hughes Medical Institute, Stanford, CA 94305, USA
| | - Peter Claes
- Medical Imaging Research Center, UZ Leuven, Leuven, Belgium; Department of Human Genetics, KU Leuven, Leuven, Belgium; Department of Electrical Engineering, ESAT/PSI, KU Leuven, Leuven, Belgium
| | - Jussi Taipale
- Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Solna, Sweden; Department of Biochemistry, University of Cambridge, Cambridge, UK; Applied Tumor Genomics Program, University of Helsinki, Helsinki, Finland
| | - Joanna Wysocka
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA 94305, USA; Department of Developmental Biology, Stanford University, Stanford, CA 94305, USA; Institute for Stem Cell Biology and Regenerative Medicine, Stanford University, Stanford, CA 94305, USA; Howard Hughes Medical Institute, Stanford, CA 94305, USA.
| |
Collapse
|
24
|
Saha K, Nielsen GI, Nandani R, Kong L, Ye P, An W. YY1 is a transcriptional activator of mouse LINE-1 Tf subfamily. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.03.573552. [PMID: 38260579 PMCID: PMC10802269 DOI: 10.1101/2024.01.03.573552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Long interspersed element type 1 (LINE-1, L1) is an active autonomous transposable element (TE) in the human genome. The first step of L1 replication is transcription, which is controlled by an internal RNA polymerase II promoter in the 5' untranslated region (UTR) of a full-length L1. It has been shown that transcription factor YY1 binds to a conserved sequence motif at the 5' end of the human L1 5'UTR and dictates where transcription initiates but not the level of transcription. Putative YY1-binding motifs have been predicted in the 5'UTRs of two distinct mouse L1 subfamilies, Tf and Gf. Using site-directed mutagenesis, in vitro binding, and gene knockdown assays, we experimentally tested the role of YY1 in mouse L1 transcription. Our results indicate that Tf, but not Gf subfamily, harbors functional YY1-binding sites in its 5'UTR monomers. In contrast to its role in human L1, YY1 functions as a transcriptional activator for the mouse Tf subfamily. Furthermore, YY1-binding motifs are solely responsible for the synergistic interaction between monomers, consistent with a model wherein distant monomers act as enhancers for mouse L1 transcription. The abundance of YY1-binding sites in Tf elements also raise important implications for gene regulation at the genomic level.
Collapse
Affiliation(s)
- Karabi Saha
- Department of Pharmaceutical Sciences, South Dakota State University, Brookings, SD 57007, USA
| | - Grace I. Nielsen
- Department of Pharmaceutical Sciences, South Dakota State University, Brookings, SD 57007, USA
| | - Raj Nandani
- Department of Pharmaceutical Sciences, South Dakota State University, Brookings, SD 57007, USA
| | - Lingqi Kong
- Department of Pharmaceutical Sciences, South Dakota State University, Brookings, SD 57007, USA
| | - Ping Ye
- Department of Pharmaceutical Sciences, South Dakota State University, Brookings, SD 57007, USA
| | - Wenfeng An
- Department of Pharmaceutical Sciences, South Dakota State University, Brookings, SD 57007, USA
| |
Collapse
|
25
|
de Boer CG, Taipale J. Hold out the genome: a roadmap to solving the cis-regulatory code. Nature 2024; 625:41-50. [PMID: 38093018 DOI: 10.1038/s41586-023-06661-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Accepted: 09/20/2023] [Indexed: 01/05/2024]
Abstract
Gene expression is regulated by transcription factors that work together to read cis-regulatory DNA sequences. The 'cis-regulatory code' - how cells interpret DNA sequences to determine when, where and how much genes should be expressed - has proven to be exceedingly complex. Recently, advances in the scale and resolution of functional genomics assays and machine learning have enabled substantial progress towards deciphering this code. However, the cis-regulatory code will probably never be solved if models are trained only on genomic sequences; regions of homology can easily lead to overestimation of predictive performance, and our genome is too short and has insufficient sequence diversity to learn all relevant parameters. Fortunately, randomly synthesized DNA sequences enable testing a far larger sequence space than exists in our genomes, and designed DNA sequences enable targeted queries to maximally improve the models. As the same biochemical principles are used to interpret DNA regardless of its source, models trained on these synthetic data can predict genomic activity, often better than genome-trained models. Here we provide an outlook on the field, and propose a roadmap towards solving the cis-regulatory code by a combination of machine learning and massively parallel assays using synthetic DNA.
Collapse
Affiliation(s)
- Carl G de Boer
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada.
| | - Jussi Taipale
- Applied Tumor Genomics Research Program, Faculty of Medicine, University of Helsinki, Helsinki, Finland.
- Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden.
- Department of Biochemistry, University of Cambridge, Cambridge, UK.
| |
Collapse
|
26
|
Loell KJ, Friedman RZ, Myers CA, Corbo JC, Cohen BA, White MA. Transcription factor interactions explain the context-dependent activity of CRX binding sites. PLoS Comput Biol 2024; 20:e1011802. [PMID: 38227575 PMCID: PMC10817189 DOI: 10.1371/journal.pcbi.1011802] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 01/26/2024] [Accepted: 01/06/2024] [Indexed: 01/18/2024] Open
Abstract
The effects of transcription factor binding sites (TFBSs) on the activity of a cis-regulatory element (CRE) depend on the local sequence context. In rod photoreceptors, binding sites for the transcription factor (TF) Cone-rod homeobox (CRX) occur in both enhancers and silencers, but the sequence context that determines whether CRX binding sites contribute to activation or repression of transcription is not understood. To investigate the context-dependent activity of CRX sites, we fit neural network-based models to the activities of synthetic CREs composed of photoreceptor TFBSs. The models revealed that CRX binding sites consistently make positive, independent contributions to CRE activity, while negative homotypic interactions between sites cause CREs composed of multiple CRX sites to function as silencers. The effects of negative homotypic interactions can be overcome by the presence of other TFBSs that either interact cooperatively with CRX sites or make independent positive contributions to activity. The context-dependent activity of CRX sites is thus determined by the balance between positive heterotypic interactions, independent contributions of TFBSs, and negative homotypic interactions. Our findings explain observed patterns of activity among genomic CRX-bound enhancers and silencers, and suggest that enhancers may require diverse TFBSs to overcome negative homotypic interactions between TFBSs.
Collapse
Affiliation(s)
- Kaiser J. Loell
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri, United States of America
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, Missouri, United States of America
| | - Ryan Z. Friedman
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri, United States of America
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, Missouri, United States of America
| | - Connie A. Myers
- Department of Pathology and Immunology, Washington University School of Medicine in St. Louis, St. Louis, Missouri, United States of America
| | - Joseph C. Corbo
- Department of Pathology and Immunology, Washington University School of Medicine in St. Louis, St. Louis, Missouri, United States of America
| | - Barak A. Cohen
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri, United States of America
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, Missouri, United States of America
| | - Michael A. White
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri, United States of America
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, Missouri, United States of America
| |
Collapse
|
27
|
Blayney JW, Francis H, Rampasekova A, Camellato B, Mitchell L, Stolper R, Cornell L, Babbs C, Boeke JD, Higgs DR, Kassouf M. Super-enhancers include classical enhancers and facilitators to fully activate gene expression. Cell 2023; 186:5826-5839.e18. [PMID: 38101409 PMCID: PMC10858684 DOI: 10.1016/j.cell.2023.11.030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Revised: 07/06/2023] [Accepted: 11/27/2023] [Indexed: 12/17/2023]
Abstract
Super-enhancers are compound regulatory elements that control expression of key cell identity genes. They recruit high levels of tissue-specific transcription factors and co-activators such as the Mediator complex and contact target gene promoters with high frequency. Most super-enhancers contain multiple constituent regulatory elements, but it is unclear whether these elements have distinct roles in activating target gene expression. Here, by rebuilding the endogenous multipartite α-globin super-enhancer, we show that it contains bioinformatically equivalent but functionally distinct element types: classical enhancers and facilitator elements. Facilitators have no intrinsic enhancer activity, yet in their absence, classical enhancers are unable to fully upregulate their target genes. Without facilitators, classical enhancers exhibit reduced Mediator recruitment, enhancer RNA transcription, and enhancer-promoter interactions. Facilitators are interchangeable but display functional hierarchy based on their position within a multipartite enhancer. Facilitators thus play an important role in potentiating the activity of classical enhancers and ensuring robust activation of target genes.
Collapse
Affiliation(s)
- Joseph W Blayney
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Headington, Oxford OX3 9DS, UK
| | - Helena Francis
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Headington, Oxford OX3 9DS, UK
| | - Alexandra Rampasekova
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Headington, Oxford OX3 9DS, UK
| | - Brendan Camellato
- Institute for Systems Genetics and Department of Biochemistry and Molecular Pharmacology, NYU Langone Health, New York, NY 10016, USA
| | - Leslie Mitchell
- Institute for Systems Genetics and Department of Biochemistry and Molecular Pharmacology, NYU Langone Health, New York, NY 10016, USA
| | - Rosa Stolper
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Headington, Oxford OX3 9DS, UK
| | - Lucy Cornell
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Headington, Oxford OX3 9DS, UK
| | - Christian Babbs
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Headington, Oxford OX3 9DS, UK
| | - Jef D Boeke
- Institute for Systems Genetics and Department of Biochemistry and Molecular Pharmacology, NYU Langone Health, New York, NY 10016, USA; Department of Biomedical Engineering, NYU Tandon School of Engineering, Brooklyn, NY 11201, USA.
| | - Douglas R Higgs
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Headington, Oxford OX3 9DS, UK; Chinese Academy of Medical Sciences Oxford Institute, Oxford OX3 7BN, UK.
| | - Mira Kassouf
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Headington, Oxford OX3 9DS, UK.
| |
Collapse
|
28
|
Martyn GE, Montgomery MT, Jones H, Guo K, Doughty BR, Linder J, Chen Z, Cochran K, Lawrence KA, Munson G, Pampari A, Fulco CP, Kelley DR, Lander ES, Kundaje A, Engreitz JM. Rewriting regulatory DNA to dissect and reprogram gene expression. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.20.572268. [PMID: 38187584 PMCID: PMC10769263 DOI: 10.1101/2023.12.20.572268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Regulatory DNA sequences within enhancers and promoters bind transcription factors to encode cell type-specific patterns of gene expression. However, the regulatory effects and programmability of such DNA sequences remain difficult to map or predict because we have lacked scalable methods to precisely edit regulatory DNA and quantify the effects in an endogenous genomic context. Here we present an approach to measure the quantitative effects of hundreds of designed DNA sequence variants on gene expression, by combining pooled CRISPR prime editing with RNA fluorescence in situ hybridization and cell sorting (Variant-FlowFISH). We apply this method to mutagenize and rewrite regulatory DNA sequences in an enhancer and the promoter of PPIF in two immune cell lines. Of 672 variant-cell type pairs, we identify 497 that affect PPIF expression. These variants appear to act through a variety of mechanisms including disruption or optimization of existing transcription factor binding sites, as well as creation of de novo sites. Disrupting a single endogenous transcription factor binding site often led to large changes in expression (up to -40% in the enhancer, and -50% in the promoter). The same variant often had different effects across cell types and states, demonstrating a highly tunable regulatory landscape. We use these data to benchmark performance of sequence-based predictive models of gene regulation, and find that certain types of variants are not accurately predicted by existing models. Finally, we computationally design 185 small sequence variants (≤10 bp) and optimize them for specific effects on expression in silico. 84% of these rationally designed edits showed the intended direction of effect, and some had dramatic effects on expression (-100% to +202%). Variant-FlowFISH thus provides a powerful tool to map the effects of variants and transcription factor binding sites on gene expression, test and improve computational models of gene regulation, and reprogram regulatory DNA.
Collapse
Affiliation(s)
- Gabriella E Martyn
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Science and Engineering Initiative, Stanford Children's Health, Betty Irene Moore Children's Heart Center, Stanford, CA, USA
| | - Michael T Montgomery
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Science and Engineering Initiative, Stanford Children's Health, Betty Irene Moore Children's Heart Center, Stanford, CA, USA
| | - Hank Jones
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Science and Engineering Initiative, Stanford Children's Health, Betty Irene Moore Children's Heart Center, Stanford, CA, USA
| | - Katherine Guo
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Science and Engineering Initiative, Stanford Children's Health, Betty Irene Moore Children's Heart Center, Stanford, CA, USA
| | - Benjamin R Doughty
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | | | - Ziwei Chen
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Kelly Cochran
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Kathryn A Lawrence
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Glen Munson
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Anusri Pampari
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Charles P Fulco
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Present Address: Sanofi, Cambridge, MA, USA
| | | | - Eric S Lander
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biology, MIT, Cambridge, MA, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Anshul Kundaje
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Jesse M Engreitz
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Science and Engineering Initiative, Stanford Children's Health, Betty Irene Moore Children's Heart Center, Stanford, CA, USA
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanford Cardiovascular Institute, Stanford University, Stanford, CA, USA
| |
Collapse
|
29
|
Wen C, Yuan Z, Zhang X, Chen H, Luo L, Li W, Li T, Ma N, Mao F, Lin D, Lin Z, Lin C, Xu T, Lü P, Lin J, Zhu F. Sea-ATI unravels novel vocabularies of plant active cistrome. Nucleic Acids Res 2023; 51:11568-11583. [PMID: 37850650 PMCID: PMC10681729 DOI: 10.1093/nar/gkad853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2023] [Revised: 08/11/2023] [Accepted: 09/25/2023] [Indexed: 10/19/2023] Open
Abstract
The cistrome consists of all cis-acting regulatory elements recognized by transcription factors (TFs). However, only a portion of the cistrome is active for TF binding in a specific tissue. Resolving the active cistrome in plants remains challenging. In this study, we report the assay sequential extraction assisted-active TF identification (sea-ATI), a low-input method that profiles the DNA sequences recognized by TFs in a target tissue. We applied sea-ATI to seven plant tissues to survey their active cistrome and generated 41 motif models, including 15 new models that represent previously unidentified cis-regulatory vocabularies. ATAC-seq and RNA-seq analyses confirmed the functionality of the cis-elements from the new models, in that they are actively bound in vivo, located near the transcription start site, and influence chromatin accessibility and transcription. Furthermore, comparing dimeric WRKY CREs between sea-ATI and DAP-seq libraries revealed that thermodynamics and genetic drifts cooperatively shaped their evolution. Notably, sea-ATI can identify not only positive but also negative regulatory cis-elements, thereby providing unique insights into the functional non-coding genome of plants.
Collapse
Affiliation(s)
- Chenjin Wen
- College of Life Science, Haixia Institute of Science and Technology, National Engineering Research Center of JUNCAO, Fujian Agriculture and Forestry University, Fuzhou 350002, Fujian, China
| | - Zhen Yuan
- College of Life Science, Haixia Institute of Science and Technology, National Engineering Research Center of JUNCAO, Fujian Agriculture and Forestry University, Fuzhou 350002, Fujian, China
| | - Xiaotian Zhang
- College of Life Science, Haixia Institute of Science and Technology, National Engineering Research Center of JUNCAO, Fujian Agriculture and Forestry University, Fuzhou 350002, Fujian, China
| | - Hao Chen
- College of Life Science, Haixia Institute of Science and Technology, National Engineering Research Center of JUNCAO, Fujian Agriculture and Forestry University, Fuzhou 350002, Fujian, China
| | - Lin Luo
- College of Life Science, Haixia Institute of Science and Technology, National Engineering Research Center of JUNCAO, Fujian Agriculture and Forestry University, Fuzhou 350002, Fujian, China
| | - Wanying Li
- College of Life Science, Haixia Institute of Science and Technology, National Engineering Research Center of JUNCAO, Fujian Agriculture and Forestry University, Fuzhou 350002, Fujian, China
| | - Tian Li
- College of Life Science, Haixia Institute of Science and Technology, National Engineering Research Center of JUNCAO, Fujian Agriculture and Forestry University, Fuzhou 350002, Fujian, China
| | - Nana Ma
- College of Life Science, Haixia Institute of Science and Technology, National Engineering Research Center of JUNCAO, Fujian Agriculture and Forestry University, Fuzhou 350002, Fujian, China
| | - Fei Mao
- College of Life Science, Haixia Institute of Science and Technology, National Engineering Research Center of JUNCAO, Fujian Agriculture and Forestry University, Fuzhou 350002, Fujian, China
| | - Dongmei Lin
- College of Life Science, Haixia Institute of Science and Technology, National Engineering Research Center of JUNCAO, Fujian Agriculture and Forestry University, Fuzhou 350002, Fujian, China
| | - Zhanxi Lin
- College of Life Science, Haixia Institute of Science and Technology, National Engineering Research Center of JUNCAO, Fujian Agriculture and Forestry University, Fuzhou 350002, Fujian, China
| | - Chentao Lin
- College of Life Science, Haixia Institute of Science and Technology, National Engineering Research Center of JUNCAO, Fujian Agriculture and Forestry University, Fuzhou 350002, Fujian, China
| | - Tongda Xu
- College of Life Science, Haixia Institute of Science and Technology, National Engineering Research Center of JUNCAO, Fujian Agriculture and Forestry University, Fuzhou 350002, Fujian, China
| | - Peitao Lü
- College of Horticulture, Fujian Agriculture and Forestry University, Fuzhou 350002, Fujian, China
| | - Juncheng Lin
- College of Life Science, Haixia Institute of Science and Technology, National Engineering Research Center of JUNCAO, Fujian Agriculture and Forestry University, Fuzhou 350002, Fujian, China
| | - Fangjie Zhu
- College of Life Science, Haixia Institute of Science and Technology, National Engineering Research Center of JUNCAO, Fujian Agriculture and Forestry University, Fuzhou 350002, Fujian, China
| |
Collapse
|
30
|
Kielich N, Mazur O, Musidlak O, Gracz-Bernaciak J, Nawrot R. Herbgenomics meets Papaveraceae: a promising -omics perspective on medicinal plant research. Brief Funct Genomics 2023:elad050. [PMID: 37952099 DOI: 10.1093/bfgp/elad050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 10/09/2023] [Accepted: 10/20/2023] [Indexed: 11/14/2023] Open
Abstract
Herbal medicines were widely used in ancient and modern societies as remedies for human ailments. Notably, the Papaveraceae family includes well-known species, such as Papaver somniferum and Chelidonium majus, which possess medicinal properties due to their latex content. Latex-bearing plants are a rich source of diverse bioactive compounds, with applications ranging from narcotics to analgesics and relaxants. With the advent of high-throughput technologies and advancements in sequencing tools, an opportunity exists to bridge the knowledge gap between the genetic information of herbs and the regulatory networks underlying their medicinal activities. This emerging discipline, known as herbgenomics, combines genomic information with other -omics studies to unravel the genetic foundations, including essential gene functions and secondary metabolite biosynthesis pathways. Furthermore, exploring the genomes of various medicinal plants enables the utilization of modern genetic manipulation techniques, such as Clustered Regularly-Interspaced Short Palindromic Repeats (CRISPR/Cas9) or RNA interference. This technological revolution has facilitated systematic studies of model herbs, targeted breeding of medicinal plants, the establishment of gene banks and the adoption of synthetic biology approaches. In this article, we provide a comprehensive overview of the recent advances in genomic, transcriptomic, proteomic and metabolomic research on species within the Papaveraceae family. Additionally, it briefly explores the potential applications and key opportunities offered by the -omics perspective in the pharmaceutical industry and the agrobiotechnology field.
Collapse
Affiliation(s)
- Natalia Kielich
- Department of Molecular Virology, Institute of Experimental Biology, Adam Mickiewicz University, Poznań, Poland
| | - Oliwia Mazur
- Department of Molecular Virology, Institute of Experimental Biology, Adam Mickiewicz University, Poznań, Poland
| | - Oskar Musidlak
- Department of Molecular Virology, Institute of Experimental Biology, Adam Mickiewicz University, Poznań, Poland
| | - Joanna Gracz-Bernaciak
- Department of Molecular Virology, Institute of Experimental Biology, Adam Mickiewicz University, Poznań, Poland
| | - Robert Nawrot
- Department of Molecular Virology, Institute of Experimental Biology, Adam Mickiewicz University, Poznań, Poland
| |
Collapse
|
31
|
Arnold M, Stengel KR. Emerging insights into enhancer biology and function. Transcription 2023; 14:68-87. [PMID: 37312570 PMCID: PMC10353330 DOI: 10.1080/21541264.2023.2222032] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 05/30/2023] [Accepted: 06/01/2023] [Indexed: 06/15/2023] Open
Abstract
Cell type-specific gene expression is coordinated by DNA-encoded enhancers and the transcription factors (TFs) that bind to them in a sequence-specific manner. As such, these enhancers and TFs are critical mediators of normal development and altered enhancer or TF function is associated with the development of diseases such as cancer. While initially defined by their ability to activate gene transcription in reporter assays, putative enhancer elements are now frequently defined by their unique chromatin features including DNase hypersensitivity and transposase accessibility, bidirectional enhancer RNA (eRNA) transcription, CpG hypomethylation, high H3K27ac and H3K4me1, sequence-specific transcription factor binding, and co-factor recruitment. Identification of these chromatin features through sequencing-based assays has revolutionized our ability to identify enhancer elements on a genome-wide scale, and genome-wide functional assays are now capitalizing on this information to greatly expand our understanding of how enhancers function to provide spatiotemporal coordination of gene expression programs. Here, we highlight recent technological advances that are providing new insights into the molecular mechanisms by which these critical cis-regulatory elements function in gene control. We pay particular attention to advances in our understanding of enhancer transcription, enhancer-promoter syntax, 3D organization and biomolecular condensates, transcription factor and co-factor dependencies, and the development of genome-wide functional enhancer screens.
Collapse
Affiliation(s)
- Mirjam Arnold
- Department of Cell Biology, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Kristy R. Stengel
- Department of Cell Biology, Albert Einstein College of Medicine, Bronx, NY, USA
- Montefiore Einstein Cancer Center, Albert Einstein College of Medicine-Montefiore Health System, Bronx, NY, USA
- Ruth L. and David S. Gottesman Institute for Stem Cell and Regenerative Medicine Research, Albert Einstein College of Medicine, Bronx, NY, USA
| |
Collapse
|
32
|
Trauernicht M, Rastogi C, Manzo S, Bussemaker H, van Steensel B. Optimisation of TP53 reporters by systematic dissection of synthetic TP53 response elements. Nucleic Acids Res 2023; 51:9690-9702. [PMID: 37650627 PMCID: PMC10570033 DOI: 10.1093/nar/gkad718] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 07/24/2023] [Accepted: 08/22/2023] [Indexed: 09/01/2023] Open
Abstract
TP53 is a transcription factor that controls multiple cellular processes, including cell cycle arrest, DNA repair and apoptosis. The relation between TP53 binding site architecture and transcriptional output is still not fully understood. Here, we systematically examined in three different cell lines the effects of binding site affinity and copy number on TP53-dependent transcriptional output, and also probed the impact of spacer length and sequence between adjacent binding sites, and of core promoter identity. Paradoxically, we found that high-affinity TP53 binding sites are less potent than medium-affinity sites. TP53 achieves supra-additive transcriptional activation through optimally spaced adjacent binding sites, suggesting a cooperative mechanism. Optimally spaced adjacent binding sites have a ∼10-bp periodicity, suggesting a role for spatial orientation along the DNA double helix. We leveraged these insights to construct a log-linear model that explains activity from sequence features, and to identify new highly active and sensitive TP53 reporters.
Collapse
Affiliation(s)
- Max Trauernicht
- Division of Gene Regulation, Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
- Oncode Institute, Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
| | - Chaitanya Rastogi
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - Stefano G Manzo
- Division of Gene Regulation, Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
- Oncode Institute, Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
- Department of Biosciences, University of Milan “La Statale”, 20133 Milan, Italy
| | - Harmen J Bussemaker
- Department of Biological Sciences, Columbia University, New York, NY, USA
- Department of Systems Biology, Columbia University Medical Center, New York, NY, USA
| | - Bas van Steensel
- Division of Gene Regulation, Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
- Oncode Institute, Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
| |
Collapse
|
33
|
Malfait J, Wan J, Spicuglia S. Epromoters are new players in the regulatory landscape with potential pleiotropic roles. Bioessays 2023; 45:e2300012. [PMID: 37246247 DOI: 10.1002/bies.202300012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Revised: 05/11/2023] [Accepted: 05/15/2023] [Indexed: 05/30/2023]
Abstract
Precise spatiotemporal control of gene expression during normal development and cell differentiation is achieved by the combined action of proximal (promoters) and distal (enhancers) cis-regulatory elements. Recent studies have reported that a subset of promoters, termed Epromoters, works also as enhancers to regulate distal genes. This new paradigm opened novel questions regarding the complexity of our genome and raises the possibility that genetic variation within Epromoters has pleiotropic effects on various physiological and pathological traits by differentially impacting multiple proximal and distal genes. Here, we discuss the different observations pointing to an important role of Epromoters in the regulatory landscape and summarize the evidence supporting a pleiotropic impact of these elements in disease. We further hypothesize that Epromoter might represent a major contributor to phenotypic variation and disease.
Collapse
Affiliation(s)
- Juliette Malfait
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, LIGUE, Marseille, France
| | - Jing Wan
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, LIGUE, Marseille, France
| | - Salvatore Spicuglia
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, LIGUE, Marseille, France
| |
Collapse
|
34
|
Fei L, Zhang K, Poddar N, Hautaniemi S, Sahu B. Single-cell epigenome analysis identifies molecular events controlling direct conversion of human fibroblasts to pancreatic ductal-like cells. Dev Cell 2023; 58:1701-1715.e8. [PMID: 37751683 DOI: 10.1016/j.devcel.2023.08.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 07/13/2023] [Accepted: 08/16/2023] [Indexed: 09/28/2023]
Abstract
Cell fate can be reprogrammed by ectopic expression of lineage-specific transcription factors (TFs). However, the exact cell state transitions during transdifferentiation are still poorly understood. Here, we have generated pancreatic exocrine cells of ductal epithelial identity from human fibroblasts using a set of six TFs. We mapped the molecular determinants of lineage dynamics using a factor-indexing method based on single-nuclei multiome sequencing (FI-snMultiome-seq) that enables dissecting the role of each individual TF and pool of TFs in cell fate conversion. We show that transition from mesenchymal fibroblast identity to epithelial pancreatic exocrine fate involves two deterministic steps: an endodermal progenitor state defined by activation of HHEX with FOXA2 and SOX17 and a temporal GATA4 activation essential for the maintenance of pancreatic cell fate program. Collectively, our data suggest that transdifferentiation-although being considered a direct cell fate conversion method-occurs through transient progenitor states orchestrated by stepwise activation of distinct TFs.
Collapse
Affiliation(s)
- Liangru Fei
- Applied Tumor Genomics Program, Research Programs Unit, Faculty of Medicine, University of Helsinki, Haartmaninkatu 8, Helsinki 00014, Finland
| | - Kaiyang Zhang
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, Haartmaninkatu 8, Helsinki 00014, Finland
| | - Nikita Poddar
- Applied Tumor Genomics Program, Research Programs Unit, Faculty of Medicine, University of Helsinki, Haartmaninkatu 8, Helsinki 00014, Finland
| | - Sampsa Hautaniemi
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, Haartmaninkatu 8, Helsinki 00014, Finland
| | - Biswajyoti Sahu
- Applied Tumor Genomics Program, Research Programs Unit, Faculty of Medicine, University of Helsinki, Haartmaninkatu 8, Helsinki 00014, Finland; iCAN Digital Precision Cancer Medicine Flagship, University of Helsinki, Haartmaninkatu 8, Helsinki 00014, Finland; Medicum, Faculty of Medicine, University of Helsinki, Haartmaninkatu 8, Helsinki 00014, Finland; Centre for Molecular Medicine Norway, Faculty of Medicine, University of Oslo, Gaustadelléen 21, 0349 Oslo, Norway.
| |
Collapse
|
35
|
Eggers N, Gkountromichos F, Krause S, Campos-Sparr A, Becker P. Physical interaction between MSL2 and CLAMP assures direct cooperativity and prevents competition at composite binding sites. Nucleic Acids Res 2023; 51:9039-9054. [PMID: 37602401 PMCID: PMC10516644 DOI: 10.1093/nar/gkad680] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 07/13/2023] [Accepted: 08/09/2023] [Indexed: 08/22/2023] Open
Abstract
MSL2, the DNA-binding subunit of the Drosophila dosage compensation complex, cooperates with the ubiquitous protein CLAMP to bind MSL recognition elements (MREs) on the X chromosome. We explore the nature of the cooperative binding to these GA-rich, composite sequence elements in reconstituted naïve embryonic chromatin. We found that the cooperativity requires physical interaction between both proteins. Remarkably, disruption of this interaction does not lead to indirect, nucleosome-mediated cooperativity as expected, but to competition. The protein interaction apparently not only increases the affinity for composite binding sites, but also locks both proteins in a defined dimeric state that prevents competition. High Affinity Sites of MSL2 on the X chromosome contain variable numbers of MREs. We find that the cooperation between MSL2/CLAMP is not influenced by MRE clustering or arrangement, but happens largely at the level of individual MREs. The sites where MSL2/CLAMP bind strongly in vitro locate to all chromosomes and show little overlap to an expanded set of X-chromosomal MSL2 in vivo binding sites generated by CUT&RUN. Apparently, the intrinsic MSL2/CLAMP cooperativity is limited to a small selection of potential sites in vivo. This restriction must be due to components missing in our reconstitution, such as roX2 lncRNA.
Collapse
Affiliation(s)
- Nikolas Eggers
- Biomedical Center, Molecular Biology Division, LMU, Munich, Germany
| | | | - Silke Krause
- Biomedical Center, Molecular Biology Division, LMU, Munich, Germany
| | | | - Peter B Becker
- Biomedical Center, Molecular Biology Division, LMU, Munich, Germany
| |
Collapse
|
36
|
Zhu W, Huang H, Ming W, Zhang R, Gu Y, Bai Y, Liu X, Liu H, Liu Y, Gu W, Sun X. Delineating highly transcribed noncoding elements landscape in breast cancer. Comput Struct Biotechnol J 2023; 21:4432-4445. [PMID: 37731598 PMCID: PMC10507584 DOI: 10.1016/j.csbj.2023.09.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 08/27/2023] [Accepted: 09/10/2023] [Indexed: 09/22/2023] Open
Abstract
Highly transcribed noncoding elements (HTNEs) are critical noncoding elements with high levels of transcriptional capacity in particular cohorts involved in multiple cellular biological processes. Investigation of HTNEs with persistent aberrant expression in abnormal tissues could be of benefit in exploring their roles in disease occurrence and progression. Breast cancer is a highly heterogeneous disease for which early screening and prognosis are exceedingly crucial. In this study, we developed a HTNE identification framework to systematically investigate HTNE landscapes in breast cancer patients and identified over ten thousand HTNEs. The robustness and rationality of our framework were demonstrated via public datasets. We revealed that HTNEs had significant chromatin characteristics of enhancers and long noncoding RNAs (lncRNAs) and were significantly enriched with RNA-binding proteins as well as targeted by miRNAs. Further, HTNE-associated genes were significantly overexpressed and exhibited strong correlations with breast cancer. Ultimately, we explored the subtype-specific transcriptional processes associated with HTNEs and uncovered the HTNE signatures that could classify breast cancer subtypes based on the properties of hormone receptors. Our results highlight that the identified HTNEs as well as their associated genes play crucial roles in breast cancer progression and correlate with subtype-specific transcriptional processes of breast cancer.
Collapse
Affiliation(s)
- Wenyong Zhu
- State Key Laboratory of Digital Medical Engineering, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Hao Huang
- State Key Laboratory of Digital Medical Engineering, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Wenlong Ming
- State Key Laboratory of Digital Medical Engineering, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Rongxin Zhang
- State Key Laboratory of Digital Medical Engineering, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Yu Gu
- State Key Laboratory of Digital Medical Engineering, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Yunfei Bai
- State Key Laboratory of Digital Medical Engineering, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Xiaoan Liu
- Department of Breast Surgery, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Hongde Liu
- State Key Laboratory of Digital Medical Engineering, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Yun Liu
- Department of Information, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Wanjun Gu
- State Key Laboratory of Digital Medical Engineering, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
- Collaborative Innovation Center of Jiangsu Province of Cancer Prevention and Treatment of Chinese Medicine, School of Artificial Intelligence and Information Technology, Nanjing University of Chinese Medicine, Nanjing, China
| | - Xiao Sun
- State Key Laboratory of Digital Medical Engineering, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| |
Collapse
|
37
|
Kleinschmidt H, Xu C, Bai L. Using Synthetic DNA Libraries to Investigate Chromatin and Gene Regulation. Chromosoma 2023; 132:167-189. [PMID: 37184694 PMCID: PMC10542970 DOI: 10.1007/s00412-023-00796-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Revised: 04/25/2023] [Accepted: 04/26/2023] [Indexed: 05/16/2023]
Abstract
Despite the recent explosion in genome-wide studies in chromatin and gene regulation, we are still far from extracting a set of genetic rules that can predict the function of the regulatory genome. One major reason for this deficiency is that gene regulation is a multi-layered process that involves an enormous variable space, which cannot be fully explored using native genomes. This problem can be partially solved by introducing synthetic DNA libraries into cells, a method that can test the regulatory roles of thousands to millions of sequences with limited variables. Here, we review recent applications of this method to study transcription factor (TF) binding, nucleosome positioning, and transcriptional activity. We discuss the design principles, experimental procedures, and major findings from these studies and compare the pros and cons of different approaches.
Collapse
Affiliation(s)
- Holly Kleinschmidt
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Cheng Xu
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Lu Bai
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA.
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA.
- Department of Physics, The Pennsylvania State University, University Park, PA, 16802, USA.
| |
Collapse
|
38
|
Karttunen K, Patel D, Xia J, Fei L, Palin K, Aaltonen L, Sahu B. Transposable elements as tissue-specific enhancers in cancers of endodermal lineage. Nat Commun 2023; 14:5313. [PMID: 37658059 PMCID: PMC10474299 DOI: 10.1038/s41467-023-41081-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 08/23/2023] [Indexed: 09/03/2023] Open
Abstract
Transposable elements (TE) are repetitive genomic elements that harbor binding sites for human transcription factors (TF). A regulatory role for TEs has been suggested in embryonal development and diseases such as cancer but systematic investigation of their functions has been limited by their widespread silencing in the genome. Here, we utilize unbiased massively parallel reporter assay data using a whole human genome library to identify TEs with functional enhancer activity in two human cancer types of endodermal lineage, colorectal and liver cancers. We show that the identified TE enhancers are characterized by genomic features associated with active enhancers, such as epigenetic marks and TF binding. Importantly, we identify distinct TE subfamilies that function as tissue-specific enhancers, namely MER11- and LTR12-elements in colon and liver cancers, respectively. These elements are bound by distinct TFs in each cell type, and they have predicted associations to differentially expressed genes. In conclusion, these data demonstrate how different cancer types can utilize distinct TEs as tissue-specific enhancers, paving the way for comprehensive understanding of the role of TEs as bona fide enhancers in the cancer genomes.
Collapse
Affiliation(s)
- Konsta Karttunen
- Applied Tumor Genomics Program, Research Programs Unit, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Divyesh Patel
- Applied Tumor Genomics Program, Research Programs Unit, Faculty of Medicine, University of Helsinki, Helsinki, Finland
- iCAN Digital Precision Cancer Medicine Flagship, University of Helsinki, Helsinki, Finland
| | - Jihan Xia
- Applied Tumor Genomics Program, Research Programs Unit, Faculty of Medicine, University of Helsinki, Helsinki, Finland
- iCAN Digital Precision Cancer Medicine Flagship, University of Helsinki, Helsinki, Finland
| | - Liangru Fei
- Applied Tumor Genomics Program, Research Programs Unit, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Kimmo Palin
- Applied Tumor Genomics Program, Research Programs Unit, Faculty of Medicine, University of Helsinki, Helsinki, Finland
- iCAN Digital Precision Cancer Medicine Flagship, University of Helsinki, Helsinki, Finland
| | - Lauri Aaltonen
- Applied Tumor Genomics Program, Research Programs Unit, Faculty of Medicine, University of Helsinki, Helsinki, Finland
- iCAN Digital Precision Cancer Medicine Flagship, University of Helsinki, Helsinki, Finland
| | - Biswajyoti Sahu
- Applied Tumor Genomics Program, Research Programs Unit, Faculty of Medicine, University of Helsinki, Helsinki, Finland.
- iCAN Digital Precision Cancer Medicine Flagship, University of Helsinki, Helsinki, Finland.
- Medicum, Faculty of Medicine, University of Helsinki, Helsinki, Finland.
- Centre for Molecular Medicine Norway, University of Oslo, Oslo, Norway.
| |
Collapse
|
39
|
Guzman C, Duttke S, Zhu Y, De Arruda Saldanha C, Downes N, Benner C, Heinz S. Combining TSS-MPRA and sensitive TSS profile dissimilarity scoring to study the sequence determinants of transcription initiation. Nucleic Acids Res 2023; 51:e80. [PMID: 37403796 PMCID: PMC10450201 DOI: 10.1093/nar/gkad562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Revised: 06/13/2023] [Accepted: 06/20/2023] [Indexed: 07/06/2023] Open
Abstract
Cis-regulatory elements (CREs) can be classified by the shapes of their transcription start site (TSS) profiles, which are indicative of distinct regulatory mechanisms. Massively parallel reporter assays (MPRAs) are increasingly being used to study CRE regulatory mechanisms, yet the degree to which MPRAs replicate individual endogenous TSS profiles has not been determined. Here, we present a new low-input MPRA protocol (TSS-MPRA) that enables measuring TSS profiles of episomal reporters as well as after lentiviral reporter chromatinization. To sensitively compare MPRA and endogenous TSS profiles, we developed a novel dissimilarity scoring algorithm (WIP score) that outperforms the frequently used earth mover's distance on experimental data. Using TSS-MPRA and WIP scoring on 500 unique reporter inserts, we found that short (153 bp) MPRA promoter inserts replicate the endogenous TSS patterns of ∼60% of promoters. Lentiviral reporter chromatinization did not improve fidelity of TSS-MPRA initiation patterns, and increasing insert size frequently led to activation of extraneous TSS in the MPRA that are not active in vivo. We discuss the implications of our findings, which highlight important caveats when using MPRAs to study transcription mechanisms. Finally, we illustrate how TSS-MPRA and WIP scoring can provide novel insights into the impact of transcription factor motif mutations and genetic variants on TSS patterns and transcription levels.
Collapse
Affiliation(s)
- Carlos Guzman
- Department of Medicine, Division of Endocrinology, U.C. San Diego School of Medicine, La Jolla, CA 92093, USA
- Department of Bioengineering, Graduate Program in Bioinformatics & Systems Biology, U.C. San Diego, La Jolla, CA 92093, USA
| | - Sascha Duttke
- Department of Medicine, Division of Endocrinology, U.C. San Diego School of Medicine, La Jolla, CA 92093, USA
| | - Yixin Zhu
- Department of Medicine, Division of Endocrinology, U.C. San Diego School of Medicine, La Jolla, CA 92093, USA
| | - Camila De Arruda Saldanha
- Department of Medicine, Division of Endocrinology, U.C. San Diego School of Medicine, La Jolla, CA 92093, USA
| | - Nicholas L Downes
- Department of Medicine, Division of Endocrinology, U.C. San Diego School of Medicine, La Jolla, CA 92093, USA
| | - Christopher Benner
- Department of Medicine, Division of Endocrinology, U.C. San Diego School of Medicine, La Jolla, CA 92093, USA
| | - Sven Heinz
- Department of Medicine, Division of Endocrinology, U.C. San Diego School of Medicine, La Jolla, CA 92093, USA
| |
Collapse
|
40
|
Friedman RZ, Ramu A, Lichtarge S, Myers CA, Granas DM, Gause M, Corbo JC, Cohen BA, White MA. Active learning of enhancer and silencer regulatory grammar in photoreceptors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.21.554146. [PMID: 37662358 PMCID: PMC10473580 DOI: 10.1101/2023.08.21.554146] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]
Abstract
Cis-regulatory elements (CREs) direct gene expression in health and disease, and models that can accurately predict their activities from DNA sequences are crucial for biomedicine. Deep learning represents one emerging strategy to model the regulatory grammar that relates CRE sequence to function. However, these models require training data on a scale that exceeds the number of CREs in the genome. We address this problem using active machine learning to iteratively train models on multiple rounds of synthetic DNA sequences assayed in live mammalian retinas. During each round of training the model actively selects sequence perturbations to assay, thereby efficiently generating informative training data. We iteratively trained a model that predicts the activities of sequences containing binding motifs for the photoreceptor transcription factor Cone-rod homeobox (CRX) using an order of magnitude less training data than current approaches. The model's internal confidence estimates of its predictions are reliable guides for designing sequences with high activity. The model correctly identified critical sequence differences between active and inactive sequences with nearly identical transcription factor binding sites, and revealed order and spacing preferences for combinations of motifs. Our results establish active learning as an effective method to train accurate deep learning models of cis-regulatory function after exhausting naturally occurring training examples in the genome.
Collapse
Affiliation(s)
- Ryan Z. Friedman
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, Saint Louis, MO, 63110
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Avinash Ramu
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, Saint Louis, MO, 63110
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Sara Lichtarge
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, Saint Louis, MO, 63110
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Connie A. Myers
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO, 63110
| | - David M. Granas
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, Saint Louis, MO, 63110
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Maria Gause
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Joseph C. Corbo
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Barak A. Cohen
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, Saint Louis, MO, 63110
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Michael A. White
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, Saint Louis, MO, 63110
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, 63110
| |
Collapse
|
41
|
Mehta P, Chattopadhyay P, Ravi V, Tarai B, Budhiraja S, Pandey R. SARS-CoV-2 infection severity and mortality is modulated by repeat-mediated regulation of alternative splicing. Microbiol Spectr 2023; 11:e0135123. [PMID: 37604131 PMCID: PMC10580830 DOI: 10.1128/spectrum.01351-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 07/16/2023] [Indexed: 08/23/2023] Open
Abstract
Like single-stranded RNA viruses, SARS-CoV-2 hijacks the host transcriptional machinery for its own replication. Numerous traditional differential gene expression-based investigations have examined the diverse clinical symptoms caused by SARS-CoV-2 infection. The virus, on the other hand, also affects the host splicing machinery, causing host transcriptional dysregulation, which can lead to diverse clinical outcomes. Hence, in this study, we performed host transcriptome sequencing of 125 hospital-admitted COVID-19 patients to understand the transcriptomic differences between the severity sub-phenotypes of mild, moderate, severe, and mortality. We performed transcript-level differential expression analysis, investigated differential isoform usage, looked at the splicing patterns within the differentially expressed transcripts (DET), and elucidated the possible genome regulatory features. Our DTE analysis showed evidence of diminished transcript length and diversity as well as altered promoter site usage in the differentially expressed protein-coding transcripts in the COVID-19 mortality patients. We also investigated the potential mechanisms driving the alternate splicing and discovered a compelling differential enrichment of repeats in the promoter region and a specific enrichment of SINE (Alu) near the splicing sites of differentially expressed transcripts. These findings suggested a repeat-mediated plausible regulation of alternative splicing as a potential modulator of COVID-19 disease severity. In this work, we emphasize the role of scarcely elucidated functional role of alternative splicing in influencing COVID-19 disease severity sub-phenotypes, clinical outcomes, and its putative mechanism. IMPORTANCE The wide range of clinical symptoms reported during the COVID-19 pandemic inherently highlights the numerous factors that influence the progression and prognosis of SARS-CoV-2 infection. While several studies have investigated the host response and discovered immunological dysregulation during severe infection, most of them have the common theme of focusing only up to the gene level. Viruses, especially RNA viruses, are renowned for hijacking the host splicing machinery for their own proliferation, which inadvertently puts pressure on the host transcriptome, exposing another side of the host response to the pathogen challenge. Therefore, in this study, we examine host response at the transcript-level to discover a transcriptional difference that culminates in differential gene-level expression. Importantly, this study highlights diminished transcript diversity and possible regulation of transcription by differentially abundant repeat elements near the promoter region and splicing sites in COVID-19 mortality patients, which together with differentially expressed isoforms hold the potential to elaborate disease severity and outcome.
Collapse
Affiliation(s)
- Priyanka Mehta
- Division of Immunology and Infectious Disease Biology, INtegrative GENomics of HOst-PathogEn (INGEN-HOPE) Laboratory, CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB), Delhi, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| | - Partha Chattopadhyay
- Division of Immunology and Infectious Disease Biology, INtegrative GENomics of HOst-PathogEn (INGEN-HOPE) Laboratory, CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB), Delhi, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| | - Varsha Ravi
- Division of Immunology and Infectious Disease Biology, INtegrative GENomics of HOst-PathogEn (INGEN-HOPE) Laboratory, CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB), Delhi, India
| | - Bansidhar Tarai
- Max Super Speciality Hospital (A Unit of Devki Devi Foundation), Max Healthcare, Delhi, India
| | - Sandeep Budhiraja
- Max Super Speciality Hospital (A Unit of Devki Devi Foundation), Max Healthcare, Delhi, India
| | - Rajesh Pandey
- Division of Immunology and Infectious Disease Biology, INtegrative GENomics of HOst-PathogEn (INGEN-HOPE) Laboratory, CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB), Delhi, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| |
Collapse
|
42
|
Benarroch L, Madsen-Østerbye J, Abdelhalim M, Mamchaoui K, Ohana J, Bigot A, Mouly V, Bonne G, Bertrand AT, Collas P. Cellular and Genomic Features of Muscle Differentiation from Isogenic Fibroblasts and Myoblasts. Cells 2023; 12:1995. [PMID: 37566074 PMCID: PMC10417614 DOI: 10.3390/cells12151995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 07/27/2023] [Accepted: 07/31/2023] [Indexed: 08/12/2023] Open
Abstract
The ability to recapitulate muscle differentiation in vitro enables the exploration of mechanisms underlying myogenesis and muscle diseases. However, obtaining myoblasts from patients with neuromuscular diseases or from healthy subjects poses ethical and procedural challenges that limit such investigations. An alternative consists in converting skin fibroblasts into myogenic cells by forcing the expression of the myogenic regulator MYOD. Here, we directly compared cellular phenotype, transcriptome, and nuclear lamina-associated domains (LADs) in myo-converted human fibroblasts and myotubes differentiated from myoblasts. We used isogenic cells from a 16-year-old donor, ruling out, for the first time to our knowledge, genetic factors as a source of variations between the two myogenic models. We show that myo-conversion of fibroblasts upregulates genes controlling myogenic pathways leading to multinucleated cells expressing muscle cell markers. However, myotubes are more advanced in myogenesis than myo-converted fibroblasts at the phenotypic and transcriptomic levels. While most LADs are shared between the two cell types, each also displays unique domains of lamin A/C interactions. Furthermore, myotube-specific LADs are more gene-rich and less heterochromatic than shared LADs or LADs unique to myo-converted fibroblasts, and they uniquely sequester developmental genes. Thus, myo-converted fibroblasts and myotubes retain cell type-specific features of radial and functional genome organization. Our results favor a view of myo-converted fibroblasts as a practical model to investigate the phenotypic and genomic properties of muscle cell differentiation in normal and pathological contexts, but also highlight current limitations in using fibroblasts as a source of myogenic cells.
Collapse
Affiliation(s)
- Louise Benarroch
- Sorbonne Université, Inserm, Institut de Myologie, Centre de Recherche en Myologie, 75013 Paris, France; (L.B.); (K.M.); (J.O.); (A.B.); (V.M.); (G.B.)
| | - Julia Madsen-Østerbye
- Department of Molecular Medicine, Institute of Basic Medical Sciences, Faculty of Medicine, University of Oslo, 0372 Oslo, Norway; (J.M.-Ø.); (M.A.)
- Department of Immunology and Transfusion Medicine, Oslo University Hospital, 0372 Oslo, Norway
| | - Mohamed Abdelhalim
- Department of Molecular Medicine, Institute of Basic Medical Sciences, Faculty of Medicine, University of Oslo, 0372 Oslo, Norway; (J.M.-Ø.); (M.A.)
| | - Kamel Mamchaoui
- Sorbonne Université, Inserm, Institut de Myologie, Centre de Recherche en Myologie, 75013 Paris, France; (L.B.); (K.M.); (J.O.); (A.B.); (V.M.); (G.B.)
| | - Jessica Ohana
- Sorbonne Université, Inserm, Institut de Myologie, Centre de Recherche en Myologie, 75013 Paris, France; (L.B.); (K.M.); (J.O.); (A.B.); (V.M.); (G.B.)
| | - Anne Bigot
- Sorbonne Université, Inserm, Institut de Myologie, Centre de Recherche en Myologie, 75013 Paris, France; (L.B.); (K.M.); (J.O.); (A.B.); (V.M.); (G.B.)
| | - Vincent Mouly
- Sorbonne Université, Inserm, Institut de Myologie, Centre de Recherche en Myologie, 75013 Paris, France; (L.B.); (K.M.); (J.O.); (A.B.); (V.M.); (G.B.)
| | - Gisèle Bonne
- Sorbonne Université, Inserm, Institut de Myologie, Centre de Recherche en Myologie, 75013 Paris, France; (L.B.); (K.M.); (J.O.); (A.B.); (V.M.); (G.B.)
| | - Anne T. Bertrand
- Sorbonne Université, Inserm, Institut de Myologie, Centre de Recherche en Myologie, 75013 Paris, France; (L.B.); (K.M.); (J.O.); (A.B.); (V.M.); (G.B.)
| | - Philippe Collas
- Department of Molecular Medicine, Institute of Basic Medical Sciences, Faculty of Medicine, University of Oslo, 0372 Oslo, Norway; (J.M.-Ø.); (M.A.)
- Department of Immunology and Transfusion Medicine, Oslo University Hospital, 0372 Oslo, Norway
| |
Collapse
|
43
|
Mahendrawada L, Warfield L, Donczew R, Hahn S. Surprising connections between DNA binding and function for the near-complete set of yeast transcription factors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.25.550593. [PMID: 37546716 PMCID: PMC10402042 DOI: 10.1101/2023.07.25.550593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
DNA sequence-specific transcription factors (TFs) modulate transcription and chromatin architecture, acting from regulatory sites in enhancers and promoters of eukaryotic genes. How TFs locate their DNA targets and how multiple TFs cooperate to regulate individual genes is still unclear. Most yeast TFs are thought to regulate transcription via binding to upstream activating sequences, situated within a few hundred base pairs upstream of the regulated gene. While this model has been validated for individual TFs and specific genes, it has not been tested in a systematic way with the large set of yeast TFs. Here, we have integrated information on the binding and expression targets for the near-complete set of yeast TFs. While we found many instances of functional TF binding sites in upstream regulatory regions, we found many more instances that do not fit this model. In many cases, rapid TF depletion affects gene expression where there is no detectable binding of that TF to the upstream region of the affected gene. In addition, for most TFs, only a small fraction of bound TFs regulates the nearby gene, showing that TF binding does not automatically correspond to regulation of the linked gene. Finally, we found that only a small percentage of TFs are exclusively strong activators or repressors with most TFs having dual function. Overall, our comprehensive mapping of TF binding and regulatory targets have both confirmed known TF relationships and revealed surprising properties of TF function.
Collapse
|
44
|
Penzar D, Nogina D, Noskova E, Zinkevich A, Meshcheryakov G, Lando A, Rafi AM, de Boer C, Kulakovskiy IV. LegNet: a best-in-class deep learning model for short DNA regulatory regions. Bioinformatics 2023; 39:btad457. [PMID: 37490428 PMCID: PMC10400376 DOI: 10.1093/bioinformatics/btad457] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 05/28/2023] [Accepted: 07/24/2023] [Indexed: 07/27/2023] Open
Abstract
MOTIVATION The increasing volume of data from high-throughput experiments including parallel reporter assays facilitates the development of complex deep-learning approaches for modeling DNA regulatory grammar. RESULTS Here, we introduce LegNet, an EfficientNetV2-inspired convolutional network for modeling short gene regulatory regions. By approaching the sequence-to-expression regression problem as a soft classification task, LegNet secured first place for the autosome.org team in the DREAM 2022 challenge of predicting gene expression from gigantic parallel reporter assays. Using published data, here, we demonstrate that LegNet outperforms existing models and accurately predicts gene expression per se as well as the effects of single-nucleotide variants. Furthermore, we show how LegNet can be used in a diffusion network manner for the rational design of promoter sequences yielding the desired expression level. AVAILABILITY AND IMPLEMENTATION https://github.com/autosome-ru/LegNet. The GitHub repository includes Jupyter Notebook tutorials and Python scripts under the MIT license to reproduce the results presented in the study.
Collapse
Affiliation(s)
- Dmitry Penzar
- Vavilov Institute of General Genetics, Moscow 119991, Russia
- Institute of Protein Research, Pushchino 142290, Russia
- Institute of Translational Medicine, Pirogov Russian National Research Medical University, Moscow 117997, Russia
| | - Daria Nogina
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow 119991, Russia
| | - Elizaveta Noskova
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow 119991, Russia
| | - Arsenii Zinkevich
- Vavilov Institute of General Genetics, Moscow 119991, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow 119991, Russia
| | | | | | - Abdul Muntakim Rafi
- School of Biomedical Engineering, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Carl de Boer
- School of Biomedical Engineering, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Ivan V Kulakovskiy
- Vavilov Institute of General Genetics, Moscow 119991, Russia
- Institute of Protein Research, Pushchino 142290, Russia
- Laboratory of Regulatory Genomics, Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan 420008, Russia
| |
Collapse
|
45
|
Oliveros W, Delfosse K, Lato DF, Kiriakopulos K, Mokhtaridoost M, Said A, McMurray BJ, Browning JW, Mattioli K, Meng G, Ellis J, Mital S, Melé M, Maass PG. Systematic characterization of regulatory variants of blood pressure genes. CELL GENOMICS 2023; 3:100330. [PMID: 37492106 PMCID: PMC10363820 DOI: 10.1016/j.xgen.2023.100330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 03/29/2023] [Accepted: 04/28/2023] [Indexed: 07/27/2023]
Abstract
High blood pressure (BP) is the major risk factor for cardiovascular disease. Genome-wide association studies have identified genetic variants for BP, but functional insights into causality and related molecular mechanisms lag behind. We functionally characterize 4,608 genetic variants in linkage with 135 BP loci in vascular smooth muscle cells and cardiomyocytes by massively parallel reporter assays. High densities of regulatory variants at BP loci (i.e., ULK4, MAP4, CFDP1, PDE5A) indicate that multiple variants drive genetic association. Regulatory variants are enriched in repeats, alter cardiovascular-related transcription factor motifs, and spatially converge with genes controlling specific cardiovascular pathways. Using heuristic scoring, we define likely causal variants, and CRISPR prime editing finally determines causal variants for KCNK9, SFXN2, and PCGF6, which are candidates for developing high BP. Our systems-level approach provides a catalog of functionally relevant variants and their genomic architecture in two trait-relevant cell lines for a better understanding of BP gene regulation.
Collapse
Affiliation(s)
- Winona Oliveros
- Life Sciences Department, Barcelona Supercomputing Center, 08034 Barcelona, Catalonia, Spain
| | - Kate Delfosse
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Daniella F. Lato
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Katerina Kiriakopulos
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Milad Mokhtaridoost
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Abdelrahman Said
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Brandon J. McMurray
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Jared W.L. Browning
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Kaia Mattioli
- Division of Genetics, Department of Medicine, Brigham & Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Guoliang Meng
- Developmental and Stem Cell Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - James Ellis
- Developmental and Stem Cell Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Seema Mital
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
- Ted Rogers Centre for Heart Research, Toronto, ON M5G 1X8, Canada
- Department of Pediatrics, The Hospital for Sick Children, University of Toronto, Toronto, ON M5G 0A4, Canada
| | - Marta Melé
- Life Sciences Department, Barcelona Supercomputing Center, 08034 Barcelona, Catalonia, Spain
| | - Philipp G. Maass
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
46
|
Zhang Z, Feng F, Qiu Y, Liu J. A generalizable framework to comprehensively predict epigenome, chromatin organization, and transcriptome. Nucleic Acids Res 2023; 51:5931-5947. [PMID: 37224527 PMCID: PMC10325920 DOI: 10.1093/nar/gkad436] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 03/31/2023] [Accepted: 05/09/2023] [Indexed: 05/26/2023] Open
Abstract
Many deep learning approaches have been proposed to predict epigenetic profiles, chromatin organization, and transcription activity. While these approaches achieve satisfactory performance in predicting one modality from another, the learned representations are not generalizable across predictive tasks or across cell types. In this paper, we propose a deep learning approach named EPCOT which employs a pre-training and fine-tuning framework, and is able to accurately and comprehensively predict multiple modalities including epigenome, chromatin organization, transcriptome, and enhancer activity for new cell types, by only requiring cell-type specific chromatin accessibility profiles. Many of these predicted modalities, such as Micro-C and ChIA-PET, are quite expensive to get in practice, and the in silico prediction from EPCOT should be quite helpful. Furthermore, this pre-training and fine-tuning framework allows EPCOT to identify generic representations generalizable across different predictive tasks. Interpreting EPCOT models also provides biological insights including mapping between different genomic modalities, identifying TF sequence binding patterns, and analyzing cell-type specific TF impacts on enhancer activity.
Collapse
Affiliation(s)
- Zhenhao Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, 500 S. State St, Ann Arbor, MI 48109, USA
| | - Fan Feng
- Department of Computational Medicine and Bioinformatics, University of Michigan, 500 S. State St, Ann Arbor, MI 48109, USA
| | - Yiyang Qiu
- Department of Computer Science and Engineering, University of Michigan, 500 S. State St, Ann Arbor, MI 48109, USA
| | - Jie Liu
- Department of Computational Medicine and Bioinformatics, University of Michigan, 500 S. State St, Ann Arbor, MI 48109, USA
- Department of Computer Science and Engineering, University of Michigan, 500 S. State St, Ann Arbor, MI 48109, USA
| |
Collapse
|
47
|
Catta-Preta R, Lindtner S, Ypsilanti A, Price J, Abnousi A, Su-Feher L, Wang Y, Juric I, Jones IR, Akiyama JA, Hu M, Shen Y, Visel A, Pennacchio LA, Dickel D, Rubenstein JLR, Nord AS. Combinatorial transcription factor binding encodes cis-regulatory wiring of forebrain GABAergic neurogenesis. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.28.546894. [PMID: 37425940 PMCID: PMC10327028 DOI: 10.1101/2023.06.28.546894] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
Transcription factors (TFs) bind combinatorially to genomic cis-regulatory elements (cREs), orchestrating transcription programs. While studies of chromatin state and chromosomal interactions have revealed dynamic neurodevelopmental cRE landscapes, parallel understanding of the underlying TF binding lags. To elucidate the combinatorial TF-cRE interactions driving mouse basal ganglia development, we integrated ChIP-seq for twelve TFs, H3K4me3-associated enhancer-promoter interactions, chromatin and transcriptional state, and transgenic enhancer assays. We identified TF-cREs modules with distinct chromatin features and enhancer activity that have complementary roles driving GABAergic neurogenesis and suppressing other developmental fates. While the majority of distal cREs were bound by one or two TFs, a small proportion were extensively bound, and these enhancers also exhibited exceptional evolutionary conservation, motif density, and complex chromosomal interactions. Our results provide new insights into how modules of combinatorial TF-cRE interactions activate and repress developmental expression programs and demonstrate the value of TF binding data in modeling gene regulatory wiring.
Collapse
Affiliation(s)
- Rinaldo Catta-Preta
- Department of Neurobiology, Physiology and Behavior, and Department of Psychiatry and Behavioral Sciences, University of California, Davis, Davis, CA 95618, USA
- Current Address: Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA 02115, USA
| | - Susan Lindtner
- Nina Ireland Laboratory of Developmental Neurobiology, Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Athena Ypsilanti
- Nina Ireland Laboratory of Developmental Neurobiology, Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - James Price
- Nina Ireland Laboratory of Developmental Neurobiology, Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Armen Abnousi
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic Foundation, Cleveland, OH 44106, USA
- Current Address: NovaSignal, Los Angeles, CA 90064, USA
| | - Linda Su-Feher
- Department of Neurobiology, Physiology and Behavior, and Department of Psychiatry and Behavioral Sciences, University of California, Davis, Davis, CA 95618, USA
| | - Yurong Wang
- Department of Neurobiology, Physiology and Behavior, and Department of Psychiatry and Behavioral Sciences, University of California, Davis, Davis, CA 95618, USA
| | - Ivan Juric
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic Foundation, Cleveland, OH 44106, USA
| | - Ian R Jones
- Institute for Human Genetics, Department of Neurology, University of California, San Francisco, San Francisco, CA 94143, USA
- Department of Neurology, University of California, San Francisco, CA 94143, USA
| | - Jennifer A Akiyama
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Ming Hu
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic Foundation, Cleveland, OH 44106, USA
| | - Yin Shen
- Institute for Human Genetics, Department of Neurology, University of California, San Francisco, San Francisco, CA 94143, USA
- Department of Neurology, University of California, San Francisco, CA 94143, USA
| | - Axel Visel
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
- U.S. Department of Energy Joint Genome Institute, Walnut Creek, CA 94598, USA
- School of Natural Sciences, University of California, Merced, Merced, CA 95343, USA
| | - Len A Pennacchio
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
- U.S. Department of Energy Joint Genome Institute, Walnut Creek, CA 94598, USA
- Comparative Biochemistry Program, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Diane Dickel
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - John L R Rubenstein
- Nina Ireland Laboratory of Developmental Neurobiology, Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Alex S Nord
- Department of Neurobiology, Physiology and Behavior, and Department of Psychiatry and Behavioral Sciences, University of California, Davis, Davis, CA 95618, USA
| |
Collapse
|
48
|
Mach P, Giorgetti L. Integrative approaches to study enhancer-promoter communication. Curr Opin Genet Dev 2023; 80:102052. [PMID: 37257410 PMCID: PMC10293802 DOI: 10.1016/j.gde.2023.102052] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 04/21/2023] [Accepted: 04/22/2023] [Indexed: 06/02/2023]
Abstract
The spatiotemporal control of gene expression in complex multicellular organisms relies on noncoding regulatory sequences such as enhancers, which activate transcription of target genes often over large genomic distances. Despite the advances in the identification and characterization of enhancers, the principles and mechanisms by which enhancers select and control their target genes remain largely unknown. Here, we review recent interdisciplinary and quantitative approaches based on emerging techniques that aim to address open questions in the field, notably how regulatory information is encoded in the DNA sequence, how this information is transferred from enhancers to promoters, and how these processes are regulated in time.
Collapse
Affiliation(s)
- Pia Mach
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland; University of Basel, Basel, Switzerland. https://twitter.com/@MachPia
| | - Luca Giorgetti
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland.
| |
Collapse
|
49
|
Kim S, Morgunova E, Naqvi S, Bader M, Koska M, Popov A, Luong C, Pogson A, Claes P, Taipale J, Wysocka J. DNA-guided transcription factor cooperativity shapes face and limb mesenchyme. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.29.541540. [PMID: 37398193 PMCID: PMC10312427 DOI: 10.1101/2023.05.29.541540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Transcription factors (TFs) can define distinct cellular identities despite nearly identical DNA-binding specificities. One mechanism for achieving regulatory specificity is DNA-guided TF cooperativity. Although in vitro studies suggest it may be common, examples of such cooperativity remain scarce in cellular contexts. Here, we demonstrate how 'Coordinator', a long DNA motif comprised of common motifs bound by many basic helix-loop-helix (bHLH) and homeodomain (HD) TFs, uniquely defines regulatory regions of embryonic face and limb mesenchyme. Coordinator guides cooperative and selective binding between the bHLH family mesenchymal regulator TWIST1 and a collective of HD factors associated with regional identities in the face and limb. TWIST1 is required for HD binding and open chromatin at Coordinator sites, while HD factors stabilize TWIST1 occupancy at Coordinator and titrate it away from HD-independent sites. This cooperativity results in shared regulation of genes involved in cell-type and positional identities, and ultimately shapes facial morphology and evolution.
Collapse
Affiliation(s)
- Seungsoo Kim
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA 94305
- Department of Developmental Biology, Stanford University, Stanford, CA 94305
- Institute for Stem Cell Biology and Regenerative Medicine, Stanford University, Stanford, CA 94305
- Howard Hughes Medical Institute, Stanford, CA 94305
| | - Ekaterina Morgunova
- Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Solna, Sweden
| | - Sahin Naqvi
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA 94305
- Department of Developmental Biology, Stanford University, Stanford, CA 94305
- Institute for Stem Cell Biology and Regenerative Medicine, Stanford University, Stanford, CA 94305
- Department of Genetics, Stanford University, Stanford, CA 94305
| | - Maram Bader
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA 94305
- Department of Developmental Biology, Stanford University, Stanford, CA 94305
- Institute for Stem Cell Biology and Regenerative Medicine, Stanford University, Stanford, CA 94305
| | - Mervenaz Koska
- Department of Developmental Biology, Stanford University, Stanford, CA 94305
| | | | - Christy Luong
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA 94305
| | - Angela Pogson
- Department of Developmental Biology, Stanford University, Stanford, CA 94305
| | - Peter Claes
- Department of Electrical Engineering, ESAT/PSI, KU Leuven, Leuven, Belgium
- Medical Imaging Research Center, UZ Leuven, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Jussi Taipale
- Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Solna, Sweden
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
- Applied Tumor Genomics Program, University of Helsinki, Helsinki, Finland
| | - Joanna Wysocka
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA 94305
- Department of Developmental Biology, Stanford University, Stanford, CA 94305
- Institute for Stem Cell Biology and Regenerative Medicine, Stanford University, Stanford, CA 94305
- Howard Hughes Medical Institute, Stanford, CA 94305
| |
Collapse
|
50
|
Ziyani C, Delaneau O, Ribeiro DM. Multimodal single cell analysis infers widespread enhancer co-activity in a lymphoblastoid cell line. Commun Biol 2023; 6:563. [PMID: 37237005 DOI: 10.1038/s42003-023-04954-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 05/18/2023] [Indexed: 05/28/2023] Open
Abstract
Non-coding regulatory elements such as enhancers are key in controlling the cell-type specificity and spatio-temporal expression of genes. To drive stable and precise gene transcription robust to genetic variation and environmental stress, genes are often targeted by multiple enhancers with redundant action. However, it is unknown whether enhancers targeting the same gene display simultaneous activity or whether some enhancer combinations are more often co-active than others. Here, we take advantage of recent developments in single cell technology that permit assessing chromatin status (scATAC-seq) and gene expression (scRNA-seq) in the same single cells to correlate gene expression to the activity of multiple enhancers. Measuring activity patterns across 24,844 human lymphoblastoid single cells, we find that the majority of enhancers associated with the same gene display significant correlation in their chromatin profiles. For 6944 expressed genes associated with enhancers, we predict 89,885 significant enhancer-enhancer associations between nearby enhancers. We find that associated enhancers share similar transcription factor binding profiles and that gene essentiality is linked with higher enhancer co-activity. We provide a set of predicted enhancer-enhancer associations based on correlation derived from a single cell line, which can be further investigated for functional relevance.
Collapse
Affiliation(s)
- Chaymae Ziyani
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Olivier Delaneau
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Diogo M Ribeiro
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland.
| |
Collapse
|