1
|
Hollingsworth EW, Liu TA, Alcantara JA, Chen CX, Jacinto SH, Kvon EZ. Rapid and quantitative functional interrogation of human enhancer variant activity in live mice. Nat Commun 2025; 16:409. [PMID: 39762235 PMCID: PMC11704014 DOI: 10.1038/s41467-024-55500-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Accepted: 12/13/2024] [Indexed: 01/11/2025] Open
Abstract
Functional analysis of non-coding variants associated with congenital disorders remains challenging due to the lack of efficient in vivo models. Here we introduce dual-enSERT, a robust Cas9-based two-color fluorescent reporter system which enables rapid, quantitative comparison of enhancer allele activities in live mice in less than two weeks. We use this technology to examine and measure the gain- and loss-of-function effects of enhancer variants previously linked to limb polydactyly, autism spectrum disorder, and craniofacial malformation. By combining dual-enSERT with single-cell transcriptomics, we characterise gene expression in cells where the enhancer is normally and ectopically active, revealing candidate pathways that may lead to enhancer misregulation. Finally, we demonstrate the widespread utility of dual-enSERT by testing the effects of fifteen previously uncharacterised rare and common non-coding variants linked to neurodevelopmental disorders. In doing so we identify variants that reproducibly alter the in vivo activity of OTX2 and MIR9-2 brain enhancers, implicating them in autism. Dual-enSERT thus allows researchers to go from identifying candidate enhancer variants to analysis of comparative enhancer activity in live embryos in under two weeks.
Collapse
Affiliation(s)
- Ethan W Hollingsworth
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA
- Medical Scientist Training Program, University of California, Irvine School of Medicine, Irvine, CA, USA
| | - Taryn A Liu
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA
| | - Joshua A Alcantara
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA
| | - Cindy X Chen
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA
| | - Sandra H Jacinto
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA
| | - Evgeny Z Kvon
- Department of Developmental and Cell Biology, University of California, Irvine, CA, USA.
| |
Collapse
|
2
|
Wang Z, Sarkar A, Ge X. De novo functional discovery of peptide-MHC restricted CARs from recombinase-constructed large-diversity monoclonal T cell libraries. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.27.625413. [PMID: 39651191 PMCID: PMC11623653 DOI: 10.1101/2024.11.27.625413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2024]
Abstract
Chimeric antigen receptors (CAR) that mimic T cell receptors (TCR) on eliciting peptide-major histocompatibility complex (pMHC) specific T cell responses hold great promise in the development of immunotherapies against solid tumors, infections, and autoimmune diseases. However, broad applications of TCR-mimic (TCRm) CARs are hindered to date largely due to lack of a facile approach for the effective isolation of TCRm CARs. Here, we establish a highly efficient process for de novo discovery of TCRm CARs from human naïve antibody repertories by combining recombinase-mediated large-diversity monoclonal library construction with T cell activation-based positive and negative screenings. Panels of highly functional TCRm CARs with peptide-specific recognition, minimal cross-reactivity, and low tonic signaling were rapidly identified towards MHC-restricted intracellular tumor-associated antigens MAGE-A3, NY-ESO-1, and MART-1. Transduced TCRm CAR-T cells exhibited pMHC-specific functional avidity, potent cytokine release, and efficacious and persistent cytotoxicity. The developed approach could be used to generate safe and potent immunotherapies targeting MHC-restricted antigens.
Collapse
|
3
|
Rezaei N, Dormiani K, Kiani-Esfahani A, Mirdamadian S, Rahmani M, Jafarpour F, Nasr-Esfahani MH. Characterization and functional evaluation of goat PDX1 regulatory modules through comparative analysis of conserved interspecies homologs. Sci Rep 2024; 14:26755. [PMID: 39500950 PMCID: PMC11538457 DOI: 10.1038/s41598-024-77614-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2024] [Accepted: 10/23/2024] [Indexed: 11/08/2024] Open
Abstract
PDX1 is a crucial transcription factor in pancreas development and mature β-cell function. However, the regulation of PDX1 expression in larger animals mirroring human pancreas morphogenesis and endocrine maturation remains poorly understood. Therefore, we conducted a comparative analysis to characterize regulatory regions of goat PDX1 gene and assessed their transcriptional activity by transient transfection of several transgenic EGFP constructs in β- and non-β cell lines. We recognized several highly conserved regions encompassing the promoter and cis-regulatory elements (Area I-IV) at 5' flanking sequence of the genes. Within the promoter, we identified that a key E-box and nearby CAAT element synergistically drive transcription, constituting the basal promoter of goat PDX1 gene. Furthermore, each recognized regulatory area separately enhances this basal promoter activity in β-cells compared to non-β cells; however, cooperatively, they exhibit a bifunctional regulatory effect on transcription. Additionally, the intact ~ 3 kb upstream region (Area I-III) functions as the most efficient reporter transgene in vitro and shows islet-specific expression in native rat pancreas. Together, our findings suggest that the regulation of goat PDX1 gene is governed by conserved regions similar to other mammals, while both structurally and functionally, these regions exhibit a closer resemblance to those found in humans.
Collapse
Affiliation(s)
- Naeimeh Rezaei
- Department of Animal Biotechnology, Cell Science Research Center, Royan Institute for Biotechnology, ACECR, Isfahan, Iran
| | - Kianoush Dormiani
- Department of Animal Biotechnology, Cell Science Research Center, Royan Institute for Biotechnology, ACECR, Isfahan, Iran.
| | - Abbas Kiani-Esfahani
- Department of Animal Biotechnology, Cell Science Research Center, Royan Institute for Biotechnology, ACECR, Isfahan, Iran
| | - Somayeh Mirdamadian
- Department of Animal Biotechnology, Cell Science Research Center, Royan Institute for Biotechnology, ACECR, Isfahan, Iran
| | - Mohsen Rahmani
- Department of Animal Biotechnology, Reproductive Biomedicine Research Center, Royan Institute for Biotechnology, ACECR, Isfahan, Iran
| | - Farnoosh Jafarpour
- Department of Animal Biotechnology, Reproductive Biomedicine Research Center, Royan Institute for Biotechnology, ACECR, Isfahan, Iran
| | - Mohammad Hossein Nasr-Esfahani
- Department of Animal Biotechnology, Cell Science Research Center, Royan Institute for Biotechnology, ACECR, Isfahan, Iran.
- Department of Animal Biotechnology, Reproductive Biomedicine Research Center, Royan Institute for Biotechnology, ACECR, Isfahan, Iran.
| |
Collapse
|
4
|
Maritato R, Medugno A, D'Andretta E, De Riso G, Lupo M, Botta S, Marrocco E, Renda M, Sofia M, Mussolino C, Bacci ML, Surace EM. A DNA base-specific sequence interposed between CRX and NRL contributes to RHODOPSIN expression. Sci Rep 2024; 14:26313. [PMID: 39487168 PMCID: PMC11530525 DOI: 10.1038/s41598-024-76664-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Accepted: 10/15/2024] [Indexed: 11/04/2024] Open
Abstract
Gene expression emerges from DNA sequences through the interaction of transcription factors (TFs) with DNA cis-regulatory sequences. In eukaryotes, TFs bind to transcription factor binding sites (TFBSs) with differential affinities, enabling cell-specific gene expression. In this view, DNA enables TF binding along a continuum ranging from low to high affinity depending on its sequence composition; however, it is not known whether evolution has entailed a further level of entanglement between DNA-protein interaction. Here we found that the composition and length (22 bp) of the DNA sequence interposed between the CRX and NRL retinal TFs in the proximal promoter of RHODOPSIN (RHO) largely controls the expression levels of RHO. Mutagenesis of CRX-NRL DNA linking sequences (here termed "DNA-linker") results in uncorrelated gene expression variation. In contrast, mutual exchange of naturally occurring divergent human and mouse Rho cis-regulatory elements conferred similar yet species-specific Rho expression levels. Two orthogonal DNA-binding proteins targeted to the DNA-linker either activate or repress the expression of Rho depending on the DNA-linker orientation relative to the CRX and NRL binding sites. These results argue that, in this instance, DNA itself contributes to CRX and NRL activities through a code based on specific base sequences of a defined length, ultimately determining optimal RHO expression levels.
Collapse
Affiliation(s)
- Rosa Maritato
- Department of Translational Medicine, University of Naples Federico II, Naples, Italy
| | - Alessia Medugno
- Department of Translational Medicine, University of Naples Federico II, Naples, Italy
| | - Emanuela D'Andretta
- Department of Translational Medicine, University of Naples Federico II, Naples, Italy
| | - Giulia De Riso
- Department of Molecular Medicine and Medical Biotechnology, University of Naples Federico II, Naples, Italy
- AOU Federico II, Naples, Italy
| | - Mariangela Lupo
- Telethon Institute of Genetics and Medicine (TIGEM), Pozzuoli, Italy
| | - Salvatore Botta
- Department of Translational Medical Science, University of Campania Luigi Vanvitelli, Naples, Italy
| | - Elena Marrocco
- Telethon Institute of Genetics and Medicine (TIGEM), Pozzuoli, Italy
| | - Mario Renda
- Telethon Institute of Genetics and Medicine (TIGEM), Pozzuoli, Italy
| | - Martina Sofia
- Telethon Institute of Genetics and Medicine (TIGEM), Pozzuoli, Italy
| | | | - Maria Laura Bacci
- Department of Veterinary Medical Sciences, University of Bologna, Bologna, Italy
| | - Enrico Maria Surace
- Department of Translational Medicine, University of Naples Federico II, Naples, Italy.
| |
Collapse
|
5
|
Martinez-Ara M, Comoglio F, van Steensel B. Large-scale analysis of the integration of enhancer-enhancer signals by promoters. eLife 2024; 12:RP91994. [PMID: 39466837 PMCID: PMC11517252 DOI: 10.7554/elife.91994] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/30/2024] Open
Abstract
Genes are often regulated by multiple enhancers. It is poorly understood how the individual enhancer activities are combined to control promoter activity. Anecdotal evidence has shown that enhancers can combine sub-additively, additively, synergistically, or redundantly. However, it is not clear which of these modes are more frequent in mammalian genomes. Here, we systematically tested how pairs of enhancers activate promoters using a three-way combinatorial reporter assay in mouse embryonic stem cells. By assaying about 69,000 enhancer-enhancer-promoter combinations we found that enhancer pairs generally combine near-additively. This behaviour was conserved across seven developmental promoters tested. Surprisingly, these promoters scale the enhancer signals in a non-linear manner that depends on promoter strength. A housekeeping promoter showed an overall different response to enhancer pairs, and a smaller dynamic range. Thus, our data indicate that enhancers mostly act additively, but promoters transform their collective effect non-linearly.
Collapse
Affiliation(s)
- Miguel Martinez-Ara
- Division of Gene Regulation, Netherlands Cancer InstituteAmsterdamNetherlands
- Oncode InstituteAmsterdamNetherlands
| | - Federico Comoglio
- Division of Gene Regulation, Netherlands Cancer InstituteAmsterdamNetherlands
| | - Bas van Steensel
- Division of Gene Regulation, Netherlands Cancer InstituteAmsterdamNetherlands
- Oncode InstituteAmsterdamNetherlands
- Division of Molecular Genetics, Netherlands Cancer InstituteAmsterdamNetherlands
| |
Collapse
|
6
|
La Fleur A, Shi Y, Seelig G. Decoding biology with massively parallel reporter assays and machine learning. Genes Dev 2024; 38:843-865. [PMID: 39362779 PMCID: PMC11535156 DOI: 10.1101/gad.351800.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/05/2024]
Abstract
Massively parallel reporter assays (MPRAs) are powerful tools for quantifying the impacts of sequence variation on gene expression. Reading out molecular phenotypes with sequencing enables interrogating the impact of sequence variation beyond genome scale. Machine learning models integrate and codify information learned from MPRAs and enable generalization by predicting sequences outside the training data set. Models can provide a quantitative understanding of cis-regulatory codes controlling gene expression, enable variant stratification, and guide the design of synthetic regulatory elements for applications from synthetic biology to mRNA and gene therapy. This review focuses on cis-regulatory MPRAs, particularly those that interrogate cotranscriptional and post-transcriptional processes: alternative splicing, cleavage and polyadenylation, translation, and mRNA decay.
Collapse
Affiliation(s)
- Alyssa La Fleur
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, Washington 98195, USA
| | - Yongsheng Shi
- Department of Microbiology and Molecular Genetics, School of Medicine, University of California, Irvine, Irvine, California 92697, USA;
| | - Georg Seelig
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, Washington 98195, USA;
- Department of Electrical & Computer Engineering, University of Washington, Seattle, Washington 98195, USA
| |
Collapse
|
7
|
Hong CKY, Wu Y, Erickson AA, Li J, Federico AJ, Cohen BA. Massively parallel characterization of insulator activity across the genome. Nat Commun 2024; 15:8350. [PMID: 39333469 PMCID: PMC11436800 DOI: 10.1038/s41467-024-52599-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Accepted: 09/15/2024] [Indexed: 09/29/2024] Open
Abstract
A key question in regulatory genomics is whether cis-regulatory elements (CREs) are modular elements that can function anywhere in the genome, or whether they are adapted to certain genomic locations. To distinguish between these possibilities we develop MPIRE (Massively Parallel Integrated Regulatory Elements), a technology for recurrently assaying CREs at thousands of defined locations across the genome in parallel. MPIRE allows us to separate the intrinsic activity of CREs from the effects of their genomic environments. We apply MPIRE to assay three insulator sequences at thousands of genomic locations and find that each insulator functions in locations with distinguishable properties. All three insulators can block enhancers, but each insulator blocks specific enhancers at specific locations. However, only ALOXE3 appears to block heterochromatin silencing. We conclude that insulator function is highly context dependent and that MPIRE is a robust method for revealing the context dependencies of CREs.
Collapse
Affiliation(s)
- Clarice K Y Hong
- The Edison Family Center for Genome Sciences and Systems Biology, School of Medicine, Washington University in St. Louis, Saint Louis, MO, 63110, USA
- Department of Genetics, School of Medicine, Washington University in St. Louis, Saint Louis, MO, 63110, USA
| | - Yawei Wu
- The Edison Family Center for Genome Sciences and Systems Biology, School of Medicine, Washington University in St. Louis, Saint Louis, MO, 63110, USA
- Department of Genetics, School of Medicine, Washington University in St. Louis, Saint Louis, MO, 63110, USA
| | - Alyssa A Erickson
- The Edison Family Center for Genome Sciences and Systems Biology, School of Medicine, Washington University in St. Louis, Saint Louis, MO, 63110, USA
- Department of Genetics, School of Medicine, Washington University in St. Louis, Saint Louis, MO, 63110, USA
| | - Jie Li
- The Edison Family Center for Genome Sciences and Systems Biology, School of Medicine, Washington University in St. Louis, Saint Louis, MO, 63110, USA
- Department of Genetics, School of Medicine, Washington University in St. Louis, Saint Louis, MO, 63110, USA
| | - Arnold J Federico
- The Edison Family Center for Genome Sciences and Systems Biology, School of Medicine, Washington University in St. Louis, Saint Louis, MO, 63110, USA
- Department of Genetics, School of Medicine, Washington University in St. Louis, Saint Louis, MO, 63110, USA
| | - Barak A Cohen
- The Edison Family Center for Genome Sciences and Systems Biology, School of Medicine, Washington University in St. Louis, Saint Louis, MO, 63110, USA.
- Department of Genetics, School of Medicine, Washington University in St. Louis, Saint Louis, MO, 63110, USA.
| |
Collapse
|
8
|
Lalanne JB, Regalado SG, Domcke S, Calderon D, Martin BK, Li X, Li T, Suiter CC, Lee C, Trapnell C, Shendure J. Multiplex profiling of developmental cis-regulatory elements with quantitative single-cell expression reporters. Nat Methods 2024; 21:983-993. [PMID: 38724692 PMCID: PMC11166576 DOI: 10.1038/s41592-024-02260-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Accepted: 03/22/2024] [Indexed: 06/13/2024]
Abstract
The inability to scalably and precisely measure the activity of developmental cis-regulatory elements (CREs) in multicellular systems is a bottleneck in genomics. Here we develop a dual RNA cassette that decouples the detection and quantification tasks inherent to multiplex single-cell reporter assays. The resulting measurement of reporter expression is accurate over multiple orders of magnitude, with a precision approaching the limit set by Poisson counting noise. Together with RNA barcode stabilization via circularization, these scalable single-cell quantitative expression reporters provide high-contrast readouts, analogous to classic in situ assays but entirely from sequencing. Screening >200 regions of accessible chromatin in a multicellular in vitro model of early mammalian development, we identify 13 (8 previously uncharacterized) autonomous and cell-type-specific developmental CREs. We further demonstrate that chimeric CRE pairs generate cognate two-cell-type activity profiles and assess gain- and loss-of-function multicellular expression phenotypes from CRE variants with perturbed transcription factor binding sites. Single-cell quantitative expression reporters can be applied in developmental and multicellular systems to quantitatively characterize native, perturbed and synthetic CREs at scale, with high sensitivity and at single-cell resolution.
Collapse
Affiliation(s)
| | - Samuel G Regalado
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Silvia Domcke
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Diego Calderon
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Beth K Martin
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Xiaoyi Li
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Tony Li
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Chase C Suiter
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Molecular and Cellular Biology Program, University of Washington, Seattle, WA, USA
| | - Choli Lee
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Cole Trapnell
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
- Allen Discovery Center for Cell Lineage Tracing, Seattle, WA, USA
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA.
- Allen Discovery Center for Cell Lineage Tracing, Seattle, WA, USA.
- Howard Hughes Medical Institute, Seattle, WA, USA.
| |
Collapse
|
9
|
Chin IM, Gardell ZA, Corces MR. Decoding polygenic diseases: advances in noncoding variant prioritization and validation. Trends Cell Biol 2024; 34:465-483. [PMID: 38719704 DOI: 10.1016/j.tcb.2024.03.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Revised: 03/12/2024] [Accepted: 03/21/2024] [Indexed: 06/09/2024]
Abstract
Genome-wide association studies (GWASs) provide a key foundation for elucidating the genetic underpinnings of common polygenic diseases. However, these studies have limitations in their ability to assign causality to particular genetic variants, especially those residing in the noncoding genome. Over the past decade, technological and methodological advances in both analytical and empirical prioritization of noncoding variants have enabled the identification of causative variants by leveraging orthogonal functional evidence at increasing scale. In this review, we present an overview of these approaches and describe how this workflow provides the groundwork necessary to move beyond associations toward genetically informed studies on the molecular and cellular mechanisms of polygenic disease.
Collapse
Affiliation(s)
- Iris M Chin
- Gladstone Institute of Neurological Disease, Gladstone Institutes, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA; Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Zachary A Gardell
- Gladstone Institute of Neurological Disease, Gladstone Institutes, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA; Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - M Ryan Corces
- Gladstone Institute of Neurological Disease, Gladstone Institutes, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA; Department of Neurology, University of California San Francisco, San Francisco, CA, USA.
| |
Collapse
|
10
|
Ordoñez R, Zhang W, Ellis G, Zhu Y, Ashe HJ, Ribeiro-Dos-Santos AM, Brosh R, Huang E, Hogan MS, Boeke JD, Maurano MT. Genomic context sensitizes regulatory elements to genetic disruption. Mol Cell 2024; 84:1842-1854.e7. [PMID: 38759624 PMCID: PMC11104518 DOI: 10.1016/j.molcel.2024.04.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 03/11/2024] [Accepted: 04/18/2024] [Indexed: 05/19/2024]
Abstract
Genomic context critically modulates regulatory function but is difficult to manipulate systematically. The murine insulin-like growth factor 2 (Igf2)/H19 locus is a paradigmatic model of enhancer selectivity, whereby CTCF occupancy at an imprinting control region directs downstream enhancers to activate either H19 or Igf2. We used synthetic regulatory genomics to repeatedly replace the native locus with 157-kb payloads, and we systematically dissected its architecture. Enhancer deletion and ectopic delivery revealed previously uncharacterized long-range regulatory dependencies at the native locus. Exchanging the H19 enhancer cluster with the Sox2 locus control region (LCR) showed that the H19 enhancers relied on their native surroundings while the Sox2 LCR functioned autonomously. Analysis of regulatory DNA actuation across cell types revealed that these enhancer clusters typify broader classes of context sensitivity genome wide. These results show that unexpected dependencies influence even well-studied loci, and our approach permits large-scale manipulation of complete loci to investigate the relationship between regulatory architecture and function.
Collapse
Affiliation(s)
- Raquel Ordoñez
- Institute for Systems Genetics, NYU School of Medicine, New York, NY 10016, USA
| | - Weimin Zhang
- Institute for Systems Genetics, NYU School of Medicine, New York, NY 10016, USA
| | - Gwen Ellis
- Institute for Systems Genetics, NYU School of Medicine, New York, NY 10016, USA
| | - Yinan Zhu
- Institute for Systems Genetics, NYU School of Medicine, New York, NY 10016, USA
| | - Hannah J Ashe
- Institute for Systems Genetics, NYU School of Medicine, New York, NY 10016, USA
| | | | - Ran Brosh
- Institute for Systems Genetics, NYU School of Medicine, New York, NY 10016, USA
| | - Emily Huang
- Institute for Systems Genetics, NYU School of Medicine, New York, NY 10016, USA
| | - Megan S Hogan
- Institute for Systems Genetics, NYU School of Medicine, New York, NY 10016, USA
| | - Jef D Boeke
- Institute for Systems Genetics, NYU School of Medicine, New York, NY 10016, USA; Department of Biochemistry Molecular Pharmacology, NYU School of Medicine, New York, NY 10016, USA; Department of Biomedical Engineering, NYU Tandon School of Engineering, Brooklyn, NY 11201, USA
| | - Matthew T Maurano
- Institute for Systems Genetics, NYU School of Medicine, New York, NY 10016, USA; Department of Pathology, NYU School of Medicine, New York, NY 10016, USA.
| |
Collapse
|
11
|
Ordoñez R, Zhang W, Ellis G, Zhu Y, Ashe HJ, Ribeiro-dos-Santos AM, Brosh R, Huang E, Hogan MS, Boeke JD, Maurano MT. Genomic context sensitizes regulatory elements to genetic disruption. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.07.02.547201. [PMID: 37781588 PMCID: PMC10541140 DOI: 10.1101/2023.07.02.547201] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/03/2023]
Abstract
Enhancer function is frequently investigated piecemeal using truncated reporter assays or single deletion analysis. Thus it remains unclear to what extent enhancer function at native loci relies on surrounding genomic context. Using the Big-IN technology for targeted integration of large DNAs, we analyzed the regulatory architecture of the murine Igf2/H19 locus, a paradigmatic model of enhancer selectivity. We assembled payloads containing a 157-kb functional Igf2/H19 locus and engineered mutations to genetically direct CTCF occupancy at the imprinting control region (ICR) that switches the target gene of the H19 enhancer cluster. Contrasting activity of payloads delivered at the endogenous Igf2/H19 locus or ectopically at Hprt revealed that the Igf2/H19 locus includes additional, previously unknown long-range regulatory elements. Exchanging components of the Igf2/H19 locus with the well-studied Sox2 locus showed that the H19 enhancer cluster functioned poorly out of context, and required its native surroundings to activate Sox2 expression. Conversely, the Sox2 locus control region (LCR) could activate both Igf2 and H19 outside its native context, but its activity was only partially modulated by CTCF occupancy at the ICR. Analysis of regulatory DNA actuation across different cell types revealed that, while the H19 enhancers are tightly coordinated within their native locus, the Sox2 LCR acts more independently. We show that these enhancer clusters typify broader classes of loci genome-wide. Our results show that unexpected dependencies may influence even the most studied functional elements, and our synthetic regulatory genomics approach permits large-scale manipulation of complete loci to investigate the relationship between locus architecture and function.
Collapse
Affiliation(s)
- Raquel Ordoñez
- Institute for Systems Genetics, NYU School of Medicine, New York, NY 10016, USA
- These authors contributed equally
| | - Weimin Zhang
- Institute for Systems Genetics, NYU School of Medicine, New York, NY 10016, USA
- These authors contributed equally
| | - Gwen Ellis
- Institute for Systems Genetics, NYU School of Medicine, New York, NY 10016, USA
- Present address: Department of Biology, University of Vermont, Burlington, VT 05405, USA
| | - Yinan Zhu
- Institute for Systems Genetics, NYU School of Medicine, New York, NY 10016, USA
| | - Hannah J. Ashe
- Institute for Systems Genetics, NYU School of Medicine, New York, NY 10016, USA
- Present address: School of Medicine, University of Maryland, Baltimore, MD 21201, USA
| | | | - Ran Brosh
- Institute for Systems Genetics, NYU School of Medicine, New York, NY 10016, USA
| | - Emily Huang
- Institute for Systems Genetics, NYU School of Medicine, New York, NY 10016, USA
- Present address: Highmark Health, Pittsburgh, PA 15222, USA
| | - Megan S. Hogan
- Institute for Systems Genetics, NYU School of Medicine, New York, NY 10016, USA
- Present address: Neochromosome Inc., Long Island City, NY 11101, USA
| | - Jef D. Boeke
- Institute for Systems Genetics, NYU School of Medicine, New York, NY 10016, USA
- Department of Biochemistry Molecular Pharmacology, NYU School of Medicine, New York, NY 10016, USA
- Department of Biomedical Engineering, NYU Tandon School of Engineering, Brooklyn, NY 11201, USA
| | - Matthew T. Maurano
- Institute for Systems Genetics, NYU School of Medicine, New York, NY 10016, USA
- Department of Pathology, NYU School of Medicine, New York, NY 10016, USA
- Lead contact
| |
Collapse
|
12
|
Zheng Y, Chen S. Transcriptional precision in photoreceptor development and diseases - Lessons from 25 years of CRX research. Front Cell Neurosci 2024; 18:1347436. [PMID: 38414750 PMCID: PMC10896975 DOI: 10.3389/fncel.2024.1347436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 01/19/2024] [Indexed: 02/29/2024] Open
Abstract
The vertebrate retina is made up of six specialized neuronal cell types and one glia that are generated from a common retinal progenitor. The development of these distinct cell types is programmed by transcription factors that regulate the expression of specific genes essential for cell fate specification and differentiation. Because of the complex nature of transcriptional regulation, understanding transcription factor functions in development and disease is challenging. Research on the Cone-rod homeobox transcription factor CRX provides an excellent model to address these challenges. In this review, we reflect on 25 years of mammalian CRX research and discuss recent progress in elucidating the distinct pathogenic mechanisms of four CRX coding variant classes. We highlight how in vitro biochemical studies of CRX protein functions facilitate understanding CRX regulatory principles in animal models. We conclude with a brief discussion of the emerging systems biology approaches that could accelerate precision medicine for CRX-linked diseases and beyond.
Collapse
Affiliation(s)
- Yiqiao Zheng
- Molecular Genetics and Genomics Graduate Program, Division of Biological and Biomedical Sciences, Saint Louis, MO, United States
- Department of Ophthalmology and Visual Sciences, Saint Louis, MO, United States
| | - Shiming Chen
- Molecular Genetics and Genomics Graduate Program, Division of Biological and Biomedical Sciences, Saint Louis, MO, United States
- Department of Ophthalmology and Visual Sciences, Saint Louis, MO, United States
- Department of Developmental Biology, Washington University in St. Louis, Saint Louis, MO, United States
| |
Collapse
|
13
|
Kang CK, Kim AR. Deep molecular learning of transcriptional control of a synthetic CRE enhancer and its variants. iScience 2024; 27:108747. [PMID: 38222110 PMCID: PMC10784702 DOI: 10.1016/j.isci.2023.108747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 08/29/2023] [Accepted: 12/12/2023] [Indexed: 01/16/2024] Open
Abstract
Massively parallel reporter assay measures transcriptional activities of various cis-regulatory modules (CRMs) in a single experiment. We developed a thermodynamic computational model framework that calculates quantitative levels of gene expression directly from regulatory DNA sequences. Using the framework, we investigated the molecular mechanisms of cis-regulatory mutations of a synthetic enhancer that cause abnormal gene expression. We found that, in a human cell line, competitive binding between family transcription factors (TFs) with slightly different binding preferences significantly increases the accuracy of recapitulating the transcriptional effects of thousands of single- or multi-mutations. We also discovered that even if various harmful mutations occurred in an activator binding site, CRM could stably maintain or even increase gene expression through a certain form of competitive binding between family TFs. These findings enhance understanding the effect of SNPs and indels on CRMs and would help building robust custom-designed CRMs for biologics production and gene therapy.
Collapse
Affiliation(s)
- Chan-Koo Kang
- School of Life Science, Handong Global University, Pohang, Gyeong-Buk 37554, South Korea
- Department of Advanced Convergence, Handong Global University, Pohang, Gyeong-Buk 37554, South Korea
| | - Ah-Ram Kim
- School of Life Science, Handong Global University, Pohang, Gyeong-Buk 37554, South Korea
- Department of Advanced Convergence, Handong Global University, Pohang, Gyeong-Buk 37554, South Korea
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- School of Applied Artificial Intelligence, Handong Global University, Pohang, Gyeong-Buk 37554, South Korea
| |
Collapse
|
14
|
Chen C, Wang Z, Kang M, Lee KB, Ge X. High-fidelity large-diversity monoclonal mammalian cell libraries by cell cycle arrested recombinase-mediated cassette exchange. Nucleic Acids Res 2023; 51:e113. [PMID: 37941133 PMCID: PMC10711435 DOI: 10.1093/nar/gkad1001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 09/26/2023] [Accepted: 10/18/2023] [Indexed: 11/10/2023] Open
Abstract
Mammalian cells carrying defined genetic variations have shown great potentials in both fundamental research and therapeutic development. However, their full use was limited by lack of a robust method to construct large monoclonal high-quality combinatorial libraries. This study developed cell cycle arrested recombinase-mediated cassette exchange (aRMCE), able to provide monoclonality, precise genomic integration and uniform transgene expression. Via optimized nocodazole-mediated mitotic arrest, 20% target gene replacement efficiency was achieved without antibiotic selection, and the improved aRMCE efficiency was applicable to a variety of tested cell clones, transgene targets and transfection methods. As a demonstration of this versatile method, we performed directed evolution of fragment crystallizable (Fc), for which error-prone libraries of over 107 variants were constructed and displayed as IgG on surface of CHO cells. Diversities of constructed libraries were validated by deep sequencing, and panels of novel Fc mutants were identified showing improved binding towards specific Fc gamma receptors and enhanced effector functions. Due to its large cargo capacity and compatibility with different mutagenesis approaches, we expect this mammalian cell platform technology has broad applications for directed evolution, multiplex genetic assays, cell line development and stem cell engineering.
Collapse
Affiliation(s)
- Chuan Chen
- Department of Chemical and Environmental Engineering, University of California Riverside, Riverside, CA 92521, USA
| | - Zening Wang
- Department of Chemical and Environmental Engineering, University of California Riverside, Riverside, CA 92521, USA
- Institute of Molecular Medicine, University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Minhyo Kang
- Department of Chemical and Environmental Engineering, University of California Riverside, Riverside, CA 92521, USA
| | - Ki Baek Lee
- Institute of Molecular Medicine, University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Xin Ge
- Department of Chemical and Environmental Engineering, University of California Riverside, Riverside, CA 92521, USA
- Institute of Molecular Medicine, University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| |
Collapse
|
15
|
Hollingsworth EW, Liu TA, Jacinto SH, Chen CX, Alcantara JA, Kvon EZ. Rapid and Quantitative Functional Interrogation of Human Enhancer Variant Activity in Live Mice. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.10.570890. [PMID: 38105996 PMCID: PMC10723448 DOI: 10.1101/2023.12.10.570890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Functional analysis of non-coding variants associated with human congenital disorders remains challenging due to the lack of efficient in vivo models. Here we introduce dual-enSERT, a robust Cas9-based two-color fluorescent reporter system which enables rapid, quantitative comparison of enhancer allele activities in live mice of any genetic background. We use this new technology to examine and measure the gain- and loss-of-function effects of enhancer variants linked to limb polydactyly, autism, and craniofacial malformation. By combining dual-enSERT with single-cell transcriptomics, we characterize variant enhancer alleles at cellular resolution, thereby implicating candidate molecular pathways in pathogenic enhancer misregulation. We further show that independent, polydactyly-linked enhancer variants lead to ectopic expression in the same cell populations, indicating shared genetic mechanisms underlying non-coding variant pathogenesis. Finally, we streamline dual-enSERT for analysis in F0 animals by placing both reporters on the same transgene separated by a synthetic insulator. Dual-enSERT allows researchers to go from identifying candidate enhancer variants to analysis of comparative enhancer activity in live embryos in under two weeks.
Collapse
Affiliation(s)
- Ethan W. Hollingsworth
- Department of Developmental and Cell Biology, University of California, Irvine, CA 92697, USA
- Medical Scientist Training Program, University of California, Irvine School of Medicine, Irvine, CA 92697, USA
| | - Taryn A. Liu
- Department of Developmental and Cell Biology, University of California, Irvine, CA 92697, USA
| | - Sandra H. Jacinto
- Department of Developmental and Cell Biology, University of California, Irvine, CA 92697, USA
| | - Cindy X. Chen
- Department of Developmental and Cell Biology, University of California, Irvine, CA 92697, USA
| | - Joshua A. Alcantara
- Department of Developmental and Cell Biology, University of California, Irvine, CA 92697, USA
| | - Evgeny Z. Kvon
- Department of Developmental and Cell Biology, University of California, Irvine, CA 92697, USA
| |
Collapse
|
16
|
Maes S, Deploey N, Peelman F, Eyckerman S. Deep mutational scanning of proteins in mammalian cells. CELL REPORTS METHODS 2023; 3:100641. [PMID: 37963462 PMCID: PMC10694495 DOI: 10.1016/j.crmeth.2023.100641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Revised: 07/06/2023] [Accepted: 10/20/2023] [Indexed: 11/16/2023]
Abstract
Protein mutagenesis is essential for unveiling the molecular mechanisms underlying protein function in health, disease, and evolution. In the past decade, deep mutational scanning methods have evolved to support the functional analysis of nearly all possible single-amino acid changes in a protein of interest. While historically these methods were developed in lower organisms such as E. coli and yeast, recent technological advancements have resulted in the increased use of mammalian cells, particularly for studying proteins involved in human disease. These advancements will aid significantly in the classification and interpretation of variants of unknown significance, which are being discovered at large scale due to the current surge in the use of whole-genome sequencing in clinical contexts. Here, we explore the experimental aspects of deep mutational scanning studies in mammalian cells and report the different methods used in each step of the workflow, ultimately providing a useful guide toward the design of such studies.
Collapse
Affiliation(s)
- Stefanie Maes
- VIB Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biochemistry and Microbiology, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
| | - Nick Deploey
- VIB Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
| | - Frank Peelman
- VIB Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
| | - Sven Eyckerman
- VIB Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium.
| |
Collapse
|
17
|
Chen Z, Javed N, Moore M, Wu J, Sun G, Vinyard M, Collins A, Pinello L, Najm FJ, Bernstein BE. Integrative dissection of gene regulatory elements at base resolution. CELL GENOMICS 2023; 3:100318. [PMID: 37388913 PMCID: PMC10300548 DOI: 10.1016/j.xgen.2023.100318] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/25/2022] [Revised: 02/21/2023] [Accepted: 03/31/2023] [Indexed: 07/01/2023]
Abstract
Although vast numbers of putative gene regulatory elements have been cataloged, the sequence motifs and individual bases that underlie their functions remain largely unknown. Here, we combine epigenetic perturbations, base editing, and deep learning to dissect regulatory sequences within the exemplar immune locus encoding CD69. We converge on a ∼170 base interval within a differentially accessible and acetylated enhancer critical for CD69 induction in stimulated Jurkat T cells. Individual C-to-T base edits within the interval markedly reduce element accessibility and acetylation, with corresponding reduction of CD69 expression. The most potent base edits may be explained by their effect on regulatory interactions between the transcriptional activators GATA3 and TAL1 and the repressor BHLHE40. Systematic analysis suggests that the interplay between GATA3 and BHLHE40 plays a general role in rapid T cell transcriptional responses. Our study provides a framework for parsing regulatory elements in their endogenous chromatin contexts and identifying operative artificial variants.
Collapse
Affiliation(s)
- Zeyu Chen
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- Gene Regulation Observatory, Broad Institute, Cambridge, MA, USA
- Department of Cell Biology and Pathology, Harvard Medical School, Boston, MA, USA
| | - Nauman Javed
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- Gene Regulation Observatory, Broad Institute, Cambridge, MA, USA
- Department of Cell Biology and Pathology, Harvard Medical School, Boston, MA, USA
| | - Molly Moore
- Gene Regulation Observatory, Broad Institute, Cambridge, MA, USA
| | - Jingyi Wu
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- Gene Regulation Observatory, Broad Institute, Cambridge, MA, USA
- Department of Cell Biology and Pathology, Harvard Medical School, Boston, MA, USA
| | - Gary Sun
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Cell Biology and Pathology, Harvard Medical School, Boston, MA, USA
| | - Michael Vinyard
- Gene Regulation Observatory, Broad Institute, Cambridge, MA, USA
- Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA
| | | | - Luca Pinello
- Gene Regulation Observatory, Broad Institute, Cambridge, MA, USA
- Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Fadi J. Najm
- Gene Regulation Observatory, Broad Institute, Cambridge, MA, USA
| | - Bradley E. Bernstein
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- Gene Regulation Observatory, Broad Institute, Cambridge, MA, USA
- Department of Cell Biology and Pathology, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
18
|
Alakuş TB. A Novel Repetition Frequency-Based DNA Encoding Scheme to Predict Human and Mouse DNA Enhancers with Deep Learning. Biomimetics (Basel) 2023; 8:218. [PMID: 37366813 DOI: 10.3390/biomimetics8020218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 05/18/2023] [Accepted: 05/22/2023] [Indexed: 06/28/2023] Open
Abstract
Recent studies have shown that DNA enhancers have an important role in the regulation of gene expression. They are responsible for different important biological elements and processes such as development, homeostasis, and embryogenesis. However, experimental prediction of these DNA enhancers is time-consuming and costly as it requires laboratory work. Therefore, researchers started to look for alternative ways and started to apply computation-based deep learning algorithms to this field. Yet, the inconsistency and unsuccessful prediction performance of computational-based approaches among various cell lines led to the investigation of these approaches as well. Therefore, in this study, a novel DNA encoding scheme was proposed, and solutions were sought to the problems mentioned and DNA enhancers were predicted with BiLSTM. The study consisted of four different stages for two scenarios. In the first stage, DNA enhancer data were obtained. In the second stage, DNA sequences were converted to numerical representations by both the proposed encoding scheme and various DNA encoding schemes including EIIP, integer number, and atomic number. In the third stage, the BiLSTM model was designed, and the data were classified. In the final stage, the performance of DNA encoding schemes was determined by accuracy, precision, recall, F1-score, CSI, MCC, G-mean, Kappa coefficient, and AUC scores. In the first scenario, it was determined whether the DNA enhancers belonged to humans or mice. As a result of the prediction process, the highest performance was achieved with the proposed DNA encoding scheme, and an accuracy of 92.16% and an AUC score of 0.85 were calculated, respectively. The closest accuracy score to the proposed scheme was obtained with the EIIP DNA encoding scheme and the result was observed as 89.14%. The AUC score of this scheme was measured as 0.87. Among the remaining DNA encoding schemes, the atomic number showed an accuracy score of 86.61%, while this rate decreased to 76.96% with the integer scheme. The AUC values of these schemes were 0.84 and 0.82, respectively. In the second scenario, it was determined whether there was a DNA enhancer and, if so, it was decided to which species this enhancer belonged. In this scenario, the highest accuracy score was obtained with the proposed DNA encoding scheme and the result was 84.59%. Moreover, the AUC score of the proposed scheme was determined as 0.92. EIIP and integer DNA encoding schemes showed accuracy scores of 77.80% and 73.68%, respectively, while their AUC scores were close to 0.90. The most ineffective prediction was performed with the atomic number and the accuracy score of this scheme was calculated as 68.27%. Finally, the AUC score of this scheme was 0.81. At the end of the study, it was observed that the proposed DNA encoding scheme was successful and effective in predicting DNA enhancers.
Collapse
Affiliation(s)
- Talha Burak Alakuş
- Department of Software Engineering, Faculty of Engineering, Kırklareli University, 39100 Kırklareli, Turkey
| |
Collapse
|
19
|
Kaur A, Chauhan APS, Aggarwal AK. Prediction of Enhancers in DNA Sequence Data using a Hybrid CNN-DLSTM Model. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1327-1336. [PMID: 35417351 DOI: 10.1109/tcbb.2022.3167090] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Enhancer, a distal cis-regulatory element controls gene expression. Experimental prediction of enhancer elements is time-consuming and expensive. Consequently, various inexpensive deep learning-based fast methods have been developed for predicting the enhancers and determining their strength. In this paper, we have proposed a two-stage deep learning-based framework leveraging DNA structural features, natural language processing, convolutional neural network, and long short-term memory to predict the enhancer elements accurately in the genomics data. In the first stage, we extracted the features from DNA sequence data by using three feature representation techniques viz., k-mer based feature extraction along with word2vector based interpretation of underlined patterns, one-hot encoding, and the DNAshape technique. In the second stage, strength of enhancers is predicted from the extracted features using a hybrid deep learning model. The method is capable of adapting itself to varying sizes of datasets. Also, as proposed model can capture long-range sequencing patterns, the robustness of the method remains unaffected against minor variations in the genomics sequence. The method outperforms the other state-of-the-art methods at both stages in terms of performance metrics of prediction accuracy, specificity, Mathew's correlation coefficient, and area under the ROC curve. In summary, the proposed method is a reliable method for enhancer prediction.
Collapse
|
20
|
Kreibich E, Krebs AR. Cofactors: a new layer of specificity to enhancer regulation. Trends Biochem Sci 2022; 47:993-995. [PMID: 35970663 DOI: 10.1016/j.tibs.2022.07.008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 07/29/2022] [Indexed: 12/24/2022]
Abstract
Cofactors are essential effectors of the transcription control machinery. How this functionally diverse group of factors is used in the genome remains elusive. A recent study by Neumayr, Haberle et al. sheds light on this question, showing that enhancers depend on defined combinations of cofactors for their activation.
Collapse
Affiliation(s)
- Elisa Kreibich
- EMBL Heidelberg, Meyerhofstraße 1, 69117 Heidelberg, Germany; Faculty of Biosciences, Collaboration for Joint PhD Degree between EMBL and Heidelberg University, Heidelberg, Germany
| | - Arnaud R Krebs
- EMBL Heidelberg, Meyerhofstraße 1, 69117 Heidelberg, Germany.
| |
Collapse
|
21
|
Leung AKY, Yao L, Yu H. Functional genomic assays to annotate enhancer-promoter interactions genome wide. Hum Mol Genet 2022; 31:R97-R104. [PMID: 36018818 PMCID: PMC9585677 DOI: 10.1093/hmg/ddac204] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Revised: 08/16/2022] [Accepted: 08/17/2022] [Indexed: 11/14/2022] Open
Abstract
Enhancers are pivotal for regulating gene transcription that occurs at promoters. Identification of the interacting enhancer-promoter pairs and understanding the mechanisms behind how they interact and how enhancers modulate transcription can provide fundamental insight into gene regulatory networks. Recently, advances in high-throughput methods in three major areas-chromosome conformation capture assay, such as Hi-C to study basic chromatin architecture, ectopic reporter experiments such as self-transcribing active regulatory region sequencing (STARR-seq) to quantify promoter and enhancer activity, and endogenous perturbations such as clustered regularly interspaced short palindromic repeat interference (CRISPRi) to identify enhancer-promoter compatibility-have further our knowledge about transcription. In this review, we will discuss the major method developments and key findings from these assays.
Collapse
Affiliation(s)
- Alden King-Yung Leung
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
- Center for Genomics and Proteomics Technology Development (CGPT), Cornell University, Ithaca NY 14853, USA
| | - Li Yao
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
- Center for Genomics and Proteomics Technology Development (CGPT), Cornell University, Ithaca NY 14853, USA
| | - Haiyuan Yu
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
- Center for Genomics and Proteomics Technology Development (CGPT), Cornell University, Ithaca NY 14853, USA
| |
Collapse
|
22
|
Du AY, Zhuo X, Sundaram V, Jensen NO, Chaudhari HG, Saccone NL, Cohen BA, Wang T. Functional characterization of enhancer activity during a long terminal repeat's evolution. Genome Res 2022; 32:1840-1851. [PMID: 36192170 PMCID: PMC9712623 DOI: 10.1101/gr.276863.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Accepted: 08/23/2022] [Indexed: 11/24/2022]
Abstract
Many transposable elements (TEs) contain transcription factor binding sites and are implicated as potential regulatory elements. However, TEs are rarely functionally tested for regulatory activity, which in turn limits our understanding of how TE regulatory activity has evolved. We systematically tested the human LTR18A subfamily for regulatory activity using massively parallel reporter assay (MPRA) and found AP-1- and CEBP-related binding motifs as drivers of enhancer activity. Functional analysis of evolutionarily reconstructed ancestral sequences revealed that LTR18A elements have generally lost regulatory activity over time through sequence changes, with the largest effects occurring owing to mutations in the AP-1 and CEBP motifs. We observed that the two motifs are conserved at higher rates than expected based on neutral evolution. Finally, we identified LTR18A elements as potential enhancers in the human genome, primarily in epithelial cells. Together, our results provide a model for the origin, evolution, and co-option of TE-derived regulatory elements.
Collapse
Affiliation(s)
- Alan Y Du
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Xiaoyu Zhuo
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Vasavi Sundaram
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Nicholas O Jensen
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- Division of Biostatistics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- Department of Developmental Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Hemangi G Chaudhari
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Nancy L Saccone
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- Division of Biostatistics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Barak A Cohen
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Ting Wang
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA
- The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| |
Collapse
|
23
|
McAfee JC, Bell JL, Krupa O, Matoba N, Stein JL, Won H. Focus on your locus with a massively parallel reporter assay. J Neurodev Disord 2022; 14:50. [PMID: 36085003 PMCID: PMC9463819 DOI: 10.1186/s11689-022-09461-x] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Accepted: 09/01/2022] [Indexed: 01/01/2023] Open
Abstract
A growing number of variants associated with risk for neurodevelopmental disorders have been identified by genome-wide association and whole genome sequencing studies. As common risk variants often fall within large haplotype blocks covering long stretches of the noncoding genome, the causal variants within an associated locus are often unknown. Similarly, the effect of rare noncoding risk variants identified by whole genome sequencing on molecular traits is seldom known without functional assays. A massively parallel reporter assay (MPRA) is an assay that can functionally validate thousands of regulatory elements simultaneously using high-throughput sequencing and barcode technology. MPRA has been adapted to various experimental designs that measure gene regulatory effects of genetic variants within cis- and trans-regulatory elements as well as posttranscriptional processes. This review discusses different MPRA designs that have been or could be used in the future to experimentally validate genetic variants associated with neurodevelopmental disorders. Though MPRA has limitations such as it does not model genomic context, this assay can help narrow down the underlying genetic causes of neurodevelopmental disorders by screening thousands of sequences in one experiment. We conclude by describing future directions of this technique such as applications of MPRA for gene-by-environment interactions and pharmacogenetics.
Collapse
Affiliation(s)
- Jessica C McAfee
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Jessica L Bell
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Oleh Krupa
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Nana Matoba
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Jason L Stein
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
| | - Hyejung Won
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
| |
Collapse
|
24
|
Yao L, Liang J, Ozer A, Leung AKY, Lis JT, Yu H. A comparison of experimental assays and analytical methods for genome-wide identification of active enhancers. Nat Biotechnol 2022; 40:1056-1065. [PMID: 35177836 PMCID: PMC9288987 DOI: 10.1038/s41587-022-01211-7] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Accepted: 01/06/2022] [Indexed: 01/15/2023]
Abstract
Mounting evidence supports the idea that transcriptional patterns serve as more specific identifiers of active enhancers than histone marks; however, the optimal strategy to identify active enhancers both experimentally and computationally has not been determined. Here, we compared 13 genome-wide RNA sequencing (RNA-seq) assays in K562 cells and show that nuclear run-on followed by cap-selection assay (GRO/PRO-cap) has advantages in enhancer RNA detection and active enhancer identification. We also introduce a tool, peak identifier for nascent transcript starts (PINTS), to identify active promoters and enhancers genome wide and pinpoint the precise location of 5' transcription start sites. Finally, we compiled a comprehensive enhancer candidate compendium based on the detected enhancer RNA (eRNA) transcription start sites (TSSs) available in 120 cell and tissue types, which can be accessed at https://pints.yulab.org . With knowledge of the best available assays and pipelines, this large-scale annotation of candidate enhancers will pave the way for selection and characterization of their functions in a time- and labor-efficient manner.
Collapse
Affiliation(s)
- Li Yao
- Department of Computational Biology, Cornell University, Ithaca, NY, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
| | - Jin Liang
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
| | - Abdullah Ozer
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
| | - Alden King-Yung Leung
- Department of Computational Biology, Cornell University, Ithaca, NY, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
| | - John T Lis
- Department of Computational Biology, Cornell University, Ithaca, NY, USA.
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA.
| | - Haiyuan Yu
- Department of Computational Biology, Cornell University, Ithaca, NY, USA.
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA.
| |
Collapse
|
25
|
Bergman DT, Jones TR, Liu V, Ray J, Jagoda E, Siraj L, Kang HY, Nasser J, Kane M, Rios A, Nguyen TH, Grossman SR, Fulco CP, Lander ES, Engreitz JM. Compatibility rules of human enhancer and promoter sequences. Nature 2022; 607:176-184. [PMID: 35594906 PMCID: PMC9262863 DOI: 10.1038/s41586-022-04877-w] [Citation(s) in RCA: 78] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Accepted: 05/17/2022] [Indexed: 01/03/2023]
Abstract
Gene regulation in the human genome is controlled by distal enhancers that activate specific nearby promoters1. A proposed model for this specificity is that promoters have sequence-encoded preferences for certain enhancers, for example, mediated by interacting sets of transcription factors or cofactors2. This 'biochemical compatibility' model has been supported by observations at individual human promoters and by genome-wide measurements in Drosophila3-9. However, the degree to which human enhancers and promoters are intrinsically compatible has not yet been systematically measured, and how their activities combine to control RNA expression remains unclear. Here we design a high-throughput reporter assay called enhancer × promoter self-transcribing active regulatory region sequencing (ExP STARR-seq) and applied it to examine the combinatorial compatibilities of 1,000 enhancer and 1,000 promoter sequences in human K562 cells. We identify simple rules for enhancer-promoter compatibility, whereby most enhancers activate all promoters by similar amounts, and intrinsic enhancer and promoter activities multiplicatively combine to determine RNA output (R2 = 0.82). In addition, two classes of enhancers and promoters show subtle preferential effects. Promoters of housekeeping genes contain built-in activating motifs for factors such as GABPA and YY1, which decrease the responsiveness of promoters to distal enhancers. Promoters of variably expressed genes lack these motifs and show stronger responsiveness to enhancers. Together, this systematic assessment of enhancer-promoter compatibility suggests a multiplicative model tuned by enhancer and promoter class to control gene transcription in the human genome.
Collapse
Affiliation(s)
- Drew T Bergman
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| | | | - Vincent Liu
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Judhajeet Ray
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Evelyn Jagoda
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Layla Siraj
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Biophysics Graduate Program, Harvard University, Cambridge, MA, USA
| | - Helen Y Kang
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- BASE Initiative, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Stanford University School of Medicine, Stanford, CA, USA
| | - Joseph Nasser
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Michael Kane
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Antonio Rios
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Tung H Nguyen
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Charles P Fulco
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Bristol Myers Squibb, Cambridge, MA, USA
| | - Eric S Lander
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biology, MIT, Cambridge, MA, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Jesse M Engreitz
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA.
- BASE Initiative, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Stanford University School of Medicine, Stanford, CA, USA.
| |
Collapse
|
26
|
Fehér A, Schnúr A, Muenthaisong S, Bellák T, Ayaydin F, Várady G, Kemter E, Wolf E, Dinnyés A. Establishment and characterization of a novel human induced pluripotent stem cell line stably expressing the iRFP720 reporter. Sci Rep 2022; 12:9874. [PMID: 35701501 PMCID: PMC9198085 DOI: 10.1038/s41598-022-12956-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Accepted: 05/19/2022] [Indexed: 11/27/2022] Open
Abstract
Stem cell therapy has great potential for replacing beta-cell loss in diabetic patients. However, a key obstacle to cell therapy’s success is to preserve viability and function of the engrafted cells. While several strategies have been developed to improve engrafted beta-cell survival, tools to evaluate the efficacy within the body by imaging are limited. Traditional labeling tools, such as GFP-like fluorescent proteins, have limited penetration depths in vivo due to tissue scattering and absorption. To circumvent this limitation, a near-infrared fluorescent mutant version of the DrBphP bacteriophytochrome, iRFP720, has been developed for in vivo imaging and stem/progenitor cell tracking. Here, we present the generation and characterization of an iRFP720 expressing human induced pluripotent stem cell (iPSC) line, which can be used for real-time imaging in various biological applications. To generate the transgenic cells, the CRISPR/Cas9 technology was applied. A puromycin resistance gene was inserted into the AAVS1 locus, driven by the endogenous PPP1R12C promoter, along with the CAG-iRFP720 reporter cassette, which was flanked by insulator elements. Proper integration of the transgene into the targeted genomic region was assessed by comprehensive genetic analysis, verifying precise genome editing. Stable expression of iRFP720 in the cells was confirmed and imaged by their near-infrared fluorescence. We demonstrated that the reporter iPSCs exhibit normal stem cell characteristics and can be efficiently differentiated towards the pancreatic lineage. As the genetically modified reporter cells show retained pluripotency and multilineage differentiation potential, they hold great potential as a cellular model in a variety of biological and pharmacological applications.
Collapse
Affiliation(s)
- Anita Fehér
- BioTalentum Ltd, Aulich Lajos Street 26, Gödöllő, 2100, Hungary
| | - Andrea Schnúr
- BioTalentum Ltd, Aulich Lajos Street 26, Gödöllő, 2100, Hungary
| | | | - Tamás Bellák
- BioTalentum Ltd, Aulich Lajos Street 26, Gödöllő, 2100, Hungary.,Department of Anatomy, Histology and Embryology, Albert Szent-Györgyi Medical School, University of Szeged, Szeged, 6724, Hungary
| | - Ferhan Ayaydin
- Functional Cell Biology and Immunology Advanced Core Facility, Hungarian Centre of Excellence for Molecular Medicine, University of Szeged (HCEMM-USZ), Szeged, 6720, Hungary.,Laboratory of Cellular Imaging, Biological Research Centre, Eötvös Loránd Research Network, Szeged, Hungary
| | - György Várady
- Research Centre for Natural Sciences, Institute of Enzymology, Budapest, 1117, Hungary
| | - Elisabeth Kemter
- Chair for Molecular Animal Breeding and Biotechnology, Gene Centre and Department of Veterinary Sciences, LMU Munich, 81377, Munich, Germany.,Centre for Innovative Medical Models (CiMM), Department of Veterinary Sciences, LMU Munich, 85764, Oberschleißheim, Germany.,German Center for Diabetes Research (DZD), 85764, Neuherberg, Germany
| | - Eckhard Wolf
- Chair for Molecular Animal Breeding and Biotechnology, Gene Centre and Department of Veterinary Sciences, LMU Munich, 81377, Munich, Germany.,Centre for Innovative Medical Models (CiMM), Department of Veterinary Sciences, LMU Munich, 85764, Oberschleißheim, Germany.,German Center for Diabetes Research (DZD), 85764, Neuherberg, Germany
| | - András Dinnyés
- BioTalentum Ltd, Aulich Lajos Street 26, Gödöllő, 2100, Hungary. .,HCEMM-USZ Stem Cell Research Group, Hungarian Centre of Excellence for Molecular Medicine, Szeged, 6723, Hungary. .,Department of Cell Biology and Molecular Medicine, University of Szeged, Szeged, 6720, Hungary. .,Department of Physiology and Animal Health, Institute of Physiology and Animal Nutrition, Hungarian University of Agriculture and Life Sciences, Gödöllő, 2100, Hungary.
| |
Collapse
|
27
|
Boldyreva LV, Andreyeva EN, Pindyurin AV. Position Effect Variegation: Role of the Local Chromatin Context in Gene Expression Regulation. Mol Biol 2022. [DOI: 10.1134/s0026893322030049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
28
|
Abstract
DNA can determine where and when genes are expressed, but the full set of sequence determinants that control gene expression is unknown. Here, we measured the transcriptional activity of DNA sequences that represent an ~100 times larger sequence space than the human genome using massively parallel reporter assays (MPRAs). Machine learning models revealed that transcription factors (TFs) generally act in an additive manner with weak grammar and that most enhancers increase expression from a promoter by a mechanism that does not appear to involve specific TF–TF interactions. The enhancers themselves can be classified into three types: classical, closed chromatin and chromatin dependent. We also show that few TFs are strongly active in a cell, with most activities being similar between cell types. Individual TFs can have multiple gene regulatory activities, including chromatin opening and enhancing, promoting and determining transcription start site (TSS) activity, consistent with the view that the TF binding motif is the key atomic unit of gene expression. Analysis of massively parallel reporter assays measuring the transcriptional activity of DNA sequences indicates that most transcription factor (TF) activity is additive and does not rely on specific TF–TF interactions. Individual TFs can have different gene regulatory activities.
Collapse
|
29
|
Ribeiro-Dos-Santos AM, Hogan MS, Luther RD, Brosh R, Maurano MT. Genomic context sensitivity of insulator function. Genome Res 2022; 32:425-436. [PMID: 35082140 PMCID: PMC8896466 DOI: 10.1101/gr.276449.121] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Accepted: 01/25/2022] [Indexed: 11/24/2022]
Abstract
The specificity of interactions between genomic regulatory elements and potential target genes is influenced by the binding of insulator proteins such as CTCF, which can act as potent enhancer blockers when interposed between an enhancer and a promoter in a reporter assay. But not all CTCF sites genome-wide function as insulator elements, depending on cellular and genomic context. To dissect the influence of genomic context on enhancer blocker activity, we integrated reporter constructs with promoter-only, promoter and enhancer, and enhancer blocker configurations at hundreds of thousands of genomic sites using the Sleeping Beauty transposase. Deconvolution of reporter activity by genomic position reveals distinct expression patterns subject to genomic context, including a compartment of enhancer blocker reporter integrations with robust expression. The high density of integration sites permits quantitative delineation of characteristic genomic context sensitivity profiles and their decomposition into sensitivity to both local and distant DNase I hypersensitive sites. Furthermore, using a single-cell expression approach to test the effect of integrated reporters for differential expression of nearby endogenous genes reveals that CTCF insulator elements do not completely abrogate reporter effects on endogenous gene expression. Collectively, our results lend new insight into genomic regulatory compartmentalization and its influence on the determinants of promoter–enhancer specificity.
Collapse
Affiliation(s)
| | - Megan S Hogan
- Institute for Systems Genetics, NYU Grossman School of Medicine
| | - Raven D Luther
- Institute for Systems Genetics, NYU Grossman School of Medicine
| | - Ran Brosh
- Institute for Systems Genetics, NYU Grossman School of Medicine
| | | |
Collapse
|
30
|
Hong CKY, Cohen BA. Genomic environments scale the activities of diverse core promoters. Genome Res 2022; 32:85-96. [PMID: 34961747 PMCID: PMC8744677 DOI: 10.1101/gr.276025.121] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Accepted: 11/15/2021] [Indexed: 11/24/2022]
Abstract
A classical model of gene regulation is that enhancers provide specificity whereas core promoters provide a modular site for the assembly of the basal transcriptional machinery. However, examples of core promoter specificity have led to an alternate hypothesis in which specificity is achieved by core promoters with different sequence motifs that respond differently to genomic environments containing different enhancers and chromatin landscapes. To distinguish between these models, we measured the activities of hundreds of diverse core promoters in four different genomic locations and, in a complementary experiment, six different core promoters at thousands of locations across the genome. Although genomic locations had large effects on expression, the intrinsic activities of different classes of promoters were preserved across genomic locations, suggesting that core promoters are modular regulatory elements whose activities are independently scaled up or down by different genomic locations. This scaling of promoter activities is nonlinear and depends on the genomic location and the strength of the core promoter. Our results support the classical model of regulation in which diverse core promoter motifs set the intrinsic strengths of core promoters, which are then amplified or dampened by the activities of their genomic environments.
Collapse
Affiliation(s)
- Clarice K Y Hong
- The Edison Family Center for Genome Sciences and Systems Biology, School of Medicine, Washington University in St. Louis, Saint Louis, Missouri 63110, USA
- Department of Genetics, School of Medicine, Washington University in St. Louis, Saint Louis, Missouri 63110, USA
| | - Barak A Cohen
- The Edison Family Center for Genome Sciences and Systems Biology, School of Medicine, Washington University in St. Louis, Saint Louis, Missouri 63110, USA
- Department of Genetics, School of Medicine, Washington University in St. Louis, Saint Louis, Missouri 63110, USA
| |
Collapse
|
31
|
Romanov SE, Kalashnikova DA, Laktionov PP. Methods of massive parallel reporter assays for investigation of enhancers. Vavilovskii Zhurnal Genet Selektsii 2021; 25:344-355. [PMID: 34901731 PMCID: PMC8627875 DOI: 10.18699/vj21.038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2020] [Revised: 03/28/2021] [Accepted: 03/28/2021] [Indexed: 11/19/2022] Open
Abstract
The correct deployment of genetic programs for development and differentiation relies on finely coordinated regulation of specific gene sets. Genomic regulatory elements play an exceptional role in this process. There are few types of gene regulatory elements, including promoters, enhancers, insulators and silencers. Alterations of gene regulatory elements may cause various pathologies, including cancer, congenital disorders and autoimmune diseases. The development of high-throughput genomic assays has made it possible to significantly accelerate the accumulation of information about the characteristic epigenetic properties of regulatory elements. In combination with high-throughput studies focused on the genome-wide distribution of epigenetic marks, regulatory proteins and the spatial structure of chromatin, this significantly expands the understanding of the principles of epigenetic regulation of genes and allows potential regulatory elements to be searched for in silico. However, common experimental approaches used to study the local characteristics of chromatin have a number of technical limitations that may reduce the reliability of computational identification of genomic regulatory sequences. Taking into account the variability of the functions of epigenetic determinants and complex multicomponent regulation of genomic elements activity, their functional verification is often required. A plethora of methods have been developed to study the functional role of regulatory elements on the genome scale. Common experimental approaches for in silico identification of regulatory elements and their inherent technical limitations will be described. The present review is focused on original high-throughput methods of enhancer activity reporter analysis that are currently used to validate predicted regulatory elements and to perform de novo searches. The methods described allow assessing the functional role of the nucleotide sequence of a regulatory element, to determine its exact boundaries and to assess the influence of the local state of chromatin on the activity of enhancers and gene expression. These approaches have contributed substantially to the understanding of the fundamental principles of gene regulation.
Collapse
Affiliation(s)
- S E Romanov
- Novosibirsk State University, Epigenetics Laboratory, Department of Natural Sciences, Novosibirsk, Russia Institute of Molecular and Cellular Biology of the Siberian Branch of the Russian Academy of Sciences, Genomics Laboratory, Novosibirsk, Russia
| | - D A Kalashnikova
- Novosibirsk State University, Epigenetics Laboratory, Department of Natural Sciences, Novosibirsk, Russia Institute of Molecular and Cellular Biology of the Siberian Branch of the Russian Academy of Sciences, Genomics Laboratory, Novosibirsk, Russia
| | - P P Laktionov
- Novosibirsk State University, Epigenetics Laboratory, Department of Natural Sciences, Novosibirsk, Russia Institute of Molecular and Cellular Biology of the Siberian Branch of the Russian Academy of Sciences, Genomics Laboratory, Novosibirsk, Russia
| |
Collapse
|
32
|
A broad analysis of splicing regulation in yeast using a large library of synthetic introns. PLoS Genet 2021; 17:e1009805. [PMID: 34570750 PMCID: PMC8496845 DOI: 10.1371/journal.pgen.1009805] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Revised: 10/07/2021] [Accepted: 09/03/2021] [Indexed: 11/19/2022] Open
Abstract
RNA splicing is a key process in eukaryotic gene expression, in which an intron is spliced out of a pre-mRNA molecule to eventually produce a mature mRNA. Most intron-containing genes are constitutively spliced, hence efficient splicing of an intron is crucial for efficient regulation of gene expression. Here we use a large synthetic oligo library of ~20,000 variants to explore how different intronic sequence features affect splicing efficiency and mRNA expression levels in S. cerevisiae. Introns are defined by three functional sites, the 5’ donor site, the branch site, and the 3’ acceptor site. Using a combinatorial design of synthetic introns, we demonstrate how non-consensus splice site sequences in each of these sites affect splicing efficiency. We then show that S. cerevisiae splicing machinery tends to select alternative 3’ splice sites downstream of the original site, and we suggest that this tendency created a selective pressure, leading to the avoidance of cryptic splice site motifs near introns’ 3’ ends. We further use natural intronic sequences from other yeast species, whose splicing machineries have diverged to various extents, to show how intron architectures in the various species have been adapted to the organism’s splicing machinery. We suggest that the observed tendency for cryptic splicing is a result of a loss of a specific splicing factor, U2AF1. Lastly, we show that synthetic sequences containing two introns give rise to alternative RNA isoforms in S. cerevisiae, demonstrating that merely a synthetic fusion of two introns might be suffice to facilitate alternative splicing in yeast. Our study reveals novel mechanisms by which introns are shaped in evolution to allow cells to regulate their transcriptome. In addition, it provides a valuable resource to study the regulation of constitutive and alternative splicing in a model organism. RNA splicing is a process in which parts of a new pre-mRNA are spliced out of the mRNA molecule to produce eventually a mature mRNA. Those RNA segments that are spliced out are termed introns, and they are found in most genes in eukaryotic organisms. Hence regulation of this process has a major role in the control of gene expression. The budding yeast S. cerevisiae is a popular model organism for eukaryotic cell biology, but in terms of splicing it differs, as it has only few intron-containing genes. Nevertheless, this species has been used to study basic principles of splicing regulation based on its ~300 introns. Here we used the technology of a large synthetic genetic library to introduce many new intron-containing genes to the yeast genome, to explore splicing regulation at a wider scope than was possible so far. Reassuringly, our results confirm known regulatory mechanisms, and further expand our understanding of splicing regulation, specifically how the yeast splicing machinery interacts with the end of introns, and how through evolution introns have evolved to avoid unwanted misidentifications of this end. We further demonstrate the potential of the yeast splicing machinery to alternatively splice a two-intron gene, which is common in other eukaryotes but rare in yeast. Our work presents a first-of-its-kind resource for the systematic study of splicing in live cells.
Collapse
|
33
|
Fan K, Moore JE, Zhang XO, Weng Z. Genetic and epigenetic features of promoters with ubiquitous chromatin accessibility support ubiquitous transcription of cell-essential genes. Nucleic Acids Res 2021; 49:5705-5725. [PMID: 33978759 PMCID: PMC8191798 DOI: 10.1093/nar/gkab345] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2021] [Revised: 03/19/2021] [Accepted: 05/01/2021] [Indexed: 12/04/2022] Open
Abstract
Gene expression is controlled by regulatory elements within accessible chromatin. Although most regulatory elements are cell type-specific, a subset is accessible in nearly all the 517 human and 94 mouse cell and tissue types assayed by the ENCODE consortium. We systematically analyzed 9000 human and 8000 mouse ubiquitously-accessible candidate cis-regulatory elements (cCREs) with promoter-like signatures (PLSs) from ENCODE, which we denote ubi-PLSs. These are more CpG-rich than non-ubi-PLSs and correspond to genes with ubiquitously high transcription, including a majority of cell-essential genes. ubi-PLSs are enriched with motifs of ubiquitously-expressed transcription factors and preferentially bound by transcriptional cofactors regulating ubiquitously-expressed genes. They are highly conserved between human and mouse at the synteny level but exhibit frequent turnover of motif sites; accordingly, ubi-PLSs show increased variation at their centers compared with flanking regions among the ∼186 thousand human genomes sequenced by the TOPMed project. Finally, ubi-PLSs are enriched in genes implicated in Mendelian diseases, especially diseases broadly impacting most cell types, such as deficiencies in mitochondrial functions. Thus, a set of roughly 9000 mammalian promoters are actively maintained in an accessible state across cell types by a distinct set of transcription factors and cofactors to ensure the transcriptional programs of cell-essential genes.
Collapse
Affiliation(s)
- Kaili Fan
- Program in Bioinformatics and Integrative Biology, UMass Medical School, Worcester, MA, USA
| | - Jill E Moore
- Program in Bioinformatics and Integrative Biology, UMass Medical School, Worcester, MA, USA
| | - Xiao-ou Zhang
- Program in Bioinformatics and Integrative Biology, UMass Medical School, Worcester, MA, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, UMass Medical School, Worcester, MA, USA
| |
Collapse
|
34
|
Brosh R, Laurent JM, Ordoñez R, Huang E, Hogan MS, Hitchcock AM, Mitchell LA, Pinglay S, Cadley JA, Luther RD, Truong DM, Boeke JD, Maurano MT. A versatile platform for locus-scale genome rewriting and verification. Proc Natl Acad Sci U S A 2021; 118:e2023952118. [PMID: 33649239 PMCID: PMC7958457 DOI: 10.1073/pnas.2023952118] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Routine rewriting of loci associated with human traits and diseases would facilitate their functional analysis. However, existing DNA integration approaches are limited in terms of scalability and portability across genomic loci and cellular contexts. We describe Big-IN, a versatile platform for targeted integration of large DNAs into mammalian cells. CRISPR/Cas9-mediated targeting of a landing pad enables subsequent recombinase-mediated delivery of variant payloads and efficient positive/negative selection for correct clones in mammalian stem cells. We demonstrate integration of constructs up to 143 kb, and an approach for one-step scarless delivery. We developed a staged pipeline combining PCR genotyping and targeted capture sequencing for economical and comprehensive verification of engineered stem cells. Our approach should enable combinatorial interrogation of genomic functional elements and systematic locus-scale analysis of genome function.
Collapse
Affiliation(s)
- Ran Brosh
- Institute for Systems Genetics, NYU Langone Health, New York, NY 10016
| | - Jon M Laurent
- Institute for Systems Genetics, NYU Langone Health, New York, NY 10016
| | - Raquel Ordoñez
- Institute for Systems Genetics, NYU Langone Health, New York, NY 10016
| | - Emily Huang
- Institute for Systems Genetics, NYU Langone Health, New York, NY 10016
| | - Megan S Hogan
- Institute for Systems Genetics, NYU Langone Health, New York, NY 10016
| | | | - Leslie A Mitchell
- Institute for Systems Genetics, NYU Langone Health, New York, NY 10016
| | - Sudarshan Pinglay
- Institute for Systems Genetics, NYU Langone Health, New York, NY 10016
| | - John A Cadley
- Institute for Systems Genetics, NYU Langone Health, New York, NY 10016
| | - Raven D Luther
- Institute for Systems Genetics, NYU Langone Health, New York, NY 10016
| | - David M Truong
- Institute for Systems Genetics, NYU Langone Health, New York, NY 10016
| | - Jef D Boeke
- Institute for Systems Genetics, NYU Langone Health, New York, NY 10016;
- Department of Biochemistry and Molecular Pharmacology, NYU Langone Health, New York, NY 10016
- Department of Biomedical Engineering, NYU Tandon School of Engineering, Brooklyn 11201, NY
| | - Matthew T Maurano
- Institute for Systems Genetics, NYU Langone Health, New York, NY 10016
- Department of Pathology, NYU Langone Health, New York, NY 10016
| |
Collapse
|
35
|
Gödecke N, Herrmann S, Hauser H, Mayer-Bartschmid A, Trautwein M, Wirth D. Rational Design of Single Copy Expression Cassettes in Defined Chromosomal Sites Overcomes Intraclonal Cell-to-Cell Expression Heterogeneity and Ensures Robust Antibody Production. ACS Synth Biol 2021; 10:145-157. [PMID: 33382574 DOI: 10.1021/acssynbio.0c00519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The expression of endogenous genes as well as transgenes depends on regulatory elements within and surrounding genes as well as their epigenetic modifications. Members of a cloned cell population often show pronounced cell-to-cell heterogeneity with respect to the expression of a certain gene. To investigate the heterogeneity of recombinant protein expression we targeted cassettes into two preselected chromosomal hot-spots in Chinese hamster ovary (CHO) cells. Depending on the gene of interest and the design of the expression cassette, we found strong expression variability that could be reduced by epigenetic modifiers, but not by site-specific recruitment of the modulator dCas9-VPR. In particular, the implementation of ubiquitous chromatin opening elements (UCOEs) reduced cell-to-cell heterogeneity and concomitantly increased expression. The application of this method to recombinant antibody expression confirmed that rational design of cell lines for production of transgenes with predictable and high titers is a promising approach.
Collapse
Affiliation(s)
- Natascha Gödecke
- RG Model Systems for Infection and Immunity, Helmholtz Centre for Infection Research, Braunschweig 38124, Germany
| | - Sabrina Herrmann
- RG Model Systems for Infection and Immunity, Helmholtz Centre for Infection Research, Braunschweig 38124, Germany
| | - Hansjörg Hauser
- Staff Unit Scientific Strategy, Helmholtz Centre for Infection Research, Braunschweig 38124, Germany
| | | | | | - Dagmar Wirth
- RG Model Systems for Infection and Immunity, Helmholtz Centre for Infection Research, Braunschweig 38124, Germany
- Institute of Experimental Hematology, Medical University Hannover, Hannover 30625, Germany
| |
Collapse
|
36
|
Mulvey B, Lagunas T, Dougherty JD. Massively Parallel Reporter Assays: Defining Functional Psychiatric Genetic Variants Across Biological Contexts. Biol Psychiatry 2021; 89:76-89. [PMID: 32843144 PMCID: PMC7938388 DOI: 10.1016/j.biopsych.2020.06.011] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Revised: 06/09/2020] [Accepted: 06/10/2020] [Indexed: 12/18/2022]
Abstract
Neuropsychiatric phenotypes have long been known to be influenced by heritable risk factors, directly confirmed by the past decade of genetic studies that have revealed specific genetic variants enriched in disease cohorts. However, the initial hope that a small set of genes would be responsible for a given disorder proved false. The more complex reality is that a given disorder may be influenced by myriad small-effect noncoding variants and/or by rare but severe coding variants, many de novo. Noncoding genomic sequences-for which molecular functions cannot usually be inferred-harbor a large portion of these variants, creating a substantial barrier to understanding higher-order molecular and biological systems of disease. Fortunately, novel genetic technologies-scalable oligonucleotide synthesis, RNA sequencing, and CRISPR (clustered regularly interspaced short palindromic repeats)-have opened novel avenues to experimentally identify biologically significant variants en masse. Massively parallel reporter assays (MPRAs) are an especially versatile technique resulting from such innovations. MPRAs are powerful molecular genetics tools that can be used to screen thousands of untranscribed or untranslated sequences and their variants for functional effects in a single experiment. This approach, though underutilized in psychiatric genetics, has several useful features for the field. We review methods for assaying putatively functional genetic variants and regions, emphasizing MPRAs and the opportunities they hold for dissection of psychiatric polygenicity. We discuss literature applying functional assays in neurogenetics, highlighting strengths, caveats, and design considerations-especially regarding disease-relevant variables (cell type, neurodevelopment, and sex), and we ultimately propose applications of MPRA to both computational and experimental neurogenetics of polygenic disease risk.
Collapse
Affiliation(s)
- Bernard Mulvey
- Division of Biology and Biomedical Sciences, Washington University School of Medicine in St. Louis, St. Louis, Missouri; Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri; Department of Psychiatry, Washington University School of Medicine in St. Louis, St. Louis, Missouri
| | - Tomás Lagunas
- Division of Biology and Biomedical Sciences, Washington University School of Medicine in St. Louis, St. Louis, Missouri; Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri; Department of Psychiatry, Washington University School of Medicine in St. Louis, St. Louis, Missouri
| | - Joseph D Dougherty
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri; Department of Psychiatry, Washington University School of Medicine in St. Louis, St. Louis, Missouri.
| |
Collapse
|
37
|
Wambach JA, Yang P, Wegner DJ, Heins HB, Luke C, Li F, White FV, Cole FS. Functional Genomics of ABCA3 Variants. Am J Respir Cell Mol Biol 2020; 63:436-443. [PMID: 32692933 DOI: 10.1165/rcmb.2020-0034ma] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Rare or private, biallelic variants in the ABCA3 (ATP-binding cassette transporter A3) gene are the most common monogenic cause of lethal neonatal respiratory failure and childhood interstitial lung disease. Functional characterization of fewer than 10% of over 200 disease-associated ABCA3 variants (majority missense) suggests either disruption of ABCA3 protein trafficking (type I) or of ATPase-mediated phospholipid transport (type II). Therapies remain limited and nonspecific. A scalable platform is required for functional characterization of ABCA3 variants and discovery of pharmacologic correctors. To address this need, we first silenced the endogenous ABCA3 locus in A549 cells with CRISPR/Cas9 genome editing. Next, to generate a parent cell line (A549/ABCA3-/-) with a single recombination target site for genomic integration and stable expression of individual ABCA3 missense variant cDNAs, we used lentiviral-mediated integration of a LoxFAS cassette, FACS, and dilutional cloning. To assess the fidelity of this cell-based model, we compared functional characterization (ABCA3 protein processing, ABCA3 immunofluorescence colocalization with intracellular markers, ultrastructural vesicle phenotype) of two individual ABCA3 mutants (type I mutant, p.L101P; type II mutant, p.E292V) in A549/ABCA3-/- cells and in both A549 cells and primary, human alveolar type II cells that transiently express each cDNA after adenoviral-mediated transduction. We also confirmed pharmacologic rescue of ABCA3 variant-encoded mistrafficking and vesicle diameter in A549/ABCA3-/- cells that express p.G1421R (type I mutant). A549/ABCA3-/- cells provide a scalable, genetically versatile, physiologically relevant functional genomics platform for discovery of variant-specific mechanisms that disrupt ABCA3 function and for screening of potential ABCA3 pharmacologic correctors.
Collapse
Affiliation(s)
| | - Ping Yang
- Edward Mallinckrodt Department of Pediatrics
| | | | | | - Cliff Luke
- Edward Mallinckrodt Department of Pediatrics
| | - Fuhai Li
- Edward Mallinckrodt Department of Pediatrics.,Institute for Informatics, and
| | - Frances V White
- Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, Missouri
| | | |
Collapse
|
38
|
Renganaath K, Chong R, Day L, Kosuri S, Kruglyak L, Albert FW. Systematic identification of cis-regulatory variants that cause gene expression differences in a yeast cross. eLife 2020; 9:e62669. [PMID: 33179598 PMCID: PMC7685706 DOI: 10.7554/elife.62669] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Accepted: 11/11/2020] [Indexed: 02/06/2023] Open
Abstract
Sequence variation in regulatory DNA alters gene expression and shapes genetically complex traits. However, the identification of individual, causal regulatory variants is challenging. Here, we used a massively parallel reporter assay to measure the cis-regulatory consequences of 5832 natural DNA variants in the promoters of 2503 genes in the yeast Saccharomyces cerevisiae. We identified 451 causal variants, which underlie genetic loci known to affect gene expression. Several promoters harbored multiple causal variants. In five promoters, pairs of variants showed non-additive, epistatic interactions. Causal variants were enriched at conserved nucleotides, tended to have low derived allele frequency, and were depleted from promoters of essential genes, which is consistent with the action of negative selection. Causal variants were also enriched for alterations in transcription factor binding sites. Models integrating these features provided modest, but statistically significant, ability to predict causal variants. This work revealed a complex molecular basis for cis-acting regulatory variation.
Collapse
Affiliation(s)
- Kaushik Renganaath
- Department of Genetics, Cell Biology, & Development, University of MinnesotaMinneapolisUnited States
| | - Rockie Chong
- Department of Chemistry & Biochemistry, University of California, Los AngelesLos AngelesUnited States
| | - Laura Day
- Department of Human Genetics, University of California, Los AngelesLos AngelesUnited States
- Department of Biological Chemistry, University of California, Los AngelesLos AngelesUnited States
- Howard Hughes Medical Institute, University of California, Los AngelesLos AngelesUnited States
| | - Sriram Kosuri
- Department of Chemistry & Biochemistry, University of California, Los AngelesLos AngelesUnited States
| | - Leonid Kruglyak
- Department of Human Genetics, University of California, Los AngelesLos AngelesUnited States
- Department of Biological Chemistry, University of California, Los AngelesLos AngelesUnited States
- Howard Hughes Medical Institute, University of California, Los AngelesLos AngelesUnited States
| | - Frank W Albert
- Department of Genetics, Cell Biology, & Development, University of MinnesotaMinneapolisUnited States
| |
Collapse
|
39
|
Hammelman J, Krismer K, Banerjee B, Gifford DK, Sherwood RI. Identification of determinants of differential chromatin accessibility through a massively parallel genome-integrated reporter assay. Genome Res 2020; 30:1468-1480. [PMID: 32973041 PMCID: PMC7605270 DOI: 10.1101/gr.263228.120] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Accepted: 08/26/2020] [Indexed: 12/20/2022]
Abstract
A key mechanism in cellular regulation is the ability of the transcriptional machinery to physically access DNA. Transcription factors interact with DNA to alter the accessibility of chromatin, which enables changes to gene expression during development or disease or as a response to environmental stimuli. However, the regulation of DNA accessibility via the recruitment of transcription factors is difficult to study in the context of the native genome because every genomic site is distinct in multiple ways. Here we introduce the multiplexed integrated accessibility assay (MIAA), an assay that measures chromatin accessibility of synthetic oligonucleotide sequence libraries integrated into a controlled genomic context with low native accessibility. We apply MIAA to measure the effects of sequence motifs on cell type-specific accessibility between mouse embryonic stem cells and embryonic stem cell-derived definitive endoderm cells, screening 7905 distinct DNA sequences. MIAA recapitulates differential accessibility patterns of 100-nt sequences derived from natively differential genomic regions, identifying E-box motifs common to epithelial-mesenchymal transition driver transcription factors in stem cell-specific accessible regions that become repressed in endoderm. We show that a single binding motif for a key regulatory transcription factor is sufficient to open chromatin, and classify sets of stem cell-specific, endoderm-specific, and shared accessibility-modifying transcription factor motifs. We also show that overexpression of two definitive endoderm transcription factors, T and Foxa2, results in changes to accessibility in DNA sequences containing their respective DNA-binding motifs and identify preferential motif arrangements that influence accessibility.
Collapse
Affiliation(s)
- Jennifer Hammelman
- Computational and Systems Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Konstantin Krismer
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Budhaditya Banerjee
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA
| | - David K Gifford
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Richard I Sherwood
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA
- Hubrecht Institute, 3584 CT Utrecht, Netherlands
| |
Collapse
|
40
|
Tobias IC, Abatti LE, Moorthy SD, Mullany S, Taylor T, Khader N, Filice MA, Mitchell JA. Transcriptional enhancers: from prediction to functional assessment on a genome-wide scale. Genome 2020; 64:426-448. [PMID: 32961076 DOI: 10.1139/gen-2020-0104] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Enhancers are cis-regulatory sequences located distally to target genes. These sequences consolidate developmental and environmental cues to coordinate gene expression in a tissue-specific manner. Enhancer function and tissue specificity depend on the expressed set of transcription factors, which recognize binding sites and recruit cofactors that regulate local chromatin organization and gene transcription. Unlike other genomic elements, enhancers are challenging to identify because they function independently of orientation, are often distant from their promoters, have poorly defined boundaries, and display no reading frame. In addition, there are no defined genetic or epigenetic features that are unambiguously associated with enhancer activity. Over recent years there have been developments in both empirical assays and computational methods for enhancer prediction. We review genome-wide tools, CRISPR advancements, and high-throughput screening approaches that have improved our ability to both observe and manipulate enhancers in vitro at the level of primary genetic sequences, chromatin states, and spatial interactions. We also highlight contemporary animal models and their importance to enhancer validation. Together, these experimental systems and techniques complement one another and broaden our understanding of enhancer function in development, evolution, and disease.
Collapse
Affiliation(s)
- Ian C Tobias
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada
| | - Luis E Abatti
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada
| | - Sakthi D Moorthy
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada
| | - Shanelle Mullany
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada
| | - Tiegh Taylor
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada
| | - Nawrah Khader
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada
| | - Mario A Filice
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada
| | - Jennifer A Mitchell
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada
| |
Collapse
|
41
|
Sergeeva D, Lee GM, Nielsen LK, Grav LM. Multicopy Targeted Integration for Accelerated Development of High-Producing Chinese Hamster Ovary Cells. ACS Synth Biol 2020; 9:2546-2561. [PMID: 32835482 DOI: 10.1021/acssynbio.0c00322] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The ever-growing biopharmaceutical industry relies on the production of recombinant therapeutic proteins in Chinese hamster ovary (CHO) cells. The traditional timelines of CHO cell line development can be significantly shortened by the use of targeted gene integration (TI). However, broad use of TI has been limited due to the low specific productivity (qP) of TI-generated clones. Here, we show a 10-fold increase in the qP of therapeutic glycoproteins in CHO cells through the development and optimization of a multicopy TI method. We used a recombinase-mediated cassette exchange (RMCE) platform to investigate the effect of gene copy number, 5' and 3' gene regulatory elements, and landing pad features on qP. We evaluated the limitations of multicopy expression from a single genomic site as well as multiple genomic sites and found that a transcriptional bottleneck can appear with an increase in gene dosage. We created a dual-RMCE system for simultaneous multicopy TI in two genomic sites and generated isogenic high-producing clones with qP of 12-14 pg/cell/day and product titer close to 1 g/L in fed-batch. Our study provides an extensive characterization of the multicopy TI method and elucidates the relationship between gene copy number and protein expression in mammalian cells. Moreover, it demonstrates that TI-generated CHO cells are capable of producing therapeutic proteins at levels that can support their industrial manufacture.
Collapse
Affiliation(s)
- Daria Sergeeva
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kgs. Lyngby 2800, Denmark
| | - Gyun Min Lee
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kgs. Lyngby 2800, Denmark
- Department of Biological Sciences, KAIST, Daejeon 34141, Republic of Korea
| | - Lars Keld Nielsen
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kgs. Lyngby 2800, Denmark
- Australian Institute for Bioengineering and Nanotechnology, University of Queensland, Brisbane 4072, Australia
| | - Lise Marie Grav
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kgs. Lyngby 2800, Denmark
| |
Collapse
|
42
|
Zhang T, Pilko A, Wollman R. Loci specific epigenetic drug sensitivity. Nucleic Acids Res 2020; 48:4797-4810. [PMID: 32246716 PMCID: PMC7229858 DOI: 10.1093/nar/gkaa210] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Revised: 02/10/2020] [Accepted: 03/27/2020] [Indexed: 12/14/2022] Open
Abstract
Therapeutic targeting of epigenetic modulators offers a novel approach to the treatment of multiple diseases. The cellular consequences of chemical compounds that target epigenetic regulators (epi-drugs) are complex. Epi-drugs affect global cellular phenotypes and cause local changes to gene expression due to alteration of a gene chromatin environment. Despite increasing use in the clinic, the mechanisms responsible for cellular changes are unclear. Specifically, to what degree the effects are a result of cell-wide changes or disease related locus specific effects is unknown. Here we developed a platform to systematically and simultaneously investigate the sensitivity of epi-drugs at hundreds of genomic locations by combining DNA barcoding, unique split-pool encoding, and single cell expression measurements. Internal controls are used to isolate locus specific effects separately from any global consequences these drugs have. Using this platform we discovered wide-spread loci specific sensitivities to epi-drugs for three distinct epi-drugs that target histone deacetylase, DNA methylation and bromodomain proteins. By leveraging ENCODE data on chromatin modification, we identified features of chromatin environments that are most likely to be affected by epi-drugs. The measurements of loci specific epi-drugs sensitivities will pave the way to the development of targeted therapy for personalized medicine.
Collapse
Affiliation(s)
- Thanutra Zhang
- Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, CA, USA
| | - Anna Pilko
- Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, CA, USA
- Departments of Integrative Biology and Physiology and Chemistry and Biochemistry, University of California UCLA, CA, USA
| | - Roy Wollman
- Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, CA, USA
- Departments of Integrative Biology and Physiology and Chemistry and Biochemistry, University of California UCLA, CA, USA
| |
Collapse
|
43
|
Gasperini M, Tome JM, Shendure J. Towards a comprehensive catalogue of validated and target-linked human enhancers. Nat Rev Genet 2020; 21:292-310. [PMID: 31988385 PMCID: PMC7845138 DOI: 10.1038/s41576-019-0209-0] [Citation(s) in RCA: 191] [Impact Index Per Article: 38.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/13/2019] [Indexed: 12/14/2022]
Abstract
The human gene catalogue is essentially complete, but we lack an equivalently vetted inventory of bona fide human enhancers. Hundreds of thousands of candidate enhancers have been nominated via biochemical annotations; however, only a handful of these have been validated and confidently linked to their target genes. Here we review emerging technologies for discovering, characterizing and validating human enhancers at scale. We furthermore propose a new framework for operationally defining enhancers that accommodates the heterogeneous and complementary results that are emerging from reporter assays, biochemical measurements and CRISPR screens.
Collapse
Affiliation(s)
- Molly Gasperini
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Jacob M Tome
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA.
- Allen Discovery Center for Cell Lineage, Seattle, WA, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| |
Collapse
|
44
|
Bozek M, Gompel N. Developmental Transcriptional Enhancers: A Subtle Interplay between Accessibility and Activity: Considering Quantitative Accessibility Changes between Different Regulatory States of an Enhancer Deconvolutes the Complex Relationship between Accessibility and Activity. Bioessays 2020; 42:e1900188. [PMID: 32142172 DOI: 10.1002/bies.201900188] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2019] [Revised: 01/16/2020] [Indexed: 12/21/2022]
Abstract
Measurements of open chromatin in specific cell types are widely used to infer the spatiotemporal activity of transcriptional enhancers. How reliable are these predictions? In this review, it is argued that the relationship between the accessibility and activity of an enhancer is insufficiently described by simply considering open versus closed chromatin, or active versus inactive enhancers. Instead, recent studies focusing on the quantitative nature of accessibility signal reveal subtle differences between active enhancers and their different inactive counterparts: the closed silenced state and the accessible primed and repressed states. While the open structure as such is not a specific indicator of enhancer activity, active enhancers display a higher degree of accessibility than the primed and repressed states. Molecular mechanisms that may account for these quantitative differences are discussed. A model that relates molecular events at an enhancer to changes in its activity and accessibility in a developing tissue is also proposed.
Collapse
Affiliation(s)
- Marta Bozek
- Department Biochemie, Ludwig-Maximilians Universität München, Genzentrum, 81377, München, Germany
| | - Nicolas Gompel
- Fakultät für Biologie, Ludwig-Maximilians Universität München, Biozentrum, 82152, Planegg-Martinsried, Germany
| |
Collapse
|
45
|
King DM, Hong CKY, Shepherdson JL, Granas DM, Maricque BB, Cohen BA. Synthetic and genomic regulatory elements reveal aspects of cis-regulatory grammar in mouse embryonic stem cells. eLife 2020; 9:41279. [PMID: 32043966 PMCID: PMC7077988 DOI: 10.7554/elife.41279] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2018] [Accepted: 02/07/2020] [Indexed: 01/08/2023] Open
Abstract
In embryonic stem cells (ESCs), a core transcription factor (TF) network establishes the gene expression program necessary for pluripotency. To address how interactions between four key TFs contribute to cis-regulation in mouse ESCs, we assayed two massively parallel reporter assay (MPRA) libraries composed of binding sites for SOX2, POU5F1 (OCT4), KLF4, and ESRRB. Comparisons between synthetic cis-regulatory elements and genomic sequences with comparable binding site configurations revealed some aspects of a regulatory grammar. The expression of synthetic elements is influenced by both the number and arrangement of binding sites. This grammar plays only a small role for genomic sequences, as the relative activities of genomic sequences are best explained by the predicted occupancy of binding sites, regardless of binding site identity and positioning. Our results suggest that the effects of transcription factor binding sites (TFBS) are influenced by the order and orientation of sites, but that in the genome the overall occupancy of TFs is the primary determinant of activity. Transcription factors are proteins that flip genetic switches; their role is to control when and where genes are active. They do this by binding to short stretches of DNA called cis-regulatory sequences. Each sequence can have several binding sites for different transcription factors, but it is largely unclear whether the transcription factors binding to the same regulatory sequence actually work together. It is possible that each transcription factor may work independently and there only needs to be critical mass of transcription factors bound to throw the genetic switch. If this is the case, the most important features of a cis-regulatory sequence should be the number of binding sites it contains, and how tightly the transcription factors bind to those sites. The more transcription factors and the more strongly they bind, the more active the gene should be. An alternative option is that certain transcription factors may work better together, enhancing each other's effects such that the total effect is more than the sum of its parts. If this is true, the order, orientation and spacing of the binding sites within a sequence should matter more than the number. One way to investigate to distinguish between these possibilities is to study mouse embryonic stem cells, which have a core set of four transcription factors. Looking directly at a real genome, however, can be confusing and it is difficult to measure the effects of different cis-regulatory sequences because genes differ in so many other ways. To tackle this problem, King et al. created a synthetic set of cis-regulatory sequences based on the four core transcription factors found in mouse stem cells. The synthetic set had every combination of two, three or four of the binding sites, with each site either facing forwards or backwards along the DNA strand. King et al. attached each of the synthetic cis-regulatory sequences to a reporter gene to find out how well each sequence performed. This revealed that the cis-regulatory sequences with the most binding sites and the tightest binding affinities work best, suggesting that transcription factors mainly work independently. There was evidence of some interaction between some transcription factors, because, of the synthetic sequences with four binding sites, some worked better than others, and there were patterns in the most effective binding site combinations. However, these effects were small and when King et al. went on to test sequences from the real mouse genome, the most important factor by far was the number of binding sites. Synthetic libraries of DNA sequences allow researchers to examine gene regulation more clearly than is possible in real genomes. Yet this approach does have its limitations and it is impossible to capture every type of cis-regulatory sequence in one library. The next step to extend this work is to combine the two approaches, taking sequences from the real genome and manipulating them one by one. This could help to unravel the rules that govern how cis-regulatory sequences work in real cells.
Collapse
Affiliation(s)
- Dana M King
- Edison Center for Genome Sciences and Systems Biology, Washington University in St. Louis, St. Louis, United States.,Department of Genetics, Washington University in St. Louis, St. Louis, United States
| | - Clarice Kit Yee Hong
- Edison Center for Genome Sciences and Systems Biology, Washington University in St. Louis, St. Louis, United States.,Department of Genetics, Washington University in St. Louis, St. Louis, United States
| | - James L Shepherdson
- Edison Center for Genome Sciences and Systems Biology, Washington University in St. Louis, St. Louis, United States.,Department of Genetics, Washington University in St. Louis, St. Louis, United States
| | - David M Granas
- Edison Center for Genome Sciences and Systems Biology, Washington University in St. Louis, St. Louis, United States.,Department of Genetics, Washington University in St. Louis, St. Louis, United States
| | - Brett B Maricque
- Edison Center for Genome Sciences and Systems Biology, Washington University in St. Louis, St. Louis, United States.,Department of Genetics, Washington University in St. Louis, St. Louis, United States
| | - Barak A Cohen
- Edison Center for Genome Sciences and Systems Biology, Washington University in St. Louis, St. Louis, United States.,Department of Genetics, Washington University in St. Louis, St. Louis, United States
| |
Collapse
|
46
|
de Jongh RP, van Dijk AD, Julsing MK, Schaap PJ, de Ridder D. Designing Eukaryotic Gene Expression Regulation Using Machine Learning. Trends Biotechnol 2020; 38:191-201. [DOI: 10.1016/j.tibtech.2019.07.007] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Revised: 07/12/2019] [Accepted: 07/19/2019] [Indexed: 12/11/2022]
|
47
|
Frochaux MV, Bou Sleiman M, Gardeux V, Dainese R, Hollis B, Litovchenko M, Braman VS, Andreani T, Osman D, Deplancke B. cis-regulatory variation modulates susceptibility to enteric infection in the Drosophila genetic reference panel. Genome Biol 2020; 21:6. [PMID: 31948474 PMCID: PMC6966807 DOI: 10.1186/s13059-019-1912-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Accepted: 12/05/2019] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Resistance to enteric pathogens is a complex trait at the crossroads of multiple biological processes. We have previously shown in the Drosophila Genetic Reference Panel (DGRP) that resistance to infection is highly heritable, but our understanding of how the effects of genetic variants affect different molecular mechanisms to determine gut immunocompetence is still limited. RESULTS To address this, we perform a systems genetics analysis of the gut transcriptomes from 38 DGRP lines that were orally infected with Pseudomonas entomophila. We identify a large number of condition-specific, expression quantitative trait loci (local-eQTLs) with infection-specific ones located in regions enriched for FOX transcription factor motifs. By assessing the allelic imbalance in the transcriptomes of 19 F1 hybrid lines from a large round robin design, we independently attribute a robust cis-regulatory effect to only 10% of these detected local-eQTLs. However, additional analyses indicate that many local-eQTLs may act in trans instead. Comparison of the transcriptomes of DGRP lines that were either susceptible or resistant to Pseudomonas entomophila infection reveals nutcracker as the only differentially expressed gene. Interestingly, we find that nutcracker is linked to infection-specific eQTLs that correlate with its expression level and to enteric infection susceptibility. Further regulatory analysis reveals one particular eQTL that significantly decreases the binding affinity for the repressor Broad, driving differential allele-specific nutcracker expression. CONCLUSIONS Our collective findings point to a large number of infection-specific cis- and trans-acting eQTLs in the DGRP, including one common non-coding variant that lowers enteric infection susceptibility.
Collapse
Affiliation(s)
- Michael V. Frochaux
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne (EPFL) and Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Maroun Bou Sleiman
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne (EPFL) and Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Current Address: Laboratory of Integrative Systems Physiology, Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Vincent Gardeux
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne (EPFL) and Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Riccardo Dainese
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne (EPFL) and Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Brian Hollis
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne (EPFL) and Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Current Address: Department of Biological Sciences, University of South Carolina, Columbia, South Carolina USA
| | - Maria Litovchenko
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne (EPFL) and Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Virginie S. Braman
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Tommaso Andreani
- Computational Biology and Data Mining Group, Institute of Molecular Biology, Johannes Gutenberg-Universität Mainz, Mainz, Germany
| | - Dani Osman
- Faculty of Sciences III and Azm Center for Research in Biotechnology and its Applications, LBA3B, EDST, Lebanese University, Tripoli, 1300 Lebanon
| | - Bart Deplancke
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| |
Collapse
|
48
|
Neurobiological functions of transcriptional enhancers. Nat Neurosci 2019; 23:5-14. [PMID: 31740812 DOI: 10.1038/s41593-019-0538-5] [Citation(s) in RCA: 55] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Accepted: 10/16/2019] [Indexed: 02/08/2023]
Abstract
Transcriptional enhancers are regulatory DNA elements that underlie the specificity and dynamic patterns of gene expression. Over the past decade, large-scale functional genomics projects have driven transformative progress in our understanding of enhancers. These data have relevance for identifying mechanisms of gene regulation in the CNS, elucidating the function of non-coding regulatory sequences in neurobiology and linking sequence variation within enhancers to genetic risk for neurological and psychiatric disorders. However, the sheer volume and complexity of genomic data presents a challenge to interpreting enhancer function in normal and pathogenic neurobiological processes. Here, to advance the application of genome-scale enhancer data, we offer a primer on current models of enhancer function in the CNS, we review how enhancers regulate gene expression across the neuronal lifespan, and we suggest how emerging findings regarding the role of non-coding sequence variation offer opportunities for understanding brain disorders and developing new technologies for neuroscience.
Collapse
|
49
|
Deciphering Gene Regulation Using Massively Parallel Reporter Assays. Trends Biochem Sci 2019; 45:90-91. [PMID: 31727407 DOI: 10.1016/j.tibs.2019.10.006] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Accepted: 10/14/2019] [Indexed: 12/21/2022]
|
50
|
Perenthaler E, Yousefi S, Niggl E, Barakat TS. Beyond the Exome: The Non-coding Genome and Enhancers in Neurodevelopmental Disorders and Malformations of Cortical Development. Front Cell Neurosci 2019; 13:352. [PMID: 31417368 PMCID: PMC6685065 DOI: 10.3389/fncel.2019.00352] [Citation(s) in RCA: 46] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Accepted: 07/16/2019] [Indexed: 12/22/2022] Open
Abstract
The development of the human cerebral cortex is a complex and dynamic process, in which neural stem cell proliferation, neuronal migration, and post-migratory neuronal organization need to occur in a well-organized fashion. Alterations at any of these crucial stages can result in malformations of cortical development (MCDs), a group of genetically heterogeneous neurodevelopmental disorders that present with developmental delay, intellectual disability and epilepsy. Recent progress in genetic technologies, such as next generation sequencing, most often focusing on all protein-coding exons (e.g., whole exome sequencing), allowed the discovery of more than a 100 genes associated with various types of MCDs. Although this has considerably increased the diagnostic yield, most MCD cases remain unexplained. As Whole Exome Sequencing investigates only a minor part of the human genome (1-2%), it is likely that patients, in which no disease-causing mutation has been identified, could harbor mutations in genomic regions beyond the exome. Even though functional annotation of non-coding regions is still lagging behind that of protein-coding genes, tremendous progress has been made in the field of gene regulation. One group of non-coding regulatory regions are enhancers, which can be distantly located upstream or downstream of genes and which can mediate temporal and tissue-specific transcriptional control via long-distance interactions with promoter regions. Although some examples exist in literature that link alterations of enhancers to genetic disorders, a widespread appreciation of the putative roles of these sequences in MCDs is still lacking. Here, we summarize the current state of knowledge on cis-regulatory regions and discuss novel technologies such as massively-parallel reporter assay systems, CRISPR-Cas9-based screens and computational approaches that help to further elucidate the emerging role of the non-coding genome in disease. Moreover, we discuss existing literature on mutations or copy number alterations of regulatory regions involved in brain development. We foresee that the future implementation of the knowledge obtained through ongoing gene regulation studies will benefit patients and will provide an explanation to part of the missing heritability of MCDs and other genetic disorders.
Collapse
Affiliation(s)
| | | | | | - Tahsin Stefan Barakat
- Department of Clinical Genetics, Erasmus MC – University Medical Center, Rotterdam, Netherlands
| |
Collapse
|