1
|
Refael T, Sudman M, Golan G, Pnueli L, Naik S, Preger-Ben Noon E, Henn A, Kaplan A, Melamed P. An i-motif-regulated enhancer, eRNA and adjacent lncRNA affect Lhb expression through distinct mechanisms in a sex-specific context. Cell Mol Life Sci 2024; 81:361. [PMID: 39158745 DOI: 10.1007/s00018-024-05398-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Revised: 07/21/2024] [Accepted: 08/05/2024] [Indexed: 08/20/2024]
Abstract
Genome-wide studies have demonstrated regulatory roles for diverse non-coding elements, but their precise and interrelated functions have often remained enigmatic. Addressing the need for mechanistic insight, we studied their roles in expression of Lhb which encodes the pituitary gonadotropic hormone that controls reproduction. We identified a bi-directional enhancer in gonadotrope-specific open chromatin, whose functional eRNA (eRNA2) supports permissive chromatin at the Lhb locus. The central untranscribed region of the enhancer contains an iMotif (iM), and is bound by Hmgb2 which stabilizes the iM and directs transcription specifically towards the functional eRNA2. A distinct downstream lncRNA, associated with an inducible G-quadruplex (G4) and iM, also facilitates Lhb expression, following its splicing in situ. GnRH activates Lhb transcription and increased levels of all three RNAs, eRNA2 showing the highest response, while estradiol, which inhibits Lhb, repressed levels of eRNA2 and the lncRNA. The levels of these regulatory RNAs and Lhb mRNA correlate highly in female mice, though strikingly not in males, suggesting a female-specific function. Our findings, which shed new light on the workings of non-coding elements and non-canonical DNA structures, reveal novel mechanisms regulating transcription which have implications not only in the central control of reproduction but also for other inducible genes.
Collapse
Affiliation(s)
- Tal Refael
- Faculty of Biology, Technion-Israel Institute of Technology, Haifa, 3200003, Israel
| | - Maya Sudman
- Faculty of Biology, Technion-Israel Institute of Technology, Haifa, 3200003, Israel
| | - Gil Golan
- Faculty of Biology, Technion-Israel Institute of Technology, Haifa, 3200003, Israel
| | - Lilach Pnueli
- Faculty of Biology, Technion-Israel Institute of Technology, Haifa, 3200003, Israel
| | - Sujay Naik
- Department of Genetics and Developmental Biology, The Rappaport Faculty of Medicine and Research Institute, Technion-Israel Institute of Technology, Haifa, 3109601, Israel
| | - Ella Preger-Ben Noon
- Department of Genetics and Developmental Biology, The Rappaport Faculty of Medicine and Research Institute, Technion-Israel Institute of Technology, Haifa, 3109601, Israel
| | - Arnon Henn
- Faculty of Biology, Technion-Israel Institute of Technology, Haifa, 3200003, Israel
| | - Ariel Kaplan
- Faculty of Biology, Technion-Israel Institute of Technology, Haifa, 3200003, Israel
| | - Philippa Melamed
- Faculty of Biology, Technion-Israel Institute of Technology, Haifa, 3200003, Israel.
| |
Collapse
|
2
|
Lin J, Luo R, Pinello L. EPInformer: a scalable deep learning framework for gene expression prediction by integrating promoter-enhancer sequences with multimodal epigenomic data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.01.606099. [PMID: 39131276 PMCID: PMC11312614 DOI: 10.1101/2024.08.01.606099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/13/2024]
Abstract
Transcriptional regulation, critical for cellular differentiation and adaptation to environmental changes, involves coordinated interactions among DNA sequences, regulatory proteins, and chromatin architecture. Despite extensive data from consortia like ENCODE, understanding the dynamics of cis-regulatory elements (CREs) in gene expression remains challenging. Deep learning is a powerful tool for learning gene expression and epigenomic signals from DNA sequences, exhibiting superior performance compared to conventional machine learning approaches. However, even the most advanced deep learning-based methods may fall short in capturing the regulatory effects of distal elements such as enhancers, limiting their predictive accuracy. In addition, these methods may require significant resources to train or to adapt to newly generated data. To address these challenges, we present EPInformer, a scalable deep-learning framework for predicting gene expression by integrating promoter-enhancer interactions with their sequences, epigenomic signals, and chromatin contacts. Our model outperforms existing gene expression prediction models in rigorous cross-chromosome validation, accurately recapitulates enhancer-gene interactions validated by CRISPR perturbation experiments, and identifies crucial transcription factor motifs within regulatory sequences. EPInformer is available as open-source software at https://github.com/pinellolab/EPInformer.
Collapse
Affiliation(s)
- Jiecong Lin
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Department of Pathology, Harvard Medical School, Boston, Massachusetts 02129, USA
- Department of Computer Science, The University of Hong Kong, Hong Kong, China
| | - Ruibang Luo
- Department of Computer Science, The University of Hong Kong, Hong Kong, China
| | - Luca Pinello
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Department of Pathology, Harvard Medical School, Boston, Massachusetts 02129, USA
| |
Collapse
|
3
|
Trauernicht M, Filipovska T, Rastogi C, van Steensel B. Optimized reporters for multiplexed detection of transcription factor activity. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.26.605239. [PMID: 39091757 PMCID: PMC11291157 DOI: 10.1101/2024.07.26.605239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/04/2024]
Abstract
In any given cell type, dozens of transcription factors (TFs) act in concert to control the activity of the genome by binding to specific DNA sequences in regulatory elements. Despite their considerable importance in determining cell identity and their pivotal role in numerous disorders, we currently lack simple tools to directly measure the activity of many TFs in parallel. Massively parallel reporter assays (MPRAs) allow the detection of TF activities in a multiplexed fashion; however, we lack basic understanding to rationally design sensitive reporters for many TFs. Here, we use an MPRA to systematically optimize transcriptional reporters for 86 TFs and evaluate the specificity of all reporters across a wide array of TF perturbation conditions. We thus identified critical TF reporter design features and obtained highly sensitive and specific reporters for 60 TFs, many of which outperform available reporters. The resulting collection of "prime" TF reporters can be used to uncover TF regulatory networks and to illuminate signaling pathways. HIGHLIGHTS Systematic design and optimization of transcriptional reporters for 86 TFsCharacterization of TF-specific reporter design optimization rulesEvaluation of reporter TF-specificity across a wide array of TF perturbationsIdentification of a collection of 60 "prime" TF reporters with optimized performance.
Collapse
|
4
|
Dai Y, Itai T, Pei G, Yan F, Chu Y, Jiang X, Weinberg SM, Mukhopadhyay N, Marazita ML, Simon LM, Jia P, Zhao Z. DeepFace: Deep-learning-based framework to contextualize orofacial-cleft-related variants during human embryonic craniofacial development. HGG ADVANCES 2024; 5:100312. [PMID: 38796699 PMCID: PMC11193024 DOI: 10.1016/j.xhgg.2024.100312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 05/23/2024] [Accepted: 05/23/2024] [Indexed: 05/28/2024] Open
Abstract
Orofacial clefts (OFCs) are among the most common human congenital birth defects. Previous multiethnic studies have identified dozens of associated loci for both cleft lip with or without cleft palate (CL/P) and cleft palate alone (CP). Although several nearby genes have been highlighted, the "casual" variants are largely unknown. Here, we developed DeepFace, a convolutional neural network model, to assess the functional impact of variants by SNP activity difference (SAD) scores. The DeepFace model is trained with 204 epigenomic assays from crucial human embryonic craniofacial developmental stages of post-conception week (pcw) 4 to pcw 10. The Pearson correlation coefficient between the predicted and actual values for 12 epigenetic features achieved a median range of 0.50-0.83. Specifically, our model revealed that SNPs significantly associated with OFCs tended to exhibit higher SAD scores across various variant categories compared to less related groups, indicating a context-specific impact of OFC-related SNPs. Notably, we identified six SNPs with a significant linear relationship to SAD scores throughout developmental progression, suggesting that these SNPs could play a temporal regulatory role. Furthermore, our cell-type specificity analysis pinpointed the trophoblast cell as having the highest enrichment of risk signals associated with OFCs. Overall, DeepFace can harness distal regulatory signals from extensive epigenomic assays, offering new perspectives for prioritizing OFC variants using contextualized functional genomic features. We expect DeepFace to be instrumental in accessing and predicting the regulatory roles of variants associated with OFCs, and the model can be extended to study other complex diseases or traits.
Collapse
Affiliation(s)
- Yulin Dai
- Center for Precision Health, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Toshiyuki Itai
- Center for Precision Health, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Guangsheng Pei
- Center for Precision Health, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Fangfang Yan
- Center for Precision Health, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Yan Chu
- Center for Secure Artificial Intelligence for Healthcare, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Xiaoqian Jiang
- Center for Secure Artificial Intelligence for Healthcare, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Seth M Weinberg
- Department of Oral and Craniofacial Sciences, School of Dental Medicine, Center for Craniofacial and Dental Genetics, University of Pittsburgh, Pittsburgh, PA 15213, USA; Department of Human Genetics, School of Public Health, University of Pittsburgh, Pittsburgh, PA 15261, USA
| | - Nandita Mukhopadhyay
- Department of Oral and Craniofacial Sciences, School of Dental Medicine, Center for Craniofacial and Dental Genetics, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Mary L Marazita
- Department of Oral and Craniofacial Sciences, School of Dental Medicine, Center for Craniofacial and Dental Genetics, University of Pittsburgh, Pittsburgh, PA 15213, USA; Department of Human Genetics, School of Public Health, University of Pittsburgh, Pittsburgh, PA 15261, USA; Clinical and Translational Science Institute, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Lukas M Simon
- Therapeutic Innovation Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Peilin Jia
- Center for Precision Health, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Zhongming Zhao
- Center for Precision Health, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA; MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences, Houston, TX 77030, USA.
| |
Collapse
|
5
|
Bittner N, Shi C, Zhao D, Ding J, Southam L, Swift D, Kreitmaier P, Tutino M, Stergiou O, Cheung JTS, Katsoula G, Hankinson J, Wilkinson JM, Orozco G, Zeggini E. Primary osteoarthritis chondrocyte map of chromatin conformation reveals novel candidate effector genes. Ann Rheum Dis 2024; 83:1048-1059. [PMID: 38479789 PMCID: PMC11287644 DOI: 10.1136/ard-2023-224945] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 02/29/2024] [Indexed: 07/17/2024]
Abstract
OBJECTIVES Osteoarthritis is a complex disease with a huge public health burden. Genome-wide association studies (GWAS) have identified hundreds of osteoarthritis-associated sequence variants, but the effector genes underpinning these signals remain largely elusive. Understanding chromosome organisation in three-dimensional (3D) space is essential for identifying long-range contacts between distant genomic features (e.g., between genes and regulatory elements), in a tissue-specific manner. Here, we generate the first whole genome chromosome conformation analysis (Hi-C) map of primary osteoarthritis chondrocytes and identify novel candidate effector genes for the disease. METHODS Primary chondrocytes collected from 8 patients with knee osteoarthritis underwent Hi-C analysis to link chromosomal structure to genomic sequence. The identified loops were then combined with osteoarthritis GWAS results and epigenomic data from primary knee osteoarthritis chondrocytes to identify variants involved in gene regulation via enhancer-promoter interactions. RESULTS We identified 345 genetic variants residing within chromatin loop anchors that are associated with 77 osteoarthritis GWAS signals. Ten of these variants reside directly in enhancer regions of 10 newly described active enhancer-promoter loops, identified with multiomics analysis of publicly available chromatin immunoprecipitation sequencing (ChIP-seq) and assay for transposase-accessible chromatin using sequencing (ATAC-seq) data from primary knee chondrocyte cells, pointing to two new candidate effector genes SPRY4 and PAPPA (pregnancy-associated plasma protein A) as well as further support for the gene SLC44A2 known to be involved in osteoarthritis. For example, PAPPA is directly associated with the turnover of insulin-like growth factor 1 (IGF-1) proteins, and IGF-1 is an important factor in the repair of damaged chondrocytes. CONCLUSIONS We have constructed the first Hi-C map of primary human chondrocytes and have made it available as a resource for the scientific community. By integrating 3D genomics with large-scale genetic association and epigenetic data, we identify novel candidate effector genes for osteoarthritis, which enhance our understanding of disease and can serve as putative high-value novel drug targets.
Collapse
Affiliation(s)
- Norbert Bittner
- Institute of Translational Genomics, Helmholtz Zentrum München Deutsches Forschungszentrum für Gesundheit und Umwelt, Neuherberg, Germany
| | - Chenfu Shi
- Centre for Genetics and Genomics Versus Arthritis, Division of Musculoskeletal and Dermatological Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, UK
| | - Danyun Zhao
- Centre for Genetics and Genomics Versus Arthritis, Division of Musculoskeletal and Dermatological Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, UK
| | - James Ding
- Centre for Genetics and Genomics Versus Arthritis, Division of Musculoskeletal and Dermatological Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, UK
| | - Lorraine Southam
- Institute of Translational Genomics, Helmholtz Zentrum München Deutsches Forschungszentrum für Gesundheit und Umwelt, Neuherberg, Germany
| | - Diane Swift
- Department of Oncology and Metabolism, The University of Sheffield, Sheffield, UK
| | - Peter Kreitmaier
- Institute of Translational Genomics, Helmholtz Zentrum München Deutsches Forschungszentrum für Gesundheit und Umwelt, Neuherberg, Germany
- Graduate School of Experimental Medicine, Technical University of Munich, München, Germany
- TUM School of Medicine and Health, Technical University of Munich and Klinikum Rechts der Isar, München, Germany
| | - Mauro Tutino
- Institute of Translational Genomics, Helmholtz Zentrum München Deutsches Forschungszentrum für Gesundheit und Umwelt, Neuherberg, Germany
| | - Odysseas Stergiou
- Institute of Translational Genomics, Helmholtz Zentrum München Deutsches Forschungszentrum für Gesundheit und Umwelt, Neuherberg, Germany
| | | | - Georgia Katsoula
- Institute of Translational Genomics, Helmholtz Zentrum München Deutsches Forschungszentrum für Gesundheit und Umwelt, Neuherberg, Germany
- Graduate School of Experimental Medicine, Technical University of Munich, München, Germany
- TUM School of Medicine and Health, Technical University of Munich and Klinikum Rechts der Isar, München, Germany
| | - Jenny Hankinson
- Institute of Translational Genomics, Helmholtz Zentrum München Deutsches Forschungszentrum für Gesundheit und Umwelt, Neuherberg, Germany
| | | | - Gisela Orozco
- Centre for Genetics and Genomics Versus Arthritis, Division of Musculoskeletal and Dermatological Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, UK
- NIHR Manchester Biomedical Research Centre, Manchester University NHS Foundation Trust, Manchester Academic Health Science Centre, Manchester, UK
| | - Eleftheria Zeggini
- Institute of Translational Genomics, Helmholtz Zentrum München Deutsches Forschungszentrum für Gesundheit und Umwelt, Neuherberg, Germany
- TUM School of Medicine and Health, Technical University of Munich and Klinikum Rechts der Isar, München, Germany
| |
Collapse
|
6
|
Farhangi S, Gòdia M, Derks MFL, Harlizius B, Dibbits B, González-Prendes R, Crooijmans RPMA, Madsen O, Groenen MAM. Expression genome-wide association study identifies key regulatory variants enriched with metabolic and immune functions in four porcine tissues. BMC Genomics 2024; 25:684. [PMID: 38992576 PMCID: PMC11238464 DOI: 10.1186/s12864-024-10583-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Accepted: 07/01/2024] [Indexed: 07/13/2024] Open
Abstract
BACKGROUND Integration of high throughput DNA genotyping and RNA-sequencing data enables the discovery of genomic regions that regulate gene expression, known as expression quantitative trait loci (eQTL). In pigs, efforts to date have been mainly focused on purebred lines for traits with commercial relevance as such growth and meat quality. However, little is known on genetic variants and mechanisms associated with the robustness of an animal, thus its overall health status. Here, the liver, lung, spleen, and muscle transcriptomes of 100 three-way crossbred female finishers were studied, with the aim of identifying novel eQTL regulatory regions and transcription factors (TFs) associated with regulation of porcine metabolism and health-related traits. RESULTS An expression genome-wide association study with 535,896 genotypes and the expression of 12,680 genes in liver, 13,310 genes in lung, 12,650 genes in spleen, and 12,595 genes in muscle resulted in 4,293, 10,630, 4,533, and 6,871 eQTL regions for each of these tissues, respectively. Although only a small fraction of the eQTLs were annotated as cis-eQTLs, these presented a higher number of polymorphisms per region and significantly stronger associations with their target gene compared to trans-eQTLs. Between 20 and 115 eQTL hotspots were identified across the four tissues. Interestingly, these were all enriched for immune-related biological processes. In spleen, two TFs were identified: ERF and ZNF45, with key roles in regulation of gene expression. CONCLUSIONS This study provides a comprehensive analysis with more than 26,000 eQTL regions identified that are now publicly available. The genomic regions and their variants were mostly associated with tissue-specific regulatory roles. However, some shared regions provide new insights into the complex regulation of genes and their interactions that are involved with important traits related to metabolism and immunity.
Collapse
Affiliation(s)
- Samin Farhangi
- Animal Breeding and Genomics, Wageningen University and Research, Wageningen, The Netherlands
| | - Marta Gòdia
- Animal Breeding and Genomics, Wageningen University and Research, Wageningen, The Netherlands.
| | - Martijn F L Derks
- Animal Breeding and Genomics, Wageningen University and Research, Wageningen, The Netherlands
- Topigs Norsvin Research Center, 's-Hertogenbosch, The Netherlands
| | | | - Bert Dibbits
- Animal Breeding and Genomics, Wageningen University and Research, Wageningen, The Netherlands
| | - Rayner González-Prendes
- Animal Breeding and Genomics, Wageningen University and Research, Wageningen, The Netherlands
- Ausnutria BV, Zwolle, The Netherlands
| | | | - Ole Madsen
- Animal Breeding and Genomics, Wageningen University and Research, Wageningen, The Netherlands
| | - Martien A M Groenen
- Animal Breeding and Genomics, Wageningen University and Research, Wageningen, The Netherlands
| |
Collapse
|
7
|
Li J, Fu L, Li Y, Sun W, Yi Y, Jia W, Li H, Liu H, Guo P, Wang Y, Shen Y, Zhang X, Lv Y, Qin B, Li W, Liu C, Liu L, Mazid MA, Lai Y, Esteban MA, Jiang Y, Wu L. A single-cell chromatin accessibility dataset of human primed and naïve pluripotent stem cell-derived teratoma. Sci Data 2024; 11:725. [PMID: 38956385 PMCID: PMC11220047 DOI: 10.1038/s41597-024-03558-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Accepted: 06/20/2024] [Indexed: 07/04/2024] Open
Abstract
Teratoma, due to its remarkable ability to differentiate into multiple cell lineages, is a valuable model for studying human embryonic development. The similarity of the gene expression and chromatin accessibility patterns in these cells to those observed in vivo further underscores its potential as a research tool. Notably, teratomas derived from human naïve (pre-implantation epiblast-like) pluripotent stem cells (PSCs) have larger embryonic cell diversity and contain extraembryonic lineages, making them more suitable to study developmental processes. However, the cell type-specific epigenetic profiles of naïve PSC teratomas have not been yet characterized. Using single-cell assay for transposase-accessible chromatin sequencing (scATAC-seq), we analyzed 66,384 cell profiles from five teratomas derived from human naïve PSCs and their post-implantation epiblast-like (primed) counterparts. We observed 17 distinct cell types from both embryonic and extraembryonic lineages, resembling the corresponding cell types in human fetal tissues. Additionally, we identified key transcription factors specific to different cell types. Our dataset provides a resource for investigating gene regulatory programs in a relevant model of human embryonic development.
Collapse
Affiliation(s)
- Jinxiu Li
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
- BGI Research, Shenzhen, 518083, China
- BGI Research, Hangzhou, 310030, China
| | - Lixin Fu
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
- BGI Research, Shenzhen, 518083, China
- BGI Research, Hangzhou, 310030, China
| | - Yunpan Li
- Laboratory of Integrative Biology, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, 510530, China
| | - Wei Sun
- BGI Research, Shenzhen, 518083, China
- BGI Research, Hangzhou, 310030, China
- College of Life Sciences, Nankai University, Tianjin, 300071, China
| | - Yao Yi
- MRC Metabolic Diseases Unit, Wellcome Trust-Medical Research Council Institute of Metabolic Science, University of Cambridge, Cambridge, CB2 0QQ, UK
| | - Wenqi Jia
- BGI Research, Shenzhen, 518083, China
- BGI Research, Hangzhou, 310030, China
- Laboratory of Integrative Biology, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, 510530, China
| | - Haiwei Li
- Laboratory of Integrative Biology, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, 510530, China
- Joint School of Life Sciences, Guangzhou Institutes of Biomedicine and Health and Guangzhou Medical University, Guangzhou, Guangdong, 510530, China
| | - Hao Liu
- Laboratory of Integrative Biology, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, 510530, China
| | - Pengcheng Guo
- BGI Research, Shenzhen, 518083, China
- BGI Research, Hangzhou, 310030, China
| | - Yang Wang
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
- BGI Research, Hangzhou, 310030, China
| | - Yue Shen
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
- BGI Research, Shenzhen, 518083, China
- BGI Research, Changzhou, 213299, China
| | - Xiuqing Zhang
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
- BGI Research, Shenzhen, 518083, China
| | - Yuan Lv
- BGI Research, Shenzhen, 518083, China
- BGI Research, Hangzhou, 310030, China
| | - Baoming Qin
- Laboratory of Integrative Biology, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, 510530, China
| | - Wenjuan Li
- Laboratory of Integrative Biology, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, 510530, China
| | - Chuanyu Liu
- BGI Research, Shenzhen, 518083, China
- BGI Research, Hangzhou, 310030, China
| | - Longqi Liu
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
- BGI Research, Shenzhen, 518083, China
- BGI Research, Hangzhou, 310030, China
| | - Md Abdul Mazid
- Laboratory of Integrative Biology, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, 510530, China
| | - Yiwei Lai
- BGI Research, Shenzhen, 518083, China
- BGI Research, Hangzhou, 310030, China
- 3DCStar lab, BGI, Shenzhen, 518083, China
| | - Miguel A Esteban
- BGI Research, Shenzhen, 518083, China
- Laboratory of Integrative Biology, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, 510530, China
- 3DCStar lab, BGI, Shenzhen, 518083, China
| | - Yu Jiang
- BGI Research, Shenzhen, 518083, China.
- BGI Research, Hangzhou, 310030, China.
- State Key Laboratory for Diagnosis and Treatment of Severe Zoonotic Infectious Diseases, Key Laboratory for Zoonosis Research of the Ministry of Education, Institute of Zoonosis, and College of Veterinary Medicine, Jilin University, Changchun, 130062, China.
| | - Liang Wu
- Laboratory of Integrative Biology, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, 510530, China.
| |
Collapse
|
8
|
Li Y, Tan M, Akkari-Henić A, Zhang L, Kip M, Sun S, Sepers JJ, Xu N, Ariyurek Y, Kloet SL, Davis RP, Mikkers H, Gruber JJ, Snyder MP, Li X, Pang B. Genome-wide Cas9-mediated screening of essential non-coding regulatory elements via libraries of paired single-guide RNAs. Nat Biomed Eng 2024; 8:890-908. [PMID: 38778183 PMCID: PMC11310080 DOI: 10.1038/s41551-024-01204-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 03/27/2024] [Indexed: 05/25/2024]
Abstract
The functions of non-coding regulatory elements (NCREs), which constitute a major fraction of the human genome, have not been systematically studied. Here we report a method involving libraries of paired single-guide RNAs targeting both ends of an NCRE as a screening system for the Cas9-mediated deletion of thousands of NCREs genome-wide to study their functions in distinct biological contexts. By using K562 and 293T cell lines and human embryonic stem cells, we show that NCREs can have redundant functions, and that many ultra-conserved elements have silencer activity and play essential roles in cell growth and in cellular responses to drugs (notably, the ultra-conserved element PAX6_Tarzan may be critical for heart development, as removing it from human embryonic stem cells led to defects in cardiomyocyte differentiation). The high-throughput screen, which is compatible with single-cell sequencing, may allow for the identification of druggable NCREs.
Collapse
Affiliation(s)
- Yufeng Li
- Department of Cell and Chemical Biology, Leiden University Medical Center, Leiden, the Netherlands
| | - Minkang Tan
- Department of Cell and Chemical Biology, Leiden University Medical Center, Leiden, the Netherlands
| | - Almira Akkari-Henić
- Department of Cell and Chemical Biology, Leiden University Medical Center, Leiden, the Netherlands
| | - Limin Zhang
- Department of Cell and Chemical Biology, Leiden University Medical Center, Leiden, the Netherlands
| | - Maarten Kip
- Department of Cell and Chemical Biology, Leiden University Medical Center, Leiden, the Netherlands
| | - Shengnan Sun
- Department of Cell and Chemical Biology, Leiden University Medical Center, Leiden, the Netherlands
| | - Jorian J Sepers
- Department of Cell and Chemical Biology, Leiden University Medical Center, Leiden, the Netherlands
| | - Ningning Xu
- Department of Cell and Chemical Biology, Leiden University Medical Center, Leiden, the Netherlands
| | - Yavuz Ariyurek
- Leiden Genome Technology Center, Department of Human Genetics, Leiden University Medical Center, Leiden, the Netherlands
| | - Susan L Kloet
- Leiden Genome Technology Center, Department of Human Genetics, Leiden University Medical Center, Leiden, the Netherlands
| | - Richard P Davis
- Department of Anatomy and Embryology, The Novo Nordisk Foundation Center for Stem Cell Medicine (reNEW), Leiden University Medical Center, Leiden, the Netherlands
| | - Harald Mikkers
- Department of Cell and Chemical Biology, Leiden University Medical Center, Leiden, the Netherlands
| | - Joshua J Gruber
- Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | | | - Xiao Li
- Department of Biochemistry, The Center for RNA Science and Therapeutics, Department of Computer and Data Sciences, Case Western Reserve University, Cleveland, OH, USA.
| | - Baoxu Pang
- Department of Cell and Chemical Biology, Leiden University Medical Center, Leiden, the Netherlands.
| |
Collapse
|
9
|
Moeckel C, Mouratidis I, Chantzi N, Uzun Y, Georgakopoulos-Soares I. Advances in computational and experimental approaches for deciphering transcriptional regulatory networks: Understanding the roles of cis-regulatory elements is essential, and recent research utilizing MPRAs, STARR-seq, CRISPR-Cas9, and machine learning has yielded valuable insights. Bioessays 2024; 46:e2300210. [PMID: 38715516 DOI: 10.1002/bies.202300210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 04/22/2024] [Accepted: 04/23/2024] [Indexed: 05/16/2024]
Abstract
Understanding the influence of cis-regulatory elements on gene regulation poses numerous challenges given complexities stemming from variations in transcription factor (TF) binding, chromatin accessibility, structural constraints, and cell-type differences. This review discusses the role of gene regulatory networks in enhancing understanding of transcriptional regulation and covers construction methods ranging from expression-based approaches to supervised machine learning. Additionally, key experimental methods, including MPRAs and CRISPR-Cas9-based screening, which have significantly contributed to understanding TF binding preferences and cis-regulatory element functions, are explored. Lastly, the potential of machine learning and artificial intelligence to unravel cis-regulatory logic is analyzed. These computational advances have far-reaching implications for precision medicine, therapeutic target discovery, and the study of genetic variations in health and disease.
Collapse
Affiliation(s)
- Camille Moeckel
- Department of Biochemistry and Molecular Biology, Institute for Personalized Medicine, The Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
| | - Ioannis Mouratidis
- Department of Biochemistry and Molecular Biology, Institute for Personalized Medicine, The Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania, USA
| | - Nikol Chantzi
- Department of Biochemistry and Molecular Biology, Institute for Personalized Medicine, The Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
| | - Yasin Uzun
- Department of Biochemistry and Molecular Biology, Institute for Personalized Medicine, The Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania, USA
- Department of Pediatrics, The Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
| | - Ilias Georgakopoulos-Soares
- Department of Biochemistry and Molecular Biology, Institute for Personalized Medicine, The Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania, USA
| |
Collapse
|
10
|
Kaplan SJ, Wong W, Yan J, Pulecio J, Cho HS, Li Q, Zhao J, Leslie-Iyer J, Kazakov J, Murphy D, Luo R, Dey KK, Apostolou E, Leslie CS, Huangfu D. CRISPR Screening Uncovers a Long-Range Enhancer for ONECUT1 in Pancreatic Differentiation and Links a Diabetes Risk Variant. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.26.591412. [PMID: 38746154 PMCID: PMC11092487 DOI: 10.1101/2024.04.26.591412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Functional enhancer annotation is a valuable first step for understanding tissue-specific transcriptional regulation and prioritizing disease-associated non-coding variants for investigation. However, unbiased enhancer discovery in physiologically relevant contexts remains a major challenge. To discover regulatory elements pertinent to diabetes, we conducted a CRISPR interference screen in the human pluripotent stem cell (hPSC) pancreatic differentiation system. Among the enhancers uncovered, we focused on a long-range enhancer ∼664 kb from the ONECUT1 promoter, since coding mutations in ONECUT1 cause pancreatic hypoplasia and neonatal diabetes. Homozygous enhancer deletion in hPSCs was associated with a near-complete loss of ONECUT1 gene expression and compromised pancreatic differentiation. This enhancer contains a confidently fine-mapped type 2 diabetes associated variant (rs528350911) which disrupts a GATA motif. Introduction of the risk variant into hPSCs revealed substantially reduced binding of key pancreatic transcription factors (GATA4, GATA6 and FOXA2) on the edited allele, accompanied by a slight reduction of ONECUT1 transcription, supporting a causal role for this risk variant in metabolic disease. This work expands our knowledge about transcriptional regulation in pancreatic development through the characterization of a long-range enhancer and highlights the utility of enhancer discovery in disease-relevant settings for understanding monogenic and complex disease.
Collapse
|
11
|
Yin C, Hair SC, Byeon GW, Bromley P, Meuleman W, Seelig G. Iterative deep learning-design of human enhancers exploits condensed sequence grammar to achieve cell type-specificity. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.14.599076. [PMID: 38915713 PMCID: PMC11195158 DOI: 10.1101/2024.06.14.599076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]
Abstract
An important and largely unsolved problem in synthetic biology is how to target gene expression to specific cell types. Here, we apply iterative deep learning to design synthetic enhancers with strong differential activity between two human cell lines. We initially train models on published datasets of enhancer activity and chromatin accessibility and use them to guide the design of synthetic enhancers that maximize predicted specificity. We experimentally validate these sequences, use the measurements to re-optimize the predictor, and design a second generation of enhancers with improved specificity. Our design methods embed relevant transcription factor binding site (TFBS) motifs with higher frequencies than comparable endogenous enhancers while using a more selective motif vocabulary, and we show that enhancer activity is correlated with transcription factor expression at the single cell level. Finally, we characterize causal features of top enhancers via perturbation experiments and show enhancers as short as 50bp can maintain specificity.
Collapse
Affiliation(s)
- Christopher Yin
- Department of Electrical & Computer Engineering, University of Washington, Seattle, WA
| | | | - Gun Woo Byeon
- Department of Electrical & Computer Engineering, University of Washington, Seattle, WA
| | - Peter Bromley
- Altius Institute for Biomedical Sciences, Seattle, WA
| | - Wouter Meuleman
- Altius Institute for Biomedical Sciences, Seattle, WA
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA
| | - Georg Seelig
- Department of Electrical & Computer Engineering, University of Washington, Seattle, WA
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA
| |
Collapse
|
12
|
Lalanne JB, Regalado SG, Domcke S, Calderon D, Martin BK, Li X, Li T, Suiter CC, Lee C, Trapnell C, Shendure J. Multiplex profiling of developmental cis-regulatory elements with quantitative single-cell expression reporters. Nat Methods 2024; 21:983-993. [PMID: 38724692 PMCID: PMC11166576 DOI: 10.1038/s41592-024-02260-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Accepted: 03/22/2024] [Indexed: 06/13/2024]
Abstract
The inability to scalably and precisely measure the activity of developmental cis-regulatory elements (CREs) in multicellular systems is a bottleneck in genomics. Here we develop a dual RNA cassette that decouples the detection and quantification tasks inherent to multiplex single-cell reporter assays. The resulting measurement of reporter expression is accurate over multiple orders of magnitude, with a precision approaching the limit set by Poisson counting noise. Together with RNA barcode stabilization via circularization, these scalable single-cell quantitative expression reporters provide high-contrast readouts, analogous to classic in situ assays but entirely from sequencing. Screening >200 regions of accessible chromatin in a multicellular in vitro model of early mammalian development, we identify 13 (8 previously uncharacterized) autonomous and cell-type-specific developmental CREs. We further demonstrate that chimeric CRE pairs generate cognate two-cell-type activity profiles and assess gain- and loss-of-function multicellular expression phenotypes from CRE variants with perturbed transcription factor binding sites. Single-cell quantitative expression reporters can be applied in developmental and multicellular systems to quantitatively characterize native, perturbed and synthetic CREs at scale, with high sensitivity and at single-cell resolution.
Collapse
Affiliation(s)
| | - Samuel G Regalado
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Silvia Domcke
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Diego Calderon
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Beth K Martin
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Xiaoyi Li
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Tony Li
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Chase C Suiter
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Molecular and Cellular Biology Program, University of Washington, Seattle, WA, USA
| | - Choli Lee
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Cole Trapnell
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
- Allen Discovery Center for Cell Lineage Tracing, Seattle, WA, USA
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA.
- Allen Discovery Center for Cell Lineage Tracing, Seattle, WA, USA.
- Howard Hughes Medical Institute, Seattle, WA, USA.
| |
Collapse
|
13
|
Ni P, Wu S, Su Z. Validated Negative Regions (VNRs) in the VISTA Database might be Truncated Forms of Bona Fide Enhancers. ADVANCED GENETICS (HOBOKEN, N.J.) 2024; 5:2300209. [PMID: 38884049 PMCID: PMC11170074 DOI: 10.1002/ggn2.202300209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 03/16/2024] [Indexed: 06/18/2024]
Abstract
The VISTA enhancer database is a valuable resource for evaluating predicted enhancers in humans and mice. In addition to thousands of validated positive regions (VPRs) in the human and mouse genomes, the database also contains similar numbers of validated negative regions (VNRs). It is previously shown that the VPRs are on average half as long as predicted overlapping enhancers that are highly conserved and hypothesize that the VPRs may be truncated forms of long bona fide enhancers. Here, it is shown that like the VPRs, the VNRs also are under strong evolutionary constraints and overlap predicted enhancers in the genomes. The VNRs are also on average half as long as predicted overlapping enhancers that are highly conserved. Moreover, the VNRs and the VPRs display similar cell/tissue-specific modification patterns of key epigenetic marks of active enhancers. Furthermore, the VNRs and the VPRs show similar impact score spectra of in silico mutagenesis. These highly similar properties between the VPRs and the VNRs suggest that like the VPRs, the VNRs may also be truncated forms of long bona fide enhancers.
Collapse
Affiliation(s)
- Pengyu Ni
- Department of Bioinformatics and Genomics the University of North Carolina at Charlotte Charlotte NC 28223 USA
- Present address: Department of Molecular Biophysics & Biochemistry Yale University New Haven CT 06520 USA
| | - Siwen Wu
- Department of Bioinformatics and Genomics the University of North Carolina at Charlotte Charlotte NC 28223 USA
| | - Zhengchang Su
- Department of Bioinformatics and Genomics the University of North Carolina at Charlotte Charlotte NC 28223 USA
| |
Collapse
|
14
|
Rottenberg JT, Taslim TH, Soto-Ugaldi LF, Martinez-Cuesta L, Martinez-Calejman C, Fuxman Bass JI. Viral cis-regulatory elements as sensors of cellular states and environmental cues. Trends Genet 2024:S0168-9525(24)00108-2. [PMID: 38821843 DOI: 10.1016/j.tig.2024.05.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 05/11/2024] [Accepted: 05/13/2024] [Indexed: 06/02/2024]
Abstract
To withstand a hostile cellular environment and replicate, viruses must sense, interpret, and respond to many internal and external cues. Retroviruses and DNA viruses can intercept these cues impinging on host transcription factors via cis-regulatory elements (CREs) in viral genomes, allowing them to sense and coordinate context-specific responses to varied signals. Here, we explore the characteristics of viral CREs, the classes of signals and host transcription factors that regulate them, and how this informs outcomes of viral replication, immune evasion, and latency. We propose that viral CREs constitute central hubs for signal integration from multiple pathways and that sequence variation between viral isolates can rapidly rewire sensing mechanisms, contributing to the variability observed in patient outcomes.
Collapse
Affiliation(s)
| | - Tommy H Taslim
- Department of Biology, Boston University, Boston, MA, USA; Molecular and Cellular Biology and Biochemistry Program, Boston University, Boston, MA, USA
| | - Luis F Soto-Ugaldi
- Tri-Institutional Program in Computational Biology and Medicine, New York, NY, USA
| | - Lucia Martinez-Cuesta
- Department of Biology, Boston University, Boston, MA, USA; Department of Biochemistry, University of Nebraska-Lincoln, Lincoln, NE, USA
| | | | - Juan I Fuxman Bass
- Department of Biology, Boston University, Boston, MA, USA; Molecular and Cellular Biology and Biochemistry Program, Boston University, Boston, MA, USA.
| |
Collapse
|
15
|
Chardon FM, McDiarmid TA, Page NF, Daza RM, Martin B, Domcke S, Regalado SG, Lalanne JB, Calderon D, Li X, Starita LM, Sanders SJ, Ahituv N, Shendure J. Multiplex, single-cell CRISPRa screening for cell type specific regulatory elements. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.03.28.534017. [PMID: 37034704 PMCID: PMC10081248 DOI: 10.1101/2023.03.28.534017] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
CRISPR-based gene activation (CRISPRa) is a promising therapeutic approach for gene therapy, upregulating gene expression by targeting promoters or enhancers in a tissue/cell-type specific manner. Here, we describe an experimental framework that combines highly multiplexed perturbations with single-cell RNA sequencing (sc-RNA-seq) to identify cell-type-specific, CRISPRa-responsive cis- regulatory elements and the gene(s) they regulate. Random combinations of many gRNAs are introduced to each of many cells, which are then profiled and partitioned into test and control groups to test for effect(s) of CRISPRa perturbations of both enhancers and promoters on the expression of neighboring genes. Applying this method to a library of 493 gRNAs targeting candidate cis- regulatory elements in both K562 cells and iPSC-derived excitatory neurons, we identify gRNAs capable of specifically upregulating intended target genes and no other neighboring genes within 1 Mb, including gRNAs yielding upregulation of six autism spectrum disorder (ASD) and neurodevelopmental disorder (NDD) risk genes in neurons. A consistent pattern is that the responsiveness of individual enhancers to CRISPRa is restricted by cell type, implying a dependency on either chromatin landscape and/or additional trans- acting factors for successful gene activation. The approach outlined here may facilitate large-scale screens for gRNAs that activate therapeutically relevant genes in a cell type-specific manner.
Collapse
|
16
|
Castillo H, Hanna P, Sachs LM, Buisine N, Godoy F, Gilbert C, Aguilera F, Muñoz D, Boisvert C, Debiais-Thibaud M, Wan J, Spicuglia S, Marcellini S. Xenopus tropicalis osteoblast-specific open chromatin regions reveal promoters and enhancers involved in human skeletal phenotypes and shed light on early vertebrate evolution. Cells Dev 2024:203924. [PMID: 38692409 DOI: 10.1016/j.cdev.2024.203924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2024] [Revised: 04/18/2024] [Accepted: 04/26/2024] [Indexed: 05/03/2024]
Abstract
While understanding the genetic underpinnings of osteogenesis has far-reaching implications for skeletal diseases and evolution, a comprehensive characterization of the osteoblastic regulatory landscape in non-mammalian vertebrates is still lacking. Here, we compared the ATAC-Seq profile of Xenopus tropicalis (Xt) osteoblasts to a variety of non mineralizing control tissues, and identified osteoblast-specific nucleosome free regions (NFRs) at 527 promoters and 6747 distal regions. Sequence analyses, Gene Ontology, RNA-Seq and ChIP-Seq against four key histone marks confirmed that the distal regions correspond to bona fide osteogenic transcriptional enhancers exhibiting a shared regulatory logic with mammals. We report 425 regulatory regions conserved with human and globally associated to skeletogenic genes. Of these, 35 regions have been shown to impact human skeletal phenotypes by GWAS, including one trps1 enhancer and the runx2 promoter, two genes which are respectively involved in trichorhinophalangeal syndrome type I and cleidocranial dysplasia. Intriguingly, 60 osteoblastic NFRs also align to the genome of the elephant shark, a species lacking osteoblasts and bone tissue. To tackle this paradox, we chose to focus on dlx5 because its conserved promoter, known to integrate regulatory inputs during mammalian osteogenesis, harbours an osteoblast-specific NFR in both frog and human. Hence, we show that dlx5 is expressed in Xt and elephant shark odontoblasts, supporting a common cellular and genetic origin of bone and dentine. Taken together, our work (i) unravels the Xt osteogenic regulatory landscape, (ii) illustrates how cross-species comparisons harvest data relevant to human biology and (iii) reveals that a set of genes including bnc2, dlx5, ebf3, mir199a, nfia, runx2 and zfhx4 drove the development of a primitive form of mineralized skeletal tissue deep in the vertebrate lineage.
Collapse
Affiliation(s)
- Héctor Castillo
- Group for the Study of Developmental Processes (GDeP), School of Biological Sciences, University of Concepción, Chile.
| | - Patricia Hanna
- Group for the Study of Developmental Processes (GDeP), School of Biological Sciences, University of Concepción, Chile
| | - Laurent M Sachs
- UMR7221, Physiologie Moléculaire et Adaptation, CNRS, MNHN, Paris Cedex 05, France
| | - Nicolas Buisine
- UMR7221, Physiologie Moléculaire et Adaptation, CNRS, MNHN, Paris Cedex 05, France
| | - Francisco Godoy
- Group for the Study of Developmental Processes (GDeP), School of Biological Sciences, University of Concepción, Chile
| | - Clément Gilbert
- Université Paris-Saclay, CNRS, IRD, UMR Évolution, Génomes, Comportement et Écologie, 12 route 128, 91190 Gif-sur-Yvette, France
| | - Felipe Aguilera
- Group for the Study of Developmental Processes (GDeP), School of Biological Sciences, University of Concepción, Chile
| | - David Muñoz
- Group for the Study of Developmental Processes (GDeP), School of Biological Sciences, University of Concepción, Chile
| | - Catherine Boisvert
- School of Molecular and Life Sciences, Curtin University, Perth, WA, Australia
| | - Mélanie Debiais-Thibaud
- Institut des Sciences de l'Evolution de Montpellier, ISEM, Univ Montpellier, CNRS, IRD, Montpellier, France
| | - Jing Wan
- Aix-Marseille University, INSERM, TAGC, UMR 1090, Marseille, France; Equipe Labelisée LIGUE contre le Cancer, Marseille, France
| | - Salvatore Spicuglia
- Aix-Marseille University, INSERM, TAGC, UMR 1090, Marseille, France; Equipe Labelisée LIGUE contre le Cancer, Marseille, France
| | - Sylvain Marcellini
- Group for the Study of Developmental Processes (GDeP), School of Biological Sciences, University of Concepción, Chile.
| |
Collapse
|
17
|
Kosicki M, Cintrón DL, Page NF, Georgakopoulos-Soares I, Akiyama JA, Plajzer-Frick I, Novak CS, Kato M, Hunter RD, von Maydell K, Barton S, Godfrey P, Beckman E, Sanders SJ, Pennacchio LA, Ahituv N. Massively parallel reporter assays and mouse transgenic assays provide complementary information about neuronal enhancer activity. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.22.590634. [PMID: 38712228 PMCID: PMC11071441 DOI: 10.1101/2024.04.22.590634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
Genetic studies find hundreds of thousands of noncoding variants associated with psychiatric disorders. Massively parallel reporter assays (MPRAs) and in vivo transgenic mouse assays can be used to assay the impact of these variants. However, the relevance of MPRAs to in vivo function is unknown and transgenic assays suffer from low throughput. Here, we studied the utility of combining the two assays to study the impact of non-coding variants. We carried out an MPRA on over 50,000 sequences derived from enhancers validated in transgenic mouse assays and from multiple fetal neuronal ATAC-seq datasets. We also tested over 20,000 variants, including synthetic mutations in highly active neuronal enhancers and 177 common variants associated with psychiatric disorders. Variants with a high impact on MPRA activity were further tested in mice. We found a strong and specific correlation between MPRA and mouse neuronal enhancer activity including changes in neuronal enhancer activity in mouse embryos for variants with strong MPRA effects. Mouse assays also revealed pleiotropic variant effects that could not be observed in MPRA. Our work provides a large catalog of functional neuronal enhancers and variant effects and highlights the effectiveness of combining MPRAs and mouse transgenic assays.
Collapse
Affiliation(s)
- Michael Kosicki
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Dianne Laboy Cintrón
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94158, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA 94158, USA
| | - Nicholas F. Page
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94158, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA 94158, USA
- Department of Psychiatry and Behavioral Sciences, Kavli Institute for Fundamental Neuroscience, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Ilias Georgakopoulos-Soares
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA 17033, USA
| | - Jennifer A. Akiyama
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Ingrid Plajzer-Frick
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Catherine S. Novak
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Momoe Kato
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Riana D. Hunter
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Kianna von Maydell
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Sarah Barton
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Patrick Godfrey
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Erik Beckman
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Stephan J. Sanders
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA 94158, USA
- Department of Psychiatry and Behavioral Sciences, Kavli Institute for Fundamental Neuroscience, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
- Institute of Developmental and Regenerative Medicine, Department of Paediatrics, University of Oxford, Oxford, OX3 16 7TY, UK
| | - Len A. Pennacchio
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94158, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA 94158, USA
| |
Collapse
|
18
|
Golov AK, Gavrilov AA. Cohesin-Dependent Loop Extrusion: Molecular Mechanics and Role in Cell Physiology. BIOCHEMISTRY. BIOKHIMIIA 2024; 89:601-625. [PMID: 38831499 DOI: 10.1134/s0006297924040023] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/21/2023] [Revised: 12/29/2023] [Accepted: 02/15/2024] [Indexed: 06/05/2024]
Abstract
The most prominent representatives of multisubunit SMC complexes, cohesin and condensin, are best known as structural components of mitotic chromosomes. It turned out that these complexes, as well as their bacterial homologues, are molecular motors, the ATP-dependent movement of these complexes along DNA threads leads to the formation of DNA loops. In recent years, we have witnessed an avalanche-like accumulation of data on the process of SMC dependent DNA looping, also known as loop extrusion. This review briefly summarizes the current understanding of the place and role of cohesin-dependent extrusion in cell physiology and presents a number of models describing the potential molecular mechanism of extrusion in a most compelling way. We conclude the review with a discussion of how the capacity of cohesin to extrude DNA loops may be mechanistically linked to its involvement in sister chromatid cohesion.
Collapse
Affiliation(s)
- Arkadiy K Golov
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, 119334, Russia.
- Technion - Israel Institute of Technology, Haifa, 3525433, Israel
| | - Alexey A Gavrilov
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, 119334, Russia.
| |
Collapse
|
19
|
Yao D, Tycko J, Oh JW, Bounds LR, Gosai SJ, Lataniotis L, Mackay-Smith A, Doughty BR, Gabdank I, Schmidt H, Guerrero-Altamirano T, Siklenka K, Guo K, White AD, Youngworth I, Andreeva K, Ren X, Barrera A, Luo Y, Yardımcı GG, Tewhey R, Kundaje A, Greenleaf WJ, Sabeti PC, Leslie C, Pritykin Y, Moore JE, Beer MA, Gersbach CA, Reddy TE, Shen Y, Engreitz JM, Bassik MC, Reilly SK. Multicenter integrated analysis of noncoding CRISPRi screens. Nat Methods 2024; 21:723-734. [PMID: 38504114 PMCID: PMC11009116 DOI: 10.1038/s41592-024-02216-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Accepted: 02/18/2024] [Indexed: 03/21/2024]
Abstract
The ENCODE Consortium's efforts to annotate noncoding cis-regulatory elements (CREs) have advanced our understanding of gene regulatory landscapes. Pooled, noncoding CRISPR screens offer a systematic approach to investigate cis-regulatory mechanisms. The ENCODE4 Functional Characterization Centers conducted 108 screens in human cell lines, comprising >540,000 perturbations across 24.85 megabases of the genome. Using 332 functionally confirmed CRE-gene links in K562 cells, we established guidelines for screening endogenous noncoding elements with CRISPR interference (CRISPRi), including accurate detection of CREs that exhibit variable, often low, transcriptional effects. Benchmarking five screen analysis tools, we find that CASA produces the most conservative CRE calls and is robust to artifacts of low-specificity single guide RNAs. We uncover a subtle DNA strand bias for CRISPRi in transcribed regions with implications for screen design and analysis. Together, we provide an accessible data resource, predesigned single guide RNAs for targeting 3,275,697 ENCODE SCREEN candidate CREs with CRISPRi and screening guidelines to accelerate functional characterization of the noncoding genome.
Collapse
Affiliation(s)
- David Yao
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Josh Tycko
- Department of Genetics, Stanford University, Stanford, CA, USA.
- Department of Neurobiology, Harvard Medical School, Boston, MA, USA.
| | - Jin Woo Oh
- Departments of Biomedical Engineering and Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - Lexi R Bounds
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
| | - Sager J Gosai
- Broad Institute of Harvard & MIT, Cambridge, MA, USA
- Department of Organismic and Evolutionary Biology, Center for System Biology, Harvard University, Cambridge, MA, USA
- Harvard Graduate Program in Biological and Biomedical Science, Boston, MA, USA
| | - Lazaros Lataniotis
- Department of Neurology, Institute for Human Genetics, University of California, San Franscisco, San Francisco, CA, USA
| | - Ava Mackay-Smith
- University Program in Genetics and Genomics, Duke University School of Medicine, Durham, NC, USA
| | | | - Idan Gabdank
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Henri Schmidt
- Department of Computer Science, Princeton University, Princeton, NJ, USA
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Tania Guerrero-Altamirano
- University Program in Genetics and Genomics, Duke University School of Medicine, Durham, NC, USA
- Department of Biology, Duke University, Durham, NC, USA
| | - Keith Siklenka
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
- Department of Biostatistics and Bioinformatics, Duke University Medical Center, Durham, NC, USA
| | - Katherine Guo
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Alexander D White
- Department of Electrical Engineering, Stanford University, Stanford, CA, USA
| | | | - Kalina Andreeva
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Xingjie Ren
- Department of Neurology, Institute for Human Genetics, University of California, San Franscisco, San Francisco, CA, USA
| | - Alejandro Barrera
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
- Department of Biostatistics and Bioinformatics, Duke University Medical Center, Durham, NC, USA
| | - Yunhai Luo
- Department of Genetics, Stanford University, Stanford, CA, USA
| | | | | | - Anshul Kundaje
- Department of Genetics, Stanford University, Stanford, CA, USA
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - William J Greenleaf
- Department of Genetics, Stanford University, Stanford, CA, USA
- Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA, USA
- Department of Applied Physics, Stanford University, Stanford, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - Pardis C Sabeti
- Broad Institute of Harvard & MIT, Cambridge, MA, USA
- Department of Organismic and Evolutionary Biology, Center for System Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
- Department of Immunology and Infectious Disease, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Christina Leslie
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Yuri Pritykin
- Department of Computer Science, Princeton University, Princeton, NJ, USA
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Jill E Moore
- Program in Bioinformatics and Integrative Biology, RNA Therapeutics Institute, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Michael A Beer
- Departments of Biomedical Engineering and Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - Charles A Gersbach
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
| | - Timothy E Reddy
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
- Department of Biostatistics and Bioinformatics, Duke University Medical Center, Durham, NC, USA
| | - Yin Shen
- Department of Neurology, Institute for Human Genetics, University of California, San Franscisco, San Francisco, CA, USA
- Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
- Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Jesse M Engreitz
- Department of Genetics, Stanford University, Stanford, CA, USA
- BASE Initiative, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Stanford, CA, USA
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Steven K Reilly
- Department of Genetics, Yale University, New Haven, CT, USA.
| |
Collapse
|
20
|
Gunamalai L, Singh P, Berg B, Shi L, Sanchez E, Smith A, Breton G, Bedford MT, Balciunas D, Kapoor A. Functional characterization of QT interval associated SCN5A enhancer variants identify combined additive effects. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.11.584440. [PMID: 38559211 PMCID: PMC10979898 DOI: 10.1101/2024.03.11.584440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Several empirical and theoretical studies suggest presence of multiple enhancers per gene that collectively regulate gene expression, and that common sequence variation impacting on the activities of these enhancers is a major source of inter-individual variability in gene expression. However, for vast majority of genes, enhancers and the underlying regulatory variation remains unknown. Even for the genes with well-characterized enhancers, the nature of the combined effects from multiple enhancers and their variants, when known, on gene expression regulation remains unexplored. Here, we have evaluated the combined effects from five SCN5A enhancers and their regulatory variants that are known to collectively correlate with SCN5A cardiac expression and underlie QT interval association in the general population. Using small deletions centered at the regulatory variants in episomal reporter assays in a mouse cardiomyocyte cell line we demonstrate that the variants and their flanking sequences play critical role in individual enhancer activities, likely being a transcription factor (TF) binding site. By performing oligonucleotide-based pulldown assays on predicted TFs we identify the TFs likely driving allele-specific enhancer activities. Using all 32 possible allelic synthetic constructs in reporter assays, representing the five biallelic enhancers in tandem in their genomic order, we demonstrate combined additive effects on overall enhancer activities. Using transient enhancer assays in developing zebrafish embryos we demonstrate the four out the five enhancer elements act as enhancers in vivo . Together, these studies extend the previous findings to uncover the TFs driving the enhancer activities of QT interval associated SCN5A regulatory variants, reveal the additive effects from allelic combinations of these regulatory variants, and prove their potential to act as enhancers in vivo .
Collapse
|
21
|
Li J, Zhang Y, You Y, Huang Z, Wu L, Liang C, Weng B, Pan L, Huang Y, Huang Y, Yang M, Lu M, Li R, Yan X, Liu Q, Deng S. Unraveling the mechanisms of NK cell dysfunction in aging and Alzheimer's disease: insights from GWAS and single-cell transcriptomics. Front Immunol 2024; 15:1360687. [PMID: 38464521 PMCID: PMC10920339 DOI: 10.3389/fimmu.2024.1360687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Accepted: 02/06/2024] [Indexed: 03/12/2024] Open
Abstract
Background Aging is an important factor in the development of Alzheimer's disease (AD). The senescent cells can be recognized and removed by NK cells. However, NK cell function is gradually inactivated with age. Therefore, this study used senescence as an entry point to investigate how NK cells affect AD. Methods The study validated the correlation between cognition and aging through a prospective cohort of the National Health and Nutrition Examination Survey database. A cellular trajectory analysis of the aging population was performed using single-cell nuclear transcriptome sequencing data from patients with AD and different ages. The genome-wide association study (GWAS) cohort of AD patients was used as the outcome event, and the expression quantitative trait locus was used as an instrumental variable. Causal associations between genes and AD were analyzed by bidirectional Mendelian randomization (MR) and co-localization. Finally, clinical cohorts were constructed to validate the expression of key genes. Results A correlation between cognition and aging was demonstrated using 2,171 older adults over 60 years of age. Gene regulation analysis revealed that most of the highly active transcription factors were concentrated in the NK cell subpopulation of AD. NK cell trajectories were constructed for different age populations. MR and co-localization analyses revealed that CHD6 may be one of the factors influencing AD. Conclusion We explored different levels of AD and aging from population cohorts, single-cell data, and GWAS cohorts and found that there may be some correlations of NK cells between aging and AD. It also provides some basis for potential causation.
Collapse
Affiliation(s)
- Jinwei Li
- Department of Neurosurgery, Liuzhou Workers Hospital, Liuzhou, China
- Department of Neurosurgery, West China Hospital, Sichuan University, Chengdu, Sichuan, China
| | - Yang Zhang
- Department of Vascular Surgery, Fuwai Yunnan Cardiovascular Hospital, Affiliated Cardiovascular Hospital of Kunming Medical University, Kunming, Yunnan, China
| | - Yanwei You
- Division of Sports Science and Physical Education, Tsinghua University, Beijing, China
| | - Zhiwei Huang
- Department of Neurosurgery, Liuzhou Workers Hospital, Liuzhou, China
| | - Liya Wu
- Department of Neurology, Liuzhou Workers Hospital, Liuzhou, China
| | - Cong Liang
- Department of Pharmacy, Liuzhou Workers Hospital, Liuzhou, China
| | - Baohui Weng
- Department of Neurology, Liuzhou Workers Hospital, Liuzhou, China
| | - Liya Pan
- Department of Neurology, Liuzhou Workers Hospital, Liuzhou, China
| | - Yan Huang
- Department of Neurology, Liuzhou Workers Hospital, Liuzhou, China
| | - Yushen Huang
- Department of Pharmacy, Liuzhou Workers Hospital, Liuzhou, China
| | - Mengqi Yang
- Department of Neurology, Liuzhou Workers Hospital, Liuzhou, China
| | - Mengting Lu
- Department of Dermatology, Liuzhou Workers Hospital, Liuzhou, China
| | - Rui Li
- Department of Medical Imaging, Liuzhou Workers Hospital, Liuzhou, China
| | - Xianlei Yan
- Department of Neurosurgery, Liuzhou Workers Hospital, Liuzhou, China
| | - Quan Liu
- Department of Neurosurgery, Liuzhou Workers Hospital, Liuzhou, China
| | - Shan Deng
- Department of Neurology, Liuzhou Workers Hospital, Liuzhou, China
| |
Collapse
|
22
|
Alda-Catalinas C, Ibarra-Soria X, Flouri C, Gordillo JE, Cousminer D, Hutchinson A, Sun B, Pembroke W, Ullrich S, Krejci A, Cortes A, Acevedo A, Malla S, Fishwick C, Drewes G, Rapiteanu R. Mapping the functional impact of non-coding regulatory elements in primary T cells through single-cell CRISPR screens. Genome Biol 2024; 25:42. [PMID: 38308274 PMCID: PMC10835965 DOI: 10.1186/s13059-024-03176-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Accepted: 01/18/2024] [Indexed: 02/04/2024] Open
Abstract
BACKGROUND Drug targets with genetic evidence are expected to increase clinical success by at least twofold. Yet, translating disease-associated genetic variants into functional knowledge remains a fundamental challenge of drug discovery. A key issue is that the vast majority of complex disease associations cannot be cleanly mapped to a gene. Immune disease-associated variants are enriched within regulatory elements found in T-cell-specific open chromatin regions. RESULTS To identify genes and molecular programs modulated by these regulatory elements, we develop a CRISPRi-based single-cell functional screening approach in primary human T cells. Our pipeline enables the interrogation of transcriptomic changes induced by the perturbation of regulatory elements at scale. We first optimize an efficient CRISPRi protocol in primary CD4+ T cells via CROPseq vectors. Subsequently, we perform a screen targeting 45 non-coding regulatory elements and 35 transcription start sites and profile approximately 250,000 T -cell single-cell transcriptomes. We develop a bespoke analytical pipeline for element-to-gene (E2G) mapping and demonstrate that our method can identify both previously annotated and novel E2G links. Lastly, we integrate genetic association data for immune-related traits and demonstrate how our platform can aid in the identification of effector genes for GWAS loci. CONCLUSIONS We describe "primary T cell crisprQTL" - a scalable, single-cell functional genomics approach for mapping regulatory elements to genes in primary human T cells. We show how this framework can facilitate the interrogation of immune disease GWAS hits and propose that the combination of experimental and QTL-based techniques is likely to address the variant-to-function problem.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Bin Sun
- Genomic Sciences, GSK, Stevenage, UK
| | | | | | | | | | | | | | | | - Gerard Drewes
- Genomic Sciences, GSK, Stevenage, UK
- Genomic Sciences, GSK, Collegeville, PA, USA
| | | |
Collapse
|
23
|
Venema WJ, Hiddingh S, van Loosdregt J, Bowes J, Balliu B, de Boer JH, Ossewaarde-van Norel J, Thompson SD, Langefeld CD, de Ligt A, van der Veken LT, Krijger PHL, de Laat W, Kuiper JJW. A cis-regulatory element regulates ERAP2 expression through autoimmune disease risk SNPs. CELL GENOMICS 2024; 4:100460. [PMID: 38190099 PMCID: PMC10794781 DOI: 10.1016/j.xgen.2023.100460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Revised: 10/04/2023] [Accepted: 11/09/2023] [Indexed: 01/09/2024]
Abstract
Single-nucleotide polymorphisms (SNPs) near the ERAP2 gene are associated with various autoimmune conditions, as well as protection against lethal infections. Due to high linkage disequilibrium, numerous trait-associated SNPs are correlated with ERAP2 expression; however, their functional mechanisms remain unidentified. We show by reciprocal allelic replacement that ERAP2 expression is directly controlled by the splice region variant rs2248374. However, disease-associated variants in the downstream LNPEP gene promoter are independently associated with ERAP2 expression. Allele-specific conformation capture assays revealed long-range chromatin contacts between the gene promoters of LNPEP and ERAP2 and showed that interactions were stronger in patients carrying the alleles that increase susceptibility to autoimmune diseases. Replacing the SNPs in the LNPEP promoter by reference sequences lowered ERAP2 expression. These findings show that multiple SNPs act in concert to regulate ERAP2 expression and that disease-associated variants can convert a gene promoter region into a potent enhancer of a distal gene.
Collapse
Affiliation(s)
- Wouter J Venema
- Department of Ophthalmology, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands; Center for Translational Immunology, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands
| | - Sanne Hiddingh
- Department of Ophthalmology, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands; Center for Translational Immunology, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands
| | - Jorg van Loosdregt
- Center for Translational Immunology, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands
| | - John Bowes
- Centre for Genetics and Genomics Versus Arthritis, Centre for Musculoskeletal Research, Manchester Academic Health Science Centre, The University of Manchester, Manchester, UK
| | - Brunilda Balliu
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Joke H de Boer
- Department of Ophthalmology, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands
| | | | - Susan D Thompson
- Department of Pediatrics, University of Cincinnati College of Medicine, Division of Human Genetics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Carl D Langefeld
- Department of Biostatistics and Data Science, and Center for Precision Medicine, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Aafke de Ligt
- Department of Ophthalmology, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands; Center for Translational Immunology, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands
| | - Lars T van der Veken
- Department of Genetics, Division Laboratories, Pharmacy and Biomedical Genetics, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands
| | - Peter H L Krijger
- Oncode Institute, Hubrecht Institute-KNAW and University Medical Center Utrecht, 3584 CT Utrecht, the Netherlands
| | - Wouter de Laat
- Oncode Institute, Hubrecht Institute-KNAW and University Medical Center Utrecht, 3584 CT Utrecht, the Netherlands
| | - Jonas J W Kuiper
- Department of Ophthalmology, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands; Center for Translational Immunology, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands.
| |
Collapse
|
24
|
Schubach M, Maass T, Nazaretyan L, Röner S, Kircher M. CADD v1.7: using protein language models, regulatory CNNs and other nucleotide-level scores to improve genome-wide variant predictions. Nucleic Acids Res 2024; 52:D1143-D1154. [PMID: 38183205 PMCID: PMC10767851 DOI: 10.1093/nar/gkad989] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 10/14/2023] [Accepted: 10/17/2023] [Indexed: 01/07/2024] Open
Abstract
Machine Learning-based scoring and classification of genetic variants aids the assessment of clinical findings and is employed to prioritize variants in diverse genetic studies and analyses. Combined Annotation-Dependent Depletion (CADD) is one of the first methods for the genome-wide prioritization of variants across different molecular functions and has been continuously developed and improved since its original publication. Here, we present our most recent release, CADD v1.7. We explored and integrated new annotation features, among them state-of-the-art protein language model scores (Meta ESM-1v), regulatory variant effect predictions (from sequence-based convolutional neural networks) and sequence conservation scores (Zoonomia). We evaluated the new version on data sets derived from ClinVar, ExAC/gnomAD and 1000 Genomes variants. For coding effects, we tested CADD on 31 Deep Mutational Scanning (DMS) data sets from ProteinGym and, for regulatory effect prediction, we used saturation mutagenesis reporter assay data of promoter and enhancer sequences. The inclusion of new features further improved the overall performance of CADD. As with previous releases, all data sets, genome-wide CADD v1.7 scores, scripts for on-site scoring and an easy-to-use webserver are readily provided via https://cadd.bihealth.org/ or https://cadd.gs.washington.edu/ to the community.
Collapse
Affiliation(s)
- Max Schubach
- Exploratory Diagnostic Sciences, Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany
| | - Thorben Maass
- Institute of Human Genetics, University Hospital Schleswig-Holstein, University of Lübeck, Lübeck, Germany
| | - Lusiné Nazaretyan
- Exploratory Diagnostic Sciences, Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany
| | - Sebastian Röner
- Exploratory Diagnostic Sciences, Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany
| | - Martin Kircher
- Exploratory Diagnostic Sciences, Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany
- Institute of Human Genetics, University Hospital Schleswig-Holstein, University of Lübeck, Lübeck, Germany
| |
Collapse
|
25
|
Zhu X, Ma S, Wong WH. Genetic effects of sequence-conserved enhancer-like elements on human complex traits. Genome Biol 2024; 25:1. [PMID: 38167462 PMCID: PMC10759394 DOI: 10.1186/s13059-023-03142-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Accepted: 12/08/2023] [Indexed: 01/05/2024] Open
Abstract
BACKGROUND The vast majority of findings from human genome-wide association studies (GWAS) map to non-coding sequences, complicating their mechanistic interpretations and clinical translations. Non-coding sequences that are evolutionarily conserved and biochemically active could offer clues to the mechanisms underpinning GWAS discoveries. However, genetic effects of such sequences have not been systematically examined across a wide range of human tissues and traits, hampering progress to fully understand regulatory causes of human complex traits. RESULTS Here we develop a simple yet effective strategy to identify functional elements exhibiting high levels of human-mouse sequence conservation and enhancer-like biochemical activity, which scales well to 313 epigenomic datasets across 106 human tissues and cell types. Combined with 468 GWAS of European (EUR) and East Asian (EAS) ancestries, these elements show tissue-specific enrichments of heritability and causal variants for many traits, which are significantly stronger than enrichments based on enhancers without sequence conservation. These elements also help prioritize candidate genes that are functionally relevant to body mass index (BMI) and schizophrenia but were not reported in previous GWAS with large sample sizes. CONCLUSIONS Our findings provide a comprehensive assessment of how sequence-conserved enhancer-like elements affect complex traits in diverse tissues and demonstrate a generalizable strategy of integrating evolutionary and biochemical data to elucidate human disease genetics.
Collapse
Affiliation(s)
- Xiang Zhu
- Department of Statistics, The Pennsylvania State University, 326 Thomas Building, University Park, 16802, PA, USA.
- Huck Institutes of the Life Sciences, The Pennsylvania State University, 201 Huck Life Sciences Building, University Park, 16802, PA, USA.
- Department of Statistics, Stanford University, 390 Jane Stanford Way, Stanford, 94305, CA, USA.
| | - Shining Ma
- Department of Statistics, Stanford University, 390 Jane Stanford Way, Stanford, 94305, CA, USA
- Department of Biomedical Data Science, Stanford University School of Medicine, 1265 Welch Road MC5464, Stanford, 94305, CA, USA
| | - Wing Hung Wong
- Department of Statistics, Stanford University, 390 Jane Stanford Way, Stanford, 94305, CA, USA.
- Department of Biomedical Data Science, Stanford University School of Medicine, 1265 Welch Road MC5464, Stanford, 94305, CA, USA.
| |
Collapse
|
26
|
Yoshida H. Dissecting the Immune System through Gene Regulation. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2024; 1444:219-235. [PMID: 38467983 DOI: 10.1007/978-981-99-9781-7_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/13/2024]
Abstract
The immune system plays a dual role in human health, functioning both as a protector against pathogens and, at times, as a contributor to disease. This feature emphasizes the importance to uncover the underlying causes of its malfunctions, necessitating an in-depth analysis in both pathological and physiological conditions to better understand the immune system and immune disorders. Recent advances in scientific technology have enabled extensive investigations into gene regulation, a crucial mechanism governing cellular functionality. Studying gene regulatory mechanisms within the immune system is a promising avenue for enhancing our understanding of immune cells and the immune system as a whole. The gene regulatory mechanisms, revealed through various methodologies, and their implications in the field of immunology are discussed in this chapter.
Collapse
Affiliation(s)
- Hideyuki Yoshida
- YCI Laboratory for Immunological Transcriptomics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.
| |
Collapse
|
27
|
Lee AS, Ayers LJ, Kosicki M, Chan WM, Fozo LN, Pratt BM, Collins TE, Zhao B, Rose MF, Sanchis-Juan A, Fu JM, Wong I, Zhao X, Tenney AP, Lee C, Laricchia KM, Barry BJ, Bradford VR, Lek M, MacArthur DG, Lee EA, Talkowski ME, Brand H, Pennacchio LA, Engle EC. A cell type-aware framework for nominating non-coding variants in Mendelian regulatory disorders. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.12.22.23300468. [PMID: 38234731 PMCID: PMC10793524 DOI: 10.1101/2023.12.22.23300468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2024]
Abstract
Unsolved Mendelian cases often lack obvious pathogenic coding variants, suggesting potential non-coding etiologies. Here, we present a single cell multi-omic framework integrating embryonic mouse chromatin accessibility, histone modification, and gene expression assays to discover cranial motor neuron (cMN) cis-regulatory elements and subsequently nominate candidate non-coding variants in the congenital cranial dysinnervation disorders (CCDDs), a set of Mendelian disorders altering cMN development. We generated single cell epigenomic profiles for ~86,000 cMNs and related cell types, identifying ~250,000 accessible regulatory elements with cognate gene predictions for ~145,000 putative enhancers. Seventy-five percent of elements (44 of 59) validated in an in vivo transgenic reporter assay, demonstrating that single cell accessibility is a strong predictor of enhancer activity. Applying our cMN atlas to 899 whole genome sequences from 270 genetically unsolved CCDD pedigrees, we achieved significant reduction in our variant search space and nominated candidate variants predicted to regulate known CCDD disease genes MAFB, PHOX2A, CHN1, and EBF3 - as well as new candidates in recurrently mutated enhancers through peak- and gene-centric allelic aggregation. This work provides novel non-coding variant discoveries of relevance to CCDDs and a generalizable framework for nominating non-coding variants of potentially high functional impact in other Mendelian disorders.
Collapse
Affiliation(s)
- Arthur S Lee
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA
- Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA
- Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
| | - Lauren J Ayers
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA
| | - Michael Kosicki
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA
| | - Wai-Man Chan
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA
- Howard Hughes Medical Institute, Chevy Chase, MD
| | - Lydia N Fozo
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA
| | - Brandon M Pratt
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA
| | - Thomas E Collins
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA
| | - Boxun Zhao
- Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA
| | - Matthew F Rose
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA
- Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Department of Pathology, Boston Children's Hospital, Boston, MA
- Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA
- Medical Genetics Training Program, Harvard Medical School, Boston, MA
| | - Alba Sanchis-Juan
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
| | - Jack M Fu
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA
| | - Isaac Wong
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
| | - Xuefang Zhao
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA
| | - Alan P Tenney
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA
- Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
| | - Cassia Lee
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA
- Harvard College, Cambridge, MA
| | - Kristen M Laricchia
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
| | - Brenda J Barry
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA
- Howard Hughes Medical Institute, Chevy Chase, MD
| | - Victoria R Bradford
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA
| | - Monkol Lek
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
| | - Daniel G MacArthur
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Centre for Population Genomics, Garvan Institute of Medical Research and UNSW Sydney, Sydney, NSW, Australia
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, VIC, Australia
| | - Eunjung Alice Lee
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA
- Department of Genetics, Harvard Medical School, Boston, MA
| | - Michael E Talkowski
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA
| | - Harrison Brand
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA
- Pediatric Surgical Research Laboratories, Massachusetts General Hospital, Boston, MA
| | - Len A Pennacchio
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA
| | - Elizabeth C Engle
- Department of Neurology, Boston Children's Hospital and Harvard Medical School, Boston, MA
- Kirby Neurobiology Center, Boston Children's Hospital, Boston, MA
- Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
- Howard Hughes Medical Institute, Chevy Chase, MD
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA
- Medical Genetics Training Program, Harvard Medical School, Boston, MA
- Department of Ophthalmology, Boston Children's Hospital and Harvard Medical School, Boston, MA
| |
Collapse
|
28
|
Sui JY, Eichenfield DZ, Sun BK. The role of enhancers in psoriasis and atopic dermatitis. Br J Dermatol 2023; 190:10-19. [PMID: 37658835 DOI: 10.1093/bjd/ljad321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 08/25/2023] [Accepted: 08/27/2023] [Indexed: 09/05/2023]
Abstract
Regulatory elements, particularly enhancers, play a crucial role in disease susceptibility and progression. Enhancers are DNA sequences that activate gene expression and can be affected by epigenetic modifications, interactions with transcription factors (TFs) or changes to the enhancer DNA sequence itself. Altered enhancer activity impacts gene expression and contributes to disease. In this review, we define enhancers and the experimental techniques used to identify and characterize them. We also discuss recent studies that examine how enhancers contribute to atopic dermatitis (AD) and psoriasis. Articles in the PubMed database were identified (from 1 January 2010 to 28 February 2023) that were relevant to enhancer variants, enhancer-associated TFs and enhancer histone modifications in psoriasis or AD. Most enhancers associated with these conditions regulate genes affecting epidermal homeostasis or immune function. These discoveries present potential therapeutic targets to complement existing treatment options for AD and psoriasis.
Collapse
Affiliation(s)
- Jennifer Y Sui
- Department of Dermatology, University of California San Diego School of Medicine, CA, USA
- Division of Pediatric and Adolescent Dermatology, Rady Children's Hospital of San Diego, CA, USA
| | - Dawn Z Eichenfield
- Department of Dermatology, University of California San Diego School of Medicine, CA, USA
- Division of Pediatric and Adolescent Dermatology, Rady Children's Hospital of San Diego, CA, USA
| | - Bryan K Sun
- Department of Dermatology, University of California San Diego School of Medicine, CA, USA
| |
Collapse
|
29
|
Gschwind AR, Mualim KS, Karbalayghareh A, Sheth MU, Dey KK, Jagoda E, Nurtdinov RN, Xi W, Tan AS, Jones H, Ma XR, Yao D, Nasser J, Avsec Ž, James BT, Shamim MS, Durand NC, Rao SSP, Mahajan R, Doughty BR, Andreeva K, Ulirsch JC, Fan K, Perez EM, Nguyen TC, Kelley DR, Finucane HK, Moore JE, Weng Z, Kellis M, Bassik MC, Price AL, Beer MA, Guigó R, Stamatoyannopoulos JA, Lieberman Aiden E, Greenleaf WJ, Leslie CS, Steinmetz LM, Kundaje A, Engreitz JM. An encyclopedia of enhancer-gene regulatory interactions in the human genome. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.09.563812. [PMID: 38014075 PMCID: PMC10680627 DOI: 10.1101/2023.11.09.563812] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Identifying transcriptional enhancers and their target genes is essential for understanding gene regulation and the impact of human genetic variation on disease1-6. Here we create and evaluate a resource of >13 million enhancer-gene regulatory interactions across 352 cell types and tissues, by integrating predictive models, measurements of chromatin state and 3D contacts, and largescale genetic perturbations generated by the ENCODE Consortium7. We first create a systematic benchmarking pipeline to compare predictive models, assembling a dataset of 10,411 elementgene pairs measured in CRISPR perturbation experiments, >30,000 fine-mapped eQTLs, and 569 fine-mapped GWAS variants linked to a likely causal gene. Using this framework, we develop a new predictive model, ENCODE-rE2G, that achieves state-of-the-art performance across multiple prediction tasks, demonstrating a strategy involving iterative perturbations and supervised machine learning to build increasingly accurate predictive models of enhancer regulation. Using the ENCODE-rE2G model, we build an encyclopedia of enhancer-gene regulatory interactions in the human genome, which reveals global properties of enhancer networks, identifies differences in the functions of genes that have more or less complex regulatory landscapes, and improves analyses to link noncoding variants to target genes and cell types for common, complex diseases. By interpreting the model, we find evidence that, beyond enhancer activity and 3D enhancer-promoter contacts, additional features guide enhancerpromoter communication including promoter class and enhancer-enhancer synergy. Altogether, these genome-wide maps of enhancer-gene regulatory interactions, benchmarking software, predictive models, and insights about enhancer function provide a valuable resource for future studies of gene regulation and human genetics.
Collapse
Affiliation(s)
- Andreas R. Gschwind
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
| | - Kristy S. Mualim
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Department of Plant Biology, Carnegie Institute of Science, Stanford, CA, USA
| | - Alireza Karbalayghareh
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Maya U. Sheth
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
- Department of Bioengineering, Stanford University, Stanford, CA, USA
| | - Kushal K. Dey
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Evelyn Jagoda
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Ramil N. Nurtdinov
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Wang Xi
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Anthony S. Tan
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
| | - Hank Jones
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
| | - X. Rosa Ma
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
| | - David Yao
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Joseph Nasser
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Present Address: Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | | | - Benjamin T. James
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Muhammad S. Shamim
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX, USA
- Department of Bioengineering, Rice University, Houston, TX, USA
- Medical Scientist Training Program, Baylor College of Medicine, Houston, Texas, USA
| | - Neva C. Durand
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX, USA
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Suhas S. P. Rao
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Department of Medicine, University of California San Francisco, San Francisco, CA, USA
- Department of Structural Biology, Stanford University, Stanford, CA, USA
| | - Ragini Mahajan
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX, USA
- Department of Biosciences, Rice University, Houston, TX, USA
| | - Benjamin R. Doughty
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Kalina Andreeva
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Jacob C. Ulirsch
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Present Address: Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA, USA
| | - Kaili Fan
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
- Present Address: Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, MA, USA
| | | | - Tri C. Nguyen
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
| | | | - Hilary K. Finucane
- Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jill E. Moore
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Manolis Kellis
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Michael C. Bassik
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Alkes L. Price
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Michael A. Beer
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - John A. Stamatoyannopoulos
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
- Clinical Research Division, Fred Hutch Cancer Center, Seattle, WA, USA
| | - Erez Lieberman Aiden
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX, USA
- Department of Computer Science, Rice University, Houston, TX, USA
| | - William J. Greenleaf
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Department of Applied Physics, Stanford University, Stanford, CA, USA
| | | | - Lars M. Steinmetz
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Genome Technology Center, Palo Alto, CA, USA
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Heidelberg, Germany
| | - Anshul Kundaje
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Jesse M. Engreitz
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA, USA
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanford Cardiovascular Institute, Stanford University, Stanford, CA, USA
| |
Collapse
|
30
|
Della Chiara G, Jiménez C, Virdi M, Crosetto N, Bienko M. Enhancers dysfunction in the 3D genome of cancer cells. Front Cell Dev Biol 2023; 11:1303862. [PMID: 38020908 PMCID: PMC10657884 DOI: 10.3389/fcell.2023.1303862] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 10/23/2023] [Indexed: 12/01/2023] Open
Abstract
Eukaryotic genomes are spatially organized inside the cell nucleus, forming a threedimensional (3D) architecture that allows for spatial separation of nuclear processes and for controlled expression of genes required for cell identity specification and tissue homeostasis. Hence, it is of no surprise that mis-regulation of genome architecture through rearrangements of the linear genome sequence or epigenetic perturbations are often linked to aberrant gene expression programs in tumor cells. Increasing research efforts have shed light into the causes and consequences of alterations of 3D genome organization. In this review, we summarize the current knowledge on how 3D genome architecture is dysregulated in cancer, with a focus on enhancer highjacking events and their contribution to tumorigenesis. Studying the functional effects of genome architecture perturbations on gene expression in cancer offers a unique opportunity for a deeper understanding of tumor biology and sets the basis for the discovery of novel therapeutic targets.
Collapse
Affiliation(s)
| | | | | | - Nicola Crosetto
- Human Technopole, Milan, Italy
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Solna, Sweden
- Science for Life Laboratory, Solna, Sweden
| | - Magda Bienko
- Human Technopole, Milan, Italy
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Solna, Sweden
- Science for Life Laboratory, Solna, Sweden
| |
Collapse
|
31
|
Badia-I-Mompel P, Wessels L, Müller-Dott S, Trimbour R, Ramirez Flores RO, Argelaguet R, Saez-Rodriguez J. Gene regulatory network inference in the era of single-cell multi-omics. Nat Rev Genet 2023; 24:739-754. [PMID: 37365273 DOI: 10.1038/s41576-023-00618-5] [Citation(s) in RCA: 48] [Impact Index Per Article: 48.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/12/2023] [Indexed: 06/28/2023]
Abstract
The interplay between chromatin, transcription factors and genes generates complex regulatory circuits that can be represented as gene regulatory networks (GRNs). The study of GRNs is useful to understand how cellular identity is established, maintained and disrupted in disease. GRNs can be inferred from experimental data - historically, bulk omics data - and/or from the literature. The advent of single-cell multi-omics technologies has led to the development of novel computational methods that leverage genomic, transcriptomic and chromatin accessibility information to infer GRNs at an unprecedented resolution. Here, we review the key principles of inferring GRNs that encompass transcription factor-gene interactions from transcriptomics and chromatin accessibility data. We focus on the comparison and classification of methods that use single-cell multimodal data. We highlight challenges in GRN inference, in particular with respect to benchmarking, and potential further developments using additional data modalities.
Collapse
Affiliation(s)
- Pau Badia-I-Mompel
- Heidelberg University, Faculty of Medicine, Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
| | - Lorna Wessels
- Heidelberg University, Faculty of Medicine, Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
- Department of Vascular Biology and Tumor Angiogenesis, European Center for Angioscience, Medical Faculty, MannHeim Heidelberg University, Mannheim, Germany
| | - Sophia Müller-Dott
- Heidelberg University, Faculty of Medicine, Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
| | - Rémi Trimbour
- Heidelberg University, Faculty of Medicine, Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
- Institut Pasteur, Université Paris Cité, CNRS UMR 3738, Machine Learning for Integrative Genomics Group, Paris, France
| | - Ricardo O Ramirez Flores
- Heidelberg University, Faculty of Medicine, Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
| | | | - Julio Saez-Rodriguez
- Heidelberg University, Faculty of Medicine, Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany.
| |
Collapse
|
32
|
Mulet-Lazaro R, Delwel R. From Genotype to Phenotype: How Enhancers Control Gene Expression and Cell Identity in Hematopoiesis. Hemasphere 2023; 7:e969. [PMID: 37953829 PMCID: PMC10635615 DOI: 10.1097/hs9.0000000000000969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Accepted: 09/11/2023] [Indexed: 11/14/2023] Open
Abstract
Blood comprises a wide array of specialized cells, all of which share the same genetic information and ultimately derive from the same precursor, the hematopoietic stem cell (HSC). This diversity of phenotypes is underpinned by unique transcriptional programs gradually acquired in the process known as hematopoiesis. Spatiotemporal regulation of gene expression depends on many factors, but critical among them are enhancers-sequences of DNA that bind transcription factors and increase transcription of genes under their control. Thus, hematopoiesis involves the activation of specific enhancer repertoires in HSCs and their progeny, driving the expression of sets of genes that collectively determine morphology and function. Disruption of this tightly regulated process can have catastrophic consequences: in hematopoietic malignancies, dysregulation of transcriptional control by enhancers leads to misexpression of oncogenes that ultimately drive transformation. This review attempts to provide a basic understanding of enhancers and their role in transcriptional regulation, with a focus on normal and malignant hematopoiesis. We present examples of enhancers controlling master regulators of hematopoiesis and discuss the main mechanisms leading to enhancer dysregulation in leukemia and lymphoma.
Collapse
Affiliation(s)
- Roger Mulet-Lazaro
- Department of Hematology, Erasmus MC Cancer Institute, Rotterdam, the Netherlands
- Oncode Institute, Utrecht, the Netherlands
| | - Ruud Delwel
- Department of Hematology, Erasmus MC Cancer Institute, Rotterdam, the Netherlands
- Oncode Institute, Utrecht, the Netherlands
| |
Collapse
|
33
|
Arnold M, Stengel KR. Emerging insights into enhancer biology and function. Transcription 2023; 14:68-87. [PMID: 37312570 PMCID: PMC10353330 DOI: 10.1080/21541264.2023.2222032] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 05/30/2023] [Accepted: 06/01/2023] [Indexed: 06/15/2023] Open
Abstract
Cell type-specific gene expression is coordinated by DNA-encoded enhancers and the transcription factors (TFs) that bind to them in a sequence-specific manner. As such, these enhancers and TFs are critical mediators of normal development and altered enhancer or TF function is associated with the development of diseases such as cancer. While initially defined by their ability to activate gene transcription in reporter assays, putative enhancer elements are now frequently defined by their unique chromatin features including DNase hypersensitivity and transposase accessibility, bidirectional enhancer RNA (eRNA) transcription, CpG hypomethylation, high H3K27ac and H3K4me1, sequence-specific transcription factor binding, and co-factor recruitment. Identification of these chromatin features through sequencing-based assays has revolutionized our ability to identify enhancer elements on a genome-wide scale, and genome-wide functional assays are now capitalizing on this information to greatly expand our understanding of how enhancers function to provide spatiotemporal coordination of gene expression programs. Here, we highlight recent technological advances that are providing new insights into the molecular mechanisms by which these critical cis-regulatory elements function in gene control. We pay particular attention to advances in our understanding of enhancer transcription, enhancer-promoter syntax, 3D organization and biomolecular condensates, transcription factor and co-factor dependencies, and the development of genome-wide functional enhancer screens.
Collapse
Affiliation(s)
- Mirjam Arnold
- Department of Cell Biology, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Kristy R. Stengel
- Department of Cell Biology, Albert Einstein College of Medicine, Bronx, NY, USA
- Montefiore Einstein Cancer Center, Albert Einstein College of Medicine-Montefiore Health System, Bronx, NY, USA
- Ruth L. and David S. Gottesman Institute for Stem Cell and Regenerative Medicine Research, Albert Einstein College of Medicine, Bronx, NY, USA
| |
Collapse
|
34
|
Umarov R, Hon CC. Enhancer target prediction: state-of-the-art approaches and future prospects. Biochem Soc Trans 2023; 51:1975-1988. [PMID: 37830459 DOI: 10.1042/bst20230917] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 10/02/2023] [Accepted: 10/02/2023] [Indexed: 10/14/2023]
Abstract
Enhancers are genomic regions that regulate gene transcription and are located far away from the transcription start sites of their target genes. Enhancers are highly enriched in disease-associated variants and thus deciphering the interactions between enhancers and genes is crucial to understanding the molecular basis of genetic predispositions to diseases. Experimental validations of enhancer targets can be laborious. Computational methods have thus emerged as a valuable alternative for studying enhancer-gene interactions. A variety of computational methods have been developed to predict enhancer targets by incorporating genomic features (e.g. conservation, distance, and sequence), epigenomic features (e.g. histone marks and chromatin contacts) and activity measurements (e.g. covariations of enhancer activity and gene expression). With the recent advances in genome perturbation and chromatin conformation capture technologies, data on experimentally validated enhancer targets are becoming available for supervised training of these methods and evaluation of their performance. In this review, we categorize enhancer target prediction methods based on their rationales and approaches. Then we discuss their merits and limitations and highlight the future directions for enhancer targets prediction.
Collapse
Affiliation(s)
- Ramzan Umarov
- RIKEN Centre for Integrative Medical Sciences, Yokohama RIKEN Institute, Yokohama, Japan
| | - Chung-Chau Hon
- RIKEN Centre for Integrative Medical Sciences, Yokohama RIKEN Institute, Yokohama, Japan
| |
Collapse
|
35
|
Yang Y, Li X, Meng Z, Liu Y, Qian K, Chu M, Pan Z. A body map of super-enhancers and their function in pig. Front Vet Sci 2023; 10:1239965. [PMID: 37869495 PMCID: PMC10587440 DOI: 10.3389/fvets.2023.1239965] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Accepted: 09/26/2023] [Indexed: 10/24/2023] Open
Abstract
Introduction Super-enhancers (SEs) are clusters of enhancers that act synergistically to drive the high-level expression of genes involved in cell identity and function. Although SEs have been extensively investigated in humans and mice, they have not been well characterized in pigs. Methods Here, we identified 42,380 SEs in 14 pig tissues using chromatin immunoprecipitation sequencing, and statistics of its overall situation, studied the composition and characteristics of SE, and explored the influence of SEs characteristics on gene expression. Results We observed that approximately 40% of normal enhancers (NEs) form SEs. Compared to NEs, we found that SEs were more likely to be enriched with an activated enhancer and show activated functions. Interestingly, SEs showed X chromosome depletion and short interspersed nuclear element enrichment, implying that SEs play an important role in sex traits and repeat evolution. Additionally, SE-associated genes exhibited higher expression levels and stronger conservation than NE-associated genes. However, genes with the largest SEs had higher expression levels than those with the smallest SEs, indicating that SE size may influence gene expression. Moreover, we observed a negative correlation between SE gene distance and gene expression, indicating that the proximity of SEs can affect gene activity. Gene ontology enrichment and motif analysis revealed that SEs have strong tissue-specific activity. For example, the CORO2B gene with a brain-specific SE shows strong brain-specific expression, and the phenylalanine hydroxylase gene with liver-specific SEs shows strong liver-specific expression. Discussion In this study, we illustrated a body map of SEs and explored their functions in pigs, providing information on the composition and tissue-specific patterns of SEs. This study can serve as a valuable resource of gene regulatory and comparative analyses to the scientific community and provides a theoretical reference for genetic control mechanisms of important traits in pigs.
Collapse
Affiliation(s)
- Youbing Yang
- College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, China
| | - Xinyue Li
- College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, China
- Key Laboratory of Animal Genetics and Breeding and Reproduction of Ministry of Agriculture and Rural Affairs, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Zhu Meng
- Key Laboratory of Animal Genetics and Breeding and Reproduction of Ministry of Agriculture and Rural Affairs, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Yongjian Liu
- College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, China
| | - Kaifeng Qian
- College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, China
| | - Mingxing Chu
- Key Laboratory of Animal Genetics and Breeding and Reproduction of Ministry of Agriculture and Rural Affairs, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Zhangyuan Pan
- College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, China
- Key Laboratory of Animal Genetics and Breeding and Reproduction of Ministry of Agriculture and Rural Affairs, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| |
Collapse
|
36
|
Zhu I, Landsman D. Clustered and diverse transcription factor binding underlies cell type specificity of enhancers for housekeeping genes. Genome Res 2023; 33:1662-1672. [PMID: 37884340 PMCID: PMC10691539 DOI: 10.1101/gr.278130.123] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 09/12/2023] [Indexed: 10/28/2023]
Abstract
Housekeeping genes are considered to be regulated by common enhancers across different tissues. Here we report that most of the commonly expressed mouse or human genes across different cell types, including more than half of the previously identified housekeeping genes, are associated with cell type-specific enhancers. Furthermore, the binding of most transcription factors (TFs) is cell type-specific. We reason that these cell type specificities are causally related to the collective TF recruitment at regulatory sites, as TFs tend to bind to regions associated with many other TFs and each cell type has a unique repertoire of expressed TFs. Based on binding profiles of hundreds of TFs from HepG2, K562, and GM12878 cells, we show that 80% of all TF peaks overlapping H3K27ac signals are in the top 20,000-23,000 most TF-enriched H3K27ac peak regions, and approximately 12,000-15,000 of these peaks are enhancers (nonpromoters). Those enhancers are mainly cell type-specific and include those linked to the majority of commonly expressed genes. Moreover, we show that the top 15,000 most TF-enriched regulatory sites in HepG2 cells, associated with about 200 TFs, can be predicted largely from the binding profile of as few as 30 TFs. Through motif analysis, we show that major enhancers harbor diverse and clustered motifs from a combination of available TFs uniquely present in each cell type. We propose a mechanism that explains how the highly focused TF binding at regulatory sites results in cell type specificity of enhancers for housekeeping and commonly expressed genes.
Collapse
Affiliation(s)
- Iris Zhu
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| | - David Landsman
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| |
Collapse
|
37
|
Malfait J, Wan J, Spicuglia S. Epromoters are new players in the regulatory landscape with potential pleiotropic roles. Bioessays 2023; 45:e2300012. [PMID: 37246247 DOI: 10.1002/bies.202300012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Revised: 05/11/2023] [Accepted: 05/15/2023] [Indexed: 05/30/2023]
Abstract
Precise spatiotemporal control of gene expression during normal development and cell differentiation is achieved by the combined action of proximal (promoters) and distal (enhancers) cis-regulatory elements. Recent studies have reported that a subset of promoters, termed Epromoters, works also as enhancers to regulate distal genes. This new paradigm opened novel questions regarding the complexity of our genome and raises the possibility that genetic variation within Epromoters has pleiotropic effects on various physiological and pathological traits by differentially impacting multiple proximal and distal genes. Here, we discuss the different observations pointing to an important role of Epromoters in the regulatory landscape and summarize the evidence supporting a pleiotropic impact of these elements in disease. We further hypothesize that Epromoter might represent a major contributor to phenotypic variation and disease.
Collapse
Affiliation(s)
- Juliette Malfait
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, LIGUE, Marseille, France
| | - Jing Wan
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, LIGUE, Marseille, France
| | - Salvatore Spicuglia
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, LIGUE, Marseille, France
| |
Collapse
|
38
|
Gonçalves TM, Stewart CL, Baxley SD, Xu J, Li D, Gabel HW, Wang T, Avraham O, Zhao G. Towards a comprehensive regulatory map of Mammalian Genomes. RESEARCH SQUARE 2023:rs.3.rs-3294408. [PMID: 37841836 PMCID: PMC10571623 DOI: 10.21203/rs.3.rs-3294408/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/17/2023]
Abstract
Genome mapping studies have generated a nearly complete collection of genes for the human genome, but we still lack an equivalently vetted inventory of human regulatory sequences. Cis-regulatory modules (CRMs) play important roles in controlling when, where, and how much a gene is expressed. We developed a training data-free CRM-prediction algorithm, the Mammalian Regulatory MOdule Detector (MrMOD) for accurate CRM prediction in mammalian genomes. MrMOD provides genome position-fixed CRM models similar to the fixed gene models for the mouse and human genomes using only genomic sequences as the inputs with one adjustable parameter - the significance p-value. Importantly, MrMOD predicts a comprehensive set of high-resolution CRMs in the mouse and human genomes including all types of regulatory modules not limited to any tissue, cell type, developmental stage, or condition. We computationally validated MrMOD predictions used a compendium of 21 orthogonal experimental data sets including thousands of experimentally defined CRMs and millions of putative regulatory elements derived from hundreds of different tissues, cell types, and stimulus conditions obtained from multiple databases. In ovo transgenic reporter assay demonstrates the power of our prediction in guiding experimental design. We analyzed CRMs located in the chromosome 17 using unsupervised machine learning and identified groups of CRMs with multiple lines of evidence supporting their functionality, linking CRMs with upstream binding transcription factors and downstream target genes. Our work provides a comprehensive base pair resolution annotation of the functional regulatory elements and non-functional regions in the mammalian genomes.
Collapse
Affiliation(s)
| | | | | | - Jason Xu
- Missouri University of Science & Technology
| | - Daofeng Li
- Washington University School of Medicine
| | | | - Ting Wang
- Washington University School of Medicine
| | | | | |
Collapse
|
39
|
Ying P, Chen C, Lu Z, Chen S, Zhang M, Cai Y, Zhang F, Huang J, Fan L, Ning C, Li Y, Wang W, Geng H, Liu Y, Tian W, Yang Z, Liu J, Huang C, Yang X, Xu B, Li H, Zhu X, Li N, Li B, Wei Y, Zhu Y, Tian J, Miao X. Genome-wide enhancer-gene regulatory maps link causal variants to target genes underlying human cancer risk. Nat Commun 2023; 14:5958. [PMID: 37749132 PMCID: PMC10520073 DOI: 10.1038/s41467-023-41690-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 09/14/2023] [Indexed: 09/27/2023] Open
Abstract
Genome-wide association studies have identified numerous variants associated with human complex traits, most of which reside in the non-coding regions, but biological mechanisms remain unclear. However, assigning function to the non-coding elements is still challenging. Here we apply Activity-by-Contact (ABC) model to evaluate enhancer-gene regulation effect by integrating multi-omics data and identified 544,849 connections across 20 cancer types. ABC model outperforms previous approaches in linking regulatory variants to target genes. Furthermore, we identify over 30,000 enhancer-gene connections in colorectal cancer (CRC) tissues. By integrating large-scale population cohorts (23,813 cases and 29,973 controls) and multipronged functional assays, we demonstrate an ABC regulatory variant rs4810856 associated with CRC risk (Odds Ratio = 1.11, 95%CI = 1.05-1.16, P = 4.02 × 10-5) by acting as an allele-specific enhancer to distally facilitate PREX1, CSE1L and STAU1 expression, which synergistically activate p-AKT signaling. Our study provides comprehensive regulation maps and illuminates a single variant regulating multiple genes, providing insights into cancer etiology.
Collapse
Grants
- Distinguished Young Scholars of China (NSFC-81925032), Key Program of National Natural Science Foundation of China (NSFC-82130098), the Fundamental Research Funds for the Central Universities (2042022rc0026, 2042023kf1005),Knowledge Innovation Program of Wuhan (2023020201010060).
- Youth Program of National Natural Science Foundation of China (NSFC-82003547), Program of Health Commission of Hubei Province (WJ2023M045) and Fundamental Research Funds for the Central Universities (WHU: 2042022kf1031).
- The National Science Fund for Excellent Young Scholars (NSFC-82322058), Program of National Natural Science Foundation of China (NSFC-82103929, NSFC-82273713), Young Elite Scientists Sponsorship Program by cst(2022QNRC001), National Science Fund for Distinguished Young Scholars of Hubei Province of China (2023AFA046), Fundamental Research Funds for the Central Universities (WHU:2042022kf1205) and Knowledge Innovation Program of Wuhan (whkxjsj011, 2023020201010073).
Collapse
Affiliation(s)
- Pingting Ying
- Department of Epidemiology and Biostatistics, School of Public Health, Wuhan University, Wuhan, 430071, China
- Department of Gastrointestinal Oncology, Zhongnan Hospital of Wuhan University, Wuhan, 430071, China
- Department of Radiation Oncology, Renmin Hospital of Wuhan University, Wuhan, 430071, China
| | - Can Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Wuhan University, Wuhan, 430071, China
- Department of Gastrointestinal Oncology, Zhongnan Hospital of Wuhan University, Wuhan, 430071, China
- Department of Radiation Oncology, Renmin Hospital of Wuhan University, Wuhan, 430071, China
| | - Zequn Lu
- Department of Epidemiology and Biostatistics, School of Public Health, Wuhan University, Wuhan, 430071, China
- Department of Gastrointestinal Oncology, Zhongnan Hospital of Wuhan University, Wuhan, 430071, China
- Department of Radiation Oncology, Renmin Hospital of Wuhan University, Wuhan, 430071, China
| | - Shuoni Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Wuhan University, Wuhan, 430071, China
| | - Ming Zhang
- Department of Epidemiology and Biostatistics, School of Public Health, Wuhan University, Wuhan, 430071, China
| | - Yimin Cai
- Department of Epidemiology and Biostatistics, School of Public Health, Wuhan University, Wuhan, 430071, China
| | - Fuwei Zhang
- Department of Epidemiology and Biostatistics, School of Public Health, Wuhan University, Wuhan, 430071, China
| | - Jinyu Huang
- Department of Epidemiology and Biostatistics, School of Public Health, Wuhan University, Wuhan, 430071, China
| | - Linyun Fan
- Department of Epidemiology and Biostatistics, School of Public Health, Wuhan University, Wuhan, 430071, China
| | - Caibo Ning
- Department of Epidemiology and Biostatistics, School of Public Health, Wuhan University, Wuhan, 430071, China
| | - Yanmin Li
- Department of Epidemiology and Biostatistics, School of Public Health, Wuhan University, Wuhan, 430071, China
| | - Wenzhuo Wang
- Department of Epidemiology and Biostatistics, School of Public Health, Wuhan University, Wuhan, 430071, China
| | - Hui Geng
- Department of Epidemiology and Biostatistics, School of Public Health, Wuhan University, Wuhan, 430071, China
| | - Yizhuo Liu
- Department of Epidemiology and Biostatistics, School of Public Health, Wuhan University, Wuhan, 430071, China
| | - Wen Tian
- Department of Epidemiology and Biostatistics, School of Public Health, Wuhan University, Wuhan, 430071, China
| | - Zhiyong Yang
- Department of Hepatobiliary and Pancreatic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, 430071, China
| | - Jiuyang Liu
- Department of Gastrointestinal Surgery, Zhongnan Hospital of Wuhan University, Wuhan University, Wuhan, 430071, China
| | - Chaoqun Huang
- Department of Gastrointestinal Surgery, Zhongnan Hospital of Wuhan University, Wuhan University, Wuhan, 430071, China
| | - Xiaojun Yang
- Department of Gastrointestinal Surgery, Zhongnan Hospital of Wuhan University, Wuhan University, Wuhan, 430071, China
| | - Bin Xu
- Cancer Center, Renmin Hospital of Wuhan University, Wuhan University, Wuhan, 430060, China
| | - Heng Li
- Department of Urology, Tongji Hospital of Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, China
| | - Xu Zhu
- Department of Gastrointestinal Surgery, Renmin Hospital of Wuhan University, Wuhan, 430071, China
| | - Ni Li
- Office of Cancer Screening, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Bin Li
- Department of Epidemiology and Biostatistics, School of Public Health, Wuhan University, Wuhan, 430071, China
| | - Yongchang Wei
- Department of Gastrointestinal Oncology, Hubei Cancer Clinical Study Center, Zhongnan Hospital of Wuhan University, Wuhan, 430071, China
| | - Ying Zhu
- Department of Epidemiology and Biostatistics, School of Public Health, Wuhan University, Wuhan, 430071, China
| | - Jianbo Tian
- Department of Epidemiology and Biostatistics, School of Public Health, Wuhan University, Wuhan, 430071, China.
- Department of Gastrointestinal Oncology, Zhongnan Hospital of Wuhan University, Wuhan, 430071, China.
- Department of Radiation Oncology, Renmin Hospital of Wuhan University, Wuhan, 430071, China.
| | - Xiaoping Miao
- Department of Epidemiology and Biostatistics, School of Public Health, Wuhan University, Wuhan, 430071, China.
- Department of Gastrointestinal Oncology, Zhongnan Hospital of Wuhan University, Wuhan, 430071, China.
- Department of Radiation Oncology, Renmin Hospital of Wuhan University, Wuhan, 430071, China.
- Department of Epidemiology and Biostatistics, School of Public Health, Tongji Medical College, Huazhong University of Sciences and Technology, Wuhan, 430030, China.
| |
Collapse
|
40
|
Cerda-Smith CG, Hutchinson HM, Liu A, Goel VY, Sept C, Kim H, Casaní-Galdón S, Burkman KG, Bassil CF, Hansen AS, Aryee MJ, Johnstone SE, Eyler CE, Wood KC. Integrative PTEN Enhancer Discovery Reveals a New Model of Enhancer Organization. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.20.558459. [PMID: 37786671 PMCID: PMC10541578 DOI: 10.1101/2023.09.20.558459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/04/2023]
Abstract
Enhancers possess both structural elements mediating promoter looping and functional elements mediating gene expression. Traditional models of enhancer-mediated gene regulation imply genomic overlap or immediate adjacency of these elements. We test this model by combining densely-tiled CRISPRa screening with nucleosome-resolution Region Capture Micro-C topology analysis. Using this integrated approach, we comprehensively define the cis-regulatory landscape for the tumor suppressor PTEN, identifying and validating 10 distinct enhancers and defining their 3D spatial organization. Unexpectedly, we identify several long-range functional enhancers whose promoter proximity is facilitated by chromatin loop anchors several kilobases away, and demonstrate that accounting for this spatial separation improves the computational prediction of validated enhancers. Thus, we propose a new model of enhancer organization incorporating spatial separation of essential functional and structural components.
Collapse
Affiliation(s)
- Christian G. Cerda-Smith
- Department of Pharmacology and Cancer Biology, Duke University School of Medicine; Durham, NC 27710, USA
| | - Haley M. Hutchinson
- Department of Pharmacology and Cancer Biology, Duke University School of Medicine; Durham, NC 27710, USA
| | - Annie Liu
- Department of Surgery, Duke University School of Medicine; Durham, NC 27710, USA
| | - Viraat Y. Goel
- Department of Biological Engineering, Massachusetts Institute of Technology; Cambridge, 02139, USA
- Broad Institute; Cambridge, MA 02139, USA
- Koch Institute for Integrative Cancer Research; Cambridge, MA, 02139, USA
| | - Corriene Sept
- Broad Institute; Cambridge, MA 02139, USA
- Department of Biostatistics, Harvard School of Public Health; Boston, MA 02215, USA
| | - Holly Kim
- Department of Radiation Oncology, Duke University School of Medicine; Durham, NC 27710, USA
| | - Salvador Casaní-Galdón
- Broad Institute; Cambridge, MA 02139, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute; Boston, MA 02215, USA
- Departments of Cell Biology and Pathology, Harvard Medical School; Boston, MA 02114, USA
| | - Katherine G. Burkman
- Department of Radiation Oncology, Duke University School of Medicine; Durham, NC 27710, USA
| | - Christopher F. Bassil
- Department of Pharmacology and Cancer Biology, Duke University School of Medicine; Durham, NC 27710, USA
| | - Anders S. Hansen
- Department of Biological Engineering, Massachusetts Institute of Technology; Cambridge, 02139, USA
- Broad Institute; Cambridge, MA 02139, USA
- Koch Institute for Integrative Cancer Research; Cambridge, MA, 02139, USA
| | - Martin J. Aryee
- Broad Institute; Cambridge, MA 02139, USA
- Department of Pathology, Harvard Medical School; Boston, MA 02114, USA
- Department of Data Science, Dana-Farber Cancer Institute; Boston, MA 02215, USA
| | - Sarah E. Johnstone
- Broad Institute; Cambridge, MA 02139, USA
- Department of Pathology, Dana-Farber Cancer Institute; Boston, MA 02215, USA
| | - Christine E. Eyler
- Department of Radiation Oncology, Duke University School of Medicine; Durham, NC 27710, USA
- Duke Cancer Institute, Duke University School of Medicine; Durham, NC 27710, USA
| | - Kris C. Wood
- Department of Pharmacology and Cancer Biology, Duke University School of Medicine; Durham, NC 27710, USA
- Duke Cancer Institute, Duke University School of Medicine; Durham, NC 27710, USA
| |
Collapse
|
41
|
Ni P, Wu S, Su Z. Underlying causes for prevalent false positives and false negatives in STARR-seq data. NAR Genom Bioinform 2023; 5:lqad085. [PMID: 37745976 PMCID: PMC10516709 DOI: 10.1093/nargab/lqad085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Revised: 08/23/2023] [Accepted: 09/12/2023] [Indexed: 09/26/2023] Open
Abstract
Self-transcribing active regulatory region sequencing (STARR-seq) and its variants have been widely used to characterize enhancers. However, it has been reported that up to 87% of STARR-seq peaks are located in repressive chromatin and are not functional in the tested cells. While some of the STARR-seq peaks in repressive chromatin might be active in other cell/tissue types, some others might be false positives. Meanwhile, many active enhancers may not be identified by the current STARR-seq methods. Although methods have been proposed to mitigate systematic errors caused by the use of plasmid vectors, the artifacts due to the intrinsic limitations of current STARR-seq methods are still prevalent and the underlying causes are not fully understood. Based on predicted cis-regulatory modules (CRMs) and non-CRMs in the human genome as well as predicted active CRMs and non-active CRMs in a few human cell lines/tissues with STARR-seq data available, we reveal prevalent false positives and false negatives in STARR-seq peaks generated by major variants of STARR-seq methods and possible underlying causes. Our results will help design strategies to improve STARR-seq methods and interpret the results.
Collapse
Affiliation(s)
- Pengyu Ni
- Department of Bioinformatics and Genomics, The University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Siwen Wu
- Department of Bioinformatics and Genomics, The University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Zhengchang Su
- Department of Bioinformatics and Genomics, The University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| |
Collapse
|
42
|
Gosai SJ, Castro RI, Fuentes N, Butts JC, Kales S, Noche RR, Mouri K, Sabeti PC, Reilly SK, Tewhey R. Machine-guided design of synthetic cell type-specific cis-regulatory elements. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.08.552077. [PMID: 37609287 PMCID: PMC10441439 DOI: 10.1101/2023.08.08.552077] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/24/2023]
Abstract
Cis-regulatory elements (CREs) control gene expression, orchestrating tissue identity, developmental timing, and stimulus responses, which collectively define the thousands of unique cell types in the body. While there is great potential for strategically incorporating CREs in therapeutic or biotechnology applications that require tissue specificity, there is no guarantee that an optimal CRE for an intended purpose has arisen naturally through evolution. Here, we present a platform to engineer and validate synthetic CREs capable of driving gene expression with programmed cell type specificity. We leverage innovations in deep neural network modeling of CRE activity across three cell types, efficient in silico optimization, and massively parallel reporter assays (MPRAs) to design and empirically test thousands of CREs. Through in vitro and in vivo validation, we show that synthetic sequences outperform natural sequences from the human genome in driving cell type-specific expression. Synthetic sequences leverage unique sequence syntax to promote activity in the on-target cell type and simultaneously reduce activity in off-target cells. Together, we provide a generalizable framework to prospectively engineer CREs and demonstrate the required literacy to write regulatory code that is fit-for-purpose in vivo across vertebrates.
Collapse
Affiliation(s)
- SJ Gosai
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Graduate Program in Biological and Biomedical Science, Boston MA
- Department Of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - RI Castro
- The Jackson Laboratory, Bar Harbor, ME, USA
| | - N Fuentes
- The Jackson Laboratory, Bar Harbor, ME, USA
- Harvard College, Harvard University, Cambridge, MA, USA
| | - JC Butts
- The Jackson Laboratory, Bar Harbor, ME, USA
- Graduate School of Biomedical Sciences and Engineering, University of Maine, Orono, ME, USA
| | - S Kales
- The Jackson Laboratory, Bar Harbor, ME, USA
| | - RR Noche
- Department of Comparative Medicine, Yale School of Medicine, New Haven, CT, USA
- Yale Zebrafish Research Core, Yale School of Medicine, New Haven, CT, USA
| | - K Mouri
- The Jackson Laboratory, Bar Harbor, ME, USA
| | - PC Sabeti
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department Of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - SK Reilly
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
- Wu Tsai Institute, Yale University, New Haven, CT, USA
| | - R Tewhey
- The Jackson Laboratory, Bar Harbor, ME, USA
- Graduate School of Biomedical Sciences and Engineering, University of Maine, Orono, ME, USA
- Graduate School of Biomedical Sciences, Tufts University School of Medicine, Boston, MA, USA
| |
Collapse
|
43
|
Luo R, Yan J, Oh JW, Xi W, Shigaki D, Wong W, Cho HS, Murphy D, Cutler R, Rosen BP, Pulecio J, Yang D, Glenn RA, Chen T, Li QV, Vierbuchen T, Sidoli S, Apostolou E, Huangfu D, Beer MA. Dynamic network-guided CRISPRi screen identifies CTCF-loop-constrained nonlinear enhancer gene regulatory activity during cell state transitions. Nat Genet 2023; 55:1336-1346. [PMID: 37488417 PMCID: PMC11012226 DOI: 10.1038/s41588-023-01450-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2022] [Accepted: 06/20/2023] [Indexed: 07/26/2023]
Abstract
Comprehensive enhancer discovery is challenging because most enhancers, especially those contributing to complex diseases, have weak effects on gene expression. Our gene regulatory network modeling identified that nonlinear enhancer gene regulation during cell state transitions can be leveraged to improve the sensitivity of enhancer discovery. Using human embryonic stem cell definitive endoderm differentiation as a dynamic transition system, we conducted a mid-transition CRISPRi-based enhancer screen. We discovered a comprehensive set of enhancers for each of the core endoderm-specifying transcription factors. Many enhancers had strong effects mid-transition but weak effects post-transition, consistent with the nonlinear temporal responses to enhancer perturbation predicted by the modeling. Integrating three-dimensional genomic information, we were able to develop a CTCF-loop-constrained Interaction Activity model that can better predict functional enhancers compared to models that rely on Hi-C-based enhancer-promoter contact frequency. Our study provides generalizable strategies for sensitive and systematic enhancer discovery in both normal and pathological cell state transitions.
Collapse
Affiliation(s)
- Renhe Luo
- Developmental Biology Program, Sloan Kettering Institute, New York City, NY, USA
- Louis V. Gerstner Jr. Graduate School of Biomedical Sciences, Memorial Sloan Kettering Cancer Center, New York City, NY, USA
| | - Jielin Yan
- Developmental Biology Program, Sloan Kettering Institute, New York City, NY, USA
- Louis V. Gerstner Jr. Graduate School of Biomedical Sciences, Memorial Sloan Kettering Cancer Center, New York City, NY, USA
| | - Jin Woo Oh
- Department of Biomedical Engineering and McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - Wang Xi
- Department of Biomedical Engineering and McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - Dustin Shigaki
- Department of Biomedical Engineering and McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - Wilfred Wong
- Computational & Systems Biology Program, Sloan Kettering Institute, New York City, NY, USA
- Weill Cornell Graduate School of Medical Sciences, Weill Cornell Medicine, New York City, NY, USA
| | - Hyein S Cho
- Developmental Biology Program, Sloan Kettering Institute, New York City, NY, USA
| | - Dylan Murphy
- Weill Cornell Graduate School of Medical Sciences, Weill Cornell Medicine, New York City, NY, USA
- Department of Medicine, Weill Cornell Medicine, New York City, NY, USA
| | - Ronald Cutler
- Department of Biochemistry, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Bess P Rosen
- Developmental Biology Program, Sloan Kettering Institute, New York City, NY, USA
- Weill Cornell Graduate School of Medical Sciences, Weill Cornell Medicine, New York City, NY, USA
| | - Julian Pulecio
- Developmental Biology Program, Sloan Kettering Institute, New York City, NY, USA
| | - Dapeng Yang
- Developmental Biology Program, Sloan Kettering Institute, New York City, NY, USA
| | - Rachel A Glenn
- Developmental Biology Program, Sloan Kettering Institute, New York City, NY, USA
- Weill Cornell Graduate School of Medical Sciences, Weill Cornell Medicine, New York City, NY, USA
| | - Tingxu Chen
- Developmental Biology Program, Sloan Kettering Institute, New York City, NY, USA
- Louis V. Gerstner Jr. Graduate School of Biomedical Sciences, Memorial Sloan Kettering Cancer Center, New York City, NY, USA
| | - Qing V Li
- Developmental Biology Program, Sloan Kettering Institute, New York City, NY, USA
- Louis V. Gerstner Jr. Graduate School of Biomedical Sciences, Memorial Sloan Kettering Cancer Center, New York City, NY, USA
| | - Thomas Vierbuchen
- Developmental Biology Program, Sloan Kettering Institute, New York City, NY, USA
| | - Simone Sidoli
- Department of Biochemistry, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Effie Apostolou
- Department of Medicine, Weill Cornell Medicine, New York City, NY, USA
| | - Danwei Huangfu
- Developmental Biology Program, Sloan Kettering Institute, New York City, NY, USA.
| | - Michael A Beer
- Department of Biomedical Engineering and McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
44
|
Wu X, Wu X, Xie W. Activation, decommissioning, and dememorization: enhancers in a life cycle. Trends Biochem Sci 2023; 48:673-688. [PMID: 37221124 DOI: 10.1016/j.tibs.2023.04.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 04/15/2023] [Accepted: 04/18/2023] [Indexed: 05/25/2023]
Abstract
Spatiotemporal regulation of cell type-specific gene expression is essential to convert a zygote into a complex organism that contains hundreds of distinct cell types. A class of cis-regulatory elements called enhancers, which have the potential to enhance target gene transcription, are crucial for precise gene expression programs during development. Following decades of research, many enhancers have been discovered and how enhancers become activated has been extensively studied. However, the mechanisms underlying enhancer silencing are less well understood. We review current understanding of enhancer decommissioning and dememorization, both of which enable enhancer silencing. We highlight recent progress from genome-wide perspectives that have revealed the life cycle of enhancers and how its dynamic regulation underlies cell fate transition, development, cell regeneration, and epigenetic reprogramming.
Collapse
Affiliation(s)
- Xiaotong Wu
- Tsinghua-Peking Center for Life Sciences, New Cornerstone Science Laboratory, MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing 100084, China; Laboratory of Molecular Developmental Biology, State Key Laboratory of Membrane Biology, Tsinghua-Peking Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Xi Wu
- Tsinghua-Peking Center for Life Sciences, New Cornerstone Science Laboratory, MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Wei Xie
- Tsinghua-Peking Center for Life Sciences, New Cornerstone Science Laboratory, MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing 100084, China.
| |
Collapse
|
45
|
Weiner L, Brissette JL. Finding meaning in chaos: a selection signature for functional interactions and its use in molecular biology. FEBS J 2023; 290:3914-3927. [PMID: 35653424 DOI: 10.1111/febs.16542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2022] [Revised: 04/18/2022] [Accepted: 06/01/2022] [Indexed: 11/28/2022]
Abstract
A primary goal of biomedical research is to elucidate molecular mechanisms, particularly those responsible for human traits, either normal or pathological. Yet achieving this goal is difficult if not impossible when the traits of interest lack tractable models and so cannot be dissected through time-honoured approaches like forward genetics or reconstitution. Arguably, no biological problem has hindered scientific progress more than this: the inability to dissect a trait's mechanism without a tractable likeness of the trait. At root, forward genetics and reconstitution are powerful approaches because they assay for specific molecular functions. Here, we discuss an alternative way to uncover important mechanistic interactions, namely, to assay for positive natural selection. If an interaction has been selected for, then it must perform an important function, a function that significantly promotes reproductive success. Accordingly, selection is a consequence and indicator of function, and uncovering multimolecular selection will reveal important functional interactions. We propose a selection signature for interactions and review recent selection-based approaches through which to dissect traits that are not inherently tractable. The review includes proof-of-principle studies in which important interactions were uncovered by screening for selection. In sum, screens for selection appear feasible when screens for specific functions are not. Selection screens thus constitute a novel tool through which to reveal the mechanisms that shape the fates of organisms.
Collapse
Affiliation(s)
- Lorin Weiner
- Department of Cell Biology, State University of New York Downstate Health Sciences University, Brooklyn, NY, USA
| | - Janice L Brissette
- Department of Cell Biology, State University of New York Downstate Health Sciences University, Brooklyn, NY, USA
| |
Collapse
|
46
|
Armendariz DA, Sundarrajan A, Hon GC. Breaking enhancers to gain insights into developmental defects. eLife 2023; 12:e88187. [PMID: 37497775 PMCID: PMC10374278 DOI: 10.7554/elife.88187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 07/19/2023] [Indexed: 07/28/2023] Open
Abstract
Despite ground-breaking genetic studies that have identified thousands of risk variants for developmental diseases, how these variants lead to molecular and cellular phenotypes remains a gap in knowledge. Many of these variants are non-coding and occur at enhancers, which orchestrate key regulatory programs during development. The prevailing paradigm is that non-coding variants alter the activity of enhancers, impacting gene expression programs, and ultimately contributing to disease risk. A key obstacle to progress is the systematic functional characterization of non-coding variants at scale, especially since enhancer activity is highly specific to cell type and developmental stage. Here, we review the foundational studies of enhancers in developmental disease and current genomic approaches to functionally characterize developmental enhancers and their variants at scale. In the coming decade, we anticipate systematic enhancer perturbation studies to link non-coding variants to molecular mechanisms, changes in cell state, and disease phenotypes.
Collapse
Affiliation(s)
- Daniel A Armendariz
- Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, United States
| | - Anjana Sundarrajan
- Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, United States
| | - Gary C Hon
- Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, United States
- Hamon Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, United States
- Lyda Hill Department of Bioinformatics, Department of Obstetrics and Gynecology, University of Texas Southwestern Medical Center, Dallas, United States
| |
Collapse
|
47
|
Bejjani F, Evanno E, Mahfoud S, Tolza C, Zibara K, Piechaczyk M, Jariel-Encontre I. Multiple Fra-1-bound enhancers showing different molecular and functional features can cooperate to repress gene transcription. Cell Biosci 2023; 13:129. [PMID: 37464380 DOI: 10.1186/s13578-023-01077-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Accepted: 06/26/2023] [Indexed: 07/20/2023] Open
Abstract
BACKGROUND How transcription factors (TFs) down-regulate gene expression remains ill-understood, especially when they bind to multiple enhancers contacting the same gene promoter. In particular, it is not known whether they exert similar or significantly different molecular effects at these enhancers. RESULTS To address this issue, we used a particularly well-suited study model consisting of the down-regulation of the TGFB2 gene by the TF Fra-1 in Fra-1-overexpressing cancer cells, as Fra-1 binds to multiple enhancers interacting with the TGFB2 promoter. We show that Fra-1 does not repress TGFB2 transcription via reducing RNA Pol II recruitment at the gene promoter but by decreasing the formation of its transcription-initiating form. This is associated with complex long-range chromatin interactions implicating multiple molecularly and functionally heterogeneous Fra-1-bound transcriptional enhancers distal to the TGFB2 transcriptional start site. In particular, the latter display differential requirements upon the presence and the activity of the lysine acetyltransferase p300/CBP. Furthermore, the final transcriptional output of the TGFB2 gene seems to depend on a balance between the positive and negative effects of Fra-1 at these enhancers. CONCLUSION Our work unveils complex molecular mechanisms underlying the repressive actions of Fra-1 on TGFB2 gene expression. This has consequences for our general understanding of the functioning of the ubiquitous transcriptional complex AP-1, of which Fra-1 is the most documented component for prooncogenic activities. In addition, it raises the general question of the heterogeneity of the molecular functions of TFs binding to different enhancers regulating the same gene.
Collapse
Affiliation(s)
- Fabienne Bejjani
- IGMM, Univ Montpellier, CNRS, Montpellier, France
- DSST, ER045, PRASE, Lebanese University, Beirut, Lebanon
| | | | - Samantha Mahfoud
- IGMM, Univ Montpellier, CNRS, Montpellier, France
- DSST, ER045, PRASE, Lebanese University, Beirut, Lebanon
| | - Claire Tolza
- IGMM, Univ Montpellier, CNRS, Montpellier, France
| | - Kazem Zibara
- DSST, ER045, PRASE, Lebanese University, Beirut, Lebanon
- Biology Department, Faculty of Sciences-I, Lebanese University, Beirut, Lebanon
| | | | - Isabelle Jariel-Encontre
- IGMM, Univ Montpellier, CNRS, Montpellier, France.
- Institut de Recherche en Cancérologie de Montpellier, IRCM, INSERM U1194, ICM, Université de Montpellier, Montpellier, France.
| |
Collapse
|
48
|
Dincer TU, Ernst J. Integrative epigenomic and functional characterization assay based annotation of regulatory activity across diverse human cell types. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.14.549056. [PMID: 37503240 PMCID: PMC10369970 DOI: 10.1101/2023.07.14.549056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
We introduce ChromActivity, a computational framework for predicting and annotating regulatory activity across the genome through integration of multiple epigenomic maps and various functional characterization datasets. ChromActivity generates genomewide predictions of regulatory activity associated with each functional characterization dataset across many cell types based on available epigenomic data. It then for each cell type produces (1) ChromScoreHMM genome annotations based on the combinatorial and spatial patterns within these predictions and (2) ChromScore tracks of overall predicted regulatory activity. ChromActivity provides a resource for analyzing and interpreting the human regulatory genome across diverse cell types.
Collapse
Affiliation(s)
- Tevfik Umut Dincer
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, CA, 90095, USA
- Department of Biological Chemistry, University of California, Los Angeles, CA, 90095, USA
| | - Jason Ernst
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, CA, 90095, USA
- Department of Biological Chemistry, University of California, Los Angeles, CA, 90095, USA
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research at University of California, Los Angeles, CA, 90095, USA
- Computer Science Department, University of California, Los Angeles, CA, 90095, USA
- Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, University of California, Los Angeles, CA, 90095, USA
- Department of Computational Medicine, University of California, Los Angeles, CA, 90095, USA
| |
Collapse
|
49
|
Blotas C, Férec C, Moisan S. Tissue-Specific Regulation of CFTR Gene Expression. Int J Mol Sci 2023; 24:10678. [PMID: 37445855 DOI: 10.3390/ijms241310678] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Revised: 06/21/2023] [Accepted: 06/23/2023] [Indexed: 07/15/2023] Open
Abstract
More than 2000 variations are described within the CFTR (Cystic Fibrosis Transmembrane Regulator) gene and related to large clinical issues from cystic fibrosis to mono-organ diseases. Although these CFTR-associated diseases have been well documented, a large phenotype spectrum is observed and correlations between phenotypes and genotypes are still not well established. To address this issue, we present several regulatory elements that can modulate CFTR gene expression in a tissue-specific manner. Among them, cis-regulatory elements act through chromatin loopings and take part in three-dimensional structured organization. With tissue-specific transcription factors, they form chromatin modules and can regulate gene expression. Alterations of specific regulations can impact and modulate disease expressions. Understanding all those mechanisms highlights the need to expand research outside the gene to enhance our knowledge.
Collapse
Affiliation(s)
- Clara Blotas
- Univ Brest, Inserm, EFS, UMR 1078, GGB, F-29200 Brest, France
| | - Claude Férec
- Univ Brest, Inserm, EFS, UMR 1078, GGB, F-29200 Brest, France
| | - Stéphanie Moisan
- Univ Brest, Inserm, EFS, UMR 1078, GGB, F-29200 Brest, France
- Laboratoire de Génétique Moléculaire et d'Histocompatibilité, CHU Brest, F-29200 Brest, France
| |
Collapse
|
50
|
Hussain S, Sadouni N, van Essen D, Dao LTM, Ferré Q, Charbonnier G, Torres M, Gallardo F, Lecellier CH, Sexton T, Saccani S, Spicuglia S. Short tandem repeats are important contributors to silencer elements in T cells. Nucleic Acids Res 2023; 51:4845-4866. [PMID: 36929452 PMCID: PMC10250210 DOI: 10.1093/nar/gkad187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 02/26/2023] [Accepted: 03/15/2023] [Indexed: 03/18/2023] Open
Abstract
The action of cis-regulatory elements with either activation or repression functions underpins the precise regulation of gene expression during normal development and cell differentiation. Gene activation by the combined activities of promoters and distal enhancers has been extensively studied in normal and pathological contexts. In sharp contrast, gene repression by cis-acting silencers, defined as genetic elements that negatively regulate gene transcription in a position-independent fashion, is less well understood. Here, we repurpose the STARR-seq approach as a novel high-throughput reporter strategy to quantitatively assess silencer activity in mammals. We assessed silencer activity from DNase hypersensitive I sites in a mouse T cell line. Identified silencers were associated with either repressive or active chromatin marks and enriched for binding motifs of known transcriptional repressors. CRISPR-mediated genomic deletions validated the repressive function of distinct silencers involved in the repression of non-T cell genes and genes regulated during T cell differentiation. Finally, we unravel an association of silencer activity with short tandem repeats, highlighting the role of repetitive elements in silencer activity. Our results provide a general strategy for genome-wide identification and characterization of silencer elements.
Collapse
Affiliation(s)
- Saadat Hussain
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, Marseille, France
| | - Nori Sadouni
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, Marseille, France
| | - Dominic van Essen
- Institute for Research on Cancer and Ageing, IRCAN, 06107 Nice, France
| | - Lan T M Dao
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, Marseille, France
| | - Quentin Ferré
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, Marseille, France
| | - Guillaume Charbonnier
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, Marseille, France
| | - Magali Torres
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, Marseille, France
| | - Frederic Gallardo
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, Marseille, France
| | - Charles-Henri Lecellier
- Institut de Génétique Moléculaire de Montpellier, University of Montpellier, CNRS, Montpellier, France
- LIRMM, University of Montpellier, CNRS, Montpellier, France
| | - Tom Sexton
- Institut de Génétique et de Biologie Moléculaire et Cellulaire – IGBMC (CNRS UMR 7104, INSERM U1258, Université de Strasbourg), 67404 Illkirch, France
| | - Simona Saccani
- Institute for Research on Cancer and Ageing, IRCAN, 06107 Nice, France
| | - Salvatore Spicuglia
- Aix-Marseille University, Inserm, TAGC, UMR1090, Marseille, France
- Equipe Labélisée Ligue Contre le Cancer, Marseille, France
| |
Collapse
|