1
|
Kwait R, Pinsky ML, Gignoux‐Wolfsohn S, Eskew EA, Kerwin K, Maslo B. Impact of putatively beneficial genomic loci on gene expression in little brown bats ( Myotis lucifugus, Le Conte, 1831) affected by white-nose syndrome. Evol Appl 2024; 17:e13748. [PMID: 39310794 PMCID: PMC11413065 DOI: 10.1111/eva.13748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Revised: 06/06/2024] [Accepted: 06/19/2024] [Indexed: 09/25/2024] Open
Abstract
Genome-wide scans for selection have become a popular tool for investigating evolutionary responses in wildlife to emerging diseases. However, genome scans are susceptible to false positives and do little to demonstrate specific mechanisms by which loci impact survival. Linking putatively resistant genotypes to observable phenotypes increases confidence in genome scan results and provides evidence of survival mechanisms that can guide conservation and management efforts. Here we used an expression quantitative trait loci (eQTL) analysis to uncover relationships between gene expression and alleles associated with the survival of little brown bats (Myotis lucifugus) despite infection with the causative agent of white-nose syndrome. We found that 25 of the 63 single-nucleotide polymorphisms (SNPs) associated with survival were related to gene expression in wing tissue. The differentially expressed genes have functional annotations associated with the innate immune system, metabolism, circadian rhythms, and the cellular response to stress. In addition, we observed differential expression of multiple genes with survival implications related to loci in linkage disequilibrium with focal SNPs. Together, these findings support the selective function of these loci and suggest that part of the mechanism driving survival may be the alteration of immune and other responses in epithelial tissue.
Collapse
Affiliation(s)
- Robert Kwait
- Department of Ecology, Evolution and Natural ResourcesRutgers, The State University of New JerseyNew BrunswickNew JerseyUSA
| | - Malin L. Pinsky
- Department of Ecology, Evolution and Natural ResourcesRutgers, The State University of New JerseyNew BrunswickNew JerseyUSA
- Department of Ecology and Evolutionary BiologyUniversity of California Santa CruzSanta CruzCaliforniaUSA
| | | | - Evan A. Eskew
- Institute for Interdisciplinary Data SciencesUniversity of IdahoMoscowIdahoUSA
| | - Kathleen Kerwin
- Department of Ecology, Evolution and Natural ResourcesRutgers, The State University of New JerseyNew BrunswickNew JerseyUSA
| | - Brooke Maslo
- Department of Ecology, Evolution and Natural ResourcesRutgers, The State University of New JerseyNew BrunswickNew JerseyUSA
| |
Collapse
|
2
|
DeGroat W, Inoue F, Ashuach T, Yosef N, Ahituv N, Kreimer A. Comprehensive network modeling approaches unravel dynamic enhancer-promoter interactions across neural differentiation. Genome Biol 2024; 25:221. [PMID: 39143563 PMCID: PMC11323586 DOI: 10.1186/s13059-024-03365-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 08/01/2024] [Indexed: 08/16/2024] Open
Abstract
BACKGROUND Increasing evidence suggests that a substantial proportion of disease-associated mutations occur in enhancers, regions of non-coding DNA essential to gene regulation. Understanding the structures and mechanisms of the regulatory programs this variation affects can shed light on the apparatuses of human diseases. RESULTS We collect epigenetic and gene expression datasets from seven early time points during neural differentiation. Focusing on this model system, we construct networks of enhancer-promoter interactions, each at an individual stage of neural induction. These networks serve as the base for a rich series of analyses, through which we demonstrate their temporal dynamics and enrichment for various disease-associated variants. We apply the Girvan-Newman clustering algorithm to these networks to reveal biologically relevant substructures of regulation. Additionally, we demonstrate methods to validate predicted enhancer-promoter interactions using transcription factor overexpression and massively parallel reporter assays. CONCLUSIONS Our findings suggest a generalizable framework for exploring gene regulatory programs and their dynamics across developmental processes; this includes a comprehensive approach to studying the effects of disease-associated variation on transcriptional networks. The techniques applied to our networks have been published alongside our findings as a computational tool, E-P-INAnalyzer. Our procedure can be utilized across different cellular contexts and disorders.
Collapse
Affiliation(s)
- William DeGroat
- Center for Advanced Biotechnology and Medicine, Rutgers, The State University of New Jersey, 679 Hoes Lane West, Piscataway, NJ, 08854, USA
| | - Fumitaka Inoue
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Kyoto, Japan
| | - Tal Ashuach
- Department of Electrical Engineering and Computer Sciences and Center for Computational Biology, University of California, Berkeley, 387 Soda Hall, Berkeley, CA, 94720, USA
| | - Nir Yosef
- Department of Systems Immunology, Weizmann Institute of Science, 234 Herzl Street, Rehovot, 7610001, Israel
- Chan-Zuckerberg Biohub, 499 Illinois St, San Francisco, CA, 94158, USA
- Department of Systems Immunology, Ragon Institute of MGH, MIT, and Harvard Institute of Science, 400 Technology Square, Cambridge, MA, 02139, USA
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California, 513 Parnassus Ave, San Francisco, CA, 94143, USA
- Institute for Human Genetics, University of California, 513 Parnassus Ave, San Francisco, CA, 94143, USA
| | - Anat Kreimer
- Center for Advanced Biotechnology and Medicine, Rutgers, The State University of New Jersey, 679 Hoes Lane West, Piscataway, NJ, 08854, USA.
- Department of Biochemistry and Molecular Biology, Rutgers, The State University of New Jersey, 604 Allison Road, Piscataway, NJ, 08854, USA.
| |
Collapse
|
3
|
Ichiyama-Kobayashi S, Hata K, Wakamori K, Takahata Y, Murakami T, Yamanaka H, Takano H, Yao R, Uzawa N, Nishimura R. Chromatin profiling identifies chondrocyte-specific Sox9 enhancers important for skeletal development. JCI Insight 2024; 9:e175486. [PMID: 38855864 PMCID: PMC11382882 DOI: 10.1172/jci.insight.175486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Accepted: 05/01/2024] [Indexed: 06/11/2024] Open
Abstract
The transcription factor SRY-related HMG box 9 (Sox9) is essential for chondrogenesis. Mutations in and around SOX9 cause campomelic dysplasia (CD) characterized by skeletal malformations. Although the function of Sox9 in this context is well studied, the mechanisms that regulate Sox9 expression in chondrocytes remain to be elucidated. Here, we have used genome-wide profiling to identify 2 Sox9 enhancers located in a proximal breakpoint cluster responsible for CD. Enhancer activity of E308 (located 308 kb 5' upstream) and E160 (located 160 kb 5' upstream) correlated with Sox9 expression levels, and both enhancers showed a synergistic effect in vitro. While single deletions in mice had no apparent effect, simultaneous deletion of both E308 and E160 caused a dwarf phenotype, concomitant with a reduction of Sox9 expression in chondrocytes. Moreover, bone morphogenetic protein 2-dependent chondrocyte differentiation of limb bud mesenchymal cells was severely attenuated in E308/E160 deletion mice. Finally, we found that an open chromatin region upstream of the Sox9 gene was reorganized in the E308/E160 deletion mice to partially compensate for the loss of E308 and E160. In conclusion, our findings reveal a mechanism of Sox9 gene regulation in chondrocytes that might aid in our understanding of the pathophysiology of skeletal disorders.
Collapse
Affiliation(s)
- Sachi Ichiyama-Kobayashi
- Department of Molecular and Cellular Biochemistry
- Department of Oral and Maxillofacial Oncology and Surgery, and
| | - Kenji Hata
- Department of Molecular and Cellular Biochemistry
| | - Kanta Wakamori
- Department of Molecular and Cellular Biochemistry
- Department of Oral and Maxillofacial Oncology and Surgery, and
| | - Yoshifumi Takahata
- Department of Molecular and Cellular Biochemistry
- Genome Editing Research and Development Unit, Osaka University Graduate School of Dentistry, Suita, Osaka, Japan
| | | | - Hitomi Yamanaka
- Department of Cell Biology, Cancer Institute, Japanese Foundation for Cancer Research, Koto-ku, Tokyo, Japan
| | - Hiroshi Takano
- Department of Cell Biology, Cancer Institute, Japanese Foundation for Cancer Research, Koto-ku, Tokyo, Japan
| | - Ryoji Yao
- Department of Cell Biology, Cancer Institute, Japanese Foundation for Cancer Research, Koto-ku, Tokyo, Japan
| | - Narikazu Uzawa
- Department of Oral and Maxillofacial Oncology and Surgery, and
| | | |
Collapse
|
4
|
DeGroat W, Inoue F, Ashuach T, Yosef N, Ahituv N, Kreimer A. Comprehensive network modeling approaches unravel dynamic enhancer-promoter interactions across neural differentiation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.22.595375. [PMID: 38826254 PMCID: PMC11142193 DOI: 10.1101/2024.05.22.595375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
Background Increasing evidence suggests that a substantial proportion of disease-associated mutations occur in enhancers, regions of non-coding DNA essential to gene regulation. Understanding the structures and mechanisms of regulatory programs this variation affects can shed light on the apparatuses of human diseases. Results We collected epigenetic and gene expression datasets from seven early time points during neural differentiation. Focusing on this model system, we constructed networks of enhancer-promoter interactions, each at an individual stage of neural induction. These networks served as the base for a rich series of analyses, through which we demonstrated their temporal dynamics and enrichment for various disease-associated variants. We applied the Girvan-Newman clustering algorithm to these networks to reveal biologically relevant substructures of regulation. Additionally, we demonstrated methods to validate predicted enhancer-promoter interactions using transcription factor overexpression and massively parallel reporter assays. Conclusions Our findings suggest a generalizable framework for exploring gene regulatory programs and their dynamics across developmental processes. This includes a comprehensive approach to studying the effects of disease-associated variation on transcriptional networks. The techniques applied to our networks have been published alongside our findings as a computational tool, E-P-INAnalyzer. Our procedure can be utilized across different cellular contexts and disorders.
Collapse
Affiliation(s)
- William DeGroat
- Center for Advanced Biotechnology and Medicine, Rutgers, The State University of New Jersey, 679 Hoes Lane West, Piscataway, NJ 08854, UAS
| | - Fumitaka Inoue
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Kyoto, Japan
| | - Tal Ashuach
- Department of Electrical Engineering and Computer Sciences and Center for Computational Biology, University of California, Berkeley, 387 Soda Hall, Berkeley, CA 94720, USA
| | - Nir Yosef
- Department of Systems Immunology, Weizmann Institute of Science, 234 Herzl Street, Rehovot 7610001, Israel
- Chan-Zuckerberg Biohub, 499 Illinois St, San Francisco, CA 94158, USA
- Department of Systems Immunology, Ragon Institute of MGH, MIT, and Harvard Institute of Science, 400 Technology Square, Cambridge, MA 02139, USA
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, 513 Parnassus Ave, CA 94143, USA
- Institute for Human Genetics, University of California, San Francisco, 513 Parnassus Ave, CA 94143, USA
| | - Anat Kreimer
- Center for Advanced Biotechnology and Medicine, Rutgers, The State University of New Jersey, 679 Hoes Lane West, Piscataway, NJ 08854, UAS
- Department of Biochemistry and Molecular Biology, Rutgers, The State University of New Jersey, 604 Allison Road, Piscataway, NJ 08854, USA
| |
Collapse
|
5
|
Brennan KJ, Weilert M, Krueger S, Pampari A, Liu HY, Yang AWH, Morrison JA, Hughes TR, Rushlow CA, Kundaje A, Zeitlinger J. Chromatin accessibility in the Drosophila embryo is determined by transcription factor pioneering and enhancer activation. Dev Cell 2023; 58:1898-1916.e9. [PMID: 37557175 PMCID: PMC10592203 DOI: 10.1016/j.devcel.2023.07.007] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 05/09/2023] [Accepted: 07/13/2023] [Indexed: 08/11/2023]
Abstract
Chromatin accessibility is integral to the process by which transcription factors (TFs) read out cis-regulatory DNA sequences, but it is difficult to differentiate between TFs that drive accessibility and those that do not. Deep learning models that learn complex sequence rules provide an unprecedented opportunity to dissect this problem. Using zygotic genome activation in Drosophila as a model, we analyzed high-resolution TF binding and chromatin accessibility data with interpretable deep learning and performed genetic validation experiments. We identify a hierarchical relationship between the pioneer TF Zelda and the TFs involved in axis patterning. Zelda consistently pioneers chromatin accessibility proportional to motif affinity, whereas patterning TFs augment chromatin accessibility in sequence contexts where they mediate enhancer activation. We conclude that chromatin accessibility occurs in two tiers: one through pioneering, which makes enhancers accessible but not necessarily active, and the second when the correct combination of TFs leads to enhancer activation.
Collapse
Affiliation(s)
- Kaelan J Brennan
- Stowers Institute for Medical Research, Kansas City, MO 64110, USA
| | - Melanie Weilert
- Stowers Institute for Medical Research, Kansas City, MO 64110, USA
| | - Sabrina Krueger
- Stowers Institute for Medical Research, Kansas City, MO 64110, USA
| | - Anusri Pampari
- Department of Computer Science, Stanford University, Palo Alto, CA 94305, USA
| | - Hsiao-Yun Liu
- Department of Biology, New York University, New York, NY 10003, USA
| | - Ally W H Yang
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada
| | - Jason A Morrison
- Stowers Institute for Medical Research, Kansas City, MO 64110, USA
| | - Timothy R Hughes
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada
| | | | - Anshul Kundaje
- Department of Computer Science, Stanford University, Palo Alto, CA 94305, USA; Department of Genetics, Stanford University, Palo Alto, CA 94305, USA
| | - Julia Zeitlinger
- Stowers Institute for Medical Research, Kansas City, MO 64110, USA; Department of Pathology & Laboratory Medicine, The University of Kansas Medical Center, Kansas City, KS 66160, USA.
| |
Collapse
|
6
|
Yang Y, Li X, Meng Z, Liu Y, Qian K, Chu M, Pan Z. A body map of super-enhancers and their function in pig. Front Vet Sci 2023; 10:1239965. [PMID: 37869495 PMCID: PMC10587440 DOI: 10.3389/fvets.2023.1239965] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Accepted: 09/26/2023] [Indexed: 10/24/2023] Open
Abstract
Introduction Super-enhancers (SEs) are clusters of enhancers that act synergistically to drive the high-level expression of genes involved in cell identity and function. Although SEs have been extensively investigated in humans and mice, they have not been well characterized in pigs. Methods Here, we identified 42,380 SEs in 14 pig tissues using chromatin immunoprecipitation sequencing, and statistics of its overall situation, studied the composition and characteristics of SE, and explored the influence of SEs characteristics on gene expression. Results We observed that approximately 40% of normal enhancers (NEs) form SEs. Compared to NEs, we found that SEs were more likely to be enriched with an activated enhancer and show activated functions. Interestingly, SEs showed X chromosome depletion and short interspersed nuclear element enrichment, implying that SEs play an important role in sex traits and repeat evolution. Additionally, SE-associated genes exhibited higher expression levels and stronger conservation than NE-associated genes. However, genes with the largest SEs had higher expression levels than those with the smallest SEs, indicating that SE size may influence gene expression. Moreover, we observed a negative correlation between SE gene distance and gene expression, indicating that the proximity of SEs can affect gene activity. Gene ontology enrichment and motif analysis revealed that SEs have strong tissue-specific activity. For example, the CORO2B gene with a brain-specific SE shows strong brain-specific expression, and the phenylalanine hydroxylase gene with liver-specific SEs shows strong liver-specific expression. Discussion In this study, we illustrated a body map of SEs and explored their functions in pigs, providing information on the composition and tissue-specific patterns of SEs. This study can serve as a valuable resource of gene regulatory and comparative analyses to the scientific community and provides a theoretical reference for genetic control mechanisms of important traits in pigs.
Collapse
Affiliation(s)
- Youbing Yang
- College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, China
| | - Xinyue Li
- College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, China
- Key Laboratory of Animal Genetics and Breeding and Reproduction of Ministry of Agriculture and Rural Affairs, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Zhu Meng
- Key Laboratory of Animal Genetics and Breeding and Reproduction of Ministry of Agriculture and Rural Affairs, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Yongjian Liu
- College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, China
| | - Kaifeng Qian
- College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, China
| | - Mingxing Chu
- Key Laboratory of Animal Genetics and Breeding and Reproduction of Ministry of Agriculture and Rural Affairs, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Zhangyuan Pan
- College of Animal Science and Technology, Henan University of Science and Technology, Luoyang, China
- Key Laboratory of Animal Genetics and Breeding and Reproduction of Ministry of Agriculture and Rural Affairs, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| |
Collapse
|
7
|
Galupa R, Alvarez-Canales G, Borst NO, Fuqua T, Gandara L, Misunou N, Richter K, Alves MRP, Karumbi E, Perkins ML, Kocijan T, Rushlow CA, Crocker J. Enhancer architecture and chromatin accessibility constrain phenotypic space during Drosophila development. Dev Cell 2023; 58:51-62.e4. [PMID: 36626871 PMCID: PMC9860173 DOI: 10.1016/j.devcel.2022.12.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 10/18/2022] [Accepted: 12/07/2022] [Indexed: 01/11/2023]
Abstract
Developmental enhancers bind transcription factors and dictate patterns of gene expression during development. Their molecular evolution can underlie phenotypical evolution, but the contributions of the evolutionary pathways involved remain little understood. Here, using mutation libraries in Drosophila melanogaster embryos, we observed that most point mutations in developmental enhancers led to changes in gene expression levels but rarely resulted in novel expression outside of the native pattern. In contrast, random sequences, often acting as developmental enhancers, drove expression across a range of cell types; random sequences including motifs for transcription factors with pioneer activity acted as enhancers even more frequently. Our findings suggest that the phenotypic landscapes of developmental enhancers are constrained by enhancer architecture and chromatin accessibility. We propose that the evolution of existing enhancers is limited in its capacity to generate novel phenotypes, whereas the activity of de novo elements is a primary source of phenotypic novelty.
Collapse
Affiliation(s)
- Rafael Galupa
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany.
| | | | | | - Timothy Fuqua
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Lautaro Gandara
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Natalia Misunou
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Kerstin Richter
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | | | - Esther Karumbi
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | | | - Tin Kocijan
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | | | - Justin Crocker
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany.
| |
Collapse
|
8
|
Fong SL, Capra JA. Function and Constraint in Enhancer Sequences with Multiple Evolutionary Origins. Genome Biol Evol 2022; 14:evac159. [PMID: 36314566 PMCID: PMC9673499 DOI: 10.1093/gbe/evac159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/22/2022] [Indexed: 11/04/2022] Open
Abstract
Thousands of human gene regulatory enhancers are composed of sequences with multiple evolutionary origins. These evolutionarily "complex" enhancers consist of older "core" sequences and younger "derived" sequences. However, the functional relationship between the sequences of different evolutionary origins within complex enhancers is poorly understood. We evaluated the function, selective pressures, and sequence variation across core and derived components of human complex enhancers. We find that both components are older than expected from the genomic background, and complex enhancers are enriched for core and derived sequences of similar evolutionary ages. Both components show strong evidence of biochemical activity in massively parallel report assays. However, core and derived sequences have distinct transcription factor (TF)-binding preferences that are largely similar across evolutionary origins. As expected, given these signatures of function, both core and derived sequences have substantial evidence of purifying selection. Nonetheless, derived sequences exhibit weaker purifying selection than adjacent cores. Derived sequences also tolerate more common genetic variation and are enriched compared with cores for expression quantitative trait loci associated with gene expression variability in human populations. In conclusion, both core and derived sequences have strong evidence of gene regulatory function, but derived sequences have distinct constraint profiles, TF-binding preferences, and tolerance to variation compared with cores. We propose that the step-wise integration of younger derived with older core sequences has generated regulatory substrates with robust activity and the potential for functional variation. Our analyses demonstrate that synthesizing study of enhancer evolution and function can aid interpretation of regulatory sequence activity and functional variation across human populations.
Collapse
Affiliation(s)
- Sarah L Fong
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, Tennessee
| | - John A Capra
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee
- Bakar Computational Health Sciences Institute and Department of Epidemiology and Biostatistics, University of California, San Francisco
| |
Collapse
|
9
|
Ni P, Wilson D, Su Z. A map of cis-regulatory modules and constituent transcription factor binding sites in 80% of the mouse genome. BMC Genomics 2022; 23:714. [PMID: 36261804 PMCID: PMC9583556 DOI: 10.1186/s12864-022-08933-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2022] [Accepted: 10/11/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Mouse is probably the most important model organism to study mammal biology and human diseases. A better understanding of the mouse genome will help understand the human genome, biology and diseases. However, despite the recent progress, the characterization of the regulatory sequences in the mouse genome is still far from complete, limiting its use to understand the regulatory sequences in the human genome. RESULTS Here, by integrating binding peaks in ~ 9,000 transcription factor (TF) ChIP-seq datasets that cover 79.9% of the mouse mappable genome using an efficient pipeline, we were able to partition these binding peak-covered genome regions into a cis-regulatory module (CRM) candidate (CRMC) set and a non-CRMC set. The CRMCs contain 912,197 putative CRMs and 38,554,729 TF binding sites (TFBSs) islands, covering 55.5% and 24.4% of the mappable genome, respectively. The CRMCs tend to be under strong evolutionary constraints, indicating that they are likely cis-regulatory; while the non-CRMCs are largely selectively neutral, indicating that they are unlikely cis-regulatory. Based on evolutionary profiles of the genome positions, we further estimated that 63.8% and 27.4% of the mouse genome might code for CRMs and TFBSs, respectively. CONCLUSIONS Validation using experimental data suggests that at least most of the CRMCs are authentic. Thus, this unprecedentedly comprehensive map of CRMs and TFBSs can be a good resource to guide experimental studies of regulatory genomes in mice and humans.
Collapse
Affiliation(s)
- Pengyu Ni
- Department of Bioinformatics and Genomics, the University of North Carolina at Charlotte, Charlotte, NC, 28223, USA
| | - David Wilson
- Department of Bioinformatics and Genomics, the University of North Carolina at Charlotte, Charlotte, NC, 28223, USA
| | - Zhengchang Su
- Department of Bioinformatics and Genomics, the University of North Carolina at Charlotte, Charlotte, NC, 28223, USA.
| |
Collapse
|
10
|
Giacoman-Lozano M, Meléndez-Ramírez C, Martinez-Ledesma E, Cuevas-Diaz Duran R, Velasco I. Epigenetics of neural differentiation: Spotlight on enhancers. Front Cell Dev Biol 2022; 10:1001701. [PMID: 36313573 PMCID: PMC9606577 DOI: 10.3389/fcell.2022.1001701] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2022] [Accepted: 10/03/2022] [Indexed: 11/28/2022] Open
Abstract
Neural induction, both in vivo and in vitro, includes cellular and molecular changes that result in phenotypic specialization related to specific transcriptional patterns. These changes are achieved through the implementation of complex gene regulatory networks. Furthermore, these regulatory networks are influenced by epigenetic mechanisms that drive cell heterogeneity and cell-type specificity, in a controlled and complex manner. Epigenetic marks, such as DNA methylation and histone residue modifications, are highly dynamic and stage-specific during neurogenesis. Genome-wide assessment of these modifications has allowed the identification of distinct non-coding regulatory regions involved in neural cell differentiation, maturation, and plasticity. Enhancers are short DNA regulatory regions that bind transcription factors (TFs) and interact with gene promoters to increase transcriptional activity. They are of special interest in neuroscience because they are enriched in neurons and underlie the cell-type-specificity and dynamic gene expression profiles. Classification of the full epigenomic landscape of neural subtypes is important to better understand gene regulation in brain health and during diseases. Advances in novel next-generation high-throughput sequencing technologies, genome editing, Genome-wide association studies (GWAS), stem cell differentiation, and brain organoids are allowing researchers to study brain development and neurodegenerative diseases with an unprecedented resolution. Herein, we describe important epigenetic mechanisms related to neurogenesis in mammals. We focus on the potential roles of neural enhancers in neurogenesis, cell-fate commitment, and neuronal plasticity. We review recent findings on epigenetic regulatory mechanisms involved in neurogenesis and discuss how sequence variations within enhancers may be associated with genetic risk for neurological and psychiatric disorders.
Collapse
Affiliation(s)
- Mayela Giacoman-Lozano
- Tecnologico de Monterrey, Escuela de Medicina y Ciencias de la Salud, Monterrey, NL, Mexico
| | - César Meléndez-Ramírez
- Instituto de Fisiología Celular—Neurociencias, Universidad Nacional Autónoma de Mexico, Mexico City, Mexico
- Laboratorio de Reprogramación Celular, Instituto Nacional de Neurología y Neurocirugía “Manuel Velasco Suárez”, Mexico City, Mexico
| | - Emmanuel Martinez-Ledesma
- Tecnologico de Monterrey, Escuela de Medicina y Ciencias de la Salud, Monterrey, NL, Mexico
- Tecnologico de Monterrey, The Institute for Obesity Research, Monterrey, NL, Mexico
| | - Raquel Cuevas-Diaz Duran
- Tecnologico de Monterrey, Escuela de Medicina y Ciencias de la Salud, Monterrey, NL, Mexico
- *Correspondence: Raquel Cuevas-Diaz Duran, ; Iván Velasco,
| | - Iván Velasco
- Instituto de Fisiología Celular—Neurociencias, Universidad Nacional Autónoma de Mexico, Mexico City, Mexico
- Laboratorio de Reprogramación Celular, Instituto Nacional de Neurología y Neurocirugía “Manuel Velasco Suárez”, Mexico City, Mexico
- *Correspondence: Raquel Cuevas-Diaz Duran, ; Iván Velasco,
| |
Collapse
|
11
|
Heller IS, Guenther CA, Meireles AM, Talbot WS, Kingsley DM. Characterization of mouse Bmp5 regulatory injury element in zebrafish wound models. Bone 2022; 155:116263. [PMID: 34826632 PMCID: PMC9007314 DOI: 10.1016/j.bone.2021.116263] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Revised: 11/17/2021] [Accepted: 11/18/2021] [Indexed: 11/21/2022]
Abstract
Many key signaling molecules used to build tissues during embryonic development are re-activated at injury sites to stimulate tissue regeneration and repair. Bone morphogenetic proteins provide a classic example, but the mechanisms that lead to reactivation of BMPs following injury are still unknown. Previous studies have mapped a large "injury response element" (IRE) in the mouse Bmp5 gene that drives gene expression following bone fractures and other types of injury. Here we show that the large mouse IRE region is also activated in both zebrafish tail resection and mechanosensory hair cell injury models. Using the ability to test multiple constructs and image temporal and spatial dynamics following injury responses, we have narrowed the original size of the mouse IRE region by over 100 fold and identified a small 142 bp minimal enhancer that is rapidly induced in both mesenchymal and epithelial tissues after injury. These studies identify a small sequence that responds to evolutionarily conserved local signals in wounded tissues and suggest candidate pathways that contribute to BMP reactivation after injury.
Collapse
Affiliation(s)
- Ian S Heller
- Department of Developmental Biology, Stanford University School of Medicine, United States of America
| | - Catherine A Guenther
- Department of Developmental Biology, Stanford University School of Medicine, United States of America; Howard Hughes Medical Institute, Stanford University School of Medicine, United States of America
| | - Ana M Meireles
- Department of Developmental Biology, Stanford University School of Medicine, United States of America
| | - William S Talbot
- Department of Developmental Biology, Stanford University School of Medicine, United States of America
| | - David M Kingsley
- Department of Developmental Biology, Stanford University School of Medicine, United States of America; Howard Hughes Medical Institute, Stanford University School of Medicine, United States of America.
| |
Collapse
|
12
|
Mauduit D, Taskiran II, Minnoye L, de Waegeneer M, Christiaens V, Hulselmans G, Demeulemeester J, Wouters J, Aerts S. Analysis of long and short enhancers in melanoma cell states. eLife 2021; 10:e71735. [PMID: 34874265 PMCID: PMC8691835 DOI: 10.7554/elife.71735] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Accepted: 12/06/2021] [Indexed: 12/14/2022] Open
Abstract
Understanding how enhancers drive cell-type specificity and efficiently identifying them is essential for the development of innovative therapeutic strategies. In melanoma, the melanocytic (MEL) and the mesenchymal-like (MES) states present themselves with different responses to therapy, making the identification of specific enhancers highly relevant. Using massively parallel reporter assays (MPRAs) in a panel of patient-derived melanoma lines (MM lines), we set to identify and decipher melanoma enhancers by first focusing on regions with state-specific H3K27 acetylation close to differentially expressed genes. An in-depth evaluation of those regions was then pursued by investigating the activity of overlapping ATAC-seq peaks along with a full tiling of the acetylated regions with 190 bp sequences. Activity was observed in more than 60% of the selected regions, and we were able to precisely locate the active enhancers within ATAC-seq peaks. Comparison of sequence content with activity, using the deep learning model DeepMEL2, revealed that AP-1 alone is responsible for the MES enhancer activity. In contrast, SOX10 and MITF both influence MEL enhancer function with SOX10 being required to achieve high levels of activity. Overall, our MPRAs shed light on the relationship between long and short sequences in terms of their sequence content, enhancer activity, and specificity across melanoma cell states.
Collapse
Affiliation(s)
- David Mauduit
- VIB-KU Leuven Center for Brain & Disease ResearchLeuvenBelgium
- KU Leuven, Department of Human Genetics KU LeuvenLeuvenBelgium
| | - Ibrahim Ihsan Taskiran
- VIB-KU Leuven Center for Brain & Disease ResearchLeuvenBelgium
- KU Leuven, Department of Human Genetics KU LeuvenLeuvenBelgium
| | - Liesbeth Minnoye
- VIB-KU Leuven Center for Brain & Disease ResearchLeuvenBelgium
- KU Leuven, Department of Human Genetics KU LeuvenLeuvenBelgium
| | - Maxime de Waegeneer
- VIB-KU Leuven Center for Brain & Disease ResearchLeuvenBelgium
- KU Leuven, Department of Human Genetics KU LeuvenLeuvenBelgium
| | - Valerie Christiaens
- VIB-KU Leuven Center for Brain & Disease ResearchLeuvenBelgium
- KU Leuven, Department of Human Genetics KU LeuvenLeuvenBelgium
| | - Gert Hulselmans
- VIB-KU Leuven Center for Brain & Disease ResearchLeuvenBelgium
- KU Leuven, Department of Human Genetics KU LeuvenLeuvenBelgium
| | - Jonas Demeulemeester
- VIB-KU Leuven Center for Brain & Disease ResearchLeuvenBelgium
- KU Leuven, Department of Human Genetics KU LeuvenLeuvenBelgium
- Cancer Genomics Laboratory, The Francis Crick InstituteLondonUnited Kingdom
| | - Jasper Wouters
- VIB-KU Leuven Center for Brain & Disease ResearchLeuvenBelgium
- KU Leuven, Department of Human Genetics KU LeuvenLeuvenBelgium
| | - Stein Aerts
- VIB-KU Leuven Center for Brain & Disease ResearchLeuvenBelgium
- KU Leuven, Department of Human Genetics KU LeuvenLeuvenBelgium
| |
Collapse
|
13
|
Wolfe JC, Mikheeva LA, Hagras H, Zabet NR. An explainable artificial intelligence approach for decoding the enhancer histone modifications code and identification of novel enhancers in Drosophila. Genome Biol 2021; 22:308. [PMID: 34749786 PMCID: PMC8574042 DOI: 10.1186/s13059-021-02532-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Accepted: 10/29/2021] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Enhancers are non-coding regions of the genome that control the activity of target genes. Recent efforts to identify active enhancers experimentally and in silico have proven effective. While these tools can predict the locations of enhancers with a high degree of accuracy, the mechanisms underpinning the activity of enhancers are often unclear. RESULTS Using machine learning (ML) and a rule-based explainable artificial intelligence (XAI) model, we demonstrate that we can predict the location of known enhancers in Drosophila with a high degree of accuracy. Most importantly, we use the rules of the XAI model to provide insight into the underlying combinatorial histone modifications code of enhancers. In addition, we identified a large set of putative enhancers that display the same epigenetic signature as enhancers identified experimentally. These putative enhancers are enriched in nascent transcription, divergent transcription and have 3D contacts with promoters of transcribed genes. However, they display only intermediary enrichment of mediator and cohesin complexes compared to previously characterised active enhancers. We also found that 10-15% of the predicted enhancers display similar characteristics to super enhancers observed in other species. CONCLUSIONS Here, we applied an explainable AI model to predict enhancers with high accuracy. Most importantly, we identified that different combinations of epigenetic marks characterise different groups of enhancers. Finally, we discovered a large set of putative enhancers which display similar characteristics with previously characterised active enhancers.
Collapse
Affiliation(s)
- Jareth C Wolfe
- School of Life Sciences, University of Essex, Colchester, CO4 3SQ, UK
- School of Computer Science and Electronic Engineering, University of Essex, Colchester, CO4 3SQ, UK
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, E1 2AT, London, UK
| | - Liudmila A Mikheeva
- School of Life Sciences, University of Essex, Colchester, CO4 3SQ, UK
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, E1 2AT, London, UK
- Department of Mathematical Sciences, University of Essex, Colchester, CO4 3SQ, UK
| | - Hani Hagras
- School of Computer Science and Electronic Engineering, University of Essex, Colchester, CO4 3SQ, UK.
| | - Nicolae Radu Zabet
- School of Life Sciences, University of Essex, Colchester, CO4 3SQ, UK.
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, E1 2AT, London, UK.
| |
Collapse
|
14
|
Patel ZM, Hughes TR. Global properties of regulatory sequences are predicted by transcription factor recognition mechanisms. Genome Biol 2021; 22:285. [PMID: 34620190 PMCID: PMC8496038 DOI: 10.1186/s13059-021-02503-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2020] [Accepted: 09/16/2021] [Indexed: 01/07/2023] Open
Abstract
Background Mammalian genomes contain millions of putative regulatory sequences, which are delineated by binding of multiple transcription factors. The degree to which spacing and orientation constraints among transcription factor binding sites contribute to the recognition and identity of regulatory sequence is an unresolved but important question that impacts our understanding of genome function and evolution. Global mechanisms that underlie phenomena including the size of regulatory sequences, their uniqueness, and their evolutionary turnover remain poorly described. Results Here, we ask whether models incorporating different degrees of spacing and orientation constraints among transcription factor binding sites are broadly consistent with several global properties of regulatory sequence. These properties include length, sequence diversity, turnover rate, and dominance of specific TFs in regulatory site identity and cell type specification. Models with and without spacing and orientation constraints are generally consistent with all observed properties of regulatory sequence, and with regulatory sequences being fundamentally small (~ 1 nucleosome). Uniqueness of regulatory regions and their rapid evolutionary turnover are expected under all models examined. An intriguing issue we identify is that the complexity of eukaryotic regulatory sites must scale with the number of active transcription factors, in order to accomplish observed specificity. Conclusions Models of transcription factor binding with or without spacing and orientation constraints predict that regulatory sequences should be fundamentally short, unique, and turn over rapidly. We posit that the existence of master regulators may be, in part, a consequence of evolutionary pressure to limit the complexity and increase evolvability of regulatory sites. Supplementary Information The online version contains supplementary material available at 10.1186/s13059-021-02503-y.
Collapse
Affiliation(s)
- Zain M Patel
- Donnelly Centre for Cellular and Biomolecular Research and Department of Molecular Genetics, University of Toronto, Toronto, ON, M5S 3E1, Canada
| | - Timothy R Hughes
- Donnelly Centre for Cellular and Biomolecular Research and Department of Molecular Genetics, University of Toronto, Toronto, ON, M5S 3E1, Canada.
| |
Collapse
|
15
|
Chen Z, Zhang J, Liu J, Dai Y, Lee D, Min MR, Xu M, Gerstein M. DECODE: a Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays. Bioinformatics 2021; 37:i280-i288. [PMID: 34252960 PMCID: PMC8275369 DOI: 10.1093/bioinformatics/btab283] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/26/2021] [Indexed: 11/13/2022] Open
Abstract
Motivation Mapping distal regulatory elements, such as enhancers, is a cornerstone for elucidating how genetic variations may influence diseases. Previous enhancer-prediction methods have used either unsupervised approaches or supervised methods with limited training data. Moreover, past approaches have implemented enhancer discovery as a binary classification problem without accurate boundary detection, producing low-resolution annotations with superfluous regions and reducing the statistical power for downstream analyses (e.g. causal variant mapping and functional validations). Here, we addressed these challenges via a two-step model called Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays (DECODE). First, we employed direct enhancer-activity readouts from novel functional characterization assays, such as STARR-seq, to train a deep neural network for accurate cell-type-specific enhancer prediction. Second, to improve the annotation resolution, we implemented a weakly supervised object detection framework for enhancer localization with precise boundary detection (to a 10 bp resolution) using Gradient-weighted Class Activation Mapping. Results Our DECODE binary classifier outperformed a state-of-the-art enhancer prediction method by 24% in transgenic mouse validation. Furthermore, the object detection framework can condense enhancer annotations to only 13% of their original size, and these compact annotations have significantly higher conservation scores and genome-wide association study variant enrichments than the original predictions. Overall, DECODE is an effective tool for enhancer classification and precise localization. Availability and implementation DECODE source code and pre-processing scripts are available at decode.gersteinlab.org. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zhanlin Chen
- Department of Statistics & Data Science, Yale University, New Haven, CT 06520, USA
| | - Jing Zhang
- Department of Computer Science, University of California, Irvine, CA 92617, USA
| | - Jason Liu
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Yi Dai
- Department of Computer Science, University of California, Irvine, CA 92617, USA
| | - Donghoon Lee
- Genetics and Genomic Sciences, The Icahn School of Medicine at Mount Sinai, New York, NY 10029-6574, USA
| | | | - Min Xu
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Mark Gerstein
- Department of Statistics & Data Science, Yale University, New Haven, CT 06520, USA.,Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA.,Department of Computer Science, Yale University, New Haven, CT 06520, USA
| |
Collapse
|
16
|
Asma H, Halfon MS. Annotating the Insect Regulatory Genome. INSECTS 2021; 12:591. [PMID: 34209769 PMCID: PMC8305585 DOI: 10.3390/insects12070591] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 06/23/2021] [Accepted: 06/25/2021] [Indexed: 11/17/2022]
Abstract
An ever-growing number of insect genomes is being sequenced across the evolutionary spectrum. Comprehensive annotation of not only genes but also regulatory regions is critical for reaping the full benefits of this sequencing. Driven by developments in sequencing technologies and in both empirical and computational discovery strategies, the past few decades have witnessed dramatic progress in our ability to identify cis-regulatory modules (CRMs), sequences such as enhancers that play a major role in regulating transcription. Nevertheless, providing a timely and comprehensive regulatory annotation of newly sequenced insect genomes is an ongoing challenge. We review here the methods being used to identify CRMs in both model and non-model insect species, and focus on two tools that we have developed, REDfly and SCRMshaw. These resources can be paired together in a powerful combination to facilitate insect regulatory annotation over a broad range of species, with an accuracy equal to or better than that of other state-of-the-art methods.
Collapse
Affiliation(s)
- Hasiba Asma
- Program in Genetics, Genomics, and Bioinformatics, University at Buffalo-State University of New York, Buffalo, NY 14203, USA;
| | - Marc S. Halfon
- Program in Genetics, Genomics, and Bioinformatics, University at Buffalo-State University of New York, Buffalo, NY 14203, USA;
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- Department of Biomedical Informatics, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- Department of Biological Sciences, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- NY State Center of Excellence in Bioinformatics & Life Sciences, Buffalo, NY 14203, USA
| |
Collapse
|
17
|
Ni P, Su Z. Accurate prediction of cis-regulatory modules reveals a prevalent regulatory genome of humans. NAR Genom Bioinform 2021; 3:lqab052. [PMID: 34159315 PMCID: PMC8210889 DOI: 10.1093/nargab/lqab052] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Revised: 05/01/2021] [Accepted: 06/14/2021] [Indexed: 02/07/2023] Open
Abstract
cis-regulatory modules(CRMs) formed by clusters of transcription factor (TF) binding sites (TFBSs) are as important as coding sequences in specifying phenotypes of humans. It is essential to categorize all CRMs and constituent TFBSs in the genome. In contrast to most existing methods that predict CRMs in specific cell types using epigenetic marks, we predict a largely cell type agonistic but more comprehensive map of CRMs and constituent TFBSs in the gnome by integrating all available TF ChIP-seq datasets. Our method is able to partition 77.47% of genome regions covered by available 6092 datasets into a CRM candidate (CRMC) set (56.84%) and a non-CRMC set (43.16%). Intriguingly, the predicted CRMCs are under strong evolutionary constraints, while the non-CRMCs are largely selectively neutral, strongly suggesting that the CRMCs are likely cis-regulatory, while the non-CRMCs are not. Our predicted CRMs are under stronger evolutionary constraints than three state-of-the-art predictions (GeneHancer, EnhancerAtlas and ENCODE phase 3) and substantially outperform them for recalling VISTA enhancers and non-coding ClinVar variants. We estimated that the human genome might encode about 1.47M CRMs and 68M TFBSs, comprising about 55% and 22% of the genome, respectively; for both of which, we predicted 80%. Therefore, the cis-regulatory genome appears to be more prevalent than originally thought.
Collapse
Affiliation(s)
- Pengyu Ni
- Department of Bioinformatics and Genomics, the University of North Carolina at Charlotte, 9201 University City Boulevard, Charlotte, NC 28223, USA
| | - Zhengchang Su
- Department of Bioinformatics and Genomics, the University of North Carolina at Charlotte, 9201 University City Boulevard, Charlotte, NC 28223, USA
| |
Collapse
|
18
|
Jindal GA, Farley EK. Enhancer grammar in development, evolution, and disease: dependencies and interplay. Dev Cell 2021; 56:575-587. [PMID: 33689769 PMCID: PMC8462829 DOI: 10.1016/j.devcel.2021.02.016] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Revised: 02/15/2021] [Accepted: 02/16/2021] [Indexed: 12/19/2022]
Abstract
Each language has standard books describing that language's grammatical rules. Biologists have searched for similar, albeit more complex, principles relating enhancer sequence to gene expression. Here, we review the literature on enhancer grammar. We introduce dependency grammar, a model where enhancers encode information based on dependencies between enhancer features shaped by mechanistic, evolutionary, and biological constraints. Classifying enhancers based on the types of dependencies may identify unifying principles relating enhancer sequence to gene expression. Such rules would allow us to read the instructions for development within genomes and pinpoint causal enhancer variants underlying disease and evolutionary changes.
Collapse
Affiliation(s)
- Granton A Jindal
- Division of Cardiology, Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA; Division of Biological Sciences, Section of Molecular Biology, University of California San Diego, La Jolla, CA 92093, USA
| | - Emma K Farley
- Division of Cardiology, Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA; Division of Biological Sciences, Section of Molecular Biology, University of California San Diego, La Jolla, CA 92093, USA.
| |
Collapse
|
19
|
Makashov AA, Myasnikova EM, Spirov AV. Fuzzy Linguistic Modeling of the Regulation of Drosophila Segmentation Genes. Biophysics (Nagoya-shi) 2021. [DOI: 10.1134/s0006350921010073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
|
20
|
Co-option of the lineage-specific LAVA retrotransposon in the gibbon genome. Proc Natl Acad Sci U S A 2020; 117:19328-19338. [PMID: 32690705 DOI: 10.1073/pnas.2006038117] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Co-option of transposable elements (TEs) to become part of existing or new enhancers is an important mechanism for evolution of gene regulation. However, contributions of lineage-specific TE insertions to recent regulatory adaptations remain poorly understood. Gibbons present a suitable model to study these contributions as they have evolved a lineage-specific TE called LAVA (LINE-AluSz-VNTR-Alu LIKE), which is still active in the gibbon genome. The LAVA retrotransposon is thought to have played a role in the emergence of the highly rearranged structure of the gibbon genome by disrupting transcription of cell cycle genes. In this study, we investigated whether LAVA may have also contributed to the evolution of gene regulation by adopting enhancer function. We characterized fixed and polymorphic LAVA insertions across multiple gibbons and found 96 LAVA elements overlapping enhancer chromatin states. Moreover, LAVA was enriched in multiple transcription factor binding motifs, was bound by an important transcription factor (PU.1), and was associated with higher levels of gene expression in cis We found gibbon-specific signatures of purifying/positive selection at 27 LAVA insertions. Two of these insertions were fixed in the gibbon lineage and overlapped with enhancer chromatin states, representing putative co-opted LAVA enhancers. These putative enhancers were located within genes encoding SETD2 and RAD9A, two proteins that facilitate accurate repair of DNA double-strand breaks and prevent chromosomal rearrangement mutations. Co-option of LAVA in these genes may have influenced regulation of processes that preserve genome integrity. Our findings highlight the importance of considering lineage-specific TEs in studying evolution of gene regulatory elements.
Collapse
|
21
|
Rivera J, Keränen SVE, Gallo SM, Halfon MS. REDfly: the transcriptional regulatory element database for Drosophila. Nucleic Acids Res 2020; 47:D828-D834. [PMID: 30329093 PMCID: PMC6323911 DOI: 10.1093/nar/gky957] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2018] [Accepted: 10/04/2018] [Indexed: 12/21/2022] Open
Abstract
The REDfly database provides a comprehensive curation of experimentally-validated Drosophila transcriptional cis-regulatory elements and includes information on DNA sequence, experimental evidence, patterns of regulated gene expression, and more. Now in its thirteenth year, REDfly has grown to over 23 000 records of tested reporter gene constructs and 2200 tested transcription factor binding sites. Recent developments include the start of curation of predicted cis-regulatory modules in addition to experimentally-verified ones, improved search and filtering, and increased interaction with the authors of curated papers. An expanded data model that will capture information on temporal aspects of gene regulation, regulation in response to environmental and other non-developmental cues, sexually dimorphic gene regulation, and non-endogenous (ectopic) aspects of reporter gene expression is under development and expected to be in place within the coming year. REDfly is freely accessible at http://redfly.ccr.buffalo.edu, and news about database updates and new features can be followed on Twitter at @REDfly_database.
Collapse
Affiliation(s)
- John Rivera
- Center for Computational Research, State University of New York at Buffalo, Buffalo, NY 14203, USA.,New York State Center of Excellence in Bioinformatics and Life Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA
| | | | - Steven M Gallo
- Center for Computational Research, State University of New York at Buffalo, Buffalo, NY 14203, USA.,New York State Center of Excellence in Bioinformatics and Life Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA
| | - Marc S Halfon
- New York State Center of Excellence in Bioinformatics and Life Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA.,Department of Biochemistry, State University of New York at Buffalo, Buffalo, NY 14203, USA.,Department of Biomedical Informatics, State University of New York at Buffalo, Buffalo, NY 14203, USA.,Department of Biological Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA.,Department of Molecular and Cellular Biology and Program in Cancer Genetics, Roswell Park Cancer Institute, Buffalo, NY 14263, USA
| |
Collapse
|
22
|
Gasperini M, Tome JM, Shendure J. Towards a comprehensive catalogue of validated and target-linked human enhancers. Nat Rev Genet 2020; 21:292-310. [PMID: 31988385 PMCID: PMC7845138 DOI: 10.1038/s41576-019-0209-0] [Citation(s) in RCA: 159] [Impact Index Per Article: 39.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/13/2019] [Indexed: 12/14/2022]
Abstract
The human gene catalogue is essentially complete, but we lack an equivalently vetted inventory of bona fide human enhancers. Hundreds of thousands of candidate enhancers have been nominated via biochemical annotations; however, only a handful of these have been validated and confidently linked to their target genes. Here we review emerging technologies for discovering, characterizing and validating human enhancers at scale. We furthermore propose a new framework for operationally defining enhancers that accommodates the heterogeneous and complementary results that are emerging from reporter assays, biochemical measurements and CRISPR screens.
Collapse
Affiliation(s)
- Molly Gasperini
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Jacob M Tome
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA.
- Allen Discovery Center for Cell Lineage, Seattle, WA, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| |
Collapse
|
23
|
Sabarís G, Laiker I, Preger-Ben Noon E, Frankel N. Actors with Multiple Roles: Pleiotropic Enhancers and the Paradigm of Enhancer Modularity. Trends Genet 2019; 35:423-433. [DOI: 10.1016/j.tig.2019.03.006] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Accepted: 03/21/2019] [Indexed: 10/27/2022]
|
24
|
Combs PA, Fraser HB. Spatially varying cis-regulatory divergence in Drosophila embryos elucidates cis-regulatory logic. PLoS Genet 2018; 14:e1007631. [PMID: 30383747 PMCID: PMC6211617 DOI: 10.1371/journal.pgen.1007631] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2018] [Accepted: 08/14/2018] [Indexed: 12/30/2022] Open
Abstract
Spatial patterning of gene expression is a key process in development, yet how it evolves is still poorly understood. Both cis- and trans-acting changes could participate in complex interactions, so to isolate the cis-regulatory component of patterning evolution, we measured allele-specific spatial gene expression patterns in D. melanogaster × simulans hybrid embryos. RNA-seq of cryo-sectioned slices revealed 66 genes with strong spatially varying allele-specific expression. We found that hunchback, a major regulator of developmental patterning, had reduced expression of the D. simulans allele specifically in the anterior tip of hybrid embryos. Mathematical modeling of hunchback cis-regulation suggested a candidate transcription factor binding site variant, which we verified as causal using CRISPR-Cas9 genome editing. In sum, even comparing morphologically near-identical species we identified surprisingly extensive spatial variation in gene expression, suggesting not only that development is robust to many such changes, but also that natural selection may have ample raw material for evolving new body plans via changes in spatial patterning.
Collapse
Affiliation(s)
- Peter A. Combs
- Department of Biology, Stanford University, Stanford, California, United States of America
| | - Hunter B. Fraser
- Department of Biology, Stanford University, Stanford, California, United States of America
| |
Collapse
|