51
|
Montavon T, Duboule D. Landscapes and archipelagos: spatial organization of gene regulation in vertebrates. Trends Cell Biol 2012; 22:347-54. [PMID: 22560708 DOI: 10.1016/j.tcb.2012.04.003] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2012] [Revised: 03/30/2012] [Accepted: 04/03/2012] [Indexed: 11/28/2022]
Abstract
Vertebrate genes controlling critical developmental processes are often regulated by complex sets of global enhancer sequences, located at a distance, within neighboring gene deserts. Recent technological advances have made it possible to investigate the spatial organization of these 'regulatory landscapes'. The integration of such datasets with information on chromatin status, transcriptional activity and nuclear localization of these loci, as well as the effects of genetic modifications thereof, may bring a more comprehensive understanding of tissue- and/or stage-specific gene regulation in both normal and pathological contexts. Here, we review the impact of recent technological advances on our understanding of large-scale gene regulation in vertebrates, by focusing on paradigmatic gene loci.
Collapse
Affiliation(s)
- Thomas Montavon
- National Research Centre Frontiers in Genetics, University of Geneva, Geneva, Switzerland
| | | |
Collapse
|
52
|
Transcriptional enhancers in protein-coding exons of vertebrate developmental genes. PLoS One 2012; 7:e35202. [PMID: 22567096 PMCID: PMC3342275 DOI: 10.1371/journal.pone.0035202] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2011] [Accepted: 03/10/2012] [Indexed: 11/19/2022] Open
Abstract
Many conserved noncoding sequences function as transcriptional enhancers that regulate gene expression. Here, we report that protein-coding DNA also frequently contains enhancers functioning at the transcriptional level. We tested the enhancer activity of 31 protein-coding exons, which we chose based on strong sequence conservation between zebrafish and human, and occurrence in developmental genes, using a Tol2 transposable GFP reporter assay in zebrafish. For each exon we measured GFP expression in hundreds of embryos in 10 anatomies via a novel system that implements the voice-recognition capabilities of a cellular phone. We find that 24/31 (77%) exons drive GFP expression compared to a minimal promoter control, and 14/24 are anatomy-specific (expression in four anatomies or less). GFP expression driven by these coding enhancers frequently overlaps the anatomies where the host gene is expressed (60%), suggesting self-regulation. Highly conserved coding sequences and highly conserved noncoding sequences do not significantly differ in enhancer activity (coding: 24/31 vs. noncoding: 105/147) or tissue-specificity (coding: 14/24 vs. noncoding: 50/105). Furthermore, coding and noncoding enhancers display similar levels of the enhancer-related histone modification H3K4me1 (coding: 9/24 vs noncoding: 34/81). Meanwhile, coding enhancers are over three times as likely to contain an H3K4me1 mark as other exons of the host gene. Our work suggests that developmental transcriptional enhancers do not discriminate between coding and noncoding DNA and reveals widespread dual functions in protein-coding DNA.
Collapse
|
53
|
Takahashi M, Saitou N. Identification and characterization of lineage-specific highly conserved noncoding sequences in Mammalian genomes. Genome Biol Evol 2012; 4:641-57. [PMID: 22505575 PMCID: PMC3381673 DOI: 10.1093/gbe/evs035] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/23/2012] [Indexed: 01/12/2023] Open
Abstract
Vertebrate genome comparisons revealed that there are highly conserved noncoding sequences (HCNSs) among a wide range of species and many of which contain regulatory elements. However, recently emerged sequences conserved in specific lineages have not been well studied. Toward this end, we identified 8,198 primate and 21,128 specific HCNSs as representative ones among mammals from human-marmoset and mouse-rat comparisons, respectively. Derived allele frequency analysis of primate-specific HCNSs showed that these HCNSs were under purifying selection, indicating that they may harbor important functions. We selected the top 1,000 largest HCNSs and compared the lineage-specific HCNS-flanking genes (LHF genes) with ultraconserved element (UCE)-flanking genes. Interestingly, the majority of LHF genes were different from UCE-flanking genes. This lineage-specific set of LHF genes was more enriched in protein-binding function. Conversely, the number of LHF genes that were also shared by UCEs was small but significantly larger than random expectation, and many of these genes were involved in anatomical development as transcriptional regulators, suggesting that certain groups of genes preferentially recruit new HCNSs in addition to old HCNSs that are conserved among vertebrates. This group of LHF genes might be involved in the various levels of lineage-specific evolution among vertebrates, mammals, primates, and rodents. If so, the emergence of HCNSs in and around these two groups of LHF genes developed lineage-specific characteristics. Our results provide new insight into lineage-specific evolution through interactions between HCNSs and their LHF genes.
Collapse
Affiliation(s)
- Mahoko Takahashi
- Department of Genetics, School of Life Science, Graduate University for Advanced Studies, Japan
- Division of Population Genetics, National Institute of Genetics, Japan
- Present address: Department of Genetics, Stanford University
| | - Naruya Saitou
- Department of Genetics, School of Life Science, Graduate University for Advanced Studies, Japan
- Division of Population Genetics, National Institute of Genetics, Japan
| |
Collapse
|
54
|
Chatterjee S, Lufkin T. Regulatory genomics: Insights from the zebrafish. CURRENT TOPICS IN GENETICS 2012; 5:1-10. [PMID: 23440612 PMCID: PMC3577074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
The sequencing of many vertebrate species over the last decade has opened up the possibility of using comparative genomics as a powerful tool to elucidate regulatory elements in the vertebrate genome. The zebrafish has played a pivotal role in this process. Its genome has been used in large-scale genome comparisons to locate vertebrate specific regulatory elements and also it has been an excellent model system to test out the predicted DNA sequences for their ability to drive reporter gene expression in vivo. In spite of all the successes there have still been some issues in using the zebrafish as a model system for these kinds of assays. This review will shed some light on the successes and failures of the zebrafish in pushing forward regulatory genomics.
Collapse
Affiliation(s)
- Sumantra Chatterjee
- Stem Cell and Developmental Biology, Genome Institute of Singapore, 60 Biopolis Street, 138672, Singapore
| | | |
Collapse
|
55
|
DNaseI hypersensitivity and ultraconservation reveal novel, interdependent long-range enhancers at the complex Pax6 cis-regulatory region. PLoS One 2011; 6:e28616. [PMID: 22220192 PMCID: PMC3248410 DOI: 10.1371/journal.pone.0028616] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2011] [Accepted: 11/11/2011] [Indexed: 02/01/2023] Open
Abstract
The PAX6 gene plays a crucial role in development of the eye, brain, olfactory system and endocrine pancreas. Consistent with its pleiotropic role the gene exhibits a complex developmental expression pattern which is subject to strict spatial, temporal and quantitative regulation. Control of expression depends on a large array of cis-elements residing in an extended genomic domain around the coding region of the gene. The minimal essential region required for proper regulation of this complex locus has been defined through analysis of human aniridia-associated breakpoints and YAC transgenic rescue studies of the mouse smalleye mutant. We have carried out a systematic DNase I hypersensitive site (HS) analysis across 200 kb of this critical region of mouse chromosome 2E3 to identify putative regulatory elements. Mapping the identified HSs onto a percent identity plot (PIP) shows many HSs correspond to recognisable genomic features such as evolutionarily conserved sequences, CpG islands and retrotransposon derived repeats. We then focussed on a region previously shown to contain essential long range cis-regulatory information, the Pax6 downstream regulatory region (DRR), allowing comparison of mouse HS data with previous human HS data for this region. Reporter transgenic mice for two of the HS sites, HS5 and HS6, show that they function as tissue specific regulatory elements. In addition we have characterised enhancer activity of an ultra-conserved cis-regulatory region located near Pax6, termed E60. All three cis-elements exhibit multiple spatio-temporal activities in the embryo that overlap between themselves and other elements in the locus. Using a deletion set of YAC reporter transgenic mice we demonstrate functional interdependence of the elements. Finally, we use the HS6 enhancer as a marker for the migration of precerebellar neuro-epithelium cells to the hindbrain precerebellar nuclei along the posterior and anterior extramural streams allowing visualisation of migratory defects in both pathways in Pax6(Sey/Sey) mice.
Collapse
|
56
|
Rockman MV. The QTN program and the alleles that matter for evolution: all that's gold does not glitter. Evolution 2011; 66:1-17. [PMID: 22220860 DOI: 10.1111/j.1558-5646.2011.01486.x] [Citation(s) in RCA: 462] [Impact Index Per Article: 35.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The search for the alleles that matter, the quantitative trait nucleotides (QTNs) that underlie heritable variation within populations and divergence among them, is a popular pursuit. But what is the question to which QTNs are the answer? Although their pursuit is often invoked as a means of addressing the molecular basis of phenotypic evolution or of estimating the roles of evolutionary forces, the QTNs that are accessible to experimentalists, QTNs of relatively large effect, may be uninformative about these issues if large-effect variants are unrepresentative of the alleles that matter. Although 20th century evolutionary biology generally viewed large-effect variants as atypical, the field has recently undergone a quiet realignment toward a view of readily discoverable large-effect alleles as the primary molecular substrates for evolution. I argue that neither theory nor data justify this realignment. Models and experimental findings covering broad swaths of evolutionary phenomena suggest that evolution often acts via large numbers of small-effect polygenes, individually undetectable. Moreover, these small-effect variants are different in kind, at the molecular level, from the large-effect alleles accessible to experimentalists. Although discoverable QTNs address some fundamental evolutionary questions, they are essentially misleading about many others.
Collapse
Affiliation(s)
- Matthew V Rockman
- Department of Biology and Center for Genomics and Systems Biology, New York University, 12 Waverly Place, New York, NY 10003, USA.
| |
Collapse
|
57
|
Chatterjee S, Bourque G, Lufkin T. Conserved and non-conserved enhancers direct tissue specific transcription in ancient germ layer specific developmental control genes. BMC DEVELOPMENTAL BIOLOGY 2011; 11:63. [PMID: 22011226 PMCID: PMC3210094 DOI: 10.1186/1471-213x-11-63] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/31/2011] [Accepted: 10/20/2011] [Indexed: 01/29/2023]
Abstract
BACKGROUND Identifying DNA sequences (enhancers) that direct the precise spatial and temporal expression of developmental control genes remains a significant challenge in the annotation of vertebrate genomes. Locating these sequences, which in many cases lie at a great distance from the transcription start site, has been a major obstacle in deciphering gene regulation. Coupling of comparative genomics with functional validation to locate such regulatory elements has been a successful method in locating many such regulatory elements. But most of these studies looked either at a single gene only or the whole genome without focusing on any particular process. The pressing need is to integrate the tools of comparative genomics with knowledge of developmental biology to validate enhancers for developmental transcription factors in greater detail RESULTS Our results show that near four different genes (nkx3.2, pax9, otx1b and foxa2) in zebrafish, only 20-30% of highly conserved DNA sequences can act as developmental enhancers irrespective of the tissue the gene expresses in. We find that some genes also have multiple conserved enhancers expressing in the same tissue at the same or different time points in development. We also located non-conserved enhancers for two of the genes (pax9 and otx1b). Our modified Bacterial artificial chromosome (BACs) studies for these 4 genes revealed that many of these enhancers work in a synergistic fashion, which cannot be captured by individual DNA constructs and are not conserved at the sequence level. Our detailed biochemical and transgenic analysis revealed Foxa1 binds to the otx1b non-conserved enhancer to direct its activity in forebrain and otic vesicle of zebrafish at 24 hpf. CONCLUSION Our results clearly indicate that high level of functional conservation of genes is not necessarily associated with sequence conservation of its regulatory elements. Moreover certain non conserved DNA elements might have role in gene regulation. The need is to bring together multiple approaches to bear upon individual genes to decipher all its regulatory elements.
Collapse
Affiliation(s)
- Sumantra Chatterjee
- Stem Cell and Developmental Biology, Genome Institute of Singapore, 60 Biopolis Street, 138672, Singapore
| | | | | |
Collapse
|
58
|
Abstract
The enteric nervous system (ENS) is composed of neurons and glia that modulate many aspects of intestinal function. The ability to use both forward and reverse genetic approaches and to visualize development in living embryos and larvae has made zebrafish an attractive model in which to study mechanisms underlying ENS development. In this chapter, we review the recent work describing the development and organization of the zebrafish ENS and how this relates to intestinal motility. We also discuss the cellular, molecular, and genetic mechanisms that have been revealed by these studies and how they are providing new insights into human ENS diseases.
Collapse
Affiliation(s)
- Iain Shepherd
- Department of Biology, Emory University Rollins Research Building, Atlanta, Georgia, USA
| | | |
Collapse
|
59
|
Royo JL, Hidalgo C, Roncero Y, Seda MA, Akalin A, Lenhard B, Casares F, Gómez-Skarmeta JL. Dissecting the transcriptional regulatory properties of human chromosome 16 highly conserved non-coding regions. PLoS One 2011; 6:e24824. [PMID: 21935474 PMCID: PMC3172297 DOI: 10.1371/journal.pone.0024824] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2011] [Accepted: 08/18/2011] [Indexed: 12/28/2022] Open
Abstract
Non-coding DNA conservation across species has been often used as a predictor for transcriptional enhancer activity. However, only a few systematic analyses of the function of these highly conserved non-coding regions (HCNRs) have been performed. Here we use zebrafish transgenic assays to perform a systematic study of 113 HCNRs from human chromosome 16. By comparing transient and stable transgenesis, we show that the first method is highly inefficient, leading to 40% of false positives and 20% of false negatives. When analyzed in stable transgenic lines, a great majority of HCNRs were active in the central nervous system, although some of them drove expression in other organs such as the eye and the excretory system. Finally, by testing a fraction of the HCNRs lacking enhancer activity for in vivo insulator activity, we find that 20% of them may contain enhancer-blocking function. Altogether our data indicate that HCNRs may contain different types of cis-regulatory activity, including enhancer, insulators as well as other not yet discovered functions.
Collapse
Affiliation(s)
- José Luis Royo
- Centro Andaluz de Biologia del Desarrollo, CSIC-Universidad Pablo de Olavide-Junta de Andalucía, Sevilla, Spain
| | - Carmen Hidalgo
- Centro Andaluz de Biologia del Desarrollo, CSIC-Universidad Pablo de Olavide-Junta de Andalucía, Sevilla, Spain
| | - Yolanda Roncero
- Centro Andaluz de Biologia del Desarrollo, CSIC-Universidad Pablo de Olavide-Junta de Andalucía, Sevilla, Spain
| | - María Angeles Seda
- Centro Andaluz de Biologia del Desarrollo, CSIC-Universidad Pablo de Olavide-Junta de Andalucía, Sevilla, Spain
| | - Altuna Akalin
- Computational Biology Unit, Bergen Center for Computational Science, University of Bergen, Bergen, Norway
| | - Boris Lenhard
- Computational Biology Unit, Bergen Center for Computational Science, University of Bergen, Bergen, Norway
- Sars Centre for Marine Molecular Biology, University of Bergen, Bergen, Norway
| | - Fernando Casares
- Centro Andaluz de Biologia del Desarrollo, CSIC-Universidad Pablo de Olavide-Junta de Andalucía, Sevilla, Spain
| | - José Luis Gómez-Skarmeta
- Centro Andaluz de Biologia del Desarrollo, CSIC-Universidad Pablo de Olavide-Junta de Andalucía, Sevilla, Spain
- * E-mail:
| |
Collapse
|
60
|
Abstract
Accurately predicting regulatory sequences and enhancers in entire genomes is an important but difficult problem, especially in large vertebrate genomes. With the advent of ChIP-seq technology, experimental detection of genome-wide EP300/CREBBP bound regions provides a powerful platform to develop predictive tools for regulatory sequences and to study their sequence properties. Here, we develop a support vector machine (SVM) framework which can accurately identify EP300-bound enhancers using only genomic sequence and an unbiased set of general sequence features. Moreover, we find that the predictive sequence features identified by the SVM classifier reveal biologically relevant sequence elements enriched in the enhancers, but we also identify other features that are significantly depleted in enhancers. The predictive sequence features are evolutionarily conserved and spatially clustered, providing further support of their functional significance. Although our SVM is trained on experimental data, we also predict novel enhancers and show that these putative enhancers are significantly enriched in both ChIP-seq signal and DNase I hypersensitivity signal in the mouse brain and are located near relevant genes. Finally, we present results of comparisons between other EP300/CREBBP data sets using our SVM and uncover sequence elements enriched and/or depleted in the different classes of enhancers. Many of these sequence features play a role in specifying tissue-specific or developmental-stage-specific enhancer activity, but our results indicate that some features operate in a general or tissue-independent manner. In addition to providing a high confidence list of enhancer targets for subsequent experimental investigation, these results contribute to our understanding of the general sequence structure of vertebrate enhancers.
Collapse
|
61
|
Stine ZE, McGaughey DM, Bessling SL, Li S, McCallion AS. Steroid hormone modulation of RET through two estrogen responsive enhancers in breast cancer. Hum Mol Genet 2011; 20:3746-56. [PMID: 21737465 DOI: 10.1093/hmg/ddr291] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
RET, a gene causatively mutated in Hirschsprung disease and cancer, has recently been implicated in breast cancer estrogen (E2) independence and tamoxifen resistance. RET displays both E2 and retinoic acid (RA)-dependent transcriptional modulation in E2-responsive breast cancers. However, the regulatory elements through which the steroid hormone transcriptional regulation of RET is mediated are poorly defined. Recent genome-wide chromatin immunoprecipitation-based studies have identified 10 putative E2 receptor-alpha (ESR1) and RA receptor alpha-binding sites at the RET locus, of which we demonstrate only two (RET -49.8 and RET +32.8) display significant E2 regulatory response when assayed independently in MCF-7 breast cancer cells. We demonstrate that endogenous RET expression and RET -49.8 regulatory activity are cooperatively regulated by E2 and RA in breast cancer cells. We identify key sequences that are required for RET -49.8 and RET +32.8 E2 responsiveness, including motifs known to be bound by ESR1, FOXA1 and TFAP2C. We also report that both RET -49.8 regulatory activity and endogenous RET expression are completely dependent on ESR1 for their (E2)-induction and that ESR1 is sufficient to mediate the E2-induced enhancer activity of RET -49.8 and RET +32.8. Finally, using zebrafish transgenesis, we also demonstrate that RET -49.8 directs reporter expression in the central nervous system and peripheral nervous system consistent with the endogenous ret expression. Taken collectively, these data suggest that RET transcription in breast cancer cells is modulated by E2 via ESR1 acting on multiple elements collectively.
Collapse
Affiliation(s)
- Zachary E Stine
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, 733 N. Broadway, Baltimore, MD 21205, USA
| | | | | | | | | |
Collapse
|
62
|
Fulton DL, Denarier E, Friedman HC, Wasserman WW, Peterson AC. Towards resolving the transcription factor network controlling myelin gene expression. Nucleic Acids Res 2011; 39:7974-91. [PMID: 21729871 PMCID: PMC3185407 DOI: 10.1093/nar/gkr326] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
In the central nervous system (CNS), myelin is produced from spirally-wrapped oligodendrocyte plasma membrane and, as exemplified by the debilitating effects of inherited or acquired myelin abnormalities in diseases such as multiple sclerosis, it plays a critical role in nervous system function. Myelin sheath production coincides with rapid up-regulation of numerous genes. The complexity of their subsequent expression patterns, along with recently recognized heterogeneity within the oligodendrocyte lineage, suggest that the regulatory networks controlling such genes drive multiple context-specific transcriptional programs. Conferring this nuanced level of control likely involves a large repertoire of interacting transcription factors (TFs). Here, we combined novel strategies of computational sequence analyses with in vivo functional analysis to establish a TF network model of coordinate myelin-associated gene transcription. Notably, the network model captures regulatory DNA elements and TFs known to regulate oligodendrocyte myelin gene transcription and/or oligodendrocyte development, thereby validating our approach. Further, it links to numerous TFs with previously unsuspected roles in CNS myelination and suggests collaborative relationships amongst both known and novel TFs, thus providing deeper insight into the myelin gene transcriptional network.
Collapse
Affiliation(s)
- Debra L Fulton
- Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, University of British Columbia, Vancouver, V5Z 4H4, Canada
| | | | | | | | | |
Collapse
|
63
|
Taher L, McGaughey DM, Maragh S, Aneas I, Bessling SL, Miller W, Nobrega MA, McCallion AS, Ovcharenko I. Genome-wide identification of conserved regulatory function in diverged sequences. Genome Res 2011; 21:1139-49. [PMID: 21628450 PMCID: PMC3129256 DOI: 10.1101/gr.119016.110] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2010] [Accepted: 04/19/2011] [Indexed: 01/16/2023]
Abstract
Plasticity of gene regulatory encryption can permit DNA sequence divergence without loss of function. Functional information is preserved through conservation of the composition of transcription factor binding sites (TFBS) in a regulatory element. We have developed a method that can accurately identify pairs of functional noncoding orthologs at evolutionarily diverged loci by searching for conserved TFBS arrangements. With an estimated 5% false-positive rate (FPR) in approximately 3000 human and zebrafish syntenic loci, we detected approximately 300 pairs of diverged elements that are likely to share common ancestry and have similar regulatory activity. By analyzing a pool of experimentally validated human enhancers, we demonstrated that 7/8 (88%) of their predicted functional orthologs retained in vivo regulatory control. Moreover, in 5/7 (71%) of assayed enhancer pairs, we observed concordant expression patterns. We argue that TFBS composition is often necessary to retain and sufficient to predict regulatory function in the absence of overt sequence conservation, revealing an entire class of functionally conserved, evolutionarily diverged regulatory elements that we term "covert."
Collapse
Affiliation(s)
- Leila Taher
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| | - David M. McGaughey
- McKusick–Nathans Institute of Genetic Medicine, Department of Molecular and Comparative Pathobiology, The Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA
| | - Samantha Maragh
- McKusick–Nathans Institute of Genetic Medicine, Department of Molecular and Comparative Pathobiology, The Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA
- Biochemical Science Division, National Institute of Standards and Technology, Gaithersburg, Maryland 20899, USA
| | - Ivy Aneas
- Department of Human Genetics, University of Chicago, Chicago, Illinois 60637, USA
| | - Seneca L. Bessling
- McKusick–Nathans Institute of Genetic Medicine, Department of Molecular and Comparative Pathobiology, The Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA
| | - Webb Miller
- Center for Comparative Genomics and Bioinformatics, Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Marcelo A. Nobrega
- Department of Human Genetics, University of Chicago, Chicago, Illinois 60637, USA
| | - Andrew S. McCallion
- McKusick–Nathans Institute of Genetic Medicine, Department of Molecular and Comparative Pathobiology, The Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA
| | - Ivan Ovcharenko
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| |
Collapse
|
64
|
Chatterjee S, Lufkin T. Fishing for function: zebrafish BAC transgenics for functional genomics. MOLECULAR BIOSYSTEMS 2011; 7:2345-51. [PMID: 21647532 DOI: 10.1039/c1mb05116d] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Transgenics using bacterial artificial chromosomes (BACs) offers a great opportunity to look at gene regulation in a developing embryo. The modified BAC containing a reporter inserted just before the translational start site of the gene of interest allows for the visualization of spatio-temporal gene expression. Though this method has been used in the mouse model extensively, its utility in zebrafish studies is relatively new. This review aims to look at the utility of making BAC transgenics in zebrafish and its applications in functional genomics. We look at the various methods to modify the BAC, some limitations and what the future holds.
Collapse
Affiliation(s)
- Sumantra Chatterjee
- Stem Cell and Developmental Biology, Genome Institute of Singapore, Singapore
| | | |
Collapse
|
65
|
Barrière A, Gordon KL, Ruvinsky I. Distinct functional constraints partition sequence conservation in a cis-regulatory element. PLoS Genet 2011; 7:e1002095. [PMID: 21655084 PMCID: PMC3107193 DOI: 10.1371/journal.pgen.1002095] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2010] [Accepted: 04/07/2011] [Indexed: 11/25/2022] Open
Abstract
Different functional constraints contribute to different evolutionary rates across genomes. To understand why some sequences evolve faster than others in a single cis-regulatory locus, we investigated function and evolutionary dynamics of the promoter of the Caenorhabditis elegans unc-47 gene. We found that this promoter consists of two distinct domains. The proximal promoter is conserved and is largely sufficient to direct appropriate spatial expression. The distal promoter displays little if any conservation between several closely related nematodes. Despite this divergence, sequences from all species confer robustness of expression, arguing that this function does not require substantial sequence conservation. We showed that even unrelated sequences have the ability to promote robust expression. A prominent feature shared by all of these robustness-promoting sequences is an AT-enriched nucleotide composition consistent with nucleosome depletion. Because general sequence composition can be maintained despite sequence turnover, our results explain how different functional constraints can lead to vastly disparate rates of sequence divergence within a promoter. Comparison between genome sequences of different species is a powerful tool in modern biology because important features are maintained by natural selection and are therefore conserved. However, some important sequences within genomes evolve considerably faster than others. One possible explanation is that they encode little or no function. Alternatively, they may evolve under different constraints that permit sequence turnover while maintaining function. Here we report that the promoter of the unc-47 gene of C. elegans contains two discrete elements. One has a highly conserved sequence that determines the spatial expression pattern. Another shows no sequence conservation, but it makes expression of the gene robust, that is, consistent between individuals and resilient to environmental challenges. Remarkably, multiple unrelated sequences are capable of promoting robust expression. Nucleotide composition of these sequences suggests that open chromatin may play a role in conferring robustness of gene expression. Because general sequence composition and therefore expression robustness can be maintained despite sequence turnover, our results offer an explanation of how rapidly diverging promoter elements can nevertheless remain functionally conserved.
Collapse
Affiliation(s)
- Antoine Barrière
- Department of Ecology and Evolution and Institute for Genomics and Systems Biology, Chicago, Illinois, United States of America
| | - Kacy L. Gordon
- Department of Organismal Biology and Anatomy, The University of Chicago, Chicago, Illinois, United States of America
| | - Ilya Ruvinsky
- Department of Ecology and Evolution and Institute for Genomics and Systems Biology, Chicago, Illinois, United States of America
- Department of Organismal Biology and Anatomy, The University of Chicago, Chicago, Illinois, United States of America
- * E-mail:
| |
Collapse
|
66
|
Hemberg M, Kreiman G. Conservation of transcription factor binding events predicts gene expression across species. Nucleic Acids Res 2011; 39:7092-102. [PMID: 21622661 PMCID: PMC3167604 DOI: 10.1093/nar/gkr404] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Recent technological advances have made it possible to determine the genome-wide binding sites of transcription factors (TFs). Comparisons across species have suggested a relatively low degree of evolutionary conservation of experimentally defined TF binding events (TFBEs). Using binding data for six different TFs in hepatocytes and embryonic stem cells from human and mouse, we demonstrate that evolutionary conservation of TFBEs within orthologous proximal promoters is closely linked to function, defined as expression of the target genes. We show that (i) there is a significantly higher degree of conservation of TFBEs when the target gene is expressed in both species; (ii) there is increased conservation of binding events for groups of TFs compared to individual TFs; and (iii) conserved TFBEs have a greater impact on the expression of their target genes than non-conserved ones. These results link conservation of structural elements (TFBEs) to conservation of function (gene expression) and suggest a higher degree of functional conservation than implied by previous studies.
Collapse
Affiliation(s)
- Martin Hemberg
- Children's Hospital Boston, Program in Biophysics and Program in Neuroscience, Harvard Medical School, 300 Longwood Avenue, Boston, MA 02115, USA
| | | |
Collapse
|
67
|
Bulger M, Groudine M. Functional and mechanistic diversity of distal transcription enhancers. Cell 2011; 144:327-39. [PMID: 21295696 DOI: 10.1016/j.cell.2011.01.024] [Citation(s) in RCA: 616] [Impact Index Per Article: 47.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2010] [Revised: 12/20/2010] [Accepted: 01/18/2011] [Indexed: 12/28/2022]
Abstract
Biological differences among metazoans and between cell types in a given organism arise in large part due to differences in gene expression patterns. Gene-distal enhancers are key contributors to these expression patterns, exhibiting both sequence diversity and cell type specificity. Studies of long-range interactions indicate that enhancers are often important determinants of nuclear organization, contributing to a general model for enhancer function that involves direct enhancer-promoter contact. However, mechanisms for enhancer function are emerging that do not fit solely within such a model, suggesting that enhancers as a class of DNA regulatory element may be functionally and mechanistically diverse.
Collapse
Affiliation(s)
- Michael Bulger
- Center for Pediatric Biomedical Research, Department of Pediatrics, University of Rochester, NY 14627, USA.
| | | |
Collapse
|
68
|
Matsson P, Yee SW, Markova S, Morrissey K, Jenkins G, Xuan J, Jorgenson E, Kroetz DL, Giacomini KM. Discovery of regulatory elements in human ATP-binding cassette transporters through expression quantitative trait mapping. THE PHARMACOGENOMICS JOURNAL 2011; 12:214-26. [PMID: 21383772 PMCID: PMC3325368 DOI: 10.1038/tpj.2011.8] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
ATP-Binding Cassette (ABC) membrane transporters determine the disposition of many drugs, metabolites and endogenous compounds. Coding region variation in ABC transporters is the cause of many genetic disorders, but much less is known about the genetic basis and functional outcome of ABC transporter expression level variation. We used genotype and mRNA transcript level data from human lymphoblastoid cell lines to assess population and gender differences in ABC transporter expression, and to guide the discovery of genomic regions involved in transcriptional regulation. Nineteen of 49 ABC genes were differentially expressed between individuals of African, Asian and European descent suggesting an important influence of race on expression level of ABC transporters. Twenty-four significant associations were found between transporter transcript levels and proximally located genetic variants. Several of the associations were experimentally validated in reporter assays. Through influencing ABC expression levels, these SNPs may affect disease susceptibility and response to drugs.
Collapse
Affiliation(s)
- P Matsson
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
69
|
A systematic enhancer screen using lentivector transgenesis identifies conserved and non-conserved functional elements at the Olig1 and Olig2 locus. PLoS One 2010; 5:e15741. [PMID: 21206754 PMCID: PMC3012086 DOI: 10.1371/journal.pone.0015741] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2010] [Accepted: 11/23/2010] [Indexed: 01/22/2023] Open
Abstract
Finding sequences that control expression of genes is central to understanding genome function. Previous studies have used evolutionary conservation as an indicator of regulatory potential. Here, we present a method for the unbiased in vivo screen of putative enhancers in large DNA regions, using the mouse as a model. We cloned a library of 142 overlapping fragments from a 200 kb-long murine BAC in a lentiviral vector expressing LacZ from a minimal promoter, and used the resulting vectors to infect fertilized murine oocytes. LacZ staining of E11 embryos obtained by first using the vectors in pools and then testing individual candidates led to the identification of 3 enhancers, only one of which shows significant evolutionary conservation. In situ hybridization and 3C/4C experiments suggest that this enhancer, which is active in the neural tube and posterior diencephalon, influences the expression of the Olig1 and/or Olig2 genes. This work provides a new approach for the large-scale in vivo screening of transcriptional regulatory sequences, and further demonstrates that evolutionary conservation alone seems too limiting a criterion for the identification of enhancers.
Collapse
|
70
|
Antonellis A, Dennis MY, Burzynski G, Huynh J, Maduro V, Hodonsky CJ, Khajavi M, Szigeti K, Mukkamala S, Bessling SL, Pavan WJ, McCallion AS, Lupski JR, Green ED. A rare myelin protein zero (MPZ) variant alters enhancer activity in vitro and in vivo. PLoS One 2010; 5:e14346. [PMID: 21179557 PMCID: PMC3002941 DOI: 10.1371/journal.pone.0014346] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2010] [Accepted: 11/26/2010] [Indexed: 01/16/2023] Open
Abstract
Background Myelin protein zero (MPZ) is a critical structural component of myelin in the peripheral nervous system. The MPZ gene is regulated, in part, by the transcription factors SOX10 and EGR2. Mutations in MPZ, SOX10, and EGR2 have been implicated in demyelinating peripheral neuropathies, suggesting that components of this transcriptional network are candidates for harboring disease-causing mutations (or otherwise functional variants) that affect MPZ expression. Methodology We utilized a combination of multi-species sequence comparisons, transcription factor-binding site predictions, targeted human DNA re-sequencing, and in vitro and in vivo enhancer assays to study human non-coding MPZ variants. Principal Findings Our efforts revealed a variant within the first intron of MPZ that resides within a previously described SOX10 binding site is associated with decreased enhancer activity, and alters binding of nuclear proteins. Additionally, the genomic segment harboring this variant directs tissue-relevant reporter gene expression in zebrafish. Conclusions This is the first reported MPZ variant within a cis-acting transcriptional regulatory element. While we were unable to implicate this variant in disease onset, our data suggests that similar non-coding sequences should be screened for mutations in patients with neurological disease. Furthermore, our multi-faceted approach for examining the functional significance of non-coding variants can be readily generalized to study other loci important for myelin structure and function.
Collapse
Affiliation(s)
- Anthony Antonellis
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, Michigan, United States of America
- Department of Neurology, University of Michigan Medical School, Ann Arbor, Michigan, United States of America
| | - Megan Y. Dennis
- Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Grzegorz Burzynski
- McKusick–Nathans Institute of Genetic Medicine and Department of Molecular and Comparative Pathobiology, The Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Jimmy Huynh
- McKusick–Nathans Institute of Genetic Medicine and Department of Molecular and Comparative Pathobiology, The Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Valerie Maduro
- Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Chani J. Hodonsky
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, Michigan, United States of America
| | - Mehrdad Khajavi
- Department of Molecular and Human Genetics, Houston, Texas, United States of America
| | - Kinga Szigeti
- Department of Molecular and Human Genetics, Houston, Texas, United States of America
- Department of Neurology, Houston, Texas, United States of America
| | - Sandeep Mukkamala
- Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Seneca L. Bessling
- McKusick–Nathans Institute of Genetic Medicine and Department of Molecular and Comparative Pathobiology, The Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - NISC Comparative Sequencing Program
- NIH Intramural Sequencing Center (NISC), National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - William J. Pavan
- Genetic Disease Research Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Andrew S. McCallion
- McKusick–Nathans Institute of Genetic Medicine and Department of Molecular and Comparative Pathobiology, The Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - James R. Lupski
- Department of Molecular and Human Genetics, Houston, Texas, United States of America
- Department of Pediatrics, Baylor College of Medicine, Houston, Texas, United States of America
- Texas Children's Hospital, Houston, Texas, United States of America
| | - Eric D. Green
- Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
- NIH Intramural Sequencing Center (NISC), National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail:
| |
Collapse
|
71
|
When needles look like hay: how to find tissue-specific enhancers in model organism genomes. Dev Biol 2010; 350:239-54. [PMID: 21130761 DOI: 10.1016/j.ydbio.2010.11.026] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2010] [Revised: 11/11/2010] [Accepted: 11/22/2010] [Indexed: 01/22/2023]
Abstract
A major prerequisite for the investigation of tissue-specific processes is the identification of cis-regulatory elements. No generally applicable technique is available to distinguish them from any other type of genomic non-coding sequence. Therefore, researchers often have to identify these elements by elaborate in vivo screens, testing individual regions until the right one is found. Here, based on many examples from the literature, we summarize how functional enhancers have been isolated from other elements in the genome and how they have been characterized in transgenic animals. Covering computational and experimental studies, we provide an overview of the global properties of cis-regulatory elements, like their specific interactions with promoters and target gene distances. We describe conserved non-coding elements (CNEs) and their internal structure, nucleotide composition, binding site clustering and overlap, with a special focus on developmental enhancers. Conflicting data and unresolved questions on the nature of these elements are highlighted. Our comprehensive overview of the experimental shortcuts that have been found in the different model organism communities and the new field of high-throughput assays should help during the preparation phase of a screen for enhancers. The review is accompanied by a list of general guidelines for such a project.
Collapse
|
72
|
Mannaert A, Amemiya CT, Bossuyt F. Comparative analyses of vertebrate posterior HoxD clusters reveal atypical cluster architecture in the caecilian Typhlonectes natans. BMC Genomics 2010; 11:658. [PMID: 21106068 PMCID: PMC3091776 DOI: 10.1186/1471-2164-11-658] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2010] [Accepted: 11/24/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The posterior genes of the HoxD cluster play a crucial role in the patterning of the tetrapod limb. This region is under the control of a global, long-range enhancer that is present in all vertebrates. Variation in limb types, as is the case in amphibians, can probably not only be attributed to variation in Hox genes, but is likely to be the product of differences in gene regulation. With a collection of vertebrate genome sequences available today, we used a comparative genomics approach to study the posterior HoxD cluster of amphibians. A frog and a caecilian were included in the study to compare coding sequences as well as to determine the gain and loss of putative regulatory sequences. RESULTS We sequenced the posterior end of the HoxD cluster of a caecilian and performed comparative analyses of this region using HoxD clusters of other vertebrates. We determined the presence of conserved non-coding sequences and traced gains and losses of these footprints during vertebrate evolution, with particular focus on amphibians. We found that the caecilian HoxD cluster is almost three times larger than its mammalian counterpart. This enlargement is accompanied with the loss of one gene and the accumulation of repeats in that area. A similar phenomenon was observed in the coelacanth, where a different gene was lost and expansion of the area where the gene was lost has occurred. At least one phylogenetic footprint present in all vertebrates was lost in amphibians. This conserved region is a known regulatory element and functions as a boundary element in neural tissue to prevent expression of Hoxd genes. CONCLUSION The posterior part of the HoxD cluster of Typhlonectes natans is among the largest known today. The loss of Hoxd-12 and the expansion of the intergenic region may exert an influence on the limb enhancer, by having to bypass a distance seven times that of regular HoxD clusters. Whether or not there is a correlation with the loss of limbs remains to be investigated. These results, together with data on other vertebrates show that the tetrapod Hox clusters are more variable than previously thought.
Collapse
Affiliation(s)
- An Mannaert
- Biology Department, ECOL, Amphibian Evolution Lab, Vrije Universiteit Brussel, Brussels, Belgium
| | - Chris T Amemiya
- Benaroya Research Institute at Virginia Mason and University of Washington, Seattle, USA
| | - Franky Bossuyt
- Biology Department, ECOL, Amphibian Evolution Lab, Vrije Universiteit Brussel, Brussels, Belgium
| |
Collapse
|
73
|
Abstract
Transcriptional regulation of gene expression plays a significant role in establishing the diversity of human cell types and biological functions from a common set of genes. The components of regulatory control in the human genome include cis-acting elements that act across immense genomic distances to influence the spatial and temporal distribution of gene expression. Here we review the established categories of distant-acting regulatory elements, discussing the classical and contemporary evidence of their regulatory potential and clinical importance. Current efforts to identify regulatory sequences throughout the genome and elucidate their biological significance depend heavily on advances in sequence conservation-based analyses and on increasingly large-scale efforts applying transgenic technologies in model organisms. We discuss the advantages and limitations of sequence conservation as a predictor of regulatory function and present complementary emerging technologies now being applied to annotate regulatory elements in vertebrate genomes.
Collapse
Affiliation(s)
- James P Noonan
- Department of Genetics, Yale University School of Medicine, New Haven, CT 06510, USA.
| | | |
Collapse
|
74
|
Multiple enhancers located in a 1-Mb region upstream of POU3F4 promote expression during inner ear development and may be required for hearing. Hum Genet 2010; 128:411-9. [PMID: 20668882 PMCID: PMC2939330 DOI: 10.1007/s00439-010-0864-x] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2010] [Accepted: 07/13/2010] [Indexed: 01/01/2023]
Abstract
POU3F4 encodes a POU-domain transcription factor required for inner ear development. Defects in POU3F4 function are associated with X-linked deafness type 3 (DFN3). Multiple deletions affecting up to ~900-kb upstream of POU3F4 are found in DFN3 patients, suggesting the presence of essential POU3F4 enhancers in this region. Recently, an inner ear enhancer was reported that is absent in most DFN3 patients with upstream deletions. However, two indications suggest that additional enhancers in the POU3F4 upstream region are required for POU3F4 function during inner ear development. First, there is at least one DFN3 deletion that does not eliminate the reported enhancer. Second, the expression pattern driven by this enhancer does not fully recapitulate Pou3f4 expression in the inner ear. Here, we screened a 1-Mb region upstream of the POU3F4 gene for additional cis-regulatory elements and searched for novel DFN3 mutations in the identified POU3F4 enhancers. We found several novel enhancers for otic vesicle expression. Some of these also drive expression in kidney, pancreas and brain, tissues that are known to express Pou3f4. In addition, we report a new and smallest deletion identified so far in a DFN3 family which eliminates 3.9 kb, comprising almost exclusively the previous reported inner ear enhancer. We suggest that multiple enhancers control the expression of Pou3f4 in the inner ear and these may contribute to the phenotype observed in DFN3 patients. In addition, the novel deletion demonstrates that the previous reported enhancer, although not sufficient, is essential for POU3F4 function during inner ear development.
Collapse
|
75
|
McGaughey DM, McCallion AS. Efficient discovery of ASCL1 regulatory sequences through transgene pooling. Genomics 2010; 95:363-9. [PMID: 20206680 PMCID: PMC2904508 DOI: 10.1016/j.ygeno.2010.02.011] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2009] [Revised: 02/19/2010] [Accepted: 02/25/2010] [Indexed: 10/19/2022]
Abstract
Zebrafish transgenesis is a powerful and increasingly common strategy to assay vertebrate transcriptional regulatory control. Several challenges remain, however, to the broader application of this technique; they include increasing the rate with which transgenes can be analyzed and maximizing the informational value of the data generated. Presently, many rely on the injection of individual constructs and the analysis of resulting reporter expression in mosaic G0 embryos. Here, we contrast these approaches, examining whether injecting pooled transgene constructs can increase the efficiency with which regulatory sequences can be assayed, restricting analysis to the offspring of germ line transmitting transgenic zebrafish in an effort to reduce potential subjectivity. We selected a 64kb interval encompassing the human ASCL1 locus as our model interval and report the analysis of 9 highly conserved putative enhancers therein. We identified 32 transgene-positive zebrafish, transmitting one or more independent constructs displaying ASCL1-like regulatory control. Through examination of embryos harboring one or more transgenes, we demonstrate that five of the nine sequences account for the observed control and describe their likely roles in ASCL1 regulation. These data demonstrate the utility of this approach and its potential for further adaptation and higher throughput application.
Collapse
Affiliation(s)
- David M. McGaughey
- McKusick - Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, 733 N. Broadway, BRB Suite 449, Baltimore, MD 21205, USA
| | - Andrew S. McCallion
- McKusick - Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, 733 N. Broadway, BRB Suite 449, Baltimore, MD 21205, USA
- Department of Molecular and Comparative Pathobiology, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| |
Collapse
|
76
|
Ritter DI, Li Q, Kostka D, Pollard KS, Guo S, Chuang JH. The importance of being cis: evolution of orthologous fish and mammalian enhancer activity. Mol Biol Evol 2010; 27:2322-32. [PMID: 20494938 DOI: 10.1093/molbev/msq128] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Conserved noncoding elements (CNEs) in vertebrate genomes often act as developmental enhancers, but a critical issue is how well orthologous CNE sequences retain the same activity in their respective species, a characteristic important for generalization of model organism studies. To quantify how well CNE enhancer activity has been preserved, we compared the anatomy-specific activities of 41 zebra fish CNEs in zebra fish embryos with the activities of orthologous human CNEs in mouse embryos. We found that 13/41 (∼30%) of the orthologous CNE pairs exhibit conserved positive activity in zebra fish and mouse. Conserved positive activity is only weakly associated with either sequence conservation or the absence of bases undergoing accelerated evolution. A stronger effect is that disparate activity is associated with transcription factor binding site divergence. To distinguish the contributions of cis- versus trans-regulatory changes, we analyzed 13 CNEs in a three-way experimental comparison: human CNE tested in zebra fish, human CNE tested in mouse, and orthologous zebra fish CNE tested in zebra fish. Both cis- and trans-changes affect a significant fraction of CNEs, although human and zebra fish sequences exhibit disparate activity in zebra fish (indicating cis regulatory changes) twice as often as human sequences show disparate activity when tested in mouse and zebra fish (indicating trans regulatory changes). In all four cases where the zebra fish and human CNE display a similar expression pattern in zebra fish, the human CNE also displays a similar expression pattern in mouse. This suggests that the endogenous enhancer activity of ∼30% of human CNEs can be determined from experiments in zebra fish alone, and to identify these CNEs, both the zebra fish and the human sequences should be tested.
Collapse
Affiliation(s)
- Deborah I Ritter
- Department of Biology, Boston College, Chestnut Hill, Massachusetts, USA
| | | | | | | | | | | |
Collapse
|
77
|
Rao A, States DJ, Hero AO, Engel JD. Understanding distal transcriptional regulation from sequence, expression and interactome perspectives. J Bioinform Comput Biol 2010; 8:219-46. [PMID: 20401945 DOI: 10.1142/s0219720010004756] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2009] [Revised: 10/17/2009] [Accepted: 10/17/2009] [Indexed: 11/18/2022]
Abstract
Gene regulation in eukaryotes involves a complex interplay between the proximal promoter and distal genomic elements (such as enhancers) which work in concert to drive precise spatio-temporal gene expression. The experimental localization and characterization of gene regulatory elements is a very complex and resource-intensive process. The computational identification of regulatory regions that confer spatiotemporally specific tissue-restricted expression of a gene is thus an important challenge for computational biology. One of the most popular strategies for enhancer localization from DNA sequence is the use of conservation-based prefiltering and more recently, the use of canonical (transcription factor motifs) or de novo tissue-specific sequence motifs. However, there is an ongoing effort in the computational biology community to further improve the fidelity of enhancer predictions from sequence data by integrating other, complementary genomic modalities. In this work, we propose a framework that complements existing methodologies for prospective enhancer identification. The methods in this work are derived from two key insights: (i) that chromatin modification signatures can discriminate proximal and distally located regulatory regions and (ii) the notion of promoter-enhancer cross-talk (as assayed in 3C/5C experiments) might have implications in the search for regulatory sequences that co-operate with the promoter to yield tissue-restricted, gene-specific expression.
Collapse
Affiliation(s)
- Arvind Rao
- Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA.
| | | | | | | |
Collapse
|
78
|
Ernst J, Plasterer HL, Simon I, Bar-Joseph Z. Integrating multiple evidence sources to predict transcription factor binding in the human genome. Genome Res 2010; 20:526-36. [PMID: 20219943 DOI: 10.1101/gr.096305.109] [Citation(s) in RCA: 79] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
Information about the binding preferences of many transcription factors is known and characterized by a sequence binding motif. However, determining regions of the genome in which a transcription factor binds based on its motif is a challenging problem, particularly in species with large genomes, since there are often many sequences containing matches to the motif but are not bound. Several rules based on sequence conservation or location, relative to a transcription start site, have been proposed to help differentiate true binding sites from random ones. Other evidence sources may also be informative for this task. We developed a method for integrating multiple evidence sources using logistic regression classifiers. Our method works in two steps. First, we infer a score quantifying the general binding preferences of transcription factor binding at all locations based on a large set of evidence features, without using any motif specific information. Then, we combined this general binding preference score with motif information for specific transcription factors to improve prediction of regions bound by the factor. Using cross-validation and new experimental data we show that, surprisingly, the general binding preference can be highly predictive of true locations of transcription factor binding even when no binding motif is used. When combined with motif information our method outperforms previous methods for predicting locations of true binding.
Collapse
Affiliation(s)
- Jason Ernst
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | | | | | | |
Collapse
|
79
|
Cis-regulatory characterization of sequence conservation surrounding the Hox4 genes. Dev Biol 2010; 340:269-82. [PMID: 20144609 DOI: 10.1016/j.ydbio.2010.01.035] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2009] [Revised: 01/17/2010] [Accepted: 01/30/2010] [Indexed: 01/30/2023]
Abstract
Hox genes are key regulators of anterior-posterior axis patterning and have a major role in hindbrain development. The zebrafish Hox4 paralogs have strong overlapping activities in hindbrain rhombomeres 7 and 8, in the spinal cord and in the pharyngeal arches. With the aim to predict enhancers that act on the hoxa4a, hoxb4a, hoxc4a and hoxd4a genes, we used sequence conservation around the Hox4 genes to analyze all fish:human conserved non-coding sequences by reporter assays in stable zebrafish transgenesis. Thirty-four elements were functionally tested in GFP reporter gene constructs and more than 100 F1 lines were analyzed to establish a correlation between sequence conservation and cis-regulatory function, constituting a catalog of Hox4 CNEs. Sixteen tissue-specific enhancers could be identified. Multiple alignments of the CNEs revealed paralogous cis-regulatory sequences, however, the CNE sequence similarities were found not to correlate with tissue specificity. To identify ancestral enhancers that direct Hox4 gene activity, genome sequence alignments of mammals, teleosts, horn shark and the cephalochordate amphioxus, which is the most basal extant chordate possessing a single prototypical Hox cluster, were performed. Three elements were identified and two of them exhibited regulatory activity in transgenic zebrafish, however revealing no specificity. Our data show that the approach to identify cis-regulatory sequences by genome sequence alignments and subsequent testing in zebrafish transgenesis can be used to define enhancers within the Hox clusters and that these have significantly diverged in their function during evolution.
Collapse
|
80
|
Sholtis SJ, Noonan JP. Gene regulation and the origins of human biological uniqueness. Trends Genet 2010; 26:110-8. [PMID: 20106546 DOI: 10.1016/j.tig.2009.12.009] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2009] [Revised: 12/23/2009] [Accepted: 12/23/2009] [Indexed: 02/01/2023]
Abstract
What makes us human? It is likely that changes in gene expression and regulation, in addition to those in protein-coding genes, drove the evolution of uniquely human biological traits. In this review, we discuss how efforts to annotate regulatory functions in the human genome are being combined with maps of human-specific sequence acceleration to identify cis-regulatory elements with human-specific activity. Although the evolutionary interpretation of these events is a subject of considerable debate, the technical and analytical means are now at hand to identify the set of evolutionary genetic events that shaped our species.
Collapse
Affiliation(s)
- Samuel J Sholtis
- Department of Genetics, Yale University School of Medicine, New Haven, CT 06520, USA
| | | |
Collapse
|
81
|
Alu and b1 repeats have been selectively retained in the upstream and intronic regions of genes of specific functional classes. PLoS Comput Biol 2009; 5:e1000610. [PMID: 20019790 PMCID: PMC2784220 DOI: 10.1371/journal.pcbi.1000610] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2009] [Accepted: 11/13/2009] [Indexed: 11/20/2022] Open
Abstract
Alu and B1 repeats are mobile elements that originated in an initial duplication of the 7SL RNA gene prior to the primate-rodent split about 80 million years ago and currently account for a substantial fraction of the human and mouse genome, respectively. Following the primate-rodent split, Alu and B1 elements spread independently in each of the two genomes in a seemingly random manner, and, according to the prevailing hypothesis, negative selection shaped their final distribution in each genome by forcing the selective loss of certain Alu and B1 copies. In this paper, contrary to the prevailing hypothesis, we present evidence that Alu and B1 elements have been selectively retained in the upstream and intronic regions of genes belonging to specific functional classes. At the same time, we found no evidence for selective loss of these elements in any functional class. A subset of the functional links we discovered corresponds to functions where Alu involvement has actually been experimentally validated, whereas the majority of the functional links we report are novel. Finally, the unexpected finding that Alu and B1 elements show similar biases in their distribution across functional classes, despite having spread independently in their respective genomes, further supports our claim that the extant instances of Alu and B1 elements are the result of positive selection. Despite their fundamental role in cell regulation, genes account for less than 1% of the human genome. Recent studies have shown that non-genic regions of our DNA may also play an important functional role in human cells. In this paper, we study Alu and B elements, a specific class of such non-genic elements that account for ∼10% of the human genome and ∼7% of the mouse genome respectively. We show that, contrary to the prevailing hypothesis, Alu and B elements have been preferentially retained in the proximity of genes that perform specific functions in the cell. In contrast, we found no evidence for selective loss of these elements in any functional class. Several of the functional classes that we have linked to Alu and B elements are central to the proper working of the cell, and their disruption has previously been shown to lead to the onset of disease. Interestingly, the DNA sequences of Alu and B elements differ substantially between human and mouse, thus hinting at the existence of a potentially large number of non-conserved regulatory elements.
Collapse
|
82
|
Meireles-Filho ACA, Stark A. Comparative genomics of gene regulation-conservation and divergence of cis-regulatory information. Curr Opin Genet Dev 2009; 19:565-70. [PMID: 19913403 DOI: 10.1016/j.gde.2009.10.006] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2009] [Revised: 10/06/2009] [Accepted: 10/06/2009] [Indexed: 01/13/2023]
Abstract
We recently witnessed a tremendous increase in genomics studies on gene regulation and in entirely sequenced genomes from closely related species. This has triggered analyses that suggest a wide range of evolutionary dynamics of gene regulation, from rapid turnover of transcription-factor binding sites to conservation of enhancer function across large evolutionary distances. Many examples show that enhancers can evolve beyond recognizable sequence similarity while retaining function. However, bioinformatics approaches are increasingly able to detect conserved regulatory elements through characteristic evolutionary sequence signatures. Cis-regulatory changes are also a major source of morphological evolution, which might be facilitated by many biochemically functional elements that are selectively neutral and by the buffering function of redundant enhancers and 'shadow' enhancers.
Collapse
|
83
|
Abstract
In contrast to changes in protein-coding sequences, the significance of noncoding DNA variation in human disease has been minimally explored. A recent torrent of genome-wide association studies suggests that noncoding variation represents a significant risk factor for common disorders, but the mechanisms by which they contribute to disease remain largely obscure. Distant-acting transcriptional enhancers - a major category of functional noncoding DNA - are likely involved in many developmental and disease-relevant processes. Genome-wide approaches for their discovery and functional characterization are now available and provide a growing knowledgebase for the systematic exploration of their role in human biology and disease susceptibility.
Collapse
Affiliation(s)
- Axel Visel
- Genomics Division, MS 84-171, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
| | | | | |
Collapse
|
84
|
Elgar G. Pan-vertebrate conserved non-coding sequences associated with developmental regulation. BRIEFINGS IN FUNCTIONAL GENOMICS AND PROTEOMICS 2009; 8:256-65. [DOI: 10.1093/bfgp/elp033] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
85
|
Parker SCJ, Hansen L, Abaan HO, Tullius TD, Margulies EH. Local DNA topography correlates with functional noncoding regions of the human genome. Science 2009; 324:389-92. [PMID: 19286520 DOI: 10.1126/science.1169050] [Citation(s) in RCA: 169] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
The three-dimensional molecular structure of DNA, specifically the shape of the backbone and grooves of genomic DNA, can be dramatically affected by nucleotide changes, which can cause differences in protein-binding affinity and phenotype. We developed an algorithm to measure constraint on the basis of similarity of DNA topography among multiple species, using hydroxyl radical cleavage patterns to interrogate the solvent-accessible surface area of DNA. This algorithm found that 12% of bases in the human genome are evolutionarily constrained-double the number detected by nucleotide sequence-based algorithms. Topography-informed constrained regions correlated with functional noncoding elements, including enhancers, better than did regions identified solely on the basis of nucleotide sequence. These results support the idea that the molecular shape of DNA is under selection and can identify evolutionary history.
Collapse
|
86
|
ChIP-seq accurately predicts tissue-specific activity of enhancers. Nature 2009; 457:854-8. [PMID: 19212405 DOI: 10.1038/nature07730] [Citation(s) in RCA: 1274] [Impact Index Per Article: 84.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2008] [Accepted: 12/18/2008] [Indexed: 12/22/2022]
Abstract
A major yet unresolved quest in decoding the human genome is the identification of the regulatory sequences that control the spatial and temporal expression of genes. Distant-acting transcriptional enhancers are particularly challenging to uncover because they are scattered among the vast non-coding portion of the genome. Evolutionary sequence constraint can facilitate the discovery of enhancers, but fails to predict when and where they are active in vivo. Here we present the results of chromatin immunoprecipitation with the enhancer-associated protein p300 followed by massively parallel sequencing, and map several thousand in vivo binding sites of p300 in mouse embryonic forebrain, midbrain and limb tissue. We tested 86 of these sequences in a transgenic mouse assay, which in nearly all cases demonstrated reproducible enhancer activity in the tissues that were predicted by p300 binding. Our results indicate that in vivo mapping of p300 binding is a highly accurate means for identifying enhancers and their associated activities, and suggest that such data sets will be useful to study the role of tissue-specific enhancers in human biology and disease on a genome-wide scale.
Collapse
|
87
|
Burzynski G, Shepherd IT, Enomoto H. Genetic model system studies of the development of the enteric nervous system, gut motility and Hirschsprung's disease. Neurogastroenterol Motil 2009; 21:113-27. [PMID: 19215589 PMCID: PMC4041618 DOI: 10.1111/j.1365-2982.2008.01256.x] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The enteric nervous system (ENS) is the largest and most complicated subdivision of the peripheral nervous system. Its action is necessary to regulate many of the functions of the gastrointestinal tract including its motility. Whilst the ENS has been studied extensively by developmental biologists, neuroscientists and physiologists for several decades it has only been since the early 1990s that the molecular and genetic basis of ENS development has begun to emerge. Central to this understanding has been the use of genetic model organisms. In this article, we will discuss recent advances that have been achieved using both mouse and zebrafish model genetic systems that have led to new insights into ENS development and the genetic basis of Hirschsprung's disease.
Collapse
Affiliation(s)
- G Burzynski
- Department of Biology, Emory University, Atlanta, GA 30322, USA
| | | | | |
Collapse
|
88
|
McGaughey DM, Stine ZE, Huynh JL, Vinton RM, McCallion AS. Asymmetrical distribution of non-conserved regulatory sequences at PHOX2B is reflected at the ENCODE loci and illuminates a possible genome-wide trend. BMC Genomics 2009; 10:8. [PMID: 19128492 PMCID: PMC2630312 DOI: 10.1186/1471-2164-10-8] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2008] [Accepted: 01/07/2009] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND Transcriptional regulatory elements are central to development and interspecific phenotypic variation. Current regulatory element prediction tools rely heavily upon conservation for prediction of putative elements. Recent in vitro observations from the ENCODE project combined with in vivo analyses at the zebrafish phox2b locus suggests that a significant fraction of regulatory elements may fall below commonly applied metrics of conservation. We propose to explore these observations in vivo at the human PHOX2B locus, and also evaluate the potential evidence for genome-wide applicability of these observations through a novel analysis of extant data. RESULTS Transposon-based transgenic analysis utilizing a tiling path proximal to human PHOX2B in zebrafish recapitulates the observations at the zebrafish phox2b locus of both conserved and non-conserved regulatory elements. Analysis of human sequences conserved with previously identified zebrafish phox2b regulatory elements demonstrates that the orthologous sequences exhibit overlapping regulatory control. Additionally, analysis of non-conserved sequences scattered over 135 kb 5' to PHOX2B, provides evidence of non-conserved regulatory elements positively biased with close proximity to the gene. Furthermore, we provide a novel analysis of data from the ENCODE project, finding a non-uniform distribution of regulatory elements consistent with our in vivo observations at PHOX2B. These observations remain largely unchanged when one accounts for the sequence repeat content of the assayed intervals, when the intervals are sub-classified by biological role (developmental versus non-developmental), or by gene density (gene desert versus non-gene desert). CONCLUSION While regulatory elements frequently display evidence of evolutionary conservation, a fraction appears to be undetected by current metrics of conservation. In vivo observations at the PHOX2B locus, supported by our analyses of in vitro data from the ENCODE project, suggest that the risk of excluding non-conserved sequences in a search for regulatory elements may decrease as distance from the gene increases. Our data combined with the ENCODE data suggests that this may represent a genome wide trend.
Collapse
Affiliation(s)
- David M McGaughey
- McKusick - Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, 733 N, Broadway, BRB Suite 449, Baltimore, MD 21205, USA.
| | | | | | | | | |
Collapse
|
89
|
Kuntz SG, Schwarz EM, DeModena JA, De Buysscher T, Trout D, Shizuya H, Sternberg PW, Wold BJ. Multigenome DNA sequence conservation identifies Hox cis-regulatory elements. Genome Res 2008; 18:1955-68. [PMID: 18981268 DOI: 10.1101/gr.085472.108] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
To learn how well ungapped sequence comparisons of multiple species can predict cis-regulatory elements in Caenorhabditis elegans, we made such predictions across the large, complex ceh-13/lin-39 locus and tested them transgenically. We also examined how prediction quality varied with different genomes and parameters in our comparisons. Specifically, we sequenced approximately 0.5% of the C. brenneri and C. sp. 3 PS1010 genomes, and compared five Caenorhabditis genomes (C. elegans, C. briggsae, C. brenneri, C. remanei, and C. sp. 3 PS1010) to find regulatory elements in 22.8 kb of noncoding sequence from the ceh-13/lin-39 Hox subcluster. We developed the MUSSA program to find ungapped DNA sequences with N-way transitive conservation, applied it to the ceh-13/lin-39 locus, and transgenically assayed 21 regions with both high and low degrees of conservation. This identified 10 functional regulatory elements whose activities matched known ceh-13/lin-39 expression, with 100% specificity and a 77% recovery rate. One element was so well conserved that a similar mouse Hox cluster sequence recapitulated the native nematode expression pattern when tested in worms. Our findings suggest that ungapped sequence comparisons can predict regulatory elements genome-wide.
Collapse
Affiliation(s)
- Steven G Kuntz
- Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | | | | | | | | | | | | | | |
Collapse
|
90
|
Abstract
Ultraconserved elements (UCEs) are sequences that are identical between reference genomes of distantly related species. As they are under negative selection and enriched near or in specific classes of genes, one explanation for their ultraconservation may be their involvement in important functions. Indeed, many UCEs can drive tissue-specific gene expression. We have demonstrated that nonexonic UCEs are depleted among segmental duplications (SDs) and copy number variants (CNVs) and proposed that their ultraconservation may reflect a mechanism of copy counting via comparison. Here, we report that nonexonic UCEs are also depleted among 10 of 11 recent genomewide data sets of human CNVs, including 3 obtained with strategies permitting greater precision in determining the extents of CNVs. We further present observations suggesting that nonexonic UCEs per se may contribute to this depletion and that their apparent dosage sensitivity was in effect when they became fixed in the last common ancestor of mammals, birds, and reptiles, consistent with dosage sensitivity contributing to ultraconservation. Finally, in searching for the mechanism(s) underlying the function of nonexonic UCEs, we have found that they are enriched in TAATTA, which is also the recognition sequence for the homeodomain DNA-binding module, and bounded by a change in A + T frequency.
Collapse
|
91
|
Pashos EE, Kague E, Fisher S. Evaluation of cis-regulatory function in zebrafish. BRIEFINGS IN FUNCTIONAL GENOMICS AND PROTEOMICS 2008; 7:465-73. [PMID: 18820318 DOI: 10.1093/bfgp/eln045] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
As increasing numbers of vertebrate genomes are sequenced, comparative genomics offers tremendous promise to unveil mechanisms of transcriptional gene regulation on a large scale. However, the challenge of analysing immense amounts of sequence data and relating primary sequence to function is daunting. Several teleost species occupy crucial niches in the world of comparative genomics, as experimental model organisms of wide utility and living roadmaps of molecular evolution. Extant species have evolved after a teleost-specific genome duplication, and offer the opportunity to examine the evolution of thousands of duplicate gene pairs. Transgenesis in zebrafish is being increasingly employed to functionally examine non-coding sequences, from fish and mammals. Here, we discuss current approaches to the study of gene regulation in teleosts, and the promise of future research.
Collapse
|
92
|
Antonellis A, Huynh JL, Lee-Lin SQ, Vinton RM, Renaud G, Loftus SK, Elliot G, Wolfsberg TG, Green ED, McCallion AS, Pavan WJ. Identification of neural crest and glial enhancers at the mouse Sox10 locus through transgenesis in zebrafish. PLoS Genet 2008; 4:e1000174. [PMID: 18773071 PMCID: PMC2518861 DOI: 10.1371/journal.pgen.1000174] [Citation(s) in RCA: 89] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2008] [Accepted: 07/17/2008] [Indexed: 11/18/2022] Open
Abstract
Sox10 is a dynamically regulated transcription factor gene that is essential for the development of neural crest-derived and oligodendroglial populations. Developmental genes often require multiple regulatory sequences that integrate discrete and overlapping functions to coordinate their expression. To identify Sox10 cis-regulatory elements, we integrated multiple model systems, including cell-based screens and transposon-mediated transgensis in zebrafish, to scrutinize mammalian conserved, noncoding genomic segments at the mouse Sox10 locus. We demonstrate that eight of 11 Sox10 genomic elements direct reporter gene expression in transgenic zebrafish similar to patterns observed in transgenic mice, despite an absence of observable sequence conservation between mice and zebrafish. Multiple segments direct expression in overlapping populations of neural crest derivatives and glial cells, ranging from pan-Sox10 and pan-neural crest regulatory control to the modulation of expression in subpopulations of Sox10-expressing cells, including developing melanocytes and Schwann cells. Several sequences demonstrate overlapping spatial control, yet direct expression in incompletely overlapping developmental intervals. We were able to partially explain neural crest expression patterns by the presence of head to head SoxE family binding sites within two of the elements. Moreover, we were able to use this transcription factor binding site signature to identify the corresponding zebrafish enhancers in the absence of overall sequence homology. We demonstrate the utility of zebrafish transgenesis as a high-fidelity surrogate in the dissection of mammalian gene regulation, especially those with dynamically controlled developmental expression.
Collapse
Affiliation(s)
- Anthony Antonellis
- Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Jimmy L. Huynh
- McKusick–Nathans Institute of Genetic Medicine, Department of Molecular and Comparative Pathobiology, The Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Shih-Queen Lee-Lin
- Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Ryan M. Vinton
- McKusick–Nathans Institute of Genetic Medicine, Department of Molecular and Comparative Pathobiology, The Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Gabriel Renaud
- Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Stacie K. Loftus
- Genetic Disease Research Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Gene Elliot
- Genetic Disease Research Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Tyra G. Wolfsberg
- Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Eric D. Green
- Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Andrew S. McCallion
- McKusick–Nathans Institute of Genetic Medicine, Department of Molecular and Comparative Pathobiology, The Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
- * E-mail:
| | - William J. Pavan
- Genetic Disease Research Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| |
Collapse
|
93
|
Bourque G, Leong B, Vega VB, Chen X, Lee YL, Srinivasan KG, Chew JL, Ruan Y, Wei CL, Ng HH, Liu ET. Evolution of the mammalian transcription factor binding repertoire via transposable elements. Genome Res 2008; 18:1752-62. [PMID: 18682548 DOI: 10.1101/gr.080663.108] [Citation(s) in RCA: 416] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Identification of lineage-specific innovations in genomic control elements is critical for understanding transcriptional regulatory networks and phenotypic heterogeneity. We analyzed, from an evolutionary perspective, the binding regions of seven mammalian transcription factors (ESR1, TP53, MYC, RELA, POU5F1, SOX2, and CTCF) identified on a genome-wide scale by different chromatin immunoprecipitation approaches and found that only a minority of sites appear to be conserved at the sequence level. Instead, we uncovered a pervasive association with genomic repeats by showing that a large fraction of the bona fide binding sites for five of the seven transcription factors (ESR1, TP53, POU5F1, SOX2, and CTCF) are embedded in distinctive families of transposable elements. Using the age of the repeats, we established that these repeat-associated binding sites (RABS) have been associated with significant regulatory expansions throughout the mammalian phylogeny. We validated the functional significance of these RABS by showing that they are over-represented in proximity of regulated genes and that the binding motifs within these repeats have undergone evolutionary selection. Our results demonstrate that transcriptional regulatory networks are highly dynamic in eukaryotic genomes and that transposable elements play an important role in expanding the repertoire of binding sites.
Collapse
Affiliation(s)
- Guillaume Bourque
- Computational and Mathematical Biology, Genome Institute of Singapore, Singapore 138672, Singapore.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
94
|
Elgar G, Vavouri T. Tuning in to the signals: noncoding sequence conservation in vertebrate genomes. Trends Genet 2008; 24:344-52. [PMID: 18514361 DOI: 10.1016/j.tig.2008.04.005] [Citation(s) in RCA: 129] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2008] [Revised: 04/14/2008] [Accepted: 04/14/2008] [Indexed: 01/25/2023]
|
95
|
Tsirigos A, Rigoutsos I. Human and mouse introns are linked to the same processes and functions through each genome's most frequent non-conserved motifs. Nucleic Acids Res 2008; 36:3484-93. [PMID: 18450818 PMCID: PMC2425492 DOI: 10.1093/nar/gkn155] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
We identified the most frequent, variable-length DNA sequence motifs in the human and mouse genomes and sub-selected those with multiple recurrences in the intergenic and intronic regions and at least one additional exonic instance in the corresponding genome. We discovered that these motifs have virtually no overlap with intronic sequences that are conserved between human and mouse, and thus are genome-specific. Moreover, we found that these motifs span a substantial fraction of previously uncharacterized human and mouse intronic space. Surprisingly, we found that these genome-specific motifs are over-represented in the introns of genes belonging to the same biological processes and molecular functions in both the human and mouse genomes even though the underlying sequences are not conserved between the two genomes. In fact, the processes and functions that are linked to these genome-specific sequence-motifs are distinct from the processes and functions which are associated with intronic regions that are conserved between human and mouse. The findings show that intronic regions from different genomes are linked to the same processes and functions in the absence of underlying sequence conservation. We highlight the ramifications of this observation with a concrete example that involves the microsatellite instability gene MLH1.
Collapse
Affiliation(s)
- Aristotelis Tsirigos
- Bioinformatics and Pattern Discovery Group, IBM Thomas J. Watson Research Center, PO Box 218, Yorktown Heights, NY 10598, USA
| | | |
Collapse
|
96
|
Cooper GM, Brown CD. Qualifying the relationship between sequence conservation and molecular function. Genome Res 2008; 18:201-5. [PMID: 18245453 DOI: 10.1101/gr.7205808] [Citation(s) in RCA: 75] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Quantification of evolutionary constraints via sequence conservation can be leveraged to annotate genomic functional sequences. Recent efforts addressing the converse of this relationship have identified many sites in metazoan genomes with molecular function but without detectable conservation between related species. Here, we discuss explanations and implications for these results considering both practical and theoretical issues. In particular, phylogenetic scope influences the relationship between sequence conservation and function. Comparisons of distantly related species can detect constraint with high specificity due to the loss of conserved neutral sequence, but sensitivity is sacrificed as a result of functional changes related to lineage-specific biology. The strength of natural selection operating on functional sequence is also important. Mutations to functional sequences that result in small fitness effects are subject to weaker constraints. Therefore, particularly when comparing highly divergent species, functional sequences that are degenerate or biologically redundant will be prone to turnover, wherein functional sequences are replaced by effectively equivalent, but nonorthologous counterparts. Finally, considering the size and complexity of metazoan genomes and the fact that many nonconserved sequences are associated with sequence-degenerate, low-level molecular functions, we find it likely that there exist many biochemically functional sequences that are not under constraint. This hypothesis does not lead to the conclusion that huge amounts of vertebrate genomes are functionally important, but rather that such "functionality" represents molecular noise that has weak or no effect on organismal phenotypes.
Collapse
Affiliation(s)
- Gregory M Cooper
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA.
| | | |
Collapse
|
97
|
Abstract
The transcription of almost all developmental genes is driven by tissue- and time-specific regulatory elements. These transcriptional regulatory elements lie in the genomic DNA proximal to the gene, and hence are cis-regulatory (as opposed to trans-regulatory elements like transcription factor genes). Over the past three decades, a number of techniques have been applied to the problem of finding and characterizing these regulatory elements. In this chapter, I discuss some computational approaches that have been particularly useful in identifying developmental cis-regulatory regions, and provide a tutorial on how to apply these approaches to the study of chick development.
Collapse
|