251
|
Cañestro C, Albalat R, Irimia M, Garcia-Fernàndez J. Impact of gene gains, losses and duplication modes on the origin and diversification of vertebrates. Semin Cell Dev Biol 2013; 24:83-94. [DOI: 10.1016/j.semcdb.2012.12.008] [Citation(s) in RCA: 74] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2012] [Accepted: 12/25/2012] [Indexed: 02/06/2023]
|
252
|
Identification of two evolutionarily conserved 5' cis-elements involved in regulating spatiotemporal expression of Nolz-1 during mouse embryogenesis. PLoS One 2013; 8:e54485. [PMID: 23349903 PMCID: PMC3551757 DOI: 10.1371/journal.pone.0054485] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2012] [Accepted: 12/12/2012] [Indexed: 01/13/2023] Open
Abstract
Proper development of vertebrate embryos depends not only on the crucial funtions of key evolutionarily conserved transcriptional regulators, but also on the precisely spatiotemporal expression of these transcriptional regulators. The mouse Nolz-1/Znf503/Zfp503 gene is a mammalian member of the conserved zinc-finger containing NET family. The expression pattern of Nolz-1 in mouse embryos is highly correlated with that of its homologues in different species. To study the spatiotemporal regulation of Nolz-1, we first identified two evolutionarily conserved cis-elements, UREA and UREB, in 5' upstream regions of mouse Nolz-1 locus. We then generated UREA-LacZ and UREB-LacZ transgenic reporter mice to characterize the putative enhancer activity of UREA and UREB. The results indicated that both UREA and UREB contained tissue-specific enhancer activity for directing LacZ expression in selective tissue organs during mouse embryogensis. UREA directed LacZ expression preferentially in selective regions of developing central nervous system, including the forebrain, hindbrain and spinal cord, whereas UREB directed LacZ expression mainly in other developing tissue organs such as the Nolz-1 expressing branchial arches and its derivatives, the apical ectodermal ridge of limb buds and the urogenital tissues. Both UREA and UREB directed strong LacZ expression in the lateral plate mesoderm where endogenous Nolz-1 was also expressed. Despite that the LacZ expression pattern did not full recapitulated the endogenous Nolz-1 expression and some mismatched expression patterns were observed, co-expression of LacZ and Nolz-1 did occur in many cells of selective tissue organs, such as in the ventrolateral cortex and ventral spinal cord of UREA-LacZ embryos, and the urogenital tubes of UREB-LacZ embryos. Taken together, our study suggests that UREA and UREB may function as evolutionarily conserved cis-regulatory elements that coordinate with other cis-elements to regulate spatiotemporal expression of Nolz-1 in different tissue organs during mouse embryogenesis.
Collapse
|
253
|
Understanding the Dynamics of Gene Regulatory Systems; Characterisation and Clinical Relevance of cis-Regulatory Polymorphisms. BIOLOGY 2013; 2:64-84. [PMID: 24832652 PMCID: PMC4009875 DOI: 10.3390/biology2010064] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/01/2012] [Revised: 12/21/2012] [Accepted: 01/04/2013] [Indexed: 12/02/2022]
Abstract
Modern genetic analysis has shown that most polymorphisms associated with human disease are non-coding. Much of the functional information contained in the non-coding genome consists of cis-regulatory sequences (CRSs) that are required to respond to signal transduction cues that direct cell specific gene expression. It has been hypothesised that many diseases may be due to polymorphisms within CRSs that alter their responses to signal transduction cues. However, identification of CRSs, and the effects of allelic variation on their ability to respond to signal transduction cues, is still at an early stage. In the current review we describe the use of comparative genomics and experimental techniques that allow for the identification of CRSs building on recent advances by the ENCODE consortium. In addition we describe techniques that allow for the analysis of the effects of allelic variation and epigenetic modification on CRS responses to signal transduction cues. Using specific examples we show that the interactions driving these elements are highly complex and the effects of disease associated polymorphisms often subtle. It is clear that gaining an understanding of the functions of CRSs, and how they are affected by SNPs and epigenetic modification, is essential to understanding the genetic basis of human disease and stratification whilst providing novel directions for the development of personalised medicine.
Collapse
|
254
|
Taher L, Smith RP, Kim MJ, Ahituv N, Ovcharenko I. Sequence signatures extracted from proximal promoters can be used to predict distal enhancers. Genome Biol 2013; 14:R117. [PMID: 24156763 PMCID: PMC3983659 DOI: 10.1186/gb-2013-14-10-r117] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2013] [Accepted: 10/24/2013] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND Gene expression is controlled by proximal promoters and distal regulatory elements such as enhancers. While the activity of some promoters can be invariant across tissues, enhancers tend to be highly tissue-specific. RESULTS We compiled sets of tissue-specific promoters based on gene expression profiles of 79 human tissues and cell types. Putative transcription factor binding sites within each set of sequences were used to train a support vector machine classifier capable of distinguishing tissue-specific promoters from control sequences. We obtained reliable classifiers for 92% of the tissues, with an area under the receiver operating characteristic curve between 60% (for subthalamic nucleus promoters) and 98% (for heart promoters). We next used these classifiers to identify tissue-specific enhancers, scanning distal non-coding sequences in the loci of the 200 most highly and lowly expressed genes. Thirty percent of reliable classifiers produced consistent enhancer predictions, with significantly higher densities in the loci of the most highly expressed compared to lowly expressed genes. Liver enhancer predictions were assessed in vivo using the hydrodynamic tail vein injection assay. Fifty-eight percent of the predictions yielded significant enhancer activity in the mouse liver, whereas a control set of five sequences was completely negative. CONCLUSIONS We conclude that promoters of tissue-specific genes often contain unambiguous tissue-specific signatures that can be learned and used for the de novo prediction of enhancers.
Collapse
Affiliation(s)
- Leila Taher
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
- Institute for Biostatistics and Informatics in Medicine and Ageing Research, University of Rostock, Rostock, 18057, Germany
| | - Robin P Smith
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, 94158, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, 94158, USA
| | - Mee J Kim
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, 94158, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, 94158, USA
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, 94158, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, 94158, USA
| | - Ivan Ovcharenko
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| |
Collapse
|
255
|
Enhancer chip: detecting human copy number variations in regulatory elements. PLoS One 2012; 7:e52264. [PMID: 23284961 PMCID: PMC3527541 DOI: 10.1371/journal.pone.0052264] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2012] [Accepted: 11/12/2012] [Indexed: 12/31/2022] Open
Abstract
Critical functional properties are embedded in the non-coding portion of the human genome. Recent successful studies have shown that variations in distant-acting gene enhancer sequences can contribute to disease. In fact, various disorders, such as thalassaemias, preaxial polydactyly or susceptibility to Hirschsprung’s disease, may be the result of rearrangements of enhancer elements. We have analyzed the distribution of enhancer loci in the genome and compared their localization to that of previously described copy-number variations (CNVs). These data suggest a negative selection of copy number variable enhancers. To identify CNVs covering enhancer elements, we have developed a simple and cost-effective test. Here we describe the gene selection, design strategy and experimental validation of a customized oligonucleotide Array-Based Comparative Genomic Hybridization (aCGH), designated Enhancer Chip. It has been designed to investigate CNVs, allowing the analysis of all the genome with a 300 Kb resolution and specific disease regions (telomeres, centromeres and selected disease loci) at a tenfold higher resolution. Moreover, this is the first aCGH able to test over 1,250 enhancers, in order to investigate their potential pathogenic role. Validation experiments have demonstrated that Enhancer Chip efficiently detects duplications and deletions covering enhancer loci, demonstrating that it is a powerful instrument to detect and characterize copy number variable enhancers.
Collapse
|
256
|
Ariza-Cosano A, Visel A, Pennacchio LA, Fraser HB, Gómez-Skarmeta JL, Irimia M, Bessa J. Differences in enhancer activity in mouse and zebrafish reporter assays are often associated with changes in gene expression. BMC Genomics 2012; 13:713. [PMID: 23253453 PMCID: PMC3541358 DOI: 10.1186/1471-2164-13-713] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2012] [Accepted: 12/14/2012] [Indexed: 01/18/2023] Open
Abstract
Background Phenotypic evolution in animals is thought to be driven in large part by differences in gene expression patterns, which can result from sequence changes in cis-regulatory elements (cis-changes) or from changes in the expression pattern or function of transcription factors (trans-changes). While isolated examples of trans-changes have been identified, the scale of their overall contribution to regulatory and phenotypic evolution remains unclear. Results Here, we attempt to examine the prevalence of trans-effects and their potential impact on gene expression patterns in vertebrate evolution by comparing the function of identical human tissue-specific enhancer sequences in two highly divergent vertebrate model systems, mouse and zebrafish. Among 47 human conserved non-coding elements (CNEs) tested in transgenic mouse embryos and in stable zebrafish lines, at least one species-specific expression domain was observed in the majority (83%) of cases, and 36% presented dramatically different expression patterns between the two species. Although some of these discrepancies may be due to the use of different transgenesis systems in mouse and zebrafish, in some instances we found an association between differences in enhancer activity and changes in the endogenous gene expression patterns between mouse and zebrafish, suggesting a potential role for trans-changes in the evolution of gene expression. Conclusions In total, our results: (i) serve as a cautionary tale for studies investigating the role of human enhancers in different model organisms, and (ii) suggest that changes in the trans environment may play a significant role in the evolution of gene expression in vertebrates.
Collapse
Affiliation(s)
- Ana Ariza-Cosano
- Centro Andaluz de Biología del Desarrollo (CABD), CSIC-Universidad Pablo de Olavide-Junta de Andalucía, Ctra. Utrera Km 1, Seville 41013, Spain
| | | | | | | | | | | | | |
Collapse
|
257
|
Dimitrieva S, Bucher P. UCNEbase--a database of ultraconserved non-coding elements and genomic regulatory blocks. Nucleic Acids Res 2012. [PMID: 23193254 PMCID: PMC3531063 DOI: 10.1093/nar/gks1092] [Citation(s) in RCA: 87] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
UCNEbase (http://ccg.vital-it.ch/UCNEbase) is a free, web-accessible information resource on the evolution and genomic organization of ultra-conserved non-coding elements (UCNEs). It currently covers 4351 such elements in 18 different species. The majority of UCNEs are supposed to be transcriptional regulators of key developmental genes. As most of them occur as clusters near potential target genes, the database is organized along two hierarchical levels: individual UCNEs and ultra-conserved genomic regulatory blocks (UGRBs). UCNEbase introduces a coherent nomenclature for UCNEs reflecting their respective associations with likely target genes. Orthologous and paralogous UCNEs share components of their names and are systematically cross-linked. Detailed synteny maps between the human and other genomes are provided for all UGRBs. UCNEbase is managed by a relational database system and can be accessed by a variety of web-based query pages. As it relies on the UCSC genome browser as visualization platform, a large part of its data content is also available as browser viewable custom track files. UCNEbase is potentially useful to any computational, experimental or evolutionary biologist interested in conserved non-coding DNA elements in vertebrates.
Collapse
Affiliation(s)
- Slavica Dimitrieva
- Swiss Institute for Experimental Cancer Research (ISREC), School of Life Sciences, Swiss Federal Institute of Technology (EPFL) and Swiss Institute of Bioinformatics (SIB), CH-1015 Lausanne, Switzerland
- *To whom correspondence should be addressed. Tel: +41 21 693 0956; Fax: +41 21 693 1850;
| | - Philipp Bucher
- Swiss Institute for Experimental Cancer Research (ISREC), School of Life Sciences, Swiss Federal Institute of Technology (EPFL) and Swiss Institute of Bioinformatics (SIB), CH-1015 Lausanne, Switzerland
- Correspondence may also be addressed to Slavica Dimitrieva. Tel: +41 21 693 0958; Fax: +41 21 693 1850;
| |
Collapse
|
258
|
Vrieze SI, Iacono WG, McGue M. Confluence of genes, environment, development, and behavior in a post Genome-Wide Association Study world. Dev Psychopathol 2012; 24:1195-214. [PMID: 23062291 PMCID: PMC3476066 DOI: 10.1017/s0954579412000648] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
This article serves to outline a research paradigm to investigate main effects and interactions of genes, environment, and development on behavior and psychiatric illness. We provide a historical context for candidate gene studies and genome-wide association studies, including benefits, limitations, and expected payoffs. Using substance use and abuse as our driving example, we then turn to the importance of etiological psychological theory in guiding genetic, environmental, and developmental research, as well as the utility of refined phenotypic measures, such as endophenotypes, in the pursuit of etiological understanding and focused tests of genetic and environmental associations. Phenotypic measurement has received considerable attention in the history of psychology and is informed by psychometrics, whereas the environment remains relatively poorly measured and is often confounded with genetic effects (i.e., gene-environment correlation). Genetically informed designs, which are no longer limited to twin and adoption studies thanks to ever-cheaper genotyping, are required to understand environmental influences. Finally, we outline the vast amount of individual difference in structural genomic variation, most of which remains to be leveraged in genetic association tests. Although the genetic data can be massive and burdensome (tens of millions of variants per person), we argue that improved understanding of genomic structure and function will provide investigators with new tools to test specific a priori hypotheses derived from etiological psychological theory, much like current candidate gene research but with less confusion and more payoff than candidate gene research has to date.
Collapse
Affiliation(s)
- Scott I Vrieze
- Department of Psychology, University of Minnesota, Minneapolis, MN 55455, USA.
| | | | | |
Collapse
|
259
|
Goode DK, Elgar G. Capturing the regulatory interactions of eukaryote genomes. Brief Funct Genomics 2012; 12:142-60. [PMID: 23117864 DOI: 10.1093/bfgp/els041] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
A key finding from early genomics research is the remarkable consistency in the number of protein-coding regions across diverse species. This has led many researchers to look to the cis-regulatory elements of genes as the fundamental influence behind evolving gene function and subsequent species diversification. Historically, since these elements are often located in vast intergenic and intronic regions of the genome, their identification has been recalcitrant. Now, with the deluge of whole-genome data from representatives of numerous eukaryotic lineages, various approaches have enabled us to begin to recognize features that characterize regulatory regions of the genome. Here we endeavour to collate these approaches in order to give an overview of the complexities involved in extrapolating regulatory signatures. The resource provided by the escalating richness of whole-genome datasets enables more sophisticated modelling of these regulatory signatures yet at the same time introduces increasing potential for noise. While we are only at the advent of making these discoveries, the next decade promises to be a very exciting and rewarding time for genome researchers.
Collapse
Affiliation(s)
- Debbie K Goode
- Cambridge Institute for Medical Research, Deptartment of Haematology, Addenbrooke's Hospital, Hills Road, Cambridge, UK
| | | |
Collapse
|
260
|
Becker TS, Rinkwitz S. Zebrafish as a genomics model for human neurological and polygenic disorders. Dev Neurobiol 2012; 72:415-28. [PMID: 21465670 DOI: 10.1002/dneu.20888] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Whole exome sequencing and, to a lesser extent, genome-wide association studies, have provided unprecedented advances in identifying genes and candidate genomic regions involved in the development of human disease. Further progress will come from sequencing the entire genome of multiple patients and normal controls to evaluate overall mutational burden and disease risk. A major challenge will be the interpretation of the resulting data and distinguishing true pathogenic mutations from rare benign variants.While in model organisms such as the zebrafish,mutants are sought that disrupt the function of individual genes, human mutations that cause, or are associated with, the development of disease, are often not acting in a Mendelian fashion, are frequently of small effect size, are late onset, and may reside in noncoding parts of the genome. The zebrafish model is uniquely poised for understanding human coding- and noncoding variants because of its sequenced genome, a large body of knowledge on gene expression and function, rapid generation time, and easy access to embryos. A critical advantage is the ease of zebrafish transgenesis, both for the testing of human regulatory DNA driving expression of fluorescent reporter proteins, and the expression of mutated disease-associated human proteins in specific neurons to rapidly model aspects of neurological disorders. The zebrafish affords progress both through its model genome and it is rapidly developing transparent model vertebrate embryo.
Collapse
Affiliation(s)
- Thomas S Becker
- Sydney Medical School, University of Sydney, Camperdown, Australia.
| | | |
Collapse
|
261
|
Xu W, Zhu Q, Wu Z, Guo H, Wu F, Mashausi DS, Zheng C, Li D. A Novel Evolutionarily Conserved Element Is a General Transcriptional Repressor of p21WAF1/CIP1. Cancer Res 2012; 72:6236-46. [DOI: 10.1158/0008-5472.can-12-1236] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
262
|
Artificial induction of Sox21 regulates sensory cell formation in the embryonic chicken inner ear. PLoS One 2012; 7:e46387. [PMID: 23071561 PMCID: PMC3468625 DOI: 10.1371/journal.pone.0046387] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2012] [Accepted: 08/29/2012] [Indexed: 12/26/2022] Open
Abstract
During embryonic development, hair cells and support cells in the sensory epithelia of the inner ear derive from progenitors that express Sox2, a member of the SoxB1 family of transcription factors. Sox2 is essential for sensory specification, but high levels of Sox2 expression appear to inhibit hair cell differentiation, suggesting that factors regulating Sox2 activity could be critical for both processes. Antagonistic interactions between SoxB1 and SoxB2 factors are known to regulate cell differentiation in neural tissue, which led us to investigate the potential roles of the SoxB2 member Sox21 during chicken inner ear development. Sox21 is normally expressed by sensory progenitors within vestibular and auditory regions of the early embryonic chicken inner ear. At later stages, Sox21 is differentially expressed in the vestibular and auditory organs. Sox21 is restricted to the support cell layer of the auditory epithelium, while it is enriched in the hair cell layer of the vestibular organs. To test Sox21 function, we used two temporally distinct gain-of-function approaches. Sustained over-expression of Sox21 from early developmental stages prevented prosensory specification, and abolished the formation of both hair cells and support cells. However, later induction of Sox21 expression at the time of hair cell formation in organotypic cultures of vestibular epithelia inhibited endogenous Sox2 expression and Notch activity, and biased progenitor cells towards a hair cell fate. Interestingly, Sox21 did not promote hair cell differentiation in the immature auditory epithelium, which fits with the expression of endogenous Sox21 within mature support cells in this tissue. These results suggest that interactions among endogenous SoxB family transcription factors may regulate sensory cell formation in the inner ear, but in a context-dependent manner.
Collapse
|
263
|
Hiller M, Schaar BT, Bejerano G. Hundreds of conserved non-coding genomic regions are independently lost in mammals. Nucleic Acids Res 2012; 40:11463-76. [PMID: 23042682 PMCID: PMC3526296 DOI: 10.1093/nar/gks905] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Conserved non-protein-coding DNA elements (CNEs) often encode cis-regulatory elements and are rarely lost during evolution. However, CNE losses that do occur can be associated with phenotypic changes, exemplified by pelvic spine loss in sticklebacks. Using a computational strategy to detect complete loss of CNEs in mammalian genomes while strictly controlling for artifacts, we find >600 CNEs that are independently lost in at least two mammalian lineages, including a spinal cord enhancer near GDF11. We observed several genomic regions where multiple independent CNE loss events happened; the most extreme is the DIAPH2 locus. We show that CNE losses often involve deletions and that CNE loss frequencies are non-uniform. Similar to less pleiotropic enhancers, we find that independently lost CNEs are shorter, slightly less constrained and evolutionarily younger than CNEs without detected losses. This suggests that independently lost CNEs are less pleiotropic and that pleiotropic constraints contribute to non-uniform CNE loss frequencies. We also detected 35 CNEs that are independently lost in the human lineage and in other mammals. Our study uncovers an interesting aspect of the evolution of functional DNA in mammalian genomes. Experiments are necessary to test if these independently lost CNEs are associated with parallel phenotype changes in mammals.
Collapse
Affiliation(s)
- Michael Hiller
- Department of Developmental Biology, Stanford University, Stanford, California 94305, USA.
| | | | | |
Collapse
|
264
|
Functional analysis of HapMap SNPs. Gene 2012; 511:358-63. [PMID: 23041558 DOI: 10.1016/j.gene.2012.09.075] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2012] [Revised: 08/07/2012] [Accepted: 09/13/2012] [Indexed: 11/20/2022]
Abstract
Genome-wide association studies (GWAS) have successfully identified many genetic variants associated with complex diseases and traits. However, functional consequence of genetic variants studied in GWAS is not yet fully investigated, which would hinder the application of GWAS. We therefore performed a systematic functional analysis of HapMap SNPs, which have been most commonly used as the reference panel for GWAS. Our study highlights several characteristics of HapMap SNPs and identifies subsets of genetic variants with interesting functional implication. The results show that HapMap SNPs have good coverage within RefSeq genes, especially within known disease-related genes. On the other hand, only a small percentage of SNPs are non-synonymous SNPs while many SNPs are actually located at gene deserts. Moreover, many functionally important variants are not yet still interrogated. A redesigned SNP reference panel with additional functionally important variants would be useful to identify disease-causal variants in the future genome-wide studies.
Collapse
|
265
|
Kritsas K, Wuest SE, Hupalo D, Kern AD, Wicker T, Grossniklaus U. Computational analysis and characterization of UCE-like elements (ULEs) in plant genomes. Genome Res 2012; 22:2455-66. [PMID: 22987666 PMCID: PMC3514675 DOI: 10.1101/gr.129346.111] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Ultraconserved elements (UCEs), stretches of DNA that are identical between distantly related species, are enigmatic genomic features whose function is not well understood. First identified and characterized in mammals, UCEs have been proposed to play important roles in gene regulation, RNA processing, and maintaining genome integrity. However, because all of these functions can tolerate some sequence variation, their ultraconserved and ultraselected nature is not explained. We investigated whether there are highly conserved DNA elements without genic function in distantly related plant genomes. We compared the genomes of Arabidopsis thaliana and Vitis vinifera; species that diverged ∼115 million years ago (Mya). We identified 36 highly conserved elements with at least 85% similarity that are longer than 55 bp. Interestingly, these elements exhibit properties similar to mammalian UCEs, such that we named them UCE-like elements (ULEs). ULEs are located in intergenic or intronic regions and are depleted from segmental duplications. Like UCEs, ULEs are under strong purifying selection, suggesting a functional role for these elements. As their mammalian counterparts, ULEs show a sharp drop of A+T content at their borders and are enriched close to genes encoding transcription factors and genes involved in development, the latter showing preferential expression in undifferentiated tissues. By comparing the genomes of Brachypodium distachyon and Oryza sativa, species that diverged ∼50 Mya, we identified a different set of ULEs with similar properties in monocots. The identification of ULEs in plant genomes offers new opportunities to study their possible roles in genome function, integrity, and regulation.
Collapse
Affiliation(s)
- Konstantinos Kritsas
- Institute of Plant Biology & Zürich-Basel Plant Science Center, University Zürich, CH-8008 Zürich, Switzerland
| | | | | | | | | | | |
Collapse
|
266
|
D'Hont A, Denoeud F, Aury JM, Baurens FC, Carreel F, Garsmeur O, Noel B, Bocs S, Droc G, Rouard M, Da Silva C, Jabbari K, Cardi C, Poulain J, Souquet M, Labadie K, Jourda C, Lengellé J, Rodier-Goud M, Alberti A, Bernard M, Correa M, Ayyampalayam S, Mckain MR, Leebens-Mack J, Burgess D, Freeling M, Mbéguié-A-Mbéguié D, Chabannes M, Wicker T, Panaud O, Barbosa J, Hribova E, Heslop-Harrison P, Habas R, Rivallan R, Francois P, Poiron C, Kilian A, Burthia D, Jenny C, Bakry F, Brown S, Guignon V, Kema G, Dita M, Waalwijk C, Joseph S, Dievart A, Jaillon O, Leclercq J, Argout X, Lyons E, Almeida A, Jeridi M, Dolezel J, Roux N, Risterucci AM, Weissenbach J, Ruiz M, Glaszmann JC, Quétier F, Yahiaoui N, Wincker P. The banana (Musa acuminata) genome and the evolution of monocotyledonous plants. Nature 2012; 488:213-7. [PMID: 22801500 DOI: 10.1038/nature11241] [Citation(s) in RCA: 624] [Impact Index Per Article: 52.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2012] [Accepted: 05/18/2012] [Indexed: 01/17/2023]
Abstract
Bananas (Musa spp.), including dessert and cooking types, are giant perennial monocotyledonous herbs of the order Zingiberales, a sister group to the well-studied Poales, which include cereals. Bananas are vital for food security in many tropical and subtropical countries and the most popular fruit in industrialized countries. The Musa domestication process started some 7,000 years ago in Southeast Asia. It involved hybridizations between diverse species and subspecies, fostered by human migrations, and selection of diploid and triploid seedless, parthenocarpic hybrids thereafter widely dispersed by vegetative propagation. Half of the current production relies on somaclones derived from a single triploid genotype (Cavendish). Pests and diseases have gradually become adapted, representing an imminent danger for global banana production. Here we describe the draft sequence of the 523-megabase genome of a Musa acuminata doubled-haploid genotype, providing a crucial stepping-stone for genetic improvement of banana. We detected three rounds of whole-genome duplications in the Musa lineage, independently of those previously described in the Poales lineage and the one we detected in the Arecales lineage. This first monocotyledon high-continuity whole-genome sequence reported outside Poales represents an essential bridge for comparative genome analysis in plants. As such, it clarifies commelinid-monocotyledon phylogenetic relationships, reveals Poaceae-specific features and has led to the discovery of conserved non-coding sequences predating monocotyledon-eudicotyledon divergence.
Collapse
Affiliation(s)
- Angélique D'Hont
- Centre de coopération Internationale en Recherche Agronomique pour le Développement, UMR AGAP, F-34398 Montpellier, France. angelique.d’
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
267
|
Lowe CB, Haussler D. 29 mammalian genomes reveal novel exaptations of mobile elements for likely regulatory functions in the human genome. PLoS One 2012; 7:e43128. [PMID: 22952639 PMCID: PMC3428314 DOI: 10.1371/journal.pone.0043128] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2012] [Accepted: 07/17/2012] [Indexed: 11/18/2022] Open
Abstract
Recent research supports the view that changes in gene regulation, as opposed to changes in the genes themselves, play a significant role in morphological evolution. Gene regulation is largely dependent on transcription factor binding sites. Researchers are now able to use the available 29 mammalian genomes to measure selective constraint at the level of binding sites. This detailed map of constraint suggests that mammalian genomes co-opt fragments of mobile elements to act as gene regulatory sequence on a large scale. In the human genome we detect over 280,000 putative regulatory elements, totaling approximately 7 Mb of sequence, that originated as mobile element insertions. These putative regulatory regions are conserved non-exonic elements (CNEEs), which show considerable cross-species constraint and signatures of continued negative selection in humans, yet do not appear in a known mature transcript. These putative regulatory elements were co-opted from SINE, LINE, LTR and DNA transposon insertions. We demonstrate that at least 11%, and an estimated 20%, of gene regulatory sequence in the human genome showing cross-species conservation was co-opted from mobile elements. The location in the genome of CNEEs co-opted from mobile elements closely resembles that of CNEEs in general, except in the centers of the largest gene deserts where recognizable co-option events are relatively rare. We find that regions of certain mobile element insertions are more likely to be held under purifying selection than others. In particular, we show 6 examples where paralogous instances of an often co-opted mobile element region define a sequence motif that closely matches a transcription factor's binding profile.
Collapse
Affiliation(s)
- Craig B. Lowe
- Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, California, United States of America
| | - David Haussler
- Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, California, United States of America
- Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, California, United States of America
| |
Collapse
|
268
|
A SINE-derived element constitutes a unique modular enhancer for mammalian diencephalic Fgf8. PLoS One 2012; 7:e43785. [PMID: 22937095 PMCID: PMC3427154 DOI: 10.1371/journal.pone.0043785] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2012] [Accepted: 07/25/2012] [Indexed: 01/04/2023] Open
Abstract
Transposable elements, including short interspersed repetitive elements (SINEs), comprise nearly half the mammalian genome. Moreover, they are a major source of conserved non-coding elements (CNEs), which play important functional roles in regulating development-related genes, such as enhancing and silencing, serving for the diversification of morphological and physiological features among species. We previously reported a novel SINE family, AmnSINE1, as part of mammalian-specific CNEs. One AmnSINE1 locus, named AS071, showed an enhancer property in the developing mouse diencephalon. Indeed, AS071 appears to recapitulate the expression of diencephalic fibroblast growth factor 8 (Fgf8). Here we established three independent lines of AS071-transgenic mice and performed detailed expression profiling of AS071-enhanced lacZ in comparison with that of Fgf8 across embryonic stages. We demonstrate that AS071 is a distal enhancer that directs Fgf8 expression in the developing diencephalon. Furthermore, enhancer assays with constructs encoding partially deleted AS071 sequence revealed a unique modular organization in which AS071 contains at least three functionally distinct sub-elements that cooperatively direct the enhancer activity in three diencephalic domains, namely the dorsal midline and the lateral wall of the diencephalon, and the ventral midline of the hypothalamus. Interestingly, the AmnSINE1-derived sub-element was found to specify the enhancer activity to the ventral midline of the hypothalamus. To our knowledge, this is the first discovery of an enhancer element that could be separated into respective sub-elements that determine regional specificity and/or the core enhancing activity. These results potentiate our understanding of the evolution of retroposon-derived cis-regulatory elements as well as the basis for future studies of the molecular mechanism underlying the determination of domain-specificity of an enhancer.
Collapse
|
269
|
Clarke SL, VanderMeer JE, Wenger AM, Schaar BT, Ahituv N, Bejerano G. Human developmental enhancers conserved between deuterostomes and protostomes. PLoS Genet 2012; 8:e1002852. [PMID: 22876195 PMCID: PMC3410860 DOI: 10.1371/journal.pgen.1002852] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2012] [Accepted: 06/07/2012] [Indexed: 01/10/2023] Open
Abstract
The identification of homologies, whether morphological, molecular, or genetic, is fundamental to our understanding of common biological principles. Homologies bridging the great divide between deuterostomes and protostomes have served as the basis for current models of animal evolution and development. It is now appreciated that these two clades share a common developmental toolkit consisting of conserved transcription factors and signaling pathways. These patterning genes sometimes show common expression patterns and genetic interactions, suggesting the existence of similar or even conserved regulatory apparatus. However, previous studies have found no regulatory sequence conserved between deuterostomes and protostomes. Here we describe the first such enhancers, which we call bilaterian conserved regulatory elements (Bicores). Bicores show conservation of sequence and gene synteny. Sequence conservation of Bicores reflects conserved patterns of transcription factor binding sites. We predict that Bicores act as response elements to signaling pathways, and we show that Bicores are developmental enhancers that drive expression of transcriptional repressors in the vertebrate central nervous system. Although the small number of identified Bicores suggests extensive rewiring of cis-regulation between the protostome and deuterostome clades, additional Bicores may be revealed as our understanding of cis-regulatory logic and sample of bilaterian genomes continue to grow. Flies and worms have long served as valuable model organisms for the study of human development and health. Despite the great morphological and evolutionary distance between them, humans, flies, and worms share many commonalities. Each develops from three major germ layers and is patterned along the two major spatial axes. At the molecular level, development in these widely diverged species is often controlled by the same signaling pathways activating members of the same transcription factor and target gene families, shared since the common ancestor of humans, flies, and worms. And yet, at the gene regulatory level, humans and flies or worms seem starkly different, with not a single regulatory region shared across the phyla. Here we discover the first two examples of developmental enhancers conserved between deuterostomes (ranging from human to sea urchins) and protostomes (a large clade that includes flies and worms). We show evidence that these ancient regulatory loci retain the capacity to respond to the same signaling pathways in these widely diverged organisms, and we show that they have been co-opted, along with the molecular pathways that control them, to pattern the vertebrate nervous systems. Our screen supports large scale regulatory rewiring, while offering the first intriguing outliers.
Collapse
Affiliation(s)
- Shoa L Clarke
- Department of Genetics, Stanford University, Stanford, California, United States of America
| | | | | | | | | | | |
Collapse
|
270
|
Jahangiri L, Nelson AC, Wardle FC. A cis-regulatory module upstream of deltaC regulated by Ntla and Tbx16 drives expression in the tailbud, presomitic mesoderm and somites. Dev Biol 2012; 371:110-20. [PMID: 22877946 PMCID: PMC3460241 DOI: 10.1016/j.ydbio.2012.07.002] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2012] [Revised: 06/29/2012] [Accepted: 07/04/2012] [Indexed: 12/15/2022]
Abstract
Somites form by an iterative process from unsegmented, presomitic mesoderm (PSM). Notch pathway components, such as deltaC (dlc) have been shown to play a role in this process, while the T-box transcription factors Ntla and Tbx16 regulate somite formation upstream of this by controlling supply and movement of cells into the PSM during gastrulation and tailbud outgrowth. In this work, we report that Ntla and Tbx16 play a more explicit role in segmentation by directly regulating dlc expression. In addition we describe a cis-regulatory module (CRM) upstream of dlc that drives expression of a reporter in the tailbud, PSM and somites during somitogenesis. This CRM is bound by both Ntla and Tbx16 at a cluster of T-box binding sites, which are required in combination for activation of the CRM.
Collapse
Affiliation(s)
- Leila Jahangiri
- Department of Physiology, Development and Neuroscience, Cambridge University, Downing Street, Cambridge, CB2 3DY, UK.
| | | | | |
Collapse
|
271
|
Multiple enhancers associated with ACAN suggest highly redundant transcriptional regulation in cartilage. Matrix Biol 2012; 31:328-37. [PMID: 22820679 DOI: 10.1016/j.matbio.2012.06.001] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2012] [Revised: 06/08/2012] [Accepted: 06/29/2012] [Indexed: 12/22/2022]
Abstract
The chondroitin sulfate proteoglycan core protein aggrecan is the major protein constituent of cartilage aside from collagen, and is largely responsible for its distinctive mechanical properties. Aggrecan is required both for proper cartilage formation in development and maintenance of mature cartilage. Prominent ACAN transcription is a conserved feature of vertebrate cartilage, although little is known about its specific transcriptional regulation. We examined the genomic interval containing human ACAN for transcriptional enhancers directing expression to cartilage, using a functional assay in transgenic zebrafish. We tested 24 conserved non-coding sequences, representing ~6% of the total sequence in the interval, and identified eleven independently capable of regulating reporter gene expression in cartilage. These enhancers were widely spaced, from >100kb upstream of the gene to within the first intron. While the majority displayed broad cartilage expression in zebrafish larvae, several were restricted to a subset of cartilage cells in the craniofacial skeleton. In older fish, the enhancers displayed differential activity; some maintained expression, either in all cartilage or preferentially in articular cartilage at the joints, while others were not active. This remarkable degree of overlapping regulatory control has been highly conserved; we identified clear orthologues of six enhancers at the chicken ACAN locus, arranged in the same order relative to the gene. These were also functional in directing expression to cartilage in transgenic zebrafish. Several enhancers contain potential binding sites for Sox9, consistent with its described role as an upstream regulator of ACAN expression. However, others lacked Sox9 consensus binding sites, implicating additional pathways and transcription factors as regulators of ACAN expression in cartilage, either in development or adult tissue. Our identification of these enhancer sequences is the necessary first step in detailed examination of the upstream regulators of ACAN expression.
Collapse
|
272
|
Vergara MN, Canto-Soler MV. Rediscovering the chick embryo as a model to study retinal development. Neural Dev 2012; 7:22. [PMID: 22738172 PMCID: PMC3541172 DOI: 10.1186/1749-8104-7-22] [Citation(s) in RCA: 73] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2012] [Accepted: 05/22/2012] [Indexed: 01/20/2023] Open
Abstract
The embryonic chick occupies a privileged place among animal models used in developmental studies. Its rapid development and accessibility for visualization and experimental manipulation are just some of the characteristics that have made it a vertebrate model of choice for more than two millennia. Until a few years ago, the inability to perform genetic manipulations constituted a major drawback of this system. However, the completion of the chicken genome project and the development of techniques to manipulate gene expression have allowed this classic animal model to enter the molecular age. Such techniques, combined with the embryological manipulations that this system is well known for, provide a unique toolkit to study the genetic basis of neural development. A major advantage of these approaches is that they permit targeted gene misexpression with extremely high spatiotemporal resolution and over a large range of developmental stages, allowing functional analysis at a level, speed and ease that is difficult to achieve in other systems. This article provides a general overview of the chick as a developmental model focusing more specifically on its application to the study of eye development. Special emphasis is given to the state of the art of the techniques that have made gene gain- and loss-of-function studies in this model a reality. In addition, we discuss some methodological considerations derived from our own experience that we believe will be beneficial to researchers working with this system.
Collapse
Affiliation(s)
- M Natalia Vergara
- Wilmer Eye Institute, The Johns Hopkins University School of Medicine, Smith Building 3023, 400 N Broadway, Baltimore, MD 21287-9257, USA
| | - M Valeria Canto-Soler
- Wilmer Eye Institute, The Johns Hopkins University School of Medicine, Smith Building 3023, 400 N Broadway, Baltimore, MD 21287-9257, USA
| |
Collapse
|
273
|
Quina LA, Kuramoto T, Luquetti DV, Cox TC, Serikawa T, Turner EE. Deletion of a conserved regulatory element required for Hmx1 expression in craniofacial mesenchyme in the dumbo rat: a newly identified cause of congenital ear malformation. Dis Model Mech 2012; 5:812-22. [PMID: 22736458 PMCID: PMC3484864 DOI: 10.1242/dmm.009910] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Hmx1 is a homeodomain transcription factor expressed in the developing eye, peripheral ganglia, and branchial arches of avian and mammalian embryos. Recent studies have identified a loss-of-function allele at the HMX1 locus as the causative mutation in the oculo-auricular syndrome (OAS) in humans, characterized by ear and eye malformations. The mouse dumbo (dmbo) mutation, with similar effects on ear and eye development, also results from a loss-of-function mutation in the Hmx1 gene. A recessive dmbo mutation causing ear malformation in rats has been mapped to the chromosomal region containing the Hmx1 gene, but the nature of the causative allele is unknown. Here we show that dumbo rats and mice exhibit similar neonatal ear and eye phenotypes. In midgestation embryos, dumbo rats show a specific loss of Hmx1 expression in neural-crest-derived craniofacial mesenchyme (CM), whereas Hmx1 is expressed normally in retinal progenitors, sensory ganglia and in CM, which is derived from mesoderm. High-throughput resequencing of 1 Mb of rat chromosome 14 from dmbo/dmbo rats, encompassing the Hmx1 locus, reveals numerous divergences from the rat genomic reference sequence, but no coding changes in Hmx1. Fine genetic mapping narrows the dmbo critical region to an interval of ∼410 kb immediately downstream of the Hmx1 transcription unit. Further sequence analysis of this region reveals a 5777-bp deletion located ∼80 kb downstream in dmbo/dmbo rats that is not apparent in 137 other rat strains. The dmbo deletion region contains a highly conserved domain of ∼500 bp, which is a candidate distal enhancer and which exhibits a similar relationship to Hmx genes in all vertebrate species for which data are available. We conclude that the rat dumbo phenotype is likely to result from loss of function of an ultraconserved enhancer specifically regulating Hmx1 expression in neural-crest-derived CM. Dysregulation of Hmx1 expression is thus a candidate mechanism for congenital ear malformation, most cases of which remain unexplained.
Collapse
Affiliation(s)
- Lely A Quina
- Center for Integrative Brain Research, Seattle Children's Research Institute, Seattle, WA 98101, USA
| | | | | | | | | | | |
Collapse
|
274
|
Liu Y, Nandi S, Martel A, Antoun A, Ioshikhes I, Blais A. Discovery, optimization and validation of an optimal DNA-binding sequence for the Six1 homeodomain transcription factor. Nucleic Acids Res 2012; 40:8227-39. [PMID: 22730291 PMCID: PMC3458543 DOI: 10.1093/nar/gks587] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
The Six1 transcription factor is a homeodomain protein involved in controlling gene expression during embryonic development. Six1 establishes gene expression profiles that enable skeletal myogenesis and nephrogenesis, among others. While several homeodomain factors have been extensively characterized with regards to their DNA-binding properties, relatively little is known of the properties of Six1. We have used the genomic binding profile of Six1 during the myogenic differentiation of myoblasts to obtain a better understanding of its preferences for recognizing certain DNA sequences. DNA sequence analyses on our genomic binding dataset, combined with biochemical characterization using binding assays, reveal that Six1 has a much broader DNA-binding sequence spectrum than had been previously determined. Moreover, using a position weight matrix optimization algorithm, we generated a highly sensitive and specific matrix that can be used to predict novel Six1-binding sites with highest accuracy. Furthermore, our results support the idea of a mode of DNA recognition by this factor where Six1 itself is sufficient for sequence discrimination, and where Six1 domains outside of its homeodomain contribute to binding site selection. Together, our results provide new light on the properties of this important transcription factor, and will enable more accurate modeling of Six1 function in bioinformatic studies.
Collapse
Affiliation(s)
- Yubing Liu
- Ottawa Institute of Systems Biology and Biochemistry, Microbiology and Immunology Department, Faculty of Medicine, University of Ottawa, Ottawa, Ontario, Canada
| | | | | | | | | | | |
Collapse
|
275
|
Irimia M, Tena JJ, Alexis MS, Fernandez-Miñan A, Maeso I, Bogdanovic O, de la Calle-Mustienes E, Roy SW, Gómez-Skarmeta JL, Fraser HB. Extensive conservation of ancient microsynteny across metazoans due to cis-regulatory constraints. Genome Res 2012; 22:2356-67. [PMID: 22722344 PMCID: PMC3514665 DOI: 10.1101/gr.139725.112] [Citation(s) in RCA: 98] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
The order of genes in eukaryotic genomes has generally been assumed to be neutral, since gene order is largely scrambled over evolutionary time. Only a handful of exceptional examples are known, typically involving deeply conserved clusters of tandemly duplicated genes (e.g., Hox genes and histones). Here we report the first systematic survey of microsynteny conservation across metazoans, utilizing 17 genome sequences. We identified nearly 600 pairs of unrelated genes that have remained tightly physically linked in diverse lineages across over 600 million years of evolution. Integrating sequence conservation, gene expression data, gene function, epigenetic marks, and other genomic features, we provide extensive evidence that many conserved ancient linkages involve (1) the coordinated transcription of neighboring genes, or (2) genomic regulatory blocks (GRBs) in which transcriptional enhancers controlling developmental genes are contained within nearby bystander genes. In addition, we generated ChIP-seq data for key histone modifications in zebrafish embryos, which provided further evidence of putative GRBs in embryonic development. Finally, using chromosome conformation capture (3C) assays and stable transgenic experiments, we demonstrate that enhancers within bystander genes drive the expression of genes such as Otx and Islet, critical regulators of central nervous system development across bilaterians. These results suggest that ancient genomic functional associations are far more common than previously thought—involving ∼12% of the ancestral bilaterian genome—and that cis-regulatory constraints are crucial in determining metazoan genome architecture.
Collapse
Affiliation(s)
- Manuel Irimia
- Department of Biology, Stanford University, Stanford, California 94305, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
276
|
Ultraconserved elements in the human genome: association and transmission analyses of highly constrained single-nucleotide polymorphisms. Genetics 2012; 192:253-66. [PMID: 22714408 DOI: 10.1534/genetics.112.141945] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Ultraconserved elements in the human genome likely harbor important biological functions as they are dosage sensitive and are able to direct tissue-specific expression. Because they are under purifying selection, variants in these elements may have a lower frequency in the population but a higher likelihood of association with complex traits. We tested a set of highly constrained SNPs (hcSNPs) distributed genome-wide among ultraconserved and nearly ultraconserved elements for association with seven traits related to reproductive (age at natural menopause, number of children, age at first child, and age at last child) and overall [longevity, body mass index (BMI), and height] fitness. Using up to 24,047 European-American samples from the National Heart, Lung, and Blood Institute Candidate Gene Association Resource (CARe), we observed an excess of associations with BMI and height. In an independent replication panel the most strongly associated SNPs showed an 8.4-fold enrichment of associations at the nominal level, including three variants in previously identified loci and one in a locus (DENND1A) previously shown to be associated with polycystic ovary syndrome. Finally, using 1430 family trios, we showed that the transmissions from heterozygous parents to offspring of the derived alleles of rare (frequency ≤ 0.5%) hcSNPs are not biased, particularly after adjusting for the rates of genotype missingness and error in the data. The lack of transmission bias ruled out an immediately and strongly deleterious effect due to the rare derived alleles, consistent with the observation that mice homozygous for the deletion of ultraconserved elements showed no overt phenotype. Our study also illustrated the importance of carefully modeling potential technical confounders when analyzing genotype data of rare variants.
Collapse
|
277
|
Abstract
Differential gene expression is the fundamental mechanism underlying animal development and cell differentiation. However, it is a challenge to identify comprehensively and accurately the DNA sequences that are required to regulate gene expression: namely, cis-regulatory modules (CRMs). Three major features, either singly or in combination, are used to predict CRMs: clusters of transcription factor binding site motifs, non-coding DNA that is under evolutionary constraint and biochemical marks associated with CRMs, such as histone modifications and protein occupancy. The validation rates for predictions indicate that identifying diagnostic biochemical marks is the most reliable method, and understanding is enhanced by the analysis of motifs and conservation patterns within those predicted CRMs.
Collapse
|
278
|
Shimeld SM, Donoghue PCJ. Evolutionary crossroads in developmental biology: cyclostomes (lamprey and hagfish). Development 2012; 139:2091-9. [DOI: 10.1242/dev.074716] [Citation(s) in RCA: 113] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Lampreys and hagfish, which together are known as the cyclostomes or ‘agnathans’, are the only surviving lineages of jawless fish. They diverged early in vertebrate evolution, before the origin of the hinged jaws that are characteristic of gnathostome (jawed) vertebrates and before the evolution of paired appendages. However, they do share numerous characteristics with jawed vertebrates. Studies of cyclostome development can thus help us to understand when, and how, key aspects of the vertebrate body evolved. Here, we summarise the development of cyclostomes, highlighting the key species studied and experimental methods available. We then discuss how studies of cyclostomes have provided important insight into the evolution of fins, jaws, skeleton and neural crest.
Collapse
Affiliation(s)
- Sebastian M. Shimeld
- Department of Zoology, University of Oxford, The Tinbergen Building, South Parks Road, Oxford OX1 3PS, UK
| | - Phillip C. J. Donoghue
- School of Earth Sciences, University of Bristol, Wills Memorial Building, Queens Road, Bristol BS8 1RJ, UK
| |
Collapse
|
279
|
Birnbaum RY, Clowney EJ, Agamy O, Kim MJ, Zhao J, Yamanaka T, Pappalardo Z, Clarke SL, Wenger AM, Nguyen L, Gurrieri F, Everman DB, Schwartz CE, Birk OS, Bejerano G, Lomvardas S, Ahituv N. Coding exons function as tissue-specific enhancers of nearby genes. Genome Res 2012; 22:1059-68. [PMID: 22442009 PMCID: PMC3371700 DOI: 10.1101/gr.133546.111] [Citation(s) in RCA: 158] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2011] [Accepted: 03/19/2012] [Indexed: 01/17/2023]
Abstract
Enhancers are essential gene regulatory elements whose alteration can lead to morphological differences between species, developmental abnormalities, and human disease. Current strategies to identify enhancers focus primarily on noncoding sequences and tend to exclude protein coding sequences. Here, we analyzed 25 available ChIP-seq data sets that identify enhancers in an unbiased manner (H3K4me1, H3K27ac, and EP300) for peaks that overlap exons. We find that, on average, 7% of all ChIP-seq peaks overlap coding exons (after excluding for peaks that overlap with first exons). By using mouse and zebrafish enhancer assays, we demonstrate that several of these exonic enhancer (eExons) candidates can function as enhancers of their neighboring genes and that the exonic sequence is necessary for enhancer activity. Using ChIP, 3C, and DNA FISH, we further show that one of these exonic limb enhancers, Dync1i1 exon 15, has active enhancer marks and physically interacts with Dlx5/6 promoter regions 900 kb away. In addition, its removal by chromosomal abnormalities in humans could cause split hand and foot malformation 1 (SHFM1), a disorder associated with DLX5/6. These results demonstrate that DNA sequences can have a dual function, operating as coding exons in one tissue and enhancers of nearby gene(s) in another tissue, suggesting that phenotypes resulting from coding mutations could be caused not only by protein alteration but also by disrupting the regulation of another gene.
Collapse
Affiliation(s)
- Ramon Y. Birnbaum
- Department of Bioengineering and Therapeutic Sciences
- Institute for Human Genetics
| | - E. Josephine Clowney
- Department of Anatomy
- Program in Biomedical Sciences, University of California, San Francisco, California 94143, USA
| | - Orly Agamy
- The Morris Kahn Laboratory of Human Genetics, NIBN, Ben-Gurion University, Beer-Sheva 84105, Israel
| | - Mee J. Kim
- Department of Bioengineering and Therapeutic Sciences
- Institute for Human Genetics
| | - Jingjing Zhao
- Department of Bioengineering and Therapeutic Sciences
- Institute for Human Genetics
- Key Laboratory of Advanced Control and Optimization for Chemical Processes of the Ministry of Education, East China University of Science and Technology, Shanghai 200237, China
| | - Takayuki Yamanaka
- Department of Bioengineering and Therapeutic Sciences
- Institute for Human Genetics
| | - Zachary Pappalardo
- Department of Bioengineering and Therapeutic Sciences
- Institute for Human Genetics
| | | | - Aaron M. Wenger
- Department of Computer Science, Stanford University, Stanford, California 94305-5329, USA
| | - Loan Nguyen
- Department of Bioengineering and Therapeutic Sciences
- Institute for Human Genetics
| | - Fiorella Gurrieri
- Istituto di Genetica Medica, Università Cattolica S. Cuore, Rome 00168, Italy
| | - David B. Everman
- JC Self Research Institute, Greenwood Genetic Center, Greenwood, South Carolina 29646, USA
| | - Charles E. Schwartz
- JC Self Research Institute, Greenwood Genetic Center, Greenwood, South Carolina 29646, USA
- Department of Genetics and Biochemistry, Clemson University, Clemson, South Carolina 29634, USA
| | - Ohad S. Birk
- The Morris Kahn Laboratory of Human Genetics, NIBN, Ben-Gurion University, Beer-Sheva 84105, Israel
| | - Gill Bejerano
- Department of Computer Science, Stanford University, Stanford, California 94305-5329, USA
- Department of Developmental Biology, Stanford University, Stanford, California 94305-5329, USA
| | | | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences
- Institute for Human Genetics
| |
Collapse
|
280
|
Evolutionary growth process of highly conserved sequences in vertebrate genomes. Gene 2012; 504:1-5. [PMID: 22580082 DOI: 10.1016/j.gene.2012.05.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2011] [Revised: 04/27/2012] [Accepted: 05/02/2012] [Indexed: 11/22/2022]
Abstract
Genome sequence comparison between evolutionarily distant species revealed ultraconserved elements (UCEs) among mammals under strong purifying selection. Most of them were also conserved among vertebrates. Because they tend to be located in the flanking regions of developmental genes, they would have fundamental roles in creating vertebrate body plans. However, the evolutionary origin and selection mechanism of these UCEs remain unclear. Here we report that UCEs arose in primitive vertebrates, and gradually grew in vertebrate evolution. We searched for UCEs in two teleost fishes, Tetraodon nigroviridis and Oryzias latipes, and found 554 UCEs with 100% identity over 100 bps. Comparison of teleost and mammalian UCEs revealed 43 pairs of common, jawed-vertebrate UCEs (jUCE) with high sequence identities, ranging from 83.1% to 99.2%. Ten of them retain lower similarities to the Petromyzon marinus genome, and the substitution rates of four non-exonic jUCEs were reduced after the teleost-mammal divergence, suggesting that robust conservation had been acquired in the jawed vertebrate lineage. Our results indicate that prototypical UCEs originated before the divergence of jawed and jawless vertebrates and have been frozen as perfect conserved sequences in the jawed vertebrate lineage. In addition, our comparative sequence analyses of UCEs and neighboring regions resulted in a discovery of lineage-specific conserved sequences. They were added progressively to prototypical UCEs, suggesting step-wise acquisition of novel regulatory roles. Our results indicate that conserved non-coding elements (CNEs) consist of blocks with distinct evolutionary history, each having been frozen since different evolutionary era along the vertebrate lineage.
Collapse
|
281
|
Ramialison M, Reinhardt R, Henrich T, Wittbrodt B, Kellner T, Lowy CM, Wittbrodt J. Cis-regulatory properties of medaka synexpression groups. Development 2012; 139:917-28. [PMID: 22318626 DOI: 10.1242/dev.071803] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
During embryogenesis, tissue specification is triggered by the expression of a unique combination of developmental genes and their expression in time and space is crucial for successful development. Synexpression groups are batteries of spatiotemporally co-expressed genes that act in shared biological processes through their coordinated expression. Although several synexpression groups have been described in numerous vertebrate species, the regulatory mechanisms that orchestrate their common complex expression pattern remain to be elucidated. Here we performed a pilot screen on 560 genes of the vertebrate model system medaka (Oryzias latipes) to systematically identify synexpression groups and investigate their regulatory properties by searching for common regulatory cues. We find that synexpression groups share DNA motifs that are arranged in various combinations into cis-regulatory modules that drive co-expression. In contrast to previous assumptions that these genes are located randomly in the genome, we discovered that genes belonging to the same synexpression group frequently occur in synexpression clusters in the genome. This work presents a first repertoire of synexpression group common signatures, a resource that will contribute to deciphering developmental gene regulatory networks.
Collapse
Affiliation(s)
- Mirana Ramialison
- University of Heidelberg, Centre for Organismal Studies, Heidelberg, Germany.
| | | | | | | | | | | | | |
Collapse
|
282
|
Transcriptional enhancers in protein-coding exons of vertebrate developmental genes. PLoS One 2012; 7:e35202. [PMID: 22567096 PMCID: PMC3342275 DOI: 10.1371/journal.pone.0035202] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2011] [Accepted: 03/10/2012] [Indexed: 11/19/2022] Open
Abstract
Many conserved noncoding sequences function as transcriptional enhancers that regulate gene expression. Here, we report that protein-coding DNA also frequently contains enhancers functioning at the transcriptional level. We tested the enhancer activity of 31 protein-coding exons, which we chose based on strong sequence conservation between zebrafish and human, and occurrence in developmental genes, using a Tol2 transposable GFP reporter assay in zebrafish. For each exon we measured GFP expression in hundreds of embryos in 10 anatomies via a novel system that implements the voice-recognition capabilities of a cellular phone. We find that 24/31 (77%) exons drive GFP expression compared to a minimal promoter control, and 14/24 are anatomy-specific (expression in four anatomies or less). GFP expression driven by these coding enhancers frequently overlaps the anatomies where the host gene is expressed (60%), suggesting self-regulation. Highly conserved coding sequences and highly conserved noncoding sequences do not significantly differ in enhancer activity (coding: 24/31 vs. noncoding: 105/147) or tissue-specificity (coding: 14/24 vs. noncoding: 50/105). Furthermore, coding and noncoding enhancers display similar levels of the enhancer-related histone modification H3K4me1 (coding: 9/24 vs noncoding: 34/81). Meanwhile, coding enhancers are over three times as likely to contain an H3K4me1 mark as other exons of the host gene. Our work suggests that developmental transcriptional enhancers do not discriminate between coding and noncoding DNA and reveals widespread dual functions in protein-coding DNA.
Collapse
|
283
|
Direct transcriptional regulation of Six6 is controlled by SoxB1 binding to a remote forebrain enhancer. Dev Biol 2012; 366:393-403. [PMID: 22561201 DOI: 10.1016/j.ydbio.2012.04.023] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2012] [Revised: 04/01/2012] [Accepted: 04/17/2012] [Indexed: 01/30/2023]
Abstract
Six6, a sine oculis homeobox protein, plays a crucial and conserved role in the development of the forebrain and eye. To understand how the expression of Six6 is regulated during embryogenesis, we screened ~250 kb of genomic DNA encompassing the Six6 locus for cis-regulatory elements capable of directing reporter gene expression to sites of Six6 transcription in transgenic mouse embryos. Here, we describe two novel enhancer elements, that are highly conserved in vertebrate species and whose activities recapitulate Six6 expression in the ventral forebrain and eye, respectively. Cross-species comparisons of the Six6 forebrain enhancer sequences revealed highly conserved binding sites matching the consensus for homeodomain and SoxB1 transcription factors. Deletion of either of the binding sites resulted in loss of the forebrain enhancer activity in the ventral forebrain. Moreover, our studies show that members of the SoxB1 family, including Sox2 and Sox3, are expressed in the overlapping region of the ventral forebrain with Six6 and can bind to the Six6 forebrain enhancer. Loss of function of SoxB1 genes in vivo further emphasizes their role in regulating Six6 forebrain enhancer activity. Thus, our data strongly suggest that SoxB1 transcription factors are direct activators of Six6 expression in the ventral forebrain.
Collapse
|
284
|
Rogozin IB, Carmel L, Csuros M, Koonin EV. Origin and evolution of spliceosomal introns. Biol Direct 2012; 7:11. [PMID: 22507701 PMCID: PMC3488318 DOI: 10.1186/1745-6150-7-11] [Citation(s) in RCA: 224] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2011] [Accepted: 03/15/2012] [Indexed: 12/31/2022] Open
Abstract
Evolution of exon-intron structure of eukaryotic genes has been a matter of long-standing, intensive debate. The introns-early concept, later rebranded ‘introns first’ held that protein-coding genes were interrupted by numerous introns even at the earliest stages of life's evolution and that introns played a major role in the origin of proteins by facilitating recombination of sequences coding for small protein/peptide modules. The introns-late concept held that introns emerged only in eukaryotes and new introns have been accumulating continuously throughout eukaryotic evolution. Analysis of orthologous genes from completely sequenced eukaryotic genomes revealed numerous shared intron positions in orthologous genes from animals and plants and even between animals, plants and protists, suggesting that many ancestral introns have persisted since the last eukaryotic common ancestor (LECA). Reconstructions of intron gain and loss using the growing collection of genomes of diverse eukaryotes and increasingly advanced probabilistic models convincingly show that the LECA and the ancestors of each eukaryotic supergroup had intron-rich genes, with intron densities comparable to those in the most intron-rich modern genomes such as those of vertebrates. The subsequent evolution in most lineages of eukaryotes involved primarily loss of introns, with only a few episodes of substantial intron gain that might have accompanied major evolutionary innovations such as the origin of metazoa. The original invasion of self-splicing Group II introns, presumably originating from the mitochondrial endosymbiont, into the genome of the emerging eukaryote might have been a key factor of eukaryogenesis that in particular triggered the origin of endomembranes and the nucleus. Conversely, splicing errors gave rise to alternative splicing, a major contribution to the biological complexity of multicellular eukaryotes. There is no indication that any prokaryote has ever possessed a spliceosome or introns in protein-coding genes, other than relatively rare mobile self-splicing introns. Thus, the introns-first scenario is not supported by any evidence but exon-intron structure of protein-coding genes appears to have evolved concomitantly with the eukaryotic cell, and introns were a major factor of evolution throughout the history of eukaryotes. This article was reviewed by I. King Jordan, Manuel Irimia (nominated by Anthony Poole), Tobias Mourier (nominated by Anthony Poole), and Fyodor Kondrashov. For the complete reports, see the Reviewers’ Reports section.
Collapse
Affiliation(s)
- Igor B Rogozin
- National Center for Biotechnology Information NLM/NIH, 8600 Rockville Pike, Bldg, 38A, Bethesda, MD 20894, USA
| | | | | | | |
Collapse
|
285
|
Takahashi M, Saitou N. Identification and characterization of lineage-specific highly conserved noncoding sequences in Mammalian genomes. Genome Biol Evol 2012; 4:641-57. [PMID: 22505575 PMCID: PMC3381673 DOI: 10.1093/gbe/evs035] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/23/2012] [Indexed: 01/12/2023] Open
Abstract
Vertebrate genome comparisons revealed that there are highly conserved noncoding sequences (HCNSs) among a wide range of species and many of which contain regulatory elements. However, recently emerged sequences conserved in specific lineages have not been well studied. Toward this end, we identified 8,198 primate and 21,128 specific HCNSs as representative ones among mammals from human-marmoset and mouse-rat comparisons, respectively. Derived allele frequency analysis of primate-specific HCNSs showed that these HCNSs were under purifying selection, indicating that they may harbor important functions. We selected the top 1,000 largest HCNSs and compared the lineage-specific HCNS-flanking genes (LHF genes) with ultraconserved element (UCE)-flanking genes. Interestingly, the majority of LHF genes were different from UCE-flanking genes. This lineage-specific set of LHF genes was more enriched in protein-binding function. Conversely, the number of LHF genes that were also shared by UCEs was small but significantly larger than random expectation, and many of these genes were involved in anatomical development as transcriptional regulators, suggesting that certain groups of genes preferentially recruit new HCNSs in addition to old HCNSs that are conserved among vertebrates. This group of LHF genes might be involved in the various levels of lineage-specific evolution among vertebrates, mammals, primates, and rodents. If so, the emergence of HCNSs in and around these two groups of LHF genes developed lineage-specific characteristics. Our results provide new insight into lineage-specific evolution through interactions between HCNSs and their LHF genes.
Collapse
Affiliation(s)
- Mahoko Takahashi
- Department of Genetics, School of Life Science, Graduate University for Advanced Studies, Japan
- Division of Population Genetics, National Institute of Genetics, Japan
- Present address: Department of Genetics, Stanford University
| | - Naruya Saitou
- Department of Genetics, School of Life Science, Graduate University for Advanced Studies, Japan
- Division of Population Genetics, National Institute of Genetics, Japan
| |
Collapse
|
286
|
Abstract
Producing complex recombinant proteins in the milk of transgenic animals offers several advantages: large amounts of proteins can be obtained, and in most cases, these proteins are properly folded, assembled, cleaved, and glycosylated. The level of expression of foreign genes in the mammalian gland cannot be predicted in all cases, and appropriate vectors must be used. The main elements of these vectors are as follows: a well-characterized specific promoter, the coding region of the gene of interest, preferably with a homologous or heterologous intron, to improve transcription efficiency, and an insulator or boundary element to counteract the chromosomal position effects at the integration site. Once high expression levels are achieved, and the recombinant protein is purified, an essential step in the analysis of the final product is determining its degree of glycosylation. This is an important readout because it can affect among other parameters the stability and immunogenicity of the recombinant protein.
Collapse
|
287
|
Royo JL, Bessa J, Hidalgo C, Fernández-Miñán A, Tena JJ, Roncero Y, Gómez-Skarmeta JL, Casares F. Identification and analysis of conserved cis-regulatory regions of the MEIS1 gene. PLoS One 2012; 7:e33617. [PMID: 22448256 PMCID: PMC3308983 DOI: 10.1371/journal.pone.0033617] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2011] [Accepted: 02/13/2012] [Indexed: 11/22/2022] Open
Abstract
Meis1, a conserved transcription factor of the TALE-homeodomain class, is expressed in a wide variety of tissues during development. Its complex expression pattern is likely to be controlled by an equally complex regulatory landscape. Here we have scanned the Meis1 locus for regulatory elements and found 13 non-coding regions, highly conserved between humans and teleost fishes, that have enhancer activity in stable transgenic zebrafish lines. All these regions are syntenic in most vertebrates. The composite expression of all these enhancer elements recapitulate most of Meis1 expression during early embryogenesis, indicating they comprise a basic set of regulatory elements of the Meis1 gene. Using bioinformatic tools, we identify a number of potential binding sites for transcription factors that are compatible with the regulation of these enhancers. Specifically, HHc2:066650, which is expressed in the developing retina and optic tectum, harbors several predicted Pax6 sites. Biochemical, functional and transgenic assays indicate that pax6 genes directly regulate HHc2:066650 activity.
Collapse
Affiliation(s)
| | | | | | | | | | | | - José Luis Gómez-Skarmeta
- Centro Andaluz de Biología del Desarrollo (CABD) CSIC-UPO-Junta de Anadalucía, Sevilla, Spain
- * E-mail: (JLGS); (FC)
| | - Fernando Casares
- Centro Andaluz de Biología del Desarrollo (CABD) CSIC-UPO-Junta de Anadalucía, Sevilla, Spain
- * E-mail: (JLGS); (FC)
| |
Collapse
|
288
|
Arkhipova V, Wendik B, Devos N, Ek O, Peers B, Meyer D. Characterization and regulation of the hb9/mnx1 beta-cell progenitor specific enhancer in zebrafish. Dev Biol 2012; 365:290-302. [PMID: 22426004 PMCID: PMC3327876 DOI: 10.1016/j.ydbio.2012.03.001] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2012] [Revised: 02/27/2012] [Accepted: 03/01/2012] [Indexed: 11/06/2022]
Abstract
Differentiation of insulin producing beta-cells is a genetically well defined process that involves functions of various conserved transcription factors. Still, the transcriptional mechanisms underlying specification and determination of beta-cell fate are poorly defined. Here we provide the description of a beta-cell progenitor specific enhancer as a model to study initial steps of beta-cell differentiation. We show that evolutionary non-conserved upstream sequences of the zebrafish hb9 gene are required and sufficient for regulating expression in beta-cells prior to the onset of insulin expression. This enhancer contains binding sites for paired-box transcription factors and two E-boxes that in EMSA studies show interaction with Pax6b and NeuroD, respectively. We show that Pax6b is a potent activator of endodermal hb9 expression and that this activation depends on the beta-cell enhancer. Using genetic approaches we show that pax6b is crucial for maintenance but not induction of pancreatic hb9 transcription. As loss of Pax6b or Hb9 independently results in the loss of insulin expression, the data reveal a novel cross-talk between the two essential regulators of early beta-cell differentiation. While we find that the known pancreatic E-box binding proteins NeuroD and Ngn3 are not required for hb9 expression we also show that removal of both E-boxes selectively eliminates pancreatic specific reporter expression. The data provide evidence for an Ngn3 independent pathway of beta-cell specification that requires function of currently not specified E-box binding factors.
Collapse
Affiliation(s)
- Valeriya Arkhipova
- Institute for Molecular Biology/CMBI, Technikerstr. 25, University of Innsbruck, 6020 Innsbruck, Austria.
| | | | | | | | | | | |
Collapse
|
289
|
Busser BW, Taher L, Kim Y, Tansey T, Bloom MJ, Ovcharenko I, Michelson AM. A machine learning approach for identifying novel cell type-specific transcriptional regulators of myogenesis. PLoS Genet 2012; 8:e1002531. [PMID: 22412381 PMCID: PMC3297574 DOI: 10.1371/journal.pgen.1002531] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2011] [Accepted: 12/23/2011] [Indexed: 12/22/2022] Open
Abstract
Transcriptional enhancers integrate the contributions of multiple classes of transcription factors (TFs) to orchestrate the myriad spatio-temporal gene expression programs that occur during development. A molecular understanding of enhancers with similar activities requires the identification of both their unique and their shared sequence features. To address this problem, we combined phylogenetic profiling with a DNA-based enhancer sequence classifier that analyzes the TF binding sites (TFBSs) governing the transcription of a co-expressed gene set. We first assembled a small number of enhancers that are active in Drosophila melanogaster muscle founder cells (FCs) and other mesodermal cell types. Using phylogenetic profiling, we increased the number of enhancers by incorporating orthologous but divergent sequences from other Drosophila species. Functional assays revealed that the diverged enhancer orthologs were active in largely similar patterns as their D. melanogaster counterparts, although there was extensive evolutionary shuffling of known TFBSs. We then built and trained a classifier using this enhancer set and identified additional related enhancers based on the presence or absence of known and putative TFBSs. Predicted FC enhancers were over-represented in proximity to known FC genes; and many of the TFBSs learned by the classifier were found to be critical for enhancer activity, including POU homeodomain, Myb, Ets, Forkhead, and T-box motifs. Empirical testing also revealed that the T-box TF encoded by org-1 is a previously uncharacterized regulator of muscle cell identity. Finally, we found extensive diversity in the composition of TFBSs within known FC enhancers, suggesting that motif combinatorics plays an essential role in the cellular specificity exhibited by such enhancers. In summary, machine learning combined with evolutionary sequence analysis is useful for recognizing novel TFBSs and for facilitating the identification of cognate TFs that coordinate cell type-specific developmental gene expression patterns.
Collapse
Affiliation(s)
- Brian W. Busser
- Laboratory of Developmental Systems Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Leila Taher
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Yongsok Kim
- Laboratory of Developmental Systems Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Terese Tansey
- Laboratory of Developmental Systems Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Molly J. Bloom
- Laboratory of Developmental Systems Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Ivan Ovcharenko
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail: (IO); (AMM)
| | - Alan M. Michelson
- Laboratory of Developmental Systems Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail: (IO); (AMM)
| |
Collapse
|
290
|
Young RS, Marques AC, Tibbit C, Haerty W, Bassett AR, Liu JL, Ponting CP. Identification and properties of 1,119 candidate lincRNA loci in the Drosophila melanogaster genome. Genome Biol Evol 2012; 4:427-42. [PMID: 22403033 PMCID: PMC3342871 DOI: 10.1093/gbe/evs020] [Citation(s) in RCA: 158] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
The functional repertoire of long intergenic noncoding RNA (lincRNA) molecules has begun to be elucidated in mammals. Determining the biological relevance and potential gene regulatory mechanisms of these enigmatic molecules would be expedited in a more tractable model organism, such as Drosophila melanogaster. To this end, we defined a set of 1,119 putative lincRNA genes in D. melanogaster using modENCODE whole transcriptome (RNA-seq) data. A large majority (1.1 of 1.3 Mb; 85%) of these bases were not previously reported by modENCODE as being transcribed. Significant selective constraint on the sequences of these loci predicts that virtually all have sustained functionality across the Drosophila clade. We observe biases in lincRNA genomic locations and expression profiles that are consistent with some of these lincRNAs being involved in the regulation of neighboring protein-coding genes with developmental functions. We identify lincRNAs that may be important in the developing nervous system and in male-specific organs, such as the testes. LincRNA loci were also identified whose positions, relative to nearby protein-coding loci, are equivalent between D. melanogaster and mouse. This study predicts that the genomes of not only vertebrates, such as mammals, but also an invertebrate (fruit fly) harbor large numbers of lincRNA loci. Our findings now permit exploitation of Drosophila genetics for the investigation of lincRNA mechanisms, including lincRNAs with potential functional analogues in mammals.
Collapse
|
291
|
Beaster-Jones L. Cis-regulation and conserved non-coding elements in amphioxus. Brief Funct Genomics 2012; 11:118-30. [DOI: 10.1093/bfgp/els006] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
292
|
Metazoan promoters: emerging characteristics and insights into transcriptional regulation. Nat Rev Genet 2012; 13:233-45. [PMID: 22392219 DOI: 10.1038/nrg3163] [Citation(s) in RCA: 340] [Impact Index Per Article: 28.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Promoters are crucial for gene regulation. They vary greatly in terms of associated regulatory elements, sequence motifs, the choice of transcription start sites and other features. Several technologies that harness next-generation sequencing have enabled recent advances in identifying promoters and their features, helping researchers who are investigating functional categories of promoters and their modes of regulation. Additional features of promoters that are being characterized include types of histone modifications, nucleosome positioning, RNA polymerase pausing and novel small RNAs. In this Review, we discuss recent findings relating to metazoan promoters and how these findings are leading to a revised picture of what a gene promoter is and how it works.
Collapse
|
293
|
Louis A, Roest Crollius H, Robinson-Rechavi M. How much does the amphioxus genome represent the ancestor of chordates? Brief Funct Genomics 2012; 11:89-95. [PMID: 22373648 PMCID: PMC3310212 DOI: 10.1093/bfgp/els003] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
One of the main motivations to study amphioxus is its potential for understanding the last common ancestor of chordates, which notably gave rise to the vertebrates. An important feature in this respect is the slow evolutionary rate that seems to have characterized the cephalochordate lineage, making amphioxus an interesting proxy for the chordate ancestor, as well as a key lineage to include in comparative studies. Whereas slow evolution was first noticed at the phenotypic level, it has also been described at the genomic level. Here, we examine whether the amphioxus genome is indeed a good proxy for the genome of the chordate ancestor, with a focus on protein-coding genes. We investigate genome features, such as synteny, gene duplication and gene loss, and contrast the amphioxus genome with those of other deuterostomes that are used in comparative studies, such as Ciona, Oikopleura and urchin.
Collapse
Affiliation(s)
- Alexandra Louis
- Institute of Biology of the Ecole Normale Supérieure, Paris, France
| | | | | |
Collapse
|
294
|
Pauls S, Smith SF, Elgar G. Lens development depends on a pair of highly conserved Sox21 regulatory elements. Dev Biol 2012; 365:310-8. [PMID: 22387845 PMCID: PMC3480646 DOI: 10.1016/j.ydbio.2012.02.025] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2011] [Revised: 02/16/2012] [Accepted: 02/18/2012] [Indexed: 02/03/2023]
Abstract
Highly conserved non-coding elements (CNEs) linked to genes involved in embryonic development have been hypothesised to correspond to cis-regulatory modules due to their ability to induce tissue-specific expression patterns. However, attempts to prove their requirement for normal development or for the correct expression of the genes they are associated with have yielded conflicting results. Here, we show that CNEs at the vertebrate Sox21 locus are crucial for Sox21 expression in the embryonic lens and that loss of Sox21 function interferes with normal lens development. Using different expression assays in zebrafish we find that two CNEs linked to Sox21 in all vertebrates contain lens enhancers and that their removal from a reporter BAC abolishes lens expression. Furthermore inhibition of Sox21 function after the injection of a sox21b morpholino into zebrafish leads to defects in lens development. These findings identify a direct link between sequence conservation and genomic function of regulatory sequences. In addition to this we provide evidence that putative Sox binding sites in one of the CNEs are essential for induction of lens expression as well as enhancer function in the CNS. Our results show that CNEs identified in pufferfish-mammal whole-genome comparisons are crucial developmental enhancers and hence essential components of gene regulatory networks underlying vertebrate embryogenesis.
Collapse
|
295
|
Díaz-Castillo C, Xia XQ, Ranz JM. Evaluation of the role of functional constraints on the integrity of an ultraconserved region in the genus Drosophila. PLoS Genet 2012; 8:e1002475. [PMID: 22319453 PMCID: PMC3271063 DOI: 10.1371/journal.pgen.1002475] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2011] [Accepted: 11/29/2011] [Indexed: 01/02/2023] Open
Abstract
Why gene order is conserved over long evolutionary timespans remains elusive. A common interpretation is that gene order conservation might reflect the existence of functional constraints that are important for organismal performance. Alteration of the integrity of genomic regions, and therefore of those constraints, would result in detrimental effects. This notion seems especially plausible in those genomes that can easily accommodate gene reshuffling via chromosomal inversions since genomic regions free of constraints are likely to have been disrupted in one or more lineages. Nevertheless, no empirical test has been performed to this notion. Here, we disrupt one of the largest conserved genomic regions of the Drosophila genome by chromosome engineering and examine the phenotypic consequences derived from such disruption. The targeted region exhibits multiple patterns of functional enrichment suggestive of the presence of constraints. The carriers of the disrupted collinear block show no defects in their viability, fertility, and parameters of general homeostasis, although their odorant perception is altered. This change in odorant perception does not correlate with modifications of the level of expression and sex bias of the genes within the genomic region disrupted. Our results indicate that even in highly rearranged genomes, like those of Diptera, unusually high levels of gene order conservation cannot be systematically attributed to functional constraints, which raises the possibility that other mechanisms can be in place and therefore the underpinnings of the maintenance of gene organization might be more diverse than previously thought.
Collapse
Affiliation(s)
- Carlos Díaz-Castillo
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, California, United States of America
| | - Xiao-Qin Xia
- Institute of Hydrobiology, Chinese Academy of Science, Wuhan, China
| | - José M. Ranz
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, California, United States of America
| |
Collapse
|
296
|
Sakabe NJ, Savic D, Nobrega MA. Transcriptional enhancers in development and disease. Genome Biol 2012; 13:238. [PMID: 22269347 PMCID: PMC3334578 DOI: 10.1186/gb-2012-13-1-238] [Citation(s) in RCA: 104] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2011] [Accepted: 01/13/2012] [Indexed: 01/24/2023] Open
Abstract
Distal transcription enhancers are cis-regulatory elements that promote gene expression, enabling spatiotemporal control of genetic programs such as those required in metazoan developmental processes. Because of their importance, their disruption can lead to disease.
Collapse
Affiliation(s)
- Noboru Jo Sakabe
- Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA.
| | | | | |
Collapse
|
297
|
Mongin E, Dewar K, Blanchette M. Mapping association between long-range cis-regulatory regions and their target genes using synteny. J Comput Biol 2012; 18:1115-30. [PMID: 21899419 DOI: 10.1089/cmb.2011.0088] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
In chordates, long-range cis-regulatory regions are involved in the control of transcription initiation (either as repressors or enhancers). Their main characteristics are that (i) they can be located as far as 1 Mb away from the transcription start site of the target gene, (ii) they can regulate more than one gene, and (iii) they are usually orientation-independent. Therefore, proper characterization of functional interactions between long-range cis-regulatory regions and their target genes remains problematic. We present a novel method to predict such interactions based on the analysis of rearrangements between the human and 16 other vertebrate genomes. Our method is based on the assumption that genome rearrangements that would disrupt the functional interaction between a cis-regulatory region and its target gene are likely to be deleterious. Therefore, conservation of synteny through evolution would be an indication of a functional interaction. We use our algorithm to predict the association between a set of 123,905 human candidate regulatory regions to their target gene(s). This genome-wide map of interactions has many potential applications, including the selection of candidate regions prior to in vivo experimental characterization, a better characterization of regulatory regions involved in position effect diseases, and an improved understanding of the mechanisms and importance of long-range regulation.
Collapse
Affiliation(s)
- Emmanuel Mongin
- McGill Centre for Bioinformatics, McGill University, Montreal, Quebec, Canada
| | | | | |
Collapse
|
298
|
Weissmann S, Brutnell TP. Engineering C4 photosynthetic regulatory networks. Curr Opin Biotechnol 2012; 23:298-304. [PMID: 22261559 DOI: 10.1016/j.copbio.2011.12.018] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2011] [Revised: 12/13/2011] [Accepted: 12/15/2011] [Indexed: 11/25/2022]
Abstract
C4 photosynthesis is a complex metabolic pathway responsible for carbon fixation in major feed, food and bioenergy crops. Although many enzymes driving this pathway have been identified, regulatory mechanisms underlying this system remain elusive. C4 photosynthesis contributes to photosynthetic efficiency in major bioenergy crops such as sugarcane, Miscanthus, switchgrass, maize and sorghum, and international efforts are underway to engineer C4 photosynthesis into C3 crops. A fundamental understanding of the C4 network is thus needed. New experimental and informatics methods can facilitate the accumulation and analysis of high-throughput data to define components of the C4 system. The use of new model plants, closely related to C4 crops, will also contribute to our understanding of the mechanisms that regulate this complex and important pathway.
Collapse
Affiliation(s)
- Sarit Weissmann
- Boyce Thompson Institute for Plant Research, Cornell University, Tower Road, Ithaca, NY 14853, United States
| | | |
Collapse
|
299
|
An ancient genomic regulatory block conserved across bilaterians and its dismantling in tetrapods by retrogene replacement. Genome Res 2012; 22:642-55. [PMID: 22234889 DOI: 10.1101/gr.132233.111] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Developmental genes are regulated by complex, distantly located cis-regulatory modules (CRMs), often forming genomic regulatory blocks (GRBs) that are conserved among vertebrates and among insects. We have investigated GRBs associated with Iroquois homeobox genes in 39 metazoans. Despite 600 million years of independent evolution, Iroquois genes are linked to ankyrin-repeat-containing Sowah genes in nearly all studied bilaterians. We show that Iroquois-specific CRMs populate the Sowah locus, suggesting that regulatory constraints underlie the maintenance of the Iroquois-Sowah syntenic block. Surprisingly, tetrapod Sowah orthologs are intronless and not associated with Iroquois; however, teleost and elephant shark data demonstrate that this is a derived feature, and that many Iroquois-CRMs were ancestrally located within Sowah introns. Retroposition, gene, and genome duplication have allowed selective elimination of Sowah exons from the Iroquois regulatory landscape while keeping associated CRMs, resulting in large associated gene deserts. These results highlight the importance of CRMs in imposing constraints to genome architecture, even across large phylogenetic distances, and of gene duplication-mediated genetic redundancy to disentangle these constraints, increasing genomic plasticity.
Collapse
|
300
|
Faircloth BC, McCormack JE, Crawford NG, Harvey MG, Brumfield RT, Glenn TC. Ultraconserved elements anchor thousands of genetic markers spanning multiple evolutionary timescales. Syst Biol 2012; 61:717-26. [PMID: 22232343 DOI: 10.1093/sysbio/sys004] [Citation(s) in RCA: 705] [Impact Index Per Article: 58.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Although massively parallel sequencing has facilitated large-scale DNA sequencing, comparisons among distantly related species rely upon small portions of the genome that are easily aligned. Methods are needed to efficiently obtain comparable DNA fragments prior to massively parallel sequencing, particularly for biologists working with non-model organisms. We introduce a new class of molecular marker, anchored by ultraconserved genomic elements (UCEs), that universally enable target enrichment and sequencing of thousands of orthologous loci across species separated by hundreds of millions of years of evolution. Our analyses here focus on use of UCE markers in Amniota because UCEs and phylogenetic relationships are well-known in some amniotes. We perform an in silico experiment to demonstrate that sequence flanking 2030 UCEs contains information sufficient to enable unambiguous recovery of the established primate phylogeny. We extend this experiment by performing an in vitro enrichment of 2386 UCE-anchored loci from nine, non-model avian species. We then use alignments of 854 of these loci to unambiguously recover the established evolutionary relationships within and among three ancient bird lineages. Because many organismal lineages have UCEs, this type of genetic marker and the analytical framework we outline can be applied across the tree of life, potentially reshaping our understanding of phylogeny at many taxonomic levels.
Collapse
Affiliation(s)
- Brant C Faircloth
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA 90095, USA.
| | | | | | | | | | | |
Collapse
|