1
|
Gui Y, Zhang Y, Zhang Q, Chen X, Wang F, Wu F, Gui Y, Li Q. The functional verification and analysis of Fugu promoter of cardiac gene tnni1a in zebrafish. Cells Dev 2022; 171:203801. [PMID: 35787465 DOI: 10.1016/j.cdev.2022.203801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Revised: 05/09/2022] [Accepted: 06/28/2022] [Indexed: 01/25/2023]
Abstract
Troponin I type 1b (Tnni1b) is thought to be a novel isoform that is expressed only in the zebrafish heart. Knocking down of tnni1b can lead to cardiac defects in zebrafish. Although both the zebrafish tnni1b and human troponin I1 (TNNI1) genes are thought to be closely associated with fatal cardiac development, the regulatory molecular mechanisms of these genes are poorly understood. Analyzing the functionally conserved sequence, especially in the noncoding regulatory region involved in gene expression, clarified these mechanisms. In this study, we isolated a 3 kb fragment upstream of Fugu tnni1a that can regulate green fluorescence protein (GFP) expression in a heart-specific manner, similar to the pattern of zebrafish homologue expression. Three evolutionarily conserved regions (ECRs) in the 5'-flanking sequence of Fugu tnni1a were identified by sequence alignment. Deletion analysis led to the identification of ECR2 as a core sequence that affects the heart-specific expression function of the Fugu tnni1a promoter. Interestingly, both the Fugu tnni1a promoter and ECR2 sequence were functionally conserved in zebrafish, although they shared no sequence similarity. Together, the findings of our study provided further evidence for the important role of tnni1a homologous in cardiac development and demonstrated that two functionally conserved sequences in the zebrafish and Fugu genomes may be ECRs, despite their lack of similarity.
Collapse
Affiliation(s)
- Yiting Gui
- Translational Medical Center for Development and Disease, Shanghai Key Laboratory of Birth Defect Prevention and Control, NHC Key Laboratory of Neonatal Diseases, Institute of Pediatrics, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai 201102, China; Cardiovascular Center, NHC Key Laboratory of Neonatal Diseases, Fudan University, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai 201102, China
| | - Yawen Zhang
- Translational Medical Center for Development and Disease, Shanghai Key Laboratory of Birth Defect Prevention and Control, NHC Key Laboratory of Neonatal Diseases, Institute of Pediatrics, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai 201102, China; Cardiovascular Center, NHC Key Laboratory of Neonatal Diseases, Fudan University, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai 201102, China
| | - Qi Zhang
- Translational Medical Center for Development and Disease, Shanghai Key Laboratory of Birth Defect Prevention and Control, NHC Key Laboratory of Neonatal Diseases, Institute of Pediatrics, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai 201102, China
| | - Xudong Chen
- Translational Medical Center for Development and Disease, Shanghai Key Laboratory of Birth Defect Prevention and Control, NHC Key Laboratory of Neonatal Diseases, Institute of Pediatrics, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai 201102, China
| | - Feng Wang
- Translational Medical Center for Development and Disease, Shanghai Key Laboratory of Birth Defect Prevention and Control, NHC Key Laboratory of Neonatal Diseases, Institute of Pediatrics, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai 201102, China; Cardiovascular Center, NHC Key Laboratory of Neonatal Diseases, Fudan University, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai 201102, China
| | - Fang Wu
- Department of Neonatology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 201600, China
| | - Yonghao Gui
- Cardiovascular Center, NHC Key Laboratory of Neonatal Diseases, Fudan University, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai 201102, China.
| | - Qiang Li
- Translational Medical Center for Development and Disease, Shanghai Key Laboratory of Birth Defect Prevention and Control, NHC Key Laboratory of Neonatal Diseases, Institute of Pediatrics, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai 201102, China.
| |
Collapse
|
2
|
Snetkova V, Pennacchio LA, Visel A, Dickel DE. Perfect and imperfect views of ultraconserved sequences. Nat Rev Genet 2022; 23:182-194. [PMID: 34764456 PMCID: PMC8858888 DOI: 10.1038/s41576-021-00424-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/30/2021] [Indexed: 12/12/2022]
Abstract
Across the human genome, there are nearly 500 'ultraconserved' elements: regions of at least 200 contiguous nucleotides that are perfectly conserved in both the mouse and rat genomes. Remarkably, the majority of these sequences are non-coding, and many can function as enhancers that activate tissue-specific gene expression during embryonic development. From their first description more than 15 years ago, their extreme conservation has both fascinated and perplexed researchers in genomics and evolutionary biology. The intrigue around ultraconserved elements only grew with the observation that they are dispensable for viability. Here, we review recent progress towards understanding the general importance and the specific functions of ultraconserved sequences in mammalian development and human disease and discuss possible explanations for their extreme conservation.
Collapse
Affiliation(s)
- Valentina Snetkova
- Environmental Genomics & Systems Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Len A. Pennacchio
- Environmental Genomics & Systems Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA,Comparative Biochemistry Program, University of California, Berkeley, CA 94720, USA,U.S. Department of Energy Joint Genome Institute, 1 Cyclotron Road, Berkeley, CA 94720, USA,To whom correspondence should be addressed: L.A.P., ; A.V., ; D.E.D., (lead contact)
| | - Axel Visel
- Environmental Genomics & Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA. .,US Department of Energy Joint Genome Institute, Berkeley, CA, USA. .,School of Natural Sciences, University of California, Merced, Merced, CA, USA.
| | - Diane E. Dickel
- Environmental Genomics & Systems Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA,To whom correspondence should be addressed: L.A.P., ; A.V., ; D.E.D., (lead contact)
| |
Collapse
|
3
|
Franchini LF. Genetic Mechanisms Underlying Cortical Evolution in Mammals. Front Cell Dev Biol 2021; 9:591017. [PMID: 33659245 PMCID: PMC7917222 DOI: 10.3389/fcell.2021.591017] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Accepted: 01/08/2021] [Indexed: 12/13/2022] Open
Abstract
The remarkable sensory, motor, and cognitive abilities of mammals mainly depend on the neocortex. Thus, the emergence of the six-layered neocortex in reptilian ancestors of mammals constitutes a fundamental evolutionary landmark. The mammalian cortex is a columnar epithelium of densely packed cells organized in layers where neurons are generated mainly in the subventricular zone in successive waves throughout development. Newborn cells move away from their site of neurogenesis through radial or tangential migration to reach their specific destination closer to the pial surface of the same or different cortical area. Interestingly, the genetic programs underlying neocortical development diversified in different mammalian lineages. In this work, I will review several recent studies that characterized how distinct transcriptional programs relate to the development and functional organization of the neocortex across diverse mammalian lineages. In some primates such as the anthropoids, the neocortex became extremely large, especially in humans where it comprises around 80% of the brain. It has been hypothesized that the massive expansion of the cortical surface and elaboration of its connections in the human lineage, has enabled our unique cognitive capacities including abstract thinking, long-term planning, verbal language and elaborated tool making capabilities. I will also analyze the lineage-specific genetic changes that could have led to the modification of key neurodevelopmental events, including regulation of cell number, neuronal migration, and differentiation into specific phenotypes, in order to shed light on the evolutionary mechanisms underlying the diversity of mammalian brains including the human brain.
Collapse
Affiliation(s)
- Lucía Florencia Franchini
- Instituto de Investigaciones en Ingeniería Genética y Biología Molecular (INGEBI), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Buenos Aires, Argentina
| |
Collapse
|
4
|
Sadler B, Haller G, Antunes L, Nikolov M, Amarillo I, Coe B, Dobbs MB, Gurnett CA. Rare and de novo duplications containing SHOX in clubfoot. J Med Genet 2020; 57:851-857. [PMID: 32518174 PMCID: PMC7688552 DOI: 10.1136/jmedgenet-2020-106842] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Revised: 03/04/2020] [Accepted: 03/05/2020] [Indexed: 11/12/2022]
Abstract
Introduction Congenital clubfoot is a common birth defect that affects at least 0.1% of all births. Nearly 25% cases are familial and the remaining are sporadic in inheritance. Copy number variants (CNVs) involving transcriptional regulators of limb development, including PITX1 and TBX4, have previously been shown to cause familial clubfoot, but much of the heritability remains unexplained. Methods Exome sequence data from 816 unrelated clubfoot cases and 2645 in-house controls were analysed using coverage data to identify rare CNVs. The precise size and location of duplications were then determined using high-density Affymetrix Cytoscan chromosomal microarray (CMA). Segregation in families and de novo status were determined using qantitative PCR. Results Chromosome Xp22.33 duplications involving SHOX were identified in 1.1% of cases (9/816) compared with 0.07% of in-house controls (2/2645) (p=7.98×10−5, OR=14.57) and 0.27% (38/13592) of Atherosclerosis Risk in Communities/the Wellcome Trust Case Control Consortium 2 controls (p=0.001, OR=3.97). CMA validation confirmed an overlapping 180.28 kb duplicated region that included SHOX exons as well as downstream non-coding regions. In four of six sporadic cases where DNA was available for unaffected parents, the duplication was de novo. The probability of four de novo mutations in SHOX by chance in a cohort of 450 sporadic clubfoot cases is 5.4×10–10. Conclusions Microduplications of the pseudoautosomal chromosome Xp22.33 region (PAR1) containing SHOX and downstream enhancer elements occur in ~1% of patients with clubfoot. SHOX and regulatory regions have previously been implicated in skeletal dysplasia as well as idiopathic short stature, but have not yet been reported in clubfoot. SHOX duplications likely contribute to clubfoot pathogenesis by altering early limb development.
Collapse
Affiliation(s)
- Brooke Sadler
- Department of Neurology, Washington University in Saint Louis School of Medicine, Saint Louis, Missouri, USA
| | - Gabe Haller
- Department of Orthopedic Surgery, Washington University in Saint Louis School of Medicine, Saint Louis, Missouri, USA
| | - Lilian Antunes
- Department of Neurology, Washington University in Saint Louis School of Medicine, Saint Louis, Missouri, USA
| | - Momchil Nikolov
- Department of Neurology, Washington University in Saint Louis School of Medicine, Saint Louis, Missouri, USA
| | - Ina Amarillo
- Department of Pathology and Immunology, Washington University in Saint Louis School of Medicine, Saint Louis, Missouri, USA
| | - Bradley Coe
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington, USA.,Department of Pathology & Laboratory Medicine, The University of British Columbia, Vancouver, British Columbia, Canada
| | - Matthew B Dobbs
- Department of Orthopedic Surgery, Washington University in Saint Louis School of Medicine, Saint Louis, Missouri, USA
| | - Christina A Gurnett
- Department of Neurology, Washington University in Saint Louis School of Medicine, Saint Louis, Missouri, USA
| |
Collapse
|
5
|
Lettice LA, Devenney P, De Angelis C, Hill RE. The Conserved Sonic Hedgehog Limb Enhancer Consists of Discrete Functional Elements that Regulate Precise Spatial Expression. Cell Rep 2018; 20:1396-1408. [PMID: 28793263 PMCID: PMC5561167 DOI: 10.1016/j.celrep.2017.07.037] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2016] [Revised: 05/17/2017] [Accepted: 07/13/2017] [Indexed: 12/21/2022] Open
Abstract
Expression of sonic hedgehog (Shh) in the limb bud is regulated by an enhancer called the zone of polarizing activity regulatory sequence (ZRS), which, in evolution, belongs to an ancient group of highly conserved cis regulators found in all classes of vertebrates. Here, we examined the endogenous ZRS in mice, using genome editing to establish the relationship between enhancer composition and embryonic phenotype. We show that enhancer activity is a consolidation of distinct activity domains. Spatial restriction of Shh expression is mediated by a discrete repressor module, whereas levels of gene expression are controlled by large overlapping domains containing varying numbers of HOXD binding sites. The number of HOXD binding sites regulates expression levels incrementally. Substantial portions of conserved sequence are dispensable, indicating the presence of sequence redundancy. We propose a collective model for enhancer activity in which function is an integration of discrete expression activities and redundant components that drive robust expression. The ancient vertebrate enhancer, the ZRS, shows sequence plasticity Discrete regulatory activities are assigned to specific sites in the enhancer The number of HOXD binding sites determines the level of Shh expression Robust expression is a collective of regulatory and redundant information
Collapse
Affiliation(s)
- Laura A Lettice
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK
| | - Paul Devenney
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK
| | - Carlo De Angelis
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK
| | - Robert E Hill
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK.
| |
Collapse
|
6
|
dos Santos FC, Peixoto MGCD, Fonseca PADS, Pires MDFÁ, Ventura RV, Rosse IDC, Bruneli FAT, Machado MA, Carvalho MRS. Identification of Candidate Genes for Reactivity in Guzerat (Bos indicus) Cattle: A Genome-Wide Association Study. PLoS One 2017; 12:e0169163. [PMID: 28125592 PMCID: PMC5268462 DOI: 10.1371/journal.pone.0169163] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2016] [Accepted: 12/13/2016] [Indexed: 01/24/2023] Open
Abstract
Temperament is fundamental to animal production due to its direct influence on the animal-herdsman relationship. When compared to calm animals, the aggressive, anxious or fearful ones exhibit less weight gain, lower reproductive efficiency, decreased milk production and higher herd maintenance costs, all of which contribute to reduced profits. However, temperament is a trait that is complex and difficult to assess. Recently, a new quantitative system, REATEST®, for assessing reactivity, a phenotype of temperament, was developed. Herein, we describe the results of a Genome-wide association study for reactivity, assessed using REATEST® with a sample of 754 females from five dual-purpose (milk and meat production) Guzerat (Bos indicus) herds. Genotyping was performed using a 50k SNP chip and a two-step mixed model approach (Grammar-Gamma) with a one-by-one marker regression was used to identify QTLs. QTLs for reactivity were identified on chromosomes BTA1, BTA5, BTA14, and BTA25. Five intronic and two intergenic markers were significantly associated with reactivity. POU1F1, DRD3, VWA3A, ZBTB20, EPHA6, SNRPF and NTN4 were identified as candidate genes. Previous QTL reports for temperament traits, covering areas surrounding the SNPs/genes identified here, further corroborate these associations. The seven genes identified in the present study explain 20.5% of reactivity variance and give a better understanding of temperament biology.
Collapse
Affiliation(s)
| | | | | | | | - Ricardo Vieira Ventura
- Center for Genetic Improvement of Livestock, University of Guelph, Guelph, Canada
- Beef Improvement Opportunities, Guelph, Canada
| | - Izinara da Cruz. Rosse
- Departamento de Biologia Geral, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | | | | | | |
Collapse
|
7
|
Martinez AF, Abe Y, Hong S, Molyneux K, Yarnell D, Löhr H, Driever W, Acosta MT, Arcos-Burgos M, Muenke M. An Ultraconserved Brain-Specific Enhancer Within ADGRL3 (LPHN3) Underpins Attention-Deficit/Hyperactivity Disorder Susceptibility. Biol Psychiatry 2016; 80:943-954. [PMID: 27692237 PMCID: PMC5108697 DOI: 10.1016/j.biopsych.2016.06.026] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Revised: 06/28/2016] [Accepted: 06/30/2016] [Indexed: 12/22/2022]
Abstract
BACKGROUND Genetic factors predispose individuals to attention-deficit/hyperactivity disorder (ADHD). Previous studies have reported linkage and association to ADHD of gene variants within ADGRL3. In this study, we functionally analyzed noncoding variants in this gene as likely pathological contributors. METHODS In silico, in vitro, and in vivo approaches were used to identify and characterize evolutionary conserved elements within the ADGRL3 linkage region (~207 Kb). Family-based genetic analyses of 838 individuals (372 affected and 466 unaffected patients) identified ADHD-associated single nucleotide polymorphisms harbored in some of these conserved elements. Luciferase assays and zebrafish green fluorescent protein transgenesis tested conserved elements for transcriptional enhancer activity. Electromobility shift assays were used to verify transcription factor-binding disruption by ADHD risk alleles. RESULTS An ultraconserved element was discovered (evolutionary conserved region 47) that functions as a transcriptional enhancer. A three-variant ADHD risk haplotype in evolutionary conserved region 47, formed by rs17226398, rs56038622, and rs2271338, reduced enhancer activity by 40% in neuroblastoma and astrocytoma cells (pBonferroni < .0001). This enhancer also drove green fluorescent protein expression in the zebrafish brain in a tissue-specific manner, sharing aspects of endogenous ADGRL3 expression. The rs2271338 risk allele disrupts binding of YY1 transcription factor, an important factor in the development and function of the central nervous system. Expression quantitative trait loci analysis of postmortem human brain tissues revealed an association between rs2271338 and reduced ADGRL3 expression in the thalamus. CONCLUSIONS These results uncover the first functional evidence of common noncoding variants with potential implications for the pathology of ADHD.
Collapse
|
8
|
The disruption of a novel limb cis-regulatory element of SHH is associated with autosomal dominant preaxial polydactyly-hypertrichosis. Eur J Hum Genet 2015; 24:37-43. [PMID: 25782671 DOI: 10.1038/ejhg.2015.53] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2014] [Revised: 02/12/2015] [Accepted: 02/19/2015] [Indexed: 12/16/2022] Open
Abstract
The expression gradient of the morphogen Sonic Hedgehog (SHH) is crucial in establishing the number and the identity of the digits during anteroposterior patterning of the limb. Its anterior ectopic expression is responsible for preaxial polydactyly (PPD). Most of these malformations are due to the gain-of-function of the Zone of Polarizing Activity Regulatory Sequence, the only limb-specific enhancer of SHH known to date. We report a family affected with a novel condition associating PPD and hypertrichosis of the upper back, following an autosomal dominant mode of inheritance. This phenotype is consistent with deregulation of SHH expression during limb and follicle development. In affected members, we identified a 2 kb deletion located ~240 kb upstream from the SHH promoter. The deleted sequence is capable of repressing the transcriptional activity of the SHH promoter in vitro, consistent with a silencer activity. We hypothesize that the deletion of this silencer could be responsible for SHH deregulation during development, leading to a PPD-hypertrichosis phenotype.
Collapse
|
9
|
Taher L, Narlikar L, Ovcharenko I. Identification and computational analysis of gene regulatory elements. Cold Spring Harb Protoc 2015; 2015:pdb.top083642. [PMID: 25561628 PMCID: PMC5885252 DOI: 10.1101/pdb.top083642] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Over the last two decades, advances in experimental and computational technologies have greatly facilitated genomic research. Next-generation sequencing technologies have made de novo sequencing of large genomes affordable, and powerful computational approaches have enabled accurate annotations of genomic DNA sequences. Charting functional regions in genomes must account for not only the coding sequences, but also noncoding RNAs, repetitive elements, chromatin states, epigenetic modifications, and gene regulatory elements. A mix of comparative genomics, high-throughput biological experiments, and machine learning approaches has played a major role in this truly global effort. Here we describe some of these approaches and provide an account of our current understanding of the complex landscape of the human genome. We also present overviews of different publicly available, large-scale experimental data sets and computational tools, which we hope will prove beneficial for researchers working with large and complex genomes.
Collapse
Affiliation(s)
- Leila Taher
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894
- Institute for Biostatistics and Informatics in Medicine and Ageing Research, University of Rostock, 18051 Rostock, Germany
| | - Leelavati Narlikar
- Chemical Engineering and Process Development Division, National Chemical Laboratory, CSIR, Pune 411008, India
| | - Ivan Ovcharenko
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894
| |
Collapse
|
10
|
Khursheed K, Wilm TP, Cashman C, Quinn JP, Bubb VJ, Moss DJ. Characterisation of multiple regulatory domains spanning the major transcriptional start site of the FUS gene, a candidate gene for motor neurone disease. Brain Res 2014; 1595:1-9. [PMID: 25451114 DOI: 10.1016/j.brainres.2014.10.056] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2014] [Revised: 10/07/2014] [Accepted: 10/27/2014] [Indexed: 10/24/2022]
Abstract
Fused-In-Sarcoma (FUS) is a candidate gene for neurological disorders including motor neurone disease and Parkinson׳s disease in addition to various types of cancer. Recently it has been reported that over expression of FUS causes motor neurone disease in mouse models hence mutations leading to changes in gene expression may contribute to the development of neurodegenerative disease. Genome evolutionary conservation was used to predict important cis-acting DNA regulators of the FUS gene promoter that direct transcription. The putative regulators identified were analysed in reporter gene assays in cells and in chick embryos. Our analysis indicated in addition to regulatory domains 5' of the transcriptional start site an important regulatory domain resides in intron 1 of the gene itself. This intronic domain functioned both in cell lines and in vivo in the neural tube of the chick embryo including developing motor neurones. Our data suggest the interaction of multiple domains including intronic domains are involved in expression of FUS. A better understanding of the regulation of expression of FUS may give insight into how its stimulus inducible expression may be associated with neurological disorders.
Collapse
Affiliation(s)
- Kejhal Khursheed
- Institute of Translational Medicine, Sherrington Buildings, Ashton St, Liverpool University, Liverpool L69 3GE, UK
| | - Thomas P Wilm
- Institute of Translational Medicine, Sherrington Buildings, Ashton St, Liverpool University, Liverpool L69 3GE, UK
| | - Christine Cashman
- Institute of Translational Medicine, Sherrington Buildings, Ashton St, Liverpool University, Liverpool L69 3GE, UK
| | - John P Quinn
- Institute of Translational Medicine, Sherrington Buildings, Ashton St, Liverpool University, Liverpool L69 3GE, UK
| | - Vivien J Bubb
- Institute of Translational Medicine, Sherrington Buildings, Ashton St, Liverpool University, Liverpool L69 3GE, UK
| | - Diana J Moss
- Institute of Translational Medicine, Sherrington Buildings, Ashton St, Liverpool University, Liverpool L69 3GE, UK.
| |
Collapse
|
11
|
Jia Z, Gao S, M'Rabet N, De Geyter C, Zhang H. Sp1 is necessary for gene activation of Adamts17 by estrogen. J Cell Biochem 2014; 115:1829-39. [PMID: 24906090 DOI: 10.1002/jcb.24855] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2014] [Accepted: 05/30/2014] [Indexed: 12/21/2022]
Abstract
Adamts17 is a member of a family of secreted metalloproteinases. In this report, we show that knockdown of Adamts17 expression induces apoptosis and inhibits breast cancer cell growth. Adamts17 expression can rapidly be induced by estrogens. siRNA knockdown of Sp1 or Myc demonstrated that Sp1 is required to induce Adamts17 gene expression in response to estrogen. Moreover, reporter assays showed that the proximal promoter and the upstream sequences were not capable of conferring estrogen responsiveness, suggesting that Sp1 elements may be located in the downstream intronic region. We further demonstrated that Sp1 and Myc binding in the proximal promoter region contributed to the Adamts17 basal expression. Furthermore, histone deacetylase (HDAC) and methylase inhibitors also induced Adamts17 expression, indicating that epigenetic alterations, such as aberrant HDAC and/or methylation are associated with dysregulated Adamts17 expression. By meta-analysis using Oncomine microarray data, we found that higher Adamts17 expression is found in several human cancer cell subtypes, especially in breast ductal carcinoma. Moreover, we found that there is an inverse correlation between higher Adamts17 expression and patients' survival. Our study suggests that Adamts17 may support breast cancer cell growth and survival.
Collapse
Affiliation(s)
- Zanhui Jia
- Clinic of Gynecological Endocrinology and Reproductive Medicine, University of Basel, Spitalstrasse 21, CH-4031, Basel, Switzerland; Department of Biomedicine, University of Basel, Hebelstrasse 20, CH-4031, Basel, Switzerland; Department of Gynecology and Obstetrics, Second Hospital of Jilin University, Changchun City, Jilin Province, P.R. China
| | | | | | | | | |
Collapse
|
12
|
Wallmen B, Schrempp M, Hecht A. Intrinsic properties of Tcf1 and Tcf4 splice variants determine cell-type-specific Wnt/β-catenin target gene expression. Nucleic Acids Res 2012; 40:9455-69. [PMID: 22859735 PMCID: PMC3479169 DOI: 10.1093/nar/gks690] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
T-cell factor (Tcf)/lymphoid-enhancer factor (Lef) proteins are a structurally diverse family of deoxyribonucleic acid-binding proteins that have essential nuclear functions in Wnt/β-catenin signalling. Expression of Wnt/β-catenin target genes is highly dependent on context, but the precise role of Tcf/Lef family members in the generation and maintenance of cell-type-specific Wnt/β-catenin responses is unknown. Herein, we show that induction of a subset of Wnt/β-catenin targets in embryonic stem cells depends on Tcf1 and Tcf4, whereas other co-expressed Tcf/Lef family members cannot induce these targets. The Tcf1/Tcf4-dependent gene responses to Wnt are primarily if not exclusively mediated by C-clamp-containing Tcf1E and Tcf4E splice variants. A combined knockdown of Tcf1/Tcf4 abrogates Wnt-inducible transcription but does not affect the active chromatin conformation of their targets. Thus, the transcriptionally poised state of Wnt/β-catenin targets is maintained independent of Tcf/Lef proteins. Conversely, ectopically overexpressed Tcf1E cannot invade silent chromatin and fails to initiate expression of inactive Wnt/β-catenin targets even if repressive chromatin modifications are abolished. The observed non-redundant functions of Tcf1/Tcf4 isoforms in acute transcriptional activation demonstrated that the cell-type-specific complement of Tcf/Lef proteins is a critical determinant of context-dependent Wnt/β-catenin responses. Moreover, the apparent inability to cope with chromatin uncovers an intrinsic property of Tcf/Lef proteins that prevents false ectopic induction and ensures spatiotemporal stability of Wnt/β-catenin target gene expression.
Collapse
Affiliation(s)
- Britta Wallmen
- Spemann Graduate School of Biology and Medicine and Faculty of Biology, Albert-Ludwigs-University Freiburg, Albertstr. 19A, D-79104 Freiburg, Germany
| | | | | |
Collapse
|
13
|
Marek KW, Kurtz LM, Spitzer NC. cJun integrates calcium activity and tlx3 expression to regulate neurotransmitter specification. Nat Neurosci 2010; 13:944-50. [PMID: 20581840 PMCID: PMC2910808 DOI: 10.1038/nn.2582] [Citation(s) in RCA: 63] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2010] [Accepted: 05/19/2010] [Indexed: 12/02/2022]
Abstract
Neuronal differentiation is accomplished through cascades of intrinsic genetic factors initiated in neuronal progenitors by external gradients of morphogens. Activity was thought to be important only late in development, but recent evidence indicates that activity also regulates early neuronal differentiation. Activity in post-mitotic neurons prior to synapse formation can regulate phenotypic specification, including neurotransmitter choice, but the mechanisms are not clear. Here we identify a mechanism that links endogenous calcium spike activity with an intrinsic genetic pathway to specify neurotransmitter choice in neurons in the dorsal embryonic spinal cord of Xenopus tropicalis. Early activity modulates transcription of the GABAergic/glutamatergic selection gene tlx3 and requires a variant cAMP response element (CRE) in its promoter. The cJun transcription factor binds to this CRE site, modulates transcription, and regulates neurotransmitter phenotype through its transactivation domain. Calcium signals through cJun N-terminal phosphorylation, thus integrating activity-dependent and intrinsic neurotransmitter specification. This mechanism provides a basis for early activity to regulate genetic pathways at critical decision points, switching the phenotype of developing neurons.
Collapse
Affiliation(s)
- Kurt W Marek
- Neurobiology Section, Division of Biological Sciences and Center for Neural Circuits, Kavli Institute for Brain and Mind, University of California San Diego, La Jolla, California, USA.
| | | | | |
Collapse
|
14
|
Gotea V, Visel A, Westlund JM, Nobrega MA, Pennacchio LA, Ovcharenko I. Homotypic clusters of transcription factor binding sites are a key component of human promoters and enhancers. Genome Res 2010; 20:565-77. [PMID: 20363979 DOI: 10.1101/gr.104471.109] [Citation(s) in RCA: 169] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Clustering of multiple transcription factor binding sites (TFBSs) for the same transcription factor (TF) is a common feature of cis-regulatory modules in invertebrate animals, but the occurrence of such homotypic clusters of TFBSs (HCTs) in the human genome has remained largely unknown. To explore whether HCTs are also common in human and other vertebrates, we used known binding motifs for vertebrate TFs and a hidden Markov model-based approach to detect HCTs in the human, mouse, chicken, and fugu genomes, and examined their association with cis-regulatory modules. We found that evolutionarily conserved HCTs occupy nearly 2% of the human genome, with experimental evidence for individual TFs supporting their binding to predicted HCTs. More than half of the promoters of human genes contain HCTs, with a distribution around the transcription start site in agreement with the experimental data from the ENCODE project. In addition, almost half of the 487 experimentally validated developmental enhancers contain them as well--a number more than 25-fold larger than expected by chance. We also found evidence of negative selection acting on TFBSs within HCTs, as the conservation of TFBSs is stronger than the conservation of sequences separating them. The important role of HCTs as components of developmental enhancers is additionally supported by a strong correlation between HCTs and the binding of the enhancer-associated coactivator protein Ep300 (also known as p300). Experimental validation of HCT-containing elements in both zebrafish and mouse suggest that HCTs could be used to predict both the presence of enhancers and their tissue specificity, and are thus a feature that can be effectively used in deciphering the gene regulatory code. In conclusion, our results indicate that HCTs are a pervasive feature of human cis-regulatory modules and suggest that they play an important role in gene regulation in the human and other vertebrate genomes.
Collapse
Affiliation(s)
- Valer Gotea
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| | | | | | | | | | | |
Collapse
|
15
|
Loots GG, Ovcharenko I. Human variation in short regions predisposed to deep evolutionary conservation. Mol Biol Evol 2010; 27:1279-88. [PMID: 20093432 PMCID: PMC2872621 DOI: 10.1093/molbev/msq011] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
The landscape of the human genome consists of millions of short islands of conservation that are 100% conserved across multiple vertebrate genomes (termed “bricks”), the majority of which are located in noncoding regions. Several hundred thousand bricks are deeply conserved reaching the genomes of amphibians and fish. Deep phylogenetic conservation of noncoding DNA has been reported to be strongly associated with the presence of gene regulatory elements, introducing bricks as a proxy to the functional noncoding landscape of the human genome. Here, we report a significant overrepresentation of bricks in the promoters of transcription factors and developmental genes, where the high level of phylogenetic conservation correlates with an increase in brick overrepresentation. We also found that the presence of a brick dictates a predisposition to evolutionary constraint, with only 0.7% of the amniota brick central nucleotides being diverged within the primate lineage—an 11-fold reduction in the divergence rate compared with random expectation. Human single-nucleotide polymorphism (SNP) data explains only 3% of primate-specific variation in amniota bricks, thus arguing for a widespread fixation of brick mutations within the primate lineage and prior to human radiation. This variation, in turn, might have been utilized as a driving force for primate- and hominoid-specific adaptation. We also discovered a pronounced deviation from the evolutionary predisposition in the human lineage, with over 20-fold increase in the substitution rate at brick SNP sites over expected values. In addition, contrary to typical brick mutations, brick variation commonly encountered in the human population displays limited, if any, signatures of negative selection as measured by the minor allele frequency and population differentiation (F-statistical measure) measures. These observations argue for the plasticity of gene regulatory mechanisms in vertebrates—with evidence of strong purifying selection acting on the gene regulatory landscape of the human genome, where widespread advantageous mutations in putative regulatory elements are likely utilized in functional diversification and adaptation of species.
Collapse
Affiliation(s)
- Gabriela G Loots
- Biology and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, CA, USA
| | | |
Collapse
|
16
|
Itou J, Suyama M, Imamura Y, Deguchi T, Fujimori K, Yuba S, Kawarabayasi Y, Kawasaki T. Functional and comparative genomics analyses of pmp22 in medaka fish. BMC Neurosci 2009; 10:60. [PMID: 19534778 PMCID: PMC2714311 DOI: 10.1186/1471-2202-10-60] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2009] [Accepted: 06/17/2009] [Indexed: 01/23/2023] Open
Abstract
Background Pmp22, a member of the junction protein family Claudin/EMP/PMP22, plays an important role in myelin formation. Increase of pmp22 transcription causes peripheral neuropathy, Charcot-Marie-Tooth disease type1A (CMT1A). The pathophysiological phenotype of CMT1A is aberrant axonal myelination which induces a reduction in nerve conduction velocity (NCV). Several CMT1A model rodents have been established by overexpressing pmp22. Thus, it is thought that pmp22 expression must be tightly regulated for correct myelin formation in mammals. Interestingly, the myelin sheath is also present in other jawed vertebrates. The purpose of this study is to analyze the evolutionary conservation of the association between pmp22 transcription level and vertebrate myelin formation, and to find the conserved non-coding sequences for pmp22 regulation by comparative genomics analyses between jawed fishes and mammals. Results A transgenic pmp22 over-expression medaka fish line was established. The transgenic fish had approximately one fifth the peripheral NCV values of controls, and aberrant myelination of transgenic fish in the peripheral nerve system (PNS) was observed. We successfully confirmed that medaka fish pmp22 has the same exon-intron structure as mammals, and identified some known conserved regulatory motifs. Furthermore, we found novel conserved sequences in the first intron and 3'UTR. Conclusion Medaka fish undergo abnormalities in the PNS when pmp22 transcription increases. This result indicates that an adequate pmp22 transcription level is necessary for correct myelination of jawed vertebrates. Comparison of pmp22 orthologs between distantly related species identifies evolutionary conserved sequences that contribute to precise regulation of pmp22 expression.
Collapse
Affiliation(s)
- Junji Itou
- Department of Radiation Biomedical Science IV, Radiation Biology Center, Graduate School of Medicine, Kyoto University, Kyoto 606-8501, Japan.
| | | | | | | | | | | | | | | |
Collapse
|
17
|
Taher L, Ovcharenko I. Variable locus length in the human genome leads to ascertainment bias in functional inference for non-coding elements. ACTA ACUST UNITED AC 2009; 25:578-84. [PMID: 19168912 PMCID: PMC2647827 DOI: 10.1093/bioinformatics/btp043] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
MOTIVATION Several functional gene annotation databases have been developed in the recent years, and are widely used to infer the biological function of gene sets, by scrutinizing the attributes that appear over- and underrepresented. However, this strategy is not directly applicable to the study of non-coding DNA, as the non-coding sequence span varies greatly among different gene loci in the human genome and longer loci have a higher likelihood of being selected purely by chance. Therefore, conclusions involving the function of non-coding elements that are drawn based on the annotation of neighboring genes are often biased. We assessed the systematic bias in several particular Gene Ontology (GO) categories using the standard hypergeometric test, by randomly sampling non-coding elements from the human genome and inferring their function based on the functional annotation of the closest genes. While no category is expected to occur significantly over- or underrepresented for a random selection of elements, categories such as 'cell adhesion', 'nervous system development' and 'transcription factor activities' appeared to be systematically overrepresented, while others such as 'olfactory receptor activity'-underrepresented. RESULTS Our results suggest that functional inference for non-coding elements using gene annotation databases requires a special correction. We introduce a set of correction coefficients for the probabilities of the GO categories that accounts for the variability in the length of the non-coding DNA across different loci and effectively eliminates the ascertainment bias from the functional characterization of non-coding elements. Our approach can be easily generalized to any other gene annotation database.
Collapse
Affiliation(s)
- Leila Taher
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | | |
Collapse
|
18
|
Xie HB, Irwin DM, Zhang YP. Evolution of conserved secondary structures and their function in transcriptional regulation networks. BMC Genomics 2008; 9:520. [PMID: 18976501 PMCID: PMC2584662 DOI: 10.1186/1471-2164-9-520] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2008] [Accepted: 11/02/2008] [Indexed: 12/12/2022] Open
Abstract
Background Many conserved secondary structures have been identified within conserved elements in the human genome, but only a small fraction of them are known to be functional RNAs. The evolutionary variations of these conserved secondary structures in human populations and their biological functions have not been fully studied. Results We searched for polymorphisms within conserved secondary structures and identified a number of SNPs within these elements even though they are highly conserved among species. The density of SNPs in conserved secondary structures is about 65% of that of their flanking, non-conserved, sequences. Classification of sites as stems or as loops/bulges revealed that the density of SNPs in stems is about 62% of that found in loops/bulges. Analysis of derived allele frequency data indicates that sites in stems are under stronger evolutionary constraint than sites in loops/bulges. Intergenic conserved secondary structures tend to associate with transcription factor-encoding genes with genetic distance being the measure of regulator-gene associations. A substantial fraction of intergenic conserved secondary structures overlap characterized binding sites for multiple transcription factors. Conclusion Strong purifying selection implies that secondary structures are probably important carriers of biological functions for conserved sequences. The overlap between intergenic conserved secondary structures and transcription factor binding sites further suggests that intergenic conserved secondary structures have essential roles in directing gene expression in transcriptional regulation networks.
Collapse
Affiliation(s)
- Hai-Bing Xie
- State Key Laboratory of Genetic Resource and Evolution, Kunming Institute of Zoology, Kunming 650223, PR China.
| | | | | |
Collapse
|
19
|
Rose D, Hertel J, Reiche K, Stadler PF, Hackermüller J. NcDNAlign: plausible multiple alignments of non-protein-coding genomic sequences. Genomics 2008; 92:65-74. [PMID: 18511233 DOI: 10.1016/j.ygeno.2008.04.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2007] [Revised: 04/09/2008] [Accepted: 04/09/2008] [Indexed: 10/22/2022]
Abstract
Genome-wide multiple sequence alignments (MSAs) are a necessary prerequisite for an increasingly diverse collection of comparative genomic approaches. Here we present a versatile method that generates high-quality MSAs for non-protein-coding sequences. The NcDNAlign pipeline combines pairwise BLAST alignments to create initial MSAs, which are then locally improved and trimmed. The program is optimized for speed and hence is particulary well-suited to pilot studies. We demonstrate the practical use of NcDNAlign in three case studies: the search for ncRNAs in gammaproteobacteria and the analysis of conserved noncoding DNA in nematodes and teleost fish, in the latter case focusing on the fate of duplicated ultra-conserved regions. Compared to the currently widely used genome-wide alignment program TBA, our program results in a 20- to 30-fold reduction of CPU time necessary to generate gammaproteobacterial alignments. A showcase application of bacterial ncRNA prediction based on alignments of both algorithms results in similar sensitivity, false discovery rates, and up to 100 putatively novel ncRNA structures. Similar findings hold for our application of NcDNAlign to the identification of ultra-conserved regions in nematodes and teleosts. Both approaches yield conserved sequences of unknown function, result in novel evolutionary insights into conservation patterns among these genomes, and manifest the benefits of an efficient and reliable genome-wide alignment package. The software is available under the GNU Public License at http://www.bioinf.uni-leipzig.de/Software/NcDNAlign/.
Collapse
Affiliation(s)
- Dominic Rose
- Bioinformatics Group, Department of Computer Science, University of Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany
| | | | | | | | | |
Collapse
|
20
|
Recent papers on zebrafish and other aquarium fish models. Zebrafish 2008; 1:305-11. [PMID: 18248239 DOI: 10.1089/zeb.2004.1.305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
21
|
Mikkelsen TS, Wakefield MJ, Aken B, Amemiya CT, Chang JL, Duke S, Garber M, Gentles AJ, Goodstadt L, Heger A, Jurka J, Kamal M, Mauceli E, Searle SMJ, Sharpe T, Baker ML, Batzer MA, Benos PV, Belov K, Clamp M, Cook A, Cuff J, Das R, Davidow L, Deakin JE, Fazzari MJ, Glass JL, Grabherr M, Greally JM, Gu W, Hore TA, Huttley GA, Kleber M, Jirtle RL, Koina E, Lee JT, Mahony S, Marra MA, Miller RD, Nicholls RD, Oda M, Papenfuss AT, Parra ZE, Pollock DD, Ray DA, Schein JE, Speed TP, Thompson K, VandeBerg JL, Wade CM, Walker JA, Waters PD, Webber C, Weidman JR, Xie X, Zody MC, Graves JAM, Ponting CP, Breen M, Samollow PB, Lander ES, Lindblad-Toh K. Genome of the marsupial Monodelphis domestica reveals innovation in non-coding sequences. Nature 2007; 447:167-77. [PMID: 17495919 DOI: 10.1038/nature05805] [Citation(s) in RCA: 508] [Impact Index Per Article: 29.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2006] [Accepted: 04/03/2007] [Indexed: 12/15/2022]
Abstract
We report a high-quality draft of the genome sequence of the grey, short-tailed opossum (Monodelphis domestica). As the first metatherian ('marsupial') species to be sequenced, the opossum provides a unique perspective on the organization and evolution of mammalian genomes. Distinctive features of the opossum chromosomes provide support for recent theories about genome evolution and function, including a strong influence of biased gene conversion on nucleotide sequence composition, and a relationship between chromosomal characteristics and X chromosome inactivation. Comparison of opossum and eutherian genomes also reveals a sharp difference in evolutionary innovation between protein-coding and non-coding functional elements. True innovation in protein-coding genes seems to be relatively rare, with lineage-specific differences being largely due to diversification and rapid turnover in gene families involved in environmental interactions. In contrast, about 20% of eutherian conserved non-coding elements (CNEs) are recent inventions that postdate the divergence of Eutheria and Metatheria. A substantial proportion of these eutherian-specific CNEs arose from sequence inserted by transposable elements, pointing to transposons as a major creative force in the evolution of mammalian gene regulation.
Collapse
Affiliation(s)
- Tarjei S Mikkelsen
- Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, Massachusetts 02142, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Pennacchio LA, Loots GG, Nobrega MA, Ovcharenko I. Predicting tissue-specific enhancers in the human genome. Genome Res 2007; 17:201-11. [PMID: 17210927 PMCID: PMC1781352 DOI: 10.1101/gr.5972507] [Citation(s) in RCA: 102] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Determining how transcriptional regulatory signals are encoded in vertebrate genomes is essential for understanding the origins of multicellular complexity; yet the genetic code of vertebrate gene regulation remains poorly understood. In an attempt to elucidate this code, we synergistically combined genome-wide gene-expression profiling, vertebrate genome comparisons, and transcription factor binding-site analysis to define sequence signatures characteristic of candidate tissue-specific enhancers in the human genome. We applied this strategy to microarray-based gene expression profiles from 79 human tissues and identified 7187 candidate enhancers that defined their flanking gene expression, the majority of which were located outside of known promoters. We cross-validated this method for its ability to de novo predict tissue-specific gene expression and confirmed its reliability in 57 of the 79 available human tissues, with an average precision in enhancer recognition ranging from 32% to 63% and a sensitivity of 47%. We used the sequence signatures identified by this approach to successfully assign tissue-specific predictions to approximately 328,000 human-mouse conserved noncoding elements in the human genome. By overlapping these genome-wide predictions with a data set of enhancers validated in vivo, in transgenic mice, we were able to confirm our results with a 28% sensitivity and 50% precision. These results indicate the power of combining complementary genomic data sets as an initial computational foray into a global view of tissue-specific gene regulation in vertebrates.
Collapse
Affiliation(s)
- Len A. Pennacchio
- Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
- U.S. Department of Energy, Joint Genome Institute, Walnut Creek, California 94598, USA
| | - Gabriela G. Loots
- Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, Livermore, California 94550, USA
| | - Marcelo A. Nobrega
- Department of Human Genetics, University of Chicago, Chicago, Illinois 60637, USA
| | - Ivan Ovcharenko
- U.S. Department of Energy, Joint Genome Institute, Walnut Creek, California 94598, USA
- Computation Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, USA
- Corresponding author.E-mail ; fax (925) 422-2099
| |
Collapse
|
23
|
Loots G, Ovcharenko I. ECRbase: database of evolutionary conserved regions, promoters, and transcription factor binding sites in vertebrate genomes. Bioinformatics 2006; 23:122-4. [PMID: 17090579 DOI: 10.1093/bioinformatics/btl546] [Citation(s) in RCA: 84] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Evolutionary conservation of DNA sequences provides a tool for the identification of functional elements in genomes. We have created a database of evolutionary conserved regions (ECRs) in vertebrate genomes, entitled ECRbase, which is constructed from a collection of whole-genome alignments produced by the ECR Browser. ECRbase features a database of syntenic blocks that recapitulate the evolution of rearrangements in vertebrates and a comprehensive collection of promoters in all vertebrate genomes generated using multiple sources of gene annotation. The database also contains a collection of annotated transcription factor binding sites (TFBSs) in evolutionary conserved and promoter elements. ECRbase currently includes human, rhesus macaque, dog, opossum, rat, mouse, chicken, frog, zebrafish and fugu genomes. It is freely accessible at http://ecrbase.dcode.org.
Collapse
|
24
|
Prabhakar S, Poulin F, Shoukry M, Afzal V, Rubin EM, Couronne O, Pennacchio LA. Close sequence comparisons are sufficient to identify human cis-regulatory elements. Genome Res 2006; 16:855-63. [PMID: 16769978 PMCID: PMC1484452 DOI: 10.1101/gr.4717506] [Citation(s) in RCA: 156] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Cross-species DNA sequence comparison is the primary method used to identify functional noncoding elements in human and other large genomes. However, little is known about the relative merits of evolutionarily close and distant sequence comparisons. To address this problem, we identified evolutionarily conserved noncoding regions in primate, mammalian, and more distant comparisons using a uniform approach (Gumby) that facilitates unbiased assessment of the impact of evolutionary distance on predictive power. We benchmarked computational predictions against previously identified cis-regulatory elements at diverse genomic loci and also tested numerous extremely conserved human-rodent sequences for transcriptional enhancer activity using an in vivo enhancer assay in transgenic mice. Human regulatory elements were identified with acceptable sensitivity (53%-80%) and true-positive rate (27%-67%) by comparison with one to five other eutherian mammals or six other simian primates. More distant comparisons (marsupial, avian, amphibian, and fish) failed to identify many of the empirically defined functional noncoding elements. Our results highlight the practical utility of close sequence comparisons, and the loss of sensitivity entailed by more distant comparisons. We derived an intuitive relationship between ancient and recent noncoding sequence conservation from whole-genome comparative analysis that explains most of the observations from empirical benchmarking. Lastly, we determined that, in addition to strength of conservation, genomic location and/or density of surrounding conserved elements must also be considered in selecting candidate enhancers for in vivo testing at embryonic time points.
Collapse
Affiliation(s)
- Shyam Prabhakar
- Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
- U.S. Department of Energy Joint Genome Institute, Walnut Creek, California 94598, USA
- Corresponding authors.E-mail ; fax (510) 486-4229. E-mail ; fax (510) 486-4229
| | - Francis Poulin
- Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
| | - Malak Shoukry
- Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
| | - Veena Afzal
- Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
| | - Edward M. Rubin
- Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
- U.S. Department of Energy Joint Genome Institute, Walnut Creek, California 94598, USA
| | - Olivier Couronne
- Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
- U.S. Department of Energy Joint Genome Institute, Walnut Creek, California 94598, USA
| | - Len A. Pennacchio
- Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
- U.S. Department of Energy Joint Genome Institute, Walnut Creek, California 94598, USA
- Corresponding authors.E-mail ; fax (510) 486-4229. E-mail ; fax (510) 486-4229
| |
Collapse
|
25
|
Richler E, Reichert JG, Buxbaum JD, McInnes LA. Autism and ultraconserved non-coding sequence on chromosome 7q. Psychiatr Genet 2006; 16:19-23. [PMID: 16395125 DOI: 10.1097/01.ypg.0000180683.18665.ef] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
OBJECTIVE Autism has been linked to a broad region on chromosome 7q that contains a large number of genes involved in transcription and development. This region is also enriched for ultraconserved non-coding elements, defined as human-rodent sequences that are 100% aligned over > or =200 base pairs, which have a high likelihood of being functional. Therefore, as only a few rare coding variants have been detected in the autism candidate genes on 7q examined to date, we decided to screen these ultraconserved elements for possible autism susceptibility alleles. METHODS We used denaturing high-performance liquid chromatography, and DNA sequencing, to perform variant detection in a total of 146 cases with autism, 96 from the Autism Genetic Resource Exchange and 50 from the Central Valley of Costa Rica, as well as 124 controls from the Polymorphism Discovery Resource Panel. We screened 10 consecutive ultraconserved elements in, or flanking, the genes DLX5/6, AUTS2 and FOXP2 on chromosome 7q. RESULTS Although we did find several rare variants in autism cases that were not present in controls, we also observed rare variants present in controls and not cases. The most common variant occurred in controls at a frequency of 3.3%. Interestingly, two ultraconserved elements each harbored three independent variants and one ultraconserved element harbored two independent variants, suggesting that ultraconservation is maintained chiefly by a decreased tendency toward fixation, rather than a significantly lower mutation rate. CONCLUSIONS Our results show that these sequences are unlikely to harbor major autism susceptibility alleles.
Collapse
Affiliation(s)
- Esther Richler
- Department of Psychiatry, Mount Sinai School of Medicine, New York, USA
| | | | | | | |
Collapse
|
26
|
Stone EA, Cooper GM, Sidow A. Trade-offs in detecting evolutionarily constrained sequence by comparative genomics. Annu Rev Genomics Hum Genet 2005; 6:143-64. [PMID: 16124857 DOI: 10.1146/annurev.genom.6.080604.162146] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
As whole-genome sequencing efforts extend beyond more traditional model organisms to include a deep diversity of species, comparative genomic analyses will be further empowered to reveal insights into the human genome and its evolution. The discovery and annotation of functional genomic elements is a necessary step toward a detailed understanding of our biology, and sequence comparisons have proven to be an integral tool for that task. This review is structured to broadly reflect the statistical challenges in discriminating these functional elements from the bulk of the genome that has evolved neutrally. Specifically, we review the comparative genomics literature in terms of specificity, sensitivity, and phylogenetic scope, as well as the trade-offs that relate these factors in standard analyses. We consider the impact of an expanding diversity of orthologous sequences on our ability to resolve functional elements. This impact is assessed through both recent comparative analyses of deep alignments and mathematical modeling.
Collapse
Affiliation(s)
- Eric A Stone
- Department of Statistics, Stanford University, Stanford, California 94305, USA
| | | | | |
Collapse
|
27
|
Shin JT, Priest JR, Ovcharenko I, Ronco A, Moore RK, Burns CG, MacRae CA. Human-zebrafish non-coding conserved elements act in vivo to regulate transcription. Nucleic Acids Res 2005; 33:5437-45. [PMID: 16179648 PMCID: PMC1236720 DOI: 10.1093/nar/gki853] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Whole genome comparisons of distantly related species effectively predict biologically important sequences--core genes and cis-acting regulatory elements (REs)--but require experimentation to verify biological activity. To examine the efficacy of comparative genomics in identification of active REs from anonymous, non-coding (NC) sequences, we generated a novel alignment of the human and draft zebrafish genomes, and contrasted this set to existing human and fugu datasets. We tested the transcriptional regulatory potential of candidate sequences using two in vivo assays. Strict selection of non-genic elements which are deeply conserved in vertebrate evolution identifies 1744 core vertebrate REs in human and two fish genomes. We tested 16 elements in vivo for cis-acting gene regulatory properties using zebrafish transient transgenesis and found that 10 (63%) strongly modulate tissue-specific expression of a green fluorescent protein reporter vector. We also report a novel quantitative enhancer assay with potential for increased throughput based on normalized luciferase activity in vivo. This complementary system identified 11 (69%; including 9 of 10 GFP-confirmed elements) with cis-acting function. Together, these data support the utility of comparative genomics of distantly related vertebrate species to identify REs and provide a scaleable, in vivo quantitative assay to define functional activity of candidate REs.
Collapse
Affiliation(s)
- Jordan T Shin
- Cardiovascular Research Center and Cardiology Division, Massachusetts General Hospital and Harvard Medical School Charlestown, MA 02129, USA.
| | | | | | | | | | | | | |
Collapse
|
28
|
Bejerano G, Siepel AC, Kent WJ, Haussler D. Computational screening of conserved genomic DNA in search of functional noncoding elements. Nat Methods 2005; 2:535-45. [PMID: 16170870 DOI: 10.1038/nmeth0705-535] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Affiliation(s)
- Gill Bejerano
- Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, California 95064, USA.
| | | | | | | |
Collapse
|
29
|
Vavouri T, Elgar G. Prediction of cis-regulatory elements using binding site matrices--the successes, the failures and the reasons for both. Curr Opin Genet Dev 2005; 15:395-402. [PMID: 15950456 DOI: 10.1016/j.gde.2005.05.002] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2005] [Accepted: 05/23/2005] [Indexed: 01/02/2023]
Abstract
Protein-DNA interactions control many aspects of animal development and cellular responses to the environment. Although profiling of individual transcription factor binding sites is not a reliable guide for predicting the position of cis-regulatory elements in large genomes, modelling the evolution and the organization of regulatory elements has provided enough information to make some successful predictions. For vertebrate genomes, the field is limited by the lack of sufficient experimental data upon which to build reliable models. Nonetheless, a combination of experimental, computational and comparative data is likely to reveal aspects of complex regulatory networks in vertebrates, just as it has already done for simple eukaryotic genomes.
Collapse
Affiliation(s)
- Tanya Vavouri
- Comparative Genomics Group, MRC Rosalind Franklin Centre for Genomics Research, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SB, UK
| | | |
Collapse
|
30
|
Abstract
Synonymous gene regulation, defined by regulatory elements driving shared temporal and/or spatial aspects of gene expression, is most probably predicated on genomic elements that contain similar modules of certain transcription factor binding sites (TFBS). We have developed a method to scan vertebrate genomes for evolutionary conserved modules of TFBS in a predefined configuration, and created a tool, named SynoR that identifies synonymous regulatory elements (SREs) in vertebrate genomes. SynoR performs de novo identification of SREs utilizing known patterns of TFBS in active regulatory elements (REs) as seeds for genome scans. Layers of multiple-species conservation allow the use of differential phylogenetic sequence conservation filters in search of SREs and the results are displayed such as to provide an extensive annotation of the genes containing the detected REs. Gene Ontology categories are utilized to further functionally classify the identified genes, and integrated GNF Expression Atlas 2 data allow the cataloging of tissue-specificities of the predicted SREs. SynoR is publicly available at .
Collapse
Affiliation(s)
- Ivan Ovcharenko
- Energy, Environment, Biology, and Institutional Computing, Lawrence Livermore National Laboratory, Livermore, CA 94550, USA.
| | | |
Collapse
|
31
|
Abstract
Comparative genomics provides the means to demarcate functional regions in anonymous DNA sequences. The successful application of this method to identifying novel genes is currently shifting to deciphering the non-coding encryption of gene regulation across genomes. To facilitate the practical application of comparative sequence analysis to genetics and genomics, we have developed several analytical and visualization tools for the analysis of arbitrary sequences and whole genomes. These tools include two alignment tools, zPicture and Mulan; a phylogenetic shadowing tool, eShadow for identifying lineage- and species-specific functional elements; two evolutionary conserved transcription factor analysis tools, rVista and multiTF; a tool for extracting cis-regulatory modules governing the expression of co-regulated genes, Creme 2.0; and a dynamic portal to multiple vertebrate and invertebrate genome alignments, the ECR Browser. Here, we briefly describe each one of these tools and provide specific examples on their practical applications. All the tools are publicly available at the http://www.dcode.org/ website.
Collapse
Affiliation(s)
| | - Ivan Ovcharenko
- Energy, Environment, Biology, and Institutional Computing Division, Lawrence Livermore National Laboratory7000 East Avenue, L-441 Livermore, CA 94550, USA
- To whom correspondence should be addressed: Tel: +1 925 422 4723; Fax: +1 925 422 2099;
| |
Collapse
|
32
|
Current Awareness on Comparative and Functional Genomics. Comp Funct Genomics 2005. [PMCID: PMC2447482 DOI: 10.1002/cfg.421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
|
33
|
Ovcharenko I, Loots GG, Nobrega MA, Hardison RC, Miller W, Stubbs L. Evolution and functional classification of vertebrate gene deserts. Genome Res 2004; 15:137-45. [PMID: 15590943 PMCID: PMC540279 DOI: 10.1101/gr.3015505] [Citation(s) in RCA: 192] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Large tracts of the human genome, known as gene deserts, are devoid of protein-coding genes. Dichotomy in their level of conservation with chicken separates these regions into two distinct categories, stable and variable. The separation is not caused by differences in rates of neutral evolution but instead appears to be related to different biological functions of stable and variable gene deserts in the human genome. Gene Ontology categories of the adjacent genes are strongly biased toward transcriptional regulation and development for the stable gene deserts, and toward distinctively different functions for the variable gene deserts. Stable gene deserts resist chromosomal rearrangements and appear to harbor multiple distant regulatory elements physically linked to their neighboring genes, with the linearity of conservation invariant throughout vertebrate evolution.
Collapse
Affiliation(s)
- Ivan Ovcharenko
- Energy, Environment, Biology, and Institutional Computing, Lawrence Livermore National Laboratory, Livermore, California 94550, USA.
| | | | | | | | | | | |
Collapse
|
34
|
Nóbrega MA, Zhu Y, Plajzer-Frick I, Afzal V, Rubin EM. Megabase deletions of gene deserts result in viable mice. Nature 2004; 431:988-93. [PMID: 15496924 DOI: 10.1038/nature03022] [Citation(s) in RCA: 129] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2004] [Accepted: 09/08/2004] [Indexed: 12/24/2022]
Abstract
The functional importance of the roughly 98% of mammalian genomes not corresponding to protein coding sequences remains largely undetermined. Here we show that some large-scale deletions of the non-coding DNA referred to as gene deserts can be well tolerated by an organism. We deleted two large non-coding intervals, 1,511 kilobases and 845 kilobases in length, from the mouse genome. Viable mice homozygous for the deletions were generated and were indistinguishable from wild-type littermates with regard to morphology, reproductive fitness, growth, longevity and a variety of parameters assaying general homeostasis. Further detailed analysis of the expression of multiple genes bracketing the deletions revealed only minor expression differences in homozygous deletion and wild-type mice. Together, the two deleted segments harbour 1,243 non-coding sequences conserved between humans and rodents (more than 100 base pairs, 70% identity). Some of the deleted sequences might encode for functions unidentified in our screen; nonetheless, these studies further support the existence of potentially 'disposable DNA' in the genomes of mammals.
Collapse
|