1
|
Dogan N, Wu W, Morrissey CS, Chen KB, Stonestrom A, Long M, Keller CA, Cheng Y, Jain D, Visel A, Pennacchio LA, Weiss MJ, Blobel GA, Hardison RC. Occupancy by key transcription factors is a more accurate predictor of enhancer activity than histone modifications or chromatin accessibility. Epigenetics Chromatin 2015; 8:16. [PMID: 25984238 PMCID: PMC4432502 DOI: 10.1186/s13072-015-0009-5] [Citation(s) in RCA: 89] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2015] [Accepted: 04/02/2015] [Indexed: 12/12/2022] Open
Abstract
Background Regulated gene expression controls organismal development, and variation in regulatory patterns has been implicated in complex traits. Thus accurate prediction of enhancers is important for further understanding of these processes. Genome-wide measurement of epigenetic features, such as histone modifications and occupancy by transcription factors, is improving enhancer predictions, but the contribution of these features to prediction accuracy is not known. Given the importance of the hematopoietic transcription factor TAL1 for erythroid gene activation, we predicted candidate enhancers based on genomic occupancy by TAL1 and measured their activity. Contributions of multiple features to enhancer prediction were evaluated based on the results of these and other studies. Results TAL1-bound DNA segments were active enhancers at a high rate both in transient transfections of cultured cells (39 of 79, or 56%) and transgenic mice (43 of 66, or 65%). The level of binding signal for TAL1 or GATA1 did not help distinguish TAL1-bound DNA segments as active versus inactive enhancers, nor did the density of regulation-related histone modifications. A meta-analysis of results from this and other studies (273 tested predicted enhancers) showed that the presence of TAL1, GATA1, EP300, SMAD1, H3K4 methylation, H3K27ac, and CAGE tags at DNase hypersensitive sites gave the most accurate predictors of enhancer activity, with a success rate over 80% and a median threefold increase in activity. Chromatin accessibility assays and the histone modifications H3K4me1 and H3K27ac were sensitive for finding enhancers, but they have high false positive rates unless transcription factor occupancy is also included. Conclusions Occupancy by key transcription factors such as TAL1, GATA1, SMAD1, and EP300, along with evidence of transcription, improves the accuracy of enhancer predictions based on epigenetic features. Electronic supplementary material The online version of this article (doi:10.1186/s13072-015-0009-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Nergiz Dogan
- Center for Comparative Genomics and Bioinformatics, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, 304 Wartik Laboratory, University Park, PA 16802 USA
| | - Weisheng Wu
- Center for Comparative Genomics and Bioinformatics, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, 304 Wartik Laboratory, University Park, PA 16802 USA ; Bioinformatics Core, Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109-2218 USA
| | - Christapher S Morrissey
- Center for Comparative Genomics and Bioinformatics, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, 304 Wartik Laboratory, University Park, PA 16802 USA
| | - Kuan-Bei Chen
- Center for Comparative Genomics and Bioinformatics, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, 304 Wartik Laboratory, University Park, PA 16802 USA
| | - Aaron Stonestrom
- Division of Hematology, The Children's Hospital of Philadelphia, 3401 Civic Center Boulevard, Philadelphia, PA 19104 USA ; Perelman School of Medicine at the University of Pennsylvania, 415 Curie Boulevard, Philadelphia, PA 19104 USA
| | - Maria Long
- Center for Comparative Genomics and Bioinformatics, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, 304 Wartik Laboratory, University Park, PA 16802 USA
| | - Cheryl A Keller
- Center for Comparative Genomics and Bioinformatics, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, 304 Wartik Laboratory, University Park, PA 16802 USA
| | - Yong Cheng
- Department of Genetics, Mail Stop-5120, Stanford University, Stanford, CA 94305 USA
| | - Deepti Jain
- Center for Comparative Genomics and Bioinformatics, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, 304 Wartik Laboratory, University Park, PA 16802 USA
| | - Axel Visel
- Genomics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Mailstop 84-171, Berkeley, CA 94720 USA ; DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598 USA
| | - Len A Pennacchio
- Genomics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Mailstop 84-171, Berkeley, CA 94720 USA ; DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598 USA
| | - Mitchell J Weiss
- Department of Hematology, St Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN 38105 USA
| | - Gerd A Blobel
- Division of Hematology, The Children's Hospital of Philadelphia, 3401 Civic Center Boulevard, Philadelphia, PA 19104 USA ; Perelman School of Medicine at the University of Pennsylvania, 415 Curie Boulevard, Philadelphia, PA 19104 USA
| | - Ross C Hardison
- Center for Comparative Genomics and Bioinformatics, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, 304 Wartik Laboratory, University Park, PA 16802 USA
| |
Collapse
|
2
|
Suryamohan K, Halfon MS. Identifying transcriptional cis-regulatory modules in animal genomes. WILEY INTERDISCIPLINARY REVIEWS. DEVELOPMENTAL BIOLOGY 2015; 4:59-84. [PMID: 25704908 PMCID: PMC4339228 DOI: 10.1002/wdev.168] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/24/2014] [Revised: 11/04/2014] [Accepted: 11/16/2014] [Indexed: 11/08/2022]
Abstract
UNLABELLED Gene expression is regulated through the activity of transcription factors (TFs) and chromatin-modifying proteins acting on specific DNA sequences, referred to as cis-regulatory elements. These include promoters, located at the transcription initiation sites of genes, and a variety of distal cis-regulatory modules (CRMs), the most common of which are transcriptional enhancers. Because regulated gene expression is fundamental to cell differentiation and acquisition of new cell fates, identifying, characterizing, and understanding the mechanisms of action of CRMs is critical for understanding development. CRM discovery has historically been challenging, as CRMs can be located far from the genes they regulate, have few readily identifiable sequence characteristics, and for many years were not amenable to high-throughput discovery methods. However, the recent availability of complete genome sequences and the development of next-generation sequencing methods have led to an explosion of both computational and empirical methods for CRM discovery in model and nonmodel organisms alike. Experimentally, CRMs can be identified through chromatin immunoprecipitation directed against TFs or histone post-translational modifications, identification of nucleosome-depleted 'open' chromatin regions, or sequencing-based high-throughput functional screening. Computational methods include comparative genomics, clustering of known or predicted TF-binding sites, and supervised machine-learning approaches trained on known CRMs. All of these methods have proven effective for CRM discovery, but each has its own considerations and limitations, and each is subject to a greater or lesser number of false-positive identifications. Experimental confirmation of predictions is essential, although shortcomings in current methods suggest that additional means of validation need to be developed. For further resources related to this article, please visit the WIREs website. CONFLICT OF INTEREST The authors have declared no conflicts of interest for this article.
Collapse
Affiliation(s)
- Kushal Suryamohan
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- NY State Center of Excellence in Bioinformatics and Life Sciences, Buffalo, NY 14203, USA
| | - Marc S. Halfon
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- Department of Biological Sciences, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- Department of Biomedical Informatics, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- NY State Center of Excellence in Bioinformatics and Life Sciences, Buffalo, NY 14203, USA
- Molecular and Cellular Biology Department and Program in Cancer Genetics, Roswell Park Cancer Institute, Buffalo, NY 14263, USA
| |
Collapse
|
3
|
Burgess D, Freeling M. The most deeply conserved noncoding sequences in plants serve similar functions to those in vertebrates despite large differences in evolutionary rates. THE PLANT CELL 2014; 26:946-61. [PMID: 24681619 PMCID: PMC4001403 DOI: 10.1105/tpc.113.121905] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
In vertebrates, conserved noncoding elements (CNEs) are functionally constrained sequences that can show striking conservation over >400 million years of evolutionary distance and frequently are located megabases away from target developmental genes. Conserved noncoding sequences (CNSs) in plants are much shorter, and it has been difficult to detect conservation among distantly related genomes. In this article, we show not only that CNS sequences can be detected throughout the eudicot clade of flowering plants, but also that a subset of 37 CNSs can be found in all flowering plants (diverging ∼170 million years ago). These CNSs are functionally similar to vertebrate CNEs, being highly associated with transcription factor and development genes and enriched in transcription factor binding sites. Some of the most highly conserved sequences occur in genes encoding RNA binding proteins, particularly the RNA splicing-associated SR genes. Differences in sequence conservation between plants and animals are likely to reflect differences in the biology of the organisms, with plants being much more able to tolerate genomic deletions and whole-genome duplication events due, in part, to their far greater fecundity compared with vertebrates.
Collapse
|
4
|
Abstract
In its first production phase, The ENCODE Project Consortium (ENCODE) has generated thousands of genome-scale data sets, resulting in a genomic “parts list” that encompasses transcripts, sites of transcription factor binding, and other functional features that now number in the millions of distinct elements. These data are reshaping many long-held beliefs concerning the information content of the human and other complex genomes, including the very definition of the gene. Here I discuss and place in context many of the leading findings of ENCODE, as well as trends that are shaping the generation and interpretation of ENCODE data. Finally, I consider prospects for the future, including maximizing the accuracy, completeness, and utility of ENCODE data for the community.
Collapse
Affiliation(s)
- John A Stamatoyannopoulos
- Departments of Genome Sciences and Medicine, University of Washington School of Medicine, Seattle, Washington 98195, USA.
| |
Collapse
|
5
|
Abstract
Differential gene expression is the fundamental mechanism underlying animal development and cell differentiation. However, it is a challenge to identify comprehensively and accurately the DNA sequences that are required to regulate gene expression: namely, cis-regulatory modules (CRMs). Three major features, either singly or in combination, are used to predict CRMs: clusters of transcription factor binding site motifs, non-coding DNA that is under evolutionary constraint and biochemical marks associated with CRMs, such as histone modifications and protein occupancy. The validation rates for predictions indicate that identifying diagnostic biochemical marks is the most reliable method, and understanding is enhanced by the analysis of motifs and conservation patterns within those predicted CRMs.
Collapse
|
6
|
Abstract
The number of known mutations in human nuclear genes, underlying or associated with human inherited disease, has now exceeded 100,000 in more than 3700 different genes (Human Gene Mutation Database). However, for a variety of reasons, this figure is likely to represent only a small proportion of the clinically relevant genetic variants that remain to be identified in the human genome (the 'mutome'). With the advent of next-generation sequencing, we are currently witnessing a revolution in medical genetics. In particular, whole-genome sequencing (WGS) has the potential to identify all disease-causing or disease-associated DNA variants in a given individual. Here, we use examples of recent advances in our understanding of mutational/pathogenic mechanisms to guide our thinking about possible locations outwith gene-coding sequences for those disease-causing or disease-associated variants that are likely so often to have been overlooked because of the inadequacy of current mutation screening protocols. Such considerations are important not only for improving mutation-screening strategies but also for enhancing the interpretation of findings derived from genome-wide association studies, whole-exome sequencing and WGS. An improved understanding of the human mutome will not only lead to the development of improved diagnostic testing procedures but should also improve our understanding of human genome biology.
Collapse
Affiliation(s)
- J M Chen
- Etablissement Français du Sang (EFS) - Bretagne, Brest, France.
| | | | | |
Collapse
|
7
|
Robyr D, Friedli M, Gehrig C, Arcangeli M, Marin M, Guipponi M, Farinelli L, Barde I, Verp S, Trono D, Antonarakis SE. Chromosome conformation capture uncovers potential genome-wide interactions between human conserved non-coding sequences. PLoS One 2011; 6:e17634. [PMID: 21408183 PMCID: PMC3049788 DOI: 10.1371/journal.pone.0017634] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2010] [Accepted: 02/04/2011] [Indexed: 02/02/2023] Open
Abstract
Comparative analyses of various mammalian genomes have identified numerous conserved non-coding (CNC) DNA elements that display striking conservation among species, suggesting that they have maintained specific functions throughout evolution. CNC function remains poorly understood, although recent studies have identified a role in gene regulation. We hypothesized that the identification of genomic loci that interact physically with CNCs would provide information on their functions. We have used circular chromosome conformation capture (4C) to characterize interactions of 10 CNCs from human chromosome 21 in K562 cells. The data provide evidence that CNCs are capable of interacting with loci that are enriched for CNCs. The number of trans interactions varies among CNCs; some show interactions with many loci, while others interact with few. Some of the tested CNCs are capable of driving the expression of a reporter gene in the mouse embryo, and associate with the oligodendrocyte genes OLIG1 and OLIG2. Our results underscore the power of chromosome conformation capture for the identification of targets of functional DNA elements and raise the possibility that CNCs exert their functions by physical association with defined genomic regions enriched in CNCs. These CNC-CNC interactions may in part explain their stringent conservation as a group of regulatory sequences.
Collapse
Affiliation(s)
- Daniel Robyr
- Department of Genetic Medicine and Development, University of Geneva Medical School and University Hospitals of Geneva, Geneva, Switzerland
- * E-mail: (SEA); (DR)
| | - Marc Friedli
- Department of Genetic Medicine and Development, University of Geneva Medical School and University Hospitals of Geneva, Geneva, Switzerland
| | - Corinne Gehrig
- Department of Genetic Medicine and Development, University of Geneva Medical School and University Hospitals of Geneva, Geneva, Switzerland
| | - Mélanie Arcangeli
- Department of Genetic Medicine and Development, University of Geneva Medical School and University Hospitals of Geneva, Geneva, Switzerland
| | - Marilyn Marin
- Department of Genetic Medicine and Development, University of Geneva Medical School and University Hospitals of Geneva, Geneva, Switzerland
| | - Michel Guipponi
- Department of Genetic Medicine and Development, University of Geneva Medical School and University Hospitals of Geneva, Geneva, Switzerland
| | | | - Isabelle Barde
- Global Health Institute, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Sonia Verp
- Global Health Institute, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Didier Trono
- Global Health Institute, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Stylianos E. Antonarakis
- Department of Genetic Medicine and Development, University of Geneva Medical School and University Hospitals of Geneva, Geneva, Switzerland
- * E-mail: (SEA); (DR)
| |
Collapse
|
8
|
A systematic enhancer screen using lentivector transgenesis identifies conserved and non-conserved functional elements at the Olig1 and Olig2 locus. PLoS One 2010; 5:e15741. [PMID: 21206754 PMCID: PMC3012086 DOI: 10.1371/journal.pone.0015741] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2010] [Accepted: 11/23/2010] [Indexed: 01/22/2023] Open
Abstract
Finding sequences that control expression of genes is central to understanding genome function. Previous studies have used evolutionary conservation as an indicator of regulatory potential. Here, we present a method for the unbiased in vivo screen of putative enhancers in large DNA regions, using the mouse as a model. We cloned a library of 142 overlapping fragments from a 200 kb-long murine BAC in a lentiviral vector expressing LacZ from a minimal promoter, and used the resulting vectors to infect fertilized murine oocytes. LacZ staining of E11 embryos obtained by first using the vectors in pools and then testing individual candidates led to the identification of 3 enhancers, only one of which shows significant evolutionary conservation. In situ hybridization and 3C/4C experiments suggest that this enhancer, which is active in the neural tube and posterior diencephalon, influences the expression of the Olig1 and/or Olig2 genes. This work provides a new approach for the large-scale in vivo screening of transcriptional regulatory sequences, and further demonstrates that evolutionary conservation alone seems too limiting a criterion for the identification of enhancers.
Collapse
|
9
|
When needles look like hay: how to find tissue-specific enhancers in model organism genomes. Dev Biol 2010; 350:239-54. [PMID: 21130761 DOI: 10.1016/j.ydbio.2010.11.026] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2010] [Revised: 11/11/2010] [Accepted: 11/22/2010] [Indexed: 01/22/2023]
Abstract
A major prerequisite for the investigation of tissue-specific processes is the identification of cis-regulatory elements. No generally applicable technique is available to distinguish them from any other type of genomic non-coding sequence. Therefore, researchers often have to identify these elements by elaborate in vivo screens, testing individual regions until the right one is found. Here, based on many examples from the literature, we summarize how functional enhancers have been isolated from other elements in the genome and how they have been characterized in transgenic animals. Covering computational and experimental studies, we provide an overview of the global properties of cis-regulatory elements, like their specific interactions with promoters and target gene distances. We describe conserved non-coding elements (CNEs) and their internal structure, nucleotide composition, binding site clustering and overlap, with a special focus on developmental enhancers. Conflicting data and unresolved questions on the nature of these elements are highlighted. Our comprehensive overview of the experimental shortcuts that have been found in the different model organism communities and the new field of high-throughput assays should help during the preparation phase of a screen for enhancers. The review is accompanied by a list of general guidelines for such a project.
Collapse
|
10
|
Abstract
The plasma concentration of fibrinogen varies in the healthy human population between 1.5 and 3.5 g/L. Understanding the basis of this variability has clinical importance because elevated fibrinogen levels are associated with increased cardiovascular disease risk. To identify novel regulatory elements involved in the control of fibrinogen expression, we used sequence conservation and in silico-predicted regulatory potential to select 14 conserved noncoding sequences (CNCs) within the conserved block of synteny containing the fibrinogen locus. The regulatory potential of each CNC was tested in vitro using a luciferase reporter gene assay in fibrinogen-expressing hepatoma cell lines (HuH7 and HepG2). 4 potential enhancers were tested for their ability to direct enhanced green fluorescent protein expression in zebrafish embryos. CNC12, a sequence equidistant from the human fibrinogen alpha and beta chain genes, activates strong liver enhanced green fluorescent protein expression in injected embryos and their transgenic progeny. A transgenic assay in embryonic day 14.5 mouse embryos confirmed the ability of CNC12 to activate transcription in the liver. While additional experiments are necessary to prove the role of CNC12 in the regulation of fibrinogen, our study reveals a novel regulatory element in the fibrinogen locus that is active in the liver and may contribute to variable fibrinogen expression in humans.
Collapse
|
11
|
Cooper DN, Chen JM, Ball EV, Howells K, Mort M, Phillips AD, Chuzhanova N, Krawczak M, Kehrer-Sawatzki H, Stenson PD. Genes, mutations, and human inherited disease at the dawn of the age of personalized genomics. Hum Mutat 2010; 31:631-55. [PMID: 20506564 DOI: 10.1002/humu.21260] [Citation(s) in RCA: 117] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The number of reported germline mutations in human nuclear genes, either underlying or associated with inherited disease, has now exceeded 100,000 in more than 3,700 different genes. The availability of these data has both revolutionized the study of the morbid anatomy of the human genome and facilitated "personalized genomics." With approximately 300 new "inherited disease genes" (and approximately 10,000 new mutations) being identified annually, it is pertinent to ask how many "inherited disease genes" there are in the human genome, how many mutations reside within them, and where such lesions are likely to be located? To address these questions, it is necessary not only to reconsider how we define human genes but also to explore notions of gene "essentiality" and "dispensability."Answers to these questions are now emerging from recent novel insights into genome structure and function and through complete genome sequence information derived from multiple individual human genomes. However, a change in focus toward screening functional genomic elements as opposed to genes sensu stricto will be required if we are to capitalize fully on recent technical and conceptual advances and identify new types of disease-associated mutation within noncoding regions remote from the genes whose function they disrupt.
Collapse
Affiliation(s)
- David N Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff CF14 4XN, United Kingdom.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Abstract
Determining the timing and molecular repertoire responsible for gene expression is fundamental to understanding a gene's function. Heritable differences in this character are increasingly regarded as explanatory for complex and common traits. For many known trait-predisposing genes, studies have sought to elucidate the associated logic behind gene regulation. However, there exist many challenges in deciphering these mechanisms. Among them, it is recognized that we have limited understanding of regulatory complexity, the current models of gene regulation have low specificity and any gene's regulatory logic is dependent on biological context. Addressing these limitations and defining the regulatory genome is an ongoing challenge for molecular biology. We discuss current efforts to define and annotate the regulatory genome by focusing on curation and text-mining activities. We further highlight the type of information and curation process for describing regulatory elements within the ORegAnno database ( www.oreganno.org ) and how the general standards for such information are changing.
Collapse
|
13
|
Maia AT, Spiteri I, Lee AJX, O'Reilly M, Jones L, Caldas C, Ponder BAJ. Extent of differential allelic expression of candidate breast cancer genes is similar in blood and breast. Breast Cancer Res 2009; 11:R88. [PMID: 20003265 PMCID: PMC2815552 DOI: 10.1186/bcr2458] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2009] [Revised: 11/10/2009] [Accepted: 12/10/2009] [Indexed: 12/31/2022] Open
Abstract
Introduction Normal gene expression variation is thought to play a central role in inter-individual variation and susceptibility to disease. Regulatory polymorphisms in cis-acting elements result in the unequal expression of alleles. Differential allelic expression (DAE) in heterozygote individuals could be used to develop a new approach to discover regulatory breast cancer susceptibility loci. As access to large numbers of fresh breast tissue to perform such studies is difficult, a suitable surrogate test tissue must be identified for future studies. Methods We measured differential allelic expression of 12 candidate genes possibly related to breast cancer susceptibility (BRCA1, BRCA2, C1qA, CCND3, EMSY, GPX1, GPX4, MLH3, MTHFR, NBS1, TP53 and TRXR2) in breast tissue (n = 40) and fresh blood (n = 170) of healthy individuals and EBV-transformed lymphoblastoid cells (n = 19). Differential allelic expression ratios were determined by Taqman assay. Ratio distributions were compared using t-test and Wilcoxon rank sum test, for mean ratios and variances respectively. Results We show that differential allelic expression is common among these 12 candidate genes and is comparable between breast and blood (fresh and transformed lymphoblasts) in a significant proportion of them. We found that eight out of nine genes with DAE in breast and fresh blood were comparable, as were 10 out of 11 genes between breast and transformed lymphoblasts. Conclusions Our findings support the use of differential allelic expression in blood as a surrogate for breast tissue in future studies on predisposition to breast cancer.
Collapse
Affiliation(s)
- Ana-Teresa Maia
- Cancer Research UK Cambridge Research Institute, Li Ka Shing Centre and Department of Oncology, University of Cambridge, Robinson Way, Cambridge CB2 0RE, UK.
| | | | | | | | | | | | | |
Collapse
|
14
|
D'haene B, Attanasio C, Beysen D, Dostie J, Lemire E, Bouchard P, Field M, Jones K, Lorenz B, Menten B, Buysse K, Pattyn F, Friedli M, Ucla C, Rossier C, Wyss C, Speleman F, De Paepe A, Dekker J, Antonarakis SE, De Baere E. Disease-causing 7.4 kb cis-regulatory deletion disrupting conserved non-coding sequences and their interaction with the FOXL2 promotor: implications for mutation screening. PLoS Genet 2009; 5:e1000522. [PMID: 19543368 PMCID: PMC2689649 DOI: 10.1371/journal.pgen.1000522] [Citation(s) in RCA: 77] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2009] [Accepted: 05/18/2009] [Indexed: 11/23/2022] Open
Abstract
To date, the contribution of disrupted potentially cis-regulatory conserved non-coding sequences (CNCs) to human disease is most likely underestimated, as no systematic screens for putative deleterious variations in CNCs have been conducted. As a model for monogenic disease we studied the involvement of genetic changes of CNCs in the cis-regulatory domain of FOXL2 in blepharophimosis syndrome (BPES). Fifty-seven molecularly unsolved BPES patients underwent high-resolution copy number screening and targeted sequencing of CNCs. Apart from three larger distant deletions, a de novo deletion as small as 7.4 kb was found at 283 kb 5′ to FOXL2. The deletion appeared to be triggered by an H-DNA-induced double-stranded break (DSB). In addition, it disrupts a novel long non-coding RNA (ncRNA) PISRT1 and 8 CNCs. The regulatory potential of the deleted CNCs was substantiated by in vitro luciferase assays. Interestingly, Chromosome Conformation Capture (3C) of a 625 kb region surrounding FOXL2 in expressing cellular systems revealed physical interactions of three upstream fragments and the FOXL2 core promoter. Importantly, one of these contains the 7.4 kb deleted fragment. Overall, this study revealed the smallest distant deletion causing monogenic disease and impacts upon the concept of mutation screening in human disease and developmental disorders in particular. Long-range genetic control is an inherent feature of genes harbouring a highly complex spatiotemporal expression pattern, requiring a combined action of multiple cis-regulatory elements such as promoters, enhancers, and silencers. Consequently, disruption of the long-range genetic control of a target gene by genomic rearrangements of regulatory elements may lead to aberrant gene transcription and disease. To date, the contribution of mutated regulatory elements to human disease has not been studied frequently. Here, we explored the contribution of genetic changes in potentially cis-regulatory elements of the FOXL2 gene in blepharophimosis syndrome (BPES), a developmental monogenic condition of the eyelids and ovaries. We identified a de novo very subtle deletion of 7.4 kb causing BPES. Moreover, we studied the functional capacities and chromosome conformation of the deleted region in FOXL2 expressing cellular systems. Interestingly, the chromosome conformation analysis demonstrated the close proximity of the 7.4 kb deleted fragment and two other conserved regions with the FOXL2 core promoter, and the necessity of their integrity for correct FOXL2 expression. Finally, our study revealed the smallest distant deletion causing monogenic disease and emphasized the importance of mutation screening of cis-regulatory elements in human genetic disease.
Collapse
Affiliation(s)
- Barbara D'haene
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Catia Attanasio
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
| | - Diane Beysen
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Josée Dostie
- Program in Gene Function and Expression and Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, Massachusetts, United States of America
| | - Edmond Lemire
- Division of Medical Genetics, Royal University Hospital, Saskatoon, Saskatchewan, Canada
| | | | | | - Kristie Jones
- Department of Clinical Genetics, The Children's Hospital at Westmead, Westmead, Australia
| | - Birgit Lorenz
- Department of Ophthalmology, Justus-Liebig-University Giessen, Universitaetsklinikum Giessen und Marburg GmbH Giessen Campus, Giessen, Germany
| | - Björn Menten
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Karen Buysse
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Filip Pattyn
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Marc Friedli
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
| | - Catherine Ucla
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
| | - Colette Rossier
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
| | - Carine Wyss
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
| | - Frank Speleman
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Anne De Paepe
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Job Dekker
- Program in Gene Function and Expression and Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, Massachusetts, United States of America
| | - Stylianos E. Antonarakis
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
| | - Elfride De Baere
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
- * E-mail:
| |
Collapse
|
15
|
Schmidl C, Klug M, Boeld TJ, Andreesen R, Hoffmann P, Edinger M, Rehli M. Lineage-specific DNA methylation in T cells correlates with histone methylation and enhancer activity. Genome Res 2009; 19:1165-74. [PMID: 19494038 DOI: 10.1101/gr.091470.109] [Citation(s) in RCA: 180] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
DNA methylation participates in establishing and maintaining chromatin structures and regulates gene transcription during mammalian development and cellular differentiation. With few exceptions, research thus far has focused on gene promoters, and little is known about the extent, functional relevance, and regulation of cell type-specific DNA methylation at promoter-distal sites. Here, we present a comprehensive analysis of differential DNA methylation in human conventional CD4(+) T cells (Tconv) and CD4(+)CD25(+) regulatory T cells (Treg), cell types whose differentiation and function are known to be controlled by epigenetic mechanisms. Using a novel approach that is based on the separation of a genome into methylated and unmethylated fractions, we examined the extent of lineage-specific DNA methylation across whole gene loci. More than 100 differentially methylated regions (DMRs) were identified that are present mainly in cell type-specific genes (e.g., FOXP3, IL2RA, CTLA4, CD40LG, and IFNG) and show differential patterns of histone H3 lysine 4 methylation. Interestingly, the majority of DMRs were located at promoter-distal sites, and many of these areas harbor DNA methylation-dependent enhancer activity in reporter gene assays. Thus, our study provides a comprehensive, locus-wide analysis of lineage-specific methylation patterns in Treg and Tconv cells, links cell type-specific DNA methylation with histone methylation and regulatory function, and identifies a number of cell type-specific, CpG methylation-sensitive enhancers in immunologically relevant genes.
Collapse
Affiliation(s)
- Christian Schmidl
- Department of Hematology, University Hospital Regensburg, 93042 Regensburg, Germany
| | | | | | | | | | | | | |
Collapse
|
16
|
Balanced translocations in mental retardation. Hum Genet 2009; 126:133-47. [PMID: 19347365 DOI: 10.1007/s00439-009-0661-6] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2009] [Accepted: 03/23/2009] [Indexed: 12/13/2022]
Abstract
Over the past few decades, the knowledge on genetic defects causing mental retardation has dramatically increased. In this review, we discuss the importance of balanced chromosomal translocations in the identification of genes responsible for mental retardation. We present a database-search guided overview of balanced translocations identified in patients with mental retardation. We divide those in four categories: (1) balanced translocations that helped to identify a causative gene within a contiguous gene syndrome, (2) balanced translocations that led to the identification of a mental retardation gene confirmed by independent methods, (3) balanced translocations disrupting candidate genes that have not been confirmed by independent methods and (4) balanced translocations not reported to disrupt protein coding sequences. It can safely be concluded that balanced translocations have been instrumental in the identification of multiple genes that are involved in mental retardation. In addition, many more candidate genes were identified with a suspected but (as yet?) unconfirmed role in mental retardation. Some balanced translocations do not disrupt a protein coding gene and it can be speculated that in the light of recent findings concerning ncRNA's and ultra-conserved regions, such findings are worth further investigation as these potentially may lead us to the discovery of novel disease mechanisms.
Collapse
|