1
|
Butler LM, Hallström BM, Fagerberg L, Pontén F, Uhlén M, Renné T, Odeberg J. Analysis of Body-wide Unfractionated Tissue Data to Identify a Core Human Endothelial Transcriptome. Cell Syst 2016; 3:287-301.e3. [PMID: 27641958 DOI: 10.1016/j.cels.2016.08.001] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2015] [Revised: 05/23/2016] [Accepted: 08/03/2016] [Indexed: 12/11/2022]
Abstract
Endothelial cells line blood vessels and regulate hemostasis, inflammation, and blood pressure. Proteins critical for these specialized functions tend to be predominantly expressed in endothelial cells across vascular beds. Here, we present a systems approach to identify a panel of human endothelial-enriched genes using global, body-wide transcriptomics data from 124 tissue samples from 32 organs. We identified known and unknown endothelial-enriched gene transcripts and used antibody-based profiling to confirm expression across vascular beds. The majority of identified transcripts could be detected in cultured endothelial cells from various vascular beds, and we observed maintenance of relative expression in early passage cells. In summary, we describe a widely applicable method to determine cell-type-specific transcriptome profiles in a whole-organism context, based on differential abundance across tissues. We identify potential vascular drug targets or endothelial biomarkers and highlight candidates for functional studies to increase understanding of the endothelium in health and disease.
Collapse
Affiliation(s)
- Lynn Marie Butler
- Institute for Clinical Chemistry and Laboratory Medicine, University Medical Centre Hamburg-Eppendorf, 20246 Hamburg, Germany; Clinical Chemistry and Blood Coagulation, Department of Molecular Medicine and Surgery, Karolinska Institute, 171 76 Stockholm, Sweden.
| | - Björn Mikael Hallström
- Science for Life Laboratory, School of Biotechnology, Royal Institute of Technology (KTH), 171 21 Stockholm, Sweden
| | - Linn Fagerberg
- Science for Life Laboratory, School of Biotechnology, Royal Institute of Technology (KTH), 171 21 Stockholm, Sweden
| | - Fredrik Pontén
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, 751 85 Uppsala, Sweden
| | - Mathias Uhlén
- Science for Life Laboratory, School of Biotechnology, Royal Institute of Technology (KTH), 171 21 Stockholm, Sweden
| | - Thomas Renné
- Institute for Clinical Chemistry and Laboratory Medicine, University Medical Centre Hamburg-Eppendorf, 20246 Hamburg, Germany; Clinical Chemistry and Blood Coagulation, Department of Molecular Medicine and Surgery, Karolinska Institute, 171 76 Stockholm, Sweden
| | - Jacob Odeberg
- Science for Life Laboratory, School of Biotechnology, Royal Institute of Technology (KTH), 171 21 Stockholm, Sweden; Coagulation Unit, Centre for Hematology, Karolinska University Hospital, 171 76 Stockholm, Sweden
| |
Collapse
|
2
|
Xu T, Li B, Zhao M, Szulwach KE, Street RC, Lin L, Yao B, Zhang F, Jin P, Wu H, Qin ZS. Base-resolution methylation patterns accurately predict transcription factor bindings in vivo. Nucleic Acids Res 2015; 43:2757-66. [PMID: 25722376 PMCID: PMC4357735 DOI: 10.1093/nar/gkv151] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Detecting in vivo transcription factor (TF) binding is important for understanding gene regulatory circuitries. ChIP-seq is a powerful technique to empirically define TF binding in vivo. However, the multitude of distinct TFs makes genome-wide profiling for them all labor-intensive and costly. Algorithms for in silico prediction of TF binding have been developed, based mostly on histone modification or DNase I hypersensitivity data in conjunction with DNA motif and other genomic features. However, technical limitations of these methods prevent them from being applied broadly, especially in clinical settings. We conducted a comprehensive survey involving multiple cell lines, TFs, and methylation types and found that there are intimate relationships between TF binding and methylation level changes around the binding sites. Exploiting the connection between DNA methylation and TF binding, we proposed a novel supervised learning approach to predict TF-DNA interaction using data from base-resolution whole-genome methylation sequencing experiments. We devised beta-binomial models to characterize methylation data around TF binding sites and the background. Along with other static genomic features, we adopted a random forest framework to predict TF-DNA interaction. After conducting comprehensive tests, we saw that the proposed method accurately predicts TF binding and performs favorably versus competing methods.
Collapse
Affiliation(s)
- Tianlei Xu
- Department of Mathematics and Computer Science, Emory University, 400 Dowman Drive, Atlanta, GA 30322, USA Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, 1518 Clifton Road, Atlanta, GA 30322, USA
| | - Ben Li
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, 1518 Clifton Road, Atlanta, GA 30322, USA
| | - Meng Zhao
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, 1518 Clifton Road, Atlanta, GA 30322, USA
| | - Keith E Szulwach
- Department of Human Genetics, Emory University, School of Medicine, 615 Michael Street, Atlanta, GA 30322, USA
| | - R Craig Street
- Department of Human Genetics, Emory University, School of Medicine, 615 Michael Street, Atlanta, GA 30322, USA
| | - Li Lin
- Department of Human Genetics, Emory University, School of Medicine, 615 Michael Street, Atlanta, GA 30322, USA
| | - Bing Yao
- Department of Human Genetics, Emory University, School of Medicine, 615 Michael Street, Atlanta, GA 30322, USA
| | - Feiran Zhang
- Department of Human Genetics, Emory University, School of Medicine, 615 Michael Street, Atlanta, GA 30322, USA
| | - Peng Jin
- Department of Human Genetics, Emory University, School of Medicine, 615 Michael Street, Atlanta, GA 30322, USA
| | - Hao Wu
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, 1518 Clifton Road, Atlanta, GA 30322, USA
| | - Zhaohui S Qin
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, 1518 Clifton Road, Atlanta, GA 30322, USA Department of Biomedical Informatics, Emory University, 36 Eagle Row, Atlanta, GA 30322, USA
| |
Collapse
|
3
|
Zhang W, Spector TD, Deloukas P, Bell JT, Engelhardt BE. Predicting genome-wide DNA methylation using methylation marks, genomic position, and DNA regulatory elements. Genome Biol 2015; 16:14. [PMID: 25616342 PMCID: PMC4389802 DOI: 10.1186/s13059-015-0581-9] [Citation(s) in RCA: 125] [Impact Index Per Article: 13.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2013] [Accepted: 01/02/2015] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Recent assays for individual-specific genome-wide DNA methylation profiles have enabled epigenome-wide association studies to identify specific CpG sites associated with a phenotype. Computational prediction of CpG site-specific methylation levels is critical to enable genome-wide analyses, but current approaches tackle average methylation within a locus and are often limited to specific genomic regions. RESULTS We characterize genome-wide DNA methylation patterns, and show that correlation among CpG sites decays rapidly, making predictions solely based on neighboring sites challenging. We built a random forest classifier to predict methylation levels at CpG site resolution using features including neighboring CpG site methylation levels and genomic distance, co-localization with coding regions, CpG islands (CGIs), and regulatory elements from the ENCODE project. Our approach achieves 92% prediction accuracy of genome-wide methylation levels at single-CpG-site precision. The accuracy increases to 98% when restricted to CpG sites within CGIs and is robust across platform and cell-type heterogeneity. Our classifier outperforms other types of classifiers and identifies features that contribute to prediction accuracy: neighboring CpG site methylation, CGIs, co-localized DNase I hypersensitive sites, transcription factor binding sites, and histone modifications were found to be most predictive of methylation levels. CONCLUSIONS Our observations of DNA methylation patterns led us to develop a classifier to predict DNA methylation levels at CpG site resolution with high accuracy. Furthermore, our method identified genomic features that interact with DNA methylation, suggesting mechanisms involved in DNA methylation modification and regulation, and linking diverse epigenetic processes.
Collapse
Affiliation(s)
- Weiwei Zhang
- Department of Molecular Genetics and Microbiology, Duke University, Durham, NC, USA.
| | - Tim D Spector
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK.
| | - Panos Deloukas
- William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, UK.
- Princess Al-Jawhara Al-Brahim Centre of Excellence in Research of Hereditary Disorders (PACER-HD), King Abdulaziz University, Jeddah, 21589, Saudi Arabia.
| | - Jordana T Bell
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK.
| | | |
Collapse
|
4
|
Yamamizu K, Matsunaga T, Katayama S, Kataoka H, Takayama N, Eto K, Nishikawa SI, Yamashita JK. PKA/CREB signaling triggers initiation of endothelial and hematopoietic cell differentiation via Etv2 induction. Stem Cells 2012; 30:687-96. [PMID: 22267325 DOI: 10.1002/stem.1041] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Ets family protein Etv2 (also called ER71 or Etsrp) is a key factor for initiation of vascular and blood development from mesodermal cells. However, regulatory mechanisms and inducing signals for Etv2 expression have been largely unknown. Previously, we revealed that cyclic adenosine monophosphate (cAMP)/protein kinase A (PKA) signaling enhanced differentiation of vascular progenitors into endothelial cells (ECs) and hematopoietic cells (HPCs) using an embryonic stem cell (ESC) differentiation system. Here, we show that PKA activation in an earlier differentiation stage can trigger EC/HPC differentiation through Etv2 induction. We found Etv2 was markedly upregulated by PKA activation preceding EC and HPC differentiation. We identified two cAMP response element (CRE) sequences in the Etv2 promoter and 5'-untranslated region and confirmed that CRE-binding protein (CREB) directly binds to the CRE sites and activates Etv2 transcription. Expression of a dominant negative form of CREB completely inhibited PKA-elicited Etv2 expression and induction of EC/HPCs from ESCs. Furthermore, blockade of PKA significantly inhibited Etv2 expression in ex vivo whole-embryo culture using Etv2-Venus knockin mice. These data indicated that PKA/CREB pathway is a critical regulator for the initiation of EC/HPC differentiation via Etv2 transcription. This early-stage molecular linkage between a triggering signal and transcriptional cascades for differentiation would provide novel insights in vascular and blood development and cell fate determination.
Collapse
Affiliation(s)
- Kohei Yamamizu
- Laboratory of Stem Cell Differentiation, Stem Cell Research Center, Institute for Frontier Medical Sciences, Kyoto University, Kyoto, Japan; Department of Cell Growth and Differentiation, Center for iPS Cell Research and Application (CiRA), Kyoto University, Kyoto, Japan
| | | | | | | | | | | | | | | |
Collapse
|
5
|
Heinke J, Patterson C, Moser M. Life is a pattern: vascular assembly within the embryo. Front Biosci (Elite Ed) 2012; 4:2269-88. [PMID: 22202036 DOI: 10.2741/541] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The formation of the vascular system is one of the earliest and most important events during organogenesis in the developing embryo because the growing organism needs a transportation system to supply oxygen and nutrients and to remove waste products. Two distinct processes termed vasculogenesis and angiogenesis lead to a complex vasculature covering the entire body. Several cellular mechanisms including migration, proliferation, differentiation and maturation are involved in generating this hierarchical vascular tree. To achieve this aim, a multitude of signaling pathways need to be activated and coordinated in spatio-temporal patterns. Understanding embryonic molecular mechanism in angiogenesis further provides insight for therapeutic approaches in pathological conditions like cancer or ischemic diseases in the adult. In this review, we describe the current understanding of major signaling pathways that are necessary and active during vascular development.
Collapse
Affiliation(s)
- Jennifer Heinke
- Department of Internal Medicine III, University of Freiburg, Germany
| | | | | |
Collapse
|
6
|
Cuellar-Partida G, Buske FA, McLeay RC, Whitington T, Noble WS, Bailey TL. Epigenetic priors for identifying active transcription factor binding sites. ACTA ACUST UNITED AC 2011; 28:56-62. [PMID: 22072382 DOI: 10.1093/bioinformatics/btr614] [Citation(s) in RCA: 76] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION Accurate knowledge of the genome-wide binding of transcription factors in a particular cell type or under a particular condition is necessary for understanding transcriptional regulation. Using epigenetic data such as histone modification and DNase I, accessibility data has been shown to improve motif-based in silico methods for predicting such binding, but this approach has not yet been fully explored. RESULTS We describe a probabilistic method for combining one or more tracks of epigenetic data with a standard DNA sequence motif model to improve our ability to identify active transcription factor binding sites (TFBSs). We convert each data type into a position-specific probabilistic prior and combine these priors with a traditional probabilistic motif model to compute a log-posterior odds score. Our experiments, using histone modifications H3K4me1, H3K4me3, H3K9ac and H3K27ac, as well as DNase I sensitivity, show conclusively that the log-posterior odds score consistently outperforms a simple binary filter based on the same data. We also show that our approach performs competitively with a more complex method, CENTIPEDE, and suggest that the relative simplicity of the log-posterior odds scoring method makes it an appealing and very general method for identifying functional TFBSs on the basis of DNA and epigenetic evidence. AVAILABILITY AND IMPLEMENTATION FIMO, part of the MEME Suite software toolkit, now supports log-posterior odds scoring using position-specific priors for motif search. A web server and source code are available at http://meme.nbcr.net. Utilities for creating priors are at http://research.imb.uq.edu.au/t.bailey/SD/Cuellar2011. CONTACT t.bailey@uq.edu.au SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Gabriel Cuellar-Partida
- Institute for Molecular Bioscience, The University of Queensland, Brisbane QLD 4072, Australia
| | | | | | | | | | | |
Collapse
|
7
|
Iacobas I, Vats A, Hirschi KK. Vascular potential of human pluripotent stem cells. Arterioscler Thromb Vasc Biol 2010; 30:1110-7. [PMID: 20453170 DOI: 10.1161/atvbaha.109.191601] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Cardiovascular disease is the number one cause of death and disability in the US. Understanding the biological activity of stem and progenitor cells, and their ability to contribute to the repair, regeneration and remodeling of the heart and blood vessels affected by pathological processes is an essential part of the paradigm in enabling us to achieve a reduction in related deaths. Both human embryonic stem (ES) cells and induced pluripotent stem (iPS) cells are promising sources of cells for clinical cardiovascular therapies. Additional in vitro studies are needed, however, to understand their relative phenotypes and molecular regulation toward cardiovascular cell fates. Further studies in translational animal models are also needed to gain insights into the potential and function of both human ES- and iPS-derived cardiovascular cells, and enable translation from experimental and preclinical studies to human trials.
Collapse
Affiliation(s)
- Ionela Iacobas
- Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA
| | | | | |
Collapse
|
8
|
Modifiers of von Willebrand factor identified by natural variation in inbred strains of mice. Blood 2009; 114:5368-74. [PMID: 19789385 DOI: 10.1182/blood-2009-07-233213] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Type 1 von Willebrand disease (VWD) is the most common inherited human bleeding disorder. However, diagnosis is complicated by incomplete penetrance and variable expressivity, as well as wide variation in von Willebrand factor (VWF) levels among the normal population. Previous work has exploited the highly variable plasma VWF levels among inbred strains of mice to identify 2 major regulators, Mvwf1 and Mvwf2 (modifier of VWF). Mvwf1 is a glycosyltransferase and Mvwf2 is a natural variant in Vwf that alters biosynthesis. We report the identification of an additional alteration at the Vwf locus (Mvwf5), as well as 2 loci unlinked to Vwf (Mvwf6-7) using a backcross approach with the inbred mouse strains WSB/EiJ and C57BL/6J. Through positional cloning, we show that Mvwf5 is a cis-regulatory variant that alters Vwf mRNA expression. A similar mechanism could potentially explain a significant percentage of human VWD cases, especially those with no detectable mutation in the VWF coding sequence. Mvwf6 displays conservation of synteny with potential VWF modifier loci identified in human pedigrees, suggesting that its ortholog may modify VWF in human populations.
Collapse
|
9
|
De Val S, Black BL. Transcriptional control of endothelial cell development. Dev Cell 2009; 16:180-95. [PMID: 19217421 DOI: 10.1016/j.devcel.2009.01.014] [Citation(s) in RCA: 256] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2008] [Revised: 01/26/2009] [Accepted: 01/26/2009] [Indexed: 12/14/2022]
Abstract
The transcription factors that regulate endothelial cell development have been a focus of active research for several years, and many players in the endothelial transcriptional program have been identified. This review discusses the function of several major regulators of endothelial transcription, including members of the Sox, Ets, Forkhead, GATA, and Kruppel-like families. This review also highlights recent developments aimed at unraveling the combinatorial mechanisms and transcription factor interactions that regulate endothelial cell specification and differentiation during vasculogenesis and angiogenesis.
Collapse
Affiliation(s)
- Sarah De Val
- Cardiovascular Research Institute and Department of Biochemistry and Biophysics, University of California, San Francisco, 94158, USA
| | | |
Collapse
|
10
|
Attanasio C, Reymond A, Humbert R, Lyle R, Kuehn MS, Neph S, Sabo PJ, Goldy J, Weaver M, Haydock A, Lee K, Dorschner M, Dermitzakis ET, Antonarakis SE, Stamatoyannopoulos JA. Assaying the regulatory potential of mammalian conserved non-coding sequences in human cells. Genome Biol 2008; 9:R168. [PMID: 19055709 PMCID: PMC2646272 DOI: 10.1186/gb-2008-9-12-r168] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2008] [Revised: 09/24/2008] [Accepted: 12/02/2008] [Indexed: 01/26/2023] Open
Abstract
The fraction of experimentally active conserved non-coding sequences within any given cell type is low, so classical assays are unlikely to expose their potential. Background Conserved non-coding sequences in the human genome are approximately tenfold more abundant than known genes, and have been hypothesized to mark the locations of cis-regulatory elements. However, the global contribution of conserved non-coding sequences to the transcriptional regulation of human genes is currently unknown. Deeply conserved elements shared between humans and teleost fish predominantly flank genes active during morphogenesis and are enriched for positive transcriptional regulatory elements. However, such deeply conserved elements account for <1% of the conserved non-coding sequences in the human genome, which are predominantly mammalian. Results We explored the regulatory potential of a large sample of these 'common' conserved non-coding sequences using a variety of classic assays, including chromatin remodeling, and enhancer/repressor and promoter activity. When tested across diverse human model cell types, we find that the fraction of experimentally active conserved non-coding sequences within any given cell type is low (approximately 5%), and that this proportion increases only modestly when considered collectively across cell types. Conclusions The results suggest that classic assays of cis-regulatory potential are unlikely to expose the functional potential of the substantial majority of mammalian conserved non-coding sequences in the human genome.
Collapse
Affiliation(s)
- Catia Attanasio
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1 rue Michel Servet, 1211, Geneva 4, Switzerland.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Ogurtsov AY, Mariño-Ramírez L, Johnson GR, Landsman D, Shabalina SA, Spiridonov NA. Expression patterns of protein kinases correlate with gene architecture and evolutionary rates. PLoS One 2008; 3:e3599. [PMID: 18974838 PMCID: PMC2572838 DOI: 10.1371/journal.pone.0003599] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2008] [Accepted: 10/09/2008] [Indexed: 12/20/2022] Open
Abstract
Background Protein kinase (PK) genes comprise the third largest superfamily that occupy ∼2% of the human genome. They encode regulatory enzymes that control a vast variety of cellular processes through phosphorylation of their protein substrates. Expression of PK genes is subject to complex transcriptional regulation which is not fully understood. Principal Findings Our comparative analysis demonstrates that genomic organization of regulatory PK genes differs from organization of other protein coding genes. PK genes occupy larger genomic loci, have longer introns, spacer regions, and encode larger proteins. The primary transcript length of PK genes, similar to other protein coding genes, inversely correlates with gene expression level and expression breadth, which is likely due to the necessity to reduce metabolic costs of transcription for abundant messages. On average, PK genes evolve slower than other protein coding genes. Breadth of PK expression negatively correlates with rate of non-synonymous substitutions in protein coding regions. This rate is lower for high expression and ubiquitous PKs, relative to low expression PKs, and correlates with divergence in untranslated regions. Conversely, rate of silent mutations is uniform in different PK groups, indicating that differing rates of non-synonymous substitutions reflect variations in selective pressure. Brain and testis employ a considerable number of tissue-specific PKs, indicating high complexity of phosphorylation-dependent regulatory network in these organs. There are considerable differences in genomic organization between PKs up-regulated in the testis and brain. PK genes up-regulated in the highly proliferative testicular tissue are fast evolving and small, with short introns and transcribed regions. In contrast, genes up-regulated in the minimally proliferative nervous tissue carry long introns, extended transcribed regions, and evolve slowly. Conclusions/Significance PK genomic architecture, the size of gene functional domains and evolutionary rates correlate with the pattern of gene expression. Structure and evolutionary divergence of tissue-specific PK genes is related to the proliferative activity of the tissue where these genes are predominantly expressed. Our data provide evidence that physiological requirements for transcription intensity, ubiquitous expression, and tissue-specific regulation shape gene structure and affect rates of evolution.
Collapse
Affiliation(s)
- Aleksey Y. Ogurtsov
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Leonardo Mariño-Ramírez
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Gibbes R. Johnson
- Division of Therapeutic Proteins, Center for Drug Evaluation and Research, U. S. Food and Drug Administration, Bethesda, Maryland, United States of America
| | - David Landsman
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Svetlana A. Shabalina
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail: (NAS); (SAS)
| | - Nikolay A. Spiridonov
- Division of Therapeutic Proteins, Center for Drug Evaluation and Research, U. S. Food and Drug Administration, Bethesda, Maryland, United States of America
- * E-mail: (NAS); (SAS)
| |
Collapse
|
12
|
Failure of terminal erythroid differentiation in EKLF-deficient mice is associated with cell cycle perturbation and reduced expression of E2F2. Mol Cell Biol 2008; 28:7394-401. [PMID: 18852285 DOI: 10.1128/mcb.01087-08] [Citation(s) in RCA: 82] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Erythroid Krüppel-like factor (EKLF) is a Krüppel-like transcription factor identified as a transcriptional activator and chromatin modifier in erythroid cells. EKLF-deficient (Eklf(-/-)) mice die at day 14.5 of gestation from severe anemia. In this study, we demonstrate that early progenitor cells fail to undergo terminal erythroid differentiation in Eklf(-/-) embryos. To discover potential EKLF target genes responsible for the failure of erythropoiesis, transcriptional profiling was performed with RNA from wild-type and Eklf(-/-) early erythroid progenitor cells. These analyses identified significant perturbation of a network of genes involved in cell cycle regulation, with the critical regulator of the cell cycle, E2f2, at a hub. E2f2 mRNA and protein levels were markedly decreased in Eklf(-/-) early erythroid progenitor cells, which showed a delay in the G(1)-to-S-phase transition. Chromatin immunoprecipitation analysis demonstrated EKLF occupancy at the proximal E2f2 promoter in vivo. Consistent with the role of EKLF as a chromatin modifier, EKLF binding sites in the E2f2 promoter were located in a region of EKLF-dependent DNase I sensitivity in early erythroid progenitor cells. We propose a model in which EKLF-dependent activation and modification of the E2f2 locus is required for cell cycle progression preceding terminal erythroid differentiation.
Collapse
|
13
|
Gupta S, Dennis J, Thurman RE, Kingston R, Stamatoyannopoulos JA, Noble WS. Predicting human nucleosome occupancy from primary sequence. PLoS Comput Biol 2008; 4:e1000134. [PMID: 18725940 PMCID: PMC2515632 DOI: 10.1371/journal.pcbi.1000134] [Citation(s) in RCA: 99] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2008] [Accepted: 06/19/2008] [Indexed: 11/30/2022] Open
Abstract
Nucleosomes are the fundamental repeating unit of chromatin and comprise the structural building blocks of the living eukaryotic genome. Micrococcal nuclease (MNase) has long been used to delineate nucleosomal organization. Microarray-based nucleosome mapping experiments in yeast chromatin have revealed regularly-spaced translational phasing of nucleosomes. These data have been used to train computational models of sequence-directed nuclesosome positioning, which have identified ubiquitous strong intrinsic nucleosome positioning signals. Here, we successfully apply this approach to nucleosome positioning experiments from human chromatin. The predictions made by the human-trained and yeast-trained models are strongly correlated, suggesting a shared mechanism for sequence-based determination of nucleosome occupancy. In addition, we observed striking complementarity between classifiers trained on experimental data from weakly versus heavily digested MNase samples. In the former case, the resulting model accurately identifies nucleosome-forming sequences; in the latter, the classifier excels at identifying nucleosome-free regions. Using this model we are able to identify several characteristics of nucleosome-forming and nucleosome-disfavoring sequences. First, by combining results from each classifier applied de novo across the human ENCODE regions, the classifier reveals distinct sequence composition and periodicity features of nucleosome-forming and nucleosome-disfavoring sequences. Short runs of dinucleotide repeat appear as a hallmark of nucleosome-disfavoring sequences, while nucleosome-forming sequences contain short periodic runs of GC base pairs. Second, we show that nucleosome phasing is most frequently predicted flanking nucleosome-free regions. The results suggest that the major mechanism of nucleosome positioning in vivo is boundary-event-driven and affirm the classical statistical positioning theory of nucleosome organization. Inside the nucleus, DNA is wrapped into a complex molecular structure called chromatin, whose fundamental unit is ∼150 bp of DNA organized around the eight-histone protein complex known as the nucleosome. Understanding the local organization of nucleosomes is critical for understanding how chromatin impacts gene regulation. Here, we describe a computational model that predicts nucleosome placement from DNA sequence. We train the model using data derived from human cell lines, and we apply the model systematically to 1% of the human genome. We show that previously described models trained from yeast data correlate strongly with the human-trained model, suggesting a common mechanism for sequence-based determination of nucleosome occupancy. In addition, we observe a striking complementarity between models trained using data from weakly and strongly digested samples: one type of model recognizes nucleosome-free regions, whereas the other identifies well-positioned nucleosomes. Finally, our analysis of predicted nucleosome positions in the human genome allows us to identify common features of nucleosome-forming and inhibitory sequences. Overall, our results are consistent with the classical statistical positioning theory of nucleosome organization.
Collapse
Affiliation(s)
- Shobhit Gupta
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Jonathan Dennis
- Department of Molecular Biology, Massachusetts General Hospital, Boston, Massachusetts, United States of America
| | - Robert E. Thurman
- Division of Medical Genetics, University of Washington, Seattle, Washington, United States of America
| | - Robert Kingston
- Department of Molecular Biology, Massachusetts General Hospital, Boston, Massachusetts, United States of America
| | - John A. Stamatoyannopoulos
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
- * E-mail: ;
| | - William Stafford Noble
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
- Department of Computer Science and Engineering, University of Washington, Seattle, Washington, United States of America
- * E-mail: ;
| |
Collapse
|
14
|
Interferon regulatory factors are transcriptional regulators of adipogenesis. Cell Metab 2008; 7:86-94. [PMID: 18177728 PMCID: PMC2278019 DOI: 10.1016/j.cmet.2007.11.002] [Citation(s) in RCA: 107] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/14/2007] [Revised: 08/03/2007] [Accepted: 11/05/2007] [Indexed: 12/31/2022]
Abstract
We have sought to identify transcriptional pathways in adipogenesis using an integrated experimental and computational approach. Here, we employ high-throughput DNase hypersensitivity analysis to find regions of altered chromatin structure surrounding key adipocyte genes. Regions that display differentiation-dependent changes in hypersensitivity were used to predict binding sites for proteins involved in adipogenesis. A high-scoring example was a binding motif for interferon regulatory factor (IRF) family members. Expression of all nine mammalian IRF mRNAs is regulated during adipogenesis, and several bind to the identified motifs in a differentiation-dependent manner. Furthermore, several IRF proteins repress differentiation. This analysis suggests an important role for IRF proteins in adipocyte biology and demonstrates the utility of this approach in identifying cis- and trans-acting factors not previously suspected to participate in adipogenesis.
Collapse
|
15
|
Minovitsky S, Stegmaier P, Kel A, Kondrashov AS, Dubchak I. Short sequence motifs, overrepresented in mammalian conserved non-coding sequences. BMC Genomics 2007; 8:378. [PMID: 17945028 PMCID: PMC2176071 DOI: 10.1186/1471-2164-8-378] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2007] [Accepted: 10/18/2007] [Indexed: 12/22/2022] Open
Abstract
Background A substantial fraction of non-coding DNA sequences of multicellular eukaryotes is under selective constraint. In particular, ~5% of the human genome consists of conserved non-coding sequences (CNSs). CNSs differ from other genomic sequences in their nucleotide composition and must play important functional roles, which mostly remain obscure. Results We investigated relative abundances of short sequence motifs in all human CNSs present in the human/mouse whole-genome alignments vs. three background sets of sequences: (i) weakly conserved or unconserved non-coding sequences (non-CNSs); (ii) near-promoter sequences (located between nucleotides -500 and -1500, relative to a start of transcription); and (iii) random sequences with the same nucleotide composition as that of CNSs. When compared to non-CNSs and near-promoter sequences, CNSs possess an excess of AT-rich motifs, often containing runs of identical nucleotides. In contrast, when compared to random sequences, CNSs contain an excess of GC-rich motifs which, however, lack CpG dinucleotides. Thus, abundance of short sequence motifs in human CNSs, taken as a whole, is mostly determined by their overall compositional properties and not by overrepresentation of any specific short motifs. These properties are: (i) high AT-content of CNSs, (ii) a tendency, probably due to context-dependent mutation, of A's and T's to clump, (iii) presence of short GC-rich regions, and (iv) avoidance of CpG contexts, due to their hypermutability. Only a small number of short motifs, overrepresented in all human CNSs are similar to binding sites of transcription factors from the FOX family. Conclusion Human CNSs as a whole appear to be too broad a class of sequences to possess strong footprints of any short sequence-specific functions. Such footprints should be studied at the level of functional subclasses of CNSs, such as those which flank genes with a particular pattern of expression. Overall properties of CNSs are affected by patterns in mutation, suggesting that selection which causes their conservation is not always very strong.
Collapse
Affiliation(s)
- Simon Minovitsky
- Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA.
| | | | | | | | | |
Collapse
|
16
|
Nagel S, Scherr M, Kel A, Hornischer K, Crawford GE, Kaufmann M, Meyer C, Drexler HG, MacLeod RAF. Activation of TLX3 and NKX2-5 in t(5;14)(q35;q32) T-cell acute lymphoblastic leukemia by remote 3'-BCL11B enhancers and coregulation by PU.1 and HMGA1. Cancer Res 2007; 67:1461-71. [PMID: 17308084 DOI: 10.1158/0008-5472.can-06-2615] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In T-cell acute lymphoblastic leukemia, alternative t(5;14)(q35;q32.2) forms effect dysregulation of either TLX3 or NKX2-5 homeobox genes at 5q35 by juxtaposition with 14q32.2 breakpoints dispersed across the BCL11B downstream genomic desert. Leukemic gene dysregulation by t(5;14) was investigated by DNA inhibitory treatments with 26-mer double-stranded DNA oligonucleotides directed against candidate enhancers at, or near, orphan T-cell DNase I hypersensitive sites located between 3'-BCL11B and VRK1. NKX2-5 down-regulation in t(5;14) PEER cells was almost entirely restricted to DNA inhibitory treatment targeting enhancers within the distal breakpoint cluster region and was dose and sequence dependent, whereas enhancers near 3'-BCL11B regulated that gene only. Chromatin immunoprecipitation assays showed that the four most effectual NKX2-5 ectopic enhancers were hyperacetylated. These enhancers clustered approximately 1 Mbp downstream of BCL11B, within a region displaying multiple regulatory stigmata, including a TCRA enhancer motif, deep sequence conservation, and tight nuclear matrix attachment relaxed by trichostatin A treatment. Intriguingly, although TLX3/NKX2-5 promoter/exon 1 regions were hypoacetylated, their expression was trichostatin A sensitive, implying extrinsic regulation by factor(s) under acetylation control. Knockdown of PU.1, known to be trichostatin A responsive and which potentially binds TLX3/NKX2-5 promoters, effected down-regulation of both homeobox genes. Moreover, genomic analysis showed preferential enrichment near ectopic enhancers of binding sites for the PU.1 cofactor HMGA1, the knockdown of which also inhibited NKX2-5. We suggest that HMGA1 and PU.1 coregulate ectopic homeobox gene expression in t(5;14) T-cell acute lymphoblastic leukemia by interactions mediated at the nuclear matrix. Our data document homeobox gene dysregulation by a novel regulatory region at 3'-BCL11B responsive to histone deacetylase inhibition and highlight a novel class of potential therapeutic target amid noncoding DNA.
Collapse
MESH Headings
- Acetylation
- Chromosome Breakage
- Chromosomes, Human, Pair 14
- Chromosomes, Human, Pair 5
- DNA-Binding Proteins/genetics
- Deoxyribonuclease I/metabolism
- Enhancer Elements, Genetic
- Gene Expression Regulation, Leukemic
- HMGA Proteins/genetics
- Histones/metabolism
- Homeobox Protein Nkx-2.5
- Homeodomain Proteins/genetics
- Humans
- Leukemia-Lymphoma, Adult T-Cell/genetics
- Leukemia-Lymphoma, Adult T-Cell/metabolism
- Multigene Family
- Nuclear Matrix/metabolism
- Oligonucleotides/genetics
- Oncogene Proteins/genetics
- Precursor Cell Lymphoblastic Leukemia-Lymphoma/genetics
- Precursor Cell Lymphoblastic Leukemia-Lymphoma/metabolism
- Proto-Oncogene Proteins/genetics
- RNA, Small Interfering/genetics
- Repressor Proteins/genetics
- Trans-Activators/genetics
- Transcription Factors/genetics
- Translocation, Genetic
- Tumor Suppressor Proteins/genetics
Collapse
Affiliation(s)
- Stefan Nagel
- German Collection of Microorganisms and Cell Cultures, Department of Cell Cultures, Inhoffenstrasse 7B, 38124 Braunschweig, Germany.
| | | | | | | | | | | | | | | | | |
Collapse
|