1
|
Zhang Y, Zhang P, Wu H. Enhancer-MDLF: a novel deep learning framework for identifying cell-specific enhancers. Brief Bioinform 2024; 25:bbae083. [PMID: 38485768 PMCID: PMC10938904 DOI: 10.1093/bib/bbae083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2023] [Revised: 01/27/2024] [Accepted: 02/07/2024] [Indexed: 03/18/2024] Open
Abstract
Enhancers, noncoding DNA fragments, play a pivotal role in gene regulation, facilitating gene transcription. Identifying enhancers is crucial for understanding genomic regulatory mechanisms, pinpointing key elements and investigating networks governing gene expression and disease-related mechanisms. Existing enhancer identification methods exhibit limitations, prompting the development of our novel multi-input deep learning framework, termed Enhancer-MDLF. Experimental results illustrate that Enhancer-MDLF outperforms the previous method, Enhancer-IF, across eight distinct human cell lines and exhibits superior performance on generic enhancer datasets and enhancer-promoter datasets, affirming the robustness of Enhancer-MDLF. Additionally, we introduce transfer learning to provide an effective and potential solution to address the prediction challenges posed by enhancer specificity. Furthermore, we utilize model interpretation to identify transcription factor binding site motifs that may be associated with enhancer regions, with important implications for facilitating the study of enhancer regulatory mechanisms. The source code is openly accessible at https://github.com/HaoWuLab-Bioinformatics/Enhancer-MDLF.
Collapse
Affiliation(s)
- Yao Zhang
- School of Software, Shandong University, Jinan, 250100, Shandong, China
| | - Pengyu Zhang
- College of Information Engineering, Northwest A&F University, Yangling, 712100, Shaanxi, China
| | - Hao Wu
- School of Software, Shandong University, Jinan, 250100, Shandong, China
| |
Collapse
|
2
|
Marinov GK, Shipony Z, Kundaje A, Greenleaf WJ. Genome-Wide Mapping of Active Regulatory Elements Using ATAC-seq. Methods Mol Biol 2023; 2611:3-19. [PMID: 36807060 DOI: 10.1007/978-1-0716-2899-7_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2023]
Abstract
Active cis-regulatory elements (cREs) in eukaryotes are characterized by nucleosomal depletion and, accordingly, higher accessibility. This property has turned out to be immensely useful for identifying cREs genome-wide and tracking their dynamics across different cellular states and is the basis of numerous methods taking advantage of the preferential enzymatic cleavage/labeling of accessible DNA. ATAC-seq (Assay for Transposase-Accessible Chromatin using sequencing) has emerged as the most versatile and widely adaptable method and has been widely adopted as the standard tool for mapping open chromatin regions. Here, we discuss the current optimal practices and important considerations for carrying out ATAC-seq experiments, primarily in the context of mammalian systems.
Collapse
Affiliation(s)
| | - Zohar Shipony
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Anshul Kundaje
- Department of Genetics, Stanford University, Stanford, CA, USA.,Department of Computer Science, Stanford University, Stanford, CA, USA
| | - William J Greenleaf
- Department of Genetics, Stanford University, Stanford, CA, USA. .,Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA, USA. .,Department of Applied Physics, Stanford University, Stanford, CA, USA. .,Chan Zuckerberg Biohub, San Francisco, CA, USA.
| |
Collapse
|
3
|
Li Y, Kong F, Cui H, Wang F, Li C, Ma J. SENIES: DNA Shape Enhanced Two-Layer Deep Learning Predictor for the Identification of Enhancers and Their Strength. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:637-645. [PMID: 35015646 DOI: 10.1109/tcbb.2022.3142019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Identifying enhancers is a critical task in bioinformatics due to their primary role in regulating gene expression. For this reason, various computational algorithms devoted to enhancer identification have been put forward over the years. More features are extracted from the single DNA sequences to boost the performance. Nevertheless, DNA structural information is neglected, which is an essential factor affecting the binding preferences of transcription factors to regulatory elements like enhancers. Here, we propose SENIES, a DNA shape enhanced deep learning predictor, to identify enhancers and their strength. The predictor consists of two layers where the first layer is for enhancer and non-enhancer identification, and the second layer is for predicting the strength of enhancers. Apart from two common sequence-derived features (i.e., one-hot and k-mer), DNA shape is introduced to describe the 3D structures of DNA sequences. Performance comparison with state-of-the-art methods conducted on public datasets demonstrates the effectiveness and robustness of our predictor. The code implementation of SENIES is publicly available at https://github.com/hlju-liye/SENIES.
Collapse
|
4
|
Liao M, Zhao JP, Tian J, Zheng CH. iEnhancer-DCLA: using the original sequence to identify enhancers and their strength based on a deep learning framework. BMC Bioinformatics 2022; 23:480. [PMCID: PMC9664816 DOI: 10.1186/s12859-022-05033-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Accepted: 11/02/2022] [Indexed: 11/16/2022] Open
Abstract
AbstractEnhancers are small regions of DNA that bind to proteins, which enhance the transcription of genes. The enhancer may be located upstream or downstream of the gene. It is not necessarily close to the gene to be acted on, because the entanglement structure of chromatin allows the positions far apart in the sequence to have the opportunity to contact each other. Therefore, identifying enhancers and their strength is a complex and challenging task. In this article, a new prediction method based on deep learning is proposed to identify enhancers and enhancer strength, called iEnhancer-DCLA. Firstly, we use word2vec to convert k-mers into number vectors to construct an input matrix. Secondly, we use convolutional neural network and bidirectional long short-term memory network to extract sequence features, and finally use the attention mechanism to extract relatively important features. In the task of predicting enhancers and their strengths, this method has improved to a certain extent in most evaluation indexes. In summary, we believe that this method provides new ideas in the analysis of enhancers.
Collapse
|
5
|
Li Z, Zhao B, Qin C, Wang Y, Li T, Wang W. Chromatin Dynamics in Digestive System Cancer: Commander and Regulator. Front Oncol 2022; 12:935877. [PMID: 35965507 PMCID: PMC9372441 DOI: 10.3389/fonc.2022.935877] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Accepted: 06/23/2022] [Indexed: 11/30/2022] Open
Abstract
Digestive system tumors have a poor prognosis due to complex anatomy, insidious onset, challenges in early diagnosis, and chemoresistance. Epidemiological statistics has verified that digestive system tumors rank first in tumor-related death. Although a great number of studies are devoted to the molecular biological mechanism, early diagnostic markers, and application of new targeted drugs in digestive system tumors, the therapeutic effect is still not satisfactory. Epigenomic alterations including histone modification and chromatin remodeling are present in human cancers and are now known to cooperate with genetic changes to drive the cancer phenotype. Chromatin is the carrier of genetic information and consists of DNA, histones, non-histone proteins, and a small amount of RNA. Chromatin and nucleosomes control the stability of the eukaryotic genome and regulate DNA processes such as transcription, replication, and repair. The dynamic structure of chromatin plays a key role in this regulatory function. Structural fluctuations expose internal DNA and thus provide access to the nuclear machinery. The dynamic changes are affected by various complexes and epigenetic modifications. Variation of chromatin dynamics produces early and superior regulation of the expression of related genes and downstream pathways, thereby controlling tumor development. Intervention at the chromatin level can change the process of cancer earlier and is a feasible option for future tumor diagnosis and treatment. In this review, we introduced chromatin dynamics including chromatin remodeling, histone modifications, and chromatin accessibility, and current research on chromatin regulation in digestive system tumors was also summarized.
Collapse
|
6
|
Marinov GK, Shipony Z, Kundaje A, Greenleaf WJ. Single-Molecule Multikilobase-Scale Profiling of Chromatin Accessibility Using m6A-SMAC-Seq and m6A-CpG-GpC-SMAC-Seq. Methods Mol Biol 2022; 2458:269-298. [PMID: 35103973 PMCID: PMC9531602 DOI: 10.1007/978-1-0716-2140-0_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
A hallmark feature of active cis-regulatory elements (CREs) in eukaryotes is their nucleosomal depletion and, accordingly, higher accessibility to enzymatic treatment. This property has been the basis of a number of sequencing-based assays for genome-wide identification and tracking the activity of CREs across different biological conditions, such as DNAse-seq, ATAC-seq , NOMeseq, and others. However, the fragmentation of DNA inherent to many of these assays and the limited read length of short-read sequencing platforms have so far not allowed the simultaneous measurement of the chromatin accessibility state of CREs located distally from each other. The combination of labeling accessible DNA with DNA modifications and nanopore sequencing has made it possible to develop such assays. Here, we provide a detailed protocol for carrying out the SMAC-seq assay (Single-Molecule long-read Accessible Chromatin mapping sequencing), in its m6A-SMAC-seq and m6A-CpG-GpC-SMAC-seq variants, together with methods for data processing and analysis, and discuss key experimental and analytical considerations for working with SMAC-seq datasets.
Collapse
Affiliation(s)
| | - Zohar Shipony
- Department of Genetics, Stanford University, Stanford, CA, USA.
| | - Anshul Kundaje
- Department of Genetics, Stanford University, Stanford, CA, USA
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - William J Greenleaf
- Department of Genetics, Stanford University, Stanford, CA, USA
- Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA, USA
- Department of Applied Physics, Stanford University, Stanford, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| |
Collapse
|
7
|
St George-Hyslop F, Kivisild T, Livesey FJ. The role of contactin-associated protein-like 2 in neurodevelopmental disease and human cerebral cortex evolution. Front Mol Neurosci 2022; 15:1017144. [PMID: 36340692 PMCID: PMC9630569 DOI: 10.3389/fnmol.2022.1017144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Accepted: 09/20/2022] [Indexed: 12/04/2022] Open
Abstract
The contactin-associated protein-like 2 (CNTNAP2) gene is associated with multiple neurodevelopmental disorders, including autism spectrum disorder (ASD), intellectual disability (ID), and specific language impairment (SLI). Experimental work has shown that CNTNAP2 is important for neuronal development and synapse formation. There is also accumulating evidence for the differential use of CNTNAP2 in the human cerebral cortex compared with other primates. Here, we review the current literature on CNTNAP2, including what is known about its expression, disease associations, and molecular/cellular functions. We also review the evidence for its role in human brain evolution, such as the presence of eight human accelerated regions (HARs) within the introns of the gene. While progress has been made in understanding the function(s) of CNTNAP2, more work is needed to clarify the precise mechanisms through which CNTNAP2 acts. Such information will be crucial for developing effective treatments for CNTNAP2 patients. It may also shed light on the longstanding question of what makes us human.
Collapse
Affiliation(s)
- Frances St George-Hyslop
- Zayed Centre for Research Into Rare Disease in Children, UCL Great Ormond Street Institute of Child Health, University College London, London, United Kingdom.,Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
| | - Toomas Kivisild
- Estonian Biocentre, Institute of Genomics, University of Tartu, Tartu, Estonia.,Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Frederick J Livesey
- Zayed Centre for Research Into Rare Disease in Children, UCL Great Ormond Street Institute of Child Health, University College London, London, United Kingdom
| |
Collapse
|
8
|
Chawla A, Nagy C, Turecki G. Chromatin Profiling Techniques: Exploring the Chromatin Environment and Its Contributions to Complex Traits. Int J Mol Sci 2021; 22:7612. [PMID: 34299232 PMCID: PMC8305586 DOI: 10.3390/ijms22147612] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Revised: 07/09/2021] [Accepted: 07/13/2021] [Indexed: 01/04/2023] Open
Abstract
The genetic architecture of complex traits is multifactorial. Genome-wide association studies (GWASs) have identified risk loci for complex traits and diseases that are disproportionately located at the non-coding regions of the genome. On the other hand, we have just begun to understand the regulatory roles of the non-coding genome, making it challenging to precisely interpret the functions of non-coding variants associated with complex diseases. Additionally, the epigenome plays an active role in mediating cellular responses to fluctuations of sensory or environmental stimuli. However, it remains unclear how exactly non-coding elements associate with epigenetic modifications to regulate gene expression changes and mediate phenotypic outcomes. Therefore, finer interrogations of the human epigenomic landscape in associating with non-coding variants are warranted. Recently, chromatin-profiling techniques have vastly improved our understanding of the numerous functions mediated by the epigenome and DNA structure. Here, we review various chromatin-profiling techniques, such as assays of chromatin accessibility, nucleosome distribution, histone modifications, and chromatin topology, and discuss their applications in unraveling the brain epigenome and etiology of complex traits at tissue homogenate and single-cell resolution. These techniques have elucidated compositional and structural organizing principles of the chromatin environment. Taken together, we believe that high-resolution epigenomic and DNA structure profiling will be one of the best ways to elucidate how non-coding genetic variations impact complex diseases, ultimately allowing us to pinpoint cell-type targets with therapeutic potential.
Collapse
Affiliation(s)
- Anjali Chawla
- Integrated Program in Neuroscience, McGill University, 845 Sherbrooke St W, Montreal, QC H3A 0G4, Canada;
- McGill Group for Suicide Studies, Department of Psychiatry, Douglas Mental Health University Institute, McGill University, 6875 LaSalle Blvd, Verdun, QC H4H 1R3, Canada;
| | - Corina Nagy
- McGill Group for Suicide Studies, Department of Psychiatry, Douglas Mental Health University Institute, McGill University, 6875 LaSalle Blvd, Verdun, QC H4H 1R3, Canada;
- Genome Quebec Innovation Centre, Department of Human Genetics, McGill University, 845 Sherbrooke St W, Montreal, QC H3A 0G4, Canada
| | - Gustavo Turecki
- Integrated Program in Neuroscience, McGill University, 845 Sherbrooke St W, Montreal, QC H3A 0G4, Canada;
- McGill Group for Suicide Studies, Department of Psychiatry, Douglas Mental Health University Institute, McGill University, 6875 LaSalle Blvd, Verdun, QC H4H 1R3, Canada;
- Genome Quebec Innovation Centre, Department of Human Genetics, McGill University, 845 Sherbrooke St W, Montreal, QC H3A 0G4, Canada
| |
Collapse
|
9
|
Henikoff S, Henikoff JG, Ahmad K. Simplified Epigenome Profiling Using Antibody-tethered Tagmentation. Bio Protoc 2021; 11:e4043. [PMID: 34250209 PMCID: PMC8250384 DOI: 10.21769/bioprotoc.4043] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Revised: 03/24/2021] [Accepted: 04/29/2021] [Indexed: 12/02/2022] Open
Abstract
We previously introduced Cleavage Under Targets & Tagmentation (CUT&Tag), an epigenomic profiling method in which antibody tethering of the Tn5 transposase to a chromatin epitope of interest maps specific chromatin features in small samples and single cells. With CUT&Tag, intact cells or nuclei are permeabilized, followed by successive addition of a primary antibody, a secondary antibody, and a chimeric Protein A-Transposase fusion protein that binds to the antibody. Addition of Mg++ activates the transposase and inserts sequencing adapters into adjacent DNA in situ. We have since adapted CUT&Tag to also map chromatin accessibility by simply modifying the transposase activation conditions when using histone H3K4me2, H3K4me3, or Serine-5-phosphorylated RNA Polymerase II antibodies. Using these antibodies, we redirect the tagmentation of accessible DNA sites to produce chromatin accessibility maps with exceptionally high signal-to-noise and resolution. All steps from nuclei to amplified sequencing-ready libraries are performed in single PCR tubes using non-toxic reagents and inexpensive equipment, making our simplified strategy for simultaneous chromatin profiling and accessibility mapping suitable for the lab, home workbench, or classroom.
Collapse
Affiliation(s)
- Steven Henikoff
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, Washington 98109, USA.,Howard Hughes Medical Institute, Seattle WA, USA
| | - Jorja G Henikoff
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, Washington 98109, USA
| | - Kami Ahmad
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, Washington 98109, USA
| |
Collapse
|
10
|
Federation AJ, Nandakumar V, Searle BC, Stergachis A, Wang H, Pino LK, Merrihew G, Ting YS, Howard N, Kutyavin T, MacCoss MJ, Stamatoyannopoulos JA. Highly Parallel Quantification and Compartment Localization of Transcription Factors and Nuclear Proteins. Cell Rep 2021; 30:2463-2471.e5. [PMID: 32101728 DOI: 10.1016/j.celrep.2020.01.096] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2019] [Revised: 04/15/2019] [Accepted: 01/28/2020] [Indexed: 01/12/2023] Open
Abstract
Transcription factors and other chromatin-associated proteins are difficult to quantify comprehensively. Here, we combine facile nuclear sub-fractionation with data-independent acquisition mass spectrometry to achieve rapid, sensitive, and highly parallel quantification of the nuclear proteome in human cells. We apply this approach to quantify the response to acute degradation of BET bromodomains, revealing unexpected chromatin regulatory dynamics. The method is simple and enables system-level study of previously inaccessible chromatin and genome regulators.
Collapse
Affiliation(s)
| | - Vivek Nandakumar
- Altius Institute for Biomedical Sciences, Seattle, WA 98121, USA
| | - Brian C Searle
- University of Washington, Department of Genome Sciences, Seattle, WA 98195, USA
| | - Andrew Stergachis
- University of Washington, Department of Genome Sciences, Seattle, WA 98195, USA
| | - Hao Wang
- Altius Institute for Biomedical Sciences, Seattle, WA 98121, USA
| | - Lindsay K Pino
- University of Washington, Department of Genome Sciences, Seattle, WA 98195, USA
| | - Gennifer Merrihew
- University of Washington, Department of Genome Sciences, Seattle, WA 98195, USA
| | - Ying S Ting
- University of Washington, Department of Genome Sciences, Seattle, WA 98195, USA
| | - Nicholas Howard
- Altius Institute for Biomedical Sciences, Seattle, WA 98121, USA
| | - Tanya Kutyavin
- Altius Institute for Biomedical Sciences, Seattle, WA 98121, USA
| | - Michael J MacCoss
- University of Washington, Department of Genome Sciences, Seattle, WA 98195, USA.
| | - John A Stamatoyannopoulos
- Altius Institute for Biomedical Sciences, Seattle, WA 98121, USA; University of Washington, Department of Genome Sciences, Seattle, WA 98195, USA.
| |
Collapse
|
11
|
Abstract
The ATAC-seq assay has emerged as the most useful, versatile, and widely adaptable method for profiling accessible chromatin regions and tracking the activity of cis-regulatory elements (cREs) in eukaryotes. Thanks to its great utility, it is now being applied to map active chromatin in the context of a very wide diversity of biological systems and questions. In the course of these studies, considerable experience working with ATAC-seq data has accumulated and a standard set of computational tasks that need to be carried for most ATAC-seq analyses has emerged. Here, we review and provide examples of common such analytical procedures (including data processing, quality control, peak calling, identifying differentially accessible open chromatin regions, and variable transcription factor (TF) motif accessibility) and discuss recommended optimal practices.
Collapse
|
12
|
Henikoff S, Henikoff JG, Kaya-Okur HS, Ahmad K. Efficient chromatin accessibility mapping in situ by nucleosome-tethered tagmentation. eLife 2020; 9:e63274. [PMID: 33191916 PMCID: PMC7721439 DOI: 10.7554/elife.63274] [Citation(s) in RCA: 59] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2020] [Accepted: 11/13/2020] [Indexed: 12/27/2022] Open
Abstract
Chromatin accessibility mapping is a powerful approach to identify potential regulatory elements. A popular example is ATAC-seq, whereby Tn5 transposase inserts sequencing adapters into accessible DNA ('tagmentation'). CUT&Tag is a tagmentation-based epigenomic profiling method in which antibody tethering of Tn5 to a chromatin epitope of interest profiles specific chromatin features in small samples and single cells. Here, we show that by simply modifying the tagmentation conditions for histone H3K4me2 or H3K4me3 CUT&Tag, antibody-tethered tagmentation of accessible DNA sites is redirected to produce chromatin accessibility maps that are indistinguishable from the best ATAC-seq maps. Thus, chromatin accessibility maps can be produced in parallel with CUT&Tag maps of other epitopes with all steps from nuclei to amplified sequencing-ready libraries performed in single PCR tubes in the laboratory or on a home workbench. As H3K4 methylation is produced by transcription at promoters and enhancers, our method identifies transcription-coupled accessible regulatory sites.
Collapse
Affiliation(s)
- Steven Henikoff
- Basic Sciences Division Fred Hutchinson Cancer Research CenterSeattleUnited States
- Howard Hughes Medical InstituteSeattleUnited States
| | - Jorja G Henikoff
- Basic Sciences Division Fred Hutchinson Cancer Research CenterSeattleUnited States
| | - Hatice S Kaya-Okur
- Basic Sciences Division Fred Hutchinson Cancer Research CenterSeattleUnited States
| | - Kami Ahmad
- Basic Sciences Division Fred Hutchinson Cancer Research CenterSeattleUnited States
| |
Collapse
|
13
|
Shipony Z, Marinov GK, Swaffer MP, Sinnott-Armstrong NA, Skotheim JM, Kundaje A, Greenleaf WJ. Long-range single-molecule mapping of chromatin accessibility in eukaryotes. Nat Methods 2020; 17:319-327. [PMID: 32042188 PMCID: PMC7968351 DOI: 10.1038/s41592-019-0730-2] [Citation(s) in RCA: 60] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2019] [Accepted: 12/22/2019] [Indexed: 02/06/2023]
Abstract
Mapping open chromatin regions has emerged as a widely used tool for identifying active regulatory elements in eukaryotes. However, existing approaches, limited by reliance on DNA fragmentation and short-read sequencing, cannot provide information about large-scale chromatin states or reveal coordination between the states of distal regulatory elements. We have developed a method for profiling the accessibility of individual chromatin fibers, a single-molecule long-read accessible chromatin mapping sequencing assay (SMAC-seq), enabling the simultaneous, high-resolution, single-molecule assessment of chromatin states at multikilobase length scales. Our strategy is based on combining the preferential methylation of open chromatin regions by DNA methyltransferases with low sequence specificity, in this case EcoGII, an N6-methyladenosine (m6A) methyltransferase, and the ability of nanopore sequencing to directly read DNA modifications. We demonstrate that aggregate SMAC-seq signals match bulk-level accessibility measurements, observe single-molecule nucleosome and transcription factor protection footprints, and quantify the correlation between chromatin states of distal genomic elements.
Collapse
Affiliation(s)
- Zohar Shipony
- Department of Genetics, Stanford University, Stanford, CA, USA
| | | | | | | | - Jan M Skotheim
- Department of Biology, Stanford University, Stanford, CA, USA
| | - Anshul Kundaje
- Department of Genetics, Stanford University, Stanford, CA, USA
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - William J Greenleaf
- Department of Genetics, Stanford University, Stanford, CA, USA.
- Department of Applied Physics, Stanford University, Stanford, CA, USA.
- Chan Zuckerberg Biohub, San Francisco, CA, USA.
| |
Collapse
|
14
|
Ha SD, Cho W, DeKoter RP, Kim SO. The transcription factor PU.1 mediates enhancer-promoter looping that is required for IL-1β eRNA and mRNA transcription in mouse melanoma and macrophage cell lines. J Biol Chem 2019; 294:17487-17500. [PMID: 31586032 DOI: 10.1074/jbc.ra119.010149] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Revised: 09/11/2019] [Indexed: 01/08/2023] Open
Abstract
The DNA-binding protein PU.1 is a myeloid lineage-determining and pioneering transcription factor due to its ability to bind "closed" genomic sites and maintain "open" chromatin state for myeloid lineage-specific genes. The precise mechanism of PU.1 in cell type-specific programming is yet to be elucidated. The melanoma cell line B16BL6, although it is nonmyeloid lineage, expressed Toll-like receptors and activated the transcription factor NF-κB upon stimulation by the bacterial cell wall component lipopolysaccharide. However, it did not produce cytokines, such as IL-1β mRNA. Ectopic PU.1 expression induced remodeling of a novel distal enhancer (located ∼10 kbp upstream of the IL-1β transcription start site), marked by nucleosome depletion, enhancer-promoter looping, and histone H3 lysine 27 acetylation (H3K27ac). PU.1 induced enhancer-promoter looping and H3K27ac through two distinct PU.1 regions. These PU.1-dependent events were independently required for subsequent signal-dependent and co-dependent events: NF-κB recruitment and further H3K27ac, both of which were required for enhancer RNA (eRNA) transcription. In murine macrophage RAW264.7 cells, these PU.1-dependent events were constitutively established and readily expressed eRNA and subsequently IL-1β mRNA by lipopolysaccharide stimulation. In summary, this study showed a sequence of epigenetic events in programming IL-1β transcription by the distal enhancer priming and eRNA production mediated by PU.1 and the signal-dependent transcription factor NF-κB.
Collapse
Affiliation(s)
- Soon-Duck Ha
- Department of Microbiology and Immunology and Infectious Diseases Research Group, Siebens-Drake Research Institute, University of Western Ontario, London, Ontario N6G 2V4, Canada
| | - Woohyun Cho
- Department of Microbiology and Immunology and Infectious Diseases Research Group, Siebens-Drake Research Institute, University of Western Ontario, London, Ontario N6G 2V4, Canada
| | - Rodney P DeKoter
- Department of Microbiology and Immunology and Infectious Diseases Research Group, Siebens-Drake Research Institute, University of Western Ontario, London, Ontario N6G 2V4, Canada
| | - Sung Ouk Kim
- Department of Microbiology and Immunology and Infectious Diseases Research Group, Siebens-Drake Research Institute, University of Western Ontario, London, Ontario N6G 2V4, Canada
| |
Collapse
|
15
|
Evolution of DNAase I Hypersensitive Sites in MHC Regulatory Regions of Primates. Genetics 2018; 209:579-589. [PMID: 29669733 DOI: 10.1534/genetics.118.301028] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2018] [Accepted: 04/16/2018] [Indexed: 01/08/2023] Open
Abstract
It has been challenging to determine the disease-causing variant(s) for most major histocompatibility complex (MHC)-associated diseases. However, it is becoming increasingly clear that regulatory variation is pervasive and a fundamentally important mechanism governing phenotypic diversity and disease susceptibility. We gathered DNase I data from 136 human cells to characterize the regulatory landscape of the MHC region, including 4867 DNase I hypersensitive sites (DHSs). We identified thousands of regulatory elements that have been gained or lost in the human or chimpanzee genomes since their evolutionary divergence. We compared alignments of the DHS across six primates and found 149 DHSs with convincing evidence of positive and/or purifying selection. Of these DHSs, compared to neutral sequences, 24 evolved rapidly in the human lineage. We identified 15 instances of transcription-factor-binding motif gains, such as USF, MYC, MAX, MAFK, STAT1, PBX3, etc, and observed 16 GWAS (genome-wide association study) SNPs associated with diseases within these 24 DHSs using FIMO (Find Individual Motif Occurrences) and UCSC (University of California, Santa Cruz) ChIP-seq data. Combining eQTL and Hi-C data, our results indicated that there were five SNPs located in human gains motifs affecting the corresponding gene's expression, two of which closely matched DHS target genes. In addition, a significant SNP, rs7756521, at genome-wide significant level likely affects DDR expression and represents a causal genetic variant for HIV-1 control. These results indicated that species-specific motif gains or losses of rapidly evolving DHSs in the primate genomes might play a role during adaptation evolution and provided some new evidence for a potentially causal role for these GWAS SNPs.
Collapse
|
16
|
Manavalan B, Shin TH, Lee G. DHSpred: support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest. Oncotarget 2018; 9:1944-1956. [PMID: 29416743 PMCID: PMC5788611 DOI: 10.18632/oncotarget.23099] [Citation(s) in RCA: 77] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2017] [Accepted: 11/17/2017] [Indexed: 12/20/2022] Open
Abstract
DNase I hypersensitive sites (DHSs) are genomic regions that provide important information regarding the presence of transcriptional regulatory elements and the state of chromatin. Therefore, identifying DHSs in uncharacterized DNA sequences is crucial for understanding their biological functions and mechanisms. Although many experimental methods have been proposed to identify DHSs, they have proven to be expensive for genome-wide application. Therefore, it is necessary to develop computational methods for DHS prediction. In this study, we proposed a support vector machine (SVM)-based method for predicting DHSs, called DHSpred (DNase I Hypersensitive Site predictor in human DNA sequences), which was trained with 174 optimal features. The optimal combination of features was identified from a large set that included nucleotide composition and di- and trinucleotide physicochemical properties, using a random forest algorithm. DHSpred achieved a Matthews correlation coefficient and accuracy of 0.660 and 0.871, respectively, which were 3% higher than those of control SVM predictors trained with non-optimized features, indicating the efficiency of the feature selection method. Furthermore, the performance of DHSpred was superior to that of state-of-the-art predictors. An online prediction server has been developed to assist the scientific community, and is freely available at: http://www.thegleelab.org/DHSpred.html.
Collapse
Affiliation(s)
| | - Tae Hwan Shin
- Department of Physiology, Ajou University School of Medicine, Suwon, Republic of Korea
- Institute of Molecular Science and Technology, Ajou University, Suwon, Republic of Korea
| | - Gwang Lee
- Department of Physiology, Ajou University School of Medicine, Suwon, Republic of Korea
- Institute of Molecular Science and Technology, Ajou University, Suwon, Republic of Korea
| |
Collapse
|
17
|
Levchenko A, Kanapin A, Samsonova A, Gainetdinov RR. Human Accelerated Regions and Other Human-Specific Sequence Variations in the Context of Evolution and Their Relevance for Brain Development. Genome Biol Evol 2018; 10:166-188. [PMID: 29149249 PMCID: PMC5767953 DOI: 10.1093/gbe/evx240] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/14/2017] [Indexed: 12/24/2022] Open
Abstract
The review discusses, in a format of a timeline, the studies of different types of genetic variants, present in Homo sapiens, but absent in all other primate, mammalian, or vertebrate species, tested so far. The main characteristic of these variants is that they are found in regions of high evolutionary conservation. These sequence variations include single nucleotide substitutions (called human accelerated regions), deletions, and segmental duplications. The rationale for finding such variations in the human genome is that they could be responsible for traits, specific to our species, of which the human brain is the most remarkable. As became obvious, the vast majority of human-specific single nucleotide substitutions are found in noncoding, likely regulatory regions. A number of genes, associated with these human-specific alleles, often through novel enhancer activity, were in fact shown to be implicated in human-specific development of certain brain areas, including the prefrontal cortex. Human-specific deletions may remove regulatory sequences, such as enhancers. Segmental duplications, because of their large size, create new coding sequences, like new functional paralogs. Further functional study of these variants will shed light on evolution of our species, as well as on the etiology of neurodevelopmental disorders.
Collapse
Affiliation(s)
- Anastasia Levchenko
- Institute of Translational Biomedicine, Saint Petersburg State University, Russia
| | - Alexander Kanapin
- Institute of Translational Biomedicine, Saint Petersburg State University, Russia
- Department of Oncology, University of Oxford, United Kingdom
| | - Anastasia Samsonova
- Institute of Translational Biomedicine, Saint Petersburg State University, Russia
- Department of Oncology, University of Oxford, United Kingdom
| | - Raul R Gainetdinov
- Institute of Translational Biomedicine, Saint Petersburg State University, Russia
- Skolkovo Institute of Science and Technology, Skolkovo, Moscow, Russia
| |
Collapse
|
18
|
Breeze CE, Paul DS, van Dongen J, Butcher LM, Ambrose JC, Barrett JE, Lowe R, Rakyan VK, Iotchkova V, Frontini M, Downes K, Ouwehand WH, Laperle J, Jacques PÉ, Bourque G, Bergmann AK, Siebert R, Vellenga E, Saeed S, Matarese F, Martens JHA, Stunnenberg HG, Teschendorff AE, Herrero J, Birney E, Dunham I, Beck S. eFORGE: A Tool for Identifying Cell Type-Specific Signal in Epigenomic Data. Cell Rep 2017; 17:2137-2150. [PMID: 27851974 PMCID: PMC5120369 DOI: 10.1016/j.celrep.2016.10.059] [Citation(s) in RCA: 81] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2016] [Revised: 08/25/2016] [Accepted: 09/30/2016] [Indexed: 12/14/2022] Open
Abstract
Epigenome-wide association studies (EWAS) provide an alternative approach for studying human disease through consideration of non-genetic variants such as altered DNA methylation. To advance the complex interpretation of EWAS, we developed eFORGE (http://eforge.cs.ucl.ac.uk/), a new standalone and web-based tool for the analysis and interpretation of EWAS data. eFORGE determines the cell type-specific regulatory component of a set of EWAS-identified differentially methylated positions. This is achieved by detecting enrichment of overlap with DNase I hypersensitive sites across 454 samples (tissues, primary cell types, and cell lines) from the ENCODE, Roadmap Epigenomics, and BLUEPRINT projects. Application of eFORGE to 20 publicly available EWAS datasets identified disease-relevant cell types for several common diseases, a stem cell-like signature in cancer, and demonstrated the ability to detect cell-composition effects for EWAS performed on heterogeneous tissues. Our approach bridges the gap between large-scale epigenomics data and EWAS-derived target selection to yield insight into disease etiology.
Collapse
Affiliation(s)
- Charles E Breeze
- UCL Cancer Institute, University College London, London WC1E 6BT, UK.
| | - Dirk S Paul
- UCL Cancer Institute, University College London, London WC1E 6BT, UK
| | - Jenny van Dongen
- Department of Biological Psychology, Vrije Universiteit Amsterdam, 1081BT Amsterdam, the Netherlands
| | - Lee M Butcher
- UCL Cancer Institute, University College London, London WC1E 6BT, UK; Department of Surgery and Cancer, Imperial College London, London W12 0NN, UK
| | - John C Ambrose
- UCL Cancer Institute, University College London, London WC1E 6BT, UK
| | - James E Barrett
- UCL Cancer Institute, University College London, London WC1E 6BT, UK
| | - Robert Lowe
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, E1 2AT London, UK
| | - Vardhman K Rakyan
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, E1 2AT London, UK
| | - Valentina Iotchkova
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK; Department of Human Genetics, The Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1HH, UK
| | - Mattia Frontini
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Long Road, Cambridge CB2 0PT, UK; National Health Service (NHS) Blood and Transplant, University of Cambridge, Cambridge Biomedical Campus, Long Road, Cambridge CB2 0PT, UK; British Heart Foundation Centre of Excellence, Cambridge Biomedical Campus, Long Road, Cambridge CB2 0QQ, UK
| | - Kate Downes
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Long Road, Cambridge CB2 0PT, UK; National Health Service (NHS) Blood and Transplant, University of Cambridge, Cambridge Biomedical Campus, Long Road, Cambridge CB2 0PT, UK
| | - Willem H Ouwehand
- Department of Human Genetics, The Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1HH, UK; Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Long Road, Cambridge CB2 0PT, UK; National Health Service (NHS) Blood and Transplant, University of Cambridge, Cambridge Biomedical Campus, Long Road, Cambridge CB2 0PT, UK; British Heart Foundation Centre of Excellence, Cambridge Biomedical Campus, Long Road, Cambridge CB2 0QQ, UK
| | - Jonathan Laperle
- Département d'Informatique, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
| | - Pierre-Étienne Jacques
- Département d'Informatique, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada; Département de Biologie, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada; Centre de recherche du Centre hospitalier universitaire de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
| | - Guillaume Bourque
- Department of Human Genetics, McGill University, Montréal, QC H3G 1Y6, Canada; Génome Québec Innovation Center, Montréal, QC H3A 0G1, Canada
| | - Anke K Bergmann
- Institute of Human Genetics, Christian Albrechts University, 24105 Kiel, Germany; Department of Pediatrics, Christian-Albrechts-University Kiel & University Hospital Schleswig-Holstein, 24105 Kiel, Germany
| | - Reiner Siebert
- Institute of Human Genetics, Christian Albrechts University, 24105 Kiel, Germany; Institute of Human Genetics, University of Ulm, Albert-Einstein-Allee 11, 89081 Ulm, Germany
| | - Edo Vellenga
- Department of Hematology, University of Groningen and University Medical Center Groningen, PO Box 30001, 9700 RB Groningen, the Netherlands
| | - Sadia Saeed
- Department of Biochemistry, PMAS Arid Agriculture University Rawalpindi, 46300 Rawalpindi, Pakistan; Department of Molecular Biology, Faculty of Science, Nijmegen Centre for Molecular Life Sciences, Radboud University, 6500 HB Nijmegen, the Netherlands
| | - Filomena Matarese
- Department of Molecular Biology, Faculty of Science, Nijmegen Centre for Molecular Life Sciences, Radboud University, 6500 HB Nijmegen, the Netherlands
| | - Joost H A Martens
- Department of Molecular Biology, Faculty of Science, Nijmegen Centre for Molecular Life Sciences, Radboud University, 6500 HB Nijmegen, the Netherlands
| | - Hendrik G Stunnenberg
- Department of Molecular Biology, Faculty of Science, Nijmegen Centre for Molecular Life Sciences, Radboud University, 6500 HB Nijmegen, the Netherlands
| | | | - Javier Herrero
- UCL Cancer Institute, University College London, London WC1E 6BT, UK
| | - Ewan Birney
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ian Dunham
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Stephan Beck
- UCL Cancer Institute, University College London, London WC1E 6BT, UK.
| |
Collapse
|
19
|
Selecting optimal combinations of transcription factors to promote axon regeneration: Why mechanisms matter. Neurosci Lett 2016; 652:64-73. [PMID: 28025113 DOI: 10.1016/j.neulet.2016.12.032] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2016] [Revised: 12/02/2016] [Accepted: 12/14/2016] [Indexed: 01/17/2023]
Abstract
Recovery from injuries to the central nervous system, including spinal cord injury, is constrained in part by the intrinsically low ability of many CNS neurons to mount an effective regenerative growth response. To improve outcomes, it is essential to understand and ultimately reverse these neuron-intrinsic constraints. Genetic manipulation of key transcription factors (TFs), which act to orchestrate production of multiple regeneration-associated genes, has emerged as a promising strategy. It is likely that no single TF will be sufficient to fully restore neuron-intrinsic growth potential, and that multiple, functionally interacting factors will be needed. An extensive literature, mostly from non-neural cell types, has identified potential mechanisms by which TFs can functionally synergize. Here we examine four potential mechanisms of TF/TF interaction; physical interaction, transcriptional cross-regulation, signaling-based cross regulation, and co-occupancy of regulatory DNA. For each mechanism, we consider how existing knowledge can be used to guide the discovery and effective use of TF combinations in the context of regenerative neuroscience. This mechanistic insight into TF interactions is needed to accelerate the design of effective TF-based interventions to relieve neuron-intrinsic constraints to regeneration and to foster recovery from CNS injury.
Collapse
|
20
|
Logie C, Stunnenberg HG. Epigenetic memory: A macrophage perspective. Semin Immunol 2016; 28:359-67. [DOI: 10.1016/j.smim.2016.06.003] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/09/2016] [Revised: 06/16/2016] [Accepted: 06/23/2016] [Indexed: 01/02/2023]
|
21
|
Murakawa Y, Yoshihara M, Kawaji H, Nishikawa M, Zayed H, Suzuki H, FANTOM Consortium, Hayashizaki Y. Enhanced Identification of Transcriptional Enhancers Provides Mechanistic Insights into Diseases. Trends Genet 2016; 32:76-88. [DOI: 10.1016/j.tig.2015.11.004] [Citation(s) in RCA: 73] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2015] [Revised: 11/25/2015] [Accepted: 11/30/2015] [Indexed: 12/24/2022]
|
22
|
Labbadia J, Morimoto RI. Repression of the Heat Shock Response Is a Programmed Event at the Onset of Reproduction. Mol Cell 2015. [PMID: 26212459 DOI: 10.1016/j.molcel.2015.06.027] [Citation(s) in RCA: 226] [Impact Index Per Article: 25.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
The heat shock response (HSR) is essential for proteostasis and cellular health. In metazoans, aging is associated with a decline in quality control, thus increasing the risk for protein conformational disease. Here, we show that in C. elegans, the HSR declines precipitously over a 4 hr period in early adulthood coincident with the onset of reproductive maturity. Repression of the HSR occurs due to an increase in H3K27me3 marks at stress gene loci, the timing of which is determined by reduced expression of the H3K27 demethylase jmjd-3.1. This results in a repressed chromatin state that interferes with HSF-1 binding and suppresses transcription initiation in response to stress. The removal of germline stem cells preserves jmjd-3.1 expression, suppresses the accumulation of H3K27me3 at stress gene loci, and maintains the HSR. These findings suggest that competing requirements of the germline and soma dictate organismal stress resistance as animals begin reproduction.
Collapse
Affiliation(s)
- Johnathan Labbadia
- Department of Molecular Biosciences, Rice Institute for Biomedical Research, Northwestern University, Evanston, IL 60208, USA
| | - Richard I Morimoto
- Department of Molecular Biosciences, Rice Institute for Biomedical Research, Northwestern University, Evanston, IL 60208, USA.
| |
Collapse
|
23
|
Gittelman RM, Hun E, Ay F, Madeoy J, Pennacchio L, Noble WS, Hawkins RD, Akey JM. Comprehensive identification and analysis of human accelerated regulatory DNA. Genome Res 2015; 25:1245-55. [PMID: 26104583 PMCID: PMC4561485 DOI: 10.1101/gr.192591.115] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2015] [Accepted: 06/15/2015] [Indexed: 01/19/2023]
Abstract
It has long been hypothesized that changes in gene regulation have played an important role in human evolution, but regulatory DNA has been much more difficult to study compared with protein-coding regions. Recent large-scale studies have created genome-scale catalogs of DNase I hypersensitive sites (DHSs), which demark potentially functional regulatory DNA. To better define regulatory DNA that has been subject to human-specific adaptive evolution, we performed comprehensive evolutionary and population genetics analyses on over 18 million DHSs discovered in 130 cell types. We identified 524 DHSs that are conserved in nonhuman primates but accelerated in the human lineage (haDHS), and estimate that 70% of substitutions in haDHSs are attributable to positive selection. Through extensive computational and experimental analyses, we demonstrate that haDHSs are often active in brain or neuronal cell types; play an important role in regulating the expression of developmentally important genes, including many transcription factors such as SOX6, POU3F2, and HOX genes; and identify striking examples of adaptive regulatory evolution that may have contributed to human-specific phenotypes. More generally, our results reveal new insights into conserved and adaptive regulatory DNA in humans and refine the set of genomic substrates that distinguish humans from their closest living primate relatives.
Collapse
Affiliation(s)
- Rachel M Gittelman
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Enna Hun
- Division of Medical Genetics, University of Washington, Seattle, Washington 98195, USA
| | - Ferhat Ay
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Jennifer Madeoy
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Len Pennacchio
- Lawrence Berkeley National Laboratory, Genomics Division, Berkeley, California 94701, USA
| | - William S Noble
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - R David Hawkins
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA; Division of Medical Genetics, University of Washington, Seattle, Washington 98195, USA
| | - Joshua M Akey
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| |
Collapse
|
24
|
Tsompana M, Buck MJ. Chromatin accessibility: a window into the genome. Epigenetics Chromatin 2014; 7:33. [PMID: 25473421 PMCID: PMC4253006 DOI: 10.1186/1756-8935-7-33] [Citation(s) in RCA: 251] [Impact Index Per Article: 25.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2014] [Accepted: 11/05/2014] [Indexed: 01/09/2023] Open
Abstract
Transcriptional activation throughout the eukaryotic lineage has been tightly linked with disruption of nucleosome organization at promoters, enhancers, silencers, insulators and locus control regions due to transcription factor binding. Regulatory DNA thus coincides with open or accessible genomic sites of remodeled chromatin. Current chromatin accessibility assays are used to separate the genome by enzymatic or chemical means and isolate either the accessible or protected locations. The isolated DNA is then quantified using a next-generation sequencing platform. Wide application of these assays has recently focused on the identification of the instrumental epigenetic changes responsible for differential gene expression, cell proliferation, functional diversification and disease development. Here we discuss the limitations and advantages of current genome-wide chromatin accessibility assays with especial attention on experimental precautions and sequence data analysis. We conclude with our perspective on future improvements necessary for moving the field of chromatin profiling forward.
Collapse
Affiliation(s)
- Maria Tsompana
- New York State Center of Excellence in Bioinformatics and Life Sciences, State University of New York at Buffalo, 701 Ellicott St, Buffalo, NY 14203 USA
| | - Michael J Buck
- New York State Center of Excellence in Bioinformatics and Life Sciences, State University of New York at Buffalo, 701 Ellicott St, Buffalo, NY 14203 USA ; Department of Biochemistry, State University of New York at Buffalo, Buffalo, NY USA
| |
Collapse
|
25
|
Fattori J, Indolfo NDC, Campos JCLDO, Videira NB, Bridi AV, Doratioto TR, Assis MAD, Figueira ACM. Investigation of Interactions between DNA and Nuclear Receptors: A Review of the Most Used Methods. NUCLEAR RECEPTOR RESEARCH 2014. [DOI: 10.11131/2014/101090] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
- Juliana Fattori
- Brazilian Biosciences National Laboratory (LNBio), Brazilian Center for Research in Energy and Materials (CNPEM), P.O. Box 6192, Campinas-SP, Brazil
| | - Nathalia de Carvalho Indolfo
- Brazilian Biosciences National Laboratory (LNBio), Brazilian Center for Research in Energy and Materials (CNPEM), P.O. Box 6192, Campinas-SP, Brazil
| | | | - Natália Bernardi Videira
- Brazilian Biosciences National Laboratory (LNBio), Brazilian Center for Research in Energy and Materials (CNPEM), P.O. Box 6192, Campinas-SP, Brazil
| | - Aline Villanova Bridi
- Brazilian Biosciences National Laboratory (LNBio), Brazilian Center for Research in Energy and Materials (CNPEM), P.O. Box 6192, Campinas-SP, Brazil
| | - Tábata Renée Doratioto
- Brazilian Biosciences National Laboratory (LNBio), Brazilian Center for Research in Energy and Materials (CNPEM), P.O. Box 6192, Campinas-SP, Brazil
| | - Michelle Alexandrino de Assis
- Brazilian Biosciences National Laboratory (LNBio), Brazilian Center for Research in Energy and Materials (CNPEM), P.O. Box 6192, Campinas-SP, Brazil
| | - Ana Carolina Migliorini Figueira
- Brazilian Biosciences National Laboratory (LNBio), Brazilian Center for Research in Energy and Materials (CNPEM), P.O. Box 6192, Campinas-SP, Brazil
| |
Collapse
|
26
|
Son EY, Crabtree GR. The role of BAF (mSWI/SNF) complexes in mammalian neural development. AMERICAN JOURNAL OF MEDICAL GENETICS PART C-SEMINARS IN MEDICAL GENETICS 2014; 166C:333-49. [PMID: 25195934 DOI: 10.1002/ajmg.c.31416] [Citation(s) in RCA: 105] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
The BAF (mammalian SWI/SNF) complexes are a family of multi-subunit ATP-dependent chromatin remodelers that use ATP hydrolysis to alter chromatin structure. Distinct BAF complex compositions are possible through combinatorial assembly of homologous subunit families and can serve non-redundant functions. In mammalian neural development, developmental stage-specific BAF assemblies are found in embryonic stem cells, neural progenitors and postmitotic neurons. In particular, the neural progenitor-specific BAF complexes are essential for controlling the kinetics and mode of neural progenitor cell division, while neuronal BAF function is necessary for the maturation of postmitotic neuronal phenotypes as well as long-term memory formation. The microRNA-mediated mechanism for transitioning from npBAF to nBAF complexes is instructive for the neuronal fate and can even convert fibroblasts into neurons. The high frequency of BAF subunit mutations in neurological disorders underscores the rate-determining role of BAF complexes in neural development, homeostasis, and plasticity.
Collapse
|
27
|
Zhang W, Zhang T, Wu Y, Jiang J. Open Chromatin in Plant Genomes. Cytogenet Genome Res 2014; 143:18-27. [DOI: 10.1159/000362827] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
|
28
|
John S, Sabo PJ, Canfield TK, Lee K, Vong S, Weaver M, Wang H, Vierstra J, Reynolds AP, Thurman RE, Stamatoyannopoulos JA. Genome-scale mapping of DNase I hypersensitivity. CURRENT PROTOCOLS IN MOLECULAR BIOLOGY 2014; Chapter 27:Unit 21.27. [PMID: 23821440 DOI: 10.1002/0471142727.mb2127s103] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
DNase I-seq is a global and high-resolution method that uses the nonspecific endonuclease DNase I to map chromatin accessibility. These accessible regions, designated as DNase I hypersensitive sites (DHSs), define the regulatory features, (e.g., promoters, enhancers, insulators, and locus control regions) of complex genomes. In this unit, methods are described for nuclei isolation, digestion of nuclei with limiting concentrations of DNase I, and the biochemical fractionation of DNase I hypersensitive sites in preparation for high-throughput sequencing. DNase I-seq is an unbiased and robust method that is not predicated on an a priori understanding of regulatory patterns or chromatin features.
Collapse
Affiliation(s)
- Sam John
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
29
|
Vigneault F, Guérin SL. Regulation of gene expression: probing DNA–protein interactionsin vivoandin vitro. Expert Rev Proteomics 2014; 2:705-18. [PMID: 16209650 DOI: 10.1586/14789450.2.5.705] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Tremendous efforts have been put together over the last several years to complete the entire sequencing of the human genome. As we enter the proteomic era, when the major aim is understanding which gene encodes which protein, the time has also come to identify their precise function inside the astonishing signaling network required to accomplish all cellular functions. Understanding when, why and how a gene is expressed has now become a necessity toward identifying all the regulatory pathways that mediate cellular processes such as differentiation, migration, replication, DNA repair and apoptosis. Regulation of gene transcription is a process that is primarily under the influence of nuclear-located transcription factors. Consequently, identifying which protein activates or represses a specific gene is a prerequisite for understanding cell fate and function. The current state of, and recent advances in, transcriptional regulation approaches are reviewed here, with special emphasis on new technologies required when probing for DNA-protein interactions. This review explores different strategies aimed at identifying both the regulatory sequences of any given gene and the trans-acting regulatory factors that recognize these elements as their target sites in the nucleus. Ongoing developments in the fields of nanotechnology, RNA silencing and protein modeling toward the investigation of DNA-protein interactions and their relevance in the battle against cancer are discussed.
Collapse
Affiliation(s)
- Francois Vigneault
- Laboratoire d'Endocrinologie Moléculaire et Oncologique, Centre de recherche du CHUL (CHUQ), Sainte-Foy, Québec, G1V 4G2, Canada.
| | | |
Collapse
|
30
|
McKay DJ, Lieb JD. A common set of DNA regulatory elements shapes Drosophila appendages. Dev Cell 2014; 27:306-18. [PMID: 24229644 DOI: 10.1016/j.devcel.2013.10.009] [Citation(s) in RCA: 113] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2013] [Revised: 09/19/2013] [Accepted: 10/13/2013] [Indexed: 12/20/2022]
Abstract
Animals have body parts made of similar cell types located at different axial positions, such as limbs. The identity and distinct morphology of each structure is often specified by the activity of different "master regulator" transcription factors. Although similarities in gene expression have been observed between body parts made of similar cell types, how regulatory information in the genome is differentially utilized to create morphologically diverse structures in development is not known. Here, we use genome-wide open chromatin profiling to show that among the Drosophila appendages, the same DNA regulatory modules are accessible throughout the genome at a given stage of development, except at the loci encoding the master regulators themselves. In addition, open chromatin profiles change over developmental time, and these changes are coordinated between different appendages. We propose that master regulators create morphologically distinct structures by differentially influencing the function of the same set of DNA regulatory modules.
Collapse
Affiliation(s)
- Daniel J McKay
- Department of Biology, Carolina Center for Genome Sciences, and Lineberger Comprehensive Cancer Center, The University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-3280, USA.
| | | |
Collapse
|
31
|
Chen J, Li Q. Enhancing myogenic differentiation of pluripotent stem cells with small molecule inducers. Cell Biosci 2013; 3:40. [PMID: 24172312 PMCID: PMC3953345 DOI: 10.1186/2045-3701-3-40] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2013] [Accepted: 08/16/2013] [Indexed: 03/01/2023] Open
Abstract
Pluripotent stem cells are able to differentiate into many types of cell lineages in response to differentiation cues. However, a pure population of lineage-specific cells is desirable for any potential clinical application. Therefore, induction of the pluripotent stem cells with lineage-specific regulatory signals, or small molecule inducers, is a prerequisite for effectively directing lineage specification for cell-based therapeutics. In this article, we provide in-depth analysis of recent research findings on small molecule inducers of the skeletal muscle lineage. We also provide perspectives on how different signaling pathways and chromatin dynamics converge to direct the differentiation of skeletal myocytes.
Collapse
Affiliation(s)
| | - Qiao Li
- Department of Cellular and Molecular Medicine, Faculty of Medicine, University of Ottawa, Ottawa, Ontario, Canada.
| |
Collapse
|
32
|
Pennacchio LA, Bickmore W, Dean A, Nobrega MA, Bejerano G. Enhancers: five essential questions. Nat Rev Genet 2013; 14:288-95. [PMID: 23503198 DOI: 10.1038/nrg3458] [Citation(s) in RCA: 350] [Impact Index Per Article: 31.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
It is estimated that the human genome contains hundreds of thousands of enhancers, so understanding these gene-regulatory elements is a crucial goal. Several fundamental questions need to be addressed about enhancers, such as how do we identify them all, how do they work, and how do they contribute to disease and evolution? Five prominent researchers in this field look at how much we know already and what needs to be done to answer these questions.
Collapse
Affiliation(s)
- Len A Pennacchio
- Genomics Division, One Cyclotron Road, MS 84-171, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA.
| | | | | | | | | |
Collapse
|
33
|
Probing DNA shape and methylation state on a genomic scale with DNase I. Proc Natl Acad Sci U S A 2013; 110:6376-81. [PMID: 23576721 DOI: 10.1073/pnas.1216822110] [Citation(s) in RCA: 112] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
DNA binding proteins find their cognate sequences within genomic DNA through recognition of specific chemical and structural features. Here we demonstrate that high-resolution DNase I cleavage profiles can provide detailed information about the shape and chemical modification status of genomic DNA. Analyzing millions of DNA backbone hydrolysis events on naked genomic DNA, we show that the intrinsic rate of cleavage by DNase I closely tracks the width of the minor groove. Integration of these DNase I cleavage data with bisulfite sequencing data for the same cell type's genome reveals that cleavage directly adjacent to cytosine-phosphate-guanine (CpG) dinucleotides is enhanced at least eightfold by cytosine methylation. This phenomenon we show to be attributable to methylation-induced narrowing of the minor groove. Furthermore, we demonstrate that it enables simultaneous mapping of DNase I hypersensitivity and regional DNA methylation levels using dense in vivo cleavage data. Taken together, our results suggest a general mechanism by which CpG methylation can modulate protein-DNA interaction strength via the remodeling of DNA shape.
Collapse
|
34
|
Abstract
Next-generation sequencing technologies need careful design of experiments and evaluation of results to meet field requirements. Here we discuss technical considerations for these high-throughput assays, together with criteria to assess the quality of the results and the necessary validation.
Collapse
Affiliation(s)
- Weihua Zeng
- Department of Developmental and Cell Biology, University of California, Irvine, California, USA and at the Center for Complex Biological Systems, University of California, Irvine, California, USA
| | - Ali Mortazavi
- Department of Developmental and Cell Biology, University of California, Irvine, California, USA and at the Center for Complex Biological Systems, University of California, Irvine, California, USA
| |
Collapse
|
35
|
Ott CJ, Harris A. Genomic approaches for the discovery of CFTR regulatory elements. Transcription 2012; 2:23-7. [PMID: 21326906 DOI: 10.4161/trns.2.1.13693] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2010] [Revised: 09/19/2010] [Accepted: 09/20/2010] [Indexed: 12/30/2022] Open
Abstract
Non-coding regions of the human genome contain vast regulatory potential that contributes to the coordination of gene expression. Indeed, regulatory elements can reside large genomic distances from the promoters of genes they control. Here we describe approaches recently used to identify functional elements within the complex CFTR locus.
Collapse
Affiliation(s)
- Christopher J Ott
- Human Molecular Genetics Program, Children's Memorial Research Center, Chicago, IL, USA
| | | |
Collapse
|
36
|
Vazquez BN, Laguna T, Notario L, Lauzurica P. Evidence for an intronic cis-regulatory element within CD69 gene. Genes Immun 2012; 13:356-62. [PMID: 22456278 DOI: 10.1038/gene.2012.4] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
CD69 is one of the earliest proteins expressed after leukocyte activation and its engagement is essential in the control of innate and adaptive immune responses. Inducible CD69 expression is strongly controlled at the transcriptional level. The molecular basis for developmental- and stage-specific regulation in T cells is beginning to be elucidated while it remains largely unknown in the rest of immune cells. DNase I hypersensitivity experiments in lymphocytes identified a novel hypersensitive region within mouse and human intron I, which was inducible upon stimulation. In silico analysis of CD69 gene revealed that this open chromatin region was present in different cell types and was associated with positioned nucleosomes. Analysis of histone post-translational modifications of intron I indicated that acetylation and lysine 4 dimethylation of histone H3 were dynamically regulated during thymocyte development and were constitutively high in resting and stimulated mature T lymphocytes. Thus, we provide evidence for the existence of a cis-acting element in intron I that is more accessible to DNase I digestion and that it is developmentally regulated at the chromatin level.
Collapse
Affiliation(s)
- B N Vazquez
- Instituto de Salud Carlos III, Centro Nacional de Microbiología, Majadahonda, Madrid, Spain
| | | | | | | |
Collapse
|
37
|
Abstract
DNaseI-hypersensitive sites within chromatin are indicative of genomic loci with regulatory function. Several techniques have been described for analyzing these regions, but are either laborious, offer low-throughput possibilities, or are expensive. We have developed a new approach based on a modified version of multiplex ligation-dependent probe amplification (MLPA). Using this method, it is possible to analyse up to 50 defined genomic regions for DNaseI-hypersensitivity in a single PCR-based reaction. This chapter outlines the approach and discusses the critical features of each step of the procedure.
Collapse
Affiliation(s)
- Thomas Ohnesorg
- Molecular Development Laboratory, Murdoch Childrens Research Institute, Royal Children's Hospital, Melbourne, VIC, Australia
| | | | | |
Collapse
|
38
|
Francetic T, Le May M, Hamed M, Mach H, Meyers D, Cole PA, Chen J, Li Q. Regulation of Myf5 Early Enhancer by Histone Acetyltransferase p300 during Stem Cell Differentiation. Mol Biol 2012; 1. [PMID: 25382872 PMCID: PMC4222083 DOI: 10.4172/2168-9547.1000103] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Skeletal myogenesis is an intricate process coordinated temporally by multiple myogenic regulatory factors (MRF) including Myf5, which is the first MRF expressed and marks the commitment of skeletal muscle lineage. The expression of Myf5 gene during early embryogenesis is controlled by a set of enhancer elements, and requires the histone acetyltransferase (HAT) activity of transcriptional coactivator p300. However, it is unclear as to how different regulatory signals converge at enhancer elements to regulate early Myf5 gene expression, and if p300 is directly involved. We show here that p300 associates with the Myf5 early enhancer at the early stage of stem cell differentiation, and its HAT activity is important for the recruitment of β-catenin to this early enhancer. In addition, histone H3-K27 acetylation, but not H3-K9/14, is intimately connected to the p300 HAT activity. Thus, p300 is directly involved in the regulation of the Myf5 early enhancer, and is important for specific histone acetylation and transcription factor recruitment. This connection of p300 HAT activity with H3-K27 acetylation and β-catenin signalling during myogenic differentiation in vitro offers a molecular insight into the enhancer-elements participation observed in embryonic development. In addition, pluripotent stem cell differentiation is a valuable system to dissect the signal-dependent regulation of specific enhancer element during cell fate determinations.
Collapse
Affiliation(s)
- Tanja Francetic
- Cellular and Molecular, Medicine, Faculty of Medicine, University of Ottawa, Ottawa, ON Canada
| | - Melanie Le May
- Cellular and Molecular, Medicine, Faculty of Medicine, University of Ottawa, Ottawa, ON Canada
| | - Munerah Hamed
- Cellular and Molecular, Medicine, Faculty of Medicine, University of Ottawa, Ottawa, ON Canada
| | - Hymn Mach
- Departments of Pathology and Laboratory Medicine, University of Ottawa, Ottawa, ON Canada
| | - David Meyers
- Department of Pharmacology and Molecular Sciences, Johns Hopkins University School of Medicine, Baltimore, MD USA
| | - Philip A Cole
- Department of Pharmacology and Molecular Sciences, Johns Hopkins University School of Medicine, Baltimore, MD USA
| | - Jihong Chen
- Departments of Pathology and Laboratory Medicine, University of Ottawa, Ottawa, ON Canada
| | - Qiao Li
- Departments of Pathology and Laboratory Medicine, University of Ottawa, Ottawa, ON Canada ; Cellular and Molecular, Medicine, Faculty of Medicine, University of Ottawa, Ottawa, ON Canada
| |
Collapse
|
39
|
Stergachis AB, MacLean B, Lee K, Stamatoyannopoulos JA, MacCoss MJ. Rapid empirical discovery of optimal peptides for targeted proteomics. Nat Methods 2011; 8:1041-3. [PMID: 22056677 PMCID: PMC3227787 DOI: 10.1038/nmeth.1770] [Citation(s) in RCA: 90] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2011] [Accepted: 10/11/2011] [Indexed: 11/16/2022]
Abstract
We report a method for high-throughput, cost-efficient empirical discovery of optimal proteotypic peptides and fragment ions for targeted proteomics applications using in vitro-synthesized proteins. We demonstrate the approach using human transcription factors – which are typically difficult, low-abundance – targets with an overall success rate of 98%. We show further that targeted proteomic assays developed using our approach facilitate robust in vivo quantification of human transcription factors.
Collapse
Affiliation(s)
- Andrew B Stergachis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington, USA
| | | | | | | | | |
Collapse
|
40
|
Kamath U, Shehu A, De Jong KA. A two-stage evolutionary approach for effective classification of hypersensitive DNA sequences. J Bioinform Comput Biol 2011; 9:399-413. [PMID: 21714132 DOI: 10.1142/s0219720011005586] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2011] [Revised: 04/06/2011] [Accepted: 04/14/2011] [Indexed: 11/18/2022]
Abstract
Hypersensitive (HS) sites in genomic sequences are reliable markers of DNA regulatory regions that control gene expression. Annotation of regulatory regions is important in understanding phenotypical differences among cells and diseases linked to pathologies in protein expression. Several computational techniques are devoted to mapping out regulatory regions in DNA by initially identifying HS sequences. Statistical learning techniques like Support Vector Machines (SVM), for instance, are employed to classify DNA sequences as HS or non-HS. This paper proposes a method to automate the basic steps in designing an SVM that improves the accuracy of such classification. The method proceeds in two stages and makes use of evolutionary algorithms. An evolutionary algorithm first designs optimal sequence motifs to associate explicit discriminating feature vectors with input DNA sequences. A second evolutionary algorithm then designs SVM kernel functions and parameters that optimally separate the HS and non-HS classes. Results show that this two-stage method significantly improves SVM classification accuracy. The method promises to be generally useful in automating the analysis of biological sequences, and we post its source code on our website.
Collapse
Affiliation(s)
- Uday Kamath
- Department of Computer Science, George Mason University, Fairfax, Virginia 20123, USA.
| | | | | |
Collapse
|
41
|
Assessing the effects of symmetry on motif discovery and modeling. PLoS One 2011; 6:e24908. [PMID: 21949783 PMCID: PMC3176789 DOI: 10.1371/journal.pone.0024908] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2011] [Accepted: 08/19/2011] [Indexed: 11/23/2022] Open
Abstract
Background Identifying the DNA binding sites for transcription factors is a key task in modeling the gene regulatory network of a cell. Predicting DNA binding sites computationally suffers from high false positives and false negatives due to various contributing factors, including the inaccurate models for transcription factor specificity. One source of inaccuracy in the specificity models is the assumption of asymmetry for symmetric models. Methodology/Principal Findings Using simulation studies, so that the correct binding site model is known and various parameters of the process can be systematically controlled, we test different motif finding algorithms on both symmetric and asymmetric binding site data. We show that if the true binding site is asymmetric the results are unambiguous and the asymmetric model is clearly superior to the symmetric model. But if the true binding specificity is symmetric commonly used methods can infer, incorrectly, that the motif is asymmetric. The resulting inaccurate motifs lead to lower sensitivity and specificity than would the correct, symmetric models. We also show how the correct model can be obtained by the use of appropriate measures of statistical significance. Conclusions/Significance This study demonstrates that the most commonly used motif-finding approaches usually model symmetric motifs incorrectly, which leads to higher than necessary false prediction errors. It also demonstrates how alternative motif-finding methods can correct the problem, providing more accurate motif models and reducing the errors. Furthermore, it provides criteria for determining whether a symmetric or asymmetric model is the most appropriate for any experimental dataset.
Collapse
|
42
|
Splinter E, de Laat W. The complex transcription regulatory landscape of our genome: control in three dimensions. EMBO J 2011; 30:4345-55. [PMID: 21952046 DOI: 10.1038/emboj.2011.344] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2011] [Accepted: 08/29/2011] [Indexed: 11/09/2022] Open
Abstract
The non-coding part of our genome contains sequence motifs that can control gene transcription over distance. Here, we discuss functional genomics studies that uncover and characterize these sequences across the mammalian genome. The picture emerging is of a genome being a complex regulatory landscape. We explore the principles that underlie the wiring of regulatory DNA sequences and genes. We argue transcriptional control over distance can be understood when considering action in the context of the folded genome. Genome topology is expected to differ between individual cells, and this may cause variegated expression. High-resolution three-dimensional genome topology maps, ultimately of single cells, are required to understand the cis-regulatory networks that underlie cellular transcriptomes.
Collapse
Affiliation(s)
- Erik Splinter
- Hubrecht Insitute-KNAW & University Medical Center Utrecht, Utrecht, The Netherlands
| | | |
Collapse
|
43
|
esBAF facilitates pluripotency by conditioning the genome for LIF/STAT3 signalling and by regulating polycomb function. Nat Cell Biol 2011; 13:903-13. [PMID: 21785422 PMCID: PMC3155811 DOI: 10.1038/ncb2285] [Citation(s) in RCA: 211] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2010] [Accepted: 05/31/2011] [Indexed: 12/19/2022]
Abstract
Signaling by the cytokine LIF and its downstream transcription factor, STAT3, prevents differentiation of pluripotent embryonic stem cells (ESCs) by opposing MAP kinase signaling. This contrasts with most cell types where STAT3signaling induces differentiation. We find that STAT3binding across the pluripotent genome is dependent upon Brg, the ATPase subunit of a specialized chromatin remodeling complex (esBAF) found in ESCs. Brg is required to establish chromatin accessibility at STAT3 binding targets, in essence preparing these sites to respond to LIF signaling. Moreover, Brg deletion leads to rapid Polycomb (PcG) binding and H3K27me3-mediated silencing of many Brg-activated targets genome-wide, including the target genes of the LIF signaling pathway. Hence, one crucial role of Brg in ESCs involves its ability to potentiate LIF signaling by opposing PcG. Contrary to expectations, Brg also facilitates PcG function at classical PcG target including all four Hox loci, reinforcing their repression in ESCs. These findings reveal that esBAF does not simply antagonize PcG, but rather, the two chromatin regulators act both antagonistically and synergistically with the common goal of supporting pluripotency.
Collapse
|
44
|
Extensive chromatin remodelling and establishment of transcription factor 'hotspots' during early adipogenesis. EMBO J 2011; 30:1459-72. [PMID: 21427703 DOI: 10.1038/emboj.2011.65] [Citation(s) in RCA: 275] [Impact Index Per Article: 21.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2010] [Accepted: 02/17/2011] [Indexed: 12/27/2022] Open
Abstract
Adipogenesis is tightly controlled by a complex network of transcription factors acting at different stages of differentiation. Peroxisome proliferator-activated receptor γ (PPARγ) and CCAAT/enhancer-binding protein (C/EBP) family members are key regulators of this process. We have employed DNase I hypersensitive site analysis to investigate the genome-wide changes in chromatin structure that accompany the binding of adipogenic transcription factors. These analyses revealed a dramatic and dynamic modulation of the chromatin landscape during the first hours of adipocyte differentiation that coincides with cooperative binding of multiple early transcription factors (including glucocorticoid receptor, retinoid X receptor, Stat5a, C/EBPβ and -δ) to transcription factor 'hotspots'. Our results demonstrate that C/EBPβ marks a large number of these transcription factor 'hotspots' before induction of differentiation and chromatin remodelling and is required for their establishment. Furthermore, a subset of early remodelled C/EBP-binding sites persists throughout differentiation and is later occupied by PPARγ, indicating that early C/EBP family members, in addition to their well-established role in activation of PPARγ transcription, may act as pioneering factors for PPARγ binding.
Collapse
|
45
|
Landry JW, Banerjee S, Taylor B, Aplan PD, Singer A, Wu C. Chromatin remodeling complex NURF regulates thymocyte maturation. Genes Dev 2011; 25:275-86. [PMID: 21289071 DOI: 10.1101/gad.2007311] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The maturation of T cells requires signaling from both cytokine and T-cell receptors to gene targets in chromatin, but how chromatin architecture influences this process is largely unknown. Here we show that thymocyte maturation post-positive selection is dependent on the nucleosome remodeling factor (NURF). Depletion of Bptf (bromodomain PHD finger transcription factor), the largest NURF subunit, in conditional mouse mutants results in developmental arrest beyond the CD4(+) CD8(int) stage without affecting cellular proliferation, cellular apoptosis, or coreceptor gene expression. In the Bptf mutant, specific subsets of genes important for thymocyte development show aberrant expression. We also observed defects in DNase I-hypersensitive chromatin structures at Egr1, a prototypical Bptf-dependent gene that is required for efficient thymocyte development. Moreover, chromatin binding of the sequence-specific factor Srf (serum response factor) to Egr1 regulatory sites is dependent on Bptf function. Physical interactions between NURF and Srf suggest a model in which Srf recruits NURF to facilitate transcription factor binding at Bptf-dependent genes. These findings provide evidence for causal connections between NURF, transcription factor occupancy, and gene regulation during thymocyte development.
Collapse
Affiliation(s)
- Joseph W Landry
- Laboratory of Biochemistry and Molecular Cell Biology, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA.
| | | | | | | | | | | |
Collapse
|
46
|
Chen X, Hoffman MM, Bilmes JA, Hesselberth JR, Noble WS. A dynamic Bayesian network for identifying protein-binding footprints from single molecule-based sequencing data. ACTA ACUST UNITED AC 2010; 26:i334-42. [PMID: 20529925 PMCID: PMC2881360 DOI: 10.1093/bioinformatics/btq175] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Motivation: A global map of transcription factor binding sites (TFBSs) is critical to understanding gene regulation and genome function. DNaseI digestion of chromatin coupled with massively parallel sequencing (digital genomic footprinting) enables the identification of protein-binding footprints with high resolution on a genome-wide scale. However, accurately inferring the locations of these footprints remains a challenging computational problem. Results: We present a dynamic Bayesian network-based approach for the identification and assignment of statistical confidence estimates to protein-binding footprints from digital genomic footprinting data. The method, DBFP, allows footprints to be identified in a probabilistic framework and outperforms our previously described algorithm in terms of precision at a fixed recall. Applied to a digital footprinting data set from Saccharomyces cerevisiae, DBFP identifies 4679 statistically significant footprints within intergenic regions. These footprints are mainly located near transcription start sites and are strongly enriched for known TFBSs. Footprints containing no known motif are preferentially located proximal to other footprints, consistent with cooperative binding of these footprints. DBFP also identifies a set of statistically significant footprints in the yeast coding regions. Many of these footprints coincide with the boundaries of antisense transcripts, and the most significant footprints are enriched for binding sites of the chromatin-associated factors Abf1 and Rap1. Contact:jay.hesselberth@ucdenver.edu; william-noble@u.washington.edu Supplementary information:Supplementary material is available at Bioinformatics online.
Collapse
Affiliation(s)
- Xiaoyu Chen
- Department of Computer Science and Engineering, University of Washington, Seattle, WA, USA
| | | | | | | | | |
Collapse
|
47
|
Abstract
Integrating results from diverse experiments is an essential process in our effort to understand the logic of complex systems, such as development, homeostasis and responses to the environment. With the advent of high-throughput methods--including genome-wide association (GWA) studies, chromatin immunoprecipitation followed by sequencing (ChIP-seq) and RNA sequencing (RNA-seq)--acquisition of genome-scale data has never been easier. Epigenomics, transcriptomics, proteomics and genomics each provide an insightful, and yet one-dimensional, view of genome function; integrative analysis promises a unified, global view. However, the large amount of information and diverse technology platforms pose multiple challenges for data access and processing. This Review discusses emerging issues and strategies related to data integration in the era of next-generation genomics.
Collapse
Affiliation(s)
- R. David Hawkins
- Ludwig Institute for Cancer Research, Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, CA 92093-0653
| | - Gary C. Hon
- Ludwig Institute for Cancer Research, Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, CA 92093-0653
| | - Bing Ren
- Ludwig Institute for Cancer Research, Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, CA 92093-0653
| |
Collapse
|
48
|
Gheldof N, Smith EM, Tabuchi TM, Koch CM, Dunham I, Stamatoyannopoulos JA, Dekker J. Cell-type-specific long-range looping interactions identify distant regulatory elements of the CFTR gene. Nucleic Acids Res 2010; 38:4325-36. [PMID: 20360044 PMCID: PMC2910055 DOI: 10.1093/nar/gkq175] [Citation(s) in RCA: 86] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2009] [Revised: 03/01/2010] [Accepted: 03/03/2010] [Indexed: 12/20/2022] Open
Abstract
Identification of regulatory elements and their target genes is complicated by the fact that regulatory elements can act over large genomic distances. Identification of long-range acting elements is particularly important in the case of disease genes as mutations in these elements can result in human disease. It is becoming increasingly clear that long-range control of gene expression is facilitated by chromatin looping interactions. These interactions can be detected by chromosome conformation capture (3C). Here, we employed 3C as a discovery tool for identification of long-range regulatory elements that control the cystic fibrosis transmembrane conductance regulator gene, CFTR. We identified four elements in a 460-kb region around the locus that loop specifically to the CFTR promoter exclusively in CFTR expressing cells. The elements are located 20 and 80 kb upstream; and 109 and 203 kb downstream of the CFTR promoter. These elements contain DNase I hypersensitive sites and histone modification patterns characteristic of enhancers. The elements also interact with each other and the latter two activate the CFTR promoter synergistically in reporter assays. Our results reveal novel long-range acting elements that control expression of CFTR and suggest that 3C-based approaches can be used for discovery of novel regulatory elements.
Collapse
Affiliation(s)
- Nele Gheldof
- Program in Gene Function and Expression and Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, 364 Plantation Street, Worcester, MA 01605-0103, USA, European Bioinformatics Institute (EBI), The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK and Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Emily M. Smith
- Program in Gene Function and Expression and Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, 364 Plantation Street, Worcester, MA 01605-0103, USA, European Bioinformatics Institute (EBI), The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK and Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Tomoko M. Tabuchi
- Program in Gene Function and Expression and Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, 364 Plantation Street, Worcester, MA 01605-0103, USA, European Bioinformatics Institute (EBI), The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK and Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Christoph M. Koch
- Program in Gene Function and Expression and Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, 364 Plantation Street, Worcester, MA 01605-0103, USA, European Bioinformatics Institute (EBI), The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK and Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Ian Dunham
- Program in Gene Function and Expression and Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, 364 Plantation Street, Worcester, MA 01605-0103, USA, European Bioinformatics Institute (EBI), The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK and Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - John A. Stamatoyannopoulos
- Program in Gene Function and Expression and Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, 364 Plantation Street, Worcester, MA 01605-0103, USA, European Bioinformatics Institute (EBI), The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK and Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Job Dekker
- Program in Gene Function and Expression and Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, 364 Plantation Street, Worcester, MA 01605-0103, USA, European Bioinformatics Institute (EBI), The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK and Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
49
|
Li LM, Arnosti DN. Fine mapping of chromatin structure in Drosophila melanogaster embryos using micrococcal nuclease. Fly (Austin) 2010; 4:213-5. [PMID: 20519935 DOI: 10.4161/fly.4.3.12200] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
The structure of chromatin in eukaryotes exerts significant influences on many DNA related processes, including transcription, replication, recombination and repair. A useful tool for mapping chromatin structure is micrococcal nuclease (MNase), which induces double-strand breaks within nucleosome linker regions, and with more extensive digestion, single-strand nicks within the nucleosome itself. Many studies, carried out largely with microbes and cell cultures, have used MNase to determine the positions of nucleosomes within a region of DNA to identify dynamic changes induced during gene regulation. To measure similar processes in a developmental context, we turned to a tractable model system, the Drosophila embryo. Here we describe a protocol that enables MNase mapping of the enhancer chromatin structure in the embryo, and show how it can be used to identify structural changes on a cis-regulatory element targeted by the Knirps repressor.
Collapse
Affiliation(s)
- Li M Li
- Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, MI, USA
| | | |
Collapse
|
50
|
Placek K, Gasparian S, Coffre M, Maiella S, Sechet E, Bianchi E, Rogge L. Integration of distinct intracellular signaling pathways at distal regulatory elements directs T-bet expression in human CD4+ T cells. THE JOURNAL OF IMMUNOLOGY 2010; 183:7743-51. [PMID: 19923468 DOI: 10.4049/jimmunol.0803812] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
T-bet is a key regulator controlling Th1 cell development. This factor is not expressed in naive CD4(+) T cells, and the mechanisms controlling expression of T-bet are incompletely understood. In this study, we defined regulatory elements at the human T-bet locus and determined how signals originating at the TCR and at cytokine receptors are integrated to induce chromatin modifications and expression of this gene during human Th1 cell differentiation. We found that T cell activation induced two strong DNase I-hypersensitive sites (HS) and rapid histone acetylation at these elements in CD4(+) T cells. Histone acetylation and T-bet expression were strongly inhibited by cyclosporine A, and we detected binding of NF-AT to a HS in vivo. IL-12 and IFN-gamma signaling alone were not sufficient to induce T-bet expression in naive CD4(+) T cells, but enhanced T-bet expression in TCR/CD28-stimulated cells. We detected a third HS 12 kb upstream of the mRNA start site only in developing Th1 cells, which was bound by IL-12-induced STAT4. Our data suggest that T-bet locus remodeling and gene expression are initiated by TCR-induced NF-AT recruitment and amplified by IL-12-mediated STAT4 binding to distinct distal regulatory elements during human Th1 cell differentiation.
Collapse
Affiliation(s)
- Katarzyna Placek
- Institut Pasteur, Immunoregulation Unit and Centre National de la Recherche Scientifique Unité de Recherche Associée 1961, Department of Immunology, Paris, France
| | | | | | | | | | | | | |
Collapse
|