1
|
Madrigal P, Deng S, Feng Y, Militi S, Goh KJ, Nibhani R, Grandy R, Osnato A, Ortmann D, Brown S, Pauklin S. Epigenetic and transcriptional regulations prime cell fate before division during human pluripotent stem cell differentiation. Nat Commun 2023; 14:405. [PMID: 36697417 PMCID: PMC9876972 DOI: 10.1038/s41467-023-36116-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 01/17/2023] [Indexed: 01/26/2023] Open
Abstract
Stem cells undergo cellular division during their differentiation to produce daughter cells with a new cellular identity. However, the epigenetic events and molecular mechanisms occurring between consecutive cell divisions have been insufficiently studied due to technical limitations. Here, using the FUCCI reporter we developed a cell-cycle synchronised human pluripotent stem cell (hPSC) differentiation system for uncovering epigenome and transcriptome dynamics during the first two divisions leading to definitive endoderm. We observed that transcription of key differentiation markers occurs before cell division, while chromatin accessibility analyses revealed the early inhibition of alternative cell fates. We found that Activator protein-1 members controlled by p38/MAPK signalling are necessary for inducing endoderm while blocking cell fate shifting toward mesoderm, and that enhancers are rapidly established and decommissioned between different cell divisions. Our study has practical biomedical utility for producing hPSC-derived patient-specific cell types since p38/MAPK induction increased the differentiation efficiency of insulin-producing pancreatic beta-cells.
Collapse
Affiliation(s)
- Pedro Madrigal
- Department of Surgery, University of Cambridge, Cambridge, CB2 0QQ, UK
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
- Wellcome - MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, CB2 0SZ, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | - Siwei Deng
- Botnar Research Centre, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Old Road, University of Oxford, Headington, Oxford, OX3 7LD, UK
| | - Yuliang Feng
- Botnar Research Centre, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Old Road, University of Oxford, Headington, Oxford, OX3 7LD, UK
| | - Stefania Militi
- Botnar Research Centre, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Old Road, University of Oxford, Headington, Oxford, OX3 7LD, UK
| | - Kim Jee Goh
- Department of Surgery, University of Cambridge, Cambridge, CB2 0QQ, UK
- The Francis Crick Institute, London, NW1 1AT, UK
| | - Reshma Nibhani
- Botnar Research Centre, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Old Road, University of Oxford, Headington, Oxford, OX3 7LD, UK
| | - Rodrigo Grandy
- Department of Surgery, University of Cambridge, Cambridge, CB2 0QQ, UK
| | - Anna Osnato
- Department of Surgery, University of Cambridge, Cambridge, CB2 0QQ, UK
| | - Daniel Ortmann
- Department of Surgery, University of Cambridge, Cambridge, CB2 0QQ, UK
| | - Stephanie Brown
- Department of Surgery, University of Cambridge, Cambridge, CB2 0QQ, UK
| | - Siim Pauklin
- Botnar Research Centre, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Old Road, University of Oxford, Headington, Oxford, OX3 7LD, UK.
| |
Collapse
|
2
|
Xu B, Li X, Gao X, Jia Y, Liu J, Li F, Zhang Z. DeNOPA: decoding nucleosome positions sensitively with sparse ATAC-seq data. Brief Bioinform 2021; 23:6454261. [PMID: 34875002 DOI: 10.1093/bib/bbab469] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Revised: 10/09/2021] [Accepted: 10/13/2021] [Indexed: 12/25/2022] Open
Abstract
As the basal bricks, the dynamics and arrangement of nucleosomes orchestrate the higher architecture of chromatin in a fundamental way, thereby affecting almost all nuclear biology processes. Thanks to its rather simple protocol, assay for transposase-accessible chromatin using sequencing (ATAC)-seq has been rapidly adopted as a major tool for chromatin-accessible profiling at both bulk and single-cell levels; however, to picture the arrangement of nucleosomes per se remains a challenge with ATAC-seq. In the present work, we introduce a novel ATAC-seq analysis toolkit, named decoding nucleosome organization profile based on ATAC-seq data (deNOPA), to predict nucleosome positions. Assessments showed that deNOPA outperformed state-of-the-art tools with ultra-sparse ATAC-seq data, e.g. no more than 0.5 fragment per base pair. The remarkable performance of deNOPA was fueled by the short fragment reads, which compose nearly half of sequenced reads in the ATAC-seq libraries and are commonly discarded by state-of-the-art nucleosome positioning tools. However, we found that the short fragment reads enrich information on nucleosome positions and that the linker regions were predicted by reads from both short and long fragments using Gaussian smoothing. Last, using deNOPA, we showed that the dynamics of nucleosome organization may not directly couple with chromatin accessibility in the cis-regulatory regions when human cells respond to heat shock stimulation. Our deNOPA provides a powerful tool with which to analyze the dynamics of chromatin at nucleosome position level with ultra-sparse ATAC-seq data.
Collapse
Affiliation(s)
- Bingxiang Xu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, and China National Center for Bioinformation, Beijing 100101, China.,School of Life Science, University of Chinese Academy of Sciences, Beijing, P.R. China.,School of Kinesiology, Shanghai University of Sport, Shanghai, China
| | - Xiaoli Li
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, and China National Center for Bioinformation, Beijing 100101, China.,School of Life Science, University of Chinese Academy of Sciences, Beijing, P.R. China
| | - Xiaomeng Gao
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, and China National Center for Bioinformation, Beijing 100101, China.,School of Life Science, University of Chinese Academy of Sciences, Beijing, P.R. China
| | - Yan Jia
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, and China National Center for Bioinformation, Beijing 100101, China
| | - Jing Liu
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, P.R. China
| | - Feifei Li
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, and China National Center for Bioinformation, Beijing 100101, China
| | - Zhihua Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, and China National Center for Bioinformation, Beijing 100101, China.,School of Life Science, University of Chinese Academy of Sciences, Beijing, P.R. China.,School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, P.R. China
| |
Collapse
|
3
|
Li H, Guan Y. Fast decoding cell type-specific transcription factor binding landscape at single-nucleotide resolution. Genome Res 2021; 31:721-731. [PMID: 33741685 PMCID: PMC8015851 DOI: 10.1101/gr.269613.120] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Accepted: 02/17/2021] [Indexed: 01/22/2023]
Abstract
Decoding the cell type-specific transcription factor (TF) binding landscape at single-nucleotide resolution is crucial for understanding the regulatory mechanisms underlying many fundamental biological processes and human diseases. However, limits on time and resources restrict the high-resolution experimental measurements of TF binding profiles of all possible TF-cell type combinations. Previous computational approaches either cannot distinguish the cell context-dependent TF binding profiles across diverse cell types or can only provide a relatively low-resolution prediction. Here we present a novel deep learning approach, Leopard, for predicting TF binding sites at single-nucleotide resolution, achieving the average area under receiver operating characteristic curve (AUROC) of 0.982 and the average area under precision recall curve (AUPRC) of 0.208. Our method substantially outperformed the state-of-the-art methods Anchor and FactorNet, improving the predictive AUPRC by 19% and 27%, respectively, when evaluated at 200-bp resolution. Meanwhile, by leveraging a many-to-many neural network architecture, Leopard features a hundredfold to thousandfold speedup compared with current many-to-one machine learning methods.
Collapse
Affiliation(s)
- Hongyang Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
| |
Collapse
|
4
|
Youn A, Marquez EJ, Lawlor N, Stitzel ML, Ucar D. BiFET: sequencing Bias-free transcription factor Footprint Enrichment Test. Nucleic Acids Res 2019; 47:e11. [PMID: 30428075 PMCID: PMC6344870 DOI: 10.1093/nar/gky1117] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2018] [Accepted: 10/23/2018] [Indexed: 01/15/2023] Open
Abstract
Transcription factor (TF) footprinting uncovers putative protein–DNA binding via combined analyses of chromatin accessibility patterns and their underlying TF sequence motifs. TF footprints are frequently used to identify TFs that regulate activities of cell/condition-specific genomic regions (target loci) in comparison to control regions (background loci) using standard enrichment tests. However, there is a strong association between the chromatin accessibility level and the GC content of a locus and the number and types of TF footprints that can be detected at this site. Traditional enrichment tests (e.g. hypergeometric) do not account for this bias and inflate false positive associations. Therefore, we developed a novel post-processing method, Bias-free Footprint Enrichment Test (BiFET), that corrects for the biases arising from the differences in chromatin accessibility levels and GC contents between target and background loci in footprint enrichment analyses. We applied BiFET on TF footprint calls obtained from EndoC-βH1 ATAC-seq samples using three different algorithms (CENTIPEDE, HINT-BC and PIQ) and showed BiFET’s ability to increase power and reduce false positive rate when compared to hypergeometric test. Furthermore, we used BiFET to study TF footprints from human PBMC and pancreatic islet ATAC-seq samples to show its utility to identify putative TFs associated with cell-type-specific loci.
Collapse
Affiliation(s)
- Ahrim Youn
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Eladio J Marquez
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Nathan Lawlor
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Michael L Stitzel
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA.,Institute for Systems Genomics, University of Connecticut Health Center, Farmington, CT 06030, USA.,Department of Genetics & Genome Sciences, University of Connecticut Health Center, Farmington, CT 06030, USA
| | - Duygu Ucar
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA.,Institute for Systems Genomics, University of Connecticut Health Center, Farmington, CT 06030, USA.,Department of Genetics & Genome Sciences, University of Connecticut Health Center, Farmington, CT 06030, USA
| |
Collapse
|
5
|
Frerichs A, Engelhorn J, Altmüller J, Gutierrez-Marcos J, Werr W. Specific chromatin changes mark lateral organ founder cells in the Arabidopsis inflorescence meristem. JOURNAL OF EXPERIMENTAL BOTANY 2019; 70:3867-3879. [PMID: 31037302 PMCID: PMC6685650 DOI: 10.1093/jxb/erz181] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/25/2019] [Accepted: 04/18/2019] [Indexed: 05/20/2023]
Abstract
Fluorescence-activated cell sorting (FACS) and assay for transposase-accessible chromatin with high-throughput sequencing (ATAC-seq) were combined to analyse the chromatin state of lateral organ founder cells (LOFCs) in the peripheral zone of the Arabidopsis apetala1-1 cauliflower-1 double mutant inflorescence meristem. On a genome-wide level, we observed a striking correlation between transposase hypersensitive sites (THSs) detected by ATAC-seq and DNase I hypersensitive sites (DHSs). The mostly expanded DHSs were often substructured into several individual THSs, which correlated with phylogenetically conserved DNA sequences or enhancer elements. Comparing chromatin accessibility with available RNA-seq data, THS change configuration was reflected by gene activation or repression and chromatin regions acquired or lost transposase accessibility in direct correlation with gene expression levels in LOFCs. This was most pronounced immediately upstream of the transcription start, where genome-wide THSs were abundant in a complementary pattern to established H3K4me3 activation or H3K27me3 repression marks. At this resolution, the combined application of FACS/ATAC-seq is widely applicable to detect chromatin changes during cell-type specification and facilitates the detection of regulatory elements in plant promoters.
Collapse
Affiliation(s)
- Anneke Frerichs
- Developmental Biology, Department of Biology, Biocenter, University of Cologne, Cologne, Germany
| | - Julia Engelhorn
- Max Planck Institute for Plant Breeding Research, Carl-von-Linné-Weg, Cologne, Germany
- Institute for Molecular Physiology, Heinrich-Heine-Universität, Düsseldorf, Germany
| | - Janine Altmüller
- Cologne Center for Genomics (CCG), University of Cologne, Weyertal Cologne, Germany
| | | | - Wolfgang Werr
- Developmental Biology, Department of Biology, Biocenter, University of Cologne, Cologne, Germany
- Correspondence:
| |
Collapse
|
6
|
Karabacak Calviello A, Hirsekorn A, Wurmus R, Yusuf D, Ohler U. Reproducible inference of transcription factor footprints in ATAC-seq and DNase-seq datasets using protocol-specific bias modeling. Genome Biol 2019; 20:42. [PMID: 30791920 PMCID: PMC6385462 DOI: 10.1186/s13059-019-1654-y] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2018] [Accepted: 02/13/2019] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND DNase-seq and ATAC-seq are broadly used methods to assay open chromatin regions genome-wide. The single nucleotide resolution of DNase-seq has been further exploited to infer transcription factor binding sites (TFBSs) in regulatory regions through footprinting. Recent studies have demonstrated the sequence bias of DNase I and its adverse effects on footprinting efficiency. However, footprinting and the impact of sequence bias have not been extensively studied for ATAC-seq. RESULTS Here, we undertake a systematic comparison of the two methods and show that a modification to the ATAC-seq protocol increases its yield and its agreement with DNase-seq data from the same cell line. We demonstrate that the two methods have distinct sequence biases and correct for these protocol-specific biases when performing footprinting. Despite the differences in footprint shapes, the locations of the inferred footprints in ATAC-seq and DNase-seq are largely concordant. However, the protocol-specific sequence biases in conjunction with the sequence content of TFBSs impact the discrimination of footprint from the background, which leads to one method outperforming the other for some TFs. Finally, we address the depth required for reproducible identification of open chromatin regions and TF footprints. CONCLUSIONS We demonstrate that the impact of bias correction on footprinting performance is greater for DNase-seq than for ATAC-seq and that DNase-seq footprinting leads to better performance. It is possible to infer concordant footprints by using replicates, highlighting the importance of reproducibility assessment. The results presented here provide an overview of the advantages and limitations of footprinting analyses using ATAC-seq and DNase-seq.
Collapse
Affiliation(s)
- Aslıhan Karabacak Calviello
- Max Delbrück Center for Molecular Medicine, Berlin Institute for Medical Systems Biology, Berlin, Germany
- Department of Biology, Humboldt University, Berlin, Germany
| | - Antje Hirsekorn
- Max Delbrück Center for Molecular Medicine, Berlin Institute for Medical Systems Biology, Berlin, Germany
| | - Ricardo Wurmus
- Max Delbrück Center for Molecular Medicine, Berlin Institute for Medical Systems Biology, Berlin, Germany
| | - Dilmurat Yusuf
- Max Delbrück Center for Molecular Medicine, Berlin Institute for Medical Systems Biology, Berlin, Germany
| | - Uwe Ohler
- Max Delbrück Center for Molecular Medicine, Berlin Institute for Medical Systems Biology, Berlin, Germany.
- Department of Biology, Humboldt University, Berlin, Germany.
- Department of Computer Science, Humboldt University, Berlin, Germany.
| |
Collapse
|
7
|
Ou J, Liu H, Yu J, Kelliher MA, Castilla LH, Lawson ND, Zhu LJ. ATACseqQC: a Bioconductor package for post-alignment quality assessment of ATAC-seq data. BMC Genomics 2018; 19:169. [PMID: 29490630 PMCID: PMC5831847 DOI: 10.1186/s12864-018-4559-3] [Citation(s) in RCA: 113] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2017] [Accepted: 02/20/2018] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND ATAC-seq (Assays for Transposase-Accessible Chromatin using sequencing) is a recently developed technique for genome-wide analysis of chromatin accessibility. Compared to earlier methods for assaying chromatin accessibility, ATAC-seq is faster and easier to perform, does not require cross-linking, has higher signal to noise ratio, and can be performed on small cell numbers. However, to ensure a successful ATAC-seq experiment, step-by-step quality assurance processes, including both wet lab quality control and in silico quality assessment, are essential. While several tools have been developed or adopted for assessing read quality, identifying nucleosome occupancy and accessible regions from ATAC-seq data, none of the tools provide a comprehensive set of functionalities for preprocessing and quality assessment of aligned ATAC-seq datasets. RESULTS We have developed a Bioconductor package, ATACseqQC, for easily generating various diagnostic plots to help researchers quickly assess the quality of their ATAC-seq data. In addition, this package contains functions to preprocess aligned ATAC-seq data for subsequent peak calling. Here we demonstrate the utilities of our package using 25 publicly available ATAC-seq datasets from four studies. We also provide guidelines on what the diagnostic plots should look like for an ideal ATAC-seq dataset. CONCLUSIONS This software package has been used successfully for preprocessing and assessing several in-house and public ATAC-seq datasets. Diagnostic plots generated by this package will facilitate the quality assessment of ATAC-seq data, and help researchers to evaluate their own ATAC-seq experiments as well as select high-quality ATAC-seq datasets from public repositories such as GEO to avoid generating hypotheses or drawing conclusions from low-quality ATAC-seq experiments. The software, source code, and documentation are freely available as a Bioconductor package at https://bioconductor.org/packages/release/bioc/html/ATACseqQC.html .
Collapse
Affiliation(s)
- Jianhong Ou
- Department of Cell Biology, Duke University Medical Center, Durham, NC 27710 USA
| | - Haibo Liu
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Medical School, 364 Plantation Street, Worcester, MA 01605 USA
| | - Jun Yu
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Medical School, 364 Plantation Street, Worcester, MA 01605 USA
| | - Michelle A. Kelliher
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Medical School, 364 Plantation Street, Worcester, MA 01605 USA
| | - Lucio H. Castilla
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Medical School, 364 Plantation Street, Worcester, MA 01605 USA
| | - Nathan D. Lawson
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Medical School, 364 Plantation Street, Worcester, MA 01605 USA
| | - Lihua Julie Zhu
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Medical School, 364 Plantation Street, Worcester, MA 01605 USA
- Department of Molecular Medicine, Program in Bioinformatics and Integrative Biology, Worcester, MA 01655 USA
| |
Collapse
|
8
|
Aughey GN, Estacio Gomez A, Thomson J, Yin H, Southall TD. CATaDa reveals global remodelling of chromatin accessibility during stem cell differentiation in vivo. eLife 2018; 7:32341. [PMID: 29481322 PMCID: PMC5826290 DOI: 10.7554/elife.32341] [Citation(s) in RCA: 44] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2017] [Accepted: 01/30/2018] [Indexed: 01/09/2023] Open
Abstract
During development eukaryotic gene expression is coordinated by dynamic changes in chromatin structure. Measurements of accessible chromatin are used extensively to identify genomic regulatory elements. Whilst chromatin landscapes of pluripotent stem cells are well characterised, chromatin accessibility changes in the development of somatic lineages are not well defined. Here we show that cell-specific chromatin accessibility data can be produced via ectopic expression of E. coli Dam methylase in vivo, without the requirement for cell-sorting (CATaDa). We have profiled chromatin accessibility in individual cell-types of Drosophila neural and midgut lineages. Functional cell-type-specific enhancers were identified, as well as novel motifs enriched at different stages of development. Finally, we show global changes in the accessibility of chromatin between stem-cells and their differentiated progeny. Our results demonstrate the dynamic nature of chromatin accessibility in somatic tissues during stem cell differentiation and provide a novel approach to understanding gene regulatory mechanisms underlying development. For an embryo to successfully develop into an adult animal, specific genes must act in different types of cells. Though all the cells have the same genes encoded within their DNA, looking at the way that the DNA is packaged can indicate which parts of the DNA are important for that particular cell type. If regions of DNA are “open” one can infer that those regions are actively involved in gene regulation, whereas “closed” regions are considered less important. It is currently difficult to determine which parts of the DNA are open within an individual cell type in a complex organ, such as the brain. Existing methods require the cells to be physically isolated from the tissue, which is technically challenging. To overcome this issue, Aughey et al. have now developed a method that does not require isolation of the cells. The new technique involves using genetic engineering to introduce an enzyme called Dam into specific cell types in living fruit flies. This enzyme adds a chemical label on regions of open DNA, which can then be detected. Aughey et al. tested this technique on various cells of the developing brain and gut, and were able to see differences in the openness of DNA that corresponded to the action of genes that are important in each cell type. The data also contain trends that help to understand the role of open DNA in development. For example, mature cells were shown to overall have less open DNA than the stem cells that divide to generate them. Aughey et al. hope their new technique will be of use to other researchers working with either fruit flies or mammalian tissues. The knowledge that scientists will gain from identifying how open DNA contributes to gene regulation, in both healthy and diseased tissues, will further our understanding of human development and the biology of diseases such as cancer.
Collapse
Affiliation(s)
- Gabriel N Aughey
- Department of Life Sciences, Imperial College London, London, United Kingdom
| | | | - Jamie Thomson
- Department of Life Sciences, Imperial College London, London, United Kingdom
| | - Hang Yin
- Department of Life Sciences, Imperial College London, London, United Kingdom
| | - Tony D Southall
- Department of Life Sciences, Imperial College London, London, United Kingdom
| |
Collapse
|
9
|
Oudelaar AM, Davies JOJ, Downes DJ, Higgs DR, Hughes JR. Robust detection of chromosomal interactions from small numbers of cells using low-input Capture-C. Nucleic Acids Res 2018; 45:e184. [PMID: 29186505 PMCID: PMC5728395 DOI: 10.1093/nar/gkx1194] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2017] [Accepted: 11/21/2017] [Indexed: 02/06/2023] Open
Abstract
Chromosome conformation capture (3C) techniques are crucial to understanding tissue-specific regulation of gene expression, but current methods generally require large numbers of cells. This hampers the investigation of chromatin architecture in rare cell populations. We present a new low-input Capture-C approach that can generate high-quality 3C interaction profiles from 10 000-20 000 cells, depending on the resolution used for analysis. We also present a PCR-free, sequencing-free 3C technique based on NanoString technology called C-String. By comparing C-String and Capture-C interaction profiles we show that the latter are not skewed by PCR amplification. Furthermore, we demonstrate that chromatin interactions detected by Capture-C do not depend on the degree of cross-linking by performing experiments with varying formaldehyde concentrations.
Collapse
Affiliation(s)
- A Marieke Oudelaar
- Medical Research Council (MRC) Molecular Haematology Unit, Weatherall Institute of Molecular Medicine, University of Oxford, Oxford OX3 9DS, UK
| | - James O J Davies
- Medical Research Council (MRC) Molecular Haematology Unit, Weatherall Institute of Molecular Medicine, University of Oxford, Oxford OX3 9DS, UK
| | - Damien J Downes
- Medical Research Council (MRC) Molecular Haematology Unit, Weatherall Institute of Molecular Medicine, University of Oxford, Oxford OX3 9DS, UK
| | - Douglas R Higgs
- Medical Research Council (MRC) Molecular Haematology Unit, Weatherall Institute of Molecular Medicine, University of Oxford, Oxford OX3 9DS, UK
| | - Jim R Hughes
- Medical Research Council (MRC) Molecular Haematology Unit, Weatherall Institute of Molecular Medicine, University of Oxford, Oxford OX3 9DS, UK
| |
Collapse
|
10
|
Grbesa I, Tannenbaum M, Sarusi-Portuguez A, Schwartz M, Hakim O. Mapping Genome-wide Accessible Chromatin in Primary Human T Lymphocytes by ATAC-Seq. J Vis Exp 2017. [PMID: 29155775 DOI: 10.3791/56313] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Assay for Transposase-Accessible Chromatin with high-throughput sequencing (ATAC-seq) is a method used for the identification of open (accessible) regions of chromatin. These regions represent regulatory DNA elements (e.g., promoters, enhancers, locus control regions, insulators) to which transcription factors bind. Mapping the accessible chromatin landscape is a powerful approach for uncovering active regulatory elements across the genome. This information serves as an unbiased approach for discovering the network of relevant transcription factors and mechanisms of chromatin structure that govern gene expression programs. ATAC-seq is a robust and sensitive alternative to DNase I hypersensitivity analysis coupled with next-generation sequencing (DNase-seq) and formaldehyde-assisted isolation of regulatory elements (FAIRE-seq) for genome-wide analysis of chromatin accessibility and to the sequencing of micrococcal nuclease-sensitive sites (MNase-seq) to determine nucleosome positioning. We present a detailed ATAC-seq protocol optimized for human primary immune cells i.e. CD4+ lymphocytes (T helper 1 (Th1) and Th2 cells). This comprehensive protocol begins with cell harvest, then describes the molecular procedure of chromatin tagmentation, sample preparation for next-generation sequencing, and also includes methods and considerations for the computational analyses used to interpret the results. Moreover, to save time and money, we introduced quality control measures to assess the ATAC-seq library prior to sequencing. Importantly, the principles presented in this protocol allow its adaptation to other human immune and non-immune primary cells and cell lines. These guidelines will also be useful for laboratories which are not proficient with next-generation sequencing methods.
Collapse
Affiliation(s)
- Ivana Grbesa
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University
| | - Miriam Tannenbaum
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University
| | | | - Michal Schwartz
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University
| | - Ofir Hakim
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University;
| |
Collapse
|
11
|
Lee PJ, Choudhary MNK, Wang T. Online resources for studies of genome biology and epigenetics. CURRENT OPINION IN TOXICOLOGY 2017. [DOI: 10.1016/j.cotox.2017.07.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
12
|
Schwessinger R, Suciu MC, McGowan SJ, Telenius J, Taylor S, Higgs DR, Hughes JR. Sasquatch: predicting the impact of regulatory SNPs on transcription factor binding from cell- and tissue-specific DNase footprints. Genome Res 2017; 27:1730-1742. [PMID: 28904015 PMCID: PMC5630036 DOI: 10.1101/gr.220202.117] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2017] [Accepted: 08/07/2017] [Indexed: 12/22/2022]
Abstract
In the era of genome-wide association studies (GWAS) and personalized medicine, predicting the impact of single nucleotide polymorphisms (SNPs) in regulatory elements is an important goal. Current approaches to determine the potential of regulatory SNPs depend on inadequate knowledge of cell-specific DNA binding motifs. Here, we present Sasquatch, a new computational approach that uses DNase footprint data to estimate and visualize the effects of noncoding variants on transcription factor binding. Sasquatch performs a comprehensive k-mer-based analysis of DNase footprints to determine any k-mer's potential for protein binding in a specific cell type and how this may be changed by sequence variants. Therefore, Sasquatch uses an unbiased approach, independent of known transcription factor binding sites and motifs. Sasquatch only requires a single DNase-seq data set per cell type, from any genotype, and produces consistent predictions from data generated by different experimental procedures and at different sequence depths. Here we demonstrate the effectiveness of Sasquatch using previously validated functional SNPs and benchmark its performance against existing approaches. Sasquatch is available as a versatile webtool incorporating publicly available data, including the human ENCODE collection. Thus, Sasquatch provides a powerful tool and repository for prioritizing likely regulatory SNPs in the noncoding genome.
Collapse
Affiliation(s)
- Ron Schwessinger
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, Oxford OX3 9DS, United Kingdom
| | - Maria C Suciu
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, Oxford OX3 9DS, United Kingdom
| | - Simon J McGowan
- Computational Biology Research Group, MRC Weatherall Institute of Molecular Medicine, Oxford OX3 9DS, United Kingdom
| | - Jelena Telenius
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, Oxford OX3 9DS, United Kingdom
| | - Stephen Taylor
- Computational Biology Research Group, MRC Weatherall Institute of Molecular Medicine, Oxford OX3 9DS, United Kingdom
| | - Doug R Higgs
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, Oxford OX3 9DS, United Kingdom
| | - Jim R Hughes
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, Oxford OX3 9DS, United Kingdom
| |
Collapse
|
13
|
Correcting nucleotide-specific biases in high-throughput sequencing data. BMC Bioinformatics 2017; 18:357. [PMID: 28764645 PMCID: PMC5540620 DOI: 10.1186/s12859-017-1766-x] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2017] [Accepted: 07/19/2017] [Indexed: 01/07/2023] Open
Abstract
Background High-throughput sequence (HTS) data exhibit position-specific nucleotide biases that obscure the intended signal and reduce the effectiveness of these data for downstream analyses. These biases are particularly evident in HTS assays for identifying regulatory regions in DNA (DNase-seq, ChIP-seq, FAIRE-seq, ATAC-seq). Biases may result from many experiment-specific factors, including selectivity of DNA restriction enzymes and fragmentation method, as well as sequencing technology-specific factors, such as choice of adapters/primers and sample amplification methods. Results We present a novel method to detect and correct position-specific nucleotide biases in HTS short read data. Our method calculates read-specific weights based on aligned reads to correct the over- or underrepresentation of position-specific nucleotide subsequences, both within and adjacent to the aligned read, relative to a baseline calculated in assay-specific enriched regions. Using HTS data from a variety of ChIP-seq, DNase-seq, FAIRE-seq, and ATAC-seq experiments, we show that our weight-adjusted reads reduce the position-specific nucleotide imbalance across reads and improve the utility of these data for downstream analyses, including identification and characterization of open chromatin peaks and transcription-factor binding sites. Conclusions A general-purpose method to characterize and correct position-specific nucleotide sequence biases fills the need to recognize and deal with, in a systematic manner, binding-site preference for the growing number of HTS-based epigenetic assays. As the breadth and impact of these biases are better understood, the availability of a standard toolkit to correct them will be important. Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1766-x) contains supplementary material, which is available to authorized users.
Collapse
|
14
|
Chen X, Yu B, Carriero N, Silva C, Bonneau R. Mocap: large-scale inference of transcription factor binding sites from chromatin accessibility. Nucleic Acids Res 2017; 45:4315-4329. [PMID: 28334916 PMCID: PMC5416775 DOI: 10.1093/nar/gkx174] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2016] [Revised: 02/28/2017] [Accepted: 03/06/2017] [Indexed: 12/21/2022] Open
Abstract
Differential binding of transcription factors (TFs) at cis-regulatory loci drives the differentiation and function of diverse cellular lineages. Understanding the regulatory interactions that underlie cell fate decisions requires characterizing TF binding sites (TFBS) across multiple cell types and conditions. Techniques, e.g. ChIP-Seq can reveal genome-wide patterns of TF binding, but typically requires laborious and costly experiments for each TF-cell-type (TFCT) condition of interest. Chromosomal accessibility assays can connect accessible chromatin in one cell type to many TFs through sequence motif mapping. Such methods, however, rarely take into account that the genomic context preferred by each factor differs from TF to TF, and from cell type to cell type. To address the differences in TF behaviors, we developed Mocap, a method that integrates chromatin accessibility, motif scores, TF footprints, CpG/GC content, evolutionary conservation and other factors in an ensemble of TFCT-specific classifiers. We show that integration of genomic features, such as CpG islands improves TFBS prediction in some TFCT. Further, we describe a method for mapping new TFCT, for which no ChIP-seq data exists, onto our ensemble of classifiers and show that our cross-sample TFBS prediction method outperforms several previously described methods.
Collapse
Affiliation(s)
- Xi Chen
- Department of Biology, New York University, New York, NY 10003, USA
| | - Bowen Yu
- Department of Computer Science, New York University, New York, NY 10003, USA
| | - Nicholas Carriero
- Center for Computational Biology, Flatiron Foundation, Simons Foundation, New York, NY 10010, USA
| | - Claudio Silva
- Department of Computer Science, New York University, New York, NY 10003, USA
| | - Richard Bonneau
- Department of Biology, New York University, New York, NY 10003, USA
- Department of Computer Science, New York University, New York, NY 10003, USA
- Center for Computational Biology, Flatiron Foundation, Simons Foundation, New York, NY 10010, USA
| |
Collapse
|
15
|
Chaitankar V, Karakülah G, Ratnapriya R, Giuste FO, Brooks MJ, Swaroop A. Next generation sequencing technology and genomewide data analysis: Perspectives for retinal research. Prog Retin Eye Res 2016; 55:1-31. [PMID: 27297499 DOI: 10.1016/j.preteyeres.2016.06.001] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2016] [Revised: 06/06/2016] [Accepted: 06/08/2016] [Indexed: 02/08/2023]
Abstract
The advent of high throughput next generation sequencing (NGS) has accelerated the pace of discovery of disease-associated genetic variants and genomewide profiling of expressed sequences and epigenetic marks, thereby permitting systems-based analyses of ocular development and disease. Rapid evolution of NGS and associated methodologies presents significant challenges in acquisition, management, and analysis of large data sets and for extracting biologically or clinically relevant information. Here we illustrate the basic design of commonly used NGS-based methods, specifically whole exome sequencing, transcriptome, and epigenome profiling, and provide recommendations for data analyses. We briefly discuss systems biology approaches for integrating multiple data sets to elucidate gene regulatory or disease networks. While we provide examples from the retina, the NGS guidelines reviewed here are applicable to other tissues/cell types as well.
Collapse
Affiliation(s)
- Vijender Chaitankar
- Neurobiology-Neurodegeneration & Repair Laboratory, National Eye Institute, National Institutes of Health, 6 Center Drive, Bethesda, MD, 20892-0610, USA
| | - Gökhan Karakülah
- Neurobiology-Neurodegeneration & Repair Laboratory, National Eye Institute, National Institutes of Health, 6 Center Drive, Bethesda, MD, 20892-0610, USA
| | - Rinki Ratnapriya
- Neurobiology-Neurodegeneration & Repair Laboratory, National Eye Institute, National Institutes of Health, 6 Center Drive, Bethesda, MD, 20892-0610, USA
| | - Felipe O Giuste
- Neurobiology-Neurodegeneration & Repair Laboratory, National Eye Institute, National Institutes of Health, 6 Center Drive, Bethesda, MD, 20892-0610, USA
| | - Matthew J Brooks
- Neurobiology-Neurodegeneration & Repair Laboratory, National Eye Institute, National Institutes of Health, 6 Center Drive, Bethesda, MD, 20892-0610, USA
| | - Anand Swaroop
- Neurobiology-Neurodegeneration & Repair Laboratory, National Eye Institute, National Institutes of Health, 6 Center Drive, Bethesda, MD, 20892-0610, USA.
| |
Collapse
|