351
|
Tuncbag N, Gosline SJC, Kedaigle A, Soltis AR, Gitter A, Fraenkel E. Network-Based Interpretation of Diverse High-Throughput Datasets through the Omics Integrator Software Package. PLoS Comput Biol 2016; 12:e1004879. [PMID: 27096930 PMCID: PMC4838263 DOI: 10.1371/journal.pcbi.1004879] [Citation(s) in RCA: 91] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2015] [Accepted: 03/23/2016] [Indexed: 02/07/2023] Open
Abstract
High-throughput, ‘omic’ methods provide sensitive measures of biological responses to perturbations. However, inherent biases in high-throughput assays make it difficult to interpret experiments in which more than one type of data is collected. In this work, we introduce Omics Integrator, a software package that takes a variety of ‘omic’ data as input and identifies putative underlying molecular pathways. The approach applies advanced network optimization algorithms to a network of thousands of molecular interactions to find high-confidence, interpretable subnetworks that best explain the data. These subnetworks connect changes observed in gene expression, protein abundance or other global assays to proteins that may not have been measured in the screens due to inherent bias or noise in measurement. This approach reveals unannotated molecular pathways that would not be detectable by searching pathway databases. Omics Integrator also provides an elegant framework to incorporate not only positive data, but also negative evidence. Incorporating negative evidence allows Omics Integrator to avoid unexpressed genes and avoid being biased toward highly-studied hub proteins, except when they are strongly implicated by the data. The software is comprised of two individual tools, Garnet and Forest, that can be run together or independently to allow a user to perform advanced integration of multiple types of high-throughput data as well as create condition-specific subnetworks of protein interactions that best connect the observed changes in various datasets. It is available at http://fraenkel.mit.edu/omicsintegrator and on GitHub at https://github.com/fraenkel-lab/OmicsIntegrator.
Collapse
Affiliation(s)
- Nurcan Tuncbag
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Sara J. C. Gosline
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Amanda Kedaigle
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Anthony R. Soltis
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Anthony Gitter
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Ernest Fraenkel
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
352
|
El-Shamayleh Y, Ni AM, Horwitz GD. Strategies for targeting primate neural circuits with viral vectors. J Neurophysiol 2016; 116:122-34. [PMID: 27052579 PMCID: PMC4961743 DOI: 10.1152/jn.00087.2016] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2016] [Accepted: 04/05/2016] [Indexed: 11/22/2022] Open
Abstract
Understanding how the brain works requires understanding how different types of neurons contribute to circuit function and organism behavior. Progress on this front has been accelerated by optogenetics and chemogenetics, which provide an unprecedented level of control over distinct neuronal types in small animals. In primates, however, targeting specific types of neurons with these tools remains challenging. In this review, we discuss existing and emerging strategies for directing genetic manipulations to targeted neurons in the adult primate central nervous system. We review the literature on viral vectors for gene delivery to neurons, focusing on adeno-associated viral vectors and lentiviral vectors, their tropism for different cell types, and prospects for new variants with improved efficacy and selectivity. We discuss two projection targeting approaches for probing neural circuits: anterograde projection targeting and retrograde transport of viral vectors. We conclude with an analysis of cell type-specific promoters and other nucleotide sequences that can be used in viral vectors to target neuronal types at the transcriptional level.
Collapse
Affiliation(s)
- Yasmine El-Shamayleh
- Department of Physiology and Biophysics and Washington National Primate Research Center, University of Washington, Seattle, Washington; and
| | - Amy M Ni
- Department of Neuroscience and Center for the Neural Basis of Cognition, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Gregory D Horwitz
- Department of Physiology and Biophysics and Washington National Primate Research Center, University of Washington, Seattle, Washington; and
| |
Collapse
|
353
|
Moison C, Assemat F, Daunay A, Arimondo PB, Tost J. DNA Methylation Analysis of ChIP Products at Single Nucleotide Resolution by Pyrosequencing®. Methods Mol Biol 2016; 1315:315-33. [PMID: 26103908 DOI: 10.1007/978-1-4939-2715-9_22] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Interaction and co-occurrence of protein and DNA-based epigenetic modifications have become a topic of interest for many fundamental and biomedical questions. We describe within this chapter a protocol that combines two techniques in order to determine the methylation status of the DNA specifically associated with a protein of interest. First, DNA that directly interacts with the selected protein (such as a specific histone modification, a transcription factor, or any other DNA-associated protein) is purified by standard chromatin immunoprecipitation (ChIP). Second, the level of DNA methylation of this immunoprecipitated DNA is measured by bisulfite conversion and Pyrosequencing, a quantitative sequencing-by-synthesis method. This procedure allows determining the methylation status of genomic DNA associated to a specific protein at single nucleotide resolution.
Collapse
Affiliation(s)
- Céline Moison
- Unité de Service et de Recherche CNRS-Pierre Fabre n°3388, Epigenetic Targeting of Cancer (ETaC), Toulouse, France
| | | | | | | | | |
Collapse
|
354
|
Epigenetic Profiling of H3K4Me3 Reveals Herbal Medicine Jinfukang-Induced Epigenetic Alteration Is Involved in Anti-Lung Cancer Activity. EVIDENCE-BASED COMPLEMENTARY AND ALTERNATIVE MEDICINE 2016; 2016:7276161. [PMID: 27087825 PMCID: PMC4818803 DOI: 10.1155/2016/7276161] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/02/2015] [Revised: 02/03/2016] [Accepted: 02/07/2016] [Indexed: 11/17/2022]
Abstract
Traditional Chinese medicine Jinfukang (JFK) has been clinically used for treating lung cancer. To examine whether epigenetic modifications are involved in its anticancer activity, we performed a global profiling analysis of H3K4Me3, an epigenomic marker associated with active gene expression, in JFK-treated lung cancer cells. We identified 11,670 genes with significantly altered status of H3K4Me3 modification following JFK treatment (P < 0.05). Gene Ontology analysis indicates that these genes are involved in tumor-related pathways, including pathway in cancer, basal cell carcinoma, apoptosis, induction of programmed cell death, regulation of transcription (DNA-templated), intracellular signal transduction, and regulation of peptidase activity. In particular, we found that the levels of H3K4Me3 at the promoters of SUSD2, CCND2, BCL2A1, and TMEM158 are significantly altered in A549, NCI-H1975, NCI-H1650, and NCI-H2228 cells, when treated with JFK. Collectively, these findings provide the first evidence that the anticancer activity of JFK involves modulation of histone modification at many cancer-related gene loci.
Collapse
|
355
|
Vincent AT, Derome N, Boyle B, Culley AI, Charette SJ. Next-generation sequencing (NGS) in the microbiological world: How to make the most of your money. J Microbiol Methods 2016; 138:60-71. [PMID: 26995332 DOI: 10.1016/j.mimet.2016.02.016] [Citation(s) in RCA: 71] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2015] [Revised: 01/26/2016] [Accepted: 02/24/2016] [Indexed: 12/16/2022]
Abstract
The Sanger sequencing method produces relatively long DNA sequences of unmatched quality and has been considered for long time as the gold standard for sequencing DNA. Many improvements of the Sanger method that culminated with fluorescent dyes coupled with automated capillary electrophoresis enabled the sequencing of the first genomes. Nevertheless, using this technology to sequence whole genomes was costly, laborious and time consuming even for genomes that are relatively small in size. A major technological advance was the introduction of next-generation sequencing (NGS) pioneered by 454 Life Sciences in the early part of the 21th century. NGS allowed scientists to sequence thousands to millions of DNA molecules in a single machine run. Since then, new NGS technologies have emerged and existing NGS platforms have been improved, enabling the production of genome sequences at an unprecedented rate as well as broadening the spectrum of NGS applications. The current affordability of generating genomic information, especially with microbial samples, has resulted in a false sense of simplicity that belies the fact that many researchers still consider these technologies a black box. In this review, our objective is to identify and discuss four steps that we consider crucial to the success of any NGS-related project. These steps are: (1) the definition of the research objectives beyond sequencing and appropriate experimental planning, (2) library preparation, (3) sequencing and (4) data analysis. The goal of this review is to give an overview of the process, from sample to analysis, and discuss how to optimize your resources to achieve the most from your NGS-based research. Regardless of the evolution and improvement of the sequencing technologies, these four steps will remain relevant.
Collapse
Affiliation(s)
- Antony T Vincent
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Quebec City, QC G1V 0A6, Canada; Département de biochimie, de microbiologie et de bio-informatique, Faculté des sciences et de génie, Université Laval, Quebec City, QC G1V 0A6, Canada; Centre de recherche de l'Institut universitaire de cardiologie et de pneumologie de Québec, Quebec City, QC G1V 4G5, Canada
| | - Nicolas Derome
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Quebec City, QC G1V 0A6, Canada; Département de biologie, Faculté des sciences et de génie, Université Laval, Quebec City G1V 0A6, Canada
| | - Brian Boyle
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Quebec City, QC G1V 0A6, Canada
| | - Alexander I Culley
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Quebec City, QC G1V 0A6, Canada; Département de biochimie, de microbiologie et de bio-informatique, Faculté des sciences et de génie, Université Laval, Quebec City, QC G1V 0A6, Canada; Groupe de Recherche en Écologie Buccale (GREB), Faculté de médecine dentaire, Université Laval, Quebec City, QC G1V 0A6, Canada
| | - Steve J Charette
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Quebec City, QC G1V 0A6, Canada; Département de biochimie, de microbiologie et de bio-informatique, Faculté des sciences et de génie, Université Laval, Quebec City, QC G1V 0A6, Canada; Centre de recherche de l'Institut universitaire de cardiologie et de pneumologie de Québec, Quebec City, QC G1V 4G5, Canada.
| |
Collapse
|
356
|
Yu H, Huang T. Molecular Mechanisms of Floral Boundary Formation in Arabidopsis. Int J Mol Sci 2016; 17:317. [PMID: 26950117 PMCID: PMC4813180 DOI: 10.3390/ijms17030317] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2016] [Revised: 02/21/2016] [Accepted: 02/23/2016] [Indexed: 01/03/2023] Open
Abstract
Boundary formation is a crucial developmental process in plant organogenesis. Boundaries separate cells with distinct identities and act as organizing centers to control the development of adjacent organs. In flower development, initiation of floral primordia requires the formation of the meristem-to-organ (M-O) boundaries and floral organ development depends on the establishment of organ-to-organ (O-O) boundaries. Studies in this field have revealed a suite of genes and regulatory pathways controlling floral boundary formation. Many of these genes are transcription factors that interact with phytohormone pathways. This review will focus on the functions and interactions of the genes that play important roles in the floral boundaries and discuss the molecular mechanisms that integrate these regulatory pathways to control the floral boundary formation.
Collapse
Affiliation(s)
- Hongyang Yu
- College of Life Sciences and Oceanography, Shenzhen University, 3688 Nanhai Ave., Shenzhen 518060, China.
- College of Optoelectronic Engineering, Shenzhen University, 3688 Nanhai Ave., Shenzhen 518060, China.
| | - Tengbo Huang
- College of Life Sciences and Oceanography, Shenzhen University, 3688 Nanhai Ave., Shenzhen 518060, China.
| |
Collapse
|
357
|
MOCCS: Clarifying DNA-binding motif ambiguity using ChIP-Seq data. Comput Biol Chem 2016; 63:62-72. [PMID: 26971251 DOI: 10.1016/j.compbiolchem.2016.01.014] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2016] [Accepted: 01/25/2016] [Indexed: 11/21/2022]
Abstract
BACKGROUND As a key mechanism of gene regulation, transcription factors (TFs) bind to DNA by recognizing specific short sequence patterns that are called DNA-binding motifs. A single TF can accept ambiguity within its DNA-binding motifs, which comprise both canonical (typical) and non-canonical motifs. Clarification of such DNA-binding motif ambiguity is crucial for revealing gene regulatory networks and evaluating mutations in cis-regulatory elements. Although chromatin immunoprecipitation sequencing (ChIP-seq) now provides abundant data on the genomic sequences to which a given TF binds, existing motif discovery methods are unable to directly answer whether a given TF can bind to a specific DNA-binding motif. RESULTS Here, we report a method for clarifying the DNA-binding motif ambiguity, MOCCS. Given ChIP-Seq data of any TF, MOCCS comprehensively analyzes and describes every k-mer to which that TF binds. Analysis of simulated datasets revealed that MOCCS is applicable to various ChIP-Seq datasets, requiring only a few minutes per dataset. Application to the ENCODE ChIP-Seq datasets proved that MOCCS directly evaluates whether a given TF binds to each DNA-binding motif, even if known position weight matrix models do not provide sufficient information on DNA-binding motif ambiguity. Furthermore, users are not required to provide numerous parameters or background genomic sequence models that are typically unavailable. MOCCS is implemented in Perl and R and is freely available via https://github.com/yuifu/moccs. CONCLUSIONS By complementing existing motif-discovery software, MOCCS will contribute to the basic understanding of how the genome controls diverse cellular processes via DNA-protein interactions.
Collapse
|
358
|
Günther T, Theiss JM, Fischer N, Grundhoff A. Investigation of Viral and Host Chromatin by ChIP-PCR or ChIP-Seq Analysis. ACTA ACUST UNITED AC 2016; 40:1E.10.1-1E.10.21. [PMID: 26855283 DOI: 10.1002/9780471729259.mc01e10s40] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Complex regulation of viral transcription patterns and DNA replication levels is a feature of many DNA viruses. This is especially true for those viruses which establish latent or persistent infections (e.g., herpesviruses, papillomaviruses, polyomaviruses, or adenovirus), as long-term persistence often requires adaptation of gene expression programs and/or replication levels to the cellular milieu. A key factor in the control of such processes is the establishment of a specific chromatin state on promoters or replication origins, which in turn will determine whether or not the underlying DNA is accessible for other factors that mediate downstream processes. Chromatin immunoprecipitation (ChIP) is a powerful technique to investigate viral chromatin, in particular to study binding patterns of modified histones, transcription factors or other DNA-/chromatin-binding proteins that regulate the viral lifecycle. Here, we provide protocols that are suitable for performing ChIP-PCR and ChIP-Seq studies on chromatin of large and small viral genomes.
Collapse
Affiliation(s)
- Thomas Günther
- Heinrich-Pette Institute, Leibniz Institute for Experimental Virology, Hamburg, Germany
| | - Juliane M Theiss
- Heinrich-Pette Institute, Leibniz Institute for Experimental Virology, Hamburg, Germany.,Institute for Medical Microbiology, Virology and Hygiene; University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Nicole Fischer
- Institute for Medical Microbiology, Virology and Hygiene; University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Adam Grundhoff
- Heinrich-Pette Institute, Leibniz Institute for Experimental Virology, Hamburg, Germany
| |
Collapse
|
359
|
Sos BC, Fung HL, Gao DR, Osothprarop TF, Kia A, He MM, Zhang K. Characterization of chromatin accessibility with a transposome hypersensitive sites sequencing (THS-seq) assay. Genome Biol 2016; 17:20. [PMID: 26846207 PMCID: PMC4743176 DOI: 10.1186/s13059-016-0882-7] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2015] [Accepted: 01/18/2016] [Indexed: 12/22/2022] Open
Abstract
Chromatin accessibility captures in vivo protein-chromosome binding status, and is considered an informative proxy for protein-DNA interactions. DNase I and Tn5 transposase assays require thousands to millions of fresh cells for comprehensive chromatin mapping. Applying Tn5 tagmentation to hundreds of cells results in sparse chromatin maps. We present a transposome hypersensitive sites sequencing assay for highly sensitive characterization of chromatin accessibility. Linear amplification of accessible DNA ends with in vitro transcription, coupled with an engineered Tn5 super-mutant, demonstrates improved sensitivity on limited input materials, and accessibility of small regions near distal enhancers, compared with ATAC-seq.
Collapse
Affiliation(s)
- Brandon Chin Sos
- Department of Bioengineering, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, USA.,Biomedical Sciences Graduate Program, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, USA
| | - Ho-Lim Fung
- Department of Bioengineering, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, USA
| | - Derek Rui Gao
- Department of Bioengineering, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, USA
| | | | - Amirali Kia
- Illumina Inc, 5200 Illumina Way, San Diego, CA, USA
| | - Molly Min He
- Illumina Inc, 5200 Illumina Way, San Diego, CA, USA
| | - Kun Zhang
- Department of Bioengineering, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, USA. .,Biomedical Sciences Graduate Program, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, USA.
| |
Collapse
|
360
|
Yan H, Tian S, Slager SL, Sun Z, Ordog T. Genome-Wide Epigenetic Studies in Human Disease: A Primer on -Omic Technologies. Am J Epidemiol 2016; 183:96-109. [PMID: 26721890 DOI: 10.1093/aje/kwv187] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2015] [Accepted: 07/09/2015] [Indexed: 12/12/2022] Open
Abstract
Epigenetic information encoded in covalent modifications of DNA and histone proteins regulates fundamental biological processes through the action of chromatin regulators, transcription factors, and noncoding RNA species. Epigenetic plasticity enables an organism to respond to developmental and environmental signals without genetic changes. However, aberrant epigenetic control plays a key role in pathogenesis of disease. Normal epigenetic states could be disrupted by detrimental mutations and expression alteration of chromatin regulators or by environmental factors. In this primer, we briefly review the epigenetic basis of human disease and discuss how recent discoveries in this field could be translated into clinical diagnosis, prevention, and treatment. We introduce platforms for mapping genome-wide chromatin accessibility, nucleosome occupancy, DNA-binding proteins, and DNA methylation, primarily focusing on the integration of DNA methylation and chromatin immunoprecipitation-sequencing technologies into disease association studies. We highlight practical considerations in applying high-throughput epigenetic assays and formulating analytical strategies. Finally, we summarize current challenges in sample acquisition, experimental procedures, data analysis, and interpretation and make recommendations on further refinement in these areas. Incorporating epigenomic testing into the clinical research arsenal will greatly facilitate our understanding of the epigenetic basis of disease and help identify novel therapeutic targets.
Collapse
|
361
|
Kumar S, Bucher P. Predicting transcription factor site occupancy using DNA sequence intrinsic and cell-type specific chromatin features. BMC Bioinformatics 2016; 17 Suppl 1:4. [PMID: 26818008 PMCID: PMC4895346 DOI: 10.1186/s12859-015-0846-z] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Background Understanding the mechanisms by which transcription factors (TF) are recruited to their physiological target sites is crucial for understanding gene regulation. DNA sequence intrinsic features such as predicted binding affinity are often not very effective in predicting in vivo site occupancy and in any case could not explain cell-type specific binding events. Recent reports show that chromatin accessibility, nucleosome occupancy and specific histone post-translational modifications greatly influence TF site occupancy in vivo. In this work, we use machine-learning methods to build predictive models and assess the relative importance of different sequence-intrinsic and chromatin features in the TF-to-target-site recruitment process. Methods Our study primarily relies on recent data published by the ENCODE consortium. Five dissimilar TFs assayed in multiple cell-types were selected as examples: CTCF, JunD, REST, GABP and USF2. We used two types of candidate target sites: (a) predicted sites obtained by scanning the whole genome with a position weight matrix, and (b) cell-type specific peak lists provided by ENCODE. Quantitative in vivo occupancy levels in different cell-types were based on ChIP-seq data for the corresponding TFs. In parallel, we computed a number of associated sequence-intrinsic and experimental features (histone modification, DNase I hypersensitivity, etc.) for each site. Machine learning algorithms were then used in a binary classification and regression framework to predict site occupancy and binding strength, for the purpose of assessing the relative importance of different contextual features. Results We observed striking differences in the feature importance rankings between the five factors tested. PWM-scores were amongst the most important features only for CTCF and REST but of little value for JunD and USF2. Chromatin accessibility and active histone marks are potent predictors for all factors except REST. Structural DNA parameters, repressive and gene body associated histone marks are generally of little or no predictive value. Conclusions We define a general and extensible computational framework for analyzing the importance of various DNA-intrinsic and chromatin-associated features in determining cell-type specific TF binding to target sites. The application of our methodology to ENCODE data has led to new insights on transcription regulatory processes and may serve as example for future studies encompassing even larger datasets. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0846-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sunil Kumar
- Swiss Institute for Experimental Cancer Research (ISREC), School of Life Sciences, EPFL, Station 15, Lausanne, CH-1015, Switzerland. .,Swiss Institute of Bioinformatics (SIB), EPFL, Station 15, Lausanne, CH-1015, Switzerland.
| | - Philipp Bucher
- Swiss Institute for Experimental Cancer Research (ISREC), School of Life Sciences, EPFL, Station 15, Lausanne, CH-1015, Switzerland. .,Swiss Institute of Bioinformatics (SIB), EPFL, Station 15, Lausanne, CH-1015, Switzerland.
| |
Collapse
|
362
|
Methods to Study Long Noncoding RNA Biology in Cancer. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2016; 927:69-107. [PMID: 27376732 DOI: 10.1007/978-981-10-1498-7_3] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Thousands of long noncoding RNAs (lncRNAs) have been discovered in recent years. The functions of lncRNAs range broadly from regulating chromatin structure and gene expression in the nucleus to controlling messenger RNA (mRNA) processing, mRNA posttranscriptional regulation, cellular signaling, and protein activity in the cytoplasm. Experimental and computational techniques have been developed to characterize lncRNAs in high-throughput scale, to study the lncRNA function in vitro and in vivo, to map lncRNA binding sites on the genome, and to capture lncRNA-protein interactions with the identification of lncRNA-binding partners, binding sites, and interaction determinants. In this chapter, we will discuss these technologies and their applications in decoding the functions of lncRNAs. Understanding these techniques including their advantages and disadvantages and developing them in the future will be essential to elaborate the roles of lncRNAs in cancer and other diseases.
Collapse
|
363
|
Rastegar S, Strähle U. The Zebrafish as Model for Deciphering the Regulatory Architecture of Vertebrate Genomes. GENETICS, GENOMICS AND FISH PHENOMICS 2016; 95:195-216. [DOI: 10.1016/bs.adgen.2016.04.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
364
|
Probing DNA interactions with proteins using a single-molecule toolbox: inside the cell, in a test tube and in a computer. Biochem Soc Trans 2016; 43:139-45. [PMID: 26020443 DOI: 10.1042/bst20140253] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
DNA-interacting proteins have roles in multiple processes, many operating as molecular machines which undergo dynamic meta-stable transitions to bring about their biological function. To fully understand this molecular heterogeneity, DNA and the proteins that bind to it must ideally be interrogated at a single molecule level in their native in vivo environments, in a time-resolved manner, fast enough to sample the molecular transitions across the free-energy landscape. Progress has been made over the past decade in utilizing cutting-edge tools of the physical sciences to address challenging biological questions concerning the function and modes of action of several different proteins which bind to DNA. These physiologically relevant assays are technically challenging but can be complemented by powerful and often more tractable in vitro experiments which confer advantages of the chemical environment with enhanced detection signal-to-noise of molecular signatures and transition events. In the present paper, we discuss a range of techniques we have developed to monitor DNA-protein interactions in vivo, in vitro and in silico. These include bespoke single-molecule fluorescence microscopy techniques to elucidate the architecture and dynamics of the bacterial replisome and the structural maintenance of bacterial chromosomes, as well as new computational tools to extract single-molecule molecular signatures from live cells to monitor stoichiometry, spatial localization and mobility in living cells. We also discuss recent developments from our laboratory made in vitro, complementing these in vivo studies, which combine optical and magnetic tweezers to manipulate and image single molecules of DNA, with and without bound protein, in a new super-resolution fluorescence microscope.
Collapse
|
365
|
KOSTKA DENNIS, FRIEDRICH TARA, HOLLOWAY ALISHAK, POLLARD KATHERINES. motifDiverge: a model for assessing the statistical significance of gene regulatory motif divergence between two DNA sequences. STATISTICS AND ITS INTERFACE 2015; 8:463-476. [PMID: 26709360 PMCID: PMC4689439 DOI: 10.4310/sii.2015.v8.n4.a6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Next-generation sequencing technology enables the identification of thousands of gene regulatory sequences in many cell types and organisms. We consider the problem of testing if two such sequences differ in their number of binding site motifs for a given transcription factor (TF) protein. Binding site motifs impart regulatory function by providing TFs the opportunity to bind to genomic elements and thereby affect the expression of nearby genes. Evolutionary changes to such functional DNA are hypothesized to be major contributors to phenotypic diversity within and between species; but despite the importance of TF motifs for gene expression, no method exists to test for motif loss or gain. Assuming that motif counts are Binomially distributed, and allowing for dependencies between motif instances in evolutionarily related sequences, we derive the probability mass function of the difference in motif counts between two nucleotide sequences. We provide a method to numerically estimate this distribution from genomic data and show through simulations that our estimator is accurate. Finally, we introduce the R package motifDiverge that implements our methodology and illustrate its application to gene regulatory enhancers identified by a mouse developmental time course experiment. While this study was motivated by analysis of regulatory motifs, our results can be applied to any problem involving two correlated Bernoulli trials.
Collapse
Affiliation(s)
- DENNIS KOSTKA
- Department of Developmental Biology, Department of Computational & Systems Biology, University of Pittsburgh School of Medicine, 530 45th Street, Pittsburgh, PA 15201, USA
| | - TARA FRIEDRICH
- Gladstone Institutes, Integrative Program in Quantitative Biology, University of California, 1650 Owens Street, San Francisco, CA 94158, USA
| | - ALISHA K. HOLLOWAY
- Gladstone Institutes, Division of Biostatistics, University of California, 1650 Owens Street, San Francisco, CA 94158, USA
| | - KATHERINE S. POLLARD
- Gladstone Institutes, Institute for Human Genetics, Division of Biostatistics, University of California, 1650 Owens Street, San Francisco, CA 94158, USA
| |
Collapse
|
366
|
Arrigoni L, Richter AS, Betancourt E, Bruder K, Diehl S, Manke T, Bönisch U. Standardizing chromatin research: a simple and universal method for ChIP-seq. Nucleic Acids Res 2015; 44:e67. [PMID: 26704968 PMCID: PMC4838356 DOI: 10.1093/nar/gkv1495] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2015] [Accepted: 12/09/2015] [Indexed: 01/18/2023] Open
Abstract
Chromatin immunoprecipitation followed by next generation sequencing (ChIP-seq) is a key technique in chromatin research. Although heavily applied, existing ChIP-seq protocols are often highly fine-tuned workflows, optimized for specific experimental requirements. Especially the initial steps of ChIP-seq, particularly chromatin shearing, are deemed to be exceedingly cell-type-specific, thus impeding any protocol standardization efforts. Here we demonstrate that harmonization of ChIP-seq workflows across cell types and conditions is possible when obtaining chromatin from properly isolated nuclei. We established an ultrasound-based nuclei extraction method (NEXSON: Nuclei EXtraction by SONication) that is highly effective across various organisms, cell types and cell numbers. The described method has the potential to replace complex cell-type-specific, but largely ineffective, nuclei isolation protocols. By including NEXSON in ChIP-seq workflows, we completely eliminate the need for extensive optimization and sample-dependent adjustments. Apart from this significant simplification, our approach also provides the basis for a fully standardized ChIP-seq and yields highly reproducible transcription factor and histone modifications maps for a wide range of different cell types. Even small cell numbers (∼10 000 cells per ChIP) can be easily processed without application of modified chromatin or library preparation protocols.
Collapse
Affiliation(s)
- Laura Arrigoni
- Max Planck Institute of Immunobiology and Epigenetics, Stübeweg 51, Freiburg, 79108, Germany
| | - Andreas S Richter
- Max Planck Institute of Immunobiology and Epigenetics, Stübeweg 51, Freiburg, 79108, Germany
| | - Emily Betancourt
- Max Planck Institute of Immunobiology and Epigenetics, Stübeweg 51, Freiburg, 79108, Germany
| | - Kerstin Bruder
- Max Planck Institute of Immunobiology and Epigenetics, Stübeweg 51, Freiburg, 79108, Germany
| | - Sarah Diehl
- Luxembourg Centre for Systems Biomedicine, Université du Luxembourg, avenue du Swing 6, Belvaux, 4366, Luxembourg
| | - Thomas Manke
- Max Planck Institute of Immunobiology and Epigenetics, Stübeweg 51, Freiburg, 79108, Germany
| | - Ulrike Bönisch
- Max Planck Institute of Immunobiology and Epigenetics, Stübeweg 51, Freiburg, 79108, Germany
| |
Collapse
|
367
|
Valensisi C, Liao JL, Andrus C, Battle SL, Hawkins RD. cChIP-seq: a robust small-scale method for investigation of histone modifications. BMC Genomics 2015; 16:1083. [PMID: 26692029 PMCID: PMC4687106 DOI: 10.1186/s12864-015-2285-7] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2015] [Accepted: 12/10/2015] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND ChIP-seq is highly utilized for mapping histone modifications that are informative about gene regulation and genome annotations. For example, applying ChIP-seq to histone modifications such as H3K4me1 has facilitated generating epigenomic maps of putative enhancers. This powerful technology, however, is limited in its application by the large number of cells required. ChIP-seq involves extensive manipulation of sample material and multiple reactions with limited quality control at each step, therefore, scaling down the number of cells required has proven challenging. Recently, several methods have been proposed to overcome this limit but most of these methods require extensive optimization to tailor the protocol to the specific antibody used or number of cells being profiled. RESULTS Here we describe a robust, yet facile method, which we named carrier ChIP-seq (cChIP-seq), for use on limited cell amounts. cChIP-seq employs a DNA-free histone carrier in order to maintain the working ChIP reaction scale, removing the need to tailor reactions to specific amounts of cells or histone modifications to be assayed. We have applied our method to three different histone modifications, H3K4me3, H3K4me1 and H3K27me3 in the K562 cell line, and H3K4me1 in H1 hESCs. We successfully obtained epigenomic maps for these histone modifications starting with as few as 10,000 cells. We compared cChIP-seq data to data generated as part of the ENCODE project. ENCODE data are the reference standard in the field and have been generated starting from tens of million of cells. Our results show that cChIP-seq successfully recapitulates bulk data. Furthermore, we showed that the differences observed between small-scale ChIP-seq data and ENCODE data are largely to be due to lab-to-lab variability rather than operating on a reduced scale. CONCLUSIONS Data generated using cChIP-seq are equivalent to reference epigenomic maps from three orders of magnitude more cells. Our method offers a robust and straightforward approach to scale down ChIP-seq to as low as 10,000 cells. The underlying principle of our strategy makes it suitable for being applied to a vast range of chromatin modifications without requiring expensive optimization. Furthermore, our strategy of a DNA-free carrier can be adapted to most ChIP-seq protocols.
Collapse
Affiliation(s)
- Cristina Valensisi
- Division of Medical Genetics, Department of Medicine, Department of Genome Sciences, Institute for Stem Cell and Regenerative Medicine, University of Washington School of Medicine, Seattle, WA, USA.
| | - Jo Ling Liao
- Division of Medical Genetics, Department of Medicine, Department of Genome Sciences, Institute for Stem Cell and Regenerative Medicine, University of Washington School of Medicine, Seattle, WA, USA.
| | - Colin Andrus
- Division of Medical Genetics, Department of Medicine, Department of Genome Sciences, Institute for Stem Cell and Regenerative Medicine, University of Washington School of Medicine, Seattle, WA, USA.
| | - Stephanie L Battle
- Division of Medical Genetics, Department of Medicine, Department of Genome Sciences, Institute for Stem Cell and Regenerative Medicine, University of Washington School of Medicine, Seattle, WA, USA.
| | - R David Hawkins
- Division of Medical Genetics, Department of Medicine, Department of Genome Sciences, Institute for Stem Cell and Regenerative Medicine, University of Washington School of Medicine, Seattle, WA, USA. .,Turku Centre for Biotechnology, Turku, Finland.
| |
Collapse
|
368
|
Shrestha A, Abd-Elfattah A, Freudenschuss B, Hinney B, Palmieri N, Ruttkowski B, Joachim A. Cystoisospora suis - A Model of Mammalian Cystoisosporosis. Front Vet Sci 2015; 2:68. [PMID: 26664994 PMCID: PMC4672278 DOI: 10.3389/fvets.2015.00068] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2015] [Accepted: 11/17/2015] [Indexed: 11/13/2022] Open
Abstract
Cystoisospora suis is a coccidian species that typically affects suckling piglets. Infections occur by oral uptake of oocysts and are characterized by non-hemorrhagic transient diarrhea, resulting in poor weight gain. Apparently, primary immune responses to C. suis cannot readily be mounted by neonates, which contributes to the establishment and rapid development of the parasite, while in older pigs age-resistance prevents disease development. However, the presence of extraintestinal stages, although not unequivocally demonstrated, is suspected to enable parasite persistence together with the induction and maintenance of immune response in older pigs, which in turn may facilitate the transfer of C. suis-specific factors from sow to offspring. It is assumed that neonates are particularly prone to clinical disease because infections with C. suis interfere with the establishment of the gut microbiome. Clostridia have been especially inferred to profit from the altered intestinal environment during parasite infection. New tools, particularly in the area of genomics, might illustrate the interactions between C. suis and its host and pave the way for the development of new control methods not only for porcine cystoisosporosis but also for other mammalian Cystoisospora infections. The first reference genome for C. suis is under way and will be a fertile ground to discover new drugs and vaccines. At the same time, the establishment and refinement of an in vivo model and an in vitro culture system, supporting the complete life cycle of C. suis, will underpin the functional characterization of the parasite and shed light on its biology and control.
Collapse
Affiliation(s)
- Aruna Shrestha
- Department of Pathobiology, Institute of Parasitology, University of Veterinary Medicine Vienna , Vienna , Austria
| | - Ahmed Abd-Elfattah
- Department of Pathobiology, Institute of Parasitology, University of Veterinary Medicine Vienna , Vienna , Austria
| | - Barbara Freudenschuss
- Department of Pathobiology, Institute of Parasitology, University of Veterinary Medicine Vienna , Vienna , Austria
| | - Barbara Hinney
- Department of Pathobiology, Institute of Parasitology, University of Veterinary Medicine Vienna , Vienna , Austria
| | - Nicola Palmieri
- Department of Pathobiology, Institute of Parasitology, University of Veterinary Medicine Vienna , Vienna , Austria
| | - Bärbel Ruttkowski
- Department of Pathobiology, Institute of Parasitology, University of Veterinary Medicine Vienna , Vienna , Austria
| | - Anja Joachim
- Department of Pathobiology, Institute of Parasitology, University of Veterinary Medicine Vienna , Vienna , Austria
| |
Collapse
|
369
|
Abstract
Nucleotide changes in gene regulatory elements can have a major effect on interindividual differences in drug response. For example, by reviewing all published pharmacogenomic genome-wide association studies, we show here that 96.4% of the associated single nucleotide polymorphisms reside in noncoding regions. We discuss how sequencing technologies are improving our ability to identify drug response-associated regulatory elements genome-wide and to annotate nucleotide variants within them. We highlight specific examples of how nucleotide changes in these elements can affect drug response and illustrate the techniques used to find them and functionally characterize them. Finally, we also discuss challenges in the field of drug-responsive regulatory elements that need to be considered in order to translate these findings into the clinic.
Collapse
Affiliation(s)
- Marcelo R Luizon
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94158, USA.,Institute for Human Genetics, University of California San Francisco, San Francisco, CA 94158, USA
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94158, USA.,Institute for Human Genetics, University of California San Francisco, San Francisco, CA 94158, USA
| |
Collapse
|
370
|
Kim K, Lee K, Bang H, Kim JY, Choi JK. Intersection of genetics and epigenetics in monozygotic twin genomes. Methods 2015; 102:50-6. [PMID: 26548893 DOI: 10.1016/j.ymeth.2015.10.020] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2015] [Accepted: 10/18/2015] [Indexed: 02/01/2023] Open
Abstract
As a final function of various epigenetic mechanisms, chromatin regulation is a transcription control process that especially demonstrates active interaction with genetic elements. Thus, chromatin structure has become a principal focus in recent genomics researches that strive to characterize regulatory functions of DNA variants related to diseases or other traits. Although researchers have been focusing on DNA methylation when studying monozygotic (MZ) twins, a great model in epigenetics research, interactions between genetics and epigenetics in chromatin level are expected to be an imperative research trend in the future. In this review, we discuss how the genome, epigenome, and transcriptome of MZ twins can be studied in an integrative manner from this perspective.
Collapse
Affiliation(s)
- Kwoneel Kim
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea
| | - Kibaick Lee
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea
| | - Hyoeun Bang
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea
| | - Jeong Yeon Kim
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea
| | - Jung Kyoon Choi
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea.
| |
Collapse
|
371
|
Harmanci A, Rozowsky J, Gerstein M. MUSIC: identification of enriched regions in ChIP-Seq experiments using a mappability-corrected multiscale signal processing framework. Genome Biol 2015; 15:474. [PMID: 25292436 PMCID: PMC4234855 DOI: 10.1186/s13059-014-0474-3] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2014] [Indexed: 12/20/2022] Open
Abstract
We present MUSIC, a signal processing approach for identification of enriched regions in ChIP-Seq data, available at music.gersteinlab.org. MUSIC first filters the ChIP-Seq read-depth signal for systematic noise from non-uniform mappability, which fragments enriched regions. Then it performs a multiscale decomposition, using median filtering, identifying enriched regions at multiple length scales. This is useful given the wide range of scales probed in ChIP-Seq assays. MUSIC performs favorably in terms of accuracy and reproducibility compared with other methods. In particular, analysis of RNA polymerase II data reveals a clear distinction between the stalled and elongating forms of the polymerase.
Collapse
Affiliation(s)
- Arif Harmanci
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
| | | | | |
Collapse
|
372
|
Roy S, Thompson D. Evolution of regulatory networks in Candida glabrata: learning to live with the human host. FEMS Yeast Res 2015; 15:fov087. [PMID: 26449820 DOI: 10.1093/femsyr/fov087] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/17/2015] [Indexed: 12/12/2022] Open
Abstract
The opportunistic human fungal pathogen Candida glabrata is second only to C. albicans as the cause of Candida infections and yet is more closely related to Saccharomyces cerevisiae. Recent advances in functional genomics technologies and computational approaches to decipher regulatory networks, and the comparison of these networks among these and other Ascomycete species, have revealed both unique and shared strategies in adaptation to a human commensal/opportunistic pathogen lifestyle and antifungal drug resistance in C. glabrata. Recently, several C. glabrata sister species in the Nakeseomyces clade representing both human associated (commensal) and environmental isolates have had their genomes sequenced and analyzed. This has paved the way for comparative functional genomics studies to characterize the regulatory networks in these species to identify informative patterns of conservation and divergence linked to phenotypic evolution in the Nakaseomyces lineage.
Collapse
Affiliation(s)
- Sushmita Roy
- Department of Biostatistics and Medical Informatics, University of Wisconsin Madison, Madison, WI 53715, USA Wisconsin Institute for Discovery, University of Wisconsin, Madison, WI 53715, USA
| | - Dawn Thompson
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| |
Collapse
|
373
|
Cunha MLR, Meijers JCM, Middeldorp S. Introduction to the analysis of next generation sequencing data and its application to venous thromboembolism. Thromb Haemost 2015; 114:920-32. [PMID: 26446408 DOI: 10.1160/th15-05-0411] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Accepted: 08/26/2015] [Indexed: 12/13/2022]
Abstract
Despite knowledge of various inherited risk factors associated with venous thromboembolism (VTE), no definite cause can be found in about 50% of patients. The application of data-driven searches such as GWAS has not been able to identify genetic variants with implications for clinical care, and unexplained heritability remains. In the past years, the development of several so-called next generation sequencing (NGS) platforms is offering the possibility of generating fast, inexpensive and accurate genomic information. However, so far their application to VTE has been very limited. Here we review basic concepts of NGS data analysis and explore the application of NGS technology to VTE. We provide both computational and biological viewpoints to discuss potentials and challenges of NGS-based studies.
Collapse
Affiliation(s)
- Marisa L R Cunha
- Marisa L. R. Cunha, Department of Experimental Vascular Medicine, Academic Medical Center, Meibergdreef 9, 1105 AZ Amsterdam, The Netherlands, Tel.: +31 20 5662824, Fax: +31 20 6968833, E-mail:
| | | | | |
Collapse
|
374
|
A Glimpse to Background and Characteristics of Major Molecular Biological Networks. BIOMED RESEARCH INTERNATIONAL 2015; 2015:540297. [PMID: 26491677 PMCID: PMC4605226 DOI: 10.1155/2015/540297] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/01/2015] [Revised: 07/22/2015] [Accepted: 08/18/2015] [Indexed: 12/11/2022]
Abstract
Recently, biology has become a data intensive science because of huge data sets produced by high throughput molecular biological experiments in diverse areas including the fields of genomics, transcriptomics, proteomics, and metabolomics. These huge datasets have paved the way for system-level analysis of the processes and subprocesses of the cell. For system-level understanding, initially the elements of a system are connected based on their mutual relations and a network is formed. Among omics researchers, construction and analysis of biological networks have become highly popular. In this review, we briefly discuss both the biological background and topological properties of major types of omics networks to facilitate a comprehensive understanding and to conceptualize the foundation of network biology.
Collapse
|
375
|
Dozmorov MG, Adrianto I, Giles CB, Glass E, Glenn SB, Montgomery C, Sivils KL, Olson LE, Iwayama T, Freeman WM, Lessard CJ, Wren JD. Detrimental effects of duplicate reads and low complexity regions on RNA- and ChIP-seq data. BMC Bioinformatics 2015; 16 Suppl 13:S10. [PMID: 26423047 PMCID: PMC4597324 DOI: 10.1186/1471-2105-16-s13-s10] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Background Adapter trimming and removal of duplicate reads are common practices in next-generation sequencing pipelines. Sequencing reads ambiguously mapped to repetitive and low complexity regions can also be problematic for accurate assessment of the biological signal, yet their impact on sequencing data has not received much attention. We investigate how trimming the adapters, removing duplicates, and filtering out reads overlapping low complexity regions influence the significance of biological signal in RNA- and ChIP-seq experiments. Methods We assessed the effect of data processing steps on the alignment statistics and the functional enrichment analysis results of RNA- and ChIP-seq data. We compared differentially processed RNA-seq data with matching microarray data on the same patient samples to determine whether changes in pre-processing improved correlation between the two. We have developed a simple tool to remove low complexity regions, RepeatSoaker, available at https://github.com/mdozmorov/RepeatSoaker, and tested its effect on the alignment statistics and the results of the enrichment analyses. Results Both adapter trimming and duplicate removal moderately improved the strength of biological signals in RNA-seq and ChIP-seq data. Aggressive filtering of reads overlapping with low complexity regions, as defined by RepeatMasker, further improved the strength of biological signals, and the correlation between RNA-seq and microarray gene expression data. Conclusions Adapter trimming and duplicates removal, coupled with filtering out reads overlapping low complexity regions, is shown to increase the quality and reliability of detecting biological signals in RNA-seq and ChIP-seq data.
Collapse
|
376
|
He X, Cicek AE, Wang Y, Schulz MH, Le HS, Bar-Joseph Z. De novo ChIP-seq analysis. Genome Biol 2015; 16:205. [PMID: 26400819 PMCID: PMC4579611 DOI: 10.1186/s13059-015-0756-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2015] [Accepted: 08/19/2015] [Indexed: 12/21/2022] Open
Abstract
Methods for the analysis of chromatin immunoprecipitation sequencing (ChIP-seq) data start by aligning the short reads to a reference genome. While often successful, they are not appropriate for cases where a reference genome is not available. Here we develop methods for de novo analysis of ChIP-seq data. Our methods combine de novo assembly with statistical tests enabling motif discovery without the use of a reference genome. We validate the performance of our method using human and mouse data. Analysis of fly data indicates that our method outperforms alignment based methods that utilize closely related species.
Collapse
Affiliation(s)
- Xin He
- Department of Human Genetics, The University of Chicago, 920 E. 58th Street, CLSC, Chicago, IL, 60637, USA.
| | - A Ercument Cicek
- Computational Biology Department, Carnegie Mellon University, 5000 Forbes Ave, Pittsburgh, PA, 15213, USA. .,Department of Computer Engineering, Bilkent University, Ankara, 06800, Turkey.
| | - Yuhao Wang
- Computer Science and Artificial Intelligence Laboratory, 32 Vassar Street, MIT, Cambridge, MA, 02139, USA.
| | - Marcel H Schulz
- Multimodal Computing and Interaction, Saarland University & Max Planck Institute for Informatics, Saarbrücken, 66123, Saarland, Germany.
| | - Hai-Son Le
- Computational Biology Department, Carnegie Mellon University, 5000 Forbes Ave, Pittsburgh, PA, 15213, USA. hple+@cs.cmu.edu
| | - Ziv Bar-Joseph
- Computational Biology Department, Carnegie Mellon University, 5000 Forbes Ave, Pittsburgh, PA, 15213, USA.
| |
Collapse
|
377
|
Franchini LF, Pollard KS. Genomic approaches to studying human-specific developmental traits. Development 2015; 142:3100-12. [DOI: 10.1242/dev.120048] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Changes in developmental regulatory programs drive both disease and phenotypic differences among species. Linking human-specific traits to alterations in development is challenging, because we have lacked the tools to assay and manipulate regulatory networks in human and primate embryonic cells. This field was transformed by the sequencing of hundreds of genomes – human and non-human – that can be compared to discover the regulatory machinery of genes involved in human development. This approach has identified thousands of human-specific genome alterations in developmental genes and their regulatory regions. With recent advances in stem cell techniques, genome engineering, and genomics, we can now test these sequences for effects on developmental gene regulation and downstream phenotypes in human cells and tissues.
Collapse
Affiliation(s)
- Lucía F. Franchini
- Instituto de Investigaciones en Ingeniería Genética y Biología Molecular (INGEBI), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Buenos Aires C1428, Argentina
| | - Katherine S. Pollard
- Gladstone Institutes, San Francisco, CA 94158, USA
- Institute for Human Genetics, Department of Epidemiology & Biostatistics, University of California, San Francisco, CA 94158, USA
| |
Collapse
|
378
|
Savic D, Partridge EC, Newberry KM, Smith SB, Meadows SK, Roberts BS, Mackiewicz M, Mendenhall EM, Myers RM. CETCh-seq: CRISPR epitope tagging ChIP-seq of DNA-binding proteins. Genome Res 2015; 25:1581-9. [PMID: 26355004 PMCID: PMC4579343 DOI: 10.1101/gr.193540.115] [Citation(s) in RCA: 94] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2015] [Accepted: 08/14/2015] [Indexed: 01/16/2023]
Abstract
Chromatin immunoprecipitation followed by next-generation DNA sequencing (ChIP-seq) is a widely used technique for identifying transcription factor (TF) binding events throughout an entire genome. However, ChIP-seq is limited by the availability of suitable ChIP-seq grade antibodies, and the vast majority of commercially available antibodies fail to generate usable data sets. To ameliorate these technical obstacles, we present a robust methodological approach for performing ChIP-seq through epitope tagging of endogenous TFs. We used clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9-based genome editing technology to develop CRISPR epitope tagging ChIP-seq (CETCh-seq) of DNA-binding proteins. We assessed the feasibility of CETCh-seq by tagging several DNA-binding proteins spanning a wide range of endogenous expression levels in the hepatocellular carcinoma cell line HepG2. Our data exhibit strong correlations between both replicate types as well as with standard ChIP-seq approaches that use TF antibodies. Notably, we also observed minimal changes to the cellular transcriptome and to the expression of the tagged TF. To examine the robustness of our technique, we further performed CETCh-seq in the breast adenocarcinoma cell line MCF7 as well as mouse embryonic stem cells and observed similarly high correlations. Collectively, these data highlight the applicability of CETCh-seq to accurately define the genome-wide binding profiles of DNA-binding proteins, allowing for a straightforward methodology to potentially assay the complete repertoire of TFs, including the large fraction for which ChIP-quality antibodies are not available.
Collapse
Affiliation(s)
- Daniel Savic
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama 35806, USA
| | | | | | - Sophia B Smith
- University of Alabama in Huntsville, Huntsville, Alabama 35899, USA
| | - Sarah K Meadows
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama 35806, USA
| | - Brian S Roberts
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama 35806, USA
| | - Mark Mackiewicz
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama 35806, USA
| | - Eric M Mendenhall
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama 35806, USA; University of Alabama in Huntsville, Huntsville, Alabama 35899, USA
| | - Richard M Myers
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama 35806, USA
| |
Collapse
|
379
|
Buisine N, Ruan X, Bilesimo P, Grimaldi A, Alfama G, Ariyaratne P, Mulawadi F, Chen J, Sung WK, Liu ET, Demeneix BA, Ruan Y, Sachs LM. Xenopus tropicalis Genome Re-Scaffolding and Re-Annotation Reach the Resolution Required for In Vivo ChIA-PET Analysis. PLoS One 2015; 10:e0137526. [PMID: 26348928 PMCID: PMC4562602 DOI: 10.1371/journal.pone.0137526] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2015] [Accepted: 08/19/2015] [Indexed: 12/11/2022] Open
Abstract
Genome-wide functional analyses require high-resolution genome assembly and annotation. We applied ChIA-PET to analyze gene regulatory networks, including 3D chromosome interactions, underlying thyroid hormone (TH) signaling in the frog Xenopus tropicalis. As the available versions of Xenopus tropicalis assembly and annotation lacked the resolution required for ChIA-PET we improve the genome assembly version 4.1 and annotations using data derived from the paired end tag (PET) sequencing technologies and approaches (e.g., DNA-PET [gPET], RNA-PET etc.). The large insert (~10Kb, ~17Kb) paired end DNA-PET with high throughput NGS sequencing not only significantly improved genome assembly quality, but also strongly reduced genome “fragmentation”, reducing total scaffold numbers by ~60%. Next, RNA-PET technology, designed and developed for the detection of full-length transcripts and fusion mRNA in whole transcriptome studies (ENCODE consortia), was applied to capture the 5' and 3' ends of transcripts. These amendments in assembly and annotation were essential prerequisites for the ChIA-PET analysis of TH transcription regulation. Their application revealed complex regulatory configurations of target genes and the structures of the regulatory networks underlying physiological responses. Our work allowed us to improve the quality of Xenopus tropicalis genomic resources, reaching the standard required for ChIA-PET analysis of transcriptional networks. We consider that the workflow proposed offers useful conceptual and methodological guidance and can readily be applied to other non-conventional models that have low-resolution genome data.
Collapse
Affiliation(s)
- Nicolas Buisine
- UMR CNRS 7221, Muséum National d'Histoire Naturelle, Paris, France
| | - Xiaoan Ruan
- The Jackson Laboratory of Genomic Medicine, Farmington, Connecticut, United States of America
- Department of Genetics and Developmental Biology, University of Connecticut, Farmington, Connecticut, United States of America
- Genome Institute of Singapore, Singapore, Singapore
| | - Patrice Bilesimo
- UMR CNRS 7221, Muséum National d'Histoire Naturelle, Paris, France
- Watchfrog S.A.S., Evry, France
| | - Alexis Grimaldi
- UMR CNRS 7221, Muséum National d'Histoire Naturelle, Paris, France
| | - Gladys Alfama
- UMR CNRS 7221, Muséum National d'Histoire Naturelle, Paris, France
| | | | | | - Jieqi Chen
- Genome Institute of Singapore, Singapore, Singapore
| | | | - Edison T. Liu
- The Jackson Laboratory of Genomic Medicine, Farmington, Connecticut, United States of America
- Department of Genetics and Developmental Biology, University of Connecticut, Farmington, Connecticut, United States of America
- Genome Institute of Singapore, Singapore, Singapore
| | | | - Yijun Ruan
- The Jackson Laboratory of Genomic Medicine, Farmington, Connecticut, United States of America
- Department of Genetics and Developmental Biology, University of Connecticut, Farmington, Connecticut, United States of America
- Genome Institute of Singapore, Singapore, Singapore
- * E-mail: (YR); (LMS)
| | - Laurent M. Sachs
- UMR CNRS 7221, Muséum National d'Histoire Naturelle, Paris, France
- * E-mail: (YR); (LMS)
| |
Collapse
|
380
|
Bricker TM, Mummadisetti MP, Frankel LK. Recent advances in the use of mass spectrometry to examine structure/function relationships in photosystem II. JOURNAL OF PHOTOCHEMISTRY AND PHOTOBIOLOGY B-BIOLOGY 2015; 152:227-46. [PMID: 26390944 DOI: 10.1016/j.jphotobiol.2015.08.031] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/28/2015] [Revised: 08/27/2015] [Accepted: 08/31/2015] [Indexed: 01/24/2023]
Abstract
Tandem mass spectrometry often coupled with chemical modification techniques, is developing into increasingly important tool in structural biology. These methods can provide important supplementary information concerning the structural organization and subunit make-up of membrane protein complexes, identification of conformational changes occurring during enzymatic reactions, identification of the location of posttranslational modifications, and elucidation of the structure of assembly and repair complexes. In this review, we will present a brief introduction to Photosystem II, tandem mass spectrometry and protein modification techniques that have been used to examine the photosystem. We will then discuss a number of recent case studies that have used these techniques to address open questions concerning PS II. These include the nature of subunit-subunit interactions within the phycobilisome, the interaction of phycobilisomes with Photosystem I and the Orange Carotenoid Protein, the location of CyanoQ, PsbQ and PsbP within Photosystem II, and the identification of phosphorylation and oxidative modification sites within the photosystem. Finally, we will discuss some of the future prospects for the use of these methods in examining other open questions in PS II structural biochemistry.
Collapse
Affiliation(s)
- Terry M Bricker
- Department of Biological Sciences, Division of Biochemistry and Molecular Biology, Louisiana State University, Baton Rouge, LA 70803, United States.
| | - Manjula P Mummadisetti
- Department of Biological Sciences, Division of Biochemistry and Molecular Biology, Louisiana State University, Baton Rouge, LA 70803, United States
| | - Laurie K Frankel
- Department of Biological Sciences, Division of Biochemistry and Molecular Biology, Louisiana State University, Baton Rouge, LA 70803, United States
| |
Collapse
|
381
|
Schmidl C, Rendeiro AF, Sheffield NC, Bock C. ChIPmentation: fast, robust, low-input ChIP-seq for histones and transcription factors. Nat Methods 2015; 12:963-965. [PMID: 26280331 PMCID: PMC4589892 DOI: 10.1038/nmeth.3542] [Citation(s) in RCA: 308] [Impact Index Per Article: 34.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2015] [Accepted: 07/07/2015] [Indexed: 12/31/2022]
Abstract
Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is widely used to map histone marks and transcription factor binding throughout the genome. Here we present ChIPmentation, a method that combines chromatin immunoprecipitation with sequencing library preparation by Tn5 transposase (“tagmentation”). ChIPmentation introduces sequencing-compatible adapters in a single-step reaction directly on bead-bound chromatin, which reduces time, cost, and input requirements, thus providing a convenient and broadly useful alternative to existing ChIP-seq protocols.
Collapse
Affiliation(s)
- Christian Schmidl
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - André F Rendeiro
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - Nathan C Sheffield
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - Christoph Bock
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria.,Department of Laboratory Medicine, Medical University of Vienna, Vienna, Austria.,Max Planck Institute for Informatics, Saarbrücken, Germany
| |
Collapse
|
382
|
Hass MR, Liow HH, Chen X, Sharma A, Inoue YU, Inoue T, Reeb A, Martens A, Fulbright M, Raju S, Stevens M, Boyle S, Park JS, Weirauch MT, Brent MR, Kopan R. SpDamID: Marking DNA Bound by Protein Complexes Identifies Notch-Dimer Responsive Enhancers. Mol Cell 2015; 59:685-97. [PMID: 26257285 DOI: 10.1016/j.molcel.2015.07.008] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2015] [Revised: 06/11/2015] [Accepted: 07/02/2015] [Indexed: 12/20/2022]
Abstract
We developed Split DamID (SpDamID), a protein complementation version of DamID, to mark genomic DNA bound in vivo by interacting or juxtapositioned transcription factors. Inactive halves of DAM (DNA adenine methyltransferase) were fused to protein pairs to be queried. Either direct interaction between proteins or proximity enabled DAM reconstitution and methylation of adenine in GATC. Inducible SpDamID was used to analyze Notch-mediated transcriptional activation. We demonstrate that Notch complexes label RBP sites broadly across the genome and show that a subset of these complexes that recruit MAML and p300 undergo changes in chromatin accessibility in response to Notch signaling. SpDamID differentiates between monomeric and dimeric binding, thereby allowing for identification of half-site motifs used by Notch dimers. Motif enrichment of Notch enhancers coupled with SpDamID reveals co-targeting of regulatory sequences by Notch and Runx1. SpDamID represents a sensitive and powerful tool that enables dynamic analysis of combinatorial protein-DNA transactions at a genome-wide level.
Collapse
Affiliation(s)
- Matthew R Hass
- Division of Developmental Biology, Children's Hospital Medical Center, Cincinnati, OH 45229, USA.
| | - Hien-Haw Liow
- Center for Genome Sciences and Systems Biology, Washington University, Saint Louis, MO 63108, USA
| | - Xiaoting Chen
- School of Electronic and Computing Systems, University of Cincinnati, Cincinnati, OH 45221, USA; Center for Autoimmune Genomics and Etiology (CAGE) and Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Ankur Sharma
- Division of Developmental Biology, Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Yukiko U Inoue
- Department of Biochemistry and Cellular Biology, National Institute of Neuroscience, National Center of Neurology and Psychiatry, Kodaira, Tokyo 187-8502, Japan
| | - Takayoshi Inoue
- Department of Biochemistry and Cellular Biology, National Institute of Neuroscience, National Center of Neurology and Psychiatry, Kodaira, Tokyo 187-8502, Japan
| | - Ashley Reeb
- Department of Developmental Biology, Washington University, Saint Louis, MO 63110, USA
| | - Andrew Martens
- Department of Developmental Biology, Washington University, Saint Louis, MO 63110, USA
| | - Mary Fulbright
- Department of Developmental Biology, Washington University, Saint Louis, MO 63110, USA
| | - Saravanan Raju
- Department of Developmental Biology, Washington University, Saint Louis, MO 63110, USA
| | - Michael Stevens
- Department of Developmental Biology, Washington University, Saint Louis, MO 63110, USA
| | - Scott Boyle
- Department of Developmental Biology, Washington University, Saint Louis, MO 63110, USA
| | - Joo-Seop Park
- Division of Developmental Biology, Children's Hospital Medical Center, Cincinnati, OH 45229, USA; Division of Pediatric Urology, Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Matthew T Weirauch
- Division of Developmental Biology, Children's Hospital Medical Center, Cincinnati, OH 45229, USA; Center for Autoimmune Genomics and Etiology (CAGE) and Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Michael R Brent
- Center for Genome Sciences and Systems Biology, Washington University, Saint Louis, MO 63108, USA
| | - Raphael Kopan
- Division of Developmental Biology, Children's Hospital Medical Center, Cincinnati, OH 45229, USA.
| |
Collapse
|
383
|
Dobigny G, Britton-Davidian J, Robinson TJ. Chromosomal polymorphism in mammals: an evolutionary perspective. Biol Rev Camb Philos Soc 2015; 92:1-21. [PMID: 26234165 DOI: 10.1111/brv.12213] [Citation(s) in RCA: 49] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2014] [Revised: 06/23/2015] [Accepted: 07/09/2015] [Indexed: 12/28/2022]
Abstract
Although chromosome rearrangements (CRs) are central to studies of genome evolution, our understanding of the evolutionary consequences of the early stages of karyotypic differentiation (i.e. polymorphism), especially the non-meiotic impacts, is surprisingly limited. We review the available data on chromosomal polymorphisms in mammals so as to identify taxa that hold promise for developing a more comprehensive understanding of chromosomal change. In doing so, we address several key questions: (i) to what extent are mammalian karyotypes polymorphic, and what types of rearrangements are principally involved? (ii) Are some mammalian lineages more prone to chromosomal polymorphism than others? More specifically, do (karyotypically) polymorphic mammalian species belong to lineages that are also characterized by past, extensive karyotype repatterning? (iii) How long can chromosomal polymorphisms persist in mammals? We discuss the evolutionary implications of these questions and propose several research avenues that may shed light on the role of chromosome change in the diversification of mammalian populations and species.
Collapse
Affiliation(s)
- Gauthier Dobigny
- Institut de Recherche pour le Développement, Centre de Biologie pour la Gestion des Populations (UMR IRD-INRA-Cirad-Montpellier SupAgro), Campus International de Baillarguet, CS30016, 34988, Montferrier-sur-Lez, France
| | - Janice Britton-Davidian
- Institut des Sciences de l'Evolution, Université de Montpellier, CNRS, IRD, EPHE, Cc065, Place Eugène Bataillon, 34095, Montpellier Cedex 5, France
| | - Terence J Robinson
- Evolutionary Genomics Group, Department of Botany and Zoology, Stellenbosch University, Private Bag X1, Matieland, Stellenbosch, 7062, South Africa
| |
Collapse
|
384
|
Ozer A, Tome JM, Friedman RC, Gheba D, Schroth GP, Lis JT. Quantitative assessment of RNA-protein interactions with high-throughput sequencing-RNA affinity profiling. Nat Protoc 2015; 10:1212-33. [PMID: 26182240 PMCID: PMC4714542 DOI: 10.1038/nprot.2015.074] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Because RNA-protein interactions have a central role in a wide array of biological processes, methods that enable a quantitative assessment of these interactions in a high-throughput manner are in great demand. Recently, we developed the high-throughput sequencing-RNA affinity profiling (HiTS-RAP) assay that couples sequencing on an Illumina GAIIx genome analyzer with the quantitative assessment of protein-RNA interactions. This assay is able to analyze interactions between one or possibly several proteins with millions of different RNAs in a single experiment. We have successfully used HiTS-RAP to analyze interactions of the EGFP and negative elongation factor subunit E (NELF-E) proteins with their corresponding canonical and mutant RNA aptamers. Here we provide a detailed protocol for HiTS-RAP that can be completed in about a month (8 d hands-on time). This includes the preparation and testing of recombinant proteins and DNA templates, clustering DNA templates on a flowcell, HiTS and protein binding with a GAIIx instrument, and finally data analysis. We also highlight aspects of HiTS-RAP that can be further improved and points of comparison between HiTS-RAP and two other recently developed methods, quantitative analysis of RNA on a massively parallel array (RNA-MaP) and RNA Bind-n-Seq (RBNS), for quantitative analysis of RNA-protein interactions.
Collapse
Affiliation(s)
- Abdullah Ozer
- Molecular Biology and Genetics Department, Cornell University, Ithaca, NY 14853, USA. Phone +1 (607) 255-2441, fax +1 (607) 255-6249
| | - Jacob M. Tome
- Molecular Biology and Genetics Department, Cornell University, Ithaca, NY 14853, USA. Phone +1 (607) 255-2441, fax +1 (607) 255-6249
| | - Robin C. Friedman
- Molecular Microbial Pathogenesis Unit, Institut Pasteur, 75724 Paris Cedex 15, FRANCE. +33 (0) 1-4438-9437
| | - Dan Gheba
- Illumina Inc., San Diego, CA 92121, USA. +1 (267) 251-4547, +1 (510) 670-9310
| | - Gary P. Schroth
- Illumina Inc., San Diego, CA 92121, USA. +1 (267) 251-4547, +1 (510) 670-9310
| | - John T. Lis
- Molecular Biology and Genetics Department, Cornell University, Ithaca, NY 14853, USA. Phone +1 (607) 255-2441, fax +1 (607) 255-6249
| |
Collapse
|
385
|
Erhard F, Zimmer R. Count ratio model reveals bias affecting NGS fold changes. Nucleic Acids Res 2015; 43:e136. [PMID: 26160885 PMCID: PMC4787746 DOI: 10.1093/nar/gkv696] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2014] [Accepted: 06/25/2015] [Indexed: 01/01/2023] Open
Abstract
Various biases affect high-throughput sequencing read counts. Contrary to the general assumption, we show that bias does not always cancel out when fold changes are computed and that bias affects more than 20% of genes that are called differentially regulated in RNA-seq experiments with drastic effects on subsequent biological interpretation. Here, we propose a novel approach to estimate fold changes. Our method is based on a probabilistic model that directly incorporates count ratios instead of read counts. It provides a theoretical foundation for pseudo-counts and can be used to estimate fold change credible intervals as well as normalization factors that outperform currently used normalization methods. We show that fold change estimates are significantly improved by our method by comparing RNA-seq derived fold changes to qPCR data from the MAQC/SEQC project as a reference and analyzing random barcoded sequencing data. Our software implementation is freely available from the project website http://www.bio.ifi.lmu.de/software/lfc.
Collapse
Affiliation(s)
- Florian Erhard
- Institut für Informatik, Ludwig-Maximilians-Universität München, Amalienstraße 17, 80333 München, Germany
| | - Ralf Zimmer
- Institut für Informatik, Ludwig-Maximilians-Universität München, Amalienstraße 17, 80333 München, Germany
| |
Collapse
|
386
|
Tretyakova NY, Groehler A, Ji S. DNA-Protein Cross-Links: Formation, Structural Identities, and Biological Outcomes. Acc Chem Res 2015; 48:1631-44. [PMID: 26032357 PMCID: PMC4704791 DOI: 10.1021/acs.accounts.5b00056] [Citation(s) in RCA: 134] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Noncovalent DNA-protein interactions are at the heart of normal cell function. In eukaryotic cells, genomic DNA is wrapped around histone octamers to allow for chromosomal packaging in the nucleus. Binding of regulatory protein factors to DNA directs replication, controls transcription, and mediates cellular responses to DNA damage. Because of their fundamental significance in all cellular processes involving DNA, dynamic DNA-protein interactions are required for cell survival, and their disruption is likely to have serious biological consequences. DNA-protein cross-links (DPCs) form when cellular proteins become covalently trapped on DNA strands upon exposure to various endogenous, environmental and chemotherapeutic agents. DPCs progressively accumulate in the brain and heart tissues as a result of endogenous exposure to reactive oxygen species and lipid peroxidation products, as well as normal cellular metabolism. A range of structurally diverse DPCs are found following treatment with chemotherapeutic drugs, transition metal ions, and metabolically activated carcinogens. Because of their considerable size and their helix-distorting nature, DPCs interfere with the progression of replication and transcription machineries and hence hamper the faithful expression of genetic information, potentially contributing to mutagenesis and carcinogenesis. Mass spectrometry-based studies have identified hundreds of proteins that can become cross-linked to nuclear DNA in the presence of reactive oxygen species, carcinogen metabolites, and antitumor drugs. While many of these proteins including histones, transcription factors, and repair proteins are known DNA binding partners, other gene products with no documented affinity for DNA also participate in DPC formation. Furthermore, multiple sites within DNA can be targeted for cross-linking including the N7 of guanine, the C-5 methyl group of thymine, and the exocyclic amino groups of guanine, cytosine, and adenine. This structural complexity complicates structural and biological studies of DPC lesions. Two general strategies have been developed for creating DNA strands containing structurally defined, site-specific DPCs. Enzymatic methodologies that trap DNA modifying proteins on their DNA substrate are site specific and efficient, but do not allow for systematic studies of DPC lesion structure on their biological outcomes. Synthetic methodologies for DPC formation are based on solid phase synthesis of oligonucleotide strands containing protein-reactive unnatural DNA bases. The latter approach allows for a wider range of protein substrates to be conjugated to DNA and affords a greater flexibility for the attachment sites within DNA. In this Account, we outline the chemistry of DPC formation in cells, describe our recent efforts to identify the cross-linked proteins by mass spectrometry, and discuss various methodologies for preparing DNA strands containing structurally defined, site specific DPC lesions. Polymerase bypass experiments conducted with model DPCs indicate that the biological outcomes of these bulky lesions are strongly dependent on the peptide/protein size and the exact cross-linking site within DNA. Future studies are needed to elucidate the mechanisms of DPC repair and their biological outcomes in living cells.
Collapse
Affiliation(s)
- Natalia Y. Tretyakova
- Masonic Cancer Center and the Department of Medicinal Chemistry, University of Minnesota, Minneapolis, MN 55455
| | - Arnold Groehler
- Masonic Cancer Center and the Department of Medicinal Chemistry, University of Minnesota, Minneapolis, MN 55455
| | - Shaofei Ji
- Masonic Cancer Center and the Department of Medicinal Chemistry, University of Minnesota, Minneapolis, MN 55455
| |
Collapse
|
387
|
Multiplexing of ChIP-Seq Samples in an Optimized Experimental Condition Has Minimal Impact on Peak Detection. PLoS One 2015; 10:e0129350. [PMID: 26066343 PMCID: PMC4466019 DOI: 10.1371/journal.pone.0129350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2014] [Accepted: 05/07/2015] [Indexed: 11/19/2022] Open
Abstract
Multiplexing samples in sequencing experiments is a common approach to maximize information yield while minimizing cost. In most cases the number of samples that are multiplexed is determined by financial consideration or experimental convenience, with limited understanding on the effects on the experimental results. Here we set to examine the impact of multiplexing ChIP-seq experiments on the ability to identify a specific epigenetic modification. We performed peak detection analyses to determine the effects of multiplexing. These include false discovery rates, size, position and statistical significance of peak detection, and changes in gene annotation. We found that, for histone marker H3K4me3, one can multiplex up to 8 samples (7 IP + 1 input) at ~21 million single-end reads each and still detect over 90% of all peaks found when using a full lane for sample (~181 million reads). Furthermore, there are no variations introduced by indexing or lane batch effects and importantly there is no significant reduction in the number of genes with neighboring H3K4me3 peaks. We conclude that, for a well characterized antibody and, therefore, model IP condition, multiplexing 8 samples per lane is sufficient to capture most of the biological signal.
Collapse
|
388
|
Harbers M. Shift-Western Blotting: Separate Analysis of Protein and DNA from Protein-DNA Complexes. Methods Mol Biol 2015; 1312:355-73. [PMID: 26044017 DOI: 10.1007/978-1-4939-2694-7_36] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]
Abstract
The electrophoretic mobility shift assay (EMSA) is the most frequently used experiment for studying protein-DNA interactions and to identify DNA-binding proteins. Protein-DNA complexes formed during EMSA experiments can be further analyzed by shift-western blotting, where the protein and DNA components contained in a polyacrylamide gel are transferred to stacked membranes: First a nitrocellulose membrane retains the proteins while double-stranded DNA passes through the nitrocellulose membrane and binds only to a charged membrane placed below. Immobilized proteins can then be stained with specific antibodies while the DNA can be detected by a radioactive label or a nonradioactive detection system. Shift-western blotting can overcome many limitations of supershift experiments and allows for the analysis of complex protein-DNA complexes containing multiple protein factors. Moreover, proteins and/or DNA may be recovered from membranes after the blotting step for further analysis by other means.
Collapse
Affiliation(s)
- Matthias Harbers
- Division of Genomic Technologies, RIKEN Center for Life Science Technologies, RIKEN Yokohama Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan,
| |
Collapse
|
389
|
Stavreva DA, Coulon A, Baek S, Sung MH, John S, Stixova L, Tesikova M, Hakim O, Miranda T, Hawkins M, Stamatoyannopoulos JA, Chow CC, Hager GL. Dynamics of chromatin accessibility and long-range interactions in response to glucocorticoid pulsing. Genome Res 2015; 25:845-57. [PMID: 25677181 PMCID: PMC4448681 DOI: 10.1101/gr.184168.114] [Citation(s) in RCA: 66] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2014] [Accepted: 02/05/2015] [Indexed: 12/20/2022]
Abstract
Although physiological steroid levels are often pulsatile (ultradian), the genomic effects of this pulsatility are poorly understood. By utilizing glucocorticoid receptor (GR) signaling as a model system, we uncovered striking spatiotemporal relationships between receptor loading, lifetimes of the DNase I hypersensitivity sites (DHSs), long-range interactions, and gene regulation. We found that hormone-induced DHSs were enriched within ± 50 kb of GR-responsive genes and displayed a broad spectrum of lifetimes upon hormone withdrawal. These lifetimes dictate the strength of the DHS interactions with gene targets and contribute to gene regulation from a distance. Our results demonstrate that pulsatile and constant hormone stimulations induce unique, treatment-specific patterns of gene and regulatory element activation. These modes of activation have implications for corticosteroid function in vivo and for steroid therapies in various clinical settings.
Collapse
Affiliation(s)
- Diana A Stavreva
- Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, NIH, Bethesda, Maryland 20892, USA
| | - Antoine Coulon
- Laboratory of Biological Modeling, National Institute of Diabetes and Digestive and Kidney Diseases, NIH, Bethesda, Maryland 20892, USA
| | - Songjoon Baek
- Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, NIH, Bethesda, Maryland 20892, USA
| | - Myong-Hee Sung
- Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, NIH, Bethesda, Maryland 20892, USA
| | - Sam John
- Laboratory of Genome Integrity, National Cancer Institute, NIH, Bethesda, Maryland 20892, USA
| | - Lenka Stixova
- Department of Molecular Cytology and Cytometry, Institute of Biophysics, Academy of Sciences of the Czech Republic, 612 65 Brno, Czech Republic
| | - Martina Tesikova
- Department of Biosciences, University of Oslo, 0316 Oslo, Norway
| | - Ofir Hakim
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan 5290002, Israel
| | - Tina Miranda
- Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, NIH, Bethesda, Maryland 20892, USA
| | - Mary Hawkins
- Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, NIH, Bethesda, Maryland 20892, USA
| | | | - Carson C Chow
- Laboratory of Biological Modeling, National Institute of Diabetes and Digestive and Kidney Diseases, NIH, Bethesda, Maryland 20892, USA
| | - Gordon L Hager
- Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, NIH, Bethesda, Maryland 20892, USA
| |
Collapse
|
390
|
Bailey SD, Virtanen C, Haibe-Kains B, Lupien M. ABC: a tool to identify SNVs causing allele-specific transcription factor binding from ChIP-Seq experiments. Bioinformatics 2015; 31:3057-9. [PMID: 25995231 DOI: 10.1093/bioinformatics/btv321] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2015] [Accepted: 05/18/2015] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Detection of allelic imbalances in ChIP-Seq reads is a powerful approach to identify functional non-coding single nucleotide variants (SNVs), either polymorphisms or mutations, which modulate the affinity of transcription factors for chromatin. We present ABC, a computational tool that identifies allele-specific binding of transcription factors from aligned ChIP-Seq reads at heterozygous SNVs. ABC controls for potential false positives resulting from biases introduced by the use of short sequencing reads in ChIP-Seq and can efficiently process a large number of heterozygous SNVs. RESULTS ABC successfully identifies previously characterized functional SNVs, such as the rs4784227 breast cancer risk associated SNP that modulates the affinity of FOXA1 for the chromatin. AVAILABILITY AND IMPLEMENTATION The code is open-source under an Artistic-2.0 license and versioned on GitHub (https://github.com/mlupien/ABC/). ABC is written in PERL and can be run on any platform with both PERL (≥5.18.1) and R (≥3.1.1) installed. The script requires the PERL Statistics::R module. CONTACT mlupien@uhnres.utoronto.ca SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Swneke D Bailey
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada, Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada and Ontario Institute for Cancer Research, Toronto, ON, Canada Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada, Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada and Ontario Institute for Cancer Research, Toronto, ON, Canada
| | - Carl Virtanen
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada, Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada and Ontario Institute for Cancer Research, Toronto, ON, Canada
| | - Benjamin Haibe-Kains
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada, Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada and Ontario Institute for Cancer Research, Toronto, ON, Canada Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada, Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada and Ontario Institute for Cancer Research, Toronto, ON, Canada
| | - Mathieu Lupien
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada, Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada and Ontario Institute for Cancer Research, Toronto, ON, Canada Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada, Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada and Ontario Institute for Cancer Research, Toronto, ON, Canada Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada, Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada and Ontario Institute for Cancer Research, Toronto, ON, Canada
| |
Collapse
|
391
|
Angelini C, Heller R, Volkinshtein R, Yekutieli D. Is this the right normalization? A diagnostic tool for ChIP-seq normalization. BMC Bioinformatics 2015; 16:150. [PMID: 25957089 PMCID: PMC4448883 DOI: 10.1186/s12859-015-0579-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2014] [Accepted: 04/20/2015] [Indexed: 12/21/2022] Open
Abstract
Background Chip-seq experiments are becoming a standard approach for genome-wide profiling protein-DNA interactions, such as detecting transcription factor binding sites, histone modification marks and RNA Polymerase II occupancy. However, when comparing a ChIP sample versus a control sample, such as Input DNA, normalization procedures have to be applied in order to remove experimental source of biases. Despite the substantial impact that the choice of the normalization method can have on the results of a ChIP-seq data analysis, their assessment is not fully explored in the literature. In particular, there are no diagnostic tools that show whether the applied normalization is indeed appropriate for the data being analyzed. Results In this work we propose a novel diagnostic tool to examine the appropriateness of the estimated normalization procedure. By plotting the empirical densities of log relative risks in bins of equal read count, along with the estimated normalization constant, after logarithmic transformation, the researcher is able to assess the appropriateness of the estimated normalization constant. We use the diagnostic plot to evaluate the appropriateness of the estimates obtained by CisGenome, NCIS and CCAT on several real data examples. Moreover, we show the impact that the choice of the normalization constant can have on standard tools for peak calling such as MACS or SICER. Finally, we propose a novel procedure for controlling the FDR using sample swapping. This procedure makes use of the estimated normalization constant in order to gain power over the naive choice of constant (used in MACS and SICER), which is the ratio of the total number of reads in the ChIP and Input samples. Conclusions Linear normalization approaches aim to estimate a scale factor, r, to adjust for different sequencing depths when comparing ChIP versus Input samples. The estimated scaling factor can easily be incorporated in many peak caller algorithms to improve the accuracy of the peak identification. The diagnostic plot proposed in this paper can be used to assess how adequate ChIP/Input normalization constants are, and thus it allows the user to choose the most adequate estimate for the analysis. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0579-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Claudia Angelini
- Istituto per le Applicazioni del Calcolo "Mauro Picone", Via Pietro Castellino, 111, Naples, 80131, Italy.
| | - Ruth Heller
- Department of Statistics and Operations Research Tel Aviv University, Ramat Aviv, Tel Aviv, 69978, Israel.
| | - Rita Volkinshtein
- Department of Statistics and Operations Research Tel Aviv University, Ramat Aviv, Tel Aviv, 69978, Israel.
| | - Daniel Yekutieli
- Department of Statistics and Operations Research Tel Aviv University, Ramat Aviv, Tel Aviv, 69978, Israel.
| |
Collapse
|
392
|
Yang HJ, Ratnapriya R, Cogliati T, Kim JW, Swaroop A. Vision from next generation sequencing: multi-dimensional genome-wide analysis for producing gene regulatory networks underlying retinal development, aging and disease. Prog Retin Eye Res 2015; 46:1-30. [PMID: 25668385 PMCID: PMC4402139 DOI: 10.1016/j.preteyeres.2015.01.005] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2014] [Revised: 01/18/2015] [Accepted: 01/21/2015] [Indexed: 01/10/2023]
Abstract
Genomics and genetics have invaded all aspects of biology and medicine, opening uncharted territory for scientific exploration. The definition of "gene" itself has become ambiguous, and the central dogma is continuously being revised and expanded. Computational biology and computational medicine are no longer intellectual domains of the chosen few. Next generation sequencing (NGS) technology, together with novel methods of pattern recognition and network analyses, has revolutionized the way we think about fundamental biological mechanisms and cellular pathways. In this review, we discuss NGS-based genome-wide approaches that can provide deeper insights into retinal development, aging and disease pathogenesis. We first focus on gene regulatory networks (GRNs) that govern the differentiation of retinal photoreceptors and modulate adaptive response during aging. Then, we discuss NGS technology in the context of retinal disease and develop a vision for therapies based on network biology. We should emphasize that basic strategies for network construction and analyses can be transported to any tissue or cell type. We believe that specific and uniform guidelines are required for generation of genome, transcriptome and epigenome data to facilitate comparative analysis and integration of multi-dimensional data sets, and for constructing networks underlying complex biological processes. As cellular homeostasis and organismal survival are dependent on gene-gene and gene-environment interactions, we believe that network-based biology will provide the foundation for deciphering disease mechanisms and discovering novel drug targets for retinal neurodegenerative diseases.
Collapse
Affiliation(s)
- Hyun-Jin Yang
- Neurobiology-Neurodegeneration and Repair Laboratory, National Eye Institute, National Institutes of Health, 6 Center Drive, Bethesda, MD 20892-0610, USA
| | - Rinki Ratnapriya
- Neurobiology-Neurodegeneration and Repair Laboratory, National Eye Institute, National Institutes of Health, 6 Center Drive, Bethesda, MD 20892-0610, USA
| | - Tiziana Cogliati
- Neurobiology-Neurodegeneration and Repair Laboratory, National Eye Institute, National Institutes of Health, 6 Center Drive, Bethesda, MD 20892-0610, USA
| | - Jung-Woong Kim
- Neurobiology-Neurodegeneration and Repair Laboratory, National Eye Institute, National Institutes of Health, 6 Center Drive, Bethesda, MD 20892-0610, USA
| | - Anand Swaroop
- Neurobiology-Neurodegeneration and Repair Laboratory, National Eye Institute, National Institutes of Health, 6 Center Drive, Bethesda, MD 20892-0610, USA.
| |
Collapse
|
393
|
Shao M, Sun Y, Zhou S. Identifying TF-MiRNA Regulatory Relationships Using Multiple Features. PLoS One 2015; 10:e0125156. [PMID: 25922940 PMCID: PMC4414601 DOI: 10.1371/journal.pone.0125156] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2014] [Accepted: 03/20/2015] [Indexed: 12/31/2022] Open
Abstract
MicroRNAs are known to play important roles in the transcriptional and post-transcriptional regulation of gene expression. While intensive research has been conducted to identify miRNAs and their target genes in various genomes, there is only limited knowledge about how microRNAs are regulated. In this study, we construct a pipeline that can infer the regulatory relationships between transcription factors and microRNAs from ChIP-Seq data with high confidence. In particular, after identifying candidate peaks from ChIP-Seq data, we formulate the inference as a PU learning (learning from only positive and unlabeled examples) problem. Multiple features including the statistical significance of the peaks, the location of the peaks, the transcription factor binding site motifs, and the evolutionary conservation are derived from peaks for training and prediction. To further improve the accuracy of our inference, we also apply a mean reciprocal rank (MRR)-based method to the candidate peaks. We apply our pipeline to infer TF-miRNA regulatory relationships in mouse embryonic stem cells. The experimental results show that our approach provides very specific findings of TF-miRNA regulatory relationships.
Collapse
Affiliation(s)
- Mingyu Shao
- School of Computer Science and Shanghai Key Lab of Intelligent Information Processing, Fudan University, 220 Handan Road, Shanghai 200433, China
- Department of Computer Science and Engineering, Michigan State University, 428 S. Shaw Lane, East Lansing, 48824, USA
| | - Yanni Sun
- Department of Computer Science and Engineering, Michigan State University, 428 S. Shaw Lane, East Lansing, 48824, USA
- * E-mail: (YS); (SZ)
| | - Shuigeng Zhou
- School of Computer Science and Shanghai Key Lab of Intelligent Information Processing, Fudan University, 220 Handan Road, Shanghai 200433, China
- * E-mail: (YS); (SZ)
| |
Collapse
|
394
|
Wu DY, Bittencourt D, Stallcup MR, Siegmund KD. Identifying differential transcription factor binding in ChIP-seq. Front Genet 2015; 6:169. [PMID: 25972895 PMCID: PMC4413818 DOI: 10.3389/fgene.2015.00169] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2015] [Accepted: 04/14/2015] [Indexed: 12/19/2022] Open
Abstract
ChIP seq is a widely used assay to measure genome-wide protein binding. The decrease in costs associated with sequencing has led to a rise in the number of studies that investigate protein binding across treatment conditions or cell lines. In addition to the identification of binding sites, new studies evaluate the variation in protein binding between conditions. A number of approaches to study differential transcription factor binding have recently been developed. Several of these methods build upon established methods from RNA-seq to quantify differences in read counts. We compare how these new approaches perform on different data sets from the ENCODE project to illustrate the impact of data processing pipelines under different study designs. The performance of normalization methods for differential ChIP-seq depends strongly on the variation in total amount of protein bound between conditions, with total read count outperforming effective library size, or variants thereof, when a large variation in binding was studied. Use of input subtraction to correct for non-specific binding showed a relatively modest impact on the number of differential peaks found and the fold change accuracy to biological validation, however a larger impact might be expected for samples with more extreme copy number variations between them. Still, it did identify a small subset of novel differential regions while excluding some differential peaks in regions with high background signal. These results highlight proper scaling for between-sample data normalization as critical for differential transcription factor binding analysis and suggest bioinformaticians need to know about the variation in level of total protein binding between conditions to select the best analysis method. At the same time, validation using fold-change estimates from qRT-PCR suggests there is still room for further method improvement.
Collapse
Affiliation(s)
- Dai-Ying Wu
- Department of Biochemistry and Molecular Biology, University of Southern California Norris Comprehensive Cancer Center, University of Southern California Los Angeles, CA, USA
| | - Danielle Bittencourt
- Department of Biochemistry and Molecular Biology, University of Southern California Norris Comprehensive Cancer Center, University of Southern California Los Angeles, CA, USA
| | - Michael R Stallcup
- Department of Biochemistry and Molecular Biology, University of Southern California Norris Comprehensive Cancer Center, University of Southern California Los Angeles, CA, USA
| | - Kimberly D Siegmund
- Department of Preventive Medicine, University of Southern California Norris Comprehensive Cancer Center, University of Southern California Los Angeles, CA, USA
| |
Collapse
|
395
|
Danino YM, Even D, Ideses D, Juven-Gershon T. The core promoter: At the heart of gene expression. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2015; 1849:1116-31. [PMID: 25934543 DOI: 10.1016/j.bbagrm.2015.04.003] [Citation(s) in RCA: 102] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2015] [Revised: 04/19/2015] [Accepted: 04/23/2015] [Indexed: 12/17/2022]
Abstract
The identities of different cells and tissues in multicellular organisms are determined by tightly controlled transcriptional programs that enable accurate gene expression. The mechanisms that regulate gene expression comprise diverse multiplayer molecular circuits of multiple dedicated components. The RNA polymerase II (Pol II) core promoter establishes the center of this spatiotemporally orchestrated molecular machine. Here, we discuss transcription initiation, diversity in core promoter composition, interactions of the basal transcription machinery with the core promoter, enhancer-promoter specificity, core promoter-preferential activation, enhancer RNAs, Pol II pausing, transcription termination, Pol II recycling and translation. We further discuss recent findings indicating that promoters and enhancers share similar features and may not substantially differ from each other, as previously assumed. Taken together, we review a broad spectrum of studies that highlight the importance of the core promoter and its pivotal role in the regulation of metazoan gene expression and suggest future research directions and challenges.
Collapse
Affiliation(s)
- Yehuda M Danino
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan 5290002, Israel
| | - Dan Even
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan 5290002, Israel
| | - Diana Ideses
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan 5290002, Israel
| | - Tamar Juven-Gershon
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan 5290002, Israel.
| |
Collapse
|
396
|
Abstract
Over the past decade, techniques based on chromosome conformation capture (3C) have accelerated our understanding of eukaryote's nuclear architecture. Coupled to high throughput sequencing and bioinformatics they have unveiled different organizational levels of the genome at an unprecedented scale. Initially performed using large populations of cells, a new variant of these techniques can be applied to single cell. Although it can be shown that chromosome folding varies from one cell to the other, their overall organization into topologically associating domains is conserved between cells of the same population. Interestingly, the predicted chromosome structures reveal that regions engaged in trans-chromosomal interactions are preferentially localized at the surface of the chromosome territory. These results confirm and extend previous observations on individual loci therefore highlighting the power of 3C based techniques.
Collapse
Affiliation(s)
- David Umlauf
- Université de Toulouse, université Paul Sabatier, laboratoire de biologie moléculaire des eucaryotes, CNRS, 118 route de Narbonne, 31000 Toulouse, France
| |
Collapse
|
397
|
Yadav SS, Li J, Lavery HJ, Yadav KK, Tewari AK. Next-generation sequencing technology in prostate cancer diagnosis, prognosis, and personalized treatment. Urol Oncol 2015; 33:267.e1-13. [PMID: 25791755 DOI: 10.1016/j.urolonc.2015.02.009] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2014] [Revised: 02/11/2015] [Accepted: 02/12/2015] [Indexed: 02/06/2023]
Abstract
Next-generation sequencing (NGS) of the genetic information of cancer cells has revolutionized the field of cancer biology, including prostate cancer (PCa). New recurrent alterations have been identified in PCa (e.g., TMPRSS2-ERG translocation, SPOP and CHD1 mutations, and chromoplexy), and many previous ones in well-established pathways have been validated (e.g., androgen receptor overexpression and mutations; PTEN, RB1, and TP53 loss/mutations). With its highly heterogeneous nature, PCa continues to pose a tremendous challenge in terms of diagnosis and prognosis. Combining the information gained through NGS studies with clinicopathological and radiological data will help diagnose the aggressiveness of the cancer with greater accuracy. Furthermore, understanding the heterogeneity of tumor through single-cell or single-molecule sequencing technology will also strengthen the prognosis and provide better, patient-specific drug identification. As this research becomes more prominent, it is important that urologic oncologists become familiar with the various NGS technologies and the results generated using them. We highlight the commonly used NGS tools and summarize recent discoveries relevant to PCa.
Collapse
Affiliation(s)
- Shalini S Yadav
- Department of Urology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY
| | - Jinyi Li
- Department of Urology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY
| | - Hugh J Lavery
- Department of Urology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY
| | - Kamlesh K Yadav
- Department of Urology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY.
| | - Ashutosh K Tewari
- Department of Urology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY.
| |
Collapse
|
398
|
Enkhmandakh B, Bayarsaihan D. Genome-wide Chromatin Mapping Defines AP2α in the Etiology of Craniofacial Disorders. Cleft Palate Craniofac J 2015; 52:135-42. [DOI: 10.1597/13-151] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Objective The aim of this study is to identify direct AP2α target genes implicated in craniofacial morphogenesis. Design AP2α, a product of the TFAP2A gene, is a master regulator of neural crest differentiation and development. AP2α is expressed in ectoderm and in migrating cranial neural crest (NC) cells that provide patterning information during orofacial development and generate most of the skull bones and the cranial ganglia. Mutations in TFAP2A cause branchio-oculofacial syndrome characterized by dysmorphic facial features including cleft or pseudocleft lip/palate. We hypothesize that AP2α primes a distinctive group of genes associated with NC development. Human promoter ChIP-chip arrays were used to define chromatin regions bound by AP2α in neural crest progenitors differentiated from human embryonic stem cells. Results High-confidence AP2α-binding peaks were detected in the regulatory regions of many target genes involved in the development of facial tissues including MSX1, IRF6, TBX22, and MAFB. In addition, we uncovered multiple single-nucleotide polymorphisms (SNPs) disrupting a conserved AP2α consensus sequence. Conclusions Knowledge of noncoding SNPs in the genomic loci occupied by AP2α provides an insight into the regulatory mechanisms underlying craniofacial development.
Collapse
Affiliation(s)
- Badam Enkhmandakh
- Center for Center for Regenerative Medicine and Skeletal Development, Department of Reconstructive Sciences, School of Dentistry, University of Connecticut Health Center, Farmington, Connecticut
| | - Dashzeveg Bayarsaihan
- Center for Center for Regenerative Medicine and Skeletal Development, Department of Reconstructive Sciences, School of Dentistry, University of Connecticut Health Center, Farmington, Connecticut
| |
Collapse
|
399
|
Ikeda M, Matsuzaki T. Regulation of aquaporins by vasopressin in the kidney. VITAMINS AND HORMONES 2015; 98:307-37. [PMID: 25817873 DOI: 10.1016/bs.vh.2014.12.008] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Vasopressin is the main hormone that regulates water conservation in mammals and one of its major targets is the principal cells in the renal collecting duct. Vasopressin increases the apical water permeability of principal cells, mediated by apical accumulation of aquaporin-2 (AQP2), a water channel protein, thus facilitating water reabsorption by the kidney. The mechanisms underlying the accumulation of AQP2 in response to vasopressin include vesicular trafficking from intracellular storage vesicles expressing AQP2 within several tens of minutes (short-term regulation) and protein expression of AQP2 over a period of hours to days (long-term regulation). This chapter reviews vasopressin signaling in the kidney, focusing on the molecular mechanisms of short- and long-term regulations of AQP2 expression.
Collapse
Affiliation(s)
- Masahiro Ikeda
- Department of Veterinary Pharmacology, University of Miyazaki, Miyazaki, Japan.
| | - Toshiyuki Matsuzaki
- Department of Anatomy and Cell Biology, Gunma University Graduate School of Medicine, Maebashi, Japan
| |
Collapse
|
400
|
Cloonan N. Re-thinking miRNA-mRNA interactions: intertwining issues confound target discovery. Bioessays 2015; 37:379-88. [PMID: 25683051 PMCID: PMC4671252 DOI: 10.1002/bies.201400191] [Citation(s) in RCA: 88] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2014] [Revised: 12/19/2014] [Accepted: 12/19/2014] [Indexed: 12/20/2022]
Abstract
Despite a library full of literature on miRNA biology, core issues relating to miRNA target detection, biological effect, and mode of action remain controversial. This essay proposes that the predominant mechanism of direct miRNA action is translational inhibition, whereas the bulk of miRNA effects are mRNA based. It explores several issues confounding miRNA target detection, and discusses their impact on the dominance of “miRNA seed” dogma and the exploration of non-canonical binding sites. Finally, it makes comparisons between miRNA target prediction and transcription factor binding prediction, and questions the value of characterizing miRNA binding sites based on which miRNA nucleotides are paired with an mRNA.
Collapse
Affiliation(s)
- Nicole Cloonan
- QIMR Berghofer Medical Research Institute, Genomic Biology Lab, Herston, QLD, Australia
| |
Collapse
|