1
|
Malonzo MH, Lähdesmäki H. LuxHMM: DNA methylation analysis with genome segmentation via hidden Markov model. BMC Bioinformatics 2023; 24:58. [PMID: 36810075 PMCID: PMC9945676 DOI: 10.1186/s12859-023-05174-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2022] [Accepted: 02/06/2023] [Indexed: 02/23/2023] Open
Abstract
BACKGROUND DNA methylation plays an important role in studying the epigenetics of various biological processes including many diseases. Although differential methylation of individual cytosines can be informative, given that methylation of neighboring CpGs are typically correlated, analysis of differentially methylated regions is often of more interest. RESULTS We have developed a probabilistic method and software, LuxHMM, that uses hidden Markov model (HMM) to segment the genome into regions and a Bayesian regression model, which allows handling of multiple covariates, to infer differential methylation of regions. Moreover, our model includes experimental parameters that describe the underlying biochemistry in bisulfite sequencing and model inference is done using either variational inference for efficient genome-scale analysis or Hamiltonian Monte Carlo (HMC). CONCLUSIONS Analyses of real and simulated bisulfite sequencing data demonstrate the competitive performance of LuxHMM compared with other published differential methylation analysis methods.
Collapse
Affiliation(s)
- Maia H. Malonzo
- grid.5373.20000000108389418Department of Computer Science, Aalto University, 00076 Espoo, Finland
| | - Harri Lähdesmäki
- grid.5373.20000000108389418Department of Computer Science, Aalto University, 00076 Espoo, Finland
| |
Collapse
|
2
|
Kyriakopoulos C, Nordström K, Kramer PL, Gottfreund JY, Salhab A, Arand J, Müller F, von Meyenn F, Ficz G, Reik W, Wolf V, Walter J, Giehr P. A comprehensive approach for genome-wide efficiency profiling of DNA modifying enzymes. CELL REPORTS METHODS 2022; 2:100187. [PMID: 35475220 PMCID: PMC9017147 DOI: 10.1016/j.crmeth.2022.100187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/28/2021] [Revised: 01/19/2022] [Accepted: 03/01/2022] [Indexed: 10/25/2022]
Abstract
A precise understanding of DNA methylation dynamics is of great importance for a variety of biological processes including cellular reprogramming and differentiation. To date, complex integration of multiple and distinct genome-wide datasets is required to realize this task. We present GwEEP (genome-wide epigenetic efficiency profiling) a versatile approach to infer dynamic efficiencies of DNA modifying enzymes. GwEEP relies on genome-wide hairpin datasets, which are translated by a hidden Markov model into quantitative enzyme efficiencies with reported confidence around the estimates. GwEEP predicts de novo and maintenance methylation efficiencies of Dnmts and furthermore the hydroxylation efficiency of Tets. Its design also allows capturing further oxidation processes given available data. We show that GwEEP predicts accurately the epigenetic changes of ESCs following a Serum-to-2i shift and applied to Tet TKO cells confirms the hypothesized mutual interference between Dnmts and Tets.
Collapse
Affiliation(s)
| | - Karl Nordström
- Department of Genetics and Epigenetics, Saarland University, Campus A2.4, 66123 Saarbrücken, Germany
| | - Paula Linh Kramer
- Computer Science Department, Saarland University, Campus E1.3, 66123 Saarbrücken, Germany
| | - Judith Yumiko Gottfreund
- Department of Genetics and Epigenetics, Saarland University, Campus A2.4, 66123 Saarbrücken, Germany
| | - Abdulrahman Salhab
- Department of Genetics and Epigenetics, Saarland University, Campus A2.4, 66123 Saarbrücken, Germany
| | - Julia Arand
- Division of Cell and Developmental Biology, Medical University of Vienna, 1090 Vienna, Austria
| | - Fabian Müller
- Department of Integrative Cellular Biology and Bioinformatics, Campus A2.4, 66123 Saarbrücken, Germany
| | - Ferdinand von Meyenn
- Department of Health Sciences and Technology, ETH Zürich, Schorenstrasse 16, Schwerzenbach, 8603 Zürich, Switzerland
| | - Gabriella Ficz
- Haemato-Oncology, Queen Mary University of London, London EC1M 6BQ, UK
| | - Wolf Reik
- Epigenetics Department, Babraham Institute, Cambridge CB22 3AT, UK
| | - Verena Wolf
- Computer Science Department, Saarland University, Campus E1.3, 66123 Saarbrücken, Germany
| | - Jörn Walter
- Department of Genetics and Epigenetics, Saarland University, Campus A2.4, 66123 Saarbrücken, Germany
| | - Pascal Giehr
- Department of Genetics and Epigenetics, Saarland University, Campus A2.4, 66123 Saarbrücken, Germany
- Department of Health Sciences and Technology, ETH Zürich, Schorenstrasse 16, Schwerzenbach, 8603 Zürich, Switzerland
| |
Collapse
|
3
|
Malonzo MH, Halla-Aho V, Konki M, Lund RJ, Lähdesmäki H. LuxRep: a technical replicate-aware method for bisulfite sequencing data analysis. BMC Bioinformatics 2022; 23:41. [PMID: 35030989 PMCID: PMC8760685 DOI: 10.1186/s12859-021-04546-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Accepted: 12/20/2021] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND DNA methylation is commonly measured using bisulfite sequencing (BS-seq). The quality of a BS-seq library is measured by its bisulfite conversion efficiency. Libraries with low conversion rates are typically excluded from analysis resulting in reduced coverage and increased costs. RESULTS We have developed a probabilistic method and software, LuxRep, that implements a general linear model and simultaneously accounts for technical replicates (libraries from the same biological sample) from different bisulfite-converted DNA libraries. Using simulations and actual DNA methylation data, we show that including technical replicates with low bisulfite conversion rates generates more accurate estimates of methylation levels and differentially methylated sites. Moreover, using variational inference speeds up computation time necessary for whole genome analysis. CONCLUSIONS In this work we show that taking into account technical replicates (i.e. libraries) of BS-seq data of varying bisulfite conversion rates, with their corresponding experimental parameters, improves methylation level estimation and differential methylation detection.
Collapse
Affiliation(s)
- Maia H Malonzo
- Department of Computer Science, Aalto University, 00076, Espoo, Finland.
| | - Viivi Halla-Aho
- Department of Computer Science, Aalto University, 00076, Espoo, Finland
| | - Mikko Konki
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, 20520, Turku, Finland
| | - Riikka J Lund
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, 20520, Turku, Finland
| | - Harri Lähdesmäki
- Department of Computer Science, Aalto University, 00076, Espoo, Finland
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, 20520, Turku, Finland
| |
Collapse
|
4
|
Halla-Aho V, Lähdesmäki H. LuxUS: DNA methylation analysis using generalized linear mixed model with spatial correlation. Bioinformatics 2020; 36:4535-4543. [PMID: 32484876 PMCID: PMC7750928 DOI: 10.1093/bioinformatics/btaa539] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2019] [Revised: 05/05/2020] [Accepted: 05/27/2020] [Indexed: 11/19/2022] Open
Abstract
Motivation DNA methylation is an important epigenetic modification, which has multiple functions. DNA methylation and its connections to diseases have been extensively studied in recent years. It is known that DNA methylation levels of neighboring cytosines are correlated and that differential DNA methylation typically occurs rather as regions instead of individual cytosine level. Results We have developed a generalized linear mixed model, LuxUS, that makes use of the correlation between neighboring cytosines to facilitate analysis of differential methylation. LuxUS implements a likelihood model for bisulfite sequencing data that accounts for experimental variation in underlying biochemistry. LuxUS can model both binary and continuous covariates, and mixed model formulation enables including replicate and cytosine random effects. Spatial correlation is included to the model through a cytosine random effect correlation structure. We show with simulation experiments that using the spatial correlation, we gain more power to the statistical testing of differential DNA methylation. Results with real bisulfite sequencing dataset show that LuxUS is able to detect biologically significant differentially methylated cytosines. Availability and implementation The tool is available at https://github.com/hallav/LuxUS. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Viivi Halla-Aho
- Department of Computer Science, Aalto University, FI-00076 Aalto, Finland
| | - Harri Lähdesmäki
- Department of Computer Science, Aalto University, FI-00076 Aalto, Finland
| |
Collapse
|
5
|
Chung RH, Kang CY. pWGBSSimla: a profile-based whole-genome bisulfite sequencing data simulator incorporating methylation QTLs, allele-specific methylations and differentially methylated regions. Bioinformatics 2020; 36:660-665. [PMID: 31397839 DOI: 10.1093/bioinformatics/btz635] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2018] [Revised: 08/05/2019] [Accepted: 08/08/2019] [Indexed: 12/19/2022] Open
Abstract
MOTIVATION DNA methylation plays an important role in regulating gene expression. DNA methylation is commonly analyzed using bisulfite sequencing (BS-seq)-based designs, such as whole-genome bisulfite sequencing (WGBS), reduced representation bisulfite sequencing (RRBS) and oxidative bisulfite sequencing (oxBS-seq). Furthermore, there has been growing interest in investigating the roles that genetic variants play in changing the methylation levels (i.e. methylation quantitative trait loci or meQTLs), how methylation regulates the imprinting of gene expression (i.e. allele-specific methylation or ASM) and the differentially methylated regions (DMRs) among different cell types. However, none of the current simulation tools can generate different BS-seq data types (e.g. WGBS, RRBS and oxBS-seq) while modeling meQTLs, ASM and DMRs. RESULTS We developed profile-based whole-genome bisulfite sequencing data simulator (pWGBSSimla), a profile-based bisulfite sequencing data simulator, which simulates WGBS, RRBS and oxBS-seq data for different cell types based on real data. meQTLs and ASM are modeled based on the block structures of the methylation status at CpGs, whereas the simulation of DMRs is based on observations of methylation rates in real data. We demonstrated that pWGBSSimla adequately simulates data and allows performance comparisons among different methylation analysis methods. AVAILABILITY AND IMPLEMENTATION pWGBSSimla is available at https://omicssimla.sourceforge.io. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ren-Hua Chung
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan 350, Taiwan
| | - Chen-Yu Kang
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan 350, Taiwan
| |
Collapse
|
6
|
Lio CWJ, Shukla V, Samaniego-Castruita D, González-Avalos E, Chakraborty A, Yue X, Schatz DG, Ay F, Rao A. TET enzymes augment activation-induced deaminase (AID) expression via 5-hydroxymethylcytosine modifications at the Aicda superenhancer. Sci Immunol 2019; 4:eaau7523. [PMID: 31028100 PMCID: PMC6599614 DOI: 10.1126/sciimmunol.aau7523] [Citation(s) in RCA: 72] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2018] [Accepted: 03/19/2019] [Indexed: 12/15/2022]
Abstract
TET enzymes are dioxygenases that promote DNA demethylation by oxidizing the methyl group of 5-methylcytosine to 5-hydroxymethylcytosine (5hmC). Here, we report a close correspondence between 5hmC-marked regions, chromatin accessibility and enhancer activity in B cells, and a strong enrichment for consensus binding motifs for basic region-leucine zipper (bZIP) transcription factors at TET-responsive genomic regions. Functionally, Tet2 and Tet3 regulate class switch recombination (CSR) in murine B cells by enhancing expression of Aicda, which encodes the activation-induced cytidine deaminase (AID) enzyme essential for CSR. TET enzymes deposit 5hmC, facilitate DNA demethylation, and maintain chromatin accessibility at two TET-responsive enhancer elements, TetE1 and TetE2, located within a superenhancer in the Aicda locus. Our data identify the bZIP transcription factor, ATF-like (BATF) as a key transcription factor involved in TET-dependent Aicda expression. 5hmC is not deposited at TetE1 in activated Batf-deficient B cells, indicating that BATF facilitates TET recruitment to this Aicda enhancer. Our study emphasizes the importance of TET enzymes for bolstering AID expression and highlights 5hmC as an epigenetic mark that captures enhancer dynamics during cell activation.
Collapse
Affiliation(s)
- Chan-Wang J Lio
- Division of Signaling and Gene Expression, La Jolla Institute, San Diego, CA, USA.
| | - Vipul Shukla
- Division of Signaling and Gene Expression, La Jolla Institute, San Diego, CA, USA
| | | | | | | | - Xiaojing Yue
- Division of Signaling and Gene Expression, La Jolla Institute, San Diego, CA, USA
| | - David G Schatz
- Department of Immunobiology, Yale University School of Medicine, New Haven, CT, USA
| | - Ferhat Ay
- Division of Vaccine Discovery, La Jolla Institute, San Diego, CA, USA
| | - Anjana Rao
- Division of Signaling and Gene Expression, La Jolla Institute, San Diego, CA, USA.
- Sanford Consortium for Regenerative Medicine, San Diego, CA, USA
- Department of Pharmacology, University of California, San Diego, San Diego, CA, USA
- Moores Cancer Center, University of California, San Diego, San Diego, CA, USA
| |
Collapse
|
7
|
Abstract
There are multiple chemical modifications of cytosine that are important to the regulation and ultimately the functional expression of the genome. To date no single experiment can capture these separate modifications, and integrative experimental designs are needed to fully characterize cytosine methylation and chemical modification. This chapter describes a generative probabilistic model, Lux, for integrative analysis of cytosine methylation and its oxidized variants. Lux simultaneously analyzes partially orthogonal bisulfite sequencing data sets to estimate proportions of different cytosine methylation modifications and estimate multiple cytosine modifications for a single sample by integrating across experimental designs composed of multiple parallel destructive genomic measurements. Lux also considers the variation in measurements introduced by different imperfect experimental steps; the experimental variation can be quantified by using appropriate spike-in controls, allowing Lux to deconvolve the measurements and recover accurately the underlying signal.
Collapse
Affiliation(s)
- Tarmo Äijö
- Center for Computational Biology, Flatiron Institute, New York, NY, USA
| | - Richard Bonneau
- Center for Computational Biology, Flatiron Institute, New York, NY, USA
- Department of Biology, Center for Genomics and Systems Biology, New York University, New York, NY, USA
- Courant Institute of Mathematical Sciences, New York University, New York, NY, USA
| | - Harri Lähdesmäki
- Department of Computer Science, Aalto University School of Science, Aalto, Finland.
| |
Collapse
|
8
|
Tsagaratou A, Lio CWJ, Yue X, Rao A. TET Methylcytosine Oxidases in T Cell and B Cell Development and Function. Front Immunol 2017; 8:220. [PMID: 28408905 PMCID: PMC5374156 DOI: 10.3389/fimmu.2017.00220] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2017] [Accepted: 02/16/2017] [Indexed: 11/13/2022] Open
Abstract
DNA methylation is established by DNA methyltransferases and is a key epigenetic mark. Ten-eleven translocation (TET) proteins are enzymes that oxidize 5-methylcytosine (5mC) to 5-hydroxymethylcytosine (5hmC) and further oxidization products (oxi-mCs), which indirectly promote DNA demethylation. Here, we provide an overview of the effect of TET proteins and altered DNA modification status in T and B cell development and function. We summarize current advances in our understanding of the role of TET proteins and 5hmC in T and B cells in both physiological and pathological contexts. We describe how TET proteins and 5hmC regulate DNA modification, chromatin accessibility, gene expression, and transcriptional networks and discuss potential underlying mechanisms and open questions in the field.
Collapse
Affiliation(s)
- Ageliki Tsagaratou
- Department of Signaling and Gene Expression, La Jolla Institute for Allergy and Immunology, La Jolla, CA, USA
| | - Chan-Wang J Lio
- Department of Signaling and Gene Expression, La Jolla Institute for Allergy and Immunology, La Jolla, CA, USA
| | - Xiaojing Yue
- Department of Signaling and Gene Expression, La Jolla Institute for Allergy and Immunology, La Jolla, CA, USA
| | - Anjana Rao
- Department of Signaling and Gene Expression, La Jolla Institute for Allergy and Immunology, La Jolla, CA, USA.,Department of Pharmacology and Moores Cancer Center, University of California at San Diego, La Jolla, CA, USA.,Sanford Consortium for Regenerative Medicine, La Jolla, CA, USA
| |
Collapse
|