1
|
Zhao K, Oualkacha K, Zeng Y, Shen C, Klein K, Lakhal-Chaieb L, Labbe A, Pastinen T, Hudson M, Colmegna I, Bernatsky S, Greenwood CMT. Addressing dispersion in mis-measured multivariate binomial outcomes: A novel statistical approach for detecting differentially methylated regions in bisulfite sequencing data. Stat Med 2024. [PMID: 38932470 DOI: 10.1002/sim.10149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 04/13/2024] [Accepted: 06/07/2024] [Indexed: 06/28/2024]
Abstract
Motivated by a DNA methylation application, this article addresses the problem of fitting and inferring a multivariate binomial regression model for outcomes that are contaminated by errors and exhibit extra-parametric variations, also known as dispersion. While dispersion in univariate binomial regression has been extensively studied, addressing dispersion in the context of multivariate outcomes remains a complex and relatively unexplored task. The complexity arises from a noteworthy data characteristic observed in our motivating dataset: non-constant yet correlated dispersion across outcomes. To address this challenge and account for possible measurement error, we propose a novel hierarchical quasi-binomial varying coefficient mixed model, which enables flexible dispersion patterns through a combination of additive and multiplicative dispersion components. To maximize the Laplace-approximated quasi-likelihood of our model, we further develop a specialized two-stage expectation-maximization (EM) algorithm, where a plug-in estimate for the multiplicative scale parameter enhances the speed and stability of the EM iterations. Simulations demonstrated that our approach yields accurate inference for smooth covariate effects and exhibits excellent power in detecting non-zero effects. Additionally, we applied our proposed method to investigate the association between DNA methylation, measured across the genome through targeted custom capture sequencing of whole blood, and levels of anti-citrullinated protein antibodies (ACPA), a preclinical marker for rheumatoid arthritis (RA) risk. Our analysis revealed 23 significant genes that potentially contribute to ACPA-related differential methylation, highlighting the relevance of cell signaling and collagen metabolism in RA. We implemented our method in the R Bioconductor package called "SOMNiBUS."
Collapse
Affiliation(s)
- Kaiqiong Zhao
- Department of Mathematics and Statistics, York University, Toronto, Ontario, Canada
| | - Karim Oualkacha
- Département de Mathématiques, Université du Québec à Montréal, Montreal, Quebec, Canada
| | - Yixiao Zeng
- Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Quebec, Canada
| | - Cathy Shen
- Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Quebec, Canada
| | - Kathleen Klein
- Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Quebec, Canada
| | - Lajmi Lakhal-Chaieb
- Département de Mathématiques et de Statistique, Université Laval, Quebec, Quebec, Canada
| | - Aurélie Labbe
- Département de Sciences de la Décision, HEC Montrèal, Montreal, Quebec, Canada
| | - Tomi Pastinen
- Genomic Medicine Center, Children's Mercy, Independence, Missouri, USA
| | - Marie Hudson
- Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Quebec, Canada
- Department of Medicine, McGill University, Montreal, Quebec, Canada
| | - Inés Colmegna
- Department of Medicine, McGill University, Montreal, Quebec, Canada
- The Research Institute of the McGill University Health Centre, Montreal, Quebec, Canada
| | - Sasha Bernatsky
- Department of Medicine, McGill University, Montreal, Quebec, Canada
- The Research Institute of the McGill University Health Centre, Montreal, Quebec, Canada
| | - Celia M T Greenwood
- Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Quebec, Canada
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, Quebec, Canada
- Department of Human Genetics and Gerald Bronfman Department of Oncology, McGill University, Montreal, Quebec, Canada
| |
Collapse
|
2
|
Bell CG. Epigenomic insights into common human disease pathology. Cell Mol Life Sci 2024; 81:178. [PMID: 38602535 PMCID: PMC11008083 DOI: 10.1007/s00018-024-05206-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 03/11/2024] [Accepted: 03/13/2024] [Indexed: 04/12/2024]
Abstract
The epigenome-the chemical modifications and chromatin-related packaging of the genome-enables the same genetic template to be activated or repressed in different cellular settings. This multi-layered mechanism facilitates cell-type specific function by setting the local sequence and 3D interactive activity level. Gene transcription is further modulated through the interplay with transcription factors and co-regulators. The human body requires this epigenomic apparatus to be precisely installed throughout development and then adequately maintained during the lifespan. The causal role of the epigenome in human pathology, beyond imprinting disorders and specific tumour suppressor genes, was further brought into the spotlight by large-scale sequencing projects identifying that mutations in epigenomic machinery genes could be critical drivers in both cancer and developmental disorders. Abrogation of this cellular mechanism is providing new molecular insights into pathogenesis. However, deciphering the full breadth and implications of these epigenomic changes remains challenging. Knowledge is accruing regarding disease mechanisms and clinical biomarkers, through pathogenically relevant and surrogate tissue analyses, respectively. Advances include consortia generated cell-type specific reference epigenomes, high-throughput DNA methylome association studies, as well as insights into ageing-related diseases from biological 'clocks' constructed by machine learning algorithms. Also, 3rd-generation sequencing is beginning to disentangle the complexity of genetic and DNA modification haplotypes. Cell-free DNA methylation as a cancer biomarker has clear clinical utility and further potential to assess organ damage across many disorders. Finally, molecular understanding of disease aetiology brings with it the opportunity for exact therapeutic alteration of the epigenome through CRISPR-activation or inhibition.
Collapse
Affiliation(s)
- Christopher G Bell
- William Harvey Research Institute, Barts & The London Faculty of Medicine, Queen Mary University of London, Charterhouse Square, London, EC1M 6BQ, UK.
| |
Collapse
|
3
|
Klemmensen MM, Borrowman SH, Pearce C, Pyles B, Chandra B. Mitochondrial dysfunction in neurodegenerative disorders. Neurotherapeutics 2024; 21:e00292. [PMID: 38241161 PMCID: PMC10903104 DOI: 10.1016/j.neurot.2023.10.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2023] [Accepted: 10/07/2023] [Indexed: 01/21/2024] Open
Abstract
Recent advances in understanding the role of mitochondrial dysfunction in neurodegenerative diseases have expanded the opportunities for neurotherapeutics targeting mitochondria to alleviate symptoms and slow disease progression. In this review, we offer a historical account of advances in mitochondrial biology and neurodegenerative disease. Additionally, we summarize current knowledge of the normal physiology of mitochondria and the pathogenesis of mitochondrial dysfunction, the role of mitochondrial dysfunction in neurodegenerative disease, current therapeutics and recent therapeutic advances, as well as future directions for neurotherapeutics targeting mitochondrial function. A focus is placed on reactive oxygen species and their role in the disruption of telomeres and their effects on the epigenome. The effects of mitochondrial dysfunction in the etiology and progression of Alzheimer's disease, amyotrophic lateral sclerosis, Parkinson's disease, and Huntington's disease are discussed in depth. Current clinical trials for mitochondria-targeting neurotherapeutics are discussed.
Collapse
Affiliation(s)
- Madelyn M Klemmensen
- University of Iowa Roy J and Lucille A Carver College of Medicine, Iowa City, IA 52242, USA
| | - Seth H Borrowman
- Division of Medical Genetics and Genomics, Stead Family Department of Pediatrics, University of Iowa Hospitals and Clinics, Iowa City, IA 52242, USA
| | - Colin Pearce
- Division of Medical Genetics and Genomics, Stead Family Department of Pediatrics, University of Iowa Hospitals and Clinics, Iowa City, IA 52242, USA
| | - Benjamin Pyles
- Aper Funis Research, Union River Innovation Center, Ellsworth, ME 04605, USA
| | - Bharatendu Chandra
- University of Iowa Roy J and Lucille A Carver College of Medicine, Iowa City, IA 52242, USA; Division of Medical Genetics and Genomics, Stead Family Department of Pediatrics, University of Iowa Hospitals and Clinics, Iowa City, IA 52242, USA.
| |
Collapse
|
4
|
Halla-Aho V, Lähdesmäki H. LuxUS: DNA methylation analysis using generalized linear mixed model with spatial correlation. Bioinformatics 2020; 36:4535-4543. [PMID: 32484876 PMCID: PMC7750928 DOI: 10.1093/bioinformatics/btaa539] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2019] [Revised: 05/05/2020] [Accepted: 05/27/2020] [Indexed: 11/19/2022] Open
Abstract
Motivation DNA methylation is an important epigenetic modification, which has multiple functions. DNA methylation and its connections to diseases have been extensively studied in recent years. It is known that DNA methylation levels of neighboring cytosines are correlated and that differential DNA methylation typically occurs rather as regions instead of individual cytosine level. Results We have developed a generalized linear mixed model, LuxUS, that makes use of the correlation between neighboring cytosines to facilitate analysis of differential methylation. LuxUS implements a likelihood model for bisulfite sequencing data that accounts for experimental variation in underlying biochemistry. LuxUS can model both binary and continuous covariates, and mixed model formulation enables including replicate and cytosine random effects. Spatial correlation is included to the model through a cytosine random effect correlation structure. We show with simulation experiments that using the spatial correlation, we gain more power to the statistical testing of differential DNA methylation. Results with real bisulfite sequencing dataset show that LuxUS is able to detect biologically significant differentially methylated cytosines. Availability and implementation The tool is available at https://github.com/hallav/LuxUS. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Viivi Halla-Aho
- Department of Computer Science, Aalto University, FI-00076 Aalto, Finland
| | - Harri Lähdesmäki
- Department of Computer Science, Aalto University, FI-00076 Aalto, Finland
| |
Collapse
|
5
|
Chung RH, Kang CY. pWGBSSimla: a profile-based whole-genome bisulfite sequencing data simulator incorporating methylation QTLs, allele-specific methylations and differentially methylated regions. Bioinformatics 2020; 36:660-665. [PMID: 31397839 DOI: 10.1093/bioinformatics/btz635] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2018] [Revised: 08/05/2019] [Accepted: 08/08/2019] [Indexed: 12/19/2022] Open
Abstract
MOTIVATION DNA methylation plays an important role in regulating gene expression. DNA methylation is commonly analyzed using bisulfite sequencing (BS-seq)-based designs, such as whole-genome bisulfite sequencing (WGBS), reduced representation bisulfite sequencing (RRBS) and oxidative bisulfite sequencing (oxBS-seq). Furthermore, there has been growing interest in investigating the roles that genetic variants play in changing the methylation levels (i.e. methylation quantitative trait loci or meQTLs), how methylation regulates the imprinting of gene expression (i.e. allele-specific methylation or ASM) and the differentially methylated regions (DMRs) among different cell types. However, none of the current simulation tools can generate different BS-seq data types (e.g. WGBS, RRBS and oxBS-seq) while modeling meQTLs, ASM and DMRs. RESULTS We developed profile-based whole-genome bisulfite sequencing data simulator (pWGBSSimla), a profile-based bisulfite sequencing data simulator, which simulates WGBS, RRBS and oxBS-seq data for different cell types based on real data. meQTLs and ASM are modeled based on the block structures of the methylation status at CpGs, whereas the simulation of DMRs is based on observations of methylation rates in real data. We demonstrated that pWGBSSimla adequately simulates data and allows performance comparisons among different methylation analysis methods. AVAILABILITY AND IMPLEMENTATION pWGBSSimla is available at https://omicssimla.sourceforge.io. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ren-Hua Chung
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan 350, Taiwan
| | - Chen-Yu Kang
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan 350, Taiwan
| |
Collapse
|
6
|
Zhao K, Oualkacha K, Lakhal-Chaieb L, Labbe A, Klein K, Ciampi A, Hudson M, Colmegna I, Pastinen T, Zhang T, Daley D, Greenwood CMT. A novel statistical method for modeling covariate effects in bisulfite sequencing derived measures of DNA methylation. Biometrics 2020; 77:424-438. [PMID: 32438470 PMCID: PMC8359306 DOI: 10.1111/biom.13307] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2019] [Revised: 02/28/2020] [Accepted: 05/08/2020] [Indexed: 01/24/2023]
Abstract
Identifying disease-associated changes in DNA methylation can help us gain a better understanding of disease etiology. Bisulfite sequencing allows the generation of high-throughput methylation profiles at single-base resolution of DNA. However, optimally modeling and analyzing these sparse and discrete sequencing data is still very challenging due to variable read depth, missing data patterns, long-range correlations, data errors, and confounding from cell type mixtures. We propose a regression-based hierarchical model that allows covariate effects to vary smoothly along genomic positions and we have built a specialized EM algorithm, which explicitly allows for experimental errors and cell type mixtures, to make inference about smooth covariate effects in the model. Simulations show that the proposed method provides accurate estimates of covariate effects and captures the major underlying methylation patterns with excellent power. We also apply our method to analyze data from rheumatoid arthritis patients and controls. The method has been implemented in R package SOMNiBUS.
Collapse
Affiliation(s)
- Kaiqiong Zhao
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, QC, Canada.,Lady Davis Institute for Medical Research, Montreal, QC, Canada
| | - Karim Oualkacha
- Département de Mathématiques, Université du Québec à Montrèal, Montreal, QC, Canada
| | - Lajmi Lakhal-Chaieb
- Département de Mathématiques et de Statistique, Université Laval, Quebec City, QC, Canada
| | - Aurélie Labbe
- Département des Sciences de la Décision, HEC Montrèal, Montreal, QC, Canada
| | - Kathleen Klein
- Lady Davis Institute for Medical Research, Montreal, QC, Canada
| | - Antonio Ciampi
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, QC, Canada.,Lady Davis Institute for Medical Research, Montreal, QC, Canada
| | - Marie Hudson
- Lady Davis Institute for Medical Research, Montreal, QC, Canada.,Department of Medicine, McGill University, Montreal, QC, Canada
| | - Inés Colmegna
- Department of Medicine, McGill University, Montreal, QC, Canada.,The Research Institute of the McGill University Health Centre, Montreal, QC, Canada
| | - Tomi Pastinen
- Center for Pediatric Genomic Medicine, Children's Mercy Kansas City, Kansas City, MO, USA
| | - Tieyuan Zhang
- Department of Psychiatry, Douglas Mental Health University Institute, McGill University, Montreal, QC, Canada
| | - Denise Daley
- The Centre for Heart Lung Innovation, and Department of Medicine, University of British Columbia, Vancouver, BC, Canada
| | - Celia M T Greenwood
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, QC, Canada.,Lady Davis Institute for Medical Research, Montreal, QC, Canada.,Department of Human Genetics and Gerald Bronfman Department of Oncology, McGill University, Montreal, QC, Canada
| |
Collapse
|
7
|
Lightbody G, Haberland V, Browne F, Taggart L, Zheng H, Parkes E, Blayney JK. Review of applications of high-throughput sequencing in personalized medicine: barriers and facilitators of future progress in research and clinical application. Brief Bioinform 2019; 20:1795-1811. [PMID: 30084865 PMCID: PMC6917217 DOI: 10.1093/bib/bby051] [Citation(s) in RCA: 78] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2018] [Revised: 05/01/2018] [Indexed: 12/28/2022] Open
Abstract
There has been an exponential growth in the performance and output of sequencing technologies (omics data) with full genome sequencing now producing gigabases of reads on a daily basis. These data may hold the promise of personalized medicine, leading to routinely available sequencing tests that can guide patient treatment decisions. In the era of high-throughput sequencing (HTS), computational considerations, data governance and clinical translation are the greatest rate-limiting steps. To ensure that the analysis, management and interpretation of such extensive omics data is exploited to its full potential, key factors, including sample sourcing, technology selection and computational expertise and resources, need to be considered, leading to an integrated set of high-performance tools and systems. This article provides an up-to-date overview of the evolution of HTS and the accompanying tools, infrastructure and data management approaches that are emerging in this space, which, if used within in a multidisciplinary context, may ultimately facilitate the development of personalized medicine.
Collapse
Affiliation(s)
- Gaye Lightbody
- School of Computing, Ulster University, Newtownabbey, UK
| | - Valeriia Haberland
- MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Fiona Browne
- School of Computing, Ulster University, Newtownabbey, UK
| | | | - Huiru Zheng
- School of Computing, Ulster University, Newtownabbey, UK
| | - Eileen Parkes
- Centre for Cancer Research & Cell Biology, School of Medicine, Dentistry and Biomedical Sciences, Queen's University, Belfast, UK
| | - Jaine K Blayney
- Centre for Cancer Research & Cell Biology, School of Medicine, Dentistry and Biomedical Sciences, Queen's University, Belfast, UK
| |
Collapse
|
8
|
Ochoa E, Zuber V, Fernandez-Jimenez N, Bilbao JR, Clark GR, Maher ER, Bottolo L. MethylCal: Bayesian calibration of methylation levels. Nucleic Acids Res 2019; 47:e81. [PMID: 31049595 PMCID: PMC6698668 DOI: 10.1093/nar/gkz325] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2018] [Revised: 03/24/2019] [Accepted: 04/20/2019] [Indexed: 12/16/2022] Open
Abstract
Bisulfite amplicon sequencing has become the primary choice for single-base methylation quantification of multiple targets in parallel. The main limitation of this technology is a preferential amplification of an allele and strand in the PCR due to methylation state. This effect, known as 'PCR bias', causes inaccurate estimation of the methylation levels and calibration methods based on standard controls have been proposed to correct for it. Here, we present a Bayesian calibration tool, MethylCal, which can analyse jointly all CpGs within a CpG island (CGI) or a Differentially Methylated Region (DMR), avoiding 'one-at-a-time' CpG calibration. This enables more precise modeling of the methylation levels observed in the standard controls. It also provides accurate predictions of the methylation levels not considered in the controlled experiment, a feature that is paramount in the derivation of the corrected methylation degree. We tested the proposed method on eight independent assays (two CpG islands and six imprinting DMRs) and demonstrated its benefits, including the ability to detect outliers. We also evaluated MethylCal's calibration in two practical cases, a clinical diagnostic test on 18 patients potentially affected by Beckwith-Wiedemann syndrome, and 17 individuals with celiac disease. The calibration of the methylation levels obtained by MethylCal allows a clearer identification of patients undergoing loss or gain of methylation in borderline cases and could influence further clinical or treatment decisions.
Collapse
Affiliation(s)
- Eguzkine Ochoa
- Department of Medical Genetics, University of Cambridge, Cambridge CB2 0QQ, UK
- Cambridge NIHR Biomedical Research Centre, Cambridge CB2 0QQ, UK
| | - Verena Zuber
- Department of Epidemiology and Biostatistics, Imperial College London, London W2 1PG, UK
- MRC Biostatistics Unit, University of Cambridge, Cambridge CB2 0SR, UK
| | - Nora Fernandez-Jimenez
- Department of Genetics, Physical Anthropology and Animal Physiology, Biocruces-Bizkaia Health Research Institute, University of the Basque Country (UPV/EHU), Leioa, Bizkaia, Spain
| | - Jose Ramon Bilbao
- Department of Genetics, Physical Anthropology and Animal Physiology, Biocruces-Bizkaia Health Research Institute, University of the Basque Country (UPV/EHU), Leioa, Bizkaia, Spain
- CIBERDEM Diabetes and Associated Metabolic Diseases, Spain
| | - Graeme R Clark
- Department of Medical Genetics, University of Cambridge, Cambridge CB2 0QQ, UK
- Cambridge NIHR Biomedical Research Centre, Cambridge CB2 0QQ, UK
| | - Eamonn R Maher
- Department of Medical Genetics, University of Cambridge, Cambridge CB2 0QQ, UK
- Cambridge NIHR Biomedical Research Centre, Cambridge CB2 0QQ, UK
| | - Leonardo Bottolo
- Department of Medical Genetics, University of Cambridge, Cambridge CB2 0QQ, UK
- MRC Biostatistics Unit, University of Cambridge, Cambridge CB2 0SR, UK
- The Alan Turing Institute, London NW1 2DB, UK
| |
Collapse
|
9
|
Identification of Ceruloplasmin as a Gene that Affects Susceptibility to Glomerulonephritis Through Macrophage Function. Genetics 2017; 206:1139-1151. [PMID: 28450461 PMCID: PMC5499168 DOI: 10.1534/genetics.116.197376] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2016] [Accepted: 04/05/2017] [Indexed: 12/31/2022] Open
Abstract
Crescentic glomerulonephritis (Crgn) is a complex disorder where macrophage activity and infiltration are significant effector causes. In previous linkage studies using the uniquely susceptible Wistar Kyoto (WKY) rat strain, we have identified multiple crescentic glomerulonephritis QTL (Crgn) and positionally cloned genes underlying Crgn1 and Crgn2, which accounted for 40% of total variance in glomerular inflammation. Here, we have generated a backcross (BC) population (n = 166) where Crgn1 and Crgn2 were genetically fixed and found significant linkage to glomerular crescents on chromosome 2 (Crgn8, LOD = 3.8). Fine mapping analysis by integration with genome-wide expression QTLs (eQTLs) from the same BC population identified ceruloplasmin (Cp) as a positional eQTL in macrophages but not in serum. Liquid chromatography-tandem mass spectrometry confirmed Cp as a protein QTL in rat macrophages. WKY macrophages overexpress Cp and its downregulation by RNA interference decreases markers of glomerular proinflammatory macrophage activation. Similarly, short incubation with Cp results in a strain-dependent macrophage polarization in the rat. These results suggest that genetically determined Cp levels can alter susceptibility to Crgn through macrophage function and propose a new role for Cp in early macrophage activation.
Collapse
|