1
|
Daca-Roszak P, Jaksik R, Paczkowska J, Witt M, Ziętkiewicz E. Discrimination between human populations using a small number of differentially methylated CpG sites: a preliminary study using lymphoblastoid cell lines and peripheral blood samples of European and Chinese origin. BMC Genomics 2020; 21:706. [PMID: 33045984 PMCID: PMC7549247 DOI: 10.1186/s12864-020-07092-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2020] [Accepted: 09/22/2020] [Indexed: 02/08/2023] Open
Abstract
Background Epigenetics is one of the factors shaping natural variability observed among human populations. A small proportion of heritable inter-population differences are observed in the context of both the genome-wide methylation level and the methylation status of individual CpG sites. It has been demonstrated that a limited number of carefully selected differentially methylated sites may allow discrimination between main human populations. However, most of the few published results have been performed exclusively on B-lymphocyte cell lines. Results The goal of our study was to identify a set of CpG sites sufficient to discriminate between populations of European and Chinese ancestry based on the difference in the DNA methylation profile not only in cell lines but also in primary cell samples. The preliminary selection of CpG sites differentially methylated in these two populations (pop-CpGs) was based on the analysis of two groups of commercially available ethnically-specific B-lymphocyte cell lines, performed using Illumina Infinium Human Methylation 450 BeadChip Array. A subset of 10 pop-CpGs characterized by the best differentiating criteria (|Mdiff| > 1, q < 0.05; lack of the confounding genomic features), and 10 additional CpGs in their immediate vicinity, were further tested using pyrosequencing technology in both B-lymphocyte cell lines and in the primary samples of the peripheral blood representing two analyzed populations. To assess the population-discriminating potential of the selected set of CpGs (further referred to as “composite pop (CEU-CHB)-CpG marker”), three classification methods were applied. The predictive ability of the composite 8-site pop (CEU-CHB)-CpG marker was assessed using 10-fold cross-validation method on two independent sets of samples. Conclusions Our results showed that less than 10 pop-CpG sites may distinguish populations of European and Chinese ancestry; importantly, this small composite pop-CpG marker performs well in both lymphoblastoid cell lines and in non-homogenous blood samples regardless of a gender.
Collapse
Affiliation(s)
- Patrycja Daca-Roszak
- Institute of Human Genetics, Polish Academy of Sciences, Strzeszynska 32, 60-479, Poznan, Poland.
| | - Roman Jaksik
- Silesian University of Technology, Akademicka 16, 44-100, Gliwice, Poland
| | - Julia Paczkowska
- Institute of Human Genetics, Polish Academy of Sciences, Strzeszynska 32, 60-479, Poznan, Poland
| | - Michał Witt
- Institute of Human Genetics, Polish Academy of Sciences, Strzeszynska 32, 60-479, Poznan, Poland
| | - Ewa Ziętkiewicz
- Institute of Human Genetics, Polish Academy of Sciences, Strzeszynska 32, 60-479, Poznan, Poland
| |
Collapse
|
2
|
Liu D, Zhao L, Wang Z, Zhou X, Fan X, Li Y, Xu J, Hu S, Niu M, Song X, Li Y, Zuo L, Lei C, Zhang M, Tang G, Huang M, Zhang N, Duan L, Lv H, Zhang M, Li J, Xu L, Kong F, Feng R, Jiang Y. EWASdb: epigenome-wide association study database. Nucleic Acids Res 2020; 47:D989-D993. [PMID: 30321400 PMCID: PMC6323898 DOI: 10.1093/nar/gky942] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2018] [Accepted: 10/04/2018] [Indexed: 12/29/2022] Open
Abstract
DNA methylation, the most intensively studied epigenetic modification, plays an important role in understanding the molecular basis of diseases. Furthermore, epigenome-wide association study (EWAS) provides a systematic approach to identify epigenetic variants underlying common diseases/phenotypes. However, there is no comprehensive database to archive the results of EWASs. To fill this gap, we developed the EWASdb, which is a part of 'The EWAS Project', to store the epigenetic association results of DNA methylation from EWASs. In its current version (v 1.0, up to July 2018), the EWASdb has curated 1319 EWASs associated with 302 diseases/phenotypes. There are three types of EWAS results curated in this database: (i) EWAS for single marker; (ii) EWAS for KEGG pathway and (iii) EWAS for GO (Gene Ontology) category. As the first comprehensive EWAS database, EWASdb has been searched or downloaded by researchers from 43 countries to date. We believe that EWASdb will become a valuable resource and significantly contribute to the epigenetic research of diseases/phenotypes and have potential clinical applications. EWASdb is freely available at http://www.ewas.org.cn/ewasdb or http://www.bioapp.org/ewasdb.
Collapse
Affiliation(s)
- Di Liu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.,Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China
| | - Linna Zhao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.,Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China
| | - Zhaoyang Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.,Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China
| | - Xu Zhou
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.,Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China
| | - Xiuzhao Fan
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.,Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China
| | - Yong Li
- Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China
| | - Jing Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.,Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China
| | - Simeng Hu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.,Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China
| | - Miaomiao Niu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.,Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China
| | - Xiuling Song
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.,Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China
| | - Ying Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.,Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China
| | - Lijiao Zuo
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.,Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China
| | - Changgui Lei
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.,Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China
| | - Meng Zhang
- Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China.,Department of Nutrition and Food Hygiene, Public Health College, Harbin Medical University, Harbin, China
| | - Guoping Tang
- The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang, China
| | - Min Huang
- Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China.,Department of Nutrition and Food Hygiene, Public Health College, Harbin Medical University, Harbin, China
| | - Nan Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.,Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China
| | - Lian Duan
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Hongchao Lv
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Mingming Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Jin Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Liangde Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.,Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China
| | - Fanwu Kong
- Department of Nephrology, The Second Affiliated Hospital, Harbin Medical University, Harbin, China
| | - Rennan Feng
- Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China.,Department of Nutrition and Food Hygiene, Public Health College, Harbin Medical University, Harbin, China
| | - Yongshuai Jiang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.,Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China
| |
Collapse
|
3
|
Huang J, Bai L, Cui B, Wu L, Wang L, An Z, Ruan S, Yu Y, Zhang X, Chen J. Leveraging biological and statistical covariates improves the detection power in epigenome-wide association testing. Genome Biol 2020; 21:88. [PMID: 32252795 PMCID: PMC7132874 DOI: 10.1186/s13059-020-02001-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2019] [Accepted: 03/17/2020] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Epigenome-wide association studies (EWAS), which seek the association between epigenetic marks and an outcome or exposure, involve multiple hypothesis testing. False discovery rate (FDR) control has been widely used for multiple testing correction. However, traditional FDR control methods do not use auxiliary covariates, and they could be less powerful if the covariates could inform the likelihood of the null hypothesis. Recently, many covariate-adaptive FDR control methods have been developed, but application of these methods to EWAS data has not yet been explored. It is not clear whether these methods can significantly improve detection power, and if so, which covariates are more relevant for EWAS data. RESULTS In this study, we evaluate the performance of five covariate-adaptive FDR control methods with EWAS-related covariates using simulated as well as real EWAS datasets. We develop an omnibus test to assess the informativeness of the covariates. We find that statistical covariates are generally more informative than biological covariates, and the covariates of methylation mean and variance are almost universally informative. In contrast, the informativeness of biological covariates depends on specific datasets. We show that the independent hypothesis weighting (IHW) and covariate adaptive multiple testing (CAMT) method are overall more powerful, especially for sparse signals, and could improve the detection power by a median of 25% and 68% on real datasets, compared to the ST procedure. We further validate the findings in various biological contexts. CONCLUSIONS Covariate-adaptive FDR control methods with informative covariates can significantly increase the detection power for EWAS. For sparse signals, IHW and CAMT are recommended.
Collapse
Affiliation(s)
- Jinyan Huang
- State Key Laboratory of Medical Genomics, Shanghai Institute of Hematology, National Research Center for Translational Medicine, Rui-Jin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai Jiao Tong University, 197 Ruijin Er Road, Shanghai, 200025, China.
| | - Ling Bai
- State Key Laboratory of Medical Genomics, Shanghai Institute of Hematology, National Research Center for Translational Medicine, Rui-Jin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai Jiao Tong University, 197 Ruijin Er Road, Shanghai, 200025, China
| | - Bowen Cui
- State Key Laboratory of Medical Genomics, Shanghai Institute of Hematology, National Research Center for Translational Medicine, Rui-Jin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai Jiao Tong University, 197 Ruijin Er Road, Shanghai, 200025, China
| | - Liang Wu
- State Key Laboratory of Medical Genomics, Shanghai Institute of Hematology, National Research Center for Translational Medicine, Rui-Jin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai Jiao Tong University, 197 Ruijin Er Road, Shanghai, 200025, China
| | - Liwen Wang
- Department of General Surgery, Rui-Jin Hospital, Shanghai Jiao Tong University, 197 Ruijin Er Road, Shanghai, 200025, China
| | - Zhiyin An
- State Key Laboratory of Medical Genomics, Shanghai Institute of Hematology, National Research Center for Translational Medicine, Rui-Jin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai Jiao Tong University, 197 Ruijin Er Road, Shanghai, 200025, China
| | - Shulin Ruan
- State Key Laboratory of Medical Genomics, Shanghai Institute of Hematology, National Research Center for Translational Medicine, Rui-Jin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai Jiao Tong University, 197 Ruijin Er Road, Shanghai, 200025, China
| | - Yue Yu
- Division of Digital Health Sciences, Mayo Clinic, 200 1st St SW, Rochester, MN, 55905, USA
| | - Xianyang Zhang
- Department of Statistics, Texas A&M University, Blocker 449D, College Station, TX, 77843, USA.
| | - Jun Chen
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research and Center for Individualized Medicine, Mayo Clinic, 200 1st St SW, Rochester, MN, 55905, USA.
| |
Collapse
|
4
|
Xu J, Zhao L, Liu D, Hu S, Song X, Li J, Lv H, Duan L, Zhang M, Jiang Q, Liu G, Jin S, Liao M, Zhang M, Feng R, Kong F, Xu L, Jiang Y. EWAS: epigenome-wide association study software 2.0. Bioinformatics 2019; 34:2657-2658. [PMID: 29566144 PMCID: PMC6061808 DOI: 10.1093/bioinformatics/bty163] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2017] [Accepted: 03/15/2018] [Indexed: 11/13/2022] Open
Abstract
Motivation With the development of biotechnology, DNA methylation data showed exponential growth. Epigenome-wide association study (EWAS) provide a systematic approach to uncovering epigenetic variants underlying common diseases/phenotypes. But the EWAS software has lagged behind compared with genome-wide association study (GWAS). To meet the requirements of users, we developed a convenient and useful software, EWAS2.0. Results EWAS2.0 can analyze EWAS data and identify the association between epigenetic variations and disease/phenotype. On the basis of EWAS1.0, we have added more distinctive features. EWAS2.0 software was developed based on our ‘population epigenetic framework’ and can perform: (i) epigenome-wide single marker association study; (ii) epigenome-wide methylation haplotype (meplotype) association study and (iii) epigenome-wide association meta-analysis. Users can use EWAS2.0 to execute chi-square test, t-test, linear regression analysis, logistic regression analysis, identify the association between epi-alleles, identify the methylation disequilibrium (MD) blocks, calculate the MD coefficient, the frequency of meplotype and Pearson's correlation coefficients and carry out meta-analysis and so on. Finally, we expect EWAS2.0 to become a popular software and be widely used in epigenome-wide associated studies in the future. Availability and implementation The EWAS software is freely available at http://www.ewas.org.cn or http://www.bioapp.org/ewas.
Collapse
Affiliation(s)
- Jing Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.,Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China
| | - Linna Zhao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.,Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China
| | - Di Liu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.,Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China
| | - Simeng Hu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.,Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China
| | - Xiuling Song
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.,Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China
| | - Jin Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Hongchao Lv
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Lian Duan
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Mingming Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Qinghua Jiang
- Center for Bioinformatics, School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Guiyou Liu
- Center for Bioinformatics, School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Shuilin Jin
- Department of Mathematics, Harbin Institute of Technology, Harbin, China
| | - Mingzhi Liao
- College of Life Science, Northwest A&F University, Yangling, Shaanxi, China
| | - Meng Zhang
- Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China.,Department of Nutrition and Food Hygiene, Public Health College, Harbin Medical University, Harbin, China
| | - Rennan Feng
- Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China.,Department of Nutrition and Food Hygiene, Public Health College, Harbin Medical University, Harbin, China
| | - Fanwu Kong
- Department of Nephrology, The Second Affiliated Hospital, Harbin Medical University, Harbin, China
| | - Liangde Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.,Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China
| | - Yongshuai Jiang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.,Training Center for Students Innovation and Entrepreneurship Education, Harbin Medical University, Harbin, China
| |
Collapse
|
5
|
Hannon E, Gorrie-Stone TJ, Smart MC, Burrage J, Hughes A, Bao Y, Kumari M, Schalkwyk LC, Mill J. Leveraging DNA-Methylation Quantitative-Trait Loci to Characterize the Relationship between Methylomic Variation, Gene Expression, and Complex Traits. Am J Hum Genet 2018; 103:654-665. [PMID: 30401456 PMCID: PMC6217758 DOI: 10.1016/j.ajhg.2018.09.007] [Citation(s) in RCA: 105] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2018] [Accepted: 09/14/2018] [Indexed: 11/23/2022] Open
Abstract
Characterizing the complex relationship between genetic, epigenetic, and transcriptomic variation has the potential to increase understanding about the mechanisms underpinning health and disease phenotypes. We undertook a comprehensive analysis of common genetic variation on DNA methylation (DNAm) by using the Illumina EPIC array to profile samples from the UK Household Longitudinal study. We identified 12,689,548 significant DNA methylation quantitative trait loci (mQTL) associations (p < 6.52 × 10-14) occurring between 2,907,234 genetic variants and 93,268 DNAm sites, including a large number not identified by previous DNAm-profiling methods. We demonstrate the utility of these data for interpreting the functional consequences of common genetic variation associated with > 60 human traits by using summary-data-based Mendelian randomization (SMR) to identify 1,662 pleiotropic associations between 36 complex traits and 1,246 DNAm sites. We also use SMR to characterize the relationship between DNAm and gene expression and thereby identify 6,798 pleiotropic associations between 5,420 DNAm sites and the transcription of 1,702 genes. Our mQTL database and SMR results are available via a searchable online database as a resource to the research community.
Collapse
Affiliation(s)
- Eilis Hannon
- University of Exeter Medical School, University of Exeter, Exeter EX2 5DW, United Kingdom
| | - Tyler J Gorrie-Stone
- School of Biological Sciences, University of Essex, Colchester, CO4 3SQ, United Kingdom
| | - Melissa C Smart
- Institute for Social and Economic Research, University of Essex, Colchester CO3 3LG, United Kingdom
| | - Joe Burrage
- University of Exeter Medical School, University of Exeter, Exeter EX2 5DW, United Kingdom
| | - Amanda Hughes
- Institute for Social and Economic Research, University of Essex, Colchester CO3 3LG, United Kingdom
| | - Yanchun Bao
- Institute for Social and Economic Research, University of Essex, Colchester CO3 3LG, United Kingdom
| | - Meena Kumari
- Institute for Social and Economic Research, University of Essex, Colchester CO3 3LG, United Kingdom
| | - Leonard C Schalkwyk
- School of Biological Sciences, University of Essex, Colchester, CO4 3SQ, United Kingdom
| | - Jonathan Mill
- University of Exeter Medical School, University of Exeter, Exeter EX2 5DW, United Kingdom.
| |
Collapse
|