1
|
Huang J, Wang L. Cell-Free DNA Methylation Profiling Analysis-Technologies and Bioinformatics. Cancers (Basel) 2019; 11:cancers11111741. [PMID: 31698791 PMCID: PMC6896050 DOI: 10.3390/cancers11111741] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2019] [Revised: 11/01/2019] [Accepted: 11/04/2019] [Indexed: 12/24/2022] Open
Abstract
Analysis of circulating nucleic acids in bodily fluids, referred to as “liquid biopsies”, is rapidly gaining prominence. Studies have shown that cell-free DNA (cfDNA) has great potential in characterizing tumor status and heterogeneity, as well as the response to therapy and tumor recurrence. DNA methylation is an epigenetic modification that plays an important role in a broad range of biological processes and diseases. It is well known that aberrant DNA methylation is generalizable across various samples and occurs early during the pathogenesis of cancer. Methylation patterns of cfDNA are also consistent with their originated cells or tissues. Systemic analysis of cfDNA methylation profiles has emerged as a promising approach for cancer detection and origin determination. In this review, we will summarize the technologies for DNA methylation analysis and discuss their feasibility for liquid biopsy applications. We will also provide a brief overview of the bioinformatic approaches for analysis of DNA methylation sequencing data. Overall, this review provides informative guidance for the selection of experimental and computational methods in cfDNA methylation-based studies.
Collapse
|
2
|
Computational Methods for Detection of Differentially Methylated Regions Using Kernel Distance and Scan Statistics. Genes (Basel) 2019; 10:genes10040298. [PMID: 31013791 PMCID: PMC6523914 DOI: 10.3390/genes10040298] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Revised: 03/29/2019] [Accepted: 04/08/2019] [Indexed: 01/14/2023] Open
Abstract
MOTIVATION Researchers in genomics are increasingly interested in epigenetic factors such as DNA methylation because they play an important role in regulating gene expression without changes in the sequence of DNA. Abnormal DNA methylation is associated with many human diseases. RESULTS We propose two different approaches to test for differentially methylated regions (DMRs) associated with complex traits, while accounting for correlations among CpG sites in the DMRs. The first approach is a nonparametric method using a kernel distance statistic and the second one is a likelihood-based method using a binomial spatial scan statistic. The kernel distance method uses the kernel function, while the binomial scan statistic approach uses a mixed-effects model to incorporate correlations among CpG sites. Extensive simulations show that both approaches have excellent control of type I error, and both have reasonable statistical power. The binomial scan statistic approach appears to have higher power, while the kernel distance method is computationally faster. The proposed methods are demonstrated using data from a chronic lymphocytic leukemia (CLL) study.
Collapse
|
3
|
Zhang Y, Baheti S, Sun Z. Statistical method evaluation for differentially methylated CpGs in base resolution next-generation DNA sequencing data. Brief Bioinform 2019; 19:374-386. [PMID: 28040747 DOI: 10.1093/bib/bbw133] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2016] [Indexed: 01/05/2023] Open
Abstract
High-throughput bisulfite methylation sequencing such as reduced representation bisulfite sequencing (RRBS), Agilent SureSelect Human Methyl-Seq (Methyl-seq) or whole-genome bisulfite sequencing is commonly used for base resolution methylome research. These data are represented either by the ratio of methylated cytosine versus total coverage at a CpG site or numbers of methylated and unmethylated cytosines. Multiple statistical methods can be used to detect differentially methylated CpGs (DMCs) between conditions, and these methods are often the base for the next step of differentially methylated region identification. The ratio data have a flexibility of fitting to many linear models, but the raw count data take consideration of coverage information. There is an array of options in each datatype for DMC detection; however, it is not clear which is an optimal statistical method. In this study, we systematically evaluated four statistic methods on methylation ratio data and four methods on count-based data and compared their performances with regard to type I error control, sensitivity and specificity of DMC detection and computational resource demands using real RRBS data along with simulation. Our results show that the ratio-based tests are generally more conservative (less sensitive) than the count-based tests. However, some count-based methods have high false-positive rates and should be avoided. The beta-binomial model gives a good balance between sensitivity and specificity and is preferred method. Selection of methods in different settings, signal versus noise and sample size estimation are also discussed.
Collapse
Affiliation(s)
- Yun Zhang
- Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, MN.,Department of Biostatistics and Computational Biology, University of Rochester, Rochester, NY
| | - Saurabh Baheti
- Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, MN
| | - Zhifu Sun
- Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, MN
| |
Collapse
|
4
|
El Sherbini MA, Mansour AA, Sallam MM, Shaban EA, Shehab ElDin ZA, El-Shalakany AH. KLK10 exon 3 unmethylated PCR product concentration: a new potential early diagnostic marker in ovarian cancer? - A pilot study. J Ovarian Res 2018; 11:32. [PMID: 29690914 PMCID: PMC5913797 DOI: 10.1186/s13048-018-0407-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2018] [Accepted: 04/17/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND KLK10 exon 3 hypermethylation correlated to tumor-specific lack of KLK10 expression in cancer cell lines and primary tumors. In the present study we investigate the possible role of KLK10 exon 3 methylation in ovarian tumor diagnosis and prognosis. RESULTS Qualitative methylation-specific PCR (MSP) results did not show statistically significant differences in patient group samples (normal and tumor) where all samples were positive only for the unmethylated-specific PCR except for two malignant samples that were either doubly positive (serous carcinoma) or doubly negative (Sertoli-Leydig cell tumor) for the two MSP tests. However, KLK10 exon 3 unmethylated PCR product concentration (ng/μl) showed statistically significant differences in benign and malignant patient group samples; mean ± SD (n): tumor: 0.077 ± 0.035 (14) and 0.047 ± 0.021 (15), respectively, p-value = 0.011; and normal: 0.094 ± 0.039 (7) and 0.046 ± 0.027 (6), respectively, p-value = 0.031. Moreover, ROC curve analysis of KLK10 exon 3 unmethylated PCR product concentration in overall patient group samples showed good diagnostic ability (AUC = 0.778; p-value = 0.002). Patient survival (living and died) showed statistically significant difference according to preoperative serum CA125 concentration (U/ml); median (n): 101.25 (10) and 1252 (5), respectively, p-value = 0.037, but not KLK10 exon 3 unmethylated PCR product concentration (ng/μl) in overall malignant patient samples; mean ± SD (n): 0.042 ± 0.015 (14) and 0.055 ± 0.032 (7), p-value = 0.228. CONCLUSION To the best of our knowledge, this is the first report on KLK10 exon 3 unmethylated PCR product concentration as potential early epigenetic diagnostic marker in primary ovarian tumors. Taken into account the limitations in our study (small sample size and semi-quantitative PCR product analysis) further studies are strongly recommended.
Collapse
Affiliation(s)
- Mustafa A El Sherbini
- Medical Biochemistry Department, Faculty of Medicine, Ain Shams University, Cairo, Egypt.
| | - Amal A Mansour
- Medical Biochemistry Department, Faculty of Medicine, Ain Shams University, Cairo, Egypt
| | - Maha M Sallam
- Medical Biochemistry Department, Faculty of Medicine, Ain Shams University, Cairo, Egypt
| | - Emtiaz A Shaban
- Medical Biochemistry Department, Faculty of Medicine, Ain Shams University, Cairo, Egypt
| | | | - Amr H El-Shalakany
- Gynecologic Oncology Unit, Ain Shams University Maternity Hospital, Cairo, Egypt
| |
Collapse
|
5
|
Abstract
The number of epigenetic studies is exponentially increasing. There is anticipation that DNA methylation may close gaps in our understanding of disease etiology, and how certain risk factors affect health and disease, but also that it has potential as a biomarker for disease. Human DNA methylation studies require careful considerations for design and analysis including population and tissue selection, population stratification, cell heterogeneity, confounding, temporality, sample size, appropriate statistical analysis, and validation of results. In this chapter, we discuss relevant aspects for the design of DNA methylation studies and delineate essential steps for their analysis. Specifically, we summarize methods used to extricate biologic signals from technical noise, and statistical approaches to capture meaningful variability based on the research hypothesis.
Collapse
Affiliation(s)
- Karin B Michels
- Department of Epidemiology, Fielding School of Public Health, University of California, Los Angeles, CA, 90095, USA.
| | - Alexandra M Binder
- Department of Epidemiology, Fielding School of Public Health, University of California, Los Angeles, CA, 90095, USA
| |
Collapse
|
6
|
Zhang Q, Zhao Y, Zhang R, Wei Y, Yi H, Shao F, Chen F. A Comparative Study of Five Association Tests Based on CpG Set for Epigenome-Wide Association Studies. PLoS One 2016; 11:e0156895. [PMID: 27258058 PMCID: PMC4892473 DOI: 10.1371/journal.pone.0156895] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2016] [Accepted: 05/20/2016] [Indexed: 11/19/2022] Open
Abstract
An epigenome-wide association study (EWAS) is a large-scale study of human disease-associated epigenetic variation, specifically variation in DNA methylation. High throughput technologies enable simultaneous epigenetic profiling of DNA methylation at hundreds of thousands of CpGs across the genome. The clustering of correlated DNA methylation at CpGs is reportedly similar to that of linkage-disequilibrium (LD) correlation in genetic single nucleotide polymorphisms (SNP) variation. However, current analysis methods, such as the t-test and rank-sum test, may be underpowered to detect differentially methylated markers. We propose to test the association between the outcome (e.g case or control) and a set of CpG sites jointly. Here, we compared the performance of five CpG set analysis approaches: principal component analysis (PCA), supervised principal component analysis (SPCA), kernel principal component analysis (KPCA), sequence kernel association test (SKAT), and sliced inverse regression (SIR) with Hotelling's T2 test and t-test using Bonferroni correction. The simulation results revealed that the first six methods can control the type I error at the significance level, while the t-test is conservative. SPCA and SKAT performed better than other approaches when the correlation among CpG sites was strong. For illustration, these methods were also applied to a real methylation dataset.
Collapse
Affiliation(s)
- Qiuyi Zhang
- Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, China, 211166
| | - Yang Zhao
- Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, China, 211166
| | - Ruyang Zhang
- Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, China, 211166
| | - Yongyue Wei
- Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, China, 211166
| | - Honggang Yi
- Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, China, 211166
| | - Fang Shao
- Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, China, 211166
| | - Feng Chen
- Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, China, 211166
| |
Collapse
|
7
|
Sun Z, Cunningham J, Slager S, Kocher JP. Base resolution methylome profiling: considerations in platform selection, data preprocessing and analysis. Epigenomics 2015; 7:813-28. [PMID: 26366945 PMCID: PMC4790440 DOI: 10.2217/epi.15.21] [Citation(s) in RCA: 71] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Bisulfite treatment-based methylation microarray (mainly Illumina 450K Infinium array) and next-generation sequencing (reduced representation bisulfite sequencing, Agilent SureSelect Human Methyl-Seq, NimbleGen SeqCap Epi CpGiant or whole-genome bisulfite sequencing) are commonly used for base resolution DNA methylome research. Although multiple tools and methods have been developed and used for the data preprocessing and analysis, confusions remains for these platforms including how and whether the 450k array should be normalized; which platform should be used to better fit researchers' needs; and which statistical models would be more appropriate for differential methylation analysis. This review presents the commonly used platforms and compares the pros and cons of each in methylome profiling. We then discuss approaches to study design, data normalization, bias correction and model selection for differentially methylated individual CpGs and regions.
Collapse
Affiliation(s)
- Zhifu Sun
- Division of Biomedical Statistics & Informatics, Mayo Clinic, Rochester, MN 55905, USA
| | | | - Susan Slager
- Division of Biomedical Statistics & Informatics, Mayo Clinic, Rochester, MN 55905, USA
| | - Jean-Pierre Kocher
- Division of Biomedical Statistics & Informatics, Mayo Clinic, Rochester, MN 55905, USA
| |
Collapse
|
8
|
Zhang Y, Zhang J. Identification of functionally methylated regions based on discriminant analysis through integrating methylation and gene expression data. MOLECULAR BIOSYSTEMS 2015; 11:1786-93. [DOI: 10.1039/c5mb00141b] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
DNA methylation is essential not only in cellular differentiation but also in diseases.
Collapse
Affiliation(s)
- Yuanyuan Zhang
- School of Computer Science and Technology
- Xidian University
- Xi'an 710071
- China
| | - Junying Zhang
- School of Computer Science and Technology
- Xidian University
- Xi'an 710071
- China
| |
Collapse
|
9
|
Feng H, Conneely KN, Wu H. A Bayesian hierarchical model to detect differentially methylated loci from single nucleotide resolution sequencing data. Nucleic Acids Res 2014; 42:e69. [PMID: 24561809 PMCID: PMC4005660 DOI: 10.1093/nar/gku154] [Citation(s) in RCA: 314] [Impact Index Per Article: 31.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
DNA methylation is an important epigenetic modification that has essential roles in cellular processes including gene regulation, development and disease and is widely dysregulated in most types of cancer. Recent advances in sequencing technology have enabled the measurement of DNA methylation at single nucleotide resolution through methods such as whole-genome bisulfite sequencing and reduced representation bisulfite sequencing. In DNA methylation studies, a key task is to identify differences under distinct biological contexts, for example, between tumor and normal tissue. A challenge in sequencing studies is that the number of biological replicates is often limited by the costs of sequencing. The small number of replicates leads to unstable variance estimation, which can reduce accuracy to detect differentially methylated loci (DML). Here we propose a novel statistical method to detect DML when comparing two treatment groups. The sequencing counts are described by a lognormal-beta-binomial hierarchical model, which provides a basis for information sharing across different CpG sites. A Wald test is developed for hypothesis testing at each CpG site. Simulation results show that the proposed method yields improved DML detection compared to existing methods, particularly when the number of replicates is low. The proposed method is implemented in the Bioconductor package DSS.
Collapse
Affiliation(s)
- Hao Feng
- Department of Biostatistics and Bioinformatics, Emory University Rollins School of Public Health and Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, USA
| | | | | |
Collapse
|