1
|
Cao R, Guan W. Evaluating Reliability of DNA Methylation Measurement. Methods Mol Biol 2022; 2432:15-24. [PMID: 35505204 DOI: 10.1007/978-1-0716-1994-0_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
DNA methylation is a widely studied epigenetic phenomenon. Alterations in methylation patterns influence human phenotypes and risk of disease. The Illumina Infinium HumanMethylation450 (HM450) and MethylationEPIC (EPIC) BeadChip are widely used microarray-based platforms for epigenome-wide association studies (EWASs). In this chapter, we will discuss the use of intraclass correlation coefficient (ICC) for assessing technical variations induced by methylation arrays at single-CpG level. ICC compares variation of methylation levels within- and between-replicate measurements, ranging between 0 and 1. We further characterize the distribution of ICCs using a mixture of truncated normal and normal distributions, and cluster CpG sites on the arrays into low- and high-reliability groups. In practice, we recommend that extra caution needs to be taken for associations at the CpG sites with low ICC values.
Collapse
Affiliation(s)
- Rui Cao
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - Weihua Guan
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA.
| |
Collapse
|
2
|
Hui Y, Wei PJ, Xia J, Wang YT, Zheng CH. MECoRank: cancer driver genes discovery simultaneously evaluating the impact of SNVs and differential expression on transcriptional networks. BMC Med Genomics 2019; 12:140. [PMID: 31888623 PMCID: PMC6936061 DOI: 10.1186/s12920-019-0582-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Accepted: 09/10/2019] [Indexed: 01/09/2023] Open
Abstract
Background Although there are huge volumes of genomic data, how to decipher them and identify driver events is still a challenge. The current methods based on network typically use the relationship between genomic events and consequent changes in gene expression to nominate putative driver genes. But there may exist some relationships within the transcriptional network. Methods We developed MECoRank, a novel method that improves the recognition accuracy of driver genes. MECoRank is based on bipartite graph to propagates the scores via an iterative process. After iteration, we will obtain a ranked gene list for each patient sample. Then, we applied the Condorcet voting method to determine the most impactful drivers in a population. Results We applied MECoRank to three cancer datasets to reveal candidate driver genes which have a greater impact on gene expression. Experimental results show that our method not only can identify more driver genes that have been validated than other methods, but also can recognize some impactful novel genes which have been proved to be more important in literature. Conclusions We propose a novel approach named MECoRank to prioritize driver genes based on their impact on the expression in the molecular interaction network. This method not only assesses mutation’s effect on the transcriptional network, but also assesses the differential expression’s effect within the transcriptional network. And the results demonstrated that MECoRank has better performance than the other competing approaches in identifying driver genes.
Collapse
Affiliation(s)
- Ying Hui
- Key Lab of Intelligent Computing and Signal Processing of Ministry of Education, College of Computer Science and Technology, Anhui University, Hefei, China
| | - Pi-Jing Wei
- Key Lab of Intelligent Computing and Signal Processing of Ministry of Education, College of Computer Science and Technology, Anhui University, Hefei, China
| | - Junfeng Xia
- Institute of Physical Science and Information Technology, Anhui University, Hefei, China
| | - Yu-Tian Wang
- School of Software Engineering, Qufu Normal University, Qufu, China
| | - Chun-Hou Zheng
- Key Lab of Intelligent Computing and Signal Processing of Ministry of Education, College of Computer Science and Technology, Anhui University, Hefei, China.
| |
Collapse
|
3
|
Kinzy TG, Starr TK, Tseng GC, Ho YY. Meta-analytic framework for modeling genetic coexpression dynamics. Stat Appl Genet Mol Biol 2019; 18:/j/sagmb.ahead-of-print/sagmb-2017-0052/sagmb-2017-0052.xml. [DOI: 10.1515/sagmb-2017-0052] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Abstract
Methods for exploring genetic interactions have been developed in an attempt to move beyond single gene analyses. Because biological molecules frequently participate in different processes under various cellular conditions, investigating the changes in gene coexpression patterns under various biological conditions could reveal important regulatory mechanisms. One of the methods for capturing gene coexpression dynamics, named liquid association (LA), quantifies the relationship where the coexpression between two genes is modulated by a third “coordinator” gene. This LA measure offers a natural framework for studying gene coexpression changes and has been applied increasingly to study regulatory networks among genes. With a wealth of publicly available gene expression data, there is a need to develop a meta-analytic framework for LA analysis. In this paper, we incorporated mixed effects when modeling correlation to account for between-studies heterogeneity. For statistical inference about LA, we developed a Markov chain Monte Carlo (MCMC) estimation procedure through a Bayesian hierarchical framework. We evaluated the proposed methods in a set of simulations and illustrated their use in two collections of experimental data sets. The first data set combined 10 pancreatic ductal adenocarcinoma gene expression studies to determine the role of possible coordinator gene USP9X in the Hippo pathway. The second experimental data set consisted of 907 gene expression microarray Escherichia coli experiments from multiple studies publicly available through the Many Microbe Microarray Database website (http://m3d.bu.edu/) and examined genes that coexpress with serA in the presence of coordinator gene Lrp.
Collapse
|
4
|
Meyer KN, Lacey MR. Modeling Methylation Patterns with Long Read Sequencing Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:1379-1389. [PMID: 28682263 DOI: 10.1109/tcbb.2017.2721943] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Variation in cytosine methylation at CpG dinucleotides is often observed in genomic regions, and analysis typically focuses on estimating the proportion of methylated sites observed in a given region and comparing these levels across samples to determine association with conditions of interest. While sites are tacitly treated as independent, when observed at the level of individual molecules methylation patterns exhibit strong evidence of local spatial dependence. We previously developed a neighboring sites model to account for correlation and clustering behavior observed in two tandem repeat regions in a collection of ovarian carcinomas. We now introduce extensions of the model that account for the effect of distance between sites as well as asymmetric correlation in de novo methylation and demethylation rates. We apply our models to published data from a whole genome bisulfite sequencing experiment using long reads, estimating model parameters for a selection of CpG-dense regions spanning between 21 and 67 sites. Our methods detect evidence of local spatial correlation as a function of site-to-site distance and demonstrate the added value of employing long read sequencing data in epigenetic research.
Collapse
|
5
|
Topological Characterization of Human and Mouse m 5C Epitranscriptome Revealed by Bisulfite Sequencing. Int J Genomics 2018; 2018:1351964. [PMID: 30009162 PMCID: PMC6020461 DOI: 10.1155/2018/1351964] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2018] [Revised: 04/14/2018] [Accepted: 04/17/2018] [Indexed: 11/17/2022] Open
Abstract
Background Compared with the well-studied 5-methylcytosine (m5C) in DNA, the role and topology of epitranscriptome m5C remain insufficiently characterized. Results Through analyzing transcriptome-wide m5C distribution in human and mouse, we show that the m5C modification is significantly enriched at 5′ untranslated regions (5′UTRs) of mRNA in human and mouse. With a comparative analysis of the mRNA and DNA methylome, we demonstrate that, like DNA methylation, transcriptome m5C methylation exhibits a strong clustering effect. Surprisingly, an inverse correlation between mRNA and DNA m5C methylation is observed at CpG sites. Further analysis reveals that RNA m5C methylation level is positively correlated with both RNA expression and RNA half-life. We also observed that the methylation level of mitochondrial RNAs is significantly higher than RNAs transcribed from the nuclear genome. Conclusions This study provides an in-depth topological characterization of transcriptome-wide m5C modification by associating RNA m5C methylation patterns with transcriptional expression, DNA methylations, RNA stabilities, and mitochondrial genome.
Collapse
|
6
|
Abstract
This article concerns testing for equality of distribution between groups. We focus on screening variables with shared distributional features such as common support, modes and patterns of skewness. We propose a Bayesian testing method using kernel mixtures, which improves performance by borrowing information across the different variables and groups through shared kernels and a common probability of group differences. The inclusion of shared kernels in a finite mixture, with Dirichlet priors on the weights, leads to a simple framework for testing that scales well for high-dimensional data. We provide closed asymptotic forms for the posterior probability of equivalence in two groups and prove consistency under model misspecification. The method is applied to DNA methylation array data from a breast cancer study, and compares favourably to competitors when Type I error is estimated via permutation.
Collapse
Affiliation(s)
- Eric F Lock
- Division of Biostatistics, University of Minnesota, Minneapolis, Minnesota 55455, U.S.A
| | - David B Dunson
- Department of Statistical Science, Duke University, Durham, North Carolina 27708, U.S.A ,
| |
Collapse
|
7
|
Bose M, Wu C, Pankow JS, Demerath EW, Bressler J, Fornage M, Grove ML, Mosley TH, Hicks C, North K, Kao WH, Zhang Y, Boerwinkle E, Guan W. Evaluation of microarray-based DNA methylation measurement using technical replicates: the Atherosclerosis Risk In Communities (ARIC) Study. BMC Bioinformatics 2014; 15:312. [PMID: 25239148 PMCID: PMC4180315 DOI: 10.1186/1471-2105-15-312] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2014] [Accepted: 09/08/2014] [Indexed: 11/15/2022] Open
Abstract
Background DNA methylation is a widely studied epigenetic phenomenon; alterations in methylation patterns influence human phenotypes and risk of disease. As part of the Atherosclerosis Risk in Communities (ARIC) study, the Illumina Infinium HumanMethylation450 (HM450) BeadChip was used to measure DNA methylation in peripheral blood obtained from ~3000 African American study participants. Over 480,000 cytosine-guanine (CpG) dinucleotide sites were surveyed on the HM450 BeadChip. To evaluate the impact of technical variation, 265 technical replicates from 130 participants were included in the study. Results For each CpG site, we calculated the intraclass correlation coefficient (ICC) to compare variation of methylation levels within- and between-replicate pairs, ranging between 0 and 1. We modeled the distribution of ICC as a mixture of censored or truncated normal and normal distributions using an EM algorithm. The CpG sites were clustered into low- and high-reliability groups, according to the calculated posterior probabilities. We also demonstrated the performance of this clustering when applied to a study of association between methylation levels and smoking status of individuals. For the CpG sites showing genome-wide significant association with smoking status, most (~96%) were seen from sites in the high reliability cluster. Conclusions We suggest that CpG sites with low ICC may be excluded from subsequent association analyses, or extra caution needs to be taken for associations at such sites. Electronic supplementary material The online version of this article (doi:10.1186/1471-2105-15-312) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | | | - Weihua Guan
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455, USA.
| |
Collapse
|
8
|
Green T, Chen X, Ryan S, Asch AS, Ruiz-Echevarría MJ. TMEFF2 and SARDH cooperate to modulate one-carbon metabolism and invasion of prostate cancer cells. Prostate 2013; 73:1561-75. [PMID: 23824605 PMCID: PMC3878307 DOI: 10.1002/pros.22706] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/25/2013] [Accepted: 06/11/2013] [Indexed: 12/16/2022]
Abstract
BACKGROUND The transmembrane protein with epidermal growth factor and two follistatin motifs, TMEFF2, has been implicated in prostate cancer but its role in this disease is unclear. We recently demonstrated that the tumor suppressor role of TMEFF2 correlates, in part, with its ability to interact with sarcosine dehydrogenase (SARDH) and modulate sarcosine level. TMEFF2 overexpression inhibits sarcosine-induced invasion. Here, we further characterize the functional interaction between TMEFF2 and SARDH and their link with one-carbon (1-C) metabolism and invasion. METHODS RNA interference was used to study the effect of SARDH and/or TMEFF2 knockdown (KD) in invasion, evaluated using Boyden chambers. The dependence of invasion on 1-C metabolism was determined by examining sensitivity to methotrexate. Real-time PCR and Western blot of subcellular fractions were used to study the effect of SARDH KD or TMEFF2 KD on expression of enzymes involved in one-carbon (1-C) metabolism and on TMEFF2 expression and localization. Protein interactions were analyzed by mass spectrometry. Cell viability and proliferation were measured by cell counting and MTT analysis. RESULTS While knocking down SARDH affects TMEFF2 subcellular localization, this effect is not responsible for the increased invasion observed in SARDH KD cells. Importantly, SARDH and/or TMEFF2 KD promote increased cellular invasion, sensitize the cell to methotrexate, render the cell resistant to invasion induced by sarcosine, a metabolite from the folate-mediated 1-C metabolism pathway, and affect the expression level of enzymes involved in that pathway. CONCLUSIONS Our findings define a role for TMEFF2 and the folate-mediated 1-C metabolism pathway in modulating cellular invasion.
Collapse
Affiliation(s)
- Thomas Green
- Department of Oncology, Brody School of Medicine at East Carolina University, Greenville, NC, USA
| | - Xiaofei Chen
- Department of Biochemistry and Molecular Biology, Brody School of Medicine at East Carolina University, Greenville, USA
| | - Stephen Ryan
- Department of Oncology, Brody School of Medicine at East Carolina University, Greenville, NC, USA
| | - Adam S. Asch
- Department of Oncology, Brody School of Medicine at East Carolina University, Greenville, NC, USA
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Maria J. Ruiz-Echevarría
- Department of Oncology, Brody School of Medicine at East Carolina University, Greenville, NC, USA
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Anatomy and Cell Biology, Brody School of Medicine at East Carolina University, Greenville, NC, USA
- Correspondence: , Phone: 252-744.2856, Fax: 252-744.3418
| |
Collapse
|
9
|
Liu Y, Ji Y, Qiu P. Identification of thresholds for dichotomizing DNA methylation data. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2013; 2013:8. [PMID: 23742247 PMCID: PMC3680080 DOI: 10.1186/1687-4153-2013-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/01/2013] [Accepted: 05/23/2013] [Indexed: 12/31/2022]
Abstract
: DNA methylation plays an important role in many biological processes by regulating gene expression. It is commonly accepted that turning on the DNA methylation leads to silencing of the expression of the corresponding genes. While methylation is often described as a binary on-off signal, it is typically measured using beta values derived from either microarray or sequencing technologies, which takes continuous values between 0 and 1. If we would like to interpret methylation in a binary fashion, appropriate thresholds are needed to dichotomize the continuous measurements. In this paper, we use data from The Cancer Genome Atlas project. For a total of 992 samples across five cancer types, both methylation and gene expression data are available. A bivariate extension of the StepMiner algorithm is used to identify thresholds for dichotomizing both methylation and expression data. Hypergeometric test is applied to identify CpG sites whose methylation status is significantly associated to silencing of the expression of their corresponding genes. The test is performed on either all five cancer types together or individual cancer types separately. We notice that the appropriate thresholds vary across different CpG sites. In addition, the negative association between methylation and expression is highly tissue specific.
Collapse
Affiliation(s)
- Yihua Liu
- Department of Bioinformatics and Computational Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
| | | | | |
Collapse
|
10
|
Mulware SJ. The mammary gland carcinogens: the role of metal compounds and organic solvents. Int J Breast Cancer 2013; 2013:640851. [PMID: 23762568 PMCID: PMC3671233 DOI: 10.1155/2013/640851] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2013] [Accepted: 04/24/2013] [Indexed: 11/18/2022] Open
Abstract
The increased rate of breast cancer incidences especially among postmenopausal women has been reported in recent decades. Despite the fact that women who inherited mutations in the BRCA1 and BRCA2 genes have a high risk of developing breast cancer, studies have also shown that significant exposure to certain metal compounds and organic solvents also increases the risks of mammary gland carcinogenesis. While physiological properties govern the uptake, intracellular distribution, and binding of metal compounds, their interaction with proteins seems to be the most relevant process for metal carcinogenicity than biding to DNA. The four most predominant mechanisms for metal carcinogenicity include (1) interference with cellular redox regulation and induction of oxidative stress, (2) inhibition of major DNA repair, (3) deregulation of cell proliferation, and (4) epigenetic inactivation of genes by DNA hypermethylation. On the other hand, most organic solvents are highly lipophilic and are biotransformed mainly in the liver and the kidney through a series of oxidative and reductive reactions, some of which result in bioactivation. The breast physiology, notably the parenchyma, is embedded in a fat depot capable of storing lipophilic xenobiotics. This paper reviews the role of metal compounds and organic solvents in breast cancer development.
Collapse
Affiliation(s)
- Stephen Juma Mulware
- Ion Beam Modification and Analysis Laboratory, Physics Department, University of North Texas, 1155 Union Circle, #311427, Denton, TX 76203, USA
| |
Collapse
|
11
|
Qiu P, Plevritis SK. TreeVis: a MATLAB-based tool for tree visualization. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2013; 109:74-6. [PMID: 23036855 PMCID: PMC3508366 DOI: 10.1016/j.cmpb.2012.08.008] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2011] [Revised: 06/02/2012] [Accepted: 08/15/2012] [Indexed: 05/25/2023]
Abstract
Network-based analyses of high-dimensional biological data often produce results in the form of tree structures. Generating easily interpretable layouts to visualize these tree structures is a non-trivial task. We present a new visualization algorithm to generate two-dimensional layouts for complex tree structures. Implementations in both MATLAB and R are provided.
Collapse
Affiliation(s)
- Peng Qiu
- Department of Bioinformatics and Computational Biology, University of Texas MD Anderson Cancer Center, USA
| | | |
Collapse
|