1
|
Lund JB, Li W, Mohammadnejad A, Li S, Baumbach J, Tan Q. EWASex: an efficient R-package to predict sex in epigenome-wide association studies. Bioinformatics 2020; 37:btaa949. [PMID: 33313760 DOI: 10.1093/bioinformatics/btaa949] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Revised: 10/14/2020] [Accepted: 10/29/2020] [Indexed: 11/14/2022] Open
Abstract
SUMMARY Epigenome-Wide Association Study (EWAS) has become a powerful approach to identify epigenetic variations associated with diseases or health traits. Sex is an important variable to include in EWAS to ensure unbiased data processing and statistical analysis. We introduce the R-package EWASex, which allows for fast and highly accurate sex-estimation using DNA methylation data on a small set of CpG sites located on the X-chromosome under stable X-chromosome inactivation in females. RESULTS We demonstrate that EWASex outperforms the current state of the art tools by using different EWAS datasets. With EWASex, we offer an efficient way to predict and to verify sex that can be easily implemented in any EWAS using blood samples or even other tissue types. It comes with pre-trained weights to work without prior sex labels and without requiring access to RAW data, which is a necessity for all currently available methods. AVAILABILITY AND IMPLEMENTATION The EWASex R-package along with tutorials, documentation and source code are available at https://github.com/Silver-Hawk/EWASex. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jesper Beltoft Lund
- Digital Health & Machine Learning Research Group, Hasso Plattner Institut for Digital Engineering, 14467 Potsdam, Germany
- Epidemiology & Biostatistics, Department of Public Health, University of Southern Denmark, 5000 Odense, Denmark
| | - Weilong Li
- Epidemiology & Biostatistics, Department of Public Health, University of Southern Denmark, 5000 Odense, Denmark
| | - Afsaneh Mohammadnejad
- Epidemiology & Biostatistics, Department of Public Health, University of Southern Denmark, 5000 Odense, Denmark
| | - Shuxia Li
- Epidemiology & Biostatistics, Department of Public Health, University of Southern Denmark, 5000 Odense, Denmark
| | - Jan Baumbach
- Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Technical University of Munich, 80333 Munich, Germany
- Computational BioMedicine Lab, Department of Mathematics and Computer Science, 5000 Odense, Denmark
| | - Qihua Tan
- Epidemiology & Biostatistics, Department of Public Health, University of Southern Denmark, 5000 Odense, Denmark
- Computational BioMedicine Lab, Department of Mathematics and Computer Science, 5000 Odense, Denmark
- Unit of Human Genetics, Department of Clinical Research, University of Southern Denmark, 5000 Odense, Denmark
| |
Collapse
|
2
|
Jung CH, Park DJ, Georgeson P, Mahmood K, Milne RL, Southey MC, Pope BJ. sEst: Accurate Sex-Estimation and Abnormality Detection in Methylation Microarray Data. Int J Mol Sci 2018; 19:ijms19103172. [PMID: 30326623 PMCID: PMC6213967 DOI: 10.3390/ijms19103172] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2018] [Revised: 10/08/2018] [Accepted: 10/09/2018] [Indexed: 01/21/2023] Open
Abstract
DNA methylation influences predisposition, development and prognosis for many diseases, including cancer. However, it is not uncommon to encounter samples with incorrect sex labelling or atypical sex chromosome arrangement. Sex is one of the strongest influencers of the genomic distribution of DNA methylation and, therefore, correct assignment of sex and filtering of abnormal samples are essential for the quality control of study data. Differences in sex chromosome copy numbers between sexes and X-chromosome inactivation in females result in distinctive sex-specific patterns in the distribution of DNA methylation levels. In this study, we present a software tool, sEst, which incorporates clustering analysis to infer sex and to detect sex-chromosome abnormalities from DNA methylation microarray data. Testing with two publicly available datasets demonstrated that sEst not only correctly inferred the sex of the test samples, but also identified mislabelled samples and samples with potential sex-chromosome abnormalities, such as Klinefelter syndrome and Turner syndrome, the latter being a feature not offered by existing methods. Considering that sex and the sex-chromosome abnormalities can have large effects on many phenotypes, including diseases, our method can make a significant contribution to DNA methylation studies that are based on microarray platforms.
Collapse
Affiliation(s)
- Chol-Hee Jung
- Melbourne Bioinformatics, The University of Melbourne, Parkville, VIC 3010, Australia.
| | - Daniel J Park
- Melbourne Bioinformatics, The University of Melbourne, Parkville, VIC 3010, Australia.
| | - Peter Georgeson
- Melbourne Bioinformatics, The University of Melbourne, Parkville, VIC 3010, Australia.
- Department of Clinical Pathology, The University of Melbourne, Parkville, VIC 3010, Australia.
| | - Khalid Mahmood
- Melbourne Bioinformatics, The University of Melbourne, Parkville, VIC 3010, Australia.
| | - Roger L Milne
- Cancer Epidemiology & Intelligence Division, Cancer Council Victoria, Melbourne, VIC 3004, Australia.
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Parkville, VIC, 3010, Australia.
- Precision Medicine, School of Clinical Sciences at Monash Health, Monash University, Clayton, VIC 3168, Australia.
| | - Melissa C Southey
- Cancer Epidemiology & Intelligence Division, Cancer Council Victoria, Melbourne, VIC 3004, Australia.
- Precision Medicine, School of Clinical Sciences at Monash Health, Monash University, Clayton, VIC 3168, Australia.
- Genetic Epidemiology Laboratory, The University of Melbourne, Parkville, VIC 3010, Australia.
| | - Bernard J Pope
- Melbourne Bioinformatics, The University of Melbourne, Parkville, VIC 3010, Australia.
- Department of Clinical Pathology, The University of Melbourne, Parkville, VIC 3010, Australia.
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Parkville, VIC, 3010, Australia.
| |
Collapse
|
3
|
Palumbo D, Affinito O, Monticelli A, Cocozza S. DNA Methylation variability among individuals is related to CpGs cluster density and evolutionary signatures. BMC Genomics 2018; 19:229. [PMID: 29606093 PMCID: PMC5880022 DOI: 10.1186/s12864-018-4618-9] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2017] [Accepted: 03/23/2018] [Indexed: 01/06/2023] Open
Abstract
BACKGROUND In recent years, epigenetics has gained a central role in the understanding of the process of natural selection. It is now clear how environmental impacts on the methylome could promote methylation variability with direct effects on disease etiology as well as phenotypic and genotypic variations in evolutionary processes. To identify possible factors influencing inter-individual methylation variability, we studied methylation values standard deviation of 166 healthy individuals searching for possible associations with genomic features and evolutionary signatures. RESULTS We analyzed methylation variability values in relation to CpG cluster density and we found a strong association between them (p-value < 2.2 × 10- 16). Furthermore, we found that genes related to CpGs with high methylation variability values were enriched for immunological pathways; instead, those associated with low ones were enriched for pathways related to basic cellular functions. Finally, we found an association between methylation variability values and signals of both ancient (p-value < 2.2 × 10- 16) and recent selective pressure (p-value < 1 × 10- 4). CONCLUSION Our results indicate the presence of an intricate interplay between genetics, epigenetic code and evolutionary constraints in humans.
Collapse
Affiliation(s)
- Domenico Palumbo
- Department of Molecular Medicine and Medical Biotechnology (DMMBM), University of Naples “Federico II”, Naples, Italy
| | - Ornella Affinito
- Department of Molecular Medicine and Medical Biotechnology (DMMBM), University of Naples “Federico II”, Naples, Italy
| | - Antonella Monticelli
- Institute for Experimental Endocrinology and Oncology (IEOS) “Gaetano Salvatore”, CNR, Naples, Italy
| | - Sergio Cocozza
- Department of Molecular Medicine and Medical Biotechnology (DMMBM), University of Naples “Federico II”, Naples, Italy
| |
Collapse
|