1
|
Eulalio T, Sun MW, Gevaert O, Greicius MD, Montine TJ, Nachun D, Montgomery SB. regionalpcs: improved discovery of DNA methylation associations with complex traits. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.01.590171. [PMID: 38746367 PMCID: PMC11092597 DOI: 10.1101/2024.05.01.590171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
We have developed the regional principal components (rPCs) method, a novel approach for summarizing gene-level methylation. rPCs address the challenge of deciphering complex epigenetic mechanisms in diseases like Alzheimer's disease (AD). In contrast to traditional averaging, rPCs leverage principal components analysis to capture complex methylation patterns across gene regions. Our method demonstrated a 54% improvement in sensitivity over averaging in simulations, offering a robust framework for identifying subtle epigenetic variations. Applying rPCs to the AD brain methylation data in ROSMAP, combined with cell type deconvolution, we uncovered 838 differentially methylated genes associated with neuritic plaque burden-significantly outperforming conventional methods. Integrating methylation quantitative trait loci (meQTL) with genome-wide association studies (GWAS) identified 17 genes with potential causal roles in AD, including MS4A4A and PICALM. Our approach is available in the Bioconductor package regionalpcs, opening avenues for research and facilitating a deeper understanding of the epigenetic landscape in complex diseases.
Collapse
Affiliation(s)
- Tiffany Eulalio
- Department of Biomedical Data Science, Stanford University, Stanford, CA, 94305, USA
| | - Min Woo Sun
- Department of Biomedical Data Science, Stanford University, Stanford, CA, 94305, USA
| | - Olivier Gevaert
- Department of Biomedical Data Science, Stanford University, Stanford, CA, 94305, USA
| | - Michael D Greicius
- Department of Neurology & Neurological Sciences, Stanford University, Stanford, CA, 94305, USA
| | - Thomas J Montine
- Department of Pathology, Stanford University, Stanford, CA, 94305, USA
| | - Daniel Nachun
- Department of Pathology, Stanford University, Stanford, CA, 94305, USA
| | | |
Collapse
|
2
|
Augustine J, Jereesh AS. Identification of gene-level methylation for disease prediction. Interdiscip Sci 2023; 15:678-695. [PMID: 37603212 DOI: 10.1007/s12539-023-00584-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 07/30/2023] [Accepted: 08/01/2023] [Indexed: 08/22/2023]
Abstract
DNA methylation is an epigenetic alteration that plays a fundamental part in governing gene regulatory processes. The DNA methylation mechanism affixes methyl groups to distinct cytosine residues, influencing chromatin architectures. Multiple studies have demonstrated that DNA methylation's regulatory effect on genes is linked to the beginning and progression of several disorders. Researchers have recently uncovered thousands of phenotype-related methylation sites through the epigenome-wide association study (EWAS). However, combining the methylation levels of several sites within a gene and determining the gene-level DNA methylation remains challenging. In this study, we proposed the supervised UMAP Assisted Gene-level Methylation method (sUAGM) for disease prediction based on supervised UMAP (Uniform Manifold Approximation and Projection), a manifold learning-based method for reducing dimensionality. The methylation values at the gene level generated using the proposed method are evaluated by employing various feature selection and classification algorithms on three distinct DNA methylation datasets derived from blood samples. The performance has been assessed employing classification accuracy, F-1 score, Mathews Correlation Coefficient (MCC), Kappa, Classification Success Index (CSI) and Jaccard Index. The Support Vector Machine with the linear kernel (SVML) classifier with Recursive Feature Elimination (RFE) performs best across all three datasets. From comparative analysis, our method outperformed existing gene-level and site-level approaches by achieving 100% accuracy and F1-score with fewer genes. The functional analysis of the top 28 genes selected from the Parkinson's disease dataset revealed a significant association with the disease.
Collapse
Affiliation(s)
- Jisha Augustine
- Bioinformatics Lab, Department of Computer Science, Cochin University of Science and Technology, Cochin, Kerala, 682022, India.
| | - A S Jereesh
- Bioinformatics Lab, Department of Computer Science, Cochin University of Science and Technology, Cochin, Kerala, 682022, India
| |
Collapse
|
3
|
Xu C, Zhang R, Duan M, Zhou Y, Bao J, Lu H, Wang J, Hu M, Hu Z, Zhou F, Zhu W. A polygenic stacking classifier revealed the complicated platelet transcriptomic landscape of adult immune thrombocytopenia. MOLECULAR THERAPY - NUCLEIC ACIDS 2022; 28:477-487. [PMID: 35505964 PMCID: PMC9046129 DOI: 10.1016/j.omtn.2022.04.004] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Accepted: 04/01/2022] [Indexed: 01/19/2023]
Abstract
Immune thrombocytopenia (ITP) is an autoimmune disease with the typical symptom of a low platelet count in blood. ITP demonstrated age and sex biases in both occurrences and prognosis, and adult ITP was mainly induced by the living environments. The current diagnosis guideline lacks the integration of molecular heterogenicity. This study recruited the largest cohort of platelet transcriptome samples. A comprehensive procedure of feature selection, feature engineering, and stacking classification was carried out to detect the ITP biomarkers using RNA sequencing (RNA-seq) transcriptomes. The 40 detected biomarkers were loaded to train the final ITP detection model, with an overall accuracy 0.974. The biomarkers suggested that ITP onset may be associated with various transcribed components, including protein-coding genes, long intergenic non-coding RNA (lincRNA) genes, and pseudogenes with apparent transcriptions. The delivered ITP detection model may also be utilized as a complementary ITP diagnosis tool. The code and the example dataset is freely available on http://www.healthinformaticslab.org/supp/resources.php
Collapse
Affiliation(s)
- Chengfeng Xu
- Department of Hematology, Yueyang Hospital of Integrated Traditional Chinese and Western Medicine, Shanghai University of Traditional Chinese Medicine, 110 Ganhe Road, Hongkou District, Shanghai 200437, China
| | - Ruochi Zhang
- College of Computer Science and Technology, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China
| | - Meiyu Duan
- College of Computer Science and Technology, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China
| | - Yongming Zhou
- Department of Hematology, Yueyang Hospital of Integrated Traditional Chinese and Western Medicine, Shanghai University of Traditional Chinese Medicine, 110 Ganhe Road, Hongkou District, Shanghai 200437, China
| | - Jizhang Bao
- Department of Hematology, Yueyang Hospital of Integrated Traditional Chinese and Western Medicine, Shanghai University of Traditional Chinese Medicine, 110 Ganhe Road, Hongkou District, Shanghai 200437, China
| | - Hao Lu
- Department of Hematology, Yueyang Hospital of Integrated Traditional Chinese and Western Medicine, Shanghai University of Traditional Chinese Medicine, 110 Ganhe Road, Hongkou District, Shanghai 200437, China
| | - Jie Wang
- Department of Hematology, Yueyang Hospital of Integrated Traditional Chinese and Western Medicine, Shanghai University of Traditional Chinese Medicine, 110 Ganhe Road, Hongkou District, Shanghai 200437, China
| | - Minghui Hu
- Department of Hematology, Yueyang Hospital of Integrated Traditional Chinese and Western Medicine, Shanghai University of Traditional Chinese Medicine, 110 Ganhe Road, Hongkou District, Shanghai 200437, China
| | - Zhaoyang Hu
- Fun-Med Pharmaceutical Technology (Shanghai) Co., Ltd., RM. A310, 115 Xinjunhuan Road, Minhang District, Shanghai 201100, China
- Corresponding author Zhaoyang Hu, PhD, Fengneng Pharmaceutical Technology (Shanghai) Co., Ltd., RM. A310, 115 Xinjunhuan Road, Minhang District, Shanghai 201100, China.
| | - Fengfeng Zhou
- College of Computer Science and Technology, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China
- Corresponding author Fengfeng Zhou, PhD, College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China.
| | - Wenwei Zhu
- Department of Hematology, Yueyang Hospital of Integrated Traditional Chinese and Western Medicine, Shanghai University of Traditional Chinese Medicine, 110 Ganhe Road, Hongkou District, Shanghai 200437, China
- Corresponding author Wenwei Zhu, PhD, Department of Hematology, Yueyang Hospital of Integrated Traditional Chinese and Western Medicine, Shanghai University of Traditional Chinese Medicine, 110 Ganhe Road, Hongkou District, Shanghai 200437, China.
| |
Collapse
|
4
|
Wang Q, Duan M, Fan Y, Liu S, Ren Y, Huang L, Zhou F. Transforming OMIC features for classification using Siamese convolutional networks. J Bioinform Comput Biol 2022; 20:2250013. [DOI: 10.1142/s0219720022500135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|