1
|
Sahoo K, Sundararajan V. Methods in DNA methylation array dataset analysis: A review. Comput Struct Biotechnol J 2024; 23:2304-2325. [PMID: 38845821 PMCID: PMC11153885 DOI: 10.1016/j.csbj.2024.05.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 04/25/2024] [Accepted: 05/08/2024] [Indexed: 06/09/2024] Open
Abstract
Understanding the intricate relationships between gene expression levels and epigenetic modifications in a genome is crucial to comprehending the pathogenic mechanisms of many diseases. With the advancement of DNA Methylome Profiling techniques, the emphasis on identifying Differentially Methylated Regions (DMRs/DMGs) has become crucial for biomarker discovery, offering new insights into the etiology of illnesses. This review surveys the current state of computational tools/algorithms for the analysis of microarray-based DNA methylation profiling datasets, focusing on key concepts underlying the diagnostic/prognostic CpG site extraction. It addresses methodological frameworks, algorithms, and pipelines employed by various authors, serving as a roadmap to address challenges and understand changing trends in the methodologies for analyzing array-based DNA methylation profiling datasets derived from diseased genomes. Additionally, it highlights the importance of integrating gene expression and methylation datasets for accurate biomarker identification, explores prognostic prediction models, and discusses molecular subtyping for disease classification. The review also emphasizes the contributions of machine learning, neural networks, and data mining to enhance diagnostic workflow development, thereby improving accuracy, precision, and robustness.
Collapse
Affiliation(s)
| | - Vino Sundararajan
- Correspondence to: Department of Bio Sciences, School of Bio Sciences and Technology, Vellore Institute of Technology, Vellore 632 014, Tamil Nadu, India.
| |
Collapse
|
2
|
Shore CJ, Villicaña S, El-Sayed Moustafa JS, Roberts AL, Gunn DA, Bataille V, Deloukas P, Spector TD, Small KS, Bell JT. Genetic effects on the skin methylome in healthy older twins. Am J Hum Genet 2024; 111:1932-1952. [PMID: 39137780 PMCID: PMC11393713 DOI: 10.1016/j.ajhg.2024.07.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Revised: 05/22/2024] [Accepted: 07/15/2024] [Indexed: 08/15/2024] Open
Abstract
Whole-skin DNA methylation variation has been implicated in several diseases, including melanoma, but its genetic basis has not yet been fully characterized. Using bulk skin tissue samples from 414 healthy female UK twins, we performed twin-based heritability and methylation quantitative trait loci (meQTL) analyses for >400,000 DNA methylation sites. We find that the human skin DNA methylome is on average less heritable than previously estimated in blood and other tissues (mean heritability: 10.02%). meQTL analysis identified local genetic effects influencing DNA methylation at 18.8% (76,442) of tested CpG sites, as well as 1,775 CpG sites associated with at least one distal genetic variant. As a functional follow-up, we performed skin expression QTL (eQTL) analyses in a partially overlapping sample of 604 female twins. Colocalization analysis identified over 3,500 shared genetic effects affecting thousands of CpG sites (10,067) and genes (4,475). Mediation analysis of putative colocalized gene-CpG pairs identified 114 genes with evidence for eQTL effects being mediated by DNA methylation in skin, including in genes implicating skin disease such as ALOX12 and CSPG4. We further explored the relevance of skin meQTLs to skin disease and found that skin meQTLs and CpGs under genetic influence were enriched for multiple skin-related genome-wide and epigenome-wide association signals, including for melanoma and psoriasis. Our findings give insights into the regulatory landscape of epigenomic variation in skin.
Collapse
Affiliation(s)
- Christopher J Shore
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK.
| | - Sergio Villicaña
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK
| | | | - Amy L Roberts
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK
| | | | - Veronique Bataille
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK
| | - Panos Deloukas
- William Harvey Research Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, UK
| | - Tim D Spector
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK
| | - Kerrin S Small
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK
| | - Jordana T Bell
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK.
| |
Collapse
|
3
|
Yusipov I, Kalyakulina A, Trukhanov A, Franceschi C, Ivanchenko M. Map of epigenetic age acceleration: A worldwide analysis. Ageing Res Rev 2024; 100:102418. [PMID: 39002646 DOI: 10.1016/j.arr.2024.102418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Revised: 07/03/2024] [Accepted: 07/08/2024] [Indexed: 07/15/2024]
Abstract
We present a systematic analysis of epigenetic age acceleration based on by far the largest collection of publicly available DNA methylation data for healthy samples (93 datasets, 23 K samples), focusing on the geographic (25 countries) and ethnic (31 ethnicities) aspects around the world. We employed the most popular epigenetic tools for assessing age acceleration and examined their quality metrics and ability to extrapolate to epigenetic data from different tissue types and age ranges different from the training data of these models. In most cases, the models proved to be inconsistent with each other and showed different signs of age acceleration, with the PhenoAge model tending to systematically underestimate and different versions of the GrimAge model tending to systematically overestimate the age prediction of healthy subjects. Referring to data availability and consistency, most countries and populations are still not represented in GEO, moreover, different datasets use different criteria for determining healthy controls. Because of this, it is difficult to fully isolate the contribution of "geography/environment", "ethnicity" and "healthiness" to epigenetic age acceleration. Among the explored metrics, only the DunedinPACE, which measures aging rate, appears to adequately reflect the standard of living and socioeconomic indicators in countries, although it has a limited application to blood methylation data only. Invariably, by epigenetic age acceleration, males age faster than females in most of the studied countries and populations.
Collapse
Affiliation(s)
- Igor Yusipov
- Artificial Intelligence Research Center, Institute of Information Technologies, Mathematics and Mechanics, Lobachevsky State University, Nizhny Novgorod 603022, Russia; Institute of Biogerontology, Lobachevsky State University, Nizhny Novgorod 603022, Russia.
| | - Alena Kalyakulina
- Artificial Intelligence Research Center, Institute of Information Technologies, Mathematics and Mechanics, Lobachevsky State University, Nizhny Novgorod 603022, Russia; Institute of Biogerontology, Lobachevsky State University, Nizhny Novgorod 603022, Russia.
| | - Arseniy Trukhanov
- Mriya Life Institute, National Academy of Active Longevity, Moscow 124489, Russia.
| | - Claudio Franceschi
- Institute of Biogerontology, Lobachevsky State University, Nizhny Novgorod 603022, Russia.
| | - Mikhail Ivanchenko
- Artificial Intelligence Research Center, Institute of Information Technologies, Mathematics and Mechanics, Lobachevsky State University, Nizhny Novgorod 603022, Russia; Institute of Biogerontology, Lobachevsky State University, Nizhny Novgorod 603022, Russia.
| |
Collapse
|
4
|
Sala C, Di Lena P, Fernandes Durso D, Faria do Valle I, Bacalini MG, Dall’Olio D, Franceschi C, Castellani G, Garagnani P, Nardini C. Where are we in the implementation of tissue-specific epigenetic clocks? FRONTIERS IN BIOINFORMATICS 2024; 4:1306244. [PMID: 38501111 PMCID: PMC10944965 DOI: 10.3389/fbinf.2024.1306244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Accepted: 02/14/2024] [Indexed: 03/20/2024] Open
Abstract
Introduction: DNA methylation clocks presents advantageous characteristics with respect to the ambitious goal of identifying very early markers of disease, based on the concept that accelerated ageing is a reliable predictor in this sense. Methods: Such tools, being epigenomic based, are expected to be conditioned by sex and tissue specificities, and this work is about quantifying this dependency as well as that from the regression model and the size of the training set. Results: Our quantitative results indicate that elastic-net penalization is the best performing strategy, and better so when-unsurprisingly-the data set is bigger; sex does not appear to condition clocks performances and tissue specific clocks appear to perform better than generic blood clocks. Finally, when considering all trained clocks, we identified a subset of genes that, to the best of our knowledge, have not been presented yet and might deserve further investigation: CPT1A, MMP15, SHROOM3, SLIT3, and SYNGR. Conclusion: These factual starting points can be useful for the future medical translation of clocks and in particular in the debate between multi-tissue clocks, generally trained on a large majority of blood samples, and tissue-specific clocks.
Collapse
Affiliation(s)
- Claudia Sala
- Department of Medical and Surgical Sciences, University of Bologna, Bologna, Italy
| | - Pietro Di Lena
- Department of Computer Science and Engineering, University of Bologna, Bologna, Italy
| | - Danielle Fernandes Durso
- National Counsel of Technological and Scientific Development (CNPq), Ministry of Science Technology and Innovation (MCTI), Brasília, Brazil
| | | | | | - Daniele Dall’Olio
- IRCCS Istituto delle Scienze Neurologiche di Bologna, Bologna, Italy
| | - Claudio Franceschi
- Institute of Information Technologies, Mathematics and Mechanics, Lobachevsky University, Nizhny Novgorod, Russia
| | - Gastone Castellani
- Department of Medical and Surgical Sciences, University of Bologna, Bologna, Italy
| | - Paolo Garagnani
- Department of Medical and Surgical Sciences, University of Bologna, Bologna, Italy
| | - Christine Nardini
- Istituto per le Applicazioni del Calcolo “Mauro Picone”, Consiglio Nazionale delle Ricerche, Roma, Italy
| |
Collapse
|
5
|
Ng JWY, Felix JF, Olson DM. A novel approach to risk exposure and epigenetics-the use of multidimensional context to gain insights into the early origins of cardiometabolic and neurocognitive health. BMC Med 2023; 21:466. [PMID: 38012757 PMCID: PMC10683259 DOI: 10.1186/s12916-023-03168-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Accepted: 11/09/2023] [Indexed: 11/29/2023] Open
Abstract
BACKGROUND Each mother-child dyad represents a unique combination of genetic and environmental factors. This constellation of variables impacts the expression of countless genes. Numerous studies have uncovered changes in DNA methylation (DNAm), a form of epigenetic regulation, in offspring related to maternal risk factors. How these changes work together to link maternal-child risks to childhood cardiometabolic and neurocognitive traits remains unknown. This question is a key research priority as such traits predispose to future non-communicable diseases (NCDs). We propose viewing risk and the genome through a multidimensional lens to identify common DNAm patterns shared among diverse risk profiles. METHODS We identified multifactorial Maternal Risk Profiles (MRPs) generated from population-based data (n = 15,454, Avon Longitudinal Study of Parents and Children (ALSPAC)). Using cord blood HumanMethylation450 BeadChip data, we identified genome-wide patterns of DNAm that co-vary with these MRPs. We tested the prospective relation of these DNAm patterns (n = 914) to future outcomes using decision tree analysis. We then tested the reproducibility of these patterns in (1) DNAm data at age 7 and 17 years within the same cohort (n = 973 and 974, respectively) and (2) cord DNAm in an independent cohort, the Generation R Study (n = 686). RESULTS We identified twenty MRP-related DNAm patterns at birth in ALSPAC. Four were prospectively related to cardiometabolic and/or neurocognitive childhood outcomes. These patterns were replicated in DNAm data from blood collected at later ages. Three of these patterns were externally validated in cord DNAm data in Generation R. Compared to previous literature, DNAm patterns exhibited novel spatial distribution across the genome that intersects with chromatin functional and tissue-specific signatures. CONCLUSIONS To our knowledge, we are the first to leverage multifactorial population-wide data to detect patterns of variability in DNAm. This context-based approach decreases biases stemming from overreliance on specific samples or variables. We discovered molecular patterns demonstrating prospective and replicable relations to complex traits. Moreover, results suggest that patterns harbour a genome-wide organisation specific to chromatin regulation and target tissues. These preliminary findings warrant further investigation to better reflect the reality of human context in molecular studies of NCDs.
Collapse
Affiliation(s)
- Jane W Y Ng
- Department of Pediatrics, Cummings School of Medicine, University of Calgary, 28 Oki Drive NW, Calgary, AB, T3B 6A8, Canada
| | - Janine F Felix
- The Generation F Study Group, Erasmus MC University Medical Center Rotterdam, Postbus, 2040, 3000 CA, Rotterdam, The Netherlands
- Department of Pediatrics, Erasmus MC University Medical Center Rotterdam, Rotterdam, The Netherlands
| | - David M Olson
- Departments of Obstetrics and Gynecology, Physiology, and Pediatrics, Faculty of Medicine and Dentistry, University of Alberta, 220 HMRC, Edmonton, AB, T6G2S2, Canada.
| |
Collapse
|
6
|
Khan A, Inkster AM, Peñaherrera MS, King S, Kildea S, Oberlander TF, Olson DM, Vaillancourt C, Brain U, Beraldo EO, Beristain AG, Clifton VL, Del Gobbo GF, Lam WL, Metz GAS, Ng JWY, Price EM, Schuetz JM, Yuan V, Portales-Casamar É, Robinson WP. The application of epiphenotyping approaches to DNA methylation array studies of the human placenta. Epigenetics Chromatin 2023; 16:37. [PMID: 37794499 PMCID: PMC10548571 DOI: 10.1186/s13072-023-00507-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 09/15/2023] [Indexed: 10/06/2023] Open
Abstract
BACKGROUND Genome-wide DNA methylation (DNAme) profiling of the placenta with Illumina Infinium Methylation bead arrays is often used to explore the connections between in utero exposures, placental pathology, and fetal development. However, many technical and biological factors can lead to signals of DNAme variation between samples and between cohorts, and understanding and accounting for these factors is essential to ensure meaningful and replicable data analysis. Recently, "epiphenotyping" approaches have been developed whereby DNAme data can be used to impute information about phenotypic variables such as gestational age, sex, cell composition, and ancestry. These epiphenotypes offer avenues to compare phenotypic data across cohorts, and to understand how phenotypic variables relate to DNAme variability. However, the relationships between placental epiphenotyping variables and other technical and biological variables, and their application to downstream epigenome analyses, have not been well studied. RESULTS Using DNAme data from 204 placentas across three cohorts, we applied the PlaNET R package to estimate epiphenotypes gestational age, ancestry, and cell composition in these samples. PlaNET ancestry estimates were highly correlated with independent polymorphic ancestry-informative markers, and epigenetic gestational age, on average, was estimated within 4 days of reported gestational age, underscoring the accuracy of these tools. Cell composition estimates varied both within and between cohorts, as well as over very long placental processing times. Interestingly, the ratio of cytotrophoblast to syncytiotrophoblast proportion decreased with increasing gestational age, and differed slightly by both maternal ethnicity (lower in white vs. non-white) and genetic ancestry (lower in higher probability European ancestry). The cohort of origin and cytotrophoblast proportion were the largest drivers of DNAme variation in this dataset, based on their associations with the first principal component. CONCLUSIONS This work confirms that cohort, array (technical) batch, cell type proportion, self-reported ethnicity, genetic ancestry, and biological sex are important variables to consider in any analyses of Illumina DNAme data. We further demonstrate the specific utility of epiphenotyping tools developed for use with placental DNAme data, and show that these variables (i) provide an independent check of clinically obtained data and (ii) provide a robust approach to compare variables across different datasets. Finally, we present a general framework for the processing and analysis of placental DNAme data, integrating the epiphenotype variables discussed here.
Collapse
Affiliation(s)
- A Khan
- BC Children's Hospital Research Institute (BCCHR), 950 W 28th Ave, Vancouver, BC, V5Z 4H4, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6H 3N1, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, ON, M5G 1L7, Canada
- Princess Margaret Cancer Center, Toronto, ON, M5G 2C4, Canada
| | - A M Inkster
- BC Children's Hospital Research Institute (BCCHR), 950 W 28th Ave, Vancouver, BC, V5Z 4H4, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6H 3N1, Canada
| | - M S Peñaherrera
- BC Children's Hospital Research Institute (BCCHR), 950 W 28th Ave, Vancouver, BC, V5Z 4H4, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6H 3N1, Canada
| | - S King
- Department of Psychiatry, McGill University, Montreal, QC, H3A 1A1, Canada
- Psychosocial Research Division, Douglas Hospital Research Centre, Montreal, QC, H4H 1R3, Canada
| | - S Kildea
- Mater Research Institute, University of Queensland, Brisbane, QLD, 4101, Australia
- Molly Wardaguga Research Centre, Charles Darwin University, Brisbane, QLD, 4000, Australia
| | - T F Oberlander
- BC Children's Hospital Research Institute (BCCHR), 950 W 28th Ave, Vancouver, BC, V5Z 4H4, Canada
- School of Population and Public Health, University of British Columbia, Vancouver, BC, V6T 1Z3, Canada
- Department of Pediatrics, University of British Columbia, Vancouver, BC, V6H 3V4, Canada
| | - D M Olson
- Department of Obstetrics and Gynecology, University of Alberta, 220 HMRC, Edmonton, AB, T6G 2S2, Canada
| | - C Vaillancourt
- Centre Armand Frappier Santé Biotechnologie - INRS and University of Quebec Intersectorial Health Research Network, Laval, QC, H7V 1B7, Canada
| | - U Brain
- BC Children's Hospital Research Institute (BCCHR), 950 W 28th Ave, Vancouver, BC, V5Z 4H4, Canada
- School of Population and Public Health, University of British Columbia, Vancouver, BC, V6T 1Z3, Canada
- Department of Pediatrics, University of British Columbia, Vancouver, BC, V6H 3V4, Canada
| | - E O Beraldo
- BC Children's Hospital Research Institute (BCCHR), 950 W 28th Ave, Vancouver, BC, V5Z 4H4, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6H 3N1, Canada
| | - A G Beristain
- BC Children's Hospital Research Institute (BCCHR), 950 W 28th Ave, Vancouver, BC, V5Z 4H4, Canada
- Department of Obstetrics & Gynecology, University of British Columbia, Vancouver, BC, V6T 1Z3, Canada
| | - V L Clifton
- Mater Research Institute, University of Queensland, Brisbane, QLD, 4101, Australia
- Faculty of Medicine, The University of Queensland, Herston, QLD, 4006, Australia
| | - G F Del Gobbo
- BC Children's Hospital Research Institute (BCCHR), 950 W 28th Ave, Vancouver, BC, V5Z 4H4, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6H 3N1, Canada
- Children's Hospital of Eastern Ontario Research Institute, University of Ottawa, Ottawa, ON, K1H 5B2, Canada
| | - W L Lam
- British Columbia Cancer Research Centre, Vancouver, BC, V5Z 1L3, Canada
| | - G A S Metz
- Canadian Centre for Behavioural Neuroscience, Department of Neuroscience, University of Lethbridge, Lethbridge, AB, T1K 3M4, Canada
| | - J W Y Ng
- Faculty of Medicine, University of Calgary, Calgary, AB, T2N 4N1, Canada
| | - E M Price
- BC Children's Hospital Research Institute (BCCHR), 950 W 28th Ave, Vancouver, BC, V5Z 4H4, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6H 3N1, Canada
- Children's Hospital of Eastern Ontario Research Institute, University of Ottawa, Ottawa, ON, K1H 5B2, Canada
| | - J M Schuetz
- BC Children's Hospital Research Institute (BCCHR), 950 W 28th Ave, Vancouver, BC, V5Z 4H4, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6H 3N1, Canada
| | - V Yuan
- BC Children's Hospital Research Institute (BCCHR), 950 W 28th Ave, Vancouver, BC, V5Z 4H4, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6H 3N1, Canada
| | - É Portales-Casamar
- BC Children's Hospital Research Institute (BCCHR), 950 W 28th Ave, Vancouver, BC, V5Z 4H4, Canada.
- Centre de Recherche du CHU Sainte-Justine, 3175 Côte-Sainte-Catherine Road, Montréal, QC, H3T 1C5, Canada.
| | - W P Robinson
- BC Children's Hospital Research Institute (BCCHR), 950 W 28th Ave, Vancouver, BC, V5Z 4H4, Canada.
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6H 3N1, Canada.
| |
Collapse
|
7
|
Kalyakulina A, Yusipov I, Bacalini MG, Franceschi C, Vedunova M, Ivanchenko M. Disease classification for whole-blood DNA methylation: Meta-analysis, missing values imputation, and XAI. Gigascience 2022; 11:giac097. [PMID: 36259657 PMCID: PMC9718659 DOI: 10.1093/gigascience/giac097] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Revised: 08/01/2022] [Accepted: 09/15/2022] [Indexed: 07/25/2023] Open
Abstract
BACKGROUND DNA methylation has a significant effect on gene expression and can be associated with various diseases. Meta-analysis of available DNA methylation datasets requires development of a specific workflow for joint data processing. RESULTS We propose a comprehensive approach of combined DNA methylation datasets to classify controls and patients. The solution includes data harmonization, construction of machine learning classification models, dimensionality reduction of models, imputation of missing values, and explanation of model predictions by explainable artificial intelligence (XAI) algorithms. We show that harmonization can improve classification accuracy by up to 20% when preprocessing methods of the training and test datasets are different. The best accuracy results were obtained with tree ensembles, reaching above 95% for Parkinson's disease. Dimensionality reduction can substantially decrease the number of features, without detriment to the classification accuracy. The best imputation methods achieve almost the same classification accuracy for data with missing values as for the original data. XAI approaches have allowed us to explain model predictions from both populational and individual perspectives. CONCLUSIONS We propose a methodologically valid and comprehensive approach to the classification of healthy individuals and patients with various diseases based on whole-blood DNA methylation data using Parkinson's disease and schizophrenia as examples. The proposed algorithm works better for the former pathology, characterized by a complex set of symptoms. It allows to solve data harmonization problems for meta-analysis of many different datasets, impute missing values, and build classification models of small dimensionality.
Collapse
Affiliation(s)
- Alena Kalyakulina
- Institute of Information Technologies, Mathematics and Mechanics, Lobachevsky State University, 603022 Nizhny Novgorod, Russia
| | - Igor Yusipov
- Institute of Information Technologies, Mathematics and Mechanics, Lobachevsky State University, 603022 Nizhny Novgorod, Russia
| | | | - Claudio Franceschi
- Institute of Information Technologies, Mathematics and Mechanics, Lobachevsky State University, 603022 Nizhny Novgorod, Russia
| | - Maria Vedunova
- Institute of Biology and Biomedicine, Lobachevsky State University, 603022 Nizhny Novgorod, Russia
| | - Mikhail Ivanchenko
- Institute of Information Technologies, Mathematics and Mechanics, Lobachevsky State University, 603022 Nizhny Novgorod, Russia
| |
Collapse
|
8
|
Di Lena P, Sala C, Nardini C. Evaluation of different computational methods for DNA methylation-based biological age. Brief Bioinform 2022; 23:6632619. [PMID: 35794713 DOI: 10.1093/bib/bbac274] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 05/27/2022] [Accepted: 06/14/2022] [Indexed: 11/13/2022] Open
Abstract
In recent years there has been a widespread interest in researching biomarkers of aging that could predict physiological vulnerability better than chronological age. Aging, in fact, is one of the most relevant risk factors for a wide range of maladies, and molecular surrogates of this phenotype could enable better patients stratification. Among the most promising of such biomarkers is DNA methylation-based biological age. Given the potential and variety of computational implementations (epigenetic clocks), we here present a systematic review of such clocks. Furthermore, we provide a large-scale performance comparison across different tissues and diseases in terms of age prediction accuracy and age acceleration, a measure of deviance from physiology. Our analysis offers both a state-of-the-art overview of the computational techniques developed so far and a heterogeneous picture of performances, which can be helpful in orienting future research.
Collapse
Affiliation(s)
- Pietro Di Lena
- Department of Computer Science and Engineering, University of Bologna, Mura Anteo Zamboni 7, 40126 Bologna, Italy
| | - Claudia Sala
- Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Via Massarenti 9, 40138, Bologna, Italy
| | | |
Collapse
|
9
|
Mi S, Shi Y, Dari G, Yu Y. Function of m6A and its regulation of domesticated animals' complex traits. J Anim Sci 2022; 100:6524534. [PMID: 35137116 PMCID: PMC8942107 DOI: 10.1093/jas/skac034] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Accepted: 02/06/2022] [Indexed: 11/14/2022] Open
Abstract
N6-methyladenosine (m6A) is the most functionally important epigenetic modification in RNA. The m6A modification widely exists in mRNA and noncoding RNA, influences the mRNA processing, and regulates the secondary structure and maturation of noncoding RNA. Studies showed the important regulatory roles of m6A modification in animal's complex traits, such as development, immunity, and reproduction-related traits. As an important intermediate stage from animal genome to phenotype, the function of m6A in the complex trait formation of domestic animals cannot be neglected. This review discusses recent research advances on m6A modification in well-studied organisms, such as human and model organisms, and introduces m6A detection technologies, small-molecule inhibitors of m6A-related enzymes, interaction between m6A and other biological progresses, and the regulation mechanisms of m6A in domesticated animals' complex traits.
Collapse
Affiliation(s)
- Siyuan Mi
- Key Laboratory of Animal Genetics, Breeding and
Reproduction, Ministry of Agriculture and Rural Affairs and National Engineering
Laboratory for Animal Breeding, College of Animal Science and Technology, China
Agricultural University, Beijing 100193,
China
| | - Yuanjun Shi
- Key Laboratory of Animal Genetics, Breeding and
Reproduction, Ministry of Agriculture and Rural Affairs and National Engineering
Laboratory for Animal Breeding, College of Animal Science and Technology, China
Agricultural University, Beijing 100193,
China
| | - Gerile Dari
- Key Laboratory of Animal Genetics, Breeding and
Reproduction, Ministry of Agriculture and Rural Affairs and National Engineering
Laboratory for Animal Breeding, College of Animal Science and Technology, China
Agricultural University, Beijing 100193,
China
| | - Ying Yu
- Key Laboratory of Animal Genetics, Breeding and
Reproduction, Ministry of Agriculture and Rural Affairs and National Engineering
Laboratory for Animal Breeding, College of Animal Science and Technology, China
Agricultural University, Beijing 100193,
China,Corresponding author:
| |
Collapse
|
10
|
Planterose Jiménez B, Kayser M, Vidaki A. Revisiting genetic artifacts on DNA methylation microarrays exposes novel biological implications. Genome Biol 2021; 22:274. [PMID: 34548083 PMCID: PMC8454075 DOI: 10.1186/s13059-021-02484-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 09/01/2021] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Illumina DNA methylation microarrays enable epigenome-wide analysis vastly used for the discovery of novel DNA methylation variation in health and disease. However, the microarrays' probe design cannot fully consider the vast human genetic diversity, leading to genetic artifacts. Distinguishing genuine from artifactual genetic influence is of particular relevance in the study of DNA methylation heritability and methylation quantitative trait loci. But despite its importance, current strategies to account for genetic artifacts are lagging due to a limited mechanistic understanding on how such artifacts operate. RESULTS To address this, we develop and benchmark UMtools, an R-package containing novel methods for the quantification and qualification of genetic artifacts based on fluorescence intensity signals. With our approach, we model and validate known SNPs/indels on a genetically controlled dataset of monozygotic twins, and we estimate minor allele frequency from DNA methylation data and empirically detect variants not included in dbSNP. Moreover, we identify examples where genetic artifacts interact with each other or with imprinting, X-inactivation, or tissue-specific regulation. Finally, we propose a novel strategy based on co-methylation that can discern between genetic artifacts and genuine genomic influence. CONCLUSIONS We provide an atlas to navigate through the huge diversity of genetic artifacts encountered on DNA methylation microarrays. Overall, our study sets the ground for a paradigm shift in the study of the genetic component of epigenetic variation in DNA methylation microarrays.
Collapse
Affiliation(s)
- Benjamin Planterose Jiménez
- Erasmus MC, University Medical Center Rotterdam, Department of Genetic Identification, Rotterdam, the Netherlands
| | - Manfred Kayser
- Erasmus MC, University Medical Center Rotterdam, Department of Genetic Identification, Rotterdam, the Netherlands
| | - Athina Vidaki
- Erasmus MC, University Medical Center Rotterdam, Department of Genetic Identification, Rotterdam, the Netherlands
| |
Collapse
|
11
|
Lena PD, Sala C, Prodi A, Nardini C. Methylation data imputation performances under different representations and missingness patterns. BMC Bioinformatics 2020; 21:268. [PMID: 32600298 PMCID: PMC7325236 DOI: 10.1186/s12859-020-03592-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2019] [Accepted: 06/09/2020] [Indexed: 02/07/2023] Open
Abstract
Background High-throughput technologies enable the cost-effective collection and analysis of DNA methylation data throughout the human genome. This naturally entails missing values management that can complicate the analysis of the data. Several general and specific imputation methods are suitable for DNA methylation data. However, there are no detailed studies of their performances under different missing data mechanisms –(completely) at random or not- and different representations of DNA methylation levels (β and M-value). Results We make an extensive analysis of the imputation performances of seven imputation methods on simulated missing completely at random (MCAR), missing at random (MAR) and missing not at random (MNAR) methylation data. We further consider imputation performances on the popular β- and M-value representations of methylation levels. Overall, β-values enable better imputation performances than M-values. Imputation accuracy is lower for mid-range β-values, while it is generally more accurate for values at the extremes of the β-value range. The MAR values distribution is on the average more dense in the mid-range in comparison to the expected β-value distribution. As a consequence, MAR values are on average harder to impute. Conclusions The results of the analysis provide guidelines for the most suitable imputation approaches for DNA methylation data under different representations of DNA methylation levels and different missing data mechanisms.
Collapse
Affiliation(s)
- Pietro Di Lena
- Department of Computer Science and Engineering, University of Bologna, Mura Anteo Zamboni 7, Bologna, Italy.
| | - Claudia Sala
- Department of Physics and Astronomy, University of Bologna, Viale Berti Pichat 6/2, Bologna, Italy
| | - Andrea Prodi
- Smart Cities Living Lab, ISOF CNR, Via P. Gobetti, 101, Bologna, Italy
| | - Christine Nardini
- Istituto per le Applicazioni del Calcolo Mauro Picone, CNR, Via dei Taurini, 19, Roma, Italy.
| |
Collapse
|