1
|
Park MK, Ahn J, Lim JM, Han M, Lee JW, Lee JC, Hwang SJ, Kim KC. A Transcriptomics-Based Machine Learning Model Discriminating Mild Cognitive Impairment and the Prediction of Conversion to Alzheimer's Disease. Cells 2024; 13:1920. [PMID: 39594668 PMCID: PMC11593234 DOI: 10.3390/cells13221920] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2024] [Revised: 11/14/2024] [Accepted: 11/15/2024] [Indexed: 11/28/2024] Open
Abstract
The clinical spectrum of Alzheimer's disease (AD) ranges dynamically from asymptomatic and mild cognitive impairment (MCI) to mild, moderate, or severe AD. Although a few disease-modifying treatments, such as lecanemab and donanemab, have been developed, current therapies can only delay disease progression rather than halt it entirely. Therefore, the early detection of MCI and the identification of MCI patients at high risk of progression to AD remain urgent unmet needs in the super-aged era. This study utilized transcriptomics data from cognitively unimpaired (CU) individuals, MCI, and AD patients in the Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort and leveraged machine learning models to identify biomarkers that differentiate MCI from CU and also distinguish AD from MCI individuals. Furthermore, Cox proportional hazards analysis was conducted to identify biomarkers predictive of the progression from MCI to AD. Our machine learning models identified a unique set of gene expression profiles capable of achieving an area under the curve (AUC) of 0.98 in distinguishing those with MCI from CU individuals. A subset of these biomarkers was also found to be significantly associated with the risk of progression from MCI to AD. A linear mixed model demonstrated that plasma tau phosphorylated at threonine 181 (pTau181) and neurofilament light chain (NFL) exhibit the prognostic value in predicting cognitive decline longitudinally. These findings underscore the potential of integrating machine learning (ML) with transcriptomic profiling in the early detection and prognostication of AD. This integrated approach could facilitate the development of novel diagnostic tools and therapeutic strategies aimed at delaying or preventing the onset of AD in at-risk individuals. Future studies should focus on validating these biomarkers in larger, independent cohorts and further investigating their roles in AD pathogenesis.
Collapse
Affiliation(s)
- Min-Koo Park
- Department of Biological Sciences, College of Natural Sciences, Kangwon National University, Chuncheon 24341, Republic of Korea;
- Hugenebio Institute, Bio-Innovation Park, Erom, Inc., Chuncheon 24427, Republic of Korea; (J.-W.L.); (J.-C.L.)
| | - Jinhyun Ahn
- Department of Management Information Systems, College of Economics & Commerce, Jeju National University, Jeju 63243, Republic of Korea;
| | - Jin-Muk Lim
- Precision Medicine Research Institute, Innowl, Co., Ltd., Seoul 08350, Republic of Korea
| | - Minsoo Han
- AI Institute, Alopax-Algo, Co., Ltd., Seoul 06978, Republic of Korea;
| | - Ji-Won Lee
- Hugenebio Institute, Bio-Innovation Park, Erom, Inc., Chuncheon 24427, Republic of Korea; (J.-W.L.); (J.-C.L.)
| | - Jeong-Chan Lee
- Hugenebio Institute, Bio-Innovation Park, Erom, Inc., Chuncheon 24427, Republic of Korea; (J.-W.L.); (J.-C.L.)
| | - Sung-Joo Hwang
- Integrated Medicine Institute, Loving Care Hospital, Seongnam 463400, Republic of Korea;
| | - Keun-Cheol Kim
- Department of Biological Sciences, College of Natural Sciences, Kangwon National University, Chuncheon 24341, Republic of Korea;
| |
Collapse
|
2
|
Bode HF, He L, Hjelmborg JVB, Kaprio J, Ollikainen M. Pre-diagnosis blood DNA methylation profiling of twin pairs discordant for breast cancer points to the importance of environmental risk factors. Clin Epigenetics 2024; 16:160. [PMID: 39558433 PMCID: PMC11574988 DOI: 10.1186/s13148-024-01767-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Accepted: 10/23/2024] [Indexed: 11/20/2024] Open
Abstract
BACKGROUND Assessment of breast cancer (BC) risk generally relies on mammography, family history, reproductive history, and genotyping of major mutations. However, assessing the impact of environmental factors, such as lifestyle, health-related behavior, or external exposures, is still challenging. DNA methylation (DNAm), capturing both genetic and environmental effects, presents a promising opportunity. Previous studies have identified associations and predicted the risk of BC using DNAm in blood; however, these studies did not distinguish between genetic and environmental contributions to these DNAm sites. In this study, associations between DNAm and BC are assessed using paired twin models, which control for shared genetic and environmental effects, allowing testing for associations between DNAm and non-shared environmental exposures and behavior. RESULTS Pre-diagnosis blood samples of 32 monozygotic (MZ) and 76 dizygotic (DZ) female twin pairs discordant for BC were collected at the mean age of 56.0 years, with the mean age at diagnosis 66.8 years and censoring 75.2 years. We identified 212 CpGs (p < 6.4*10-8) and 15 DMRs associated with BC risk across all pairs using paired Cox proportional hazard models. All but one of the BC risks associated with CpGs were hypomethylated, and 198/212 CpGs had their DNAm associated with BC risk independent of genetic effects. According to previous literature, at least five of the top CpGs were related to estrogen signaling. Following a comprehensive two-sample Mendelian randomization analysis, we found evidence supporting a dual causal impact of DNAm at cg20145695 (gene body of NXN, rs480351) with increased risk for estrogen receptor positive BC and decreased risk for estrogen receptor negative BC. CONCLUSION While causal effects of DNAm on BC risk are rare, most of the identified CpGs associated with the risk of BC appear to be independent of genetic effects. This suggests that DNAm could serve as a valuable biomarker for environmental risk factors for BC, and may offer potential benefits as a complementary tool to current risk assessment procedures.
Collapse
Affiliation(s)
- Hannes Frederik Bode
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Tukholmankatu 8, 00290, Helsinki, Finland.
- Minerva Foundation Institute for Medical Research, Tukholmankatu 8, 00290, Helsinki, Finland.
| | - Liang He
- Research Unit for Epidemiology, Biostatistics and Biodemography, Department of Public Health, University of Southern Denmark, Campusvej 55, 5230, Odense M, Denmark
| | - Jacob V B Hjelmborg
- Research Unit for Epidemiology, Biostatistics and Biodemography, Department of Public Health, University of Southern Denmark, Campusvej 55, 5230, Odense M, Denmark
| | - Jaakko Kaprio
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Tukholmankatu 8, 00290, Helsinki, Finland
| | - Miina Ollikainen
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Tukholmankatu 8, 00290, Helsinki, Finland
- Minerva Foundation Institute for Medical Research, Tukholmankatu 8, 00290, Helsinki, Finland
| |
Collapse
|
3
|
Milicic L, Porter T, Vacher M, Laws SM. Utility of DNA Methylation as a Biomarker in Aging and Alzheimer's Disease. J Alzheimers Dis Rep 2023; 7:475-503. [PMID: 37313495 PMCID: PMC10259073 DOI: 10.3233/adr-220109] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 04/23/2023] [Indexed: 06/15/2023] Open
Abstract
Epigenetic mechanisms such as DNA methylation have been implicated in a number of diseases including cancer, heart disease, autoimmune disorders, and neurodegenerative diseases. While it is recognized that DNA methylation is tissue-specific, a limitation for many studies is the ability to sample the tissue of interest, which is why there is a need for a proxy tissue such as blood, that is reflective of the methylation state of the target tissue. In the last decade, DNA methylation has been utilized in the design of epigenetic clocks, which aim to predict an individual's biological age based on an algorithmically defined set of CpGs. A number of studies have found associations between disease and/or disease risk with increased biological age, adding weight to the theory of increased biological age being linked with disease processes. Hence, this review takes a closer look at the utility of DNA methylation as a biomarker in aging and disease, with a particular focus on Alzheimer's disease.
Collapse
Affiliation(s)
- Lidija Milicic
- Centre for Precision Health, Edith Cowan University, Joondalup, Western Australia, Australia
- Collaborative Genomics and Translation Group, Edith Cowan University, Joondalup, Western Australia, Australia
- School of Medical and Health Sciences, Edith Cowan University, Joondalup, Western Australia, Australia
| | - Tenielle Porter
- Centre for Precision Health, Edith Cowan University, Joondalup, Western Australia, Australia
- Collaborative Genomics and Translation Group, Edith Cowan University, Joondalup, Western Australia, Australia
- School of Medical and Health Sciences, Edith Cowan University, Joondalup, Western Australia, Australia
- Curtin Medical School, Curtin University, Bentley, Western Australia, Australia
| | - Michael Vacher
- Centre for Precision Health, Edith Cowan University, Joondalup, Western Australia, Australia
- CSIRO Health and Biosecurity, Australian e-Health Research Centre, Floreat, Western Australia
| | - Simon M. Laws
- Centre for Precision Health, Edith Cowan University, Joondalup, Western Australia, Australia
- Collaborative Genomics and Translation Group, Edith Cowan University, Joondalup, Western Australia, Australia
- School of Medical and Health Sciences, Edith Cowan University, Joondalup, Western Australia, Australia
- Curtin Medical School, Curtin University, Bentley, Western Australia, Australia
| |
Collapse
|
4
|
Tarazona S, Arzalluz-Luque A, Conesa A. Undisclosed, unmet and neglected challenges in multi-omics studies. NATURE COMPUTATIONAL SCIENCE 2021; 1:395-402. [PMID: 38217236 DOI: 10.1038/s43588-021-00086-z] [Citation(s) in RCA: 57] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Accepted: 05/17/2021] [Indexed: 01/15/2024]
Abstract
Multi-omics approaches have become a reality in both large genomics projects and small laboratories. However, the multi-omics research community still faces a number of issues that have either not been sufficiently discussed or for which current solutions are still limited. In this Perspective, we elaborate on these limitations and suggest points of attention for future research. We finally discuss new opportunities and challenges brought to the field by the rapid development of single-cell high-throughput molecular technologies.
Collapse
Affiliation(s)
- Sonia Tarazona
- Department of Applied Statistics, Operations Research and Quality, Universitat Politècnica de València, Valencia, Spain
| | - Angeles Arzalluz-Luque
- Department of Applied Statistics, Operations Research and Quality, Universitat Politècnica de València, Valencia, Spain
| | - Ana Conesa
- Microbiology and Cell Science Department, Institute for Food and Agricultural Research, University of Florida, Gainesville, FL, USA.
- Genetics Institute, University of Florida, Gainesville, FL, USA.
- Institute for Integrative Systems Biology, Spanish National Research Council, Valencia, Spain.
| |
Collapse
|
5
|
Wörheide MA, Krumsiek J, Kastenmüller G, Arnold M. Multi-omics integration in biomedical research - A metabolomics-centric review. Anal Chim Acta 2021; 1141:144-162. [PMID: 33248648 PMCID: PMC7701361 DOI: 10.1016/j.aca.2020.10.038] [Citation(s) in RCA: 128] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Revised: 10/09/2020] [Accepted: 10/19/2020] [Indexed: 02/07/2023]
Abstract
Recent advances in high-throughput technologies have enabled the profiling of multiple layers of a biological system, including DNA sequence data (genomics), RNA expression levels (transcriptomics), and metabolite levels (metabolomics). This has led to the generation of vast amounts of biological data that can be integrated in so-called multi-omics studies to examine the complex molecular underpinnings of health and disease. Integrative analysis of such datasets is not straightforward and is particularly complicated by the high dimensionality and heterogeneity of the data and by the lack of universal analysis protocols. Previous reviews have discussed various strategies to address the challenges of data integration, elaborating on specific aspects, such as network inference or feature selection techniques. Thereby, the main focus has been on the integration of two omics layers in their relation to a phenotype of interest. In this review we provide an overview over a typical multi-omics workflow, focusing on integration methods that have the potential to combine metabolomics data with two or more omics. We discuss multiple integration concepts including data-driven, knowledge-based, simultaneous and step-wise approaches. We highlight the application of these methods in recent multi-omics studies, including large-scale integration efforts aiming at a global depiction of the complex relationships within and between different biological layers without focusing on a particular phenotype.
Collapse
Affiliation(s)
- Maria A Wörheide
- Institute of Computational Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
| | - Jan Krumsiek
- Institute for Computational Biomedicine, Englander Institute for Precision Medicine, Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
| | - Gabi Kastenmüller
- Institute of Computational Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany; German Center for Diabetes Research (DZD), Neuherberg, Germany
| | - Matthias Arnold
- Institute of Computational Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany; Department of Psychiatry and Behavioral Sciences, Duke University, Durham, NC, USA.
| |
Collapse
|