1
|
Schwarzerova J, Hurta M, Barton V, Lexa M, Walther D, Provaznik V, Weckwerth W. A perspective on genetic and polygenic risk scores-advances and limitations and overview of associated tools. Brief Bioinform 2024; 25:bbae240. [PMID: 38770718 PMCID: PMC11106636 DOI: 10.1093/bib/bbae240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 04/14/2024] [Accepted: 05/03/2024] [Indexed: 05/22/2024] Open
Abstract
Polygenetic Risk Scores are used to evaluate an individual's vulnerability to developing specific diseases or conditions based on their genetic composition, by taking into account numerous genetic variations. This article provides an overview of the concept of Polygenic Risk Scores (PRS). We elucidate the historical advancements of PRS, their advantages and shortcomings in comparison with other predictive methods, and discuss their conceptual limitations in light of the complexity of biological systems. Furthermore, we provide a survey of published tools for computing PRS and associated resources. The various tools and software packages are categorized based on their technical utility for users or prospective developers. Understanding the array of available tools and their limitations is crucial for accurately assessing and predicting disease risks, facilitating early interventions, and guiding personalized healthcare decisions. Additionally, we also identify potential new avenues for future bioinformatic analyzes and advancements related to PRS.
Collapse
Affiliation(s)
- Jana Schwarzerova
- Department of Biomedical Engineering, Faculty of Electrical Engineering and Communication, Brno University of Technology, Technicka 10, Brno 61600, Czechia
- Molecular Systems Biology (MOSYS), Department of Functional and Evolutionary Ecology, University of Vienna, Vienna 1010, Austria
| | - Martin Hurta
- Department of Computer Systems, Faculty of Information Technology, Brno University of Technology, Brno 612 00, Czechia
| | - Vojtech Barton
- Department of Biomedical Engineering, Faculty of Electrical Engineering and Communication, Brno University of Technology, Technicka 10, Brno 61600, Czechia
- RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno 62500, Czech Republic
| | - Matej Lexa
- Faculty of Informatics, Masaryk University, Botanicka 68a, Brno 60200, Czech Republic
| | - Dirk Walther
- Max-Planck-Institute of Molecular Plant Physiology, Potsdam 14476, Germany
| | - Valentine Provaznik
- Department of Biomedical Engineering, Faculty of Electrical Engineering and Communication, Brno University of Technology, Technicka 10, Brno 61600, Czechia
- Department of Physiology, Faculty of Medicine, Masaryk University, Brno 62500, Czech Republic
| | - Wolfram Weckwerth
- Molecular Systems Biology (MOSYS), Department of Functional and Evolutionary Ecology, University of Vienna, Vienna 1010, Austria
- Vienna Metabolomics Center (VIME), University of Vienna, Vienna 1010, Austria
| |
Collapse
|
2
|
Gunter NB, Gebre RK, Graff-Radford J, Heckman MG, Jack CR, Lowe VJ, Knopman DS, Petersen RC, Ross OA, Vemuri P, Ramanan VK. Machine Learning Models of Polygenic Risk for Enhanced Prediction of Alzheimer Disease Endophenotypes. Neurol Genet 2024; 10:e200120. [PMID: 38250184 PMCID: PMC10798228 DOI: 10.1212/nxg.0000000000200120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 11/01/2023] [Indexed: 01/23/2024]
Abstract
Background and Objectives Alzheimer disease (AD) has a polygenic architecture, for which genome-wide association studies (GWAS) have helped elucidate sequence variants (SVs) influencing susceptibility. Polygenic risk score (PRS) approaches show promise for generating summary measures of inherited risk for clinical AD based on the effects of APOE and other GWAS hits. However, existing PRS approaches, based on traditional regression models, explain only modest variation in AD dementia risk and AD-related endophenotypes. We hypothesized that machine learning (ML) models of polygenic risk (ML-PRS) could outperform standard regression-based PRS methods and therefore have the potential for greater clinical utility. Methods We analyzed combined data from the Mayo Clinic Study of Aging (n = 1,791) and the Alzheimer's Disease Neuroimaging Initiative (n = 864). An AD PRS was computed for each participant using the top common SVs obtained from a large AD dementia GWAS. In parallel, ML models were trained using those SV genotypes, with amyloid PET burden as the primary outcome. Secondary outcomes included amyloid PET positivity and clinical diagnosis (cognitively unimpaired vs impaired). We compared performance between ML-PRS and standard PRS across 100 training sessions with different data splits. In each session, data were split into 80% training and 20% testing, and then five-fold cross-validation was used within the training set to ensure the best model was produced for testing. We also applied permutation importance techniques to assess which genetic factors contributed most to outcome prediction. Results ML-PRS models outperformed the AD PRS (r2 = 0.28 vs r2 = 0.24 in test set) in explaining variation in amyloid PET burden. Among ML approaches, methods accounting for nonlinear genetic influences were superior to linear methods. ML-PRS models were also more accurate when predicting amyloid PET positivity (area under the curve [AUC] = 0.80 vs AUC = 0.63) and the presence of cognitive impairment (AUC = 0.75 vs AUC = 0.54) compared with the standard PRS. Discussion We found that ML-PRS approaches improved upon standard PRS for prediction of AD endophenotypes, partly related to improved accounting for nonlinear effects of genetic susceptibility alleles. Further adaptations of the ML-PRS framework could help to close the gap of remaining unexplained heritability for AD and therefore facilitate more accurate presymptomatic and early-stage risk stratification for clinical decision-making.
Collapse
Affiliation(s)
- Nathaniel B Gunter
- From the Departments of Radiology (N.B.G., R.K.G., C.R.J., V.J.L., P.V.), Neurology (J.G.-R., D.S.K., R.C.P., V.K.R.), and Quantitative Health Sciences (R.C.P.), Mayo Clinic Rochester, MN; and Departments of Quantitative Health Sciences (M.G.H.), Neuroscience (O.A.R.), and Clinical Genomics (O.A.R.), Mayo Clinic Florida, Jacksonville
| | - Robel K Gebre
- From the Departments of Radiology (N.B.G., R.K.G., C.R.J., V.J.L., P.V.), Neurology (J.G.-R., D.S.K., R.C.P., V.K.R.), and Quantitative Health Sciences (R.C.P.), Mayo Clinic Rochester, MN; and Departments of Quantitative Health Sciences (M.G.H.), Neuroscience (O.A.R.), and Clinical Genomics (O.A.R.), Mayo Clinic Florida, Jacksonville
| | - Jonathan Graff-Radford
- From the Departments of Radiology (N.B.G., R.K.G., C.R.J., V.J.L., P.V.), Neurology (J.G.-R., D.S.K., R.C.P., V.K.R.), and Quantitative Health Sciences (R.C.P.), Mayo Clinic Rochester, MN; and Departments of Quantitative Health Sciences (M.G.H.), Neuroscience (O.A.R.), and Clinical Genomics (O.A.R.), Mayo Clinic Florida, Jacksonville
| | - Michael G Heckman
- From the Departments of Radiology (N.B.G., R.K.G., C.R.J., V.J.L., P.V.), Neurology (J.G.-R., D.S.K., R.C.P., V.K.R.), and Quantitative Health Sciences (R.C.P.), Mayo Clinic Rochester, MN; and Departments of Quantitative Health Sciences (M.G.H.), Neuroscience (O.A.R.), and Clinical Genomics (O.A.R.), Mayo Clinic Florida, Jacksonville
| | - Clifford R Jack
- From the Departments of Radiology (N.B.G., R.K.G., C.R.J., V.J.L., P.V.), Neurology (J.G.-R., D.S.K., R.C.P., V.K.R.), and Quantitative Health Sciences (R.C.P.), Mayo Clinic Rochester, MN; and Departments of Quantitative Health Sciences (M.G.H.), Neuroscience (O.A.R.), and Clinical Genomics (O.A.R.), Mayo Clinic Florida, Jacksonville
| | - Val J Lowe
- From the Departments of Radiology (N.B.G., R.K.G., C.R.J., V.J.L., P.V.), Neurology (J.G.-R., D.S.K., R.C.P., V.K.R.), and Quantitative Health Sciences (R.C.P.), Mayo Clinic Rochester, MN; and Departments of Quantitative Health Sciences (M.G.H.), Neuroscience (O.A.R.), and Clinical Genomics (O.A.R.), Mayo Clinic Florida, Jacksonville
| | - David S Knopman
- From the Departments of Radiology (N.B.G., R.K.G., C.R.J., V.J.L., P.V.), Neurology (J.G.-R., D.S.K., R.C.P., V.K.R.), and Quantitative Health Sciences (R.C.P.), Mayo Clinic Rochester, MN; and Departments of Quantitative Health Sciences (M.G.H.), Neuroscience (O.A.R.), and Clinical Genomics (O.A.R.), Mayo Clinic Florida, Jacksonville
| | - Ronald C Petersen
- From the Departments of Radiology (N.B.G., R.K.G., C.R.J., V.J.L., P.V.), Neurology (J.G.-R., D.S.K., R.C.P., V.K.R.), and Quantitative Health Sciences (R.C.P.), Mayo Clinic Rochester, MN; and Departments of Quantitative Health Sciences (M.G.H.), Neuroscience (O.A.R.), and Clinical Genomics (O.A.R.), Mayo Clinic Florida, Jacksonville
| | - Owen A Ross
- From the Departments of Radiology (N.B.G., R.K.G., C.R.J., V.J.L., P.V.), Neurology (J.G.-R., D.S.K., R.C.P., V.K.R.), and Quantitative Health Sciences (R.C.P.), Mayo Clinic Rochester, MN; and Departments of Quantitative Health Sciences (M.G.H.), Neuroscience (O.A.R.), and Clinical Genomics (O.A.R.), Mayo Clinic Florida, Jacksonville
| | - Prashanthi Vemuri
- From the Departments of Radiology (N.B.G., R.K.G., C.R.J., V.J.L., P.V.), Neurology (J.G.-R., D.S.K., R.C.P., V.K.R.), and Quantitative Health Sciences (R.C.P.), Mayo Clinic Rochester, MN; and Departments of Quantitative Health Sciences (M.G.H.), Neuroscience (O.A.R.), and Clinical Genomics (O.A.R.), Mayo Clinic Florida, Jacksonville
| | - Vijay K Ramanan
- From the Departments of Radiology (N.B.G., R.K.G., C.R.J., V.J.L., P.V.), Neurology (J.G.-R., D.S.K., R.C.P., V.K.R.), and Quantitative Health Sciences (R.C.P.), Mayo Clinic Rochester, MN; and Departments of Quantitative Health Sciences (M.G.H.), Neuroscience (O.A.R.), and Clinical Genomics (O.A.R.), Mayo Clinic Florida, Jacksonville
| |
Collapse
|
3
|
Hermes S, Cady J, Armentrout S, O’Connor J, Holdaway SC, Cruchaga C, Wingo T, Greytak EM. Epistatic Features and Machine Learning Improve Alzheimer's Disease Risk Prediction Over Polygenic Risk Scores. J Alzheimers Dis 2024; 99:1425-1440. [PMID: 38788065 PMCID: PMC11284654 DOI: 10.3233/jad-230236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2024]
Abstract
Background Polygenic risk scores (PRS) are linear combinations of genetic markers weighted by effect size that are commonly used to predict disease risk. For complex heritable diseases such as late-onset Alzheimer's disease (LOAD), PRS models fail to capture much of the heritability. Additionally, PRS models are highly dependent on the population structure of the data on which effect sizes are assessed and have poor generalizability to new data. Objective The goal of this study is to construct a paragenic risk score that, in addition to single genetic marker data used in PRS, incorporates epistatic interaction features and machine learning methods to predict risk for LOAD. Methods We construct a new state-of-the-art genetic model for risk of Alzheimer's disease. Our approach innovates over PRS models in two ways: First, by directly incorporating epistatic interactions between SNP loci using an evolutionary algorithm guided by shared pathway information; and second, by estimating risk via an ensemble of non-linear machine learning models rather than a single linear model. We compare the paragenic model to several PRS models from the literature trained on the same dataset. Results The paragenic model is significantly more accurate than the PRS models under 10-fold cross-validation, obtaining an AUC of 83% and near-clinically significant matched sensitivity/specificity of 75%. It remains significantly more accurate when evaluated on an independent holdout dataset and maintains accuracy within APOE genotype strata. Conclusions Paragenic models show potential for improving disease risk prediction for complex heritable diseases such as LOAD over PRS models.
Collapse
Affiliation(s)
| | | | | | | | | | - Carlos Cruchaga
- Department of Psychiatry, Washington University, St. Louis, MO, USA
- Hope Center Program on Protein Aggregation and Neurodegeneration, Washington University, St. Louis, MO, USA
| | - Thomas Wingo
- Goizueta Alzheimer’s Disease Center, Emory University School of Medicine, Atlanta, GA, USA
- Department of Neurology, Emory University School of Medicine, Atlanta, GA, USA
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, USA
| | | | | |
Collapse
|