1
|
Santiago-Lamelas L, Dos Santos-Sobrín R, Carracedo Á, Castro-Santos P, Díaz-Peña R. Utility of polygenic risk scores to aid in the diagnosis of rheumatic diseases. Best Pract Res Clin Rheumatol 2024:101973. [PMID: 38997822 DOI: 10.1016/j.berh.2024.101973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Revised: 07/04/2024] [Accepted: 07/05/2024] [Indexed: 07/14/2024]
Abstract
Rheumatic diseases (RDs) are characterized by autoimmunity and autoinflammation and are recognized as complex due to the interplay of multiple genetic, environmental, and lifestyle factors in their pathogenesis. The rapid advancement of genome-wide association studies (GWASs) has enabled the identification of numerous single nucleotide polymorphisms (SNPs) associated with RD susceptibility. Based on these SNPs, polygenic risk scores (PRSs) have emerged as promising tools for quantifying genetic risk in this disease group. This chapter reviews the current status of PRSs in assessing the risk of RDs and discusses their potential to improve the accuracy of the diagnosis of these complex diseases through their ability to discriminate among different RDs. PRSs demonstrate a high discriminatory capacity for various RDs and show potential clinical utility. As GWASs continue to evolve, PRSs are expected to enable more precise risk stratification by integrating genetic, environmental, and lifestyle factors, thereby refining individual risk predictions and advancing disease management strategies.
Collapse
Affiliation(s)
- Lucía Santiago-Lamelas
- Fundación Pública Galega de Medicina Xenómica (SERGAS), Centro Nacional de Genotipado, Health Research Institute of Santiago de Compostela (IDIS), Santiago de Compostela, Spain
| | - Raquel Dos Santos-Sobrín
- Reumatología, Hospital Clínico Universitario, Health Research Institute of Santiago de Compostela (IDIS), Santiago de Compostela, Spain
| | - Ángel Carracedo
- Fundación Pública Galega de Medicina Xenómica (SERGAS), Centro Nacional de Genotipado, Health Research Institute of Santiago de Compostela (IDIS), Santiago de Compostela, Spain; Grupo de Medicina Xenómica, CIMUS, Universidade de Santiago de Compostela, Santiago de Compostela, Spain; Centre for Biomedical Network Research on Rare Diseases (CIBERER), Instituto de Salud Carlos III, Madrid, Spain
| | - Patricia Castro-Santos
- Fundación Pública Galega de Medicina Xenómica (SERGAS), Centro Nacional de Genotipado, Health Research Institute of Santiago de Compostela (IDIS), Santiago de Compostela, Spain; Faculty of Health Sciences, Universidad Autónoma de Chile, Talca, Chile.
| | - Roberto Díaz-Peña
- Fundación Pública Galega de Medicina Xenómica (SERGAS), Centro Nacional de Genotipado, Health Research Institute of Santiago de Compostela (IDIS), Santiago de Compostela, Spain; Faculty of Health Sciences, Universidad Autónoma de Chile, Talca, Chile.
| |
Collapse
|
2
|
Yan D, Hu B, Darst BF, Mukherjee S, Kunkle BW, Deming Y, Dumitrescu L, Wang Y, Naj A, Kuzma A, Zhao Y, Kang H, Johnson SC, Carlos C, Hohman TJ, Crane PK, Engelman CD, Lu Q. Biobank-wide association scan identifies risk factors for late-onset Alzheimer's disease and endophenotypes. eLife 2024; 12:RP91360. [PMID: 38787369 PMCID: PMC11126309 DOI: 10.7554/elife.91360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2024] Open
Abstract
Rich data from large biobanks, coupled with increasingly accessible association statistics from genome-wide association studies (GWAS), provide great opportunities to dissect the complex relationships among human traits and diseases. We introduce BADGERS, a powerful method to perform polygenic score-based biobank-wide association scans. Compared to traditional approaches, BADGERS uses GWAS summary statistics as input and does not require multiple traits to be measured in the same cohort. We applied BADGERS to two independent datasets for late-onset Alzheimer's disease (AD; n=61,212). Among 1738 traits in the UK biobank, we identified 48 significant associations for AD. Family history, high cholesterol, and numerous traits related to intelligence and education showed strong and independent associations with AD. Furthermore, we identified 41 significant associations for a variety of AD endophenotypes. While family history and high cholesterol were strongly associated with AD subgroups and pathologies, only intelligence and education-related traits predicted pre-clinical cognitive phenotypes. These results provide novel insights into the distinct biological processes underlying various risk factors for AD.
Collapse
Affiliation(s)
- Donghui Yan
- University of Wisconsin-MadisonMadisonUnited States
| | - Bowen Hu
- Department of Statistics, University of Wisconsin-MadisonMadisonUnited States
| | - Burcu F Darst
- Department of Population Health Sciences, University of Wisconsin-MadisonMadisonUnited States
| | - Shubhabrata Mukherjee
- Division of General Internal Medicine, Department of Medicine, University of WashingtonSeattleUnited States
| | - Brian W Kunkle
- University of Miami Miller School of MedicineMiamiUnited States
| | - Yuetiva Deming
- Department of Population Health Sciences, University of Wisconsin-MadisonMadisonUnited States
| | - Logan Dumitrescu
- Vanderbilt Memory and Alzheimer’s Center, Vanderbilt University Medical Center, Vanderbilt University School of MedicineNashvilleUnited States
| | - Yunling Wang
- University of Wisconsin-MadisonMadisonUnited States
| | - Adam Naj
- School of Medicine, University of PennsylvaniaPhiladelphiaUnited States
| | - Amanda Kuzma
- School of Medicine, University of PennsylvaniaPhiladelphiaUnited States
| | - Yi Zhao
- School of Medicine, University of PennsylvaniaPhiladelphiaUnited States
| | - Hyunseung Kang
- Department of Statistics, University of Wisconsin-MadisonMadisonUnited States
| | - Sterling C Johnson
- Wisconsin Alzheimer’s Institute, University of Wisconsin School of Medicine and Public HealthMadisonUnited States
- Geriatric Research Education and Clinical Center, Wm. S. Middleton Memorial VA HospitalMadisonUnited States
- Alzheimer’s Disease Research Center, University of Wisconsin School of Medicine and Public HealthMadisonUnited States
| | - Cruchaga Carlos
- Department of Psychiatry, Washington University in St. LouisSt. LouisUnited States
| | - Timothy J Hohman
- Vanderbilt Memory and Alzheimer’s Center, Vanderbilt University Medical Center, Vanderbilt University School of MedicineNashvilleUnited States
| | - Paul K Crane
- Division of General Internal Medicine, Department of Medicine, University of WashingtonSeattleUnited States
| | - Corinne D Engelman
- Department of Population Health Sciences, University of Wisconsin-MadisonMadisonUnited States
- Wisconsin Alzheimer’s Institute, University of Wisconsin School of Medicine and Public HealthMadisonUnited States
- Alzheimer’s Disease Research Center, University of Wisconsin School of Medicine and Public HealthMadisonUnited States
| | | | - Qiongshi Lu
- Department of Statistics, University of Wisconsin-MadisonMadisonUnited States
- Department of Biostatistics and Medical Informatics, University of Wisconsin-MadisonMadisonUnited States
| |
Collapse
|
3
|
Truong B, Ruan Y, Haidermota S, Patel A, Surakka I, Hornsby W, Koyama S, Lee SH, Natarajan P. Modification of coronary artery disease clinical risk factors by coronary artery disease polygenic risk score. MED 2024; 5:459-468.e3. [PMID: 38642556 PMCID: PMC11088498 DOI: 10.1016/j.medj.2024.02.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 10/11/2023] [Accepted: 02/28/2024] [Indexed: 04/22/2024]
Abstract
BACKGROUND The extent to which the relationships between clinical risk factors and coronary artery disease (CAD) are altered by CAD polygenic risk score (PRS) is not well understood. Here, we determine whether the interactions between clinical risk factors and CAD PRS further explain risk for incident CAD. METHODS Participants were of European ancestry from the UK Biobank without prevalent CAD. An externally trained genome-wide CAD PRS was generated and then applied. Clinical risk factors were ascertained at baseline. Cox proportional hazards models were fitted to examine the incident CAD effects of CAD PRS, risk factors, and their interactions. Next, the PRS and risk factors were stratified to investigate the attributable risk of clinical risk factors. FINDINGS A total of 357,144 individuals of European ancestry without prevalent CAD were included. During a median of 11.1 years of follow-up (interquartile range 10.4-14.1 years), CAD PRS was associated with 1.35-fold (95% confidence interval [CI] 1.332-1.368) risk per SD for incident CAD. The prognostic relevance of the following risk factors was relatively diminished for those with high CAD PRS on a continuous scale: type 2 diabetes (hazard ratio [HR]interaction 0.91, 95% CIinteraction 0.88-0.94), increased body mass index (HRinteraction 0.97, 95% CIinteraction 0.96-0.98), and increased C-reactive protein (HRinteraction 0.98, 95% CIinteraction 0.96-0.99). However, a high CAD PRS yielded joint risk increases with low-density lipoprotein cholesterol (HRinteraction 1.05, 95% CIinteraction 1.04-1.06) and total cholesterol (HRinteraction 1.05, 95% CIinteraction 1.03-1.06). CONCLUSION The CAD PRS is associated with incident CAD, and its application improves the prognostic relevance of several clinical risk factors. FUNDING P.N. (R01HL127564, R01HL151152, and U01HG011719) is supported by the National Institutes of Health.
Collapse
Affiliation(s)
- Buu Truong
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA; Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and Massachusetts Institute of Technology (MIT), Cambridge, MA, USA
| | - Yunfeng Ruan
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA; Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and Massachusetts Institute of Technology (MIT), Cambridge, MA, USA
| | - Sara Haidermota
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA; Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and Massachusetts Institute of Technology (MIT), Cambridge, MA, USA
| | - Aniruddh Patel
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA; Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and Massachusetts Institute of Technology (MIT), Cambridge, MA, USA
| | - Ida Surakka
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA; Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and Massachusetts Institute of Technology (MIT), Cambridge, MA, USA; Department of Internal Medicine, University of Michigan, Ann Arbor, MI, USA
| | - Whitney Hornsby
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA; Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and Massachusetts Institute of Technology (MIT), Cambridge, MA, USA
| | - Satoshi Koyama
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA; Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and Massachusetts Institute of Technology (MIT), Cambridge, MA, USA
| | - S Hong Lee
- Australian Centre for Precision Health, University of South Australia, Adelaide, SA 5000, Australia; UniSA Allied Health and Human Performance, University of South Australia, Adelaide, SA 5000, Australia
| | - Pradeep Natarajan
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA; Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and Massachusetts Institute of Technology (MIT), Cambridge, MA, USA; Department of Medicine, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
4
|
Zheng Z, Liu S, Sidorenko J, Wang Y, Lin T, Yengo L, Turley P, Ani A, Wang R, Nolte IM, Snieder H, Yang J, Wray NR, Goddard ME, Visscher PM, Zeng J. Leveraging functional genomic annotations and genome coverage to improve polygenic prediction of complex traits within and between ancestries. Nat Genet 2024; 56:767-777. [PMID: 38689000 PMCID: PMC11096109 DOI: 10.1038/s41588-024-01704-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2022] [Accepted: 03/05/2024] [Indexed: 05/02/2024]
Abstract
We develop a method, SBayesRC, that integrates genome-wide association study (GWAS) summary statistics with functional genomic annotations to improve polygenic prediction of complex traits. Our method is scalable to whole-genome variant analysis and refines signals from functional annotations by allowing them to affect both causal variant probability and causal effect distribution. We analyze 50 complex traits and diseases using ∼7 million common single-nucleotide polymorphisms (SNPs) and 96 annotations. SBayesRC improves prediction accuracy by 14% in European ancestry and up to 34% in cross-ancestry prediction compared to the baseline method SBayesR, which does not use annotations, and outperforms other methods, including LDpred2, LDpred-funct, MegaPRS, PolyPred-S and PRS-CSx. Investigation of factors affecting prediction accuracy identifies a significant interaction between SNP density and annotation information, suggesting whole-genome sequence variants with annotations may further improve prediction. Functional partitioning analysis highlights a major contribution of evolutionary constrained regions to prediction accuracy and the largest per-SNP contribution from nonsynonymous SNPs.
Collapse
Affiliation(s)
- Zhili Zheng
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia.
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
| | - Shouye Liu
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
| | - Julia Sidorenko
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
| | - Ying Wang
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
| | - Tian Lin
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
| | - Loic Yengo
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
| | - Patrick Turley
- Center for Economic and Social Research, University of Southern California, Los Angeles, CA, USA
- Department of Economics, University of Southern California, Los Angeles, CA, USA
| | - Alireza Ani
- Department of Epidemiology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
- Department of Bioinformatics, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Rujia Wang
- Department of Epidemiology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Ilja M Nolte
- Department of Epidemiology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Harold Snieder
- Department of Epidemiology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Jian Yang
- School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, China
| | - Naomi R Wray
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
- Department of Psychiatry, University of Oxford, Oxford, UK
| | - Michael E Goddard
- Faculty of Veterinary and Agricultural Science, University of Melbourne, Parkville, Victoria, Australia
- Biosciences Research Division, Department of Economic Development, Jobs, Transport and Resources, Bundoora, Victoria, Australia
| | - Peter M Visscher
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Jian Zeng
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia.
| |
Collapse
|
5
|
Zhang T, Zhou G, Klei L, Liu P, Chouldechova A, Zhao H, Roeder K, G'Sell M, Devlin B. Evaluating and improving health equity and fairness of polygenic scores. HGG ADVANCES 2024; 5:100280. [PMID: 38402414 PMCID: PMC10937319 DOI: 10.1016/j.xhgg.2024.100280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 02/14/2024] [Accepted: 02/14/2024] [Indexed: 02/26/2024] Open
Abstract
Polygenic scores (PGSs) are quantitative metrics for predicting phenotypic values, such as human height or disease status. Some PGS methods require only summary statistics of a relevant genome-wide association study (GWAS) for their score. One such method is Lassosum, which inherits the model selection advantages of Lasso to select a meaningful subset of the GWAS single-nucleotide polymorphisms as predictors from their association statistics. However, even efficient scores like Lassosum, when derived from European-based GWASs, are poor predictors of phenotype for subjects of non-European ancestry; that is, they have limited portability to other ancestries. To increase the portability of Lassosum, when GWAS information and estimates of linkage disequilibrium are available for both ancestries, we propose Joint-Lassosum (JLS). In the simulation settings we explore, JLS provides more accurate PGSs compared to other methods, especially when measured in terms of fairness. In analyses of UK Biobank data, JLS was computationally more efficient but slightly less accurate than a Bayesian comparator, SDPRX. Like all PGS methods, JLS requires selection of predictors, which are determined by data-driven tuning parameters. We describe a new approach to selecting tuning parameters and note its relevance for model selection for any PGS. We also draw connections to the literature on algorithmic fairness and discuss how JLS can help mitigate fairness-related harms that might result from the use of PGSs in clinical settings. While no PGS method is likely to be universally portable, due to the diversity of human populations and unequal information content of GWASs for different ancestries, JLS is an effective approach for enhancing portability and reducing predictive bias.
Collapse
Affiliation(s)
- Tianyu Zhang
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA.
| | - Geyu Zhou
- Department of Biostatistics, Yale University, New Haven, CT 06511, USA
| | - Lambertus Klei
- Department of Psychiatry, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Peng Liu
- Merck Research Laboratories, Merck & Co., Inc., Rahway, NJ 07065, USA
| | - Alexandra Chouldechova
- Microsoft Research NYC, New York, NY 10012, USA; Heinz College of Information Systems and Public Policy, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Hongyu Zhao
- Department of Biostatistics, Yale University, New Haven, CT 06511, USA
| | - Kathryn Roeder
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA; Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Max G'Sell
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Bernie Devlin
- Department of Psychiatry, University of Pittsburgh, Pittsburgh, PA 15213, USA.
| |
Collapse
|
6
|
Timmins IR, Dudbridge F. Bayesian approach to assessing population differences in genetic risk of disease with application to prostate cancer. PLoS Genet 2024; 20:e1011212. [PMID: 38630784 PMCID: PMC11023298 DOI: 10.1371/journal.pgen.1011212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Accepted: 03/07/2024] [Indexed: 04/19/2024] Open
Abstract
Population differences in risk of disease are common, but the potential genetic basis for these differences is not well understood. A standard approach is to compare genetic risk across populations by testing for mean differences in polygenic scores, but existing studies that use this approach do not account for statistical noise in effect estimates (i.e., the GWAS betas) that arise due to the finite sample size of GWAS training data. Here, we show using Bayesian polygenic score methods that the level of uncertainty in estimates of genetic risk differences across populations is highly dependent on the GWAS training sample size, the polygenicity (number of causal variants), and genetic distance (FST) between the populations considered. We derive a Wald test for formally assessing the difference in genetic risk across populations, which we show to have calibrated type 1 error rates under a simplified assumption that all SNPs are independent, which we achieve in practise using linkage disequilibrium (LD) pruning. We further provide closed-form expressions for assessing the uncertainty in estimates of relative genetic risk across populations under the special case of an infinitesimal genetic architecture. We suggest that for many complex traits and diseases, particularly those with more polygenic architectures, current GWAS sample sizes are insufficient to detect moderate differences in genetic risk across populations, though more substantial differences in relative genetic risk (relative risk > 1.5) can be detected. We show that conventional approaches that do not account for sampling error from the training sample, such as using a simple t-test, have very high type 1 error rates. When applying our approach to prostate cancer, we demonstrate a higher genetic risk in African Ancestry men, with lower risk in men of European followed by East Asian ancestry.
Collapse
Affiliation(s)
- Iain R. Timmins
- Department of Population Health Sciences, University of Leicester, Leicester, United Kingdom
- Division of Genetics and Epidemiology, The Institute of Cancer Research, London, United Kingdom
- Statistical Innovation, AstraZeneca, Cambridge, United Kingdom
| | | | - Frank Dudbridge
- Department of Population Health Sciences, University of Leicester, Leicester, United Kingdom
| |
Collapse
|
7
|
Sun C, Cheng X, Xu J, Chen H, Tao J, Dong Y, Wei S, Chen R, Meng X, Ma Y, Tian H, Guo X, Bi S, Zhang C, Kang J, Zhang M, Lv H, Shang Z, Lv W, Zhang R, Jiang Y. A review of disease risk prediction methods and applications in the omics era. Proteomics 2024:e2300359. [PMID: 38522029 DOI: 10.1002/pmic.202300359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 03/08/2024] [Accepted: 03/12/2024] [Indexed: 03/25/2024]
Abstract
Risk prediction and disease prevention are the innovative care challenges of the 21st century. Apart from freeing the individual from the pain of disease, it will lead to low medical costs for society. Until very recently, risk assessments have ushered in a new era with the emergence of omics technologies, including genomics, transcriptomics, epigenomics, proteomics, and so on, which potentially advance the ability of biomarkers to aid prediction models. While risk prediction has achieved great success, there are still some challenges and limitations. We reviewed the general process of omics-based disease risk model construction and the applications in four typical diseases. Meanwhile, we highlighted the problems in current studies and explored the potential opportunities and challenges for future clinical practice.
Collapse
Affiliation(s)
- Chen Sun
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The EWAS Project, Harbin, China
| | - Xiangshu Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The EWAS Project, Harbin, China
| | - Jing Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The EWAS Project, Harbin, China
| | - Haiyan Chen
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Junxian Tao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The EWAS Project, Harbin, China
| | - Yu Dong
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The EWAS Project, Harbin, China
| | - Siyu Wei
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The EWAS Project, Harbin, China
| | - Rui Chen
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Xin Meng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Yingnan Ma
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The EWAS Project, Harbin, China
| | - Hongsheng Tian
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Xuying Guo
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Shuo Bi
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Chen Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Jingxuan Kang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Mingming Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Hongchao Lv
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Zhenwei Shang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Wenhua Lv
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Ruijie Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Yongshuai Jiang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The EWAS Project, Harbin, China
| |
Collapse
|
8
|
Hu J, Ye Y, Zhou G, Zhao H. Using clinical and genetic risk factors for risk prediction of 8 cancers in the UK Biobank. JNCI Cancer Spectr 2024; 8:pkae008. [PMID: 38366150 PMCID: PMC10919929 DOI: 10.1093/jncics/pkae008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 01/04/2024] [Accepted: 02/08/2024] [Indexed: 02/18/2024] Open
Abstract
BACKGROUND Models with polygenic risk scores and clinical factors to predict risk of different cancers have been developed, but these models have been limited by the polygenic risk score-derivation methods and the incomplete selection of clinical variables. METHODS We used UK Biobank to train the best polygenic risk scores for 8 cancers (bladder, breast, colorectal, kidney, lung, ovarian, pancreatic, and prostate cancers) and select relevant clinical variables from 733 baseline traits through extreme gradient boosting (XGBoost). Combining polygenic risk scores and clinical variables, we developed Cox proportional hazards models for risk prediction in these cancers. RESULTS Our models achieved high prediction accuracy for 8 cancers, with areas under the curve ranging from 0.618 (95% confidence interval = 0.581 to 0.655) for ovarian cancer to 0.831 (95% confidence interval = 0.817 to 0.845) for lung cancer. Additionally, our models could identify individuals at a high risk for developing cancer. For example, the risk of breast cancer for individuals in the top 5% score quantile was nearly 13 times greater than for individuals in the lowest 10%. Furthermore, we observed a higher proportion of individuals with high polygenic risk scores in the early-onset group but a higher proportion of individuals at high clinical risk in the late-onset group. CONCLUSION Our models demonstrated the potential to predict cancer risk and identify high-risk individuals with great generalizability to different cancers. Our findings suggested that the polygenic risk score model is more predictive for the cancer risk of early-onset patients than for late-onset patients, while the clinical risk model is more predictive for late-onset patients. Meanwhile, combining polygenic risk scores and clinical risk factors has overall better predictive performance than using polygenic risk scores or clinical risk factors alone.
Collapse
Affiliation(s)
- Jiaqi Hu
- Department of Chronic Disease Epidemiology, Yale School of Public Health, New Haven, CT, USA
| | - Yixuan Ye
- Program of Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA
| | - Geyu Zhou
- Program of Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA
| | - Hongyu Zhao
- Program of Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
| |
Collapse
|
9
|
Zhuang Y, Kim NY, Fritsche LG, Mukherjee B, Lee S. Incorporating functional annotation with bilevel continuous shrinkage for polygenic risk prediction. BMC Bioinformatics 2024; 25:65. [PMID: 38336614 DOI: 10.1186/s12859-024-05664-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Accepted: 01/19/2024] [Indexed: 02/12/2024] Open
Abstract
BACKGROUND Genetic variants can contribute differently to trait heritability by their functional categories, and recent studies have shown that incorporating functional annotation can improve the predictive performance of polygenic risk scores (PRSs). In addition, when only a small proportion of variants are causal variants, PRS methods that employ a Bayesian framework with shrinkage can account for such sparsity. It is possible that the annotation group level effect is also sparse. However, the number of PRS methods that incorporate both annotation information and shrinkage on effect sizes is limited. We propose a PRS method, PRSbils, which utilizes the functional annotation information with a bilevel continuous shrinkage prior to accommodate the varying genetic architectures both on the variant-specific level and on the functional annotation level. RESULTS We conducted simulation studies and investigated the predictive performance in settings with different genetic architectures. Results indicated that when there was a relatively large variability of group-wise heritability contribution, the gain in prediction performance from the proposed method was on average 8.0% higher AUC compared to the benchmark method PRS-CS. The proposed method also yielded higher predictive performance compared to PRS-CS in settings with different overlapping patterns of annotation groups and obtained on average 6.4% higher AUC. We applied PRSbils to binary and quantitative traits in three real world data sources (the UK Biobank, the Michigan Genomics Initiative (MGI), and the Korean Genome and Epidemiology Study (KoGES)), and two sources of annotations: ANNOVAR, and pathway information from the Kyoto Encyclopedia of Genes and Genomes (KEGG), and demonstrated that the proposed method holds the potential for improving predictive performance by incorporating functional annotations. CONCLUSIONS By utilizing a bilevel shrinkage framework, PRSbils enables the incorporation of both overlapping and non-overlapping annotations into PRS construction to improve the performance of genetic risk prediction. The software is available at https://github.com/styvon/PRSbils .
Collapse
Affiliation(s)
| | - Na Yeon Kim
- Seoul National University, Seoul, Republic of Korea
| | | | | | - Seunggeun Lee
- Seoul National University, Seoul, Republic of Korea.
| |
Collapse
|
10
|
Feng X, Liu S, Li K, Bu F, Yuan H. NCAD v1.0: a database for non-coding variant annotation and interpretation. J Genet Genomics 2024; 51:230-242. [PMID: 38142743 DOI: 10.1016/j.jgg.2023.12.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 12/15/2023] [Accepted: 12/18/2023] [Indexed: 12/26/2023]
Abstract
The application of whole genome sequencing is expanding in clinical diagnostics across various genetic disorders, and the significance of non-coding variants in penetrant diseases is increasingly being demonstrated. Therefore, it is urgent to improve the diagnostic yield by exploring the pathogenic mechanisms of variants in non-coding regions. However, the interpretation of non-coding variants remains a significant challenge, due to the complex functional regulatory mechanisms of non-coding regions and the current limitations of available databases and tools. Hence, we develop the non-coding variant annotation database (NCAD, http://www.ncawdb.net/), encompassing comprehensive insights into 665,679,194 variants, regulatory elements, and element interaction details. Integrating data from 96 sources, spanning both GRCh37 and GRCh38 versions, NCAD v1.0 provides vital information to support the genetic diagnosis of non-coding variants, including allele frequencies of 12 diverse populations, with a particular focus on the population frequency information for 230,235,698 variants in 20,964 Chinese individuals. Moreover, it offers prediction scores for variant functionality, five categories of regulatory elements, and four types of non-coding RNAs. With its rich data and comprehensive coverage, NCAD serves as a valuable platform, empowering researchers and clinicians with profound insights into non-coding regulatory mechanisms while facilitating the interpretation of non-coding variants.
Collapse
Affiliation(s)
- Xiaoshu Feng
- Institute of Rare Diseases, West China Hospital, Sichuan University, Chengdu, Sichuan 610044, China
| | - Sihan Liu
- Institute of Rare Diseases, West China Hospital, Sichuan University, Chengdu, Sichuan 610044, China
| | - Ke Li
- Institute of Rare Diseases, West China Hospital, Sichuan University, Chengdu, Sichuan 610044, China
| | - Fengxiao Bu
- Institute of Rare Diseases, West China Hospital, Sichuan University, Chengdu, Sichuan 610044, China.
| | - Huijun Yuan
- Institute of Rare Diseases, West China Hospital, Sichuan University, Chengdu, Sichuan 610044, China.
| |
Collapse
|
11
|
Ye Y, Hu J, Pang F, Cui C, Zhao H. Genomic risk prediction of cardiovascular diseases among type 2 diabetes patients in the UK Biobank. FRONTIERS IN BIOINFORMATICS 2024; 3:1320748. [PMID: 38239805 PMCID: PMC10794561 DOI: 10.3389/fbinf.2023.1320748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2023] [Accepted: 12/11/2023] [Indexed: 01/22/2024] Open
Abstract
Background: Polygenic risk score (PRS) has proved useful in predicting the risk of cardiovascular diseases (CVD) based on the genotypes of an individual, but most analyses have focused on disease onset in the general population. The usefulness of PRS to predict CVD risk among type 2 diabetes (T2D) patients remains unclear. Methods: We built a meta-PRSCVD upon the candidate PRSs developed from state-of-the-art PRS methods for three CVD subtypes of significant importance: coronary artery disease (CAD), ischemic stroke (IS), and heart failure (HF). To evaluate the prediction performance of the meta-PRSCVD, we restricted our analysis to 21,092 white British T2D patients in the UK Biobank, among which 4,015 had CVD events. Results: Results showed that the meta-PRSCVD was significantly associated with CVD risk with a hazard ratio per standard deviation increase of 1.28 (95% CI: 1.23-1.33). The meta-PRSCVD alone predicted the CVD incidence with an area under the receiver operating characteristic curve (AUC) of 0.57 (95% CI: 0.54-0.59). When restricted to the early-onset patients (onset age ≤ 55), the AUC was further increased to 0.61 (95% CI 0.56-0.67). Conclusion: Our results highlight the potential role of genomic screening for secondary preventions of CVD among T2D patients, especially among early-onset patients.
Collapse
Affiliation(s)
- Yixuan Ye
- Program of Computational Biology and Bioinformatics, Yale University, New Haven, CT, United States
| | - Jiaqi Hu
- Department of Chronic Disease Epidemiology, Yale School of Public Health, New Haven, CT, United States
| | - Fuyuan Pang
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, United States
- Department of Biostatistics, Shanghai Jiao Tong University, Shanghai, China
| | - Can Cui
- Department of Immunobiology, Yale University School of Medicine, New Haven, CT, United States
| | - Hongyu Zhao
- Program of Computational Biology and Bioinformatics, Yale University, New Haven, CT, United States
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, United States
| |
Collapse
|
12
|
Jowell AR, Bhattacharya R, Marnell C, Wong M, Haidermota S, Trinder M, Fahed AC, Peloso GM, Honigberg MC, Natarajan P. Genetic and clinical factors underlying a self-reported family history of heart disease. Eur J Prev Cardiol 2023; 30:1571-1579. [PMID: 37011137 PMCID: PMC10545808 DOI: 10.1093/eurjpc/zwad096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 03/16/2023] [Accepted: 03/24/2023] [Indexed: 04/05/2023]
Abstract
AIMS To estimate how much information conveyed by self-reported family history of heart disease (FHHD) is already explained by clinical and genetic risk factors. METHODS AND RESULTS Cross-sectional analysis of UK Biobank participants without pre-existing coronary artery disease using a multivariable model with self-reported FHHD as the outcome. Clinical (diabetes, hypertension, smoking, apolipoprotein B-to-apolipoprotein AI ratio, waist-to-hip ratio, high sensitivity C-reactive protein, lipoprotein(a), triglycerides) and genetic risk factors (polygenic risk score for coronary artery disease [PRSCAD], heterozygous familial hypercholesterolemia [HeFH]) were exposures. Models were adjusted for age, sex, and cholesterol-lowering medication use. Multiple logistic regression models were fitted to associate FHHD with risk factors, with continuous variables treated as quintiles. Population attributable risks (PAR) were subsequently calculated from the resultant odds ratios. Among 166 714 individuals, 72 052 (43.2%) participants reported an FHHD. In a multivariable model, genetic risk factors PRSCAD (OR 1.30, CI 1.27-1.33) and HeFH (OR 1.31, 1.11-1.54) were most strongly associated with FHHD. Clinical risk factors followed: hypertension (OR 1.18, CI 1.15-1.21), lipoprotein(a) (OR 1.17, CI 1.14-1.20), apolipoprotein B-to-apolipoprotein AI ratio (OR 1.13, 95% CI 1.10-1.16), and triglycerides (OR 1.07, CI 1.04-1.10). For the PAR analyses: 21.9% (CI 18.19-25.63) of the risk of reporting an FHHD is attributed to clinical factors, 22.2% (CI% 20.44-23.88) is attributed to genetic factors, and 36.0% (CI 33.31-38.68) is attributed to genetic and clinical factors combined. CONCLUSIONS A combined model of clinical and genetic risk factors explains only 36% of the likelihood of FHHD, implying additional value in the family history.
Collapse
Affiliation(s)
- Amanda R Jowell
- Department of Medicine, Harvard Medical School, 25 Shattuck Street, Boston, MA 02115, USA
| | - Romit Bhattacharya
- Department of Medicine, Harvard Medical School, 25 Shattuck Street, Boston, MA 02115, USA
- Cardiovascular Research Center, Massachusetts General Hospital, 185 Cambridge Street Suite 320, Boston, MA 02114, USA
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Merkin Building, 415 Main Street, Cambridge, MA 02142, USA
| | - Christopher Marnell
- Department of Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Division of Cardiology, Icahn School of Medicine at Mount Sinai Hospital, New York, NY 10029, USA
| | - Megan Wong
- Cardiovascular Research Center, Massachusetts General Hospital, 185 Cambridge Street Suite 320, Boston, MA 02114, USA
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Merkin Building, 415 Main Street, Cambridge, MA 02142, USA
| | - Sara Haidermota
- Cardiovascular Research Center, Massachusetts General Hospital, 185 Cambridge Street Suite 320, Boston, MA 02114, USA
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Merkin Building, 415 Main Street, Cambridge, MA 02142, USA
| | - Mark Trinder
- Cardiovascular Research Center, Massachusetts General Hospital, 185 Cambridge Street Suite 320, Boston, MA 02114, USA
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Merkin Building, 415 Main Street, Cambridge, MA 02142, USA
- Department of Medicine, University of British Columbia, Vancouver, BC V6T 1Z3, Canada
| | - Akl C Fahed
- Department of Medicine, Harvard Medical School, 25 Shattuck Street, Boston, MA 02115, USA
- Cardiovascular Research Center, Massachusetts General Hospital, 185 Cambridge Street Suite 320, Boston, MA 02114, USA
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Merkin Building, 415 Main Street, Cambridge, MA 02142, USA
| | - Gina M Peloso
- Department of Biostatistics, Boston University School of Public Health, Boston, MA 02115, USA
| | - Michael C Honigberg
- Department of Medicine, Harvard Medical School, 25 Shattuck Street, Boston, MA 02115, USA
- Cardiovascular Research Center, Massachusetts General Hospital, 185 Cambridge Street Suite 320, Boston, MA 02114, USA
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Merkin Building, 415 Main Street, Cambridge, MA 02142, USA
| | - Pradeep Natarajan
- Department of Medicine, Harvard Medical School, 25 Shattuck Street, Boston, MA 02115, USA
- Cardiovascular Research Center, Massachusetts General Hospital, 185 Cambridge Street Suite 320, Boston, MA 02114, USA
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Merkin Building, 415 Main Street, Cambridge, MA 02142, USA
| |
Collapse
|
13
|
Xu C, Ganesh SK, Zhou X. mtPGS: Leverage multiple correlated traits for accurate polygenic score construction. Am J Hum Genet 2023; 110:1673-1689. [PMID: 37716346 PMCID: PMC10577082 DOI: 10.1016/j.ajhg.2023.08.016] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 08/18/2023] [Accepted: 08/27/2023] [Indexed: 09/18/2023] Open
Abstract
Accurate polygenic scores (PGSs) facilitate the genetic prediction of complex traits and aid in the development of personalized medicine. Here, we develop a statistical method called multi-trait assisted PGS (mtPGS), which can construct accurate PGSs for a target trait of interest by leveraging multiple traits relevant to the target trait. Specifically, mtPGS borrows SNP effect size similarity information between the target trait and its relevant traits to improve the effect size estimation on the target trait, thus achieving accurate PGSs. In the process, mtPGS flexibly models the shared genetic architecture between the target and the relevant traits to achieve robust performance, while explicitly accounting for the environmental covariance among them to accommodate different study designs with various sample overlap patterns. In addition, mtPGS uses only summary statistics as input and relies on a deterministic algorithm with several algebraic techniques for scalable computation. We evaluate the performance of mtPGS through comprehensive simulations and applications to 25 traits in the UK Biobank, where in the real data mtPGS achieves an average of 0.90%-52.91% accuracy gain compared to the state-of-the-art PGS methods. Overall, mtPGS represents an accurate, fast, and robust solution for PGS construction in biobank-scale datasets.
Collapse
Affiliation(s)
- Chang Xu
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA; Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Santhi K Ganesh
- Department of Internal Medicine, Division of Cardiovascular Medicine, University of Michigan, Ann Arbor, MI, USA; Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA; Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA.
| |
Collapse
|
14
|
Tao LR, Ye Y, Zhao H. Early breast cancer risk detection: a novel framework leveraging polygenic risk scores and machine learning. J Med Genet 2023; 60:960-964. [PMID: 37055164 DOI: 10.1136/jmg-2022-108582] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2022] [Accepted: 03/27/2023] [Indexed: 04/15/2023]
Abstract
BACKGROUND Breast cancer (BC) is the most common cancer and the second leading cause of cancer death in women; an estimated one in eight women in the USA will develop BC during her lifetime. However, current methods of BC screening, including clinical breast exams, mammograms, biopsies and others, are often underused due to limited access, expense and a lack of risk awareness, causing 30% (up to 80% in low-income and middle-income countries) of patients with BC to miss the precious early detection phase. METHODS This study creates a key step to supplement the current BC diagnostic pipeline: a prescreening platform, prior to traditional detection and diagnostic steps. We have developed BREast CAncer Risk Detection Application (BRECARDA), a novel framework that personalises BC risk assessment using artificial intelligence neural networks to incorporate relevant genetic and non-genetic risk factors. A polygenic risk score (PRS) was enhanced by employing AnnoPred and validated by fivefolds cross-validation, outperforming three existing state-of-the-art PRS methods. RESULTS We used data from 97 597 female participants of the UK BioBank to train our algorithm. Using the enhanced PRS thus trained together with non-genetic information, BRECARDA was evaluated in a testing dataset with 48 074 UK Biobank female participants and achieved a high accuracy of 94.28% and area under the curve of 0.7861. Our optimised AnnoPred outperformed other state-of-the-art methods on quantifying genetic risk, indicating its potential for supplementing the current BC detection tests, population screening and risk evaluation. CONCLUSION BRECARDA can enhance disease risk prediction, identify high-risk individuals for BC screening, facilitate disease diagnosis and improve population-level screening efficiency. It can serve as a valuable and supplemental platform to assist doctors in BC diagnosis and evaluation.
Collapse
Affiliation(s)
- Lynn Rose Tao
- Thomas Jefferson High School for Science and Technology, Alexandria, Virginia, USA
| | - Yixuan Ye
- Interdepartmental Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA
| | - Hongyu Zhao
- Department of Biostatistics, Yale School of Public Health, Yale University, New Haven, CT, USA
| |
Collapse
|
15
|
Zhang T, Klei L, Liu P, Chouldechova A, Roeder K, G'Sell M, Devlin B. Evaluating and Improving Health Equity and Fairness of Polygenic Scores. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.22.559051. [PMID: 37790341 PMCID: PMC10542523 DOI: 10.1101/2023.09.22.559051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
Polygenic scores (PGS) are quantitative metrics for predicting phenotypic values, such as human height or disease status. Some PGS methods require only summary statistics of a relevant genome-wide association study (GWAS) for their score. One such method is Lassosum, which inherits the model selection advantages of Lasso to select a meaningful subset of the GWAS single nucleotide polymorphisms as predictors from their association statistics. However, even efficient scores like Lassosum, when derived from European-based GWAS, are poor predictors of phenotype for subjects of non-European ancestry; that is, they have limited portability to other ancestries. To increase the portability of Lassosum, when GWAS information and estimates of linkage disequilibrium are available for both ancestries, we propose Joint-Lassosum. In the simulation settings we explore, Joint-Lassosum provides more accurate PGS compared with other methods, especially when measured in terms of fairness. Like all PGS methods, Joint-Lassosum requires selection of predictors, which are determined by data-driven tuning parameters. We describe a new approach to selecting tuning parameters and note its relevance for model selection for any PGS. We also draw connections to the literature on algorithmic fairness and discuss how Joint-Lassosum can help mitigate fairness-related harms that might result from the use of PGS scores in clinical settings. While no PGS method is likely to be universally portable, due to the diversity of human populations and unequal information content of GWAS for different ancestries, Joint-Lassosum is an effective approach for enhancing portability and reducing predictive bias.
Collapse
|
16
|
Cho SMJ, Koyama S, Honigberg MC, Surakka I, Haidermota S, Ganesh S, Patel AP, Bhattacharya R, Lee H, Kim HC, Natarajan P. Genetic, sociodemographic, lifestyle, and clinical risk factors of recurrent coronary artery disease events: a population-based cohort study. Eur Heart J 2023; 44:3456-3465. [PMID: 37350734 PMCID: PMC10516626 DOI: 10.1093/eurheartj/ehad380] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 05/07/2023] [Accepted: 05/25/2023] [Indexed: 06/24/2023] Open
Abstract
AIMS Complications of coronary artery disease (CAD) represent the leading cause of death among adults globally. This study examined the associations and clinical utilities of genetic, sociodemographic, lifestyle, and clinical risk factors on CAD recurrence. METHODS AND RESULTS Data were from 7024 UK Biobank middle-aged adults with established CAD at enrolment. Cox proportional hazards regressions modelled associations of age at enrolment, age at first CAD diagnosis, sex, cigarette smoking, physical activity, diet, sleep, Townsend Deprivation Index, body mass index, blood pressure, blood lipids, glucose, lipoprotein(a), C reactive protein, estimated glomerular filtration rate (eGFR), statin prescription, and CAD polygenic risk score (PRS) with first post-enrolment CAD recurrence. Over a median [interquartile range] follow-up of 11.6 [7.2-12.7] years, 2003 (28.5%) recurrent CAD events occurred. The hazard ratio (95% confidence interval [CI]) for CAD recurrence was the most pronounced with current smoking (1.35, 1.13-1.61) and per standard deviation increase in age at first CAD (0.74, 0.67-0.82). Additionally, age at enrolment, CAD PRS, C-reactive protein, lipoprotein(a), glucose, low-density lipoprotein cholesterol, deprivation, sleep quality, eGFR, and high-density lipoprotein (HDL) cholesterol also significantly associated with recurrence risk. Based on C indices (95% CI), the strongest predictors were CAD PRS (0.58, 0.57-0.59), HDL cholesterol (0.57, 0.57-0.58), and age at initial CAD event (0.57, 0.56-0.57). In addition to traditional risk factors, a comprehensive model improved the C index from 0.644 (0.632-0.654) to 0.676 (0.667-0.686). CONCLUSION Sociodemographic, clinical, and laboratory factors are each associated with CAD recurrence with genetic risk, age at first CAD event, and HDL cholesterol concentration explaining the most.
Collapse
Affiliation(s)
- So Mi Jemma Cho
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, 415 Main St., Cambridge, MA 02142, USA
- Cardiovascular Research Center and Center for Genomic Medicine, Massachusetts General Hospital, 185 Cambridge St., Boston, MA 02114, USA
- Integrative Research Center for Cerebrovascular and Cardiovascular Diseases, Yonsei University College of Medicine, 50-1 Yonsei-ro, Seodaemun-gu, Seoul 03722, Republic of Korea
| | - Satoshi Koyama
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, 415 Main St., Cambridge, MA 02142, USA
- Cardiovascular Research Center and Center for Genomic Medicine, Massachusetts General Hospital, 185 Cambridge St., Boston, MA 02114, USA
| | - Michael C Honigberg
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, 415 Main St., Cambridge, MA 02142, USA
- Cardiovascular Research Center and Center for Genomic Medicine, Massachusetts General Hospital, 185 Cambridge St., Boston, MA 02114, USA
- Cardiology Division, Department of Medicine, Massachusetts General Hospital, 55 Fruit St., Boston, MA 02114, USA
- Department of Medicine, Harvard Medical School, 25 Shattuck St., Boston, MA 02114, USA
| | - Ida Surakka
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, 415 Main St., Cambridge, MA 02142, USA
- Division of Cardiology, Department of Internal Medicine, University of Michigan, 1500 E Medical Center Dr., Ann Arbor, MI 48109, USA
| | - Sara Haidermota
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, 415 Main St., Cambridge, MA 02142, USA
- Cardiovascular Research Center and Center for Genomic Medicine, Massachusetts General Hospital, 185 Cambridge St., Boston, MA 02114, USA
| | - Shriienidhie Ganesh
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, 415 Main St., Cambridge, MA 02142, USA
- Cardiovascular Research Center and Center for Genomic Medicine, Massachusetts General Hospital, 185 Cambridge St., Boston, MA 02114, USA
| | - Aniruddh P Patel
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, 415 Main St., Cambridge, MA 02142, USA
- Cardiovascular Research Center and Center for Genomic Medicine, Massachusetts General Hospital, 185 Cambridge St., Boston, MA 02114, USA
- Cardiology Division, Department of Medicine, Massachusetts General Hospital, 55 Fruit St., Boston, MA 02114, USA
- Department of Medicine, Harvard Medical School, 25 Shattuck St., Boston, MA 02114, USA
| | - Romit Bhattacharya
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, 415 Main St., Cambridge, MA 02142, USA
- Cardiovascular Research Center and Center for Genomic Medicine, Massachusetts General Hospital, 185 Cambridge St., Boston, MA 02114, USA
- Cardiology Division, Department of Medicine, Massachusetts General Hospital, 55 Fruit St., Boston, MA 02114, USA
- Department of Medicine, Harvard Medical School, 25 Shattuck St., Boston, MA 02114, USA
| | - Hokyou Lee
- Department of Preventive Medicine, Yonsei University College of Medicine, 50-1 Yonsei-ro, Seodaemun-gu, Seoul 03722, Republic of Korea
| | - Hyeon Chang Kim
- Integrative Research Center for Cerebrovascular and Cardiovascular Diseases, Yonsei University College of Medicine, 50-1 Yonsei-ro, Seodaemun-gu, Seoul 03722, Republic of Korea
- Department of Preventive Medicine, Yonsei University College of Medicine, 50-1 Yonsei-ro, Seodaemun-gu, Seoul 03722, Republic of Korea
- Institute for Innovation in Digital Healthcare, Yonsei University Health System, 50-1 Yonsei-ro, Seodaemun-gu, Seoul 03722, Republic of Korea
| | - Pradeep Natarajan
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, 415 Main St., Cambridge, MA 02142, USA
- Cardiovascular Research Center and Center for Genomic Medicine, Massachusetts General Hospital, 185 Cambridge St., Boston, MA 02114, USA
- Cardiology Division, Department of Medicine, Massachusetts General Hospital, 55 Fruit St., Boston, MA 02114, USA
- Department of Medicine, Harvard Medical School, 25 Shattuck St., Boston, MA 02114, USA
| |
Collapse
|
17
|
He Q, Keding TJ, Zhang Q, Miao J, Russell JD, Herringa RJ, Lu Q, Travers BG, Li JJ. Neurogenetic mechanisms of risk for ADHD: Examining associations of polygenic scores and brain volumes in a population cohort. J Neurodev Disord 2023; 15:30. [PMID: 37653373 PMCID: PMC10469494 DOI: 10.1186/s11689-023-09498-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Accepted: 08/21/2023] [Indexed: 09/02/2023] Open
Abstract
BACKGROUND ADHD polygenic scores (PGSs) have been previously shown to predict ADHD outcomes in several studies. However, ADHD PGSs are typically correlated with ADHD but not necessarily reflective of causal mechanisms. More research is needed to elucidate the neurobiological mechanisms underlying ADHD. We leveraged functional annotation information into an ADHD PGS to (1) improve the prediction performance over a non-annotated ADHD PGS and (2) test whether volumetric variation in brain regions putatively associated with ADHD mediate the association between PGSs and ADHD outcomes. METHODS Data were from the Philadelphia Neurodevelopmental Cohort (N = 555). Multiple mediation models were tested to examine the indirect effects of two ADHD PGSs-one using a traditional computation involving clumping and thresholding and another using a functionally annotated approach (i.e., AnnoPred)-on ADHD inattention (IA) and hyperactivity-impulsivity (HI) symptoms, via gray matter volumes in the cingulate gyrus, angular gyrus, caudate, dorsolateral prefrontal cortex (DLPFC), and inferior temporal lobe. RESULTS A direct effect was detected between the AnnoPred ADHD PGS and IA symptoms in adolescents. No indirect effects via brain volumes were detected for either IA or HI symptoms. However, both ADHD PGSs were negatively associated with the DLPFC. CONCLUSIONS The AnnoPred ADHD PGS was a more developmentally specific predictor of adolescent IA symptoms compared to the traditional ADHD PGS. However, brain volumes did not mediate the effects of either a traditional or AnnoPred ADHD PGS on ADHD symptoms, suggesting that we may still be underpowered in clarifying brain-based biomarkers for ADHD using genetic measures.
Collapse
Affiliation(s)
- Quanfa He
- Department of Psychology, University of, Wisconsin-Madison, 1202 W. Johnson Street, Madison, WI, 53706, USA
- Waisman Center, University of Wisconsin-Madison, Madison, USA
| | | | - Qi Zhang
- Department of Educational Psychology, University of Wisconsin-Madison, Madison, USA
| | - Jiacheng Miao
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, USA
| | - Justin D Russell
- Department of Psychiatry, School of Medicine and Public Health, University of Wisconsin, Madison, USA
| | - Ryan J Herringa
- Department of Psychiatry, School of Medicine and Public Health, University of Wisconsin, Madison, USA
| | - Qiongshi Lu
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, USA
- Center for Demography of Health and Aging, University of Wisconsin-Madison, Madison, USA
- Department of Statistics, University of Wisconsin-Madison, Madison, USA
| | - Brittany G Travers
- Waisman Center, University of Wisconsin-Madison, Madison, USA
- Department of Kinesiology, University of Wisconsin-Madison, Madison, USA
| | - James J Li
- Department of Psychology, University of, Wisconsin-Madison, 1202 W. Johnson Street, Madison, WI, 53706, USA.
- Waisman Center, University of Wisconsin-Madison, Madison, USA.
- Center for Demography of Health and Aging, University of Wisconsin-Madison, Madison, USA.
| |
Collapse
|
18
|
Zhuang Y, Kim NY, Fritsche LG, Mukherjee B, Lee S. Incorporating functional annotation with bilevel continuous shrinkage for polygenic risk prediction. RESEARCH SQUARE 2023:rs.3.rs-2759690. [PMID: 37090583 PMCID: PMC10120759 DOI: 10.21203/rs.3.rs-2759690/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/25/2023]
Abstract
Background Genetic variants can contribute differently to trait heritability by their functional categories, and recent studies have shown that incorporating functional annotation can improve the predictive performance of polygenic risk scores (PRSs). In addition, when only a small proportion of variants are causal variants, PRS methods that employ a Bayesian framework with shrinkage can account for such sparsity. It is possible that the annotation group level effect is also sparse. However, the number of PRS methods that incorporate both annotation information and shrinkage on effect sizes is limited. We propose a PRS method, PRSbils, which utilizes the functional annotation information with a bilevel continuous shrinkage prior to accommodate the varying genetic architectures both on the variant-specific level and on the functional annotation level. Results We conducted simulation studies and investigated the predictive performance in settings with different genetic architectures. Results indicated that when there was a relatively large variability of group-wise heritability contribution, the gain in prediction performance from the proposed method was on average 8.0% higher AUC compared to the benchmark method PRS-CS. The proposed method also yielded higher predictive performance compared to PRS-CS in settings with different overlapping patterns of annotation groups and obtained on average 6.4% higher AUC. We applied PRSbils to binary and quantitative traits in three real world data sources (the UK Biobank, the Michigan Genomics Initiative (MGI), and the Korean Genome and Epidemiology Study (KoGES)), and two sources of annotations: ANNOVAR, and pathway information from the Kyoto Encyclopedia of Genes and Genomes (KEGG), and demonstrated that the proposed method holds the potential for improving predictive performance by incorporating functional annotations. Conclusions By utilizing a bilevel shrinkage framework, PRSbils enables the incorporation of both overlapping and non-overlapping annotations into PRS construction to improve the performance of genetic risk prediction. The software is available at https://github.com/styvon/PRSbils.
Collapse
|
19
|
Örd T, Lönnberg T, Nurminen V, Ravindran A, Niskanen H, Kiema M, Õunap K, Maria M, Moreau PR, Mishra PP, Palani S, Virta J, Liljenbäck H, Aavik E, Roivainen A, Ylä-Herttuala S, Laakkonen JP, Lehtimäki T, Kaikkonen MU. Dissecting the polygenic basis of atherosclerosis via disease-associated cell state signatures. Am J Hum Genet 2023; 110:722-740. [PMID: 37060905 PMCID: PMC10183377 DOI: 10.1016/j.ajhg.2023.03.013] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Accepted: 03/21/2023] [Indexed: 04/17/2023] Open
Abstract
Coronary artery disease (CAD) is a pandemic disease where up to half of the risk is explained by genetic factors. Advanced insights into the genetic basis of CAD require deeper understanding of the contributions of different cell types, molecular pathways, and genes to disease heritability. Here, we investigate the biological diversity of atherosclerosis-associated cell states and interrogate their contribution to the genetic risk of CAD by using single-cell and bulk RNA sequencing (RNA-seq) of mouse and human lesions. We identified 12 disease-associated cell states that we characterized further by gene set functional profiling, ligand-receptor prediction, and transcription factor inference. Importantly, Vcam1+ smooth muscle cell state genes contributed most to SNP-based heritability of CAD. In line with this, genetic variants near smooth muscle cell state genes and regulatory elements explained the largest fraction of CAD-risk variance between individuals. Using this information for variant prioritization, we derived a hybrid polygenic risk score (PRS) that demonstrated improved performance over a classical PRS. Our results provide insights into the biological mechanisms associated with CAD risk, which could make a promising contribution to precision medicine and tailored therapeutic interventions in the future.
Collapse
Affiliation(s)
- Tiit Örd
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, 70211 Kuopio, Finland.
| | - Tapio Lönnberg
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, 20520 Turku, Finland; InFLAMES Research Flagship Center, University of Turku
| | - Valtteri Nurminen
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, 70211 Kuopio, Finland
| | - Aarthi Ravindran
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, 70211 Kuopio, Finland
| | - Henri Niskanen
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, 70211 Kuopio, Finland
| | - Miika Kiema
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, 70211 Kuopio, Finland
| | - Kadri Õunap
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, 70211 Kuopio, Finland
| | - Maleeha Maria
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, 70211 Kuopio, Finland
| | - Pierre R Moreau
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, 70211 Kuopio, Finland
| | - Pashupati P Mishra
- Department of Clinical Chemistry, Fimlab Laboratories and Finnish Cardiovascular Research Center Tampere, Faculty of Medicine and Health Technology, Tampere University, 33100 Tampere, Finland
| | - Senthil Palani
- Turku PET Centre, University of Turku, Kiinamyllynkatu 4-8, 20520 Turku, Finland
| | - Jenni Virta
- Turku PET Centre, University of Turku, Kiinamyllynkatu 4-8, 20520 Turku, Finland
| | - Heidi Liljenbäck
- Turku PET Centre, University of Turku, Kiinamyllynkatu 4-8, 20520 Turku, Finland; Turku Center for Disease Modeling, University of Turku, 20520 Turku, Finland
| | - Einari Aavik
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, 70211 Kuopio, Finland
| | - Anne Roivainen
- Turku PET Centre, University of Turku, Kiinamyllynkatu 4-8, 20520 Turku, Finland; Turku Center for Disease Modeling, University of Turku, 20520 Turku, Finland; Turku PET Centre, Turku University Hospital, 20520 Turku, Finland
| | - Seppo Ylä-Herttuala
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, 70211 Kuopio, Finland
| | - Johanna P Laakkonen
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, 70211 Kuopio, Finland
| | - Terho Lehtimäki
- Department of Clinical Chemistry, Fimlab Laboratories and Finnish Cardiovascular Research Center Tampere, Faculty of Medicine and Health Technology, Tampere University, 33100 Tampere, Finland
| | - Minna U Kaikkonen
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, 70211 Kuopio, Finland.
| |
Collapse
|
20
|
Adam Y, Sadeeq S, Kumuthini J, Ajayi O, Wells G, Solomon R, Ogunlana O, Adetiba E, Iweala E, Brors B, Adebiyi E. Polygenic Risk Score in African populations: progress and challenges. F1000Res 2023; 11:175. [PMID: 37273966 PMCID: PMC10233318 DOI: 10.12688/f1000research.76218.2] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/10/2023] [Indexed: 06/06/2023] Open
Abstract
Polygenic Risk Score (PRS) analysis is a method that predicts the genetic risk of an individual towards targeted traits. Even when there are no significant markers, it gives evidence of a genetic effect beyond the results of Genome-Wide Association Studies (GWAS). Moreover, it selects single nucleotide polymorphisms (SNPs) that contribute to the disease with low effect size making it more precise at individual level risk prediction. PRS analysis addresses the shortfall of GWAS by taking into account the SNPs/alleles with low effect size but play an indispensable role to the observed phenotypic/trait variance. PRS analysis has applications that investigate the genetic basis of several traits, which includes rare diseases. However, the accuracy of PRS analysis depends on the genomic data of the underlying population. For instance, several studies show that obtaining higher prediction power of PRS analysis is challenging for non-Europeans. In this manuscript, we review the conventional PRS methods and their application to sub-Saharan African communities. We conclude that lack of sufficient GWAS data and tools is the limiting factor of applying PRS analysis to sub-Saharan populations. We recommend developing Africa-specific PRS methods and tools for estimating and analyzing African population data for clinical evaluation of PRSs of interest and predicting rare diseases.
Collapse
Affiliation(s)
- Yagoub Adam
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Suraju Sadeeq
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept Computer & Information Sciences, Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Judit Kumuthini
- South African National Bioinformatics Institute, Life Sciences Building, University of Western Cape, Cape Town, South Africa
- Centre for Proteomic and Genomic Research, Cape Town, Western Cape, South Africa
| | - Olabode Ajayi
- South African National Bioinformatics Institute, Life Sciences Building, University of Western Cape, Cape Town, South Africa
- Centre for Proteomic and Genomic Research, Cape Town, Western Cape, South Africa
| | - Gordon Wells
- South African National Bioinformatics Institute, Life Sciences Building, University of Western Cape, Cape Town, South Africa
- Centre for Proteomic and Genomic Research, Cape Town, Western Cape, South Africa
| | - Rotimi Solomon
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, 112212, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept of Biochemistry, Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Olubanke Ogunlana
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, 112212, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept of Biochemistry, Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Emmanuel Adetiba
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept of Electrical & Information Engineering (EIE), Covenant University, Ota, Ogun State, 112212, Nigeria
- HRA, Institute for Systems Science, Durban University of Technology, Durban, South Africa
| | - Emeka Iweala
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept of Biochemistry, Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Benedikt Brors
- Applied Bioinformatics Division, German Cancer Research Center (DKFZ), Heidelberg, 69120, Germany
- German Cancer Consortium (DKTK), Heidelberg, Germany
| | - Ezekiel Adebiyi
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, 112212, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept Computer & Information Sciences, Covenant University, Ota, Ogun State, 112212, Nigeria
- Applied Bioinformatics Division, German Cancer Research Center (DKFZ), Heidelberg, 69120, Germany
| |
Collapse
|
21
|
Adam Y, Sadeeq S, Kumuthini J, Ajayi O, Wells G, Solomon R, Ogunlana O, Adetiba E, Iweala E, Brors B, Adebiyi E. Polygenic Risk Score in African populations: progress and challenges. F1000Res 2023; 11:175. [PMID: 37273966 PMCID: PMC10233318 DOI: 10.12688/f1000research.76218.1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/10/2023] [Indexed: 11/23/2023] Open
Abstract
Polygenic Risk Score (PRS) analysis is a method that predicts the genetic risk of an individual towards targeted traits. Even when there are no significant markers, it gives evidence of a genetic effect beyond the results of Genome-Wide Association Studies (GWAS). Moreover, it selects single nucleotide polymorphisms (SNPs) that contribute to the disease with low effect size making it more precise at individual level risk prediction. PRS analysis addresses the shortfall of GWAS by taking into account the SNPs/alleles with low effect size but play an indispensable role to the observed phenotypic/trait variance. PRS analysis has applications that investigate the genetic basis of several traits, which includes rare diseases. However, the accuracy of PRS analysis depends on the genomic data of the underlying population. For instance, several studies show that obtaining higher prediction power of PRS analysis is challenging for non-Europeans. In this manuscript, we review the conventional PRS methods and their application to sub-Saharan African communities. We conclude that lack of sufficient GWAS data and tools is the limiting factor of applying PRS analysis to sub-Saharan populations. We recommend developing Africa-specific PRS methods and tools for estimating and analyzing African population data for clinical evaluation of PRSs of interest and predicting rare diseases.
Collapse
Affiliation(s)
- Yagoub Adam
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Suraju Sadeeq
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept Computer & Information Sciences, Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Judit Kumuthini
- South African National Bioinformatics Institute, Life Sciences Building, University of Western Cape, Cape Town, South Africa
- Centre for Proteomic and Genomic Research, Cape Town, Western Cape, South Africa
| | - Olabode Ajayi
- South African National Bioinformatics Institute, Life Sciences Building, University of Western Cape, Cape Town, South Africa
- Centre for Proteomic and Genomic Research, Cape Town, Western Cape, South Africa
| | - Gordon Wells
- South African National Bioinformatics Institute, Life Sciences Building, University of Western Cape, Cape Town, South Africa
- Centre for Proteomic and Genomic Research, Cape Town, Western Cape, South Africa
| | - Rotimi Solomon
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, 112212, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept of Biochemistry, Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Olubanke Ogunlana
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, 112212, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept of Biochemistry, Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Emmanuel Adetiba
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept of Electrical & Information Engineering (EIE), Covenant University, Ota, Ogun State, 112212, Nigeria
- HRA, Institute for Systems Science, Durban University of Technology, Durban, South Africa
| | - Emeka Iweala
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept of Biochemistry, Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Benedikt Brors
- Applied Bioinformatics Division, German Cancer Research Center (DKFZ), Heidelberg, 69120, Germany
- German Cancer Consortium (DKTK), Heidelberg, Germany
| | - Ezekiel Adebiyi
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, 112212, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept Computer & Information Sciences, Covenant University, Ota, Ogun State, 112212, Nigeria
- Applied Bioinformatics Division, German Cancer Research Center (DKFZ), Heidelberg, 69120, Germany
| |
Collapse
|
22
|
Zhou X, Chen Y, Ip FCF, Jiang Y, Cao H, Lv G, Zhong H, Chen J, Ye T, Chen Y, Zhang Y, Ma S, Lo RMN, Tong EPS, Mok VCT, Kwok TCY, Guo Q, Mok KY, Shoai M, Hardy J, Chen L, Fu AKY, Ip NY. Deep learning-based polygenic risk analysis for Alzheimer's disease prediction. COMMUNICATIONS MEDICINE 2023; 3:49. [PMID: 37024668 PMCID: PMC10079691 DOI: 10.1038/s43856-023-00269-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Accepted: 03/06/2023] [Indexed: 04/08/2023] Open
Abstract
BACKGROUND The polygenic nature of Alzheimer's disease (AD) suggests that multiple variants jointly contribute to disease susceptibility. As an individual's genetic variants are constant throughout life, evaluating the combined effects of multiple disease-associated genetic risks enables reliable AD risk prediction. Because of the complexity of genomic data, current statistical analyses cannot comprehensively capture the polygenic risk of AD, resulting in unsatisfactory disease risk prediction. However, deep learning methods, which capture nonlinearity within high-dimensional genomic data, may enable more accurate disease risk prediction and improve our understanding of AD etiology. Accordingly, we developed deep learning neural network models for modeling AD polygenic risk. METHODS We constructed neural network models to model AD polygenic risk and compared them with the widely used weighted polygenic risk score and lasso models. We conducted robust linear regression analysis to investigate the relationship between the AD polygenic risk derived from deep learning methods and AD endophenotypes (i.e., plasma biomarkers and individual cognitive performance). We stratified individuals by applying unsupervised clustering to the outputs from the hidden layers of the neural network model. RESULTS The deep learning models outperform other statistical models for modeling AD risk. Moreover, the polygenic risk derived from the deep learning models enables the identification of disease-associated biological pathways and the stratification of individuals according to distinct pathological mechanisms. CONCLUSION Our results suggest that deep learning methods are effective for modeling the genetic risks of AD and other diseases, classifying disease risks, and uncovering disease mechanisms.
Collapse
Affiliation(s)
- Xiaopu Zhou
- Division of Life Science, State Key Laboratory of Molecular Neuroscience, Molecular Neuroscience Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
- Hong Kong Center for Neurodegenerative Diseases, Hong Kong Science Park, Hong Kong, China
- Guangdong Provincial Key Laboratory of Brain Science, Disease and Drug Development, HKUST Shenzhen Research Institute, Shenzhen-Hong Kong Institute of Brain Science, Shenzhen, Guangdong, 518057, China
| | - Yu Chen
- Division of Life Science, State Key Laboratory of Molecular Neuroscience, Molecular Neuroscience Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
- Guangdong Provincial Key Laboratory of Brain Science, Disease and Drug Development, HKUST Shenzhen Research Institute, Shenzhen-Hong Kong Institute of Brain Science, Shenzhen, Guangdong, 518057, China
- Chinese Academy of Sciences Key Laboratory of Brain Connectome and Manipulation, Shenzhen Key Laboratory of Translational Research for Brain Diseases, The Brain Cognition and Brain Disease Institute, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen-Hong Kong Institute of Brain Science-Shenzhen Fundamental Research Institutions, Shenzhen, Guangdong, 518055, China
| | - Fanny C F Ip
- Division of Life Science, State Key Laboratory of Molecular Neuroscience, Molecular Neuroscience Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
- Hong Kong Center for Neurodegenerative Diseases, Hong Kong Science Park, Hong Kong, China
- Guangdong Provincial Key Laboratory of Brain Science, Disease and Drug Development, HKUST Shenzhen Research Institute, Shenzhen-Hong Kong Institute of Brain Science, Shenzhen, Guangdong, 518057, China
| | - Yuanbing Jiang
- Division of Life Science, State Key Laboratory of Molecular Neuroscience, Molecular Neuroscience Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
- Hong Kong Center for Neurodegenerative Diseases, Hong Kong Science Park, Hong Kong, China
| | - Han Cao
- Division of Life Science, State Key Laboratory of Molecular Neuroscience, Molecular Neuroscience Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| | - Ge Lv
- Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| | - Huan Zhong
- Division of Life Science, State Key Laboratory of Molecular Neuroscience, Molecular Neuroscience Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
- Hong Kong Center for Neurodegenerative Diseases, Hong Kong Science Park, Hong Kong, China
| | - Jiahang Chen
- Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| | - Tao Ye
- Division of Life Science, State Key Laboratory of Molecular Neuroscience, Molecular Neuroscience Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
- Guangdong Provincial Key Laboratory of Brain Science, Disease and Drug Development, HKUST Shenzhen Research Institute, Shenzhen-Hong Kong Institute of Brain Science, Shenzhen, Guangdong, 518057, China
- Chinese Academy of Sciences Key Laboratory of Brain Connectome and Manipulation, Shenzhen Key Laboratory of Translational Research for Brain Diseases, The Brain Cognition and Brain Disease Institute, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen-Hong Kong Institute of Brain Science-Shenzhen Fundamental Research Institutions, Shenzhen, Guangdong, 518055, China
| | - Yuewen Chen
- Division of Life Science, State Key Laboratory of Molecular Neuroscience, Molecular Neuroscience Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
- Guangdong Provincial Key Laboratory of Brain Science, Disease and Drug Development, HKUST Shenzhen Research Institute, Shenzhen-Hong Kong Institute of Brain Science, Shenzhen, Guangdong, 518057, China
- Chinese Academy of Sciences Key Laboratory of Brain Connectome and Manipulation, Shenzhen Key Laboratory of Translational Research for Brain Diseases, The Brain Cognition and Brain Disease Institute, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen-Hong Kong Institute of Brain Science-Shenzhen Fundamental Research Institutions, Shenzhen, Guangdong, 518055, China
| | - Yulin Zhang
- Guangdong Provincial Key Laboratory of Brain Science, Disease and Drug Development, HKUST Shenzhen Research Institute, Shenzhen-Hong Kong Institute of Brain Science, Shenzhen, Guangdong, 518057, China
| | - Shuangshuang Ma
- Guangdong Provincial Key Laboratory of Brain Science, Disease and Drug Development, HKUST Shenzhen Research Institute, Shenzhen-Hong Kong Institute of Brain Science, Shenzhen, Guangdong, 518057, China
| | - Ronnie M N Lo
- Division of Life Science, State Key Laboratory of Molecular Neuroscience, Molecular Neuroscience Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| | - Estella P S Tong
- Division of Life Science, State Key Laboratory of Molecular Neuroscience, Molecular Neuroscience Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| | - Vincent C T Mok
- Gerald Choa Neuroscience Centre, Lui Che Woo Institute of Innovative Medicine, Therese Pei Fong Chow Research Centre for Prevention of Dementia, Division of Neurology, Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Shatin, Hong Kong, China
| | - Timothy C Y Kwok
- Therese Pei Fong Chow Research Centre for Prevention of Dementia, Division of Geriatrics, Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Shatin, Hong Kong, China
| | - Qihao Guo
- Department of Gerontology, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai, 200233, China
| | - Kin Y Mok
- Division of Life Science, State Key Laboratory of Molecular Neuroscience, Molecular Neuroscience Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
- Hong Kong Center for Neurodegenerative Diseases, Hong Kong Science Park, Hong Kong, China
- Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, London, UK
- UK Dementia Research Institute at UCL, London, UK
| | - Maryam Shoai
- Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, London, UK
- UK Dementia Research Institute at UCL, London, UK
| | - John Hardy
- Hong Kong Center for Neurodegenerative Diseases, Hong Kong Science Park, Hong Kong, China
- Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, London, UK
- UK Dementia Research Institute at UCL, London, UK
- HKUST Jockey Club Institute for Advanced Study, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| | - Lei Chen
- Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| | - Amy K Y Fu
- Division of Life Science, State Key Laboratory of Molecular Neuroscience, Molecular Neuroscience Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
- Hong Kong Center for Neurodegenerative Diseases, Hong Kong Science Park, Hong Kong, China
- Guangdong Provincial Key Laboratory of Brain Science, Disease and Drug Development, HKUST Shenzhen Research Institute, Shenzhen-Hong Kong Institute of Brain Science, Shenzhen, Guangdong, 518057, China
| | - Nancy Y Ip
- Division of Life Science, State Key Laboratory of Molecular Neuroscience, Molecular Neuroscience Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China.
- Hong Kong Center for Neurodegenerative Diseases, Hong Kong Science Park, Hong Kong, China.
- Guangdong Provincial Key Laboratory of Brain Science, Disease and Drug Development, HKUST Shenzhen Research Institute, Shenzhen-Hong Kong Institute of Brain Science, Shenzhen, Guangdong, 518057, China.
| |
Collapse
|
23
|
Quantifying portable genetic effects and improving cross-ancestry genetic prediction with GWAS summary statistics. Nat Commun 2023; 14:832. [PMID: 36788230 PMCID: PMC9929290 DOI: 10.1038/s41467-023-36544-7] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 02/07/2023] [Indexed: 02/16/2023] Open
Abstract
Polygenic risk scores (PRS) calculated from genome-wide association studies (GWAS) of Europeans are known to have substantially reduced predictive accuracy in non-European populations, limiting their clinical utility and raising concerns about health disparities across ancestral populations. Here, we introduce a statistical framework named X-Wing to improve predictive performance in ancestrally diverse populations. X-Wing quantifies local genetic correlations for complex traits between populations, employs an annotation-dependent estimation procedure to amplify correlated genetic effects between populations, and combines multiple population-specific PRS into a unified score with GWAS summary statistics alone as input. Through extensive benchmarking, we demonstrate that X-Wing pinpoints portable genetic effects and substantially improves PRS performance in non-European populations, showing 14.1%-119.1% relative gain in predictive R2 compared to state-of-the-art methods based on GWAS summary statistics. Overall, X-Wing addresses critical limitations in existing approaches and may have broad applications in cross-population polygenic risk prediction.
Collapse
|
24
|
Momin MM, Shin J, Lee S, Truong B, Benyamin B, Lee SH. A method for an unbiased estimate of cross-ancestry genetic correlation using individual-level data. Nat Commun 2023; 14:722. [PMID: 36759513 PMCID: PMC9911789 DOI: 10.1038/s41467-023-36281-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Accepted: 01/24/2023] [Indexed: 02/11/2023] Open
Abstract
Cross-ancestry genetic correlation is an important parameter to understand the genetic relationship between two ancestry groups. However, existing methods cannot properly account for ancestry-specific genetic architecture, which is diverse across ancestries, producing biased estimates of cross-ancestry genetic correlation. Here, we present a method to construct a genomic relationship matrix (GRM) that can correctly account for the relationship between ancestry-specific allele frequencies and ancestry-specific allelic effects. Through comprehensive simulations, we show that the proposed method outperforms existing methods in the estimations of SNP-based heritability and cross-ancestry genetic correlation. The proposed method is further applied to anthropometric and other complex traits from the UK Biobank data across ancestry groups. For obesity, the estimated genetic correlation between African and European ancestry cohorts is significantly different from unity, suggesting that obesity is genetically heterogenous between these two ancestries.
Collapse
Affiliation(s)
- Md Moksedul Momin
- Australian Centre for Precision Health, University of South Australia, Adelaide, SA, 5000, Australia
- UniSA Allied Health and Human Performance, University of South Australia, Adelaide, SA, 5000, Australia
- Department of Genetics and Animal Breeding, Faculty of Veterinary Medicine, Chattogram Veterinary and Animal Sciences University (CVASU), Khulshi, Chattogram, 4225, Bangladesh
- South Australian Health and Medical Research Institute (SAHMRI), University of South Australia, Adelaide, SA, 5000, Australia
| | - Jisu Shin
- Australian Centre for Precision Health, University of South Australia, Adelaide, SA, 5000, Australia
- UniSA Allied Health and Human Performance, University of South Australia, Adelaide, SA, 5000, Australia
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, 22908, USA
- Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA, USA
| | - Soohyun Lee
- Division of Animal Breeding and Genetics, National Institute of Animal Science (NIAS), Cheonan, South Korea
| | - Buu Truong
- Australian Centre for Precision Health, University of South Australia, Adelaide, SA, 5000, Australia
| | - Beben Benyamin
- Australian Centre for Precision Health, University of South Australia, Adelaide, SA, 5000, Australia
- UniSA Allied Health and Human Performance, University of South Australia, Adelaide, SA, 5000, Australia
- South Australian Health and Medical Research Institute (SAHMRI), University of South Australia, Adelaide, SA, 5000, Australia
| | - S Hong Lee
- Australian Centre for Precision Health, University of South Australia, Adelaide, SA, 5000, Australia.
- UniSA Allied Health and Human Performance, University of South Australia, Adelaide, SA, 5000, Australia.
- South Australian Health and Medical Research Institute (SAHMRI), University of South Australia, Adelaide, SA, 5000, Australia.
| |
Collapse
|
25
|
Yu X, Xiao J, Cai M, Jiao Y, Wan X, Liu J, Yang C. PALM: a powerful and adaptive latent model for prioritizing risk variants with functional annotations. Bioinformatics 2023; 39:7028484. [PMID: 36744920 PMCID: PMC9950853 DOI: 10.1093/bioinformatics/btad068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Revised: 01/12/2023] [Accepted: 02/03/2023] [Indexed: 02/07/2023] Open
Abstract
MOTIVATION The findings from genome-wide association studies (GWASs) have greatly helped us to understand the genetic basis of human complex traits and diseases. Despite the tremendous progress, much effects are still needed to address several major challenges arising in GWAS. First, most GWAS hits are located in the non-coding region of human genome, and thus their biological functions largely remain unknown. Second, due to the polygenicity of human complex traits and diseases, many genetic risk variants with weak or moderate effects have not been identified yet. RESULTS To address the above challenges, we propose a powerful and adaptive latent model (PALM) to integrate cell-type/tissue-specific functional annotations with GWAS summary statistics. Unlike existing methods, which are mainly based on linear models, PALM leverages a tree ensemble to adaptively characterize non-linear relationship between functional annotations and the association status of genetic variants. To make PALM scalable to millions of variants and hundreds of functional annotations, we develop a functional gradient-based expectation-maximization algorithm, to fit the tree-based non-linear model in a stable manner. Through comprehensive simulation studies, we show that PALM not only controls false discovery rate well, but also improves statistical power of identifying risk variants. We also apply PALM to integrate summary statistics of 30 GWASs with 127 cell type/tissue-specific functional annotations. The results indicate that PALM can identify more risk variants as well as rank the importance of functional annotations, yielding better interpretation of GWAS results. AVAILABILITY AND IMPLEMENTATION The source code is available at https://github.com/YangLabHKUST/PALM. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xinyi Yu
- Shenzhen Research Institute of Big Data, Shenzhen 518172, China.,Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| | - Jiashun Xiao
- Shenzhen Research Institute of Big Data, Shenzhen 518172, China.,Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| | - Mingxuan Cai
- Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong SAR, China.,Department of Biostatistics, City University of Hong Kong, Hong Kong SAR, China
| | - Yuling Jiao
- School of Mathematics and Statistics, Wuhan University, Wuhan 430072, China
| | - Xiang Wan
- Shenzhen Research Institute of Big Data, Shenzhen 518172, China
| | - Jin Liu
- Centre for Quantitative Medicine, Health Services & Systems Research, Duke-NUS Medical School, Singapore 169857, Singapore.,School of Data Science, The Chinese University of Hong Kong-Shenzhen, Shenzhen 518172, China
| | - Can Yang
- Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| |
Collapse
|
26
|
Choi SW, García-González J, Ruan Y, Wu HM, Porras C, Johnson J, Hoggart CJ, O'Reilly PF. PRSet: Pathway-based polygenic risk score analyses and software. PLoS Genet 2023; 19:e1010624. [PMID: 36749789 PMCID: PMC9937466 DOI: 10.1371/journal.pgen.1010624] [Citation(s) in RCA: 26] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 02/17/2023] [Accepted: 01/19/2023] [Indexed: 02/08/2023] Open
Abstract
Polygenic risk scores (PRSs) have been among the leading advances in biomedicine in recent years. As a proxy of genetic liability, PRSs are utilised across multiple fields and applications. While numerous statistical and machine learning methods have been developed to optimise their predictive accuracy, these typically distil genetic liability to a single number based on aggregation of an individual's genome-wide risk alleles. This results in a key loss of information about an individual's genetic profile, which could be critical given the functional sub-structure of the genome and the heterogeneity of complex disease. In this manuscript, we introduce a 'pathway polygenic' paradigm of disease risk, in which multiple genetic liabilities underlie complex diseases, rather than a single genome-wide liability. We describe a method and accompanying software, PRSet, for computing and analysing pathway-based PRSs, in which polygenic scores are calculated across genomic pathways for each individual. We evaluate the potential of pathway PRSs in two distinct ways, creating two major sections: (1) In the first section, we benchmark PRSet as a pathway enrichment tool, evaluating its capacity to capture GWAS signal in pathways. We find that for target sample sizes of >10,000 individuals, pathway PRSs have similar power for evaluating pathway enrichment as leading methods MAGMA and LD score regression, with the distinct advantage of providing individual-level estimates of genetic liability for each pathway -opening up a range of pathway-based PRS applications, (2) In the second section, we evaluate the performance of pathway PRSs for disease stratification. We show that using a supervised disease stratification approach, pathway PRSs (computed by PRSet) outperform two standard genome-wide PRSs (computed by C+T and lassosum) for classifying disease subtypes in 20 of 21 scenarios tested. As the definition and functional annotation of pathways becomes increasingly refined, we expect pathway PRSs to offer key insights into the heterogeneity of complex disease and treatment response, to generate biologically tractable therapeutic targets from polygenic signal, and, ultimately, to provide a powerful path to precision medicine.
Collapse
Affiliation(s)
- Shing Wan Choi
- Department of Genetics and Genomic Sciences, Icahn School of Medicine, Mount Sinai, New York City, New York, United States of America
| | - Judit García-González
- Department of Genetics and Genomic Sciences, Icahn School of Medicine, Mount Sinai, New York City, New York, United States of America
| | - Yunfeng Ruan
- The Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Hei Man Wu
- Department of Genetics and Genomic Sciences, Icahn School of Medicine, Mount Sinai, New York City, New York, United States of America
| | - Christian Porras
- Department of Genetics and Genomic Sciences, Icahn School of Medicine, Mount Sinai, New York City, New York, United States of America
| | - Jessica Johnson
- Department of Genetics and Genomic Sciences, Icahn School of Medicine, Mount Sinai, New York City, New York, United States of America
| | - Clive J Hoggart
- Department of Genetics and Genomic Sciences, Icahn School of Medicine, Mount Sinai, New York City, New York, United States of America
| | - Paul F O'Reilly
- Department of Genetics and Genomic Sciences, Icahn School of Medicine, Mount Sinai, New York City, New York, United States of America
| |
Collapse
|
27
|
Conery M, Grant SFA. Human height: a model common complex trait. Ann Hum Biol 2023; 50:258-266. [PMID: 37343163 PMCID: PMC10368389 DOI: 10.1080/03014460.2023.2215546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 04/10/2023] [Accepted: 05/09/2023] [Indexed: 06/23/2023]
Abstract
CONTEXT Like other complex phenotypes, human height reflects a combination of environmental and genetic factors, but is notable for being exceptionally easy to measure. Height has therefore been commonly used to make observations later generalised to other phenotypes though the appropriateness of such generalisations is not always considered. OBJECTIVES We aimed to assess height's suitability as a model for other complex phenotypes and review recent advances in height genetics with regard to their implications for complex phenotypes more broadly. METHODS We conducted a comprehensive literature search in PubMed and Google Scholar for articles relevant to the genetics of height and its comparatibility to other phenotypes. RESULTS Height is broadly similar to other phenotypes apart from its high heritability and ease of measurment. Recent genome-wide association studies (GWAS) have identified over 12,000 independent signals associated with height and saturated height's common single nucleotide polymorphism based heritability of height within a subset of the genome in individuals similar to European reference populations. CONCLUSIONS Given the similarity of height to other complex traits, the saturation of GWAS's ability to discover additional height-associated variants signals potential limitations to the omnigenic model of complex-phenotype inheritance, indicating the likely future power of polygenic scores and risk scores, and highlights the increasing need for large-scale variant-to-gene mapping efforts.
Collapse
Affiliation(s)
- Mitchell Conery
- Division of Human Genetics, Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pediatrics, Perelman School of Medicine at the University of PA, Philadelphia, PA, USA
- Department of Pharmacology, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA
| | - Struan F A Grant
- Division of Human Genetics, Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pediatrics, Perelman School of Medicine at the University of PA, Philadelphia, PA, USA
- Division of Diabetes and Endocrinology, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Institute for Diabetes, Obesity, and Metabolism, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA
- Department of Genetics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
28
|
Abstract
Polygenic scores quantify inherited risk by integrating information from many common sites of DNA variation into a single number. Rapid increases in the scale of genetic association studies and new statistical algorithms have enabled development of polygenic scores that meaningfully measure-as early as birth-risk of coronary artery disease. These newer-generation polygenic scores identify up to 8% of the population with triple the normal risk based on genetic variation alone, and these individuals cannot be identified on the basis of family history or clinical risk factors alone. For those identified with increased genetic risk, evidence supports risk reduction with at least two interventions, adherence to a healthy lifestyle and cholesterol-lowering therapies, that can substantially reduce risk. Alongside considerable enthusiasm for the potential of polygenic risk estimation to enable a new era of preventive clinical medicine is recognition of a need for ongoing research into how best to ensure equitable performance across diverse ancestries, how and in whom to assess the scores in clinical practice, as well as randomized trials to confirm clinical utility.
Collapse
Affiliation(s)
- Aniruddh P Patel
- Division of Cardiology and Center for Genomic Medicine, Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA; , .,Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA.,Department of Medicine, Harvard Medical School, Boston, Massachusetts, USA
| | - Amit V Khera
- Division of Cardiology and Center for Genomic Medicine, Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA; , .,Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA.,Department of Medicine, Harvard Medical School, Boston, Massachusetts, USA.,Verve Therapeutics, Cambridge, Massachusetts, USA
| |
Collapse
|
29
|
O'Sullivan JW, Ashley EA, Elliott PM. Polygenic risk scores for the prediction of cardiometabolic disease. Eur Heart J 2023; 44:89-99. [PMID: 36478054 DOI: 10.1093/eurheartj/ehac648] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Revised: 08/28/2022] [Accepted: 10/27/2022] [Indexed: 12/12/2022] Open
Abstract
Cardiometabolic diseases contribute more to global morbidity and mortality than any other group of disorders. Polygenic risk scores (PRSs), the weighted summation of individually small-effect genetic variants, represent an advance in our ability to predict the development and complications of cardiometabolic diseases. This article reviews the evidence supporting the use of PRS in seven common cardiometabolic diseases: coronary artery disease (CAD), stroke, hypertension, heart failure and cardiomyopathies, obesity, atrial fibrillation (AF), and type 2 diabetes mellitus (T2DM). Data suggest that PRS for CAD, AF, and T2DM consistently improves prediction when incorporated into existing clinical risk tools. In other areas such as ischaemic stroke and hypertension, clinical application appears premature but emerging evidence suggests that the study of larger and more diverse populations coupled with more granular phenotyping will propel the translation of PRS into practical clinical prediction tools.
Collapse
Affiliation(s)
- Jack W O'Sullivan
- Stanford Center for Inherited Cardiovascular Disease, Stanford University School of Medicine, Stanford, CA, USA
- Division of Cardiology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Euan A Ashley
- Stanford Center for Inherited Cardiovascular Disease, Stanford University School of Medicine, Stanford, CA, USA
- Division of Cardiology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Perry M Elliott
- UCL Institute of Cardiovascular Science, Gower Street, London WC1E 6BT, UK
- St. Bartholomew's Hospital, W Smithfield, London EC1A 7BE, UK
| |
Collapse
|
30
|
Zhou H, Arapoglou T, Li X, Li Z, Zheng X, Moore J, Asok A, Kumar S, Blue E, Buyske S, Cox N, Felsenfeld A, Gerstein M, Kenny E, Li B, Matise T, Philippakis A, Rehm HL, Sofia HJ, Snyder G, Weng Z, Neale B, Sunyaev S, Lin X. FAVOR: functional annotation of variants online resource and annotator for variation across the human genome. Nucleic Acids Res 2023; 51:D1300-D1311. [PMID: 36350676 PMCID: PMC9825437 DOI: 10.1093/nar/gkac966] [Citation(s) in RCA: 34] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 09/25/2022] [Accepted: 10/14/2022] [Indexed: 11/11/2022] Open
Abstract
Large biobank-scale whole genome sequencing (WGS) studies are rapidly identifying a multitude of coding and non-coding variants. They provide an unprecedented resource for illuminating the genetic basis of human diseases. Variant functional annotations play a critical role in WGS analysis, result interpretation, and prioritization of disease- or trait-associated causal variants. Existing functional annotation databases have limited scope to perform online queries and functionally annotate the genotype data of large biobank-scale WGS studies. We develop the Functional Annotation of Variants Online Resources (FAVOR) to meet these pressing needs. FAVOR provides a comprehensive multi-faceted variant functional annotation online portal that summarizes and visualizes findings of all possible nine billion single nucleotide variants (SNVs) across the genome. It allows for rapid variant-, gene- and region-level queries of variant functional annotations. FAVOR integrates variant functional information from multiple sources to describe the functional characteristics of variants and facilitates prioritizing plausible causal variants influencing human phenotypes. Furthermore, we provide a scalable annotation tool, FAVORannotator, to functionally annotate large-scale WGS studies and efficiently store the genotype and their variant functional annotation data in a single file using the annotated Genomic Data Structure (aGDS) format, making downstream analysis more convenient. FAVOR and FAVORannotator are available at https://favor.genohub.org.
Collapse
Affiliation(s)
- Hufeng Zhou
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Theodore Arapoglou
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Xihao Li
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Zilin Li
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Biostatistics and Health Data Science, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Xiuwen Zheng
- Department of Biostatistics, University of Washington, Seattle, WA 98195, USA
| | - Jill Moore
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | | | - Sushant Kumar
- Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada
- Princess Margaret Cancer Centre, Toronto, ON, Canada
| | - Elizabeth E Blue
- Division of Medical Genetics, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
| | - Steven Buyske
- Department of Statistics, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| | - Nancy Cox
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | | | - Mark Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Eimear Kenny
- Department of Genetics and Genomic Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Bingshan Li
- Department of Molecular Physiology and Biophysics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Tara Matise
- Department of Genetics, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| | - Anthony Philippakis
- Data Science Platform, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Heidi L Rehm
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Heidi J Sofia
- National Human Genome Research Institute, Bethesda, DC, USA
| | - Grace Snyder
- National Human Genome Research Institute, Bethesda, DC, USA
| | | | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Benjamin Neale
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | - Shamil R Sunyaev
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Xihong Lin
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Statistics, Harvard University, Cambridge, MA, USA
| |
Collapse
|
31
|
Zhou G, Chen T, Zhao H. SDPRX: A statistical method for cross-population prediction of complex traits. Am J Hum Genet 2023; 110:13-22. [PMID: 36460009 PMCID: PMC9892700 DOI: 10.1016/j.ajhg.2022.11.007] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Accepted: 11/08/2022] [Indexed: 12/03/2022] Open
Abstract
Polygenic risk score (PRS) has demonstrated its great utility in biomedical research through identifying high-risk individuals for different diseases from their genotypes. However, the broader application of PRS to the general population is hindered by the limited transferability of PRS developed in Europeans to non-European populations. To improve PRS prediction accuracy in non-European populations, we develop a statistical method called SDPRX that can effectively integrate genome wide association study summary statistics from different populations. SDPRX automatically adjusts for linkage disequilibrium differences between populations and characterizes the joint distribution of the effect sizes of a variant in two populations to be both null, population specific, or shared with correlation. Through simulations and applications to real traits, we show that SDPRX improves the prediction performance over existing methods in non-European populations.
Collapse
Affiliation(s)
- Geyu Zhou
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
| | - Tianqi Chen
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
| | - Hongyu Zhao
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA.
| |
Collapse
|
32
|
Panyard DJ, Deming YK, Darst BF, Van Hulle CA, Zetterberg H, Blennow K, Kollmorgen G, Suridjan I, Carlsson CM, Johnson SC, Asthana S, Engelman CD, Lu Q. Liver-Specific Polygenic Risk Score Is Associated with Alzheimer's Disease Diagnosis. J Alzheimers Dis 2023; 92:395-409. [PMID: 36744333 PMCID: PMC10050104 DOI: 10.3233/jad-220599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
BACKGROUND Our understanding of the pathophysiology underlying Alzheimer's disease (AD) has benefited from genomic analyses, including those that leverage polygenic risk score (PRS) models of disease. The use of functional annotation has been able to improve the power of genomic models. OBJECTIVE We sought to leverage genomic functional annotations to build tissue-specific AD PRS models and study their relationship with AD and its biomarkers. METHODS We built 13 tissue-specific AD PRS and studied the scores' relationships with AD diagnosis, cerebrospinal fluid (CSF) amyloid, CSF tau, and other CSF biomarkers in two longitudinal cohort studies of AD. RESULTS The AD PRS model that was most predictive of AD diagnosis (even without APOE) was the liver AD PRS: n = 1,115; odds ratio = 2.15 (1.67-2.78), p = 3.62×10-9. The liver AD PRS was also statistically significantly associated with cerebrospinal fluid biomarker evidence of amyloid-β (Aβ42:Aβ40 ratio, p = 3.53×10-6) and the phosphorylated tau:amyloid-β ratio (p = 1.45×10-5). CONCLUSION These findings provide further evidence of the role of the liver-functional genome in AD and the benefits of incorporating functional annotation into genomic research.
Collapse
Affiliation(s)
- Daniel J. Panyard
- Department of Population Health Sciences, University of Wisconsin-Madison, 610 Walnut Street, 707 WARF Building, Madison, WI 53726, United States of America
| | - Yuetiva K. Deming
- Department of Population Health Sciences, University of Wisconsin-Madison, 610 Walnut Street, 707 WARF Building, Madison, WI 53726, United States of America
- Wisconsin Alzheimer’s Disease Research Center, University of Wisconsin-Madison, 600 Highland Avenue, J5/1 Mezzanine, Madison, WI 53792, United States of America
- Department of Medicine, University of Wisconsin-Madison, 1685 Highland Avenue, 5158 Medical Foundation Centennial Building, Madison, WI 53705, United States of America
| | - Burcu F. Darst
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, 1450 Biggy Street, Los Angeles, CA 90033, United States of America
| | - Carol A. Van Hulle
- Wisconsin Alzheimer’s Disease Research Center, University of Wisconsin-Madison, 600 Highland Avenue, J5/1 Mezzanine, Madison, WI 53792, United States of America
- Department of Medicine, University of Wisconsin-Madison, 1685 Highland Avenue, 5158 Medical Foundation Centennial Building, Madison, WI 53705, United States of America
| | - Henrik Zetterberg
- Department of Psychiatry and Neurochemistry, Institute of Neuroscience and Physiology, the Sahlgrenska Academy at the University of Gothenburg, Mölndal, Sweden
- Clinical Neurochemistry Laboratory, Sahlgrenska University Hospital, Mölndal, Sweden
- Department of Neurodegenerative Disease, UCL Institute of Neurology, London, UK
- UK Dementia Research Institute at UCL, London, UK
- Hong Kong Center for Neurodegenerative Diseases, Hong Kong, China
| | - Kaj Blennow
- Department of Psychiatry and Neurochemistry, Institute of Neuroscience and Physiology, the Sahlgrenska Academy at the University of Gothenburg, Mölndal, Sweden
- Clinical Neurochemistry Laboratory, Sahlgrenska University Hospital, Mölndal, Sweden
| | | | | | - Cynthia M. Carlsson
- Wisconsin Alzheimer’s Disease Research Center, University of Wisconsin-Madison, 600 Highland Avenue, J5/1 Mezzanine, Madison, WI 53792, United States of America
- Department of Medicine, University of Wisconsin-Madison, 1685 Highland Avenue, 5158 Medical Foundation Centennial Building, Madison, WI 53705, United States of America
- Wisconsin Alzheimer’s Institute, University of Wisconsin-Madison, 610 Walnut Street, 9 Floor, Madison, WI 53726, United States of America
- William S. Middleton Memorial Veterans Hospital, 2500 Overlook Terrace, Madison, WI 53705, United States of America
| | - Sterling C. Johnson
- Wisconsin Alzheimer’s Disease Research Center, University of Wisconsin-Madison, 600 Highland Avenue, J5/1 Mezzanine, Madison, WI 53792, United States of America
- Department of Medicine, University of Wisconsin-Madison, 1685 Highland Avenue, 5158 Medical Foundation Centennial Building, Madison, WI 53705, United States of America
- Wisconsin Alzheimer’s Institute, University of Wisconsin-Madison, 610 Walnut Street, 9 Floor, Madison, WI 53726, United States of America
- William S. Middleton Memorial Veterans Hospital, 2500 Overlook Terrace, Madison, WI 53705, United States of America
| | - Sanjay Asthana
- Wisconsin Alzheimer’s Disease Research Center, University of Wisconsin-Madison, 600 Highland Avenue, J5/1 Mezzanine, Madison, WI 53792, United States of America
- Department of Medicine, University of Wisconsin-Madison, 1685 Highland Avenue, 5158 Medical Foundation Centennial Building, Madison, WI 53705, United States of America
- William S. Middleton Memorial Veterans Hospital, 2500 Overlook Terrace, Madison, WI 53705, United States of America
| | - Corinne D. Engelman
- Department of Population Health Sciences, University of Wisconsin-Madison, 610 Walnut Street, 707 WARF Building, Madison, WI 53726, United States of America
| | - Qiongshi Lu
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, WARF Room 201, 610 Walnut Street, Madison, WI 53726, United States of America
- Department of Statistics, University of Wisconsin-Madison, 1300 University Avenue, Madison, WI 53706, United States of America
| |
Collapse
|
33
|
Bogdan R, Hatoum AS, Johnson EC, Agrawal A. The Genetically Informed Neurobiology of Addiction (GINA) model. Nat Rev Neurosci 2023; 24:40-57. [PMID: 36446900 PMCID: PMC10041646 DOI: 10.1038/s41583-022-00656-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/19/2022] [Indexed: 11/30/2022]
Abstract
Addictions are heritable and unfold dynamically across the lifespan. One prominent neurobiological theory proposes that substance-induced changes in neural circuitry promote the progression of addiction. Genome-wide association studies have begun to characterize the polygenic architecture undergirding addiction liability and revealed that genetic loci associated with risk can be divided into those associated with a general broad-spectrum liability to addiction and those associated with drug-specific addiction risk. In this Perspective, we integrate these genomic findings with our current understanding of the neurobiology of addiction to propose a new Genetically Informed Neurobiology of Addiction (GINA) model.
Collapse
Affiliation(s)
- Ryan Bogdan
- Department of Psychological and Brain Sciences, Washington University in St. Louis, St. Louis, MO, USA.
| | - Alexander S Hatoum
- Department of Psychiatry, Washington University School of Medicine, St. Louis, MO, USA
| | - Emma C Johnson
- Department of Psychiatry, Washington University School of Medicine, St. Louis, MO, USA
| | - Arpana Agrawal
- Department of Psychiatry, Washington University School of Medicine, St. Louis, MO, USA.
| |
Collapse
|
34
|
Prowse-Wilkins CP, Lopdell TJ, Xiang R, Vander Jagt CJ, Littlejohn MD, Chamberlain AJ, Goddard ME. Genetic variation in histone modifications and gene expression identifies regulatory variants in the mammary gland of cattle. BMC Genomics 2022; 23:815. [PMID: 36482302 PMCID: PMC9733386 DOI: 10.1186/s12864-022-09002-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Accepted: 11/10/2022] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Causal variants for complex traits, such as eQTL are often found in non-coding regions of the genome, where they are hypothesised to influence phenotypes by regulating gene expression. Many regulatory regions are marked by histone modifications, which can be assayed by chromatin immunoprecipitation followed by sequencing (ChIP-seq). Sequence reads from ChIP-seq form peaks at putative regulatory regions, which may reflect the amount of regulatory activity at this region. Therefore, eQTL which are also associated with differences in histone modifications are excellent candidate causal variants. RESULTS We assayed the histone modifications H3K4Me3, H3K4Me1 and H3K27ac and mRNA in the mammary gland of up to 400 animals. We identified QTL for peak height (histone QTL), exon expression (eeQTL), allele specific expression (aseQTL) and allele specific binding (asbQTL). By intersecting these results, we identify variants which may influence gene expression by altering regulatory regions of the genome, and may be causal variants for other traits. Lastly, we find that these variants are found in putative transcription factor binding sites, identifying a mechanism for the effect of many eQTL. CONCLUSIONS We find that allele specific and traditional QTL analysis often identify the same genetic variants and provide evidence that many eQTL are regulatory variants which alter activity at regulatory regions of the bovine genome. Our work provides methodological and biological updates on how regulatory mechanisms interplay at multi-omics levels.
Collapse
Affiliation(s)
- Claire P Prowse-Wilkins
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, Victoria, 3082, Australia.
- Faculty of Veterinary & Agricultural Science, University of Melbourne, Parkville, Victoria, 3010, Australia.
| | - Thomas J Lopdell
- Research and Development, Livestock Improvement Corporation, Private Bag 3016, Hamilton, 3240, New Zealand
| | - Ruidong Xiang
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, Victoria, 3082, Australia
- Faculty of Veterinary & Agricultural Science, University of Melbourne, Parkville, Victoria, 3010, Australia
| | - Christy J Vander Jagt
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, Victoria, 3082, Australia
| | - Mathew D Littlejohn
- Research and Development, Livestock Improvement Corporation, Private Bag 3016, Hamilton, 3240, New Zealand
| | - Amanda J Chamberlain
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, Victoria, 3082, Australia
| | - Michael E Goddard
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, Victoria, 3082, Australia
- Faculty of Veterinary & Agricultural Science, University of Melbourne, Parkville, Victoria, 3010, Australia
| |
Collapse
|
35
|
Ogbunugafor CB, Edge MD. Gattaca as a lens on contemporary genetics: marking 25 years into the film's "not-too-distant" future. Genetics 2022; 222:iyac142. [PMID: 36218390 PMCID: PMC9713434 DOI: 10.1093/genetics/iyac142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Accepted: 09/05/2022] [Indexed: 11/13/2022] Open
Abstract
The 1997 film Gattaca has emerged as a canonical pop culture reference used to discuss modern controversies in genetics and bioethics. It appeared in theaters a few years prior to the announcement of the "completion" of the human genome (2000), as the science of human genetics was developing a renewed sense of its social implications. The story is set in a near-future world in which parents can, with technological assistance, influence the genetic composition of their offspring on the basis of predicted life outcomes. The current moment-25 years after the film's release-offers an opportunity to reflect on where society currently stands with respect to the ideas explored in Gattaca. Here, we review and discuss several active areas of genetic research-genetic prediction, embryo selection, forensic genetics, and others-that interface directly with scenes and concepts in the film. On its silver anniversary, we argue that Gattaca remains an important reflection of society's expectations and fears with respect to the ways that genetic science has manifested in the real world. In accompanying supplemental material, we offer some thought questions to guide group discussions inside and outside of the classroom.
Collapse
Affiliation(s)
- C Brandon Ogbunugafor
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA
- Santa Fe Institute, Santa Fe, NM 87501, USA
- Vermont Complex Systems Center, Burlington, VT 05401, USA
| | - Michael D Edge
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
36
|
Construction and evaluation of a polygenic hazard score for prognostic assessment in localized gastric cancer. FUNDAMENTAL RESEARCH 2022. [DOI: 10.1016/j.fmre.2022.09.031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
|
37
|
Ma Y, Patil S, Zhou X, Mukherjee B, Fritsche LG. ExPRSweb: An online repository with polygenic risk scores for common health-related exposures. Am J Hum Genet 2022; 109:1742-1760. [PMID: 36152628 PMCID: PMC9606385 DOI: 10.1016/j.ajhg.2022.09.001] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Accepted: 08/31/2022] [Indexed: 01/25/2023] Open
Abstract
Complex traits are influenced by genetic risk factors, lifestyle, and environmental variables, so-called exposures. Some exposures, e.g., smoking or lipid levels, have common genetic modifiers identified in genome-wide association studies. Because measurements are often unfeasible, exposure polygenic risk scores (ExPRSs) offer an alternative to study the influence of exposures on various phenotypes. Here, we collected publicly available summary statistics for 28 exposures and applied four common PRS methods to generate ExPRSs in two large biobanks: the Michigan Genomics Initiative and the UK Biobank. We established ExPRSs for 27 exposures and demonstrated their applicability in phenome-wide association studies and as predictors for common chronic conditions. Especially the addition of multiple ExPRSs showed, for several chronic conditions, an improvement compared to prediction models that only included traditional, disease-focused PRSs. To facilitate follow-up studies, we share all ExPRS constructs and generated results via an online repository called ExPRSweb.
Collapse
Affiliation(s)
- Ying Ma
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA
| | - Snehal Patil
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Precision Health Data Science, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA
| | - Bhramar Mukherjee
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Precision Health Data Science, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Department of Epidemiology, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; University of Michigan Rogel Cancer Center, University of Michigan, Ann Arbor, MI 48109, USA; Michigan Institute for Data Science, University of Michigan, Ann Arbor, MI 48109, USA
| | - Lars G Fritsche
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Precision Health Data Science, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; University of Michigan Rogel Cancer Center, University of Michigan, Ann Arbor, MI 48109, USA.
| |
Collapse
|
38
|
Allegrini AG, Baldwin JR, Barkhuizen W, Pingault JB. Research Review: A guide to computing and implementing polygenic scores in developmental research. J Child Psychol Psychiatry 2022; 63:1111-1124. [PMID: 35354222 PMCID: PMC10108570 DOI: 10.1111/jcpp.13611] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Revised: 02/28/2022] [Accepted: 03/04/2022] [Indexed: 12/14/2022]
Abstract
The increasing availability of genotype data in longitudinal population- and family-based samples provides opportunities for using polygenic scores (PGS) to study developmental questions in child and adolescent psychology and psychiatry. Here, we aim to provide a comprehensive overview of how PGS can be generated and implemented in developmental psycho(patho)logy, with a focus on longitudinal designs. As such, the paper is organized into three parts: First, we provide a formal definition of polygenic scores and related concepts, focusing on assumptions and limitations. Second, we give a general overview of the methods used to compute polygenic scores, ranging from the classic approach to more advanced methods. We include recommendations and reference resources available to researchers aiming to conduct PGS analyses. Finally, we focus on the practical applications of PGS in the analysis of longitudinal data. We describe how PGS have been used to research developmental outcomes, and how they can be applied to longitudinal data to address developmental questions.
Collapse
Affiliation(s)
- Andrea G Allegrini
- Division of Psychology and Language Sciences, Department of Clinical, Educational and Health Psychology, University College London, London, UK.,Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
| | - Jessie R Baldwin
- Division of Psychology and Language Sciences, Department of Clinical, Educational and Health Psychology, University College London, London, UK.,Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
| | - Wikus Barkhuizen
- Division of Psychology and Language Sciences, Department of Clinical, Educational and Health Psychology, University College London, London, UK
| | - Jean-Baptiste Pingault
- Division of Psychology and Language Sciences, Department of Clinical, Educational and Health Psychology, University College London, London, UK.,Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
| |
Collapse
|
39
|
Gao G, Zhao F, Ahearn TU, Lunetta KL, Troester MA, Du Z, Ogundiran TO, Ojengbede O, Blot W, Nathanson KL, Domchek SM, Nemesure B, Hennis A, Ambs S, McClellan J, Nie M, Bertrand K, Zirpoli G, Yao S, Olshan AF, Bensen JT, Bandera EV, Nyante S, Conti DV, Press MF, Ingles SA, John EM, Bernstein L, Hu JJ, Deming-Halverson SL, Chanock SJ, Ziegler RG, Rodriguez-Gil JL, Sucheston-Campbell LE, Sandler DP, Taylor JA, Kitahara CM, O’Brien KM, Bolla MK, Dennis J, Dunning AM, Easton DF, Michailidou K, Pharoah PDP, Wang Q, Figueroa J, Biritwum R, Adjei E, Wiafe S, Ambrosone CB, Zheng W, Olopade OI, García-Closas M, Palmer JR, Haiman CA, Huo D. Polygenic risk scores for prediction of breast cancer risk in women of African ancestry: a cross-ancestry approach. Hum Mol Genet 2022; 31:3133-3143. [PMID: 35554533 PMCID: PMC9476624 DOI: 10.1093/hmg/ddac102] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Revised: 03/29/2022] [Accepted: 04/26/2022] [Indexed: 11/13/2022] Open
Abstract
Polygenic risk scores (PRSs) are useful for predicting breast cancer risk, but the prediction accuracy of existing PRSs in women of African ancestry (AA) remains relatively low. We aim to develop optimal PRSs for the prediction of overall and estrogen receptor (ER) subtype-specific breast cancer risk in AA women. The AA dataset comprised 9235 cases and 10 184 controls from four genome-wide association study (GWAS) consortia and a GWAS study in Ghana. We randomly divided samples into training and validation sets. We built PRSs using individual-level AA data by a forward stepwise logistic regression and then developed joint PRSs that combined (1) the PRSs built in the AA training dataset and (2) a 313-variant PRS previously developed in women of European ancestry. PRSs were evaluated in the AA validation set. For overall breast cancer, the odds ratio per standard deviation of the joint PRS in the validation set was 1.34 [95% confidence interval (CI): 1.27-1.42] with the area under receiver operating characteristic curve (AUC) of 0.581. Compared with women with average risk (40th-60th PRS percentile), women in the top decile of the PRS had a 1.98-fold increased risk (95% CI: 1.63-2.39). For PRSs of ER-positive and ER-negative breast cancer, the AUCs were 0.608 and 0.576, respectively. Compared with existing methods, the proposed joint PRSs can improve prediction of breast cancer risk in AA women.
Collapse
Affiliation(s)
- Guimin Gao
- Department of Public Health Sciences, The University of Chicago, Chicago, IL 60637, USA
| | - Fangyuan Zhao
- Department of Public Health Sciences, The University of Chicago, Chicago, IL 60637, USA
| | - Thomas U Ahearn
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD 20850, USA
| | - Kathryn L Lunetta
- Department of Biostatistics, Boston University School of Public Health, Boston, MA 02118, USA
| | - Melissa A Troester
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Zhaohui Du
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Temidayo O Ogundiran
- Department of Surgery, College of Medicine, University of Ibadan, Ibadan, Nigeria
| | - Oladosu Ojengbede
- Centre for Population & Reproductive Health, College of Medicine, University of Ibadan, Ibadan, Nigeria
| | - William Blot
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN 37232, USA
| | - Katherine L Nathanson
- Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Susan M Domchek
- Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Barbara Nemesure
- Department of Family, Population and Preventive Medicine, Stony Brook University, Stony Brook, NY 11794, USA
| | - Anselm Hennis
- Department of Family, Population and Preventive Medicine, Stony Brook University, Stony Brook, NY 11794, USA
- University of the West Indies, Bridgetown, Bardados
| | - Stefan Ambs
- Laboratory of Human Carcinogenesis, National Cancer Institute, Bethesda, MD 20892, USA
| | - Julian McClellan
- Department of Public Health Sciences, The University of Chicago, Chicago, IL 60637, USA
| | - Mark Nie
- Department of Public Health Sciences, The University of Chicago, Chicago, IL 60637, USA
| | | | - Gary Zirpoli
- Slone Epidemiology Center, Boston University, Boston, MA 02215, USA
| | - Song Yao
- Department of Cancer Prevention and Control, Roswell Park Comprehensive Cancer Center, Buffalo, NY 14203, USA
| | - Andrew F Olshan
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jeannette T Bensen
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Elisa V Bandera
- Cancer Prevention and Control Program, Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08903, USA
| | - Sarah Nyante
- Department of Radiology, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27514, USA
| | - David V Conti
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
| | - Michael F Press
- Department of Pathology, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
| | - Sue A Ingles
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
| | - Esther M John
- Departments of Epidemiology & Population Health and of Medicine (Oncology) and Stanford Cancer Institute, Stanford University School of Medicine, Stanford, CA 94304, USA
| | - Leslie Bernstein
- Biomarkers of Early Detection and Prevention, Department of Population Sciences, Beckman Research Institute, City of Hope Comprehensive Cancer Center, Duarte, CA 91010, USA
| | - Jennifer J Hu
- Department of Public Health Sciences, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Sandra L Deming-Halverson
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN 37232, USA
| | - Stephen J Chanock
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD 20850, USA
| | - Regina G Ziegler
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD 20850, USA
| | - Jorge L Rodriguez-Gil
- Genomics, Development and Disease Section, Genetic Disease Research Branch, National Human Genome Research Institute, Bethesda, MD 20894, USA
| | - Lara E Sucheston-Campbell
- Department of Veterinary Biosciences, College of Veterinary Medicine, The Ohio State University, Columbus, OH 43210, USA
| | - Dale P Sandler
- Epidemiology Branch, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, NC 27709, USA
| | - Jack A Taylor
- Epidemiology Branch, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, NC 27709, USA
| | - Cari M Kitahara
- Radiation Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Katie M O’Brien
- Epidemiology Branch, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, NC 27709, USA
| | - Manjeet K Bolla
- Department of Public Health and Primary Care, Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge CB1 8RN, UK
| | - Joe Dennis
- Department of Public Health and Primary Care, Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge CB1 8RN, UK
| | - Alison M Dunning
- Department of Oncology, Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge CB1 8RN, UK
| | - Douglas F Easton
- Department of Public Health and Primary Care, Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge CB1 8RN, UK
- Department of Oncology, Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge CB1 8RN, UK
| | - Kyriaki Michailidou
- Biostatistics Unit, The Cyprus Institute of Neurology & Genetics, Nicosia 2371, Cyprus
| | - Paul D P Pharoah
- Department of Public Health and Primary Care, Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge CB1 8RN, UK
- Department of Oncology, Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge CB1 8RN, UK
| | - Qin Wang
- Department of Public Health and Primary Care, Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge CB1 8RN, UK
| | - Jonine Figueroa
- Usher Institute of Population Health Sciences and Informatics, The University of Edinburgh Medical School, Edinburgh EH16 5TJ, UK
- Cancer Research UK Edinburgh Centre, Edinburgh EH4 2XR, UK
| | | | | | - Seth Wiafe
- School of Public Health, Loma Linda University, Loma Linda, CA 92350, USA
| | | | - Christine B Ambrosone
- Department of Cancer Prevention and Control, Roswell Park Comprehensive Cancer Center, Buffalo, NY 14203, USA
| | - Wei Zheng
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN 37232, USA
| | - Olufunmilayo I Olopade
- Center for Clinical Cancer Genetics & Global Health, The University of Chicago, Chicago, IL 60637, USA
| | - Montserrat García-Closas
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD 20850, USA
| | - Julie R Palmer
- Slone Epidemiology Center, Boston University, Boston, MA 02215, USA
| | - Christopher A Haiman
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
| | - Dezheng Huo
- Department of Public Health Sciences, The University of Chicago, Chicago, IL 60637, USA
- Center for Clinical Cancer Genetics & Global Health, The University of Chicago, Chicago, IL 60637, USA
| |
Collapse
|
40
|
O'Sullivan JW, Raghavan S, Marquez-Luna C, Luzum JA, Damrauer SM, Ashley EA, O'Donnell CJ, Willer CJ, Natarajan P. Polygenic Risk Scores for Cardiovascular Disease: A Scientific Statement From the American Heart Association. Circulation 2022; 146:e93-e118. [PMID: 35862132 PMCID: PMC9847481 DOI: 10.1161/cir.0000000000001077] [Citation(s) in RCA: 72] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
Cardiovascular disease is the leading contributor to years lost due to disability or premature death among adults. Current efforts focus on risk prediction and risk factor mitigation' which have been recognized for the past half-century. However, despite advances, risk prediction remains imprecise with persistently high rates of incident cardiovascular disease. Genetic characterization has been proposed as an approach to enable earlier and potentially tailored prevention. Rare mendelian pathogenic variants predisposing to cardiometabolic conditions have long been known to contribute to disease risk in some families. However, twin and familial aggregation studies imply that diverse cardiovascular conditions are heritable in the general population. Significant technological and methodological advances since the Human Genome Project are facilitating population-based comprehensive genetic profiling at decreasing costs. Genome-wide association studies from such endeavors continue to elucidate causal mechanisms for cardiovascular diseases. Systematic cataloging for cardiovascular risk alleles also enabled the development of polygenic risk scores. Genetic profiling is becoming widespread in large-scale research, including in health care-associated biobanks, randomized controlled trials, and direct-to-consumer profiling in tens of millions of people. Thus, individuals and their physicians are increasingly presented with polygenic risk scores for cardiovascular conditions in clinical encounters. In this scientific statement, we review the contemporary science, clinical considerations, and future challenges for polygenic risk scores for cardiovascular diseases. We selected 5 cardiometabolic diseases (coronary artery disease, hypercholesterolemia, type 2 diabetes, atrial fibrillation, and venous thromboembolic disease) and response to drug therapy and offer provisional guidance to health care professionals, researchers, policymakers, and patients.
Collapse
|
41
|
Wang Y, Tsuo K, Kanai M, Neale BM, Martin AR. Challenges and Opportunities for Developing More Generalizable Polygenic Risk Scores. Annu Rev Biomed Data Sci 2022; 5:293-320. [PMID: 35576555 PMCID: PMC9828290 DOI: 10.1146/annurev-biodatasci-111721-074830] [Citation(s) in RCA: 47] [Impact Index Per Article: 23.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Polygenic risk scores (PRS) estimate an individual's genetic likelihood of complex traits and diseases by aggregating information across multiple genetic variants identified from genome-wide association studies. PRS can predict a broad spectrum of diseases and have therefore been widely used in research settings. Some work has investigated their potential applications as biomarkers in preventative medicine, but significant work is still needed to definitively establish and communicate absolute risk to patients for genetic and modifiable risk factors across demographic groups. However, the biggest limitation of PRS currently is that they show poor generalizability across diverse ancestries and cohorts. Major efforts are underway through methodological development and data generation initiatives to improve their generalizability. This review aims to comprehensively discuss current progress on the development of PRS, the factors that affect their generalizability, and promising areas for improving their accuracy, portability, and implementation.
Collapse
Affiliation(s)
- Ying Wang
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, USA,Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
| | - Kristin Tsuo
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, USA,Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA,Biological and Biomedical Sciences, Harvard Medical School, Boston, Massachusetts, USA
| | - Masahiro Kanai
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, USA,Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA,Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA,Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan
| | - Benjamin M. Neale
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, USA,Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
| | - Alicia R. Martin
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, USA,Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
| |
Collapse
|
42
|
Abstract
Genetically informed, deep-phenotyped biobanks are an important research resource and it is imperative that the most powerful, versatile, and efficient analysis approaches are used. Here, we apply our recently developed Bayesian grouped mixture of regressions model (GMRM) in the UK and Estonian Biobanks and obtain the highest genomic prediction accuracy reported to date across 21 heritable traits. When compared to other approaches, GMRM accuracy was greater than annotation prediction models run in the LDAK or LDPred-funct software by 15% (SE 7%) and 14% (SE 2%), respectively, and was 18% (SE 3%) greater than a baseline BayesR model without single-nucleotide polymorphism (SNP) markers grouped into minor allele frequency-linkage disequilibrium (MAF-LD) annotation categories. For height, the prediction accuracy R2 was 47% in a UK Biobank holdout sample, which was 76% of the estimated [Formula: see text]. We then extend our GMRM prediction model to provide mixed-linear model association (MLMA) SNP marker estimates for genome-wide association (GWAS) discovery, which increased the independent loci detected to 16,162 in unrelated UK Biobank individuals, compared to 10,550 from BoltLMM and 10,095 from Regenie, a 62 and 65% increase, respectively. The average [Formula: see text] value of the leading markers increased by 15.24 (SE 0.41) for every 1% increase in prediction accuracy gained over a baseline BayesR model across the traits. Thus, we show that modeling genetic associations accounting for MAF and LD differences among SNP markers, and incorporating prior knowledge of genomic function, is important for both genomic prediction and discovery in large-scale individual-level studies.
Collapse
|
43
|
Novel functional genomics approaches bridging neuroscience and psychiatry. BIOLOGICAL PSYCHIATRY GLOBAL OPEN SCIENCE 2022. [PMID: 37519472 PMCID: PMC10382709 DOI: 10.1016/j.bpsgos.2022.07.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
The possibility of establishing a metric of individual genetic risk for a particular disease or trait has sparked the interest of the clinical and research communities, with many groups developing and validating genomic profiling methodologies for their potential application in clinical care. Current approaches for calculating genetic risk to specific psychiatric conditions consist of aggregating genome-wide association studies-derived estimates into polygenic risk scores, which broadly represent the number of inherited risk alleles for an individual. While the traditional approach for polygenic risk score calculation aggregates estimates of gene-disease associations, novel alternative approaches have started to consider functional molecular phenotypes that are closer to genetic variation and are less penalized by the multiple testing required in genome-wide association studies. Moving the focus from genotype-disease to genotype-gene regulation frameworks, these novel approaches incorporate prior knowledge regarding biological processes involved in disease and aggregate estimates for the association of genotypes and phenotypes using multi-omics data modalities. In this review, we discuss and list different functional genomics tools that can be used and integrated to inform researchers and clinicians for a better understanding and diagnosis of psychopathology. We suggest that these novel approaches can help generate biologically driven hypotheses for polygenic signals that can ultimately serve the clinical community as potential biomarkers of psychiatric disease susceptibility.
Collapse
|
44
|
Dey KK, Gazal S, van de Geijn B, Kim SS, Nasser J, Engreitz JM, Price AL. SNP-to-gene linking strategies reveal contributions of enhancer-related and candidate master-regulator genes to autoimmune disease. CELL GENOMICS 2022; 2:100145. [PMID: 35873673 PMCID: PMC9306342 DOI: 10.1016/j.xgen.2022.100145] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
We assess contributions to autoimmune disease of genes whose regulation is driven by enhancer regions (enhancer-related) and genes that regulate other genes in trans (candidate master-regulator). We link these genes to SNPs using several SNP-to-gene (S2G) strategies and apply heritability analyses to draw three conclusions about 11 autoimmune/blood-related diseases/traits. First, several characterizations of enhancer-related genes using functional genomics data are informative for autoimmune disease heritability after conditioning on a broad set of regulatory annotations. Second, candidate master-regulator genes defined using trans-eQTL in blood are also conditionally informative for autoimmune disease heritability. Third, integrating enhancer-related and master-regulator gene sets with protein-protein interaction (PPI) network information magnified their disease signal. The resulting PPI-enhancer gene score produced >2-fold stronger heritability signal and >2-fold stronger enrichment for drug targets, compared with the recently proposed enhancer domain score. In each case, functionally informed S2G strategies produced 4.1- to 13-fold stronger disease signals than conventional window-based strategies.
Collapse
Affiliation(s)
- Kushal K. Dey
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
- Corresponding author
| | - Steven Gazal
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Bryce van de Geijn
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
- Genentech, South San Francisco, CA 94080, USA
| | - Samuel Sungil Kim
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Joseph Nasser
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Jesse M. Engreitz
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
- BASE Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford University School of Medicine, Stanford, CA 94304, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Alkes L. Price
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| |
Collapse
|
45
|
Xiao J, Cai M, Yu X, Hu X, Chen G, Wan X, Yang C. Leveraging the local genetic structure for trans-ancestry association mapping. Am J Hum Genet 2022; 109:1317-1337. [PMID: 35714612 PMCID: PMC9300880 DOI: 10.1016/j.ajhg.2022.05.013] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2022] [Accepted: 05/23/2022] [Indexed: 01/09/2023] Open
Abstract
Over the past two decades, genome-wide association studies (GWASs) have successfully advanced our understanding of the genetic basis of complex traits. Despite the fruitful discovery of GWASs, most GWAS samples are collected from European populations, and these GWASs are often criticized for their lack of ancestry diversity. Trans-ancestry association mapping (TRAM) offers an exciting opportunity to fill the gap of disparities in genetic studies between non-Europeans and Europeans. Here, we propose a statistical method, LOG-TRAM, to leverage the local genetic architecture for TRAM. By using biobank-scale datasets, we showed that LOG-TRAM can greatly improve the statistical power of identifying risk variants in under-represented populations while producing well-calibrated p values. We applied LOG-TRAM to the GWAS summary statistics of various complex traits/diseases from BioBank Japan, UK Biobank, and African populations. We obtained substantial gains in power and achieved effective correction of confounding biases in TRAM. Finally, we showed that LOG-TRAM can be successfully applied to identify ancestry-specific loci and the LOG-TRAM output can be further used for construction of more accurate polygenic risk scores in under-represented populations.
Collapse
Affiliation(s)
- Jiashun Xiao
- Guangzhou HKUST Fok Ying Tung Research Institute, Guangzhou 511458, China; Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| | - Mingxuan Cai
- Guangzhou HKUST Fok Ying Tung Research Institute, Guangzhou 511458, China; Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| | - Xinyi Yu
- Guangzhou HKUST Fok Ying Tung Research Institute, Guangzhou 511458, China; Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| | - Xianghong Hu
- Guangzhou HKUST Fok Ying Tung Research Institute, Guangzhou 511458, China; Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong SAR, China
| | - Gang Chen
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Xiang Wan
- Shenzhen Research Institute of Big Data, Shenzhen 518172, China; Pazhou Lab, Guangzhou 510330, China.
| | - Can Yang
- Guangzhou HKUST Fok Ying Tung Research Institute, Guangzhou 511458, China; Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong SAR, China.
| |
Collapse
|
46
|
Lai D, Schwantes-An TH, Abreu M, Chan G, Hesselbrock V, Kamarajan C, Liu Y, Meyers JL, Nurnberger JI, Plawecki MH, Wetherill L, Schuckit M, Zhang P, Edenberg HJ, Porjesz B, Agrawal A, Foroud T. Gene-based polygenic risk scores analysis of alcohol use disorder in African Americans. Transl Psychiatry 2022; 12:266. [PMID: 35790736 PMCID: PMC9256707 DOI: 10.1038/s41398-022-02029-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Revised: 06/13/2022] [Accepted: 06/16/2022] [Indexed: 11/09/2022] Open
Abstract
Genome-wide association studies (GWAS) in admixed populations such as African Americans (AA) have limited sample sizes, resulting in poor performance of polygenic risk scores (PRS). Based on the observations that many disease-causing genes are shared between AA and European ancestry (EA) populations, and some disease-causing variants are located within the boundaries of these genes, we proposed a novel gene-based PRS framework (PRSgene) by using variants located within disease-associated genes. Using the AA GWAS of alcohol use disorder (AUD) from the Million Veteran Program and the EA GWAS of problematic alcohol use as the discovery GWAS, we identified 858 variants from 410 genes that were AUD-related in both AA and EA. PRSgene calculated using these variants were significantly associated with AUD in three AA target datasets (P-values ranged from 7.61E-05 to 6.27E-03; Betas ranged from 0.15 to 0.21) and outperformed PRS calculated using all variants (P-values ranged from 7.28E-03 to 0.16; Betas ranged from 0.06 to 0.18). PRSgene were also associated with AUD in an EA target dataset (P-value = 0.02, Beta = 0.11). In AA, individuals in the highest PRSgene decile had an odds ratio of 1.76 (95% CI: 1.32-2.34) to develop AUD compared to those in the lowest decile. The 410 genes were enriched in 54 Gene Ontology biological processes, including ethanol oxidation and processes involving the synaptic system, which are known to be AUD-related. In addition, 26 genes were targets of drugs used to treat AUD or other diseases that might be considered for repurposing to treat AUD. Our study demonstrated that the gene-based PRS had improved performance in evaluating AUD risk in AA and provided new insight into AUD genetics.
Collapse
Affiliation(s)
- Dongbing Lai
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA.
| | - Tae-Hwi Schwantes-An
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Marco Abreu
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Grace Chan
- Department of Psychiatry, University of Connecticut School of Medicine, Farmington, CT, USA
- Department of Psychiatry, University of Iowa, Carver College of Medicine, Iowa City, IA, USA
| | - Victor Hesselbrock
- Department of Psychiatry, University of Connecticut School of Medicine, Farmington, CT, USA
| | - Chella Kamarajan
- Henri Begleiter Neurodynamics Lab, Department of Psychiatry, State University of New York, Downstate Medical Center, Brooklyn, NY, USA
| | - Yunlong Liu
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Jacquelyn L Meyers
- Henri Begleiter Neurodynamics Lab, Department of Psychiatry, State University of New York, Downstate Medical Center, Brooklyn, NY, USA
| | - John I Nurnberger
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA
- Department of Psychiatry, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Martin H Plawecki
- Department of Psychiatry, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Leah Wetherill
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Marc Schuckit
- Department of Psychiatry, University of California, San Diego Medical School, San Diego, CA, USA
| | - Pengyue Zhang
- Department of Biostatistics and Health Data Science, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Howard J Edenberg
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA
- Department of Biochemistry and Molecular Biology, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Bernice Porjesz
- Henri Begleiter Neurodynamics Lab, Department of Psychiatry, State University of New York, Downstate Medical Center, Brooklyn, NY, USA
| | - Arpana Agrawal
- Department of Psychiatry, Washington University School of Medicine, St. Louis, MO, USA
| | - Tatiana Foroud
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA
| |
Collapse
|
47
|
Long E, Patel H, Byun J, Amos CI, Choi J. Functional studies of lung cancer GWAS beyond association. Hum Mol Genet 2022; 31:R22-R36. [PMID: 35776125 DOI: 10.1093/hmg/ddac140] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 06/01/2022] [Accepted: 06/16/2022] [Indexed: 11/14/2022] Open
Abstract
Fourteen years after the first genome-wide association study (GWAS) of lung cancer was published, approximately forty-five genomic loci have now been significantly associated with lung cancer risk. While functional characterization was performed for several of these loci, a comprehensive summary of current molecular understanding of lung cancer risk has been lacking. Further, many novel computational and experimental tools now became available to accelerate the functional assessment of disease-associated variants, moving beyond locus-by-locus approaches. In this review, we first highlight the heterogeneity of lung cancer GWAS findings across histological subtypes, ancestries, and smoking status, which poses unique challenges to follow-up studies. We then summarize the published lung cancer post-GWAS studies for each risk-associated locus to assess the current understanding of biological mechanisms beyond the initial statistical association. We further summarize strategies for GWAS functional follow-up studies considering cutting-edge functional genomics tools and providing a catalog of available resources relevant to lung cancer. Overall, we aim to highlight the importance of integrating computational and experimental approaches to draw biological insights from the lung cancer GWAS results beyond association.
Collapse
Affiliation(s)
- Erping Long
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Harsh Patel
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Jinyoung Byun
- Institute for Clinical and Translational Research, Baylor College of Medicine, Houston, TX, 77030, USA.,Section of Epidemiology and Population Sciences, Department of Medicine, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Christopher I Amos
- Institute for Clinical and Translational Research, Baylor College of Medicine, Houston, TX, 77030, USA.,Section of Epidemiology and Population Sciences, Department of Medicine, Baylor College of Medicine, Houston, TX, 77030, USA.,Dan L Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Jiyeon Choi
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| |
Collapse
|
48
|
Khunsriraksakul C, Markus H, Olsen NJ, Carrel L, Jiang B, Liu DJ. Construction and Application of Polygenic Risk Scores in Autoimmune Diseases. Front Immunol 2022; 13:889296. [PMID: 35833142 PMCID: PMC9271862 DOI: 10.3389/fimmu.2022.889296] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Accepted: 04/25/2022] [Indexed: 11/13/2022] Open
Abstract
Genome-wide association studies (GWAS) have identified hundreds of genetic variants associated with autoimmune diseases and provided unique mechanistic insights and informed novel treatments. These individual genetic variants on their own typically confer a small effect of disease risk with limited predictive power; however, when aggregated (e.g., via polygenic risk score method), they could provide meaningful risk predictions for a myriad of diseases. In this review, we describe the recent advances in GWAS for autoimmune diseases and the practical application of this knowledge to predict an individual’s susceptibility/severity for autoimmune diseases such as systemic lupus erythematosus (SLE) via the polygenic risk score method. We provide an overview of methods for deriving different polygenic risk scores and discuss the strategies to integrate additional information from correlated traits and diverse ancestries. We further advocate for the need to integrate clinical features (e.g., anti-nuclear antibody status) with genetic profiling to better identify patients at high risk of disease susceptibility/severity even before clinical signs or symptoms develop. We conclude by discussing future challenges and opportunities of applying polygenic risk score methods in clinical care.
Collapse
Affiliation(s)
- Chachrit Khunsriraksakul
- Graduate Program in Bioinformatics and Genomics, Pennsylvania State University College of Medicine, Hershey, PA, United States
- Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA, United States
| | - Havell Markus
- Graduate Program in Bioinformatics and Genomics, Pennsylvania State University College of Medicine, Hershey, PA, United States
- Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA, United States
| | - Nancy J. Olsen
- Department of Medicine, Division of Rheumatology, Pennsylvania State University College of Medicine, Hershey, PA, United States
| | - Laura Carrel
- Department of Biochemistry and Molecular Biology, Pennsylvania State University College of Medicine, Hershey, PA, United States
| | - Bibo Jiang
- Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA, United States
| | - Dajiang J. Liu
- Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA, United States
- Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA, United States
- *Correspondence: Dajiang J. Liu,
| |
Collapse
|
49
|
Yang H, Ting X, Geng YH, Xie Y, Nierenberg JL, Huo YF, Zhou YT, Huang Y, Yu YQ, Yu XY, Li XF, Ziv E, Zhang H, Fang WG, Shen Y, Tian XX. The risk variant rs11836367 contributes to breast cancer onset and metastasis by attenuating Wnt signaling via regulating NTN4 expression. SCIENCE ADVANCES 2022; 8:eabn3509. [PMID: 35687692 PMCID: PMC9187238 DOI: 10.1126/sciadv.abn3509] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Accepted: 04/27/2022] [Indexed: 06/15/2023]
Abstract
Most genome-wide association study (GWAS)-identified breast cancer-associated causal variants remain uncharacterized. To provide a framework of understanding GWAS-identified variants to function, we performed a comprehensive study of noncoding regulatory variants at the NTN4 locus (12q22) and NTN4 gene in breast cancer etiology. We find that rs11836367 is the more likely causal variant, disrupting enhancer activity in both enhancer reporter assays and endogenous genome editing experiments. The protective T allele of rs11837367 increases the binding of GATA3 to the distal enhancer and up-regulates NTN4 expression. In addition, we demonstrate that loss of NTN4 gene in mice leads to tumor earlier onset, progression, and metastasis. We discover that NTN4, as a tumor suppressor, can attenuate the Wnt signaling pathway by directly binding to Wnt ligands. Our findings bridge the gaps among breast cancer-associated single-nucleotide polymorphisms, transcriptional regulation of NTN4, and breast cancer biology, which provides previously unidentified insights into breast cancer prediction and prevention.
Collapse
Affiliation(s)
- Han Yang
- Department of Pathology, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), School of Basic Medical Sciences, Peking University Third Hospital, Peking University Health Science Center, Beijing 100191, China
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA
| | - Xia Ting
- Department of Pathology, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), School of Basic Medical Sciences, Peking University Third Hospital, Peking University Health Science Center, Beijing 100191, China
| | - Yue-Hang Geng
- Department of Pathology, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), School of Basic Medical Sciences, Peking University Third Hospital, Peking University Health Science Center, Beijing 100191, China
| | - Yuntao Xie
- Breast Center, Peking University School of Oncology, Beijing Cancer Hospital and Institute, Beijing 100142, China
| | - Jovia L. Nierenberg
- Department of Epidemiology and Biostatistics, University of California, San Francisco School of Medicine, San Francisco, CA, USA
| | - Yan-Fei Huo
- Department of Pathology, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), School of Basic Medical Sciences, Peking University Third Hospital, Peking University Health Science Center, Beijing 100191, China
| | - Yan-Ting Zhou
- Department of Pathology, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), School of Basic Medical Sciences, Peking University Third Hospital, Peking University Health Science Center, Beijing 100191, China
| | - Yang Huang
- Department of Pathology, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), School of Basic Medical Sciences, Peking University Third Hospital, Peking University Health Science Center, Beijing 100191, China
| | - Yu-Qing Yu
- Department of Pathology, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), School of Basic Medical Sciences, Peking University Third Hospital, Peking University Health Science Center, Beijing 100191, China
| | - Xin-Yao Yu
- Department of Pathology, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), School of Basic Medical Sciences, Peking University Third Hospital, Peking University Health Science Center, Beijing 100191, China
| | - Xiao-Fei Li
- Department of Pathology, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), School of Basic Medical Sciences, Peking University Third Hospital, Peking University Health Science Center, Beijing 100191, China
| | - Elad Ziv
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA
- Division of General Internal Medicine, Department of Medicine, and Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA, USA
| | - Hongquan Zhang
- Department of Anatomy, Histology and Embryology, Peking University Health Science Center, Beijing 100191, China
| | - Wei-Gang Fang
- Department of Pathology, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), School of Basic Medical Sciences, Peking University Third Hospital, Peking University Health Science Center, Beijing 100191, China
| | - Yin Shen
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA
- Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Xin-Xia Tian
- Department of Pathology, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), School of Basic Medical Sciences, Peking University Third Hospital, Peking University Health Science Center, Beijing 100191, China
| |
Collapse
|
50
|
van den Berg FF, Issa Y, Vreijling JP, Lerch MM, Weiss FU, Besselink MG, Baas F, Boermeester MA, van Santvoort HC. Whole-exome Sequencing Identifies SLC52A1 and ZNF106 Variants as Novel Genetic Risk Factors for (Early) Multiple-organ Failure in Acute Pancreatitis. Ann Surg 2022; 275:e781-e788. [PMID: 33427755 DOI: 10.1097/sla.0000000000004312] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
OBJECTIVE The aim of this study was to identify genetic variants associated with early multiple organ failure (MOF) in acute pancreatitis. SUMMARY BACKGROUND DATA MOF is a life-threatening complication of acute pancreatitis, and risk factors are largely unknown, especially in early persistent MOF. Genetic risk factors are thought to enhance severity in complex diseases such as acute pancreatitis. METHODS A 2-phase study design was conducted. First, we exome sequenced 9 acute pancreatitis patients with early persistent MOF and 9 case-matched patients with mild edematous pancreatitis (phenotypic extremes) from our initial Dutch cohort of 387 patients. Secondly, 48 candidate variants that were overrepresented in MOF patients and 10 additional variants known from literature were genotyped in a replication cohort of 286 Dutch and German patients. RESULTS Exome sequencing resulted in 161,696 genetic variants, of which the 38,333 non-synonymous variants were selected for downstream analyses. Of these, 153 variants were overrepresented in patients with multiple-organ failure, as compared with patients with mild acute pancreatitis. In total, 58 candidate variants were genotyped in the joined Dutch and German replication cohort. We found the rs12440118 variant of ZNF106 to be overrepresented in patients with MOF (minor allele frequency 20.4% vs 11.6%, Padj=0.026). Additionally, SLC52A1 rs346821 was found to be overrepresented (minor allele frequency 48.0% vs 42.4%, Padj= 0.003) in early MOF. None of the variants known from literature were associated.Conclusions: This study indicates that SLC52A1, a riboflavin plasma membrane transporter, and ZNF106, a zinc finger protein, may be involved in disease progression toward (early) MOF in acute pancreatitis.
Collapse
Affiliation(s)
- Fons F van den Berg
- Department of Surgery, Amsterdam Gastroenterology Endocrinology Metabolism, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
- Department of Clinical Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Yama Issa
- Department of Surgery, Amsterdam Gastroenterology Endocrinology Metabolism, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
- Department of Clinical Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Jeroen P Vreijling
- Department of Medicine A, University Medicine Greifswald, Greifswald, Germany
| | - Markus M Lerch
- Departments of Clinical Chemistry, Genetics and Pediatrics, Amsterdam Gastroenterology & Metabolism, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Frank Ulrich Weiss
- Departments of Clinical Chemistry, Genetics and Pediatrics, Amsterdam Gastroenterology & Metabolism, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Marc G Besselink
- Department of Surgery, Amsterdam Gastroenterology Endocrinology Metabolism, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
- Department of Clinical Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Frank Baas
- Department of Medicine A, University Medicine Greifswald, Greifswald, Germany
| | - Marja A Boermeester
- Department of Surgery, Amsterdam Gastroenterology Endocrinology Metabolism, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
- Department of Clinical Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Hjalmar C van Santvoort
- Department of Surgery, University Medical Center, Utrecht, The Netherlands; Department of Surgery, St. Antonius Hospital, Nieuwegein, The Netherlands
| |
Collapse
|