1
|
Kember RL, Davis CN, Feuer KL, Kranzler HR. Considerations for the application of polygenic scores to clinical care of individuals with substance use disorders. J Clin Invest 2024; 134:e172882. [PMID: 39403926 PMCID: PMC11473164 DOI: 10.1172/jci172882] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2024] Open
Abstract
Substance use disorders (SUDs) are highly prevalent and associated with excess morbidity, mortality, and economic costs. Thus, there is considerable interest in the early identification of individuals who may be more susceptible to developing SUDs and in improving personalized treatment decisions for those who have SUDs. SUDs are known to be influenced by both genetic and environmental factors. Polygenic scores (PGSs) provide a single measure of genetic liability that could be used as a biomarker in predicting disease development, progression, and treatment response. Although PGSs are rapidly being integrated into clinical practice, there is little information to guide clinicians in their responsible use and interpretation. In this Review, we discuss the potential benefits and pitfalls of the use of PGSs in the clinical care of SUDs, highlighting current research. We also provide suggestions for important considerations prior to implementing the clinical use of PGSs and recommend future directions for research.
Collapse
|
2
|
Zhao Z, Gruenloh T, Yan M, Wu Y, Sun Z, Miao J, Wu Y, Song J, Lu Q. Optimizing and benchmarking polygenic risk scores with GWAS summary statistics. Genome Biol 2024; 25:260. [PMID: 39379999 PMCID: PMC11462675 DOI: 10.1186/s13059-024-03400-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2023] [Accepted: 09/23/2024] [Indexed: 10/10/2024] Open
Abstract
BACKGROUND Polygenic risk score (PRS) is a major research topic in human genetics. However, a significant gap exists between PRS methodology and applications in practice due to often unavailable individual-level data for various PRS tasks including model fine-tuning, benchmarking, and ensemble learning. RESULTS We introduce an innovative statistical framework to optimize and benchmark PRS models using summary statistics of genome-wide association studies. This framework builds upon our previous work and can fine-tune virtually all existing PRS models while accounting for linkage disequilibrium. In addition, we provide an ensemble learning strategy named PUMAS-ensemble to combine multiple PRS models into an ensemble score without requiring external data for model fitting. Through extensive simulations and analysis of many complex traits in the UK Biobank, we demonstrate that this approach closely approximates gold-standard analytical strategies based on external validation, and substantially outperforms state-of-the-art PRS methods. CONCLUSIONS Our method is a powerful and general modeling technique that can continue to combine the best-performing PRS methods out there through ensemble learning and could become an integral component for all future PRS applications.
Collapse
Affiliation(s)
- Zijie Zhao
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA
| | - Tim Gruenloh
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA
| | - Meiyi Yan
- Department of Statistics, University of Wisconsin-Madison, Madison, WI, USA
| | - Yixuan Wu
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA
| | - Zhongxuan Sun
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA
| | - Jiacheng Miao
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA
| | - Yuchang Wu
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA
- Center for Demography of Health and Aging, University of Wisconsin-Madison, Madison, WI, USA
| | - Jie Song
- Center for Demography of Health and Aging, University of Wisconsin-Madison, Madison, WI, USA
| | - Qiongshi Lu
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA.
- Department of Statistics, University of Wisconsin-Madison, Madison, WI, USA.
- Center for Demography of Health and Aging, University of Wisconsin-Madison, Madison, WI, USA.
| |
Collapse
|
3
|
Thompson DJ, Wells D, Selzam S, Peneva I, Moore R, Sharp K, Tarran WA, Beard EJ, Riveros-Mckay F, Giner-Delgado C, Palmer D, Seth P, Harrison J, Futema M, McVean G, Plagnol V, Donnelly P, Weale ME. A systematic evaluation of the performance and properties of the UK Biobank Polygenic Risk Score (PRS) Release. PLoS One 2024; 19:e0307270. [PMID: 39292644 PMCID: PMC11410272 DOI: 10.1371/journal.pone.0307270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Accepted: 07/01/2024] [Indexed: 09/20/2024] Open
Abstract
We assess the UK Biobank (UKB) Polygenic Risk Score (PRS) Release, a set of PRSs for 28 diseases and 25 quantitative traits that has been made available on the individuals in UKB, using a unified pipeline for PRS evaluation. We also release a benchmarking software tool to enable like-for-like performance evaluation for different PRSs for the same disease or trait. Extensive benchmarking shows the PRSs in the UKB Release to outperform a broad set of 76 published PRSs. For many of the diseases and traits we also validate the PRS algorithms in a separate cohort (100,000 Genomes Project). The availability of PRSs for 53 traits on the same set of individuals also allows a systematic assessment of their properties, and the increased power of these PRSs increases the evidence for their potential clinical benefit.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | | | - Marta Futema
- Cardiology Research Centre, Molecular and Clinical Sciences Research Institute, St George's University of London, London, United Kingdom
- Institute of Cardiovascular Science, Faculty of Population Health Sciences, University College London, London, United Kingdom
| | | | | | | | | |
Collapse
|
4
|
Vilkaite G, Vogel J, Mattsson-Carlgren N. Integrating amyloid and tau imaging with proteomics and genomics in Alzheimer's disease. Cell Rep Med 2024; 5:101735. [PMID: 39293391 DOI: 10.1016/j.xcrm.2024.101735] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 07/28/2024] [Accepted: 08/20/2024] [Indexed: 09/20/2024]
Abstract
Alzheimer's disease (AD) is the most common neurodegenerative disease and is characterized by the aggregation of β-amyloid (Aβ) and tau in the brain. Breakthroughs in disease-modifying treatments targeting Aβ bring new hope for the management of AD. But to effectively modify and someday even prevent AD, a better understanding is needed of the biological mechanisms that underlie and link Aβ and tau in AD. Developments of high-throughput omics, including genomics, proteomics, and transcriptomics, together with molecular imaging of Aβ and tau with positron emission tomography (PET), allow us to discover and understand the biological pathways that regulate the aggregation and spread of Aβ and tau in living humans. The field of integrated omics and PET studies of Aβ and tau in AD is growing rapidly. We here provide an update of this field, both in terms of biological insights and in terms of future clinical implications of integrated omics-molecular imaging studies.
Collapse
Affiliation(s)
- Gabriele Vilkaite
- Department of Clinical Sciences Malmö, SciLifeLab, Lund University, Lund, Sweden
| | - Jacob Vogel
- Department of Clinical Sciences Malmö, SciLifeLab, Lund University, Lund, Sweden
| | - Niklas Mattsson-Carlgren
- Clinical Memory Research Unit, Department of Clinical Sciences Malmö, Lund University, Lund, Sweden; Department of Neurology, Skåne University Hospital, Lund University, Lund, Sweden; Wallenberg Center for Molecular Medicine, Lund University, Lund, Sweden.
| |
Collapse
|
5
|
Sun C, Cheng X, Xu J, Chen H, Tao J, Dong Y, Wei S, Chen R, Meng X, Ma Y, Tian H, Guo X, Bi S, Zhang C, Kang J, Zhang M, Lv H, Shang Z, Lv W, Zhang R, Jiang Y. A review of disease risk prediction methods and applications in the omics era. Proteomics 2024; 24:e2300359. [PMID: 38522029 DOI: 10.1002/pmic.202300359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 03/08/2024] [Accepted: 03/12/2024] [Indexed: 03/25/2024]
Abstract
Risk prediction and disease prevention are the innovative care challenges of the 21st century. Apart from freeing the individual from the pain of disease, it will lead to low medical costs for society. Until very recently, risk assessments have ushered in a new era with the emergence of omics technologies, including genomics, transcriptomics, epigenomics, proteomics, and so on, which potentially advance the ability of biomarkers to aid prediction models. While risk prediction has achieved great success, there are still some challenges and limitations. We reviewed the general process of omics-based disease risk model construction and the applications in four typical diseases. Meanwhile, we highlighted the problems in current studies and explored the potential opportunities and challenges for future clinical practice.
Collapse
Affiliation(s)
- Chen Sun
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The EWAS Project, Harbin, China
| | - Xiangshu Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The EWAS Project, Harbin, China
| | - Jing Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The EWAS Project, Harbin, China
| | - Haiyan Chen
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Junxian Tao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The EWAS Project, Harbin, China
| | - Yu Dong
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The EWAS Project, Harbin, China
| | - Siyu Wei
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The EWAS Project, Harbin, China
| | - Rui Chen
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Xin Meng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Yingnan Ma
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The EWAS Project, Harbin, China
| | - Hongsheng Tian
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Xuying Guo
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Shuo Bi
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Chen Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Jingxuan Kang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Mingming Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Hongchao Lv
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Zhenwei Shang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Wenhua Lv
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Ruijie Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Yongshuai Jiang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The EWAS Project, Harbin, China
| |
Collapse
|
6
|
Singh S, Stocco G, Theken KN, Dickson A, Feng Q, Karnes JH, Mosley JD, El Rouby N. Pharmacogenomics polygenic risk score: Ready or not for prime time? Clin Transl Sci 2024; 17:e13893. [PMID: 39078255 PMCID: PMC11287822 DOI: 10.1111/cts.13893] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2024] [Revised: 06/11/2024] [Accepted: 06/25/2024] [Indexed: 07/31/2024] Open
Abstract
Pharmacogenomic Polygenic Risk Scores (PRS) have emerged as a tool to address the polygenic nature of pharmacogenetic phenotypes, increasing the potential to predict drug response. Most pharmacogenomic PRS have been extrapolated from disease-associated variants identified by genome wide association studies (GWAS), although some have begun to utilize genetic variants from pharmacogenomic GWAS. As pharmacogenomic PRS hold the promise of enabling precision medicine, including stratified treatment approaches, it is important to assess the opportunities and challenges presented by the current data. This assessment will help determine how pharmacogenomic PRS can be advanced and transitioned into clinical use. In this review, we present a summary of recent evidence, evaluate the current status, and identify several challenges that have impeded the progress of pharmacogenomic PRS. These challenges include the reliance on extrapolations from disease genetics and limitations inherent to pharmacogenomics research such as low sample sizes, phenotyping inconsistencies, among others. We finally propose recommendations to overcome the challenges and facilitate the clinical implementation. These recommendations include standardizing methodologies for phenotyping, enhancing collaborative efforts, developing new statistical methods to capitalize on drug-specific genetic associations for PRS construction. Additional recommendations include enhancing the infrastructure that can integrate genomic data with clinical predictors, along with implementing user-friendly clinical decision tools, and patient education. Ethical and regulatory considerations should address issues related to patient privacy, informed consent and safe use of PRS. Despite these challenges, ongoing research and large-scale collaboration is likely to advance the field and realize the potential of pharmacogenomic PRS.
Collapse
Affiliation(s)
- Sonal Singh
- Merck & Co., IncSouth San FranciscoCaliforniaUSA
| | - Gabriele Stocco
- Department of Medical, Surgical and Health SciencesUniversity of TriesteTriesteItaly
- Institute for Maternal and Child Health IRCCS Burlo GarofoloTriesteItaly
| | - Katherine N. Theken
- Department of Oral and Maxillofacial Surgery and Pharmacology, School of Dental MedicineUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Alyson Dickson
- Department of MedicineVanderbilt University Medical CenterNashvilleTennesseeUSA
| | - QiPing Feng
- Department of MedicineVanderbilt University Medical CenterNashvilleTennesseeUSA
| | - Jason H. Karnes
- Department of Pharmacy Practice and Science, R. Ken Coit College of PharmacyUniversity of ArizonaTucsonArizonaUSA
- Department of Biomedical InformaticsVanderbilt University Medical CenterNashvilleTennesseeUSA
| | - Jonathan D. Mosley
- Department of MedicineVanderbilt University Medical CenterNashvilleTennesseeUSA
- Department of Biomedical InformaticsVanderbilt University Medical CenterNashvilleTennesseeUSA
| | - Nihal El Rouby
- Division of Pharmacy Practice and Adminstrative Sciences, James L Winkle College of PharmacyUniversity of CincinnatiCincinnatiOhioUSA
- St. Elizabeth HealthcareEdgewoodKentuckyUSA
| |
Collapse
|
7
|
Tubbs JD, Chen Y, Duan R, Huang H, Ge T. Real-time dynamic polygenic prediction for streaming data. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.07.12.24310357. [PMID: 39040195 PMCID: PMC11261927 DOI: 10.1101/2024.07.12.24310357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/24/2024]
Abstract
Polygenic risk scores (PRSs) are promising tools for advancing precision medicine. However, existing PRS construction methods rely on static summary statistics derived from genome-wide association studies (GWASs), which are often updated at lengthy intervals. As genetic data and health outcomes are continuously being generated at an ever-increasing pace, the current PRS training and deployment paradigm is suboptimal in maximizing the prediction accuracy of PRSs for incoming patients in healthcare settings. Here, we introduce real-time PRS-CS (rtPRS-CS), which enables online, dynamic refinement and calibration of PRS as each new sample is collected, without the need to perform intermediate GWASs. Through extensive simulation studies, we evaluate the performance of rtPRS-CS across various genetic architectures and training sample sizes. Leveraging quantitative traits from the Mass General Brigham Biobank and UK Biobank, we show that rtPRS-CS can integrate massive streaming data to enhance PRS prediction over time. We further apply rtPRS-CS to 22 schizophrenia cohorts in 7 Asian regions, demonstrating the clinical utility of rtPRS-CS in dynamically predicting and stratifying disease risk across diverse genetic ancestries.
Collapse
Affiliation(s)
- Justin D. Tubbs
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
- Center for Precision Psychiatry, Department of Psychiatry, Massachusetts General Hospital, Boston, MA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA
| | - Yu Chen
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA
- Department of Medicine, Massachusetts General Hospital, Boston, MA
| | - Rui Duan
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA
| | - Hailiang Huang
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA
- Department of Medicine, Massachusetts General Hospital, Boston, MA
| | - Tian Ge
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
- Center for Precision Psychiatry, Department of Psychiatry, Massachusetts General Hospital, Boston, MA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA
| |
Collapse
|
8
|
Kojima N, Koido M, He Y, Shimmori Y, Hachiya T, Japan B, Debette S, Kamatani Y. Recurrent stroke prediction by applying a stroke polygenic risk score in the Japanese population. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.06.17.24309034. [PMID: 39371120 PMCID: PMC11451717 DOI: 10.1101/2024.06.17.24309034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/08/2024]
Abstract
Background Recently, various polygenic risk score (PRS)-based methods were developed to improve stroke prediction. However, current PRSs (including cross-ancestry PRS) poorly predict recurrent stroke. Here, we aimed to determine whether the best PRS for Japanese individuals can also predict stroke recurrence in this population by extensively comparing the methods and maximizing the predictive performance for stroke onset. Methods We used data from the BioBank Japan (BBJ) 1st cohort (n=179,938) to derive and optimize the PRSs using a 10-fold cross-validation. We integrated the optimized PRSs for multiple traits, such as vascular risk factors and stroke subtypes to generate a single PRS using the meta-scoring approach (metaGRS). We used an independent BBJ 2nd cohort (n=41,929) as a test sample to evaluate the association of the metaGRS with stroke and recurrent stroke. Results We analyzed recurrent stroke cases (n=174) and non-recurrent stroke controls (n=1,153) among subjects within the BBJ 2nd cohort. After adjusting for known risk factors, metaGRS was associated with stroke recurrence (adjusted OR per SD 1.18 [95% CI: 1.00-1.39, p=0.044]), although no significant correlation was observed with the published PRSs. We administered three distinct tests to consider the potential index event bias; however, the outcomes derived from these examinations did not provide any significant indication of the influence of index event bias. The high metaGRS group without a history of hypertension had a higher risk of stroke recurrence than that of the low metaGRS group (adjusted OR 2.24 [95% CI: 1.07-4.66, p=0.032]). However, this association was weak in the hypertension group (adjusted OR 1.21 [95% CI: 0.69-2.13, p=0.50]). Conclusions The metaGRS developed in a Japanese cohort predicted stroke recurrence in an independent cohort of patients. In particular, it predicted an increased risk of recurrence among stroke patients without hypertension. These findings provide clues for additional genetic risk stratification and help in developing personalized strategies for stroke recurrence prevention.
Collapse
Affiliation(s)
- Naoki Kojima
- Laboratory of Complex Trait Genomics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Masaru Koido
- Laboratory of Complex Trait Genomics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Yunye He
- Laboratory of Complex Trait Genomics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Yuka Shimmori
- Laboratory of Complex Trait Genomics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Tsuyoshi Hachiya
- Laboratory of Complex Trait Genomics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | | | - Stéphanie Debette
- Bordeaux Population Health Research Center, University of Bordeaux, Inserm, UMR 1219, Bordeaux, France
- Department of Neurology, Institute for Neurodegenerative Diseases, CHU de Bordeaux, Bordeaux, France
| | - Yoichiro Kamatani
- Laboratory of Complex Trait Genomics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| |
Collapse
|
9
|
Tiezzi F, Goda K, Morgante F. Using lifestyle information in polygenic modeling of blood pressure traits: a simple method to reduce bias. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.05.597631. [PMID: 38895222 PMCID: PMC11185601 DOI: 10.1101/2024.06.05.597631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
Complex traits are determined by the effects of multiple genetic variants, multiple environmental factors, and potentially their interaction. Predicting complex trait phenotypes from genotypes is a fundamental task in quantitative genetics that was pioneered in agricultural breeding for selection purposes. However, it has recently become important in human genetics. While prediction accuracy for some human complex traits is appreciable, this remains low for most traits. A promising way to improve prediction accuracy is by including not only genetic information but also environmental information in prediction models. However, environmental factors can, in turn, be genetically determined. This phenomenon gives rise to a correlation between the genetic and environmental components of the phenotype, which violates the assumption of independence between the genetic and environmental components of most statistical methods for polygenic modeling. In this work, we investigated the impact of including 27 lifestyle variables as well as genotype information (and their interaction) for predicting diastolic blood pressure, systolic blood pressure, and pulse pressure in older individuals in UK Biobank. The 27 lifestyle variables were included as either raw variables or adjusted by genetic and other non-genetic factors. The results show that including both lifestyle and genetic data improved prediction accuracy compared to using either piece of information alone. Both prediction accuracy and bias can improve substantially for some traits when the models account for the lifestyle variables after their proper adjustment. Our work confirms the utility of including environmental information in polygenic models of complex traits and highlights the importance of proper handling of the environmental variables.
Collapse
Affiliation(s)
- Francesco Tiezzi
- Department of Agriculture, Food, Environment and Forestry (DAGRI), University of Florence, Florence, Italy
- Center for Human Genetics, Clemson University, Greenwood, SC, USA
| | - Khushi Goda
- Center for Human Genetics, Clemson University, Greenwood, SC, USA
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA
| | - Fabio Morgante
- Center for Human Genetics, Clemson University, Greenwood, SC, USA
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA
| |
Collapse
|
10
|
Gao Y, Cui Y. Optimizing clinico-genomic disease prediction across ancestries: a machine learning strategy with Pareto improvement. Genome Med 2024; 16:76. [PMID: 38835075 PMCID: PMC11149372 DOI: 10.1186/s13073-024-01345-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Accepted: 05/17/2024] [Indexed: 06/06/2024] Open
Abstract
BACKGROUND Accurate prediction of an individual's predisposition to diseases is vital for preventive medicine and early intervention. Various statistical and machine learning models have been developed for disease prediction using clinico-genomic data. However, the accuracy of clinico-genomic prediction of diseases may vary significantly across ancestry groups due to their unequal representation in clinical genomic datasets. METHODS We introduced a deep transfer learning approach to improve the performance of clinico-genomic prediction models for data-disadvantaged ancestry groups. We conducted machine learning experiments on multi-ancestral genomic datasets of lung cancer, prostate cancer, and Alzheimer's disease, as well as on synthetic datasets with built-in data inequality and distribution shifts across ancestry groups. RESULTS Deep transfer learning significantly improved disease prediction accuracy for data-disadvantaged populations in our multi-ancestral machine learning experiments. In contrast, transfer learning based on linear frameworks did not achieve comparable improvements for these data-disadvantaged populations. CONCLUSIONS This study shows that deep transfer learning can enhance fairness in multi-ancestral machine learning by improving prediction accuracy for data-disadvantaged populations without compromising prediction accuracy for other populations, thus providing a Pareto improvement towards equitable clinico-genomic prediction of diseases.
Collapse
Affiliation(s)
- Yan Gao
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, 38163, USA
- Center for Integrative and Translational Genomics, University of Tennessee Health Science Center, Memphis, TN, 38163, USA
| | - Yan Cui
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, 38163, USA.
- Center for Integrative and Translational Genomics, University of Tennessee Health Science Center, Memphis, TN, 38163, USA.
- Center for Cancer Research, University of Tennessee Health Science Center, Memphis, TN, 38163, USA.
| |
Collapse
|
11
|
Arango NK, Morgante F. Comparing statistical learning methods for complex trait prediction from gene expression. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.01.596951. [PMID: 38895364 PMCID: PMC11185554 DOI: 10.1101/2024.06.01.596951] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
Accurate prediction of complex traits is an important task in quantitative genetics that has become increasingly relevant for personalized medicine. Genotypes have traditionally been used for trait prediction using a variety of methods such as mixed models, Bayesian methods, penalized regressions, dimension reductions, and machine learning methods. Recent studies have shown that gene expression levels can produce higher prediction accuracy than genotypes. However, only a few prediction methods were used in these studies. Thus, a comprehensive assessment of methods is needed to fully evaluate the potential of gene expression as a predictor of complex trait phenotypes. Here, we used data from the Drosophila Genetic Reference Panel (DGRP) to compare the ability of several existing statistical learning methods to predict starvation resistance from gene expression in the two sexes separately. The methods considered differ in assumptions about the distribution of gene effect sizes - ranging from models that assume that every gene affects the trait to more sparse models - and their ability to capture gene-gene interactions. We also used functional annotation (i.e., Gene Ontology (GO)) as an external source of biological information to inform prediction models. The results show that differences in prediction accuracy between methods exist, although they are generally not large. Methods performing variable selection gave higher accuracy in females while methods assuming a more polygenic architecture performed better in males. Incorporating GO annotations further improved prediction accuracy for a few GO terms of biological significance. Biological significance extended to the genes underlying highly predictive GO terms with different genes emerging between sexes. Notably, the Insulin-like Receptor (InR) was prevalent across methods and sexes. Our results confirmed the potential of transcriptomic prediction and highlighted the importance of selecting appropriate methods and strategies in order to achieve accurate predictions.
Collapse
Affiliation(s)
- Noah Klimkowski Arango
- Center for Human Genetics, Clemson University, Greenwood, SC, USA
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA
| | - Fabio Morgante
- Center for Human Genetics, Clemson University, Greenwood, SC, USA
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA
| |
Collapse
|
12
|
Yang X, Sullivan PF, Li B, Fan Z, Ding D, Shu J, Guo Y, Paschou P, Bao J, Shen L, Ritchie MD, Nave G, Platt ML, Li T, Zhu H, Zhao B. Multi-organ imaging-derived polygenic indexes for brain and body health. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2023.04.18.23288769. [PMID: 38883759 PMCID: PMC11177904 DOI: 10.1101/2023.04.18.23288769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2024]
Abstract
The UK Biobank (UKB) imaging project is a crucial resource for biomedical research, but is limited to 100,000 participants due to cost and accessibility barriers. Here we used genetic data to predict heritable imaging-derived phenotypes (IDPs) for a larger cohort. We developed and evaluated 4,375 IDP genetic scores (IGS) derived from UKB brain and body images. When applied to UKB participants who were not imaged, IGS revealed links to numerous phenotypes and stratified participants at increased risk for both brain and somatic diseases. For example, IGS identified individuals at higher risk for Alzheimer's disease and multiple sclerosis, offering additional insights beyond traditional polygenic risk scores of these diseases. When applied to independent external cohorts, IGS also stratified those at high disease risk in the All of Us Research Program and the Alzheimer's Disease Neuroimaging Initiative study. Our results demonstrate that, while the UKB imaging cohort is largely healthy and may not be the most enriched for disease risk management, it holds immense potential for stratifying the risk of various brain and body diseases in broader external genetic cohorts.
Collapse
Affiliation(s)
- Xiaochen Yang
- Department of Statistics, Purdue University, West Lafayette, IN 47907, USA
| | - Patrick F. Sullivan
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Bingxuan Li
- UCLA Samueli School of Engineering, Los Angeles, CA 90095, USA
| | - Zirui Fan
- Department of Statistics and Data Science, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Dezheng Ding
- Department of Electrical and Systems Engineering, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Juan Shu
- Department of Statistics, Purdue University, West Lafayette, IN 47907, USA
| | - Yuxin Guo
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | - Peristera Paschou
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | - Jingxuan Bao
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA 19104, USA
- Graduate Group in Genomics and Computational Biology, University of Pennsylvania, Philadelphia, PA, USA
| | - Li Shen
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Marylyn D. Ritchie
- Department of Genetics, University of Pennsylvania, Philadelphia, PA 19104, USA
- Institute for Biomedical Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Gideon Nave
- Marketing Department, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Michael L. Platt
- Marketing Department, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Neuroscience, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Psychology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Tengfei Li
- Department of Radiology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Biomedical Research Imaging Center, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Hongtu Zhu
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Biomedical Research Imaging Center, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Bingxin Zhao
- Department of Statistics and Data Science, University of Pennsylvania, Philadelphia, PA 19104, USA
- Institute for Biomedical Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
- Applied Mathematics and Computational Science Graduate Group, University of Pennsylvania, Philadelphia, PA 19104, USA
- Center for AI and Data Science for Integrated Diagnostics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Population Aging Research Center, University of Pennsylvania, Philadelphia, PA 19104, USA
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA 19104, USA
| |
Collapse
|
13
|
Flint JP, Welstead M, Cox SR, Russ TC, Marshall A, Luciano M. Validation of a polygenic risk score for frailty in the Lothian Birth Cohort 1936 and English longitudinal study of ageing. Sci Rep 2024; 14:12586. [PMID: 38822050 PMCID: PMC11143351 DOI: 10.1038/s41598-024-63229-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Accepted: 05/24/2024] [Indexed: 06/02/2024] Open
Abstract
Frailty is a complex trait. Twin studies and high-powered Genome Wide Association Studies conducted in the UK Biobank have demonstrated a strong genetic basis of frailty. The present study utilized summary statistics from a Genome Wide Association Study on the Frailty Index to create and test the predictive power of frailty polygenic risk scores (PRS) in two independent samples - the Lothian Birth Cohort 1936 (LBC1936) and the English Longitudinal Study of Ageing (ELSA) aged 67-84 years. Multiple regression models were built to test the predictive power of frailty PRS at five time points. Frailty PRS significantly predicted frailty, measured via the FI, at all-time points in LBC1936 and ELSA, explaining 2.1% (β = 0.15, 95%CI, 0.085-0.21) and 1.8% (β = 0.14, 95%CI, 0.10-0.17) of the variance, respectively, at age ~ 68/ ~ 70 years (p < 0.001). This work demonstrates that frailty PRS can predict frailty in two independent cohorts, particularly at early ages (~ 68/ ~ 70). PRS have the potential to be valuable instruments for identifying those at risk for frailty and could be important for controlling for genetic confounders in epidemiological studies.
Collapse
Affiliation(s)
- J P Flint
- Advanced Care Research Centre, School of Engineering, College of Science and Engineering, The University of Edinburgh, Edinburgh, UK.
- Lothian Birth Cohorts, Department of Psychology, University of Edinburgh, Edinburgh, UK.
- Department of Psychology, School of Philosophy, Psychology and Language Sciences, University of Edinburgh, Edinburgh, UK.
- Alzheimer Scotland Dementia Research Centre, University of Edinburgh, Edinburgh, UK.
| | - M Welstead
- Department of Psychology, School of Philosophy, Psychology and Language Sciences, University of Edinburgh, Edinburgh, UK
- Alzheimer Scotland Dementia Research Centre, University of Edinburgh, Edinburgh, UK
| | - S R Cox
- Lothian Birth Cohorts, Department of Psychology, University of Edinburgh, Edinburgh, UK
- Department of Psychology, School of Philosophy, Psychology and Language Sciences, University of Edinburgh, Edinburgh, UK
| | - T C Russ
- Alzheimer Scotland Dementia Research Centre, University of Edinburgh, Edinburgh, UK
- Division of Psychiatry, Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, UK
| | - A Marshall
- Advanced Care Research Centre, School of Engineering, College of Science and Engineering, The University of Edinburgh, Edinburgh, UK
- School of Social and Political Science, University of Edinburgh, Edinburgh, UK
| | - M Luciano
- Lothian Birth Cohorts, Department of Psychology, University of Edinburgh, Edinburgh, UK
- Department of Psychology, School of Philosophy, Psychology and Language Sciences, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
14
|
Kunkel D, Sørensen P, Shankar V, Morgante F. Improving polygenic prediction from summary data by learning patterns of effect sharing across multiple phenotypes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.06.592745. [PMID: 38766136 PMCID: PMC11100663 DOI: 10.1101/2024.05.06.592745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
Polygenic prediction of complex trait phenotypes has become important in human genetics, especially in the context of precision medicine. Recently, Morgante et al. introduced mr.mash, a flexible and computationally efficient method that models multiple phenotypes jointly and leverages sharing of effects across such phenotypes to improve prediction accuracy. However, a drawback of mr.mash is that it requires individual-level data, which are often not publicly available. In this work, we introduce mr.mash-rss, an extension of the mr.mash model that requires only summary statistics from Genome-Wide Association Studies (GWAS) and linkage disequilibrium (LD) estimates from a reference panel. By using summary data, we achieve the twin goal of increasing the applicability of the mr.mash model to data sets that are not publicly available and making it scalable to biobank-size data. Through simulations, we show that mr.mash-rss is competitive with, and often outperforms, current state-of-the-art methods for single- and multi-phenotype polygenic prediction in a variety of scenarios that differ in the pattern of effect sharing across phenotypes, the number of phenotypes, the number of causal variants, and the genomic heritability. We also present a real data analysis of 16 blood cell phenotypes in UK Biobank, showing that mr.mash-rss achieves higher prediction accuracy than competing methods for the majority of traits, especially when the data has smaller sample size.
Collapse
Affiliation(s)
- Deborah Kunkel
- School of Mathematical and Statistical Sciences, Clemson University, Clemson, SC, United States of America
| | - Peter Sørensen
- Center for Quantitative Genetics and Genomics, Aarhus University, Aarhus, Denmark
| | - Vijay Shankar
- Center for Human Genetics, Clemson University, Greenwood, SC, United States of America
| | - Fabio Morgante
- Center for Human Genetics, Clemson University, Greenwood, SC, United States of America
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC, United States of America
| |
Collapse
|
15
|
Zhang J, Zhan J, Jin J, Ma C, Zhao R, O'Connell J, Jiang Y, Koelsch BL, Zhang H, Chatterjee N. An ensemble penalized regression method for multi-ancestry polygenic risk prediction. Nat Commun 2024; 15:3238. [PMID: 38622117 PMCID: PMC11271575 DOI: 10.1038/s41467-024-47357-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 03/28/2024] [Indexed: 04/17/2024] Open
Abstract
Great efforts are being made to develop advanced polygenic risk scores (PRS) to improve the prediction of complex traits and diseases. However, most existing PRS are primarily trained on European ancestry populations, limiting their transferability to non-European populations. In this article, we propose a novel method for generating multi-ancestry Polygenic Risk scOres based on enSemble of PEnalized Regression models (PROSPER). PROSPER integrates genome-wide association studies (GWAS) summary statistics from diverse populations to develop ancestry-specific PRS with improved predictive power for minority populations. The method uses a combination ofL 1 (lasso) andL 2 (ridge) penalty functions, a parsimonious specification of the penalty parameters across populations, and an ensemble step to combine PRS generated across different penalty parameters. We evaluate the performance of PROSPER and other existing methods on large-scale simulated and real datasets, including those from 23andMe Inc., the Global Lipids Genetics Consortium, and All of Us. Results show that PROSPER can substantially improve multi-ancestry polygenic prediction compared to alternative methods across a wide variety of genetic architectures. In real data analyses, for example, PROSPER increased out-of-sample prediction R2 for continuous traits by an average of 70% compared to a state-of-the-art Bayesian method (PRS-CSx) in the African ancestry population. Further, PROSPER is computationally highly scalable for the analysis of large SNP contents and many diverse populations.
Collapse
Affiliation(s)
- Jingning Zhang
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.
| | | | - Jin Jin
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA, USA
| | - Cheng Ma
- Department of Statistics, University of Michigan, Ann Arbor, MI, USA
| | - Ruzhang Zhao
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | | | | | | | - Haoyu Zhang
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Nilanjan Chatterjee
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.
- Department of Oncology, School of Medicine, Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
16
|
Zhang T, Zhou G, Klei L, Liu P, Chouldechova A, Zhao H, Roeder K, G'Sell M, Devlin B. Evaluating and improving health equity and fairness of polygenic scores. HGG ADVANCES 2024; 5:100280. [PMID: 38402414 PMCID: PMC10937319 DOI: 10.1016/j.xhgg.2024.100280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 02/14/2024] [Accepted: 02/14/2024] [Indexed: 02/26/2024] Open
Abstract
Polygenic scores (PGSs) are quantitative metrics for predicting phenotypic values, such as human height or disease status. Some PGS methods require only summary statistics of a relevant genome-wide association study (GWAS) for their score. One such method is Lassosum, which inherits the model selection advantages of Lasso to select a meaningful subset of the GWAS single-nucleotide polymorphisms as predictors from their association statistics. However, even efficient scores like Lassosum, when derived from European-based GWASs, are poor predictors of phenotype for subjects of non-European ancestry; that is, they have limited portability to other ancestries. To increase the portability of Lassosum, when GWAS information and estimates of linkage disequilibrium are available for both ancestries, we propose Joint-Lassosum (JLS). In the simulation settings we explore, JLS provides more accurate PGSs compared to other methods, especially when measured in terms of fairness. In analyses of UK Biobank data, JLS was computationally more efficient but slightly less accurate than a Bayesian comparator, SDPRX. Like all PGS methods, JLS requires selection of predictors, which are determined by data-driven tuning parameters. We describe a new approach to selecting tuning parameters and note its relevance for model selection for any PGS. We also draw connections to the literature on algorithmic fairness and discuss how JLS can help mitigate fairness-related harms that might result from the use of PGSs in clinical settings. While no PGS method is likely to be universally portable, due to the diversity of human populations and unequal information content of GWASs for different ancestries, JLS is an effective approach for enhancing portability and reducing predictive bias.
Collapse
Affiliation(s)
- Tianyu Zhang
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA.
| | - Geyu Zhou
- Department of Biostatistics, Yale University, New Haven, CT 06511, USA
| | - Lambertus Klei
- Department of Psychiatry, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Peng Liu
- Merck Research Laboratories, Merck & Co., Inc., Rahway, NJ 07065, USA
| | - Alexandra Chouldechova
- Microsoft Research NYC, New York, NY 10012, USA; Heinz College of Information Systems and Public Policy, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Hongyu Zhao
- Department of Biostatistics, Yale University, New Haven, CT 06511, USA
| | - Kathryn Roeder
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA; Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Max G'Sell
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Bernie Devlin
- Department of Psychiatry, University of Pittsburgh, Pittsburgh, PA 15213, USA.
| |
Collapse
|
17
|
Zhang J, Zhan J, Jin J, Ma C, Zhao R, O’Connell J, Jiang Y, Koelsch BL, Zhang H, Chatterjee N. An Ensemble Penalized Regression Method for Multi-ancestry Polygenic Risk Prediction. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.03.15.532652. [PMID: 36993331 PMCID: PMC10055041 DOI: 10.1101/2023.03.15.532652] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/30/2023]
Abstract
Great efforts are being made to develop advanced polygenic risk scores (PRS) to improve the prediction of complex traits and diseases. However, most existing PRS are primarily trained on European ancestry populations, limiting their transferability to non-European populations. In this article, we propose a novel method for generating multi-ancestry Polygenic Risk scOres based on enSemble of PEnalized Regression models (PROSPER). PROSPER integrates genome-wide association studies (GWAS) summary statistics from diverse populations to develop ancestry-specific PRS with improved predictive power for minority populations. The method uses a combination of ℒ 1 (lasso) and ℒ 2 (ridge) penalty functions, a parsimonious specification of the penalty parameters across populations, and an ensemble step to combine PRS generated across different penalty parameters. We evaluate the performance of PROSPER and other existing methods on large-scale simulated and real datasets, including those from 23andMe Inc., the Global Lipids Genetics Consortium, and All of Us. Results show that PROSPER can substantially improve multi-ancestry polygenic prediction compared to alternative methods across a wide variety of genetic architectures. In real data analyses, for example, PROSPER increased out-of-sample prediction R2 for continuous traits by an average of 70% compared to a state-of-the-art Bayesian method (PRS-CSx) in the African ancestry population. Further, PROSPER is computationally highly scalable for the analysis of large SNP contents and many diverse populations.
Collapse
Affiliation(s)
- Jingning Zhang
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | | | - Jin Jin
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA, USA
| | - Cheng Ma
- Department of Statistics, University of Michigan, Ann Arbor, MI, USA
| | - Ruzhang Zhao
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | | | | | | | | | - Haoyu Zhang
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Nilanjan Chatterjee
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
- Department of Oncology, School of Medicine, Johns Hopkins University, Baltimore, MD, USA
| |
Collapse
|
18
|
Lind BM, Candido-Ribeiro R, Singh P, Lu M, Obreht Vidakovic D, Booker TR, Whitlock MC, Yeaman S, Isabel N, Aitken SN. How useful are genomic data for predicting maladaptation to future climate? GLOBAL CHANGE BIOLOGY 2024; 30:e17227. [PMID: 38558300 DOI: 10.1111/gcb.17227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 12/18/2023] [Accepted: 12/27/2023] [Indexed: 04/04/2024]
Abstract
Methods using genomic information to forecast potential population maladaptation to climate change or new environments are becoming increasingly common, yet the lack of model validation poses serious hurdles toward their incorporation into management and policy. Here, we compare the validation of maladaptation estimates derived from two methods-Gradient Forests (GFoffset) and the risk of non-adaptedness (RONA)-using exome capture pool-seq data from 35 to 39 populations across three conifer taxa: two Douglas-fir varieties and jack pine. We evaluate sensitivity of these algorithms to the source of input loci (markers selected from genotype-environment associations [GEA] or those selected at random). We validate these methods against 2- and 52-year growth and mortality measured in independent transplant experiments. Overall, we find that both methods often better predict transplant performance than climatic or geographic distances. We also find that GFoffset and RONA models are surprisingly not improved using GEA candidates. Even with promising validation results, variation in model projections to future climates makes it difficult to identify the most maladapted populations using either method. Our work advances understanding of the sensitivity and applicability of these approaches, and we discuss recommendations for their future use.
Collapse
Affiliation(s)
- Brandon M Lind
- Centre for Forest Conservation Genetics and Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, British Columbia, Canada
| | - Rafael Candido-Ribeiro
- Centre for Forest Conservation Genetics and Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, British Columbia, Canada
| | - Pooja Singh
- Department of Biological Sciences, University of Calgary, Calgary, Alberta, Canada
| | - Mengmeng Lu
- Department of Biological Sciences, University of Calgary, Calgary, Alberta, Canada
| | - Dragana Obreht Vidakovic
- Centre for Forest Conservation Genetics and Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, British Columbia, Canada
| | - Tom R Booker
- Department of Zoology, University of British Columbia, Vancouver, British Columbia, Canada
| | - Michael C Whitlock
- Department of Zoology, University of British Columbia, Vancouver, British Columbia, Canada
| | - Sam Yeaman
- Department of Biological Sciences, University of Calgary, Calgary, Alberta, Canada
| | - Nathalie Isabel
- Canada Research Chair in Forest Genomics, Centre for Forest Research and Institute for Systems and Integrative Biology, Université Laval, Québec, Quebec, Canada
- Natural Resources Canada, Canadian Forest Service, Laurentian Forestry Centre, Québec, Quebec, Canada
| | - Sally N Aitken
- Centre for Forest Conservation Genetics and Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
19
|
Ren J, Pan W. Statistical inference with large-scale trait imputation. Stat Med 2024; 43:625-641. [PMID: 38038193 PMCID: PMC10848238 DOI: 10.1002/sim.9975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 09/26/2023] [Accepted: 11/17/2023] [Indexed: 12/02/2023]
Abstract
Recently a nonparametric method called LS-imputation has been proposed for large-scale trait imputation based on a GWAS summary dataset and a large set of genotyped individuals. The imputed trait values, along with the genotypes, can be treated as an individual-level dataset for downstream genetic analyses, including those that cannot be done with GWAS summary data. However, since the covariance matrix of the imputed trait values is often too large to calculate, the current method imposes a working assumption that the imputed trait values are identically and independently distributed, which is incorrect in truth. Here we propose a "divide and conquer/combine" strategy to estimate and account for the covariance matrix of the imputed trait values via batches, thus relaxing the incorrect working assumption. Applications of the methods to the UK Biobank data for marginal association analysis showed some improvement by the new method in some cases, but overall the original method performed well, which was explained by nearly constant variances of and mostly weak correlations among imputed trait values.
Collapse
Affiliation(s)
- Jingchen Ren
- School of Statistics, University of Minnesota, Minneapolis, MN, 55455
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, 55455
| | - Wei Pan
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, 55455
| |
Collapse
|
20
|
Xiang R, Kelemen M, Xu Y, Harris LW, Parkinson H, Inouye M, Lambert SA. Recent advances in polygenic scores: translation, equitability, methods and FAIR tools. Genome Med 2024; 16:33. [PMID: 38373998 PMCID: PMC10875792 DOI: 10.1186/s13073-024-01304-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 02/07/2024] [Indexed: 02/21/2024] Open
Abstract
Polygenic scores (PGS) can be used for risk stratification by quantifying individuals' genetic predisposition to disease, and many potentially clinically useful applications have been proposed. Here, we review the latest potential benefits of PGS in the clinic and challenges to implementation. PGS could augment risk stratification through combined use with traditional risk factors (demographics, disease-specific risk factors, family history, etc.), to support diagnostic pathways, to predict groups with therapeutic benefits, and to increase the efficiency of clinical trials. However, there exist challenges to maximizing the clinical utility of PGS, including FAIR (Findable, Accessible, Interoperable, and Reusable) use and standardized sharing of the genomic data needed to develop and recalculate PGS, the equitable performance of PGS across populations and ancestries, the generation of robust and reproducible PGS calculations, and the responsible communication and interpretation of results. We outline how these challenges may be overcome analytically and with more diverse data as well as highlight sustained community efforts to achieve equitable, impactful, and responsible use of PGS in healthcare.
Collapse
Affiliation(s)
- Ruidong Xiang
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
| | - Martin Kelemen
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
| | - Yu Xu
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
| | - Laura W Harris
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Helen Parkinson
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Michael Inouye
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia.
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK.
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK.
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK.
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK.
- British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK.
| | - Samuel A Lambert
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| |
Collapse
|
21
|
Jung H, Jung HU, Baek EJ, Kwon SY, Kang JO, Lim JE, Oh B. Integration of risk factor polygenic risk score with disease polygenic risk score for disease prediction. Commun Biol 2024; 7:180. [PMID: 38351177 PMCID: PMC10864389 DOI: 10.1038/s42003-024-05874-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Accepted: 01/30/2024] [Indexed: 02/16/2024] Open
Abstract
Polygenic risk score (PRS) is useful for capturing an individual's genetic susceptibility. However, previous studies have not fully exploited the potential of the risk factor PRS (RFPRS) for disease prediction. We explored the potential of integrating disease-related RFPRSs with disease PRS to enhance disease prediction performance. We constructed 112 RFPRSs and analyzed the association of RFPRSs with diseases to identify disease-related RFPRSs in 700 diseases, using the UK Biobank dataset. We uncovered 6157 statistically significant associations between 247 diseases and 109 RFPRSs. We estimated the disease PRSs of 70 diseases that exhibited statistically significant heritability, to generate RFDiseasemetaPRS-a combined PRS integrating RFPRSs and disease PRS-and compare the prediction performance metrics between RFDiseasemetaPRS and disease PRS. RFDiseasemetaPRS showed better performance for Nagelkerke's pseudo-R2, odds ratio (OR) per 1 SD, net reclassification improvement (NRI) values and difference of R2 considered by variance of R2 in 31 out of 70 diseases. Additionally, we assessed risk classification between two models by examining OR between the top 10% and remaining 90% individuals for the 31 diseases; RFDiseasemetaPRS exhibited better R2, NRI and OR than disease PRS. These findings highlight the importance of utilizing RFDiseasemetaPRS, which can provide personalized healthcare and tailored prevention strategies.
Collapse
Affiliation(s)
- Hyein Jung
- Department of Biomedical Science, Graduate School, Kyung Hee University, Seoul, Republic of Korea
| | - Hae-Un Jung
- Department of Biomedical Science, Graduate School, Kyung Hee University, Seoul, Republic of Korea
| | | | - Shin Young Kwon
- Department of Biomedical Science, Graduate School, Kyung Hee University, Seoul, Republic of Korea
| | - Ji-One Kang
- Department of Biochemistry and Molecular Biology, School of Medicine, Kyung Hee University, Seoul, Republic of Korea
| | - Ji Eun Lim
- Department of Biochemistry and Molecular Biology, School of Medicine, Kyung Hee University, Seoul, Republic of Korea.
| | - Bermseok Oh
- Department of Biomedical Science, Graduate School, Kyung Hee University, Seoul, Republic of Korea.
- Mendel Inc, Seoul, Republic of Korea.
- Department of Biochemistry and Molecular Biology, School of Medicine, Kyung Hee University, Seoul, Republic of Korea.
| |
Collapse
|
22
|
Cao C, Zhang S, Wang J, Tian M, Ji X, Huang D, Yang S, Gu N. PGS-Depot: a comprehensive resource for polygenic scores constructed by summary statistics based methods. Nucleic Acids Res 2024; 52:D963-D971. [PMID: 37953384 PMCID: PMC10767792 DOI: 10.1093/nar/gkad1029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 10/04/2023] [Accepted: 10/20/2023] [Indexed: 11/14/2023] Open
Abstract
Polygenic score (PGS) is an important tool for the genetic prediction of complex traits. However, there are currently no resources providing comprehensive PGSs computed from published summary statistics, and it is difficult to implement and run different PGS methods due to the complexity of their pipelines and parameter settings. To address these issues, we introduce a new resource called PGS-Depot containing the most comprehensive set of publicly available disease-related GWAS summary statistics. PGS-Depot includes 5585 high quality summary statistics (1933 quantitative and 3652 binary trait statistics) curated from 1564 traits in European and East Asian populations. A standardized best-practice pipeline is used to implement 11 summary statistics-based PGS methods, each with different model assumptions and estimation procedures. The prediction performance of each method can be compared for both in- and cross-ancestry populations, and users can also submit their own summary statistics to obtain custom PGS with the available methods. Other features include searching for PGSs by trait name, publication, cohort information, population, or the MeSH ontology tree and searching for trait descriptions with the experimental factor ontology (EFO). All scores, SNP effect sizes and summary statistics can be downloaded via FTP. PGS-Depot is freely available at http://www.pgsdepot.net.
Collapse
Affiliation(s)
- Chen Cao
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, Jiangsu 211166, China
| | - Shuting Zhang
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, Jiangsu 211166, China
| | - Jianhua Wang
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300203, China
| | - Min Tian
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, Jiangsu 211166, China
| | - Xiaolong Ji
- Department of Biostatistics, Centre for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
| | - Dandan Huang
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300203, China
| | - Sheng Yang
- Department of Biostatistics, Centre for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
| | - Ning Gu
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, Jiangsu 211166, China
- Medical School, Nanjing University, Nanjing, Jiangsu 210093, China
| |
Collapse
|
23
|
Fritsche LG, Nam K, Du J, Kundu R, Salvatore M, Shi X, Lee S, Burgess S, Mukherjee B. Uncovering associations between pre-existing conditions and COVID-19 Severity: A polygenic risk score approach across three large biobanks. PLoS Genet 2023; 19:e1010907. [PMID: 38113267 PMCID: PMC10763941 DOI: 10.1371/journal.pgen.1010907] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 01/03/2024] [Accepted: 12/05/2023] [Indexed: 12/21/2023] Open
Abstract
OBJECTIVE To overcome the limitations associated with the collection and curation of COVID-19 outcome data in biobanks, this study proposes the use of polygenic risk scores (PRS) as reliable proxies of COVID-19 severity across three large biobanks: the Michigan Genomics Initiative (MGI), UK Biobank (UKB), and NIH All of Us. The goal is to identify associations between pre-existing conditions and COVID-19 severity. METHODS Drawing on a sample of more than 500,000 individuals from the three biobanks, we conducted a phenome-wide association study (PheWAS) to identify associations between a PRS for COVID-19 severity, derived from a genome-wide association study on COVID-19 hospitalization, and clinical pre-existing, pre-pandemic phenotypes. We performed cohort-specific PRS PheWAS and a subsequent fixed-effects meta-analysis. RESULTS The current study uncovered 23 pre-existing conditions significantly associated with the COVID-19 severity PRS in cohort-specific analyses, of which 21 were observed in the UKB cohort and two in the MGI cohort. The meta-analysis yielded 27 significant phenotypes predominantly related to obesity, metabolic disorders, and cardiovascular conditions. After adjusting for body mass index, several clinical phenotypes, such as hypercholesterolemia and gastrointestinal disorders, remained associated with an increased risk of hospitalization following COVID-19 infection. CONCLUSION By employing PRS as a proxy for COVID-19 severity, we corroborated known risk factors and identified novel associations between pre-existing clinical phenotypes and COVID-19 severity. Our study highlights the potential value of using PRS when actual outcome data may be limited or inadequate for robust analyses.
Collapse
Affiliation(s)
- Lars G. Fritsche
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, Michigan, United States of America
- Center for Precision Health Data Science, University of Michigan School of Public Health, Ann Arbor, Michigan, United States of America
| | - Kisung Nam
- Graduate School of Data Science, Seoul National University, Seoul, South Korea
| | - Jiacong Du
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, Michigan, United States of America
- Center for Precision Health Data Science, University of Michigan School of Public Health, Ann Arbor, Michigan, United States of America
| | - Ritoban Kundu
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, Michigan, United States of America
- Center for Precision Health Data Science, University of Michigan School of Public Health, Ann Arbor, Michigan, United States of America
| | - Maxwell Salvatore
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, Michigan, United States of America
- Center for Precision Health Data Science, University of Michigan School of Public Health, Ann Arbor, Michigan, United States of America
| | - Xu Shi
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, Michigan, United States of America
| | - Seunggeun Lee
- Graduate School of Data Science, Seoul National University, Seoul, South Korea
| | - Stephen Burgess
- MRC Biostatistics Unit, University of Cambridge, Cambridge, United Kingdom
- Cardiovascular Epidemiology Unit, University of Cambridge, Cambridge, United Kingdom
| | - Bhramar Mukherjee
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, Michigan, United States of America
- Center for Precision Health Data Science, University of Michigan School of Public Health, Ann Arbor, Michigan, United States of America
- Michigan Institute for Data Science, University of Michigan, Ann Arbor, Michigan, United States of America
| |
Collapse
|
24
|
Cho HW, Ban HJ, Jin HS, Cha S, Eom YB. A genome-wide association scan reveals novel loci for facial traits of Koreans. Genomics 2023; 115:110710. [PMID: 37734486 DOI: 10.1016/j.ygeno.2023.110710] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 09/07/2023] [Accepted: 09/18/2023] [Indexed: 09/23/2023]
Abstract
DNA-based prediction of externally visible characteristics (EVC) with SNPs is one of the research areas of interest in the forensic field. Based on a previous study performing GWAS on facial traits in a Korean population, herein, we present results stemming from GWA analysis with KoreanChip and novel genetic loci satisfying genome-wide significant level. We discovered a total of 20 signals and 12 loci were found to have novel associations with facial traits, including six loci located in intergenic regions and six loci located at UBE2O, HECTD2, CCDC108, TPK1, FCN2, and FRMPD1. Additionally, we performed a polygenic score analysis for 33 distance-related traits in facial phenotyping and determined genetic relationships between facial traits and SNPs using the GCTA program. The results of the current study offer an understanding of how facial morphology is influenced by complex genetic structures and provide insights into forensic investigation and population genetics.
Collapse
Affiliation(s)
- Hye-Won Cho
- Department of Medical Sciences, Graduate School, Soonchunhyang University, Asan, Chungnam 31538, Republic of Korea
| | - Hyo-Jeong Ban
- Korea Medicine (KM) Data Division, Korea Institute of Oriental Medicine, Daejeon 34054, Republic of Korea
| | - Hyun-Seok Jin
- Department of Biomedical Laboratory Science, College of Life and Health Sciences, Hoseo University, Asan, Chungnam 31499, Republic of Korea
| | - Seongwon Cha
- Korea Medicine (KM) Data Division, Korea Institute of Oriental Medicine, Daejeon 34054, Republic of Korea.
| | - Yong-Bin Eom
- Department of Medical Sciences, Graduate School, Soonchunhyang University, Asan, Chungnam 31538, Republic of Korea; Department of Biomedical Laboratory Science, College of Medical Sciences, Soonchunhyang University, Asan, Chungnam 31538, Republic of Korea.
| |
Collapse
|
25
|
Therkildsen J, Rohde PD, Nissen L, Thygesen J, Hauge EM, Langdahl BL, Boettcher M, Nyegaard M, Winther S. A genome-wide genomic score added to standard recommended stratification tools does not improve the identification of patients with very low bone mineral density. Osteoporos Int 2023; 34:1893-1906. [PMID: 37495683 PMCID: PMC10579117 DOI: 10.1007/s00198-023-06857-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Accepted: 07/10/2023] [Indexed: 07/28/2023]
Abstract
The role of integrating genomic scores (GSs) needs to be assessed. Adding a GS to recommended stratification tools does not improve the prediction of very low bone mineral density. However, we noticed that the GS performed equally or above individual risk factors in discrimination. PURPOSE We aimed to investigate whether adding a genomic score (GS) to recommended stratification tools improves the discrimination of participants with very low bone mineral density (BMD). METHODS BMD was measured in three thoracic vertebrae using CT. All participants provided information on standard osteoporosis risk factors. GSs and FRAX scores were calculated. Participants were grouped according to mean BMD into very low (<80 mg/cm3), low (80-120 mg/cm3), and normal (>120 mg/cm3) and according to the Bone Health and Osteoporosis Foundation recommendations for BMD testing into an "indication for BMD testing" and "no indication for BMD testing" group. Different models were assessed using the area under the receiver operating characteristics curves (AUC) and reclassification analyses. RESULTS In the total cohort (n=1421), the AUC for the GS was 0.57 (95% CI 0.52-0.61) corresponding to AUCs for osteoporosis risk factors. In participants without indication for BMD testing, the AUC was 0.60 (95% CI 0.52-0.69) above or equal to AUCs for osteoporosis risk factors. Adding the GS to a clinical risk factor (CRF) model resulted in AUCs not statistically significant from the CRF model. Using probability cutoff values of 6, 12, and 24%, we found no improved reclassification or risk discrimination using the CRF-GS model compared to the CRF model. CONCLUSION Our results suggest adding a GS to a CRF model does not improve prediction. However, we noticed that the GS performed equally or above individual risk factors in discrimination. Clinical risk factors combined showed superior discrimination to individual risk factors and the GS, underlining the value of combined CRFs in routine clinics as a stratification tool.
Collapse
Affiliation(s)
- J Therkildsen
- Department of Rheumatology, Aarhus University Hospital, Palle Juul-Jensens Boulevard 99, 8200, Aarhus, Denmark.
- Department of Clinical Medicine, Aarhus University, Palle Juul-Jensens Boulevard 82, 8200, Aarhus, Denmark.
| | - P D Rohde
- Department of Health Science & Technology, Aalborg University, Selma Lagerløfs Vej 24, 9269, Gistrup, Denmark
| | - L Nissen
- Department of Clinical Medicine, Aarhus University, Palle Juul-Jensens Boulevard 82, 8200, Aarhus, Denmark
- Department of Cardiology, Gødstrup Hospital, Hospitalsparken 15, 7400, Herning, Denmark
| | - J Thygesen
- Department of Clinical Medicine, Aarhus University, Palle Juul-Jensens Boulevard 82, 8200, Aarhus, Denmark
- Department of Clinical Engineering, Aarhus University Hospital, Palle Juul-Jensens Boulevard 99, 8200, Aarhus, Denmark
| | - E-M Hauge
- Department of Rheumatology, Aarhus University Hospital, Palle Juul-Jensens Boulevard 99, 8200, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Palle Juul-Jensens Boulevard 82, 8200, Aarhus, Denmark
| | - B L Langdahl
- Department of Clinical Medicine, Aarhus University, Palle Juul-Jensens Boulevard 82, 8200, Aarhus, Denmark
- Department of Endocrinology and Internal Medicine, Aarhus University Hospital, Palle Juul-Jensens Boulevard 99, 8200, Aarhus, Denmark
| | - M Boettcher
- Department of Clinical Medicine, Aarhus University, Palle Juul-Jensens Boulevard 82, 8200, Aarhus, Denmark
- Department of Cardiology, Gødstrup Hospital, Hospitalsparken 15, 7400, Herning, Denmark
| | - M Nyegaard
- Department of Health Science & Technology, Aalborg University, Selma Lagerløfs Vej 24, 9269, Gistrup, Denmark
- Department of Biomedicine, Aarhus University, Høegh-Guldbergs Gade 10, 8000, Aarhus, Denmark
| | - S Winther
- Department of Clinical Medicine, Aarhus University, Palle Juul-Jensens Boulevard 82, 8200, Aarhus, Denmark
- Department of Cardiology, Gødstrup Hospital, Hospitalsparken 15, 7400, Herning, Denmark
| |
Collapse
|
26
|
Xu C, Ganesh SK, Zhou X. mtPGS: Leverage multiple correlated traits for accurate polygenic score construction. Am J Hum Genet 2023; 110:1673-1689. [PMID: 37716346 PMCID: PMC10577082 DOI: 10.1016/j.ajhg.2023.08.016] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 08/18/2023] [Accepted: 08/27/2023] [Indexed: 09/18/2023] Open
Abstract
Accurate polygenic scores (PGSs) facilitate the genetic prediction of complex traits and aid in the development of personalized medicine. Here, we develop a statistical method called multi-trait assisted PGS (mtPGS), which can construct accurate PGSs for a target trait of interest by leveraging multiple traits relevant to the target trait. Specifically, mtPGS borrows SNP effect size similarity information between the target trait and its relevant traits to improve the effect size estimation on the target trait, thus achieving accurate PGSs. In the process, mtPGS flexibly models the shared genetic architecture between the target and the relevant traits to achieve robust performance, while explicitly accounting for the environmental covariance among them to accommodate different study designs with various sample overlap patterns. In addition, mtPGS uses only summary statistics as input and relies on a deterministic algorithm with several algebraic techniques for scalable computation. We evaluate the performance of mtPGS through comprehensive simulations and applications to 25 traits in the UK Biobank, where in the real data mtPGS achieves an average of 0.90%-52.91% accuracy gain compared to the state-of-the-art PGS methods. Overall, mtPGS represents an accurate, fast, and robust solution for PGS construction in biobank-scale datasets.
Collapse
Affiliation(s)
- Chang Xu
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA; Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
| | - Santhi K Ganesh
- Department of Internal Medicine, Division of Cardiovascular Medicine, University of Michigan, Ann Arbor, MI, USA; Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI, USA; Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA.
| |
Collapse
|
27
|
Zhang T, Klei L, Liu P, Chouldechova A, Roeder K, G'Sell M, Devlin B. Evaluating and Improving Health Equity and Fairness of Polygenic Scores. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.22.559051. [PMID: 37790341 PMCID: PMC10542523 DOI: 10.1101/2023.09.22.559051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
Polygenic scores (PGS) are quantitative metrics for predicting phenotypic values, such as human height or disease status. Some PGS methods require only summary statistics of a relevant genome-wide association study (GWAS) for their score. One such method is Lassosum, which inherits the model selection advantages of Lasso to select a meaningful subset of the GWAS single nucleotide polymorphisms as predictors from their association statistics. However, even efficient scores like Lassosum, when derived from European-based GWAS, are poor predictors of phenotype for subjects of non-European ancestry; that is, they have limited portability to other ancestries. To increase the portability of Lassosum, when GWAS information and estimates of linkage disequilibrium are available for both ancestries, we propose Joint-Lassosum. In the simulation settings we explore, Joint-Lassosum provides more accurate PGS compared with other methods, especially when measured in terms of fairness. Like all PGS methods, Joint-Lassosum requires selection of predictors, which are determined by data-driven tuning parameters. We describe a new approach to selecting tuning parameters and note its relevance for model selection for any PGS. We also draw connections to the literature on algorithmic fairness and discuss how Joint-Lassosum can help mitigate fairness-related harms that might result from the use of PGS scores in clinical settings. While no PGS method is likely to be universally portable, due to the diversity of human populations and unequal information content of GWAS for different ancestries, Joint-Lassosum is an effective approach for enhancing portability and reducing predictive bias.
Collapse
|
28
|
Ren J, Lin Z, Pan W. Integrating GWAS summary statistics, individual-level genotypic and omic data to enhance the performance for large-scale trait imputation. Hum Mol Genet 2023; 32:2693-2703. [PMID: 37369060 PMCID: PMC10460491 DOI: 10.1093/hmg/ddad097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Revised: 05/23/2023] [Accepted: 06/13/2023] [Indexed: 06/29/2023] Open
Abstract
Recently, a non-parametric method has been proposed to impute the genetic component of a trait for a large set of genotyped individuals based on a separate genome-wide association study (GWAS) summary dataset of the same trait (from the same population). The imputed trait may contain linear, non-linear and epistatic effects of genetic variants, thus can be used for downstream linear or non-linear association analyses and machine learning tasks. Here, we propose an extension of the method to impute both genetic and environmental components of a trait using both single nucleotide polymorphism (SNP)-trait and omics-trait association summary data. We illustrate an application to a UK Biobank subset of individuals (n ≈ 80K) with both body mass index (BMI) GWAS data and metabolomic data. We divided the whole dataset into two equally sized and non-overlapping training and test datasets; we used the training data to build SNP- and metabolite-BMI association summary data and impute BMI on the test data. We compared the performance of the original and new imputation methods. As by the original method, the imputed BMI values by the new method largely retained SNP-BMI association information; however, the latter retained more information about BMI-environment associations and were more highly correlated with the original observed BMI values.
Collapse
Affiliation(s)
- Jingchen Ren
- School of Statistics, University of Minnesota, Minneapolis, MN 55455, USA
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455, USA
| | - Zhaotong Lin
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455, USA
| | - Wei Pan
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455, USA
| |
Collapse
|
29
|
Pledger SL, Ahmadizar F. Gene-environment interactions and the effect on obesity risk in low and middle-income countries: a scoping review. Front Endocrinol (Lausanne) 2023; 14:1230445. [PMID: 37664850 PMCID: PMC10474324 DOI: 10.3389/fendo.2023.1230445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/28/2023] [Accepted: 07/18/2023] [Indexed: 09/05/2023] Open
Abstract
Background Obesity represents a major and preventable global health challenge as a complex disease and a modifiable risk factor for developing other non-communicable diseases. In recent years, obesity prevalence has risen more rapidly in low- and middle-income countries (LMICs) compared to high-income countries (HICs). Obesity traits are shown to be modulated by an interplay of genetic and environmental factors such as unhealthy diet and physical inactivity in studies from HICs focused on populations of European descent; however, genetic heterogeneity and environmental differences prevent the generalisation of study results to LMICs. Primary research investigating gene-environment interactions (GxE) on obesity in LMICs is limited but expanding. Synthesis of current research would provide an overview of the interactions between genetic variants and environmental factors that underlie the obesity epidemic and identify knowledge gaps for future studies. Methods Three databases were searched systematically using a combination of keywords such as "genes", "obesity", "LMIC", "diet", and "physical activity" to find all relevant observational studies published before November 2022. Results Eighteen of the 1,373 articles met the inclusion criteria, of which one was a genome-wide association study (GWAS), thirteen used a candidate gene approach, and five were assigned as genetic risk score studies. Statistically significant findings were reported for 12 individual SNPs; however, most studies were small-scale and without replication. Conclusion Although the results suggest significant GxE interactions on obesity in LMICs, updated robust statistical techniques with more precise and standardised exposure and outcome measurements are necessary for translatable results. Future research should focus on improved quality replication efforts, emphasising large-scale and long-term longitudinal study designs using multi-ethnic GWAS.
Collapse
Affiliation(s)
- Sophia L. Pledger
- Department of Epidemiology and Global Health, Julius Global Health, University Medical Center Utrecht, Utrecht, Netherlands
| | - Fariba Ahmadizar
- Department of Data Science and Biostatistics, Julius Global Health, University Medical Center Utrecht, Utrecht, Netherlands
| |
Collapse
|
30
|
Gao Y, Sharma T, Cui Y. Addressing the Challenge of Biomedical Data Inequality: An Artificial Intelligence Perspective. Annu Rev Biomed Data Sci 2023; 6:153-171. [PMID: 37104653 PMCID: PMC10529864 DOI: 10.1146/annurev-biodatasci-020722-020704] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
Artificial intelligence (AI) and other data-driven technologies hold great promise to transform healthcare and confer the predictive power essential to precision medicine. However, the existing biomedical data, which are a vital resource and foundation for developing medical AI models, do not reflect the diversity of the human population. The low representation in biomedical data has become a significant health risk for non-European populations, and the growing application of AI opens a new pathway for this health risk to manifest and amplify. Here we review the current status of biomedical data inequality and present a conceptual framework for understanding its impacts on machine learning. We also discuss the recent advances in algorithmic interventions for mitigating health disparities arising from biomedical data inequality. Finally, we briefly discuss the newly identified disparity in data quality among ethnic groups and its potential impacts on machine learning.
Collapse
Affiliation(s)
- Yan Gao
- Department of Genetics, Genomics, and Informatics, University of Tennessee Health Science Center, Memphis, Tennessee, USA;
| | - Teena Sharma
- Department of Genetics, Genomics, and Informatics, University of Tennessee Health Science Center, Memphis, Tennessee, USA;
| | - Yan Cui
- Department of Genetics, Genomics, and Informatics, University of Tennessee Health Science Center, Memphis, Tennessee, USA;
| |
Collapse
|
31
|
Ren J, Lin Z, He R, Shen X, Pan W. Using GWAS summary data to impute traits for genotyped individuals. HGG ADVANCES 2023; 4:100197. [PMID: 37181332 PMCID: PMC10173780 DOI: 10.1016/j.xhgg.2023.100197] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Accepted: 04/07/2023] [Indexed: 05/16/2023] Open
Abstract
Genome-wide association study (GWAS) summary data have become extremely useful in daily routine data analysis, largely facilitating new methods development and new applications. However, a severe limitation with the current use of GWAS summary data is its exclusive restriction to only linear single nucleotide polymorphism (SNP)-trait association analyses. To further expand the use of GWAS summary data, along with a large sample of individual-level genotypes, we propose a nonparametric method for large-scale imputation of the genetic component of the trait for the given genotypes. The imputed individual-level trait values, along with the individual-level genotypes, make it possible to conduct any analysis as with individual-level GWAS data, including nonlinear SNP-trait associations and predictions. We use the UK Biobank data to highlight the usefulness and effectiveness of the proposed method in three applications that currently cannot be done with only GWAS summary data (for SNP-trait associations): marginal SNP-trait association analysis under non-additive genetic models, detection of SNP-SNP interactions, and genetic prediction of a trait using a nonlinear model of SNPs.
Collapse
Affiliation(s)
- Jingchen Ren
- School of Statistics, University of Minnesota, Minneapolis, MN 55455, USA
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455, USA
| | - Zhaotong Lin
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455, USA
| | - Ruoyu He
- School of Statistics, University of Minnesota, Minneapolis, MN 55455, USA
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455, USA
| | - Xiaotong Shen
- School of Statistics, University of Minnesota, Minneapolis, MN 55455, USA
| | - Wei Pan
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455, USA
| |
Collapse
|
32
|
Gu Y, Yan C, Wang T, Hu B, Zhu M, Jin G. Construction and evaluation of the functional polygenic risk score for gastric cancer in a prospective cohort of the European population. Chin Med J (Engl) 2023:00029330-990000000-00640. [PMID: 37394533 DOI: 10.1097/cm9.0000000000002716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Indexed: 07/04/2023] Open
Abstract
BACKGROUND A polygenic risk score (PRS) derived from 112 single-nucleotide polymorphisms (SNPs) for gastric cancer has been reported in Chinese populations (PRS-112). However, its performance in other populations is unknown. A functional PRS (fPRS) using functional SNPs (fSNPs) may improve the generalizability of the PRS across populations with distinct ethnicities. METHODS We performed functional annotations on SNPs in strong linkage disequilibrium (LD) with the 112 previously reported SNPs to identify fSNPs that affect protein-coding or transcriptional regulation. Subsequently, we constructed an fPRS based on the fSNPs by using the LDpred2-infinitesimal model and then analyzed the performance of the PRS-112 and fPRS in the risk prediction of gastric cancer in 457,521 European participants of the UK Biobank cohort. Finally, the performance of the fPRS in combination with lifestyle factors were evaluated in predicting the risk of gastric cancer. RESULTS During 4,582,045 person-years of follow-up with a total of 623 incident gastric cancer cases, we found no significant association between the PRS-112 and gastric cancer risk in the European population (hazard ratio [HR] = 1.00 [95% confidence interval (CI) 0.93-1.09], P = 0.846). We identified 125 fSNPs, including seven deleterious protein-coding SNPs and 118 regulatory non-coding SNPs, and used them to constructed the fPRS-125. Our result showed that the fPRS-125 was significantly associated with gastric cancer risk (HR = 1.11 [95% CI, 1.03-1.20], P = 0.009). Compared to participants with a low fPRS-125 (bottom quintile), those with a high fPRS-125 (top quintile) had a higher risk of incident gastric cancer (HR = 1.43 [95% CI, 1.12-1.84], P = 0.005). Moreover, we observed that participants with both an unfavorable lifestyle and a high genetic risk had the highest risk of incident gastric cancer (HR = 4.99 [95% CI, 1.55-16.10], P = 0.007) compared to those with both a favorable lifestyle and a low genetic risk. CONCLUSION These results indicate that the fPRS-125 derived from fSNPs may act as an indicator to measure the genetic risk of gastric cancer in the European population.
Collapse
Affiliation(s)
- Yuanliang Gu
- Department of Epidemiology, School of Public Health, Southeast University, Nanjing, Jiangsu 210009, China
- Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
| | - Caiwang Yan
- Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
- Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine and China International Cooperation Center for Environment and Human Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
| | - Tianpei Wang
- Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
| | - Beiping Hu
- Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
| | - Meng Zhu
- Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
- Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine and China International Cooperation Center for Environment and Human Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
- Jiangsu Key Laboratory of Molecular and Translational Cancer Research, Jiangsu Cancer Hospital, Jiangsu Institute of Cancer Research, The Affiliated Cancer Hospital of Nanjing Medical University, Nanjing, Jiangsu 210009, China
| | - Guangfu Jin
- Department of Epidemiology, School of Public Health, Southeast University, Nanjing, Jiangsu 210009, China
- Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
- Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine and China International Cooperation Center for Environment and Human Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
- Jiangsu Key Laboratory of Molecular and Translational Cancer Research, Jiangsu Cancer Hospital, Jiangsu Institute of Cancer Research, The Affiliated Cancer Hospital of Nanjing Medical University, Nanjing, Jiangsu 210009, China
| |
Collapse
|
33
|
Abu-El-Haija A, Reddi HV, Wand H, Rose NC, Mori M, Qian E, Murray MF. The clinical application of polygenic risk scores: A points to consider statement of the American College of Medical Genetics and Genomics (ACMG). Genet Med 2023; 25:100803. [PMID: 36920474 DOI: 10.1016/j.gim.2023.100803] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 01/27/2023] [Indexed: 03/16/2023] Open
Affiliation(s)
- Aya Abu-El-Haija
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA; Harvard Medical School, Boston, MA
| | - Honey V Reddi
- Department of Pathology & Laboratory Medicine, Medical College of Wisconsin, Milwaukee, WI
| | - Hannah Wand
- Division of Cardiovascular Medicine, Department of Medicine, Stanford Medicine, Stanford, CA
| | - Nancy C Rose
- Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, School of Medicine, University of Utah Health, Salt Lake City, UT
| | - Mari Mori
- Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH; Genetic and Genomic Medicine, Nationwide Children's Hospital, Columbus, OH
| | - Emily Qian
- Department of Genetics, Yale University, New Haven, CT
| | | |
Collapse
|
34
|
Flint JP, Welstead M, Cox SR, Russ TC, Marshall A, Luciano M. Validation of a polygenic risk score for Frailty in the Lothian Birth Cohort and English Longitudinal Study of Ageing. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.04.03.23288064. [PMID: 37066324 PMCID: PMC10104224 DOI: 10.1101/2023.04.03.23288064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
Frailty is a complex trait. Twin studies and a high-powered Genome Wide Association Study (GWAS) conducted in the UK Biobank have demonstrated a strong genetic basis of frailty. The present study utilized summary statistics from this GWAS to create and test the predictive power of frailty polygenic risk scores (PRS) in two independent samples - the Lothian Birth Cohort 1936 (LBC1936) and the English Longitudinal Study of Ageing (ELSA) aged 67-84 years. Multiple regression models were built to test the predictive power of frailty PRS at five time points. Frailty PRS significantly predicted frailty at all-time points in LBC1936 and ELSA, explaining 2.1% (β = 0.15, 95%CI, 0.085-0.21) and 1.6% (β = 0.14, 95%CI, 0.10-0.17) of the variance, respectively, at age ~68/~70 years (p < 0.001). This work demonstrates that frailty PRS can predict frailty in two independent cohorts, particularly at early ages (~68/~70). PRS have the potential to be valuable instruments for identifying those at risk for frailty and could be important for controlling for genetic confounders in epidemiological studies.
Collapse
Affiliation(s)
- J P Flint
- Advanced Care Research Centre School of Engineering, College of Medicine and Veterinary Medicine, The University of Edinburgh, Edinburgh, UK
- Lothian Birth Cohorts, Department of Psychology, University of Edinburgh, Edinburgh, UK
- Department of Psychology, School of Philosophy, Psychology and Language Sciences, University of Edinburgh, Edinburgh, UK
- Alzheimer Scotland Dementia Research Centre, University of Edinburgh, Edinburgh, UK
| | - M Welstead
- Department of Psychology, School of Philosophy, Psychology and Language Sciences, University of Edinburgh, Edinburgh, UK
- Alzheimer Scotland Dementia Research Centre, University of Edinburgh, Edinburgh, UK
| | - S R Cox
- Lothian Birth Cohorts, Department of Psychology, University of Edinburgh, Edinburgh, UK
- Department of Psychology, School of Philosophy, Psychology and Language Sciences, University of Edinburgh, Edinburgh, UK
| | - T C Russ
- Alzheimer Scotland Dementia Research Centre, University of Edinburgh, Edinburgh, UK
- Division of Psychiatry, Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, UK
| | - A Marshall
- Advanced Care Research Centre School of Engineering, College of Medicine and Veterinary Medicine, The University of Edinburgh, Edinburgh, UK
- School of Social and Political Science, University of Edinburgh, Edinburgh, UK
| | - M Luciano
- Lothian Birth Cohorts, Department of Psychology, University of Edinburgh, Edinburgh, UK
- Department of Psychology, School of Philosophy, Psychology and Language Sciences, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
35
|
Miao J, Guo H, Song G, Zhao Z, Hou L, Lu Q. Quantifying portable genetic effects and improving cross-ancestry genetic prediction with GWAS summary statistics. Nat Commun 2023; 14:832. [PMID: 36788230 PMCID: PMC9929290 DOI: 10.1038/s41467-023-36544-7] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 02/07/2023] [Indexed: 02/16/2023] Open
Abstract
Polygenic risk scores (PRS) calculated from genome-wide association studies (GWAS) of Europeans are known to have substantially reduced predictive accuracy in non-European populations, limiting their clinical utility and raising concerns about health disparities across ancestral populations. Here, we introduce a statistical framework named X-Wing to improve predictive performance in ancestrally diverse populations. X-Wing quantifies local genetic correlations for complex traits between populations, employs an annotation-dependent estimation procedure to amplify correlated genetic effects between populations, and combines multiple population-specific PRS into a unified score with GWAS summary statistics alone as input. Through extensive benchmarking, we demonstrate that X-Wing pinpoints portable genetic effects and substantially improves PRS performance in non-European populations, showing 14.1%-119.1% relative gain in predictive R2 compared to state-of-the-art methods based on GWAS summary statistics. Overall, X-Wing addresses critical limitations in existing approaches and may have broad applications in cross-population polygenic risk prediction.
Collapse
Affiliation(s)
- Jiacheng Miao
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Hanmin Guo
- Center for Statistical Science, Department of Industrial Engineering, Tsinghua University, Beijing, 100084, China
| | - Gefei Song
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Zijie Zhao
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Lin Hou
- Center for Statistical Science, Department of Industrial Engineering, Tsinghua University, Beijing, 100084, China.
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, 100084, China.
| | - Qiongshi Lu
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, 53706, USA.
- Department of Statistics, University of Wisconsin-Madison, Madison, WI, 53706, USA.
- Center for Demography of Health and Aging, University of Wisconsin-Madison, Madison, WI, 53706, USA.
| |
Collapse
|
36
|
Xin J, Du M, Gu D, Jiang K, Wang M, Jin M, Hu Y, Ben S, Chen S, Shao W, Li S, Chu H, Zhu L, Li C, Chen K, Ding K, Zhang Z, Shen H, Wang M. Risk assessment for colorectal cancer via polygenic risk score and lifestyle exposure: a large-scale association study of East Asian and European populations. Genome Med 2023; 15:4. [PMID: 36694225 PMCID: PMC9875451 DOI: 10.1186/s13073-023-01156-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2022] [Accepted: 01/13/2023] [Indexed: 01/26/2023] Open
Abstract
BACKGROUND The genetic architectures of colorectal cancer are distinct across different populations. To date, the majority of polygenic risk scores (PRSs) are derived from European (EUR) populations, which limits their accurate extrapolation to other populations. Here, we aimed to generate a PRS by incorporating East Asian (EAS) and EUR ancestry groups and validate its utility for colorectal cancer risk assessment among different populations. METHODS A large-scale colorectal cancer genome-wide association study (GWAS), harboring 35,145 cases and 288,934 controls from EAS and EUR populations, was used for the EAS-EUR GWAS meta-analysis and the construction of candidate EAS-EUR PRSs via different approaches. The performance of each PRS was then validated in external GWAS datasets of EAS (727 cases and 1452 controls) and EUR (1289 cases and 1284 controls) ancestries, respectively. The optimal PRS was further tested using the UK Biobank longitudinal cohort of 355,543 individuals and ultimately applied to stratify individual risk attached by healthy lifestyle. RESULTS In the meta-analysis across EAS and EUR populations, we identified 48 independent variants beyond genome-wide significance (P < 5 × 10-8) at previously reported loci. Among 26 candidate EAS-EUR PRSs, the PRS-CSx approach-derived PRS (defined as PRSCSx) that harbored genome-wide variants achieved the optimal discriminatory ability in both validation datasets, as well as better performance in the EAS population compared to the PRS derived from known variants. Using the UK Biobank cohort, we further validated a significant dose-response effect of PRSCSx on incident colorectal cancer, in which the risk was 2.11- and 3.88-fold higher in individuals with intermediate and high PRSCSx than in the low score subgroup (Ptrend = 8.15 × 10-53). Notably, the detrimental effect of being at a high genetic risk could be largely attenuated by adherence to a favorable lifestyle, with a 0.53% reduction in 5-year absolute risk. CONCLUSIONS In summary, we systemically constructed an EAS-EUR PRS to effectively stratify colorectal cancer risk, which highlighted its clinical implication among diverse ancestries. Importantly, these findings also supported that a healthy lifestyle could reduce the genetic impact on incident colorectal cancer.
Collapse
Affiliation(s)
- Junyi Xin
- grid.89957.3a0000 0000 9255 8984Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, School of Public Health, Nanjing Medical University, 101 Longmian Avenue, Jiangning District, Nanjing, 211166 China ,grid.89957.3a0000 0000 9255 8984Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Mulong Du
- grid.89957.3a0000 0000 9255 8984Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, School of Public Health, Nanjing Medical University, 101 Longmian Avenue, Jiangning District, Nanjing, 211166 China ,grid.89957.3a0000 0000 9255 8984Department of Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Dongying Gu
- grid.89957.3a0000 0000 9255 8984Department of Oncology, Nanjing First Hospital, Nanjing Medical University, Nanjing, China
| | - Kewei Jiang
- grid.411634.50000 0004 0632 4559Department of Gastroenterological Surgery, Laboratory of Surgical Oncology, Beijing Key Laboratory of Colorectal Cancer Diagnosis and Treatment Research, Peking University People’s Hospital, No. 11 Xizhimen South Street, Xicheng District, Beijing, China
| | - Mengyun Wang
- grid.452404.30000 0004 1808 0942Cancer Institute, Fudan University Shanghai Cancer Center, Shanghai, China ,grid.11841.3d0000 0004 0619 8943Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Mingjuan Jin
- grid.13402.340000 0004 1759 700XDepartment of Epidemiology and Biostatistics at School of Public Health, Zhejiang University School of Medicine, Hangzhou, China ,grid.13402.340000 0004 1759 700XCancer Institute, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Yeting Hu
- grid.13402.340000 0004 1759 700XDepartment of Colorectal Surgery and Oncology, Key Laboratory of Cancer Prevention and Intervention, Ministry of Education, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China ,grid.13402.340000 0004 1759 700XCancer Center, Zhejiang University, Hangzhou, Zhejiang, China
| | - Shuai Ben
- grid.89957.3a0000 0000 9255 8984Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, School of Public Health, Nanjing Medical University, 101 Longmian Avenue, Jiangning District, Nanjing, 211166 China ,grid.89957.3a0000 0000 9255 8984Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Silu Chen
- grid.89957.3a0000 0000 9255 8984Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, School of Public Health, Nanjing Medical University, 101 Longmian Avenue, Jiangning District, Nanjing, 211166 China ,grid.89957.3a0000 0000 9255 8984Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Wei Shao
- grid.89957.3a0000 0000 9255 8984Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, School of Public Health, Nanjing Medical University, 101 Longmian Avenue, Jiangning District, Nanjing, 211166 China ,grid.89957.3a0000 0000 9255 8984Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Shuwei Li
- grid.89957.3a0000 0000 9255 8984Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, School of Public Health, Nanjing Medical University, 101 Longmian Avenue, Jiangning District, Nanjing, 211166 China ,grid.89957.3a0000 0000 9255 8984Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Haiyan Chu
- grid.89957.3a0000 0000 9255 8984Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, School of Public Health, Nanjing Medical University, 101 Longmian Avenue, Jiangning District, Nanjing, 211166 China ,grid.89957.3a0000 0000 9255 8984Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Linjun Zhu
- grid.412676.00000 0004 1799 0784Department of Oncology, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Chen Li
- grid.411634.50000 0004 0632 4559Department of Gastroenterological Surgery, Laboratory of Surgical Oncology, Beijing Key Laboratory of Colorectal Cancer Diagnosis and Treatment Research, Peking University People’s Hospital, No. 11 Xizhimen South Street, Xicheng District, Beijing, China
| | - Kun Chen
- grid.13402.340000 0004 1759 700XDepartment of Epidemiology and Biostatistics at School of Public Health, Zhejiang University School of Medicine, Hangzhou, China ,grid.13402.340000 0004 1759 700XCancer Institute, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Kefeng Ding
- grid.13402.340000 0004 1759 700XDepartment of Colorectal Surgery and Oncology, Key Laboratory of Cancer Prevention and Intervention, Ministry of Education, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China ,grid.13402.340000 0004 1759 700XCancer Center, Zhejiang University, Hangzhou, Zhejiang, China
| | - Zhengdong Zhang
- grid.89957.3a0000 0000 9255 8984Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, School of Public Health, Nanjing Medical University, 101 Longmian Avenue, Jiangning District, Nanjing, 211166 China ,grid.89957.3a0000 0000 9255 8984Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Hongbing Shen
- grid.89957.3a0000 0000 9255 8984Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Meilin Wang
- Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, School of Public Health, Nanjing Medical University, 101 Longmian Avenue, Jiangning District, Nanjing, 211166, China. .,Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China. .,The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Gusu School, Nanjing Medical University, Suzhou, China.
| |
Collapse
|
37
|
Wang C, Zhang J, Veldsman WP, Zhou X, Zhang L. A comprehensive investigation of statistical and machine learning approaches for predicting complex human diseases on genomic variants. Brief Bioinform 2023; 24:6965909. [PMID: 36585786 DOI: 10.1093/bib/bbac552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 11/04/2022] [Accepted: 11/14/2022] [Indexed: 01/01/2023] Open
Abstract
Quantifying an individual's risk for common diseases is an important goal of precision health. The polygenic risk score (PRS), which aggregates multiple risk alleles of candidate diseases, has emerged as a standard approach for identifying high-risk individuals. Although several studies have been performed to benchmark the PRS calculation tools and assess their potential to guide future clinical applications, some issues remain to be further investigated, such as lacking (i) various simulated data with different genetic effects; (ii) evaluation of machine learning models and (iii) evaluation on multiple ancestries studies. In this study, we systematically validated and compared 13 statistical methods, 5 machine learning models and 2 ensemble models using simulated data with additive and genetic interaction models, 22 common diseases with internal training sets, 4 common diseases with external summary statistics and 3 common diseases for trans-ancestry studies in UK Biobank. The statistical methods were better in simulated data from additive models and machine learning models have edges for data that include genetic interactions. Ensemble models are generally the best choice by integrating various statistical methods. LDpred2 outperformed the other standalone tools, whereas PRS-CS, lassosum and DBSLMM showed comparable performance. We also identified that disease heritability strongly affected the predictive performance of all methods. Both the number and effect sizes of risk SNPs are important; and sample size strongly influences the performance of all methods. For the trans-ancestry studies, we found that the performance of most methods became worse when training and testing sets were from different populations.
Collapse
Affiliation(s)
- Chonghao Wang
- Department of Computer Science, Hong Kong Baptist University, Hong Kong SRA, China
| | - Jing Zhang
- Eye Institute and Department of Ophthalmology, NHC Key Laboratory of Myopia (Fudan University), Eye & ENT Hospital, Fudan University, Shanghai, China
| | | | - Xin Zhou
- Department of Biomedical Engineering, Vanderbilt University, Vanderbilt Place Nashville, 37235, TN, USA
| | - Lu Zhang
- Department of Computer Science, Hong Kong Baptist University, Hong Kong SRA, China
- Institute for Research and Continuing Education, Hong Kong Baptist University, Shenzhen, China
| |
Collapse
|
38
|
Wang Y, Namba S, Lopera E, Kerminen S, Tsuo K, Läll K, Kanai M, Zhou W, Wu KH, Favé MJ, Bhatta L, Awadalla P, Brumpton B, Deelen P, Hveem K, Lo Faro V, Mägi R, Murakami Y, Sanna S, Smoller JW, Uzunovic J, Wolford BN, Willer C, Gamazon ER, Cox NJ, Surakka I, Okada Y, Martin AR, Hirbo J. Global Biobank analyses provide lessons for developing polygenic risk scores across diverse cohorts. CELL GENOMICS 2023; 3:100241. [PMID: 36777179 PMCID: PMC9903818 DOI: 10.1016/j.xgen.2022.100241] [Citation(s) in RCA: 28] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Revised: 08/28/2022] [Accepted: 12/03/2022] [Indexed: 01/06/2023]
Abstract
Polygenic risk scores (PRSs) have been widely explored in precision medicine. However, few studies have thoroughly investigated their best practices in global populations across different diseases. We here utilized data from Global Biobank Meta-analysis Initiative (GBMI) to explore methodological considerations and PRS performance in 9 different biobanks for 14 disease endpoints. Specifically, we constructed PRSs using pruning and thresholding (P + T) and PRS-continuous shrinkage (CS). For both methods, using a European-based linkage disequilibrium (LD) reference panel resulted in comparable or higher prediction accuracy compared with several other non-European-based panels. PRS-CS overall outperformed the classic P + T method, especially for endpoints with higher SNP-based heritability. Notably, prediction accuracy is heterogeneous across endpoints, biobanks, and ancestries, especially for asthma, which has known variation in disease prevalence across populations. Overall, we provide lessons for PRS construction, evaluation, and interpretation using GBMI resources and highlight the importance of best practices for PRS in the biobank-scale genomics era.
Collapse
Affiliation(s)
- Ying Wang
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
- Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Shinichi Namba
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita 565-0871, Japan
| | - Esteban Lopera
- Department of Genetics, UMCG, University of Groningen, Groningen, the Netherlands
| | - Sini Kerminen
- Institute for Molecular Medicine Finland, FIMM, HiLIFE, University of Helsinki, Helsinki, Finland
| | - Kristin Tsuo
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
- Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Kristi Läll
- Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Masahiro Kanai
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
- Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita 565-0871, Japan
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Wei Zhou
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
- Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Kuan-Han Wu
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48103, USA
| | | | - Laxmi Bhatta
- K.G. Jebsen Center for Genetic Epidemiology, Department of Public Health and Nursing, NTNU, Norwegian University of Science and Technology, 7030 Trondheim, Norway
| | - Philip Awadalla
- Ontario Institute for Cancer Research, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Ben Brumpton
- K.G. Jebsen Center for Genetic Epidemiology, Department of Public Health and Nursing, NTNU, Norwegian University of Science and Technology, 7030 Trondheim, Norway
- HUNT Research Centre, Department of Public Health and Nursing, NTNU, Norwegian University of Science and Technology, 7600 Levanger, Norway
- Clinic of Medicine, St. Olav’s Hospital, Trondheim University Hospital, 7030 Trondheim, Norway
| | - Patrick Deelen
- Department of Genetics, UMCG, University of Groningen, Groningen, the Netherlands
- Oncode Institute, Utrecht, the Netherlands
| | - Kristian Hveem
- K.G. Jebsen Center for Genetic Epidemiology, Department of Public Health and Nursing, NTNU, Norwegian University of Science and Technology, 7030 Trondheim, Norway
- HUNT Research Centre, Department of Public Health and Nursing, NTNU, Norwegian University of Science and Technology, 7600 Levanger, Norway
| | - Valeria Lo Faro
- Department of Ophthalmology, University Medical Center Groningen, University of Groningen, Groningen, the Netherlands
- Department of Clinical Genetics, Amsterdam University Medical Center (AMC), Amsterdam, the Netherlands
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Reedik Mägi
- Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Yoshinori Murakami
- Division of Molecular Pathology, Institute of Medical Science, the University of Tokyo, Tokyo, Japan
| | - Serena Sanna
- Department of Genetics, UMCG, University of Groningen, Groningen, the Netherlands
- Institute for Genetics and Biomedical Research (IRGB), National Research Council (CNR), 09100 Cagliari, Italy
| | - Jordan W. Smoller
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | | | - Brooke N. Wolford
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48103, USA
- K.G. Jebsen Center for Genetic Epidemiology, Department of Public Health and Nursing, NTNU, Norwegian University of Science and Technology, 7030 Trondheim, Norway
| | - Cristen Willer
- K.G. Jebsen Center for Genetic Epidemiology, Department of Public Health and Nursing, NTNU, Norwegian University of Science and Technology, 7030 Trondheim, Norway
- Department of Internal Medicine, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Biostatistics and Center for Statistical Genetics, and Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Eric R. Gamazon
- Department of Medicine, Division of Genetic Medicine, Vanderbilt University School of Medicine, Nashville, TN, USA
- MRC Epidemiology Unit, University of Cambridge, Cambridge, UK
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Nancy J. Cox
- Department of Medicine, Division of Genetic Medicine, Vanderbilt University School of Medicine, Nashville, TN, USA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Ida Surakka
- Department of Internal Medicine, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yukinori Okada
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita 565-0871, Japan
- Laboratory for Systems Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Laboratory of Statistical Immunology, Immunology Frontier Research Center (WPI-IFReC) and Center for Infectious Disease Education and Research (CiDER), Osaka University, Suita 565-0871, Japan
- Department of Genome Informatics, Graduate School of Medicine, the University of Tokyo, Tokyo 113-0033, Japan
| | - Alicia R. Martin
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
- Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Jibril Hirbo
- Department of Medicine, Division of Genetic Medicine, Vanderbilt University School of Medicine, Nashville, TN, USA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
| |
Collapse
|
39
|
McElligott B, Shi Z, Rifkin AS, Wei J, Zheng SL, Helfand BT, Woo JSH, Xu J. Assessing the performance of genetic risk score for stratifying risk of post-sepsis cardiovascular complications. Front Cardiovasc Med 2023; 10:1076745. [PMID: 36926049 PMCID: PMC10011112 DOI: 10.3389/fcvm.2023.1076745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Accepted: 02/08/2023] [Indexed: 03/04/2023] Open
Abstract
Background Patients with sepsis are at increased risk for cardiovascular complications, including myocardial infarction (MI), ischemic stroke (IS), and venous thromboembolism (VTE). Our objective is to assess whether genetic risk score (GRS) can differentiate risk for these complications. Methods A population-based prospective cohort of 483,177 subjects, derived from the UK Biobank, was followed for diagnosis of sepsis and its complications (MI, IS, and VTE) after the study recruitment. GRS for each complication was calculated based on established risk-associated single nucleotide polymorphisms (SNPs). Time to incident MI, IS, and VTE was compared between subjects with or without sepsis and GRS risk groups using Kaplan-Meier log-rank test and Cox-regression analysis. Results During an average of 12.6 years of follow-up, 10,757 (2.23%) developed sepsis. Patients with sepsis had an overall higher risk than non-sepsis subjects for each complication, but the risk differed by time after a sepsis diagnosis; exceedingly high in short-term (0-30 days), considerably high in mid-term (31 days to 2 years), and reduced in long-term (>2 years). Furthermore, in White subjects, GRS was a significant predictor of complications, independent of sepsis and other risk factors. For example, GRSMI further differentiated their risk in patients with sepsis; 3.49, 4.73, and 9.03% in those with low- (<0.5), intermediate- (0.5-1.99), high- GRSMI (≥2.0), Ptrend < 0.001. Conclusion Risk for post-sepsis cardiovascular complications differed considerably by time after a sepsis diagnosis and GRS. These findings, if confirmed in other ancestry-specific populations, may guide personalized management for preventing post-sepsis cardiovascular complications.
Collapse
Affiliation(s)
- Brian McElligott
- Program for Personalized Cancer Care, NorthShore University HealthSystem, Evanston, IL, United States
| | - Zhuqing Shi
- Program for Personalized Cancer Care, NorthShore University HealthSystem, Evanston, IL, United States
| | - Andrew S Rifkin
- Program for Personalized Cancer Care, NorthShore University HealthSystem, Evanston, IL, United States
| | - Jun Wei
- Program for Personalized Cancer Care, NorthShore University HealthSystem, Evanston, IL, United States
| | - S Lilly Zheng
- Program for Personalized Cancer Care, NorthShore University HealthSystem, Evanston, IL, United States
| | - Brian T Helfand
- Program for Personalized Cancer Care, NorthShore University HealthSystem, Evanston, IL, United States.,Department of Surgery, NorthShore University HealthSystem, Evanston, IL, United States.,Pritzker School of Medicine, University of Chicago, Chicago, IL, United States
| | - Jonathan S H Woo
- Department of Medicine, NorthShore University HealthSystem, Evanston, IL, United States
| | - Jianfeng Xu
- Program for Personalized Cancer Care, NorthShore University HealthSystem, Evanston, IL, United States.,Department of Surgery, NorthShore University HealthSystem, Evanston, IL, United States.,Pritzker School of Medicine, University of Chicago, Chicago, IL, United States.,Neaman Center for Personalized Medicine, NorthShore University HealthSystem, Evanston, IL, United States
| |
Collapse
|
40
|
Fang Y, Fritsche LG, Mukherjee B, Sen S, Richmond-Rakerd LS. Polygenic Liability to Depression Is Associated With Multiple Medical Conditions in the Electronic Health Record: Phenome-wide Association Study of 46,782 Individuals. Biol Psychiatry 2022; 92:923-931. [PMID: 35965108 PMCID: PMC10712651 DOI: 10.1016/j.biopsych.2022.06.004] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/17/2021] [Revised: 04/01/2022] [Accepted: 06/02/2022] [Indexed: 11/22/2022]
Abstract
BACKGROUND Major depressive disorder (MDD) is a leading cause of disease-associated disability, with much of the increased burden due to psychiatric and medical comorbidity. This comorbidity partly reflects common genetic influences across conditions. Integrating molecular-genetic tools with health records enables tests of association with the broad range of physiological and clinical phenotypes. However, standard phenome-wide association studies analyze associations with individual genetic variants. For polygenic traits such as MDD, aggregate measures of genetic risk may yield greater insight into associations across the clinical phenome. METHODS We tested for associations between a genome-wide polygenic risk score for MDD and medical and psychiatric traits in a phenome-wide association study of 46,782 unrelated, European-ancestry participants from the Michigan Genomics Initiative. RESULTS The MDD polygenic risk score was associated with 211 traits from 15 medical and psychiatric disease categories at the phenome-wide significance threshold. After excluding patients with depression, continued associations were observed with respiratory, digestive, neurological, and genitourinary conditions; neoplasms; and mental disorders. Associations with tobacco use disorder, respiratory conditions, and genitourinary conditions persisted after accounting for genetic overlap between depression and other psychiatric traits. Temporal analyses of time-at-first-diagnosis indicated that depression disproportionately preceded chronic pain and substance-related disorders, while asthma disproportionately preceded depression. CONCLUSIONS The present results can inform the biological links between depression and both mental and systemic diseases. Although MDD polygenic risk scores cannot currently forecast health outcomes with precision at the individual level, as molecular-genetic discoveries for depression increase, these tools may augment risk prediction for medical and psychiatric conditions.
Collapse
Affiliation(s)
- Yu Fang
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor, Michigan.
| | - Lars G Fritsche
- Department of Biostatistics, School of Public Health, University of Michigan Medicine, Ann Arbor, Michigan; Rogel Cancer Center, University of Michigan Medicine, Ann Arbor, Michigan; Center for Statistical Genetics, School of Public Health, University of Michigan Medicine, Ann Arbor, Michigan
| | - Bhramar Mukherjee
- Department of Biostatistics, School of Public Health, University of Michigan Medicine, Ann Arbor, Michigan; Rogel Cancer Center, University of Michigan Medicine, Ann Arbor, Michigan; Center for Statistical Genetics, School of Public Health, University of Michigan Medicine, Ann Arbor, Michigan; Department of Epidemiology, School of Public Health, University of Michigan Medicine, Ann Arbor, Michigan
| | - Srijan Sen
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor, Michigan; Department of Psychiatry, University of Michigan Medicine, Ann Arbor, Michigan
| | | |
Collapse
|
41
|
Novembre J, Stein C, Asgari S, Gonzaga-Jauregui C, Landstrom A, Lemke A, Li J, Mighton C, Taylor M, Tishkoff S. Addressing the challenges of polygenic scores in human genetic research. Am J Hum Genet 2022; 109:2095-2100. [PMID: 36459976 PMCID: PMC9808501 DOI: 10.1016/j.ajhg.2022.10.012] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
The genotyping of millions of human samples has made it possible to evaluate variants across the human genome for their possible association with risks for numerous diseases and other traits by using genome-wide association studies (GWASs). The associations between phenotype and genotype found in GWASs make possible the construction of polygenic scores (PGSs), which aim to predict a trait or disease outcome in an individual on the basis of their genotype (in the disease case, the term polygenic risk score [PRS] is often used). PGSs have shown promise for studying the biology of complex traits and as a tool for evaluating individual disease risks in clinical settings. Although the quantity and quality of data to compute PGSs are increasing, challenges remain in the technical aspects of developing PGSs and in the ethical and social issues that might arise from their use. This ASHG Guidance emphasizes three major themes for researchers working with or interested in the application of PGSs in their own research: (1) developing diverse research cohorts; (2) fostering robustness in the development, application, and interpretation of PGSs; and (3) improving the communication of PGS results and their implications to broad audiences.
Collapse
Affiliation(s)
- John Novembre
- Professional Practice and Social Implications Committee Polygenic Scores Guidance Writing Group, American Society of Human Genetics, Rockville MD, USA,Department of Human Genetics, University of Chicago, Chicago, IL, USA,Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA,Corresponding author
| | - Catherine Stein
- Professional Practice and Social Implications Committee Polygenic Scores Guidance Writing Group, American Society of Human Genetics, Rockville MD, USA,Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH, USA,Corresponding author
| | - Samira Asgari
- Professional Practice and Social Implications Committee Polygenic Scores Guidance Writing Group, American Society of Human Genetics, Rockville MD, USA,Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Claudia Gonzaga-Jauregui
- Professional Practice and Social Implications Committee Polygenic Scores Guidance Writing Group, American Society of Human Genetics, Rockville MD, USA,International Laboratory for Human Genome Research, Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Juriquilla, Querétaro, México
| | - Andrew Landstrom
- Professional Practice and Social Implications Committee Polygenic Scores Guidance Writing Group, American Society of Human Genetics, Rockville MD, USA,Department of Pediatrics, Division of Cardiology, Duke University School of Medicine, Durham, NC, USA
| | - Amy Lemke
- Professional Practice and Social Implications Committee Polygenic Scores Guidance Writing Group, American Society of Human Genetics, Rockville MD, USA,Norton Children’s Research Institute, affiliated with the University of Louisville School of Medicine, Louisville, KY, USA
| | - Jun Li
- Professional Practice and Social Implications Committee Polygenic Scores Guidance Writing Group, American Society of Human Genetics, Rockville MD, USA,Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Chloe Mighton
- Professional Practice and Social Implications Committee Polygenic Scores Guidance Writing Group, American Society of Human Genetics, Rockville MD, USA,Genomics Health Services Research Program, St. Michael’s Hospital, Unity Health Toronto, Toronto, ON, Canada,Institute of Health Policy, Management and Evaluation, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
| | - Matthew Taylor
- Professional Practice and Social Implications Committee Polygenic Scores Guidance Writing Group, American Society of Human Genetics, Rockville MD, USA,Adult Medical Genetics Program, Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Sarah Tishkoff
- Professional Practice and Social Implications Committee Polygenic Scores Guidance Writing Group, American Society of Human Genetics, Rockville MD, USA,Department of Genetics, Center for Global Genomics and Health Equity, University of Pennsylvania, Philadelphia, PA, USA,Department of Biology, Center for Global Genomics and Health Equity, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
42
|
Ogbunugafor CB, Edge MD. Gattaca as a lens on contemporary genetics: marking 25 years into the film's "not-too-distant" future. Genetics 2022; 222:iyac142. [PMID: 36218390 PMCID: PMC9713434 DOI: 10.1093/genetics/iyac142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Accepted: 09/05/2022] [Indexed: 11/13/2022] Open
Abstract
The 1997 film Gattaca has emerged as a canonical pop culture reference used to discuss modern controversies in genetics and bioethics. It appeared in theaters a few years prior to the announcement of the "completion" of the human genome (2000), as the science of human genetics was developing a renewed sense of its social implications. The story is set in a near-future world in which parents can, with technological assistance, influence the genetic composition of their offspring on the basis of predicted life outcomes. The current moment-25 years after the film's release-offers an opportunity to reflect on where society currently stands with respect to the ideas explored in Gattaca. Here, we review and discuss several active areas of genetic research-genetic prediction, embryo selection, forensic genetics, and others-that interface directly with scenes and concepts in the film. On its silver anniversary, we argue that Gattaca remains an important reflection of society's expectations and fears with respect to the ways that genetic science has manifested in the real world. In accompanying supplemental material, we offer some thought questions to guide group discussions inside and outside of the classroom.
Collapse
Affiliation(s)
- C Brandon Ogbunugafor
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA
- Santa Fe Institute, Santa Fe, NM 87501, USA
- Vermont Complex Systems Center, Burlington, VT 05401, USA
| | - Michael D Edge
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
43
|
Ahmed R, Shi Z, Rifkin AS, Wei J, Lilly Zheng S, Helfand BT, Hulick PJ, Qamar A, Davidson DJ, Billings LK, Xu J. Reclassification of coronary artery disease risk using genetic risk score among subjects with borderline or intermediate clinical risk. IJC HEART & VASCULATURE 2022; 43:101136. [PMID: 36275420 PMCID: PMC9579501 DOI: 10.1016/j.ijcha.2022.101136] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Accepted: 10/08/2022] [Indexed: 11/07/2022]
Affiliation(s)
- Razina Ahmed
- Program for Personalized Cancer Care, NorthShore University HealthSystem, Evanston, IL, USA
| | - Zhuqing Shi
- Program for Personalized Cancer Care, NorthShore University HealthSystem, Evanston, IL, USA
| | - Andrew S. Rifkin
- Program for Personalized Cancer Care, NorthShore University HealthSystem, Evanston, IL, USA
| | - Jun Wei
- Program for Personalized Cancer Care, NorthShore University HealthSystem, Evanston, IL, USA
| | - S. Lilly Zheng
- Program for Personalized Cancer Care, NorthShore University HealthSystem, Evanston, IL, USA
| | - Brian T. Helfand
- Program for Personalized Cancer Care, NorthShore University HealthSystem, Evanston, IL, USA,Department of Surgery, NorthShore University HealthSystem, Evanston, IL, USA,University of Chicago Pritzker School of Medicine, Chicago, IL, USA
| | - Peter J. Hulick
- Neaman Center for Personalized Medicine, NorthShore University HealthSystem, Evanston, IL, USA,Department of Medicine, NorthShore University HealthSystem, Evanston, IL, USA
| | - Arman Qamar
- Department of Medicine, NorthShore University HealthSystem, Evanston, IL, USA,Cardiovascular Institute, NorthShore University HealthSystem, Evanston, IL, USA
| | - David J. Davidson
- Department of Medicine, NorthShore University HealthSystem, Evanston, IL, USA,Cardiovascular Institute, NorthShore University HealthSystem, Evanston, IL, USA
| | - Liana K. Billings
- Department of Medicine, NorthShore University HealthSystem, Evanston, IL, USA
| | - Jianfeng Xu
- Program for Personalized Cancer Care, NorthShore University HealthSystem, Evanston, IL, USA,Department of Surgery, NorthShore University HealthSystem, Evanston, IL, USA,Neaman Center for Personalized Medicine, NorthShore University HealthSystem, Evanston, IL, USA,University of Chicago Pritzker School of Medicine, Chicago, IL, USA,Corresponding author.
| |
Collapse
|
44
|
Ma Y, Patil S, Zhou X, Mukherjee B, Fritsche LG. ExPRSweb: An online repository with polygenic risk scores for common health-related exposures. Am J Hum Genet 2022; 109:1742-1760. [PMID: 36152628 PMCID: PMC9606385 DOI: 10.1016/j.ajhg.2022.09.001] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Accepted: 08/31/2022] [Indexed: 01/25/2023] Open
Abstract
Complex traits are influenced by genetic risk factors, lifestyle, and environmental variables, so-called exposures. Some exposures, e.g., smoking or lipid levels, have common genetic modifiers identified in genome-wide association studies. Because measurements are often unfeasible, exposure polygenic risk scores (ExPRSs) offer an alternative to study the influence of exposures on various phenotypes. Here, we collected publicly available summary statistics for 28 exposures and applied four common PRS methods to generate ExPRSs in two large biobanks: the Michigan Genomics Initiative and the UK Biobank. We established ExPRSs for 27 exposures and demonstrated their applicability in phenome-wide association studies and as predictors for common chronic conditions. Especially the addition of multiple ExPRSs showed, for several chronic conditions, an improvement compared to prediction models that only included traditional, disease-focused PRSs. To facilitate follow-up studies, we share all ExPRS constructs and generated results via an online repository called ExPRSweb.
Collapse
Affiliation(s)
- Ying Ma
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA
| | - Snehal Patil
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Precision Health Data Science, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA
| | - Bhramar Mukherjee
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Precision Health Data Science, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Department of Epidemiology, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; University of Michigan Rogel Cancer Center, University of Michigan, Ann Arbor, MI 48109, USA; Michigan Institute for Data Science, University of Michigan, Ann Arbor, MI 48109, USA
| | - Lars G Fritsche
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Precision Health Data Science, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; University of Michigan Rogel Cancer Center, University of Michigan, Ann Arbor, MI 48109, USA.
| |
Collapse
|
45
|
Lencz T, Sabatello M, Docherty A, Peterson RE, Soda T, Austin J, Bierut L, Crepaz-Keay D, Curtis D, Degenhardt F, Huckins L, Lazaro-Munoz G, Mattheisen M, Meiser B, Peay H, Rietschel M, Walss-Bass C, Davis LK. Concerns about the use of polygenic embryo screening for psychiatric and cognitive traits. Lancet Psychiatry 2022; 9:838-844. [PMID: 35931093 PMCID: PMC9930635 DOI: 10.1016/s2215-0366(22)00157-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Revised: 04/01/2022] [Accepted: 04/23/2022] [Indexed: 12/19/2022]
Abstract
Private companies have begun offering services to allow parents undergoing in-vitro fertilisation to screen embryos for genetic risk of complex diseases, including psychiatric disorders. This procedure, called polygenic embryo screening, raises several difficult scientific and ethical issues, as discussed in this Personal View. Polygenic embryo screening depends on the statistical properties of polygenic risk scores, which are complex and not well studied in the context of this proposed clinical application. The clinical, social, and ethical implications of polygenic embryo screening have barely been discussed among relevant stakeholders. To our knowledge, the International Society of Psychiatric Genetics is the first professional biomedical organisation to issue a statement regarding polygenic embryo screening. For the reasons discussed in this Personal View, the Society urges caution and calls for additional research and oversight on the use of polygenic embryo screening.
Collapse
Affiliation(s)
- Todd Lencz
- Divison of Psychiatry Research, Zucker Hillside Hospital, Glen Oaks, NY, USA; Department of Psychiatry, Division of Research, The Zucker Hillside Hospital Division of Northwell Health, Glen Oaks, NY, USA; Institute for Behavioral Science, The Feinstein Institutes for Medical Research, Manhasset, NY, USA.
| | - Maya Sabatello
- Division of Ethics, Department of Medical Humanities and Ethics, Columbia University, New York, NY, USA
| | - Anna Docherty
- Department of Psychiatry, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Roseann E Peterson
- Virginia Institute for Psychiatric and Behavioral Genetics, Department of Psychiatry, Virginia Commonwealth University, Richmond, VA, USA
| | - Takahiro Soda
- Department of Psychiatry, University of Florida College of Medicine, Gainesville, FL, USA
| | - Jehannine Austin
- Departments of Psychiatry and Medical Genetics, University of British Columbia, Vancouver, BC, Canada
| | - Laura Bierut
- Department of Psychiatry, Washington University School of Medicine, St Louis, MO, USA
| | | | - David Curtis
- UCL Genetics Institute, University College London, London, United Kingdom
| | - Franziska Degenhardt
- Department of Child and Adolescent Psychiatry, Psychosomatics, and Psychotherapy, University Hospital Essen, University of Duisburg-Essen, Duisburg, Germany
| | - Laura Huckins
- Departments of Psychiatry and Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | | | - Manuel Mattheisen
- Department of Psychiatry, Dalhousie Medical School, Halifax, NS, Canada
| | - Bettina Meiser
- Prince of Wales Clinical School, University of New South Wales, NSW, Australia
| | - Holly Peay
- Genomics, Bioinformatics, and Translational Research Center, RTI International, Raleigh, NC, USA
| | - Marcella Rietschel
- Department of Genetic Epidemiology in Psychiatry, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| | - Consuelo Walss-Bass
- Department of Psychiatry and Behavioral Sciences, McGovern Medical School, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Lea K Davis
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
| |
Collapse
|
46
|
Wang Y, Tsuo K, Kanai M, Neale BM, Martin AR. Challenges and Opportunities for Developing More Generalizable Polygenic Risk Scores. Annu Rev Biomed Data Sci 2022; 5:293-320. [PMID: 35576555 PMCID: PMC9828290 DOI: 10.1146/annurev-biodatasci-111721-074830] [Citation(s) in RCA: 56] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Polygenic risk scores (PRS) estimate an individual's genetic likelihood of complex traits and diseases by aggregating information across multiple genetic variants identified from genome-wide association studies. PRS can predict a broad spectrum of diseases and have therefore been widely used in research settings. Some work has investigated their potential applications as biomarkers in preventative medicine, but significant work is still needed to definitively establish and communicate absolute risk to patients for genetic and modifiable risk factors across demographic groups. However, the biggest limitation of PRS currently is that they show poor generalizability across diverse ancestries and cohorts. Major efforts are underway through methodological development and data generation initiatives to improve their generalizability. This review aims to comprehensively discuss current progress on the development of PRS, the factors that affect their generalizability, and promising areas for improving their accuracy, portability, and implementation.
Collapse
Affiliation(s)
- Ying Wang
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, USA;
- Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
| | - Kristin Tsuo
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, USA;
- Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
- Biological and Biomedical Sciences, Harvard Medical School, Boston, Massachusetts, USA
| | - Masahiro Kanai
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, USA;
- Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan
| | - Benjamin M Neale
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, USA;
- Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
| | - Alicia R Martin
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, USA;
- Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
| |
Collapse
|
47
|
Ju D, Hui D, Hammond DA, Wonkam A, Tishkoff SA. Importance of Including Non-European Populations in Large Human Genetic Studies to Enhance Precision Medicine. Annu Rev Biomed Data Sci 2022; 5:321-339. [PMID: 35576557 PMCID: PMC9904154 DOI: 10.1146/annurev-biodatasci-122220-112550] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
One goal of genomic medicine is to uncover an individual's genetic risk for disease, which generally requires data connecting genotype to phenotype, as done in genome-wide association studies (GWAS). While there may be clinical promise to employing prediction tools such as polygenic risk scores (PRS), it currently stands that individuals of non-European ancestry may not reap the benefits of genomic medicine because of underrepresentation in large-scale genetics studies. Here, we discuss why this inequity poses a problem for genomic medicine and the reasons for the low transferability of PRS across populations. We also survey the ancestry representation of published GWAS and investigate how estimates of ancestry diversity in GWASparticipants might be biased. We highlight the importance of expanding genetic research in Africa, one of the most underrepresented regions in human genomics research, and discuss issues of ethics, resources, and technology for equitable advancement of genomic medicine.
Collapse
Affiliation(s)
- Dan Ju
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA;
| | - Daniel Hui
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA;
- Graduate Program in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Dorothy A Hammond
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA;
- Penn Center for Global Genomics & Health Equity, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Ambroise Wonkam
- Division of Human Genetics, Department of Medicine, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
- Department of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, Maryland, USA;
| | - Sarah A Tishkoff
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA;
- Department of Biology, School of Arts and Sciences, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
48
|
Schultz LM, Merikangas AK, Ruparel K, Jacquemont S, Glahn DC, Gur RE, Barzilay R, Almasy L. Stability of polygenic scores across discovery genome-wide association studies. HGG ADVANCES 2022; 3:100091. [PMID: 35199043 PMCID: PMC8841810 DOI: 10.1016/j.xhgg.2022.100091] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Accepted: 01/18/2022] [Indexed: 01/19/2023] Open
Abstract
Polygenic scores (PGS) are commonly evaluated in terms of their predictive accuracy at the population level by the proportion of phenotypic variance they explain. To be useful for precision medicine applications, they also need to be evaluated at the individual level when phenotypes are not necessarily already known. We investigated the stability of PGS in European American (EUR) and African American (AFR)-ancestry individuals from the Philadelphia Neurodevelopmental Cohort and the Adolescent Brain Cognitive Development study using different discovery genome-wide association study (GWAS) results for post-traumatic stress disorder (PTSD), type 2 diabetes (T2D), and height. We found that pairs of EUR-ancestry GWAS for the same trait had genetic correlations >0.92. However, PGS calculated from pairs of same-ancestry and different-ancestry GWAS had correlations that ranged from <0.01 to 0.74. PGS stability was greater for height than for PTSD or T2D. A series of height GWAS in the UK Biobank suggested that correlation between PGS is strongly dependent on the extent of sample overlap between the discovery GWAS. Focusing on the upper end of the PGS distribution, different discovery GWAS do not consistently identify the same individuals in the upper quantiles, with the best case being 60% of individuals above the 80th percentile of PGS overlapping from one height GWAS to another. The degree of overlap decreases sharply as higher quantiles, less heritable traits, and different-ancestry GWAS are considered. PGS computed from different discovery GWAS have only modest correlation at the individual level, underscoring the need to proceed cautiously with integrating PGS into precision medicine applications.
Collapse
Affiliation(s)
- Laura M. Schultz
- Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Lifespan Brain Institute, Children's Hospital of Philadelphia and Penn Medicine, Philadelphia, PA 19104, USA
| | - Alison K. Merikangas
- Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Lifespan Brain Institute, Children's Hospital of Philadelphia and Penn Medicine, Philadelphia, PA 19104, USA
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Kosha Ruparel
- Lifespan Brain Institute, Children's Hospital of Philadelphia and Penn Medicine, Philadelphia, PA 19104, USA
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Sébastien Jacquemont
- UHC Sainte-Justine Research Center, Université de Montréal, Montréal, QC H3T 1C5, Canada
- Department of Pediatrics, Université de Montréal, Montréal, QC H3T 1C5, Canada
| | - David C. Glahn
- Tommy Fuss Center for Neuropsychiatric Disease Research, Boston Children's Hospital, Boston, MA, USA
- Department of Psychiatry, Harvard Medical School, Boston, MA, USA
| | - Raquel E. Gur
- Lifespan Brain Institute, Children's Hospital of Philadelphia and Penn Medicine, Philadelphia, PA 19104, USA
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Child Adolescent Psychiatry and Behavioral Sciences, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Ran Barzilay
- Lifespan Brain Institute, Children's Hospital of Philadelphia and Penn Medicine, Philadelphia, PA 19104, USA
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Child Adolescent Psychiatry and Behavioral Sciences, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Laura Almasy
- Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Lifespan Brain Institute, Children's Hospital of Philadelphia and Penn Medicine, Philadelphia, PA 19104, USA
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| |
Collapse
|
49
|
Cerebral Polymorphisms for Lateralisation: Modelling the Genetic and Phenotypic Architectures of Multiple Functional Modules. Symmetry (Basel) 2022. [DOI: 10.3390/sym14040814] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Recent fMRI and fTCD studies have found that functional modules for aspects of language, praxis, and visuo-spatial functioning, while typically left, left and right hemispheric respectively, frequently show atypical lateralisation. Studies with increasing numbers of modules and participants are finding increasing numbers of module combinations, which here are termed cerebral polymorphisms—qualitatively different lateral organisations of cognitive functions. Polymorphisms are more frequent in left-handers than right-handers, but it is far from the case that right-handers all show the lateral organisation of modules described in introductory textbooks. In computational terms, this paper extends the original, monogenic McManus DC (dextral-chance) model of handedness and language dominance to multiple functional modules, and to a polygenic DC model compatible with the molecular genetics of handedness, and with the biology of visceral asymmetries found in primary ciliary dyskinesia. Distributions of cerebral polymorphisms are calculated for families and twins, and consequences and implications of cerebral polymorphisms are explored for explaining aphasia due to cerebral damage, as well as possible talents and deficits arising from atypical inter- and intra-hemispheric modular connections. The model is set in the broader context of the testing of psychological theories, of issues of laterality measurement, of mutation-selection balance, and the evolution of brain and visceral asymmetries.
Collapse
|
50
|
Yang S, Zhou X. PGS-server: accuracy, robustness and transferability of polygenic score methods for biobank scale studies. Brief Bioinform 2022; 23:6534383. [PMID: 35193147 DOI: 10.1093/bib/bbac039] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2021] [Revised: 12/29/2021] [Accepted: 01/26/2022] [Indexed: 01/02/2023] Open
Abstract
Polygenic scores (PGS) are important tools for carrying out genetic prediction of common diseases and disease related complex traits, facilitating the development of precision medicine. Unfortunately, despite the critical importance of PGS and the vast number of PGS methods recently developed, few comprehensive comparison studies have been performed to evaluate the effectiveness of PGS methods. To fill this critical knowledge gap, we performed a comprehensive comparison study on 12 different PGS methods through internal evaluations on 25 quantitative and 25 binary traits within the UK Biobank with sample sizes ranging from 147 408 to 336 573, and through external evaluations via 25 cross-study and 112 cross-ancestry analyses on summary statistics from multiple genome-wide association studies with sample sizes ranging from 1415 to 329 345. We evaluate the prediction accuracy, computational scalability, as well as robustness and transferability of different PGS methods across datasets and/or genetic ancestries, providing important guidelines for practitioners in choosing PGS methods. Besides method comparison, we present a simple aggregation strategy that combines multiple PGS from different methods to take advantage of their distinct benefits to achieve stable and superior prediction performance. To facilitate future applications of PGS, we also develop a PGS webserver (http://www.pgs-server.com/) that allows users to upload summary statistics and choose different PGS methods to fit the data directly. We hope that our results, method and webserver will facilitate the routine application of PGS across different research areas.
Collapse
Affiliation(s)
- Sheng Yang
- Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
| | - Xiang Zhou
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA.,Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|