1
|
Lee YC, Jung SH, Shivakumar M, Cha S, Park WY, Won HH, Eun YG, Biobank PM, Kim D. Polygenic risk score-based phenome-wide association study of head and neck cancer across two large biobanks. BMC Med 2024; 22:120. [PMID: 38486201 PMCID: PMC10941505 DOI: 10.1186/s12916-024-03305-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/28/2023] [Accepted: 02/15/2024] [Indexed: 03/17/2024] Open
Abstract
BACKGROUND Numerous observational studies have highlighted associations of genetic predisposition of head and neck squamous cell carcinoma (HNSCC) with diverse risk factors, but these findings are constrained by design limitations of observational studies. In this study, we utilized a phenome-wide association study (PheWAS) approach, incorporating a polygenic risk score (PRS) derived from a wide array of genomic variants, to systematically investigate phenotypes associated with genetic predisposition to HNSCC. Furthermore, we validated our findings across heterogeneous cohorts, enhancing the robustness and generalizability of our results. METHODS We derived PRSs for HNSCC and its subgroups, oropharyngeal cancer and oral cancer, using large-scale genome-wide association study summary statistics from the Genetic Associations and Mechanisms in Oncology Network. We conducted a comprehensive investigation, leveraging genotyping data and electronic health records from 308,492 individuals in the UK Biobank and 38,401 individuals in the Penn Medicine Biobank (PMBB), and subsequently performed PheWAS to elucidate the associations between PRS and a wide spectrum of phenotypes. RESULTS We revealed the HNSCC PRS showed significant association with phenotypes related to tobacco use disorder (OR, 1.06; 95% CI, 1.05-1.08; P = 3.50 × 10-15), alcoholism (OR, 1.06; 95% CI, 1.04-1.09; P = 6.14 × 10-9), alcohol-related disorders (OR, 1.08; 95% CI, 1.05-1.11; P = 1.09 × 10-8), emphysema (OR, 1.11; 95% CI, 1.06-1.16; P = 5.48 × 10-6), chronic airway obstruction (OR, 1.05; 95% CI, 1.03-1.07; P = 2.64 × 10-5), and cancer of bronchus (OR, 1.08; 95% CI, 1.04-1.13; P = 4.68 × 10-5). These findings were replicated in the PMBB cohort, and sensitivity analyses, including the exclusion of HNSCC cases and the major histocompatibility complex locus, confirmed the robustness of these associations. Additionally, we identified significant associations between HNSCC PRS and lifestyle factors related to smoking and alcohol consumption. CONCLUSIONS The study demonstrated the potential of PRS-based PheWAS in revealing associations between genetic risk factors for HNSCC and various phenotypic traits. The findings emphasized the importance of considering genetic susceptibility in understanding HNSCC and highlighted shared genetic bases between HNSCC and other health conditions and lifestyles.
Collapse
Affiliation(s)
- Young Chan Lee
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Otolaryngology-Head and Neck Surgery, School of Medicine, Kyung Hee University, Seoul, Republic of Korea
| | - Sang-Hyuk Jung
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Manu Shivakumar
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Soojin Cha
- Hanyang University Institute for Rheumatology Research, Seoul, Republic of Korea
| | - Woong-Yang Park
- Samsung Genome Institute, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Hong-Hee Won
- Samsung Genome Institute, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
- Samsung Medical Center, Samsung Advanced Institute for Health Sciences and Technology, Sungkyunkwan University, Seoul, Republic of Korea
| | - Young-Gyu Eun
- Department of Otolaryngology-Head and Neck Surgery, School of Medicine, Kyung Hee University, Seoul, Republic of Korea
| | - Penn Medicine Biobank
- Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Dokyoon Kim
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
2
|
Im C, Sharafeldin N, Yuan Y, Wang Z, Sapkota Y, Lu Z, Spector LG, Howell RM, Arnold MA, Hudson MM, Ness KK, Robison LL, Bhatia S, Armstrong GT, Neglia JP, Yasui Y, Turcotte LM. Polygenic Risk and Chemotherapy-Related Subsequent Malignancies in Childhood Cancer Survivors: A Childhood Cancer Survivor Study and St Jude Lifetime Cohort Study Report. J Clin Oncol 2023; 41:4381-4393. [PMID: 37459583 PMCID: PMC10522108 DOI: 10.1200/jco.23.00428] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 04/26/2023] [Accepted: 05/18/2023] [Indexed: 08/07/2023] Open
Abstract
PURPOSE Chemotherapeutic exposures are associated with subsequent malignant neoplasm (SMN) risk. The role of genetic susceptibility in chemotherapy-related SMNs should be defined as use of radiation therapy (RT) decreases. PATIENTS AND METHODS SMNs among long-term childhood cancer survivors of European (EUR; N = 9,895) and African (AFR; N = 718) genetic ancestry from the Childhood Cancer Survivor Study and St Jude Lifetime Cohort Study were evaluated. An externally validated 179-variant polygenic risk score (PRS) associated with pleiotropic adult cancer risk from the UK Biobank Study (N > 400,000) was computed for each survivor. SMN cumulative incidence comparing top and bottom PRS quintiles was estimated, along with hazard ratios (HRs) from proportional hazards models. RESULTS A total of 1,594 survivors developed SMNs, with basal cell carcinomas (n = 822), breast cancers (n = 235), and thyroid cancers (n = 221) being the most frequent. Although SMN risk associations with the PRS were extremely modest in RT-exposed EUR survivors (HR, 1.22; P = .048; n = 4,630), the increase in 30-year SMN cumulative incidence and HRs comparing top and bottom PRS quintiles was statistically significant among nonirradiated EUR survivors (n = 4,322) treated with alkylating agents (17% v 6%; HR, 2.46; P < .01), anthracyclines (20% v 8%; HR, 2.86; P < .001), epipodophyllotoxins (23% v 1%; HR, 12.20; P < .001), or platinums (46% v 7%; HR, 8.58; P < .01). This PRS also significantly modified epipodophyllotoxin-related SMN risk among nonirradiated AFR survivors (n = 414; P < .01). Improvements in prediction attributable to the PRS were greatest for epipodophyllotoxin-exposed (AUC, 0.71 v 0.63) and platinum-exposed (AUC,0.68 v 0.58) survivors. CONCLUSION A pleiotropic cancer PRS has strong potential for improving SMN clinical risk stratification among nonirradiated survivors treated with specific chemotherapies. A polygenic risk screening approach may be a valuable complement to an early screening strategy on the basis of treatments and rare cancer-susceptibility mutations.
Collapse
Affiliation(s)
- Cindy Im
- Department of Pediatrics, University of Minnesota, Minneapolis, MN
| | - Noha Sharafeldin
- Hematology Oncology Division, Department of Medicine, University of Alabama at Birmingham, Birmingham, AL
- Institute for Cancer Outcomes and Survivorship, University of Alabama at Birmingham, Birmingham, AL
| | - Yan Yuan
- School of Public Health, University of Alberta, Edmonton, Alberta, Canada
| | - Zhaoming Wang
- Department of Epidemiology and Cancer Control, St Jude Children's Research Hospital, Memphis, TN
- Department of Computational Biology, St Jude Children's Research Hospital, Memphis, TN
| | - Yadav Sapkota
- Department of Epidemiology and Cancer Control, St Jude Children's Research Hospital, Memphis, TN
| | - Zhanni Lu
- Department of Pediatrics, University of Minnesota, Minneapolis, MN
| | - Logan G. Spector
- Department of Pediatrics, University of Minnesota, Minneapolis, MN
| | - Rebecca M. Howell
- Department of Radiation Physics, University of Texas at MD Anderson Cancer Center, Houston, TX
| | - Michael A. Arnold
- Department of Pathology and Laboratory Medicine, University of Colorado and Children's Hospital Colorado, Anschutz Medical Campus, Aurora, CO
| | - Melissa M. Hudson
- Department of Epidemiology and Cancer Control, St Jude Children's Research Hospital, Memphis, TN
- Department of Oncology, St Jude Children's Research Hospital, Memphis, TN
| | - Kirsten K. Ness
- Department of Epidemiology and Cancer Control, St Jude Children's Research Hospital, Memphis, TN
| | - Leslie L. Robison
- Department of Epidemiology and Cancer Control, St Jude Children's Research Hospital, Memphis, TN
| | - Smita Bhatia
- Institute for Cancer Outcomes and Survivorship, University of Alabama at Birmingham, Birmingham, AL
| | - Gregory T. Armstrong
- Department of Epidemiology and Cancer Control, St Jude Children's Research Hospital, Memphis, TN
- Department of Oncology, St Jude Children's Research Hospital, Memphis, TN
| | - Joseph P. Neglia
- Department of Pediatrics, University of Minnesota, Minneapolis, MN
| | - Yutaka Yasui
- School of Public Health, University of Alberta, Edmonton, Alberta, Canada
- Department of Epidemiology and Cancer Control, St Jude Children's Research Hospital, Memphis, TN
| | | |
Collapse
|
3
|
Xiao X, Wu Q. The clinical utility of the BMD-related comprehensive genome-wide polygenic score in identifying individuals with a high risk of osteoporotic fractures. Osteoporos Int 2023; 34:681-692. [PMID: 36622390 PMCID: PMC11225087 DOI: 10.1007/s00198-022-06654-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/26/2022] [Accepted: 12/20/2022] [Indexed: 01/10/2023]
Abstract
The potential of bone mineral density (BMD)-related genome-wide polygenic score (PGS) in identifying individuals with a high risk of fractures remains unclear. This study suggests that an efficient PGS enables the identification of strata with up to a 1.5-fold difference in fracture incidence. Incorporating PGS into clinical diagnosis is anticipated to increase the population-level screening benefits. PURPOSE This study sought to construct genome-wide polygenic scores for femoral neck and total body BMD and to estimate their potential in identifying individuals with a high risk of osteoporotic fractures. METHODS Genome-wide polygenic scores were developed and validated for femoral neck and total body BMD. We externally tested the PGSs, both by themselves and in combination with available clinical risk factors, in 455,663 European ancestry individuals from the UK Biobank. The predictive accuracy of the developed genome-wide PGS was also compared with previously published restricted PGS employed in fracture risk assessment. RESULTS For each unit decrease in PGSs, the genome-wide PGSs were associated with up to 1.17-fold increased fracture risk. Out of four studied PGSs, [Formula: see text] (HR: 1.03; 95%CI 1.01-1.05, p = 0.001) had the weakest and the [Formula: see text] (HR: 1.17; 95%CI 1.15-1.19, p < 0.0001) had the strongest association with an incident fracture. In the reclassification analysis, compared to the FRAX base model, the models with [Formula: see text], [Formula: see text], [Formula: see text], and [Formula: see text] improved the reclassification of fracture by 1.2% (95% CI, 1.0 to 1.3%), 0.2% (95% CI, 0.1 to 0.3%), 1.4% (95% CI, 1.3 to 1.5%), and 2.2% (95% CI, 2.1 to 2.4%), respectively. CONCLUSIONS Our findings suggested that an efficient PGS estimate enables the identification of strata with up to a 1.7-fold difference in fracture incidence. Incorporating PGS information into clinical diagnosis is anticipated to increase the benefits of screening programs at the population level.
Collapse
Affiliation(s)
- Xiangxue Xiao
- Nevada Institute of Personalized Medicine, College of Science, University of Nevada, Las Vegas, NV, USA
- Department of Epidemiology and Biostatistics, School of Public Health, University of Nevada Las Vegas, Las Vegas, NV, USA
| | - Qing Wu
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, USA.
| |
Collapse
|
4
|
Polygenic Liability to Depression Is Associated With Multiple Medical Conditions in the Electronic Health Record: Phenome-wide Association Study of 46,782 Individuals. Biol Psychiatry 2022; 92:923-931. [PMID: 35965108 DOI: 10.1016/j.biopsych.2022.06.004] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/17/2021] [Revised: 04/01/2022] [Accepted: 06/02/2022] [Indexed: 11/22/2022]
Abstract
BACKGROUND Major depressive disorder (MDD) is a leading cause of disease-associated disability, with much of the increased burden due to psychiatric and medical comorbidity. This comorbidity partly reflects common genetic influences across conditions. Integrating molecular-genetic tools with health records enables tests of association with the broad range of physiological and clinical phenotypes. However, standard phenome-wide association studies analyze associations with individual genetic variants. For polygenic traits such as MDD, aggregate measures of genetic risk may yield greater insight into associations across the clinical phenome. METHODS We tested for associations between a genome-wide polygenic risk score for MDD and medical and psychiatric traits in a phenome-wide association study of 46,782 unrelated, European-ancestry participants from the Michigan Genomics Initiative. RESULTS The MDD polygenic risk score was associated with 211 traits from 15 medical and psychiatric disease categories at the phenome-wide significance threshold. After excluding patients with depression, continued associations were observed with respiratory, digestive, neurological, and genitourinary conditions; neoplasms; and mental disorders. Associations with tobacco use disorder, respiratory conditions, and genitourinary conditions persisted after accounting for genetic overlap between depression and other psychiatric traits. Temporal analyses of time-at-first-diagnosis indicated that depression disproportionately preceded chronic pain and substance-related disorders, while asthma disproportionately preceded depression. CONCLUSIONS The present results can inform the biological links between depression and both mental and systemic diseases. Although MDD polygenic risk scores cannot currently forecast health outcomes with precision at the individual level, as molecular-genetic discoveries for depression increase, these tools may augment risk prediction for medical and psychiatric conditions.
Collapse
|
5
|
Campos-Staffico AM, Dorsch MP, Barnes GD, Zhu HJ, Limdi NA, Luzum JA. Eight pharmacokinetic genetic variants are not associated with the risk of bleeding from direct oral anticoagulants in non-valvular atrial fibrillation patients. Front Pharmacol 2022; 13:1007113. [PMID: 36506510 PMCID: PMC9730333 DOI: 10.3389/fphar.2022.1007113] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 11/07/2022] [Indexed: 11/25/2022] Open
Abstract
Background: Atrial fibrillation (AF) is the leading cause of ischemic stroke and treatment has focused on reducing this risk through anticoagulation. Direct Oral Anticoagulants (DOACs) are the first-line guideline-recommended therapy since they are as effective and overall safer than warfarin in preventing AF-related stroke. Although patients bleed less from DOACs compared to warfarin, bleeding remains the primary safety concern with this therapy. Hypothesis: Genetic variants known to modify the function of metabolic enzymes or transporters involved in the pharmacokinetics (PK) of DOACs could increase the risk of bleeding. Aim: To assess the association of eight, functional PK-related single nucleotide variants (SNVs) in five genes (ABCB1, ABCG2, CYP2J2, CYP3A4, CYP3A5) with the risk of bleeding from DOACs in non-valvular AF patients. Methods: A retrospective cohort study was carried out with 2,364 self-identified white non-valvular AF patients treated with either rivaroxaban or apixaban. Genotyping was performed with Illumina Infinium CoreExome v12.1 bead arrays by the Michigan Genomics Initiative biobank. The primary endpoint was a composite of major and clinically relevant non-major bleeding. Cox proportional hazards regression with time-varying analysis assessed the association of the eight PK-related SNVs with the risk of bleeding from DOACs in unadjusted and covariate-adjusted models. The pre-specified primary analysis was the covariate-adjusted, additive genetic models. Six tests were performed in the primary analysis as three SNVs are in the same haplotype, and thus p-values below the Bonferroni-corrected level of 8.33e-3 were considered statistically significant. Results: In the primary analysis, none of the SNVs met the Bonferroni-corrected level of statistical significance (all p > 0.1). In exploratory analyses with other genetic models, the ABCB1 (rs4148732) GG genotype tended to be associated with the risk of bleeding from rivaroxaban [HR: 1.391 (95%CI: 1.019-1.900); p = 0.038] but not from apixaban (p = 0.487). Conclusion: Eight functional PK-related genetic variants were not significantly associated with bleeding from either rivaroxaban or apixaban in more than 2,000 AF self-identified white outpatients.
Collapse
Affiliation(s)
| | - Michael P. Dorsch
- Department of Clinical Pharmacy, College of Pharmacy, University of Michigan, Ann Arbor, MI, United States
| | - Geoffrey D. Barnes
- Division of Cardiovascular Medicine, Department of Internal Medicine, University of Michigan, Ann Arbor, MI, United States
| | - Hao-Jie Zhu
- Department of Clinical Pharmacy, College of Pharmacy, University of Michigan, Ann Arbor, MI, United States
| | - Nita A. Limdi
- Department of Neurology, School of Medicine, University of Alabama at Birmingham, Birmingham, AL, United States
| | - Jasmine A. Luzum
- Department of Clinical Pharmacy, College of Pharmacy, University of Michigan, Ann Arbor, MI, United States,*Correspondence: Jasmine A. Luzum,
| |
Collapse
|
6
|
Page ML, Vance EL, Cloward ME, Ringger E, Dayton L, Ebbert MTW, Miller JB, Kauwe JSK. The Polygenic Risk Score Knowledge Base offers a centralized online repository for calculating and contextualizing polygenic risk scores. Commun Biol 2022; 5:899. [PMID: 36056235 PMCID: PMC9438378 DOI: 10.1038/s42003-022-03795-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Accepted: 08/03/2022] [Indexed: 11/20/2022] Open
Abstract
The process of identifying suitable genome-wide association (GWA) studies and formatting the data to calculate multiple polygenic risk scores on a single genome can be laborious. Here, we present a centralized polygenic risk score calculator currently containing over 250,000 genetic variant associations from the NHGRI-EBI GWAS Catalog for users to easily calculate sample-specific polygenic risk scores with comparable results to other available tools. Polygenic risk scores are calculated either online through the Polygenic Risk Score Knowledge Base (PRSKB; https://prs.byu.edu ) or via a command-line interface. We report study-specific polygenic risk scores across the UK Biobank, 1000 Genomes, and the Alzheimer's Disease Neuroimaging Initiative (ADNI), contextualize computed scores, and identify potentially confounding genetic risk factors in ADNI. We introduce a streamlined analysis tool and web interface to calculate and contextualize polygenic risk scores across various studies, which we anticipate will facilitate a wider adaptation of polygenic risk scores in future disease research.
Collapse
Affiliation(s)
- Madeline L Page
- Sanders-Brown Center on Aging, University of Kentucky, Lexington, KY, USA
| | - Elizabeth L Vance
- Sanders-Brown Center on Aging, University of Kentucky, Lexington, KY, USA
| | | | - Ed Ringger
- Department of Biology, Brigham Young University, Provo, UT, USA
| | - Louisa Dayton
- Department of Biology, Brigham Young University, Provo, UT, USA
| | - Mark T W Ebbert
- Sanders-Brown Center on Aging, University of Kentucky, Lexington, KY, USA.,Division of Biomedical Informatics, Department of Internal Medicine, University of Kentucky, Lexington, KY, USA.,Department of Neuroscience, University of Kentucky, Lexington, KY, USA
| | | | - Justin B Miller
- Sanders-Brown Center on Aging, University of Kentucky, Lexington, KY, USA.,Division of Biomedical Informatics, Department of Internal Medicine, University of Kentucky, Lexington, KY, USA.,Department of Pathology and Laboratory Medicine, University of Kentucky, Lexington, KY, USA
| | - John S K Kauwe
- Department of Biology, Brigham Young University, Provo, UT, USA.
| |
Collapse
|
7
|
Siemens A, Anderson SJ, Rassekh SR, Ross CJD, Carleton BC. A Systematic Review of Polygenic Models for Predicting Drug Outcomes. J Pers Med 2022; 12:jpm12091394. [PMID: 36143179 PMCID: PMC9505711 DOI: 10.3390/jpm12091394] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 08/21/2022] [Accepted: 08/25/2022] [Indexed: 11/16/2022] Open
Abstract
Polygenic models have emerged as promising prediction tools for the prediction of complex traits. Currently, the majority of polygenic models are developed in the context of predicting disease risk, but polygenic models may also prove useful in predicting drug outcomes. This study sought to understand how polygenic models incorporating pharmacogenetic variants are being used in the prediction of drug outcomes. A systematic review was conducted with the aim of gaining insights into the methods used to construct polygenic models, as well as their performance in drug outcome prediction. The search uncovered 89 papers that incorporated pharmacogenetic variants in the development of polygenic models. It was found that the most common polygenic models were constructed for drug dosing predictions in anticoagulant therapies (n = 27). While nearly all studies found a significant association with their polygenic model and the investigated drug outcome (93.3%), less than half (47.2%) compared the performance of the polygenic model against clinical predictors, and even fewer (40.4%) sought to validate model predictions in an independent cohort. Additionally, the heterogeneity of reported performance measures makes the comparison of models across studies challenging. These findings highlight key considerations for future work in developing polygenic models in pharmacogenomic research.
Collapse
Affiliation(s)
- Angela Siemens
- Department of Medical Genetics, Faculty of Medicine, University of British Columbia, Vancouver, BC V6H 3N1, Canada
- BC Children’s Hospital Research Institute, Vancouver, BC V5Z 4H4, Canada
| | - Spencer J. Anderson
- Department of Medical Genetics, Faculty of Medicine, University of British Columbia, Vancouver, BC V6H 3N1, Canada
- BC Children’s Hospital Research Institute, Vancouver, BC V5Z 4H4, Canada
| | - S. Rod Rassekh
- Division of Translational Therapeutics, Department of Pediatrics, Faculty of Medicine, University of British Columbia, Vancouver, BC V6H 3V4, Canada
- Division of Oncology, Hematology and Bone Marrow Transplant, University of British Columbia, Vancouver, BC V6H 3V4, Canada
| | - Colin J. D. Ross
- Department of Medical Genetics, Faculty of Medicine, University of British Columbia, Vancouver, BC V6H 3N1, Canada
- BC Children’s Hospital Research Institute, Vancouver, BC V5Z 4H4, Canada
- Faculty of Pharmaceutical Sciences, The University of British Columbia, Vancouver, BC V6T 1Z3, Canada
| | - Bruce C. Carleton
- Department of Medical Genetics, Faculty of Medicine, University of British Columbia, Vancouver, BC V6H 3N1, Canada
- Division of Translational Therapeutics, Department of Pediatrics, Faculty of Medicine, University of British Columbia, Vancouver, BC V6H 3V4, Canada
- Pharmaceutical Outcomes Programme, British Columbia Children’s Hospital, Vancouver, BC V5Z 4H4, Canada
- Correspondence:
| |
Collapse
|
8
|
Tang Y, You D, Yi H, Yang S, Zhao Y. IPRS: Leveraging Gene-Environment Interaction to Reconstruct Polygenic Risk Score. Front Genet 2022; 13:801397. [PMID: 35401709 PMCID: PMC8989431 DOI: 10.3389/fgene.2022.801397] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 02/08/2022] [Indexed: 12/30/2022] Open
Abstract
Background: Polygenic risk score (PRS) is widely regarded as a predictor of genetic susceptibility to disease, applied to individuals to predict the risk of disease occurrence. When the gene-environment (G×E) interaction is considered, the traditional PRS prediction model directly uses PRS to interact with the environment without considering the interactions between each variant and environment, which may lead to prediction performance and risk stratification of complex diseases are not promising. Methods: We developed a method called interaction PRS (iPRS), reconstructing PRS by leveraging G×E interactions. Two extensive simulations evaluated prediction performance, risk stratification, and calibration performance of the iPRS prediction model, and compared it with the traditional PRS prediction model. Real data analysis was performed using existing data from the Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial study to predict genetic susceptibility, pack-years of smoking history, and G×E interactions in patients with lung cancer. Results: Two extensive simulations indicated iPRS prediction model could improve the prediction performance of disease risk, the accuracy of risk stratification, and clinical calibration performance compared with the traditional PRS prediction model, especially when antagonism accounted for the majority of the interaction. PLCO real data analysis also suggested that the iPRS prediction model was superior to the PRS prediction model in predictive effect (p = 0.0205). Conclusion: IPRS prediction model could have a good application prospect in predicting disease risk, optimizing the screening of high-risk populations, and improving the clinical benefits of preventive interventions among populations.
Collapse
Affiliation(s)
- Yingdan Tang
- Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Dongfang You
- Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Honggang Yi
- Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Sheng Yang
- Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, China
- *Correspondence: Sheng Yang, ; Yang Zhao,
| | - Yang Zhao
- Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, China
- Center of Biomedical Big Data and the Laboratory of Biomedical Big Data, Nanjing Medical University, Nanjing, China
- Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing, China
- *Correspondence: Sheng Yang, ; Yang Zhao,
| |
Collapse
|
9
|
Maturation and application of phenome-wide association studies. Trends Genet 2022; 38:353-363. [PMID: 34991903 DOI: 10.1016/j.tig.2021.12.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2021] [Revised: 11/12/2021] [Accepted: 12/02/2021] [Indexed: 12/12/2022]
Abstract
In the past 10 years since its introduction, phenome-wide association studies (PheWAS) have uncovered novel genotype-phenotype relationships. Along the way, PheWAS have evolved in many aspects as a study design with the expanded availability of large data repositories with genome-wide data linked to detailed phenotypic data. Advancement in methods, including algorithms, software, and publicly available integrated resources, makes it feasible to more fully realize the potential of PheWAS, overcoming the previous computational and analytical limitations. We review here the most recent improvements and notable applications of PheWAS since the second half of the decade from its inception. We also note the challenges that remain embedded along the entire PheWAS analytical pipeline that necessitate further development of tools and resources to further advance the understanding of the complex genetic architecture underlying human diseases and traits.
Collapse
|
10
|
Gusev A, Groha S, Taraszka K, Semenov YR, Zaitlen N. Constructing germline research cohorts from the discarded reads of clinical tumor sequences. Genome Med 2021; 13:179. [PMID: 34749793 PMCID: PMC8576948 DOI: 10.1186/s13073-021-00999-4] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Accepted: 10/28/2021] [Indexed: 12/02/2022] Open
Abstract
Background Hundreds of thousands of cancer patients have had targeted (panel) tumor sequencing to identify clinically meaningful mutations. In addition to improving patient outcomes, this activity has led to significant discoveries in basic and translational domains. However, the targeted nature of clinical tumor sequencing has a limited scope, especially for germline genetics. In this work, we assess the utility of discarded, off-target reads from tumor-only panel sequencing for the recovery of genome-wide germline genotypes through imputation. Methods We developed a framework for inference of germline variants from tumor panel sequencing, including imputation, quality control, inference of genetic ancestry, germline polygenic risk scores, and HLA alleles. We benchmarked our framework on 833 individuals with tumor sequencing and matched germline SNP array data. We then applied our approach to a prospectively collected panel sequencing cohort of 25,889 tumors. Results We demonstrate high to moderate accuracy of each inferred feature relative to direct germline SNP array genotyping: individual common variants were imputed with a mean accuracy (correlation) of 0.86, genetic ancestry was inferred with a correlation of > 0.98, polygenic risk scores were inferred with a correlation of > 0.90, and individual HLA alleles were inferred with a correlation of > 0.80. We demonstrate a minimal influence on the accuracy of somatic copy number alterations and other tumor features. We showcase the feasibility and utility of our framework by analyzing 25,889 tumors and identifying the relationships between genetic ancestry, polygenic risk, and tumor characteristics that could not be studied with conventional on-target tumor data. Conclusions We conclude that targeted tumor sequencing can be leveraged to build rich germline research cohorts from existing data and make our analysis pipeline publicly available to facilitate this effort. Supplementary Information The online version contains supplementary material available at 10.1186/s13073-021-00999-4.
Collapse
Affiliation(s)
- Alexander Gusev
- Division of Population Sciences, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, USA. .,Division of Genetics, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA. .,The Broad Institute of MIT & Harvard, Cambridge, MA, USA.
| | - Stefan Groha
- Division of Population Sciences, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, USA.,The Broad Institute of MIT & Harvard, Cambridge, MA, USA
| | - Kodi Taraszka
- Departments of Neurology and Computational Medicine, UCLA, Los Angeles, CA, USA
| | - Yevgeniy R Semenov
- Department of Dermatology, Massachusetts General Hospital, Boston, MA, USA
| | - Noah Zaitlen
- Departments of Neurology and Computational Medicine, UCLA, Los Angeles, CA, USA.
| |
Collapse
|
11
|
Bakshi A, Yan M, Riaz M, Polekhina G, Orchard SG, Tiller J, Wolfe R, Joshi A, Cao Y, McInerney-Leo AM, Yanes T, Janda M, Soyer HP, Cust AE, Law MH, Gibbs P, McLean C, Chan AT, McNeil JJ, Mar VJ, Lacaze P. Genomic Risk Score for Melanoma in a Prospective Study of Older Individuals. J Natl Cancer Inst 2021; 113:1379-1385. [PMID: 33837773 PMCID: PMC8921762 DOI: 10.1093/jnci/djab076] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Revised: 02/16/2021] [Accepted: 03/30/2021] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Recent genome-wide association meta-analysis for melanoma doubled the number of previously identified variants. We assessed the performance of an updated polygenic risk score (PRS) in a population of older individuals, where melanoma incidence and cumulative ultraviolet radiation exposure is greatest. METHODS We assessed a PRS for cutaneous melanoma comprising 55 variants in a prospective study of 12 712 individuals in the ASPirin in Reducing Events in the Elderly Trial. We evaluated incident melanomas diagnosed during the trial and prevalent melanomas diagnosed preenrolment (self-reported). Multivariable models examined associations between PRS as a continuous variable (per SD) and categorical (low-risk [0%-20%], medium-risk [21%-80%], high-risk [81%-100%] groups) with incident melanoma. Logistic regression examined the association between PRS and prevalent melanoma. RESULTS At baseline, mean participant age was 75 years; 55.0% were female, and 528 (4.2%) had prevalent melanomas. During follow-up (median = 4.7 years), 120 (1.0%) incident cutaneous melanomas occurred, 98 of which were in participants with no history. PRS was associated with incident melanoma (hazard ratio = 1.46 per SD, 95% confidence interval [CI] = 1.20 to 1.77) and prevalent melanoma (odds ratio [OR] = 1.55 per SD, 95% CI = 1.42 to 1.69). Participants in the highest-risk PRS group had increased risk compared with the low-risk group for incident melanoma (OR = 2.51, 95% CI = 1.28 to 4.92) and prevalent melanoma (OR = 3.66, 95% CI = 2.69 to 5.05). When stratifying by sex, only males had an association between the PRS and incident melanoma, whereas both sexes had an association between the PRS and prevalent melanoma. CONCLUSIONS A genomic risk score is associated with melanoma risk in older individuals and may contribute to targeted surveillance.
Collapse
Affiliation(s)
- Andrew Bakshi
- Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia
| | - Mabel Yan
- Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia
| | - Moeen Riaz
- Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia
| | - Galina Polekhina
- Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia
| | - Suzanne G Orchard
- Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia
| | - Jane Tiller
- Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia
| | - Rory Wolfe
- Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia
| | - Amit Joshi
- Clinical and Translational Epidemiology Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA; MGH Cancer Center, Boston, MA, USA
| | - Yin Cao
- Division of Public Health Sciences, Department of Surgery, Washington University School of Medicine, St Louis, MO, USA; Alvin J. Siteman Cancer Center, Washington University School of Medicine, St Louis, MO, USA
| | - Aideen M McInerney-Leo
- The University of Queensland Diamantina Institute, The University of Queensland, Dermatology Research Centre, Brisbane, QLD, USA
| | - Tatiane Yanes
- The University of Queensland Diamantina Institute, The University of Queensland, Dermatology Research Centre, Brisbane, QLD, USA
| | - Monika Janda
- The University of Queensland Diamantina Institute, The University of Queensland, Dermatology Research Centre, Brisbane, QLD, USA
- Centre of Health Services Research, Faculty of Medicine, The University of Queensland, Brisbane, Queensland, Australia
| | - H Peter Soyer
- The University of Queensland Diamantina Institute, The University of Queensland, Dermatology Research Centre, Brisbane, QLD, USA
| | - Anne E Cust
- Sydney School of Public Health and Melanoma Institute Australia, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - Matthew H Law
- Statistical Genetics Lab, QIMR Berghofer Medical Research Institute, Herston, Queensland, Australia
- School of Biomedical Sciences, Faculty of Health, and Institute of health and Biomedical Innovation, Queensland University of Technology, Kelvin Grove, Queensland, Australia, Personalised Oncology Division, Walter and Eliza Hall Institute Medical Research and Faculty of Medicine University of Melbourne, Australia
| | - Peter Gibbs
- Department of Anatomical Pathology, Alfred Hospital, Melbourne, Victoria, Australia
| | - Catriona McLean
- Department of Anatomical Pathology, Alfred Hospital, Melbourne, Victoria, Australia
| | - Andrew T Chan
- Clinical and Translational Epidemiology Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA; MGH Cancer Center, Boston, MA, USA
| | - John J McNeil
- Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia
| | - Victoria J Mar
- Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia
- Victorian Melanoma Service, Alfred Health, Melbourne, Australia
| | - Paul Lacaze
- Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia
| |
Collapse
|
12
|
Konuma T, Okada Y. Statistical genetics and polygenic risk score for precision medicine. Inflamm Regen 2021; 41:18. [PMID: 34140035 PMCID: PMC8212479 DOI: 10.1186/s41232-021-00172-9] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Accepted: 06/09/2021] [Indexed: 12/27/2022] Open
Abstract
The prediction of disease risks is an essential part of personalized medicine, which includes early disease detection, prevention, and intervention. The polygenic risk score (PRS) has become the standard for quantifying genetic liability in predicting disease risks. PRS utilizes single-nucleotide polymorphisms (SNPs) with genetic risks elucidated by genome-wide association studies (GWASs) and is calculated as weighted sum scores of these SNPs with genetic risks using their effect sizes from GWASs as their weights. The utilities of PRS have been explored in many common diseases, such as cancer, coronary artery disease, obesity, and diabetes, and in various non-disease traits, such as clinical biomarkers. These applications demonstrated that PRS could identify a high-risk subgroup of these diseases as a predictive biomarker and provide information on modifiable risk factors driving health outcomes. On the other hand, there are several limitations to implementing PRSs in clinical practice, such as biased sensitivity for the ethnic background of PRS calculation and geographical differences even in the same population groups. Also, it remains unclear which method is the most suitable for the prediction with high accuracy among numerous PRS methods developed so far. Although further improvements of its comprehensiveness and generalizability will be needed for its clinical implementation in the future, PRS will be a powerful tool for therapeutic interventions and lifestyle recommendations in a wide range of diseases. Thus, it may ultimately improve the health of an entire population in the future.
Collapse
Affiliation(s)
- Takahiro Konuma
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, 2-2 Yamadaoka, Suita, 565-0871, Japan.,Central Pharmaceutical Research Institute, Japan Tobacco Inc., Takatsuki, 569-1125, Japan
| | - Yukinori Okada
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, 2-2 Yamadaoka, Suita, 565-0871, Japan. .,Laboratory of Statistical Immunology, Immunology Frontier Research Center (WPI-IFReC), Osaka University, Suita, 565-0871, Japan. .,Integrated Frontier Research for Medical Science Division, Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Suita, 565-0871, Japan.
| |
Collapse
|
13
|
Martucci VL, Richmond B, Davis LK, Blackwell TS, Cox NJ, Samuels D, Velez Edwards D, Aldrich MC. Fate or coincidence: do COPD and major depression share genetic risk factors? Hum Mol Genet 2021; 30:619-628. [PMID: 33704461 DOI: 10.1093/hmg/ddab068] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 02/24/2021] [Accepted: 02/27/2021] [Indexed: 01/12/2023] Open
Abstract
Major depressive disorder (MDD) is a common comorbidity in chronic obstructive pulmonary disease (COPD), affecting up to 57% of patients with COPD. Although the comorbidity of COPD and MDD is well established, the causal relationship between these two diseases is unclear. A large-scale electronic health record clinical biobank and genome-wide association study summary statistics for MDD and lung function traits were used to investigate potential shared underlying genetic susceptibility between COPD and MDD. Linkage disequilibrium score regression was used to estimate genetic correlation between phenotypes. Polygenic risk scores (PRS) for MDD and lung function traits were developed and used to perform a phenome-wide association study (PheWAS). Multi-trait-based conditional and joint analysis identified single-nucleotide polymorphisms (SNPs) influencing both lung function and MDD. We found genetic correlations between MDD and all lung function traits were small and not statistically significant. A PRS-MDD was significantly associated with an increased risk of COPD in a PheWAS [odds ratio (OR) = 1.12, 95% confidence interval (CI): 1.09-1.16] when adjusting for age, sex and genetic ancestry, but this relationship became attenuated when controlling for smoking history (OR = 1.08, 95% CI: 1.04-1.13). No significant associations were found between the lung function PRS and MDD. Multi-trait-based conditional and joint analysis identified three SNPs that may contribute to both traits, two of which were previously associated with mood disorders and COPD. Our findings suggest that the observed relationship between COPD and MDD may not be driven by a strong shared genetic architecture.
Collapse
Affiliation(s)
- Victoria L Martucci
- Vanderbilt Genetics Institute, Vanderbilt University School of Medicine, Nashville, TN 37232, USA.,Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Bradley Richmond
- Department of Veterans Affairs Medical Center, Nashville, TN 37212, USA.,Division of Allergy, Pulmonary, and Critical Care Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Lea K Davis
- Vanderbilt Genetics Institute, Vanderbilt University School of Medicine, Nashville, TN 37232, USA.,Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA.,Department of Psychiatry and Behavioral Sciences, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Timothy S Blackwell
- Department of Veterans Affairs Medical Center, Nashville, TN 37212, USA.,Division of Allergy, Pulmonary, and Critical Care Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA.,Department of Cell and Developmental Biology, Vanderbilt University School of Medicine, Nashville, TN 37232, USA
| | - Nancy J Cox
- Vanderbilt Genetics Institute, Vanderbilt University School of Medicine, Nashville, TN 37232, USA.,Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - David Samuels
- Vanderbilt Genetics Institute, Vanderbilt University School of Medicine, Nashville, TN 37232, USA.,Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, TN 37232, USA
| | - Digna Velez Edwards
- Vanderbilt Genetics Institute, Vanderbilt University School of Medicine, Nashville, TN 37232, USA.,Division of Quantitative Sciences, Department of Obstetrics and Gynecology, Vanderbilt University Medical Center, Nashville, TN 37232, USA.,Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Melinda C Aldrich
- Vanderbilt Genetics Institute, Vanderbilt University School of Medicine, Nashville, TN 37232, USA.,Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA.,Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37232, USA.,Department of Thoracic Surgery, Vanderbilt University Medical Center, Nashville, TN 37232, USA.,Division of Epidemiology, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| |
Collapse
|
14
|
Chen Z, Boehnke M, Wen X, Mukherjee B. Revisiting the genome-wide significance threshold for common variant GWAS. G3 (BETHESDA, MD.) 2021; 11:jkaa056. [PMID: 33585870 PMCID: PMC8022962 DOI: 10.1093/g3journal/jkaa056] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Accepted: 11/05/2020] [Indexed: 11/23/2022]
Abstract
Over the last decade, GWAS meta-analyses have used a strict P-value threshold of 5 × 10-8 to classify associations as significant. Here, we use our current understanding of frequently studied traits including lipid levels, height, and BMI to revisit this genome-wide significance threshold. We compare the performance of studies using the P = 5 × 10-8 threshold in terms of true and false positive rate to other multiple testing strategies: (1) less stringent P-value thresholds, (2) controlling the FDR with the Benjamini-Hochberg and Benjamini-Yekutieli procedure, and (3) controlling the Bayesian FDR with posterior probabilities. We applied these procedures to re-analyze results from the Global Lipids and GIANT GWAS meta-analysis consortia and supported them with extensive simulation that mimics the empirical data. We observe in simulated studies with sample sizes ∼20,000 and >120,000 that relaxing the P-value threshold to 5 × 10-7 increased discovery at the cost of 18% and 8% of additional loci being false positive results, respectively. FDR and Bayesian FDR are well controlled for both sample sizes with a few exceptions that disappear under a less stringent definition of true positives and the two approaches yield similar results. Our work quantifies the value of using a relaxed P-value threshold in large studies to increase their true positive discovery but also show the excess false positive rates due to such actions in modest-sized studies. These results may guide investigators considering different thresholds in replication studies and downstream work such as gene-set enrichment or pathway analysis. Finally, we demonstrate the viability of FDR-controlling procedures in GWAS.
Collapse
Affiliation(s)
- Zhongsheng Chen
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI 48109-2029, USA
| | - Michael Boehnke
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI 48109-2029, USA
| | - Xiaoquan Wen
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI 48109-2029, USA
| | - Bhramar Mukherjee
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI 48109-2029, USA
| |
Collapse
|
15
|
Seviiri M, Law MH, Ong JS, Gharahkhani P, Nyholt DR, Olsen CM, Whiteman DC, MacGregor S. Polygenic Risk Scores Allow Risk Stratification for Keratinocyte Cancer in Organ-Transplant Recipients. J Invest Dermatol 2021; 141:325-333.e6. [DOI: 10.1016/j.jid.2020.06.017] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Revised: 06/11/2020] [Accepted: 06/16/2020] [Indexed: 10/24/2022]
|
16
|
Abstract
We trained and validated risk prediction models for the three major types of skin cancer- basal cell carcinoma (BCC), squamous cell carcinoma (SCC), and melanoma-on a cross-sectional and longitudinal dataset of 210,000 consented research participants who responded to an online survey covering personal and family history of skin cancer, skin susceptibility, and UV exposure. We developed a primary disease risk score (DRS) that combined all 32 identified genetic and non-genetic risk factors. Top percentile DRS was associated with an up to 13-fold increase (odds ratio per standard deviation increase >2.5) in the risk of developing skin cancer relative to the middle DRS percentile. To derive lifetime risk trajectories for the three skin cancers, we developed a second and age independent disease score, called DRSA. Using incident cases, we demonstrated that DRSA could be used in early detection programs for identifying high risk asymptotic individuals, and predicting when they are likely to develop skin cancer. High DRSA scores were not only associated with earlier disease diagnosis (by up to 14 years), but also with more severe and recurrent forms of skin cancer.
Collapse
|
17
|
Kachuri L, Graff RE, Smith-Byrne K, Meyers TJ, Rashkin SR, Ziv E, Witte JS, Johansson M. Pan-cancer analysis demonstrates that integrating polygenic risk scores with modifiable risk factors improves risk prediction. Nat Commun 2020; 11:6084. [PMID: 33247094 PMCID: PMC7695829 DOI: 10.1038/s41467-020-19600-4] [Citation(s) in RCA: 77] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Accepted: 10/05/2020] [Indexed: 12/28/2022] Open
Abstract
Cancer risk is determined by a complex interplay of environmental and heritable factors. Polygenic risk scores (PRS) provide a personalized genetic susceptibility profile that may be leveraged for disease prediction. Using data from the UK Biobank (413,753 individuals; 22,755 incident cancer cases), we quantify the added predictive value of integrating cancer-specific PRS with family history and modifiable risk factors for 16 cancers. We show that incorporating PRS measurably improves prediction accuracy for most cancers, but the magnitude of this improvement varies substantially. We also demonstrate that stratifying on levels of PRS identifies significantly divergent 5-year risk trajectories after accounting for family history and modifiable risk factors. At the population level, the top 20% of the PRS distribution accounts for 4.0% to 30.3% of incident cancer cases, exceeding the impact of many lifestyle-related factors. In summary, this study illustrates the potential for improving cancer risk assessment by integrating genetic risk scores.
Collapse
Affiliation(s)
- Linda Kachuri
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA, USA
| | - Rebecca E Graff
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA, USA
| | - Karl Smith-Byrne
- Genetic Epidemiology Group, Section of Genetics, International Agency for Research on Cancer, Lyon, France
| | - Travis J Meyers
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA, USA
| | - Sara R Rashkin
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA, USA
| | - Elad Ziv
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA
| | - John S Witte
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA, USA.
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA, USA.
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA.
- Department of Urology, University of California, San Francisco, San Francisco, CA, USA.
| | - Mattias Johansson
- Genetic Epidemiology Group, Section of Genetics, International Agency for Research on Cancer, Lyon, France.
| |
Collapse
|
18
|
Fritsche LG, Patil S, Beesley LJ, VandeHaar P, Salvatore M, Ma Y, Peng RB, Taliun D, Zhou X, Mukherjee B. Cancer PRSweb: An Online Repository with Polygenic Risk Scores for Major Cancer Traits and Their Evaluation in Two Independent Biobanks. Am J Hum Genet 2020; 107:815-836. [PMID: 32991828 PMCID: PMC7675001 DOI: 10.1016/j.ajhg.2020.08.025] [Citation(s) in RCA: 55] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2020] [Accepted: 08/28/2020] [Indexed: 02/06/2023] Open
Abstract
To facilitate scientific collaboration on polygenic risk scores (PRSs) research, we created an extensive PRS online repository for 35 common cancer traits integrating freely available genome-wide association studies (GWASs) summary statistics from three sources: published GWASs, the NHGRI-EBI GWAS Catalog, and UK Biobank-based GWASs. Our framework condenses these summary statistics into PRSs using various approaches such as linkage disequilibrium pruning/p value thresholding (fixed or data-adaptively optimized thresholds) and penalized, genome-wide effect size weighting. We evaluated the PRSs in two biobanks: the Michigan Genomics Initiative (MGI), a longitudinal biorepository effort at Michigan Medicine, and the population-based UK Biobank (UKB). For each PRS construct, we provide measures on predictive performance and discrimination. Besides PRS evaluation, the Cancer-PRSweb platform features construct downloads and phenome-wide PRS association study results (PRS-PheWAS) for predictive PRSs. We expect this integrated platform to accelerate PRS-related cancer research.
Collapse
Affiliation(s)
- Lars G Fritsche
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Precision Health Data Science, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; University of Michigan Rogel Cancer Center, University of Michigan, Ann Arbor, MI 48109, USA.
| | - Snehal Patil
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA
| | - Lauren J Beesley
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Precision Health Data Science, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA
| | - Peter VandeHaar
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA
| | - Maxwell Salvatore
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Precision Health Data Science, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA
| | - Ying Ma
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA
| | - Robert B Peng
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Precision Health Data Science, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Department of Statistics, Northwestern University, Evanston, IL 60208, USA
| | - Daniel Taliun
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Precision Health Data Science, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA
| | - Bhramar Mukherjee
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Center for Precision Health Data Science, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; Michigan Institute for Data Science, University of Michigan, Ann Arbor, MI 48109, USA; Department of Epidemiology, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA; University of Michigan Rogel Cancer Center, University of Michigan, Ann Arbor, MI 48109, USA.
| |
Collapse
|
19
|
Babb de Villiers C, Kroese M, Moorthie S. Understanding polygenic models, their development and the potential application of polygenic scores in healthcare. J Med Genet 2020; 57:725-732. [PMID: 32376789 PMCID: PMC7591711 DOI: 10.1136/jmedgenet-2019-106763] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Revised: 03/09/2020] [Accepted: 03/28/2020] [Indexed: 02/06/2023]
Abstract
The use of genomic information to better understand and prevent common complex diseases has been an ongoing goal of genetic research. Over the past few years, research in this area has proliferated with several proposed methods of generating polygenic scores. This has been driven by the availability of larger data sets, primarily from genome-wide association studies and concomitant developments in statistical methodologies. Here we provide an overview of the methodological aspects of polygenic model construction. In addition, we consider the state of the field and implications for potential applications of polygenic scores for risk estimation within healthcare.
Collapse
Affiliation(s)
| | - Mark Kroese
- PHG Foundation, University of Cambridge, Cambridge, Cambridgeshire, UK
| | - Sowmiya Moorthie
- PHG Foundation, University of Cambridge, Cambridge, Cambridgeshire, UK
| |
Collapse
|
20
|
Wendt FR, Carvalho CM, Pathak GA, Gelernter J, Polimanti R. Polygenic risk for autism spectrum disorder associates with anger recognition in a neurodevelopment-focused phenome-wide scan of unaffected youths from a population-based cohort. PLoS Genet 2020; 16:e1009036. [PMID: 32941431 PMCID: PMC7523983 DOI: 10.1371/journal.pgen.1009036] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Revised: 09/29/2020] [Accepted: 08/08/2020] [Indexed: 12/27/2022] Open
Abstract
The polygenic nature and the contribution of common genetic variation to autism spectrum disorder (ASD) allude to a high degree of pleiotropy between ASD and other psychiatric and behavioral traits. In a pleiotropic system, a single genetic variant contributes small effects to several phenotypes or disorders. While analyzed broadly, there is a paucity of research studies investigating the shared genetic information between specific neurodevelopmental domains and ASD. We performed a phenome-wide association study of ASD polygenetic risk score (PRS) against 491 neurodevelopmental subdomains ascertained in 4,309 probands from the Philadelphia Neurodevelopmental Cohort (PNC) who lack an ASD diagnosis. Our main analysis calculated ASD PRS in 4,309 PNC probands using the per-SNP effects reported in a recent genome-wide association study of ASD in a case-control design. In a high-resolution manner, our main analysis regressed ASD PRS against 491 neurodevelopmental phenotypes with age, sex, and ten principal components of ancestry as covariates. Follow-up analyses included in the regression model PRS derived from brain-related traits genetically correlated with ASD. Our main finding demonstrated that 11-17-year old probands with the highest ASD genetic risk were able to identify angry faces (R2 = 1.06%, p = 1.38 × 10−7, pBonferroni-corrected = 1.9 × 10−3). This ability replicated in older probands (>18 years; R2 = 0.55%, p = 0.036) and persisted after covarying with other psychiatric disorders, brain imaging traits, and educational attainment (R2 = 0.2%, p = 0.019). We also detected several suggestive associations between ASD PRS and emotionality and connectedness with others. These data (i) indicate how genetic liability to ASD may influence neurodevelopment in the general population, (ii) reinforce epidemiological findings of heightened ability of ASD cases to predict certain social psychological events based on increased systemizing skills, and (iii) recapitulate theories of imbalance between empathizing and systemizing in ASD etiology. Large-scale genetic studies have identified many regions of the genome associated with autism spectrum disorder that are considered common in the general population. We investigated how the additive effects of these genetic variations associate with neurodevelopment in youths who lack an ASD diagnosis to better understand how genetic risk for ASD may contribute to other aspects of mental health. We uncovered a relationship between greater genetic risk for ASD and more accurate recognition of angry emotions in others, which persists after considering genetic associations with other psychiatric disorders, educational attainment, and brain region volume. This finding is consistent with existing theories of the relationship between ASD genetic liability and a person’s ability to build generalizable and impulse driven models for responding to social phenomena.
Collapse
Affiliation(s)
- Frank R. Wendt
- Department of Psychiatry, Yale School of Medicine and VA CT Healthcare Center, West Haven, United States of America
| | - Carolina Muniz Carvalho
- Department of Psychiatry, Yale School of Medicine and VA CT Healthcare Center, West Haven, United States of America
- Department of Psychiatry, Universidade Federal de São Paulo (UNIFESP), São Paulo, SP, Brazil
| | - Gita A. Pathak
- Department of Psychiatry, Yale School of Medicine and VA CT Healthcare Center, West Haven, United States of America
| | - Joel Gelernter
- Department of Psychiatry, Yale School of Medicine and VA CT Healthcare Center, West Haven, United States of America
- Departments of Genetics and Neuroscience, Yale University School of Medicine, New Haven, United States of America
| | - Renato Polimanti
- Department of Psychiatry, Yale School of Medicine and VA CT Healthcare Center, West Haven, United States of America
- * E-mail:
| |
Collapse
|
21
|
Yang S, Zhou X. Accurate and Scalable Construction of Polygenic Scores in Large Biobank Data Sets. Am J Hum Genet 2020; 106:679-693. [PMID: 32330416 PMCID: PMC7212266 DOI: 10.1016/j.ajhg.2020.03.013] [Citation(s) in RCA: 53] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Accepted: 03/30/2020] [Indexed: 01/24/2023] Open
Abstract
Accurate construction of polygenic scores (PGS) can enable early diagnosis of diseases and facilitate the development of personalized medicine. Accurate PGS construction requires prediction models that are both adaptive to different genetic architectures and scalable to biobank scale datasets with millions of individuals and tens of millions of genetic variants. Here, we develop such a method called Deterministic Bayesian Sparse Linear Mixed Model (DBSLMM). DBSLMM relies on a flexible modeling assumption on the effect size distribution to achieve robust and accurate prediction performance across a range of genetic architectures. DBSLMM also relies on a simple deterministic search algorithm to yield an approximate analytic estimation solution using summary statistics only. The deterministic search algorithm, when paired with further algebraic innovations, results in substantial computational savings. With simulations, we show that DBSLMM achieves scalable and accurate prediction performance across a range of realistic genetic architectures. We then apply DBSLMM to analyze 25 traits in UK Biobank. For these traits, compared to existing approaches, DBSLMM achieves an average of 2.03%-101.09% accuracy gain in internal cross-validations. In external validations on two separate datasets, including one from BioBank Japan, DBSLMM achieves an average of 14.74%-522.74% accuracy gain. In these real data applications, DBSLMM is 1.03-28.11 times faster and uses only 7.4%-24.8% of physical memory as compared to other multiple regression-based PGS methods. Overall, DBSLMM represents an accurate and scalable method for constructing PGS in biobank scale datasets.
Collapse
Affiliation(s)
- Sheng Yang
- Department of Biostatistics, Nanjing Medical University, Nanjing, Jiangsu 211166, China; Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA; Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA; Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA.
| |
Collapse
|
22
|
Li R, Chen Y, Ritchie MD, Moore JH. Electronic health records and polygenic risk scores for predicting disease risk. Nat Rev Genet 2020; 21:493-502. [PMID: 32235907 DOI: 10.1038/s41576-020-0224-1] [Citation(s) in RCA: 53] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/02/2020] [Indexed: 01/03/2023]
Abstract
Accurate prediction of disease risk based on the genetic make-up of an individual is essential for effective prevention and personalized treatment. Nevertheless, to date, individual genetic variants from genome-wide association studies have achieved only moderate prediction of disease risk. The aggregation of genetic variants under a polygenic model shows promising improvements in prediction accuracies. Increasingly, electronic health records (EHRs) are being linked to patient genetic data in biobanks, which provides new opportunities for developing and applying polygenic risk scores in the clinic, to systematically examine and evaluate patient susceptibilities to disease. However, the heterogeneous nature of EHR data brings forth many practical challenges along every step of designing and implementing risk prediction strategies. In this Review, we present the unique considerations for using genotype and phenotype data from biobank-linked EHRs for polygenic risk prediction.
Collapse
Affiliation(s)
- Ruowang Li
- Department of Biostatistics, Epidemiology & Informatics, University of Pennsylvania, Philadelphia, PA, USA
| | - Yong Chen
- Department of Biostatistics, Epidemiology & Informatics, University of Pennsylvania, Philadelphia, PA, USA
| | - Marylyn D Ritchie
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
| | - Jason H Moore
- Department of Biostatistics, Epidemiology & Informatics, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
23
|
Beesley LJ, Salvatore M, Fritsche LG, Pandit A, Rao A, Brummett C, Willer CJ, Lisabeth LD, Mukherjee B. The emerging landscape of health research based on biobanks linked to electronic health records: Existing resources, statistical challenges, and potential opportunities. Stat Med 2020; 39:773-800. [PMID: 31859414 PMCID: PMC7983809 DOI: 10.1002/sim.8445] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2018] [Revised: 09/10/2019] [Accepted: 11/16/2019] [Indexed: 01/03/2023]
Abstract
Biobanks linked to electronic health records provide rich resources for health-related research. With improvements in administrative and informatics infrastructure, the availability and utility of data from biobanks have dramatically increased. In this paper, we first aim to characterize the current landscape of available biobanks and to describe specific biobanks, including their place of origin, size, and data types. The development and accessibility of large-scale biorepositories provide the opportunity to accelerate agnostic searches, expedite discoveries, and conduct hypothesis-generating studies of disease-treatment, disease-exposure, and disease-gene associations. Rather than designing and implementing a single study focused on a few targeted hypotheses, researchers can potentially use biobanks' existing resources to answer an expanded selection of exploratory questions as quickly as they can analyze them. However, there are many obvious and subtle challenges with the design and analysis of biobank-based studies. Our second aim is to discuss statistical issues related to biobank research such as study design, sampling strategy, phenotype identification, and missing data. We focus our discussion on biobanks that are linked to electronic health records. Some of the analytic issues are illustrated using data from the Michigan Genomics Initiative and UK Biobank, two biobanks with two different recruitment mechanisms. We summarize the current body of literature for addressing these challenges and discuss some standing open problems. This work complements and extends recent reviews about biobank-based research and serves as a resource catalog with analytical and practical guidance for statisticians, epidemiologists, and other medical researchers pursuing research using biobanks.
Collapse
Affiliation(s)
| | | | | | - Anita Pandit
- University of Michigan, Department of Biostatistics
| | - Arvind Rao
- University of Michigan, Department of Computational Medicine and Bioinformatics
| | - Chad Brummett
- University of Michigan, Department of Anesthesiology
| | - Cristen J. Willer
- University of Michigan, Department of Computational Medicine and Bioinformatics
| | | | | |
Collapse
|
24
|
Janssens ACJW. Validity of polygenic risk scores: are we measuring what we think we are? Hum Mol Genet 2019; 28:R143-R150. [PMID: 31504522 PMCID: PMC7013150 DOI: 10.1093/hmg/ddz205] [Citation(s) in RCA: 59] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2019] [Revised: 08/14/2019] [Accepted: 08/14/2019] [Indexed: 12/16/2022] Open
Abstract
Polygenic risk scores (PRSs) have become the standard for quantifying genetic liability in the prediction of disease risks. PRSs are generally constructed as weighted sum scores of risk alleles using effect sizes from genome-wide association studies as their weights. The construction of PRSs is being improved with more appropriate selection of independent single-nucleotide polymorphisms (SNPs) and optimized estimation of their weights but is rarely reflected upon from a theoretical perspective, focusing on the validity of the risk score. Borrowing from psychometrics, this paper discusses the validity of PRSs and introduces the three main types of validity that are considered in the evaluation of tests and measurements: construct, content, and criterion validity. This introduction is followed by a discussion of three topics that challenge the validity of PRS, namely, their claimed independence of clinical risk factors, the consequences of relaxing SNP inclusion thresholds and the selection of SNP weights. This discussion of the validity of PRS reminds us that we need to keep questioning if weighted sums of risk alleles are measuring what we think they are in the various scenarios in which PRSs are used and that we need to keep exploring alternative modeling strategies that might better reflect the underlying biological pathways.
Collapse
Affiliation(s)
- A Cecile J W Janssens
- Department of Epidemiology, Rollins School of Public Health, Emory University, 1518 Clifton Road NE, Atlanta, GA, USA
| |
Collapse
|