51
|
Zhai S, Mehrotra DV, Shen J. Applying polygenic risk score methods to pharmacogenomics GWAS: challenges and opportunities. Brief Bioinform 2023; 25:bbad470. [PMID: 38152980 PMCID: PMC10782924 DOI: 10.1093/bib/bbad470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 11/20/2023] [Accepted: 11/28/2023] [Indexed: 12/29/2023] Open
Abstract
Polygenic risk scores (PRSs) have emerged as promising tools for the prediction of human diseases and complex traits in disease genome-wide association studies (GWAS). Applying PRSs to pharmacogenomics (PGx) studies has begun to show great potential for improving patient stratification and drug response prediction. However, there are unique challenges that arise when applying PRSs to PGx GWAS beyond those typically encountered in disease GWAS (e.g. Eurocentric or trans-ethnic bias). These challenges include: (i) the lack of knowledge about whether PGx or disease GWAS/variants should be used in the base cohort (BC); (ii) the small sample sizes in PGx GWAS with corresponding low power and (iii) the more complex PRS statistical modeling required for handling both prognostic and predictive effects simultaneously. To gain insights in this landscape about the general trends, challenges and possible solutions, we first conduct a systematic review of both PRS applications and PRS method development in PGx GWAS. To further address the challenges, we propose (i) a novel PRS application strategy by leveraging both PGx and disease GWAS summary statistics in the BC for PRS construction and (ii) a new Bayesian method (PRS-PGx-Bayesx) to reduce Eurocentric or cross-population PRS prediction bias. Extensive simulations are conducted to demonstrate their advantages over existing PRS methods applied in PGx GWAS. Our systematic review and methodology research work not only highlights current gaps and key considerations while applying PRS methods to PGx GWAS, but also provides possible solutions for better PGx PRS applications and future research.
Collapse
Affiliation(s)
- Song Zhai
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, NJ 07065, USA
| | - Devan V Mehrotra
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., North Wales, PA 19454, USA
| | - Judong Shen
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, NJ 07065, USA
| |
Collapse
|
52
|
Kim DJ, Kang JH, Kim JW, Cheon MJ, Kim SB, Lee YK, Lee BC. Evaluation of optimal methods and ancestries for calculating polygenic risk scores in East Asian population. Sci Rep 2023; 13:19195. [PMID: 37932343 PMCID: PMC10628155 DOI: 10.1038/s41598-023-45859-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Accepted: 10/25/2023] [Indexed: 11/08/2023] Open
Abstract
Polygenic risk scores (PRSs) have been studied for predicting human diseases, and various methods for PRS calculation have been developed. Most PRS studies to date have focused on European ancestry, and the performance of PRS has not been sufficiently assessed in East Asia. Herein, we evaluated the predictive performance of PRSs for East Asian populations under various conditions. Simulation studies using data from the Korean cohort, Health Examinees (HEXA), demonstrated that SBayesRC and PRS-CS outperformed other PRS methods (lassosum, LDpred-funct, and PRSice) in high fixed heritability (0.3 and 0.7). In addition, we generated PRSs using real-world data from HEXA for ten diseases: asthma, breast cancer, cataract, coronary artery disease, gastric cancer, glaucoma, hyperthyroidism, hypothyroidism, osteoporosis, and type 2 diabetes (T2D). We utilized the five previous PRS methods and genome-wide association study (GWAS) data from two biobank-scale datasets [European (UK Biobank) and East Asian (BioBank Japan) ancestry]. Additionally, we employed PRS-CSx, a PRS method that combines GWAS data from both ancestries, to generate a total of 110 PRS for ten diseases. Similar to the simulation results, SBayesRC showed better predictive performance for disease risk than the other methods. Furthermore, the East Asian GWAS data outperformed those from European ancestry for breast cancer, cataract, gastric cancer, and T2D, but neither of the two GWAS ancestries showed a significant advantage on PRS performance for the remaining six diseases. Based on simulation data and real data studies, it is expected that SBayesRC will offer superior performance for East Asian populations, and PRS generated using GWAS from non-East Asian may also yield good results.
Collapse
|
53
|
Zhu Z, Chen X, Zhang S, Yu R, Qi C, Cheng L, Zhang X. Leveraging molecular quantitative trait loci to comprehend complex diseases/traits from the omics perspective. Hum Genet 2023; 142:1543-1560. [PMID: 37755483 DOI: 10.1007/s00439-023-02602-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 09/14/2023] [Indexed: 09/28/2023]
Abstract
Comprehending the molecular basis of quantitative genetic variation is a principal goal for complex diseases or traits. Molecular quantitative trait loci (molQTLs) have made it possible to investigate the effects of genetic variants hiding behind large-scale omics data. A deeper understanding of molQTL is urgently required in light of the multi-dimensionalization of omics data to more fully elucidate the pertinent biological mechanisms. Herein, we reviewed molQTLs with the corresponding resource from the omics perspective and further discussed the integrative strategy of GWAS-molQTL to infer their causal effects. Subsequently, we described the opportunities and challenges encountered by molQTL. The case studies showed that molQTL is essential for complex diseases and traits, whether single- or multi-omics QTLs. Overall, we highlighted the functional significance of genetic variants to employ the discovery of molQTL in complex diseases and traits.
Collapse
Affiliation(s)
- Zijun Zhu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Xinyu Chen
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Sainan Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Rui Yu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Changlu Qi
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China.
- NHC Key Laboratory of Molecular Probe and Targeted Diagnosis and Therapy, Harbin Medical University, Harbin, 150028, Heilongjiang, China.
| | - Xue Zhang
- NHC Key Laboratory of Molecular Probe and Targeted Diagnosis and Therapy, Harbin Medical University, Harbin, 150028, Heilongjiang, China
- McKusick-Zhang Center for Genetic Medicine, State Key Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, 100005, China
| |
Collapse
|
54
|
Jeng XJ, Hu Y, Venkat V, Lu TP, Tzeng JY. Transfer learning with false negative control improves polygenic risk prediction. PLoS Genet 2023; 19:e1010597. [PMID: 38011285 PMCID: PMC10723713 DOI: 10.1371/journal.pgen.1010597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2023] [Revised: 12/15/2023] [Accepted: 11/09/2023] [Indexed: 11/29/2023] Open
Abstract
Polygenic risk score (PRS) is a quantity that aggregates the effects of variants across the genome and estimates an individual's genetic predisposition for a given trait. PRS analysis typically contains two input data sets: base data for effect size estimation and target data for individual-level prediction. Given the availability of large-scale base data, it becomes more common that the ancestral background of base and target data do not perfectly match. In this paper, we treat the GWAS summary information obtained in the base data as knowledge learned from a pre-trained model, and adopt a transfer learning framework to effectively leverage the knowledge learned from the base data that may or may not have similar ancestral background as the target samples to build prediction models for target individuals. Our proposed transfer learning framework consists of two main steps: (1) conducting false negative control (FNC) marginal screening to extract useful knowledge from the base data; and (2) performing joint model training to integrate the knowledge extracted from base data with the target training data for accurate trans-data prediction. This new approach can significantly enhance the computational and statistical efficiency of joint-model training, alleviate over-fitting, and facilitate more accurate trans-data prediction when heterogeneity level between target and base data sets is small or high.
Collapse
Affiliation(s)
- Xinge Jessie Jeng
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Yifei Hu
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Vaishnavi Venkat
- Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Tzu-Pin Lu
- Institute of Health Data Analytics and Statistics, National Taiwan University, Taipei, Taiwan
- Department of Public Health, National Taiwan University, Taipei, Taiwan
| | - Jung-Ying Tzeng
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, United States of America
- Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, United States of America
- Institute of Health Data Analytics and Statistics, National Taiwan University, Taipei, Taiwan
| |
Collapse
|
55
|
John M, Lencz T. Potential application of elastic nets for shared polygenicity detection with adapted threshold selection. Int J Biostat 2023; 19:417-438. [PMID: 36327464 PMCID: PMC10154439 DOI: 10.1515/ijb-2020-0108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Accepted: 10/05/2022] [Indexed: 11/06/2022]
Abstract
Current research suggests that hundreds to thousands of single nucleotide polymorphisms (SNPs) with small to modest effect sizes contribute to the genetic basis of many disorders, a phenomenon labeled as polygenicity. Additionally, many such disorders demonstrate polygenic overlap, in which risk alleles are shared at associated genetic loci. A simple strategy to detect polygenic overlap between two phenotypes is based on rank-ordering the univariate p-values from two genome-wide association studies (GWASs). Although high-dimensional variable selection strategies such as Lasso and elastic nets have been utilized in other GWAS analysis settings, they are yet to be utilized for detecting shared polygenicity. In this paper, we illustrate how elastic nets, with polygenic scores as the dependent variable and with appropriate adaptation in selecting the penalty parameter, may be utilized for detecting a subset of SNPs involved in shared polygenicity. We provide theory to better understand our approaches, and illustrate their utility using synthetic datasets. Results from extensive simulations are presented comparing the elastic net approaches with the rank ordering approach, in various scenarios. Results from simulations studies exhibit one of the elastic net approaches to be superior when the correlations among the SNPs are high. Finally, we apply the methods on two real datasets to illustrate further the capabilities, limitations and differences among the methods.
Collapse
Affiliation(s)
- Majnu John
- Institute of Behavioral Science, Feinstein Institutes of Medical Research, Manhasset, NY
- Division of Psychiatry Research, The Zucker Hillside Hospital, Northwell Health System, Glen Oaks, NY
- Departments of Psychiatry and of Mathematics, Hofstra University, Hempstead, NY
| | - Todd Lencz
- Institute of Behavioral Science, Feinstein Institutes of Medical Research, Manhasset, NY
- Division of Psychiatry Research, The Zucker Hillside Hospital, Northwell Health System, Glen Oaks, NY
- Departments of Psychiatry and of Molecular Medicine, Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY
| |
Collapse
|
56
|
Fatumo S, Sathan D, Samtal C, Isewon I, Tamuhla T, Soremekun C, Jafali J, Panji S, Tiffin N, Fakim YJ. Polygenic risk scores for disease risk prediction in Africa: current challenges and future directions. Genome Med 2023; 15:87. [PMID: 37904243 PMCID: PMC10614359 DOI: 10.1186/s13073-023-01245-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2023] [Accepted: 10/12/2023] [Indexed: 11/01/2023] Open
Abstract
Early identification of genetic risk factors for complex diseases can enable timely interventions and prevent serious outcomes, including mortality. While the genetics underlying many Mendelian diseases have been elucidated, it is harder to predict risk for complex diseases arising from the combined effects of many genetic variants with smaller individual effects on disease aetiology. Polygenic risk scores (PRS), which combine multiple contributing variants to predict disease risk, have the potential to influence the implementation for precision medicine. However, the majority of existing PRS were developed from European data with limited transferability to African populations. Notably, African populations have diverse genetic backgrounds, and a genomic architecture with smaller haplotype blocks compared to European genomes. Subsequently, growing evidence shows that using large-scale African ancestry cohorts as discovery for PRS development may generate more generalizable findings. Here, we (1) discuss the factors contributing to the poor transferability of PRS in African populations, (2) showcase the novel Africa genomic datasets for PRS development, (3) explore the potential clinical utility of PRS in African populations, and (4) provide insight into the future of PRS in Africa.
Collapse
Affiliation(s)
- Segun Fatumo
- The African Computational Genomics (TACG) Research Group, MRC/UVRI and LSHTM, Entebbe, Uganda.
- H3Africa Bioinformatics Network (H3ABioNet) Node, Centre for Genomics Research and Innovation, NABDA/FMST, Abuja, Nigeria.
- Department of Non-Communicable Disease Epidemiology (NCDE), London School of Hygiene and Tropical Medicine, Keppel St, London, WC1E 7HT, UK.
| | - Dassen Sathan
- H3Africa Bioinformatics Network (H3ABioNet) Node, University of Mauritius, Reduit, Mauritius
| | - Chaimae Samtal
- Laboratory of Biotechnology, Environment, Agri-Food and Health, Faculty of Sciences Dhar El Mahraz-Sidi Mohammed Ben Abdellah University, 30000, Fez, Morocco
| | - Itunuoluwa Isewon
- Department of Computer and Information Sciences, Covenant University, P. M. B. 1023, Ota, Ogun State, Nigeria
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Km 10 Idiroko Road, P.M.B. 1023, Ota, Ogun State, Nigeria
- Covenant Applied Informatics and Communication African Centre of Excellence (CApIC-ACE), Covenant University, P.M.B. 1023, Ota, Ogun State, Nigeria
| | - Tsaone Tamuhla
- Division of Computational Biology, Integrative Biomedical Sciences Department, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
- South African Medical Research Council Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Bellville, South Africa
| | - Chisom Soremekun
- The African Computational Genomics (TACG) Research Group, MRC/UVRI and LSHTM, Entebbe, Uganda
- H3Africa Bioinformatics Network (H3ABioNet) Node, Centre for Genomics Research and Innovation, NABDA/FMST, Abuja, Nigeria
- Department of Immunology and Molecular Biology, College of Health Science, Makerere University, Kampala, Uganda
| | - James Jafali
- Malawi-Liverpool-Wellcome Trust Clinical Research Programme, Blantyre, Malawi
- Clinical Infection, Microbiology & Immunology, The University of Liverpool, Liverpool, UK
| | - Sumir Panji
- Computational Biology Group, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Cape Town, 7925, South Africa
| | - Nicki Tiffin
- South African Medical Research Council Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Bellville, South Africa
| | | |
Collapse
|
57
|
Zhang X, Gomez L, Below J, Naj A, Martin E, Kunkle B, Bush WS. An X Chromosome Transcriptome Wide Association Study Implicates ARMCX6 in Alzheimer's Disease. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.06.543877. [PMID: 37333116 PMCID: PMC10274627 DOI: 10.1101/2023.06.06.543877] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/20/2023]
Abstract
Background The X chromosome is often omitted in disease association studies despite containing thousands of genes which may provide insight into well-known sex differences in the risk of Alzheimer's Disease. Objective To model the expression of X chromosome genes and evaluate their impact on Alzheimer's Disease risk in a sex-stratified manner. Methods Using elastic net, we evaluated multiple modeling strategies in a set of 175 whole blood samples and 126 brain cortex samples, with whole genome sequencing and RNA-seq data. SNPs (MAF>0.05) within the cis-regulatory window were used to train tissue-specific models of each gene. We apply the best models in both tissues to sex-stratified summary statistics from a meta-analysis of Alzheimer's disease Genetics Consortium (ADGC) studies to identify AD-related genes on the X chromosome. Results Across different model parameters, sample sex, and tissue types, we modeled the expression of 217 genes (95 genes in blood and 135 genes in brain cortex). The average model R2 was 0.12 (range from 0.03 to 0.34). We also compared sex-stratified and sex-combined models on the X chromosome. We further investigated genes that escaped X chromosome inactivation (XCI) to determine if their genetic regulation patterns were distinct. We found ten genes associated with AD at p 0.05, with only ARMCX6 in female brain cortex (p = 0.008) nearing the significance threshold after adjusting for multiple testing (α = 0.002). Conclusions We optimized the expression prediction of X chromosome genes, applied these models to sex-stratified AD GWAS summary statistics, and identified one putative AD risk gene, ARMCX6.
Collapse
Affiliation(s)
- Xueyi Zhang
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, 44106, USA
| | - Lissette Gomez
- John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, Miami, FL, 33136, USA
| | - Jennifer Below
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, 37235, USA
| | - Adam Naj
- Department of Biostatistics, Epidemiology, and Informatics, Penn Neurodegeneration Genomics Center, University of Pennsylvania Perelman School of Medicine, Philadelphia, 19104, USA; Department of Pathology and Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, 19104, USA
| | - Eden Martin
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, 33176, USA
| | - Brian Kunkle
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, 33176, USA
| | - William S Bush
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, 44106, USA
| |
Collapse
|
58
|
Chen S, Xin J, Ding Z, Zhao L, Ben S, Zheng R, Li S, Li H, Shao W, Cheng Y, Zhang Z, Du M, Wang M. Construction, evaluation, and AOP framework-based application of the EpPRS as a genetic surrogate for assessing environmental pollutants. ENVIRONMENT INTERNATIONAL 2023; 180:108202. [PMID: 37734146 DOI: 10.1016/j.envint.2023.108202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 09/01/2023] [Accepted: 09/11/2023] [Indexed: 09/23/2023]
Abstract
BACKGROUND Environmental pollutant measurement is essential for accurate health risk assessment. However, the detection of humans' internal exposure to pollutants is cost-intensive and consumes time and energy. Polygenic risk scores (PRSs) have been widely applied in genetic studies of complex trait diseases. It is important to construct a genetically relevant environmental surrogate for pollutant exposure and to explore its utility for disease prediction and risk assessment. OBJECTIVES This study enrolled 714 individuals with complete genomic data and exposomic data on 22 plasma-persistent organic pollutants (POPs). METHODS We first conducted 22 POP genome-wide association studies (GWAS) and constructed the corresponding environmental pollutant-based PRS (EpPRS) by clumping and P value thresholding (C + T), lassosum, and PRS-CS methods. The best-fit EpPRS was chosen by its regression R2. An adverse outcome pathway (AOP) framework was developed to assess the effects of contaminants on candidate diseases. Furthermore, Mendelian randomization (MR) analysis was performed to explore the causal association between POPs and cancer risk. RESULTS The C + T method produced the best-performing EpPRSs for 7 PCBs and 4 PBDEs. EpPRSs replicated the correlations of environmental exposure measurements based on consistent patterns. The diagnostic performance of type 2 diabetes mellitus (T2DM) PRS was improved by the combined model of T2DM-EpPRS of PCB126/BDE153. Finally, the AKT1-mediated AOP framework illustrated that PCB126 and BDE153 may increase the risk of T2DM by decreasing AKT1 phosphorylation through the cGMP-PKG pathway and promoting abnormal glucose homeostasis. MR analysis showed that digestive system tumors, such as colorectal cancer and biliary tract cancer, are more sensitive to POP exposure. CONCLUSIONS EpPRSs can serve as a proxy for assessing pollutant internal exposure. The application of the EpPRS to disease risk assessment can reveal the toxic pathway and mode of action linking exposure and disease in detail, providing a basis for the development of environmental pollutant control strategies.
Collapse
Affiliation(s)
- Silu Chen
- Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing, China; Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Junyi Xin
- Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing, China; Department of Bioinformatics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, China
| | - Zhutao Ding
- Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing, China; Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Lingyan Zhao
- Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing, China; Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Shuai Ben
- Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing, China; Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Rui Zheng
- Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing, China; Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Shuwei Li
- Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing, China; Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Huiqin Li
- Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, China; Department of Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Wei Shao
- Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing, China; Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Yifei Cheng
- Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing, China; Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Zhengdong Zhang
- Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing, China; Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Mulong Du
- Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, China; Department of Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Meilin Wang
- Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, Nanjing, China; Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing, China; The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Gusu School, Nanjing Medical University, Suzhou, China.
| |
Collapse
|
59
|
Huang N, Xiao W, Lv J, Yu C, Guo Y, Pei P, Yang L, Millwood IY, Walters RG, Chen Y, Du H, Avery D, Ou T, Chen J, Chen Z, Huang T, Li L. Genome-wide polygenic risk score, cardiometabolic risk factors, and type 2 diabetes mellitus in the Chinese population. Obesity (Silver Spring) 2023; 31:2615-2626. [PMID: 37661427 DOI: 10.1002/oby.23846] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/04/2023] [Revised: 05/26/2023] [Accepted: 05/28/2023] [Indexed: 09/05/2023]
Abstract
OBJECTIVE Type 2 diabetes (T2D) is caused by both genetic and cardiometabolic risk factors. However, the magnitude of the genetic predisposition of T2D in the Chinese population remains largely unknown. METHODS This study included 93,488 participants from the China Kadoorie Biobank, and multiple polygenic risk scores (PRS) were calculated. A common cardiometabolic risk score (CRS) using smoking, alcohol consumption, physical activity, diet, obesity, blood pressure, and blood lipids was constructed to investigate the effects of cardiometabolic risk factors on T2D. Furthermore, an equation based on ideal PRS, CRS, and their interaction was established to explore the combined effects on T2D. RESULTS An ideally fitting PRS model (variance explained, R2 = 7.6%) was reached based on multiple PRS calculation methods. An additive interaction between PRS and CRS (coefficient = 28%, 95% CI: 0.20-0.36, p < 0.001) was found. The R2 of the T2D predictive model could increase to 8.3% when CRS and the interaction terms of PRS × CRS were considered. In the etiological composition of T2D, the ratio of genetic risk effect, cardiometabolic risk effect, and interaction between genetic and cardiometabolic factors was 67:16:17. CONCLUSIONS This study identified an ideally fitting PRS model for identifying and predicting the risk of T2D suitable for the Chinese population. The quantified proportional structure of genetic risk factors, cardiometabolic risk factors, and their interaction was detected, which elucidated the critical effect of genetic factors.
Collapse
Affiliation(s)
- Ninghao Huang
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Wendi Xiao
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Jun Lv
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
- Peking University Centre for Public Health and Epidemic Preparedness & Response, Beijing, China
| | - Canqing Yu
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
- Peking University Centre for Public Health and Epidemic Preparedness & Response, Beijing, China
| | - Yu Guo
- Fuwai Hospital, Chinese Academy of Medical Sciences, Beijing, China
- National Center for Cardiovascular Diseases, Beijing, China
| | - Pei Pei
- Peking University Centre for Public Health and Epidemic Preparedness & Response, Beijing, China
| | - Ling Yang
- Medical Research Council Population Health Research Unit at the University of Oxford, Oxford, UK
- Clinical Trial Service Unit & Epidemiological Studies Unit (CTSU), Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Iona Y Millwood
- Medical Research Council Population Health Research Unit at the University of Oxford, Oxford, UK
- Clinical Trial Service Unit & Epidemiological Studies Unit (CTSU), Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Robin G Walters
- Medical Research Council Population Health Research Unit at the University of Oxford, Oxford, UK
- Clinical Trial Service Unit & Epidemiological Studies Unit (CTSU), Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Yiping Chen
- Medical Research Council Population Health Research Unit at the University of Oxford, Oxford, UK
- Clinical Trial Service Unit & Epidemiological Studies Unit (CTSU), Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Huaidong Du
- Medical Research Council Population Health Research Unit at the University of Oxford, Oxford, UK
- Clinical Trial Service Unit & Epidemiological Studies Unit (CTSU), Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Daniel Avery
- Clinical Trial Service Unit & Epidemiological Studies Unit (CTSU), Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Tingting Ou
- Noncommunicable Diseases Prevention and Control Department, Hainan Centers for Disease Control and Prevention, Hainan, China
| | - Junshi Chen
- China National Centre for Food Safety Risk Assessment, Beijing, China
| | - Zhengming Chen
- Clinical Trial Service Unit & Epidemiological Studies Unit (CTSU), Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Tao Huang
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
- Centre for Intelligent Public Health, Academy for Artificial Intelligence, Peking University, Beijing, China
- Key Laboratory of Epidemiology of Major Diseases (Peking University), Ministry of Education, Beijing, China
| | - Liming Li
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
- Peking University Centre for Public Health and Epidemic Preparedness & Response, Beijing, China
| |
Collapse
|
60
|
Moll M, Peljto AL, Kim JS, Xu H, Debban CL, Chen X, Menon A, Putman RK, Ghosh AJ, Saferali A, Nishino M, Hatabu H, Hobbs BD, Hecker J, McDermott G, Sparks JA, Wain LV, Allen RJ, Tobin MD, Raby BA, Chun S, Silverman EK, Zamora AC, Ortega VE, Garcia CK, Barr RG, Bleecker ER, Meyers DA, Kaner RJ, Rich SS, Manichaikul A, Rotter JI, Dupuis J, O’Connor GT, Fingerlin TE, Hunninghake GM, Schwartz DA, Cho MH. A Polygenic Risk Score for Idiopathic Pulmonary Fibrosis and Interstitial Lung Abnormalities. Am J Respir Crit Care Med 2023; 208:791-801. [PMID: 37523715 PMCID: PMC10563194 DOI: 10.1164/rccm.202212-2257oc] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 07/31/2023] [Indexed: 08/02/2023] Open
Abstract
Rationale: In addition to rare genetic variants and the MUC5B locus, common genetic variants contribute to idiopathic pulmonary fibrosis (IPF) risk. The predictive power of common variants outside the MUC5B locus for IPF and interstitial lung abnormalities (ILAs) is unknown. Objectives: We tested the predictive value of IPF polygenic risk scores (PRSs) with and without the MUC5B region on IPF, ILA, and ILA progression. Methods: We developed PRSs that included (PRS-M5B) and excluded (PRS-NO-M5B) the MUC5B region (500-kb window around rs35705950-T) using an IPF genome-wide association study. We assessed PRS associations with area under the receiver operating characteristic curve (AUC) metrics for IPF, ILA, and ILA progression. Measurements and Main Results: We included 14,650 participants (1,970 IPF; 1,068 ILA) from six multi-ancestry population-based and case-control cohorts. In cases excluded from genome-wide association study, the PRS-M5B (odds ratio [OR] per SD of the score, 3.1; P = 7.1 × 10-95) and PRS-NO-M5B (OR per SD, 2.8; P = 2.5 × 10-87) were associated with IPF. Participants in the top PRS-NO-M5B quintile had ∼sevenfold odds for IPF compared with those in the first quintile. A clinical model predicted IPF (AUC, 0.61); rs35705950-T and PRS-NO-M5B demonstrated higher AUCs (0.73 and 0.7, respectively), and adding both genetic predictors to a clinical model yielded the highest performance (AUC, 0.81). The PRS-NO-M5B was associated with ILA (OR, 1.25) and ILA progression (OR, 1.16) in European ancestry participants. Conclusions: A common genetic variant risk score complements the MUC5B variant to identify individuals at high risk of interstitial lung abnormalities and pulmonary fibrosis.
Collapse
Affiliation(s)
- Matthew Moll
- Division of Pulmonary and Critical Care Medicine, and
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts
| | - Anna L. Peljto
- Department of Medicine and
- Department of Immunology, Division of Pulmonary Medicine, University of Colorado, Aurora, Colorado
| | - John S. Kim
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, University of Virginia, Charlottesville, Virginia
| | - Hanfei Xu
- Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts
| | - Catherine L. Debban
- Center for Public Health Genomics, University of Virginia School of Medicine, Charlottesville, Virginia
| | - Xianfeng Chen
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Mayo Clinic, Phoenix, Arizona
| | - Aravind Menon
- Division of Pulmonary and Critical Care Medicine, and
| | | | - Auyon J. Ghosh
- Department of Medicine, Division of Pulmonary and Critical Care Medicine, State University of New York Upstate Medical Center, Syracuse, New York
| | - Aabida Saferali
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts
| | - Mizuki Nishino
- Center for Pulmonary Functional Imaging, Department of Radiology
| | - Hiroto Hatabu
- Center for Pulmonary Functional Imaging, Department of Radiology
| | - Brian D. Hobbs
- Division of Pulmonary and Critical Care Medicine, and
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts
| | - Julian Hecker
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts
| | - Gregory McDermott
- Division of Rheumatology, Department of Medicine, Brigham and Women’s Hospital, Boston, Massachusetts
| | - Jeffrey A. Sparks
- Division of Rheumatology, Department of Medicine, Brigham and Women’s Hospital, Boston, Massachusetts
| | - Louise V. Wain
- Department of Health Sciences, University of Leicester, Leicester, United Kingdom
- National Institute for Health Research, Leicester Respiratory Biomedical Research Centre, Glenfield Hospital, Leicester, United Kingdom
| | - Richard J. Allen
- Department of Health Sciences, University of Leicester, Leicester, United Kingdom
- National Institute for Health Research, Leicester Respiratory Biomedical Research Centre, Glenfield Hospital, Leicester, United Kingdom
| | - Martin D. Tobin
- Department of Health Sciences, University of Leicester, Leicester, United Kingdom
- National Institute for Health Research, Leicester Respiratory Biomedical Research Centre, Glenfield Hospital, Leicester, United Kingdom
| | - Benjamin A. Raby
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts
- Department of Pediatrics
- Division of Pulmonary Medicine, Boston Children’s Hospital, Harvard Medical School, Boston, Massachusetts
| | - Sung Chun
- Division of Pulmonary Medicine, Boston Children’s Hospital, Harvard Medical School, Boston, Massachusetts
| | - Edwin K. Silverman
- Division of Pulmonary and Critical Care Medicine, and
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts
| | - Ana C. Zamora
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Mayo Clinic, Phoenix, Arizona
| | - Victor E. Ortega
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Mayo Clinic, Phoenix, Arizona
| | - Christine K. Garcia
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Columbia University Irving Medical Center, New York, New York
| | - R. Graham Barr
- Department of Medicine and
- Division of General Medicine, Department of Epidemiology, Columbia University Medical Center, New York, New York
| | - Eugene R. Bleecker
- Division of Genetics, Genomics, and Precision Medicine, Department of Medicine, University of Arizona, Tucson, Arizona
| | - Deborah A. Meyers
- Division of Genetics, Genomics, and Precision Medicine, Department of Medicine, University of Arizona, Tucson, Arizona
| | - Robert J. Kaner
- Division of Pulmonary Medicine, Weill Cornell School of Medicine, New York, New York
| | - Stephen S. Rich
- Center for Public Health Genomics, University of Virginia School of Medicine, Charlottesville, Virginia
| | - Ani Manichaikul
- Center for Public Health Genomics, University of Virginia School of Medicine, Charlottesville, Virginia
| | - Jerome I. Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-University of California, Los Angeles Medical Center, Torrance, California
| | - Josée Dupuis
- Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts
- Department of Epidemiology, Biostatistics and Occupational Health, School of Population and Global Health, McGill University Faculty of Medicine and Health Sciences, Montreal, Quebec, Canada
| | - George T. O’Connor
- Department of Medicine, Pulmonary Center, Boston University School of Medicine, Boston, Massachusetts; and
| | - Tasha E. Fingerlin
- The National Jewish Health Cohen Family Asthma Institute, Division of Allergy and Immunology, National Jewish Health, Denver, Colorado
| | | | - David A. Schwartz
- Department of Medicine and
- Department of Immunology, Division of Pulmonary Medicine, University of Colorado, Aurora, Colorado
| | - Michael H. Cho
- Division of Pulmonary and Critical Care Medicine, and
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts
| |
Collapse
|
61
|
Koch S, Schmidtke J, Krawczak M, Caliebe A. Clinical utility of polygenic risk scores: a critical 2023 appraisal. J Community Genet 2023; 14:471-487. [PMID: 37133683 PMCID: PMC10576695 DOI: 10.1007/s12687-023-00645-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Accepted: 03/31/2023] [Indexed: 05/04/2023] Open
Abstract
Since their first appearance in the context of schizophrenia and bipolar disorder in 2009, polygenic risk scores (PRSs) have been described for a large number of common complex diseases. However, the clinical utility of PRSs in disease risk assessment or therapeutic decision making is likely limited because PRSs usually only account for the heritable component of a trait and ignore the etiological role of environment and lifestyle. We surveyed the current state of PRSs for various diseases, including breast cancer, diabetes, prostate cancer, coronary artery disease, and Parkinson disease, with an extra focus upon the potential improvement of clinical scores by their combination with PRSs. We observed that the diagnostic and prognostic performance of PRSs alone is consistently low, as expected. Moreover, combining a PRS with a clinical score at best led to moderate improvement of the power of either risk marker. Despite the large number of PRSs reported in the scientific literature, prospective studies of their clinical utility, particularly of the PRS-associated improvement of standard screening or therapeutic procedures, are still rare. In conclusion, the benefit to individual patients or the health care system in general of PRS-based extensions of existing diagnostic or treatment regimens is still difficult to judge.
Collapse
Affiliation(s)
- Sebastian Koch
- Institut für Medizinische Informatik und Statistik, Christian-Albrechts-Universität zu Kiel, Universitätsklinikum Schleswig-Holstein Campus Kiel, Kiel, Germany
| | - Jörg Schmidtke
- Amedes MVZ Wagnerstibbe, Hannover, Germany
- Institut für Humangenetik, Medizinische Hochschule Hannover, Hannover, Germany
| | - Michael Krawczak
- Institut für Medizinische Informatik und Statistik, Christian-Albrechts-Universität zu Kiel, Universitätsklinikum Schleswig-Holstein Campus Kiel, Kiel, Germany
| | - Amke Caliebe
- Institut für Medizinische Informatik und Statistik, Christian-Albrechts-Universität zu Kiel, Universitätsklinikum Schleswig-Holstein Campus Kiel, Kiel, Germany.
| |
Collapse
|
62
|
Chen T, Zhang H, Mazumder R, Lin X. Ensembled best subset selection using summary statistics for polygenic risk prediction. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.25.559307. [PMID: 37886515 PMCID: PMC10602024 DOI: 10.1101/2023.09.25.559307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/28/2023]
Abstract
Polygenic risk scores (PRS) enhance population risk stratification and advance personalized medicine, yet existing methods face a tradeoff between predictive power and computational efficiency. We introduce ALL-Sum, a fast and scalable PRS method that combines an efficient summary statistic-based L 0 L 2 penalized regression algorithm with an ensembling step that aggregates estimates from different tuning parameters for improved prediction performance. In extensive large-scale simulations across a wide range of polygenicity and genome-wide association studies (GWAS) sample sizes, ALL-Sum consistently outperforms popular alternative methods in terms of prediction accuracy, runtime, and memory usage. We analyze 27 published GWAS summary statistics for 11 complex traits from 9 reputable data sources, including the Global Lipids Genetics Consortium, Breast Cancer Association Consortium, and FinnGen, evaluated using individual-level UKBB data. ALL-Sum achieves the highest accuracy for most traits, particularly for GWAS with large sample sizes. We provide ALL-Sum as a user-friendly command-line software with pre-computed reference data for streamlined user-end analysis.
Collapse
|
63
|
Zhang T, Klei L, Liu P, Chouldechova A, Roeder K, G'Sell M, Devlin B. Evaluating and Improving Health Equity and Fairness of Polygenic Scores. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.22.559051. [PMID: 37790341 PMCID: PMC10542523 DOI: 10.1101/2023.09.22.559051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
Polygenic scores (PGS) are quantitative metrics for predicting phenotypic values, such as human height or disease status. Some PGS methods require only summary statistics of a relevant genome-wide association study (GWAS) for their score. One such method is Lassosum, which inherits the model selection advantages of Lasso to select a meaningful subset of the GWAS single nucleotide polymorphisms as predictors from their association statistics. However, even efficient scores like Lassosum, when derived from European-based GWAS, are poor predictors of phenotype for subjects of non-European ancestry; that is, they have limited portability to other ancestries. To increase the portability of Lassosum, when GWAS information and estimates of linkage disequilibrium are available for both ancestries, we propose Joint-Lassosum. In the simulation settings we explore, Joint-Lassosum provides more accurate PGS compared with other methods, especially when measured in terms of fairness. Like all PGS methods, Joint-Lassosum requires selection of predictors, which are determined by data-driven tuning parameters. We describe a new approach to selecting tuning parameters and note its relevance for model selection for any PGS. We also draw connections to the literature on algorithmic fairness and discuss how Joint-Lassosum can help mitigate fairness-related harms that might result from the use of PGS scores in clinical settings. While no PGS method is likely to be universally portable, due to the diversity of human populations and unequal information content of GWAS for different ancestries, Joint-Lassosum is an effective approach for enhancing portability and reducing predictive bias.
Collapse
|
64
|
Jin J, Zhan J, Zhang J, Zhao R, O’Connell J, Jiang Y, Buyske S, Gignoux C, Haiman C, Kenny EE, Kooperberg C, North K, Koelsch BL, Wojcik G, Zhang H, Chatterjee N. MUSSEL: Enhanced Bayesian Polygenic Risk Prediction Leveraging Information across Multiple Ancestry Groups. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.12.536510. [PMID: 37090648 PMCID: PMC10120638 DOI: 10.1101/2023.04.12.536510] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/25/2023]
Abstract
Polygenic risk scores (PRS) are now showing promising predictive performance on a wide variety of complex traits and diseases, but there exists a substantial performance gap across different populations. We propose MUSSEL, a method for ancestry-specific polygenic prediction that borrows information in the summary statistics from genome-wide association studies (GWAS) across multiple ancestry groups. MUSSEL conducts Bayesian hierarchical modeling under a MUltivariate Spike-and-Slab model for effect-size distribution and incorporates an Ensemble Learning step using super learner to combine information across different tuning parameter settings and ancestry groups. In our simulation studies and data analyses of 16 traits across four distinct studies, totaling 5.7 million participants with a substantial ancestral diversity, MUSSEL shows promising performance compared to alternatives. The method, for example, has an average gain in prediction R2 across 11 continuous traits of 40.2% and 49.3% compared to PRS-CSx and CT-SLEB, respectively, in the African Ancestry population. The best-performing method, however, varies by GWAS sample size, target ancestry, underlying trait architecture, and the choice of reference samples for LD estimation, and thus ultimately, a combination of methods may be needed to generate the most robust PRS across diverse populations.
Collapse
Affiliation(s)
- Jin Jin
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA
| | | | - Jingning Zhang
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Ruzhang Zhao
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | | | | | | | - Steven Buyske
- Department of Statistics, Rutgers University, New Brunswick, NJ, USA
| | - Christopher Gignoux
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Christopher Haiman
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Eimear E. Kenny
- Icahn Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Charles Kooperberg
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | - Kari North
- Department of Epidemiology, University of North Carolina Chapel Hill, Chapel Hill, NC, USA
| | | | - Genevieve Wojcik
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Haoyu Zhang
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Nilanjan Chatterjee
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
- Department of Oncology, School of Medicine, Johns Hopkins University, Baltimore, MD, USA
| |
Collapse
|
65
|
Im C, Sharafeldin N, Yuan Y, Wang Z, Sapkota Y, Lu Z, Spector LG, Howell RM, Arnold MA, Hudson MM, Ness KK, Robison LL, Bhatia S, Armstrong GT, Neglia JP, Yasui Y, Turcotte LM. Polygenic Risk and Chemotherapy-Related Subsequent Malignancies in Childhood Cancer Survivors: A Childhood Cancer Survivor Study and St Jude Lifetime Cohort Study Report. J Clin Oncol 2023; 41:4381-4393. [PMID: 37459583 PMCID: PMC10522108 DOI: 10.1200/jco.23.00428] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 04/26/2023] [Accepted: 05/18/2023] [Indexed: 08/07/2023] Open
Abstract
PURPOSE Chemotherapeutic exposures are associated with subsequent malignant neoplasm (SMN) risk. The role of genetic susceptibility in chemotherapy-related SMNs should be defined as use of radiation therapy (RT) decreases. PATIENTS AND METHODS SMNs among long-term childhood cancer survivors of European (EUR; N = 9,895) and African (AFR; N = 718) genetic ancestry from the Childhood Cancer Survivor Study and St Jude Lifetime Cohort Study were evaluated. An externally validated 179-variant polygenic risk score (PRS) associated with pleiotropic adult cancer risk from the UK Biobank Study (N > 400,000) was computed for each survivor. SMN cumulative incidence comparing top and bottom PRS quintiles was estimated, along with hazard ratios (HRs) from proportional hazards models. RESULTS A total of 1,594 survivors developed SMNs, with basal cell carcinomas (n = 822), breast cancers (n = 235), and thyroid cancers (n = 221) being the most frequent. Although SMN risk associations with the PRS were extremely modest in RT-exposed EUR survivors (HR, 1.22; P = .048; n = 4,630), the increase in 30-year SMN cumulative incidence and HRs comparing top and bottom PRS quintiles was statistically significant among nonirradiated EUR survivors (n = 4,322) treated with alkylating agents (17% v 6%; HR, 2.46; P < .01), anthracyclines (20% v 8%; HR, 2.86; P < .001), epipodophyllotoxins (23% v 1%; HR, 12.20; P < .001), or platinums (46% v 7%; HR, 8.58; P < .01). This PRS also significantly modified epipodophyllotoxin-related SMN risk among nonirradiated AFR survivors (n = 414; P < .01). Improvements in prediction attributable to the PRS were greatest for epipodophyllotoxin-exposed (AUC, 0.71 v 0.63) and platinum-exposed (AUC,0.68 v 0.58) survivors. CONCLUSION A pleiotropic cancer PRS has strong potential for improving SMN clinical risk stratification among nonirradiated survivors treated with specific chemotherapies. A polygenic risk screening approach may be a valuable complement to an early screening strategy on the basis of treatments and rare cancer-susceptibility mutations.
Collapse
Affiliation(s)
- Cindy Im
- Department of Pediatrics, University of Minnesota, Minneapolis, MN
| | - Noha Sharafeldin
- Hematology Oncology Division, Department of Medicine, University of Alabama at Birmingham, Birmingham, AL
- Institute for Cancer Outcomes and Survivorship, University of Alabama at Birmingham, Birmingham, AL
| | - Yan Yuan
- School of Public Health, University of Alberta, Edmonton, Alberta, Canada
| | - Zhaoming Wang
- Department of Epidemiology and Cancer Control, St Jude Children's Research Hospital, Memphis, TN
- Department of Computational Biology, St Jude Children's Research Hospital, Memphis, TN
| | - Yadav Sapkota
- Department of Epidemiology and Cancer Control, St Jude Children's Research Hospital, Memphis, TN
| | - Zhanni Lu
- Department of Pediatrics, University of Minnesota, Minneapolis, MN
| | - Logan G. Spector
- Department of Pediatrics, University of Minnesota, Minneapolis, MN
| | - Rebecca M. Howell
- Department of Radiation Physics, University of Texas at MD Anderson Cancer Center, Houston, TX
| | - Michael A. Arnold
- Department of Pathology and Laboratory Medicine, University of Colorado and Children's Hospital Colorado, Anschutz Medical Campus, Aurora, CO
| | - Melissa M. Hudson
- Department of Epidemiology and Cancer Control, St Jude Children's Research Hospital, Memphis, TN
- Department of Oncology, St Jude Children's Research Hospital, Memphis, TN
| | - Kirsten K. Ness
- Department of Epidemiology and Cancer Control, St Jude Children's Research Hospital, Memphis, TN
| | - Leslie L. Robison
- Department of Epidemiology and Cancer Control, St Jude Children's Research Hospital, Memphis, TN
| | - Smita Bhatia
- Institute for Cancer Outcomes and Survivorship, University of Alabama at Birmingham, Birmingham, AL
| | - Gregory T. Armstrong
- Department of Epidemiology and Cancer Control, St Jude Children's Research Hospital, Memphis, TN
- Department of Oncology, St Jude Children's Research Hospital, Memphis, TN
| | - Joseph P. Neglia
- Department of Pediatrics, University of Minnesota, Minneapolis, MN
| | - Yutaka Yasui
- School of Public Health, University of Alberta, Edmonton, Alberta, Canada
- Department of Epidemiology and Cancer Control, St Jude Children's Research Hospital, Memphis, TN
| | | |
Collapse
|
66
|
Park DK, Chen M, Kim S, Joo YY, Loving RK, Kim HS, Cha J, Yoo S, Kim JH. Overestimated prediction using polygenic prediction derived from summary statistics. BMC Genom Data 2023; 24:52. [PMID: 37710206 PMCID: PMC10500750 DOI: 10.1186/s12863-023-01151-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2023] [Accepted: 08/16/2023] [Indexed: 09/16/2023] Open
Abstract
BACKGROUND When polygenic risk score (PRS) is derived from summary statistics, independence between discovery and test sets cannot be monitored. We compared two types of PRS studies derived from raw genetic data (denoted as rPRS) and the summary statistics for IGAP (sPRS). RESULTS Two variables with the high heritability in UK Biobank, hypertension, and height, are used to derive an exemplary scale effect of PRS. sPRS without APOE is derived from International Genomics of Alzheimer's Project (IGAP), which records ΔAUC and ΔR2 of 0.051 ± 0.013 and 0.063 ± 0.015 for Alzheimer's Disease Sequencing Project (ADSP) and 0.060 and 0.086 for Accelerating Medicine Partnership - Alzheimer's Disease (AMP-AD). On UK Biobank, rPRS performances for hypertension assuming a similar size of discovery and test sets are 0.0036 ± 0.0027 (ΔAUC) and 0.0032 ± 0.0028 (ΔR2). For height, ΔR2 is 0.029 ± 0.0037. CONCLUSION Considering the high heritability of hypertension and height of UK Biobank and sample size of UK Biobank, sPRS results from AD databases are inflated. Independence between discovery and test sets is a well-known basic requirement for PRS studies. However, a lot of PRS studies cannot follow such requirements because of impossible direct comparisons when using summary statistics. Thus, for sPRS, potential duplications should be carefully considered within the same ethnic group.
Collapse
Affiliation(s)
- David Keetae Park
- Department of Biomedical Engineering, Columbia University, New York, USA
| | - Mingshen Chen
- Department of Applied Mathematics & Statistics, Stony Brook University, New York, USA
| | - Seungsoo Kim
- Department of Obstetrics and Gynecology, Columbia University Irving Medical Center, New York, NY, USA
| | - Yoonjung Yoonie Joo
- Samsung Advanced Institute for Health Sciences & Technology (SAHIST), Sungkyunkwan University, Samsung Medical Center, Seoul, South Korea
| | - Rebekah K Loving
- Department of Biology, California Institute of Technology, Pasadena, USA
| | - Hyoung Seop Kim
- Department of Physical Medicine and Rehabilitation, Dementia Center, National Health Insurance Service Ilsan Hospital, Goyang, South Korea
| | - Jiook Cha
- Department of Psychology, Brain and Cognitive Sciences, AI Institute, Seoul National University, Seoul, South Korea
| | - Shinjae Yoo
- Computational Science Initiative, Brookhaven National Lab. Computer Science and Math, Building 725, Room 2-189, Upton, NY, 11973, USA.
| | - Jong Hun Kim
- Department of Neurology, Dementia Center, National Health Insurance Service Ilsan Hospital, 100 Ilsan-ro Ilsandong-gu, Goyang, Gyeonggi-Do, 10444, South Korea.
| |
Collapse
|
67
|
Zhao B, Li T, Li Y, Fan Z, Xiong D, Wang X, Gao M, Smith SM, Zhu H. An atlas of trait associations with resting-state and task-evoked human brain functional organizations in the UK Biobank. IMAGING NEUROSCIENCE (CAMBRIDGE, MASS.) 2023; 1:1-23. [PMID: 38770197 PMCID: PMC11105703 DOI: 10.1162/imag_a_00015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
Functional magnetic resonance imaging (fMRI) has been widely used to identify brain regions linked to critical functions, such as language and vision, and to detect tumors, strokes, brain injuries, and diseases. It is now known that large sample sizes are necessary for fMRI studies to detect small effect sizes and produce reproducible results. Here we report a systematic association analysis of 647 traits with imaging features extracted from resting-state and task-evoked fMRI data of more than 40,000 UK Biobank participants. We used a parcellation-based approach to generate 64,620 functional connectivity measures to reveal fine-grained details about cerebral cortex functional organizations. The difference between functional organizations at rest and during task was examined, and we have prioritized important brain regions and networks associated with a variety of human traits and clinical outcomes. For example, depression was most strongly associated with decreased connectivity in the somatomotor network. We have made our results publicly available and developed a browser framework to facilitate the exploration of brain function-trait association results (http://fmriatlas.org/).
Collapse
Affiliation(s)
- Bingxin Zhao
- Department of Statistics and Data Science, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Statistics, Purdue University, West Lafayette, IN 47907, USA
- These authors contributed equally to this work
| | - Tengfei Li
- Department of Radiology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Biomedical Research Imaging Center, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- These authors contributed equally to this work
| | - Yujue Li
- Department of Statistics, Purdue University, West Lafayette, IN 47907, USA
| | - Zirui Fan
- Department of Statistics and Data Science, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Statistics, Purdue University, West Lafayette, IN 47907, USA
| | - Di Xiong
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Xifeng Wang
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Mufeng Gao
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Stephen M. Smith
- Wellcome Centre for Integrative Neuroimaging, FMRIB, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK
| | - Hongtu Zhu
- Biomedical Research Imaging Center, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
68
|
Tsegaselassie W, Jian Y, Berhanu GG, Tianyuan L, April M, Tali E, Fasil TA, Timothy TA, Jordana C, Marguerite IR, Robert SM, Michael VW, Kristine Y, Myriam F, Donald LJM, Mario S, Daichi S, Yuichiro Y, Paul M, Adam B. Associations of cardiometabolic polygenic risk scores with cardiovascular disease in African Americans. RESEARCH SQUARE 2023:rs.3.rs-3228815. [PMID: 37693576 PMCID: PMC10491340 DOI: 10.21203/rs.3.rs-3228815/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2023]
Abstract
Background Cardiovascular disease (CVD) is a complex disease, and genetic factors contribute individually or cumulatively to CVD risk. While African American women and men are disproportionately affected by CVD, their lack of representation in genomic investigations may widen disparities in health. We investigated the associations of cardiometabolic polygenic risk scores (PRSs) with CVD risk in African Americans. Methods We used the Jackson Heart Study, a prospective cohort study of CVD in African American adults and the predicted atherosclerotic cardiovascular disease (ASCVD) 10-year risk. We included 40-79 years old adults without a history of coronary heart disease (CHD) or stroke at baseline. We derived genome-wide PRSs for systolic blood pressure (SBP), diastolic blood pressure (DBP), total cholesterol, LDL cholesterol, hemoglobin A1c (HbA1c), triglycerides, and C-reactive protein (CRP) separately for each of the participants, using African-origin UK Biobank participants' genome-wide association summary statistics. We estimated the associations between PRSs and 10-year predicted ASCVD risk adjusting for age, sex, study visit date, and genetic ancestry using linear and logistic regression models. Results Participants (n=2,077) were 63% female and 66% never-smokers. They had mean (SD) 56 (10) years of age, 127.8 (16.3) mmHg SBP, 76.3 (8.7) mmHg DBP, 200.4 (40.2) mg/dL total cholesterol, 51.7 (14.7) mg/dL HDL cholesterol, 127.2 (36.7) mg/dL LDL cholesterol, 6.0 (1.3) mmol/mol HbA1c, 108.9 (81.7) mg/dL triglycerides and 0.53 (1.1) CRP. Their median (interquartile range) predicted 10-year predicted ASCVD risk was 8.0 (4.0-15.0). Participants in the >75th percentile for HbA1c PRS had 1.42 percentage-point greater predicted 10-year ASCVD risk (1.42 [95% CI: 0.58-2.26]) and higher odds of ≥10% predicted 10-year ASCVD risk (OR: 1.46 [95% CI: 1.03-2.07]) compared with those in the <25th percentile for HbA1c PRS. Participants in the >75th percentile for SBP PRS had higher odds of ≥10% predicted 10-year ASCVD risk (OR: 1.52 [95% CI: 1.07-2.15]) compared with those in the <25th percentile for SBP PRS. Conclusion Among 40-79 years old African Americans without CHD and stroke, higher PRSs for HbA1c and SBP were associated with CVD risk. PRSs may help stratify individuals based on their clinical risk factors for CVD early prevention and clinical management.
Collapse
Affiliation(s)
| | | | | | - Lu Tianyuan
- Lady Davis Institute for Medical Research, Jewish General Hospital
| | | | | | | | | | | | | | | | | | | | | | | | - Sims Mario
- University of Mississippi Medical Center
| | | | | | | | | |
Collapse
|
69
|
Hu X, Jiang X, Li J, Zhao N, Gan H, Hu X, Li L, Liu X, Shan H, Bai Y, Pang P. Identification of potential genetic Loci and polygenic risk model for Budd-Chiari syndrome in Chinese population. iScience 2023; 26:107287. [PMID: 37539039 PMCID: PMC10393737 DOI: 10.1016/j.isci.2023.107287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Revised: 05/19/2023] [Accepted: 07/02/2023] [Indexed: 08/05/2023] Open
Abstract
Budd-Chiari syndrome (BCS) is characterized by hepatic venous outflow obstruction, posing life-threatening risks in severe cases. Reported risk factors include inherited and acquired hypercoagulable states or other predisposing factors. However, many patients have no identifiable etiology, and causes of BCS differ between the West and East. This study recruited 500 BCS patients and 696 normal individuals for whole-exome sequencing and developed a polygenic risk scoring (PRS) model using PLINK, LASSOSUM, BLUP, and BayesA methods. Risk factors for venous thromboembolism and vascular malformations were also assessed for BCS risk prediction. Ultimately, we discovered potential BCS risk mutations, such as rs1042331, and the optimal BayesA-generated PRS model presented an AUC >0.9 in the external replication cohort. This model provides particular insights into genetic risk differences between China and the West and suggests shared genetic risks among BCS, venous thromboembolism, and vascular malformations, offering different perspectives on BCS pathogenesis.
Collapse
Affiliation(s)
- Xiaojun Hu
- Center for Interventional Medicine, Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China
| | - Xiaosen Jiang
- BGI-Shenzhen, Shenzhen, China
- College of Life Sciences, University of the Chinese Academy of Sciences, Beijing, China
| | - Jia Li
- BGI Genomics, BGI-Shenzhen, Shenzhen, China
- Hebei Industrial Technology Research Institute of Genomics in Maternal & Child Health, Shijiazhuang BGI Genomics Co., Ltd, Shijiazhuang, China
| | - Ni Zhao
- Center for Interventional Medicine, Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China
| | - Hairun Gan
- Center for Interventional Medicine, Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China
| | - Xinyan Hu
- Center for Interventional Medicine, Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China
| | - Luting Li
- Center for Interventional Medicine, Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China
| | - Xingtao Liu
- Changfeng Hospital of Jinjiang District, Chengdu, China
| | - Hong Shan
- Center for Interventional Medicine, Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China
| | | | - Pengfei Pang
- Center for Interventional Medicine, Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China
- Guangdong Provincial Key Laboratory of Biomedical Imaging, Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, China
- Guangdong Provincial Engineering Research Center of Molecular Imaging, Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, China
| |
Collapse
|
70
|
Singh RK, Zhao Y, Elze T, Fingert J, Gordon M, Kass MA, Luo Y, Pasquale LR, Scheetz T, Segrè AV, Wiggs JL, Zebardast N. Polygenic Risk Score Improves Prediction of Primary Open Angle Glaucoma Onset in the Ocular Hypertension Treatment Study. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.08.15.23294141. [PMID: 37645858 PMCID: PMC10462203 DOI: 10.1101/2023.08.15.23294141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Objective or Purpose Primary open-angle glaucoma (POAG) is a highly heritable disease with 127 identified risk loci. Polygenic risks score (PRS) offers a measure of aggregate genetic burden. In this study, we assess whether PRS improves risk stratification in patients with ocular hypertension. Design A post-hoc analysis of the Ocular Hypertension Treatment Study (OHTS) data. Setting Participants and/or Controls 1636 participants were followed from 1994 to 2020 across 22 sites. The PRS was computed for 1009 OHTS participants using summary statistics from largest cross-ancestry POAG metanalysis with weights trained using 8,813,496 variants from 488,395 participants in the UK Biobank. Methods Interventions or Testing Survival regression analysis, with endpoint as development of POAG, predicted disease onset from PRS incorporating baseline covariates. Main Outcomes and Measures Outcome measures were hazard ratios for POAG onset. Concordance index and time-dependent AUC were used to compare the predictive performance of multivariable Cox-Proportional Hazards models. Results Mean PRS was significantly higher for POAG-converters (0.24 ± 0.95) than for non-converters (-0.12 ± 1.00) (p < 0.01). POAG risk increased 1.36% with each higher PRS decile, with conversion ranging from 9.5% in the lowest PRS decile to 21.8% in the highest decile. Comparison of low- and high-risk PRS tertiles showed a 1.8-fold increase in 20-year POAG risk for participants of European and African ancestries (p<0.01). In the subgroup randomized to delayed treatment, each increase in PRS decile was associated with a 0.52-year decrease in age at diagnosis, (p=0.05). No significant linear relationship between PRS and age at POAG diagnosis was present in the early treatment group. Prediction models significantly improved with the addition of PRS as a covariate (C-index = 0.77) compared to OHTS baseline model (C-index=0.75) (p<0.01). One standard deviation higher PRS conferred a mean hazard ratio of 1.25 (CI=[1.13, 1.44]) for POAG onset. Conclusions Higher PRS is associated with increased risk for, and earlier development of POAG in patients with ocular hypertension. Early treatment may mitigate the risk from high genetic burden, delaying clinically detectable disease by up to 5.2 years. The inclusion of a PRS improves the prediction of POAG onset.
Collapse
Affiliation(s)
- Rishabh K. Singh
- Department of Ophthalmology, Columbia University Medical Center, New York, NY
- Schepens Eye Research Institute, Harvard Medical School, Boston, MA
| | - Yan Zhao
- Massachusetts Eye and Ear, Harvard Medical School, Boston, MA
| | - Tobias Elze
- Schepens Eye Research Institute, Harvard Medical School, Boston, MA
| | - John Fingert
- Carver College of Medicine, University of Iowa, Iowa City, IA
| | - Mae Gordon
- Washington University School of Medicine, St. Louis, MO
| | | | - Yuyang Luo
- Massachusetts Eye and Ear, Harvard Medical School, Boston, MA
| | | | - Todd Scheetz
- Carver College of Medicine, University of Iowa, Iowa City, IA
| | - Ayellet V. Segrè
- Massachusetts Eye and Ear, Harvard Medical School, Boston, MA
- Ocular Genomics Institute, Massachusetts Eye and Ear, Boston, MA
| | - Janey L. Wiggs
- Massachusetts Eye and Ear, Harvard Medical School, Boston, MA
- Ocular Genomics Institute, Massachusetts Eye and Ear, Boston, MA
| | | |
Collapse
|
71
|
Zhang Y, Jiang X, Mentzer AJ, McVean G, Lunter G. Topic modeling identifies novel genetic loci associated with multimorbidities in UK Biobank. CELL GENOMICS 2023; 3:100371. [PMID: 37601973 PMCID: PMC10435382 DOI: 10.1016/j.xgen.2023.100371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2022] [Revised: 05/04/2023] [Accepted: 07/07/2023] [Indexed: 08/22/2023]
Abstract
Many diseases show patterns of co-occurrence, possibly driven by systemic dysregulation of underlying processes affecting multiple traits. We have developed a method (treeLFA) for identifying such multimorbidities from routine health-care data, which combines topic modeling with an informative prior derived from medical ontology. We apply treeLFA to UK Biobank data and identify a variety of topics representing multimorbidity clusters, including a healthy topic. We find that loci identified using topic weights as traits in a genome-wide association study (GWAS) analysis, which we validated with a range of approaches, only partially overlap with loci from GWASs on constituent single diseases. We also show that treeLFA improves upon existing methods like latent Dirichlet allocation in various ways. Overall, our findings indicate that topic models can characterize multimorbidity patterns and that genetic analysis of these patterns can provide insight into the etiology of complex traits that cannot be determined from the analysis of constituent traits alone.
Collapse
Affiliation(s)
- Yidong Zhang
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7LF, UK
- Chinese Academy of Medical Sciences Oxford Institute, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, UK
- Department of Radiation Oncology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100006, China
| | - Xilin Jiang
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7LF, UK
- Department of Statistics, University of Oxford, Oxford OX1 3LB, UK
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, UK
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge CB2 0SR, UK
- Heart and Lung Research Institute, University of Cambridge, Cambridge CB2 0BB, UK
| | - Alexander J. Mentzer
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7LF, UK
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, UK
| | - Gil McVean
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7LF, UK
| | - Gerton Lunter
- MRC Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, University of Oxford, Oxford OX3 9DS, UK
- Department of Epidemiology, University Medical Center Groningen, University of Groningen, Groningen 9700 RB, the Netherlands
| |
Collapse
|
72
|
Yoon N, Cho YS. Development of a Polygenic Risk Score for BMI to Assess the Genetic Susceptibility to Obesity and Related Diseases in the Korean Population. Int J Mol Sci 2023; 24:11560. [PMID: 37511320 PMCID: PMC10380444 DOI: 10.3390/ijms241411560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 07/03/2023] [Accepted: 07/11/2023] [Indexed: 07/30/2023] Open
Abstract
Hundreds of genetic variants for body mass index (BMI) have been identified from numerous genome-wide association studies (GWAS) in different ethnicities. In this study, we aimed to develop a polygenic risk score (PRS) for BMI for predicting susceptibility to obesity and related traits in the Korean population. For this purpose, we obtained base data resulting from a GWAS on BMI using 57,110 HEXA study subjects from the Korean Genome and Epidemiology Study (KoGES). Subsequently, we calculated PRSs in 13,504 target subjects from the KARE and CAVAS studies of KoGES using the PRSice-2 software. The best-fit PRS for BMI (PRSBMI) comprising 53,341 SNPs was selected at a p-value threshold of 0.064, at which the model fit had the greatest R2 score. The PRSBMI was tested for its association with obesity-related quantitative traits and diseases in the target dataset. Linear regression analyses demonstrated significant associations of PRSBMI with BMI, blood pressure, and lipid traits. Logistic regression analyses revealed significant associations of PRSBMI with obesity, hypertension, and hypo-HDL cholesterolemia. We observed about 2-fold, 1.1-fold, and 1.2-fold risk for obesity, hypertension, and hypo-HDL cholesterolemia, respectively, in the highest-risk group in comparison to the lowest-risk group of PRSBMI in the test population. We further detected approximately 26.0%, 2.8%, and 3.9% differences in prevalence between the highest and lowest risk groups for obesity, hypertension, and hypo-HDL cholesterolemia, respectively. To predict the incidence of obesity and related diseases, we applied PRSBMI to the 16-year follow-up data of the KARE study. Kaplan-Meier survival analysis showed that the higher the PRSBMI, the higher the incidence of dyslipidemia and hypo-HDL cholesterolemia. Taken together, this study demonstrated that a PRS developed for BMI may be a valuable indicator to assess the risk of obesity and related diseases in the Korean population.
Collapse
Affiliation(s)
- Nara Yoon
- Department of Biomedical Science, Hallym University, Chuncheon 24252, Republic of Korea
| | - Yoon Shin Cho
- Department of Biomedical Science, Hallym University, Chuncheon 24252, Republic of Korea
| |
Collapse
|
73
|
Bahda M, Ricard J, Girard SL, Maziade M, Isabelle M, Bureau A. Multivariate extension of penalized regression on summary statistics to construct polygenic risk scores for correlated traits. HGG ADVANCES 2023; 4:100209. [PMID: 37333772 PMCID: PMC10276147 DOI: 10.1016/j.xhgg.2023.100209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Accepted: 05/17/2023] [Indexed: 06/20/2023] Open
Abstract
Genetic correlations between human traits and disorders such as schizophrenia (SZ) and bipolar disorder (BD) diagnoses are well established. Improved prediction of individual traits has been obtained by combining predictors of multiple genetically correlated traits derived from summary statistics produced by genome-wide association studies, compared with single trait predictors. We extend this idea to penalized regression on summary statistics in Multivariate Lassosum, expressing regression coefficients for the multiple traits on single nucleotide polymorphisms (SNPs) as correlated random effects, similarly to multi-trait summary statistic best linear unbiased predictors (MT-SBLUPs). We also allow the SNP contributions to genetic covariance and heritability to depend on genomic annotations. We conducted simulations with two dichotomous traits having polygenic architecture similar to SZ and BD, using genotypes from 29,330 subjects from the CARTaGENE cohort. Multivariate Lassosum produced polygenic risk scores (PRSs) more strongly correlated with the true genetic risk predictor and had better discrimination power between affected and non-affected subjects than previously published sparse multi-trait (PANPRS) and univariate (Lassosum, sparse LDpred2, and the standard clumping and thresholding) methods in most simulation settings. Application of Multivariate Lassosum to predict SZ, BD, and related psychiatric traits in the Eastern Quebec SZ and BD kindred study revealed associations with every trait stronger than those obtained with univariate sparse PRSs, particularly when heritability and genetic covariance depended on genomic annotations. Multivariate Lassosum thus appears promising to improve prediction of genetically correlated traits with summary statistics for a selected subset of SNPs.
Collapse
Affiliation(s)
- Meriem Bahda
- Department of Mathematics and Statistic, Laval University, Québec, QC G1V 0A6, Canada
- CERVO Brain Research Centre, Québec, QC G1E 1T2, Canada
| | - Jasmin Ricard
- CERVO Brain Research Centre, Québec, QC G1E 1T2, Canada
| | - Simon L. Girard
- CERVO Brain Research Centre, Québec, QC G1E 1T2, Canada
- Department of Fundamental Sciences, University of Quebec in Chicoutimi, Chicoutimi, QC G7H 2B1, Canada
| | - Michel Maziade
- CERVO Brain Research Centre, Québec, QC G1E 1T2, Canada
- Department of Psychiatry and Neurosciences, Laval University, Québec, QC G1V 0A6, Canada
| | - Maripier Isabelle
- CERVO Brain Research Centre, Québec, QC G1E 1T2, Canada
- Department of Economics, Laval University, Québec, QC G1V 0A6, Canada
| | - Alexandre Bureau
- CERVO Brain Research Centre, Québec, QC G1E 1T2, Canada
- Department of Social and Preventive Medicine, Laval University, Québec, QC G1V 0A6, Canada
| |
Collapse
|
74
|
Sigurdsson AI, Louloudis I, Banasik K, Westergaard D, Winther O, Lund O, Ostrowski S, Erikstrup C, Pedersen O, Nyegaard M, Brunak S, Vilhjálmsson B, Rasmussen S. Deep integrative models for large-scale human genomics. Nucleic Acids Res 2023; 51:e67. [PMID: 37224538 PMCID: PMC10325897 DOI: 10.1093/nar/gkad373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 04/18/2023] [Accepted: 04/28/2023] [Indexed: 05/26/2023] Open
Abstract
Polygenic risk scores (PRSs) are expected to play a critical role in precision medicine. Currently, PRS predictors are generally based on linear models using summary statistics, and more recently individual-level data. However, these predictors mainly capture additive relationships and are limited in data modalities they can use. We developed a deep learning framework (EIR) for PRS prediction which includes a model, genome-local-net (GLN), specifically designed for large-scale genomics data. The framework supports multi-task learning, automatic integration of other clinical and biochemical data, and model explainability. When applied to individual-level data from the UK Biobank, the GLN model demonstrated a competitive performance compared to established neural network architectures, particularly for certain traits, showcasing its potential in modeling complex genetic relationships. Furthermore, the GLN model outperformed linear PRS methods for Type 1 Diabetes, likely due to modeling non-additive genetic effects and epistasis. This was supported by our identification of widespread non-additive genetic effects and epistasis in the context of T1D. Finally, we constructed PRS models that integrated genotype, blood, urine, and anthropometric data and found that this improved performance for 93% of the 290 diseases and disorders considered. EIR is available at https://github.com/arnor-sigurdsson/EIR.
Collapse
Affiliation(s)
- Arnór I Sigurdsson
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Ioannis Louloudis
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Karina Banasik
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - David Westergaard
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Ole Winther
- Section for Cognitive Systems, Department of Applied Mathematics and Computer Science, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark
- Bioinformatics Centre, Department of Biology, University of Copenhagen, 2200 Copenhagen N, Denmark
- Center for Genomic Medicine, Rigshospitalet (Copenhagen University Hospital), Copenhagen 2100, Denmark
| | - Ole Lund
- Danish National Genome Center, Ørestads Boulevard 5, 2300 Copenhagen S, Denmark
- DTU Health Tech, Department of Health Technology, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark
| | - Sisse Rye Ostrowski
- Department of Clinical Immunology, Rigshospitalet, University of Copenhagen, 2200 Copenhagen N, Denmark
- Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Christian Erikstrup
- Department of Clinical Immunology, Aarhus University Hospital, 8000 Aarhus C, Denmark
- Department of Clinical Medicine, Aarhus University, 8000 Aarhus C, Denmark
| | - Ole Birger Vesterager Pedersen
- Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark
- Department of Clinical Immunology, Zealand University Hospital, 4600 Køge, Denmark
| | - Mette Nyegaard
- Department of Health Science and Technology, Aalborg University, DK- 9260 Gistrup, Denmark
| | - Søren Brunak
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Bjarni J Vilhjálmsson
- National Centre for Register-Based Research (NCRR), Aarhus University, 8000 Aarhus C, Denmark
- Lundbeck Foundation Initiative for Integrative Psychiatric Research (iPSYCH), 8210 Aarhus V, Denmark
- Bioinformatics Research Centre (BiRC), Aarhus University, 8000 Aarhus C, Denmark
| | - Simon Rasmussen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| |
Collapse
|
75
|
Lehmann B, Mackintosh M, McVean G, Holmes C. Optimal strategies for learning multi-ancestry polygenic scores vary across traits. Nat Commun 2023; 14:4023. [PMID: 37419925 PMCID: PMC10328935 DOI: 10.1038/s41467-023-38930-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Accepted: 05/22/2023] [Indexed: 07/09/2023] Open
Abstract
Polygenic scores (PGSs) are individual-level measures that aggregate the genome-wide genetic predisposition to a given trait. As PGS have predominantly been developed using European-ancestry samples, trait prediction using such European ancestry-derived PGS is less accurate in non-European ancestry individuals. Although there has been recent progress in combining multiple PGS trained on distinct populations, the problem of how to maximize performance given a multiple-ancestry cohort is largely unexplored. Here, we investigate the effect of sample size and ancestry composition on PGS performance for fifteen traits in UK Biobank. For some traits, PGS estimated using a relatively small African-ancestry training set outperformed, on an African-ancestry test set, PGS estimated using a much larger European-ancestry only training set. We observe similar, but not identical, results when considering other minority-ancestry groups within UK Biobank. Our results emphasise the importance of targeted data collection from underrepresented groups in order to address existing disparities in PGS performance.
Collapse
Affiliation(s)
- Brieuc Lehmann
- Department of Statistical Science, University College London, London, UK.
| | | | - Gil McVean
- Big Data Institute, University of Oxford, Oxford, UK
| | - Chris Holmes
- The Alan Turing Institute, London, UK
- Big Data Institute, University of Oxford, Oxford, UK
- Department of Statistics, University of Oxford, Oxford, UK
| |
Collapse
|
76
|
Kentistou KA, Kaisinger LR, Stankovic S, Vaudel M, de Oliveira EM, Messina A, Walters RG, Liu X, Busch AS, Helgason H, Thompson DJ, Santon F, Petricek KM, Zouaghi Y, Huang-Doran I, Gudbjartsson DF, Bratland E, Lin K, Gardner EJ, Zhao Y, Jia R, Terao C, Riggan M, Bolla MK, Yazdanpanah M, Yazdanpanah N, Bradfield JP, Broer L, Campbell A, Chasman DI, Cousminer DL, Franceschini N, Franke LH, Girotto G, He C, Järvelin MR, Joshi PK, Kamatani Y, Karlsson R, Luan J, Lunetta KL, Mägi R, Mangino M, Medland SE, Meisinger C, Noordam R, Nutile T, Concas MP, Polašek O, Porcu E, Ring SM, Sala C, Smith AV, Tanaka T, van der Most PJ, Vitart V, Wang CA, Willemsen G, Zygmunt M, Ahearn TU, Andrulis IL, Anton-Culver H, Antoniou AC, Auer PL, Barnes CLK, Beckmann MW, Berrington A, Bogdanova NV, Bojesen SE, Brenner H, Buring JE, Canzian F, Chang-Claude J, Couch FJ, Cox A, Crisponi L, Czene K, Daly MB, Demerath EW, Dennis J, Devilee P, Vivo ID, Dörk T, Dunning AM, Dwek M, Eriksson JG, Fasching PA, Fernandez-Rhodes L, Ferreli L, Fletcher O, Gago-Dominguez M, García-Closas M, García-Sáenz JA, González-Neira A, Grallert H, Guénel P, Haiman CA, Hall P, Hamann U, Hakonarson H, Hart RJ, Hickey M, Hooning MJ, Hoppe R, Hopper JL, Hottenga JJ, Hu FB, Hübner H, Hunter DJ, Jernström H, John EM, Karasik D, Khusnutdinova EK, Kristensen VN, Lacey JV, Lambrechts D, Launer LJ, Lind PA, Lindblom A, Magnusson PKE, Mannermaa A, McCarthy MI, Meitinger T, Menni C, Michailidou K, Millwood IY, Milne RL, Montgomery GW, Nevanlinna H, Nolte IM, Nyholt DR, Obi N, O’Brien KM, Offit K, Oldehinkel AJ, Ostrowski SR, Palotie A, Pedersen OB, Peters A, Pianigiani G, Plaseska-Karanfilska D, Pouta A, Pozarickij A, Radice P, Rennert G, Rosendaal FR, Ruggiero D, Saloustros E, Sandler DP, Schipf S, Schmidt CO, Schmidt MK, Small K, Spedicati B, Stampfer M, Stone J, Tamimi RM, Teras LR, Tikkanen E, Turman C, Vachon CM, Wang Q, Winqvist R, Wolk A, Zemel BS, Zheng W, van Dijk KW, Alizadeh BZ, Bandinelli S, Boerwinkle E, Boomsma DI, Ciullo M, Chenevix-Trench G, Cucca F, Esko T, Gieger C, Grant SFA, Gudnason V, Hayward C, Kolčić I, Kraft P, Lawlor DA, Martin NG, Nøhr EA, Pedersen NL, Pennell CE, Ridker PM, Robino A, Snieder H, Sovio U, Spector TD, Stöckl D, Sudlow C, Timpson NJ, Toniolo D, Uitterlinden A, Ulivi S, Völzke H, Wareham NJ, Widen E, Wilson JF, Pharoah PDP, Li L, Easton DF, Njølstad P, Sulem P, Murabito JM, Murray A, Manousaki D, Juul A, Erikstrup C, Stefansson K, Horikoshi M, Chen Z, Farooqi IS, Pitteloud N, Johansson S, Day FR, Perry JRB, Ong KK. Understanding the genetic complexity of puberty timing across the allele frequency spectrum. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.06.14.23291322. [PMID: 37503126 PMCID: PMC10371120 DOI: 10.1101/2023.06.14.23291322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Pubertal timing varies considerably and has been associated with a range of health outcomes in later life. To elucidate the underlying biological mechanisms, we performed multi-ancestry genetic analyses in ~800,000 women, identifying 1,080 independent signals associated with age at menarche. Collectively these loci explained 11% of the trait variance in an independent sample, with women at the top and bottom 1% of polygenic risk exhibiting a ~11 and ~14-fold higher risk of delayed and precocious pubertal development, respectively. These common variant analyses were supported by exome sequence analysis of ~220,000 women, identifying several genes, including rare loss of function variants in ZNF483 which abolished the impact of polygenic risk. Next, we implicated 660 genes in pubertal development using a combination of in silico variant-to-gene mapping approaches and integration with dynamic gene expression data from mouse embryonic GnRH neurons. This included an uncharacterized G-protein coupled receptor GPR83, which we demonstrate amplifies signaling of MC3R, a key sensor of nutritional status. Finally, we identified several genes, including ovary-expressed genes involved in DNA damage response that co-localize with signals associated with menopause timing, leading us to hypothesize that the ovarian reserve might signal centrally to trigger puberty. Collectively these findings extend our understanding of the biological complexity of puberty timing and highlight body size dependent and independent mechanisms that potentially link reproductive timing to later life disease.
Collapse
Affiliation(s)
- Katherine A Kentistou
- MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Box 285 Institute of Metabolic Science, Cambridge Biomedical Campus, Cambridge CB2 0QQ, UK
| | - Lena R Kaisinger
- MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Box 285 Institute of Metabolic Science, Cambridge Biomedical Campus, Cambridge CB2 0QQ, UK
| | - Stasa Stankovic
- MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Box 285 Institute of Metabolic Science, Cambridge Biomedical Campus, Cambridge CB2 0QQ, UK
| | - Marc Vaudel
- Mohn Center for Diabetes Precision Medicine, Department of Clinical Science, University of Bergen, NO-5020, Bergen, Norway
- Department of Genetics and Bioinformatics, Health Data and Digitalization, Norwegian Institute of Public Health, NO-0213, Oslo, Norway
| | - Edson M de Oliveira
- University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Wellcome-MRC Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, UK
| | - Andrea Messina
- Division of Endocrinology, Diabetology, and Metabolism, Lausanne University Hospital, 1011 Lausanne, Switzerland
| | - Robin G Walters
- Nuffield Department of Population Health, University of Oxford, Oxford OX3 7LF, UK
- MRC Population Health Research Unit, University of Oxford, Oxford OX3 7LF, UK
| | - Xiaoxi Liu
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Alexander S Busch
- Department of General Pediatrics, University of Münster, Münster, Germany
- Deptartment of Growth and Reproduction, Copenhagen University Hospital - Rigshospitalet, Copenhagen, Denmark
| | - Hannes Helgason
- deCODE Genetics/Amgen, Inc., Reykjavik, Iceland
- School of Engineering and Natural Sciences, University of Iceland, Reykjavik, Iceland
| | - Deborah J Thompson
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB1 8RN, UK
| | - Federico Santon
- Division of Endocrinology, Diabetology, and Metabolism, Lausanne University Hospital, 1011 Lausanne, Switzerland
| | - Konstantin M Petricek
- Charité Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Institute of Pharmacology, Berlin, Germany
| | - Yassine Zouaghi
- Division of Endocrinology, Diabetology, and Metabolism, Lausanne University Hospital, 1011 Lausanne, Switzerland
| | - Isabel Huang-Doran
- University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Wellcome-MRC Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, UK
| | - Daniel F Gudbjartsson
- deCODE Genetics/Amgen, Inc., Reykjavik, Iceland
- School of Engineering and Natural Sciences, University of Iceland, Reykjavik, Iceland
| | - Eirik Bratland
- Mohn Center for Diabetes Precision Medicine, Department of Clinical Science, University of Bergen, NO-5020, Bergen, Norway
- Department of Medical Genetics, Haukeland University Hospital, NO-5021, Bergen, Norway
| | - Kuang Lin
- Nuffield Department of Population Health, University of Oxford, Oxford OX3 7LF, UK
| | - Eugene J Gardner
- MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Box 285 Institute of Metabolic Science, Cambridge Biomedical Campus, Cambridge CB2 0QQ, UK
| | - Yajie Zhao
- MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Box 285 Institute of Metabolic Science, Cambridge Biomedical Campus, Cambridge CB2 0QQ, UK
| | - Raina Jia
- MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Box 285 Institute of Metabolic Science, Cambridge Biomedical Campus, Cambridge CB2 0QQ, UK
| | - Chikashi Terao
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Clinical Research Center, Shizuoka General Hospital, Shizuoka, Japan
- The Department of Applied Genetics, The School of Pharmaceutical Sciences, University of Shizuoka, Shizuoka, Japan
| | - Margie Riggan
- Department of Gynecology, Duke University Medical Center, Durham, North Carolina, USA
| | - Manjeet K Bolla
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB1 8RN, UK
| | - Mojgan Yazdanpanah
- Research Center of the Sainte-Justine University Hospital, University of Montreal, Montreal, Quebec, Canada
| | - Nahid Yazdanpanah
- Research Center of the Sainte-Justine University Hospital, University of Montreal, Montreal, Quebec, Canada
| | - Jonath P Bradfield
- Quantinuum Research, Wayne, PA, USA
- Center for Applied Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
| | - Linda Broer
- Department of Internal Medicine, Erasmus MC, Rotterdam, The Netherlands
- Department of Epidemiology, Erasmus MC, Rotterdam, The Netherlands
| | - Archie Campbell
- Centre for Genomic and Experimental Medicine, Institute of Genetics & Cancer, University of Edinburgh, Edinburgh, UK
| | - Daniel I Chasman
- Division of Preventive Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02215, USA
| | - Diana L Cousminer
- Division of Human Genetics, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
- Center for Spatial and Functional Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
| | - Nora Franceschini
- Department of Epidemiology, University of North Carolina, Chapel Hill, NC, USA
| | - Lude H Franke
- Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Giorgia Girotto
- Institute for Maternal and Child Health – IRCCS ‘‘Burlo Garofolo”, Trieste, Italy
- Department of Medicine, Surgery and Health Sciences, University of Trieste, Trieste, Italy
| | - Chunyan He
- Department of Epidemiology and Biostatistics, Department of Big Data in Health Science, School of Public Health, Zhejiang University School of Medicine, Hangzhou 310058, China
- Departments of Medical Oncology and Hematology, Sir Runrun Shaw Hospital, School of Medicine, Zhejiang University, Hangzhou 310016, China
| | - Marjo-Riitta Järvelin
- Department of Epidemiology and Biostatistics, MRC Health Protection Agency (HPA) Centre for Environment and Health, School of Public Health, Imperial College London, UK
- Institute of Health Sciences, P.O.Box 5000, FI-90014 University of Oulu, Finland
- Biocenter Oulu, P.O.Box 5000, Aapistie 5A, FI-90014 University of Oulu, Finland
- Unit of Primary Care, Oulu University Hospital, Kajaanintie 50, P.O.Box 20, FI-90220 Oulu, 90029 OYS, Finland
- Department of Children and Young People and Families, National Institute for Health and Welfare, Aapistie 1, Box 310, FI-90101 Oulu, Finland
| | - Peter K Joshi
- Centre for Global Health Research, Usher Institute, University of Edinburgh, Teviot Place, Edinburgh EH8 9AG, Scotland
| | - Yoichiro Kamatani
- Laboratory of Complex Trait Genomics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Robert Karlsson
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Jian’an Luan
- MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Box 285 Institute of Metabolic Science, Cambridge Biomedical Campus, Cambridge CB2 0QQ, UK
| | - Kathryn L Lunetta
- Boston University School of Public Health, Department of Biostatistics. Boston, Massachusetts 02118, USA
- NHLBI’s and Boston University’s Framingham Heart Study, Framingham, Massachusetts 01702-5827, USA
| | - Reedik Mägi
- Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Massimo Mangino
- Department of Twin Research and Genetic Epidemiology, King’s College London, London, UK
- NIHR Biomedical Research Centre at Guy’s and St. Thomas’ Foundation Trust, London, UK
| | - Sarah E Medland
- QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
- School of Psychology, University of Queensland, Brisbane, Queensland, Australia
- Faculty of Medicine, University of Queensland, Brisbane, Queensland, Australia
| | - Christa Meisinger
- Epidemiology, Medical Faculty, University of Augsburg, University Hospital of Augsburg, Augsburg, Germany
| | - Raymond Noordam
- Department of Internal Medicine, Section of Gerontology and Geriatrics, Leiden University Medical Center, Leiden, The Netherlands
| | - Teresa Nutile
- Institute of Genetics and Biophysics “A. Buzzati-Traverso”, CNR, Naples, Italy
| | - Maria Pina Concas
- Institute for Maternal and Child Health – IRCCS ‘‘Burlo Garofolo”, Trieste, Italy
| | - Ozren Polašek
- University of Split School of Medicine, Split, Croatia
- Algebra University College, Zagreb, Croatia
| | - Eleonora Porcu
- Institute of Genetics and Biomedical Research, National Research Council, Cagliari, Sardinia 09042, Italy
- University of Sassari, Department of Biomedical Sciences, Sassari, Sassari 07100, Italy
| | - Susan M Ring
- MRC Integrative Epidemiology Unit at the University of Bristol, UK
- Population Health Science, Bristol Medical School, University of Bristol, UK
| | - Cinzia Sala
- Division of Genetics and Cell Biology, San Raffele Hospital, Milano, Italy
| | - Albert V Smith
- Icelandic Heart Association, 201 Kopavogur, Iceland
- Faculty of Medicine, University of Iceland, 101 Reykjavik, Iceland
| | - Toshiko Tanaka
- National Institute on Aging, National Institutes of Health, Baltimore, Maryland, USA
| | - Peter J van der Most
- Department of Epidemiology, University of Groningen, University Medical Center Groningen, The Netherlands
| | - Veronique Vitart
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh EH4 2XU, UK
| | - Carol A Wang
- School of Medicine and Public Health, University of Newcastle, Newcastle, New South Wales 2308, Australia
- Hunter Medical Research Institute, Newcastle, New South Wales 2305, Australia
| | - Gonneke Willemsen
- Dept of Biological Psychology, Vrije Universiteit, Amsterdam; Amsterdam Public Health (APH) research institute, The Netherlands
| | - Marek Zygmunt
- Clinic of Gynaecology and Obstetrics, University Medicine Greifswald, Germany
| | - Thomas U Ahearn
- Division of Cancer Epidemiology and Genetics National Cancer Institute, National Institutes of Health, Department of Health and Human Services Bethesda, MD, USA
| | - Irene L Andrulis
- Fred A. Litwin Center for Cancer Genetics Lunenfeld-Tanenbaum Research Institute of Mount Sinai Hospital Toronto, Ontario, Canada
- Department of Molecular Genetics University of Toronto Toronto, Ontario, Canada
| | - Hoda Anton-Culver
- Department of Medicine, Genetic Epidemiology Research Institute University of California Irvine Irvine, CA, USA
| | - Antonis C Antoniou
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB1 8RN, UK
| | - Paul L Auer
- Division of Biostatistics, Institute for Health and Equity, and Cancer Center Medical College of Wisconsin Milwaukee, WI, USA
| | - Catriona LK Barnes
- Centre for Global Health Research, Usher Institute, University of Edinburgh, Teviot Place, Edinburgh EH8 9AG, Scotland
| | - Matthias W Beckmann
- Department of Gynecology and Obstetrics, Comprehensive Cancer Center Erlangen-EMN, Friedrich-Alexander University Erlangen-Nuremberg, University Hospital Erlangen, Erlangen, Germany
| | - Amy Berrington
- Division of Genetics and Epidemiology The Institute of Cancer Research, London, UK
| | - Natalia V Bogdanova
- Department of Radiation Oncology, Hannover Medical School, Hannover, Germany
- Gynaecology Research Unit, Hannover Medical School, Hannover, Germany
- N.N. Alexandrov Research Institute of Oncology and Medical Radiology, Minsk, Belarus
| | - Stig E Bojesen
- Copenhagen General Population Study, Herlev and Gentofte Hospital Copenhagen University Hospital, Herlev, Denmark
- Department of Clinical Biochemistry, Herlev and Gentofte Hospital Copenhagen University Hospital, Herlev, Denmark
| | - Hermann Brenner
- Division of Clinical Epidemiology and Aging Research German Cancer Research Center (DKFZ), Heidelberg, Germany
- Division of Preventive Oncology German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Heidelberg, Germany
- German Cancer Consortium (DKTK) German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Julie E Buring
- Division of Preventive Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02215, USA
| | - Federico Canzian
- Genomic Epidemiology Group German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Jenny Chang-Claude
- Division of Cancer Epidemiology German Cancer Research Center (DKFZ), Heidelberg, Germany
- Cancer Epidemiology Group, University Cancer Center Hamburg (UCCH) University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Fergus J Couch
- Department of Laboratory Medicine and Pathology Mayo Clinic Rochester, MN, USA
| | - Angela Cox
- Sheffield Institute for Nucleic Acids (SInFoNiA), Department of Oncology and Metabolism, University of Sheffield, Sheffield, UK
| | - Laura Crisponi
- Institute of Genetics and Biomedical Research, National Research Council, Cagliari, Sardinia 09042, Italy
| | - Kamila Czene
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Mary B Daly
- Department of Clinical Genetics Fox Chase Cancer Center Philadelphia, PA, USA
| | - Ellen W Demerath
- Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, USA
| | - Joe Dennis
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB1 8RN, UK
| | - Peter Devilee
- Department of Pathology, Leiden University Medical Center, Leiden, The Netherlands
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Immaculata De Vivo
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts 02115, USA
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, USA
| | - Thilo Dörk
- Gynaecology Research Unit, Hannover Medical School, Hannover, Germany
| | - Alison M Dunning
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge CB1 8RN, UK
| | - Miriam Dwek
- School of Life Sciences, University of Westminster, London, UK
| | - Johan G Eriksson
- Department of General Practice and Primary Healthcare, University of Helsinki, Helsinki University Hospital, Helsinki, Finland
| | - Peter A Fasching
- Department of Gynecology and Obstetrics, Comprehensive Cancer Center Erlangen-EMN, Friedrich-Alexander University Erlangen-Nuremberg, University Hospital Erlangen, Erlangen, Germany
| | | | - Liana Ferreli
- Institute of Genetics and Biomedical Research, National Research Council, Cagliari, Sardinia 09042, Italy
| | - Olivia Fletcher
- The Breast Cancer Now Toby Robins Research Centre, The Institute of Cancer Research, London, UK
| | - Manuela Gago-Dominguez
- Genomic Medicine Group, International Cancer Genetics and Epidemiology Group Fundación Pública Galega de Medicina Xenómica, Instituto de Investigación Sanitaria de Santiago de Compostela (IDIS), Complejo Hospitalario Universitario de Santiago, SERGAS Santiago de Compostela, Spain
| | - Montserrat García-Closas
- Division of Cancer Epidemiology and Genetics National Cancer Institute, National Institutes of Health, Department of Health and Human Services Bethesda, MD, USA
| | - José A García-Sáenz
- Medical Oncology Department, Hospital Clínico San Carlos Instituto de Investigación Sanitaria San Carlos (IdISSC), Centro Investigación Biomédica en Red de Cáncer (CIBERONC), Madrid, Spain
| | - Anna González-Neira
- Human Genotyping Unit-CeGen, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Harald Grallert
- Research Unit of Molecular Epidemiology, Helmholtz Zentrum München–German Research Center for Environmental Health, Neuherberg, Germany
- Institute of Epidemiology, Helmholtz Zentrum München–German Research Center for Environmental Health, Neuherberg, Germany
| | - Pascal Guénel
- Team “Exposome and Heredity”, CESP, Gustave Roussy INSERM, University Paris-Saclay, UVSQ Villejuif, France
| | - Christopher A Haiman
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Per Hall
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
- Department of Oncology, Södersjukhuset, Stockholm, Sweden
| | - Ute Hamann
- Molecular Genetics of Breast Cancer, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Hakon Hakonarson
- Center for Applied Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
- Division of Human Genetics, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pediatrics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
- Division of Pulmonary Medicine, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
| | - Roger J Hart
- Division of Obstetrics and Gynaecology, University of Western Australia, Western Australia, Australia
| | - Martha Hickey
- Department of Obstetrics and Gynaecology at the University of Melbourne and The Royal Women’s Hospital, Victoria, Australia
| | - Maartje J Hooning
- Department of Medical Oncology, Erasmus MC Cancer Institute, Rotterdam, The Netherlands
| | - Reiner Hoppe
- Dr. Margarete Fischer-Bosch-Institute of Clinical Pharmacology, Stuttgart, Germany
- University of Tübingen, Tübingen, Germany
| | - John L Hopper
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne Melbourne, Victoria, Australia
| | - Jouke-Jan Hottenga
- Dept of Biological Psychology, Vrije Universiteit, Amsterdam; Amsterdam Public Health (APH) research institute, The Netherlands
| | - Frank B Hu
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, USA
- Department of Nutrition, Harvard T.H. Chan School of Public Health School of Public Health, Boston, Massachusetts 02115, USA
| | - Hanna Hübner
- Department of Gynecology and Obstetrics, Comprehensive Cancer Center Erlangen-EMN, Friedrich-Alexander University Erlangen-Nuremberg, University Hospital Erlangen, Erlangen, Germany
| | - David J Hunter
- Nuffield Department of Population Health, University of Oxford, Oxford OX3 7LF, UK
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts 02115, USA
| | - ABCTB Investigators
- Australian Breast Cancer Tissue Bank, Westmead Institute for Medical Research, University of Sydney, Sydney, New South Wales, Australia
| | - Helena Jernström
- Oncology, Department of Clinical Sciences in Lund, Lund University, Lund, Sweden
| | - Esther M John
- Department of Epidemiology and Population Health, Stanford University School of Medicine Stanford, CA, USA
- Department of Medicine, Division of Oncology Stanford Cancer Institute, Stanford University School of Medicine Stanford, CA, USA
| | - David Karasik
- Hebrew SeniorLife Institute for Aging Research, Boston, Massachusetts, USA
- Harvard Medical School, Boston, Massachusetts, USA
| | - Elza K Khusnutdinova
- Institute of Biochemistry and Genetics of the Ufa Federal Research Centre of the Russian Academy of Sciences, Ufa, Russia
- Department of Genetics and Fundamental Medicine, Bashkir State University, Ufa, Russia
| | - Vessela N Kristensen
- Department of Medical Genetics, Oslo University Hospital and University of Oslo, Oslo, Norway
- Institute of Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo, Norway
| | - James V Lacey
- Department of Computational and Quantitative Medicine, City of Hope Duarte, CA, USA
- City of Hope Comprehensive Cancer Center, City of Hope Duarte, CA, USA
| | - Diether Lambrechts
- Laboratory for Translational Genetics, Department of Human Genetics, KU Leuven, Leuven, Belgium
- VIB Center for Cancer Biology, VIB, Leuven, Belgium
| | - Lenore J Launer
- Laboratory of Epidemiology and Population Sciences, National Institute on Aging, Intramural Research Program, National Institutes of Health, Bethesda, Maryland, 20892, USA
| | - Penelope A Lind
- QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
- Faculty of Medicine, University of Queensland, Brisbane, Queensland, Australia
- School of Biomedical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Annika Lindblom
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden
- Department of Clinical Genetics, Karolinska University Hospital, Stockholm, Sweden
| | - Patrik KE Magnusson
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Arto Mannermaa
- Translational Cancer Research Area, University of Eastern Finland, Kuopio, Finland
- Institute of Clinical Medicine, Pathology and Forensic Medicine, University of Eastern Finland, Kuopio, Finland
| | - Mark I McCarthy
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK
- Oxford Centre for Diabetes, Endocrinology, & Metabolism, University of Oxford, Churchill Hospital, Oxford OX3 7LJ, UK
- NIHR Oxford Biomedical Research Centre, Churchill Hospital, OX3 7LE Oxford, UK
| | - Thomas Meitinger
- Institute of Human Genetics, Klinikum rechts der Isar, Technical University of Munich, School of Medicine, Munich, Germany
| | - Cristina Menni
- Department of Twin Research and Genetic Epidemiology, King’s College London, London, UK
| | - Kyriaki Michailidou
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB1 8RN, UK
- Biostatistics Unit, The Cyprus Institute of Neurology & Genetics, Nicosia, Cyprus
| | - Iona Y Millwood
- Nuffield Department of Population Health, University of Oxford, Oxford OX3 7LF, UK
- MRC Population Health Research Unit, University of Oxford, Oxford OX3 7LF, UK
| | - Roger L Milne
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne Melbourne, Victoria, Australia
- Cancer Epidemiology Division, Cancer Council Victoria, Melbourne, Victoria, Australia
| | - Grant W Montgomery
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
| | - Heli Nevanlinna
- Department of Obstetrics and Gynecology, Helsinki University Hospital, University of Helsinki, Helsinki, Finland
| | - Ilja M Nolte
- Department of Epidemiology, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Dale R Nyholt
- School of Biomedical Sciences, Faculty of Health, Centre for Genomics and Personalised Health, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Nadia Obi
- Institute for Medical Biometry and Epidemiology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Katie M O’Brien
- Epidemiology Branch National Institute of Environmental Health Sciences, NIH Research Triangle Park, NC, USA
| | - Kenneth Offit
- Clinical Genetics Research Lab, Department of Cancer Biology and Genetics Memorial Sloan Kettering Cancer Center New York, NY, USA
- Clinical Genetics Service, Department of Medicine Memorial Sloan Kettering Cancer Center New York, NY, USA
| | - Albertine J Oldehinkel
- Interdisciplinary Center Psychopathology and Emotion Regulation, University Medical Center Groningen, University of Groningen, The Netherlands
| | - Sisse R Ostrowski
- Department of Clinical Immunology, Rigshospitalet - University of Copenhagen, Copenhagen, Denmark
- Department of Clinical Medicine, Faculty of health and medical sciences, University of Copenhagen, Denmark
| | - Aarno Palotie
- Psychiatric and Neurodevelopmental Genetics Unit, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA
- Medical and Population Genetics Program, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK
- Analytic and Translational Genetics Unit, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
| | - Ole B Pedersen
- Department of Clinical Medicine, Faculty of health and medical sciences, University of Copenhagen, Denmark
- Department of Clinical Immunology, Zealand University Hospital, Køge, Denmark
| | - Annette Peters
- Institute of Epidemiology, Helmholtz Zentrum München–German Research Center for Environmental Health, Neuherberg, Germany
- Institute for Medical Information Processing, Biometry and Epidemiology - IBE, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Giulia Pianigiani
- Institute for Maternal and Child Health – IRCCS ‘‘Burlo Garofolo”, Trieste, Italy
| | - Dijana Plaseska-Karanfilska
- Research Centre for Genetic Engineering and Biotechnology “Georgi D. Efremov” MASA Skopje Republic of North Macedonia
| | - Anneli Pouta
- National Institute for Health and Welfare, Finland
| | - Alfred Pozarickij
- Nuffield Department of Population Health, University of Oxford, Oxford OX3 7LF, UK
| | - Paolo Radice
- Unit of Molecular Bases of Genetic Risk and Genetic Testing, Department of Research Fondazione IRCCS, Istituto Nazionale dei Tumori (INT), Milan, Italy
| | - Gad Rennert
- Clalit National Cancer Control Center, Carmel Medical Center and Technion, Faculty of Medicine, Haifa, Israel
| | - Frits R Rosendaal
- Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, The Netherlands
| | - Daniela Ruggiero
- Institute of Genetics and Biophysics “A. Buzzati-Traverso”, CNR, Naples, Italy
- IRCCS Neuromed, Pozzilli, Isernia, Italy
| | | | - Dale P Sandler
- Epidemiology Branch National Institute of Environmental Health Sciences, NIH Research Triangle Park, NC, USA
| | - Sabine Schipf
- Institute for Community Medicine, University Medicine Greifswald, Germany
| | - Carsten O Schmidt
- Institute for Community Medicine, University Medicine Greifswald, Germany
| | - Marjanka K Schmidt
- Division of Molecular Pathology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
- Division of Psychosocial Research and Epidemiology, The Netherlands Cancer Institute - Antoni van Leeuwenhoek hospital, Amsterdam, The Netherlands
| | - Kerrin Small
- Department of Twin Research and Genetic Epidemiology, King’s College London, London, UK
| | - Beatrice Spedicati
- Department of Medicine, Surgery and Health Sciences, University of Trieste, Trieste, Italy
| | - Meir Stampfer
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts 02115, USA
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, USA
| | - Jennifer Stone
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne Melbourne, Victoria, Australia
- Genetic Epidemiology Group, School of Population and Global Health, University of Western Australia Perth, Western Australia, Australia
| | - Rulla M Tamimi
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts 02115, USA
- Department of Population Health Sciences Weill Cornell Medicine New York, NY, USA
| | - Lauren R Teras
- Department of Population Science American Cancer Society Atlanta, GA, USA
| | - Emmi Tikkanen
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
- Public Health Genomics Unit, Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki, Finland
| | - Constance Turman
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts 02115, USA
- Program in Genetic Epidemiology and Statistical Genetics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
| | - Celine M Vachon
- Department of Quantitative Health Sciences, Division of Epidemiology Mayo Clinic Rochester, MN, USA
| | - Qin Wang
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB1 8RN, UK
| | - Robert Winqvist
- Laboratory of Cancer Genetics and Tumor Biology, Translational Medicine Research Unit, Biocenter Oulu, University of Oulu, Oulu, Finland
- Laboratory of Cancer Genetics and Tumor Biology, Northern Finland Laboratory Centre Oulu, Oulu, Finland
| | - Alicja Wolk
- Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden
| | - Babette S Zemel
- Department of Pediatrics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
- Division of Gastroenterology, Hepatology and Nutrition, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
| | - Wei Zheng
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center Vanderbilt University School of Medicine Nashville, TN, USA
| | - Ko W van Dijk
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
- Department of Internal Medicine, Division of Endocrinology, Leiden University Medical Center, Leiden, The Netherlands
| | - Behrooz Z Alizadeh
- Department of Epidemiology, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | | | - Eric Boerwinkle
- Human Genetics Center, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Dorret I Boomsma
- Dept of Biological Psychology, Vrije Universiteit, Amsterdam; Amsterdam Public Health (APH) research institute, The Netherlands
- Amsterdam Reproduction & Development research institute, Amsterdam, The Netherlands
| | - Marina Ciullo
- Institute of Genetics and Biophysics “A. Buzzati-Traverso”, CNR, Naples, Italy
- IRCCS Neuromed, Pozzilli, Isernia, Italy
| | | | - Francesco Cucca
- Institute of Genetics and Biomedical Research, National Research Council, Cagliari, Sardinia 09042, Italy
- University of Sassari, Department of Biomedical Sciences, Sassari, Sassari 07100, Italy
| | - Tõnu Esko
- Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Christian Gieger
- Research Unit of Molecular Epidemiology, Helmholtz Zentrum München–German Research Center for Environmental Health, Neuherberg, Germany
- Institute of Epidemiology, Helmholtz Zentrum München–German Research Center for Environmental Health, Neuherberg, Germany
- German Center for Diabetes Research (DZD), Neuherberg, Germany
| | - Struan FA Grant
- Division of Human Genetics, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
- Center for Spatial and Functional Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pediatrics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
- Division of Endocrinology and Diabetes, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
| | - Vilmundur Gudnason
- Icelandic Heart Association, 201 Kopavogur, Iceland
- Faculty of Medicine, University of Iceland, 101 Reykjavik, Iceland
| | - Caroline Hayward
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh EH4 2XU, UK
| | - Ivana Kolčić
- University of Split School of Medicine, Split, Croatia
- Algebra University College, Zagreb, Croatia
| | - Peter Kraft
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts 02115, USA
- Program in Genetic Epidemiology and Statistical Genetics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
| | - Deborah A Lawlor
- MRC Integrative Epidemiology Unit at the University of Bristol, UK
- Population Health Science, Bristol Medical School, University of Bristol, UK
| | - Nicholas G Martin
- QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Ellen A Nøhr
- Institute of Clinical Research, University of Southern Denmark, Department of Obstetrics & Gynecology, Odense University Hospital, Denmark
| | - Nancy L Pedersen
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Craig E Pennell
- School of Medicine and Public Health, University of Newcastle, Newcastle, New South Wales 2308, Australia
- Hunter Medical Research Institute, Newcastle, New South Wales 2305, Australia
- Department of Maternity and Gynaecology, John Hunter Hospital, Newcastle, New South Wales 2305, Australia
| | - Paul M Ridker
- Division of Preventive Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02215, USA
| | - Antonietta Robino
- Institute for Maternal and Child Health – IRCCS ‘‘Burlo Garofolo”, Trieste, Italy
| | - Harold Snieder
- Department of Epidemiology, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Ulla Sovio
- Department of Epidemiology and Biostatistics, MRC Health Protection Agency (HPA) Centre for Environment and Health, School of Public Health, Imperial College London, UK
- Department of Obstetrics and Gynaecology, University of Cambridge, Cambridge, UK
| | - Tim D Spector
- Department of Twin Research and Genetic Epidemiology, King’s College London, London, UK
| | - Doris Stöckl
- Gesundheitsamt Fürstenfeldbruck, Regierung von Oberbayern, Fürstenfeldbruck, Germany
| | - Cathie Sudlow
- Centre for Genomic and Experimental Medicine, Institute of Genetics & Cancer, University of Edinburgh, Edinburgh, UK
- Centre for Medical Informatics, Usher Institute, University of Edinburgh
| | - Nic J Timpson
- MRC Integrative Epidemiology Unit at the University of Bristol, UK
- Population Health Science, Bristol Medical School, University of Bristol, UK
| | - Daniela Toniolo
- Division of Genetics and Cell Biology, San Raffele Hospital, Milano, Italy
| | - André Uitterlinden
- Department of Internal Medicine, Erasmus MC, Rotterdam, The Netherlands
- Department of Epidemiology, Erasmus MC, Rotterdam, The Netherlands
| | - Sheila Ulivi
- Institute for Maternal and Child Health – IRCCS ‘‘Burlo Garofolo”, Trieste, Italy
| | - Henry Völzke
- Institute for Community Medicine, University Medicine Greifswald, Germany
| | - Nicholas J Wareham
- MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Box 285 Institute of Metabolic Science, Cambridge Biomedical Campus, Cambridge CB2 0QQ, UK
| | - Elisabeth Widen
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
| | - James F Wilson
- Centre for Global Health Research, Usher Institute, University of Edinburgh, Teviot Place, Edinburgh EH8 9AG, Scotland
| | | | | | | | | | | | | | - Paul DP Pharoah
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB1 8RN, UK
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge CB1 8RN, UK
| | - Liming Li
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
- Center for Public Health and Epidemic Preparedness and Response, Peking University, Beijing, China
| | - Douglas F Easton
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB1 8RN, UK
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge CB1 8RN, UK
| | - Pål Njølstad
- Mohn Center for Diabetes Precision Medicine, Department of Clinical Science, University of Bergen, NO-5020, Bergen, Norway
- Department of Pediatrics and Adolescents, Haukeland University Hospital, NO-5021, Bergen, Norway
| | | | - Joanne M Murabito
- NHLBI’s and Boston University’s Framingham Heart Study, Framingham, Massachusetts 01702-5827, USA
- Boston University Chobanian & Avedisian School of Medicine, Department of Medicine, Section of General Internal Medicine, Boston, MA 02118, USA
| | - Anna Murray
- Genetics of Complex Traits, University of Exeter Medical School, University of Exeter, RILD Level 3, Royal Devon & Exeter Hospital, Barrack Road, Exeter, EX2 5DW, UK
| | - Despoina Manousaki
- Research Center of the Sainte-Justine University Hospital, University of Montreal, Montreal, Quebec, Canada
- Department of Pediatrics, University of Montreal, Montreal, Canada
- Department of Biochemistry and Molecular Medicine, University of Montreal, Montreal, Canada
| | - Anders Juul
- Department of Growth and Reproduction, Copenhagen University Hospital - Rigshospitalet, Copenhagen, Denmark
- International Center for Research and Research Training in Endocrine Disruption of Male Reproduction and Child Health (EDMaRC), Copenhagen University Hospital - Rigshospitalet, Copenhagen, Denmark
- Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
| | - Christian Erikstrup
- Department of Clinical Immunology, Aarhus University Hospital, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
| | - Kari Stefansson
- deCODE Genetics/Amgen, Inc., Reykjavik, Iceland
- Faculty of Medicine, University of Iceland, 101 Reykjavik, Iceland
| | - Momoko Horikoshi
- Laboratory for Genomics of Diabetes and Metabolism, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Zhengming Chen
- Nuffield Department of Population Health, University of Oxford, Oxford OX3 7LF, UK
- MRC Population Health Research Unit, University of Oxford, Oxford OX3 7LF, UK
| | - I Sadaf Farooqi
- University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Wellcome-MRC Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, UK
| | - Nelly Pitteloud
- Division of Endocrinology, Diabetology, and Metabolism, Lausanne University Hospital, 1011 Lausanne, Switzerland
- Faculty of Biology and Medicine, University of Lausanne, Lausanne 1005, Switzerland
| | - Stefan Johansson
- Mohn Center for Diabetes Precision Medicine, Department of Clinical Science, University of Bergen, NO-5020, Bergen, Norway
- Department of Medical Genetics, Haukeland University Hospital, NO-5021, Bergen, Norway
| | - Felix R Day
- MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Box 285 Institute of Metabolic Science, Cambridge Biomedical Campus, Cambridge CB2 0QQ, UK
| | - John RB Perry
- MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Box 285 Institute of Metabolic Science, Cambridge Biomedical Campus, Cambridge CB2 0QQ, UK
- Metabolic Research Laboratory, Wellcome-MRC Institute of Metabolic Science, University of Cambridge School of Clinical Medicine, Cambridge CB2 0QQ, UK
| | - Ken K Ong
- MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Box 285 Institute of Metabolic Science, Cambridge Biomedical Campus, Cambridge CB2 0QQ, UK
- Department of Paediatrics, University of Cambridge, Cambridge CB2 0QQ, UK
| |
Collapse
|
77
|
Liu X, Morelli D, Littlejohns TJ, Clifton DA, Clifton L. Combining machine learning with Cox models to identify predictors for incident post-menopausal breast cancer in the UK Biobank. Sci Rep 2023; 13:9221. [PMID: 37286615 DOI: 10.1038/s41598-023-36214-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Accepted: 05/31/2023] [Indexed: 06/09/2023] Open
Abstract
We aimed to identify potential novel predictors for breast cancer among post-menopausal women, with pre-specified interest in the role of polygenic risk scores (PRS) for risk prediction. We utilised an analysis pipeline where machine learning was used for feature selection, prior to risk prediction by classical statistical models. An "extreme gradient boosting" (XGBoost) machine with Shapley feature-importance measures were used for feature selection among [Formula: see text] 1.7 k features in 104,313 post-menopausal women from the UK Biobank. We constructed and compared the "augmented" Cox model (incorporating the two PRS, known and novel predictors) with a "baseline" Cox model (incorporating the two PRS and known predictors) for risk prediction. Both of the two PRS were significant in the augmented Cox model ([Formula: see text]). XGBoost identified 10 novel features, among which five showed significant associations with post-menopausal breast cancer: plasma urea (HR = 0.95, 95% CI 0.92-0.98, [Formula: see text]), plasma phosphate (HR = 0.68, 95% CI 0.53-0.88, [Formula: see text]), basal metabolic rate (HR = 1.17, 95% CI 1.11-1.24, [Formula: see text]), red blood cell count (HR = 1.21, 95% CI 1.08-1.35, [Formula: see text]), and creatinine in urine (HR = 1.05, 95% CI 1.01-1.09, [Formula: see text]). Risk discrimination was maintained in the augmented Cox model, yielding C-index 0.673 vs 0.667 (baseline Cox model) with the training data and 0.665 vs 0.664 with the test data. We identified blood/urine biomarkers as potential novel predictors for post-menopausal breast cancer. Our findings provide new insights to breast cancer risk. Future research should validate novel predictors, investigate using multiple PRS and more precise anthropometry measures for better breast cancer risk prediction.
Collapse
Affiliation(s)
- Xiaonan Liu
- Nuffield Department of Population Health, University of Oxford, Oxford, UK.
- Big Data Institute, University of Oxford, Old Road Campus, Oxford, OX3 7LF, UK.
| | - Davide Morelli
- Department of Engineering Science, University of Oxford, Oxford, UK
| | - Thomas J Littlejohns
- Nuffield Department of Population Health, University of Oxford, Oxford, UK
- Big Data Institute, University of Oxford, Old Road Campus, Oxford, OX3 7LF, UK
| | - David A Clifton
- Department of Engineering Science, University of Oxford, Oxford, UK
| | - Lei Clifton
- Nuffield Department of Population Health, University of Oxford, Oxford, UK
- Big Data Institute, University of Oxford, Old Road Campus, Oxford, OX3 7LF, UK
| |
Collapse
|
78
|
Zhao B, Li T, Fan Z, Yang Y, Shu J, Yang X, Wang X, Luo T, Tang J, Xiong D, Wu Z, Li B, Chen J, Shan Y, Tomlinson C, Zhu Z, Li Y, Stein JL, Zhu H. Heart-brain connections: Phenotypic and genetic insights from magnetic resonance images. Science 2023; 380:abn6598. [PMID: 37262162 DOI: 10.1126/science.abn6598] [Citation(s) in RCA: 27] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Accepted: 04/11/2023] [Indexed: 06/03/2023]
Abstract
Cardiovascular health interacts with cognitive and mental health in complex ways, yet little is known about the phenotypic and genetic links of heart-brain systems. We quantified heart-brain connections using multiorgan magnetic resonance imaging (MRI) data from more than 40,000 subjects. Heart MRI traits displayed numerous association patterns with brain gray matter morphometry, white matter microstructure, and functional networks. We identified 80 associated genomic loci (P < 6.09 × 10-10) for heart MRI traits, which shared genetic influences with cardiovascular and brain diseases. Genetic correlations were observed between heart MRI traits and brain-related traits and disorders. Mendelian randomization suggests that heart conditions may causally contribute to brain disorders. Our results advance a multiorgan perspective on human health by revealing heart-brain connections and shared genetic influences.
Collapse
Affiliation(s)
- Bingxin Zhao
- Department of Statistics and Data Science, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Statistics, Purdue University, West Lafayette, IN 47907, USA
| | - Tengfei Li
- Department of Radiology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Biomedical Research Imaging Center, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Zirui Fan
- Department of Statistics and Data Science, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Statistics, Purdue University, West Lafayette, IN 47907, USA
| | - Yue Yang
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Juan Shu
- Department of Statistics, Purdue University, West Lafayette, IN 47907, USA
| | - Xiaochen Yang
- Department of Statistics, Purdue University, West Lafayette, IN 47907, USA
| | - Xifeng Wang
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Tianyou Luo
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jiarui Tang
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Di Xiong
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Zhenyi Wu
- Department of Statistics, Purdue University, West Lafayette, IN 47907, USA
| | - Bingxuan Li
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
| | - Jie Chen
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Yue Shan
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Chalmer Tomlinson
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Ziliang Zhu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Yun Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jason L Stein
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Hongtu Zhu
- Biomedical Research Imaging Center, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
79
|
Zhao B, Zou F, Zhu H. Cross-trait prediction accuracy of summary statistics in genome-wide association studies. Biometrics 2023; 79:841-853. [PMID: 35278218 PMCID: PMC9464799 DOI: 10.1111/biom.13661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Accepted: 02/25/2022] [Indexed: 11/27/2022]
Abstract
In the era of big data, univariate models have widely been used as a workhorse tool for quickly producing marginal estimators; and this is true even when in a high-dimensional dense setting, in which many features are "true," but weak signals. Genome-wide association studies (GWAS) epitomize this type of setting. Although the GWAS marginal estimator is popular, it has long been criticized for ignoring the correlation structure of genetic variants (i.e., the linkage disequilibrium [LD] pattern). In this paper, we study the effects of LD pattern on the GWAS marginal estimator and investigate whether or not additionally accounting for the LD can improve the prediction accuracy of complex traits. We consider a general high-dimensional dense setting for GWAS and study a class of ridge-type estimators, including the popular marginal estimator and the best linear unbiased prediction (BLUP) estimator as two special cases. We show that the performance of GWAS marginal estimator depends on the LD pattern through the first three moments of its eigenvalue distribution. Furthermore, we uncover that the relative performance of GWAS marginal and BLUP estimators highly depends on the ratio of GWAS sample size over the number of genetic variants. Particularly, our finding reveals that the marginal estimator can easily become near-optimal within this class when the sample size is relatively small, even though it ignores the LD pattern. On the other hand, BLUP estimator has substantially better performance than the marginal estimator as the sample size increases toward the number of genetic variants, which is typically in millions. Therefore, adjusting for the LD (such as in the BLUP) is most needed when GWAS sample size is large. We illustrate the importance of our results by using the simulated data and real GWAS.
Collapse
Affiliation(s)
- Bingxin Zhao
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599, U.S.A
| | - Fei Zou
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599, U.S.A
| | - Hongtu Zhu
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599, U.S.A
| |
Collapse
|
80
|
Zhai S, Guo B, Wu B, Mehrotra DV, Shen J. Integrating multiple traits for improving polygenic risk prediction in disease and pharmacogenomics GWAS. Brief Bioinform 2023:7169140. [PMID: 37200155 DOI: 10.1093/bib/bbad181] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 03/30/2023] [Accepted: 04/21/2023] [Indexed: 05/20/2023] Open
Abstract
Polygenic risk score (PRS) has been recently developed for predicting complex traits and drug responses. It remains unknown whether multi-trait PRS (mtPRS) methods, by integrating information from multiple genetically correlated traits, can improve prediction accuracy and power for PRS analysis compared with single-trait PRS (stPRS) methods. In this paper, we first review commonly used mtPRS methods and find that they do not directly model the underlying genetic correlations among traits, which has been shown to be useful in guiding multi-trait association analysis in the literature. To overcome this limitation, we propose a mtPRS-PCA method to combine PRSs from multiple traits with weights obtained from performing principal component analysis (PCA) on the genetic correlation matrix. To accommodate various genetic architectures covering different effect directions, signal sparseness and across-trait correlation structures, we further propose an omnibus mtPRS method (mtPRS-O) by combining P values from mtPRS-PCA, mtPRS-ML (mtPRS based on machine learning) and stPRSs using Cauchy Combination Test. Our extensive simulation studies show that mtPRS-PCA outperforms other mtPRS methods in both disease and pharmacogenomics (PGx) genome-wide association studies (GWAS) contexts when traits are similarly correlated, with dense signal effects and in similar effect directions, and mtPRS-O is consistently superior to most other methods due to its robustness under various genetic architectures. We further apply mtPRS-PCA, mtPRS-O and other methods to PGx GWAS data from a randomized clinical trial in the cardiovascular domain and demonstrate performance improvement of mtPRS-PCA in both prediction accuracy and patient stratification as well as the robustness of mtPRS-O in PRS association test.
Collapse
Affiliation(s)
- Song Zhai
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, NJ 07065, USA
| | - Bin Guo
- Data and Genome Science, Merck & Co., Inc., Cambridge, MA 02141, USA
| | - Baolin Wu
- Department of Epidemiology and Biostatistics, University of California Irvine, Irvine, CA 92697, USA
| | - Devan V Mehrotra
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., North Wales, PA 19454, USA
| | - Judong Shen
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, NJ 07065, USA
| |
Collapse
|
81
|
Hou X, Ma B, Liu M, Zhao Y, Chai B, Pan J, Wang P, Li D, Liu S, Song F. The transcriptional risk scores for kidney renal clear cell carcinoma using XGBoost and multiple omics data. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:11676-11687. [PMID: 37501415 DOI: 10.3934/mbe.2023519] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Most kidney cancers are kidney renal clear cell carcinoma (KIRC) that is a main cause of cancer-related deaths. Polygenic risk score (PRS) is a weighted linear combination of phenotypic related alleles on the genome that can be used to assess KIRC risk. However, standalone SNP data as input to the PRS model may not provide satisfactory result. Therefore, Transcriptional risk scores (TRS) based on multi-omics data and machine learning models were proposed to assess the risk of KIRC. First, we collected four types of multi-omics data (DNA methylation, miRNA, mRNA and lncRNA) of KIRC patients from the TCGA database. Subsequently, a novel TRS method utilizing multiple omics data and XGBoost model was developed. Finally, we performed prevalence analysis and prognosis prediction to evaluate the utility of the TRS generated by our method. Our TRS methods exhibited better predictive performance than the linear models and other machine learning models. Furthermore, the prediction accuracy of combined TRS model was higher than that of single-omics TRS model. The KM curves showed that TRS was a valid prognostic indicator for cancer staging. Our proposed method extended the current definition of TRS from standalone SNP data to multi-omics data and was superior to the linear models and other machine learning models, which may provide a useful implement for diagnostic and prognostic prediction of KIRC.
Collapse
Affiliation(s)
- Xiaoyu Hou
- School of Information Science and Technology, Dalian Maritime University, Dalian 116026, China
| | - Baoshan Ma
- School of Information Science and Technology, Dalian Maritime University, Dalian 116026, China
| | - Ming Liu
- Physical Department of Science and Technology, Dalian University, Dalian 116622, China
| | - Yuxuan Zhao
- School of Information Science and Technology, Dalian Maritime University, Dalian 116026, China
| | - Bingjie Chai
- School of Information Science and Technology, Dalian Maritime University, Dalian 116026, China
| | - Jianqiao Pan
- School of Information Science and Technology, Dalian Maritime University, Dalian 116026, China
| | - Pengcheng Wang
- Department of Mechanical Engineering, University of Houston, Houston 77204, USA
| | - Di Li
- Department of Neuro Intervention, Dalian Medical University affiliated Dalian Municipal Central Hospital, Dalian 116033, China
| | - Shuxin Liu
- Department of Nephrology, Dalian Medical University affiliated Dalian Municipal Central Hospital, Dalian 116033, China
| | - Fengju Song
- Department of Epidemiology and Biostatistics, Key Laboratory of Molecular Cancer Epidemiology, Tianjin, National Clinical Research Center of Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin 300060, China
| |
Collapse
|
82
|
Zabad S, Gravel S, Li Y. Fast and accurate Bayesian polygenic risk modeling with variational inference. Am J Hum Genet 2023; 110:741-761. [PMID: 37030289 PMCID: PMC10183379 DOI: 10.1016/j.ajhg.2023.03.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Accepted: 03/13/2023] [Indexed: 04/10/2023] Open
Abstract
The advent of large-scale genome-wide association studies (GWASs) has motivated the development of statistical methods for phenotype prediction with single-nucleotide polymorphism (SNP) array data. These polygenic risk score (PRS) methods use a multiple linear regression framework to infer joint effect sizes of all genetic variants on the trait. Among the subset of PRS methods that operate on GWAS summary statistics, sparse Bayesian methods have shown competitive predictive ability. However, most existing Bayesian approaches employ Markov chain Monte Carlo (MCMC) algorithms, which are computationally inefficient and do not scale favorably to higher dimensions, for posterior inference. Here, we introduce variational inference of polygenic risk scores (VIPRS), a Bayesian summary statistics-based PRS method that utilizes variational inference techniques to approximate the posterior distribution for the effect sizes. Our experiments with 36 simulation configurations and 12 real phenotypes from the UK Biobank dataset demonstrated that VIPRS is consistently competitive with the state-of-the-art in prediction accuracy while being more than twice as fast as popular MCMC-based approaches. This performance advantage is robust across a variety of genetic architectures, SNP heritabilities, and independent GWAS cohorts. In addition to its competitive accuracy on the "White British" samples, VIPRS showed improved transferability when applied to other ethnic groups, with up to 1.7-fold increase in R2 among individuals of Nigerian ancestry for low-density lipoprotein (LDL) cholesterol. To illustrate its scalability, we applied VIPRS to a dataset of 9.6 million genetic markers, which conferred further improvements in prediction accuracy for highly polygenic traits, such as height.
Collapse
Affiliation(s)
- Shadi Zabad
- School of Computer Science, McGill University, Montreal, QC, Canada
| | - Simon Gravel
- Department of Human Genetics, McGill University, Montreal, QC, Canada.
| | - Yue Li
- School of Computer Science, McGill University, Montreal, QC, Canada.
| |
Collapse
|
83
|
Clark R, Lee SSY, Du R, Wang Y, Kneepkens SCM, Charng J, Huang Y, Hunter ML, Jiang C, Tideman JWL, Melles RB, Klaver CCW, Mackey DA, Williams C, Choquet H, Ohno-Matsui K, Guggenheim JA. A new polygenic score for refractive error improves detection of children at risk of high myopia but not the prediction of those at risk of myopic macular degeneration. EBioMedicine 2023; 91:104551. [PMID: 37055258 PMCID: PMC10203044 DOI: 10.1016/j.ebiom.2023.104551] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 03/17/2023] [Accepted: 03/17/2023] [Indexed: 04/15/2023] Open
Abstract
BACKGROUND High myopia (HM), defined as a spherical equivalent refractive error (SER) ≤ -6.00 diopters (D), is a leading cause of sight impairment, through myopic macular degeneration (MMD). We aimed to derive an improved polygenic score (PGS) for predicting children at risk of HM and to test if a PGS is predictive of MMD after accounting for SER. METHODS The PGS was derived from genome-wide association studies in participants of UK Biobank, CREAM Consortium, and Genetic Epidemiology Research on Adult Health and Aging. MMD severity was quantified by a deep learning algorithm. Prediction of HM was quantified as the area under the receiver operating curve (AUROC). Prediction of severe MMD was assessed by logistic regression. FINDINGS In independent samples of European, African, South Asian and East Asian ancestry, the PGS explained 19% (95% confidence interval 17-21%), 2% (1-3%), 8% (7-10%) and 6% (3-9%) of the variation in SER, respectively. The AUROC for HM in these samples was 0.78 (0.75-0.81), 0.58 (0.53-0.64), 0.71 (0.69-0.74) and 0.67 (0.62-0.72), respectively. The PGS was not associated with the risk of MMD after accounting for SER: OR = 1.07 (0.92-1.24). INTERPRETATION Performance of the PGS approached the level required for clinical utility in Europeans but not in other ancestries. A PGS for refractive error was not predictive of MMD risk once SER was accounted for. FUNDING Supported by the Welsh Government and Fight for Sight (24WG201).
Collapse
Affiliation(s)
- Rosie Clark
- School of Optometry & Vision Sciences, Cardiff University, Maindy Road, Cardiff, CF24 4HQ, UK
| | - Samantha Sze-Yee Lee
- University of Western Australia, Centre for Ophthalmology and Visual Science (incorporating the Lions Eye Institute), Perth, Western Australia, Australia
| | - Ran Du
- Department of Ophthalmology and Visual Science, Tokyo Medical and Dental University, 1-5-45 Yushima, Bunkyo-ku, Tokyo, 1138510, Japan; Department of Ophthalmology, Beijing Children's Hospital, Capital Medical University, National Center for Children's Health, Beijing, 100045, China
| | - Yining Wang
- Department of Ophthalmology and Visual Science, Tokyo Medical and Dental University, 1-5-45 Yushima, Bunkyo-ku, Tokyo, 1138510, Japan
| | - Sander C M Kneepkens
- Department of Ophthalmology, Erasmus University Medical Center, Rotterdam, the Netherlands; Department of Epidemiology, Erasmus University Medical Center, Rotterdam, the Netherlands; Generation R Study Group, Erasmus University Medical Center, Rotterdam, the Netherlands
| | - Jason Charng
- University of Western Australia, Centre for Ophthalmology and Visual Science (incorporating the Lions Eye Institute), Perth, Western Australia, Australia; Department of Optometry, School of Allied Health, University of Western Australia, Perth, Australia
| | - Yu Huang
- Department of Ophthalmology, Guangdong Eye Institute, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, 510080, China
| | - Michael L Hunter
- Busselton Health Study Centre, Busselton Population Medical Research Institute, Busselton, Western Australia; School of Population and Global Health, University of Western Australia, Perth, Western Australia
| | - Chen Jiang
- Division of Research, Kaiser Permanente Northern California, Oakland, CA, USA
| | - J Willem L Tideman
- Department of Ophthalmology, Martini Hospital, Groningen, the Netherlands; Department of Ophthalmology, Erasmus University Medical Center, Rotterdam, the Netherlands
| | - Ronald B Melles
- Department of Ophthalmology Kaiser Permanente Northern California, Redwood City, CA, USA
| | - Caroline C W Klaver
- Department of Ophthalmology, Erasmus University Medical Center, Rotterdam, the Netherlands; Department of Epidemiology, Erasmus University Medical Center, Rotterdam, the Netherlands; Generation R Study Group, Erasmus University Medical Center, Rotterdam, the Netherlands; Institute of Molecular and Clinical Ophthalmology, Basel, Switzerland; Department of Ophthalmology, Radboud University Medical Center, Nijmegen, the Netherlands
| | - David A Mackey
- University of Western Australia, Centre for Ophthalmology and Visual Science (incorporating the Lions Eye Institute), Perth, Western Australia, Australia; Centre for Eye Research Australia, Royal Victorian Eye and Ear Hospital, University of Melbourne, East Melbourne, Victoria, Australia; School of Medicine, Menzies Research Institute Tasmania, University of Tasmania, Hobart, Tasmania, Australia
| | - Cathy Williams
- Centre for Academic Child Health, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, BS81NU, UK
| | - Hélène Choquet
- Division of Research, Kaiser Permanente Northern California, Oakland, CA, USA
| | - Kyoko Ohno-Matsui
- Department of Ophthalmology and Visual Science, Tokyo Medical and Dental University, 1-5-45 Yushima, Bunkyo-ku, Tokyo, 1138510, Japan
| | - Jeremy A Guggenheim
- School of Optometry & Vision Sciences, Cardiff University, Maindy Road, Cardiff, CF24 4HQ, UK.
| |
Collapse
|
84
|
Deng WQ, Belisario K, Gray JC, Levitt EE, Mohammadi-Shemirani P, Singh D, Pare G, MacKillop J. Leveraging related health phenotypes for polygenic prediction of impulsive choice, impulsive action, and impulsive personality traits in 1534 European ancestry community adults. GENES, BRAIN, AND BEHAVIOR 2023:e12848. [PMID: 37060189 DOI: 10.1111/gbb.12848] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Revised: 03/17/2023] [Accepted: 03/31/2023] [Indexed: 04/16/2023]
Abstract
Impulsivity refers to a number of conceptually related phenotypes reflecting self-regulatory capacity that are considered promising endophenotypes for mental and physical health. Measures of impulsivity can be broadly grouped into three domains, namely, impulsive choice, impulsive action, and impulsive personality traits. In a community-based sample of ancestral Europeans (n = 1534), we conducted genome-wide association studies (GWASs) of impulsive choice (delay discounting), impulsive action (behavioral inhibition), and impulsive personality traits (UPPS-P), and evaluated 11 polygenic risk scores (PRSs) of phenotypes previously linked to self-regulation. Although there were no individual genome-wide significant hits, the neuroticism PRS was positively associated with negative urgency (adjusted R2 = 1.61%, p = 3.6 × 10-7 ) and the educational attainment PRS was inversely associated with delay discounting (adjusted R2 = 1.68%, p = 2.2 × 10-7 ). There was also evidence implicating PRSs of attention-deficit/hyperactivity disorder, externalizing, risk-taking, smoking cessation, smoking initiation, and body mass index with one or more impulsivity phenotypes (adjusted R2 s: 0.35%-1.07%; FDR adjusted ps = 0.05-0.0006). These significant associations between PRSs and impulsivity phenotypes are consistent with established genetic correlations. The combined PRS explained 0.91%-2.46% of the phenotypic variance for individual impulsivity measures, corresponding to 8.7%-32.5% of their reported single-nucleotide polymorphism (SNP)-based heritability, suggesting a non-negligible portion of the SNP-based heritability can be recovered by PRSs. These results support the predictive validity and utility of PRSs, even derived from related phenotypes, to inform the genetics of impulsivity phenotypes.
Collapse
Affiliation(s)
- Wei Q Deng
- Peter Boris Centre for Addictions Research, St. Joseph's Healthcare Hamilton, Hamilton, Ontario, Canada
- Department of Psychiatry and Behavioural Neurosciences, McMaster University, Hamilton, Canada
| | - Kyla Belisario
- Peter Boris Centre for Addictions Research, St. Joseph's Healthcare Hamilton, Hamilton, Ontario, Canada
- Department of Psychiatry and Behavioural Neurosciences, McMaster University, Hamilton, Canada
| | - Joshua C Gray
- Department of Medical and Clinical Psychology, Uniformed Services University, Bethesda, Maryland, USA
| | - Emily E Levitt
- Peter Boris Centre for Addictions Research, St. Joseph's Healthcare Hamilton, Hamilton, Ontario, Canada
- Department of Psychiatry and Behavioural Neurosciences, McMaster University, Hamilton, Canada
| | | | - Desmond Singh
- Peter Boris Centre for Addictions Research, St. Joseph's Healthcare Hamilton, Hamilton, Ontario, Canada
- Department of Biology, University of Waterloo, Wterloo, Canada
| | - Guillaume Pare
- Department of Pathology and Molecular Medicine, McMaster University, Hamilton, Canada
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Canada
| | - James MacKillop
- Peter Boris Centre for Addictions Research, St. Joseph's Healthcare Hamilton, Hamilton, Ontario, Canada
- Department of Psychiatry and Behavioural Neurosciences, McMaster University, Hamilton, Canada
| |
Collapse
|
85
|
Adam Y, Sadeeq S, Kumuthini J, Ajayi O, Wells G, Solomon R, Ogunlana O, Adetiba E, Iweala E, Brors B, Adebiyi E. Polygenic Risk Score in African populations: progress and challenges. F1000Res 2023; 11:175. [PMID: 37273966 PMCID: PMC10233318 DOI: 10.12688/f1000research.76218.2] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/10/2023] [Indexed: 06/06/2023] Open
Abstract
Polygenic Risk Score (PRS) analysis is a method that predicts the genetic risk of an individual towards targeted traits. Even when there are no significant markers, it gives evidence of a genetic effect beyond the results of Genome-Wide Association Studies (GWAS). Moreover, it selects single nucleotide polymorphisms (SNPs) that contribute to the disease with low effect size making it more precise at individual level risk prediction. PRS analysis addresses the shortfall of GWAS by taking into account the SNPs/alleles with low effect size but play an indispensable role to the observed phenotypic/trait variance. PRS analysis has applications that investigate the genetic basis of several traits, which includes rare diseases. However, the accuracy of PRS analysis depends on the genomic data of the underlying population. For instance, several studies show that obtaining higher prediction power of PRS analysis is challenging for non-Europeans. In this manuscript, we review the conventional PRS methods and their application to sub-Saharan African communities. We conclude that lack of sufficient GWAS data and tools is the limiting factor of applying PRS analysis to sub-Saharan populations. We recommend developing Africa-specific PRS methods and tools for estimating and analyzing African population data for clinical evaluation of PRSs of interest and predicting rare diseases.
Collapse
Affiliation(s)
- Yagoub Adam
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Suraju Sadeeq
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept Computer & Information Sciences, Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Judit Kumuthini
- South African National Bioinformatics Institute, Life Sciences Building, University of Western Cape, Cape Town, South Africa
- Centre for Proteomic and Genomic Research, Cape Town, Western Cape, South Africa
| | - Olabode Ajayi
- South African National Bioinformatics Institute, Life Sciences Building, University of Western Cape, Cape Town, South Africa
- Centre for Proteomic and Genomic Research, Cape Town, Western Cape, South Africa
| | - Gordon Wells
- South African National Bioinformatics Institute, Life Sciences Building, University of Western Cape, Cape Town, South Africa
- Centre for Proteomic and Genomic Research, Cape Town, Western Cape, South Africa
| | - Rotimi Solomon
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, 112212, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept of Biochemistry, Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Olubanke Ogunlana
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, 112212, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept of Biochemistry, Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Emmanuel Adetiba
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept of Electrical & Information Engineering (EIE), Covenant University, Ota, Ogun State, 112212, Nigeria
- HRA, Institute for Systems Science, Durban University of Technology, Durban, South Africa
| | - Emeka Iweala
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept of Biochemistry, Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Benedikt Brors
- Applied Bioinformatics Division, German Cancer Research Center (DKFZ), Heidelberg, 69120, Germany
- German Cancer Consortium (DKTK), Heidelberg, Germany
| | - Ezekiel Adebiyi
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, 112212, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept Computer & Information Sciences, Covenant University, Ota, Ogun State, 112212, Nigeria
- Applied Bioinformatics Division, German Cancer Research Center (DKFZ), Heidelberg, 69120, Germany
| |
Collapse
|
86
|
Adam Y, Sadeeq S, Kumuthini J, Ajayi O, Wells G, Solomon R, Ogunlana O, Adetiba E, Iweala E, Brors B, Adebiyi E. Polygenic Risk Score in African populations: progress and challenges. F1000Res 2023; 11:175. [PMID: 37273966 PMCID: PMC10233318 DOI: 10.12688/f1000research.76218.1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/10/2023] [Indexed: 11/23/2023] Open
Abstract
Polygenic Risk Score (PRS) analysis is a method that predicts the genetic risk of an individual towards targeted traits. Even when there are no significant markers, it gives evidence of a genetic effect beyond the results of Genome-Wide Association Studies (GWAS). Moreover, it selects single nucleotide polymorphisms (SNPs) that contribute to the disease with low effect size making it more precise at individual level risk prediction. PRS analysis addresses the shortfall of GWAS by taking into account the SNPs/alleles with low effect size but play an indispensable role to the observed phenotypic/trait variance. PRS analysis has applications that investigate the genetic basis of several traits, which includes rare diseases. However, the accuracy of PRS analysis depends on the genomic data of the underlying population. For instance, several studies show that obtaining higher prediction power of PRS analysis is challenging for non-Europeans. In this manuscript, we review the conventional PRS methods and their application to sub-Saharan African communities. We conclude that lack of sufficient GWAS data and tools is the limiting factor of applying PRS analysis to sub-Saharan populations. We recommend developing Africa-specific PRS methods and tools for estimating and analyzing African population data for clinical evaluation of PRSs of interest and predicting rare diseases.
Collapse
Affiliation(s)
- Yagoub Adam
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Suraju Sadeeq
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept Computer & Information Sciences, Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Judit Kumuthini
- South African National Bioinformatics Institute, Life Sciences Building, University of Western Cape, Cape Town, South Africa
- Centre for Proteomic and Genomic Research, Cape Town, Western Cape, South Africa
| | - Olabode Ajayi
- South African National Bioinformatics Institute, Life Sciences Building, University of Western Cape, Cape Town, South Africa
- Centre for Proteomic and Genomic Research, Cape Town, Western Cape, South Africa
| | - Gordon Wells
- South African National Bioinformatics Institute, Life Sciences Building, University of Western Cape, Cape Town, South Africa
- Centre for Proteomic and Genomic Research, Cape Town, Western Cape, South Africa
| | - Rotimi Solomon
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, 112212, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept of Biochemistry, Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Olubanke Ogunlana
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, 112212, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept of Biochemistry, Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Emmanuel Adetiba
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept of Electrical & Information Engineering (EIE), Covenant University, Ota, Ogun State, 112212, Nigeria
- HRA, Institute for Systems Science, Durban University of Technology, Durban, South Africa
| | - Emeka Iweala
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept of Biochemistry, Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Benedikt Brors
- Applied Bioinformatics Division, German Cancer Research Center (DKFZ), Heidelberg, 69120, Germany
- German Cancer Consortium (DKTK), Heidelberg, Germany
| | - Ezekiel Adebiyi
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, 112212, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept Computer & Information Sciences, Covenant University, Ota, Ogun State, 112212, Nigeria
- Applied Bioinformatics Division, German Cancer Research Center (DKFZ), Heidelberg, 69120, Germany
| |
Collapse
|
87
|
Dai Q, Zhou G, Zhao H, Võsa U, Franke L, Battle A, Teumer A, Lehtimäki T, Raitakari OT, Esko T, Epstein MP, Yang J. OTTERS: a powerful TWAS framework leveraging summary-level reference data. Nat Commun 2023; 14:1271. [PMID: 36882394 PMCID: PMC9992663 DOI: 10.1038/s41467-023-36862-w] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Accepted: 02/20/2023] [Indexed: 03/09/2023] Open
Abstract
Most existing TWAS tools require individual-level eQTL reference data and thus are not applicable to summary-level reference eQTL datasets. The development of TWAS methods that can harness summary-level reference data is valuable to enable TWAS in broader settings and enhance power due to increased reference sample size. Thus, we develop a TWAS framework called OTTERS (Omnibus Transcriptome Test using Expression Reference Summary data) that adapts multiple polygenic risk score (PRS) methods to estimate eQTL weights from summary-level eQTL reference data and conducts an omnibus TWAS. We show that OTTERS is a practical and powerful TWAS tool by both simulations and application studies.
Collapse
Affiliation(s)
- Qile Dai
- Department of Biostatistics and Bioinformatics, Emory University School of Public Health, Atlanta, GA, 30322, USA
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA
| | - Geyu Zhou
- Program of Computational Biology and Bioinformatics, Yale University, New Haven, CT, 06511, USA
| | - Hongyu Zhao
- Program of Computational Biology and Bioinformatics, Yale University, New Haven, CT, 06511, USA
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, 06520, USA
| | - Urmo Võsa
- Estonian Genome Centre, Institute of Genomics, University of Tartu, 50090, Tartu, Estonia
| | - Lude Franke
- Department of Genetics, University of Groningen, University Medical Center Groningen, 9700 RB, Groningen, The Netherlands
- Oncode Institute, 3521 AL, Utrecht, The Netherlands
| | - Alexis Battle
- Department of Computer Science, and Departments of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21218, USA
| | - Alexander Teumer
- Institute for Community Medicine, University Medicine Greifswald, 17489, Greifswald, Germany
| | - Terho Lehtimäki
- Department of Clinical Chemistry, Fimlab Laboratories and Finnish Centre for Cardiovascular Disease Tampere, Faculty of Medicine and Health Technology, Tampere University, Tampere, 33520, Finland
| | - Olli T Raitakari
- Centre for Population Health Research, University of Turku and Turku University Hospital, 20520, Turku, Finland
- Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku, 20520, Turku, Finland
- Department of Clinical Physiology and Nuclear Medicine, Turku University Hospital, 20521, Turku, Finland
| | - Tõnu Esko
- Estonian Genome Centre, Institute of Genomics, University of Tartu, 50090, Tartu, Estonia
| | - Michael P Epstein
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA.
| | - Jingjing Yang
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA.
| |
Collapse
|
88
|
Khunsriraksakul C, Li Q, Markus H, Patrick MT, Sauteraud R, McGuire D, Wang X, Wang C, Wang L, Chen S, Shenoy G, Li B, Zhong X, Olsen NJ, Carrel L, Tsoi LC, Jiang B, Liu DJ. Multi-ancestry and multi-trait genome-wide association meta-analyses inform clinical risk prediction for systemic lupus erythematosus. Nat Commun 2023; 14:668. [PMID: 36750564 PMCID: PMC9905560 DOI: 10.1038/s41467-023-36306-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Accepted: 01/25/2023] [Indexed: 02/09/2023] Open
Abstract
Systemic lupus erythematosus is a heritable autoimmune disease that predominantly affects young women. To improve our understanding of genetic etiology, we conduct multi-ancestry and multi-trait meta-analysis of genome-wide association studies, encompassing 12 systemic lupus erythematosus cohorts from 3 different ancestries and 10 genetically correlated autoimmune diseases, and identify 16 novel loci. We also perform transcriptome-wide association studies, computational drug repurposing analysis, and cell type enrichment analysis. We discover putative drug classes, including a histone deacetylase inhibitor that could be repurposed to treat lupus. We also identify multiple cell types enriched with putative target genes, such as non-classical monocytes and B cells, which may be targeted for future therapeutics. Using this newly assembled result, we further construct polygenic risk score models and demonstrate that integrating polygenic risk score with clinical lab biomarkers improves the diagnostic accuracy of systemic lupus erythematosus using the Vanderbilt BioVU and Michigan Genomics Initiative biobanks.
Collapse
Affiliation(s)
- Chachrit Khunsriraksakul
- Program in Bioinformatics and Genomics, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA.,Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Qinmengge Li
- Department of Dermatology, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
| | - Havell Markus
- Program in Bioinformatics and Genomics, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA.,Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Matthew T Patrick
- Department of Dermatology, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
| | - Renan Sauteraud
- Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Daniel McGuire
- Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Xingyan Wang
- Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Chen Wang
- Program in Bioinformatics and Genomics, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Lida Wang
- Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Siyuan Chen
- Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Ganesh Shenoy
- Department of Neurosurgery, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Bingshan Li
- Department of Molecular Physiology & Biophysics, Vanderbilt University, Nashville, TN, 37235, USA
| | - Xue Zhong
- Department of Medicine, Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - Nancy J Olsen
- Department of Medicine, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Laura Carrel
- Department of Biochemistry and Molecular Biology, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Lam C Tsoi
- Department of Dermatology, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
| | - Bibo Jiang
- Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Dajiang J Liu
- Program in Bioinformatics and Genomics, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA. .,Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA. .,Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA.
| |
Collapse
|
89
|
Melton HJ, Zhang Z, Wu C. SUMMIT-FA: A new resource for improved transcriptome imputation using functional annotations. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.02.02.23285208. [PMID: 36798253 PMCID: PMC9934719 DOI: 10.1101/2023.02.02.23285208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Transcriptome-wide association studies (TWAS) integrate gene expression prediction models and genome-wide association studies (GWAS) to identify gene-trait associations. The power of TWAS is determined by the sample size of GWAS and the accuracy of the expression prediction model. Here, we present a new method, the Summary-level Unified Method for Modeling Integrated Transcriptome using Functional Annotations (SUMMIT-FA), that improves the accuracy of gene expression prediction by leveraging functional annotation resources and a large expression quantitative trait loci (eQTL) summary-level dataset. We build gene expression prediction models using SUMMIT-FA with a comprehensive functional database MACIE and the eQTL summary-level data from the eQTLGen consortium. By applying the resulting models to GWASs for 24 complex traits and exploring it through a simulation study, we show that SUMMIT-FA improves the accuracy of gene expression prediction models in whole blood, identifies significantly more gene-trait associations, and improves predictive power for identifying "silver standard" genes compared to several benchmark methods.
Collapse
Affiliation(s)
- Hunter J. Melton
- Department of Statistics, Florida State University, Tallahassee, FL, USA
| | - Zichen Zhang
- Department of Statistics, Florida State University, Tallahassee, FL, USA
| | - Chong Wu
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| |
Collapse
|
90
|
Hsu L, Kooperberg A, Reiner AP, Kooperberg C. An empirical Bayes approach to improving population-specific genetic association estimation by leveraging cross-population data. Genet Epidemiol 2023; 47:45-60. [PMID: 36116031 PMCID: PMC9892209 DOI: 10.1002/gepi.22501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Revised: 07/29/2022] [Accepted: 08/16/2022] [Indexed: 02/04/2023]
Abstract
Populations of non-European ancestry are substantially underrepresented in genome-wide association studies (GWAS). As genetic effects can differ between ancestries due to possibly different causal variants or linkage disequilibrium patterns, a meta-analysis that includes GWAS of all populations yields biased estimation in each of the populations and the bias disproportionately impacts non-European ancestry populations. This is because meta-analysis combines study-specific estimates with inverse variance as the weights, which causes biases towards studies with the largest sample size, typical of the European ancestry population. In this paper, we propose two empirical Bayes (EB) estimators to borrow the strength of information across populations although accounting for between-population heterogeneity. Extensive simulation studies show that the proposed EB estimators are largely unbiased and improve efficiency compared to the population-specific estimator. In contrast, even though the meta-analysis estimator has a much smaller variance, it yields significant bias when the genetic effect is heterogeneous across populations. We apply the proposed EB estimators to a large-scale trans-ancestry GWAS of stroke and demonstrate that the EB estimators reduce the variance of the population-specific estimator substantially, with the effect estimates close to the population-specific estimates.
Collapse
Affiliation(s)
- Li Hsu
- Biostatistics Program Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, Washington, USA
| | - Anna Kooperberg
- Department of Mathematics, MIT, Cambridge, Massachusetts, USA
| | - Alexander P Reiner
- Biostatistics Program Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, Washington, USA
- Department of Epidemiology, University of Washington, Seattle, Washington, USA
| | - Charles Kooperberg
- Biostatistics Program Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, Washington, USA
| |
Collapse
|
91
|
Choi SW, García-González J, Ruan Y, Wu HM, Porras C, Johnson J, Hoggart CJ, O'Reilly PF. PRSet: Pathway-based polygenic risk score analyses and software. PLoS Genet 2023; 19:e1010624. [PMID: 36749789 PMCID: PMC9937466 DOI: 10.1371/journal.pgen.1010624] [Citation(s) in RCA: 26] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 02/17/2023] [Accepted: 01/19/2023] [Indexed: 02/08/2023] Open
Abstract
Polygenic risk scores (PRSs) have been among the leading advances in biomedicine in recent years. As a proxy of genetic liability, PRSs are utilised across multiple fields and applications. While numerous statistical and machine learning methods have been developed to optimise their predictive accuracy, these typically distil genetic liability to a single number based on aggregation of an individual's genome-wide risk alleles. This results in a key loss of information about an individual's genetic profile, which could be critical given the functional sub-structure of the genome and the heterogeneity of complex disease. In this manuscript, we introduce a 'pathway polygenic' paradigm of disease risk, in which multiple genetic liabilities underlie complex diseases, rather than a single genome-wide liability. We describe a method and accompanying software, PRSet, for computing and analysing pathway-based PRSs, in which polygenic scores are calculated across genomic pathways for each individual. We evaluate the potential of pathway PRSs in two distinct ways, creating two major sections: (1) In the first section, we benchmark PRSet as a pathway enrichment tool, evaluating its capacity to capture GWAS signal in pathways. We find that for target sample sizes of >10,000 individuals, pathway PRSs have similar power for evaluating pathway enrichment as leading methods MAGMA and LD score regression, with the distinct advantage of providing individual-level estimates of genetic liability for each pathway -opening up a range of pathway-based PRS applications, (2) In the second section, we evaluate the performance of pathway PRSs for disease stratification. We show that using a supervised disease stratification approach, pathway PRSs (computed by PRSet) outperform two standard genome-wide PRSs (computed by C+T and lassosum) for classifying disease subtypes in 20 of 21 scenarios tested. As the definition and functional annotation of pathways becomes increasingly refined, we expect pathway PRSs to offer key insights into the heterogeneity of complex disease and treatment response, to generate biologically tractable therapeutic targets from polygenic signal, and, ultimately, to provide a powerful path to precision medicine.
Collapse
Affiliation(s)
- Shing Wan Choi
- Department of Genetics and Genomic Sciences, Icahn School of Medicine, Mount Sinai, New York City, New York, United States of America
| | - Judit García-González
- Department of Genetics and Genomic Sciences, Icahn School of Medicine, Mount Sinai, New York City, New York, United States of America
| | - Yunfeng Ruan
- The Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Hei Man Wu
- Department of Genetics and Genomic Sciences, Icahn School of Medicine, Mount Sinai, New York City, New York, United States of America
| | - Christian Porras
- Department of Genetics and Genomic Sciences, Icahn School of Medicine, Mount Sinai, New York City, New York, United States of America
| | - Jessica Johnson
- Department of Genetics and Genomic Sciences, Icahn School of Medicine, Mount Sinai, New York City, New York, United States of America
| | - Clive J Hoggart
- Department of Genetics and Genomic Sciences, Icahn School of Medicine, Mount Sinai, New York City, New York, United States of America
| | - Paul F O'Reilly
- Department of Genetics and Genomic Sciences, Icahn School of Medicine, Mount Sinai, New York City, New York, United States of America
| |
Collapse
|
92
|
Abstract
Polygenic scores quantify inherited risk by integrating information from many common sites of DNA variation into a single number. Rapid increases in the scale of genetic association studies and new statistical algorithms have enabled development of polygenic scores that meaningfully measure-as early as birth-risk of coronary artery disease. These newer-generation polygenic scores identify up to 8% of the population with triple the normal risk based on genetic variation alone, and these individuals cannot be identified on the basis of family history or clinical risk factors alone. For those identified with increased genetic risk, evidence supports risk reduction with at least two interventions, adherence to a healthy lifestyle and cholesterol-lowering therapies, that can substantially reduce risk. Alongside considerable enthusiasm for the potential of polygenic risk estimation to enable a new era of preventive clinical medicine is recognition of a need for ongoing research into how best to ensure equitable performance across diverse ancestries, how and in whom to assess the scores in clinical practice, as well as randomized trials to confirm clinical utility.
Collapse
Affiliation(s)
- Aniruddh P Patel
- Division of Cardiology and Center for Genomic Medicine, Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA; , .,Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA.,Department of Medicine, Harvard Medical School, Boston, Massachusetts, USA
| | - Amit V Khera
- Division of Cardiology and Center for Genomic Medicine, Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA; , .,Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA.,Department of Medicine, Harvard Medical School, Boston, Massachusetts, USA.,Verve Therapeutics, Cambridge, Massachusetts, USA
| |
Collapse
|
93
|
Fritzsche MC, Akyüz K, Cano Abadía M, McLennan S, Marttinen P, Mayrhofer MT, Buyx AM. Ethical layering in AI-driven polygenic risk scores-New complexities, new challenges. Front Genet 2023; 14:1098439. [PMID: 36816027 PMCID: PMC9933509 DOI: 10.3389/fgene.2023.1098439] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 01/04/2023] [Indexed: 01/27/2023] Open
Abstract
Researchers aim to develop polygenic risk scores as a tool to prevent and more effectively treat serious diseases, disorders and conditions such as breast cancer, type 2 diabetes mellitus and coronary heart disease. Recently, machine learning techniques, in particular deep neural networks, have been increasingly developed to create polygenic risk scores using electronic health records as well as genomic and other health data. While the use of artificial intelligence for polygenic risk scores may enable greater accuracy, performance and prediction, it also presents a range of increasingly complex ethical challenges. The ethical and social issues of many polygenic risk score applications in medicine have been widely discussed. However, in the literature and in practice, the ethical implications of their confluence with the use of artificial intelligence have not yet been sufficiently considered. Based on a comprehensive review of the existing literature, we argue that this stands in need of urgent consideration for research and subsequent translation into the clinical setting. Considering the many ethical layers involved, we will first give a brief overview of the development of artificial intelligence-driven polygenic risk scores, associated ethical and social implications, challenges in artificial intelligence ethics, and finally, explore potential complexities of polygenic risk scores driven by artificial intelligence. We point out emerging complexity regarding fairness, challenges in building trust, explaining and understanding artificial intelligence and polygenic risk scores as well as regulatory uncertainties and further challenges. We strongly advocate taking a proactive approach to embedding ethics in research and implementation processes for polygenic risk scores driven by artificial intelligence.
Collapse
Affiliation(s)
- Marie-Christine Fritzsche
- Institute of History and Ethics in Medicine, TUM School of Medicine, Technical University of Munich, Munich, Germany,Department of Science, Technology and Society (STS), School of Social Sciences and Technology, Technical University of Munich, Munich, Germany,*Correspondence: Marie-Christine Fritzsche,
| | - Kaya Akyüz
- Biobanking and Biomolecular Resources Research Infrastructure Consortium - European Research Infrastructure Consortium (BBMRI-ERIC), Graz, Austria,Department of Science and Technology Studies, University of Vienna, Vienna, Austria
| | - Mónica Cano Abadía
- Biobanking and Biomolecular Resources Research Infrastructure Consortium - European Research Infrastructure Consortium (BBMRI-ERIC), Graz, Austria
| | - Stuart McLennan
- Institute of History and Ethics in Medicine, TUM School of Medicine, Technical University of Munich, Munich, Germany,Department of Science, Technology and Society (STS), School of Social Sciences and Technology, Technical University of Munich, Munich, Germany
| | - Pekka Marttinen
- Helsinki Institute for Information Technology HIIT, Aalto University, Helsinki, Finland
| | - Michaela Th. Mayrhofer
- Biobanking and Biomolecular Resources Research Infrastructure Consortium - European Research Infrastructure Consortium (BBMRI-ERIC), Graz, Austria
| | - Alena M. Buyx
- Institute of History and Ethics in Medicine, TUM School of Medicine, Technical University of Munich, Munich, Germany,Department of Science, Technology and Society (STS), School of Social Sciences and Technology, Technical University of Munich, Munich, Germany
| |
Collapse
|
94
|
Xin J, Du M, Gu D, Jiang K, Wang M, Jin M, Hu Y, Ben S, Chen S, Shao W, Li S, Chu H, Zhu L, Li C, Chen K, Ding K, Zhang Z, Shen H, Wang M. Risk assessment for colorectal cancer via polygenic risk score and lifestyle exposure: a large-scale association study of East Asian and European populations. Genome Med 2023; 15:4. [PMID: 36694225 PMCID: PMC9875451 DOI: 10.1186/s13073-023-01156-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2022] [Accepted: 01/13/2023] [Indexed: 01/26/2023] Open
Abstract
BACKGROUND The genetic architectures of colorectal cancer are distinct across different populations. To date, the majority of polygenic risk scores (PRSs) are derived from European (EUR) populations, which limits their accurate extrapolation to other populations. Here, we aimed to generate a PRS by incorporating East Asian (EAS) and EUR ancestry groups and validate its utility for colorectal cancer risk assessment among different populations. METHODS A large-scale colorectal cancer genome-wide association study (GWAS), harboring 35,145 cases and 288,934 controls from EAS and EUR populations, was used for the EAS-EUR GWAS meta-analysis and the construction of candidate EAS-EUR PRSs via different approaches. The performance of each PRS was then validated in external GWAS datasets of EAS (727 cases and 1452 controls) and EUR (1289 cases and 1284 controls) ancestries, respectively. The optimal PRS was further tested using the UK Biobank longitudinal cohort of 355,543 individuals and ultimately applied to stratify individual risk attached by healthy lifestyle. RESULTS In the meta-analysis across EAS and EUR populations, we identified 48 independent variants beyond genome-wide significance (P < 5 × 10-8) at previously reported loci. Among 26 candidate EAS-EUR PRSs, the PRS-CSx approach-derived PRS (defined as PRSCSx) that harbored genome-wide variants achieved the optimal discriminatory ability in both validation datasets, as well as better performance in the EAS population compared to the PRS derived from known variants. Using the UK Biobank cohort, we further validated a significant dose-response effect of PRSCSx on incident colorectal cancer, in which the risk was 2.11- and 3.88-fold higher in individuals with intermediate and high PRSCSx than in the low score subgroup (Ptrend = 8.15 × 10-53). Notably, the detrimental effect of being at a high genetic risk could be largely attenuated by adherence to a favorable lifestyle, with a 0.53% reduction in 5-year absolute risk. CONCLUSIONS In summary, we systemically constructed an EAS-EUR PRS to effectively stratify colorectal cancer risk, which highlighted its clinical implication among diverse ancestries. Importantly, these findings also supported that a healthy lifestyle could reduce the genetic impact on incident colorectal cancer.
Collapse
Affiliation(s)
- Junyi Xin
- grid.89957.3a0000 0000 9255 8984Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, School of Public Health, Nanjing Medical University, 101 Longmian Avenue, Jiangning District, Nanjing, 211166 China ,grid.89957.3a0000 0000 9255 8984Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Mulong Du
- grid.89957.3a0000 0000 9255 8984Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, School of Public Health, Nanjing Medical University, 101 Longmian Avenue, Jiangning District, Nanjing, 211166 China ,grid.89957.3a0000 0000 9255 8984Department of Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Dongying Gu
- grid.89957.3a0000 0000 9255 8984Department of Oncology, Nanjing First Hospital, Nanjing Medical University, Nanjing, China
| | - Kewei Jiang
- grid.411634.50000 0004 0632 4559Department of Gastroenterological Surgery, Laboratory of Surgical Oncology, Beijing Key Laboratory of Colorectal Cancer Diagnosis and Treatment Research, Peking University People’s Hospital, No. 11 Xizhimen South Street, Xicheng District, Beijing, China
| | - Mengyun Wang
- grid.452404.30000 0004 1808 0942Cancer Institute, Fudan University Shanghai Cancer Center, Shanghai, China ,grid.11841.3d0000 0004 0619 8943Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Mingjuan Jin
- grid.13402.340000 0004 1759 700XDepartment of Epidemiology and Biostatistics at School of Public Health, Zhejiang University School of Medicine, Hangzhou, China ,grid.13402.340000 0004 1759 700XCancer Institute, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Yeting Hu
- grid.13402.340000 0004 1759 700XDepartment of Colorectal Surgery and Oncology, Key Laboratory of Cancer Prevention and Intervention, Ministry of Education, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China ,grid.13402.340000 0004 1759 700XCancer Center, Zhejiang University, Hangzhou, Zhejiang, China
| | - Shuai Ben
- grid.89957.3a0000 0000 9255 8984Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, School of Public Health, Nanjing Medical University, 101 Longmian Avenue, Jiangning District, Nanjing, 211166 China ,grid.89957.3a0000 0000 9255 8984Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Silu Chen
- grid.89957.3a0000 0000 9255 8984Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, School of Public Health, Nanjing Medical University, 101 Longmian Avenue, Jiangning District, Nanjing, 211166 China ,grid.89957.3a0000 0000 9255 8984Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Wei Shao
- grid.89957.3a0000 0000 9255 8984Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, School of Public Health, Nanjing Medical University, 101 Longmian Avenue, Jiangning District, Nanjing, 211166 China ,grid.89957.3a0000 0000 9255 8984Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Shuwei Li
- grid.89957.3a0000 0000 9255 8984Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, School of Public Health, Nanjing Medical University, 101 Longmian Avenue, Jiangning District, Nanjing, 211166 China ,grid.89957.3a0000 0000 9255 8984Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Haiyan Chu
- grid.89957.3a0000 0000 9255 8984Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, School of Public Health, Nanjing Medical University, 101 Longmian Avenue, Jiangning District, Nanjing, 211166 China ,grid.89957.3a0000 0000 9255 8984Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Linjun Zhu
- grid.412676.00000 0004 1799 0784Department of Oncology, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Chen Li
- grid.411634.50000 0004 0632 4559Department of Gastroenterological Surgery, Laboratory of Surgical Oncology, Beijing Key Laboratory of Colorectal Cancer Diagnosis and Treatment Research, Peking University People’s Hospital, No. 11 Xizhimen South Street, Xicheng District, Beijing, China
| | - Kun Chen
- grid.13402.340000 0004 1759 700XDepartment of Epidemiology and Biostatistics at School of Public Health, Zhejiang University School of Medicine, Hangzhou, China ,grid.13402.340000 0004 1759 700XCancer Institute, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Kefeng Ding
- grid.13402.340000 0004 1759 700XDepartment of Colorectal Surgery and Oncology, Key Laboratory of Cancer Prevention and Intervention, Ministry of Education, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China ,grid.13402.340000 0004 1759 700XCancer Center, Zhejiang University, Hangzhou, Zhejiang, China
| | - Zhengdong Zhang
- grid.89957.3a0000 0000 9255 8984Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, School of Public Health, Nanjing Medical University, 101 Longmian Avenue, Jiangning District, Nanjing, 211166 China ,grid.89957.3a0000 0000 9255 8984Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Hongbing Shen
- grid.89957.3a0000 0000 9255 8984Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Meilin Wang
- Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, Nanjing Medical University, School of Public Health, Nanjing Medical University, 101 Longmian Avenue, Jiangning District, Nanjing, 211166, China. .,Department of Genetic Toxicology, The Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, China. .,The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Gusu School, Nanjing Medical University, Suzhou, China.
| |
Collapse
|
95
|
Wang C, Zhang J, Veldsman WP, Zhou X, Zhang L. A comprehensive investigation of statistical and machine learning approaches for predicting complex human diseases on genomic variants. Brief Bioinform 2023; 24:6965909. [PMID: 36585786 DOI: 10.1093/bib/bbac552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 11/04/2022] [Accepted: 11/14/2022] [Indexed: 01/01/2023] Open
Abstract
Quantifying an individual's risk for common diseases is an important goal of precision health. The polygenic risk score (PRS), which aggregates multiple risk alleles of candidate diseases, has emerged as a standard approach for identifying high-risk individuals. Although several studies have been performed to benchmark the PRS calculation tools and assess their potential to guide future clinical applications, some issues remain to be further investigated, such as lacking (i) various simulated data with different genetic effects; (ii) evaluation of machine learning models and (iii) evaluation on multiple ancestries studies. In this study, we systematically validated and compared 13 statistical methods, 5 machine learning models and 2 ensemble models using simulated data with additive and genetic interaction models, 22 common diseases with internal training sets, 4 common diseases with external summary statistics and 3 common diseases for trans-ancestry studies in UK Biobank. The statistical methods were better in simulated data from additive models and machine learning models have edges for data that include genetic interactions. Ensemble models are generally the best choice by integrating various statistical methods. LDpred2 outperformed the other standalone tools, whereas PRS-CS, lassosum and DBSLMM showed comparable performance. We also identified that disease heritability strongly affected the predictive performance of all methods. Both the number and effect sizes of risk SNPs are important; and sample size strongly influences the performance of all methods. For the trans-ancestry studies, we found that the performance of most methods became worse when training and testing sets were from different populations.
Collapse
Affiliation(s)
- Chonghao Wang
- Department of Computer Science, Hong Kong Baptist University, Hong Kong SRA, China
| | - Jing Zhang
- Eye Institute and Department of Ophthalmology, NHC Key Laboratory of Myopia (Fudan University), Eye & ENT Hospital, Fudan University, Shanghai, China
| | | | - Xin Zhou
- Department of Biomedical Engineering, Vanderbilt University, Vanderbilt Place Nashville, 37235, TN, USA
| | - Lu Zhang
- Department of Computer Science, Hong Kong Baptist University, Hong Kong SRA, China
- Institute for Research and Continuing Education, Hong Kong Baptist University, Shenzhen, China
| |
Collapse
|
96
|
Wang Y, Namba S, Lopera E, Kerminen S, Tsuo K, Läll K, Kanai M, Zhou W, Wu KH, Favé MJ, Bhatta L, Awadalla P, Brumpton B, Deelen P, Hveem K, Lo Faro V, Mägi R, Murakami Y, Sanna S, Smoller JW, Uzunovic J, Wolford BN, Willer C, Gamazon ER, Cox NJ, Surakka I, Okada Y, Martin AR, Hirbo J. Global Biobank analyses provide lessons for developing polygenic risk scores across diverse cohorts. CELL GENOMICS 2023; 3:100241. [PMID: 36777179 PMCID: PMC9903818 DOI: 10.1016/j.xgen.2022.100241] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Revised: 08/28/2022] [Accepted: 12/03/2022] [Indexed: 01/06/2023]
Abstract
Polygenic risk scores (PRSs) have been widely explored in precision medicine. However, few studies have thoroughly investigated their best practices in global populations across different diseases. We here utilized data from Global Biobank Meta-analysis Initiative (GBMI) to explore methodological considerations and PRS performance in 9 different biobanks for 14 disease endpoints. Specifically, we constructed PRSs using pruning and thresholding (P + T) and PRS-continuous shrinkage (CS). For both methods, using a European-based linkage disequilibrium (LD) reference panel resulted in comparable or higher prediction accuracy compared with several other non-European-based panels. PRS-CS overall outperformed the classic P + T method, especially for endpoints with higher SNP-based heritability. Notably, prediction accuracy is heterogeneous across endpoints, biobanks, and ancestries, especially for asthma, which has known variation in disease prevalence across populations. Overall, we provide lessons for PRS construction, evaluation, and interpretation using GBMI resources and highlight the importance of best practices for PRS in the biobank-scale genomics era.
Collapse
Affiliation(s)
- Ying Wang
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
- Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Shinichi Namba
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita 565-0871, Japan
| | - Esteban Lopera
- Department of Genetics, UMCG, University of Groningen, Groningen, the Netherlands
| | - Sini Kerminen
- Institute for Molecular Medicine Finland, FIMM, HiLIFE, University of Helsinki, Helsinki, Finland
| | - Kristin Tsuo
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
- Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Kristi Läll
- Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Masahiro Kanai
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
- Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita 565-0871, Japan
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Wei Zhou
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
- Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Kuan-Han Wu
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48103, USA
| | | | - Laxmi Bhatta
- K.G. Jebsen Center for Genetic Epidemiology, Department of Public Health and Nursing, NTNU, Norwegian University of Science and Technology, 7030 Trondheim, Norway
| | - Philip Awadalla
- Ontario Institute for Cancer Research, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Ben Brumpton
- K.G. Jebsen Center for Genetic Epidemiology, Department of Public Health and Nursing, NTNU, Norwegian University of Science and Technology, 7030 Trondheim, Norway
- HUNT Research Centre, Department of Public Health and Nursing, NTNU, Norwegian University of Science and Technology, 7600 Levanger, Norway
- Clinic of Medicine, St. Olav’s Hospital, Trondheim University Hospital, 7030 Trondheim, Norway
| | - Patrick Deelen
- Department of Genetics, UMCG, University of Groningen, Groningen, the Netherlands
- Oncode Institute, Utrecht, the Netherlands
| | - Kristian Hveem
- K.G. Jebsen Center for Genetic Epidemiology, Department of Public Health and Nursing, NTNU, Norwegian University of Science and Technology, 7030 Trondheim, Norway
- HUNT Research Centre, Department of Public Health and Nursing, NTNU, Norwegian University of Science and Technology, 7600 Levanger, Norway
| | - Valeria Lo Faro
- Department of Ophthalmology, University Medical Center Groningen, University of Groningen, Groningen, the Netherlands
- Department of Clinical Genetics, Amsterdam University Medical Center (AMC), Amsterdam, the Netherlands
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Reedik Mägi
- Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Yoshinori Murakami
- Division of Molecular Pathology, Institute of Medical Science, the University of Tokyo, Tokyo, Japan
| | - Serena Sanna
- Department of Genetics, UMCG, University of Groningen, Groningen, the Netherlands
- Institute for Genetics and Biomedical Research (IRGB), National Research Council (CNR), 09100 Cagliari, Italy
| | - Jordan W. Smoller
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | | | - Brooke N. Wolford
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48103, USA
- K.G. Jebsen Center for Genetic Epidemiology, Department of Public Health and Nursing, NTNU, Norwegian University of Science and Technology, 7030 Trondheim, Norway
| | - Cristen Willer
- K.G. Jebsen Center for Genetic Epidemiology, Department of Public Health and Nursing, NTNU, Norwegian University of Science and Technology, 7030 Trondheim, Norway
- Department of Internal Medicine, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Biostatistics and Center for Statistical Genetics, and Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Eric R. Gamazon
- Department of Medicine, Division of Genetic Medicine, Vanderbilt University School of Medicine, Nashville, TN, USA
- MRC Epidemiology Unit, University of Cambridge, Cambridge, UK
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Nancy J. Cox
- Department of Medicine, Division of Genetic Medicine, Vanderbilt University School of Medicine, Nashville, TN, USA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Ida Surakka
- Department of Internal Medicine, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yukinori Okada
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita 565-0871, Japan
- Laboratory for Systems Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Laboratory of Statistical Immunology, Immunology Frontier Research Center (WPI-IFReC) and Center for Infectious Disease Education and Research (CiDER), Osaka University, Suita 565-0871, Japan
- Department of Genome Informatics, Graduate School of Medicine, the University of Tokyo, Tokyo 113-0033, Japan
| | - Alicia R. Martin
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
- Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Jibril Hirbo
- Department of Medicine, Division of Genetic Medicine, Vanderbilt University School of Medicine, Nashville, TN, USA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
| |
Collapse
|
97
|
Klinkhammer H, Staerk C, Maj C, Krawitz PM, Mayr A. A statistical boosting framework for polygenic risk scores based on large-scale genotype data. Front Genet 2023; 13:1076440. [PMID: 36704342 PMCID: PMC9871367 DOI: 10.3389/fgene.2022.1076440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Accepted: 12/20/2022] [Indexed: 01/12/2023] Open
Abstract
Polygenic risk scores (PRS) evaluate the individual genetic liability to a certain trait and are expected to play an increasingly important role in clinical risk stratification. Most often, PRS are estimated based on summary statistics of univariate effects derived from genome-wide association studies. To improve the predictive performance of PRS, it is desirable to fit multivariable models directly on the genetic data. Due to the large and high-dimensional data, a direct application of existing methods is often not feasible and new efficient algorithms are required to overcome the computational burden regarding efficiency and memory demands. We develop an adapted component-wise L 2-boosting algorithm to fit genotype data from large cohort studies to continuous outcomes using linear base-learners for the genetic variants. Similar to the snpnet approach implementing lasso regression, the proposed snpboost approach iteratively works on smaller batches of variants. By restricting the set of possible base-learners in each boosting step to variants most correlated with the residuals from previous iterations, the computational efficiency can be substantially increased without losing prediction accuracy. Furthermore, for large-scale data based on various traits from the UK Biobank we show that our method yields competitive prediction accuracy and computational efficiency compared to the snpnet approach and further commonly used methods. Due to the modular structure of boosting, our framework can be further extended to construct PRS for different outcome data and effect types-we illustrate this for the prediction of binary traits.
Collapse
Affiliation(s)
- Hannah Klinkhammer
- Institute for Medical Biometry, Informatics and Epidemiology, Medical Faculty, University of Bonn, Bonn, Germany
- Institute for Genomic Statistics and Bioinformatics, Medical Faculty, University of Bonn, Bonn, Germany
| | - Christian Staerk
- Institute for Medical Biometry, Informatics and Epidemiology, Medical Faculty, University of Bonn, Bonn, Germany
| | - Carlo Maj
- Institute for Genomic Statistics and Bioinformatics, Medical Faculty, University of Bonn, Bonn, Germany
- Center for Human Genetics, University of Marburg, Marburg, Germany
| | - Peter Michael Krawitz
- Institute for Genomic Statistics and Bioinformatics, Medical Faculty, University of Bonn, Bonn, Germany
| | - Andreas Mayr
- Institute for Medical Biometry, Informatics and Epidemiology, Medical Faculty, University of Bonn, Bonn, Germany
| |
Collapse
|
98
|
O'Sullivan JW, Ashley EA, Elliott PM. Polygenic risk scores for the prediction of cardiometabolic disease. Eur Heart J 2023; 44:89-99. [PMID: 36478054 DOI: 10.1093/eurheartj/ehac648] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Revised: 08/28/2022] [Accepted: 10/27/2022] [Indexed: 12/12/2022] Open
Abstract
Cardiometabolic diseases contribute more to global morbidity and mortality than any other group of disorders. Polygenic risk scores (PRSs), the weighted summation of individually small-effect genetic variants, represent an advance in our ability to predict the development and complications of cardiometabolic diseases. This article reviews the evidence supporting the use of PRS in seven common cardiometabolic diseases: coronary artery disease (CAD), stroke, hypertension, heart failure and cardiomyopathies, obesity, atrial fibrillation (AF), and type 2 diabetes mellitus (T2DM). Data suggest that PRS for CAD, AF, and T2DM consistently improves prediction when incorporated into existing clinical risk tools. In other areas such as ischaemic stroke and hypertension, clinical application appears premature but emerging evidence suggests that the study of larger and more diverse populations coupled with more granular phenotyping will propel the translation of PRS into practical clinical prediction tools.
Collapse
Affiliation(s)
- Jack W O'Sullivan
- Stanford Center for Inherited Cardiovascular Disease, Stanford University School of Medicine, Stanford, CA, USA
- Division of Cardiology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Euan A Ashley
- Stanford Center for Inherited Cardiovascular Disease, Stanford University School of Medicine, Stanford, CA, USA
- Division of Cardiology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Perry M Elliott
- UCL Institute of Cardiovascular Science, Gower Street, London WC1E 6BT, UK
- St. Bartholomew's Hospital, W Smithfield, London EC1A 7BE, UK
| |
Collapse
|
99
|
Zhou G, Chen T, Zhao H. SDPRX: A statistical method for cross-population prediction of complex traits. Am J Hum Genet 2023; 110:13-22. [PMID: 36460009 PMCID: PMC9892700 DOI: 10.1016/j.ajhg.2022.11.007] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Accepted: 11/08/2022] [Indexed: 12/03/2022] Open
Abstract
Polygenic risk score (PRS) has demonstrated its great utility in biomedical research through identifying high-risk individuals for different diseases from their genotypes. However, the broader application of PRS to the general population is hindered by the limited transferability of PRS developed in Europeans to non-European populations. To improve PRS prediction accuracy in non-European populations, we develop a statistical method called SDPRX that can effectively integrate genome wide association study summary statistics from different populations. SDPRX automatically adjusts for linkage disequilibrium differences between populations and characterizes the joint distribution of the effect sizes of a variant in two populations to be both null, population specific, or shared with correlation. Through simulations and applications to real traits, we show that SDPRX improves the prediction performance over existing methods in non-European populations.
Collapse
Affiliation(s)
- Geyu Zhou
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
| | - Tianqi Chen
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
| | - Hongyu Zhao
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA.
| |
Collapse
|
100
|
Namba S, Saito Y, Kogure Y, Masuda T, Bondy ML, Gharahkhani P, Gockel I, Heider D, Hillmer A, Jankowski J, MacGregor S, Maj C, Melin B, Ostrom QT, Palles C, Schumacher J, Tomlinson I, Whiteman DC, Okada Y, Kataoka K. Common Germline Risk Variants Impact Somatic Alterations and Clinical Features across Cancers. Cancer Res 2023; 83:20-27. [PMID: 36286845 PMCID: PMC9811159 DOI: 10.1158/0008-5472.can-22-1492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 08/20/2022] [Accepted: 10/21/2022] [Indexed: 02/03/2023]
Abstract
Aggregation of genome-wide common risk variants, such as polygenic risk score (PRS), can measure genetic susceptibility to cancer. A better understanding of how common germline variants associate with somatic alterations and clinical features could facilitate personalized cancer prevention and early detection. We constructed PRSs from 14 genome-wide association studies (median n = 64,905) for 12 cancer types by multiple methods and calibrated them using the UK Biobank resources (n = 335,048). Meta-analyses across cancer types in The Cancer Genome Atlas (n = 7,965) revealed that higher PRS values were associated with earlier cancer onset and lower burden of somatic alterations, including total mutations, chromosome/arm somatic copy-number alterations (SCNA), and focal SCNAs. This contrasts with rare germline pathogenic variants (e.g., BRCA1/2 variants), showing heterogeneous associations with somatic alterations. Our results suggest that common germline cancer risk variants allow early tumor development before the accumulation of many somatic alterations characteristic of later stages of carcinogenesis. SIGNIFICANCE Meta-analyses across cancers show that common germline risk variants affect not only cancer predisposition but the age of cancer onset and burden of somatic alterations, including total mutations and copy-number alterations.
Collapse
Affiliation(s)
- Shinichi Namba
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan
| | - Yuki Saito
- Division of Molecular Oncology, National Cancer Center Research Institute, Tokyo, Japan
- Department of Gastroenterology, Keio University School of Medicine, Tokyo, Japan
| | - Yasunori Kogure
- Division of Molecular Oncology, National Cancer Center Research Institute, Tokyo, Japan
| | - Tatsuo Masuda
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan
- Department of Obstetrics and Gynecology, Osaka University Graduate School of Medicine, Osaka, Japan
- StemRIM Institute of Regeneration-Inducing Medicine, Osaka University, Osaka, Japan
| | - Melissa L. Bondy
- Department of Epidemiology and Population Health, Stanford University School of Medicine, Stanford, California
| | - Puya Gharahkhani
- Statistical Genetics Laboratory, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Ines Gockel
- Department of Visceral, Transplant, Thoracic and Vascular Surgery, University Hospital of Leipzig, Leipzig, Germany
| | - Dominik Heider
- Department of Mathematics and Computer Science, University of Marburg, Marburg, Germany
| | - Axel Hillmer
- Institute of Pathology, University of Cologne, Faculty of Medicine and University Hospital Cologne, Cologne, Germany
| | - Janusz Jankowski
- Office of Vice President Research and Innovation, Laucala Bay Campus, University of South Pacific, Suva, Fiji
- Institute for Clinical Trials, University College London, Holborn, London
| | - Stuart MacGregor
- Statistical Genetics Laboratory, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Carlo Maj
- Institute for Genomic Statistics and Bioinformatics, Medical Faculty, University of Bonn, Bonn, Germany
| | - Beatrice Melin
- Department of Radiation Sciences, Oncology, Umeå University, Umeå, Sweden
| | - Quinn T. Ostrom
- Department of Neurosurgery, Duke University School of Medicine, Durham, North Carolina
- The Preston Robert Tisch Brain Tumor Center, Duke University School of Medicine, Durham, North Carolina
- Duke Cancer Institute, Duke University Medical Center, Durham, North Carolina
| | - Claire Palles
- Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham, United Kingdom
| | | | - Ian Tomlinson
- Edinburgh Cancer Research Centre, IGMM, University of Edinburgh, Crewe Road, Edinburgh, United Kingdom
| | - David C. Whiteman
- Cancer Control, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Yukinori Okada
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan
- Department of Genome Informatics, Graduate School of Medicine, the University of Tokyo, Tokyo, Japan
- Laboratory for Systems Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Laboratory of Statistical Immunology, Immunology Frontier Research Center (WPI-IFReC), Osaka University, Suita, Japan
- Integrated Frontier Research for Medical Science Division, Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Suita, Japan
- Center for Infectious Disease Education and Research (CiDER), Osaka University, Suita, Japan
| | - Keisuke Kataoka
- Division of Molecular Oncology, National Cancer Center Research Institute, Tokyo, Japan
- Division of Hematology, Department of Medicine, Keio University School of Medicine, Tokyo, Japan
| |
Collapse
|