1
|
Burt CH. Polygenic Indices (a.k.a. Polygenic Scores) in Social Science: A Guide for Interpretation and Evaluation. SOCIOLOGICAL METHODOLOGY 2024; 54:300-350. [PMID: 39091537 PMCID: PMC11293310 DOI: 10.1177/00811750241236482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/04/2024]
Abstract
Polygenic indices (PGI)-the new recommended label for polygenic scores (PGS) in social science-are genetic summary scales often used to represent an individual's liability for a disease, trait, or behavior based on the additive effects of measured genetic variants. Enthusiasm for linking genetic data with social outcomes and the inclusion of premade PGIs in social science datasets have facilitated increased uptake of PGIs in social science research-a trend that will likely continue. Yet, most social scientists lack the expertise to interpret and evaluate PGIs in social science research. Here, we provide a primer on PGIs for social scientists focusing on key concepts, unique statistical genetic considerations, and best practices in calculation, estimation, reporting, and interpretation. We summarize our recommended best practices as a checklist to aid social scientists in evaluating and interpreting studies with PGIs. We conclude by discussing the similarities between PGIs and standard social science scales and unique interpretative considerations.
Collapse
|
2
|
Yang A, Yang YT, Zhao XM. An augmented Mendelian randomization approach provides causality of brain imaging features on complex traits in a single biobank-scale dataset. PLoS Genet 2023; 19:e1011112. [PMID: 38150468 PMCID: PMC10775988 DOI: 10.1371/journal.pgen.1011112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 01/09/2024] [Accepted: 12/12/2023] [Indexed: 12/29/2023] Open
Abstract
Mendelian randomization (MR) is an effective approach for revealing causal risk factors that underpin complex traits and diseases. While MR has been more widely applied under two-sample settings, it is more promising to be used in one single large cohort given the rise of biobank-scale datasets that simultaneously contain genotype data, brain imaging data, and matched complex traits from the same individual. However, most existing multivariable MR methods have been developed for two-sample setting or a small number of exposures. In this study, we introduce a one-sample multivariable MR method based on partial least squares and Lasso regression (MR-PL). MR-PL is capable of considering the correlation among exposures (e.g., brain imaging features) when the number of exposures is extremely upscaled, while also correcting for winner's curse bias. We performed extensive and systematic simulations, and demonstrated the robustness and reliability of our method. Comprehensive simulations confirmed that MR-PL can generate more precise causal estimates with lower false positive rates than alternative approaches. Finally, we applied MR-PL to the datasets from UK Biobank to reveal the causal effects of 36 white matter tracts on 180 complex traits, and showed putative white matter tracts that are implicated in smoking, blood vascular function-related traits, and eating behaviors.
Collapse
Affiliation(s)
- Anyi Yang
- Department of Neurology, Zhongshan Hospital and Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, People’s Republic of China
- MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, People’s Republic of China
| | - Yucheng T. Yang
- Department of Neurology, Zhongshan Hospital and Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, People’s Republic of China
- MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, People’s Republic of China
| | - Xing-Ming Zhao
- Department of Neurology, Zhongshan Hospital and Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, People’s Republic of China
- MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, People’s Republic of China
- State Key Laboratory of Medical Neurobiology, Institutes of Brain Science, Fudan University, Shanghai, People’s Republic of China
- International Human Phenome Institutes (Shanghai), Shanghai, People’s Republic of China
| |
Collapse
|
3
|
Wang L, Grimshaw AA, Mezzacappa C, Larki NR, Yang YX, Justice AC. Do Polygenic Risk Scores Add to Clinical Data in Predicting Pancreatic Cancer? A Scoping Review. Cancer Epidemiol Biomarkers Prev 2023; 32:1490-1497. [PMID: 37610426 PMCID: PMC10873036 DOI: 10.1158/1055-9965.epi-23-0468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 07/21/2023] [Accepted: 08/21/2023] [Indexed: 08/24/2023] Open
Abstract
BACKGROUND Polygenic risk scores (PRS) summarize an individual's germline genetic risk, but it is unclear whether PRS offer independent information for pancreatic cancer risk prediction beyond routine clinical data. METHODS We searched 8 databases from database inception to March 10, 2023 to identify studies evaluating the independent performance of pancreatic cancer-specific PRS for pancreatic cancer beyond clinical risk factors. RESULTS Twenty-one studies examined associations between a pancreatic cancer-specific PRS and pancreatic cancer. Seven studies evaluated risk factors beyond age and sex. Three studies evaluated the change in discrimination associated with the addition of PRS to routine risk factors and reported improvements (AUCs: 0.715 to 0.745; AUC 0.791 to 0.830; AUC from 0.694 to 0.711). Limitations to clinical applicability included using source populations younger/healthier than those at risk for pancreatic cancer (n = 10), exclusively of European ancestry (n = 13), or controls without relevant exposures (n = 1). CONCLUSIONS While most studies of pancreatic cancer-specific PRS did not evaluate the independent discrimination of PRS for pancreatic cancer beyond routine risk factors, three that did showed improvements in discrimination. IMPACT For pancreatic cancer PRS to be clinically useful, they must demonstrate substantial improvements in discrimination beyond established risk factors, apply to diverse ancestral populations representative of those at risk for pancreatic cancer, and use appropriate controls.
Collapse
Affiliation(s)
- Louise Wang
- VA Connecticut Healthcare System, West Haven, CT, USA
- Section of Digestive Diseases, Department of Internal Medicine, Yale University School of Medicine, New Haven, CT, USA
- Division of Gastroenterology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania
| | | | - Catherine Mezzacappa
- Section of Digestive Diseases, Department of Internal Medicine, Yale University School of Medicine, New Haven, CT, USA
| | - Navid Rahimi Larki
- VA Connecticut Healthcare System, West Haven, CT, USA
- Section of Digestive Diseases, Department of Internal Medicine, Yale University School of Medicine, New Haven, CT, USA
| | - Yu-Xiao Yang
- Division of Gastroenterology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania
- Corporal Michael J. Crescenz VA Medical Center, Philadelphia, PA USA
| | - Amy C. Justice
- VA Connecticut Healthcare System, West Haven, CT, USA
- Section of General Medicine, Department of Internal Medicine, Yale University School of Medicine, New Haven, CT, USA
- School of Public Health, Yale University, New Haven, CT, USA
| |
Collapse
|
4
|
Kulm S, Kaidi AC, Kolin D, Langhans MT, Bostrom MP, Elemento O, Shen TS. Genetic Risk Factors for End-Stage Hip Osteoarthritis Treated With Total Hip Arthroplasty: A Genome-wide Association Study. J Arthroplasty 2023; 38:2149-2153.e1. [PMID: 37179025 DOI: 10.1016/j.arth.2023.05.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 04/28/2023] [Accepted: 05/03/2023] [Indexed: 05/15/2023] Open
Abstract
BACKGROUND Although a genetic component to hip osteoarthritis (OA) has been described, focused evaluation of the genetic components of end-stage disease is limited. We present a genomewide association study for patients undergoing total hip arthroplasty (THA) to characterize the genetic risk factors associated with end-stage hip osteoarthritis (ESHO), defined as utilization of the procedure. METHODS Patients who underwent primary THA for hip OA were identified in a national patient data repository using administrative codes. Fifteen thousand three hundred and fifty-five patients with ESHO and 374,193 control patients were identified. Whole genome regression of genotypic data for patients who underwent primary THA for hip OA corrected for age, sex, and body mass index (BMI) was performed. Multivariate logistic regression models were used to evaluate the composite genetic risk from the identified genetic variants. RESULTS There were 13 significant genes identified. Composite genetic factors resulted in an odds ratio 1.04 for ESHO (P < .001). The effect of genetics was lower than that of age (Odds Ratio (OR): 2.38; P < .001) and BMI (1.81; P < .001). CONCLUSION Multiple genetic variants, including 5 novel loci, were associated with end-stage hip OA treated with primary THA. Age and BMI were associated with greater odds of developing end-stage disease when compared to genetic factors.
Collapse
Affiliation(s)
- Scott Kulm
- Weill Cornell Medicine, New York, New York; Englander Institute for Precision Medicine, New York, New York
| | - Austin C Kaidi
- Adult Reconstruction and Joint Replacement, Hospital for Special Surgery, New York, New York
| | - David Kolin
- Adult Reconstruction and Joint Replacement, Hospital for Special Surgery, New York, New York
| | - Mark T Langhans
- Adult Reconstruction and Joint Replacement, Hospital for Special Surgery, New York, New York
| | - Mathias P Bostrom
- Adult Reconstruction and Joint Replacement, Hospital for Special Surgery, New York, New York
| | - Olivier Elemento
- Weill Cornell Medicine, New York, New York; Englander Institute for Precision Medicine, New York, New York
| | - Tony S Shen
- Adult Reconstruction and Joint Replacement, Hospital for Special Surgery, New York, New York
| |
Collapse
|
5
|
Meng XH, Liu Z, Chen XD, Deng AM, Mao ZH. Functional Enrichment Analysis Identifying Regulatory Information Associated with Human Fracture. Calcif Tissue Int 2023; 113:286-294. [PMID: 37477662 DOI: 10.1007/s00223-023-01108-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 06/05/2023] [Indexed: 07/22/2023]
Abstract
Dozens of loci associated with fracture have been identified by genome-wide association studies (GWASs). However, most of these variants are located in the noncoding regions including introns, long terminal repeats, and intergenic regions. Although combining regulation information helps to identify the causal SNPs and interpret the involvement of these variants in the etiology of human fracture, regulation information which was truly associated with fracture was unknown. A novel functional enrichment method GARFIELD (GWAS Analysis of Regulatory of Functional Information Enrichment with LD correction) was applied to identify fracture-associated regulation information, including transcript factor binding sites, expression quantitative trait loci (eQTLs), chromatin states, enhancer, promoter, dyadic, super enhancer and Epigenome marks. Fracture SNPs were significantly enriched in exon (Bonferroni correction, p value < 7.14 × 10-3) at two GWAS p value thresholds through GARFIELD. High level of fold-enrichment was observed in super enhancer of monocyte and the enhancer of chondrocyte (Bonferroni correction, p value < 4.45 × 10-3). eQTLs of 44 tissues/cells and 10 transcription factors (TFs) were identified to be associated with human fracture. These results provide new insight into the etiology of human fracture, which might increase the identification of the causal SNPs through the fine-mapping study combined with functional annotation, as well as polygenic risk score.
Collapse
Affiliation(s)
- Xiang-He Meng
- Hunan Provincial Key Laboratory of Regional Hereditary Birth Defects Prevention and Control, Changsha Hospital for Maternal & Child Health Care Affiliated to Hunan Normal University, Changsha, 410007, People's Republic of China.
| | - Zhen Liu
- Laboratory of Molecular and Statistical Genetics, College of Life Sciences, Hunan Normal University, Changsha, 410081, Hunan, People's Republic of China
| | - Xiang-Ding Chen
- Laboratory of Molecular and Statistical Genetics, College of Life Sciences, Hunan Normal University, Changsha, 410081, Hunan, People's Republic of China
| | - Ai-Min Deng
- Hunan Provincial Key Laboratory of Regional Hereditary Birth Defects Prevention and Control, Changsha Hospital for Maternal & Child Health Care Affiliated to Hunan Normal University, Changsha, 410007, People's Republic of China.
| | - Zeng-Hui Mao
- Hunan Provincial Key Laboratory of Regional Hereditary Birth Defects Prevention and Control, Changsha Hospital for Maternal & Child Health Care Affiliated to Hunan Normal University, Changsha, 410007, People's Republic of China.
| |
Collapse
|
6
|
Brookes KJ, Guetta-Baranes T, Thomas A, Morgan K. An alternative method of SNP inclusion to develop a generalized polygenic risk score analysis across Alzheimer's disease cohorts. FRONTIERS IN DEMENTIA 2023; 2:1120206. [PMID: 39081983 PMCID: PMC11285631 DOI: 10.3389/frdem.2023.1120206] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Accepted: 06/12/2023] [Indexed: 08/02/2024]
Abstract
Introduction Polygenic risk scores (PRSs) have great clinical potential for detecting late-onset diseases such as Alzheimer's disease (AD), allowing the identification of those most at risk years before the symptoms present. Although many studies use various and complicated machine learning algorithms to determine the best discriminatory values for PRSs, few studies look at the commonality of the Single Nucleotide Polymorphisms (SNPs) utilized in these models. Methods This investigation focussed on identifying SNPs that tag blocks of linkage disequilibrium across the genome, allowing for a generalized PRS model across cohorts and genotyping panels. PRS modeling was conducted on five AD development cohorts, with the best discriminatory models exploring for a commonality of linkage disequilibrium clumps. Clumps that contributed to the discrimination of cases from controls that occurred in multiple cohorts were used to create a generalized model of PRS, which was then tested in the five development cohorts and three further AD cohorts. Results The model developed provided a discriminability accuracy average of over 70% in multiple AD cohorts and included variants of several well-known AD risk genes. Discussion A key element of devising a polygenic risk score that can be used in the clinical setting is one that has consistency in the SNPs that are used to calculate the score; this study demonstrates that using a model based on commonality of association findings rather than meta-analyses may prove useful.
Collapse
Affiliation(s)
- Keeley J. Brookes
- Interdisciplinary Biomedical Research Centre, Biosciences, Clifton Campus, Nottingham Trent University, Nottingham, United Kingdom
| | - Tamar Guetta-Baranes
- Human Genetics, Life Sciences, University Park, University of Nottingham, Nottingham, United Kingdom
| | - Alan Thomas
- Brains for Dementia Research Coordinating Centre, Institute of Neuroscience, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Kevin Morgan
- Human Genetics, Life Sciences, University Park, University of Nottingham, Nottingham, United Kingdom
| |
Collapse
|
7
|
Adam Y, Sadeeq S, Kumuthini J, Ajayi O, Wells G, Solomon R, Ogunlana O, Adetiba E, Iweala E, Brors B, Adebiyi E. Polygenic Risk Score in African populations: progress and challenges. F1000Res 2023; 11:175. [PMID: 37273966 PMCID: PMC10233318 DOI: 10.12688/f1000research.76218.2] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/10/2023] [Indexed: 06/06/2023] Open
Abstract
Polygenic Risk Score (PRS) analysis is a method that predicts the genetic risk of an individual towards targeted traits. Even when there are no significant markers, it gives evidence of a genetic effect beyond the results of Genome-Wide Association Studies (GWAS). Moreover, it selects single nucleotide polymorphisms (SNPs) that contribute to the disease with low effect size making it more precise at individual level risk prediction. PRS analysis addresses the shortfall of GWAS by taking into account the SNPs/alleles with low effect size but play an indispensable role to the observed phenotypic/trait variance. PRS analysis has applications that investigate the genetic basis of several traits, which includes rare diseases. However, the accuracy of PRS analysis depends on the genomic data of the underlying population. For instance, several studies show that obtaining higher prediction power of PRS analysis is challenging for non-Europeans. In this manuscript, we review the conventional PRS methods and their application to sub-Saharan African communities. We conclude that lack of sufficient GWAS data and tools is the limiting factor of applying PRS analysis to sub-Saharan populations. We recommend developing Africa-specific PRS methods and tools for estimating and analyzing African population data for clinical evaluation of PRSs of interest and predicting rare diseases.
Collapse
Affiliation(s)
- Yagoub Adam
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Suraju Sadeeq
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept Computer & Information Sciences, Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Judit Kumuthini
- South African National Bioinformatics Institute, Life Sciences Building, University of Western Cape, Cape Town, South Africa
- Centre for Proteomic and Genomic Research, Cape Town, Western Cape, South Africa
| | - Olabode Ajayi
- South African National Bioinformatics Institute, Life Sciences Building, University of Western Cape, Cape Town, South Africa
- Centre for Proteomic and Genomic Research, Cape Town, Western Cape, South Africa
| | - Gordon Wells
- South African National Bioinformatics Institute, Life Sciences Building, University of Western Cape, Cape Town, South Africa
- Centre for Proteomic and Genomic Research, Cape Town, Western Cape, South Africa
| | - Rotimi Solomon
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, 112212, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept of Biochemistry, Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Olubanke Ogunlana
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, 112212, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept of Biochemistry, Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Emmanuel Adetiba
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept of Electrical & Information Engineering (EIE), Covenant University, Ota, Ogun State, 112212, Nigeria
- HRA, Institute for Systems Science, Durban University of Technology, Durban, South Africa
| | - Emeka Iweala
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept of Biochemistry, Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Benedikt Brors
- Applied Bioinformatics Division, German Cancer Research Center (DKFZ), Heidelberg, 69120, Germany
- German Cancer Consortium (DKTK), Heidelberg, Germany
| | - Ezekiel Adebiyi
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, 112212, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept Computer & Information Sciences, Covenant University, Ota, Ogun State, 112212, Nigeria
- Applied Bioinformatics Division, German Cancer Research Center (DKFZ), Heidelberg, 69120, Germany
| |
Collapse
|
8
|
Adam Y, Sadeeq S, Kumuthini J, Ajayi O, Wells G, Solomon R, Ogunlana O, Adetiba E, Iweala E, Brors B, Adebiyi E. Polygenic Risk Score in African populations: progress and challenges. F1000Res 2023; 11:175. [PMID: 37273966 PMCID: PMC10233318 DOI: 10.12688/f1000research.76218.1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/10/2023] [Indexed: 11/23/2023] Open
Abstract
Polygenic Risk Score (PRS) analysis is a method that predicts the genetic risk of an individual towards targeted traits. Even when there are no significant markers, it gives evidence of a genetic effect beyond the results of Genome-Wide Association Studies (GWAS). Moreover, it selects single nucleotide polymorphisms (SNPs) that contribute to the disease with low effect size making it more precise at individual level risk prediction. PRS analysis addresses the shortfall of GWAS by taking into account the SNPs/alleles with low effect size but play an indispensable role to the observed phenotypic/trait variance. PRS analysis has applications that investigate the genetic basis of several traits, which includes rare diseases. However, the accuracy of PRS analysis depends on the genomic data of the underlying population. For instance, several studies show that obtaining higher prediction power of PRS analysis is challenging for non-Europeans. In this manuscript, we review the conventional PRS methods and their application to sub-Saharan African communities. We conclude that lack of sufficient GWAS data and tools is the limiting factor of applying PRS analysis to sub-Saharan populations. We recommend developing Africa-specific PRS methods and tools for estimating and analyzing African population data for clinical evaluation of PRSs of interest and predicting rare diseases.
Collapse
Affiliation(s)
- Yagoub Adam
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Suraju Sadeeq
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept Computer & Information Sciences, Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Judit Kumuthini
- South African National Bioinformatics Institute, Life Sciences Building, University of Western Cape, Cape Town, South Africa
- Centre for Proteomic and Genomic Research, Cape Town, Western Cape, South Africa
| | - Olabode Ajayi
- South African National Bioinformatics Institute, Life Sciences Building, University of Western Cape, Cape Town, South Africa
- Centre for Proteomic and Genomic Research, Cape Town, Western Cape, South Africa
| | - Gordon Wells
- South African National Bioinformatics Institute, Life Sciences Building, University of Western Cape, Cape Town, South Africa
- Centre for Proteomic and Genomic Research, Cape Town, Western Cape, South Africa
| | - Rotimi Solomon
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, 112212, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept of Biochemistry, Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Olubanke Ogunlana
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, 112212, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept of Biochemistry, Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Emmanuel Adetiba
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept of Electrical & Information Engineering (EIE), Covenant University, Ota, Ogun State, 112212, Nigeria
- HRA, Institute for Systems Science, Durban University of Technology, Durban, South Africa
| | - Emeka Iweala
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept of Biochemistry, Covenant University, Ota, Ogun State, 112212, Nigeria
| | - Benedikt Brors
- Applied Bioinformatics Division, German Cancer Research Center (DKFZ), Heidelberg, 69120, Germany
- German Cancer Consortium (DKTK), Heidelberg, Germany
| | - Ezekiel Adebiyi
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, 112212, Nigeria
- Covenant Applied Informatics and Communication Africa Centre of Excellence (CApIC-ACE), Covenant University, Ota, Ogun State, 112212, Nigeria
- Dept Computer & Information Sciences, Covenant University, Ota, Ogun State, 112212, Nigeria
- Applied Bioinformatics Division, German Cancer Research Center (DKFZ), Heidelberg, 69120, Germany
| |
Collapse
|
9
|
Zhou X, Chen Y, Ip FCF, Jiang Y, Cao H, Lv G, Zhong H, Chen J, Ye T, Chen Y, Zhang Y, Ma S, Lo RMN, Tong EPS, Mok VCT, Kwok TCY, Guo Q, Mok KY, Shoai M, Hardy J, Chen L, Fu AKY, Ip NY. Deep learning-based polygenic risk analysis for Alzheimer's disease prediction. COMMUNICATIONS MEDICINE 2023; 3:49. [PMID: 37024668 PMCID: PMC10079691 DOI: 10.1038/s43856-023-00269-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Accepted: 03/06/2023] [Indexed: 04/08/2023] Open
Abstract
BACKGROUND The polygenic nature of Alzheimer's disease (AD) suggests that multiple variants jointly contribute to disease susceptibility. As an individual's genetic variants are constant throughout life, evaluating the combined effects of multiple disease-associated genetic risks enables reliable AD risk prediction. Because of the complexity of genomic data, current statistical analyses cannot comprehensively capture the polygenic risk of AD, resulting in unsatisfactory disease risk prediction. However, deep learning methods, which capture nonlinearity within high-dimensional genomic data, may enable more accurate disease risk prediction and improve our understanding of AD etiology. Accordingly, we developed deep learning neural network models for modeling AD polygenic risk. METHODS We constructed neural network models to model AD polygenic risk and compared them with the widely used weighted polygenic risk score and lasso models. We conducted robust linear regression analysis to investigate the relationship between the AD polygenic risk derived from deep learning methods and AD endophenotypes (i.e., plasma biomarkers and individual cognitive performance). We stratified individuals by applying unsupervised clustering to the outputs from the hidden layers of the neural network model. RESULTS The deep learning models outperform other statistical models for modeling AD risk. Moreover, the polygenic risk derived from the deep learning models enables the identification of disease-associated biological pathways and the stratification of individuals according to distinct pathological mechanisms. CONCLUSION Our results suggest that deep learning methods are effective for modeling the genetic risks of AD and other diseases, classifying disease risks, and uncovering disease mechanisms.
Collapse
Affiliation(s)
- Xiaopu Zhou
- Division of Life Science, State Key Laboratory of Molecular Neuroscience, Molecular Neuroscience Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
- Hong Kong Center for Neurodegenerative Diseases, Hong Kong Science Park, Hong Kong, China
- Guangdong Provincial Key Laboratory of Brain Science, Disease and Drug Development, HKUST Shenzhen Research Institute, Shenzhen-Hong Kong Institute of Brain Science, Shenzhen, Guangdong, 518057, China
| | - Yu Chen
- Division of Life Science, State Key Laboratory of Molecular Neuroscience, Molecular Neuroscience Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
- Guangdong Provincial Key Laboratory of Brain Science, Disease and Drug Development, HKUST Shenzhen Research Institute, Shenzhen-Hong Kong Institute of Brain Science, Shenzhen, Guangdong, 518057, China
- Chinese Academy of Sciences Key Laboratory of Brain Connectome and Manipulation, Shenzhen Key Laboratory of Translational Research for Brain Diseases, The Brain Cognition and Brain Disease Institute, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen-Hong Kong Institute of Brain Science-Shenzhen Fundamental Research Institutions, Shenzhen, Guangdong, 518055, China
| | - Fanny C F Ip
- Division of Life Science, State Key Laboratory of Molecular Neuroscience, Molecular Neuroscience Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
- Hong Kong Center for Neurodegenerative Diseases, Hong Kong Science Park, Hong Kong, China
- Guangdong Provincial Key Laboratory of Brain Science, Disease and Drug Development, HKUST Shenzhen Research Institute, Shenzhen-Hong Kong Institute of Brain Science, Shenzhen, Guangdong, 518057, China
| | - Yuanbing Jiang
- Division of Life Science, State Key Laboratory of Molecular Neuroscience, Molecular Neuroscience Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
- Hong Kong Center for Neurodegenerative Diseases, Hong Kong Science Park, Hong Kong, China
| | - Han Cao
- Division of Life Science, State Key Laboratory of Molecular Neuroscience, Molecular Neuroscience Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| | - Ge Lv
- Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| | - Huan Zhong
- Division of Life Science, State Key Laboratory of Molecular Neuroscience, Molecular Neuroscience Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
- Hong Kong Center for Neurodegenerative Diseases, Hong Kong Science Park, Hong Kong, China
| | - Jiahang Chen
- Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| | - Tao Ye
- Division of Life Science, State Key Laboratory of Molecular Neuroscience, Molecular Neuroscience Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
- Guangdong Provincial Key Laboratory of Brain Science, Disease and Drug Development, HKUST Shenzhen Research Institute, Shenzhen-Hong Kong Institute of Brain Science, Shenzhen, Guangdong, 518057, China
- Chinese Academy of Sciences Key Laboratory of Brain Connectome and Manipulation, Shenzhen Key Laboratory of Translational Research for Brain Diseases, The Brain Cognition and Brain Disease Institute, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen-Hong Kong Institute of Brain Science-Shenzhen Fundamental Research Institutions, Shenzhen, Guangdong, 518055, China
| | - Yuewen Chen
- Division of Life Science, State Key Laboratory of Molecular Neuroscience, Molecular Neuroscience Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
- Guangdong Provincial Key Laboratory of Brain Science, Disease and Drug Development, HKUST Shenzhen Research Institute, Shenzhen-Hong Kong Institute of Brain Science, Shenzhen, Guangdong, 518057, China
- Chinese Academy of Sciences Key Laboratory of Brain Connectome and Manipulation, Shenzhen Key Laboratory of Translational Research for Brain Diseases, The Brain Cognition and Brain Disease Institute, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen-Hong Kong Institute of Brain Science-Shenzhen Fundamental Research Institutions, Shenzhen, Guangdong, 518055, China
| | - Yulin Zhang
- Guangdong Provincial Key Laboratory of Brain Science, Disease and Drug Development, HKUST Shenzhen Research Institute, Shenzhen-Hong Kong Institute of Brain Science, Shenzhen, Guangdong, 518057, China
| | - Shuangshuang Ma
- Guangdong Provincial Key Laboratory of Brain Science, Disease and Drug Development, HKUST Shenzhen Research Institute, Shenzhen-Hong Kong Institute of Brain Science, Shenzhen, Guangdong, 518057, China
| | - Ronnie M N Lo
- Division of Life Science, State Key Laboratory of Molecular Neuroscience, Molecular Neuroscience Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| | - Estella P S Tong
- Division of Life Science, State Key Laboratory of Molecular Neuroscience, Molecular Neuroscience Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| | - Vincent C T Mok
- Gerald Choa Neuroscience Centre, Lui Che Woo Institute of Innovative Medicine, Therese Pei Fong Chow Research Centre for Prevention of Dementia, Division of Neurology, Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Shatin, Hong Kong, China
| | - Timothy C Y Kwok
- Therese Pei Fong Chow Research Centre for Prevention of Dementia, Division of Geriatrics, Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Shatin, Hong Kong, China
| | - Qihao Guo
- Department of Gerontology, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai, 200233, China
| | - Kin Y Mok
- Division of Life Science, State Key Laboratory of Molecular Neuroscience, Molecular Neuroscience Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
- Hong Kong Center for Neurodegenerative Diseases, Hong Kong Science Park, Hong Kong, China
- Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, London, UK
- UK Dementia Research Institute at UCL, London, UK
| | - Maryam Shoai
- Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, London, UK
- UK Dementia Research Institute at UCL, London, UK
| | - John Hardy
- Hong Kong Center for Neurodegenerative Diseases, Hong Kong Science Park, Hong Kong, China
- Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, London, UK
- UK Dementia Research Institute at UCL, London, UK
- HKUST Jockey Club Institute for Advanced Study, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| | - Lei Chen
- Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| | - Amy K Y Fu
- Division of Life Science, State Key Laboratory of Molecular Neuroscience, Molecular Neuroscience Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
- Hong Kong Center for Neurodegenerative Diseases, Hong Kong Science Park, Hong Kong, China
- Guangdong Provincial Key Laboratory of Brain Science, Disease and Drug Development, HKUST Shenzhen Research Institute, Shenzhen-Hong Kong Institute of Brain Science, Shenzhen, Guangdong, 518057, China
| | - Nancy Y Ip
- Division of Life Science, State Key Laboratory of Molecular Neuroscience, Molecular Neuroscience Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China.
- Hong Kong Center for Neurodegenerative Diseases, Hong Kong Science Park, Hong Kong, China.
- Guangdong Provincial Key Laboratory of Brain Science, Disease and Drug Development, HKUST Shenzhen Research Institute, Shenzhen-Hong Kong Institute of Brain Science, Shenzhen, Guangdong, 518057, China.
| |
Collapse
|
10
|
Zhao Y, Sun L. A stable and adaptive polygenic signal detection method based on repeated sample splitting. CAN J STAT 2023. [DOI: 10.1002/cjs.11768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/03/2023]
|
11
|
Learning high-order interactions for polygenic risk prediction. PLoS One 2023; 18:e0281618. [PMID: 36763605 PMCID: PMC9916647 DOI: 10.1371/journal.pone.0281618] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 01/27/2023] [Indexed: 02/11/2023] Open
Abstract
Within the framework of precision medicine, the stratification of individual genetic susceptibility based on inherited DNA variation has paramount relevance. However, one of the most relevant pitfalls of traditional Polygenic Risk Scores (PRS) approaches is their inability to model complex high-order non-linear SNP-SNP interactions and their effect on the phenotype (e.g. epistasis). Indeed, they incur in a computational challenge as the number of possible interactions grows exponentially with the number of SNPs considered, affecting the statistical reliability of the model parameters as well. In this work, we address this issue by proposing a novel PRS approach, called High-order Interactions-aware Polygenic Risk Score (hiPRS), that incorporates high-order interactions in modeling polygenic risk. The latter combines an interaction search routine based on frequent itemsets mining and a novel interaction selection algorithm based on Mutual Information, to construct a simple and interpretable weighted model of user-specified dimensionality that can predict a given binary phenotype. Compared to traditional PRSs methods, hiPRS does not rely on GWAS summary statistics nor any external information. Moreover, hiPRS differs from Machine Learning-based approaches that can include complex interactions in that it provides a readable and interpretable model and it is able to control overfitting, even on small samples. In the present work we demonstrate through a comprehensive simulation study the superior performance of hiPRS w.r.t. state of the art methods, both in terms of scoring performance and interpretability of the resulting model. We also test hiPRS against small sample size, class imbalance and the presence of noise, showcasing its robustness to extreme experimental settings. Finally, we apply hiPRS to a case study on real data from DACHS cohort, defining an interaction-aware scoring model to predict mortality of stage II-III Colon-Rectal Cancer patients treated with oxaliplatin.
Collapse
|
12
|
Ma J, Li J, Jin C, Yang J, Zheng C, Chen K, Xie Y, Yang Y, Bo Z, Wang J, Su Q, Wang J, Chen G, Wang Y. Association of gut microbiome and primary liver cancer: A two-sample Mendelian randomization and case-control study. Liver Int 2023; 43:221-233. [PMID: 36300678 DOI: 10.1111/liv.15466] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Revised: 09/28/2022] [Accepted: 10/26/2022] [Indexed: 01/04/2023]
Abstract
BACKGROUND AND AIMS Observational epidemiology studies suggested a relationship between the gut microbiome and primary liver cancer. However, the causal relationship remains unclear because of confounding factors and reverse causality. We aimed to explore the causal role of the gut microbiome in the development of primary liver cancer, including hepatocellular carcinoma (HCC) and intrahepatic cholangiocarcinoma (ICC). METHODS Mendelian randomization (MR) study was conducted using summary statistics from genome-wide association studies (GWAS) of the gut microbiome and liver cancer, and sequencing data from a case-control study validated the findings. A 5-cohort GWAS study in Germany (N = 8956) served as exposure, whilst the UK biobank GWAS study (N = 456 348) served as an outcome. The case-control study was conducted at the First Affiliated Hospital of Wenzhou Medical University from December 2018 to October 2020 and included 184 HCC patients, 63 ICC patients and 40 healthy controls. RESULTS A total of 57 features were available for MR analysis, and protective causal associations were identified for Family_Ruminococcaceae (OR = 0.46 [95% CI, 0.26-0.82]; p = .009) and Genus_Porphyromonadaceae (OR = 0.59 [95% CI, 0.42-0.83]; p = .003) with HCC, and for Family_Porphyromonadaceae (OR = 0.36 [95% CI, 0.14-0.94]; p = .036) and Genus_Bacteroidetes (OR = 0.55 [95% CI, 0.34-0.90]; p = .017) with ICC respectively. The case-control study results showed that the healthy controls had a higher relative abundance of Family_Ruminococcaceae (p = .00033), Family_Porphyromonadaceae (p = .0055) and Genus_Bacteroidetes (p = .021) than the liver cancer patients. CONCLUSIONS This study demonstrates that Ruminococcaceae, Porphyromonadaceae and Bacteroidetes are related to a reduced risk of liver cancer (HCC or ICC), suggesting potential significance for the prevention and control of liver cancer.
Collapse
Affiliation(s)
- Jun Ma
- Department of Epidemiology and Biostatistics, School of Public Health and Management, Wenzhou Medical University, Wenzhou, China
| | - Jialiang Li
- Department of Hepatobiliary Surgery, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Chen Jin
- Department of Epidemiology and Biostatistics, School of Public Health and Management, Wenzhou Medical University, Wenzhou, China
| | - Jinhuan Yang
- Department of Hepatobiliary Surgery, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Chongming Zheng
- Department of Hepatobiliary Surgery, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Kaiwen Chen
- Department of Hepatobiliary Surgery, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Yitong Xie
- Department of Hepatobiliary Surgery, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Yi Yang
- Department of Epidemiology and Biostatistics, School of Public Health and Management, Wenzhou Medical University, Wenzhou, China
| | - Zhiyuan Bo
- Department of Hepatobiliary Surgery, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Jingxian Wang
- Department of Epidemiology and Biostatistics, School of Public Health and Management, Wenzhou Medical University, Wenzhou, China
| | - Qing Su
- Department of Epidemiology and Biostatistics, School of Public Health and Management, Wenzhou Medical University, Wenzhou, China
| | - Juejin Wang
- Department of Epidemiology and Biostatistics, School of Public Health and Management, Wenzhou Medical University, Wenzhou, China
| | - Gang Chen
- Department of Hepatobiliary Surgery, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
- Key Laboratory of Diagnosis and Treatment of Severe Hepato-Pancreatic Diseases of Zhejiang Province, The First Affiliated Hospital of Wenzhou Medical University, Zhejiang, China
| | - Yi Wang
- Department of Epidemiology and Biostatistics, School of Public Health and Management, Wenzhou Medical University, Wenzhou, China
| |
Collapse
|
13
|
Ma J, Jin C, Yang Y, Li H, Wang Y. Association of daytime napping frequency and schizophrenia: a bidirectional two-sample Mendelian randomization study. BMC Psychiatry 2022; 22:786. [PMID: 36513988 PMCID: PMC9746219 DOI: 10.1186/s12888-022-04431-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Accepted: 11/25/2022] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND The bidirectional causal association between daytime napping frequency and schizophrenia is unclear. METHODS A bidirectional two-sample Mendelian randomization (MR) analysis was conducted with summary statistics of top genetic variants associated with daytime napping frequency and schizophrenia from genome-wide association studies (GWAS). The single nucleotide polymorphisms (SNPs) data of daytime napping frequency GWAS came from the UK Biobank (n = 452,633) and 23andMe study cohort (n = 541,333), while the schizophrenia GWAS came from the Psychiatric Genomics Consortium (PGC, 36,989 cases and 113,075 controls). The inverse variance weighted (IVW) analysis was the primary method, with the weighted median, MR-Robust Adjusted Profile Score (RAPS), Radial MR and MR-Pleiotropy Residual Sum Outlier (PRESSO) as sensitivity analysis. RESULTS The MR analysis showed a bidirectional causal relationship between more frequent daytime napping and the occurrence of schizophrenia, with the odds ratio (OR) for one-unit increase in napping category (never, sometimes, usually) on schizophrenia was 3.38 (95% confidence interval [CI]: 2.02-5.65, P = 3.58 × 10-6), and the beta for the occurrence of schizophrenia on daytime napping frequency was 0.0112 (95%CI: 0.0060-0.0163, P = 2.04 × 10-5). The sensitivity analysis obtained the same conclusions. CONCLUSION Our findings support the bidirectional causal association between more daytime napping frequency and schizophrenia, implying that daytime napping frequency is a potential intervention for the progression and treatment of schizophrenia.
Collapse
Affiliation(s)
- Jun Ma
- Department of Epidemiology and Biostatistics, School of Public Health and Management, Wenzhou Medical University, Wenzhou, China
| | - Chen Jin
- Department of Epidemiology and Biostatistics, School of Public Health and Management, Wenzhou Medical University, Wenzhou, China
| | - Yan Yang
- Department of Epidemiology and Biostatistics, School of Public Health and Management, Wenzhou Medical University, Wenzhou, China
| | - Haoqi Li
- Department of Epidemiology and Biostatistics, School of Public Health and Management, Wenzhou Medical University, Wenzhou, China
| | - Yi Wang
- Department of Epidemiology and Biostatistics, School of Public Health and Management, Wenzhou Medical University, Wenzhou, China.
- Institute of Aging, Key Laboratory of Alzheimer's Disease of Zhejiang Province, Wenzhou Medical University, Wenzhou Medical University, Wenzhou, Zhejiang, China.
| |
Collapse
|
14
|
Lee CJ, Chen TH, Lim AMW, Chang CC, Sie JJ, Chen PL, Chang SW, Wu SJ, Hsu CL, Hsieh AR, Yang WS, Fann CSJ. Phenome-wide analysis of Taiwan Biobank reveals novel glycemia-related loci and genetic risks for diabetes. Commun Biol 2022; 5:1175. [PMID: 36329257 PMCID: PMC9633758 DOI: 10.1038/s42003-022-04168-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 10/25/2022] [Indexed: 11/05/2022] Open
Abstract
To explore the complex genetic architecture of common diseases and traits, we conducted comprehensive PheWAS of ten diseases and 34 quantitative traits in the community-based Taiwan Biobank (TWB). We identified 995 significantly associated loci with 135 novel loci specific to Taiwanese population. Further analyses highlighted the genetic pleiotropy of loci related to complex disease and associated quantitative traits. Extensive analysis on glycaemic phenotypes (T2D, fasting glucose and HbA1c) was performed and identified 115 significant loci with four novel genetic variants (HACL1, RAD21, ASH1L and GAK). Transcriptomics data also strengthen the relevancy of the findings to metabolic disorders, thus contributing to better understanding of pathogenesis. In addition, genetic risk scores are constructed and validated for absolute risks prediction of T2D in Taiwanese population. In conclusion, our data-driven approach without a priori hypothesis is useful for novel gene discovery and validation on top of disease risk prediction for unique non-European population.
Collapse
Affiliation(s)
- Chia-Jung Lee
- Institute of Biomedical Sciences, Academia Sinica, Taipei, 115, Taiwan.,Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO, USA
| | - Ting-Huei Chen
- Department of Mathematics and Statistics, Laval University, Quebec, QC, G1V0A6, Canada.,Brain Research Centre (CERVO), Quebec, QC, G1V0A6, Canada
| | - Aylwin Ming Wee Lim
- Institute of Biomedical Sciences, Academia Sinica, Taipei, 115, Taiwan.,Taiwan International Graduate Program in Molecular Medicine, National Yang Ming Chiao Tung University and Academia Sinica, Taipei, 115, Taiwan
| | - Chien-Ching Chang
- Institute of Biomedical Sciences, Academia Sinica, Taipei, 115, Taiwan
| | - Jia-Jyun Sie
- Department of Mathematics, National Changhua University of Education, Changhua, Taiwan
| | - Pei-Lung Chen
- Graduate Institute of Medical Genomics and Proteomics, College of Medicine, National Taiwan University, Taipei, 10617, Taiwan.,Department of Medical Genetics, National Taiwan University Hospital, Taipei, 100225, Taiwan.,Graduate Institute of Clinical Medicine, College of Medicine, National Taiwan University, Taipei, 10617, Taiwan
| | - Su-Wei Chang
- Clinical Informatics and Medical Statistics Research Center, Chang Gung University, Taoyuan, 333, Taiwan.,Department of Laboratory Medicine, Chang Gung Memorial Hospital at Linkou, Taoyuan, 333, Taiwan
| | - Shang-Jung Wu
- Institute of Biomedical Sciences, Academia Sinica, Taipei, 115, Taiwan
| | - Chia-Lin Hsu
- Institute of Biomedical Sciences, Academia Sinica, Taipei, 115, Taiwan
| | - Ai-Ru Hsieh
- Department of Statistics, Tamkang University, New Taipei City, 251301, Taiwan.
| | - Wei-Shiung Yang
- Graduate Institute of Medical Genomics and Proteomics, College of Medicine, National Taiwan University, Taipei, 10617, Taiwan. .,Graduate Institute of Clinical Medicine, College of Medicine, National Taiwan University, Taipei, 10617, Taiwan. .,Department of Internal Medicine, National Taiwan University Hospital, Taipei, 100225, Taiwan.
| | - Cathy S J Fann
- Institute of Biomedical Sciences, Academia Sinica, Taipei, 115, Taiwan.
| |
Collapse
|
15
|
Namjou B, Lape M, Malolepsza E, DeVore SB, Weirauch MT, Dikilitas O, Jarvik GP, Kiryluk K, Kullo IJ, Liu C, Luo Y, Satterfield BA, Smoller JW, Walunas TL, Connolly J, Sleiman P, Mersha TB, Mentch FD, Hakonarson H, Prows CA, Biagini JM, Khurana Hershey GK, Martin LJ, Kottyan L. Multiancestral polygenic risk score for pediatric asthma. J Allergy Clin Immunol 2022; 150:1086-1096. [PMID: 35595084 PMCID: PMC9643615 DOI: 10.1016/j.jaci.2022.03.035] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 03/07/2022] [Accepted: 03/29/2022] [Indexed: 10/18/2022]
Abstract
BACKGROUND Asthma is the most common chronic condition in children and the third leading cause of hospitalization in pediatrics. The genome-wide association study catalog reports 140 studies with genome-wide significance. A polygenic risk score (PRS) with predictive value across ancestries has not been evaluated for this important trait. OBJECTIVES This study aimed to train and validate a PRS relying on genetic determinants for asthma to provide predictions for disease occurrence in pediatric cohorts of diverse ancestries. METHODS This study applied a Bayesian regression framework method using the Trans-National Asthma Genetic Consortium genome-wide association study summary statistics to derive a multiancestral PRS score, used one Electronic Medical Records and Genomics (eMERGE) cohort as a training set, used a second independent eMERGE cohort to validate the score, and used the UK Biobank data to replicate the findings. A phenome-wide association study was performed using the PRS to identify shared genetic etiology with other phenotypes. RESULTS The multiancestral asthma PRS was associated with asthma in the 2 pediatric validation datasets. Overall, the multiancestral asthma PRS has an area under the curve (AUC) of 0.70 (95% CI, 0.69-0.72) in the pediatric validation 1 and AUC of 0.66 (0.65-0.66) in the pediatric validation 2 datasets. We found significant discrimination across pediatric subcohorts of European (AUC, 95% CI, 0.60 and 0.66), African (AUC, 95% CI, 0.61 and 0.66), admixed American (AUC, 0.64 and 0.70), Southeast Asian (AUC, 0.65), and East Asian (AUC, 0.73) ancestry. Pediatric participants with the top 5% PRS had 2.80 to 5.82 increased odds of asthma compared to the bottom 5% across the training, validation 1, and validation 2 cohorts when adjusted for ancestry. Phenome-wide association study analysis confirmed the strong association of the identified PRS with asthma (odds ratio, 2.71, PFDR = 3.71 × 10-65) and related phenotypes. CONCLUSIONS A multiancestral PRS for asthma based on Bayesian posterior genomic effect sizes identifies increased odds of pediatric asthma.
Collapse
Affiliation(s)
- Bahram Namjou
- Center for Autoimmune Genomics and Etiology, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio 45229
- Department of Pediatrics, University of Cincinnati, College of Medicine, Cincinnati, Ohio 45229
| | - Michael Lape
- Center for Autoimmune Genomics and Etiology, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio 45229
- Department of Pediatrics, University of Cincinnati, College of Medicine, Cincinnati, Ohio 45229
- Division of Biomedical Informatics, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio 45229
| | - Edyta Malolepsza
- Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, Massachusetts 02142
| | - Stanley B. DeVore
- Department of Pediatrics, University of Cincinnati, College of Medicine, Cincinnati, Ohio 45229
- Division of Asthma Research, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio 45229
| | - Matthew T. Weirauch
- Center for Autoimmune Genomics and Etiology, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio 45229
- Department of Pediatrics, University of Cincinnati, College of Medicine, Cincinnati, Ohio 45229
- Division of Biomedical Informatics, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio 45229
- Division of Developmental Biology, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio 45229
| | - Ozan Dikilitas
- Department of Internal Medicine, Mayo Clinic, Rochester, Minnesota 55905
- Department of Cardiovascular Medicine, Mayo Clinic, Rochester, Minnesota 55905
| | - Gail P. Jarvik
- Departments of Medicine (Division of Medical Genetics) and Genome Sciences, University of Washington Medical Center, Seattle, Washington 98195
| | - Krzysztof Kiryluk
- Department of Medicine, Division of Nephrology, College of Physicians and Surgeons, Columbia University, New York, New York 10032
| | - Iftikhar J. Kullo
- Department of Cardiovascular Medicine, Mayo Clinic, Rochester, Minnesota 55905
| | - Cong Liu
- Department of Biomedical Informatics, Columbia University, New York, New York 10032
| | - Yuan Luo
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois 60611
| | | | - Jordan W. Smoller
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Human Genomic Medicine, Massachusetts General Hospital, Boston, Massachusetts 02114
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142
- Department of Psychiatry, Harvard Medical School, Boston, Massachusetts 02115
| | - Theresa L. Walunas
- Division of General Internal Medicine and Geriatrics, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois 60611
| | - John Connolly
- Center for Applied Genomics, Children’s Hospital of Philadelphia, Department of Pediatrics, Philadelphia, Pennsylvania 19104
| | - Patrick Sleiman
- Center for Applied Genomics, Children’s Hospital of Philadelphia, Department of Pediatrics, Philadelphia, Pennsylvania 19104
- Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania 19104
| | - Tesfaye B. Mersha
- Department of Pediatrics, University of Cincinnati, College of Medicine, Cincinnati, Ohio 45229
- Division of Asthma Research, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio 45229
| | - Frank D Mentch
- Center for Applied Genomics, Children’s Hospital of Philadelphia, Department of Pediatrics, Philadelphia, Pennsylvania 19104
| | - Hakon Hakonarson
- Center for Applied Genomics, Children’s Hospital of Philadelphia, Department of Pediatrics, Philadelphia, Pennsylvania 19104
- Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania 19104
| | - Cynthia A. Prows
- Department of Pediatrics, University of Cincinnati, College of Medicine, Cincinnati, Ohio 45229
- Division of Human Genetics, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio 45229
- Department of Patient Services, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio 45229
| | - Jocelyn M. Biagini
- Department of Pediatrics, University of Cincinnati, College of Medicine, Cincinnati, Ohio 45229
- Division of Asthma Research, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio 45229
| | - Gurjit K. Khurana Hershey
- Department of Pediatrics, University of Cincinnati, College of Medicine, Cincinnati, Ohio 45229
- Division of Asthma Research, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio 45229
- Division of Allergy & Immunology, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio 45229
| | - Lisa J. Martin
- Department of Pediatrics, University of Cincinnati, College of Medicine, Cincinnati, Ohio 45229
- Division of Human Genetics, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio 45229
| | - Leah Kottyan
- Center for Autoimmune Genomics and Etiology, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio 45229
- Department of Pediatrics, University of Cincinnati, College of Medicine, Cincinnati, Ohio 45229
- Division of Allergy & Immunology, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio 45229
| | - The eMERGE Network
- National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892
| |
Collapse
|
16
|
Waksmunski AR, Kinzy TG, Cruz LA, Nealon CL, Halladay CW, Simpson P, Canania RL, Anthony SA, Roncone DP, Sawicki Rogers L, Leber JN, Dougherty JM, Greenberg PB, Sullivan JM, Wu WC, Iyengar SK, Crawford DC, Peachey NS, Cooke Bailey JN. Glaucoma Genetic Risk Scores in the Million Veteran Program. Ophthalmology 2022; 129:1263-1274. [PMID: 35718050 PMCID: PMC9997524 DOI: 10.1016/j.ophtha.2022.06.012] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Revised: 06/07/2022] [Accepted: 06/09/2022] [Indexed: 11/22/2022] Open
Abstract
PURPOSE Primary open-angle glaucoma (POAG) is a degenerative eye disease for which early treatment is critical to mitigate visual impairment and irreversible blindness. POAG-associated loci individually confer incremental risk. Genetic risk score(s) (GRS) could enable POAG risk stratification. Despite significantly higher POAG burden among individuals of African ancestry (AFR), GRS are limited in this population. A recent large-scale, multi-ancestry meta-analysis identified 127 POAG-associated loci and calculated cross-ancestry and ancestry-specific effect estimates, including in European ancestry (EUR) and AFR individuals. We assessed the utility of the 127-variant GRS for POAG risk stratification in EUR and AFR Veterans in the Million Veteran Program (MVP). We also explored the association between GRS and documented invasive glaucoma surgery (IGS). DESIGN Cross-sectional study. PARTICIPANTS MVP Veterans with imputed genetic data, including 5830 POAG cases (445 with IGS documented in the electronic health record) and 64 476 controls. METHODS We tested unweighted and weighted GRS of 127 published risk variants in EUR (3382 cases and 58 811 controls) and AFR (2448 cases and 5665 controls) Veterans in the MVP. Weighted GRS were calculated using effect estimates from the most recently published report of cross-ancestry and ancestry-specific meta-analyses. We also evaluated GRS in POAG cases with documented IGS. MAIN OUTCOME MEASURES Performance of 127-variant GRS in EUR and AFR Veterans for POAG risk stratification and association with documented IGS. RESULTS GRS were significantly associated with POAG (P < 5 × 10-5) in both groups; a higher proportion of EUR compared with AFR were consistently categorized in the top GRS decile (21.9%-23.6% and 12.9%-14.5%, respectively). Only GRS weighted by ancestry-specific effect estimates were associated with IGS documentation in AFR cases; all GRS types were associated with IGS in EUR cases. CONCLUSIONS Varied performance of the GRS for POAG risk stratification and documented IGS association in EUR and AFR Veterans highlights (1) the complex risk architecture of POAG, (2) the importance of diverse representation in genomics studies that inform GRS construction and evaluation, and (3) the necessity of expanding diverse POAG-related genomic data so that GRS can equitably aid in screening individuals at high risk of POAG and who may require more aggressive treatment.
Collapse
Affiliation(s)
- Andrea R Waksmunski
- Cleveland Institute for Computational Biology, Case Western Reserve University School of Medicine, Cleveland, Ohio; Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, Ohio
| | - Tyler G Kinzy
- Cleveland Institute for Computational Biology, Case Western Reserve University School of Medicine, Cleveland, Ohio; Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, Ohio; Research Service, VA Northeast Ohio Healthcare System, Cleveland, Ohio
| | - Lauren A Cruz
- Cleveland Institute for Computational Biology, Case Western Reserve University School of Medicine, Cleveland, Ohio; Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, Ohio
| | - Cari L Nealon
- Eye Clinic, VA Northeast Ohio Healthcare System, Cleveland, Ohio
| | - Christopher W Halladay
- Center of Innovation in Long Term Services and Supports, Providence VA Medical Center, Providence, Rhode Island
| | - Piana Simpson
- Eye Clinic, VA Northeast Ohio Healthcare System, Cleveland, Ohio
| | | | - Scott A Anthony
- Eye Clinic, VA Northeast Ohio Healthcare System, Cleveland, Ohio
| | - David P Roncone
- Eye Clinic, VA Northeast Ohio Healthcare System, Cleveland, Ohio
| | - Lea Sawicki Rogers
- Ophthalmology Section, VA Western NY Healthcare System, Buffalo, New York
| | - Jenna N Leber
- Ophthalmology Section, VA Western NY Healthcare System, Buffalo, New York
| | | | - Paul B Greenberg
- Ophthalmology Section, Providence VA Medical Center, Providence, Rhode Island; Division of Ophthalmology, Alpert Medical School, Brown University, Providence, Rhode Island
| | - Jack M Sullivan
- Ophthalmology Section, VA Western NY Healthcare System, Buffalo, New York; Research Service, VA Western NY Healthcare System, Buffalo, New York
| | - Wen-Chih Wu
- Cardiology Section, Medical Service, Providence VA Medical Center, Providence, Rhode Island
| | - Sudha K Iyengar
- Cleveland Institute for Computational Biology, Case Western Reserve University School of Medicine, Cleveland, Ohio; Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, Ohio; Research Service, VA Northeast Ohio Healthcare System, Cleveland, Ohio
| | - Dana C Crawford
- Cleveland Institute for Computational Biology, Case Western Reserve University School of Medicine, Cleveland, Ohio; Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, Ohio; Research Service, VA Northeast Ohio Healthcare System, Cleveland, Ohio
| | - Neal S Peachey
- Research Service, VA Northeast Ohio Healthcare System, Cleveland, Ohio; Cole Eye Institute, Cleveland Clinic Foundation, Cleveland, Ohio; Department of Ophthalmology, Cleveland Clinic Lerner College of Medicine of Case Western Reserve University, Cleveland, Ohio
| | - Jessica N Cooke Bailey
- Cleveland Institute for Computational Biology, Case Western Reserve University School of Medicine, Cleveland, Ohio; Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, Ohio; Research Service, VA Northeast Ohio Healthcare System, Cleveland, Ohio.
| |
Collapse
|
17
|
Allegrini AG, Baldwin JR, Barkhuizen W, Pingault JB. Research Review: A guide to computing and implementing polygenic scores in developmental research. J Child Psychol Psychiatry 2022; 63:1111-1124. [PMID: 35354222 PMCID: PMC10108570 DOI: 10.1111/jcpp.13611] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Revised: 02/28/2022] [Accepted: 03/04/2022] [Indexed: 12/14/2022]
Abstract
The increasing availability of genotype data in longitudinal population- and family-based samples provides opportunities for using polygenic scores (PGS) to study developmental questions in child and adolescent psychology and psychiatry. Here, we aim to provide a comprehensive overview of how PGS can be generated and implemented in developmental psycho(patho)logy, with a focus on longitudinal designs. As such, the paper is organized into three parts: First, we provide a formal definition of polygenic scores and related concepts, focusing on assumptions and limitations. Second, we give a general overview of the methods used to compute polygenic scores, ranging from the classic approach to more advanced methods. We include recommendations and reference resources available to researchers aiming to conduct PGS analyses. Finally, we focus on the practical applications of PGS in the analysis of longitudinal data. We describe how PGS have been used to research developmental outcomes, and how they can be applied to longitudinal data to address developmental questions.
Collapse
Affiliation(s)
- Andrea G Allegrini
- Division of Psychology and Language Sciences, Department of Clinical, Educational and Health Psychology, University College London, London, UK.,Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
| | - Jessie R Baldwin
- Division of Psychology and Language Sciences, Department of Clinical, Educational and Health Psychology, University College London, London, UK.,Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
| | - Wikus Barkhuizen
- Division of Psychology and Language Sciences, Department of Clinical, Educational and Health Psychology, University College London, London, UK
| | - Jean-Baptiste Pingault
- Division of Psychology and Language Sciences, Department of Clinical, Educational and Health Psychology, University College London, London, UK.,Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
| |
Collapse
|
18
|
Tian P, Chan TH, Wang YF, Yang W, Yin G, Zhang YD. Multiethnic polygenic risk prediction in diverse populations through transfer learning. Front Genet 2022; 13:906965. [PMID: 36061179 PMCID: PMC9438789 DOI: 10.3389/fgene.2022.906965] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Accepted: 06/27/2022] [Indexed: 11/28/2022] Open
Abstract
Polygenic risk scores (PRS) leverage the genetic contribution of an individual’s genotype to a complex trait by estimating disease risk. Traditional PRS prediction methods are predominantly for the European population. The accuracy of PRS prediction in non-European populations is diminished due to much smaller sample size of genome-wide association studies (GWAS). In this article, we introduced a novel method to construct PRS for non-European populations, abbreviated as TL-Multi, by conducting a transfer learning framework to learn useful knowledge from the European population to correct the bias for non-European populations. We considered non-European GWAS data as the target data and European GWAS data as the informative auxiliary data. TL-Multi borrows useful information from the auxiliary data to improve the learning accuracy of the target data while preserving the efficiency and accuracy. To demonstrate the practical applicability of the proposed method, we applied TL-Multi to predict the risk of systemic lupus erythematosus (SLE) in the Asian population and the risk of asthma in the Indian population by borrowing information from the European population. TL-Multi achieved better prediction accuracy than the competing methods, including Lassosum and meta-analysis in both simulations and real applications.
Collapse
Affiliation(s)
- Peixin Tian
- Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong SAR, China
| | - Tsai Hor Chan
- Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong SAR, China
| | - Yong-Fei Wang
- Department of Paediatrics and Adolescent Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Wanling Yang
- Department of Paediatrics and Adolescent Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Guosheng Yin
- Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong SAR, China
| | - Yan Dora Zhang
- Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong SAR, China
- Centre for PanorOmic Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
- *Correspondence: Yan Dora Zhang,
| |
Collapse
|
19
|
Ma W, Lau YL, Yang W, Wang YF. Random forests algorithm boosts genetic risk prediction of systemic lupus erythematosus. Front Genet 2022; 13:902793. [PMID: 36046232 PMCID: PMC9421562 DOI: 10.3389/fgene.2022.902793] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 07/19/2022] [Indexed: 11/13/2022] Open
Abstract
Patients with systemic lupus erythematosus (SLE) present varied clinical manifestations, posing a diagnostic challenge for physicians. Genetic factors substantially contribute to SLE development. A polygenic risk scoring (PRS) model has been used to estimate the genetic risk of SLE in individuals. However, this approach assumes independent and additive contribution of genetic variants to disease development. We aimed to improve the accuracy of SLE prediction using machine-learning algorithms. We applied random forest (RF), support vector machine (SVM), and artificial neural network (ANN) to classify SLE cases and controls using the data from our previous genome-wide association studies (GWAS) conducted in either Chinese or European populations, including a total of 19,208 participants. The overall performances of these predictors were assessed by the value of area under the receiver-operator curve (AUC). The analyses in the Chinese GWAS showed that the RF model significantly outperformed other predictors, achieving a mean AUC value of 0.84, a 13% improvement upon the PRS model (AUC = 0.74). At the optimal cut-off, the RF predictor reached a sensitivity of 84% with a specificity of 68% in SLE classification. To validate these results, similar analyses were repeated in the European GWAS, and the RF model consistently outperformed other algorithms. Our study suggests that the RF model could be an additional and powerful predictor for SLE early diagnosis.
Collapse
Affiliation(s)
- Wen Ma
- Department of Paediatrics and Adolescent Medicine, The University of Hong Kong, Hong Kong, China
| | - Yu-Lung Lau
- Department of Paediatrics and Adolescent Medicine, The University of Hong Kong, Hong Kong, China
| | - Wanling Yang
- Department of Paediatrics and Adolescent Medicine, The University of Hong Kong, Hong Kong, China
- *Correspondence: Wanling Yang, ; Yong-Fei Wang,
| | - Yong-Fei Wang
- Department of Paediatrics and Adolescent Medicine, The University of Hong Kong, Hong Kong, China
- Shenzhen Futian Hospital for Rheumatic Diseases, Shenzhen, China
- *Correspondence: Wanling Yang, ; Yong-Fei Wang,
| |
Collapse
|
20
|
Wang Y, Tsuo K, Kanai M, Neale BM, Martin AR. Challenges and Opportunities for Developing More Generalizable Polygenic Risk Scores. Annu Rev Biomed Data Sci 2022; 5:293-320. [PMID: 35576555 PMCID: PMC9828290 DOI: 10.1146/annurev-biodatasci-111721-074830] [Citation(s) in RCA: 53] [Impact Index Per Article: 26.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Polygenic risk scores (PRS) estimate an individual's genetic likelihood of complex traits and diseases by aggregating information across multiple genetic variants identified from genome-wide association studies. PRS can predict a broad spectrum of diseases and have therefore been widely used in research settings. Some work has investigated their potential applications as biomarkers in preventative medicine, but significant work is still needed to definitively establish and communicate absolute risk to patients for genetic and modifiable risk factors across demographic groups. However, the biggest limitation of PRS currently is that they show poor generalizability across diverse ancestries and cohorts. Major efforts are underway through methodological development and data generation initiatives to improve their generalizability. This review aims to comprehensively discuss current progress on the development of PRS, the factors that affect their generalizability, and promising areas for improving their accuracy, portability, and implementation.
Collapse
Affiliation(s)
- Ying Wang
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, USA;
- Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
| | - Kristin Tsuo
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, USA;
- Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
- Biological and Biomedical Sciences, Harvard Medical School, Boston, Massachusetts, USA
| | - Masahiro Kanai
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, USA;
- Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan
| | - Benjamin M Neale
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, USA;
- Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
| | - Alicia R Martin
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, USA;
- Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
| |
Collapse
|
21
|
Yue T, Li J, Liang M, Yang J, Ou Z, Wang S, Ma W, Fan D. Identification of the KCNQ1OT1/ miR-378a-3p/ RBMS1 Axis as a Novel Prognostic Biomarker Associated With Immune Cell Infiltration in Gastric Cancer. Front Genet 2022; 13:928754. [PMID: 35910231 PMCID: PMC9330051 DOI: 10.3389/fgene.2022.928754] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 06/23/2022] [Indexed: 11/13/2022] Open
Abstract
Background: Gastric cancer (GC) is the second leading cause of cancer-related mortality and the fifth most common cancer worldwide. However, the underlying mechanisms of competitive endogenous RNAs (ceRNAs) in GC are unclear. This study aimed to construct a ceRNA regulation network in correlation with prognosis and explore a prognostic model associated with GC. Methods: In this study, 1,040 cases of GC were obtained from TCGA and GEO datasets. To identify potential prognostic signature associated with GC, Cox regression analysis and the least absolute shrinkage and selection operator (LASSO) regression were employed. The prognostic value of the signature was validated in the GEO84437 training set, GEO84437 test set, GEO15459 set, and TCGA-STAD. Based on the public databases, TargetScan and starBase, an mRNA-miRNA-lncRNA regulatory network was constructed, and hub genes were identified using the CytoHubba plugin. Furthermore, the clinical outcomes, immune cell infiltration, genetic variants, methylation, and somatic copy number alteration (sCNA) associated with the ceRNA network were derived using bioinformatics methods. Results: A total of 234 prognostic genes were identified. GO and GSEA revealed that the biological pathways and modules related to immune response and fibroblasts were considerably enriched in GC. A nomogram was generated to provide accurate prognostic outcomes and individualized risk estimates, which were validated in the training, test dataset, and two independent validation datasets. Thereafter, an mRNA-miRNA-lncRNA regulatory network containing 4 mRNAs, 22 miRNAs, 201 lncRNAs was constructed. The KCNQ1OT1/hsa-miR-378a-3p/RBMS1 ceRNA network associated with the prognosis was obtained by hub gene analysis and correlation analysis. Importantly, we found that the KCNQ1OT1/miR-378a-3p/RBMS1 axis may play a vital role in the diagnosis and prognosis of GC patients based on Cox regression analyses. Furthermore, our findings demonstrated that mutations and sCNA of the KCNQ1OT1/miR-378a-3p/RBMS1 axis were associated with increased immune infiltration, while the abnormal upregulation of the axis was primarily a result of hypomethylation. Conclusion: Our findings suggest that the KCNQ1OT1/miR-378a-3p/RBMS1 axis may be a potential prognostic biomarker and therapeutic target for GC. Moreover, such findings provide insights into the molecular mechanisms of GC pathogenesis.
Collapse
Affiliation(s)
- Ting Yue
- The Fifth Clinical Medical School, Guangzhou University of Chinese Medicine, Guangzhou, China
- Department of Oncology Rehabilitation, Jincheng People’s Hospital, Jincheng, China
| | - Jingjing Li
- Department of Anesthesiology, The First Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China
- Department of Anesthesiology, Jincheng People’s Hospital, Jincheng, China
| | - Manguang Liang
- The Fifth Clinical Medical School, Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Jiaman Yang
- The Fifth Clinical Medical School, Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Zhiwen Ou
- The Fifth Clinical Medical School, Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Shuchen Wang
- Department of Anesthesiology, The First Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Wuhua Ma
- Department of Anesthesiology, The First Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China
- *Correspondence: Wuhua Ma, ; Dehui Fan,
| | - Dehui Fan
- The Fifth Clinical Medical School, Guangzhou University of Chinese Medicine, Guangzhou, China
- Department of Rehabilitation, GuangDong Second Traditional Chinese Medicine Hospital, Guangzhou, China
- *Correspondence: Wuhua Ma, ; Dehui Fan,
| |
Collapse
|
22
|
Khunsriraksakul C, Markus H, Olsen NJ, Carrel L, Jiang B, Liu DJ. Construction and Application of Polygenic Risk Scores in Autoimmune Diseases. Front Immunol 2022; 13:889296. [PMID: 35833142 PMCID: PMC9271862 DOI: 10.3389/fimmu.2022.889296] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Accepted: 04/25/2022] [Indexed: 11/13/2022] Open
Abstract
Genome-wide association studies (GWAS) have identified hundreds of genetic variants associated with autoimmune diseases and provided unique mechanistic insights and informed novel treatments. These individual genetic variants on their own typically confer a small effect of disease risk with limited predictive power; however, when aggregated (e.g., via polygenic risk score method), they could provide meaningful risk predictions for a myriad of diseases. In this review, we describe the recent advances in GWAS for autoimmune diseases and the practical application of this knowledge to predict an individual’s susceptibility/severity for autoimmune diseases such as systemic lupus erythematosus (SLE) via the polygenic risk score method. We provide an overview of methods for deriving different polygenic risk scores and discuss the strategies to integrate additional information from correlated traits and diverse ancestries. We further advocate for the need to integrate clinical features (e.g., anti-nuclear antibody status) with genetic profiling to better identify patients at high risk of disease susceptibility/severity even before clinical signs or symptoms develop. We conclude by discussing future challenges and opportunities of applying polygenic risk score methods in clinical care.
Collapse
Affiliation(s)
- Chachrit Khunsriraksakul
- Graduate Program in Bioinformatics and Genomics, Pennsylvania State University College of Medicine, Hershey, PA, United States
- Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA, United States
| | - Havell Markus
- Graduate Program in Bioinformatics and Genomics, Pennsylvania State University College of Medicine, Hershey, PA, United States
- Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA, United States
| | - Nancy J. Olsen
- Department of Medicine, Division of Rheumatology, Pennsylvania State University College of Medicine, Hershey, PA, United States
| | - Laura Carrel
- Department of Biochemistry and Molecular Biology, Pennsylvania State University College of Medicine, Hershey, PA, United States
| | - Bibo Jiang
- Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA, United States
| | - Dajiang J. Liu
- Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA, United States
- Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA, United States
- *Correspondence: Dajiang J. Liu,
| |
Collapse
|
23
|
Dattani S, Howard DM, Lewis CM, Sham PC. Clarifying the causes of consistent and inconsistent findings in genetics. Genet Epidemiol 2022; 46:372-389. [PMID: 35652173 PMCID: PMC9544854 DOI: 10.1002/gepi.22459] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Revised: 04/12/2022] [Accepted: 04/22/2022] [Indexed: 11/29/2022]
Abstract
As research in genetics has advanced, some findings have been unexpected or shown to be inconsistent between studies or datasets. The reasons these inconsistencies arise are complex. Results from genetic studies can be affected by various factors including statistical power, linkage disequilibrium, quality control, confounding and selection bias, as well as real differences from interactions and effect modifiers, which may be informative about the mechanisms of traits and disease. Statistical artefacts can manifest as differences between results but they can also conceal underlying differences, which implies that their critical examination is important for understanding the underpinnings of traits. In this review, we examine these factors and outline how they can be identified and conceptualised with structural causal models. We explain the consequences they have on genetic estimates, such as genetic associations, polygenic scores, family‐ and genome‐wide heritability, and describe methods to address them to aid in the estimation of true effects of genetic variation. Clarifying these factors can help researchers anticipate when results are likely to diverge and aid researchers' understanding of causal relationships between genes and complex traits.
Collapse
Affiliation(s)
- Saloni Dattani
- Social Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK.,Department of Psychiatry, Li Ka Shing (LKS) Faculty of Medicine, University of Hong Kong, Hong Kong, China
| | - David M Howard
- Social Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK.,Division of Psychiatry, Royal Edinburgh Hospital, University of Edinburgh, Edinburgh, UK
| | - Cathryn M Lewis
- Social Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK.,Department of Medical and Molecular Genetics, Faculty of Life Sciences and Medicine, King's College London, London, UK
| | - Pak C Sham
- Department of Psychiatry, State Key Laboratory of Brain and Cognitive Sciences, and Centre for Panoromic Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| |
Collapse
|
24
|
Liang J, Wu Z, Zheng J, Koskela EA, Fan L, Fan G, Gao D, Dong Z, Hou S, Feng Z, Wang F, Hytönen T, Wang H. The GATA factor HANABA TARANU promotes runner formation by regulating axillary bud initiation and outgrowth in cultivated strawberry. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2022; 110:1237-1254. [PMID: 35384101 DOI: 10.1111/tpj.15759] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Revised: 03/22/2022] [Accepted: 03/27/2022] [Indexed: 06/14/2023]
Abstract
A runner, as an elongated branch, develops from the axillary bud (AXB) in the leaf axil and is crucial for the clonal propagation of cultivated strawberry (Fragaria × ananassa Duch.). Runner formation occurs in at least two steps: AXB initiation and AXB outgrowth. HANABA TARANU (HAN ) encodes a GATA transcription factor that affects AXB initiation in Arabidopsis and promotes branching in grass species, but the underlying mechanism is largely unknown. Here, the function of a strawberry HAN homolog FaHAN in runner formation was characterized. FaHAN transcripts can be detected in the leaf axils. Overexpression (OE) of FaHAN increased the number of runners, mainly by enhancing AXB outgrowth, in strawberry. The expression of the strawberry homolog of BRANCHED1 , a key inhibitor of AXB outgrowth in many plant species, was significantly downregulated in the AXBs of FaHAN -OE lines, whereas the expression of the strawberry homolog of SHOOT MERISTEMLESS, a marker gene for AXB initiation in Arabidopsis, was upregulated. Moreover, several genes of gibberellin biosynthesis and cytokinin signaling pathways were activated, whereas the auxin response pathway genes were repressed. Further assays indicated that FaHAN could be directly activated by FaNAC2, the overexpression of which in strawberry also increased the number of runners. The silencing of FaNAC2 or FaHAN inhibited AXB initiation and led to a higher proportion of dormant AXBs, confirming their roles in the control of runner formation. Taken together, our results revealed a FaNAC2-FaHAN pathway in the control of runner formation and have provided a means to enhance the vegetative propagation of cultivated strawberry.
Collapse
Affiliation(s)
- Jiahui Liang
- Department of Fruit Science, College of Horticulture, China Agricultural University, Beijing, 100193, China
- Department of Agricultural Sciences, Viikki Plant Science Centre, University of Helsinki, Latokartanonkaari 7, 00790, Helsinki, Finland
| | - Ze Wu
- Key Laboratory of Landscaping Agriculture, Ministry of Agriculture and Rural Affairs, College of Horticulture, Nanjing Agricultural University, Nanjing, 210095, China
| | - Jing Zheng
- Department of Fruit Science, College of Horticulture, China Agricultural University, Beijing, 100193, China
| | - Elli A Koskela
- Department of Agricultural Sciences, Viikki Plant Science Centre, University of Helsinki, Latokartanonkaari 7, 00790, Helsinki, Finland
| | - Lingjiao Fan
- Department of Fruit Science, College of Horticulture, China Agricultural University, Beijing, 100193, China
| | - Guangxun Fan
- Department of Agricultural Sciences, Viikki Plant Science Centre, University of Helsinki, Latokartanonkaari 7, 00790, Helsinki, Finland
| | - Dehang Gao
- Department of Fruit Science, College of Horticulture, China Agricultural University, Beijing, 100193, China
| | - Zhenfei Dong
- Department of Fruit Science, College of Horticulture, China Agricultural University, Beijing, 100193, China
| | - Shengfan Hou
- Department of Fruit Science, College of Horticulture, China Agricultural University, Beijing, 100193, China
| | - Zekun Feng
- Department of Fruit Science, College of Horticulture, China Agricultural University, Beijing, 100193, China
| | - Feng Wang
- Department of Agricultural Sciences, Viikki Plant Science Centre, University of Helsinki, Latokartanonkaari 7, 00790, Helsinki, Finland
| | - Timo Hytönen
- Department of Agricultural Sciences, Viikki Plant Science Centre, University of Helsinki, Latokartanonkaari 7, 00790, Helsinki, Finland
- NIAB EMR, Kent, ME19 6BJ, UK
| | - Hongqing Wang
- Department of Fruit Science, College of Horticulture, China Agricultural University, Beijing, 100193, China
| |
Collapse
|
25
|
Haan E, Sallis HM, Zuccolo L, Labrecque J, Ystrom E, Reichborn-Kjennerud T, Andreassen O, Havdahl A, Munafò MR. Prenatal smoking, alcohol and caffeine exposure and maternal-reported attention deficit hyperactivity disorder symptoms in childhood: triangulation of evidence using negative control and polygenic risk score analyses. Addiction 2022; 117:1458-1471. [PMID: 34791750 PMCID: PMC7613851 DOI: 10.1111/add.15746] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Accepted: 10/29/2021] [Indexed: 01/29/2023]
Abstract
BACKGROUND AND AIMS Studies have indicated that maternal prenatal substance use may be associated with offspring attention deficit hyperactivity disorder (ADHD) via intrauterine effects. We measured associations between prenatal smoking, alcohol and caffeine consumption with childhood ADHD symptoms accounting for shared familial factors. DESIGN First, we used a negative control design comparing maternal and paternal substance use. Three models were used for negative control analyses: unadjusted (without confounders), adjusted (including confounders) and mutually adjusted (including confounders and partner's substance use). The results were meta-analysed across the cohorts. Secondly, we used polygenic risk scores (PRS) as proxies for exposures. Maternal PRS for smoking, alcohol and coffee consumption were regressed against ADHD symptoms. We triangulated the results across the two approaches to infer causality. SETTING We used data from three longitudinal pregnancy cohorts: Avon Longitudinal Study of Parents and Children (ALSPAC) in the United Kingdom, Generation R study (GenR) in the Netherlands and Norwegian Mother, Father and Child Cohort study (MoBa) in Norway. PARTICIPANTS Phenotype data available for children were: NALSPAC = 5455-7751; NGENR = 1537-3119; NMOBA = 28 053-42 206. Genotype data available for mothers was: NALSPAC = 7074; NMOBA = 14 583. MEASUREMENTS A measure of offspring ADHD symptoms at age 7-8 years was derived by dichotomizing scores from questionnaires and parental self-reported prenatal substance use was measured at the second pregnancy trimester. FINDINGS The pooled estimate for maternal prenatal substance use showed an association with total ADHD symptoms [odds ratio (OR)SMOKING = 1.11, 95% confidence interval (CI) = 1.00-1.23; ORALCOHOL = 1.27, 95% CI = 1.08-1.49; ORCAFFEINE = 1.05, 95% CI = 1.00-1.11], while not for fathers (ORSMOKING = 1.03, 95% CI = 0.95-1.13; ORALCOHOL = 0.83, 95% CI = 0.47-1.48; ORCAFFEINE = 1.02, 95% CI = 0.97-1.07). However, maternal associations did not persist in sensitivity analyses (substance use before pregnancy, adjustment for maternal ADHD symptoms in MoBa). The PRS analyses were inconclusive for an association in ALSPAC or MoBa. CONCLUSIONS There appears to be no causal intrauterine effect of maternal prenatal substance use on offspring attention deficit hyperactivity disorder symptoms.
Collapse
Affiliation(s)
- Elis Haan
- School of Psychological Science, University of Bristol, Bristol, UK
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, UK
| | - Hannah M. Sallis
- School of Psychological Science, University of Bristol, Bristol, UK
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, UK
- Department of Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Luisa Zuccolo
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, UK
- Department of Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Jeremy Labrecque
- Department of Epidemiology, Erasmus MC, Rotterdam, the Netherlands
| | - Eivind Ystrom
- PROMENTA Research Center, Department of Psychology, University of Oslo, Oslo, Norway
- Department of Mental Disorders, Norwegian Institute of Public Health, Oslo, Norway
- School of Pharmacy, University of Oslo, Oslo, Norway
| | - Ted Reichborn-Kjennerud
- Department of Mental Disorders, Norwegian Institute of Public Health, Oslo, Norway
- Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | - Ole Andreassen
- NORMENT Centre, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
- Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway
| | - Alexandra Havdahl
- PROMENTA Research Center, Department of Psychology, University of Oslo, Oslo, Norway
- Department of Mental Disorders, Norwegian Institute of Public Health, Oslo, Norway
- Nic Waals Institute, Lovisenberg Diaconal Hospital, Oslo, Norway
| | - Marcus R. Munafò
- School of Psychological Science, University of Bristol, Bristol, UK
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, UK
| |
Collapse
|
26
|
El Hadi C, Ayoub G, Bachir Y, Haykal M, Jalkh N, Kourie HR. Polygenic and Network-Based Studies in Risk Identification and Demystification of cancer. Expert Rev Mol Diagn 2022; 22:427-438. [PMID: 35400274 DOI: 10.1080/14737159.2022.2065195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
INTRODUCTION Diseases were initially thought to be the consequence of a single gene mutation. Advances in DNA sequencing tools and our understanding of gene behavior have revealed that complex diseases, such as cancer, are the product of genes cooperating with each other and with their environment in orchestrated communication networks. Seeing that the function of individual genes is still used to analyze cancer, the shift to using functionally interacting groups of genes as a new unit of study holds promise for demystifying cancer. AREAS COVERED The literature search focused on three types of cancer, namely breast, lung, and prostate, but arguments from other cancers were also included. The aim was to prove that multigene analyses can accurately predict and prognosticate cancer risk, subtype cancer for more personalized and effective treatments, and discover anti-cancer therapies. Computational intelligence is being harnessed to analyze this type of data and is proving indispensable to scientific progress. EXPERT OPINION In the future, comprehensive profiling of all kinds of patient data (e.g., serum molecules, environmental exposures) can be used to build universal networks that should help us elucidate the molecular mechanisms underlying diseases and provide appropriate preventive measures, ensuring lifelong health and longevity.
Collapse
Affiliation(s)
| | - George Ayoub
- Faculty of Medicine, Saint Joseph University, Beirut, Lebanon
| | - Yara Bachir
- Faculty of Medicine, Saint Joseph University, Beirut, Lebanon
| | - Michèle Haykal
- Faculty of Medicine, Saint Joseph University, Beirut, Lebanon
| | - Nadine Jalkh
- Medical Genetics Unit, Technology and Health division, Faculty of Medicine, Saint Joseph University, Beirut, Lebanon
| | - Hampig Raphael Kourie
- Department of Hematology-Oncology, Hotel Dieu de France University Hospital, Faculty of Medicine, Saint Joseph University, Beirut, Lebanon
| |
Collapse
|
27
|
SNP characteristics and validation success in genome wide association studies. Hum Genet 2022; 141:229-238. [PMID: 34981173 PMCID: PMC8855685 DOI: 10.1007/s00439-021-02407-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Accepted: 11/27/2021] [Indexed: 02/03/2023]
Abstract
Genome wide association studies (GWASs) have identified tens of thousands of single nucleotide polymorphisms (SNPs) associated with human diseases and characteristics. A significant fraction of GWAS findings can be false positives. The gold standard for true positives is an independent validation. The goal of this study was to identify SNP features associated with validation success. Summary statistics from the Catalog of Published GWASs were used in the analysis. Since our goal was an analysis of reproducibility, we focused on the diseases/phenotypes targeted by at least 10 GWASs. GWASs were arranged in discovery-validation pairs based on the time of publication, with the discovery GWAS published before validation. We used four definitions of the validation success that differ by stringency. Associations of SNP features with validation success were consistent across the definitions. The strongest predictor of SNP validation was the level of statistical significance in the discovery GWAS. The magnitude of the effect size was associated with validation success in a non-linear manner. SNPs with risk allele frequencies in the range 30-70% showed a higher validation success rate compared to rarer or more common SNPs. Missense, 5'UTR, stop gained, and SNPs located in transcription factor binding sites had a higher validation success rate compared to intergenic, intronic and synonymous SNPs. There was a positive association between validation success and the level of evolutionary conservation of the sites. In addition, validation success was higher when discovery and validation GWASs targeted the same ethnicity. All predictors of validation success remained significant in a multivariate logistic regression model indicating their independent contribution. To conclude, we identified SNP features predicting validation success of GWAS hits. These features can be used to select SNPs for validation and downstream functional studies.
Collapse
|
28
|
Privé F, Aschard H, Carmi S, Folkersen L, Hoggart C, O'Reilly PF, Vilhjálmsson BJ. Portability of 245 polygenic scores when derived from the UK Biobank and applied to 9 ancestry groups from the same cohort. Am J Hum Genet 2022; 109:12-23. [PMID: 34995502 PMCID: PMC8764121 DOI: 10.1016/j.ajhg.2021.11.008] [Citation(s) in RCA: 118] [Impact Index Per Article: 59.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Accepted: 11/04/2021] [Indexed: 12/25/2022] Open
Abstract
The low portability of polygenic scores (PGSs) across global populations is a major concern that must be addressed before PGSs can be used for everyone in the clinic. Indeed, prediction accuracy has been shown to decay as a function of the genetic distance between the training and test cohorts. However, such cohorts differ not only in their genetic distance but also in their geographical distance and their data collection and assaying, conflating multiple factors. In this study, we examine the extent to which PGSs are transferable between ancestries by deriving polygenic scores for 245 curated traits from the UK Biobank data and applying them in nine ancestry groups from the same cohort. By restricting both training and testing to the UK Biobank data, we reduce the risk of environmental and genotyping confounding from using different cohorts. We define the nine ancestry groups at a sub-continental level, based on a simple, robust, and effective method that we introduce here. We then apply two different predictive methods to derive polygenic scores for all 245 phenotypes and show a systematic and dramatic reduction in portability of PGSs trained using Northwestern European individuals and applied to nine ancestry groups. These analyses demonstrate that prediction already drops off within European ancestries and reduces globally in proportion to genetic distance. Altogether, our study provides unique and robust insights into the PGS portability problem.
Collapse
Affiliation(s)
- Florian Privé
- National Centre for Register-Based Research, Aarhus University, Aarhus 8210, Denmark.
| | - Hugues Aschard
- Department of Computational Biology, Institut Pasteur, Paris 75015, France; Program in Genetic Epidemiology and Statistical Genetics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Shai Carmi
- Braun School of Public Health and Community Medicine, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| | | | - Clive Hoggart
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Paul F O'Reilly
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Bjarni J Vilhjálmsson
- National Centre for Register-Based Research, Aarhus University, Aarhus 8210, Denmark; Bioinformatics Research Centre, Aarhus University, Aarhus 8000, Denmark
| |
Collapse
|
29
|
Wirtz MK, Sykes R, Samples J, Edmunds B, Choi D, Keene DR, Tufa SF, Sun YY, Keller KE. Identification of Missense Extracellular Matrix Gene Variants in a Large Glaucoma Pedigree and Investigation of the N700S Thrombospondin-1 Variant in Normal and Glaucomatous Trabecular Meshwork Cells. Curr Eye Res 2022; 47:79-90. [PMID: 34143713 PMCID: PMC8733052 DOI: 10.1080/02713683.2021.1945109] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
PURPOSE Primary open-angle glaucoma (POAG) is a complex heterogeneous disease. While several POAG genes have been identified, a high proportion of estimated heritability remains unexplained. Elevated intraocular pressure (IOP) is a leading POAG risk factor and dysfunctional extracellular matrix (ECM) in the trabecular meshwork (TM) contributes to elevated IOP. In this study, we sought to identify missense variants in ECM genes that correlate with ocular hypertensive POAG. METHODS Whole-genome sequencing was used to identify genetic variants in five members of a large POAG family (n = 68) with elevated IOP. The remaining family members were screened by Sanger sequencing. Unrelated normal (NTM) and glaucomatous (GTM) cells were sequenced for the identified variants. The ECM protein levels were determined by Western immunoblotting and confocal and electron microscopy investigated ECM ultrastructural organization. RESULTS Three ECM gene variants were significantly associated with POAG or elevated IOP in a large POAG pedigree. These included rs2228262 (N700S; thrombospondin-1 (THBS1, TSP1)), rs112913396 (D563 G; collagen type VI, alpha 3 (COL6A3)) and rs34759087 (E987K; laminin subunit beta 2 (LAMB2)). Screening of unrelated TM cells (n = 27) showed higher prevalence of the THBS1 variant but not the LAMB2 variant, in GTM cells (39%) than NTM cells (11%). The rare COL6A3 variant was not detected. TSP1 protein was upregulated and COL6A3 was down-regulated in TM cells with N700S subject to mechanical stretch, an in vitro method that mimics elevated IOP. Immunofluorescence showed increased TSP1 immunostaining in cell strains with N700S compared to wild-type TM cells. Ultrastructural studies showed ECM disorganization and altered collagen type VI distribution in GTM versus NTM cells. CONCLUSIONS Our results suggest that missense variants in ECM genes may not cause catastrophic changes to the TM, but over many years, subtle changes in ECM may accumulate and cause structural disorganization of the outflow resistance leading to elevated IOP in POAG patients.
Collapse
Affiliation(s)
- Mary K. Wirtz
- Casey Eye Institute, Oregon Health & Science University, Portland, OR 97239
| | - Renee Sykes
- Casey Eye Institute, Oregon Health & Science University, Portland, OR 97239
| | | | - Beth Edmunds
- Casey Eye Institute, Oregon Health & Science University, Portland, OR 97239
| | - Dongseok Choi
- Casey Eye Institute, Oregon Health & Science University, Portland, OR 97239.,OHSU-PSU School of Public Health Oregon Health & Science University, Portland, OR 97239.,Graduate School of Dentistry, Kyung Hee University, Seoul, Korea
| | | | - Sara F. Tufa
- Shriners Hospitals for Children, Portland, OR 97239
| | - Ying Ying Sun
- Casey Eye Institute, Oregon Health & Science University, Portland, OR 97239
| | - Kate E. Keller
- Casey Eye Institute, Oregon Health & Science University, Portland, OR 97239.,Department of Chemical Physiology and Biochemistry, Oregon Health & Science University, Portland, OR 97239.,To whom correspondence should be addressed: 503 494 2366,
| |
Collapse
|
30
|
Barroso I. The importance of increasing population diversity in genetic studies of type 2 diabetes and related glycaemic traits. Diabetologia 2021; 64:2653-2664. [PMID: 34595549 PMCID: PMC8563561 DOI: 10.1007/s00125-021-05575-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Accepted: 07/07/2021] [Indexed: 12/11/2022]
Abstract
Type 2 diabetes has a global prevalence, with epidemiological data suggesting that some populations have a higher risk of developing this disease. However, to date, most genetic studies of type 2 diabetes and related glycaemic traits have been performed in individuals of European ancestry. The same is true for most other complex diseases, largely due to use of 'convenience samples'. Rapid genotyping of large population cohorts and case-control studies from existing collections was performed when the genome-wide association study (GWAS) 'revolution' began, back in 2005. Although global representation has increased in the intervening 15 years, further expansion and inclusion of diverse populations in genetic and genomic studies is still needed. In this review, I discuss the progress made in incorporating multi-ancestry participants in genetic analyses of type 2 diabetes and related glycaemic traits, and associated opportunities and challenges. I also discuss how increased representation of global diversity in genetic and genomic studies is required to fulfil the promise of precision medicine for all.
Collapse
Affiliation(s)
- Inês Barroso
- Exeter Centre of Excellence for Diabetes research (EXCEED), University of Exeter Medical School, Exeter, UK.
| |
Collapse
|
31
|
Ma Y, Zhou X. Genetic prediction of complex traits with polygenic scores: a statistical review. Trends Genet 2021; 37:995-1011. [PMID: 34243982 PMCID: PMC8511058 DOI: 10.1016/j.tig.2021.06.004] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Revised: 05/31/2021] [Accepted: 06/03/2021] [Indexed: 01/03/2023]
Abstract
Accurate genetic prediction of complex traits can facilitate disease screening, improve early intervention, and aid in the development of personalized medicine. Genetic prediction of complex traits requires the development of statistical methods that can properly model polygenic architecture and construct a polygenic score (PGS). We present a comprehensive review of 46 methods for PGS construction. We connect the majority of these methods through a multiple linear regression framework which can be instrumental for understanding their prediction performance for traits with distinct genetic architectures. We discuss the practical considerations of PGS analysis as well as challenges and future directions of PGS method development. We hope our review serves as a useful reference both for statistical geneticists who develop PGS methods and for data analysts who perform PGS analysis.
Collapse
Affiliation(s)
- Ying Ma
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA; Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA.
| |
Collapse
|
32
|
Márquez-Luna C, Gazal S, Loh PR, Kim SS, Furlotte N, Auton A, Price AL. Incorporating functional priors improves polygenic prediction accuracy in UK Biobank and 23andMe data sets. Nat Commun 2021; 12:6052. [PMID: 34663819 PMCID: PMC8523709 DOI: 10.1038/s41467-021-25171-9] [Citation(s) in RCA: 50] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2019] [Accepted: 07/16/2021] [Indexed: 12/23/2022] Open
Abstract
Polygenic risk prediction is a widely investigated topic because of its promising clinical applications. Genetic variants in functional regions of the genome are enriched for complex trait heritability. Here, we introduce a method for polygenic prediction, LDpred-funct, that leverages trait-specific functional priors to increase prediction accuracy. We fit priors using the recently developed baseline-LD model, including coding, conserved, regulatory, and LD-related annotations. We analytically estimate posterior mean causal effect sizes and then use cross-validation to regularize these estimates, improving prediction accuracy for sparse architectures. We applied LDpred-funct to predict 21 highly heritable traits in the UK Biobank (avg N = 373 K as training data). LDpred-funct attained a +4.6% relative improvement in average prediction accuracy (avg prediction R2 = 0.144; highest R2 = 0.413 for height) compared to SBayesR (the best method that does not incorporate functional information). For height, meta-analyzing training data from UK Biobank and 23andMe cohorts (N = 1107 K) increased prediction R2 to 0.431. Our results show that incorporating functional priors improves polygenic prediction accuracy, consistent with the functional architecture of complex traits.
Collapse
Affiliation(s)
- Carla Márquez-Luna
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Charles R. Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| | - Steven Gazal
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Charles R. Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Po-Ru Loh
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Samuel S Kim
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | | | | | - Alkes L Price
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
| |
Collapse
|
33
|
Shan N, Xie Y, Song S, Jiang W, Wang Z, Hou L. A novel transcriptional risk score for risk prediction of complex human diseases. Genet Epidemiol 2021; 45:811-820. [PMID: 34245595 DOI: 10.1002/gepi.22424] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2020] [Revised: 06/08/2021] [Accepted: 06/24/2021] [Indexed: 11/06/2022]
Abstract
Recently polygenetic risk score (PRS) has been successfully used in the risk prediction of complex human diseases. Many studies incorporated internal information, such as effect size distribution, or external information, such as linkage disequilibrium, functional annotation, and pleiotropy among multiple diseases, to optimize the performance of PRS. To leverage on multiomics datasets, we developed a novel flexible transcriptional risk score (TRS), in which messenger RNA expression levels were imputed and weighted for risk prediction. In simulation studies, we demonstrated that single-tissue TRS has greater prediction power than LDpred, especially when there is a large effect of gene expression on the phenotype. Multitissue TRS improves prediction accuracy when there are multiple tissues with independent contributions to disease risk. We applied our method to complex traits, including Crohn's disease, type 2 diabetes, and so on. The single-tissue TRS method outperformed LDpred and AnnoPred across the tested traits. The performance of multitissue TRS is trait-dependent. Moreover, our method can easily incorporate information from epigenomic and proteomic data upon the availability of reference datasets.
Collapse
Affiliation(s)
- Nayang Shan
- Center for Statistical Science, Department of Industrial Engineering, Tsinghua University, Beijing, China
| | - Yuhan Xie
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, USA
| | - Shuang Song
- Center for Statistical Science, Department of Industrial Engineering, Tsinghua University, Beijing, China
| | - Wei Jiang
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, USA
| | - Zuoheng Wang
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, USA
| | - Lin Hou
- Center for Statistical Science, Department of Industrial Engineering, Tsinghua University, Beijing, China.,MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China
| |
Collapse
|
34
|
Pain O, Glanville KP, Hagenaars S, Selzam S, Fürtjes A, Coleman JRI, Rimfeld K, Breen G, Folkersen L, Lewis CM. Imputed gene expression risk scores: a functionally informed component of polygenic risk. Hum Mol Genet 2021; 30:727-738. [PMID: 33611520 PMCID: PMC8127405 DOI: 10.1093/hmg/ddab053] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Revised: 02/08/2021] [Accepted: 02/15/2021] [Indexed: 11/12/2022] Open
Abstract
Integration of functional genomic annotations when estimating polygenic risk scores (PRS) can provide insight into aetiology and improve risk prediction. This study explores the predictive utility of gene expression risk scores (GeRS), calculated using imputed gene expression and transcriptome-wide association study (TWAS) results. The predictive utility of GeRS was evaluated using 12 neuropsychiatric and anthropometric outcomes measured in two target samples: UK Biobank and the Twins Early Development Study. GeRS were calculated based on imputed gene expression levels and TWAS results, using 53 gene expression-genotype panels, termed single nucleotide polymorphism (SNP)-weight sets, capturing expression across a range of tissues. We compare the predictive utility of elastic net models containing GeRS within and across SNP-weight sets, and models containing both GeRS and PRS. We estimate the proportion of SNP-based heritability attributable to cis-regulated gene expression. GeRS significantly predicted a range of outcomes, with elastic net models combining GeRS across SNP-weight sets improving prediction. GeRS were less predictive than PRS, but models combining GeRS and PRS improved prediction for several outcomes, with relative improvements ranging from 0.3% for height (P = 0.023) to 4% for rheumatoid arthritis (P = 5.9 × 10-8). The proportion of SNP-based heritability attributable to cis-regulated expression was modest for most outcomes, even when restricting GeRS to colocalized genes. GeRS represent a component of PRS and could be useful for functional stratification of genetic risk. Only in specific circumstances can GeRS substantially improve prediction over PRS alone. Future research considering functional genomic annotations when estimating genetic risk is warranted.
Collapse
Affiliation(s)
- Oliver Pain
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London SE5 8AF, UK
- NIHR Maudsley Biomedical Research Centre, South London and Maudsley NHS Trust, London SE5 8AF, UK
| | - Kylie P Glanville
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London SE5 8AF, UK
| | - Saskia Hagenaars
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London SE5 8AF, UK
| | - Saskia Selzam
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London SE5 8AF, UK
| | - Anna Fürtjes
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London SE5 8AF, UK
| | - Jonathan R I Coleman
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London SE5 8AF, UK
| | - Kaili Rimfeld
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London SE5 8AF, UK
| | - Gerome Breen
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London SE5 8AF, UK
- NIHR Maudsley Biomedical Research Centre, South London and Maudsley NHS Trust, London SE5 8AF, UK
| | - Lasse Folkersen
- Institute of Biological Psychiatry, Sankt Hans Hospital, Copenhagen 4000 Roskilde, Denmark
| | - Cathryn M Lewis
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London SE5 8AF, UK
- NIHR Maudsley Biomedical Research Centre, South London and Maudsley NHS Trust, London SE5 8AF, UK
- Department of Medical and Molecular Genetics, Faculty of Life Sciences and Medicine, King’s College London, London WC2R 2LS, UK
| |
Collapse
|
35
|
Trinder M, Brunham LR. Polygenic scores for dyslipidemia: the emerging genomic model of plasma lipoprotein trait inheritance. Curr Opin Lipidol 2021; 32:103-111. [PMID: 33395106 DOI: 10.1097/mol.0000000000000737] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
PURPOSE OF REVIEW Contemporary polygenic scores, which summarize the cumulative contribution of millions of common single-nucleotide variants to a phenotypic trait, can have effects comparable to monogenic mutations. This review focuses on the emerging use of 'genome-wide' polygenic scores for plasma lipoproteins to define the etiology of clinical dyslipidemia, modify the severity of monogenic disease, and inform therapeutic options. RECENT FINDINGS Polygenic scores for low-density lipoprotein cholesterol (LDL-C), triglycerides, and high-density lipoprotein cholesterol are associated with severe hypercholesterolemia, hypertriglyceridemia, or hypoalphalipoproteinemia, respectively. These polygenic scores for LDL-C or triglycerides associate with risk of incident coronary artery disease (CAD) independent of polygenic scores designed specifically for CAD and may identify individuals that benefit most from lipid-lowering medication. Additionally, the severity of hypercholesterolemia and CAD associated with familial hypercholesterolemia-a common monogenic disorder-is modified by these polygenic factors. The current focus of polygenic scores for dyslipidemia is to design predictive polygenic scores for diverse populations and determining how these polygenic scores could be implemented and standardized for use in the clinic. SUMMARY Polygenic scores have shown early promise for the management of dyslipidemias, but several challenges need to be addressed before widespread clinical implementation to ensure that potential benefits are robust and reproducible, equitable, and cost-effective.
Collapse
Affiliation(s)
- Mark Trinder
- Centre for Heart Lung Innovation, University of British Columbia
- Experimental Medicine Program, University of British Columbia
| | - Liam R Brunham
- Centre for Heart Lung Innovation, University of British Columbia
- Experimental Medicine Program, University of British Columbia
- Department of Medicine, University of British Columbia
- Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
36
|
Emami NC, Cavazos TB, Rashkin SR, Cario CL, Graff RE, Tai CG, Mefford JA, Kachuri L, Wan E, Wong S, Aaronson D, Presti J, Habel LA, Shan J, Ranatunga DK, Chao CR, Ghai NR, Jorgenson E, Sakoda LC, Kvale MN, Kwok PY, Schaefer C, Risch N, Hoffmann TJ, Van Den Eeden SK, Witte JS. A Large-Scale Association Study Detects Novel Rare Variants, Risk Genes, Functional Elements, and Polygenic Architecture of Prostate Cancer Susceptibility. Cancer Res 2021; 81:1695-1703. [PMID: 33293427 PMCID: PMC8137514 DOI: 10.1158/0008-5472.can-20-2635] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2020] [Revised: 10/27/2020] [Accepted: 12/02/2020] [Indexed: 11/16/2022]
Abstract
To identify rare variants associated with prostate cancer susceptibility and better characterize the mechanisms and cumulative disease risk associated with common risk variants, we conducted an integrated study of prostate cancer genetic etiology in two cohorts using custom genotyping microarrays, large imputation reference panels, and functional annotation approaches. Specifically, 11,984 men (6,196 prostate cancer cases and 5,788 controls) of European ancestry from Northern California Kaiser Permanente were genotyped and meta-analyzed with 196,269 men of European ancestry (7,917 prostate cancer cases and 188,352 controls) from the UK Biobank. Three novel loci, including two rare variants (European ancestry minor allele frequency < 0.01, at 3p21.31 and 8p12), were significant genome wide in a meta-analysis. Gene-based rare variant tests implicated a known prostate cancer gene (HOXB13), as well as a novel candidate gene (ILDR1), which encodes a receptor highly expressed in prostate tissue and is related to the B7/CD28 family of T-cell immune checkpoint markers. Haplotypic patterns of long-range linkage disequilibrium were observed for rare genetic variants at HOXB13 and other loci, reflecting their evolutionary history. In addition, a polygenic risk score (PRS) of 188 prostate cancer variants was strongly associated with risk (90th vs. 40th-60th percentile OR = 2.62, P = 2.55 × 10-191). Many of the 188 variants exhibited functional signatures of gene expression regulation or transcription factor binding, including a 6-fold difference in log-probability of androgen receptor binding at the variant rs2680708 (17q22). Rare variant and PRS associations, with concomitant functional interpretation of risk mechanisms, can help clarify the full genetic architecture of prostate cancer and other complex traits. SIGNIFICANCE: This study maps the biological relationships between diverse risk factors for prostate cancer, integrating different functional datasets to interpret and model genome-wide data from over 200,000 men with and without prostate cancer.See related commentary by Lachance, p. 1637.
Collapse
Affiliation(s)
- Nima C Emami
- Program in Biological and Medical Informatics, University of California San Francisco, San Francisco, California
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California
| | - Taylor B Cavazos
- Program in Biological and Medical Informatics, University of California San Francisco, San Francisco, California
| | - Sara R Rashkin
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California
| | - Clinton L Cario
- Program in Biological and Medical Informatics, University of California San Francisco, San Francisco, California
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California
| | - Rebecca E Graff
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California
| | - Caroline G Tai
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California
| | - Joel A Mefford
- Program in Pharmaceutical Sciences and Pharmacogenomics, University of California San Francisco, San Francisco, California
| | - Linda Kachuri
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California
| | - Eunice Wan
- Institute for Human Genetics, University of California San Francisco, San Francisco, California
| | - Simon Wong
- Institute for Human Genetics, University of California San Francisco, San Francisco, California
| | - David Aaronson
- Department of Urology, Kaiser Oakland Medical Center, Oakland, California
| | - Joseph Presti
- Department of Urology, Kaiser Oakland Medical Center, Oakland, California
| | - Laurel A Habel
- Division of Research, Kaiser Permanente Northern California, Oakland, California
| | - Jun Shan
- Division of Research, Kaiser Permanente Northern California, Oakland, California
| | - Dilrini K Ranatunga
- Division of Research, Kaiser Permanente Northern California, Oakland, California
| | - Chun R Chao
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, California
| | - Nirupa R Ghai
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, California
| | - Eric Jorgenson
- Division of Research, Kaiser Permanente Northern California, Oakland, California
| | - Lori C Sakoda
- Division of Research, Kaiser Permanente Northern California, Oakland, California
| | - Mark N Kvale
- Institute for Human Genetics, University of California San Francisco, San Francisco, California
| | - Pui-Yan Kwok
- Program in Pharmaceutical Sciences and Pharmacogenomics, University of California San Francisco, San Francisco, California
- Institute for Human Genetics, University of California San Francisco, San Francisco, California
| | - Catherine Schaefer
- Division of Research, Kaiser Permanente Northern California, Oakland, California
| | - Neil Risch
- Program in Biological and Medical Informatics, University of California San Francisco, San Francisco, California
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California
- Program in Pharmaceutical Sciences and Pharmacogenomics, University of California San Francisco, San Francisco, California
- Institute for Human Genetics, University of California San Francisco, San Francisco, California
- Division of Research, Kaiser Permanente Northern California, Oakland, California
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, California
| | - Thomas J Hoffmann
- Program in Biological and Medical Informatics, University of California San Francisco, San Francisco, California
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California
- Institute for Human Genetics, University of California San Francisco, San Francisco, California
| | - Stephen K Van Den Eeden
- Division of Research, Kaiser Permanente Northern California, Oakland, California
- Department of Urology, University of California San Francisco, San Francisco, California
| | - John S Witte
- Program in Biological and Medical Informatics, University of California San Francisco, San Francisco, California.
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California
- Program in Pharmaceutical Sciences and Pharmacogenomics, University of California San Francisco, San Francisco, California
- Institute for Human Genetics, University of California San Francisco, San Francisco, California
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, California
- Department of Urology, University of California San Francisco, San Francisco, California
| |
Collapse
|
37
|
Li J, Chaudhary DP, Khan A, Griessenauer C, Carey DJ, Zand R, Abedi V. Polygenic Risk Scores Augment Stroke Subtyping. NEUROLOGY-GENETICS 2021; 7:e560. [PMID: 33709033 PMCID: PMC7943221 DOI: 10.1212/nxg.0000000000000560] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/07/2020] [Accepted: 12/02/2020] [Indexed: 12/12/2022]
Abstract
Objective To determine whether the polygenic risk score (PRS) derived from MEGASTROKE is associated with ischemic stroke (IS) and its subtypes in an independent tertiary health care system and to identify the PRS derived from gene sets of known biological pathways associated with IS. Methods Controls (n = 19,806/7,484, age ≥69/79 years) and cases (n = 1,184/951 for discovery/replication) of acute IS with European ancestry and clinical risk factors were identified by leveraging the Geisinger Electronic Health Record and chart review confirmation. All Geisinger MyCode patients with age ≥69/79 years and without any stroke-related diagnostic codes were included as low risk control. Genetic heritability and genetic correlation between Geisinger and MEGASTROKE (EUR) were calculated using the summary statistics of the genome-wide association study by linkage disequilibrium score regression. All PRS for any stroke (AS), any ischemic stroke (AIS), large artery stroke (LAS), cardioembolic stroke (CES), and small vessel stroke (SVS) were constructed by PRSice-2. Results A moderate heritability (10%–20%) for Geisinger sample as well as the genetic correlation between MEGASTROKE and the Geisinger cohort was identified. Variation of all 5 PRS significantly explained some of the phenotypic variations of Geisinger IS, and the R2 increased by raising the cutoff for the age of controls. PRSLAS, PRSCES, and PRSSVS derived from low-frequency common variants provided the best fit for modeling (R2 = 0.015 for PRSLAS). Gene sets analyses highlighted the association of PRS with Gene Ontology terms (vascular endothelial growth factor, amyloid precursor protein, and atherosclerosis). The PRSLAS, PRSCES, and PRSSVS explained the most variance of the corresponding subtypes of Geisinger IS suggesting shared etiologies and corroborated Geisinger TOAST subtyping. Conclusions We provide the first evidence that PRSs derived from MEGASTROKE have value in identifying shared etiologies and determining stroke subtypes.
Collapse
Affiliation(s)
- Jiang Li
- Department of Molecular and Functional Genomics (J.L., D.J.C., V.A.), Weis Center for Research, Geisinger Health System; Neuroscience Institute (D.P.C., A.K., C.G., R.Z.), Geisinger Health System, Danville, PA; Biocomplexity Institute (V.A.), Virginia Tech, Blacksburg, VA; and Research Institute of Neurointervention (C.G.), Paracelsus Medical University, Salzburg, Austria
| | - Durgesh P Chaudhary
- Department of Molecular and Functional Genomics (J.L., D.J.C., V.A.), Weis Center for Research, Geisinger Health System; Neuroscience Institute (D.P.C., A.K., C.G., R.Z.), Geisinger Health System, Danville, PA; Biocomplexity Institute (V.A.), Virginia Tech, Blacksburg, VA; and Research Institute of Neurointervention (C.G.), Paracelsus Medical University, Salzburg, Austria
| | - Ayesha Khan
- Department of Molecular and Functional Genomics (J.L., D.J.C., V.A.), Weis Center for Research, Geisinger Health System; Neuroscience Institute (D.P.C., A.K., C.G., R.Z.), Geisinger Health System, Danville, PA; Biocomplexity Institute (V.A.), Virginia Tech, Blacksburg, VA; and Research Institute of Neurointervention (C.G.), Paracelsus Medical University, Salzburg, Austria
| | - Christoph Griessenauer
- Department of Molecular and Functional Genomics (J.L., D.J.C., V.A.), Weis Center for Research, Geisinger Health System; Neuroscience Institute (D.P.C., A.K., C.G., R.Z.), Geisinger Health System, Danville, PA; Biocomplexity Institute (V.A.), Virginia Tech, Blacksburg, VA; and Research Institute of Neurointervention (C.G.), Paracelsus Medical University, Salzburg, Austria
| | - David J Carey
- Department of Molecular and Functional Genomics (J.L., D.J.C., V.A.), Weis Center for Research, Geisinger Health System; Neuroscience Institute (D.P.C., A.K., C.G., R.Z.), Geisinger Health System, Danville, PA; Biocomplexity Institute (V.A.), Virginia Tech, Blacksburg, VA; and Research Institute of Neurointervention (C.G.), Paracelsus Medical University, Salzburg, Austria
| | - Ramin Zand
- Department of Molecular and Functional Genomics (J.L., D.J.C., V.A.), Weis Center for Research, Geisinger Health System; Neuroscience Institute (D.P.C., A.K., C.G., R.Z.), Geisinger Health System, Danville, PA; Biocomplexity Institute (V.A.), Virginia Tech, Blacksburg, VA; and Research Institute of Neurointervention (C.G.), Paracelsus Medical University, Salzburg, Austria
| | - Vida Abedi
- Department of Molecular and Functional Genomics (J.L., D.J.C., V.A.), Weis Center for Research, Geisinger Health System; Neuroscience Institute (D.P.C., A.K., C.G., R.Z.), Geisinger Health System, Danville, PA; Biocomplexity Institute (V.A.), Virginia Tech, Blacksburg, VA; and Research Institute of Neurointervention (C.G.), Paracelsus Medical University, Salzburg, Austria
| |
Collapse
|
38
|
Babb de Villiers C, Kroese M, Moorthie S. Understanding polygenic models, their development and the potential application of polygenic scores in healthcare. J Med Genet 2020; 57:725-732. [PMID: 32376789 PMCID: PMC7591711 DOI: 10.1136/jmedgenet-2019-106763] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Revised: 03/09/2020] [Accepted: 03/28/2020] [Indexed: 02/06/2023]
Abstract
The use of genomic information to better understand and prevent common complex diseases has been an ongoing goal of genetic research. Over the past few years, research in this area has proliferated with several proposed methods of generating polygenic scores. This has been driven by the availability of larger data sets, primarily from genome-wide association studies and concomitant developments in statistical methodologies. Here we provide an overview of the methodological aspects of polygenic model construction. In addition, we consider the state of the field and implications for potential applications of polygenic scores for risk estimation within healthcare.
Collapse
Affiliation(s)
| | - Mark Kroese
- PHG Foundation, University of Cambridge, Cambridge, Cambridgeshire, UK
| | - Sowmiya Moorthie
- PHG Foundation, University of Cambridge, Cambridge, Cambridgeshire, UK
| |
Collapse
|
39
|
Chen TH, Chatterjee N, Landi MT, Shi J. A penalized regression framework for building polygenic risk models based on summary statistics from genome-wide association studies and incorporating external information. J Am Stat Assoc 2020; 116:133-143. [PMID: 34483403 DOI: 10.1080/01621459.2020.1764849] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Large-scale genome-wide association (GWAS) studies provide opportunities for developing genetic risk prediction models that have the potential to improve disease prevention, intervention or treatment. The key step is to develop polygenic risk score (PRS) models with high predictive performance for a given disease, which typically requires a large training data set for selecting truly associated single nucleotide polymorphisms (SNPs) and estimating effect sizes accurately. Here, we develop a comprehensive penalized regression for fitting l 1 regularized regression models to GWAS summary statistics. We propose incorporating Pleiotropy and ANnotation information into PRS (PANPRS) development through suitable formulation of penalty functions and associated tuning parameters. Extensive simulations show that PANPRS performs equally well or better than existing PRS methods when no functional annotation or pleiotropy is incorporated. When functional annotation data and pleiotropy are informative, PANPRS substantially outperforms existing PRS methods in simulations. Finally, we applied our methods to build PRS for type 2 diabetes and melanoma and found that incorporating relevant functional annotations and GWAS of genetically related traits improved prediction of these two complex diseases.
Collapse
Affiliation(s)
- Ting-Huei Chen
- Department of Mathematics and Statistics, Regular member, Cervo Brain Research Centre, University of Laval, 1045, av. of Medicine, Suite 1056, Quebec G1V 0A6, Canada
| | - Nilanjan Chatterjee
- Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University Baltimore, Maryland, United States of America, 615 N Wolfe Street Baltimore, MD 21205
| | - Maria Teresa Landi
- Integrative Tumor Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Maryland, United States of America, 9609 Medical Center Drive, RM 7E106, Bethesda, MD, 20892
| | - Jianxin Shi
- Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Maryland, United States of America, 9609 Medical Center Drive, RM 7E122, Bethesda, MD, 20892
| |
Collapse
|
40
|
Lin WY, Huang CC, Liu YL, Tsai SJ, Kuo PH. Polygenic approaches to detect gene-environment interactions when external information is unavailable. Brief Bioinform 2020; 20:2236-2252. [PMID: 30219835 PMCID: PMC6954453 DOI: 10.1093/bib/bby086] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2018] [Revised: 08/14/2018] [Accepted: 08/16/2018] [Indexed: 12/18/2022] Open
Abstract
The exploration of 'gene-environment interactions' (G × E) is important for disease prediction and prevention. The scientific community usually uses external information to construct a genetic risk score (GRS), and then tests the interaction between this GRS and an environmental factor (E). However, external genome-wide association studies (GWAS) are not always available, especially for non-Caucasian ethnicity. Although GRS is an analysis tool to detect G × E in GWAS, its performance remains unclear when there is no external information. Our 'adaptive combination of Bayes factors method' (ADABF) can aggregate G × E signals and test the significance of G × E by a polygenic test. We here explore a powerful polygenic approach for G × E when external information is unavailable, by comparing our ADABF with the GRS based on marginal effects of SNPs (GRS-M) and GRS based on SNP × E interactions (GRS-I). ADABF is the most powerful method in the absence of SNP main effects, whereas GRS-M is generally the best test when single-nucleotide polymorphisms main effects exist. GRS-I is the least powerful test due to its data-splitting strategy. Furthermore, we apply these methods to Taiwan Biobank data. ADABF and GRS-M identified gene × alcohol and gene × smoking interactions on blood pressure (BP). BP-increasing alleles elevate more BP in drinkers (smokers) than in nondrinkers (nonsmokers). This work provides guidance to choose a polygenic approach to detect G × E when external information is unavailable.
Collapse
Affiliation(s)
- Wan-Yu Lin
- Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan.,Department of Public Health, College of Public Health, National Taiwan University, Taipei, Taiwan
| | - Ching-Chieh Huang
- Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan
| | - Yu-Li Liu
- Center for Neuropsychiatric Research, National Health Research Institutes, Miaoli County, Taiwan
| | - Shih-Jen Tsai
- Department of Psychiatry, TaipeiVeterans General Hospital, Taipei, Taiwan.,Division of Psychiatry, National Yang-Ming University, Taipei, Taiwan
| | - Po-Hsiu Kuo
- Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan.,Department of Public Health, College of Public Health, National Taiwan University, Taipei, Taiwan
| |
Collapse
|
41
|
Choi SW, Mak TSH, O'Reilly PF. Tutorial: a guide to performing polygenic risk score analyses. Nat Protoc 2020; 15:2759-2772. [PMID: 32709988 PMCID: PMC7612115 DOI: 10.1038/s41596-020-0353-1] [Citation(s) in RCA: 795] [Impact Index Per Article: 198.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2018] [Accepted: 05/05/2020] [Indexed: 02/08/2023]
Abstract
A polygenic score (PGS) or polygenic risk score (PRS) is an estimate of an individual's genetic liability to a trait or disease, calculated according to their genotype profile and relevant genome-wide association study (GWAS) data. While present PRSs typically explain only a small fraction of trait variance, their correlation with the single largest contributor to phenotypic variation-genetic liability-has led to the routine application of PRSs across biomedical research. Among a range of applications, PRSs are exploited to assess shared etiology between phenotypes, to evaluate the clinical utility of genetic data for complex disease and as part of experimental studies in which, for example, experiments are performed that compare outcomes (e.g., gene expression and cellular response to treatment) between individuals with low and high PRS values. As GWAS sample sizes increase and PRSs become more powerful, PRSs are set to play a key role in research and stratified medicine. However, despite the importance and growing application of PRSs, there are limited guidelines for performing PRS analyses, which can lead to inconsistency between studies and misinterpretation of results. Here, we provide detailed guidelines for performing and interpreting PRS analyses. We outline standard quality control steps, discuss different methods for the calculation of PRSs, provide an introductory online tutorial, highlight common misconceptions relating to PRS results, offer recommendations for best practice and discuss future challenges.
Collapse
Affiliation(s)
- Shing Wan Choi
- MRC Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
- Department of Genetics and Genomic Sciences, Icahn School of Medicine, Mount Sinai, New York, NY, USA
| | | | - Paul F O'Reilly
- MRC Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK.
- Department of Genetics and Genomic Sciences, Icahn School of Medicine, Mount Sinai, New York, NY, USA.
| |
Collapse
|
42
|
Chun S, Imakaev M, Hui D, Patsopoulos NA, Neale BM, Kathiresan S, Stitziel NO, Sunyaev SR. Non-parametric Polygenic Risk Prediction via Partitioned GWAS Summary Statistics. Am J Hum Genet 2020; 107:46-59. [PMID: 32470373 PMCID: PMC7332650 DOI: 10.1016/j.ajhg.2020.05.004] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2019] [Accepted: 05/01/2020] [Indexed: 02/07/2023] Open
Abstract
In complex trait genetics, the ability to predict phenotype from genotype is the ultimate measure of our understanding of genetic architecture underlying the heritability of a trait. A complete understanding of the genetic basis of a trait should allow for predictive methods with accuracies approaching the trait's heritability. The highly polygenic nature of quantitative traits and most common phenotypes has motivated the development of statistical strategies focused on combining myriad individually non-significant genetic effects. Now that predictive accuracies are improving, there is a growing interest in the practical utility of such methods for predicting risk of common diseases responsive to early therapeutic intervention. However, existing methods require individual-level genotypes or depend on accurately specifying the genetic architecture underlying each disease to be predicted. Here, we propose a polygenic risk prediction method that does not require explicitly modeling any underlying genetic architecture. We start with summary statistics in the form of SNP effect sizes from a large GWAS cohort. We then remove the correlation structure across summary statistics arising due to linkage disequilibrium and apply a piecewise linear interpolation on conditional mean effects. In both simulated and real datasets, this new non-parametric shrinkage (NPS) method can reliably allow for linkage disequilibrium in summary statistics of 5 million dense genome-wide markers and consistently improves prediction accuracy. We show that NPS improves the identification of groups at high risk for breast cancer, type 2 diabetes, inflammatory bowel disease, and coronary heart disease, all of which have available early intervention or prevention treatments.
Collapse
Affiliation(s)
- Sung Chun
- Division of Genetics, Brigham and Women's Hospital, Boston, MA 02115, USA; Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA; Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Altius Institute for Biomedical Sciences, Seattle, WA 98121, USA
| | - Maxim Imakaev
- Division of Genetics, Brigham and Women's Hospital, Boston, MA 02115, USA; Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA; Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Altius Institute for Biomedical Sciences, Seattle, WA 98121, USA
| | - Daniel Hui
- Division of Genetics, Brigham and Women's Hospital, Boston, MA 02115, USA; Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Systems Biology and Computer Science Program, Ann Romney Center for Neurological Diseases, Department of Neurology, Brigham & Women's Hospital, Boston, MA 02115, USA
| | - Nikolaos A Patsopoulos
- Division of Genetics, Brigham and Women's Hospital, Boston, MA 02115, USA; Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Systems Biology and Computer Science Program, Ann Romney Center for Neurological Diseases, Department of Neurology, Brigham & Women's Hospital, Boston, MA 02115, USA
| | - Benjamin M Neale
- Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA; Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Sekar Kathiresan
- Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA 02114, USA; Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Nathan O Stitziel
- Cardiovascular Division, Department of Medicine, Washington University School of Medicine, Saint Louis, MO 63110, USA; Department of Genetics, Washington University School of Medicine, Saint Louis, MO 63110, USA; McDonnell Genome Institute, Washington University School of Medicine, Saint Louis, MO 63110, USA.
| | - Shamil R Sunyaev
- Division of Genetics, Brigham and Women's Hospital, Boston, MA 02115, USA; Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA; Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Altius Institute for Biomedical Sciences, Seattle, WA 98121, USA.
| |
Collapse
|
43
|
Song L, Liu A, Shi J. SummaryAUC: a tool for evaluating the performance of polygenic risk prediction models in validation datasets with only summary level statistics. Bioinformatics 2020; 35:4038-4044. [PMID: 30911754 DOI: 10.1093/bioinformatics/btz176] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2018] [Revised: 02/08/2019] [Accepted: 03/22/2019] [Indexed: 12/21/2022] Open
Abstract
MOTIVATION Polygenic risk score (PRS) methods based on genome-wide association studies (GWAS) have a potential for predicting the risk of developing complex diseases and are expected to become more accurate with larger training datasets and innovative statistical methods. The area under the ROC curve (AUC) is often used to evaluate the performance of PRSs, which requires individual genotypic and phenotypic data in an independent GWAS validation dataset. We are motivated to develop methods for approximating AUC of PRSs based on the summary level data of the validation dataset, which will greatly facilitate the development of PRS models for complex diseases. RESULTS We develop statistical methods and an R package SummaryAUC for approximating the AUC and its variance of a PRS when only the summary level data of the validation dataset are available. SummaryAUC can be applied to PRSs with SNPs either genotyped or imputed in the validation dataset. We examined the performance of SummaryAUC using a large-scale GWAS of schizophrenia. SummaryAUC provides accurate approximations to AUCs and their variances. The bias of AUC is typically <0.5% in most analyses. SummaryAUC cannot be applied to PRSs that use all SNPs in the genome because it is computationally prohibitive. AVAILABILITY AND IMPLEMENTATION https://github.com/lsncibb/SummaryAUC. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lei Song
- Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA.,Cancer Genomics Research Laboratory, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Aiyi Liu
- Biostatistics and Bioinformatics Branch, Division of Intramural Population Health Research, National Institute of Child Health and Human Development, Bethesda, MD, USA
| | - Jianxin Shi
- Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | | |
Collapse
|
44
|
Dai M, Wan X, Peng H, Wang Y, Liu Y, Liu J, Xu Z, Yang C. Joint analysis of individual-level and summary-level GWAS data by leveraging pleiotropy. Bioinformatics 2020; 35:1729-1736. [PMID: 30307540 DOI: 10.1093/bioinformatics/bty870] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2018] [Revised: 09/06/2018] [Accepted: 10/09/2018] [Indexed: 12/18/2022] Open
Abstract
MOTIVATION A large number of recent genome-wide association studies (GWASs) for complex phenotypes confirm the early conjecture for polygenicity, suggesting the presence of large number of variants with only tiny or moderate effects. However, due to the limited sample size of a single GWAS, many associated genetic variants are too weak to achieve the genome-wide significance. These undiscovered variants further limit the prediction capability of GWAS. Restricted access to the individual-level data and the increasing availability of the published GWAS results motivate the development of methods integrating both the individual-level and summary-level data. How to build the connection between the individual-level and summary-level data determines the efficiency of using the existing abundant summary-level resources with limited individual-level data, and this issue inspires more efforts in the existing area. RESULTS In this study, we propose a novel statistical approach, LEP, which provides a novel way of modeling the connection between the individual-level data and summary-level data. LEP integrates both types of data by LEveraging Pleiotropy to increase the statistical power of risk variants identification and the accuracy of risk prediction. The algorithm for parameter estimation is developed to handle genome-wide-scale data. Through comprehensive simulation studies, we demonstrated the advantages of LEP over the existing methods. We further applied LEP to perform integrative analysis of Crohn's disease from WTCCC and summary statistics from GWAS of some other diseases, such as Type 1 diabetes, Ulcerative colitis and Primary biliary cirrhosis. LEP was able to significantly increase the statistical power of identifying risk variants and improve the risk prediction accuracy from 63.39% (±0.58%) to 68.33% (±0.32%) using about 195 000 variants. AVAILABILITY AND IMPLEMENTATION The LEP software is available at https://github.com/daviddaigithub/LEP. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mingwei Dai
- Department of Applied Mathematics, School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an, China.,Department of Mathematics, Hong Kong University of Science and Technology, Hong Kong, China
| | - Xiang Wan
- ShenZhen Research Institute of Big Data, ShenZhen, China
| | - Hao Peng
- School of Business Administration, Southwestern University of Finance and Economics, Chengdu, China
| | - Yao Wang
- Department of Applied Mathematics, School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an, China
| | - Yue Liu
- Xiyuan Hospital of China Academy of Chinese Medical Sciences, Beijing, China
| | - Jin Liu
- Centre for Quantitative Medicine, Program in Health Services and Systems Research, Duke-NUS Medical School, Singapore
| | - Zongben Xu
- Department of Applied Mathematics, School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an, China
| | - Can Yang
- Department of Mathematics, Hong Kong University of Science and Technology, Hong Kong, China
| |
Collapse
|
45
|
van de Geijn B, Finucane H, Gazal S, Hormozdiari F, Amariuta T, Liu X, Gusev A, Loh PR, Reshef Y, Kichaev G, Raychauduri S, Price AL. Annotations capturing cell type-specific TF binding explain a large fraction of disease heritability. Hum Mol Genet 2020; 29:1057-1067. [PMID: 31595288 PMCID: PMC7206853 DOI: 10.1093/hmg/ddz226] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Revised: 08/12/2019] [Accepted: 09/10/2019] [Indexed: 12/21/2022] Open
Abstract
Regulatory variation plays a major role in complex disease and that cell type-specific binding of transcription factors (TF) is critical to gene regulation. However, assessing the contribution of genetic variation in TF-binding sites to disease heritability is challenging, as binding is often cell type-specific and annotations from directly measured TF binding are not currently available for most cell type-TF pairs. We investigate approaches to annotate TF binding, including directly measured chromatin data and sequence-based predictions. We find that TF-binding annotations constructed by intersecting sequence-based TF-binding predictions with cell type-specific chromatin data explain a large fraction of heritability across a broad set of diseases and corresponding cell types; this strategy of constructing annotations addresses both the limitation that identical sequences may be bound or unbound depending on surrounding chromatin context and the limitation that sequence-based predictions are generally not cell type-specific. We partitioned the heritability of 49 diseases and complex traits using stratified linkage disequilibrium (LD) score regression with the baseline-LD model (which is not cell type-specific) plus the new annotations. We determined that 100 bp windows around MotifMap sequenced-based TF-binding predictions intersected with a union of six cell type-specific chromatin marks (imputed using ChromImpute) performed best, with an 58% increase in heritability enrichment compared to the chromatin marks alone (11.6× vs. 7.3×, P = 9 × 10-14 for difference) and a 20% increase in cell type-specific signal conditional on annotations from the baseline-LD model (P = 8 × 10-11 for difference). Our results show that TF-binding annotations explain substantial disease heritability and can help refine genome-wide association signals.
Collapse
Affiliation(s)
- Bryce van de Geijn
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston 02115, MA, USA
| | - Hilary Finucane
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Steven Gazal
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston 02115, MA, USA
| | - Farhad Hormozdiari
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston 02115, MA, USA
| | - Tiffany Amariuta
- Center for Data Sciences, Harvard Medical School, Boston, MA 02215, USA
- Divisions of Genetics, Rheumatology, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02215, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02215, USA
- Graduate School of Arts and Sciences, Harvard University, Boston, MA 02215, USA
| | - Xuanyao Liu
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston 02115, MA, USA
| | | | - Po-Ru Loh
- Brigham and Women’s Hospital, Boston, MA 02215, USA
| | - Yakir Reshef
- Department of Computer Science, Harvard University, Cambridge, MA 02138, USA
- Harvard/MIT MD/PhD Program, Boston, MA 02215, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02215, USA
| | - Gleb Kichaev
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - Soumya Raychauduri
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston 02115, MA, USA
- Center for Data Sciences, Harvard Medical School, Boston, MA 02215, USA
- Divisions of Genetics, Rheumatology, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02215, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02215, USA
- Graduate School of Arts and Sciences, Harvard University, Boston, MA 02215, USA
| | - Alkes L Price
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston 02115, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02215, USA
| |
Collapse
|
46
|
Li R, Chen Y, Ritchie MD, Moore JH. Electronic health records and polygenic risk scores for predicting disease risk. Nat Rev Genet 2020; 21:493-502. [PMID: 32235907 DOI: 10.1038/s41576-020-0224-1] [Citation(s) in RCA: 53] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/02/2020] [Indexed: 01/03/2023]
Abstract
Accurate prediction of disease risk based on the genetic make-up of an individual is essential for effective prevention and personalized treatment. Nevertheless, to date, individual genetic variants from genome-wide association studies have achieved only moderate prediction of disease risk. The aggregation of genetic variants under a polygenic model shows promising improvements in prediction accuracies. Increasingly, electronic health records (EHRs) are being linked to patient genetic data in biobanks, which provides new opportunities for developing and applying polygenic risk scores in the clinic, to systematically examine and evaluate patient susceptibilities to disease. However, the heterogeneous nature of EHR data brings forth many practical challenges along every step of designing and implementing risk prediction strategies. In this Review, we present the unique considerations for using genotype and phenotype data from biobank-linked EHRs for polygenic risk prediction.
Collapse
Affiliation(s)
- Ruowang Li
- Department of Biostatistics, Epidemiology & Informatics, University of Pennsylvania, Philadelphia, PA, USA
| | - Yong Chen
- Department of Biostatistics, Epidemiology & Informatics, University of Pennsylvania, Philadelphia, PA, USA
| | - Marylyn D Ritchie
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
| | - Jason H Moore
- Department of Biostatistics, Epidemiology & Informatics, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
47
|
Hüls A, Czamara D. Methodological challenges in constructing DNA methylation risk scores. Epigenetics 2020; 15:1-11. [PMID: 31318318 PMCID: PMC6961658 DOI: 10.1080/15592294.2019.1644879] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Revised: 06/28/2019] [Accepted: 07/09/2019] [Indexed: 12/23/2022] Open
Abstract
Polygenic approaches often access more variance of complex traits than is possible by single variant approaches. For genotype data, genetic risk scores (GRS) are widely used for risk prediction as well as in association and interaction studies. Recently, interest has been growing in transferring GRS approaches to DNA methylation data (methylation risk scores, MRS), which can be used 1) as biomarkers for environmental exposures, 2) in association analyses in which single CpG sites do not achieve significance, 3) as dimension reduction approach in interaction and mediation analyses, and 4) to predict individual risks of disease or treatment success. Most GRS approaches can directly be transferred to methylation data. However, since methylation data is more sensitive to confounding, e.g. by age and tissue, it is more complex to find appropriate external weights. In this review, we will outline the adaption of current GRS approaches to methylation data and highlight occurring challenges.
Collapse
Affiliation(s)
- Anke Hüls
- Department of Human Genetics, Emory University, Atlanta, GA, USA
- Centre for Molecular Medicine and Therapeutics, BC Children’s Hospital Research Institute, and Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Darina Czamara
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, Munich, Germany
| |
Collapse
|
48
|
Janssens ACJW. Validity of polygenic risk scores: are we measuring what we think we are? Hum Mol Genet 2019; 28:R143-R150. [PMID: 31504522 PMCID: PMC7013150 DOI: 10.1093/hmg/ddz205] [Citation(s) in RCA: 59] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2019] [Revised: 08/14/2019] [Accepted: 08/14/2019] [Indexed: 12/16/2022] Open
Abstract
Polygenic risk scores (PRSs) have become the standard for quantifying genetic liability in the prediction of disease risks. PRSs are generally constructed as weighted sum scores of risk alleles using effect sizes from genome-wide association studies as their weights. The construction of PRSs is being improved with more appropriate selection of independent single-nucleotide polymorphisms (SNPs) and optimized estimation of their weights but is rarely reflected upon from a theoretical perspective, focusing on the validity of the risk score. Borrowing from psychometrics, this paper discusses the validity of PRSs and introduces the three main types of validity that are considered in the evaluation of tests and measurements: construct, content, and criterion validity. This introduction is followed by a discussion of three topics that challenge the validity of PRS, namely, their claimed independence of clinical risk factors, the consequences of relaxing SNP inclusion thresholds and the selection of SNP weights. This discussion of the validity of PRS reminds us that we need to keep questioning if weighted sums of risk alleles are measuring what we think they are in the various scenarios in which PRSs are used and that we need to keep exploring alternative modeling strategies that might better reflect the underlying biological pathways.
Collapse
Affiliation(s)
- A Cecile J W Janssens
- Department of Epidemiology, Rollins School of Public Health, Emory University, 1518 Clifton Road NE, Atlanta, GA, USA
| |
Collapse
|
49
|
Newcombe PJ, Nelson CP, Samani NJ, Dudbridge F. A flexible and parallelizable approach to genome-wide polygenic risk scores. Genet Epidemiol 2019; 43:730-741. [PMID: 31328830 PMCID: PMC6764842 DOI: 10.1002/gepi.22245] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2018] [Revised: 05/03/2019] [Accepted: 05/30/2019] [Indexed: 01/06/2023]
Abstract
The heritability of most complex traits is driven by variants throughout the genome. Consequently, polygenic risk scores, which combine information on multiple variants genome-wide, have demonstrated improved accuracy in genetic risk prediction. We present a new two-step approach to constructing genome-wide polygenic risk scores from meta-GWAS summary statistics. Local linkage disequilibrium (LD) is adjusted for in Step 1, followed by, uniquely, long-range LD in Step 2. Our algorithm is highly parallelizable since block-wise analyses in Step 1 can be distributed across a high-performance computing cluster, and flexible, since sparsity and heritability are estimated within each block. Inference is obtained through a formal Bayesian variable selection framework, meaning final risk predictions are averaged over competing models. We compared our method to two alternative approaches: LDPred and lassosum using all seven traits in the Welcome Trust Case Control Consortium as well as meta-GWAS summaries for type 1 diabetes (T1D), coronary artery disease, and schizophrenia. Performance was generally similar across methods, although our framework provided more accurate predictions for T1D, for which there are multiple heterogeneous signals in regions of both short- and long-range LD. With sufficient compute resources, our method also allows the fastest runtimes.
Collapse
Affiliation(s)
- Paul J. Newcombe
- MRC Biostatistics Unit, School of Clinical Medicine, Cambridge Institute of Public HealthCambridge Biomedical CampusCambridgeUK
| | - Christopher P. Nelson
- Department of Cardiovascular Sciences, Cardiovascular Research Centre, Glenfield HospitalUniversity of LeicesterLeicesterUK
- NIHR Leicester Biomedical Research CentreGlenfield HospitalLeicesterUK
| | - Nilesh J. Samani
- Department of Cardiovascular Sciences, Cardiovascular Research Centre, Glenfield HospitalUniversity of LeicesterLeicesterUK
- NIHR Leicester Biomedical Research CentreGlenfield HospitalLeicesterUK
| | - Frank Dudbridge
- Department of Health Sciences, Centre for MedicineUniversity of LeicesterLeicesterUK
| |
Collapse
|
50
|
Hormozdiari F, van de Geijn B, Nasser J, Weissbrod O, Gazal S, Ju CJT, Connor LO, Hujoel MLA, Engreitz J, Hormozdiari F, Price AL. Functional disease architectures reveal unique biological role of transposable elements. Nat Commun 2019; 10:4054. [PMID: 31492842 PMCID: PMC6731302 DOI: 10.1038/s41467-019-11957-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2018] [Accepted: 08/08/2019] [Indexed: 12/19/2022] Open
Abstract
Transposable elements (TE) comprise roughly half of the human genome. Though initially derided as junk DNA, they have been widely hypothesized to contribute to the evolution of gene regulation. However, the contribution of TE to the genetic architecture of diseases remains unknown. Here, we analyze data from 41 independent diseases and complex traits to draw three conclusions. First, TE are uniquely informative for disease heritability. Despite overall depletion for heritability (54% of SNPs, 39 ± 2% of heritability), TE explain substantially more heritability than expected based on their depletion for known functional annotations. This implies that TE acquire function in ways that differ from known functional annotations. Second, older TE contribute more to disease heritability, consistent with acquiring biological function. Third, Short Interspersed Nuclear Elements (SINE) are far more enriched for blood traits than for other traits. Our results can help elucidate the biological roles that TE play in the genetic architecture of diseases.
Collapse
Affiliation(s)
- Farhad Hormozdiari
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA. .,Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| | - Bryce van de Geijn
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA.,Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Joseph Nasser
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Omer Weissbrod
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA.,Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Steven Gazal
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA.,Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Chelsea J-T Ju
- Department of Computer Science, University of California, Los Angeles, CA, 90095, USA
| | - Luke O' Connor
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA.,Program in Bioinformatics and Integrative Genomics, Harvard Graduate School of Arts and Sciences, Boston, MA, USA
| | - Margaux L A Hujoel
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
| | - Jesse Engreitz
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Fereydoun Hormozdiari
- Department of Biochemistry and Molecular Medicine, University of California, Davis, CA, 95616, USA.,MIND Institute and UC-Davis Genome Center, Davis, CA, 95616, USA
| | - Alkes L Price
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA. .,Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA. .,Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA.
| |
Collapse
|