1
|
Yang M, Wen Y, Zheng J, Zhang J, Zhao T, Feng J. Improving power of genome-wide association studies via transforming ordinal phenotypes into continuous phenotypes. FRONTIERS IN PLANT SCIENCE 2023; 14:1247181. [PMID: 38023883 PMCID: PMC10652869 DOI: 10.3389/fpls.2023.1247181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/25/2023] [Accepted: 10/18/2023] [Indexed: 12/01/2023]
Abstract
Introduction Ordinal traits are important complex traits in crops, while genome-wide association study (GWAS) is a widely-used method in their gene mining. Presently, GWAS of continuous quantitative traits (C-GWAS) and single-locus association analysis method of ordinal traits are the main methods used for ordinal traits. However, the detection power of these two methods is low. Methods To address this issue, we proposed a new method, named MTOTC, in which hierarchical data of ordinal traits are transformed into continuous phenotypic data (CPData). Results Then, FASTmrMLM, one C-GWAS method, was used to conduct GWAS for CPData. The results from the simulation studies showed that, MTOTC+FASTmrMLM for ordinal traits was better than the classical methods when there were four and fewer hierarchical levels. In addition, when MTOTC was combined with FASTmrEMMA, mrMLM, ISIS EM-BLASSO, pLARmEB, and pKWmEB, relatively high power and low false positive rate in QTN detection were observed as well. Subsequently, MTOTC was applied to analyze the hierarchical data of soybean salt-alkali tolerance. It was revealed that more significant QTNs were detected when MTOTC was combined with any of the above six C-GWAs. Discussion Accordingly, the new method increases the choices of the GWAS methods for ordinal traits and helps to mine the genes for ordinal traits in resource populations.
Collapse
Affiliation(s)
- Ming Yang
- Key Laboratory of Biology and Genetics Improvement of Soybean, Ministry of Agriculture/Zhongshan Biological Breeding Laboratory (ZSBBL)/National Innovation Platform for Soybean Breeding and Industry-Education Integration/State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization/College of Agriculture, Nanjing Agricultural University, Nanjing, China
| | - Yangjun Wen
- College of Science, Nanjing Agricultural University, Nanjing, China
| | - Jinchang Zheng
- Key Laboratory of Biology and Genetics Improvement of Soybean, Ministry of Agriculture/Zhongshan Biological Breeding Laboratory (ZSBBL)/National Innovation Platform for Soybean Breeding and Industry-Education Integration/State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization/College of Agriculture, Nanjing Agricultural University, Nanjing, China
| | - Jin Zhang
- College of Science, Nanjing Agricultural University, Nanjing, China
| | - Tuanjie Zhao
- Key Laboratory of Biology and Genetics Improvement of Soybean, Ministry of Agriculture/Zhongshan Biological Breeding Laboratory (ZSBBL)/National Innovation Platform for Soybean Breeding and Industry-Education Integration/State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization/College of Agriculture, Nanjing Agricultural University, Nanjing, China
| | - Jianying Feng
- Key Laboratory of Biology and Genetics Improvement of Soybean, Ministry of Agriculture/Zhongshan Biological Breeding Laboratory (ZSBBL)/National Innovation Platform for Soybean Breeding and Industry-Education Integration/State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization/College of Agriculture, Nanjing Agricultural University, Nanjing, China
| |
Collapse
|
2
|
Sadeqi MB, Ballvora A, Léon J. Local and Bayesian Survival FDR Estimations to Identify Reliable Associations in Whole Genome of Bread Wheat. Int J Mol Sci 2023; 24:14011. [PMID: 37762314 PMCID: PMC10531084 DOI: 10.3390/ijms241814011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 09/02/2023] [Accepted: 09/07/2023] [Indexed: 09/29/2023] Open
Abstract
Estimating the FDR significance threshold in genome-wide association studies remains a major challenge in distinguishing true positive hypotheses from false positive and negative errors. Several comparative methods for multiple testing comparison have been developed to determine the significance threshold; however, these methods may be overly conservative and lead to an increase in false negative results. The local FDR approach is suitable for testing many associations simultaneously based on the empirical Bayes perspective. In the local FDR, the maximum likelihood estimator is sensitive to bias when the GWAS model contains two or more explanatory variables as genetic parameters simultaneously. The main criticism of local FDR is that it focuses only locally on the effects of single nucleotide polymorphism (SNP) in tails of distribution, whereas the signal associations are distributed across the whole genome. The advantage of the Bayesian perspective is that knowledge of prior distribution comes from other genetic parameters included in the GWAS model, such as linkage disequilibrium (LD) analysis, minor allele frequency (MAF) and call rate of significant associations. We also proposed Bayesian survival FDR to solve the multi-collinearity and large-scale problems, respectively, in grain yield (GY) vector in bread wheat with large-scale SNP information. The objective of this study was to obtain a short list of SNPs that are reliably associated with GY under low and high levels of nitrogen (N) in the population. The five top significant SNPs were compared with different Bayesian models. Based on the time to events in the Bayesian survival analysis, the differentiation between minor and major alleles within the association panel can be identified.
Collapse
Affiliation(s)
| | - Agim Ballvora
- INRES-Plant Breeding, Rheinische Friedrich-Wilhelms-Universität Bonn, 53113 Bonn, Germany; (M.B.S.); (J.L.)
| | | |
Collapse
|
3
|
Pedersen EM, Agerbo E, Plana-Ripoll O, Steinbach J, Krebs MD, Hougaard DM, Werge T, Nordentoft M, Børglum AD, Musliner KL, Ganna A, Schork AJ, Mortensen PB, McGrath JJ, Privé F, Vilhjálmsson BJ. ADuLT: An efficient and robust time-to-event GWAS. Nat Commun 2023; 14:5553. [PMID: 37689771 PMCID: PMC10492844 DOI: 10.1038/s41467-023-41210-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 08/28/2023] [Indexed: 09/11/2023] Open
Abstract
Proportional hazards models have been proposed to analyse time-to-event phenotypes in genome-wide association studies (GWAS). However, little is known about the ability of proportional hazards models to identify genetic associations under different generative models and when ascertainment is present. Here we propose the age-dependent liability threshold (ADuLT) model as an alternative to a Cox regression based GWAS, here represented by SPACox. We compare ADuLT, SPACox, and standard case-control GWAS in simulations under two generative models and with varying degrees of ascertainment as well as in the iPSYCH cohort. We find Cox regression GWAS to be underpowered when cases are strongly ascertained (cases are oversampled by a factor 5), regardless of the generative model used. ADuLT is robust to ascertainment in all simulated scenarios. Then, we analyse four psychiatric disorders in iPSYCH, ADHD, Autism, Depression, and Schizophrenia, with a strong case-ascertainment. Across these psychiatric disorders, ADuLT identifies 20 independent genome-wide significant associations, case-control GWAS finds 17, and SPACox finds 8, which is consistent with simulation results. As more genetic data are being linked to electronic health records, robust GWAS methods that can make use of age-of-onset information will help increase power in analyses for common health outcomes.
Collapse
Affiliation(s)
- Emil M Pedersen
- National Centre for Register-Based Research, Aarhus University, Aarhus, Denmark.
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark.
| | - Esben Agerbo
- National Centre for Register-Based Research, Aarhus University, Aarhus, Denmark
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- Centre for Integrated Register-based Research at Aarhus University, Aarhus, Denmark
| | - Oleguer Plana-Ripoll
- National Centre for Register-Based Research, Aarhus University, Aarhus, Denmark
- Department of Clinical Epidemiology, Aarhus University and Aarhus University Hospital, Aarhus, Denmark
| | - Jette Steinbach
- National Centre for Register-Based Research, Aarhus University, Aarhus, Denmark
| | - Morten D Krebs
- Institute of Biological Psychiatry, Mental Health Center - Sct Hans, Copenhagen University Hospital - Mental Health Services CPH, Copenhagen, Denmark
| | - David M Hougaard
- Department for Congenital Disorders, Statens Serum Institut, Copenhagen, Denmark
| | - Thomas Werge
- Institute of Biological Psychiatry, Mental Health Center - Sct Hans, Copenhagen University Hospital - Mental Health Services CPH, Copenhagen, Denmark
- Department of Clinical Sciences, Copenhagen University, Copenhagen, Denmark
- Section for Geogenetics, GLOBE Institute, Faculty of Health and Medical Science, Copenhagen University, Copenhagen, Denmark
| | - Merete Nordentoft
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- CORE- Copenhagen Centre for Research in Mental Health, Mental Health Center-Copenhagen, Copenhagen University Hospital - Mental Health Services CPH, Copenhagen, Denmark
| | - Anders D Børglum
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- Department of Biomedicine and iSEQ Centre, Aarhus University, Aarhus, Denmark
- Center for Genomics and Personalized Medicine, CGPM, Aarhus University, Aarhus, Denmark
| | - Katherine L Musliner
- National Centre for Register-Based Research, Aarhus University, Aarhus, Denmark
- Department of Affective Disorders, Aarhus University Hospital-Psychiatry, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
| | - Andrea Ganna
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland
| | - Andrew J Schork
- Institute of Biological Psychiatry, Mental Health Center - Sct Hans, Copenhagen University Hospital - Mental Health Services CPH, Copenhagen, Denmark
- Section for Geogenetics, GLOBE Institute, Faculty of Health and Medical Science, Copenhagen University, Copenhagen, Denmark
- Neurogenomics Division, The Translational Genomics Research Institute (TGEN), Phoenix, AZ, USA
| | - Preben B Mortensen
- National Centre for Register-Based Research, Aarhus University, Aarhus, Denmark
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
| | - John J McGrath
- National Centre for Register-Based Research, Aarhus University, Aarhus, Denmark
- Queensland Brain Institute, University of Queensland, St Lucia, QLD, Australia
- Queensland Centre for Mental Health Research, The Park Centre for Mental Health, Wacol, QLD, Australia
| | - Florian Privé
- National Centre for Register-Based Research, Aarhus University, Aarhus, Denmark
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
| | - Bjarni J Vilhjálmsson
- National Centre for Register-Based Research, Aarhus University, Aarhus, Denmark.
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark.
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark.
- Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, the Broad Institute of MIT and Harvard, Massachusetts, USA.
| |
Collapse
|
4
|
Gusev A. Germline mechanisms of immunotherapy toxicities in the era of genome-wide association studies. Immunol Rev 2023; 318:138-156. [PMID: 37515388 DOI: 10.1111/imr.13253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 06/29/2023] [Indexed: 07/30/2023]
Abstract
Cancer immunotherapy has revolutionized the treatment of advanced cancers and is quickly becoming an option for early-stage disease. By reactivating the host immune system, immunotherapy harnesses patients' innate defenses to eradicate the tumor. By putatively similar mechanisms, immunotherapy can also substantially increase the risk of toxicities or immune-related adverse events (irAEs). Severe irAEs can lead to hospitalization, treatment discontinuation, lifelong immune complications, or even death. Many irAEs present with similar symptoms to heritable autoimmune diseases, suggesting that germline genetics may contribute to their onset. Recently, genome-wide association studies (GWAS) of irAEs have identified common germline associations and putative mechanisms, lending support to this hypothesis. A wide range of well-established GWAS methods can potentially be harnessed to understand the etiology of irAEs specifically and immunotherapy outcomes broadly. This review summarizes current findings regarding germline effects on immunotherapy outcomes and discusses opportunities and challenges for leveraging germline genetics to understand, predict, and treat irAEs.
Collapse
Affiliation(s)
- Alexander Gusev
- Division of Population Sciences, Dana-Farber Cancer Institute and Harvard Medical School, Boston, Massachusetts, USA
- Division of Genetics, Brigham & Women's Hospital, Boston, Massachusetts, USA
- The Broad Institute, Cambridge, Massachusetts, USA
| |
Collapse
|
5
|
Arbeev KG, Ukraintseva S, Bagley O, Duan H, Wu D, Akushevich I, Stallard E, Kulminski A, Christensen K, Feitosa MF, O’Connell JR, Parker D, Whitson H, Yashin AI. Interactions between genes involved in physiological dysregulation and axon guidance: role in Alzheimer's disease. Front Genet 2023; 14:1236509. [PMID: 37719713 PMCID: PMC10500346 DOI: 10.3389/fgene.2023.1236509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Accepted: 08/17/2023] [Indexed: 09/19/2023] Open
Abstract
Dysregulation of physiological processes may contribute to Alzheimer's disease (AD) development. We previously found that an increase in the level of physiological dysregulation (PD) in the aging body is associated with declining resilience and robustness to major diseases. Also, our genome-wide association study found that genes associated with the age-related increase in PD frequently represented pathways implicated in axon guidance and synaptic function, which in turn were linked to AD and related traits (e.g., amyloid, tau, neurodegeneration) in the literature. Here, we tested the hypothesis that genes involved in PD and axon guidance/synapse function may jointly influence onset of AD. We assessed the impact of interactions between SNPs in such genes on AD onset in the Long Life Family Study and sought to replicate the findings in the Health and Retirement Study. We found significant interactions between SNPs in the UNC5C and CNTN6, and PLXNA4 and EPHB2 genes that influenced AD onset in both datasets. Associations with individual SNPs were not statistically significant. Our findings, thus, support a major role of genetic interactions in the heterogeneity of AD and suggest the joint contribution of genes involved in PD and axon guidance/synapse function (essential for the maintenance of complex neural networks) to AD development.
Collapse
Affiliation(s)
- Konstantin G. Arbeev
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, United States
| | - Svetlana Ukraintseva
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, United States
| | - Olivia Bagley
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, United States
| | - Hongzhe Duan
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, United States
| | - Deqing Wu
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, United States
| | - Igor Akushevich
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, United States
| | - Eric Stallard
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, United States
| | - Alexander Kulminski
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, United States
| | - Kaare Christensen
- Danish Aging Research Center, Department of Public Health, University of Southern Denmark, Odense, Denmark
| | - Mary F. Feitosa
- Division of Statistical Genomics, Department of Genetics, Washington University School of Medicine, St. Louis, MO, United States
| | - Jeffrey R. O’Connell
- Division of Endocrinology, Diabetes and Nutrition and Program for Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD, United States
| | - Daniel Parker
- Duke Center for the Study of Aging and Human Development, Duke University, Durham, NC, United States
| | - Heather Whitson
- Duke Center for the Study of Aging and Human Development, Duke University, Durham, NC, United States
- Durham VA Geriatrics Research Education and Clinical Center, Durham, NC, United States
| | - Anatoliy I. Yashin
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, United States
| |
Collapse
|
6
|
Wang X, Khurshid S, Choi SH, Friedman S, Weng LC, Reeder C, Pirruccello JP, Singh P, Lau ES, Venn R, Diamant N, Di Achille P, Philippakis A, Anderson CD, Ho JE, Ellinor PT, Batra P, Lubitz SA. Genetic Susceptibility to Atrial Fibrillation Identified via Deep Learning of 12-Lead Electrocardiograms. CIRCULATION. GENOMIC AND PRECISION MEDICINE 2023; 16:340-349. [PMID: 37278238 PMCID: PMC10524395 DOI: 10.1161/circgen.122.003808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Accepted: 04/11/2023] [Indexed: 06/07/2023]
Abstract
BACKGROUND Artificial intelligence (AI) models applied to 12-lead ECG waveforms can predict atrial fibrillation (AF), a heritable and morbid arrhythmia. However, the factors forming the basis of risk predictions from AI models are usually not well understood. We hypothesized that there might be a genetic basis for an AI algorithm for predicting the 5-year risk of new-onset AF using 12-lead ECGs (ECG-AI)-based risk estimates. METHODS We applied a validated ECG-AI model for predicting incident AF to ECGs from 39 986 UK Biobank participants without AF. We then performed a genome-wide association study (GWAS) of the predicted AF risk and compared it with an AF GWAS and a GWAS of risk estimates from a clinical variable model. RESULTS In the ECG-AI GWAS, we identified 3 signals (P<5×10-8) at established AF susceptibility loci marked by the sarcomeric gene TTN and sodium channel genes SCN5A and SCN10A. We also identified 2 novel loci near the genes VGLL2 and EXT1. In contrast, the clinical variable model prediction GWAS indicated a different genetic profile. In genetic correlation analysis, the prediction from the ECG-AI model was estimated to have a higher correlation with AF than that from the clinical variable model. CONCLUSIONS Predicted AF risk from an ECG-AI model is influenced by genetic variation implicating sarcomeric, ion channel and body height pathways. ECG-AI models may identify individuals at risk for disease via specific biological pathways.
Collapse
Affiliation(s)
- Xin Wang
- Cardiovascular Research Ctr, Massachusetts General Hospital, Boston
- Cardiovascular Disease Initiative, The Broad Institute of MIT & Harvard, Cambridge
| | - Shaan Khurshid
- Cardiovascular Research Ctr, Massachusetts General Hospital, Boston
- Cardiovascular Disease Initiative, The Broad Institute of MIT & Harvard, Cambridge
- Division of Cardiology, Massachusetts General Hospital, Boston
| | - Seung Hoan Choi
- Cardiovascular Disease Initiative, The Broad Institute of MIT & Harvard, Cambridge
| | - Samuel Friedman
- Data Sciences Platform, The Broad Institute of MIT & Harvard, Cambridge
| | - Lu-Chen Weng
- Cardiovascular Research Ctr, Massachusetts General Hospital, Boston
- Cardiovascular Disease Initiative, The Broad Institute of MIT & Harvard, Cambridge
| | | | - James P. Pirruccello
- Cardiovascular Research Ctr, Massachusetts General Hospital, Boston
- Cardiovascular Disease Initiative, The Broad Institute of MIT & Harvard, Cambridge
- Division of Cardiology, Massachusetts General Hospital, Boston
| | - Pulkit Singh
- Data Sciences Platform, The Broad Institute of MIT & Harvard, Cambridge
| | - Emily S. Lau
- Cardiovascular Research Ctr, Massachusetts General Hospital, Boston
- Cardiovascular Disease Initiative, The Broad Institute of MIT & Harvard, Cambridge
- Division of Cardiology, Massachusetts General Hospital, Boston
| | - Rachael Venn
- Cardiovascular Research Ctr, Massachusetts General Hospital, Boston
- Division of Cardiology, Massachusetts General Hospital, Boston
| | - Nate Diamant
- Data Sciences Platform, The Broad Institute of MIT & Harvard, Cambridge
| | - Paolo Di Achille
- Data Sciences Platform, The Broad Institute of MIT & Harvard, Cambridge
| | - Anthony Philippakis
- Data Sciences Platform, The Broad Institute of MIT & Harvard, Cambridge
- Eric & Wendy Schmidt Ctr, The Broad Institute of MIT & Harvard, Cambridge
| | - Christopher D. Anderson
- Dept of Neurology, Brigham and Women’s Hospital
- Ctr for Genomic Medicine, Massachusetts General Hospital, Boston
- Henry & Allison McCance Ctr for Brain Health, Massachusetts General Hospital, Boston
| | - Jennifer E. Ho
- CardioVascular Institute & Division of Cardiology, Dept of Medicine, Beth Israel Deaconess Medical Ctr, Boston, MA
| | - Patrick T. Ellinor
- Cardiovascular Research Ctr, Massachusetts General Hospital, Boston
- Cardiovascular Disease Initiative, The Broad Institute of MIT & Harvard, Cambridge
- Demoulas Ctr for Cardiac Arrhythmias, Massachusetts General Hospital, Boston
| | - Puneet Batra
- Data Sciences Platform, The Broad Institute of MIT & Harvard, Cambridge
| | - Steven A. Lubitz
- Cardiovascular Research Ctr, Massachusetts General Hospital, Boston
- Cardiovascular Disease Initiative, The Broad Institute of MIT & Harvard, Cambridge
- Demoulas Ctr for Cardiac Arrhythmias, Massachusetts General Hospital, Boston
| |
Collapse
|
7
|
Juodakis J, Ytterberg K, Flatley C, Sole-Navais P, Jacobsson B. Time-varying effects are common in genetic control of gestational duration. Hum Mol Genet 2023; 32:2399-2407. [PMID: 37195282 PMCID: PMC10321382 DOI: 10.1093/hmg/ddad086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Revised: 05/10/2023] [Accepted: 05/15/2023] [Indexed: 05/18/2023] Open
Abstract
Preterm birth is a major burden to neonatal health worldwide, determined in part by genetics. Recently, studies discovered several genes associated with this trait or its continuous equivalent-gestational duration. However, their effect timing, and thus clinical importance, is still unclear. Here, we use genotyping data of 31 000 births from the Norwegian Mother, Father and Child cohort (MoBa) to investigate different models of the genetic pregnancy 'clock'. We conduct genome-wide association studies using gestational duration or preterm birth, replicating known maternal associations and finding one new fetal variant. We illustrate how the interpretation of these results is complicated by the loss of power when dichotomizing. Using flexible survival models, we resolve this complexity and find that many of the known loci have time-varying effects, often stronger early in pregnancy. The overall polygenic control of birth timing appears to be shared in the term and preterm, but not very preterm, periods and exploratory results suggest involvement of the major histocompatibility complex genes in the latter. These findings show that the known gestational duration loci are clinically relevant and should help design further experimental studies.
Collapse
Affiliation(s)
- Julius Juodakis
- Department of Obstetrics and Gynecology, Institute of Clinical Sciences, University of Gothenburg, Gothenburg 416 50, Sweden
| | - Karin Ytterberg
- Department of Obstetrics and Gynecology, Institute of Clinical Sciences, University of Gothenburg, Gothenburg 416 50, Sweden
| | - Christopher Flatley
- Department of Obstetrics and Gynecology, Institute of Clinical Sciences, University of Gothenburg, Gothenburg 416 50, Sweden
| | - Pol Sole-Navais
- Department of Obstetrics and Gynecology, Institute of Clinical Sciences, University of Gothenburg, Gothenburg 416 50, Sweden
| | - Bo Jacobsson
- Department of Obstetrics and Gynecology, Institute of Clinical Sciences, University of Gothenburg, Gothenburg 416 50, Sweden
- Department of Genetics and Bioinformatics, Division of Health Data and Digitalisation, Norwegian Institute of Public Health, Oslo 0456, Norway
| |
Collapse
|
8
|
Li YJ, Nuytemans K, La JO, Jiang R, Slifer SH, Sun S, Naj A, Gao XR, Martin ER. Identification of novel genes for age-at-onset of Alzheimer's disease by combining quantitative and survival trait analyses. Alzheimers Dement 2023; 19:3148-3157. [PMID: 36738287 DOI: 10.1002/alz.12927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Revised: 12/06/2022] [Accepted: 12/19/2022] [Indexed: 02/05/2023]
Abstract
INTRODUCTION Our understanding of the genetic predisposition for age-at-onset (AAO) of Alzheimer's disease (AD) is limited. Here, we sought to identify genes modifying AAO and examined whether any have sex-specific effects. METHODS Genome-wide association analysis were performed on imputed genetic data of 9219 AD cases and 10,345 controls from 20 cohorts of the Alzheimer's Disease Genetics Consortium. AAO was modeled from cases directly and as a survival outcome. RESULTS We identified 11 genome-wide significant loci (P < 5 × 10-8 ), including six known AD-risk genes and five novel loci, UMAD1, LUZP2, ARFGEF2, DSCAM, and 4q25, affecting AAO of AD. Additionally, 39 suggestive loci showed strong association. Twelve loci showed sex-specific effects on AAO including CD300LG and MLX/TUBG2 for females and MIR4445 for males. DISCUSSION Genes that influence AAO of AD are excellent therapeutic targets for delaying onset of AD. Several loci identified include genes with promising functional implications for AD.
Collapse
Affiliation(s)
- Yi-Ju Li
- Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, North Carolina, USA
- Duke Molecular Physiology Institute, Duke University School of Medicine, Durham, North Carolina, USA
| | - Karen Nuytemans
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, Florida, USA
- John T. MacDonald Foundation Department of Human Genetics, University of Miami, Miller School of Medicine, Miami, Florida, USA
| | - Jong Ok La
- Duke Molecular Physiology Institute, Duke University School of Medicine, Durham, North Carolina, USA
| | - Rong Jiang
- Duke Molecular Physiology Institute, Duke University School of Medicine, Durham, North Carolina, USA
- Department of Psychiatry and Behavior Science, Duke University School of Medicine, Durham, North Carolina, USA
| | - Susan H Slifer
- John T. MacDonald Foundation Department of Human Genetics, University of Miami, Miller School of Medicine, Miami, Florida, USA
| | - Shuming Sun
- Duke Molecular Physiology Institute, Duke University School of Medicine, Durham, North Carolina, USA
| | - Adam Naj
- Department of Biostatistics, Epidemiology, and Informatics, Department of Pathology and Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Xiaoyi Raymond Gao
- Department of Ophthalmology and Visual Sciences, Division of Human Genetics, The Ohio State University, Columbus, Ohio, USA
- Department of Biomedical Informatics, The Ohio State University, Columbus, Ohio, USA
| | - Eden R Martin
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, Florida, USA
- John T. MacDonald Foundation Department of Human Genetics, University of Miami, Miller School of Medicine, Miami, Florida, USA
| |
Collapse
|
9
|
Hof JP, Vermeulen SH, Coolen ACC, Galesloot TE. Fast and accurate recurrent event analysis for genome-wide association studies. Genet Epidemiol 2023. [PMID: 37060326 DOI: 10.1002/gepi.22525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 03/05/2023] [Accepted: 03/28/2023] [Indexed: 04/16/2023]
Abstract
Many diseases recur after recovery, for example, recurrences in cancer and infections. However, research is often focused on analysing only time-to-first recurrence, thereby ignoring any subsequent recurrences that may occur after the first. Statistical models for the analysis of recurrent events are available, of which the extended Cox proportional hazards frailty model is the current state-of-the-art. However, this model is too statistically complex for computationally efficient application in high-dimensional data sets, including genome-wide association studies (GWAS). Here, we develop an application for fast and accurate recurrent event analysis in GWAS, called SPARE (SaddlePoint Approximation for Recurrent Event analysis). In SPARE, every DNA variant is tested for association with recurrence risk using a modified score statistic. A saddlepoint approximation is implemented to achieve statistical accuracy. SPARE controls the Type I error, and its statistical power is similar to existing recurrent event models, yet SPARE is significantly faster. An application of SPARE in a recurrent event GWAS on bladder cancer for 6.2 million DNA variants in 1,443 individuals required less than 15 min, whereas existing recurrent event methods would require several weeks.
Collapse
Affiliation(s)
- Jasper P Hof
- Department for Health Evidence, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Sita H Vermeulen
- Department for Health Evidence, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Anthony C C Coolen
- Department of Biophysics, Donders Institute, Radboud University, Nijmegen, The Netherlands
| | - Tessel E Galesloot
- Department for Health Evidence, Radboud University Medical Center, Nijmegen, The Netherlands
| |
Collapse
|
10
|
Juodakis J, Ytterberg K, Flatley C, Sole-Navais P, Jacobsson B. Time-varying effects are common in genetic control of gestational duration. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.02.07.23285609. [PMID: 36798334 PMCID: PMC9934791 DOI: 10.1101/2023.02.07.23285609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/12/2023]
Abstract
Preterm birth is a major burden to neonatal health worldwide, determined in part by genetics. Recently, studies discovered several genes associated with this trait or its continuous equivalent - gestational duration. However, their effect timing, and thus clinical importance, is still unclear. Here, we use genotyping data of 31,000 births from the Norwegian Mother, Father and Child cohort (MoBa) to investigate different models of the genetic pregnancy "clock". We conduct genome-wide association studies using gestational duration or preterm birth, replicating known maternal associations and finding one new foetal variant. We illustrate how the interpretation of these results is complicated by the loss of power when dichotomizing. Using flexible survival models, we resolve this complexity and find that many of the known loci have time-varying effects, often stronger early in pregnancy. The overall polygenic control of birth timing appears to be shared in the term and preterm, but not very preterm periods, and exploratory results suggest involvement of the major histocompatibility complex genes in the latter. These findings show that the known gestational duration loci are clinically relevant, and should help design further experimental studies.
Collapse
Affiliation(s)
- Julius Juodakis
- Department of Obstetrics and Gynecology, Institute of Clinical Sciences, University of Gothenburg, Gothenburg, Sweden
| | - Karin Ytterberg
- Department of Obstetrics and Gynecology, Institute of Clinical Sciences, University of Gothenburg, Gothenburg, Sweden
| | - Christopher Flatley
- Department of Obstetrics and Gynecology, Institute of Clinical Sciences, University of Gothenburg, Gothenburg, Sweden
| | - Pol Sole-Navais
- Department of Obstetrics and Gynecology, Institute of Clinical Sciences, University of Gothenburg, Gothenburg, Sweden
| | - Bo Jacobsson
- Department of Obstetrics and Gynecology, Institute of Clinical Sciences, University of Gothenburg, Gothenburg, Sweden
- Department of Genetics and Bioinformatics, Division of Health Data and Digitalisation, Norwegian Institute of Public Health, Oslo, Norway
| |
Collapse
|
11
|
Ramos J, Caywood LJ, Prough MB, Clouse JE, Herington SD, Slifer SH, Fuzzell MD, Fuzzell SL, Hochstetler SD, Miskimen KL, Main LR, Osterman MD, Zaman AF, Whitehead PL, Adams LD, Laux RA, Song YE, Foroud TM, Mayeux RP, George-Hyslop PS, Ogrocki PK, Lerner AJ, Vance JM, Cuccaro ML, Haines JL, Pericak-Vance MA, Scott WK. Genetic variants in the SHISA6 gene are associated with delayed cognitive impairment in two family datasets. Alzheimers Dement 2023; 19:611-620. [PMID: 35490390 PMCID: PMC9622429 DOI: 10.1002/alz.12686] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Revised: 03/08/2022] [Accepted: 03/28/2022] [Indexed: 11/12/2022]
Abstract
INTRODUCTION Studies of cognitive impairment (CI) in Amish communities have identified sibships containing CI and cognitively unimpaired (CU) individuals. We hypothesize that CU individuals may carry protective alleles delaying age at onset (AAO) of CI. METHODS A total of 1522 individuals screened for CI were genotyped. The outcome studied was AAO for CI individuals or age at last normal exam for CU individuals. Cox mixed-effects models examined association between age and single nucleotide variants (SNVs). RESULTS Three SNVs were significantly associated (P < 5 × 10-8 ) with AAO on chromosomes 6 (rs14538074; hazard ratio [HR] = 3.35), 9 (rs534551495; HR = 2.82), and 17 (rs146729640; HR = 6.38). The chromosome 17 association was replicated in the independent National Institute on Aging Genetics Initiative for Late-Onset Alzheimer's Disease dataset. DISCUSSION The replicated genome-wide significant association with AAO on chromosome 17 is located in the SHISA6 gene, which is involved in post-synaptic transmission in the hippocampus and is a biologically plausible candidate gene for Alzheimer's disease.
Collapse
Affiliation(s)
- Jairo Ramos
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Laura J. Caywood
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Michael B. Prough
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Jason E. Clouse
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Sharlene D. Herington
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Susan H. Slifer
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - M. Denise Fuzzell
- Case Western Reserve University School of Medicine, Cleveland, OH, USA
| | - Sarada L. Fuzzell
- Case Western Reserve University School of Medicine, Cleveland, OH, USA
| | | | | | - Leighanne R. Main
- Case Western Reserve University School of Medicine, Cleveland, OH, USA
| | - Michael D. Osterman
- Case Western Reserve University School of Medicine, Cleveland, OH, USA
- Cleveland Institute for Computational Biology, Case Western Reserve University, Cleveland, OH, USA
| | - Andrew F. Zaman
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Patrice L. Whitehead
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Larry D. Adams
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Renee A. Laux
- Case Western Reserve University School of Medicine, Cleveland, OH, USA
| | - Yeunjoo E. Song
- Case Western Reserve University School of Medicine, Cleveland, OH, USA
| | - Tatiana M. Foroud
- Indiana Alzheimer’s Disease Center, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Richard P. Mayeux
- Taub Institute on Alzheimer’s Disease and the Aging Brain, Department of Neurology, Columbia University, New York, NY, USA
- Gertrude H. Sergievsky Center, Columbia University, New York, NY, USA
- Department of Neurology, Columbia University, New York, NY, USA
| | | | - Paula K. Ogrocki
- University Hospitals Cleveland Medical Center, Cleveland, OH, USA
| | - Alan J. Lerner
- University Hospitals Cleveland Medical Center, Cleveland, OH, USA
| | - Jeffery M. Vance
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
- The Dr. John T. Macdonald Foundation Department of Human Genetics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Michael L. Cuccaro
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
- The Dr. John T. Macdonald Foundation Department of Human Genetics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Jonathan L. Haines
- Case Western Reserve University School of Medicine, Cleveland, OH, USA
- Cleveland Institute for Computational Biology, Case Western Reserve University, Cleveland, OH, USA
| | - Margaret A. Pericak-Vance
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
- The Dr. John T. Macdonald Foundation Department of Human Genetics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - William K. Scott
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
- The Dr. John T. Macdonald Foundation Department of Human Genetics, University of Miami Miller School of Medicine, Miami, FL, USA
| |
Collapse
|
12
|
Dey R, Zhou W, Kiiskinen T, Havulinna A, Elliott A, Karjalainen J, Kurki M, Qin A, Lee S, Palotie A, Neale B, Daly M, Lin X. Efficient and accurate frailty model approach for genome-wide survival association analysis in large-scale biobanks. Nat Commun 2022; 13:5437. [PMID: 36114182 PMCID: PMC9481565 DOI: 10.1038/s41467-022-32885-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 08/22/2022] [Indexed: 01/11/2023] Open
Abstract
With decades of electronic health records linked to genetic data, large biobanks provide unprecedented opportunities for systematically understanding the genetics of the natural history of complex diseases. Genome-wide survival association analysis can identify genetic variants associated with ages of onset, disease progression and lifespan. We propose an efficient and accurate frailty model approach for genome-wide survival association analysis of censored time-to-event (TTE) phenotypes by accounting for both population structure and relatedness. Our method utilizes state-of-the-art optimization strategies to reduce the computational cost. The saddlepoint approximation is used to allow for analysis of heavily censored phenotypes (>90%) and low frequency variants (down to minor allele count 20). We demonstrate the performance of our method through extensive simulation studies and analysis of five TTE phenotypes, including lifespan, with heavy censoring rates (90.9% to 99.8%) on ~400,000 UK Biobank participants with white British ancestry and ~180,000 individuals in FinnGen. We further analyzed 871 TTE phenotypes in the UK Biobank and presented the genome-wide scale phenome-wide association results with the PheWeb browser.
Collapse
Affiliation(s)
- Rounak Dey
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
| | - Wei Zhou
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Sciences, University of Helsinki, Helsinki, Finland
| | - Tuomo Kiiskinen
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Sciences, University of Helsinki, Helsinki, Finland
- Finnish Institute for Health and Welfare, Helsinki, Finland
| | - Aki Havulinna
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Sciences, University of Helsinki, Helsinki, Finland
- Finnish Institute for Health and Welfare, Helsinki, Finland
| | - Amanda Elliott
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Juha Karjalainen
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Sciences, University of Helsinki, Helsinki, Finland
| | - Mitja Kurki
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Sciences, University of Helsinki, Helsinki, Finland
| | - Ashley Qin
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
| | - Seunggeun Lee
- Graduate School of Data Science, Seoul National University, Seoul, Korea
| | - Aarno Palotie
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Sciences, University of Helsinki, Helsinki, Finland
| | - Benjamin Neale
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Mark Daly
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Sciences, University of Helsinki, Helsinki, Finland
| | - Xihong Lin
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA.
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Department of Statistics, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
13
|
Nazarian A, Philipp I, Culminskaya I, He L, Kulminski AM. Inter- and intra-chromosomal modulators of the APOE ɛ2 and ɛ4 effects on the Alzheimer's disease risk. GeroScience 2022; 45:233-247. [PMID: 35809216 PMCID: PMC9886755 DOI: 10.1007/s11357-022-00617-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Accepted: 06/24/2022] [Indexed: 02/03/2023] Open
Abstract
The mechanisms of incomplete penetrance of risk-modifying impacts of apolipoprotein E (APOE) ε2 and ε4 alleles on Alzheimer's disease (AD) have not been fully understood. We performed genome-wide analysis of differences in linkage disequilibrium (LD) patterns between 6,136 AD-affected and 10,555 AD-unaffected subjects from five independent studies to explore whether the association of the APOE ε2 allele (encoded by rs7412 polymorphism) and ε4 allele (encoded by rs429358 polymorphism) with AD was modulated by autosomal polymorphisms. The LD analysis identified 24 (mostly inter-chromosomal) and 57 (primarily intra-chromosomal) autosomal polymorphisms with significant differences in LD with either rs7412 or rs429358, respectively, between AD-affected and AD-unaffected subjects, indicating their potential modulatory roles. Our Cox regression analysis showed that minor alleles of four inter-chromosomal and ten intra-chromosomal polymorphisms exerted significant modulating effects on the ε2- and ε4-associated AD risks, respectively, and identified ε2-independent (rs2884183 polymorphism, 11q22.3) and ε4-independent (rs483082 polymorphism, 19q13.32) associations with AD. Our functional analysis highlighted ε2- and/or ε4-linked processes affecting the lipid and lipoprotein metabolism and cell junction organization which may contribute to AD pathogenesis. These findings provide insights into the ε2- and ε4-associated mechanisms of AD pathogenesis, underlying their incomplete penetrance.
Collapse
Affiliation(s)
- Alireza Nazarian
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Erwin Mill Building, 2024 W. Main St, Durham, NC, 27705, USA.
| | - Ian Philipp
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Erwin Mill Building, 2024 W. Main St, Durham, NC 27705 USA
| | - Irina Culminskaya
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Erwin Mill Building, 2024 W. Main St, Durham, NC 27705 USA
| | - Liang He
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Erwin Mill Building, 2024 W. Main St, Durham, NC 27705 USA
| | - Alexander M. Kulminski
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Erwin Mill Building, 2024 W. Main St, Durham, NC 27705 USA
| |
Collapse
|
14
|
Pedersen EM, Agerbo E, Plana-Ripoll O, Grove J, Dreier JW, Musliner KL, Bækvad-Hansen M, Athanasiadis G, Schork A, Bybjerg-Grauholm J, Hougaard DM, Werge T, Nordentoft M, Mors O, Dalsgaard S, Christensen J, Børglum AD, Mortensen PB, McGrath JJ, Privé F, Vilhjálmsson BJ. Accounting for age of onset and family history improves power in genome-wide association studies. Am J Hum Genet 2022; 109:417-432. [PMID: 35139346 PMCID: PMC8948165 DOI: 10.1016/j.ajhg.2022.01.009] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 01/07/2022] [Indexed: 11/01/2022] Open
Abstract
Genome-wide association studies (GWASs) have revolutionized human genetics, allowing researchers to identify thousands of disease-related genes and possible drug targets. However, case-control status does not account for the fact that not all controls may have lived through their period of risk for the disorder of interest. This can be quantified by examining the age-of-onset distribution and the age of the controls or the age of onset for cases. The age-of-onset distribution may also depend on information such as sex and birth year. In addition, family history is not routinely included in the assessment of control status. Here, we present LT-FH++, an extension of the liability threshold model conditioned on family history (LT-FH), which jointly accounts for age of onset and sex as well as family history. Using simulations, we show that, when family history and the age-of-onset distribution are available, the proposed approach yields statistically significant power gains over LT-FH and large power gains over genome-wide association study by proxy (GWAX). We applied our method to four psychiatric disorders available in the iPSYCH data and to mortality in the UK Biobank and found 20 genome-wide significant associations with LT-FH++, compared to ten for LT-FH and eight for a standard case-control GWAS. As more genetic data with linked electronic health records become available to researchers, we expect methods that account for additional health information, such as LT-FH++, to become even more beneficial.
Collapse
Affiliation(s)
- Emil M Pedersen
- National Centre for Register-Based Research, Aarhus University, 8210 Aarhus, Denmark; Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark.
| | - Esben Agerbo
- National Centre for Register-Based Research, Aarhus University, 8210 Aarhus, Denmark; Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark; Centre for Integrated Register-Based Research at Aarhus University, 8210 Aarhus, Denmark
| | - Oleguer Plana-Ripoll
- National Centre for Register-Based Research, Aarhus University, 8210 Aarhus, Denmark
| | - Jakob Grove
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark; Bioinformatics Research Centre, Aarhus University, 8000 Aarhus, Denmark; Department of Biomedicine and Center for Integrative Sequencing, Aarhus University, 8000 Aarhus, Denmark; Center for Genomics and Personalized Medicine, Aarhus University, 8000 Aarhus, Denmark
| | - Julie W Dreier
- National Centre for Register-Based Research, Aarhus University, 8210 Aarhus, Denmark; Centre for Integrated Register-Based Research at Aarhus University, 8210 Aarhus, Denmark
| | - Katherine L Musliner
- National Centre for Register-Based Research, Aarhus University, 8210 Aarhus, Denmark; Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark; Centre for Integrated Register-Based Research at Aarhus University, 8210 Aarhus, Denmark
| | - Marie Bækvad-Hansen
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark; Center for Neonatal Screening, Department for Congenital Disorders, Statens Serum Institut, 2300 Copenhagen, Denmark
| | - Georgios Athanasiadis
- Institute of Biological Psychiatry, MHC Sct. Hans, Mental Health Services Copenhagen, 4000 Roskilde, Denmark
| | - Andrew Schork
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark; Institute of Biological Psychiatry, MHC Sct. Hans, Mental Health Services Copenhagen, 4000 Roskilde, Denmark
| | - Jonas Bybjerg-Grauholm
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark; Center for Neonatal Screening, Department for Congenital Disorders, Statens Serum Institut, 2300 Copenhagen, Denmark
| | - David M Hougaard
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark; Center for Neonatal Screening, Department for Congenital Disorders, Statens Serum Institut, 2300 Copenhagen, Denmark
| | - Thomas Werge
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark; Institute of Biological Psychiatry, MHC Sct. Hans, Mental Health Services Copenhagen, 4000 Roskilde, Denmark; Department of Clinical Medicine, University of Copenhagen, 2200 Copenhagen, Denmark
| | - Merete Nordentoft
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark; Mental Health Services in the Capital Region of Denmark, Mental Health Center Copenhagen, University of Copenhagen, 2100 Copenhagen, Denmark
| | - Ole Mors
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark; Psychosis Research Unit, Aarhus University Hospital, 8245 Risskov, Denmark
| | - Søren Dalsgaard
- National Centre for Register-Based Research, Aarhus University, 8210 Aarhus, Denmark
| | - Jakob Christensen
- National Centre for Register-Based Research, Aarhus University, 8210 Aarhus, Denmark; Department of Neurology, Aarhus University Hospital, 8200 Aarhus, Denmark; Department of Clinical Medicine, Aarhus University, 8200 Aarhus, Denmark
| | - Anders D Børglum
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark; Center for Genomics and Personalized Medicine, Aarhus University, 8000 Aarhus, Denmark; Department of Biomedicine - Human Genetics, Aarhus University, 8000 Aarhus, Denmark
| | - Preben B Mortensen
- National Centre for Register-Based Research, Aarhus University, 8210 Aarhus, Denmark; Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark; Centre for Integrated Register-Based Research at Aarhus University, 8210 Aarhus, Denmark
| | - John J McGrath
- National Centre for Register-Based Research, Aarhus University, 8210 Aarhus, Denmark; Queensland Brain Institute, University of Queensland, St Lucia, QLD 4072, Australia; Queensland Centre for Mental Health Research, The Park Centre for Mental Health, Wacol, QLD 4076, Australia
| | - Florian Privé
- National Centre for Register-Based Research, Aarhus University, 8210 Aarhus, Denmark
| | - Bjarni J Vilhjálmsson
- National Centre for Register-Based Research, Aarhus University, 8210 Aarhus, Denmark; Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark; Bioinformatics Research Centre, Aarhus University, 8000 Aarhus, Denmark.
| |
Collapse
|
15
|
Bi W, Lee S. Scalable and Robust Regression Methods for Phenome-Wide Association Analysis on Large-Scale Biobank Data. Front Genet 2021; 12:682638. [PMID: 34211504 PMCID: PMC8239389 DOI: 10.3389/fgene.2021.682638] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Accepted: 05/17/2021] [Indexed: 02/05/2023] Open
Abstract
With the advances in genotyping technologies and electronic health records (EHRs), large biobanks have been great resources to identify novel genetic associations and gene-environment interactions on a genome-wide and even a phenome-wide scale. To date, several phenome-wide association studies (PheWAS) have been performed on biobank data, which provides comprehensive insights into many aspects of human genetics and biology. Although inspiring, PheWAS on large-scale biobank data encounters new challenges including computational burden, unbalanced phenotypic distribution, and genetic relationship. In this paper, we first discuss these new challenges and their potential impact on data analysis. Then, we summarize approaches that are scalable and robust in GWAS and PheWAS. This review can serve as a practical guide for geneticists, epidemiologists, and other medical researchers to identify genetic variations associated with health-related phenotypes in large-scale biobank data analysis. Meanwhile, it can also help statisticians to gain a comprehensive and up-to-date understanding of the current technical tool development.
Collapse
Affiliation(s)
- Wenjian Bi
- Department of Medical Genetics, School of Basic Medical Sciences, Peking University, Beijing, China
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, United States
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, United States
| | - Seunggeun Lee
- Graduate School of Data Science, Seoul National University, Seoul, South Korea
| |
Collapse
|
16
|
He L, Davila-Velderrain J, Sumida TS, Hafler DA, Kellis M, Kulminski AM. NEBULA is a fast negative binomial mixed model for differential or co-expression analysis of large-scale multi-subject single-cell data. Commun Biol 2021; 4:629. [PMID: 34040149 PMCID: PMC8155058 DOI: 10.1038/s42003-021-02146-6] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2020] [Accepted: 04/19/2021] [Indexed: 11/18/2022] Open
Abstract
The increasing availability of single-cell data revolutionizes the understanding of biological mechanisms at cellular resolution. For differential expression analysis in multi-subject single-cell data, negative binomial mixed models account for both subject-level and cell-level overdispersions, but are computationally demanding. Here, we propose an efficient NEgative Binomial mixed model Using a Large-sample Approximation (NEBULA). The speed gain is achieved by analytically solving high-dimensional integrals instead of using the Laplace approximation. We demonstrate that NEBULA is orders of magnitude faster than existing tools and controls false-positive errors in marker gene identification and co-expression analysis. Using NEBULA in Alzheimer's disease cohort data sets, we found that the cell-level expression of APOE correlated with that of other genetic risk factors (including CLU, CST3, TREM2, C1q, and ITM2B) in a cell-type-specific pattern and an isoform-dependent manner in microglia. NEBULA opens up a new avenue for the broad application of mixed models to large-scale multi-subject single-cell data.
Collapse
Affiliation(s)
- Liang He
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, USA.
| | - Jose Davila-Velderrain
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA, USA
| | - Tomokazu S Sumida
- Departments of Neurology and Immunobiology, Yale School of Medicine, New Haven, CT, USA
- Department of Cardiovascular Medicine, University of Tokyo Graduate School of Medicine, Tokyo, Japan
| | - David A Hafler
- Departments of Neurology and Immunobiology, Yale School of Medicine, New Haven, CT, USA
| | - Manolis Kellis
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA, USA.
| | - Alexander M Kulminski
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, USA.
| |
Collapse
|
17
|
Ojavee SE, Kousathanas A, Trejo Banos D, Orliac EJ, Patxot M, Läll K, Mägi R, Fischer K, Kutalik Z, Robinson MR. Genomic architecture and prediction of censored time-to-event phenotypes with a Bayesian genome-wide analysis. Nat Commun 2021; 12:2337. [PMID: 33879782 PMCID: PMC8058085 DOI: 10.1038/s41467-021-22538-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Accepted: 03/17/2021] [Indexed: 01/18/2023] Open
Abstract
While recent advancements in computation and modelling have improved the analysis of complex traits, our understanding of the genetic basis of the time at symptom onset remains limited. Here, we develop a Bayesian approach (BayesW) that provides probabilistic inference of the genetic architecture of age-at-onset phenotypes in a sampling scheme that facilitates biobank-scale time-to-event analyses. We show in extensive simulation work the benefits BayesW provides in terms of number of discoveries, model performance and genomic prediction. In the UK Biobank, we find many thousands of common genomic regions underlying the age-at-onset of high blood pressure (HBP), cardiac disease (CAD), and type-2 diabetes (T2D), and for the genetic basis of onset reflecting the underlying genetic liability to disease. Age-at-menopause and age-at-menarche are also highly polygenic, but with higher variance contributed by low frequency variants. Genomic prediction into the Estonian Biobank data shows that BayesW gives higher prediction accuracy than other approaches.
Collapse
Affiliation(s)
- Sven E Ojavee
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.
| | | | - Daniel Trejo Banos
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| | - Etienne J Orliac
- Scientific Computing and Research Support Unit, University of Lausanne, Lausanne, Switzerland
| | - Marion Patxot
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| | - Kristi Läll
- Estonian Genome Center, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Reedik Mägi
- Estonian Genome Center, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Krista Fischer
- Estonian Genome Center, Institute of Genomics, University of Tartu, Tartu, Estonia
- Institute of Mathematics and Statistics, University of Tartu, Tartu, Estonia
| | - Zoltan Kutalik
- University Center for Primary Care and Public Health, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | |
Collapse
|
18
|
He L, Loika Y, Park Y, Bennett DA, Kellis M, Kulminski AM. Exome-wide age-of-onset analysis reveals exonic variants in ERN1 and SPPL2C associated with Alzheimer's disease. Transl Psychiatry 2021; 11:146. [PMID: 33637690 PMCID: PMC7910483 DOI: 10.1038/s41398-021-01263-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/07/2020] [Revised: 01/07/2021] [Accepted: 02/03/2021] [Indexed: 01/31/2023] Open
Abstract
Despite recent discoveries in genome-wide association studies (GWAS) of genomic variants associated with Alzheimer's disease (AD), its underlying biological mechanisms are still elusive. The discovery of novel AD-associated genetic variants, particularly in coding regions and from APOE ε4 non-carriers, is critical for understanding the pathology of AD. In this study, we carried out an exome-wide association analysis of age-of-onset of AD with ~20,000 subjects and placed more emphasis on APOE ε4 non-carriers. Using Cox mixed-effects models, we find that age-of-onset shows a stronger genetic signal than AD case-control status, capturing many known variants with stronger significance, and also revealing new variants. We identified two novel variants, rs56201815, a rare synonymous variant in ERN1, and rs12373123, a common missense variant in SPPL2C in the MAPT region in APOE ε4 non-carriers. Besides, a rare missense variant rs144292455 in TACR3 showed the consistent direction of effect sizes across all studies with a suggestive significant level. In an attempt to unravel their regulatory and biological functions, we found that the minor allele of rs56201815 was associated with lower average FDG uptake across five brain regions in ADNI. Our eQTL analyses based on 6198 gene expression samples from ROSMAP and GTEx revealed that the minor allele of rs56201815 was potentially associated with elevated expression of ERN1, a key gene triggering unfolded protein response (UPR), in multiple brain regions, including the posterior cingulate cortex and nucleus accumbens. Our cell-type-specific eQTL analysis using ~80,000 single nuclei in the prefrontal cortex revealed that the protective minor allele of rs12373123 significantly increased the expression of GRN in microglia, and was associated with MAPT expression in astrocytes. These findings provide novel evidence supporting the hypothesis of the potential involvement of the UPR to ER stress in the pathological pathway of AD, and also give more insights into underlying regulatory mechanisms behind the pleiotropic effects of rs12373123 in multiple degenerative diseases including AD and Parkinson's disease.
Collapse
Affiliation(s)
- Liang He
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, USA.
| | - Yury Loika
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, USA
| | - Yongjin Park
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA, USA
| | - David A Bennett
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA
| | - Manolis Kellis
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA, USA.
| | - Alexander M Kulminski
- Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Durham, NC, USA.
| |
Collapse
|