1
|
Veller C, Przeworski M, Coop G. Causal interpretations of family GWAS in the presence of heterogeneous effects. Proc Natl Acad Sci U S A 2024; 121:e2401379121. [PMID: 39269774 PMCID: PMC11420194 DOI: 10.1073/pnas.2401379121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Accepted: 07/26/2024] [Indexed: 09/15/2024] Open
Abstract
Family-based genome-wide association studies (GWASs) are often claimed to provide an unbiased estimate of the average causal effects (or average treatment effects; ATEs) of alleles, on the basis of an analogy between the random transmission of alleles from parents to children and a randomized controlled trial. We show that this claim does not hold in general. Because Mendelian segregation only randomizes alleles among children of heterozygotes, the effects of alleles in the children of homozygotes are not observable. This feature will matter if an allele has different average effects in the children of homozygotes and heterozygotes, as can arise in the presence of gene-by-environment interactions, gene-by-gene interactions, or differences in linkage disequilibrium patterns. At a single locus, family-based GWAS can be thought of as providing an unbiased estimate of the average effect in the children of heterozygotes (i.e., a local average treatment effect; LATE). This interpretation does not extend to polygenic scores (PGSs), however, because different sets of SNPs are heterozygous in each family. Therefore, other than under specific conditions, the within-family regression slope of a PGS cannot be assumed to provide an unbiased estimate of the LATE for any subset or weighted average of families. In practice, the potential biases of a family-based GWAS are likely smaller than those that can arise from confounding in a standard, population-based GWAS, and so family studies remain important for the dissection of genetic contributions to phenotypic variation. Nonetheless, their causal interpretation is less straightforward than has been widely appreciated.
Collapse
Affiliation(s)
- Carl Veller
- Department of Ecology & Evolution, University of Chicago, Chicago, IL 60637
| | - Molly Przeworski
- Department of Biological Sciences, Columbia University, New York, NY 10027
- Department of Systems Biology, Columbia University, New York, NY 10032
| | - Graham Coop
- Center for Population Biology and Department of Evolution and Ecology, University of California, Davis, CA 95616
| |
Collapse
|
2
|
Baytar AA, Yanar EG, Frary A, Doğanlar S. Association mapping and candidate gene identification for yield traits in European hazelnut ( Corylus avellana L.). PLANT DIRECT 2024; 8:e625. [PMID: 39170862 PMCID: PMC11336203 DOI: 10.1002/pld3.625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Revised: 07/10/2024] [Accepted: 07/11/2024] [Indexed: 08/23/2024]
Abstract
European hazelnut (Corylus avellana L.) is an important nut crop due to its nutritional benefits, culinary uses, and economic value. Türkiye is the leading producer of hazelnut, followed by Italy and the United States. Quantitative trait locus studies offer promising opportunities for breeders and geneticists to identify genomic regions controlling desirable traits in hazelnut. A genome-wide association analysis was conducted with 5,567 single nucleotide polymorphisms on a Turkish core set of 86 hazelnut accessions, revealing 189 quantitative trait nucleotides (QTNs) associated with 22 of 31 traits (p < 2.9E-07). These QTNs were associated with plant and leaf, phenological, reproductive, nut, and kernel traits. Based on the close physical distance of QTNs associated with the same trait, we identified 23 quantitative trait loci. Furthermore, we identified 23 loci of multiple QTs comprising chromosome locations associated with more than one trait at the same position or in close proximity. A total of 159 candidate genes were identified for 189 QTNs, with 122 of them containing significant conserved protein domains. Some candidate matches to known proteins/domains were highly significant, suggesting that they have similar functions as their matches. This comprehensive study provides valuable insights for the development of breeding strategies and the improvement of hazelnut and enhances the understanding of the genetic architecture of complex traits by proposing candidate genes and potential functions.
Collapse
Affiliation(s)
- Asena Akköse Baytar
- Department of Molecular Biology and Genetics, Faculty of ScienceIzmir Institute of TechnologyIzmirTürkiye
| | - Ertuğrul Gazi Yanar
- Department of Molecular Biology and Genetics, Faculty of ScienceIzmir Institute of TechnologyIzmirTürkiye
| | - Anne Frary
- Department of Molecular Biology and Genetics, Faculty of ScienceIzmir Institute of TechnologyIzmirTürkiye
| | - Sami Doğanlar
- Department of Molecular Biology and Genetics, Faculty of ScienceIzmir Institute of TechnologyIzmirTürkiye
- Plant Science and Technology Application and Research CenterIzmir Institute of TechnologyIzmirTürkiye
| |
Collapse
|
3
|
Hillary RF, Gadd DA, Kuncheva Z, Mangelis T, Lin T, Ferber K, McLaughlin H, Runz H, Marioni RE, Foley CN, Sun BB. Systematic discovery of gene-environment interactions underlying the human plasma proteome in UK Biobank. Nat Commun 2024; 15:7346. [PMID: 39187491 PMCID: PMC11347662 DOI: 10.1038/s41467-024-51744-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Accepted: 08/14/2024] [Indexed: 08/28/2024] Open
Abstract
Understanding how gene-environment interactions (GEIs) influence the circulating proteome could aid in biomarker discovery and validation. The presence of GEIs can be inferred from single nucleotide polymorphisms that associate with phenotypic variability - termed variance quantitative trait loci (vQTLs). Here, vQTL association studies are performed on plasma levels of 1463 proteins in 52,363 UK Biobank participants. A set of 677 independent vQTLs are identified across 568 proteins. They include 67 variants that lack conventional additive main effects on protein levels. Over 1100 GEIs are identified between 101 proteins and 153 environmental exposures. GEI analyses uncover possible mechanisms that explain why 13/67 vQTL-only sites lack corresponding main effects. Additional analyses also highlight how age, sex, epistatic interactions and statistical artefacts may underscore associations between genetic variation and variance heterogeneity. This study establishes the most comprehensive database yet of vQTLs and GEIs for the human proteome.
Collapse
Affiliation(s)
- Robert F Hillary
- Optima Partners, Edinburgh, EH2 4HQ, UK
- Centre for Genomic and Experimental Medicine, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
- Translational Sciences, Research and Development, Biogen Inc., Cambridge, MA, USA
| | - Danni A Gadd
- Optima Partners, Edinburgh, EH2 4HQ, UK
- Centre for Genomic and Experimental Medicine, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
- Translational Sciences, Research and Development, Biogen Inc., Cambridge, MA, USA
| | - Zhana Kuncheva
- Optima Partners, Edinburgh, EH2 4HQ, UK
- Translational Sciences, Research and Development, Biogen Inc., Cambridge, MA, USA
- Bayes Centre, The University of Edinburgh, Edinburgh, EH8 9BT, UK
| | - Tasos Mangelis
- Optima Partners, Edinburgh, EH2 4HQ, UK
- Translational Sciences, Research and Development, Biogen Inc., Cambridge, MA, USA
- Bayes Centre, The University of Edinburgh, Edinburgh, EH8 9BT, UK
| | - Tinchi Lin
- Translational Sciences, Research and Development, Biogen Inc., Cambridge, MA, USA
| | - Kyle Ferber
- Translational Sciences, Research and Development, Biogen Inc., Cambridge, MA, USA
| | - Helen McLaughlin
- Translational Sciences, Research and Development, Biogen Inc., Cambridge, MA, USA
| | - Heiko Runz
- Translational Sciences, Research and Development, Biogen Inc., Cambridge, MA, USA
| | - Riccardo E Marioni
- Optima Partners, Edinburgh, EH2 4HQ, UK.
- Centre for Genomic and Experimental Medicine, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK.
- Translational Sciences, Research and Development, Biogen Inc., Cambridge, MA, USA.
| | - Christopher N Foley
- Optima Partners, Edinburgh, EH2 4HQ, UK.
- Translational Sciences, Research and Development, Biogen Inc., Cambridge, MA, USA.
- Bayes Centre, The University of Edinburgh, Edinburgh, EH8 9BT, UK.
| | - Benjamin B Sun
- Translational Sciences, Research and Development, Biogen Inc., Cambridge, MA, USA.
- Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, CB1 8RN, UK.
| |
Collapse
|
4
|
Pazokitoroudi A, Liu Z, Dahl A, Zaitlen N, Rosset S, Sankararaman S. A scalable and robust variance components method reveals insights into the architecture of gene-environment interactions underlying complex traits. Am J Hum Genet 2024; 111:1462-1480. [PMID: 38866020 PMCID: PMC11267529 DOI: 10.1016/j.ajhg.2024.05.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 05/15/2024] [Accepted: 05/15/2024] [Indexed: 06/14/2024] Open
Abstract
Understanding the contribution of gene-environment interactions (GxE) to complex trait variation can provide insights into disease mechanisms, explain sources of heritability, and improve genetic risk prediction. While large biobanks with genetic and deep phenotypic data hold promise for obtaining novel insights into GxE, our understanding of GxE architecture in complex traits remains limited. We introduce a method to estimate the proportion of trait variance explained by GxE (GxE heritability) and additive genetic effects (additive heritability) across the genome and within specific genomic annotations. We show that our method is accurate in simulations and computationally efficient for biobank-scale datasets. We applied our method to common array SNPs (MAF ≥1%), fifty quantitative traits, and four environmental variables (smoking, sex, age, and statin usage) in unrelated white British individuals in the UK Biobank. We found 68 trait-E pairs with significant genome-wide GxE heritability (p<0.05/200) with a ratio of GxE to additive heritability of ≈6.8% on average. Analyzing ≈8 million imputed SNPs (MAF ≥0.1%), we documented an approximate 28% increase in genome-wide GxE heritability compared to array SNPs. We partitioned GxE heritability across minor allele frequency (MAF) and local linkage disequilibrium (LD) values, revealing that, like additive allelic effects, GxE allelic effects tend to increase with decreasing MAF and LD. Analyzing GxE heritability near genes highly expressed in specific tissues, we find significant brain-specific enrichment for body mass index (BMI) and basal metabolic rate in the context of smoking and adipose-specific enrichment for waist-hip ratio (WHR) in the context of sex.
Collapse
Affiliation(s)
- Ali Pazokitoroudi
- Department of Computer Science, UCLA, Los Angeles, CA, USA; Department of Epidemiology, Harvard School of Public Health, Boston, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| | - Zhengtong Liu
- Department of Computer Science, UCLA, Los Angeles, CA, USA
| | - Andrew Dahl
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
| | - Noah Zaitlen
- Department of Human Genetics, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA; Department of Computational Medicine, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA; Department of Neurology, UCLA, Los Angeles, CA, USA
| | - Saharon Rosset
- Department of Statistics, Tel-Aviv University, Tel-Aviv, Israel
| | - Sriram Sankararaman
- Department of Computer Science, UCLA, Los Angeles, CA, USA; Department of Human Genetics, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA; Department of Computational Medicine, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA.
| |
Collapse
|
5
|
Hou K, Xu Z, Ding Y, Mandla R, Shi Z, Boulier K, Harpak A, Pasaniuc B. Calibrated prediction intervals for polygenic scores across diverse contexts. Nat Genet 2024; 56:1386-1396. [PMID: 38886587 DOI: 10.1038/s41588-024-01792-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Accepted: 05/08/2024] [Indexed: 06/20/2024]
Abstract
Polygenic scores (PGS) have emerged as the tool of choice for genomic prediction in a wide range of fields. We show that PGS performance varies broadly across contexts and biobanks. Contexts such as age, sex and income can impact PGS accuracy with similar magnitudes as genetic ancestry. Here we introduce an approach (CalPred) that models all contexts jointly to produce prediction intervals that vary across contexts to achieve calibration (include the trait with 90% probability), whereas existing methods are miscalibrated. In analyses of 72 traits across large and diverse biobanks (All of Us and UK Biobank), we find that prediction intervals required adjustment by up to 80% for quantitative traits. For disease traits, PGS-based predictions were miscalibrated across socioeconomic contexts such as annual household income levels, further highlighting the need of accounting for context information in PGS-based prediction across diverse populations.
Collapse
Affiliation(s)
- Kangcheng Hou
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA.
| | - Ziqi Xu
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, USA
| | - Yi Ding
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA
| | - Ravi Mandla
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA
| | - Zhuozheng Shi
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA
| | - Kristin Boulier
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA
| | - Arbel Harpak
- Department of Population Health, The University of Texas at Austin, Austin, TX, USA
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX, USA
| | - Bogdan Pasaniuc
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA.
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA.
- Department of Computational Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA.
- Institute for Precision Health, University of California Los Angeles, Los Angeles, CA, USA.
| |
Collapse
|
6
|
Yin B, Jia J, Sun X, Hu X, Ao M, Liu W, Tian Z, Liu H, Li D, Tian W, Hao Y, Xia X, Sade N, Brotman Y, Fernie AR, Chen J, He Z, Chen W. Dynamic metabolite QTL analyses provide novel biochemical insights into kernel development and nutritional quality improvement in common wheat. PLANT COMMUNICATIONS 2024; 5:100792. [PMID: 38173227 PMCID: PMC11121174 DOI: 10.1016/j.xplc.2024.100792] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Revised: 12/20/2023] [Accepted: 01/01/2024] [Indexed: 01/05/2024]
Abstract
Despite recent advances in crop metabolomics, the genetic control and molecular basis of the wheat kernel metabolome at different developmental stages remain largely unknown. Here, we performed widely targeted metabolite profiling of kernels from three developmental stages (grain-filling kernels [FKs], mature kernels [MKs], and germinating kernels [GKs]) using a population of 159 recombinant inbred lines. We detected 625 annotated metabolites and mapped 3173, 3143, and 2644 metabolite quantitative trait loci (mQTLs) in FKs, MKs, and GKs, respectively. Only 52 mQTLs were mapped at all three stages, indicating the high stage specificity of the wheat kernel metabolome. Four candidate genes were functionally validated by in vitro enzymatic reactions and/or transgenic approaches in wheat, three of which mediated the tricin metabolic pathway. Metabolite flux efficiencies within the tricin pathway were evaluated, and superior candidate haplotypes were identified, comprehensively delineating the tricin metabolism pathway in wheat. Finally, additional wheat metabolic pathways were re-constructed by updating them to incorporate the 177 candidate genes identified in this study. Our work provides new information on variations in the wheat kernel metabolome and important molecular resources for improvement of wheat nutritional quality.
Collapse
Affiliation(s)
- Bo Yin
- National Key Laboratory of Crop Genetic Improvement and National Center of Plant Gene Research (Wuhan), Huazhong Agricultural University, Wuhan 430070, China; Hubei Hongshan Laboratory, Wuhan 430070, China
| | - Jingqi Jia
- National Key Laboratory of Crop Genetic Improvement and National Center of Plant Gene Research (Wuhan), Huazhong Agricultural University, Wuhan 430070, China; Hubei Hongshan Laboratory, Wuhan 430070, China
| | - Xu Sun
- National Key Laboratory of Crop Genetic Improvement and National Center of Plant Gene Research (Wuhan), Huazhong Agricultural University, Wuhan 430070, China; Hubei Hongshan Laboratory, Wuhan 430070, China
| | - Xin Hu
- National Key Laboratory of Crop Genetic Improvement and National Center of Plant Gene Research (Wuhan), Huazhong Agricultural University, Wuhan 430070, China; Hubei Hongshan Laboratory, Wuhan 430070, China
| | - Min Ao
- National Key Laboratory of Crop Genetic Improvement and National Center of Plant Gene Research (Wuhan), Huazhong Agricultural University, Wuhan 430070, China; Hubei Hongshan Laboratory, Wuhan 430070, China
| | - Wei Liu
- National Key Laboratory of Crop Genetic Improvement and National Center of Plant Gene Research (Wuhan), Huazhong Agricultural University, Wuhan 430070, China; Hubei Hongshan Laboratory, Wuhan 430070, China
| | - Zhitao Tian
- National Key Laboratory of Crop Genetic Improvement and National Center of Plant Gene Research (Wuhan), Huazhong Agricultural University, Wuhan 430070, China; Hubei Hongshan Laboratory, Wuhan 430070, China
| | - Hongbo Liu
- National Key Laboratory of Crop Genetic Improvement and National Center of Plant Gene Research (Wuhan), Huazhong Agricultural University, Wuhan 430070, China
| | - Dongqin Li
- National Key Laboratory of Crop Genetic Improvement and National Center of Plant Gene Research (Wuhan), Huazhong Agricultural University, Wuhan 430070, China
| | - Wenfei Tian
- National Wheat Improvement Center, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Yuanfeng Hao
- National Wheat Improvement Center, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Xianchun Xia
- National Wheat Improvement Center, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Nir Sade
- School of Plant Sciences and Food Security, The Institute for Cereal Crops Improvement, Tel Aviv University, Tel Aviv 69978, Israel
| | - Yariv Brotman
- School of Plant Sciences and Food Security, The Institute for Cereal Crops Improvement, Tel Aviv University, Tel Aviv 69978, Israel
| | - Alisdair R Fernie
- Max Planck Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany
| | - Jie Chen
- National Key Laboratory of Crop Genetic Improvement and National Center of Plant Gene Research (Wuhan), Huazhong Agricultural University, Wuhan 430070, China; Hubei Hongshan Laboratory, Wuhan 430070, China; Yazhouwan National Laboratory, Sanya 572025, China.
| | - Zhonghu He
- National Wheat Improvement Center, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, China.
| | - Wei Chen
- National Key Laboratory of Crop Genetic Improvement and National Center of Plant Gene Research (Wuhan), Huazhong Agricultural University, Wuhan 430070, China; Hubei Hongshan Laboratory, Wuhan 430070, China.
| |
Collapse
|
7
|
Kemper KE, Sidorenko J, Wang H, Hayes BJ, Wray NR, Yengo L, Keller MC, Goddard M, Visscher PM. Genetic influence on within-person longitudinal change in anthropometric traits in the UK Biobank. Nat Commun 2024; 15:3776. [PMID: 38710707 PMCID: PMC11074304 DOI: 10.1038/s41467-024-47802-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Accepted: 04/10/2024] [Indexed: 05/08/2024] Open
Abstract
The causes of temporal fluctuations in adult traits are poorly understood. Here, we investigate the genetic determinants of within-person trait variability of 8 repeatedly measured anthropometric traits in 50,117 individuals from the UK Biobank. We found that within-person (non-directional) variability had a SNP-based heritability of 2-5% for height, sitting height, body mass index (BMI) and weight (P ≤ 2.4 × 10-3). We also analysed longitudinal trait change and show a loss of both average height and weight beyond about 70 years of age. A variant tracking the Alzheimer's risk APOE- E 4 allele (rs429358) was significantly associated with weight loss ( β = -0.047 kg per yr, s.e. 0.007, P = 2.2 × 10-11), and using 2-sample Mendelian Randomisation we detected a relationship consistent with causality between decreased lumbar spine bone mineral density and height loss (bxy = 0.011, s.e. 0.003, P = 3.5 × 10-4). Finally, population-level variance quantitative trait loci (vQTL) were consistent with within-person variability for several traits, indicating an overlap between trait variability assessed at the population or individual level. Our findings help elucidate the genetic influence on trait-change within an individual and highlight disease risks associated with these changes.
Collapse
Affiliation(s)
- Kathryn E Kemper
- Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD, Australia.
| | - Julia Sidorenko
- Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD, Australia
| | - Huanwei Wang
- Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD, Australia
| | - Ben J Hayes
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane, QLD, Australia
| | - Naomi R Wray
- Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD, Australia
- Department of Psychiatry, University of Oxford, Oxford, UK
| | - Loic Yengo
- Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD, Australia
| | - Matthew C Keller
- Institute for Behavioral Genetics, University of Colorado, Boulder, CO, USA
| | - Michael Goddard
- Faculty of Veterinary and Agricultural Science, University of Melbourne, Parkville, VIC, Australia
- Biosciences Research Division, Agriculture Victoria, Bundoora, VIC, Australia
| | - Peter M Visscher
- Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD, Australia.
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Population Health, University of Oxford, Oxford, UK.
| |
Collapse
|
8
|
Bass AJ, Bian S, Wingo AP, Wingo TS, Cutler DJ, Epstein MP. Identifying latent genetic interactions in genome-wide association studies using multiple traits. Genome Med 2024; 16:62. [PMID: 38664839 PMCID: PMC11044415 DOI: 10.1186/s13073-024-01329-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 04/02/2024] [Indexed: 04/28/2024] Open
Abstract
The "missing" heritability of complex traits may be partly explained by genetic variants interacting with other genes or environments that are difficult to specify, observe, and detect. We propose a new kernel-based method called Latent Interaction Testing (LIT) to screen for genetic interactions that leverages pleiotropy from multiple related traits without requiring the interacting variable to be specified or observed. Using simulated data, we demonstrate that LIT increases power to detect latent genetic interactions compared to univariate methods. We then apply LIT to obesity-related traits in the UK Biobank and detect variants with interactive effects near known obesity-related genes (URL: https://CRAN.R-project.org/package=lit ).
Collapse
Affiliation(s)
- Andrew J Bass
- Department of Human Genetics, Emory University, Atlanta, GA, 30322, USA.
| | - Shijia Bian
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, 30322, USA
| | - Aliza P Wingo
- Department of Psychiatry, Emory University, Atlanta, GA, 30322, USA
| | - Thomas S Wingo
- Department of Human Genetics, Emory University, Atlanta, GA, 30322, USA
- Department of Neurology, Emory University, Atlanta, GA, 30322, USA
| | - David J Cutler
- Department of Human Genetics, Emory University, Atlanta, GA, 30322, USA
| | - Michael P Epstein
- Department of Human Genetics, Emory University, Atlanta, GA, 30322, USA.
| |
Collapse
|
9
|
Durvasula A, Price AL. Distinct explanations underlie gene-environment interactions in the UK Biobank. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2023.09.22.23295969. [PMID: 37790574 PMCID: PMC10543037 DOI: 10.1101/2023.09.22.23295969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
The role of gene-environment (GxE) interaction in disease and complex trait architectures is widely hypothesized, but currently unknown. Here, we apply three statistical approaches to quantify and distinguish three different types of GxE interaction for a given trait and E variable. First, we detect locus-specific GxE interaction by testing for genetic correlation r g < 1 across E bins. Second, we detect genome-wide effects of the E variable on genetic variance by leveraging polygenic risk scores (PRS) to test for significant PRSxE in a regression of phenotypes on PRS, E, and PRSxE, together with differences in SNP-heritability across E bins. Third, we detect genome-wide proportional amplification of genetic and environmental effects as a function of the E variable by testing for significant PRSxE with no differences in SNP-heritability across E bins. Simulations show that these approaches achieve high sensitivity and specificity in distinguishing these three GxE scenarios. We applied our framework to 33 UK Biobank traits (25 quantitative traits and 8 diseases; average N = 325 K ) and 10 E variables spanning lifestyle, diet, and other environmental exposures. First, we identified 19 trait-E pairs with r g significantly < 1 (FDR<5%) (average r g = 0.95 ); for example, white blood cell count had r g = 0.95 (s.e. 0.01) between smokers and non-smokers. Second, we identified 28 trait-E pairs with significant PRSxE and significant SNP-heritability differences across E bins; for example, BMI had a significant PRSxE for physical activity (P=4.6e-5) with 5% larger SNP-heritability in the largest versus smallest quintiles of physical activity (P=7e-4). Third, we identified 15 trait-E pairs with significant PRSxE with no SNP-heritability differences across E bins; for example, waist-hip ratio adjusted for BMI had a significant PRSxE effect for time spent watching television (P=5e-3) with no SNP-heritability differences. Across the three scenarios, 8 of the trait-E pairs involved disease traits, whose interpretation is complicated by scale effects. Analyses using biological sex as the E variable produced additional significant findings in each of the three scenarios. Overall, we infer a significant contribution of GxE and GxSex effects to complex trait and disease variance.
Collapse
Affiliation(s)
- Arun Durvasula
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Department of Genetics, Harvard Medical School, Cambridge, MA, USA
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Alkes L Price
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
10
|
Willems YE, Raffington L, Ligthart L, Pool R, Hottenga JJ, Finkenauer C, Bartels M. No gene by stressful life events interaction on individual differences in adults' self-control. Front Psychiatry 2024; 15:1388264. [PMID: 38693999 PMCID: PMC11061522 DOI: 10.3389/fpsyt.2024.1388264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Accepted: 04/03/2024] [Indexed: 05/03/2024] Open
Abstract
Background Difficulty with self-control, or the ability to alter impulses and behavior in a goal-directed way, predicts interpersonal conflict, lower socioeconomic attainments, and more adverse health outcomes. Etiological understanding, and intervention for low self-control is, therefore, a public health goal. A prominent developmental theory proposes that individuals with high genetic propensity for low self-control that are also exposed to stressful environments may be most at-risk of low levels of self-control. Here we examine if polygenic measures associated with behaviors marked by low self-control interact with stressful life events in predicting self-control. Methods Leveraging molecular data from a large population-based Dutch sample (N = 7,090, Mage = 41.2) to test for effects of genetics (i.e., polygenic scores for ADHD and aggression), stressful life events (e.g., traffic accident, violent assault, financial problems), and a gene-by-stress interaction on self-control (measured with the ASEBA Self-Control Scale). Results Both genetics (β =.03 -.04, p <.001) and stressful life events (β = .11 -.14, p <.001) were associated with individual differences in self-control. We find no evidence of a gene-by-stressful life events interaction on individual differences in adults' self-control. Conclusion Our findings are consistent with the notion that genetic influences and stressful life events exert largely independent effects on adult self-control. However, the small effect sizes of polygenic scores increases the likelihood of null results. Genetically-informed longitudinal research in large samples can further inform the etiology of individual differences in self-control from early childhood into later adulthood and its downstream implications for public health.
Collapse
Affiliation(s)
- Yayouk Eva Willems
- Max Planck Institute for Human Development, Max Planck Research Group Biosocial – Biology, Social Disparities, and Development, Berlin, Germany
| | - Laurel Raffington
- Max Planck Institute for Human Development, Max Planck Research Group Biosocial – Biology, Social Disparities, and Development, Berlin, Germany
| | - Lannie Ligthart
- Department of Biological Psychology, Vrije Universiteit Amsterdam, Amsterdam, Netherlands
| | - Rene Pool
- Department of Biological Psychology, Vrije Universiteit Amsterdam, Amsterdam, Netherlands
| | - Jouke Jan Hottenga
- Department of Biological Psychology, Vrije Universiteit Amsterdam, Amsterdam, Netherlands
| | - Catrin Finkenauer
- Department of Interdisciplinary Social Science, Universiteit Utrecht, Utrecht, Netherlands
| | - Meike Bartels
- Department of Biological Psychology, Vrije Universiteit Amsterdam, Amsterdam, Netherlands
- Amsterdam Public Health Research Institute, Amsterdam University Medical Centres, Amsterdam, Netherlands
| |
Collapse
|
11
|
Zhang X, Bell JT. Detecting genetic effects on phenotype variability to capture gene-by-environment interactions: a systematic method comparison. G3 (BETHESDA, MD.) 2024; 14:jkae022. [PMID: 38289865 PMCID: PMC10989912 DOI: 10.1093/g3journal/jkae022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 01/16/2024] [Accepted: 01/19/2024] [Indexed: 02/01/2024]
Abstract
Genetically associated phenotypic variability has been widely observed across organisms and traits, including in humans. Both gene-gene and gene-environment interactions can lead to an increase in genetically associated phenotypic variability. Therefore, detecting the underlying genetic variants, or variance Quantitative Trait Loci (vQTLs), can provide novel insights into complex traits. Established approaches to detect vQTLs apply different methodologies from variance-only approaches to mean-variance joint tests, but a comprehensive comparison of these methods is lacking. Here, we review available methods to detect vQTLs in humans, carry out a simulation study to assess their performance under different biological scenarios of gene-environment interactions, and apply the optimal approaches for vQTL identification to gene expression data. Overall, with a minor allele frequency (MAF) of less than 0.2, the squared residual value linear model (SVLM) and the deviation regression model (DRM) are optimal when the data follow normal and non-normal distributions, respectively. In addition, the Brown-Forsythe (BF) test is one of the optimal methods when the MAF is 0.2 or larger, irrespective of phenotype distribution. Additionally, a larger sample size and more balanced sample distribution in different exposure categories increase the power of BF, SVLM, and DRM. Our results highlight vQTL detection methods that perform optimally under realistic simulation settings and show that their relative performance depends on the phenotype distribution, allele frequency, sample size, and the type of exposure in the interaction model underlying the vQTL.
Collapse
Affiliation(s)
- Xiaopu Zhang
- Department of Twin Research and Genetic Epidemiology, King's College London, St Thomas’ Hospital, Westminster Bridge Road, London SE1 7EH, UK
| | - Jordana T Bell
- Department of Twin Research and Genetic Epidemiology, King's College London, St Thomas’ Hospital, Westminster Bridge Road, London SE1 7EH, UK
| |
Collapse
|
12
|
Schönherr S, Schachtl-Riess JF, Di Maio S, Filosi M, Mark M, Lamina C, Fuchsberger C, Kronenberg F, Forer L. Performing highly parallelized and reproducible GWAS analysis on biobank-scale data. NAR Genom Bioinform 2024; 6:lqae015. [PMID: 38327871 PMCID: PMC10849172 DOI: 10.1093/nargab/lqae015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Revised: 12/21/2023] [Accepted: 01/24/2024] [Indexed: 02/09/2024] Open
Abstract
Genome-wide association studies (GWAS) are transforming genetic research and enable the detection of novel genotype-phenotype relationships. In the last two decades, over 60 000 genetic associations across thousands of traits have been discovered using a GWAS approach. Due to increasing sample sizes, researchers are increasingly faced with computational challenges. A reproducible, modular and extensible pipeline with a focus on parallelization is essential to simplify data analysis and to allow researchers to devote their time to other essential tasks. Here we present nf-gwas, a Nextflow pipeline to run biobank-scale GWAS analysis. The pipeline automatically performs numerous pre- and post-processing steps, integrates regression modeling from the REGENIE package and supports single-variant, gene-based and interaction testing. It includes an extensive reporting functionality that allows to inspect thousands of phenotypes and navigate interactive Manhattan plots directly in the web browser. The pipeline is tested using the unit-style testing framework nf-test, a crucial requirement in clinical and pharmaceutical settings. Furthermore, we validated the pipeline against published GWAS datasets and benchmarked the pipeline on high-performance computing and cloud infrastructures to provide cost estimations to end users. nf-gwas is a highly parallelized, scalable and well-tested Nextflow pipeline to perform GWAS analysis in a reproducible manner.
Collapse
Affiliation(s)
- Sebastian Schönherr
- Institute of Genetic Epidemiology, Medical University of Innsbruck, Innsbruck, Austria. Institutional Address: Schoepfstrasse 41, A-6020 Innsbruck, Austria
| | - Johanna F Schachtl-Riess
- Institute of Genetic Epidemiology, Medical University of Innsbruck, Innsbruck, Austria. Institutional Address: Schoepfstrasse 41, A-6020 Innsbruck, Austria
| | - Silvia Di Maio
- Institute of Genetic Epidemiology, Medical University of Innsbruck, Innsbruck, Austria. Institutional Address: Schoepfstrasse 41, A-6020 Innsbruck, Austria
| | - Michele Filosi
- Institute for Biomedicine, Eurac Research, Affiliated Institute of the University of Lübeck, Bolzano, Italy. Institutional Address: Via Alessandro Volta, 21, 39100 Bolzano BZ, Italy
| | - Marvin Mark
- Institute of Genetic Epidemiology, Medical University of Innsbruck, Innsbruck, Austria. Institutional Address: Schoepfstrasse 41, A-6020 Innsbruck, Austria
| | - Claudia Lamina
- Institute of Genetic Epidemiology, Medical University of Innsbruck, Innsbruck, Austria. Institutional Address: Schoepfstrasse 41, A-6020 Innsbruck, Austria
| | - Christian Fuchsberger
- Institute of Genetic Epidemiology, Medical University of Innsbruck, Innsbruck, Austria. Institutional Address: Schoepfstrasse 41, A-6020 Innsbruck, Austria
- Institute for Biomedicine, Eurac Research, Affiliated Institute of the University of Lübeck, Bolzano, Italy. Institutional Address: Via Alessandro Volta, 21, 39100 Bolzano BZ, Italy
| | - Florian Kronenberg
- Institute of Genetic Epidemiology, Medical University of Innsbruck, Innsbruck, Austria. Institutional Address: Schoepfstrasse 41, A-6020 Innsbruck, Austria
| | - Lukas Forer
- Institute of Genetic Epidemiology, Medical University of Innsbruck, Innsbruck, Austria. Institutional Address: Schoepfstrasse 41, A-6020 Innsbruck, Austria
| |
Collapse
|
13
|
Miao J, Wu Y, Lu Q. Statistical methods for gene-environment interaction analysis. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL STATISTICS 2024; 16:e1635. [PMID: 38699459 PMCID: PMC11064894 DOI: 10.1002/wics.1635] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 09/12/2023] [Indexed: 05/05/2024]
Abstract
Most human complex phenotypes result from multiple genetic and environmental factors and their interactions. Understanding the mechanisms by which genetic and environmental factors interact offers valuable insights into the genetic architecture of complex traits and holds great potential for advancing precision medicine. The emergence of large population biobanks has led to the development of numerous statistical methods aiming at identifying gene-environment interactions (G × E). In this review, we present state-of-the-art statistical methodologies for G × E analysis. We will survey a spectrum of approaches for single-variant G × E mapping, followed by various techniques for polygenic G × E analysis. We conclude this review with a discussion on the future directions and challenges in G × E research.
Collapse
Affiliation(s)
- Jiacheng Miao
- Department of Biostatistics and Medical Informatics, University of Wisconsin–Madison, Madison, Wisconsin, USA
| | - Yixuan Wu
- University of Wisconsin–Madison, Madison, Wisconsin, USA
| | - Qiongshi Lu
- Department of Biostatistics and Medical Informatics, University of Wisconsin–Madison, Madison, Wisconsin, USA
- Department of Statistics, University of Wisconsin–Madison, Madison, Wisconsin, USA
- Center for Demography of Health and Aging, University of Wisconsin–Madison, Madison, Wisconsin, USA
| |
Collapse
|
14
|
Veller C, Przeworski M, Coop G. Causal interpretations of family GWAS in the presence of heterogeneous effects. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.13.566950. [PMID: 38014124 PMCID: PMC10680648 DOI: 10.1101/2023.11.13.566950] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Family-based genome-wide association studies (GWAS) have emerged as a gold standard for assessing causal effects of alleles and polygenic scores. Notably, family studies are often claimed to provide an unbiased estimate of the average causal effect (or average treatment effect; ATE) of an allele, on the basis of an analogy between the random transmission of alleles from parents to children and a randomized controlled trial. Here, we show that this interpretation does not hold in general. Because Mendelian segregation only randomizes alleles among children of heterozygotes, the effects of alleles in the children of homozygotes are not observable. Consequently, if an allele has different average effects in the children of homozygotes and heterozygotes, as can arise in the presence of gene-by-environment interactions, gene-by-gene interactions, or differences in LD patterns, family studies provide a biased estimate of the average effect in the sample. At a single locus, family-based association studies can be thought of as providing an unbiased estimate of the average effect in the children of heterozygotes (i.e., a local average treatment effect; LATE). This interpretation does not extend to polygenic scores, however, because different sets of SNPs are heterozygous in each family. Therefore, other than under specific conditions, the within-family regression slope of a PGS cannot be assumed to provide an unbiased estimate for any subset or weighted average of families. Instead, family-based studies can be reinterpreted as enabling an unbiased estimate of the extent to which Mendelian segregation at loci in the PGS contributes to the population-level variance in the trait. Because this estimate does not include the between-family variance, however, this interpretation applies to only (roughly) half of the sample PGS variance. In practice, the potential biases of a family-based GWAS are likely smaller than those arising from confounding in a standard, population-based GWAS, and so family studies remain important for the dissection of genetic contributions to phenotypic variation. Nonetheless, the causal interpretation of family-based GWAS estimates is less straightforward than has been widely appreciated.
Collapse
Affiliation(s)
- Carl Veller
- Department of Ecology and Evolution, University of Chicago
| | - Molly Przeworski
- Department of Biological Sciences, Columbia University
- Department of Systems Biology, Columbia University
| | - Graham Coop
- Center for Population Biology and Department of Evolution and Ecology, University of California, Davis
| |
Collapse
|
15
|
Bass AJ, Bian S, Wingo AP, Wingo TS, Cutler DJ, Epstein MP. Identifying latent genetic interactions in genome-wide association studies using multiple traits. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.11.557155. [PMID: 37745553 PMCID: PMC10515795 DOI: 10.1101/2023.09.11.557155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]
Abstract
Genome-wide association studies of complex traits frequently find that SNP-based estimates of heritability are considerably smaller than estimates from classic family-based studies. This 'missing' heritability may be partly explained by genetic variants interacting with other genes or environments that are difficult to specify, observe, and detect. To circumvent these challenges, we propose a new method to detect genetic interactions that leverages pleiotropy from multiple related traits without requiring the interacting variable to be specified or observed. Our approach, Latent Interaction Testing (LIT), uses the observation that correlated traits with shared latent genetic interactions have trait variance and covariance patterns that differ by genotype. LIT examines the relationship between trait variance/covariance patterns and genotype using a flexible kernel-based framework that is computationally scalable for biobank-sized datasets with a large number of traits. We first use simulated data to demonstrate that LIT substantially increases power to detect latent genetic interactions compared to a trait-by-trait univariate method. We then apply LIT to four obesity-related traits in the UK Biobank and detect genetic variants with interactive effects near known obesity-related genes. Overall, we show that LIT, implemented in the R package lit, uses shared information across traits to improve detection of latent genetic interactions compared to standard approaches.
Collapse
Affiliation(s)
- Andrew J. Bass
- Department of Human Genetics, Emory University, Atlanta, GA 30322, USA
| | - Shijia Bian
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA 30322, USA
| | - Aliza P. Wingo
- Department of Psychiatry, Emory University, Atlanta, GA 30322, USA
| | - Thomas S. Wingo
- Department of Human Genetics, Emory University, Atlanta, GA 30322, USA
- Department of Neurology, Emory University, Atlanta, GA 30322, USA
| | - David J. Cutler
- Department of Human Genetics, Emory University, Atlanta, GA 30322, USA
| | | |
Collapse
|
16
|
Evans LM, Arehart CH, Grotzinger AD, Mize TJ, Brasher MS, Stitzel JA, Ehringer MA, Hoeffer CA. Transcriptome-wide gene-gene interaction associations elucidate pathways and functional enrichment of complex traits. PLoS Genet 2023; 19:e1010693. [PMID: 37216417 PMCID: PMC10237671 DOI: 10.1371/journal.pgen.1010693] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Revised: 06/02/2023] [Accepted: 03/06/2023] [Indexed: 05/24/2023] Open
Abstract
It remains unknown to what extent gene-gene interactions contribute to complex traits. Here, we introduce a new approach using predicted gene expression to perform exhaustive transcriptome-wide interaction studies (TWISs) for multiple traits across all pairs of genes expressed in several tissue types. Using imputed transcriptomes, we simultaneously reduce the computational challenge and improve interpretability and statistical power. We discover (in the UK Biobank) and replicate (in independent cohorts) several interaction associations, and find several hub genes with numerous interactions. We also demonstrate that TWIS can identify novel associated genes because genes with many or strong interactions have smaller single-locus model effect sizes. Finally, we develop a method to test gene set enrichment of TWIS associations (E-TWIS), finding numerous pathways and networks enriched in interaction associations. Epistasis is may be widespread, and our procedure represents a tractable framework for beginning to explore gene interactions and identify novel genomic targets.
Collapse
Affiliation(s)
- Luke M. Evans
- Institute for Behavioral Genetics, University of Colorado Boulder, Boulder, Colorado, United States of America
- Department of Ecology & Evolutionary Biology, University of Colorado Boulder, Boulder, Colorado, United States of America
| | - Christopher H. Arehart
- Institute for Behavioral Genetics, University of Colorado Boulder, Boulder, Colorado, United States of America
- Department of Ecology & Evolutionary Biology, University of Colorado Boulder, Boulder, Colorado, United States of America
| | - Andrew D. Grotzinger
- Institute for Behavioral Genetics, University of Colorado Boulder, Boulder, Colorado, United States of America
- Department of Psychology & Neuroscience, University of Colorado Boulder, Boulder, Colorado, United States of America
| | - Travis J. Mize
- Institute for Behavioral Genetics, University of Colorado Boulder, Boulder, Colorado, United States of America
- Department of Ecology & Evolutionary Biology, University of Colorado Boulder, Boulder, Colorado, United States of America
| | - Maizy S. Brasher
- Institute for Behavioral Genetics, University of Colorado Boulder, Boulder, Colorado, United States of America
- Department of Ecology & Evolutionary Biology, University of Colorado Boulder, Boulder, Colorado, United States of America
| | - Jerry A. Stitzel
- Institute for Behavioral Genetics, University of Colorado Boulder, Boulder, Colorado, United States of America
- Department of Integrative Physiology, University of Colorado Boulder, Boulder, Colorado, United States of America
| | - Marissa A. Ehringer
- Institute for Behavioral Genetics, University of Colorado Boulder, Boulder, Colorado, United States of America
- Department of Integrative Physiology, University of Colorado Boulder, Boulder, Colorado, United States of America
| | - Charles A. Hoeffer
- Institute for Behavioral Genetics, University of Colorado Boulder, Boulder, Colorado, United States of America
- Department of Integrative Physiology, University of Colorado Boulder, Boulder, Colorado, United States of America
| |
Collapse
|
17
|
Veller C, Coop G. Interpreting population and family-based genome-wide association studies in the presence of confounding. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.26.530052. [PMID: 36909521 PMCID: PMC10002712 DOI: 10.1101/2023.02.26.530052] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/02/2023]
Abstract
A central aim of genome-wide association studies (GWASs) is to estimate direct genetic effects: the causal effects on an individual's phenotype of the alleles that they carry. However, estimates of direct effects can be subject to genetic and environmental confounding, and can also absorb the 'indirect' genetic effects of relatives' genotypes. Recently, an important development in controlling for these confounds has been the use of within-family GWASs, which, because of the randomness of Mendelian segregation within pedigrees, are often interpreted as producing unbiased estimates of direct effects. Here, we present a general theoretical analysis of the influence of confounding in standard population-based and within-family GWASs. We show that, contrary to common interpretation, family-based estimates of direct effects can be biased by genetic confounding. In humans, such biases will often be small per-locus, but can be compounded when effect size estimates are used in polygenic scores. We illustrate the influence of genetic confounding on population- and family-based estimates of direct effects using models of assortative mating, population stratification, and stabilizing selection on GWAS traits. We further show how family-based estimates of indirect genetic effects, based on comparisons of parentally transmitted and untransmitted alleles, can suffer substantial genetic confounding. In addition to known biases that can arise in family-based GWASs when interactions between family members are ignored, we show that biases can also arise from gene-by-environment (G×E) interactions when parental genotypes are not distributed identically across interacting environmental and genetic backgrounds. We conclude that, while family-based studies have placed GWAS estimation on a more rigorous footing, they carry subtle issues of interpretation that arise from confounding and interactions.
Collapse
Affiliation(s)
- Carl Veller
- Department of Evolution and Ecology, and Center for Population Biology, University of California, Davis, CA 95616
| | - Graham Coop
- Department of Evolution and Ecology, and Center for Population Biology, University of California, Davis, CA 95616
| |
Collapse
|
18
|
Gorla A, Sankararaman S, Burchard E, Flint J, Zaitlen N, Rahmani E. Phenotypic subtyping via contrastive learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.05.522921. [PMID: 36711575 PMCID: PMC9881932 DOI: 10.1101/2023.01.05.522921] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Defining and accounting for subphenotypic structure has the potential to increase statistical power and provide a deeper understanding of the heterogeneity in the molecular basis of complex disease. Existing phenotype subtyping methods primarily rely on clinically observed heterogeneity or metadata clustering. However, they generally tend to capture the dominant sources of variation in the data, which often originate from variation that is not descriptive of the mechanistic heterogeneity of the phenotype of interest; in fact, such dominant sources of variation, such as population structure or technical variation, are, in general, expected to be independent of subphenotypic structure. We instead aim to find a subspace with signal that is unique to a group of samples for which we believe that subphenotypic variation exists (e.g., cases of a disease). To that end, we introduce Phenotype Aware Components Analysis (PACA), a contrastive learning approach leveraging canonical correlation analysis to robustly capture weak sources of subphenotypic variation. In the context of disease, PACA learns a gradient of variation unique to cases in a given dataset, while leveraging control samples for accounting for variation and imbalances of biological and technical confounders between cases and controls. We evaluated PACA using an extensive simulation study, as well as on various subtyping tasks using genotypes, transcriptomics, and DNA methylation data. Our results provide multiple strong evidence that PACA allows us to robustly capture weak unknown variation of interest while being calibrated and well-powered, far superseding the performance of alternative methods. This renders PACA as a state-of-the-art tool for defining de novo subtypes that are more likely to reflect molecular heterogeneity, especially in challenging cases where the phenotypic heterogeneity may be masked by a myriad of strong unrelated effects in the data.
Collapse
Affiliation(s)
- Aditya Gorla
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA, USA
| | - Sriram Sankararaman
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA, USA
| | - Esteban Burchard
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
| | - Jonathan Flint
- Department of Psychiatry and Behavioral Sciences, Brain Research Institute, University of California, Los Angeles, Los Angeles, CA, USA
| | - Noah Zaitlen
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Elior Rahmani
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| |
Collapse
|
19
|
Clark R, Pozarickij A, Hysi PG, Ohno-Matsui K, Williams C, Guggenheim JA. Education interacts with genetic variants near GJD2, RBFOX1, LAMA2, KCNQ5 and LRRC4C to confer susceptibility to myopia. PLoS Genet 2022; 18:e1010478. [PMID: 36395078 PMCID: PMC9671369 DOI: 10.1371/journal.pgen.1010478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 10/14/2022] [Indexed: 11/19/2022] Open
Abstract
Myopia most often develops during school age, with the highest incidence in countries with intensive education systems. Interactions between genetic variants and educational exposure are hypothesized to confer susceptibility to myopia, but few such interactions have been identified. Here, we aimed to identify genetic variants that interact with education level to confer susceptibility to myopia. Two groups of unrelated participants of European ancestry from UK Biobank were studied. A 'Stage-I' sample of 88,334 participants whose refractive error (avMSE) was measured by autorefraction and a 'Stage-II' sample of 252,838 participants who self-reported their age-of-onset of spectacle wear (AOSW) but who did not undergo autorefraction. Genetic variants were prioritized via a 2-step screening process in the Stage-I sample: Step 1 was a genome-wide association study for avMSE; Step 2 was a variance heterogeneity analysis for avMSE. Genotype-by-education interaction tests were performed in the Stage-II sample, with University education coded as a binary exposure. On average, participants were 58 years-old and left full-time education when they were 18 years-old; 35% reported University level education. The 2-step screening strategy in the Stage-I sample prioritized 25 genetic variants (GWAS P < 1e-04; variance heterogeneity P < 5e-05). In the Stage-II sample, 19 of the 25 (76%) genetic variants demonstrated evidence of variance heterogeneity, suggesting the majority were true positives. Five genetic variants located near GJD2, RBFOX1, LAMA2, KCNQ5 and LRRC4C had evidence of a genotype-by-education interaction in the Stage-II sample (P < 0.002) and consistent evidence of a genotype-by-education interaction in the Stage-I sample. For all 5 variants, University-level education was associated with an increased effect of the risk allele. In this cohort, additional years of education were associated with an enhanced effect of genetic variants that have roles including axon guidance and the development of neuronal synapses and neural circuits.
Collapse
Affiliation(s)
- Rosie Clark
- School of Optometry & Vision Sciences, Cardiff University, Cardiff, United Kingdom
| | - Alfred Pozarickij
- School of Optometry & Vision Sciences, Cardiff University, Cardiff, United Kingdom
| | - Pirro G. Hysi
- Section of Ophthalmology, School of Life Course Sciences, King’s College London, London, United Kingdom
- Department of Twin Research and Genetic Epidemiology, School of Life Course Sciences, King’s College London, London, United Kingdom
| | - Kyoko Ohno-Matsui
- Department of Ophthalmology and Visual Science, Tokyo Medical and Dental University, Tokyo, Japan
| | - Cathy Williams
- Centre for Academic Child Health, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, United Kingdom
| | - Jeremy A. Guggenheim
- School of Optometry & Vision Sciences, Cardiff University, Cardiff, United Kingdom
| | | |
Collapse
|
20
|
Miao J, Lin Y, Wu Y, Zheng B, Schmitz LL, Fletcher JM, Lu Q. A quantile integral linear model to quantify genetic effects on phenotypic variability. Proc Natl Acad Sci U S A 2022; 119:e2212959119. [PMID: 36122202 PMCID: PMC9522331 DOI: 10.1073/pnas.2212959119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 08/22/2022] [Indexed: 11/18/2022] Open
Abstract
Detecting genetic variants associated with the variance of complex traits, that is, variance quantitative trait loci (vQTLs), can provide crucial insights into the interplay between genes and environments and how they jointly shape human phenotypes in the population. We propose a quantile integral linear model (QUAIL) to estimate genetic effects on trait variability. Through extensive simulations and analyses of real data, we demonstrate that QUAIL provides computationally efficient and statistically powerful vQTL mapping that is robust to non-Gaussian phenotypes and confounding effects on phenotypic variability. Applied to UK Biobank (n = 375,791), QUAIL identified 11 vQTLs for body mass index (BMI) that have not been previously reported. Top vQTL findings showed substantial enrichment for interactions with physical activities and sedentary behavior. Furthermore, variance polygenic scores (vPGSs) based on QUAIL effect estimates showed superior predictive performance on both population-level and within-individual BMI variability compared to existing approaches. Overall, QUAIL is a unified framework to quantify genetic effects on the phenotypic variability at both single-variant and vPGS levels. It addresses critical limitations in existing approaches and may have broad applications in future gene-environment interaction studies.
Collapse
Affiliation(s)
- Jiacheng Miao
- Department of Biostatistics and Medical Informatics, University of Wisconsin–Madison, WI 53706
| | - Yupei Lin
- Baylor College of Medicine, Houston, TX 77030
| | - Yuchang Wu
- Department of Biostatistics and Medical Informatics, University of Wisconsin–Madison, WI 53706
| | - Boyan Zheng
- Department of Sociology, University of Wisconsin–Madison, Madison, WI 53706
| | - Lauren L. Schmitz
- Robert M. La Follette School of Public Affairs, University of Wisconsin–Madison, Madison, WI 53706
- Center for Demography of Health and Aging, University of Wisconsin–Madison, Madison, WI 53706
| | - Jason M. Fletcher
- Department of Sociology, University of Wisconsin–Madison, Madison, WI 53706
- Robert M. La Follette School of Public Affairs, University of Wisconsin–Madison, Madison, WI 53706
- Center for Demography of Health and Aging, University of Wisconsin–Madison, Madison, WI 53706
| | - Qiongshi Lu
- Department of Biostatistics and Medical Informatics, University of Wisconsin–Madison, WI 53706
- Center for Demography of Health and Aging, University of Wisconsin–Madison, Madison, WI 53706
- Department of Statistics, University of Wisconsin–Madison, Madison, WI 53706
| |
Collapse
|
21
|
Shi G. Genome-wide variance quantitative trait locus analysis suggests small interaction effects in blood pressure traits. Sci Rep 2022; 12:12649. [PMID: 35879408 PMCID: PMC9314370 DOI: 10.1038/s41598-022-16908-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Accepted: 07/18/2022] [Indexed: 11/09/2022] Open
Abstract
Genome-wide variance quantitative trait loci (vQTL) analysis complements genome-wide association study (GWAS) and has the potential to identify novel variants associated with the trait, explain additional trait variance and lead to the identification of factors that modulate the genetic effects. I conducted genome-wide analysis of the UK Biobank data and identified 27 vQTLs associated with systolic blood pressure (SBP), diastolic blood pressure (DBP) and pulse pressure (PP). The top single-nucleotide polymorphisms (SNPs) are enriched for expression QTLs (eQTLs) or splicing QTLs (sQTLs) annotated by GTEx, suggesting their regulatory roles in mediating the associations with blood pressure (BP). Of the 27 vQTLs, 14 are known BP-associated QTLs discovered by GWASs. The heteroscedasticity effects of the 13 novel vQTLs are larger than their genetic main effects, which were not detected by existing GWASs. The total R-squared of the 27 top SNPs due to variance heteroscedasticity is 0.28%, compared with 0.50% owing to their main effects. The overall effect size of the variance heteroscedasticity is small in GWAS SNPs compared with their main effects. For the 411, 384 and 285 GWAS SNPs associated with SBP, DBP and PP, respectively, their heteroscedasticity effects were 0.52%, 0.43%, and 0.16%, and their main effects were 5.13%, 5.61%, and 3.75%, respectively. The number and effects of the vQTLs are small, which suggests that the effects of gene-environment and gene-gene interactions are small. The main effects of the SNPs remain the major source of genetic variance for BP, which would probably be true for other complex traits as well.
Collapse
Affiliation(s)
- Gang Shi
- School of Telecommunications Engineering, Xidian University, 2 South Taibai Road, Xi'an, 710071, Shaanxi, China.
| |
Collapse
|
22
|
Westerman KE, Majarian TD, Giulianini F, Jang DK, Miao J, Florez JC, Chen H, Chasman DI, Udler MS, Manning AK, Cole JB. Variance-quantitative trait loci enable systematic discovery of gene-environment interactions for cardiometabolic serum biomarkers. Nat Commun 2022; 13:3993. [PMID: 35810165 PMCID: PMC9271055 DOI: 10.1038/s41467-022-31625-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 06/24/2022] [Indexed: 11/29/2022] Open
Abstract
Gene-environment interactions represent the modification of genetic effects by environmental exposures and are critical for understanding disease and informing personalized medicine. These often induce differential phenotypic variance across genotypes; these variance-quantitative trait loci can be prioritized in a two-stage interaction detection strategy to greatly reduce the computational and statistical burden and enable testing of a broader range of exposures. We perform genome-wide variance-quantitative trait locus analysis for 20 serum cardiometabolic biomarkers by multi-ancestry meta-analysis of 350,016 unrelated participants in the UK Biobank, identifying 182 independent locus-biomarker pairs (p < 4.5×10-9). Most are concentrated in a small subset (4%) of loci with genome-wide significant main effects, and 44% replicate (p < 0.05) in the Women's Genome Health Study (N = 23,294). Next, we test each locus-biomarker pair for interaction across 2380 exposures, identifying 847 significant interactions (p < 2.4×10-7), of which 132 are independent (p < 0.05) after accounting for correlation between exposures. Specific examples demonstrate interaction of triglyceride-associated variants with distinct body mass- versus body fat-related exposures as well as genotype-specific associations between alcohol consumption and liver stress at the ADH1B gene. Our catalog of variance-quantitative trait loci and gene-environment interactions is publicly available in an online portal.
Collapse
Affiliation(s)
- Kenneth E Westerman
- Clinical and Translational Epidemiology Unit, Mongan Institute, Massachusetts General Hospital, Boston, MA, USA.
- Programs in Metabolism and Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Department of Medicine, Harvard Medical School, Boston, MA, USA.
| | - Timothy D Majarian
- Programs in Metabolism and Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Franco Giulianini
- Division of Preventive Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Dong-Keun Jang
- Programs in Metabolism and Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Jenkai Miao
- Division of Endocrinology, Boston Children's Hospital, Boston, MA, USA
| | - Jose C Florez
- Programs in Metabolism and Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Han Chen
- Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Daniel I Chasman
- Division of Preventive Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Division of Genetics, Brigham and Women's Hospital, Boston, MA, 02115, USA
- Medical and Population Genetics Program, Broad Institute, Cambridge, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Miriam S Udler
- Programs in Metabolism and Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Alisa K Manning
- Clinical and Translational Epidemiology Unit, Mongan Institute, Massachusetts General Hospital, Boston, MA, USA
- Programs in Metabolism and Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Joanne B Cole
- Programs in Metabolism and Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Division of Endocrinology, Boston Children's Hospital, Boston, MA, USA.
- Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA.
| |
Collapse
|
23
|
Gillett AC, Jermy BS, Lee SH, Pain O, Howard DM, Hagenaars SP, Hanscombe KB, Coleman JRI, Lewis CM. Exploring polygenic-environment and residual-environment interactions for depressive symptoms within the UK Biobank. Genet Epidemiol 2022; 46:219-233. [PMID: 35438196 PMCID: PMC9541465 DOI: 10.1002/gepi.22449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Revised: 02/04/2022] [Accepted: 03/15/2022] [Indexed: 11/10/2022]
Abstract
Substantial advances have been made in identifying genetic contributions to depression, but little is known about how the effect of genes can be modulated by the environment, creating a gene-environment interaction. Using multivariate reaction norm models (MRNMs) within the UK Biobank (N = 61294-91644), we investigate whether the polygenic and residual variance components of depressive symptoms are modulated by 17 a priori selected covariate traits-12 environmental variables and 5 biomarkers. MRNMs, a mixed-effects modelling approach, provide unbiased polygenic-covariate interaction estimates for a quantitative trait by controlling for outcome-covariate correlations and residual-covariate interactions. A continuous depressive symptom variable was the outcome in 17 MRNMs-one for each covariate trait. Each MRNM had a fixed-effects model (fixed effects included the covariate trait, demographic variables, and principal components) and a random effects model (where polygenic-covariate and residual-covariate interactions are modelled). Of the 17 selected covariates, 11 significantly modulate deviations in depressive symptoms through the modelled interactions, but no single interaction explains a large proportion of phenotypic variation. Results are dominated by residual-covariate interactions, suggesting that covariate traits (including neuroticism, childhood trauma, and BMI) typically interact with unmodelled variables, rather than a genome-wide polygenic component, to influence depressive symptoms. Only average sleep duration has a polygenic-covariate interaction explaining a demonstrably nonzero proportion of the variability in depressive symptoms. This effect is small, accounting for only 1.22% (95% confidence interval: [0.54, 1.89]) of variation. The presence of an interaction highlights a specific focus for intervention, but the negative results here indicate a limited contribution from polygenic-environment interactions.
Collapse
Affiliation(s)
- Alexandra C Gillett
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK.,NIHR Maudsley Biomedical Research Centre, South London and Maudsley NHS Trust, London, UK
| | - Bradley S Jermy
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK.,NIHR Maudsley Biomedical Research Centre, South London and Maudsley NHS Trust, London, UK
| | - Sang Hong Lee
- Australian Centre for Precision Health, University of South Australia, SA, Adelaide, Australia.,UniSA Allied Health and Human Performance, University of South Australia, Adelaide, SA, Australia
| | - Oliver Pain
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK.,NIHR Maudsley Biomedical Research Centre, South London and Maudsley NHS Trust, London, UK
| | - David M Howard
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK.,Division of Psychiatry, Royal Edinburgh Hospital, University of Edinburgh, Edinburgh, UK
| | - Saskia P Hagenaars
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
| | - Ken B Hanscombe
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
| | - Jonathan R I Coleman
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK.,NIHR Maudsley Biomedical Research Centre, South London and Maudsley NHS Trust, London, UK
| | - Cathryn M Lewis
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK.,NIHR Maudsley Biomedical Research Centre, South London and Maudsley NHS Trust, London, UK.,Department of Medical and Molecular Genetics, Faculty of Life Sciences and Medicine, King's College London, London, UK
| |
Collapse
|
24
|
From Mendel to quantitative genetics in the genome era: the scientific legacy of W. G. Hill. Nat Genet 2022; 54:934-939. [PMID: 35817969 DOI: 10.1038/s41588-022-01103-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Accepted: 05/18/2022] [Indexed: 11/08/2022]
Abstract
The quantitative geneticist W. G. ('Bill') Hill, awardee of the 2018 Darwin Medal of the Royal Society and the 2019 Mendel Medal of the Genetics Society (United Kingdom), died on 17 December 2021 at the age of 81 years. Here, we pay tribute to his multiple key scientific contributions, which span population and evolutionary genetics, animal and plant breeding and human genetics. We discuss his theoretical research on the role of linkage disequilibrium (LD) and mutational variance in the response to selection, the origin of the widely used LD metric r2 in genomic association studies, the genetic architecture of complex traits, the quantification of the variation in realized relationships given a pedigree relationship and much more. We demonstrate that basic theoretical research in quantitative and statistical genetics has led to profound insights into the genetics and evolution of complex traits and made predictions that were subsequently empirically validated, often decades later.
Collapse
|
25
|
Johnson R, Sotoudeh R, Conley D. Polygenic Scores for Plasticity: A New Tool for Studying Gene-Environment Interplay. Demography 2022; 59:1045-1070. [PMID: 35553650 DOI: 10.1215/00703370-9957418] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Fertility, health, education, and other outcomes of interest to demographers are the product of an individual's genetic makeup and their social environment. Yet, gene × environment (G×E) research deploys a limited toolkit on the genetic side to study the gene-environment interplay, relying on polygenic scores (PGSs) that reflect the influence of genetics on levels of an outcome. In this article, we develop a genetic summary measure better suited for G×E research: variance polygenic scores (vPGSs), which are PGSs that reflect genetic contributions to plasticity in outcomes. First, we use the UK Biobank (N ∼ 408,000 in the analytic sample) and the Health and Retirement Study (N ∼ 5,700 in the analytic sample) to compare four approaches to constructing PGSs for plasticity. The results show that widely used methods for discovering which genetic variants affect outcome variability fail to serve as distinctive new tools for G×E. Second, using the PGSs that do capture distinctive genetic contributions to plasticity, we analyze heterogeneous effects of a UK education reform on health and educational attainment. The results show the properties of a useful new tool for population scientists studying the interplay of nature and nurture and for population-based studies that are releasing PGSs to applied researchers.
Collapse
Affiliation(s)
- Rebecca Johnson
- McCourt School of Public Policy, Georgetown University, Washington, DC, USA
| | | | - Dalton Conley
- Department of Sociology and Office of Population Research, Princeton University, Princeton, NJ, USA
| |
Collapse
|
26
|
Staley JR, Windmeijer F, Suderman M, Lyon MS, Davey Smith G, Tilling K. A robust mean and variance test with application to high-dimensional phenotypes. Eur J Epidemiol 2022; 37:377-387. [PMID: 34651232 PMCID: PMC9187575 DOI: 10.1007/s10654-021-00805-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Accepted: 09/06/2021] [Indexed: 12/01/2022]
Abstract
Most studies of continuous health-related outcomes examine differences in mean levels (location) of the outcome by exposure. However, identifying effects on the variability (scale) of an outcome, and combining tests of mean and variability (location-and-scale), could provide additional insights into biological mechanisms. A joint test could improve power for studies of high-dimensional phenotypes, such as epigenome-wide association studies of DNA methylation at CpG sites. One possible cause of heterogeneity of variance is a variable interacting with exposure in its effect on outcome, so a joint test of mean and variability could help in the identification of effect modifiers. Here, we review a scale test, based on the Brown-Forsythe test, for analysing variability of a continuous outcome with respect to both categorical and continuous exposures, and develop a novel joint location-and-scale score (JLSsc) test. These tests were compared to alternatives in simulations and used to test associations of mean and variability of DNA methylation with gender and gestational age using data from the Accessible Resource for Integrated Epigenomics Studies (ARIES). In simulations, the Brown-Forsythe and JLSsc tests retained correct type I error rates when the outcome was not normally distributed in contrast to the other approaches tested which all had inflated type I error rates. These tests also identified > 7500 CpG sites for which either mean or variability in cord blood methylation differed according to gender or gestational age. The Brown-Forsythe test and JLSsc are robust tests that can be used to detect associations not solely driven by a mean effect.
Collapse
Affiliation(s)
- James R Staley
- MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, BS8 2BN, UK
| | - Frank Windmeijer
- MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, BS8 2BN, UK
- Department of Statistics and Nuffield College, University of Oxford, Oxford, UK
| | - Matthew Suderman
- MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, BS8 2BN, UK
| | - Matthew S Lyon
- MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, BS8 2BN, UK
- National Institute for Health Research Bristol Biomedical Research Centre, University of Bristol, Oakfield House, Bristol, BS8 2BN, UK
| | - George Davey Smith
- MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, BS8 2BN, UK
| | - Kate Tilling
- MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, BS8 2BN, UK.
| |
Collapse
|
27
|
Abstract
Mendelian randomization (MR) is a method of studying the causal effects of modifiable exposures (i.e., potential risk factors) on health, social, and economic outcomes using genetic variants associated with the specific exposures of interest. MR provides a more robust understanding of the influence of these exposures on outcomes because germline genetic variants are randomly inherited from parents to offspring and, as a result, should not be related to potential confounding factors that influence exposure-outcome associations. The genetic variant can therefore be used as a tool to link the proposed risk factor and outcome, and to estimate this effect with less confounding and bias than conventional epidemiological approaches. We describe the scope of MR, highlighting the range of applications being made possible as genetic data sets and resources become larger and more freely available. We outline the MR approach in detail, covering concepts, assumptions, and estimation methods. We cover some common misconceptions, provide strategies for overcoming violation of assumptions, and discuss future prospects for extending the clinical applicability, methodological innovations, robustness, and generalizability of MR findings.
Collapse
Affiliation(s)
- Rebecca C Richmond
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol BS8 2BN, United Kingdom
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol BS8 2BN, United Kingdom
| | - George Davey Smith
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol BS8 2BN, United Kingdom
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol BS8 2BN, United Kingdom
- NIHR Bristol Biomedical Research Centre, University Hospitals Bristol NHS Foundation Trust and University of Bristol, Bristol BS1 3NU, United Kingdom
| |
Collapse
|
28
|
Domingue BW, Kanopka K, Mallard TT, Trejo S, Tucker-Drob EM. Modeling Interaction and Dispersion Effects in the Analysis of Gene-by-Environment Interaction. Behav Genet 2022; 52:56-64. [PMID: 34855050 PMCID: PMC8958844 DOI: 10.1007/s10519-021-10090-8] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Accepted: 10/28/2021] [Indexed: 11/25/2022]
Abstract
Genotype-by-environment interaction (GxE) studies probe heterogeneity in response to risk factors or interventions. Popular methods for estimation of GxE examine multiplicative interactions between individual genetic and environmental measures. However, risk factors and interventions may modulate the total variance of an epidemiological outcome that itself represents the aggregation of many other etiological components. We expand the traditional GxE model to directly model genetic and environmental moderation of the dispersion of the outcome. We derive a test statistic, [Formula: see text], for inferring whether an interaction identified between individual genetic and environmental measures represents a more general pattern of moderation of the total variance in the phenotype by either the genetic or the environmental measure. We validate our method via extensive simulation, and apply it to investigate genotype-by-birth year interactions for Body Mass Index (BMI) with polygenic scores in the Health and Retirement Study (N = 11,586) and individual genetic variants in the UK Biobank (N = 380,605). We find that changes in the penetrance of a genome-wide polygenic score for BMI across birth year are partly representative of a more general pattern of expanding BMI variation across generations. Three individual variants found to be more strongly associated with BMI among later born individuals, were also associated with the magnitude of variability in BMI itself within any given birth year, suggesting that they may confer general sensitivity of BMI to a range of unmeasured factors beyond those captured by birth year. We introduce an expanded GxE regression model that explicitly models genetic and environmental moderation of the dispersion of the outcome under study. This approach can determine whether GxE interactions identified are specific to the measured predictors or represent a more general pattern of moderation of the total variance in the outcome by the genetic and environmental measures.
Collapse
Affiliation(s)
- Benjamin W Domingue
- Graduate School of Education, Stanford University and Center for Population Health Sciences, Stanford Medicine, Stanford, USA.
| | - Klint Kanopka
- Graduate School of Education, Stanford University, Stanford, USA
| | - Travis T Mallard
- Department of Psychology, University of Texas at Austin, Austin, USA
| | - Sam Trejo
- Department of Sociology and Office of Population Research, Princeton University, Princeton, USA
| | - Elliot M Tucker-Drob
- Department of Psychology and Population Research Center, University of Texas at Austin, Austin, USA.
| |
Collapse
|
29
|
Mills HL, Higgins JP, Morris RW, Kessler D, Heron J, Wiles N, Davey Smith G, Tilling K. Detecting Heterogeneity of Intervention Effects Using Analysis and Meta-analysis of Differences in Variance Between Trial Arms. Epidemiology 2021; 32:846-854. [PMID: 34432720 PMCID: PMC8478324 DOI: 10.1097/ede.0000000000001401] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Accepted: 07/12/2021] [Indexed: 11/25/2022]
Abstract
BACKGROUND Randomized controlled trials (RCTs) with continuous outcomes usually only examine mean differences in response between trial arms. If the intervention has heterogeneous effects, then outcome variances will also differ between arms. Power of an individual trial to assess heterogeneity is lower than the power to detect the same size of main effect. METHODS We describe several methods for assessing differences in variance in trial arms and apply them to a single trial with individual patient data and to meta-analyses using summary data. Where individual data are available, we use regression-based methods to examine the effects of covariates on variation. We present an additional method to meta-analyze differences in variances with summary data. RESULTS In the single trial, there was agreement between methods, and the difference in variance was largely due to differences in prevalence of depression at baseline. In two meta-analyses, most individual trials did not show strong evidence of a difference in variance between arms, with wide confidence intervals. However, both meta-analyses showed evidence of greater variance in the control arm, and in one example, this was perhaps because mean outcome in the control arm was higher. CONCLUSIONS Using meta-analysis, we overcame low power of individual trials to examine differences in variance using meta-analysis. Evidence of differences in variance should be followed up to identify potential effect modifiers and explore other possible causes such as varying compliance.
Collapse
Affiliation(s)
- Harriet L. Mills
- From the Medical Research Council Integrative Epidemiology Unit, Bristol Medical School, University of Bristol, Bristol, United Kingdom
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, United Kingdom
| | - Julian P.T. Higgins
- From the Medical Research Council Integrative Epidemiology Unit, Bristol Medical School, University of Bristol, Bristol, United Kingdom
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, United Kingdom
- National Institute for Health Research Bristol Biomedical Research Centre, University Hospitals Bristol NHS Foundation Trust and University of Bristol, Bristol, United Kingdom
| | - Richard W. Morris
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, United Kingdom
| | - David Kessler
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, United Kingdom
- National Institute for Health Research Bristol Biomedical Research Centre, University Hospitals Bristol NHS Foundation Trust and University of Bristol, Bristol, United Kingdom
| | - Jon Heron
- From the Medical Research Council Integrative Epidemiology Unit, Bristol Medical School, University of Bristol, Bristol, United Kingdom
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, United Kingdom
| | - Nicola Wiles
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, United Kingdom
- National Institute for Health Research Bristol Biomedical Research Centre, University Hospitals Bristol NHS Foundation Trust and University of Bristol, Bristol, United Kingdom
| | - George Davey Smith
- From the Medical Research Council Integrative Epidemiology Unit, Bristol Medical School, University of Bristol, Bristol, United Kingdom
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, United Kingdom
| | - Kate Tilling
- From the Medical Research Council Integrative Epidemiology Unit, Bristol Medical School, University of Bristol, Bristol, United Kingdom
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, United Kingdom
- National Institute for Health Research Bristol Biomedical Research Centre, University Hospitals Bristol NHS Foundation Trust and University of Bristol, Bristol, United Kingdom
| |
Collapse
|
30
|
Adjangba C, Border R, Romero Villela PN, Ehringer MA, Evans LM. Little Evidence of Modified Genetic Effect of rs16969968 on Heavy Smoking Based on Age of Onset of Smoking. Nicotine Tob Res 2021; 23:1055-1063. [PMID: 33165565 DOI: 10.1093/ntr/ntaa229] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Accepted: 11/03/2020] [Indexed: 12/12/2022]
Abstract
INTRODUCTION Tobacco smoking is the leading cause of preventable death globally. Smoking quantity, measured in cigarettes per day, is influenced both by the age of onset of regular smoking (AOS) and by genetic factors, including a strong effect of the nonsynonymous single-nucleotide polymorphism rs16969968. A previous study by Hartz et al. reported an interaction between these two factors, whereby rs16969968 risk allele carriers who started smoking earlier showed increased risk for heavy smoking compared with those who started later. This finding has yet to be replicated in a large, independent sample. METHODS We performed a preregistered, direct replication attempt of the rs16969968 × AOS interaction on smoking quantity in 128 383 unrelated individuals from the UK Biobank, meta-analyzed across ancestry groups. We fit statistical association models mirroring the original publication as well as formal interaction tests on multiple phenotypic and analytical scales. RESULTS We replicated the main effects of rs16969968 and AOS on cigarettes per day but failed to replicate the interaction using previous methods. Nominal significance of the rs16969968 × AOS interaction term depended strongly on the scale of analysis and the particular phenotype, as did associations stratified by early/late AOS. No interaction tests passed genome-wide correction (α = 5e-8), and all estimated interaction effect sizes were much smaller in magnitude than previous estimates. CONCLUSIONS We failed to replicate the strong rs16969968 × AOS interaction effect previously reported. If such gene-moderator interactions influence complex traits, they likely depend on scale of measurement, and current biobanks lack the power to detect significant genome-wide associations given the minute effect sizes expected. IMPLICATIONS We failed to replicate the strong rs16969968 × AOS interaction effect on smoking quantity previously reported. If such gene-moderator interactions influence complex traits, current biobanks lack the power to detect significant genome-wide associations given the minute effect sizes expected. Furthermore, many potential interaction effects are likely to depend on the scale of measurement employed.
Collapse
Affiliation(s)
- Christine Adjangba
- Institute for Behavioral Genetics, University of Colorado-Boulder, Boulder, CO
| | - Richard Border
- Institute for Behavioral Genetics, University of Colorado-Boulder, Boulder, CO.,Department of Applied Mathematics, University of Colorado-Boulder, Boulder, CO
| | - Pamela N Romero Villela
- Institute for Behavioral Genetics, University of Colorado-Boulder, Boulder, CO.,Department of Psychology and Neuroscience, University of Colorado-Boulder, Boulder, CO
| | - Marissa A Ehringer
- Institute for Behavioral Genetics, University of Colorado-Boulder, Boulder, CO.,Department of Integrative Physiology, University of Colorado-Boulder, Boulder, CO
| | - Luke M Evans
- Institute for Behavioral Genetics, University of Colorado-Boulder, Boulder, CO.,Department of Ecology and Evolutionary Biology, University of Colorado-Boulder, Boulder, CO
| |
Collapse
|
31
|
Bailey NW, Desjonquères C, Drago A, Rayner JG, Sturiale SL, Zhang X. A neglected conceptual problem regarding phenotypic plasticity's role in adaptive evolution: The importance of genetic covariance and social drive. Evol Lett 2021; 5:444-457. [PMID: 34621532 PMCID: PMC8484725 DOI: 10.1002/evl3.251] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Revised: 07/12/2021] [Accepted: 07/19/2021] [Indexed: 01/16/2023] Open
Abstract
There is tantalizing evidence that phenotypic plasticity can buffer novel, adaptive genetic variants long enough to permit their evolutionary spread, and this process is often invoked in explanations for rapid adaptive evolution. However, the strength and generality of evidence for it is controversial. We identify a conceptual problem affecting this debate: recombination, segregation, and independent assortment are expected to quickly sever associations between genes controlling novel adaptations and genes contributing to trait plasticity that facilitates the novel adaptations by reducing their indirect fitness costs. To make clearer predictions about this role of plasticity in facilitating genetic adaptation, we describe a testable genetic mechanism that resolves the problem: genetic covariance between new adaptive variants and trait plasticity that facilitates their persistence within populations. We identify genetic architectures that might lead to such a covariance, including genetic coupling via physical linkage and pleiotropy, and illustrate the consequences for adaptation rates using numerical simulations. Such genetic covariances may also arise from the social environment, and we suggest the indirect genetic effects that result could further accentuate the process of adaptation. We call the latter mechanism of adaptation social drive, and identify methods to test it. We suggest that genetic coupling of plasticity and adaptations could promote unusually rapid ‘runaway’ evolution of novel adaptations. The resultant dynamics could facilitate evolutionary rescue, adaptive radiations, the origin of novelties, and other commonly studied processes.
Collapse
Affiliation(s)
- Nathan W Bailey
- School of Biology University of St Andrews St Andrews KY16 9TH United Kingdom
| | - Camille Desjonquères
- School of Biology University of St Andrews St Andrews KY16 9TH United Kingdom.,Department of Biological Sciences University of Wisconsin-Milwaukee Milwaukee Wisconsin 53201
| | - Ana Drago
- School of Biology University of St Andrews St Andrews KY16 9TH United Kingdom
| | - Jack G Rayner
- School of Biology University of St Andrews St Andrews KY16 9TH United Kingdom
| | - Samantha L Sturiale
- School of Biology University of St Andrews St Andrews KY16 9TH United Kingdom.,Current Address: Department of Biology Georgetown University Washington DC 20057
| | - Xiao Zhang
- School of Biology University of St Andrews St Andrews KY16 9TH United Kingdom
| |
Collapse
|
32
|
Liu D, Ban HJ, El Sergani AM, Lee MK, Hecht JT, Wehby GL, Moreno LM, Feingold E, Marazita ML, Cha S, Szabo-Rogers HL, Weinberg SM, Shaffer JR. PRICKLE1 × FOCAD Interaction Revealed by Genome-Wide vQTL Analysis of Human Facial Traits. Front Genet 2021; 12:674642. [PMID: 34434215 PMCID: PMC8381734 DOI: 10.3389/fgene.2021.674642] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Accepted: 06/03/2021] [Indexed: 12/14/2022] Open
Abstract
The human face is a highly complex and variable structure resulting from the intricate coordination of numerous genetic and non-genetic factors. Hundreds of genomic loci impacting quantitative facial features have been identified. While these associations have been shown to influence morphology by altering the mean size and shape of facial measures, their effect on trait variance remains unclear. We conducted a genome-wide association analysis for the variance of 20 quantitative facial measurements in 2,447 European individuals and identified several suggestive variance quantitative trait loci (vQTLs). These vQTLs guided us to conduct an efficient search for gene-by-gene (G × G) interactions, which uncovered an interaction between PRICKLE1 and FOCAD affecting cranial base width. We replicated this G × G interaction signal at the locus level in an additional 5,128 Korean individuals. We used the hypomorphic Prickle1 Beetlejuice (Prickle1 Bj ) mouse line to directly test the function of Prickle1 on the cranial base and observed wider cranial bases in Prickle1 Bj/Bj . Importantly, we observed that the Prickle1 and Focadhesin proteins co-localize in murine cranial base chondrocytes, and this co-localization is abnormal in the Prickle1 Bj/Bj mutants. Taken together, our findings uncovered a novel G × G interaction effect in humans with strong support from both epidemiological and molecular studies. These results highlight the potential of studying measures of phenotypic variability in gene mapping studies of facial morphology.
Collapse
Affiliation(s)
- Dongjing Liu
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Hyo-Jeong Ban
- Future Medicine Division, Korea Institute of Oriental Medicine, Daejeon, South Korea
| | - Ahmed M. El Sergani
- Center for Craniofacial and Dental Genetics, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Oral and Craniofacial Sciences, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
| | - Myoung Keun Lee
- Center for Craniofacial and Dental Genetics, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
| | - Jacqueline T. Hecht
- Department of Pediatrics, McGovern Medical Center, The University of Texas Health Science Center at Houston, Houston, TX, United States
| | - George L. Wehby
- Department of Health Management and Policy, The University of Iowa, Iowa City, IA, United States
| | - Lina M. Moreno
- Department of Orthodontics, The University of Iowa, Iowa City, IA, United States
| | - Eleanor Feingold
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States
| | - Mary L. Marazita
- Center for Craniofacial and Dental Genetics, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Oral and Craniofacial Sciences, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Psychiatry, Clinical and Translational Science Institute, School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States
| | - Seongwon Cha
- Future Medicine Division, Korea Institute of Oriental Medicine, Daejeon, South Korea
| | - Heather L. Szabo-Rogers
- Department of Oral and Craniofacial Sciences, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Developmental Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Regenerative Medicine at the McGowan Institute, University of Pittsburgh, Pittsburgh, PA, United States
- Center for Craniofacial Regeneration, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
| | - Seth M. Weinberg
- Center for Craniofacial and Dental Genetics, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Oral and Craniofacial Sciences, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States
| | - John R. Shaffer
- Center for Craniofacial and Dental Genetics, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Oral and Craniofacial Sciences, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States
| |
Collapse
|
33
|
Mbatchou J, Barnard L, Backman J, Marcketta A, Kosmicki JA, Ziyatdinov A, Benner C, O'Dushlaine C, Barber M, Boutkov B, Habegger L, Ferreira M, Baras A, Reid J, Abecasis G, Maxwell E, Marchini J. Computationally efficient whole-genome regression for quantitative and binary traits. Nat Genet 2021; 53:1097-1103. [PMID: 34017140 DOI: 10.1038/s41588-021-00870-7] [Citation(s) in RCA: 431] [Impact Index Per Article: 143.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2020] [Accepted: 04/13/2021] [Indexed: 11/08/2022]
Abstract
Genome-wide association analysis of cohorts with thousands of phenotypes is computationally expensive, particularly when accounting for sample relatedness or population structure. Here we present a novel machine-learning method called REGENIE for fitting a whole-genome regression model for quantitative and binary phenotypes that is substantially faster than alternatives in multi-trait analyses while maintaining statistical efficiency. The method naturally accommodates parallel analysis of multiple phenotypes and requires only local segments of the genotype matrix to be loaded in memory, in contrast to existing alternatives, which must load genome-wide matrices into memory. This results in substantial savings in compute time and memory usage. We introduce a fast, approximate Firth logistic regression test for unbalanced case-control phenotypes. The method is ideally suited to take advantage of distributed computing frameworks. We demonstrate the accuracy and computational benefits of this approach using the UK Biobank dataset with up to 407,746 individuals.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | - Aris Baras
- Regeneron Genetics Center, Tarrytown, NY, USA
| | | | | | | | | |
Collapse
|
34
|
Abstract
Disease classification, or nosology, was historically driven by careful examination of clinical features of patients. As technologies to measure and understand human phenotypes advanced, so too did classifications of disease, and the advent of genetic data has led to a surge in genetic subtyping in the past decades. Although the fundamental process of refining disease definitions and subtypes is shared across diverse fields, each field is driven by its own goals and technological expertise, leading to inconsistent and conflicting definitions of disease subtypes. Here, we review several classical and recent subtypes and subtyping approaches and provide concrete definitions to delineate subtypes. In particular, we focus on subtypes with distinct causal disease biology, which are of primary interest to scientists, and subtypes with pragmatic medical benefits, which are of primary interest to physicians. We propose genetic heterogeneity as a gold standard for establishing biologically distinct subtypes of complex polygenic disease. We focus especially on methods to find and validate genetic subtypes, emphasizing common pitfalls and how to avoid them.
Collapse
Affiliation(s)
- Andy Dahl
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, Illinois 60637, USA; .,Department of Neurology, University of California, Los Angeles, California 90024, USA; .,Department of Computational Medicine, University of California, Los Angeles, California 90095, USA
| | - Noah Zaitlen
- Department of Neurology, University of California, Los Angeles, California 90024, USA; .,Department of Computational Medicine, University of California, Los Angeles, California 90095, USA
| |
Collapse
|
35
|
Schmitz LL, Goodwin J, Miao J, Lu Q, Conley D. The impact of late-career job loss and genetic risk on body mass index: Evidence from variance polygenic scores. Sci Rep 2021; 11:7647. [PMID: 33828129 PMCID: PMC8027610 DOI: 10.1038/s41598-021-86716-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Accepted: 03/16/2021] [Indexed: 02/02/2023] Open
Abstract
Unemployment shocks from the COVID-19 pandemic have reignited concerns over the long-term effects of job loss on population health. Past research has highlighted the corrosive effects of unemployment on health and health behaviors. This study examines whether the effects of job loss on changes in body mass index (BMI) are moderated by genetic predisposition using data from the U.S. Health and Retirement Study (HRS). To improve detection of gene-by-environment (G × E) interplay, we interacted layoffs from business closures-a plausibly exogenous environmental exposure-with whole-genome polygenic scores (PGSs) that capture genetic contributions to both the population mean (mPGS) and variance (vPGS) of BMI. Results show evidence of genetic moderation using a vPGS (as opposed to an mPGS) and indicate genome-wide summary measures of phenotypic plasticity may further our understanding of how environmental stimuli modify the distribution of complex traits in a population.
Collapse
Affiliation(s)
- Lauren L Schmitz
- Robert M. La Follette School of Public Affairs, University of Wisconsin-Madison, 1225 Observatory Drive, Madison, WI, 53706, USA.
| | - Julia Goodwin
- Department of Sociology, University of Wisconsin-Madison, Madison, WI, USA
| | - Jiacheng Miao
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA
| | - Qiongshi Lu
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA
- Department of Statistics, University of Wisconsin-Madison, Madison, WI, USA
| | - Dalton Conley
- Department of Sociology, Princeton University & NBER, Princeton, NJ, USA
| |
Collapse
|
36
|
Ziyatdinov A, Kim J, Prokopenko D, Privé F, Laporte F, Loh PR, Kraft P, Aschard H. Estimating the effective sample size in association studies of quantitative traits. G3-GENES GENOMES GENETICS 2021; 11:6178001. [PMID: 33734375 PMCID: PMC8495748 DOI: 10.1093/g3journal/jkab057] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/09/2020] [Accepted: 01/06/2021] [Indexed: 01/08/2023]
Abstract
The effective sample size (ESS) is a metric used to summarize in a single term the amount of correlation in a sample. It is of particular interest when predicting the statistical power of genome-wide association studies (GWAS) based on linear mixed models. Here, we introduce an analytical form of the ESS for mixed-model GWAS of quantitative traits and relate it to empirical estimators recently proposed. Using our framework, we derived approximations of the ESS for analyses of related and unrelated samples and for both marginal genetic and gene-environment interaction tests. We conducted simulations to validate our approximations and to provide a quantitative perspective on the statistical power of various scenarios, including power loss due to family relatedness and power gains due to conditioning on the polygenic signal. Our analyses also demonstrate that the power of gene-environment interaction GWAS in related individuals strongly depends on the family structure and exposure distribution. Finally, we performed a series of mixed-model GWAS on data from the UK Biobank and confirmed the simulation results. We notably found that the expected power drop due to family relatedness in the UK Biobank is negligible.
Collapse
Affiliation(s)
- Andrey Ziyatdinov
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Jihye Kim
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Dmitry Prokopenko
- Genetics and Aging Unit and McCance Center for Brain Health, Department of Neurology, Massachusetts General Hospital, Boston, MA, USA.,Harvard Medical School, Boston, MA, USA
| | - Florian Privé
- National Centre for Register-Based Research, Aarhus University, Aarhus, 8210, Denmark
| | - Fabien Laporte
- Department of Computational Biology-USR 3756 CNRS, Institut Pasteur, 75015 Paris, France
| | - Po-Ru Loh
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.,Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Peter Kraft
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Hugues Aschard
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.,Department of Computational Biology-USR 3756 CNRS, Institut Pasteur, 75015 Paris, France
| |
Collapse
|
37
|
Marderstein AR, Davenport ER, Kulm S, Van Hout CV, Elemento O, Clark AG. Leveraging phenotypic variability to identify genetic interactions in human phenotypes. Am J Hum Genet 2021; 108:49-67. [PMID: 33326753 PMCID: PMC7820920 DOI: 10.1016/j.ajhg.2020.11.016] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2020] [Accepted: 11/23/2020] [Indexed: 12/13/2022] Open
Abstract
Although thousands of loci have been associated with human phenotypes, the role of gene-environment (GxE) interactions in determining individual risk of human diseases remains unclear. This is partly because of the severe erosion of statistical power resulting from the massive number of statistical tests required to detect such interactions. Here, we focus on improving the power of GxE tests by developing a statistical framework for assessing quantitative trait loci (QTLs) associated with the trait means and/or trait variances. When applying this framework to body mass index (BMI), we find that GxE discovery and replication rates are significantly higher when prioritizing genetic variants associated with the variance of the phenotype (vQTLs) compared to when assessing all genetic variants. Moreover, we find that vQTLs are enriched for associations with other non-BMI phenotypes having strong environmental influences, such as diabetes or ulcerative colitis. We show that GxE effects first identified in quantitative traits such as BMI can be used for GxE discovery in disease phenotypes such as diabetes. A clear conclusion is that strong GxE interactions mediate the genetic contribution to body weight and diabetes risk.
Collapse
Affiliation(s)
- Andrew R Marderstein
- Tri-Institutional Program in Computational Biology & Medicine, Weill Cornell Medicine, New York, NY 10021, USA; Institute of Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA; Caryl and Israel Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY 10021, USA; Department of Computational Biology, Cornell University, Ithaca, NY 14850, USA
| | - Emily R Davenport
- Department of Biology, Huck Institutes of the Life Sciences, Institute for Computational and Data Sciences, Pennsylvania State University, University Park, PA 16802, USA
| | - Scott Kulm
- Institute of Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA; Caryl and Israel Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY 10021, USA
| | | | - Olivier Elemento
- Institute of Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA; Caryl and Israel Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY 10021, USA.
| | - Andrew G Clark
- Department of Computational Biology, Cornell University, Ithaca, NY 14850, USA.
| |
Collapse
|
38
|
Kerin M, Marchini J. Inferring Gene-by-Environment Interactions with a Bayesian Whole-Genome Regression Model. Am J Hum Genet 2020; 107:698-713. [PMID: 32888427 PMCID: PMC7536582 DOI: 10.1016/j.ajhg.2020.08.009] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2019] [Accepted: 08/11/2020] [Indexed: 01/05/2023] Open
Abstract
The contribution of gene-by-environment (GxE) interactions for many human traits and diseases is poorly characterized. We propose a Bayesian whole-genome regression model for joint modeling of main genetic effects and GxE interactions in large-scale datasets, such as the UK Biobank, where many environmental variables have been measured. The method is called LEMMA (Linear Environment Mixed Model Analysis) and estimates a linear combination of environmental variables, called an environmental score (ES), that interacts with genetic markers throughout the genome. The ES provides a readily interpretable way to examine the combined effect of many environmental variables. The ES can be used both to estimate the proportion of phenotypic variance attributable to GxE effects and to test for GxE effects at genetic variants across the genome. GxE effects can induce heteroskedasticity in quantitative traits, and LEMMA accounts for this by using robust standard error estimates when testing for GxE effects. When applied to body mass index, systolic blood pressure, diastolic blood pressure, and pulse pressure in the UK Biobank, we estimate that 9.3%, 3.9%, 1.6%, and 12.5%, respectively, of phenotypic variance is explained by GxE interactions and that low-frequency variants explain most of this variance. We also identify three loci that interact with the estimated environmental scores (-log10p>7.3).
Collapse
Affiliation(s)
- Matthew Kerin
- Wellcome Trust Center for Human Genetics, Oxford, OX3 7BN, UK
| | | |
Collapse
|
39
|
Domingue BW, Trejo S, Armstrong-Carter E, Tucker-Drob EM. Interactions between Polygenic Scores and Environments: Methodological and Conceptual Challenges. SOCIOLOGICAL SCIENCE 2020; 7:465-486. [PMID: 36091972 PMCID: PMC9455807 DOI: 10.15195/v7.a19] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Interest in the study of gene-environment interaction has recently grown due to the sudden availability of molecular genetic data-in particular, polygenic scores-in many long-running longitudinal studies. Identifying and estimating statistical interactions comes with several analytic and inferential challenges; these challenges are heightened when used to integrate observational genomic and social science data. We articulate some of these key challenges, provide new perspectives on the study of gene-environment interactions, and end by offering some practical guidance for conducting research in this area. Given the sudden availability of well-powered polygenic scores, we anticipate a substantial increase in research testing for interaction between such scores and environments. The issues we discuss, if not properly addressed, may impact the enduring scientific value of gene-environment interaction studies.
Collapse
|
40
|
Abstract
Canalization refers to the evolution of populations such that the number of individuals who deviate from the optimum trait, or experience disease, is minimized. In the presence of rapid cultural, environmental, or genetic change, the reverse process of decanalization may contribute to observed increases in disease prevalence. This review starts by defining relevant concepts, drawing distinctions between the canalization of populations and robustness of individuals. It then considers evidence pertaining to three continuous traits and six domains of disease. In each case, existing genetic evidence for genotype-by-environment interactions is insufficient to support a strong inference of decanalization, but we argue that the advent of genome-wide polygenic risk assessment now makes an empirical evaluation of the role of canalization in preventing disease possible. Finally, the contributions of both rare and common variants to congenital abnormality and adult onset disease are considered in light of a new kerplunk model of genetic effects.
Collapse
Affiliation(s)
- Greg Gibson
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia 30332, USA;
| | - Kristine A Lacek
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia 30332, USA;
| |
Collapse
|
41
|
Quantification of the overall contribution of gene-environment interaction for obesity-related traits. Nat Commun 2020; 11:1385. [PMID: 32170055 PMCID: PMC7070002 DOI: 10.1038/s41467-020-15107-0] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2019] [Accepted: 02/11/2020] [Indexed: 12/03/2022] Open
Abstract
The growing sample size of genome-wide association studies has facilitated the discovery of gene-environment interactions (GxE). Here we propose a maximum likelihood method to estimate the contribution of GxE to continuous traits taking into account all interacting environmental variables, without the need to measure any. Extensive simulations demonstrate that our method provides unbiased interaction estimates and excellent coverage. We also offer strategies to distinguish specific GxE from general scale effects. Applying our method to 32 traits in the UK Biobank reveals that while the genetic risk score (GRS) of 376 variants explains 5.2% of body mass index (BMI) variance, GRSxE explains an additional 1.9%. Nevertheless, this interaction holds for any variable with identical correlation to BMI as the GRS, hence may not be GRS-specific. Still, we observe that the global contribution of specific GRSxE to complex traits is substantial for nine obesity-related measures (including leg impedance and trunk fat-free mass). Most gene-by-environment interaction methods rely on the availability of the interacting environment. Here, the authors propose a robust maximum likelihood method for estimating the overall statistical interaction between a genetic risk score for a continuous outcome and all environmental variables.
Collapse
|
42
|
Hussain W, Campbell MT, Jarquin D, Walia H, Morota G. Variance heterogeneity genome-wide mapping for cadmium in bread wheat reveals novel genomic loci and epistatic interactions. THE PLANT GENOME 2020; 13:e20011. [PMID: 33016629 DOI: 10.1002/tpg2.20011] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Accepted: 01/22/2020] [Indexed: 06/11/2023]
Abstract
Genome-wide association mapping identifies quantitative trait loci (QTL) that influence the mean differences between the marker genotypes for a given trait. While most loci influence the mean value of a trait, certain loci, known as variance heterogeneity QTL (vQTL) determine the variability of the trait instead of the mean trait value (mQTL). In the present study, we performed a variance heterogeneity genome-wide association study (vGWAS) for grain cadmium (Cd) concentration in bread wheat. We used double generalized linear model and hierarchical generalized linear model to identify vQTL associated with grain Cd. We identified novel vQTL regions on chromosomes 2A and 2B that contribute to the Cd variation and loci that affect both mean and variance heterogeneity (mvQTL) on chromosome 5A. In addition, our results demonstrated the presence of epistatic interactions between vQTL and mvQTL, which could explain variance heterogeneity. Overall, we provide novel insights into the genetic architecture of grain Cd concentration and report the first application of vGWAS in wheat. Moreover, our findings indicated that epistasis is an important mechanism underlying natural variation for grain Cd concentration.
Collapse
Affiliation(s)
- Waseem Hussain
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE, 68583, USA
| | - Malachy T Campbell
- Department of Animal and Poultry Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, 24061, USA
| | - Diego Jarquin
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE, 68583, USA
| | - Harkamal Walia
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE, 68583, USA
| | - Gota Morota
- Department of Animal and Poultry Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, 24061, USA
| |
Collapse
|
43
|
Tchernichovski O, Conley D. A genetically tailored education for birds. Nature 2019; 575:290-291. [DOI: 10.1038/d41586-019-03416-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
44
|
Young AI, Benonisdottir S, Przeworski M, Kong A. Deconstructing the sources of genotype-phenotype associations in humans. Science 2019; 365:1396-1400. [PMID: 31604265 PMCID: PMC6894903 DOI: 10.1126/science.aax3710] [Citation(s) in RCA: 120] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Efforts to link variation in the human genome to phenotypes have progressed at a tremendous pace in recent decades. Most human traits have been shown to be affected by a large number of genetic variants across the genome. To interpret these associations and to use them reliably-in particular for phenotypic prediction-a better understanding of the many sources of genotype-phenotype associations is necessary. We summarize the progress that has been made in this direction in humans, notably in decomposing direct and indirect genetic effects as well as population structure confounding. We discuss the natural next steps in data collection and methodology development, with a focus on what can be gained by analyzing genotype and phenotype data from close relatives.
Collapse
Affiliation(s)
- Alexander I Young
- Big Data Institute, Li Ka Shing Centre for Health Information Discovery, University of Oxford, Oxford, UK.
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
| | - Stefania Benonisdottir
- Big Data Institute, Li Ka Shing Centre for Health Information Discovery, University of Oxford, Oxford, UK
| | - Molly Przeworski
- Department of Biological Sciences, Columbia University, New York, NY, USA.
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Augustine Kong
- Big Data Institute, Li Ka Shing Centre for Health Information Discovery, University of Oxford, Oxford, UK.
| |
Collapse
|
45
|
Sella G, Barton NH. Thinking About the Evolution of Complex Traits in the Era of Genome-Wide Association Studies. Annu Rev Genomics Hum Genet 2019; 20:461-493. [DOI: 10.1146/annurev-genom-083115-022316] [Citation(s) in RCA: 123] [Impact Index Per Article: 24.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Many traits of interest are highly heritable and genetically complex, meaning that much of the variation they exhibit arises from differences at numerous loci in the genome. Complex traits and their evolution have been studied for more than a century, but only in the last decade have genome-wide association studies (GWASs) in humans begun to reveal their genetic basis. Here, we bring these threads of research together to ask how findings from GWASs can further our understanding of the processes that give rise to heritable variation in complex traits and of the genetic basis of complex trait evolution in response to changing selection pressures (i.e., of polygenic adaptation). Conversely, we ask how evolutionary thinking helps us to interpret findings from GWASs and informs related efforts of practical importance.
Collapse
Affiliation(s)
- Guy Sella
- Department of Biological Sciences, Columbia University, New York, NY 10027, USA
- Department of Systems Biology, Columbia University, New York, NY 10032, USA
- Program for Mathematical Genomics, Columbia University, New York, NY 10032, USA
| | - Nicholas H. Barton
- Institute of Science and Technology Austria, 3400 Klosterneuburg, Austria
| |
Collapse
|
46
|
Wang H, Zhang F, Zeng J, Wu Y, Kemper KE, Xue A, Zhang M, Powell JE, Goddard ME, Wray NR, Visscher PM, McRae AF, Yang J. Genotype-by-environment interactions inferred from genetic effects on phenotypic variability in the UK Biobank. SCIENCE ADVANCES 2019; 5:eaaw3538. [PMID: 31453325 PMCID: PMC6693916 DOI: 10.1126/sciadv.aaw3538] [Citation(s) in RCA: 102] [Impact Index Per Article: 20.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2018] [Accepted: 07/11/2019] [Indexed: 05/17/2023]
Abstract
Genotype-by-environment interaction (GEI) is a fundamental component in understanding complex trait variation. However, it remains challenging to identify genetic variants with GEI effects in humans largely because of the small effect sizes and the difficulty of monitoring environmental fluctuations. Here, we demonstrate that GEI can be inferred from genetic variants associated with phenotypic variability in a large sample without the need of measuring environmental factors. We performed a genome-wide variance quantitative trait locus (vQTL) analysis of ~5.6 million variants on 348,501 unrelated individuals of European ancestry for 13 quantitative traits in the UK Biobank and identified 75 significant vQTLs with P < 2.0 × 10-9 for 9 traits, especially for those related to obesity. Direct GEI analysis with five environmental factors showed that the vQTLs were strongly enriched with GEI effects. Our results indicate pervasive GEI effects for obesity-related traits and demonstrate the detection of GEI without environmental data.
Collapse
Affiliation(s)
- Huanwei Wang
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland 4072, Australia
| | - Futao Zhang
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland 4072, Australia
| | - Jian Zeng
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland 4072, Australia
| | - Yang Wu
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland 4072, Australia
| | - Kathryn E. Kemper
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland 4072, Australia
| | - Angli Xue
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland 4072, Australia
| | - Min Zhang
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland 4072, Australia
| | - Joseph E. Powell
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland 4072, Australia
- Garvan-Weizmann Centre for Cellular Genomics, Garvan Institute for Medical Research, Sydney, New South Wales 2010, Australia
- Faculty of Medicine, University of New South Wales, Sydney, New South Wales 2052, Australia
| | - Michael E. Goddard
- Faculty of Veterinary and Agricultural Science, University of Melbourne, Parkville, Victoria, Australia
- Biosciences Research Division, Department of Economic Development, Jobs, Transport and Resources, Bundoora, Victoria, Australia
| | - Naomi R. Wray
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland 4072, Australia
- Queensland Brain Institute, The University of Queensland, Brisbane, Queensland 4072, Australia
| | - Peter M. Visscher
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland 4072, Australia
- Queensland Brain Institute, The University of Queensland, Brisbane, Queensland 4072, Australia
| | - Allan F. McRae
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland 4072, Australia
| | - Jian Yang
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland 4072, Australia
- Institute for Advanced Research, Wenzhou Medical University, Wenzhou, Zhejiang 325027, China
| |
Collapse
|
47
|
Genotype-covariate correlation and interaction disentangled by a whole-genome multivariate reaction norm model. Nat Commun 2019; 10:2239. [PMID: 31110177 PMCID: PMC6527612 DOI: 10.1038/s41467-019-10128-w] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2018] [Accepted: 04/18/2019] [Indexed: 01/05/2023] Open
Abstract
The genomics era has brought useful tools to dissect the genetic architecture of complex traits. Here we propose a multivariate reaction norm model (MRNM) to tackle genotype–covariate (G–C) correlation and interaction problems. We apply MRNM to the UK Biobank data in analysis of body mass index using smoking quantity as a covariate, finding a highly significant G–C correlation, but only weak evidence for G–C interaction. In contrast, G–C interaction estimates are inflated in existing methods. It is also notable that there is significant heterogeneity in the estimated residual variances (i.e., variances not attributable to factors in the model) across different covariate levels, i.e., residual–covariate (R–C) interaction. We also show that the residual variances estimated by standard additive models can be inflated in the presence of G–C and/or R–C interactions. We conclude that it is essential to correctly account for both interaction and correlation in complex trait analyses. Complex traits are often influenced by genetic and non-genetic factors (such as environmental exposures), which are themselves interconnected. Here, the authors develop a method for disentangling genotype–covariate correlation and interaction, and investigate their effects on estimating statistical genetic parameters.
Collapse
|
48
|
Barroso I, McCarthy MI. The Genetic Basis of Metabolic Disease. Cell 2019; 177:146-161. [PMID: 30901536 PMCID: PMC6432945 DOI: 10.1016/j.cell.2019.02.024] [Citation(s) in RCA: 70] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2019] [Revised: 02/11/2019] [Accepted: 02/14/2019] [Indexed: 02/06/2023]
Abstract
Recent developments in genetics and genomics are providing a detailed and systematic characterization of the genetic underpinnings of common metabolic diseases and traits, highlighting the inherent complexity within systems for homeostatic control and the many ways in which that control can fail. The genetic architecture underlying these common metabolic phenotypes is complex, with each trait influenced by hundreds of loci spanning a range of allele frequencies and effect sizes. Here, we review the growing appreciation of this complexity and how this has fostered the implementation of genome-scale approaches that deliver robust mechanistic inference and unveil new strategies for translational exploitation.
Collapse
Affiliation(s)
- Inês Barroso
- Wellcome Sanger Institute, Hinxton, Cambridge CB10 1SA, UK.
| | - Mark I McCarthy
- Wellcome Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN, UK; Oxford Centre for Diabetes, Endocrinology, and Metabolism, University of Oxford, Churchill Hospital, Old Road, Headington, Oxford OX3 7LJ, UK; Oxford NIHR Biomedical Research Centre, Churchill Hospital, Old Road, Headington, Oxford OX3 7LJ, UK
| |
Collapse
|