1
|
Lin WY. Gene-Environment Interactions and Gene-Gene Interactions on Two Biological Age Measures: Evidence from Taiwan Biobank Participants. Adv Biol (Weinh) 2024; 8:e2400149. [PMID: 38684452 DOI: 10.1002/adbi.202400149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2024] [Revised: 04/14/2024] [Indexed: 05/02/2024]
Abstract
PhenoAge and BioAge are two commonly used biological age (BA) measures. The author here searched for gene-environment interactions (GxE) and gene-gene interactions (GxG) on PhenoAgeAccel (age-adjusted PhenoAge) and BioAgeAccel (age-adjusted BioAge) of 111,996 Taiwan Biobank (TWB) participants, including a discovery set of 86,536 TWB2 individuals and a replication set of 25,460 TWB1 individuals. Searching for variance quantitative trait loci (vQTLs) provides a convenient way to evaluate GxE and GxG. A total of 4 nearly independent (linkage disequilibrium measure r2 < 0.01) PhenoAgeAccel-vQTLs are identified from 5,303,039 autosomal TWB2 SNPs (p < 5E-8), whereas no vQTLs are found from BioAgeAccel. These 4 PhenoAgeAccel-vQTLs (rs35276921, rs141927875, rs10903013, and rs76038336) are further replicated by TWB1 (p < 5E-8). They are located in the OR51B5, FAM234A, and AXIN1 genes. All 4 PhenoAgeAccel-vQTLs are significantly associated with PhenoAgeAccel (p < 5E-8). A phylogenetic heat map of the GxE analyses showed that smoking exacerbated the PhenoAgeAccel-vQTLs' aging effects, while higher educational attainment attenuated the PhenoAgeAccel-vQTLs' aging effects. Body mass index, chronological age, alcohol consumption, and sex do not prominently modulate PhenoAgeAccel-vQTLs' aging effects. Based on these vQTL results, rs141927875-rs35276921 interaction (p = 4.7E-61) and rs76038336-rs10903013 interaction (p = 3.3E-116) on PhenoAgeAccel are detected.
Collapse
Affiliation(s)
- Wan-Yu Lin
- Institute of Health Data Analytics and Statistics, College of Public Health, National Taiwan University, Taipei, 100, Taiwan
- Master of Public Health Degree Program, College of Public Health, National Taiwan University, Taipei, 100, Taiwan
| |
Collapse
|
2
|
Boetto C, Frouin A, Henches L, Auvergne A, Suzuki Y, Patin E, Bredon M, Chiu A, Consortium MI, Sankararaman S, Zaitlen N, Kennedy SP, Quintana-Murci L, Duffy D, Sokol H, Aschard H. MANOCCA: a robust and computationally efficient test of covariance in high-dimension multivariate omics data. Brief Bioinform 2024; 25:bbae272. [PMID: 38856173 PMCID: PMC11163461 DOI: 10.1093/bib/bbae272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 04/16/2024] [Accepted: 05/28/2024] [Indexed: 06/11/2024] Open
Abstract
Multivariate analysis is becoming central in studies investigating high-throughput molecular data, yet, some important features of these data are seldom explored. Here, we present MANOCCA (Multivariate Analysis of Conditional CovAriance), a powerful method to test for the effect of a predictor on the covariance matrix of a multivariate outcome. The proposed test is by construction orthogonal to tests based on the mean and variance and is able to capture effects that are missed by both approaches. We first compare the performances of MANOCCA with existing correlation-based methods and show that MANOCCA is the only test correctly calibrated in simulation mimicking omics data. We then investigate the impact of reducing the dimensionality of the data using principal component analysis when the sample size is smaller than the number of pairwise covariance terms analysed. We show that, in many realistic scenarios, the maximum power can be achieved with a limited number of components. Finally, we apply MANOCCA to 1000 healthy individuals from the Milieu Interieur cohort, to assess the effect of health, lifestyle and genetic factors on the covariance of two sets of phenotypes, blood biomarkers and flow cytometry-based immune phenotypes. Our analyses identify significant associations between multiple factors and the covariance of both omics data.
Collapse
Affiliation(s)
- Christophe Boetto
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, 25-28 rue du Dr Roux, 75015 Paris, France
| | - Arthur Frouin
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, 25-28 rue du Dr Roux, 75015 Paris, France
| | - Léo Henches
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, 25-28 rue du Dr Roux, 75015 Paris, France
| | - Antoine Auvergne
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, 25-28 rue du Dr Roux, 75015 Paris, France
| | - Yuka Suzuki
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, 25-28 rue du Dr Roux, 75015 Paris, France
| | - Etienne Patin
- Human Evolutionary Genetics Unit, Institut Pasteur, Université Paris Cité, CNRS UMR2000, 25-28 rue Dr Roux, 75015 Paris, France
| | - Marius Bredon
- Sorbonne Université, INSERM, Centre de recherche Saint-Antoine, CRSA, Microbiota, Gut and Inflammation Laboratory, Hôpital Saint-Antoine (UMR S938) Sorbonne Université, 27 rue Chaligny, 75012 Paris, France
| | - Alec Chiu
- Department of Human Genetics, University California Los Angeles, 695 Charles E. Young Drive South, Box 708822, Los Angeles, CA 90095-7088, United States
| | | | - Sriram Sankararaman
- Department of Human Genetics, University California Los Angeles, 695 Charles E. Young Drive South, Box 708822, Los Angeles, CA 90095-7088, United States
| | - Noah Zaitlen
- Department of Human Genetics, University California Los Angeles, 695 Charles E. Young Drive South, Box 708822, Los Angeles, CA 90095-7088, United States
| | - Sean P Kennedy
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, 25-28 rue du Dr Roux, 75015 Paris, France
| | - Lluis Quintana-Murci
- Human Evolutionary Genetics Unit, Institut Pasteur, Université Paris Cité, CNRS UMR2000, 25-28 rue Dr Roux, 75015 Paris, France
- Chair of Human Genomics and Evolution, Collège de France, 11 Pl. Marcelin Berthelot, 75005 Paris, France
| | - Darragh Duffy
- Translational Immunology Unit, Institut Pasteur, Université de Paris Cité, 25-28 rue du Dr Roux, 75015 Paris, France
| | - Harry Sokol
- Sorbonne Université, INSERM, Centre de recherche Saint-Antoine, CRSA, Microbiota, Gut and Inflammation Laboratory, Hôpital Saint-Antoine (UMR S938) Sorbonne Université, 27 rue Chaligny, 75012 Paris, France
- Paris Center for Microbiome Medicine, Fédération Hospitalo-Universitaire, 184 rue du Faubourg Saint-Antoine, 75571 PARIS Cedex 12, France
- Gastroenterology Department, AP-HP, Saint Antoine Hospital, 184 rue du faubourg Saint-Antoine, 75012 Paris, France
- INRAE Micalis & AgroParisTech, UMR1319, Micalis & AgroParisTech, 4 avenue Jean Jaurès, 78352 Jouy en Josas, France
| | - Hugues Aschard
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, 25-28 rue du Dr Roux, 75015 Paris, France
- Department of Epidemiology, Harvard TH Chan School of Public Health, 677 Huntington Ave, Boston, MA 02115, United States
| |
Collapse
|
3
|
Kemper KE, Sidorenko J, Wang H, Hayes BJ, Wray NR, Yengo L, Keller MC, Goddard M, Visscher PM. Genetic influence on within-person longitudinal change in anthropometric traits in the UK Biobank. Nat Commun 2024; 15:3776. [PMID: 38710707 PMCID: PMC11074304 DOI: 10.1038/s41467-024-47802-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Accepted: 04/10/2024] [Indexed: 05/08/2024] Open
Abstract
The causes of temporal fluctuations in adult traits are poorly understood. Here, we investigate the genetic determinants of within-person trait variability of 8 repeatedly measured anthropometric traits in 50,117 individuals from the UK Biobank. We found that within-person (non-directional) variability had a SNP-based heritability of 2-5% for height, sitting height, body mass index (BMI) and weight (P ≤ 2.4 × 10-3). We also analysed longitudinal trait change and show a loss of both average height and weight beyond about 70 years of age. A variant tracking the Alzheimer's risk APOE- E 4 allele (rs429358) was significantly associated with weight loss ( β = -0.047 kg per yr, s.e. 0.007, P = 2.2 × 10-11), and using 2-sample Mendelian Randomisation we detected a relationship consistent with causality between decreased lumbar spine bone mineral density and height loss (bxy = 0.011, s.e. 0.003, P = 3.5 × 10-4). Finally, population-level variance quantitative trait loci (vQTL) were consistent with within-person variability for several traits, indicating an overlap between trait variability assessed at the population or individual level. Our findings help elucidate the genetic influence on trait-change within an individual and highlight disease risks associated with these changes.
Collapse
Affiliation(s)
- Kathryn E Kemper
- Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD, Australia.
| | - Julia Sidorenko
- Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD, Australia
| | - Huanwei Wang
- Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD, Australia
| | - Ben J Hayes
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane, QLD, Australia
| | - Naomi R Wray
- Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD, Australia
- Department of Psychiatry, University of Oxford, Oxford, UK
| | - Loic Yengo
- Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD, Australia
| | - Matthew C Keller
- Institute for Behavioral Genetics, University of Colorado, Boulder, CO, USA
| | - Michael Goddard
- Faculty of Veterinary and Agricultural Science, University of Melbourne, Parkville, VIC, Australia
- Biosciences Research Division, Agriculture Victoria, Bundoora, VIC, Australia
| | - Peter M Visscher
- Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD, Australia.
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Population Health, University of Oxford, Oxford, UK.
| |
Collapse
|
4
|
Assary E, Coleman J, Hemani G, van Der Veijer M, Howe L, Palviainen T, Grasby K, Ahlskog R, Nygaard M, Cheesman R, Lim K, Reynolds C, Ordoñana J, Colodro-Conde L, Gordon S, Madrid-Valero J, Thalamuthu A, Hottenga JJ, Mengel-From J, Armstrong NJ, Sachdev P, Lee T, Brodaty H, Trollor J, Wright M, Ames D, Catts V, Latvala A, Vuoksimaa E, Mallard T, Harden K, Tucker-Drob E, Oskarsson S, Hammond C, Christensen K, Taylor M, Lundström S, Larsson H, Karlsson R, Pedersen N, Mather K, Medland S, Boomsma D, Martin N, Plomin R, Bartels M, Lichtenstein P, Kaprio J, Eley T, Davies N, Munroe P, Keers R. Genetics of environmental sensitivity to psychiatric and neurodevelopmental phenotypes: evidence from GWAS of monozygotic twins. RESEARCH SQUARE 2024:rs.3.rs-4333635. [PMID: 38746362 PMCID: PMC11092831 DOI: 10.21203/rs.3.rs-4333635/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Individual sensitivity to environmental exposures may be genetically influenced. This genotype-by-environment interplay implies differences in phenotypic variance across genotypes. However, environmental sensitivity genetic variants have proven challenging to detect. GWAS of monozygotic twin differences is a family-based variance analysis method, which is more robust to systemic biases that impact population-based methods. We combined data from up to 21,792 monozygotic twins (10,896 pairs) from 11 studies to conduct the largest GWAS meta-analysis of monozygotic phenotypic differences in children and adolescents/adults for seven psychiatric and neurodevelopmental phenotypes: attention deficit hyperactivity disorder (ADHD) symptoms, autistic traits, anxiety and depression symptoms, psychotic-like experiences, neuroticism, and wellbeing. The SNP-heritability of variance in these phenotypes were estimated (h2: 0% to 18%), but were imprecise. We identified a total of 13 genome-wide significant associations (SNP, gene, and gene-set), including genes related to stress-reactivity for depression, growth factor-related genes for autistic traits and catecholamine uptake-related genes for psychotic-like experiences. Monozygotic twins are an important new source of evidence about the genetics of environmental sensitivity.
Collapse
Affiliation(s)
| | - Jonathan Coleman
- Institute of Psychiatry, Psychology, and Neuroscience, King's College London
| | | | | | | | - Teemu Palviainen
- Institute for Molecular Medicine Finland FIMM, University of Helsinki, Helsinki
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Karen Mather
- Centre for Healthy Brain Ageing, Psychiatry, University of New South Wales (UNSW)
| | | | - D Boomsma
- Vrije Universiteit Amsterdam, The Netherlands
| | | | - Robert Plomin
- Institute of Psychiatry, Psychology and Neuroscience, King's College London, London
| | | | | | | | | | | | | | | |
Collapse
|
5
|
Bass AJ, Bian S, Wingo AP, Wingo TS, Cutler DJ, Epstein MP. Identifying latent genetic interactions in genome-wide association studies using multiple traits. Genome Med 2024; 16:62. [PMID: 38664839 PMCID: PMC11044415 DOI: 10.1186/s13073-024-01329-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 04/02/2024] [Indexed: 04/28/2024] Open
Abstract
The "missing" heritability of complex traits may be partly explained by genetic variants interacting with other genes or environments that are difficult to specify, observe, and detect. We propose a new kernel-based method called Latent Interaction Testing (LIT) to screen for genetic interactions that leverages pleiotropy from multiple related traits without requiring the interacting variable to be specified or observed. Using simulated data, we demonstrate that LIT increases power to detect latent genetic interactions compared to univariate methods. We then apply LIT to obesity-related traits in the UK Biobank and detect variants with interactive effects near known obesity-related genes (URL: https://CRAN.R-project.org/package=lit ).
Collapse
Affiliation(s)
- Andrew J Bass
- Department of Human Genetics, Emory University, Atlanta, GA, 30322, USA.
| | - Shijia Bian
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, 30322, USA
| | - Aliza P Wingo
- Department of Psychiatry, Emory University, Atlanta, GA, 30322, USA
| | - Thomas S Wingo
- Department of Human Genetics, Emory University, Atlanta, GA, 30322, USA
- Department of Neurology, Emory University, Atlanta, GA, 30322, USA
| | - David J Cutler
- Department of Human Genetics, Emory University, Atlanta, GA, 30322, USA
| | - Michael P Epstein
- Department of Human Genetics, Emory University, Atlanta, GA, 30322, USA.
| |
Collapse
|
6
|
Xiang R, Liu Y, Ben-Eghan C, Ritchie S, Lambert SA, Xu Y, Takeuchi F, Inouye M. Genome-wide analyses of variance in blood cell phenotypes provide new insights into complex trait biology and prediction. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.15.24305830. [PMID: 38699308 PMCID: PMC11065006 DOI: 10.1101/2024.04.15.24305830] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
Blood cell phenotypes are routinely tested in healthcare to inform clinical decisions. Genetic variants influencing mean blood cell phenotypes have been used to understand disease aetiology and improve prediction; however, additional information may be captured by genetic effects on observed variance. Here, we mapped variance quantitative trait loci (vQTL), i.e. genetic loci associated with trait variance, for 29 blood cell phenotypes from the UK Biobank (N~408,111). We discovered 176 independent blood cell vQTLs, of which 147 were not found by additive QTL mapping. vQTLs displayed on average 1.8-fold stronger negative selection than additive QTL, highlighting that selection acts to reduce extreme blood cell phenotypes. Variance polygenic scores (vPGSs) were constructed to stratify individuals in the INTERVAL cohort (N~40,466), where genetically less variable individuals (low vPGS) had increased conventional PGS accuracy (by ~19%) than genetically more variable individuals. Genetic prediction of blood cell traits improved by ~10% on average combining PGS with vPGS. Using Mendelian randomisation and vPGS association analyses, we found that alcohol consumption significantly increased blood cell trait variances highlighting the utility of blood cell vQTLs and vPGSs to provide novel insight into phenotype aetiology as well as improve prediction.
Collapse
Affiliation(s)
- Ruidong Xiang
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, Victoria 3083, Australia
- Baker Department of Cardiovascular Research, Translation and Implementation, La Trobe University, Melbourne, VIC, 3086, Australia
- Baker Department of Cardiometabolic Health, The University of Melbourne, VIC, 3010, Australia
| | - Yang Liu
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
- British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK
| | - Chief Ben-Eghan
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
- British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK
| | - Scott Ritchie
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
- British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK
| | - Samuel A. Lambert
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
- British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK
| | - Yu Xu
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
- British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK
| | - Fumihiko Takeuchi
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Department of Gene Diagnostics and Therapeutics, Research Institute, National Center for Global Health and Medicine, Tokyo, Japan
| | - Michael Inouye
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
- British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK
| |
Collapse
|
7
|
Westerman KE, Sofer T. Many roads to a gene-environment interaction. Am J Hum Genet 2024; 111:626-635. [PMID: 38579668 PMCID: PMC11023920 DOI: 10.1016/j.ajhg.2024.03.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 02/29/2024] [Accepted: 03/01/2024] [Indexed: 04/07/2024] Open
Abstract
Despite the importance of gene-environment interactions (GxEs) in improving and operationalizing genetic discovery, interpretation of any GxEs that are discovered can be surprisingly difficult. There are many potential biological and statistical explanations for a statistically significant finding and, likewise, it is not always clear what can be claimed based on a null result. A better understanding of the possible underlying mechanisms leading to a detected GxE can help investigators decide which are and which are not relevant to their hypothesis. Here, we provide a detailed explanation of five "phenomena," or data-generating mechanisms, that can lead to nonzero interaction estimates, as well as a discussion of specific instances in which they might be relevant. We hope that, given this framework, investigators can design more targeted experiments and provide cleaner interpretations of the associated results.
Collapse
Affiliation(s)
- Kenneth E Westerman
- Department of Medicine, Clinical and Translational Epidemiology Unit, Mongan Institute, Massachusetts General Hospital, Boston, MA, USA; Metabolism Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Department of Medicine, Harvard Medical School, Boston, MA, USA.
| | - Tamar Sofer
- Cardiovascular Institute, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA; Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
| |
Collapse
|
8
|
Zhang X, Bell JT. Detecting genetic effects on phenotype variability to capture gene-by-environment interactions: a systematic method comparison. G3 (BETHESDA, MD.) 2024; 14:jkae022. [PMID: 38289865 PMCID: PMC10989912 DOI: 10.1093/g3journal/jkae022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 01/16/2024] [Accepted: 01/19/2024] [Indexed: 02/01/2024]
Abstract
Genetically associated phenotypic variability has been widely observed across organisms and traits, including in humans. Both gene-gene and gene-environment interactions can lead to an increase in genetically associated phenotypic variability. Therefore, detecting the underlying genetic variants, or variance Quantitative Trait Loci (vQTLs), can provide novel insights into complex traits. Established approaches to detect vQTLs apply different methodologies from variance-only approaches to mean-variance joint tests, but a comprehensive comparison of these methods is lacking. Here, we review available methods to detect vQTLs in humans, carry out a simulation study to assess their performance under different biological scenarios of gene-environment interactions, and apply the optimal approaches for vQTL identification to gene expression data. Overall, with a minor allele frequency (MAF) of less than 0.2, the squared residual value linear model (SVLM) and the deviation regression model (DRM) are optimal when the data follow normal and non-normal distributions, respectively. In addition, the Brown-Forsythe (BF) test is one of the optimal methods when the MAF is 0.2 or larger, irrespective of phenotype distribution. Additionally, a larger sample size and more balanced sample distribution in different exposure categories increase the power of BF, SVLM, and DRM. Our results highlight vQTL detection methods that perform optimally under realistic simulation settings and show that their relative performance depends on the phenotype distribution, allele frequency, sample size, and the type of exposure in the interaction model underlying the vQTL.
Collapse
Affiliation(s)
- Xiaopu Zhang
- Department of Twin Research and Genetic Epidemiology, King's College London, St Thomas’ Hospital, Westminster Bridge Road, London SE1 7EH, UK
| | - Jordana T Bell
- Department of Twin Research and Genetic Epidemiology, King's College London, St Thomas’ Hospital, Westminster Bridge Road, London SE1 7EH, UK
| |
Collapse
|
9
|
Reay WR, Clarke E, Eslick S, Riveros C, Holliday EG, McEvoy MA, Peel R, Hancock S, Scott RJ, Attia JR, Collins CE, Cairns MJ. Using Genetics to Inform Interventions Related to Sodium and Potassium in Hypertension. Circulation 2024; 149:1019-1032. [PMID: 38131187 PMCID: PMC10962430 DOI: 10.1161/circulationaha.123.065394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 11/28/2023] [Indexed: 12/23/2023]
Abstract
BACKGROUND Hypertension is a key risk factor for major adverse cardiovascular events but remains difficult to treat in many individuals. Dietary interventions are an effective approach to lower blood pressure (BP) but are not equally effective across all individuals. BP is heritable, and genetics may be a useful tool to overcome treatment response heterogeneity. We investigated whether the genetics of BP could be used to identify individuals with hypertension who may receive a particular benefit from lowering sodium intake and boosting potassium levels. METHODS In this observational genetic study, we leveraged cross-sectional data from up to 296 475 genotyped individuals drawn from the UK Biobank cohort for whom BP and urinary electrolytes (sodium and potassium), biomarkers of sodium and potassium intake, were measured. Biologically directed genetic scores for BP were constructed specifically among pathways related to sodium and potassium biology (pharmagenic enrichment scores), as well as unannotated genome-wide scores (conventional polygenic scores). We then tested whether there was a gene-by-environment interaction between urinary electrolytes and these genetic scores on BP. RESULTS Genetic risk and urinary electrolytes both independently correlated with BP. However, urinary sodium was associated with a larger BP increase among individuals with higher genetic risk in sodium- and potassium-related pathways than in those with comparatively lower genetic risk. For example, each SD in urinary sodium was associated with a 1.47-mm Hg increase in systolic BP for those in the top 10% of the distribution of genetic risk in sodium and potassium transport pathways versus a 0.97-mm Hg systolic BP increase in the lowest 10% (P=1.95×10-3). This interaction with urinary sodium remained when considering estimated glomerular filtration rate and indexing sodium to urinary creatinine. There was no strong evidence of an interaction between urinary sodium and a standard genome-wide polygenic score of BP. CONCLUSIONS The data suggest that genetic risk in sodium and potassium pathways could be used in a precision medicine model to direct interventions more specifically in the management of hypertension. Intervention studies are warranted.
Collapse
Affiliation(s)
- William R. Reay
- Schools of Biomedical Sciences and Pharmacy (W.R.R., R.J.S., M.J.C.), The University of Newcastle, Callaghan, NSW, Australia
- Precision Medicine Research Program (W.R.R., M.J.C.), New Lambton, NSW, Australia
| | - Erin Clarke
- Health Sciences (E.C., S.E., C.E.C.), The University of Newcastle, Callaghan, NSW, Australia
- Food and Nutrition Research Program (E.C., C.E.C.), New Lambton, NSW, Australia
| | - Shaun Eslick
- Health Sciences (E.C., S.E., C.E.C.), The University of Newcastle, Callaghan, NSW, Australia
| | - Carlos Riveros
- Hunter Medical Research Institute (C.R., E.G.H., J.R.A.), New Lambton, NSW, Australia
| | - Elizabeth G. Holliday
- Medicine and Public Health (E.G.H., R.P., S.H., J.R.A.), The University of Newcastle, Callaghan, NSW, Australia
- Hunter Medical Research Institute (C.R., E.G.H., J.R.A.), New Lambton, NSW, Australia
| | - Mark A. McEvoy
- Rural Health School, La Trobe University, Bendigo, Victoria, Australia (M.A.M.)
| | - Roseanne Peel
- Medicine and Public Health (E.G.H., R.P., S.H., J.R.A.), The University of Newcastle, Callaghan, NSW, Australia
| | - Stephen Hancock
- Medicine and Public Health (E.G.H., R.P., S.H., J.R.A.), The University of Newcastle, Callaghan, NSW, Australia
| | - Rodney J. Scott
- Schools of Biomedical Sciences and Pharmacy (W.R.R., R.J.S., M.J.C.), The University of Newcastle, Callaghan, NSW, Australia
- Cancer Detection and Therapy Research Program (R.J.S.), New Lambton, NSW, Australia
| | - John R. Attia
- Medicine and Public Health (E.G.H., R.P., S.H., J.R.A.), The University of Newcastle, Callaghan, NSW, Australia
- Hunter Medical Research Institute (C.R., E.G.H., J.R.A.), New Lambton, NSW, Australia
| | - Clare E. Collins
- Health Sciences (E.C., S.E., C.E.C.), The University of Newcastle, Callaghan, NSW, Australia
- Food and Nutrition Research Program (E.C., C.E.C.), New Lambton, NSW, Australia
| | - Murray J. Cairns
- Schools of Biomedical Sciences and Pharmacy (W.R.R., R.J.S., M.J.C.), The University of Newcastle, Callaghan, NSW, Australia
- Precision Medicine Research Program (W.R.R., M.J.C.), New Lambton, NSW, Australia
| |
Collapse
|
10
|
Lin WY. Searching for gene-gene interactions through variance quantitative trait loci of 29 continuous Taiwan Biobank phenotypes. Front Genet 2024; 15:1357238. [PMID: 38516378 PMCID: PMC10956579 DOI: 10.3389/fgene.2024.1357238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Accepted: 02/27/2024] [Indexed: 03/23/2024] Open
Abstract
Introduction: After the era of genome-wide association studies (GWAS), thousands of genetic variants have been identified to exhibit main effects on human phenotypes. The next critical issue would be to explore the interplay between genes, the so-called "gene-gene interactions" (GxG) or epistasis. An exhaustive search for all single-nucleotide polymorphism (SNP) pairs is not recommended because this will induce a harsh penalty of multiple testing. Limiting the search of epistasis on SNPs reported by previous GWAS may miss essential interactions between SNPs without significant marginal effects. Moreover, most methods are computationally intensive and can be challenging to implement genome-wide. Methods: I here searched for GxG through variance quantitative trait loci (vQTLs) of 29 continuous Taiwan Biobank (TWB) phenotypes. A discovery cohort of 86,536 and a replication cohort of 25,460 TWB individuals were analyzed, respectively. Results: A total of 18 nearly independent vQTLs with linkage disequilibrium measure r 2 < 0.01 were identified and replicated from nine phenotypes. 15 significant GxG were found with p-values <1.1E-5 (in the discovery cohort) and false discovery rates <2% (in the replication cohort). Among these 15 GxG, 11 were detected for blood traits including red blood cells, hemoglobin, and hematocrit; 2 for total bilirubin; 1 for fasting glucose; and 1 for total cholesterol (TCHO). All GxG were observed for gene pairs on the same chromosome, except for the APOA5 (chromosome 11)-TOMM40 (chromosome 19) interaction for TCHO. Discussion: This study provided a computationally feasible way to search for GxG genome-wide and applied this approach to 29 phenotypes.
Collapse
Affiliation(s)
- Wan-Yu Lin
- Institute of Health Data Analytics and Statistics, College of Public Health, National Taiwan University, Taipei, Taiwan
- Master of Public Health Degree Program, College of Public Health, National Taiwan University, Taipei, Taiwan
| |
Collapse
|
11
|
Zhou H, McPeek MS. Overcoming the "feast or famine" effect: improved interaction testing in genome-wide association studies. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.13.580168. [PMID: 38405994 PMCID: PMC10888770 DOI: 10.1101/2024.02.13.580168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
In genetic association analysis of complex traits, detection of interaction (either GxG or GxE) can help to elucidate the genetic architecture and biological mechanisms underlying the trait. Detection of interaction in a genome-wide association study (GWAS) can be methodologically challenging for various reasons, including a high burden of multiple comparisons when testing for epistasis between all possible pairs of a set of genomewide variants, as well as heteroscedasticity effects occurring in the presence of GxG or GxE interaction. In this paper, we address the problem of an even more striking phenomenon that we call the "feast or famine" effect that occurs when testing interaction in a genomewide context. As we verify, even in a simplified setting in which there is no interaction at all (and so no heteroscedasticity), in a GWAS to detect GxG or GxE interaction with a fixed genetic variant or environmental factor, the distribution of the genome-wide p-values under the null hypothesis is not the i.i.d. uniform one that is commonly assumed. Using standard methods, even if all SNPs are independent, some GWASs will have systematically underinflated p-values ("feast"), and others will have systematically overinflated p-values ("famine"), which can lead to false detection of interaction, reduced power, inconsistent results across studies, and failure to replicate true signal. This startling phenomenon is specific to detection of interaction in a GWAS, and it may partly explain why such detection has so far proved challenging and difficult to replicate. We show theoretically that the key cause of this phenomenon is which variables are conditioned on in the analysis, and this suggests an approach to correct the problem by changing the way the conditioning is done. Using this insight, we have developed the TINGA method to adjust the interaction test statistics to make their p-values closer to uniform under the null hypothesis. In simulations we show that TINGA both controls type 1 error and improves power. TINGA allows for covariates and population structure through use of a linear mixed model and accounts for heteroscedasticity. We apply TINGA to detection of epistasis in a study of flowering time in Arabidopsis thaliana.
Collapse
Affiliation(s)
- Huanlin Zhou
- Department of Statistics, The University of Chicago, Chicago, Illinois, U.S.A
| | - Mary Sara McPeek
- Department of Statistics, The University of Chicago, Chicago, Illinois, U.S.A
- Department of Human Genetics, The University of Chicago, Chicago, Illinois, U.S.A
| |
Collapse
|
12
|
Brown BC, Morris JA, Lappalainen T, Knowles DA. Large-scale causal discovery using interventional data sheds light on the regulatory network architecture of blood traits. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.13.562293. [PMID: 37905013 PMCID: PMC10614812 DOI: 10.1101/2023.10.13.562293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Inference of directed biological networks is an important but notoriously challenging problem. We introduce inverse sparse regression (inspre), an approach to learning causal networks that leverages large-scale intervention-response data. Applied to 788 genes from the genome-wide perturb-seq dataset, inspre helps elucidate the network architecture of blood traits.
Collapse
Affiliation(s)
- Brielin C. Brown
- New York Genome Center, New York, NY, USA
- Data Science Institute, Columbia University, New York, NY, USA
| | | | - Tuuli Lappalainen
- New York Genome Center, New York, NY, USA
- Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden
- Department of Systems Biology, Columbia University, New York, NY
| | - David A. Knowles
- New York Genome Center, New York, NY, USA
- Department of Systems Biology, Columbia University, New York, NY
- Department of Computer Science, Columbia University, New York, NY
| |
Collapse
|
13
|
Snaebjarnarson AS, Helgadottir A, Arnadottir GA, Ivarsdottir EV, Thorleifsson G, Ferkingstad E, Einarsson G, Sveinbjornsson G, Thorgeirsson TE, Ulfarsson MO, Halldorsson BV, Olafsson I, Erikstrup C, Pedersen OB, Nyegaard M, Bruun MT, Ullum H, Brunak S, Iversen KK, Christensen AH, Olesen MS, Ghouse J, Banasik K, Knowlton KU, Arnar DO, Thorgeirsson G, Nadauld L, Ostrowski SR, Bundgaard H, Holm H, Sulem P, Stefansson K, Gudbjartsson DF. Complex effects of sequence variants on lipid levels and coronary artery disease. Cell 2023; 186:4085-4099.e15. [PMID: 37714134 DOI: 10.1016/j.cell.2023.08.012] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 05/06/2023] [Accepted: 08/10/2023] [Indexed: 09/17/2023]
Abstract
Many sequence variants have additive effects on blood lipid levels and, through that, on the risk of coronary artery disease (CAD). We show that variants also have non-additive effects and interact to affect lipid levels as well as affecting variance and correlations. Variance and correlation effects are often signatures of epistasis or gene-environmental interactions. These complex effects can translate into CAD risk. For example, Trp154Ter in FUT2 protects against CAD among subjects with the A1 blood group, whereas it associates with greater risk of CAD in others. His48Arg in ADH1B interacts with alcohol consumption to affect lipid levels and CAD. The effect of variants in TM6SF2 on blood lipids is greatest among those who never eat oily fish but absent from those who often do. This work demonstrates that variants that affect variance of quantitative traits can allow for the discovery of epistasis and interactions of variants with the environment.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | - Magnus O Ulfarsson
- deCODE genetics/Amgen, Inc., Reykjavik 102, Iceland; Faculty of Electrical and Computer Engineering, University of Iceland, Reykjavik 102, Iceland
| | | | - Isleifur Olafsson
- Department of Clinical Biochemistry, Landspitali - National University Hospital of Iceland, Hringbraut, Reykjavik 101, Iceland
| | - Christian Erikstrup
- Department of Clinical Immunology, Aarhus University Hospital, Aarhus 8200, Denmark; Department of Clinical Medicine, Health, Aarhus University, Aarhus 8200, Denmark
| | - Ole B Pedersen
- Department of Clinical Immunology, Zealand University Hospital, Køge 4600, Denmark; Department of Clinical Medicine, University of Copenhagen, Copenhagen 1165, Denmark
| | - Mette Nyegaard
- Department of Health Science and Technology, Faculty of Medicine, Aalborg University, Aalborg 9220, Denmark
| | - Mie T Bruun
- Department of Clinical Immunology, Odense University Hospital, Odense 5000, Denmark
| | - Henrik Ullum
- Statens Serum Institut, Copenhagen 2300, Denmark
| | - Søren Brunak
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen 2200, Denmark
| | - Kasper Karmark Iversen
- Department of Clinical Medicine, University of Copenhagen, Copenhagen 1165, Denmark; Department of Emergency Medicine, Copenhagen University Hospital Herlev and Gentofte, Herlev 2900, Denmark; Department of Cardiology, Copenhagen University Hospital, Herlev-Gentofte Hospital, Herlev 2900, Denmark
| | - Alex Hoerby Christensen
- Department of Clinical Medicine, University of Copenhagen, Copenhagen 1165, Denmark; Department of Cardiology, Copenhagen University Hospital, Herlev-Gentofte Hospital, Herlev 2900, Denmark
| | - Morten S Olesen
- Laboratory for Molecular Cardiology, Department of Cardiology, Rigshospitalet, Copenhagen University Hospital, Copenhagen 2100, Denmark; Laboratory for Molecular Cardiology, Department of Biomedical Sciences, University of Copenhagen, Copenhagen 1165, Denmark
| | - Jonas Ghouse
- Laboratory for Molecular Cardiology, Department of Cardiology, Rigshospitalet, Copenhagen University Hospital, Copenhagen 2100, Denmark; Laboratory for Molecular Cardiology, Department of Biomedical Sciences, University of Copenhagen, Copenhagen 1165, Denmark
| | - Karina Banasik
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen 2200, Denmark
| | - Kirk U Knowlton
- Intermountain Medical Center, Intermountain Heart Institute, Salt Lake City, UT 84143, USA
| | - David O Arnar
- deCODE genetics/Amgen, Inc., Reykjavik 102, Iceland; Faculty of Medicine, University of Iceland, Vatnsmyrarvegur, Reykjavik 101, Iceland; Division of Cardiology, Department of Internal Medicine, Landspitali - National University Hospital of Iceland, Hringbraut, Reykjavik 101, Iceland
| | - Gudmundur Thorgeirsson
- deCODE genetics/Amgen, Inc., Reykjavik 102, Iceland; Faculty of Medicine, University of Iceland, Vatnsmyrarvegur, Reykjavik 101, Iceland; Division of Cardiology, Department of Internal Medicine, Landspitali - National University Hospital of Iceland, Hringbraut, Reykjavik 101, Iceland
| | - Lincoln Nadauld
- Precision Genomics, Intermountain Healthcare, Saint George, UT 84790, USA
| | - Sisse Rye Ostrowski
- Department of Clinical Medicine, University of Copenhagen, Copenhagen 1165, Denmark; Department of Clinical Immunology, Copenhagen University Hospital, Rigshospitalet, Copenhagen 2100, Denmark
| | - Henning Bundgaard
- Department of Clinical Medicine, University of Copenhagen, Copenhagen 1165, Denmark; Department of Cardiology, Rigshospitalet, Copenhagen University Hospital, Copenhagen 2100, Denmark
| | - Hilma Holm
- deCODE genetics/Amgen, Inc., Reykjavik 102, Iceland
| | | | - Kari Stefansson
- deCODE genetics/Amgen, Inc., Reykjavik 102, Iceland; Faculty of Medicine, University of Iceland, Vatnsmyrarvegur, Reykjavik 101, Iceland.
| | - Daniel F Gudbjartsson
- deCODE genetics/Amgen, Inc., Reykjavik 102, Iceland; School of Engineering and Natural Sciences, University of Iceland, Reykjavik 102, Iceland.
| |
Collapse
|
14
|
Bass AJ, Bian S, Wingo AP, Wingo TS, Cutler DJ, Epstein MP. Identifying latent genetic interactions in genome-wide association studies using multiple traits. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.11.557155. [PMID: 37745553 PMCID: PMC10515795 DOI: 10.1101/2023.09.11.557155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]
Abstract
Genome-wide association studies of complex traits frequently find that SNP-based estimates of heritability are considerably smaller than estimates from classic family-based studies. This 'missing' heritability may be partly explained by genetic variants interacting with other genes or environments that are difficult to specify, observe, and detect. To circumvent these challenges, we propose a new method to detect genetic interactions that leverages pleiotropy from multiple related traits without requiring the interacting variable to be specified or observed. Our approach, Latent Interaction Testing (LIT), uses the observation that correlated traits with shared latent genetic interactions have trait variance and covariance patterns that differ by genotype. LIT examines the relationship between trait variance/covariance patterns and genotype using a flexible kernel-based framework that is computationally scalable for biobank-sized datasets with a large number of traits. We first use simulated data to demonstrate that LIT substantially increases power to detect latent genetic interactions compared to a trait-by-trait univariate method. We then apply LIT to four obesity-related traits in the UK Biobank and detect genetic variants with interactive effects near known obesity-related genes. Overall, we show that LIT, implemented in the R package lit, uses shared information across traits to improve detection of latent genetic interactions compared to standard approaches.
Collapse
Affiliation(s)
- Andrew J. Bass
- Department of Human Genetics, Emory University, Atlanta, GA 30322, USA
| | - Shijia Bian
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA 30322, USA
| | - Aliza P. Wingo
- Department of Psychiatry, Emory University, Atlanta, GA 30322, USA
| | - Thomas S. Wingo
- Department of Human Genetics, Emory University, Atlanta, GA 30322, USA
- Department of Neurology, Emory University, Atlanta, GA 30322, USA
| | - David J. Cutler
- Department of Human Genetics, Emory University, Atlanta, GA 30322, USA
| | | |
Collapse
|
15
|
Liu Z, Ye T, Sun B, Schooling M, Tchetgen ET. Mendelian randomization mixed-scale treatment effect robust identification and estimation for causal inference. Biometrics 2023; 79:2208-2219. [PMID: 35950778 DOI: 10.1111/biom.13735] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 08/02/2022] [Indexed: 11/28/2022]
Abstract
Standard Mendelian randomization (MR) analysis can produce biased results if the genetic variant defining an instrumental variable (IV) is confounded and/or has a horizontal pleiotropic effect on the outcome of interest not mediated by the treatment variable. We provide novel identification conditions for the causal effect of a treatment in the presence of unmeasured confounding by leveraging a possibly invalid IV for which both the IV independence and exclusion restriction assumptions may be violated. The proposed Mendelian randomization mixed-scale treatment effect robust identification (MR MiSTERI) approach relies on (i) an assumption that the treatment effect does not vary with the possibly invalid IV on the additive scale; (ii) that the confounding bias does not vary with the possibly invalid IV on the odds ratio scale; and (iii) that the residual variance for the outcome is heteroskedastic with respect to the possibly invalid IV. Although assumptions (i) and (ii) have, respectively, appeared in the IV literature, assumption (iii) has not; we formally establish that their conjunction can identify a causal effect even with an invalid IV. MR MiSTERI is shown to be particularly advantageous in the presence of pervasive heterogeneity of pleiotropic effects on the additive scale. We propose a simple and consistent three-stage estimator that can be used as a preliminary estimator to a carefully constructed efficient one-step-update estimator. In order to incorporate multiple, possibly correlated, and weak invalid IVs, a common challenge in MR studies, we develop a MAny Weak Invalid Instruments (MR MaWII MiSTERI) approach for strengthened identification and improved estimation accuracy. Both simulation studies and UK Biobank data analysis results demonstrate the robustness of the proposed methods.
Collapse
Affiliation(s)
- Zhonghua Liu
- Department of Biostatistics, Columbia University, New York, New York, USA
| | - Ting Ye
- Department of Biostatistics, University of Washington, Seattle, Washington, USA
| | - Baoluo Sun
- Department of Statistics and Data Science, National University of Singapore, Singapore
| | - Mary Schooling
- CUNY Graduate School of Public Health and Health Policy, New York, New York, USA
- School of Public Health, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong SAR, China
| | - Eric Tchetgen Tchetgen
- Department of Statistics and Data Science, The Wharton School, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
16
|
Lewinger JP, Kawaguchi ES, Gauderman WJ. A note on p-value multiple-testing adjustment for two-step genome-wide gene-environment interactions scans. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.06.27.23291946. [PMID: 37425767 PMCID: PMC10327251 DOI: 10.1101/2023.06.27.23291946] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
Two-step testing is the state-of-the art approach for performing genome-wide interaction scans (GWIS). It is computationally efficient and yields higher power than standard single-step-based GWIS for virtually all biologically plausible scenarios. However, while two-step tests control the genome-wide type I error rate at the desired level, the lack of associated valid p-values can make it difficult for users to compare with single step-results. We show how multiple-testing adjusted p-values can be defined for two-step test based on standard multiple-testing theory, and how they can be in turn scaled to make valid comparisons with single-step tests possible.
Collapse
Affiliation(s)
- Juan Pablo Lewinger
- Department of Population and Public Health Sciences, University of Southern California
| | - Eric S Kawaguchi
- Department of Population and Public Health Sciences, University of Southern California
| | - W James Gauderman
- Department of Population and Public Health Sciences, University of Southern California
| |
Collapse
|
17
|
Wang C, Wang T, Wei Y, Aschard H, Ionita-Laza I. Quantile Regression for biomarkers in the UK Biobank. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.05.543699. [PMID: 37333162 PMCID: PMC10274625 DOI: 10.1101/2023.06.05.543699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/20/2023]
Abstract
Genome-wide association studies (GWAS) for biomarkers important for clinical phenotypes can lead to clinically relevant discoveries. GWAS for quantitative traits are based on simplified regression models modeling the conditional mean of a phenotype as a linear function of genotype. An alternative and easy to apply approach is quantile regression that naturally extends linear regression to the analysis of the entire conditional distribution of a phenotype of interest by modeling conditional quantiles within a regression framework. Quantile regression can be applied efficiently at biobank scale using standard statistical packages in much the same way as linear regression, while having some unique advantages such as identifying variants with heterogeneous effects across different quantiles, including non-additive effects and variants involved in gene-environment interactions; accommodating a wide range of phenotype distributions with invariance to trait transformation; and overall providing more detailed information about the underlying genotype-phenotype associations. Here, we demonstrate the value of quantile regression in the context of GWAS by applying it to 39 quantitative traits in the UK Biobank (n > 300 , 000 individuals). Across these 39 traits we identify 7,297 significant loci, including 259 loci only detected by quantile regression. We show that quantile regression can help uncover replicable but unmodelled gene-environment interactions, and can provide additional key insights into poorly understood genotype-phenotype correlations for clinically relevant biomarkers at minimal additional cost.
Collapse
Affiliation(s)
- Chen Wang
- Department of Biostatistics, Columbia University, New York, USA
| | - Tianying Wang
- Center for Statistical Science & Department of Industrial Engineering, Tsinghua University, Beijing, China
| | - Ying Wei
- Department of Biostatistics, Columbia University, New York, USA
| | - Hugues Aschard
- Institut Pasteur, Université Paris Cité, Department of Computational Biology, Paris, France
| | | |
Collapse
|
18
|
Jin B, Dunson DB, Rager JE, Reif DM, Engel SM, Herring AH. Bayesian matrix completion for hypothesis testing. J R Stat Soc Ser C Appl Stat 2023; 72:254-270. [PMID: 37197290 PMCID: PMC10184491 DOI: 10.1093/jrsssc/qlac005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2020] [Revised: 09/26/2021] [Accepted: 10/07/2022] [Indexed: 03/17/2023]
Abstract
We aim to infer bioactivity of each chemical by assay endpoint combination, addressing sparsity of toxicology data. We propose a Bayesian hierarchical framework which borrows information across different chemicals and assay endpoints, facilitates out-of-sample prediction of activity for chemicals not yet assayed, quantifies uncertainty of predicted activity, and adjusts for multiplicity in hypothesis testing. Furthermore, this paper makes a novel attempt in toxicology to simultaneously model heteroscedastic errors and a nonparametric mean function, leading to a broader definition of activity whose need has been suggested by toxicologists. Real application identifies chemicals most likely active for neurodevelopmental disorders and obesity.
Collapse
Affiliation(s)
- Bora Jin
- Duke University, Durham, NC, USA
| | | | - Julia E Rager
- University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - David M Reif
- North Carolina State University, Raleigh, NC, USA
| | | | | |
Collapse
|
19
|
Head ST, Leslie EJ, Cutler DJ, Epstein MP. POIROT: a powerful test for parent-of-origin effects in unrelated samples leveraging multiple phenotypes. Bioinformatics 2023; 39:btad199. [PMID: 37067493 PMCID: PMC10148680 DOI: 10.1093/bioinformatics/btad199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 04/03/2023] [Accepted: 04/06/2023] [Indexed: 04/18/2023] Open
Abstract
MOTIVATION There is widespread interest in identifying genetic variants that exhibit parent-of-origin effects (POEs) wherein the effect of an allele on phenotype expression depends on its parental origin. POEs can arise from different phenomena including genomic imprinting and have been documented for many complex traits. Traditional tests for POEs require family data to determine parental origins of transmitted alleles. As most genome-wide association studies (GWAS) sample unrelated individuals (where allelic parental origin is unknown), the study of POEs in such datasets requires sophisticated statistical methods that exploit genetic patterns we anticipate observing when POEs exist. We propose a method to improve discovery of POE variants in large-scale GWAS samples that leverages potential pleiotropy among multiple correlated traits often collected in such studies. Our method compares the phenotypic covariance matrix of heterozygotes to homozygotes based on a Robust Omnibus Test. We refer to our method as the Parent of Origin Inference using Robust Omnibus Test (POIROT) of multiple quantitative traits. RESULTS Through simulation studies, we compared POIROT to a competing univariate variance-based method which considers separate analysis of each phenotype. We observed POIROT to be well-calibrated with improved power to detect POEs compared to univariate methods. POIROT is robust to non-normality of phenotypes and can adjust for population stratification and other confounders. Finally, we applied POIROT to GWAS data from the UK Biobank using BMI and two cholesterol phenotypes. We identified 338 genome-wide significant loci for follow-up investigation. AVAILABILITY AND IMPLEMENTATION The code for this method is available at https://github.com/staylorhead/POIROT-POE.
Collapse
Affiliation(s)
- S Taylor Head
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA 30322, United States
| | - Elizabeth J Leslie
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, United States
| | - David J Cutler
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, United States
| | - Michael P Epstein
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, United States
| |
Collapse
|
20
|
Kawaguchi ES, Kim AE, Pablo Lewinger J, Gauderman WJ. Improved two-step testing of genome-wide gene-environment interactions. Genet Epidemiol 2023; 47:152-166. [PMID: 36571162 PMCID: PMC9974838 DOI: 10.1002/gepi.22509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 10/13/2022] [Accepted: 11/11/2022] [Indexed: 12/27/2022]
Abstract
Two-step tests for gene-environment (G × E $G\times E$ ) interactions exploit marginal single-nucleotide polymorphism (SNP) effects to improve the power of a genome-wide interaction scan. They combine a screening step based on marginal effects used to "bin" SNPs for weighted hypothesis testing in the second step to deliver greater power over single-step tests while preserving the genome-wide Type I error. However, the presence of many SNPs with detectable marginal effects on the trait of interest can reduce power by "displacing" true interactions with weaker marginal effects and by adding to the number of tests that need to be corrected for multiple testing. We introduce a new significance-based allocation into bins for Step-2G × E $G\times E$ testing that overcomes the displacement issue and propose a computationally efficient approach to account for multiple testing within bins. Simulation results demonstrate that these simple improvements can provide substantially greater power than current methods under several scenarios. An application to a multistudy collaboration for understanding colorectal cancer reveals a G × Sex interaction located near the SMAD7 gene.
Collapse
Affiliation(s)
- Eric S. Kawaguchi
- Department of Population and Public Health Sciences, University of Southern California, California, USA
| | - Andre E. Kim
- Department of Population and Public Health Sciences, University of Southern California, California, USA
| | - Juan Pablo Lewinger
- Department of Population and Public Health Sciences, University of Southern California, California, USA
| | - W. James Gauderman
- Department of Population and Public Health Sciences, University of Southern California, California, USA
| |
Collapse
|
21
|
Environmental neuroscience linking exposome to brain structure and function underlying cognition and behavior. Mol Psychiatry 2023; 28:17-27. [PMID: 35790874 DOI: 10.1038/s41380-022-01669-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/07/2021] [Revised: 06/02/2022] [Accepted: 06/09/2022] [Indexed: 01/07/2023]
Abstract
Individual differences in human brain structure, function, and behavior can be attributed to genetic variations, environmental exposures, and their interactions. Although genome-wide association studies have identified many genetic variants associated with brain imaging phenotypes, environmental exposures associated with these phenotypes remain largely unknown. Here, we propose that environmental neuroscience should pay more attention on exploring the associations between lifetime environmental exposures (exposome) and brain imaging phenotypes and identifying both cumulative environmental effects and their vulnerable age windows during the life course. Exposome-neuroimaging association studies face several challenges including the accurate measurement of the totality of environmental exposures varied in space and time, the highly correlated structure of the exposome, and the lack of standardized approaches for exposome-wide association studies. By agnostically scanning the effects of environmental exposures on brain imaging phenotypes and their interactions with genomic variations, exposome-neuroimaging association analyses will improve our understanding of causal factors associated with individual differences in brain structure and function as well as their relations with cognitive abilities and neuropsychiatric disorders.
Collapse
|
22
|
Lu T, Forgetta V, Richards JB, Greenwood CMT. Genetic determinants of polygenic prediction accuracy within a population. Genetics 2022; 222:6762086. [PMID: 36250789 PMCID: PMC9713421 DOI: 10.1093/genetics/iyac158] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Accepted: 10/10/2022] [Indexed: 11/15/2022] Open
Abstract
Genomic risk prediction is on the emerging path toward personalized medicine. However, the accuracy of polygenic prediction varies strongly in different individuals. Based on up to 352,277 European ancestry participants in the UK Biobank, we constructed polygenic risk scores for 15 physiological and biochemical quantitative traits. We identified a total of 185 polygenic prediction variability quantitative trait loci for 11 traits by Levene's test among 254,376 unrelated individuals. We validated the effects of prediction variability quantitative trait loci using an independent test set of 58,927 individuals. For instance, a score aggregating 51 prediction variability quantitative trait locus variants for triglycerides had the strongest Spearman correlation of 0.185 (P-value <1.0 × 10-300) with the squared prediction errors. We found a strong enrichment of complex genetic effects conferred by prediction variability quantitative trait loci compared to risk loci identified in genome-wide association studies, including 89 prediction variability quantitative trait loci exhibiting dominance effects. Incorporation of dominance effects into polygenic risk scores significantly improved polygenic prediction for triglycerides, low-density lipoprotein cholesterol, vitamin D, and platelet. In conclusion, we have discovered and profiled genetic determinants of polygenic prediction variability for 11 quantitative biomarkers. These findings may assist interpretation of genomic risk prediction in various contexts and encourage novel approaches for constructing polygenic risk scores with complex genetic effects.
Collapse
Affiliation(s)
- Tianyuan Lu
- Centre for Clinical Epidemiology, Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, QC H3T 1E2, Canada.,Quantitative Life Sciences Program, McGill University, Montreal, QC H3A 0G4, Canada
| | - Vincenzo Forgetta
- Centre for Clinical Epidemiology, Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, QC H3T 1E2, Canada
| | - John Brent Richards
- Centre for Clinical Epidemiology, Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, QC H3T 1E2, Canada.,Department of Human Genetics, McGill University, Montreal, QC H3A 0G4, Canada.,Department of Twin Research and Genetic Epidemiology, King's College London, London WC2R 2LS, UK
| | - Celia M T Greenwood
- Centre for Clinical Epidemiology, Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, QC H3T 1E2, Canada.,Department of Human Genetics, McGill University, Montreal, QC H3A 0G4, Canada.,Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, QC H3A 0G4, Canada.,Gerald Bronfman Department of Oncology, McGill University, Montreal, QC H3A 0G4, Canada
| |
Collapse
|
23
|
Hecker J, Prokopenko D, Moll M, Lee S, Kim W, Qiao D, Voorhies K, Kim W, Vansteelandt S, Hobbs BD, Cho MH, Silverman EK, Lutz SM, DeMeo DL, Weiss ST, Lange C. A robust and adaptive framework for interaction testing in quantitative traits between multiple genetic loci and exposure variables. PLoS Genet 2022; 18:e1010464. [PMID: 36383614 PMCID: PMC9668174 DOI: 10.1371/journal.pgen.1010464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Accepted: 10/04/2022] [Indexed: 11/17/2022] Open
Abstract
The identification and understanding of gene-environment interactions can provide insights into the pathways and mechanisms underlying complex diseases. However, testing for gene-environment interaction remains a challenge since a.) statistical power is often limited and b.) modeling of environmental effects is nontrivial and such model misspecifications can lead to false positive interaction findings. To address the lack of statistical power, recent methods aim to identify interactions on an aggregated level using, for example, polygenic risk scores. While this strategy can increase the power to detect interactions, identifying contributing genes and pathways is difficult based on these relatively global results. Here, we propose RITSS (Robust Interaction Testing using Sample Splitting), a gene-environment interaction testing framework for quantitative traits that is based on sample splitting and robust test statistics. RITSS can incorporate sets of genetic variants and/or multiple environmental factors. Based on the user's choice of statistical/machine learning approaches, a screening step selects and combines potential interactions into scores with improved interpretability. In the testing step, the application of robust statistics minimizes the susceptibility to main effect misspecifications. Using extensive simulation studies, we demonstrate that RITSS controls the type 1 error rate in a wide range of scenarios, and we show how the screening strategy influences statistical power. In an application to lung function phenotypes and human height in the UK Biobank, RITSS identified highly significant interactions based on subcomponents of genetic risk scores. While the contributing single variant interaction signals are weak, our results indicate interaction patterns that result in strong aggregated effects, providing potential insights into underlying gene-environment interaction mechanisms.
Collapse
Affiliation(s)
- Julian Hecker
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, Massachusetts, United States of America
- Harvard Medical School, Boston, Massachusetts, United States of America
| | - Dmitry Prokopenko
- Harvard Medical School, Boston, Massachusetts, United States of America
- Genetics and Aging Unit and McCance Center for Brain Health, Department of Neurology, Massachusetts General Hospital, Boston, Massachusetts, United States of America
| | - Matthew Moll
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, Massachusetts, United States of America
- Harvard Medical School, Boston, Massachusetts, United States of America
- Division of Pulmonary and Critical Care Medicine, Brigham and Women’s Hospital, Boston, Massachusetts, United States of America
| | - Sanghun Lee
- Department of Medical Consilience, Division of Medicine, Graduate School, Dankook University, Yongin, South Korea
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America
| | - Wonji Kim
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, Massachusetts, United States of America
- Harvard Medical School, Boston, Massachusetts, United States of America
| | - Dandi Qiao
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, Massachusetts, United States of America
- Harvard Medical School, Boston, Massachusetts, United States of America
| | - Kirsten Voorhies
- Harvard Medical School, Boston, Massachusetts, United States of America
- Department of Population Medicine, PRecisiOn Medicine Translational Research (PROMoTeR) Center, Harvard Pilgrim Health Care, Boston, Massachusetts, United States of America
| | - Woori Kim
- Harvard Medical School, Boston, Massachusetts, United States of America
- Systems Biology and Computer Science Program, Ann Romney Center for Neurological Diseases, Department of Neurology, Brigham and Women’s Hospital, Boston, Massachusetts, United States of America
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Stijn Vansteelandt
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Gent, Belgium
- Department of Medical Statistics, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Brian D. Hobbs
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, Massachusetts, United States of America
- Harvard Medical School, Boston, Massachusetts, United States of America
- Division of Pulmonary and Critical Care Medicine, Brigham and Women’s Hospital, Boston, Massachusetts, United States of America
| | - Michael H. Cho
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, Massachusetts, United States of America
- Harvard Medical School, Boston, Massachusetts, United States of America
- Division of Pulmonary and Critical Care Medicine, Brigham and Women’s Hospital, Boston, Massachusetts, United States of America
| | - Edwin K. Silverman
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, Massachusetts, United States of America
- Harvard Medical School, Boston, Massachusetts, United States of America
- Division of Pulmonary and Critical Care Medicine, Brigham and Women’s Hospital, Boston, Massachusetts, United States of America
| | - Sharon M. Lutz
- Harvard Medical School, Boston, Massachusetts, United States of America
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America
- Department of Population Medicine, PRecisiOn Medicine Translational Research (PROMoTeR) Center, Harvard Pilgrim Health Care, Boston, Massachusetts, United States of America
| | - Dawn L. DeMeo
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, Massachusetts, United States of America
- Harvard Medical School, Boston, Massachusetts, United States of America
| | - Scott T. Weiss
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, Massachusetts, United States of America
- Harvard Medical School, Boston, Massachusetts, United States of America
| | - Christoph Lange
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, Massachusetts, United States of America
- Harvard Medical School, Boston, Massachusetts, United States of America
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America
| |
Collapse
|
24
|
Clark R, Pozarickij A, Hysi PG, Ohno-Matsui K, Williams C, Guggenheim JA. Education interacts with genetic variants near GJD2, RBFOX1, LAMA2, KCNQ5 and LRRC4C to confer susceptibility to myopia. PLoS Genet 2022; 18:e1010478. [PMID: 36395078 PMCID: PMC9671369 DOI: 10.1371/journal.pgen.1010478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 10/14/2022] [Indexed: 11/19/2022] Open
Abstract
Myopia most often develops during school age, with the highest incidence in countries with intensive education systems. Interactions between genetic variants and educational exposure are hypothesized to confer susceptibility to myopia, but few such interactions have been identified. Here, we aimed to identify genetic variants that interact with education level to confer susceptibility to myopia. Two groups of unrelated participants of European ancestry from UK Biobank were studied. A 'Stage-I' sample of 88,334 participants whose refractive error (avMSE) was measured by autorefraction and a 'Stage-II' sample of 252,838 participants who self-reported their age-of-onset of spectacle wear (AOSW) but who did not undergo autorefraction. Genetic variants were prioritized via a 2-step screening process in the Stage-I sample: Step 1 was a genome-wide association study for avMSE; Step 2 was a variance heterogeneity analysis for avMSE. Genotype-by-education interaction tests were performed in the Stage-II sample, with University education coded as a binary exposure. On average, participants were 58 years-old and left full-time education when they were 18 years-old; 35% reported University level education. The 2-step screening strategy in the Stage-I sample prioritized 25 genetic variants (GWAS P < 1e-04; variance heterogeneity P < 5e-05). In the Stage-II sample, 19 of the 25 (76%) genetic variants demonstrated evidence of variance heterogeneity, suggesting the majority were true positives. Five genetic variants located near GJD2, RBFOX1, LAMA2, KCNQ5 and LRRC4C had evidence of a genotype-by-education interaction in the Stage-II sample (P < 0.002) and consistent evidence of a genotype-by-education interaction in the Stage-I sample. For all 5 variants, University-level education was associated with an increased effect of the risk allele. In this cohort, additional years of education were associated with an enhanced effect of genetic variants that have roles including axon guidance and the development of neuronal synapses and neural circuits.
Collapse
Affiliation(s)
- Rosie Clark
- School of Optometry & Vision Sciences, Cardiff University, Cardiff, United Kingdom
| | - Alfred Pozarickij
- School of Optometry & Vision Sciences, Cardiff University, Cardiff, United Kingdom
| | - Pirro G. Hysi
- Section of Ophthalmology, School of Life Course Sciences, King’s College London, London, United Kingdom
- Department of Twin Research and Genetic Epidemiology, School of Life Course Sciences, King’s College London, London, United Kingdom
| | - Kyoko Ohno-Matsui
- Department of Ophthalmology and Visual Science, Tokyo Medical and Dental University, Tokyo, Japan
| | - Cathy Williams
- Centre for Academic Child Health, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, United Kingdom
| | - Jeremy A. Guggenheim
- School of Optometry & Vision Sciences, Cardiff University, Cardiff, United Kingdom
| | | |
Collapse
|
25
|
Wang T, Ionita-Laza I, Wei Y. Integrated Quantile RAnk Test (iQRAT) for gene-level associations. Ann Appl Stat 2022. [DOI: 10.1214/21-aoas1548] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Tianying Wang
- Center for Statistical Science & Department of Industrial Engineering, Tsinghua University
| | | | - Ying Wei
- Department of Biostatistics, Columbia University
| |
Collapse
|
26
|
Maxwell TJ, Franks PW, Kahn SE, Knowler WC, Mather KJ, Florez JC, Jablonski KA. Quantitative trait loci, G×E and G×G for glycemic traits: response to metformin and placebo in the Diabetes Prevention Program (DPP). J Hum Genet 2022; 67:465-473. [PMID: 35260800 PMCID: PMC10102970 DOI: 10.1038/s10038-022-01027-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 02/16/2022] [Accepted: 02/17/2022] [Indexed: 11/09/2022]
Abstract
The complex genetic architecture of type-2-diabetes (T2D) includes gene-by-environment (G×E) and gene-by-gene (G×G) interactions. To identify G×E and G×G, we screened markers for patterns indicative of interactions (relationship loci [rQTL] and variance heterogeneity loci [vQTL]). rQTL exist when the correlation between multiple traits varies by genotype and vQTL occur when the variance of a trait differs by genotype (potentially flagging G×G and G×E). In the metformin and placebo arms of the DPP (n = 1762) we screened 280,965 exomic and intergenic SNPs, for rQTL and vQTL patterns in association with year one changes from baseline in glycemia and related traits (insulinogenic index [IGI], insulin sensitivity index [ISI], fasting glucose and fasting insulin). Significant (p < 1.8 × 10-7) rQTL and vQTL generated a priori hypotheses of individual G×E tests for a SNP × metformin treatment interaction and secondarily for G×G screens. Several rQTL and vQTL identified led to 6 nominally significant (p < 0.05) metformin treatment × SNP interactions (4 for IGI, one insulin, and one glucose) and 12G×G interactions (all IGI) that exceeded experiment-wide significance (p < 4.1 × 10-9). Some loci are directly associated with incident diabetes, and others are rQTL and modify a trait's relationship with diabetes (2 diabetes/glucose, 2 diabetes/insulin, 1 diabetes/IGI). rs3197999, an ISI/insulin rQTL, is a possible gene damaging missense mutation in MST1, is associated with ulcerative colitis, sclerosing cholangitis, Crohn's disease, BMI and coronary artery disease. This study demonstrates evidence for context-dependent effects (G×G & G×E) and the complexity of these T2D-related traits.
Collapse
Affiliation(s)
- Taylor J Maxwell
- Computational Biology Institute, The George Washington University, Ashburn, VA, USA.
| | - Paul W Franks
- Genetic & Molecular Epidemiology Unit, Lund University Diabetes Center, Lund, Sweden
| | - Steven E Kahn
- VA Puget Sound Health Care System and University of Washington, Seattle, WA, USA
| | - William C Knowler
- National Institute of Diabetes and Digestive and Kidney Diseases, Phoenix, AZ, USA
| | - Kieren J Mather
- Center for Diabetes and Metabolic Diseases & Division of Endocrinology & Metabolism, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Jose C Florez
- Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Programs in Metabolism and Medical & Population Genetics, Broad Institute, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Kathleen A Jablonski
- The Biostatistics Center, The Milken Institute of Public Health, The George Washington University, Rockville, MD, USA
| | | |
Collapse
|
27
|
Shi G. Genome-wide variance quantitative trait locus analysis suggests small interaction effects in blood pressure traits. Sci Rep 2022; 12:12649. [PMID: 35879408 PMCID: PMC9314370 DOI: 10.1038/s41598-022-16908-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Accepted: 07/18/2022] [Indexed: 11/09/2022] Open
Abstract
Genome-wide variance quantitative trait loci (vQTL) analysis complements genome-wide association study (GWAS) and has the potential to identify novel variants associated with the trait, explain additional trait variance and lead to the identification of factors that modulate the genetic effects. I conducted genome-wide analysis of the UK Biobank data and identified 27 vQTLs associated with systolic blood pressure (SBP), diastolic blood pressure (DBP) and pulse pressure (PP). The top single-nucleotide polymorphisms (SNPs) are enriched for expression QTLs (eQTLs) or splicing QTLs (sQTLs) annotated by GTEx, suggesting their regulatory roles in mediating the associations with blood pressure (BP). Of the 27 vQTLs, 14 are known BP-associated QTLs discovered by GWASs. The heteroscedasticity effects of the 13 novel vQTLs are larger than their genetic main effects, which were not detected by existing GWASs. The total R-squared of the 27 top SNPs due to variance heteroscedasticity is 0.28%, compared with 0.50% owing to their main effects. The overall effect size of the variance heteroscedasticity is small in GWAS SNPs compared with their main effects. For the 411, 384 and 285 GWAS SNPs associated with SBP, DBP and PP, respectively, their heteroscedasticity effects were 0.52%, 0.43%, and 0.16%, and their main effects were 5.13%, 5.61%, and 3.75%, respectively. The number and effects of the vQTLs are small, which suggests that the effects of gene-environment and gene-gene interactions are small. The main effects of the SNPs remain the major source of genetic variance for BP, which would probably be true for other complex traits as well.
Collapse
Affiliation(s)
- Gang Shi
- School of Telecommunications Engineering, Xidian University, 2 South Taibai Road, Xi'an, 710071, Shaanxi, China.
| |
Collapse
|
28
|
Westerman KE, Majarian TD, Giulianini F, Jang DK, Miao J, Florez JC, Chen H, Chasman DI, Udler MS, Manning AK, Cole JB. Variance-quantitative trait loci enable systematic discovery of gene-environment interactions for cardiometabolic serum biomarkers. Nat Commun 2022; 13:3993. [PMID: 35810165 PMCID: PMC9271055 DOI: 10.1038/s41467-022-31625-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 06/24/2022] [Indexed: 11/29/2022] Open
Abstract
Gene-environment interactions represent the modification of genetic effects by environmental exposures and are critical for understanding disease and informing personalized medicine. These often induce differential phenotypic variance across genotypes; these variance-quantitative trait loci can be prioritized in a two-stage interaction detection strategy to greatly reduce the computational and statistical burden and enable testing of a broader range of exposures. We perform genome-wide variance-quantitative trait locus analysis for 20 serum cardiometabolic biomarkers by multi-ancestry meta-analysis of 350,016 unrelated participants in the UK Biobank, identifying 182 independent locus-biomarker pairs (p < 4.5×10-9). Most are concentrated in a small subset (4%) of loci with genome-wide significant main effects, and 44% replicate (p < 0.05) in the Women's Genome Health Study (N = 23,294). Next, we test each locus-biomarker pair for interaction across 2380 exposures, identifying 847 significant interactions (p < 2.4×10-7), of which 132 are independent (p < 0.05) after accounting for correlation between exposures. Specific examples demonstrate interaction of triglyceride-associated variants with distinct body mass- versus body fat-related exposures as well as genotype-specific associations between alcohol consumption and liver stress at the ADH1B gene. Our catalog of variance-quantitative trait loci and gene-environment interactions is publicly available in an online portal.
Collapse
Affiliation(s)
- Kenneth E Westerman
- Clinical and Translational Epidemiology Unit, Mongan Institute, Massachusetts General Hospital, Boston, MA, USA.
- Programs in Metabolism and Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Department of Medicine, Harvard Medical School, Boston, MA, USA.
| | - Timothy D Majarian
- Programs in Metabolism and Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Franco Giulianini
- Division of Preventive Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Dong-Keun Jang
- Programs in Metabolism and Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Jenkai Miao
- Division of Endocrinology, Boston Children's Hospital, Boston, MA, USA
| | - Jose C Florez
- Programs in Metabolism and Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Han Chen
- Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Daniel I Chasman
- Division of Preventive Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Division of Genetics, Brigham and Women's Hospital, Boston, MA, 02115, USA
- Medical and Population Genetics Program, Broad Institute, Cambridge, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Miriam S Udler
- Programs in Metabolism and Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
- Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Alisa K Manning
- Clinical and Translational Epidemiology Unit, Mongan Institute, Massachusetts General Hospital, Boston, MA, USA
- Programs in Metabolism and Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Joanne B Cole
- Programs in Metabolism and Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Division of Endocrinology, Boston Children's Hospital, Boston, MA, USA.
- Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA.
| |
Collapse
|
29
|
Deng WQ, Sun L. gJLS2: an R package for generalized joint location and scale analysis in X-inclusive genome-wide association studies. G3 GENES|GENOMES|GENETICS 2022; 12:6535712. [PMID: 35201341 PMCID: PMC8982384 DOI: 10.1093/g3journal/jkac049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Accepted: 02/17/2022] [Indexed: 11/12/2022]
Abstract
A joint analysis of location and scale can be a powerful tool in genome-wide association studies to uncover previously overlooked markers that influence a quantitative trait through both mean and variance, as well as to prioritize candidates for gene–environment interactions. This approach has recently been generalized to handle related samples, dosage data, and the analytically challenging X-chromosome. We disseminate the latest advances in methodology through a user-friendly R software package with added functionalities to support genome-wide analysis on individual-level or summary-level data. The implemented R package can be called from PLINK or directly in a scripting environment, to enable a streamlined genome-wide analysis for biobank-scale data. Application results on individual-level and summary-level data highlight the advantage of the joint test to discover more genome-wide signals as compared to a location or scale test alone. We hope the availability of gJLS2 software package will encourage more scale and/or joint analyses in large-scale datasets, and promote the standardized reporting of their P-values to be shared with the scientific community.
Collapse
Affiliation(s)
- Wei Q Deng
- Department of Psychiatry and Behavioural Neurosciences, McMaster University, Hamilton, ON L8P 3R2, Canada
- Peter Boris Centre for Addictions Research, St. Joseph’s Healthcare Hamilton, McMaster University, Hamilton, ON L8P 3R2, Canada
| | - Lei Sun
- Department of Statistical Sciences, University of Toronto, Toronto, ON M5G 1Z5, Canada
- Biostatistics Division, Dalla Lana School of Public Health, University of Toronto, Toronto, ON M5T 3M7, Canada
| |
Collapse
|
30
|
Kawaguchi ES, Li G, Lewinger JP, Gauderman WJ. Two-step hypothesis testing to detect gene-environment interactions in a genome-wide scan with a survival endpoint. Stat Med 2022; 41:1644-1657. [PMID: 35075649 PMCID: PMC9007892 DOI: 10.1002/sim.9319] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Revised: 11/10/2021] [Accepted: 12/26/2021] [Indexed: 01/13/2023]
Abstract
Defined by their genetic profile, individuals may exhibit differential clinical outcomes due to an environmental exposure. Identifying subgroups based on specific exposure-modifying genes can lead to targeted interventions and focused studies. Genome-wide interaction scans (GWIS) can be performed to identify such genes, but these scans typically suffer from low power due to the large multiple testing burden. We provide a novel framework for powerful two-step hypothesis tests for GWIS with a time-to-event endpoint under the Cox proportional hazards model. In the Cox regression setting, we develop an approach that prioritizes genes for Step-2 G × E testing based on a carefully constructed Step-1 screening procedure. Simulation results demonstrate this two-step approach can lead to substantially higher power for identifying gene-environment ( G × E ) interactions compared to the standard GWIS while preserving the family wise error rate over a range of scenarios. In a taxane-anthracycline chemotherapy study for breast cancer patients, the two-step approach identifies several gene expression by treatment interactions that would not be detected using the standard GWIS.
Collapse
Affiliation(s)
- Eric S Kawaguchi
- Department of Population and Public Health Sciences, University of Southern California, Los Angeles, California, USA
| | - Gang Li
- Department of Biostatistics, University of California, Los Angeles, Los Angeles, California, USA.,Department of Computational Medicine, University of California, Los Angeles, Los Angeles, California, USA
| | - Juan Pablo Lewinger
- Department of Population and Public Health Sciences, University of Southern California, Los Angeles, California, USA
| | - W James Gauderman
- Department of Population and Public Health Sciences, University of Southern California, Los Angeles, California, USA
| |
Collapse
|
31
|
Sell-Kubiak E, Knol EF, Lopes M. Evaluation of the phenotypic and genomic background of variability based on litter size of Large White pigs. Genet Sel Evol 2022; 54:1. [PMID: 34979897 PMCID: PMC8722267 DOI: 10.1186/s12711-021-00692-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Accepted: 12/15/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The genetic background of trait variability has captured the interest of ecologists and animal breeders because the genes that control it could be involved in buffering various environmental effects. Phenotypic variability of a given trait can be assessed by studying the heterogeneity of the residual variance, and the quantitative trait loci (QTL) that are involved in the control of this variability are described as variance QTL (vQTL). This study focuses on litter size (total number born, TNB) and its variability in a Large White pig population. The variability of TNB was evaluated either using a simple method, i.e. analysis of the log-transformed variance of residuals (LnVar), or the more complex double hierarchical generalized linear model (DHGLM). We also performed a single-SNP (single nucleotide polymorphism) genome-wide association study (GWAS). To our knowledge, this is only the second study that reports vQTL for litter size in pigs and the first one that shows GWAS results when using two methods to evaluate variability of TNB: LnVar and DHGLM. RESULTS Based on LnVar, three candidate vQTL regions were detected, on Sus scrofa chromosomes (SSC) 1, 7, and 18, which comprised 18 SNPs. Based on the DHGLM, three candidate vQTL regions were detected, i.e. two on SSC7 and one on SSC11, which comprised 32 SNPs. Only one candidate vQTL region overlapped between the two methods, on SSC7, which also contained the most significant SNP. Within this vQTL region, two candidate genes were identified, ADGRF1, which is involved in neurodevelopment of the brain, and ADGRF5, which is involved in the function of the respiratory system and in vascularization. The correlation between estimated breeding values based on the two methods was 0.86. Three-fold cross-validation indicated that DHGLM yielded EBV that were much more accurate and had better prediction of missing observations than LnVar. CONCLUSIONS The results indicated that the LnVar and DHGLM methods resulted in genetically different traits. Based on their validation, we recommend the use of DHGLM over the simpler method of log-transformed variance of residuals. These conclusions can be useful for future studies on the evaluation of the variability of any trait in any species.
Collapse
Affiliation(s)
- Ewa Sell-Kubiak
- Department of Genetics and Animal Breeding, Poznan University of Life Sciences, Poznań, Poland.
| | - Egbert F Knol
- Topigs Norsvin Research Centre, Beuningen, The Netherlands
| | - Marcos Lopes
- Topigs Norsvin Research Centre, Beuningen, The Netherlands.,Topigs Norsvin, Curitiba, Brazil
| |
Collapse
|
32
|
Integration of functional genomics data to uncover cell type-specific pathways affected in Parkinson's disease. Biochem Soc Trans 2021; 49:2091-2100. [PMID: 34581766 PMCID: PMC8589426 DOI: 10.1042/bst20210128] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Revised: 08/25/2021] [Accepted: 08/31/2021] [Indexed: 12/22/2022]
Abstract
Parkinson's disease (PD) is the second most prevalent late-onset neurodegenerative disorder worldwide after Alzheimer's disease for which available drugs only deliver temporary symptomatic relief. Loss of dopaminergic neurons (DaNs) in the substantia nigra and intracellular alpha-synuclein inclusions are the main hallmarks of the disease but the events that cause this degeneration remain uncertain. Despite cell types other than DaNs such as astrocytes, microglia and oligodendrocytes have been recently associated with the pathogenesis of PD, we still lack an in-depth characterisation of PD-affected brain regions at cell-type resolution that could help our understanding of the disease mechanisms. Nevertheless, publicly available large-scale brain-specific genomic, transcriptomic and epigenomic datasets can be further exploited to extract different layers of cell type-specific biological information for the reconstruction of cell type-specific transcriptional regulatory networks. By intersecting disease risk variants within the networks, it may be possible to study the functional role of these risk variants and their combined effects at cell type- and pathway levels, that, in turn, can facilitate the identification of key regulators involved in disease progression, which are often potential therapeutic targets.
Collapse
|
33
|
Salas LA, Peres LC, Thayer ZM, Smith RWA, Guo Y, Chung W, Si J, Liang L. A transdisciplinary approach to understand the epigenetic basis of race/ethnicity health disparities. Epigenomics 2021; 13:1761-1770. [PMID: 33719520 PMCID: PMC8579937 DOI: 10.2217/epi-2020-0080] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2020] [Accepted: 04/07/2020] [Indexed: 11/21/2022] Open
Abstract
Health disparities correspond to differences in disease burden and mortality among socially defined population groups. Such disparities may emerge according to race/ethnicity, socioeconomic status and a variety of other social contexts, and are documented for a wide range of diseases. Here, we provide a transdisciplinary perspective on the contribution of epigenetics to the understanding of health disparities, with a special emphasis on disparities across socially defined racial/ethnic groups. Scientists in the fields of biological anthropology, bioinformatics and molecular epidemiology provide a summary of theoretical, statistical and practical considerations for conducting epigenetic health disparities research, and provide examples of successful applications from cancer research using this approach.
Collapse
Affiliation(s)
- Lucas A Salas
- Department of Epidemiology, Geisel School of Medicine, Dartmouth College, Lebanon, NH 03756, USA
| | - Lauren C Peres
- Department of Cancer Epidemiology, H. Lee Moffitt Cancer Center & Research Institute, Tampa, FL 33612, USA
| | - Zaneta M Thayer
- Department of Anthropology, Dartmouth College, Hanover, NH 03755, USA
| | - Rick WA Smith
- Department of Anthropology, Dartmouth College, Hanover, NH 03755, USA
- The William H. Neukom Institute for Computational Science, Dartmouth College, Hanover, NH 03755, USA
| | | | - Wonil Chung
- Department of Statistics & Actuarial Science, Soongsil University, Seoul, 06478, Korea
- Program in Genetic Epidemiology & Statistical Genetics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Jiahui Si
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
- Department of Biostatistics & Epidemiology, Peking University School of Public Health, Beijing, 100191, China
| | - Liming Liang
- Program in Genetic Epidemiology & Statistical Genetics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| |
Collapse
|
34
|
Liu D, Ban HJ, El Sergani AM, Lee MK, Hecht JT, Wehby GL, Moreno LM, Feingold E, Marazita ML, Cha S, Szabo-Rogers HL, Weinberg SM, Shaffer JR. PRICKLE1 × FOCAD Interaction Revealed by Genome-Wide vQTL Analysis of Human Facial Traits. Front Genet 2021; 12:674642. [PMID: 34434215 PMCID: PMC8381734 DOI: 10.3389/fgene.2021.674642] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Accepted: 06/03/2021] [Indexed: 12/14/2022] Open
Abstract
The human face is a highly complex and variable structure resulting from the intricate coordination of numerous genetic and non-genetic factors. Hundreds of genomic loci impacting quantitative facial features have been identified. While these associations have been shown to influence morphology by altering the mean size and shape of facial measures, their effect on trait variance remains unclear. We conducted a genome-wide association analysis for the variance of 20 quantitative facial measurements in 2,447 European individuals and identified several suggestive variance quantitative trait loci (vQTLs). These vQTLs guided us to conduct an efficient search for gene-by-gene (G × G) interactions, which uncovered an interaction between PRICKLE1 and FOCAD affecting cranial base width. We replicated this G × G interaction signal at the locus level in an additional 5,128 Korean individuals. We used the hypomorphic Prickle1 Beetlejuice (Prickle1 Bj ) mouse line to directly test the function of Prickle1 on the cranial base and observed wider cranial bases in Prickle1 Bj/Bj . Importantly, we observed that the Prickle1 and Focadhesin proteins co-localize in murine cranial base chondrocytes, and this co-localization is abnormal in the Prickle1 Bj/Bj mutants. Taken together, our findings uncovered a novel G × G interaction effect in humans with strong support from both epidemiological and molecular studies. These results highlight the potential of studying measures of phenotypic variability in gene mapping studies of facial morphology.
Collapse
Affiliation(s)
- Dongjing Liu
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Hyo-Jeong Ban
- Future Medicine Division, Korea Institute of Oriental Medicine, Daejeon, South Korea
| | - Ahmed M. El Sergani
- Center for Craniofacial and Dental Genetics, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Oral and Craniofacial Sciences, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
| | - Myoung Keun Lee
- Center for Craniofacial and Dental Genetics, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
| | - Jacqueline T. Hecht
- Department of Pediatrics, McGovern Medical Center, The University of Texas Health Science Center at Houston, Houston, TX, United States
| | - George L. Wehby
- Department of Health Management and Policy, The University of Iowa, Iowa City, IA, United States
| | - Lina M. Moreno
- Department of Orthodontics, The University of Iowa, Iowa City, IA, United States
| | - Eleanor Feingold
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States
| | - Mary L. Marazita
- Center for Craniofacial and Dental Genetics, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Oral and Craniofacial Sciences, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Psychiatry, Clinical and Translational Science Institute, School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States
| | - Seongwon Cha
- Future Medicine Division, Korea Institute of Oriental Medicine, Daejeon, South Korea
| | - Heather L. Szabo-Rogers
- Department of Oral and Craniofacial Sciences, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Developmental Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Regenerative Medicine at the McGowan Institute, University of Pittsburgh, Pittsburgh, PA, United States
- Center for Craniofacial Regeneration, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
| | - Seth M. Weinberg
- Center for Craniofacial and Dental Genetics, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Oral and Craniofacial Sciences, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States
| | - John R. Shaffer
- Center for Craniofacial and Dental Genetics, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Oral and Craniofacial Sciences, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States
| |
Collapse
|
35
|
Lin WY, Chan CC, Liu YL, Yang AC, Tsai SJ, Kuo PH. Sex-specific autosomal genetic effects across 26 human complex traits. Hum Mol Genet 2021; 29:1218-1228. [PMID: 32160288 DOI: 10.1093/hmg/ddaa040] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2019] [Revised: 12/26/2019] [Accepted: 03/05/2020] [Indexed: 12/28/2022] Open
Abstract
Previous studies have shown that men and women have different genetic architectures across many traits. However, except waist-to-hip ratio (WHR) and waist circumference (WC), it remains unknown whether the genetic effects of a certain trait are weaker or stronger on men/women. With ~18 000 Taiwan Biobank subjects, we comprehensively investigate sexual heterogeneity in autosomal genetic effects, for traits regarding cardiovascular health, diabetes, kidney, liver, anthropometric profiles, blood, etc. 'Gene-by-sex interactions' (G $\times$ S) were detected in 18 out of 26 traits, each with an interaction P-value (${{P}}_{{INT}}$) less than $0.05/104={0.00048}$, where 104 is the number of tests conducted in this study. The most significant evidence of G $\times$ S was found in WHR (${{P}}_{{INT}}$ = 3.2 $\times{{10}}^{-{55}}$) and WC (${{P}}_{{INT}}$ = 2.3$\times{{10}}^{-{41}}$). As a novel G$\times$S investigation for other traits, we here find that the autosomal genetic effects are weaker on women than on men, for low-density lipoprotein cholesterol (LDL-C), uric acid (UA) and diabetes-related traits such as fasting glucose and glycated hemoglobin. For LDL-C and UA, the evidence of G$\times$S is especially notable in subjects aged less than 50 years, where estrogen can play a role in attenuating the autosomal genetic effects of these two traits. Men and women have systematically distinct environmental contexts caused by hormonal milieu and their specific society roles, which may trigger diverse gene expressions despite the same DNA materials. As many environmental exposures are difficult to collect and quantify, sex can serve as a good surrogate for these factors.
Collapse
Affiliation(s)
- Wan-Yu Lin
- Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan.,Department of Public Health, College of Public Health, National Taiwan University, Taipei, Taiwan
| | - Chang-Chuan Chan
- Department of Public Health, College of Public Health, National Taiwan University, Taipei, Taiwan.,Institute of Environmental and Occupational Health Sciences, College of Public Health, National Taiwan University, Taipei, Taiwan
| | - Yu-Li Liu
- Center for Neuropsychiatric Research, National Health Research Institutes, Miaoli, Taiwan
| | - Albert C Yang
- Division of Interdisciplinary Medicine and Biotechnology, Beth Israel Deaconess Medical Center/Harvard Medical School, Boston, MA, USA.,Institute of Brain Science, National Yang-Ming University, Taipei, Taiwan
| | - Shih-Jen Tsai
- Institute of Brain Science, National Yang-Ming University, Taipei, Taiwan.,Division of Psychiatry, National Yang-Ming University, Taipei, Taiwan.,Department of Psychiatry, Taipei Veterans General Hospital, Taipei, Taiwan
| | - Po-Hsiu Kuo
- Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan.,Department of Public Health, College of Public Health, National Taiwan University, Taipei, Taiwan
| |
Collapse
|
36
|
Towle-Miller LM, Miecznikowski JC, Zhang F, Tritchler DL. SuMO-Fil: Supervised multi-omic filtering prior to performing network analysis. PLoS One 2021; 16:e0255579. [PMID: 34343218 PMCID: PMC8330944 DOI: 10.1371/journal.pone.0255579] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Accepted: 07/20/2021] [Indexed: 11/18/2022] Open
Abstract
Multi-omic analyses that integrate many high-dimensional datasets often present significant deficiencies in statistical power and require time consuming computations to execute the analytical methods. We present SuMO-Fil to remedy against these issues which is a pre-processing method for Supervised Multi-Omic Filtering that removes variables or features considered to be irrelevant noise. SuMO-Fil is intended to be performed prior to downstream analyses that detect supervised gene networks in sparse settings. We accomplish this by implementing variable filters based on low similarity across the datasets in conjunction with low similarity with the outcome. This approach can improve accuracy, as well as reduce run times for a variety of computationally expensive downstream analyses. This method has applications in a setting where the downstream analysis may include sparse canonical correlation analysis. Filtering methods specifically for cluster and network analysis are introduced and compared by simulating modular networks with known statistical properties. The SuMO-Fil method performs favorably by eliminating non-network features while maintaining important biological signal under a variety of different signal settings as compared to popular filtering techniques based on low means or low variances. We show that the speed and accuracy of methods such as supervised sparse canonical correlation are increased after using SuMO-Fil, thus greatly improving the scalability of these approaches.
Collapse
Affiliation(s)
- Lorin M. Towle-Miller
- Department of Biostatistics, University at Buffalo, Buffalo, NY, United States of America
| | | | - Fan Zhang
- Department of Biostatistics, University at Buffalo, Buffalo, NY, United States of America
| | - David L. Tritchler
- Department of Biostatistics, University at Buffalo, Buffalo, NY, United States of America
- Biostatistics Division, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
37
|
Braz CU, Rowan TN, Schnabel RD, Decker JE. Genome-wide association analyses identify genotype-by-environment interactions of growth traits in Simmental cattle. Sci Rep 2021; 11:13335. [PMID: 34172761 PMCID: PMC8233360 DOI: 10.1038/s41598-021-92455-x] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Accepted: 06/07/2021] [Indexed: 02/06/2023] Open
Abstract
Understanding genotype-by-environment interactions (G × E) is crucial to understand environmental adaptation in mammals and improve the sustainability of agricultural production. Here, we present an extensive study investigating the interaction of genome-wide SNP markers with a vast assortment of environmental variables and searching for SNPs controlling phenotypic variance (vQTL) using a large beef cattle dataset. We showed that G × E contribute 10.1%, 3.8%, and 2.8% of the phenotypic variance of birth weight, weaning weight, and yearling weight, respectively. G × E genome-wide association analysis (GWAA) detected a large number of G × E loci affecting growth traits, which the traditional GWAA did not detect, showing that functional loci may have non-additive genetic effects regardless of differences in genotypic means. Further, variance-heterogeneity GWAA detected loci enriched with G × E effects without requiring prior knowledge of the interacting environmental factors. Functional annotation and pathway analysis of G × E genes revealed biological mechanisms by which cattle respond to changes in their environment, such as neurotransmitter activity, hypoxia-induced processes, keratinization, hormone, thermogenic and immune pathways. We unraveled the relevance and complexity of the genetic basis of G × E underlying growth traits, providing new insights into how different environmental conditions interact with specific genes influencing adaptation and productivity in beef cattle and potentially across mammals.
Collapse
Affiliation(s)
- Camila U Braz
- Division of Animal Sciences, University of Missouri, Columbia, MO, 65211, USA
| | - Troy N Rowan
- Division of Animal Sciences, University of Missouri, Columbia, MO, 65211, USA
- Genetics Area Program, University of Missouri, Columbia, MO, 65211, USA
| | - Robert D Schnabel
- Division of Animal Sciences, University of Missouri, Columbia, MO, 65211, USA
- Genetics Area Program, University of Missouri, Columbia, MO, 65211, USA
- Informatics Institute, University of Missouri, Columbia, MO, 65211, USA
| | - Jared E Decker
- Division of Animal Sciences, University of Missouri, Columbia, MO, 65211, USA.
- Genetics Area Program, University of Missouri, Columbia, MO, 65211, USA.
- Informatics Institute, University of Missouri, Columbia, MO, 65211, USA.
| |
Collapse
|
38
|
Abstract
Disease classification, or nosology, was historically driven by careful examination of clinical features of patients. As technologies to measure and understand human phenotypes advanced, so too did classifications of disease, and the advent of genetic data has led to a surge in genetic subtyping in the past decades. Although the fundamental process of refining disease definitions and subtypes is shared across diverse fields, each field is driven by its own goals and technological expertise, leading to inconsistent and conflicting definitions of disease subtypes. Here, we review several classical and recent subtypes and subtyping approaches and provide concrete definitions to delineate subtypes. In particular, we focus on subtypes with distinct causal disease biology, which are of primary interest to scientists, and subtypes with pragmatic medical benefits, which are of primary interest to physicians. We propose genetic heterogeneity as a gold standard for establishing biologically distinct subtypes of complex polygenic disease. We focus especially on methods to find and validate genetic subtypes, emphasizing common pitfalls and how to avoid them.
Collapse
Affiliation(s)
- Andy Dahl
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, Illinois 60637, USA; .,Department of Neurology, University of California, Los Angeles, California 90024, USA; .,Department of Computational Medicine, University of California, Los Angeles, California 90095, USA
| | - Noah Zaitlen
- Department of Neurology, University of California, Los Angeles, California 90024, USA; .,Department of Computational Medicine, University of California, Los Angeles, California 90095, USA
| |
Collapse
|
39
|
Soave D, Lawless JF, Awadalla P. Score tests for scale effects, with application to genomic analysis. Stat Med 2021; 40:3808-3822. [PMID: 33908071 DOI: 10.1002/sim.9000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Revised: 04/01/2021] [Accepted: 04/07/2021] [Indexed: 11/07/2022]
Abstract
Tests for variance or scale effects due to covariates are used in many areas and recently, in genomic and genetic association studies. We study score tests based on location-scale models with arbitrary error distributions that allow incorporation of additional adjustment covariates. Tests based on Gaussian and Laplacian double generalized linear models are examined in some detail. Numerical properties of the tests under Gaussian and other error distributions are examined. Our results show that the use of model-based asymptotic distributions with score tests for scale effects does not control type 1 error well in many settings of practical relevance. We consider simple statistics based on permutation distribution approximations, which correspond to well-known statistics derived by another approach. They are shown to give good type 1 error control under different error distributions and under covariate distribution imbalance. The methods are illustrated through a differential gene expression analysis involving breast cancer tumor samples.
Collapse
Affiliation(s)
- David Soave
- Department of Mathematics, Wilfrid Laurier University, Waterloo, Ontario, Canada.,Computational Biology Program, Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Jerald F Lawless
- Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario, Canada
| | - Philip Awadalla
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, Ontario, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
40
|
Using Genetic Marginal Effects to Study Gene-Environment Interactions with GWAS Data. Behav Genet 2021; 51:358-373. [PMID: 33899139 DOI: 10.1007/s10519-021-10058-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2020] [Accepted: 04/09/2021] [Indexed: 12/30/2022]
Abstract
Gene-environment interactions (GxE) play a central role in the theoretical relationship between genetic factors and complex traits. While genome wide GxE studies of human behaviors remain underutilized, in part due to methodological limitations, existing GxE research in model organisms emphasizes the importance of interpreting genetic associations within environmental contexts. In this paper, we present a framework for conducting an analysis of GxE using raw data from genome wide association studies (GWAS) and applying the techniques to analyze gene-by-age interactions for alcohol use frequency. To illustrate the effectiveness of this procedure, we calculate genetic marginal effects from a GxE GWAS analysis for an ordinal measure of alcohol use frequency from the UK Biobank dataset, treating the respondent's age as the continuous moderating environment. The genetic marginal effects clarify the interpretation of the GxE associations and provide a direct and clear understanding of how the genetic associations vary across age (the environment). To highlight the advantages of our proposed methods for presenting GxE GWAS results, we compare the interpretation of marginal genetic effects with an interpretation that focuses narrowly on the significance of the interaction coefficients. The results imply that the genetic associations with alcohol use frequency vary considerably across ages, a conclusion that may not be obvious from the raw regression or interaction coefficients. GxE GWAS is less powerful than the standard "main effect" GWAS approach, and therefore require larger samples to detect significant moderated associations. Fortunately, the necessary sample sizes for a successful application of GxE GWAS can rely on the existing and on-going development of consortia and large-scale population-based studies.
Collapse
|
41
|
Uncovering Evidence for Endocrine-Disrupting Chemicals That Elicit Differential Susceptibility through Gene-Environment Interactions. TOXICS 2021; 9:toxics9040077. [PMID: 33917455 PMCID: PMC8067468 DOI: 10.3390/toxics9040077] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Revised: 03/27/2021] [Accepted: 04/02/2021] [Indexed: 12/17/2022]
Abstract
Exposure to endocrine-disrupting chemicals (EDCs) is linked to myriad disorders, characterized by the disruption of the complex endocrine signaling pathways that govern development, physiology, and even behavior across the entire body. The mechanisms of endocrine disruption involve a complex system of pathways that communicate across the body to stimulate specific receptors that bind DNA and regulate the expression of a suite of genes. These mechanisms, including gene regulation, DNA binding, and protein binding, can be tied to differences in individual susceptibility across a genetically diverse population. In this review, we posit that EDCs causing such differential responses may be identified by looking for a signal of population variability after exposure. We begin by summarizing how the biology of EDCs has implications for genetically diverse populations. We then describe how gene-environment interactions (GxE) across the complex pathways of endocrine signaling could lead to differences in susceptibility. We survey examples in the literature of individual susceptibility differences to EDCs, pointing to a need for research in this area, especially regarding the exceedingly complex thyroid pathway. Following a discussion of experimental designs to better identify and study GxE across EDCs, we present a case study of a high-throughput screening signal of putative GxE within known endocrine disruptors. We conclude with a call for further, deeper analysis of the EDCs, particularly the thyroid disruptors, to identify if these chemicals participate in GxE leading to differences in susceptibility.
Collapse
|
42
|
Majumdar A, Burch KS, Haldar T, Sankararaman S, Pasaniuc B, Gauderman WJ, Witte JS. A two-step approach to testing overall effect of gene-environment interaction for multiple phenotypes. Bioinformatics 2021; 36:5640-5648. [PMID: 33453114 DOI: 10.1093/bioinformatics/btaa1083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Revised: 12/09/2020] [Accepted: 12/17/2020] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION While gene-environment (GxE) interactions contribute importantly to many different phenotypes, detecting such interactions requires well-powered studies and has proven difficult. To address this, we combine two approaches to improve GxE power: simultaneously evaluating multiple phenotypes and using a two-step analysis approach. Previous work shows that the power to identify a main genetic effect can be improved by simultaneously analyzing multiple related phenotypes. For a univariate phenotype, two-step methods produce higher power for detecting a GxE interaction compared to single step analysis. Therefore, we propose a two-step approach to test for an overall GxE effect for multiple phenotypes. RESULTS Using simulations we demonstrate that, when more than one phenotype has GxE effect (i.e., GxE pleiotropy), our approach offers substantial gain in power (18%-43%) to detect an aggregate-level GxE effect for a multivariate phenotype compared to an analogous two-step method to identify GxE effect for a univariate phenotype. We applied the proposed approach to simultaneously analyze three lipids, LDL, HDL and Triglyceride with the frequency of alcohol consumption as environmental factor in the UK Biobank. The method identified two loci with an overall GxE effect on the vector of lipids, one of which was missed by the competing approaches. AVAILABILITY We provide an R package MPGE implementing the proposed approach which is available from CRAN: https://cran.r-project.org/web/packages/MPGE/index.html.
Collapse
Affiliation(s)
- Arunabha Majumdar
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA.,Department of Epidemiology and Biostatistics, University of California, San Francisco, CA, USA
| | - Kathryn S Burch
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA, USA
| | | | - Sriram Sankararaman
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, USA
| | - Bogdan Pasaniuc
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA.,Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA, USA
| | - W James Gauderman
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - John S Witte
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA, USA
| |
Collapse
|
43
|
Marderstein AR, Davenport ER, Kulm S, Van Hout CV, Elemento O, Clark AG. Leveraging phenotypic variability to identify genetic interactions in human phenotypes. Am J Hum Genet 2021; 108:49-67. [PMID: 33326753 DOI: 10.1016/j.ajhg.2020.11.016] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2020] [Accepted: 11/23/2020] [Indexed: 12/13/2022] Open
Abstract
Although thousands of loci have been associated with human phenotypes, the role of gene-environment (GxE) interactions in determining individual risk of human diseases remains unclear. This is partly because of the severe erosion of statistical power resulting from the massive number of statistical tests required to detect such interactions. Here, we focus on improving the power of GxE tests by developing a statistical framework for assessing quantitative trait loci (QTLs) associated with the trait means and/or trait variances. When applying this framework to body mass index (BMI), we find that GxE discovery and replication rates are significantly higher when prioritizing genetic variants associated with the variance of the phenotype (vQTLs) compared to when assessing all genetic variants. Moreover, we find that vQTLs are enriched for associations with other non-BMI phenotypes having strong environmental influences, such as diabetes or ulcerative colitis. We show that GxE effects first identified in quantitative traits such as BMI can be used for GxE discovery in disease phenotypes such as diabetes. A clear conclusion is that strong GxE interactions mediate the genetic contribution to body weight and diabetes risk.
Collapse
Affiliation(s)
- Andrew R Marderstein
- Tri-Institutional Program in Computational Biology & Medicine, Weill Cornell Medicine, New York, NY 10021, USA; Institute of Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA; Caryl and Israel Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY 10021, USA; Department of Computational Biology, Cornell University, Ithaca, NY 14850, USA
| | - Emily R Davenport
- Department of Biology, Huck Institutes of the Life Sciences, Institute for Computational and Data Sciences, Pennsylvania State University, University Park, PA 16802, USA
| | - Scott Kulm
- Institute of Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA; Caryl and Israel Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY 10021, USA
| | | | - Olivier Elemento
- Institute of Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA; Caryl and Israel Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY 10021, USA.
| | - Andrew G Clark
- Department of Computational Biology, Cornell University, Ithaca, NY 14850, USA.
| |
Collapse
|
44
|
Pizarro Inostroza MG, Landi V, Navas González FJ, León Jurado JM, Delgado Bermejo JV, Fernández Álvarez J, Martínez Martínez MDA. Integrating Casein Complex SNPs Additive, Dominance and Epistatic Effects on Genetic Parameters and Breeding Values Estimation for Murciano-Granadina Goat Milk Yield and Components. Genes (Basel) 2020; 11:E309. [PMID: 32183253 PMCID: PMC7140789 DOI: 10.3390/genes11030309] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2020] [Accepted: 03/12/2020] [Indexed: 11/23/2022] Open
Abstract
Assessing dominance and additive effects of casein complex single-nucleotide polymorphisms (SNPs) (αS1, αS2, β, and κ casein), and their epistatic relationships may maximize our knowledge on the genetic regulation of profitable traits. Contextually, new genomic selection perspectives may translate this higher efficiency into higher accuracies for milk yield and components' genetic parameters and breeding values. A total of 2594 lactation records were collected from 159 Murciano-Granadina goats (2005-2018), genotyped for 48 casein loci-located SNPs. Bonferroni-corrected nonparametric tests, categorical principal component analysis (CATPCA), and nonlinear canonical correlations were performed to quantify additive, dominance, and interSNP epistatic effects and evaluate the outcomes of their inclusion in quantitative and qualitative milk production traits' genetic models (yield, protein, fat, solids, and lactose contents and somatic cells count). Milk yield, lactose, and somatic cell count heritabilities increased considerably when the model including genetic effects was considered (0.46, 0.30, 0.43, respectively). Components standard prediction errors decreased, and accuracies and reliabilities increased when genetic effects were considered. Conclusively, including genetic effects and relationships among these heritable biomarkers may improve model efficiency, genetic parameters, and breeding values for milk yield and composition, optimizing selection practices profitability for components whose technological application may be especially relevant for the cheese-making dairy sector.
Collapse
Affiliation(s)
- María Gabriela Pizarro Inostroza
- Department of Genetics, Faculty of Veterinary Sciences, University of Córdoba, 14071 Córdoba, Spain; (M.G.P.I.); (J.V.D.B.); (M.d.A.M.M.)
- Animal Breeding Consulting, S.L., Córdoba Science and Technology Park Rabanales 21, 14071 Córdoba, Spain
| | - Vincenzo Landi
- Department of Veterinary Medicine, University of Bari “Aldo Moro”, 70010 Valenzano, Italy;
| | - Francisco Javier Navas González
- Department of Genetics, Faculty of Veterinary Sciences, University of Córdoba, 14071 Córdoba, Spain; (M.G.P.I.); (J.V.D.B.); (M.d.A.M.M.)
| | - Jose Manuel León Jurado
- Centro Agropecuario Provincial de Córdoba, Diputación Provincial de Córdoba, Córdoba, 14071 Córdoba, Spain;
| | - Juan Vicente Delgado Bermejo
- Department of Genetics, Faculty of Veterinary Sciences, University of Córdoba, 14071 Córdoba, Spain; (M.G.P.I.); (J.V.D.B.); (M.d.A.M.M.)
| | - Javier Fernández Álvarez
- National Association of Breeders of Murciano-Granadina Goat Breed, Fuente Vaqueros, 18340 Granada, Spain;
| | - María del Amparo Martínez Martínez
- Department of Genetics, Faculty of Veterinary Sciences, University of Córdoba, 14071 Córdoba, Spain; (M.G.P.I.); (J.V.D.B.); (M.d.A.M.M.)
| |
Collapse
|
45
|
Quantification of the overall contribution of gene-environment interaction for obesity-related traits. Nat Commun 2020; 11:1385. [PMID: 32170055 PMCID: PMC7070002 DOI: 10.1038/s41467-020-15107-0] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2019] [Accepted: 02/11/2020] [Indexed: 12/03/2022] Open
Abstract
The growing sample size of genome-wide association studies has facilitated the discovery of gene-environment interactions (GxE). Here we propose a maximum likelihood method to estimate the contribution of GxE to continuous traits taking into account all interacting environmental variables, without the need to measure any. Extensive simulations demonstrate that our method provides unbiased interaction estimates and excellent coverage. We also offer strategies to distinguish specific GxE from general scale effects. Applying our method to 32 traits in the UK Biobank reveals that while the genetic risk score (GRS) of 376 variants explains 5.2% of body mass index (BMI) variance, GRSxE explains an additional 1.9%. Nevertheless, this interaction holds for any variable with identical correlation to BMI as the GRS, hence may not be GRS-specific. Still, we observe that the global contribution of specific GRSxE to complex traits is substantial for nine obesity-related measures (including leg impedance and trunk fat-free mass). Most gene-by-environment interaction methods rely on the availability of the interacting environment. Here, the authors propose a robust maximum likelihood method for estimating the overall statistical interaction between a genetic risk score for a continuous outcome and all environmental variables.
Collapse
|
46
|
Chu W, Li R, Liu J, Reimherr M. FEATURE SELECTION FOR GENERALIZED VARYING COEFFICIENT MIXED-EFFECT MODELS WITH APPLICATION TO OBESITY GWAS. Ann Appl Stat 2020; 14:276-298. [PMID: 32802245 PMCID: PMC7426018 DOI: 10.1214/19-aoas1310] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2023]
Abstract
Motivated by an empirical analysis of data from a genome-wide association study on obesity, measured by the body mass index (BMI), we propose a two-step gene-detection procedure for generalized varying coefficient mixed-effects models with ultrahigh dimensional covariates. The proposed procedure selects significant single nucleotide polymorphisms (SNPs) impacting the mean BMI trend, some of which have already been biologically proven to be "fat genes." The method also discovers SNPs that significantly influence the age-dependent variability of BMI. The proposed procedure takes into account individual variations of genetic effects and can also be directly applied to longitudinal data with continuous, binary or count responses. We employ Monte Carlo simulation studies to assess the performance of the proposed method and further carry out causal inference for the selected SNPs.
Collapse
Affiliation(s)
| | - Runze Li
- Department of Statistics and the Methodology Center, Pennsylvania State University
| | - Jingyuan Liu
- MOE Key Laboratory of Econometrics, Department of Statistics, School of Economics, Wang Yanan Institute for Studies in Economics, and Fujian Key Lab of Statistics, Xiamen University
| | | |
Collapse
|
47
|
Hussain W, Campbell MT, Jarquin D, Walia H, Morota G. Variance heterogeneity genome-wide mapping for cadmium in bread wheat reveals novel genomic loci and epistatic interactions. THE PLANT GENOME 2020; 13:e20011. [PMID: 33016629 DOI: 10.1002/tpg2.20011] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Accepted: 01/22/2020] [Indexed: 06/11/2023]
Abstract
Genome-wide association mapping identifies quantitative trait loci (QTL) that influence the mean differences between the marker genotypes for a given trait. While most loci influence the mean value of a trait, certain loci, known as variance heterogeneity QTL (vQTL) determine the variability of the trait instead of the mean trait value (mQTL). In the present study, we performed a variance heterogeneity genome-wide association study (vGWAS) for grain cadmium (Cd) concentration in bread wheat. We used double generalized linear model and hierarchical generalized linear model to identify vQTL associated with grain Cd. We identified novel vQTL regions on chromosomes 2A and 2B that contribute to the Cd variation and loci that affect both mean and variance heterogeneity (mvQTL) on chromosome 5A. In addition, our results demonstrated the presence of epistatic interactions between vQTL and mvQTL, which could explain variance heterogeneity. Overall, we provide novel insights into the genetic architecture of grain Cd concentration and report the first application of vGWAS in wheat. Moreover, our findings indicated that epistasis is an important mechanism underlying natural variation for grain Cd concentration.
Collapse
Affiliation(s)
- Waseem Hussain
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE, 68583, USA
| | - Malachy T Campbell
- Department of Animal and Poultry Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, 24061, USA
| | - Diego Jarquin
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE, 68583, USA
| | - Harkamal Walia
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE, 68583, USA
| | - Gota Morota
- Department of Animal and Poultry Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, 24061, USA
| |
Collapse
|
48
|
Wragg D, Liu Q, Lin Z, Riggio V, Pugh CA, Beveridge AJ, Brown H, Hume DA, Harris SE, Deary IJ, Tenesa A, Prendergast JGD. Using regulatory variants to detect gene-gene interactions identifies networks of genes linked to cell immortalisation. Nat Commun 2020; 11:343. [PMID: 31953380 PMCID: PMC6969137 DOI: 10.1038/s41467-019-13762-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2018] [Accepted: 11/19/2019] [Indexed: 12/30/2022] Open
Abstract
The extent to which the impact of regulatory genetic variants may depend on other factors, such as the expression levels of upstream transcription factors, remains poorly understood. Here we report a framework in which regulatory variants are first aggregated into sets, and using these as estimates of the total cis-genetic effects on a gene we model their non-additive interactions with the expression of other genes in the genome. Using 1220 lymphoblastoid cell lines across platforms and independent datasets we identify 74 genes where the impact of their regulatory variant-set is linked to the expression levels of networks of distal genes. We show that these networks are predominantly associated with tumourigenesis pathways, through which immortalised cells are able to rapidly proliferate. We consequently present an approach to define gene interaction networks underlying important cellular pathways such as cell immortalisation.
Collapse
Affiliation(s)
- D. Wragg
- 0000 0004 1936 7988grid.4305.2The Roslin Institute, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG UK
| | - Q. Liu
- 0000 0004 1936 7988grid.4305.2The Roslin Institute, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG UK
| | - Z. Lin
- 0000 0004 1936 7988grid.4305.2The Roslin Institute, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG UK
| | - V. Riggio
- 0000 0004 1936 7988grid.4305.2The Roslin Institute, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG UK
| | - C. A. Pugh
- 0000 0004 1936 7988grid.4305.2The Roslin Institute, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG UK
| | - A. J. Beveridge
- 0000 0001 2193 314Xgrid.8756.cGlasgow Polyomics, College of Medical, Veterinary and Life Science, University of Glasgow, Glasgow, UK
| | - H. Brown
- 0000 0004 1936 7988grid.4305.2The Roslin Institute, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG UK
| | - D. A. Hume
- 0000000406180938grid.489335.0Mater Research Institute-University of Queensland, Translational Research Institute, Woolloongabba, QLD 4102 Australia
| | - S. E. Harris
- 0000 0004 1936 7988grid.4305.2Centre for Cognitive Ageing and Cognitive Epidemiology, University of Edinburgh, Edinburgh, EH8 9JZ UK
| | - I. J. Deary
- 0000 0004 1936 7988grid.4305.2Centre for Cognitive Ageing and Cognitive Epidemiology, University of Edinburgh, Edinburgh, EH8 9JZ UK
| | - A. Tenesa
- 0000 0004 1936 7988grid.4305.2The Roslin Institute, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG UK
| | - J. G. D. Prendergast
- 0000 0004 1936 7988grid.4305.2The Roslin Institute, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG UK
| |
Collapse
|
49
|
Dumitrascu B, Darnell G, Ayroles J, Engelhardt BE. Statistical tests for detecting variance effects in quantitative trait studies. Bioinformatics 2019; 35:200-210. [PMID: 29982387 PMCID: PMC6330007 DOI: 10.1093/bioinformatics/bty565] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2017] [Accepted: 07/04/2018] [Indexed: 11/17/2022] Open
Abstract
Motivation Identifying variants, both discrete and continuous, that are associated with quantitative traits, or QTs, is the primary focus of quantitative genetics. Most current methods are limited to identifying mean effects, or associations between genotype or covariates and the mean value of a quantitative trait. It is possible, however, that a variant may affect the variance of the quantitative trait in lieu of, or in addition to, affecting the trait mean. Here, we develop a general methodology to identify covariates with variance effects on a quantitative trait using a Bayesian heteroskedastic linear regression model (BTH). We compare BTH with existing methods to detect variance effects across a large range of simulations drawn from scenarios common to the analysis of quantitative traits. Results We find that BTH and a double generalized linear model (dglm) outperform classical tests used for detecting variance effects in recent genomic studies. We show BTH and dglm are less likely to generate spurious discoveries through simulations and application to identifying methylation variance QTs and expression variance QTs. We identify four variance effects of sex in the Cardiovascular and Pharmacogenetics study. Our work is the first to offer a comprehensive view of variance identifying methodology. We identify shortcomings in previously used methodology and provide a more conservative and robust alternative. We extend variance effect analysis to a wide array of covariates that enables a new statistical dimension in the study of sex and age specific quantitative trait effects. Availability and implementation https://github.com/b2du/bth. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Bianca Dumitrascu
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Gregory Darnell
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Julien Ayroles
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ, USA
| | - Barbara E Engelhardt
- Department of Computer Science, Princeton University, Princeton, NJ, USA.,Center for Statistics and Machine Learning, Princeton University, Princeton, NJ, USA
| |
Collapse
|
50
|
Young AI, Benonisdottir S, Przeworski M, Kong A. Deconstructing the sources of genotype-phenotype associations in humans. Science 2019; 365:1396-1400. [PMID: 31604265 PMCID: PMC6894903 DOI: 10.1126/science.aax3710] [Citation(s) in RCA: 120] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Efforts to link variation in the human genome to phenotypes have progressed at a tremendous pace in recent decades. Most human traits have been shown to be affected by a large number of genetic variants across the genome. To interpret these associations and to use them reliably-in particular for phenotypic prediction-a better understanding of the many sources of genotype-phenotype associations is necessary. We summarize the progress that has been made in this direction in humans, notably in decomposing direct and indirect genetic effects as well as population structure confounding. We discuss the natural next steps in data collection and methodology development, with a focus on what can be gained by analyzing genotype and phenotype data from close relatives.
Collapse
Affiliation(s)
- Alexander I Young
- Big Data Institute, Li Ka Shing Centre for Health Information Discovery, University of Oxford, Oxford, UK.
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
| | - Stefania Benonisdottir
- Big Data Institute, Li Ka Shing Centre for Health Information Discovery, University of Oxford, Oxford, UK
| | - Molly Przeworski
- Department of Biological Sciences, Columbia University, New York, NY, USA.
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Augustine Kong
- Big Data Institute, Li Ka Shing Centre for Health Information Discovery, University of Oxford, Oxford, UK.
| |
Collapse
|