1
|
Zhu L, Zhang S, Sha Q. Meta-analysis of set-based multiple phenotype association test based on GWAS summary statistics from different cohorts. Front Genet 2024; 15:1359591. [PMID: 39301532 PMCID: PMC11410627 DOI: 10.3389/fgene.2024.1359591] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 08/23/2024] [Indexed: 09/22/2024] Open
Abstract
Genome-wide association studies (GWAS) have emerged as popular tools for identifying genetic variants that are associated with complex diseases. Standard analysis of a GWAS involves assessing the association between each variant and a disease. However, this approach suffers from limited reproducibility and difficulties in detecting multi-variant and pleiotropic effects. Although joint analysis of multiple phenotypes for GWAS can identify and interpret pleiotropic loci which are essential to understand pleiotropy in diseases and complex traits, most of the multiple phenotype association tests are designed for a single variant, resulting in much lower power, especially when their effect sizes are small and only their cumulative effect is associated with multiple phenotypes. To overcome these limitations, set-based multiple phenotype association tests have been developed to enhance statistical power and facilitate the identification and interpretation of pleiotropic regions. In this research, we propose a new method, named Meta-TOW-S, which conducts joint association tests between multiple phenotypes and a set of variants (such as variants in a gene) utilizing GWAS summary statistics from different cohorts. Our approach applies the set-based method that Tests for the effect of an Optimal Weighted combination of variants in a gene (TOW) and accounts for sample size differences across GWAS cohorts by employing the Cauchy combination method. Meta-TOW-S combines the advantages of set-based tests and multi-phenotype association tests, exhibiting computational efficiency and enabling analysis across multiple phenotypes while accommodating overlapping samples from different GWAS cohorts. To assess the performance of Meta-TOW-S, we develop a phenotype simulator package that encompasses a comprehensive simulation scheme capable of modeling multiple phenotypes and multiple variants, including noise structures and diverse correlation patterns among phenotypes. Simulation studies validate that Meta-TOW-S maintains a desirable Type I error rate. Further simulation under different scenarios shows that Meta-TOW-S can improve power compared with other existing meta-analysis methods. When applied to four psychiatric disorders summary data, Meta-TOW-S detects a greater number of significant genes.
Collapse
Affiliation(s)
- Lirong Zhu
- Department of Mathematical Sciences, Michigan Technological University, Houghton, MI, United States
| | - Shuanglin Zhang
- Department of Mathematical Sciences, Michigan Technological University, Houghton, MI, United States
| | - Qiuying Sha
- Department of Mathematical Sciences, Michigan Technological University, Houghton, MI, United States
| |
Collapse
|
2
|
Van Dijck E, Diels S, Fransen E, Cremers TC, Verrijken A, Dirinck E, Hoischen A, Vandeweyer G, Vanden Berghe W, Van Gaal L, Francque S, Van Hul W. A Case-Control Study Supports Genetic Contribution of the PON Gene Family in Obesity and Metabolic Dysfunction Associated Steatotic Liver Disease. Antioxidants (Basel) 2024; 13:1051. [PMID: 39334710 PMCID: PMC11440101 DOI: 10.3390/antiox13091051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2024] [Revised: 08/23/2024] [Accepted: 08/27/2024] [Indexed: 09/30/2024] Open
Abstract
The paraoxonase (PON) gene family (including PON1, PON2, and PON3), is known for its anti-oxidative and anti-inflammatory properties, protecting against metabolic diseases such as obesity and metabolic dysfunction-associated steatotic liver disease (MASLD). In this study, the influence of common and rare PON variants on both conditions was investigated. A total of 507 healthy weight individuals and 744 patients with obesity including 433 with histological liver assessment, were sequenced with single-molecule molecular inversion probes (smMIPs), allowing the identification of genetic contributions to obesity and MASLD-related liver features. Polymorphisms rs705379 and rs854552 in the PON1 gene displayed significant association with MASLD stage and fibrosis, respectively. Additionally, rare PON1 variants were strongly associated with obesity. This study thereby reinforces the genetic foundation of PON1 in obesity and various MASLD-related liver features, by extending previous findings from common variants to include rare variants. Additionally, rare and very rare variants in PON2 were discovered to be associated with MASLD-related hepatic fibrosis. Notably, we are the first to report an association between naturally occurring rare PON2 variants and MASLD-related liver fibrosis. Considering the critical role of liver fibrosis in MASLD outcome, PON2 emerges as a possible candidate for future research endeavors including exploration of biomarker potential.
Collapse
Affiliation(s)
- Evelien Van Dijck
- Centre of Medical Genetics, University of Antwerp and Antwerp University Hospital, 2650 Edegem, Belgium
| | - Sara Diels
- Centre of Medical Genetics, University of Antwerp and Antwerp University Hospital, 2650 Edegem, Belgium
| | - Erik Fransen
- Centre of Medical Genetics, University of Antwerp and Antwerp University Hospital, 2650 Edegem, Belgium
| | - Tycho Canter Cremers
- Centre of Medical Genetics, University of Antwerp and Antwerp University Hospital, 2650 Edegem, Belgium
| | - An Verrijken
- Department of Endocrinology, Diabetology and Metabolic Diseases, Antwerp University Hospital, 2650 Edegem, Belgium
- Laboratory for Experimental Medicine and Paediatrics, Translational Sciences in Inflammation and Immunology, University of Antwerp, 2610 Wilrijk, Belgium
| | - Eveline Dirinck
- Department of Endocrinology, Diabetology and Metabolic Diseases, Antwerp University Hospital, 2650 Edegem, Belgium
- Laboratory for Experimental Medicine and Paediatrics, Translational Sciences in Inflammation and Immunology, University of Antwerp, 2610 Wilrijk, Belgium
| | - Alexander Hoischen
- Department of Human Genetics and Department of Internal Medicine, Radboud University Medical Center, 6525 GA Nijmegen, The Netherlands
| | - Geert Vandeweyer
- Centre of Medical Genetics, University of Antwerp and Antwerp University Hospital, 2650 Edegem, Belgium
| | - Wim Vanden Berghe
- Cell Death Signaling–Epigenetics Lab, Department Biomedical Sciences, University of Antwerp, 2610 Wilrijk, Belgium
| | - Luc Van Gaal
- Department of Endocrinology, Diabetology and Metabolic Diseases, Antwerp University Hospital, 2650 Edegem, Belgium
| | - Sven Francque
- Department of Gastroenterology and Hepatology, Antwerp University Hospital, 2650 Edegem, Belgium
| | - Wim Van Hul
- Centre of Medical Genetics, University of Antwerp and Antwerp University Hospital, 2650 Edegem, Belgium
| |
Collapse
|
3
|
Venkataraman GR, DeBoever C, Tanigawa Y, Aguirre M, Ioannidis AG, Mostafavi H, Spencer CCA, Poterba T, Bustamante CD, Daly MJ, Pirinen M, Rivas MA. Bayesian model comparison for rare-variant association studies. Am J Hum Genet 2021; 108:2354-2367. [PMID: 34822764 PMCID: PMC8715195 DOI: 10.1016/j.ajhg.2021.11.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2021] [Accepted: 11/02/2021] [Indexed: 12/12/2022] Open
Abstract
Whole-genome sequencing studies applied to large populations or biobanks with extensive phenotyping raise new analytic challenges. The need to consider many variants at a locus or group of genes simultaneously and the potential to study many correlated phenotypes with shared genetic architecture provide opportunities for discovery not addressed by the traditional one variant, one phenotype association study. Here, we introduce a Bayesian model comparison approach called MRP (multiple rare variants and phenotypes) for rare-variant association studies that considers correlation, scale, and direction of genetic effects across a group of genetic variants, phenotypes, and studies, requiring only summary statistic data. We apply our method to exome sequencing data (n = 184,698) across 2,019 traits from the UK Biobank, aggregating signals in genes. MRP demonstrates an ability to recover signals such as associations between PCSK9 and LDL cholesterol levels. We additionally find MRP effective in conducting meta-analyses in exome data. Non-biomarker findings include associations between MC1R and red hair color and skin color, IL17RA and monocyte count, and IQGAP2 and mean platelet volume. Finally, we apply MRP in a multi-phenotype setting; after clustering the 35 biomarker phenotypes based on genetic correlation estimates, we find that joint analysis of these phenotypes results in substantial power gains for gene-trait associations, such as in TNFRSF13B in one of the clusters containing diabetes- and lipid-related traits. Overall, we show that the MRP model comparison approach improves upon useful features from widely used meta-analysis approaches for rare-variant association analyses and prioritizes protective modifiers of disease risk.
Collapse
Affiliation(s)
| | - Christopher DeBoever
- Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
| | - Yosuke Tanigawa
- Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
| | - Matthew Aguirre
- Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
| | | | | | | | - Timothy Poterba
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Carlos D Bustamante
- Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA; Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Mark J Daly
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Matti Pirinen
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki 00014, Finland; Department of Public Health, University of Helsinki, Helsinki 00014, Finland; Department of Mathematics and Statistics, University of Helsinki, Helsinki 00014, Finland.
| | - Manuel A Rivas
- Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA.
| |
Collapse
|
4
|
De Ridder R, Vandeweyer G, Boudin E, Hendrickx G, Huybrechts Y, Cremers TC, Devogelaer JP, Mortier G, Fransen E, Van Hul W. A Panel-Based Sequencing Analysis of Patients with Paget's Disease of Bone Suggests Enrichment of Rare Genetic Variation in regulators of NF-κB Signaling and Supports the Importance of the 7q33 Locus. Calcif Tissue Int 2021; 109:656-665. [PMID: 34173013 DOI: 10.1007/s00223-021-00881-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 06/15/2021] [Indexed: 10/21/2022]
Abstract
Paget's disease of bone (PDB) is a common bone disorder characterized by focal lesions caused by increased bone turnover. Monogenic forms of PDB and PDB-related phenotypes as well as genome-wide association studies strongly support the involvement of genetic variation in components of the NF-κB signaling pathway in the pathogenesis of PDB. In this study, we performed a panel-based mutation screening of 52 genes. Single variant association testing and a series of gene-based association tests were performed. The former revealed a novel association with NFKBIA and further supports an involvement of variation in NR4A1, VCP, TNFRSF11A, and NUP205. The latter indicated a trend for enrichment of rare genetic variation in GAB2 and PRKCI. Both single variant tests and gene-based tests highlighted two genes, NR4A1 and NUP205. In conclusion, our findings support the involvement of genetic variation in modulators of NF-κB signaling in PDB and confirm the association of previously associated genes with the pathogenesis of PDB.
Collapse
Affiliation(s)
- Raphaël De Ridder
- Center of Medical Genetics, University of Antwerp & Antwerp University Hospital, Antwerp, Belgium
| | - Geert Vandeweyer
- Center of Medical Genetics, University of Antwerp & Antwerp University Hospital, Antwerp, Belgium
| | - Eveline Boudin
- Center of Medical Genetics, University of Antwerp & Antwerp University Hospital, Antwerp, Belgium
| | - Gretl Hendrickx
- Center of Medical Genetics, University of Antwerp & Antwerp University Hospital, Antwerp, Belgium
| | - Yentl Huybrechts
- Center of Medical Genetics, University of Antwerp & Antwerp University Hospital, Antwerp, Belgium
| | - Tycho Canter Cremers
- Center of Medical Genetics, University of Antwerp & Antwerp University Hospital, Antwerp, Belgium
| | - Jean-Pierre Devogelaer
- Department of Rheumatology, Saint-Luc University Hospital, Université Catholique de Louvain, Brussels, Belgium
| | - Geert Mortier
- Center of Medical Genetics, University of Antwerp & Antwerp University Hospital, Antwerp, Belgium
| | - Erik Fransen
- Center of Medical Genetics, University of Antwerp & Antwerp University Hospital, Antwerp, Belgium
| | - Wim Van Hul
- Center of Medical Genetics, University of Antwerp & Antwerp University Hospital, Antwerp, Belgium.
| |
Collapse
|
5
|
Wolf JM, Westra J, Tintle N. Using Summary Statistics to Model Multiplicative Combinations of Initially Analyzed Phenotypes With a Flexible Choice of Covariates. Front Genet 2021; 12:745901. [PMID: 34712269 PMCID: PMC8546319 DOI: 10.3389/fgene.2021.745901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Accepted: 09/23/2021] [Indexed: 12/03/2022] Open
Abstract
While the promise of electronic medical record and biobank data is large, major questions remain about patient privacy, computational hurdles, and data access. One promising area of recent development is pre-computing non-individually identifiable summary statistics to be made publicly available for exploration and downstream analysis. In this manuscript we demonstrate how to utilize pre-computed linear association statistics between individual genetic variants and phenotypes to infer genetic relationships between products of phenotypes (e.g., ratios; logical combinations of binary phenotypes using "and" and "or") with customized covariate choices. We propose a method to approximate covariate adjusted linear models for products and logical combinations of phenotypes using only pre-computed summary statistics. We evaluate our method's accuracy through several simulation studies and an application modeling ratios of fatty acids using data from the Framingham Heart Study. These studies show consistent ability to recapitulate analysis results performed on individual level data including maintenance of the Type I error rate, power, and effect size estimates. An implementation of this proposed method is available in the publicly available R package pcsstools.
Collapse
Affiliation(s)
- Jack M. Wolf
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, United States
| | - Jason Westra
- Department of Mathematics, Computer Science, and Statistics, Dordt University, Sioux Center, IA, United States
| | - Nathan Tintle
- Department of Mathematics, Computer Science, and Statistics, Dordt University, Sioux Center, IA, United States
- Department of Population Health Nursing Science, College of Nursing, University of Illinois Chicago, Chicago, IL, United States
| |
Collapse
|
6
|
Sitlani CM, Baldassari AR, Highland HM, Hodonsky CJ, McKnight B, Avery CL. Comparison of adaptive multiple phenotype association tests using summary statistics in genome-wide association studies. Hum Mol Genet 2021; 30:1371-1383. [PMID: 33949650 PMCID: PMC8283209 DOI: 10.1093/hmg/ddab126] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Revised: 04/26/2021] [Accepted: 04/27/2021] [Indexed: 12/15/2022] Open
Abstract
Genome-wide association studies have been successful mapping loci for individual phenotypes, but few studies have comprehensively interrogated evidence of shared genetic effects across multiple phenotypes simultaneously. Statistical methods have been proposed for analyzing multiple phenotypes using summary statistics, which enables studies of shared genetic effects while avoiding challenges associated with individual-level data sharing. Adaptive tests have been developed to maintain power against multiple alternative hypotheses because the most powerful single-alternative test depends on the underlying structure of the associations between the multiple phenotypes and a single nucleotide polymorphism (SNP). Here we compare the performance of six such adaptive tests: two adaptive sum of powered scores (aSPU) tests, the unified score association test (metaUSAT), the adaptive test in a mixed-models framework (mixAda) and two principal-component-based adaptive tests (PCAQ and PCO). Our simulations highlight practical challenges that arise when multivariate distributions of phenotypes do not satisfy assumptions of multivariate normality. Previous reports in this context focus on low minor allele count (MAC) and omit the aSPU test, which relies less than other methods on asymptotic and distributional assumptions. When these assumptions are not satisfied, particularly when MAC is low and/or phenotype covariance matrices are singular or nearly singular, aSPU better preserves type I error, sometimes at the cost of decreased power. We illustrate this trade-off with multiple phenotype analyses of six quantitative electrocardiogram traits in the Population Architecture using Genomics and Epidemiology (PAGE) study.
Collapse
Affiliation(s)
- Colleen M Sitlani
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA 98101 USA
| | - Antoine R Baldassari
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27516 USA
| | - Heather M Highland
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27516 USA
| | - Chani J Hodonsky
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908 USA
| | - Barbara McKnight
- Department of Biostatistics, University of Washington, Seattle, WA 98195 USA
| | - Christy L Avery
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27516 USA
| |
Collapse
|
7
|
Dutta D, VandeHaar P, Fritsche LG, Zöllner S, Boehnke M, Scott LJ, Lee S. A powerful subset-based method identifies gene set associations and improves interpretation in UK Biobank. Am J Hum Genet 2021; 108:669-681. [PMID: 33730541 DOI: 10.1016/j.ajhg.2021.02.016] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2020] [Accepted: 02/19/2021] [Indexed: 02/06/2023] Open
Abstract
Tests of association between a phenotype and a set of genes in a biological pathway can provide insights into the genetic architecture of complex phenotypes beyond those obtained from single-variant or single-gene association analysis. However, most existing gene set tests have limited power to detect gene set-phenotype association when a small fraction of the genes are associated with the phenotype and cannot identify the potentially "active" genes that might drive a gene set-based association. To address these issues, we have developed Gene set analysis Association Using Sparse Signals (GAUSS), a method for gene set association analysis that requires only GWAS summary statistics. For each significantly associated gene set, GAUSS identifies the subset of genes that have the maximal evidence of association and can best account for the gene set association. Using pre-computed correlation structure among test statistics from a reference panel, our p value calculation is substantially faster than other permutation- or simulation-based approaches. In simulations with varying proportions of causal genes, we find that GAUSS effectively controls type 1 error rate and has greater power than several existing methods, particularly when a small proportion of genes account for the gene set signal. Using GAUSS, we analyzed UK Biobank GWAS summary statistics for 10,679 gene sets and 1,403 binary phenotypes. We found that GAUSS is scalable and identified 13,466 phenotype and gene set association pairs. Within these gene sets, we identify an average of 17.2 (max = 405) genes that underlie these gene set associations.
Collapse
Affiliation(s)
- Diptavo Dutta
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA; Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA; Department of Biostatistics, Johns Hopkins University, Baltimore, MD 21205, USA
| | - Peter VandeHaar
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA; Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Lars G Fritsche
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA; Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Sebastian Zöllner
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA; Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Michael Boehnke
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA; Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Laura J Scott
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA; Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Seunggeun Lee
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA; Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA; Graduate School of Data Science, Seoul National University, Seoul 08826, Republic of Korea.
| |
Collapse
|
8
|
Resequencing of candidate genes for Keratoconus reveals a role for Ehlers-Danlos Syndrome genes. Eur J Hum Genet 2021; 29:1745-1755. [PMID: 33737726 DOI: 10.1038/s41431-021-00849-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2019] [Revised: 01/22/2021] [Accepted: 02/26/2021] [Indexed: 02/06/2023] Open
Abstract
The involvement of genetic factors in the pathogenesis of KC has long been recognized but the identification of variants affecting the underlying protein functions has been challenging. In this study, we selected 34 candidate genes for KC based on previous whole-exome sequencing (WES) and the literature, and resequenced them in 745 KC patients and 810 ethnically matched controls from Belgium, France and Italy. Data analysis was performed using the single variant association test as well as gene-based mutation burden and variance components tests. In our study, we detected enrichment of genetic variation across multiple gene-based tests for the genes COL2A1, COL5A1, TNXB, and ZNF469. The top hit in the single variant association test was obtained for a common variant in the COL12A1 gene. These associations were consistently found across independent subpopulations. Interestingly, COL5A1, TNXB, ZNF469 and COL12A1 are all known Ehlers-Danlos Syndrome (EDS) genes. Though the co-occurrence of KC and EDS has been reported previously, this study is the first to demonstrate a consistent role of genetic variants in EDS genes in the etiology of KC. In conclusion, our data show a shared genetic etiology between KC and EDS, and clearly confirm the currently disputed role of ZNF469 in disease susceptibility for KC.
Collapse
|