1
|
Kim K, Jun TH, Ha BK, Wang S, Sun H. New statistical selection method for pleiotropic variants associated with both quantitative and qualitative traits. BMC Bioinformatics 2023; 24:381. [PMID: 37817069 PMCID: PMC10563219 DOI: 10.1186/s12859-023-05505-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Accepted: 09/28/2023] [Indexed: 10/12/2023] Open
Abstract
BACKGROUND Identification of pleiotropic variants associated with multiple phenotypic traits has received increasing attention in genetic association studies. Overlapping genetic associations from multiple traits help to detect weak genetic associations missed by single-trait analyses. Many statistical methods were developed to identify pleiotropic variants with most of them being limited to quantitative traits when pleiotropic effects on both quantitative and qualitative traits have been observed. This is a statistically challenging problem because there does not exist an appropriate multivariate distribution to model both quantitative and qualitative data together. Alternatively, meta-analysis methods can be applied, which basically integrate summary statistics of individual variants associated with either a quantitative or a qualitative trait without accounting for correlations among genetic variants. RESULTS We propose a new statistical selection method based on a unified selection score quantifying how a genetic variant, i.e., a pleiotropic variant associates with both quantitative and qualitative traits. In our extensive simulation studies where various types of pleiotropic effects on both quantitative and qualitative traits were considered, we demonstrated that the proposed method outperforms the existing meta-analysis methods in terms of true positive selection. We also applied the proposed method to a peanut dataset with 6 quantitative and 2 qualitative traits, and a cowpea dataset with 2 quantitative and 6 qualitative traits. We were able to detect some potentially pleiotropic variants missed by the existing methods in both analyses. CONCLUSIONS The proposed method is able to locate pleiotropic variants associated with both quantitative and qualitative traits. It has been implemented into an R package 'UNISS', which can be downloaded from http://github.com/statpng/uniss.
Collapse
Affiliation(s)
- Kipoong Kim
- Department of Statistic, Pusan National University, 46241, Busan, Korea
| | - Tae-Hwan Jun
- Department of Plant Bioscience, Pusan National University, 50463, Miryang, Korea
| | - Bo-Keun Ha
- Department of Applied Plant Science, Chonnam National University, 61186, Gwangju, Korea
| | - Shuang Wang
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, 10032, USA
| | - Hokeun Sun
- Department of Statistic, Pusan National University, 46241, Busan, Korea.
| |
Collapse
|
2
|
A clustering linear combination method for multiple phenotype association studies based on GWAS summary statistics. Sci Rep 2023; 13:3389. [PMID: 36854754 PMCID: PMC9975197 DOI: 10.1038/s41598-023-30415-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Accepted: 02/22/2023] [Indexed: 03/02/2023] Open
Abstract
There is strong evidence showing that joint analysis of multiple phenotypes in genome-wide association studies (GWAS) can increase statistical power when detecting the association between genetic variants and human complex diseases. We previously developed the Clustering Linear Combination (CLC) method and a computationally efficient CLC (ceCLC) method to test the association between multiple phenotypes and a genetic variant, which perform very well. However, both of these methods require individual-level genotypes and phenotypes that are often not easily accessible. In this research, we develop a novel method called sCLC for association studies of multiple phenotypes and a genetic variant based on GWAS summary statistics. We use the LD score regression to estimate the correlation matrix among phenotypes. The test statistic of sCLC is constructed by GWAS summary statistics and has an approximate Cauchy distribution. We perform a variety of simulation studies and compare sCLC with other commonly used methods for multiple phenotype association studies using GWAS summary statistics. Simulation results show that sCLC can control Type I error rates well and has the highest power in most scenarios. Moreover, we apply the newly developed method to the UK Biobank GWAS summary statistics from the XIII category with 70 related musculoskeletal system and connective tissue phenotypes. The results demonstrate that sCLC detects the most number of significant SNPs, and most of these identified SNPs can be matched to genes that have been reported in the GWAS catalog to be associated with those phenotypes. Furthermore, sCLC also identifies some novel signals that were missed by standard GWAS, which provide new insight into the potential genetic factors of the musculoskeletal system and connective tissue phenotypes.
Collapse
|
3
|
Jung J, Kim H. Shared genetic etiology and antagonistic relationship of plasma renin activity and systolic blood pressure in a Korean cohorts. Genomics 2022; 114:110334. [PMID: 35278618 DOI: 10.1016/j.ygeno.2022.110334] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Revised: 02/11/2022] [Accepted: 03/06/2022] [Indexed: 01/14/2023]
Abstract
Despite extensive studies on blood pressure, its genetic risk factors remain uncertain. Even one of the most researched blood pressure-related traits - renin - is not fully understood genetically. Here, we determine the genetic relationship and associated predisposition between blood pressure and baseline renin. In 8840 Korean individuals, we observed a strong negative genome-wide genetic correlation (rg = -0.484) between systolic blood pressure (SBP) and plasma renin activity (PRA), suggesting that antagonistic genetic signals explain the variance in the two traits. We found 51 significant pleiotropic SNPs affecting the two traits, which could contribute to the Renin-Angiotensin-Aldosterone System (RAAS). Our findings provide insight into studies on RAAS by identifying the genome-wide relationship and susceptibility loci of SBP and PRA.
Collapse
Affiliation(s)
- Jaehoon Jung
- Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul 151-742, Republic of Korea; eGnome, 26 Beobwon-ro, Songpa-gu, Seoul 05836, Republic of Korea.
| | - Heebal Kim
- Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul 151-742, Republic of Korea; eGnome, 26 Beobwon-ro, Songpa-gu, Seoul 05836, Republic of Korea; Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 151-742, Republic of Korea.
| |
Collapse
|
4
|
Wang M, Zhang S, Sha Q. A computationally efficient clustering linear combination approach to jointly analyze multiple phenotypes for GWAS. PLoS One 2022; 17:e0260911. [PMID: 35482827 PMCID: PMC9049312 DOI: 10.1371/journal.pone.0260911] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Accepted: 04/13/2022] [Indexed: 11/18/2022] Open
Abstract
There has been an increasing interest in joint analysis of multiple phenotypes in genome-wide association studies (GWAS) because jointly analyzing multiple phenotypes may increase statistical power to detect genetic variants associated with complex diseases or traits. Recently, many statistical methods have been developed for joint analysis of multiple phenotypes in genetic association studies, including the Clustering Linear Combination (CLC) method. The CLC method works particularly well with phenotypes that have natural groupings, but due to the unknown number of clusters for a given data, the final test statistic of CLC method is the minimum p-value among all p-values of the CLC test statistics obtained from each possible number of clusters. Therefore, a simulation procedure needs to be used to evaluate the p-value of the final test statistic. This makes the CLC method computationally demanding. We develop a new method called computationally efficient CLC (ceCLC) to test the association between multiple phenotypes and a genetic variant. Instead of using the minimum p-value as the test statistic in the CLC method, ceCLC uses the Cauchy combination test to combine all p-values of the CLC test statistics obtained from each possible number of clusters. The test statistic of ceCLC approximately follows a standard Cauchy distribution, so the p-value can be obtained from the cumulative density function without the need for the simulation procedure. Through extensive simulation studies and application on the COPDGene data, the results demonstrate that the type I error rates of ceCLC are effectively controlled in different simulation settings and ceCLC either outperforms all other methods or has statistical power that is very close to the most powerful method with which it has been compared.
Collapse
Affiliation(s)
- Meida Wang
- Mathematical Sciences, Michigan Technological University, Houghton, MI, United States of America
| | - Shuanglin Zhang
- Mathematical Sciences, Michigan Technological University, Houghton, MI, United States of America
| | - Qiuying Sha
- Mathematical Sciences, Michigan Technological University, Houghton, MI, United States of America
| |
Collapse
|
5
|
Fu L, Wang Y, Li T, Yang S, Hu YQ. A Novel Hierarchical Clustering Approach for Joint Analysis of Multiple Phenotypes Uncovers Obesity Variants Based on ARIC. Front Genet 2022; 13:791920. [PMID: 35391794 PMCID: PMC8981031 DOI: 10.3389/fgene.2022.791920] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Accepted: 01/27/2022] [Indexed: 12/02/2022] Open
Abstract
Genome-wide association studies (GWASs) have successfully discovered numerous variants underlying various diseases. Generally, one-phenotype one-variant association study in GWASs is not efficient in identifying variants with weak effects, indicating that more signals have not been identified yet. Nowadays, jointly analyzing multiple phenotypes has been recognized as an important approach to elevate the statistical power for identifying weak genetic variants on complex diseases, shedding new light on potential biological mechanisms. Therefore, hierarchical clustering based on different methods for calculating correlation coefficients (HCDC) is developed to synchronously analyze multiple phenotypes in association studies. There are two steps involved in HCDC. First, a clustering approach based on the similarity matrix between two groups of phenotypes is applied to choose a representative phenotype in each cluster. Then, we use existing methods to estimate the genetic associations with the representative phenotypes rather than the individual phenotypes in every cluster. A variety of simulations are conducted to demonstrate the capacity of HCDC for boosting power. As a consequence, existing methods embedding HCDC are either more powerful or comparable with those of without embedding HCDC in most scenarios. Additionally, the application of obesity-related phenotypes from Atherosclerosis Risk in Communities via existing methods with HCDC uncovered several associated variants. Among these, UQCC1-rs1570004 is reported as a significant obesity signal for the first time, whose differential expression in subcutaneous fat, visceral fat, and muscle tissue is worthy of further functional studies.
Collapse
Affiliation(s)
- Liwan Fu
- Center for Non-communicable Disease Management, National Center for Children's Health, Beijing Children's Hospital, Capital Medical University, Beijing, China.,State Key Laboratory of Genetic Engineering, Human Phenome Institute, Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai, China
| | - Yuquan Wang
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai, China
| | - Tingting Li
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai, China
| | - Siqian Yang
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai, China
| | - Yue-Qing Hu
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai, China.,Shanghai Center for Mathematical Sciences, Fudan University, Shanghai, China
| |
Collapse
|
6
|
Wang T, Lu H, Zeng P. Identifying pleiotropic genes for complex phenotypes with summary statistics from a perspective of composite null hypothesis testing. Brief Bioinform 2021; 23:6375058. [PMID: 34571531 DOI: 10.1093/bib/bbab389] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2021] [Revised: 08/06/2021] [Accepted: 08/28/2021] [Indexed: 12/13/2022] Open
Abstract
Pleiotropy has important implication on genetic connection among complex phenotypes and facilitates our understanding of disease etiology. Genome-wide association studies provide an unprecedented opportunity to detect pleiotropic associations; however, efficient pleiotropy test methods are still lacking. We here consider pleiotropy identification from a methodological perspective of high-dimensional composite null hypothesis and propose a powerful gene-based method called MAIUP. MAIUP is constructed based on the traditional intersection-union test with two sets of independent P-values as input and follows a novel idea that was originally proposed under the high-dimensional mediation analysis framework. The key improvement of MAIUP is that it takes the composite null nature of pleiotropy test into account by fitting a three-component mixture null distribution, which can ultimately generate well-calibrated P-values for effective control of family-wise error rate and false discover rate. Another attractive advantage of MAIUP is its ability to effectively address the issue of overlapping subjects commonly encountered in association studies. Simulation studies demonstrate that compared with other methods, only MAIUP can maintain correct type I error control and has higher power across a wide range of scenarios. We apply MAIUP to detect shared associated genes among 14 psychiatric disorders with summary statistics and discover many new pleiotropic genes that are otherwise not identified if failing to account for the issue of composite null hypothesis testing. Functional and enrichment analyses offer additional evidence supporting the validity of these identified pleiotropic genes associated with psychiatric disorders. Overall, MAIUP represents an efficient method for pleiotropy identification.
Collapse
Affiliation(s)
- Ting Wang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu, 221004, China
| | - Haojie Lu
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu, 221004, China
| | - Ping Zeng
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu, 221004, China.,Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, Jiangsu, 221004, China.,Key Laboratory of Human Genetics and Environmental Medicine, Xuzhou Medical University, Xuzhou, Jiangsu, 221004, China
| |
Collapse
|
7
|
A powerful method for pleiotropic analysis under composite null hypothesis identifies novel shared loci between Type 2 Diabetes and Prostate Cancer. PLoS Genet 2020; 16:e1009218. [PMID: 33290408 PMCID: PMC7748289 DOI: 10.1371/journal.pgen.1009218] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2020] [Revised: 12/18/2020] [Accepted: 10/22/2020] [Indexed: 12/24/2022] Open
Abstract
There is increasing evidence that pleiotropy, the association of multiple traits with the same genetic variants/loci, is a very common phenomenon. Cross-phenotype association tests are often used to jointly analyze multiple traits from a genome-wide association study (GWAS). The underlying methods, however, are often designed to test the global null hypothesis that there is no association of a genetic variant with any of the traits, the rejection of which does not implicate pleiotropy. In this article, we propose a new statistical approach, PLACO, for specifically detecting pleiotropic loci between two traits by considering an underlying composite null hypothesis that a variant is associated with none or only one of the traits. We propose testing the null hypothesis based on the product of the Z-statistics of the genetic variants across two studies and derive a null distribution of the test statistic in the form of a mixture distribution that allows for fractions of variants to be associated with none or only one of the traits. We borrow approaches from the statistical literature on mediation analysis that allow asymptotic approximation of the null distribution avoiding estimation of nuisance parameters related to mixture proportions and variance components. Simulation studies demonstrate that the proposed method can maintain type I error and can achieve major power gain over alternative simpler methods that are typically used for testing pleiotropy. PLACO allows correlation in summary statistics between studies that may arise due to sharing of controls between disease traits. Application of PLACO to publicly available summary data from two large case-control GWAS of Type 2 Diabetes and of Prostate Cancer implicated a number of novel shared genetic regions: 3q23 (ZBTB38), 6q25.3 (RGS17), 9p22.1 (HAUS6), 9p13.3 (UBAP2), 11p11.2 (RAPSN), 14q12 (AKAP6), 15q15 (KNL1) and 18q23 (ZNF236). We propose a new approach PLACO that uses aggregate-level genotype-phenotype association statistics—commonly referred to as GWAS summary statistics—to identify genetic variants that influence risk of two traits or diseases. It allows correlation in summary statistics between studies that may arise due to sharing of controls between disease traits. We demonstrate that PLACO can achieve major power gain over alternative methods that are typically used. We applied PLACO to Type 2 Diabetes and Prostate Cancer summary data from two large case-control studies. Many previous studies have reported an inverse association of these two chronic diseases suggesting shared risk factors; however, shared genetic mechanisms underlying this association is poorly understood. PLACO identified a number of novel shared genetic regions that are not detected by individual trait analysis. Many of the loci implicated by PLACO increase risk for one disease while decreasing risk for the other. PLACO can similarly be used on other traits to shed light on shared genetic risk factors.
Collapse
|
8
|
Nguyen TH, Dobbyn A, Brown RC, Riley BP, Buxbaum JD, Pinto D, Purcell SM, Sullivan PF, He X, Stahl EA. mTADA is a framework for identifying risk genes from de novo mutations in multiple traits. Nat Commun 2020; 11:2929. [PMID: 32522981 PMCID: PMC7287090 DOI: 10.1038/s41467-020-16487-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2018] [Accepted: 05/06/2020] [Indexed: 11/12/2022] Open
Abstract
Joint analysis of multiple traits can result in the identification of associations not found through the analysis of each trait in isolation. Studies of neuropsychiatric disorders and congenital heart disease (CHD) which use de novo mutations (DNMs) from parent-offspring trios have reported multiple putatively causal genes. However, a joint analysis method designed to integrate DNMs from multiple studies has yet to be implemented. We here introduce multiple-trait TADA (mTADA) which jointly analyzes two traits using DNMs from non-overlapping family samples. We first demonstrate that mTADA is able to leverage genetic overlaps to increase the statistical power of risk-gene identification. We then apply mTADA to large datasets of >13,000 trios for five neuropsychiatric disorders and CHD. We report additional risk genes for schizophrenia, epileptic encephalopathies and CHD. We outline some shared and specific biological information of intellectual disability and CHD by conducting systems biology analyses of genes prioritized by mTADA.
Collapse
Affiliation(s)
- Tan-Hoang Nguyen
- Division of Psychiatric Genomics, Department of Genetics and Genomic Sciences, Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Virginia Institute for Psychiatric and Behavioral Genetics, Department of Psychiatry, Virginia Commonwealth University, Richmond, VA, USA.
| | - Amanda Dobbyn
- Division of Psychiatric Genomics, Department of Genetics and Genomic Sciences, Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Ruth C Brown
- Virginia Institute for Psychiatric and Behavioral Genetics, Department of Psychiatry, Virginia Commonwealth University, Richmond, VA, USA
| | - Brien P Riley
- Virginia Institute for Psychiatric and Behavioral Genetics, Department of Psychiatry, Virginia Commonwealth University, Richmond, VA, USA
| | - Joseph D Buxbaum
- Seaver Autism Center, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Dalila Pinto
- Seaver Autism Center, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- The Mindich Child Health & Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Shaun M Purcell
- Sleep Center, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Patrick F Sullivan
- Departments of Genetics and Psychiatry, University of North Carolina, Chapel Hill, NC, USA
| | - Xin He
- Department of Human Genetics, University of Chicago, Chicago, IL, USA.
- Grossman Institute for Neuroscience, Quantitative Biology and Human Behavior, University of Chicago, Chicago, IL, USA.
| | - Eli A Stahl
- Division of Psychiatric Genomics, Department of Genetics and Genomic Sciences, Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
9
|
Ragland MF, Benway CJ, Lutz SM, Bowler RP, Hecker J, Hokanson JE, Crapo JD, Castaldi PJ, DeMeo DL, Hersh CP, Hobbs BD, Lange C, Beaty TH, Cho MH, Silverman EK. Genetic Advances in Chronic Obstructive Pulmonary Disease. Insights from COPDGene. Am J Respir Crit Care Med 2020; 200:677-690. [PMID: 30908940 DOI: 10.1164/rccm.201808-1455so] [Citation(s) in RCA: 55] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
Chronic obstructive pulmonary disease (COPD) is a common and progressive disease that is influenced by both genetic and environmental factors. For many years, knowledge of the genetic basis of COPD was limited to Mendelian syndromes, such as alpha-1 antitrypsin deficiency and cutis laxa, caused by rare genetic variants. Over the past decade, the proliferation of genome-wide association studies, the accessibility of whole-genome sequencing, and the development of novel methods for analyzing genetic variation data have led to a substantial increase in the understanding of genetic variants that play a role in COPD susceptibility and COPD-related phenotypes. COPDGene (Genetic Epidemiology of COPD), a multicenter, longitudinal study of over 10,000 current and former cigarette smokers, has been pivotal to these breakthroughs in understanding the genetic basis of COPD. To date, over 20 genetic loci have been convincingly associated with COPD affection status, with additional loci demonstrating association with COPD-related phenotypes such as emphysema, chronic bronchitis, and hypoxemia. In this review, we discuss the contributions of the COPDGene study to the discovery of these genetic associations as well as the ongoing genetic investigations of COPD subtypes, protein biomarkers, and post-genome-wide association study analysis.
Collapse
Affiliation(s)
- Margaret F Ragland
- Division of Pulmonary Sciences and Critical Care Medicine, School of Medicine, and
| | | | | | | | - Julian Hecker
- Harvard T. H. Chan School of Public Health, Boston, Massachusetts; and
| | - John E Hokanson
- Department of Epidemiology, Colorado School of Public Health, University of Colorado Denver, Aurora, Colorado
| | | | | | - Dawn L DeMeo
- Channing Division of Network Medicine and.,Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, Massachusetts
| | - Craig P Hersh
- Channing Division of Network Medicine and.,Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, Massachusetts
| | - Brian D Hobbs
- Channing Division of Network Medicine and.,Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, Massachusetts
| | - Christoph Lange
- Harvard T. H. Chan School of Public Health, Boston, Massachusetts; and
| | - Terri H Beaty
- Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland
| | - Michael H Cho
- Channing Division of Network Medicine and.,Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, Massachusetts
| | - Edwin K Silverman
- Channing Division of Network Medicine and.,Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, Massachusetts
| |
Collapse
|
10
|
Parker MM, Lutz SM, Hobbs BD, Busch R, McDonald MN, Castaldi PJ, Beaty TH, Hokanson JE, Silverman EK, Cho MH. Assessing pleiotropy and mediation in genetic loci associated with chronic obstructive pulmonary disease. Genet Epidemiol 2019; 43:318-329. [PMID: 30740764 DOI: 10.1002/gepi.22192] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2018] [Revised: 09/10/2018] [Accepted: 10/10/2018] [Indexed: 12/14/2022]
Abstract
Genetic association studies have increasingly recognized variant effects on multiple phenotypes. Chronic obstructive pulmonary disease (COPD) is a heterogeneous disease with environmental and genetic causes. Multiple genetic variants have been associated with COPD, many of which show significant associations to additional phenotypes. However, it is unknown if these associations represent biological pleiotropy or if they exist through correlation of related phenotypes ("mediated pleiotropy"). Using 6,670 subjects from the COPDGene study, we describe the association of known COPD susceptibility loci with other COPD-related phenotypes and distinguish if these act directly on the phenotypes (i.e., biological pleiotropy) or if the association is due to correlation (i.e., mediated pleiotropy). We identified additional associated phenotypes for 13 of 25 known COPD loci. Tests for pleiotropy between genotype and associated outcomes were significant for all loci. In cases of significant pleiotropy, we performed mediation analysis to test if SNPs had a direct association to phenotype. Most loci showed a mediated effect through the hypothesized causal pathway. However, many loci also had direct associations, suggesting causal explanations (i.e., emphysema leading to reduced lung function) are incomplete. Our results highlight the high degree of pleiotropy in complex disease-associated loci and provide novel insights into the mechanisms underlying COPD.
Collapse
Affiliation(s)
- Margaret M Parker
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, Massachusetts
| | - Sharon M Lutz
- Department of Biostatistics and Informatics, University of Colorado, Anschutz Medical Campus, Denver, Colorado
| | - Brian D Hobbs
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, Massachusetts
- Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, Massachusetts
| | - Robert Busch
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, Massachusetts
- Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, Massachusetts
| | - MerryLynn N McDonald
- Division of Pulmonary, Allergy, and Critical Care Medicine, School of Medicine, University of Alabama at Birmingham, Birmingham, Alabama
| | - Peter J Castaldi
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, Massachusetts
- Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, Massachusetts
| | - Terri H Beaty
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland
| | - John E Hokanson
- Department of Epidemiology, University of Colorado, Denver, Aurora, Colorado
| | - Edwin K Silverman
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, Massachusetts
- Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, Massachusetts
| | - Michael H Cho
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, Massachusetts
- Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, Massachusetts
| |
Collapse
|
11
|
Hackinger S, Zeggini E. Statistical methods to detect pleiotropy in human complex traits. Open Biol 2018; 7:rsob.170125. [PMID: 29093210 PMCID: PMC5717338 DOI: 10.1098/rsob.170125] [Citation(s) in RCA: 78] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2017] [Accepted: 09/29/2017] [Indexed: 12/13/2022] Open
Abstract
In recent years pleiotropy, the phenomenon of one genetic locus influencing several traits, has become a widely researched field in human genetics. With the increasing availability of genome-wide association study summary statistics, as well as the establishment of deeply phenotyped sample collections, it is now possible to systematically assess the genetic overlap between multiple traits and diseases. In addition to increasing power to detect associated variants, multi-trait methods can also aid our understanding of how different disorders are aetiologically linked by highlighting relevant biological pathways. A plethora of available tools to perform such analyses exists, each with their own advantages and limitations. In this review, we outline some of the currently available methods to conduct multi-trait analyses. First, we briefly introduce the concept of pleiotropy and outline the current landscape of pleiotropy research in human genetics; second, we describe analytical considerations and analysis methods; finally, we discuss future directions for the field.
Collapse
|
12
|
Liang X, Sha Q, Rho Y, Zhang S. A hierarchical clustering method for dimension reduction in joint analysis of multiple phenotypes. Genet Epidemiol 2018; 42:344-353. [PMID: 29682782 DOI: 10.1002/gepi.22124] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2017] [Revised: 02/01/2018] [Accepted: 02/19/2018] [Indexed: 12/25/2022]
Abstract
Genome-wide association studies (GWAS) have become a very effective research tool to identify genetic variants of underlying various complex diseases. In spite of the success of GWAS in identifying thousands of reproducible associations between genetic variants and complex disease, in general, the association between genetic variants and a single phenotype is usually weak. It is increasingly recognized that joint analysis of multiple phenotypes can be potentially more powerful than the univariate analysis, and can shed new light on underlying biological mechanisms of complex diseases. In this paper, we develop a novel variable reduction method using hierarchical clustering method (HCM) for joint analysis of multiple phenotypes in association studies. The proposed method involves two steps. The first step applies a dimension reduction technique by using a representative phenotype for each cluster of phenotypes. Then, existing methods are used in the second step to test the association between genetic variants and the representative phenotypes rather than the individual phenotypes. We perform extensive simulation studies to compare the powers of multivariate analysis of variance (MANOVA), joint model of multiple phenotypes (MultiPhen), and trait-based association test that uses extended simes procedure (TATES) using HCM with those of without using HCM. Our simulation studies show that using HCM is more powerful than without using HCM in most scenarios. We also illustrate the usefulness of using HCM by analyzing a whole-genome genotyping data from a lung function study.
Collapse
Affiliation(s)
- Xiaoyu Liang
- Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan, United States of America
| | - Qiuying Sha
- Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan, United States of America
| | - Yeonwoo Rho
- Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan, United States of America
| | - Shuanglin Zhang
- Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan, United States of America
| |
Collapse
|
13
|
Salinas YD, Wang Z, DeWan AT. Statistical Analysis of Multiple Phenotypes in Genetic Epidemiologic Studies: From Cross-Phenotype Associations to Pleiotropy. Am J Epidemiol 2018; 187:855-863. [PMID: 29020254 DOI: 10.1093/aje/kwx296] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2017] [Accepted: 08/03/2017] [Indexed: 12/15/2022] Open
Abstract
In the context of genetics, pleiotropy refers to the phenomenon in which a single genetic locus affects more than 1 trait or disease. Genetic epidemiologic studies have identified loci associated with multiple phenotypes, and these cross-phenotype associations are often incorrectly interpreted as examples of pleiotropy. Pleiotropy is only one possible explanation for cross-phenotype associations. Cross-phenotype associations may also arise due to issues related to study design, confounder bias, or nongenetic causal links between the phenotypes under analysis. Therefore, it is necessary to dissect cross-phenotype associations carefully to uncover true pleiotropic loci. In this review, we describe statistical methods that can be used to identify robust statistical evidence of pleiotropy. First, we provide an overview of univariate and multivariate methods for discovery of cross-phenotype associations and highlight important considerations for choosing among available methods. Then, we describe how to dissect cross-phenotype associations by using mediation analysis. Pleiotropic loci provide insights into the mechanistic underpinnings of disease comorbidity, and they may serve as novel targets for interventions that simultaneously treat multiple diseases. Discerning between different types of cross-phenotype associations is necessary to realize the public health potential of pleiotropic loci.
Collapse
Affiliation(s)
- Yasmmyn D Salinas
- Department of Chronic Disease Epidemiology, Yale School of Public Health, New Haven, Connecticut
| | - Zuoheng Wang
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut
| | - Andrew T DeWan
- Department of Chronic Disease Epidemiology, Yale School of Public Health, New Haven, Connecticut
| |
Collapse
|