1
|
Huang M, Lyu C, Liu N, Nembhard WN, Witte JS, Hobbs CA, Li M. A gene-based association test of interactions for maternal-fetal genotypes identifies genes associated with nonsyndromic congenital heart defects. Genet Epidemiol 2023; 47:475-495. [PMID: 37341229 DOI: 10.1002/gepi.22533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 04/13/2023] [Accepted: 06/02/2023] [Indexed: 06/22/2023]
Abstract
The risk of congenital heart defects (CHDs) may be influenced by maternal genes, fetal genes, and their interactions. Existing methods commonly test the effects of maternal and fetal variants one-at-a-time and may have reduced statistical power to detect genetic variants with low minor allele frequencies. In this article, we propose a gene-based association test of interactions for maternal-fetal genotypes (GATI-MFG) using a case-mother and control-mother design. GATI-MFG can integrate the effects of multiple variants within a gene or genomic region and evaluate the joint effect of maternal and fetal genotypes while allowing for their interactions. In simulation studies, GATI-MFG had improved statistical power over alternative methods, such as the single-variant test and functional data analysis (FDA) under various disease scenarios. We further applied GATI-MFG to a two-phase genome-wide association study of CHDs for the testing of both common variants and rare variants using 947 CHD case mother-infant pairs and 1306 control mother-infant pairs from the National Birth Defects Prevention Study (NBDPS). After Bonferroni adjustment for 23,035 genes, two genes on chromosome 17, TMEM107 (p = 1.64e-06) and CTC1 (p = 2.0e-06), were identified for significant association with CHD in common variants analysis. Gene TMEM107 regulates ciliogenesis and ciliary protein composition and was found to be associated with heterotaxy. Gene CTC1 plays an essential role in protecting telomeres from degradation, which was suggested to be associated with cardiogenesis. Overall, GATI-MFG outperformed the single-variant test and FDA in the simulations, and the results of application to NBDPS samples are consistent with existing literature supporting the association of TMEM107 and CTC1 with CHDs.
Collapse
Affiliation(s)
- Manyan Huang
- Department of Epidemiology and Biostatistics, Indiana University Bloomington, Bloomington, Indiana, USA
| | - Chen Lyu
- Department of Population Health, New York University Grossman School of Medicine, New York City, New York, USA
| | - Nianjun Liu
- Department of Epidemiology and Biostatistics, Indiana University Bloomington, Bloomington, Indiana, USA
| | - Wendy N Nembhard
- Department of Epidemiology, University of Arkansas for Medical Sciences, Little Rock, Arkansas, USA
| | - John S Witte
- Department of Epidemiology and Population Health, Stanford University, Stanford, California, USA
- Department of Biomedical Data Sciences, Stanford University, Stanford, California, USA
| | - Charlotte A Hobbs
- Rady Children's Institute for Genomic Medicine, San Diego, California, USA
| | - Ming Li
- Department of Epidemiology and Biostatistics, Indiana University Bloomington, Bloomington, Indiana, USA
| |
Collapse
|
2
|
Lyu C, Huang M, Liu N, Chen Z, Lupo PJ, Tycko B, Witte JS, Hobbs CA, Li M. Random field modeling of multi-trait multi-locus association for detecting methylation quantitative trait loci. Bioinformatics 2022; 38:3853-3862. [PMID: 35781319 PMCID: PMC9364381 DOI: 10.1093/bioinformatics/btac443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Revised: 06/28/2022] [Accepted: 06/30/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION CpG sites within the same genomic region often share similar methylation patterns and tend to be co-regulated by multiple genetic variants that may interact with one another. RESULTS We propose a multi-trait methylation random field (multi-MRF) method to evaluate the joint association between a set of CpG sites and a set of genetic variants. The proposed method has several advantages. First, it is a multi-trait method that allows flexible correlation structures between neighboring CpG sites (e.g. distance-based correlation). Second, it is also a multi-locus method that integrates the effect of multiple common and rare genetic variants. Third, it models the methylation traits with a beta distribution to characterize their bimodal and interval properties. Through simulations, we demonstrated that the proposed method had improved power over some existing methods under various disease scenarios. We further illustrated the proposed method via an application to a study of congenital heart defects (CHDs) with 83 cardiac tissue samples. Our results suggested that gene BACE2, a methylation quantitative trait locus (QTL) candidate, colocalized with expression QTLs in artery tibial and harbored genetic variants with nominal significant associations in two genome-wide association studies of CHD. AVAILABILITY AND IMPLEMENTATION https://github.com/chenlyu2656/Multi-MRF. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Chen Lyu
- Department of Epidemiology and Biostatistics, Indiana University Bloomington, Bloomington, IN 47405, USA,Department of Population Health, New York University Grossman School of Medicine, New York, NY 10016, USA
| | - Manyan Huang
- Department of Epidemiology and Biostatistics, Indiana University Bloomington, Bloomington, IN 47405, USA
| | - Nianjun Liu
- Department of Epidemiology and Biostatistics, Indiana University Bloomington, Bloomington, IN 47405, USA
| | - Zhongxue Chen
- Department of Epidemiology and Biostatistics, Indiana University Bloomington, Bloomington, IN 47405, USA
| | - Philip J Lupo
- Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Benjamin Tycko
- Center for Discovery and Innovation, Nutley, NJ 07110, USA
| | - John S Witte
- Department of Epidemiology and Population Health, Stanford University, Stanford, CA 94305, USA,Department of Biomedical Data Sciences, Stanford University, Stanford, CA 94305, USA
| | - Charlotte A Hobbs
- Rady Children’s Institute for Genomic Medicine, San Diego, CA 92123, USA
| | - Ming Li
- To whom correspondence should be addressed.
| |
Collapse
|
3
|
Shen X, Wen Y, Cui Y, Lu Q. A conditional autoregressive model for genetic association analysis accounting for genetic heterogeneity. Stat Med 2022; 41:517-542. [PMID: 34811777 PMCID: PMC8792507 DOI: 10.1002/sim.9257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Revised: 10/21/2021] [Accepted: 10/25/2021] [Indexed: 11/07/2022]
Abstract
Converging evidence from genetic studies and population genetics theory suggest that complex diseases are characterized by remarkable genetic heterogeneity, and individual rare mutations with different effects could collectively play an important role in human diseases. Many existing statistical models for association analysis assume homogeneous effects of genetic variants across all individuals, and could be subject to power loss in the presence of genetic heterogeneity. To consider possible heterogeneous genetic effects among individuals, we propose a conditional autoregressive model. In the proposed method, the genetic effect is considered as a random effect and a score test is developed to test the variance component of genetic random effect. Through simulations, we compare the type I error and power performance of the proposed method with those of the generalized genetic random field and the sequence kernel association test methods under different disease scenarios. We find that our method outperforms the other two methods when (i) the rare variants have the major contribution to the disease, or (ii) the genetic effects vary in different individuals or subgroups of individuals. Finally, we illustrate the new method by applying it to the whole genome sequencing data from the Alzheimer's Disease Neuroimaging Initiative.
Collapse
Affiliation(s)
- Xiaoxi Shen
- Department of Mathematics, Texas State University, Texas, USA
- Department of Biostatistics, University of Florida, Florida, USA
| | - Yalu Wen
- Department of Statistics, University of Auckland, Auckland, New Zealand
| | - Yuehua Cui
- Department of Statistics and Probability, Michigan State University, Michigan, USA
| | - Qing Lu
- Department of Biostatistics, University of Florida, Florida, USA
| |
Collapse
|
4
|
Lyu C, Huang M, Liu N, Chen Z, Lupo PJ, Tycko B, Witte JS, Hobbs CA, Li M. Detecting methylation quantitative trait loci using a methylation random field method. Brief Bioinform 2021; 22:bbab323. [PMID: 34414410 PMCID: PMC8575051 DOI: 10.1093/bib/bbab323] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Revised: 07/09/2021] [Accepted: 07/24/2021] [Indexed: 11/13/2022] Open
Abstract
DNA methylation may be regulated by genetic variants within a genomic region, referred to as methylation quantitative trait loci (mQTLs). The changes of methylation levels can further lead to alterations of gene expression, and influence the risk of various complex human diseases. Detecting mQTLs may provide insights into the underlying mechanism of how genotypic variations may influence the disease risk. In this article, we propose a methylation random field (MRF) method to detect mQTLs by testing the association between the methylation level of a CpG site and a set of genetic variants within a genomic region. The proposed MRF has two major advantages over existing approaches. First, it uses a beta distribution to characterize the bimodal and interval properties of the methylation trait at a CpG site. Second, it considers multiple common and rare genetic variants within a genomic region to identify mQTLs. Through simulations, we demonstrated that the MRF had improved power over other existing methods in detecting rare variants of relatively large effect, especially when the sample size is small. We further applied our method to a study of congenital heart defects with 83 cardiac tissue samples and identified two mQTL regions, MRPS10 and PSORS1C1, which were colocalized with expression QTL in cardiac tissue. In conclusion, the proposed MRF is a useful tool to identify novel mQTLs, especially for studies with limited sample sizes.
Collapse
Affiliation(s)
- Chen Lyu
- Department of Epidemiology and Biostatistics, Indiana University, Bloomington, IN, USA
| | - Manyan Huang
- Department of Epidemiology and Biostatistics, Indiana University, Bloomington, IN, USA
| | - Nianjun Liu
- Department of Epidemiology and Biostatistics, Indiana University, Bloomington, IN, USA
| | - Zhongxue Chen
- Department of Epidemiology and Biostatistics, Indiana University, Bloomington, IN, USA
| | - Philip J Lupo
- Department of Pediatrics, Baylor College of Medicine, Houston, TX, USA
| | | | - John S Witte
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, USA
| | | | - Ming Li
- Department of Epidemiology and Biostatistics, Indiana University, Bloomington, IN, USA
| |
Collapse
|
5
|
Huang M, Lyu C, Li X, Qureshi AA, Han J, Li M. Identifying Susceptibility Loci for Cutaneous Squamous Cell Carcinoma Using a Fast Sequence Kernel Association Test. Front Genet 2021; 12:657499. [PMID: 34040636 PMCID: PMC8141858 DOI: 10.3389/fgene.2021.657499] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2021] [Accepted: 04/09/2021] [Indexed: 11/13/2022] Open
Abstract
Cutaneous squamous cell carcinoma (cSCC) accounts for about 20% of all skin cancers, the most common type of malignancy in the United States. Genome-wide association studies (GWAS) have successfully identified multiple genetic variants associated with the risk of cSCC. Most of these studies were single-locus-based, testing genetic variants one-at-a-time. In this article, we performed gene-based association tests to evaluate the joint effect of multiple variants, especially rare variants, on the risk of cSCC by using a fast sequence kernel association test (fastSKAT). The study included 1,710 cSCC cases and 24,304 cancer-free controls from the Nurses' Health Study, the Nurses' Health Study II and the Health Professionals Follow-up Study. We used UCSC Genome Browser to define gene units as candidate loci, and further evaluated the association between all variants within each gene unit and disease outcome. Four genes HP1BP3, DAG1, SEPT7P2, and SLFN12 were identified using Bonferroni adjusted significance level. Our study is complementary to the existing GWASs, and our findings may provide additional insights into the etiology of cSCC. Further studies are needed to validate these findings.
Collapse
Affiliation(s)
- Manyan Huang
- Department of Epidemiology and Biostatistics, School of Public Health, Indiana University at Bloomington, Bloomington, IN, United States
| | - Chen Lyu
- Department of Epidemiology and Biostatistics, School of Public Health, Indiana University at Bloomington, Bloomington, IN, United States
| | - Xin Li
- Department of Epidemiology, Richard M. Fairbanks School of Public Health, Indiana University - Purdue University Indianapolis, Indianapolis, IN, United States.,Melvin and Bren Simon Cancer Center, Indianapolis, IN, United States
| | - Abrar A Qureshi
- Department of Dermatology, Alpert Medical School, Brown University, Providence, RI, United States
| | - Jiali Han
- Department of Epidemiology, Richard M. Fairbanks School of Public Health, Indiana University - Purdue University Indianapolis, Indianapolis, IN, United States.,Melvin and Bren Simon Cancer Center, Indianapolis, IN, United States
| | - Ming Li
- Department of Epidemiology and Biostatistics, School of Public Health, Indiana University at Bloomington, Bloomington, IN, United States
| |
Collapse
|
6
|
Detecting Rare Mutations with Heterogeneous Effects Using a Family-Based Genetic Random Field Method. Genetics 2018; 210:463-476. [PMID: 30104420 DOI: 10.1534/genetics.118.301266] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2018] [Accepted: 07/29/2018] [Indexed: 01/19/2023] Open
Abstract
The genetic etiology of many complex diseases is highly heterogeneous. A complex disease can be caused by multiple mutations within the same gene or mutations in multiple genes at various genomic loci. Although these disease-susceptibility mutations can be collectively common in the population, they are often individually rare or even private to certain families. Family-based studies are powerful for detecting rare variants enriched in families, which is an important feature for sequencing studies due to the heterogeneous nature of rare variants. In addition, family designs can provide robust protection against population stratification. Nevertheless, statistical methods for analyzing family-based sequencing data are underdeveloped, especially those accounting for heterogeneous etiology of complex diseases. In this article, we introduce a random field framework for detecting gene-phenotype associations in family-based sequencing studies, referred to as family-based genetic random field (FGRF). Similar to existing family-based association tests, FGRF could utilize within-family and between-family information separately or jointly to test an association. We demonstrate that FGRF has comparable statistical power with existing methods when there is no genetic heterogeneity, but can improve statistical power when there is genetic heterogeneity across families. The proposed method also shares the same advantages with the conventional family-based association tests (e.g., being robust to population stratification). Finally, we applied the proposed method to a sequencing data from the Minnesota Twin Family Study, and revealed several genes, including SAMD14, potentially associated with alcohol dependence.
Collapse
|
7
|
He Z, Lee S, Zhang M, Smith JA, Guo X, Palmas W, Kardia SL, Ionita-Laza I, Mukherjee B. Rare-variant association tests in longitudinal studies, with an application to the Multi-Ethnic Study of Atherosclerosis (MESA). Genet Epidemiol 2017; 41:801-810. [PMID: 29076270 PMCID: PMC5696115 DOI: 10.1002/gepi.22081] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2017] [Revised: 08/24/2017] [Accepted: 08/24/2017] [Indexed: 11/09/2022]
Abstract
Over the past few years, an increasing number of studies have identified rare variants that contribute to trait heritability. Due to the extreme rarity of some individual variants, gene-based association tests have been proposed to aggregate the genetic variants within a gene, pathway, or specific genomic region as opposed to a one-at-a-time single variant analysis. In addition, in longitudinal studies, statistical power to detect disease susceptibility rare variants can be improved through jointly testing repeatedly measured outcomes, which better describes the temporal development of the trait of interest. However, usual sandwich/model-based inference for sequencing studies with longitudinal outcomes and rare variants can produce deflated/inflated type I error rate without further corrections. In this paper, we develop a group of tests for rare-variant association based on outcomes with repeated measures. We propose new perturbation methods such that the type I error rate of the new tests is not only robust to misspecification of within-subject correlation, but also significantly improved for variants with extreme rarity in a study with small or moderate sample size. Through extensive simulation studies, we illustrate that substantially higher power can be achieved by utilizing longitudinal outcomes and our proposed finite sample adjustment. We illustrate our methods using data from the Multi-Ethnic Study of Atherosclerosis for exploring association of repeated measures of blood pressure with rare and common variants based on exome sequencing data on 6,361 individuals.
Collapse
Affiliation(s)
- Zihuai He
- Department of Biostatistics, Columbia University, New York, NY 10032
| | - Seunggeun Lee
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109
| | - Min Zhang
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109
| | - Jennifer A. Smith
- Department of Epidemiology, University of Michigan, Ann Arbor, MI 48109
| | - Xiuqing Guo
- Department of Pediatrics, Harbor-UCLA Medical Center, Torrance, CA 90509
| | - Walter Palmas
- Department of Medicine, Columbia University, New York, NY 10032
| | | | | | - Bhramar Mukherjee
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109
| |
Collapse
|
8
|
He Z, Xu B, Lee S, Ionita-Laza I. Unified Sequence-Based Association Tests Allowing for Multiple Functional Annotations and Meta-analysis of Noncoding Variation in Metabochip Data. Am J Hum Genet 2017; 101:340-352. [PMID: 28844485 PMCID: PMC5590864 DOI: 10.1016/j.ajhg.2017.07.011] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2017] [Accepted: 07/18/2017] [Indexed: 12/14/2022] Open
Abstract
Substantial progress has been made in the functional annotation of genetic variation in the human genome. Integrative analysis that incorporates such functional annotations into sequencing studies can aid the discovery of disease-associated genetic variants, especially those with unknown function and located outside protein-coding regions. Direct incorporation of one functional annotation as weight in existing dispersion and burden tests can suffer substantial loss of power when the functional annotation is not predictive of the risk status of a variant. Here, we have developed unified tests that can utilize multiple functional annotations simultaneously for integrative association analysis with efficient computational techniques. We show that the proposed tests significantly improve power when variant risk status can be predicted by functional annotations. Importantly, when functional annotations are not predictive of risk status, the proposed tests incur only minimal loss of power in relation to existing dispersion and burden tests, and under certain circumstances they can even have improved power by learning a weight that better approximates the underlying disease model in a data-adaptive manner. The tests can be constructed with summary statistics of existing dispersion and burden tests for sequencing data, therefore allowing meta-analysis of multiple studies without sharing individual-level data. We applied the proposed tests to a meta-analysis of noncoding rare variants in Metabochip data on 12,281 individuals from eight studies for lipid traits. By incorporating the Eigen functional score, we detected significant associations between noncoding rare variants in SLC22A3 and low-density lipoprotein and total cholesterol, associations that are missed by standard dispersion and burden tests.
Collapse
Affiliation(s)
- Zihuai He
- Department of Biostatistics, Columbia University, New York, NY 10032, USA
| | - Bin Xu
- Department of Psychiatry, Columbia University, New York, NY 10032, USA
| | - Seunggeun Lee
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | | |
Collapse
|
9
|
Li M, Li J, He Z, Lu Q, Witte JS, Macleod SL, Hobbs CA, Cleves MA. Testing Allele Transmission of an SNP Set Using a Family-Based Generalized Genetic Random Field Method. Genet Epidemiol 2016; 40:341-51. [PMID: 27061818 DOI: 10.1002/gepi.21970] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2015] [Revised: 02/19/2016] [Accepted: 02/22/2016] [Indexed: 12/20/2022]
Abstract
Family-based association studies are commonly used in genetic research because they can be robust to population stratification (PS). Recent advances in high-throughput genotyping technologies have produced a massive amount of genomic data in family-based studies. However, current family-based association tests are mainly focused on evaluating individual variants one at a time. In this article, we introduce a family-based generalized genetic random field (FB-GGRF) method to test the joint association between a set of autosomal SNPs (i.e., single-nucleotide polymorphisms) and disease phenotypes. The proposed method is a natural extension of a recently developed GGRF method for population-based case-control studies. It models offspring genotypes conditional on parental genotypes, and, thus, is robust to PS. Through simulations, we presented that under various disease scenarios the FB-GGRF has improved power over a commonly used family-based sequence kernel association test (FB-SKAT). Further, similar to GGRF, the proposed FB-GGRF method is asymptotically well-behaved, and does not require empirical adjustment of the type I error rates. We illustrate the proposed method using a study of congenital heart defects with family trios from the National Birth Defects Prevention Study (NBDPS).
Collapse
Affiliation(s)
- Ming Li
- Department of Epidemiology and Biostatistics, Indiana University at Bloomington, Bloomington, Indiana, United States of America
| | - Jingyun Li
- Department of Pediatrics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, United States of America
| | - Zihuai He
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Qing Lu
- Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, Michigan, United States of America
| | - John S Witte
- Department of Epidemiology and Biostatistics, University of California at San Francisco, San Francisco, California, United States of America
| | - Stewart L Macleod
- Department of Pediatrics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, United States of America
| | - Charlotte A Hobbs
- Department of Pediatrics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, United States of America
| | - Mario A Cleves
- Department of Pediatrics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, United States of America
| | | |
Collapse
|
10
|
Zeng P, Wang T. Detecting the Genomic Signature of Divergent Selection in Presence of Gene Flow. Curr Genomics 2015; 16:203-12. [PMID: 26069460 PMCID: PMC4460224 DOI: 10.2174/1389202916666150313230943] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2014] [Revised: 02/23/2015] [Accepted: 03/09/2015] [Indexed: 11/22/2022] Open
Abstract
In this paper the detection of rare variants association with continuous phenotypes of interest is investigated via the likelihood-ratio based variance component test under the framework of linear mixed models. The hypothesis testing is challenging and nonstandard, since under the null the variance component is located on the boundary of its parameter space. In this situation the usual asymptotic chisquare distribution of the likelihood ratio statistic does not necessarily hold. To circumvent the derivation of the null distribution we resort to the bootstrap method due to its generic applicability and being easy to implement. Both parametric and nonparametric bootstrap likelihood ratio tests are studied. Numerical studies are implemented to evaluate the performance of the proposed bootstrap likelihood ratio test and compare to some existing methods for the identification of rare variants. To reduce the computational time of the bootstrap likelihood ratio test we propose an effective approximation mixture for the bootstrap null distribution. The GAW17 data is used to illustrate the proposed test.
Collapse
Affiliation(s)
- Ping Zeng
- Department of Epidemiology and Biostatistics, and Center of Medical Statistics and Data Analysis, School of Public Health, Xuzhou Medical College, Xuzhou, Jiangsu, 221004, P. R. China
| | - Ting Wang
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical College, Xuzhou, Jiangsu, 221004, P. R. China
| |
Collapse
|
11
|
Zeng P, Zhao Y, Li H, Wang T, Chen F. Permutation-based variance component test in generalized linear mixed model with application to multilocus genetic association study. BMC Med Res Methodol 2015; 15:37. [PMID: 25897803 PMCID: PMC4410500 DOI: 10.1186/s12874-015-0030-1] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2014] [Accepted: 04/07/2015] [Indexed: 11/29/2022] Open
Abstract
Background In many medical studies the likelihood ratio test (LRT) has been widely applied to examine whether the random effects variance component is zero within the mixed effects models framework; whereas little work about likelihood-ratio based variance component test has been done in the generalized linear mixed models (GLMM), where the response is discrete and the log-likelihood cannot be computed exactly. Before applying the LRT for variance component in GLMM, several difficulties need to be overcome, including the computation of the log-likelihood, the parameter estimation and the derivation of the null distribution for the LRT statistic. Methods To overcome these problems, in this paper we make use of the penalized quasi-likelihood algorithm and calculate the LRT statistic based on the resulting working response and the quasi-likelihood. The permutation procedure is used to obtain the null distribution of the LRT statistic. We evaluate the permutation-based LRT via simulations and compare it with the score-based variance component test and the tests based on the mixture of chi-square distributions. Finally we apply the permutation-based LRT to multilocus association analysis in the case–control study, where the problem can be investigated under the framework of logistic mixed effects model. Results The simulations show that the permutation-based LRT can effectively control the type I error rate, while the score test is sometimes slightly conservative and the tests based on mixtures cannot maintain the type I error rate. Our studies also show that the permutation-based LRT has higher power than these existing tests and still maintains a reasonably high power even when the random effects do not follow a normal distribution. The application to GAW17 data also demonstrates that the proposed LRT has a higher probability to identify the association signals than the score test and the tests based on mixtures. Conclusions In the present paper the permutation-based LRT was developed for variance component in GLMM. The LRT outperforms existing tests and has a reasonably higher power under various scenarios; additionally, it is conceptually simple and easy to implement.
Collapse
Affiliation(s)
- Ping Zeng
- Department of Epidemiology and Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, 211166, , Jiangsu, People's Republic of China. .,Department of Epidemiology and Biostatistics, Center of Medical Statistics and Data Analysis, School of Public Health, Xuzhou Medical College, Xuzhou, 221004, Jiangsu, People's Republic of China.
| | - Yang Zhao
- Department of Epidemiology and Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, 211166, , Jiangsu, People's Republic of China.
| | - Hongliang Li
- Center for Disease Control and Prevention of Pudong New Area, Pudong New Area, Shanghai, 200136, People's Republic of China.
| | - Ting Wang
- Department of Epidemiology and Biostatistics, Center of Medical Statistics and Data Analysis, School of Public Health, Xuzhou Medical College, Xuzhou, 221004, Jiangsu, People's Republic of China.
| | - Feng Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, 211166, , Jiangsu, People's Republic of China.
| |
Collapse
|
12
|
He Z, Zhang M, Lee S, Smith JA, Guo X, Palmas W, Kardia SLR, Diez Roux AV, Mukherjee B. Set-based tests for genetic association in longitudinal studies. Biometrics 2015; 71:606-15. [PMID: 25854837 DOI: 10.1111/biom.12310] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2014] [Revised: 01/01/2015] [Accepted: 02/01/2015] [Indexed: 11/30/2022]
Abstract
Genetic association studies with longitudinal markers of chronic diseases (e.g., blood pressure, body mass index) provide a valuable opportunity to explore how genetic variants affect traits over time by utilizing the full trajectory of longitudinal outcomes. Since these traits are likely influenced by the joint effect of multiple variants in a gene, a joint analysis of these variants considering linkage disequilibrium (LD) may help to explain additional phenotypic variation. In this article, we propose a longitudinal genetic random field model (LGRF), to test the association between a phenotype measured repeatedly during the course of an observational study and a set of genetic variants. Generalized score type tests are developed, which we show are robust to misspecification of within-subject correlation, a feature that is desirable for longitudinal analysis. In addition, a joint test incorporating gene-time interaction is further proposed. Computational advancement is made for scalable implementation of the proposed methods in large-scale genome-wide association studies (GWAS). The proposed methods are evaluated through extensive simulation studies and illustrated using data from the Multi-Ethnic Study of Atherosclerosis (MESA). Our simulation results indicate substantial gain in power using LGRF when compared with two commonly used existing alternatives: (i) single marker tests using longitudinal outcome and (ii) existing gene-based tests using the average value of repeated measurements as the outcome.
Collapse
Affiliation(s)
- Zihuai He
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, U.S.A
| | - Min Zhang
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, U.S.A
| | - Seunggeun Lee
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, U.S.A
| | - Jennifer A Smith
- Department of Epidemiology, University of Michigan, Ann Arbor, Michigan, U.S.A
| | - Xiuqing Guo
- Department of Pediatrics, Harbor-UCLA Medical Center, Torrance, California, U.S.A
| | - Walter Palmas
- Department of Medicine, Columbia University, New York, New York, U.S.A
| | - Sharon L R Kardia
- Department of Epidemiology, University of Michigan, Ann Arbor, Michigan, U.S.A
| | - Ana V Diez Roux
- Department of Epidemiology, Drexel University, Philadelphia, U.S.A
| | - Bhramar Mukherjee
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, U.S.A
| |
Collapse
|