1
|
Noh M, Ha ID, Lee Y, Lim J, Lee J, Oh H, Shin D, Lee S, Seo J, Park Y, Cho S, Park J, Kim Y, You K. SRC-Stat Package for Fitting Double Hierarchical Generalized Linear Models. KOREAN JOURNAL OF APPLIED STATISTICS 2015. [DOI: 10.5351/kjas.2015.28.2.343] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
2
|
Li J, Yang J, Levin AM, Montgomery CG, Datta I, Trudeau S, Adrianto I, McKeigue P, Iannuzzi MC, Rybicki BA. Efficient generalized least squares method for mixed population and family-based samples in genome-wide association studies. Genet Epidemiol 2014; 38:430-8. [PMID: 24845555 DOI: 10.1002/gepi.21811] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2013] [Revised: 03/26/2014] [Accepted: 03/26/2014] [Indexed: 12/16/2022]
Abstract
Genome-wide association studies (GWAS) that draw samples from multiple studies with a mixture of relationship structures are becoming more common. Analytical methods exist for using mixed-sample data, but few methods have been proposed for the analysis of genotype-by-environment (G×E) interactions. Using GWAS data from a study of sarcoidosis susceptibility genes in related and unrelated African Americans, we explored the current analytic options for genotype association testing in studies using both unrelated and family-based designs. We propose a novel method-generalized least squares (GLX)-to estimate both SNP and G×E interaction effects for categorical environmental covariates and compared this method to generalized estimating equations (GEE), logistic regression, the Cochran-Armitage trend test, and the WQLS and MQLS methods. We used simulation to demonstrate that the GLX method reduces type I error under a variety of pedigree structures. We also demonstrate its superior power to detect SNP effects while offering computational advantages and comparable power to detect G×E interactions versus GEE. Using this method, we found two novel SNPs that demonstrate a significant genome-wide interaction with insecticide exposure-rs10499003 and rs7745248, located in the intronic and 3' UTR regions of the FUT9 gene on chromosome 6q16.1.
Collapse
Affiliation(s)
- Jia Li
- Department of Public Health Sciences, Henry Ford Health System, Detroit, Michigan, United States of America
| | | | | | | | | | | | | | | | | | | |
Collapse
|
3
|
Molas M, Noh M, Lee Y, Lesaffre E. Joint hierarchical generalized linear models with multivariate Gaussian random effects. Comput Stat Data Anal 2013. [DOI: 10.1016/j.csda.2013.07.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
4
|
Lee Y, Noh M. Modelling random effect variance with double hierarchical generalized linear models. STAT MODEL 2012. [DOI: 10.1177/1471082x12460132] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Random-effect models are becoming increasingly popular in the analysis of data. Lee and Nelder (2006) introduced double hierarchical generalized linear models (DHGLMs) in which not only the mean but also the residual variance (overdispersion) can be further modelled as random-effect models. In this article, we introduce DHGLMs that allow random-effect models for both the variances of random effects and the residual variance. We show how to use this general model class for the analysis of data and discuss how to select the best fitting model using the likelihood and various model-checking plots.
Collapse
Affiliation(s)
- Youngjo Lee
- Department of Statistics, Seoul National University, Seoul 151–742, South Korea
| | - Maengseok Noh
- Department of Statistics, Pukyong National University, Busan 608–737, South Korea
| |
Collapse
|
5
|
Neuhaus JM, McCulloch CE. The effect of misspecification of random effects distributions in clustered data settings with outcome-dependent sampling. CAN J STAT 2011. [PMID: 23204632 DOI: 10.1002/cjs.10117] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Genetic epidemiologists often gather outcome-dependent samples of family data to measure within-family associations of genetic factors with disease outcomes. Generalized linear mixed models provide effective methods to estimate within-family associations but typically require parametric specification of the random effects distribution. Although misspecification of the random effects distribution often leads to little bias in estimated regression coefficients in standard, prospective clustered data settings, some recent studies suggest that such misspecification will impact parameter estimates from outcome-dependent cluster sampling designs. Using analytic results, simulation studies and fits to example data, this study examines the effect of misspecification of random effects distributions on parameter estimates in clustered data settings with outcome-dependent sampling. We show that the effects are consistent with results from prospective cluster sampling settings. In particular, ascertainment corrected mixed model methods that assume normally distributed random intercepts and conditional likelihood approaches provide accurate estimates of within-family covariate effects even under a misspecified random effects distribution.
Collapse
Affiliation(s)
- John M Neuhaus
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA 94143-0560, USA
| | | |
Collapse
|
6
|
Abstract
Correlated survival times can be modelled by introducing a random effect, or frailty component, into the hazard function. For multivariate survival data, we extend a non-proportional hazards (PH) model, the generalized time-dependent logistic survival model, to include random effects. The hierarchical likelihood procedure, which obviates the need for marginalization over the random effect distribution, is derived for this extended model and its properties are discussed. The extended model leads to a robust estimation result for the regression parameters against the misspecification of the form of the basic hazard function or frailty distribution compared to PH-based alternatives. The proposed method is illustrated by two practical examples and a simulation study which demonstrate the advantages of the new model.
Collapse
Affiliation(s)
- Il Do Ha
- Department of Asset Management, Daegu Haany University, South Korea
| | - Gilbert MacKenzie
- Centre of Biostatistics, Department of Mathematics & Statistics, University of Limerick, Ireland and ENSAI, Rennes, France
| |
Collapse
|
7
|
Ma Y, Genton MG. Explicit estimating equations for semiparametric generalized linear latent variable models. J R Stat Soc Series B Stat Methodol 2010. [DOI: 10.1111/j.1467-9868.2010.00741.x] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
8
|
Yip BH, Reilly M, Cnattingius S, Pawitan Y. Matched ascertainment of informative families for complex genetic modelling. Behav Genet 2010; 40:404-14. [PMID: 20033275 PMCID: PMC2953624 DOI: 10.1007/s10519-009-9322-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2008] [Accepted: 11/28/2009] [Indexed: 11/26/2022]
Abstract
Family data are used extensively in quantitative genetic studies to disentangle the genetic and environmental contributions to various diseases. Many family studies based their analysis on population-based registers containing a large number of individuals composed of small family units. For binary trait analyses, exact marginal likelihood is a common approach, but, due to the computational demand of the enormous data sets, it allows only a limited number of effects in the model. This makes it particularly difficult to perform joint estimation of variance components for a binary trait and the potential confounders. We have developed a data-reduction method of ascertaining informative families from population-based family registers. We propose a scheme where the ascertained families match the full cohort with respect to some relevant statistics, such as the risk to relatives of an affected individual. The ascertainment-adjusted analysis, which we implement using a pseudo-likelihood approach, is shown to be efficient relative to the analysis of the whole cohort and robust to mis-specification of the random effect distribution.
Collapse
Affiliation(s)
- Benjamin H. Yip
- Department of Psychiatry, University of Hong Kong, Hong Kong, China
| | - Marie Reilly
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, 17177 Stockholm, Sweden
| | - Sven Cnattingius
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, 17177 Stockholm, Sweden
| | - Yudi Pawitan
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, 17177 Stockholm, Sweden
| |
Collapse
|
9
|
|
10
|
Lee Y, Nelder JA. Rejoinder: Likelihood Inference for Models with Unobservables Another View. Stat Sci 2009. [DOI: 10.1214/09-sts277rej] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
11
|
Rue H, Martino S, Chopin N. Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. J R Stat Soc Series B Stat Methodol 2009. [DOI: 10.1111/j.1467-9868.2008.00700.x] [Citation(s) in RCA: 2620] [Impact Index Per Article: 174.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
12
|
Yip BH, Björk C, Lichtenstein P, Hultman CM, Pawitan Y. Covariance component models for multivariate binary traits in family data analysis. Stat Med 2008; 27:1086-105. [PMID: 17634971 DOI: 10.1002/sim.2996] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
For family studies, there is now an established analytical framework for binary-trait outcomes within the generalized linear mixed models (GLMMs). However, the corresponding analysis of multivariate binary-trait (MBT) outcomes is still limited. Certain diseases, such as schizophrenia and bipolar disorder, have similarities in epidemiological features, risk factor patterns and intermediate phenotypes. To have a better etiological understanding, it is important to investigate the common genetic and environmental factors driving the comorbidity of the diseases. In this paper, we develop a suitable GLMM for MBT outcomes from extended families, such as nuclear, paternal- and maternal-halfsib families. We motivate our problem with real questions from psychiatric epidemiology and demonstrate how different substantive issues of comorbidity between two diseases can be put into the analytical framework.
Collapse
Affiliation(s)
- Benjamin H Yip
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Nobelvgen 12, Stockholm, Sweden
| | | | | | | | | |
Collapse
|
13
|
Bowden J, Thompson JR, Burton PR. A two-stage approach to the correction of ascertainment bias in complex genetic studies involving variance components. Ann Hum Genet 2007; 71:220-9. [PMID: 17354286 DOI: 10.1111/j.1469-1809.2006.00307.x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Correction for ascertainment bias is a vital part of the analysis of genetic epidemiology studies that needs to be undertaken whenever subjects are not recruited at random. Adjustment often requires extensive numerical integration, which can be very slow or even computationally infeasible, especially if the model includes many fixed and random effects. In this paper we propose a two-stage method for ascertainment bias correction. In the first stage we estimate parameters that pertain to the ascertained population, that is the population that would be selected into the sample if the ascertainment criterion were applied to everyone. In the second stage we convert the estimates for the ascertained population into general population parameter estimates. We illustrate the method with simulations based on a simple model and then describe how the method can be used with complex models. The two-stage approach avoids some of the integration required in direct adjustment, hence speeding up the process of model fitting.
Collapse
|
14
|
|