1
|
Ren B, Lipsitz SR, Fitzmaurice GM, Weiss RD. Permutation Tests for Assessing Potential Non-Linear Associations between Treatment Use and Multivariate Clinical Outcomes. MULTIVARIATE BEHAVIORAL RESEARCH 2024; 59:110-122. [PMID: 37379399 PMCID: PMC10753035 DOI: 10.1080/00273171.2023.2217662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/30/2023]
Abstract
In many psychometric applications, the relationship between the mean of an outcome and a quantitative covariate is too complex to be described by simple parametric functions; instead, flexible nonlinear relationships can be incorporated using penalized splines. Penalized splines can be conveniently represented as a linear mixed effects model (LMM), where the coefficients of the spline basis functions are random effects. The LMM representation of penalized splines makes the extension to multivariate outcomes relatively straightforward. In the LMM, no effect of the quantitative covariate on the outcome corresponds to the null hypothesis that a fixed effect and a variance component are both zero. Under the null, the usual asymptotic chi-square distribution of the likelihood ratio test for the variance component does not hold. Therefore, we propose three permutation tests for the likelihood ratio test statistic: one based on permuting the quantitative covariate, the other two based on permuting residuals. We compare via simulation the Type I error rate and power of the three permutation tests obtained from joint models for multiple outcomes, as well as a commonly used parametric test. The tests are illustrated using data from a stimulant use disorder psychosocial clinical trial.
Collapse
Affiliation(s)
- Boyu Ren
- McLean Hospital, Blemont, MA, U.S.A
| | | | | | | |
Collapse
|
2
|
Todem D, Hsu WW, Kim K. Nonparametric scanning tests of homogeneity for hierarchical models with continuous covariates. Biometrics 2023; 79:2063-2075. [PMID: 36454666 PMCID: PMC10232678 DOI: 10.1111/biom.13801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 10/31/2022] [Indexed: 12/03/2022]
Abstract
In many applications of hierarchical models, there is often interest in evaluating the inherent heterogeneity in view of observed data. When the underlying hypothesis involves parameters resting on the boundary of their support space such as variances and mixture proportions, it is a usual practice to entertain testing procedures that rely on common heterogeneity assumptions. Such procedures, albeit omnibus for general alternatives, may entail a substantial loss of power for specific alternatives such as heterogeneity varying with covariates. We introduce a novel and flexible approach that uses covariate information to improve the power to detect heterogeneity, without imposing unnecessary restrictions. With continuous covariates, the approach does not impose a regression model relating heterogeneity parameters to covariates or rely on arbitrary discretizations. Instead, a scanning approach requiring continuous dichotomizations of the covariates is proposed. Empirical processes resulting from these dichotomizations are then used to construct the test statistics, with limiting null distributions shown to be functionals of tight random processes. We illustrate our proposals and results on a popular class of two-component mixture models, followed by simulation studies and applications to two real datasets in cancer and caries research.
Collapse
Affiliation(s)
- David Todem
- Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, Michigan, USA
| | - Wei-Wen Hsu
- Department of Environmental and Public Health Sciences, University of Cincinnati, Cincinnati, Ohio, USA
| | - KyungMann Kim
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, Wisconsin, USA
| |
Collapse
|
3
|
Devogel N, Auer PL, Manansala R, Wang T. On asymptotic distributions of several test statistics for familial relatedness in linear mixed models. Stat Med 2023; 42:2962-2981. [PMID: 37345498 DOI: 10.1002/sim.9762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Revised: 03/16/2023] [Accepted: 04/26/2023] [Indexed: 06/23/2023]
Abstract
In this study, the asymptotic distributions of the likelihood ratio test (LRT), the restricted likelihood ratio test (RLRT), the F and the sequence kernel association test (SKAT) statistics for testing an additive effect of the expected familial relatedness (FR) in a linear mixed model are examined based on an eigenvalue approach. First, the covariance structure for modeling the FR effect in a LMM is presented. Then, the multiplicity of eigenvalues for the log-likelihood and restricted log-likelihood is established under a replicate family setting and extended to a more general replicate family setting (GRFS) as well. After that, the asymptotic null distributions of LRT, RLRT, F and SKAT statistics under GRFS are derived. The asymptotic null distribution of SKAT for testing genetic rare variants is also constructed. In addition, a simple formula for sample size calculation is provided based on the restricted maximum likelihood estimate of the effect size for the expected FR. Finally, a power comparison of these test statistics on hypothesis test of the expected FR effect is made via simulation. The four test statistics are also applied to a data set from the UK Biobank.
Collapse
Affiliation(s)
- Nicholas Devogel
- Division of Biostatistics, Medical College of Wisconsin, Milwaukee, Wisconsin, USA
| | - Paul L Auer
- Division of Biostatistics, Medical College of Wisconsin, Milwaukee, Wisconsin, USA
| | - Regina Manansala
- Centre for Health Economics Research & Modelling Infectious Diseases, Vaccine & Infectious Disease Institute WHO Collaborating Centre, Faculty of Medicine & Health Sciences, University of Antwerp, Antwerp, Belgium
| | - Tao Wang
- Division of Biostatistics, Medical College of Wisconsin, Milwaukee, Wisconsin, USA
| |
Collapse
|
4
|
Rubio FJ, Drikvandi R. MEGH: A parametric class of general hazard models for clustered survival data. Stat Methods Med Res 2022; 31:1603-1616. [PMID: 35668699 PMCID: PMC9315191 DOI: 10.1177/09622802221102620] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
In many applications of survival data analysis, the individuals are treated in different medical centres or belong to different clusters defined by geographical or administrative regions. The analysis of such data requires accounting for between-cluster variability. Ignoring such variability would impose unrealistic assumptions in the analysis and could affect the inference on the statistical models. We develop a novel parametric mixed-effects general hazard (MEGH) model that is particularly suitable for the analysis of clustered survival data. The proposed structure generalises the mixed-effects proportional hazards and mixed-effects accelerated failure time structures, among other structures, which are obtained as special cases of the MEGH structure. We develop a likelihood-based algorithm for parameter estimation in general subclasses of the MEGH model, which is implemented in our R package MEGH. We propose diagnostic tools for assessing the random effects and their distributional assumption in the proposed MEGH model. We investigate the performance of the MEGH model using theoretical and simulation studies, as well as a real data application on leukaemia.
Collapse
Affiliation(s)
| | - Reza Drikvandi
- Department of Mathematical Sciences, 3057Durham University, Durham, UK
| |
Collapse
|
5
|
Ekvall KO, Bottai M. Confidence regions near singular information and boundary points with applications to mixed models. Ann Stat 2022. [DOI: 10.1214/22-aos2177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Karl Oskar Ekvall
- Division of Biostatistics, Institute of Environmental Medicine, Karolinska Institutet
| | - Matteo Bottai
- Division of Biostatistics, Institute of Environmental Medicine, Karolinska Institutet
| |
Collapse
|
6
|
Brown C, Templin J. Modification Indices for Diagnostic Classification Models. MULTIVARIATE BEHAVIORAL RESEARCH 2022:1-18. [PMID: 35507677 DOI: 10.1080/00273171.2022.2049672] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Diagnostic classification models (DCMs) are psychometric models for evaluating a student's mastery of the essential skills in a content domain based upon their responses to a set of test items. Currently, diagnostic model and/or Q-matrix misspecification is a known problem with limited avenues for remediation. To address this problem, this paper defines a one-sided score statistic that is a computationally efficient method for detecting under-specification at the item level of both the Q-matrix and the model parameters of the particular DCM chosen in an analysis. This method is analogous to the modification indices widely used in structural equation modeling. The results of a simulation study show the Type I error rate of modification indices for DCMs are acceptably close to the nominal significance level when the appropriate mixture χ2 reference distribution is used. The simulation results indicate that modification indices are very powerful in the detection of an under-specified Q-matrix and have ample power to detect the omission of model parameters in large samples or when the items are highly discriminating. An application of modification indices for DCMs to an analysis of response data from a large-scale administration of a diagnostic test demonstrates how they can be useful in diagnostic model refinement.
Collapse
Affiliation(s)
- Christy Brown
- Department of Education and Human Development, Clemson University
| | - Jonathan Templin
- Department of Psychological and Quantitative Foundations, University of Iowa
| |
Collapse
|
7
|
Gaston RT, Habyarimana F, Ramroop S. Joint modelling of anaemia and stunting in children less than five years of age in Lesotho: a cross-sectional case study. BMC Public Health 2022; 22:285. [PMID: 35148690 PMCID: PMC8840695 DOI: 10.1186/s12889-022-12690-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Accepted: 01/27/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Anaemia and stunting remain jointly a serious health issue worldwide especially in developing countries. In Lesotho, their prevalence is high, particularly among children less than 5 years of age. OBJECTIVES The primary objective was to determine the association between anaemia and stunting, and identify factors relating to both conditions among children younger than 5 years in Lesotho. METHODS This cross-sectional study used secondary data from 3112 children collected during the 2014 Lesotho Demographic Health Survey (LDHS). Haemoglobin (Hb) levels were adjusted for altitude and a level less than 11 g per deciliters (11 g/dl) was determined as the cutoff for being anaemic. A child with the height-for-age z score (HAZ) below minus two standard deviations (SD) was considered to have stunting. We linked factors relating to anaemia and stunting using a multivariate joint model under the scope of the generalized linear mixed model (GLMM). RESULTS The prevalence of anaemia and stunting in children younger than 5 years were 51% and 43% respectively. The multivariate results revealed a strong association between anaemia and stunting. In addition, maternal education, urban vs. rural residence, wealth index and childbirth weight significantly impacted childhood stunting or malnutrition, while having fever and/or diarrhoea was linked to anaemia. Lastly, age was shown to have a significant effect on both stunting and anaemia. CONCLUSION Anaemia and stunting or malnutrition showed linked longitudinal trajectories, suggesting both conditions could lead to synergetic improvements in overall child health. Demographic, socio-economic, and geographical characteristics were also important drivers of stunting and anaemia in children younger than 5 years. Thus, children living in similar resources settings as Lesotho could benefit from coordinated programs designed to address both malnutrition and anaemia.
Collapse
Affiliation(s)
- Rugiranka Tony Gaston
- School of Mathematics, Statistics and Computer Sciences, University of KwaZulu-Natal, Private Bag X01, Pietermaritzburg CampusScottsville, 3209, South Africa. .,Health Economics and HIV/AIDS Research Division (HEARD), University of KwaZulu-Natal, Westville Campus, Private Bag X01, Westville, 3629, South Africa.
| | - Faustin Habyarimana
- School of Mathematics, Statistics and Computer Sciences, University of KwaZulu-Natal, Private Bag X01, Pietermaritzburg CampusScottsville, 3209, South Africa
| | - Shaun Ramroop
- School of Mathematics, Statistics and Computer Sciences, University of KwaZulu-Natal, Private Bag X01, Pietermaritzburg CampusScottsville, 3209, South Africa
| |
Collapse
|
8
|
Lee W, Kim J, Lee D. Revisiting the analysis pipeline for overdispersed Poisson and binomial data. J Appl Stat 2022; 50:1455-1476. [PMID: 37197756 PMCID: PMC10184615 DOI: 10.1080/02664763.2022.2026897] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
Overdispersion is a common feature in categorical data analysis and several methods have been developed for detecting and handling it in generalized linear models. The first aim of this study is to clarify the relationships among various score statistics for testing overdispersion and to compare their performances. In addition, we investigate a principled way to correct finite sample bias in the score statistic caused by estimating regression parameters with restricted likelihood. The second aim is to reconsider the current practice for handling overdispersed categorical data. Although the conventional models are based on substantially different mechanisms for generating overdispersion, model selection in practice has not been well studied. We perform an intensive numerical study for determining which method is more robust to various overdispersion mechanisms. In addition, we provide some graphical tools for identifying the better model. The last aim is to reconsider the key assumption for deriving the score statistics. We study the meaning of testing overdispersion when this assumption is violated, and we analytically show the conditions for which it is not appropriate to employ the current statistical practices for analyzing overdispersed data.
Collapse
Affiliation(s)
- Woojoo Lee
- Department of Public Health Science, Graduate School of Public Health, Seoul National University, Seoul, Korea of Republic
| | - Jeonghwan Kim
- Department of Statistics, Ewha Womans University, Seoul, Korea of Republic
| | - Donghwan Lee
- Department of Statistics, Ewha Womans University, Seoul, Korea of Republic
| |
Collapse
|
9
|
Collyer ML, Baken EK, Adams DC. A standardized effect size for evaluating and comparing the strength of phylogenetic signal. Methods Ecol Evol 2021. [DOI: 10.1111/2041-210x.13749] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Affiliation(s)
| | - Erica K. Baken
- Department of Science Chatham University Pittsburgh PA USA
- Department of Ecology, Evolution, and Organismal Biology Iowa State University Ames IA USA
| | - Dean C. Adams
- Department of Ecology, Evolution, and Organismal Biology Iowa State University Ames IA USA
| |
Collapse
|
10
|
Li S, Cai TT, Li H. Inference for high-dimensional linear mixed-effects models: A quasi-likelihood approach. J Am Stat Assoc 2021; 117:1835-1846. [PMID: 36793369 PMCID: PMC9928173 DOI: 10.1080/01621459.2021.1888740] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2019] [Revised: 01/15/2021] [Accepted: 02/04/2021] [Indexed: 10/22/2022]
Abstract
Linear mixed-effects models are widely used in analyzing clustered or repeated measures data. We propose a quasi-likelihood approach for estimation and inference of the unknown parameters in linear mixed-effects models with high-dimensional fixed effects. The proposed method is applicable to general settings where the dimension of the random effects and the cluster sizes are possibly large. Regarding the fixed effects, we provide rate optimal estimators and valid inference procedures that do not rely on the structural information of the variance components. We also study the estimation of variance components with high-dimensional fixed effects in general settings. The algorithms are easy to implement and computationally fast. The proposed methods are assessed in various simulation settings and are applied to a real study regarding the associations between body mass index and genetic polymorphic markers in a heterogeneous stock mice population.
Collapse
Affiliation(s)
- Sai Li
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| | - T Tony Cai
- Department of Statistics, The Wharton School, University of Pennsylvania, Philadelphia, PA 19104
| | - Hongzhe Li
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| |
Collapse
|
11
|
Zhang C, Wang X, Chen M, Wang T. A comparison of hypothesis tests for homogeneity in meta-analysis with focus on rare binary events. Res Synth Methods 2021; 12:408-428. [PMID: 34231330 DOI: 10.1002/jrsm.1484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2019] [Revised: 01/01/2021] [Accepted: 02/03/2021] [Indexed: 11/06/2022]
Abstract
Analysis of rare binary events is an important problem for biomedical researchers. Due to the sparsity of events in such problems, meta-analysis that integrates information across multiple studies can be applied to increase the efficiency of statistical inference. Although it is critical to examine whether the effect sizes are homogeneous across all studies, a comprehensive review of homogeneity tests has been lacking, and in particular, no attention has been paid to infrequent dichotomous outcomes. We systematically review statistical methods for homogeneity testing. By conducting an extensive simulation analysis and two case studies, we examine the performance of 30 tests in meta-analysis of rare binary outcomes. When using log-odds ratio as the association measure, our simulation results suggest that there is no uniform winner. However, we recommend the test proposed by Kulinskaya and Dollinger (BMC Med Res Methodol, 2015, 15), which uses a gamma distribution to approximate the null distribution, for its generally good performance; for very rare events coupled with small within-study sample sizes, in addition to the Kulinskaya-Dollinger test, we further recommend the conditional score test based on the random-effects hypergeometric model proposed by Liang and Self (Biometrika, 1985, 72:353-358). One should be cautious about the use of the Wald tests, the Lipsitz tests (Biometrics, 1998, 54:148-160), and tests proposed by Bhaumik et al (J Am Stat Assoc, 2012, 107:555-567).
Collapse
Affiliation(s)
- Chiyu Zhang
- Department of Statistical Science, Southern Methodist University, Dallas, Texas, USA
| | - Xinlei Wang
- Department of Statistical Science, Southern Methodist University, Dallas, Texas, USA
| | - Min Chen
- Department of Mathematical Sciences, University of Texas at Dallas, Richardson, Texas, USA.,Department of Population and Data Sciences, UT Southwestern Medical Center, Dallas, Texas, USA
| | - Tao Wang
- Department of Population and Data Sciences, UT Southwestern Medical Center, Dallas, Texas, USA
| |
Collapse
|
12
|
Smits DJM, De Boeck P, Vansteelandt K. The inhibition of verbally aggressive behaviour. EUROPEAN JOURNAL OF PERSONALITY 2020. [DOI: 10.1002/per.529] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
We studied the inhibition of verbal aggression, defined as not displaying verbal aggression when one would want to. The approach we used was based on a situation–response questionnaire containing 15 anger provoking situations and three verbally aggressive responses. Two questions were asked for each combination of a situation and a response: one about wanting to react in a verbally aggressive way and one about actually displaying the reaction. This questionnaire was administered to 316 participants. Based on different theories about inhibition, several logistic mixed models were constructed and tested against each other. In the best fitting model, inhibition was conceptualized as a trait. Trait inhibition was negatively correlated with external measures of Anger Out and positively with Control of Anger Out. Copyright © 2004 John Wiley & Sons, Ltd.
Collapse
|
13
|
Nobre JS, Singer JM, Batista MJ. Improved $U$-tests for variance components in one-way random effects models. BRAZ J PROBAB STAT 2020. [DOI: 10.1214/19-bjps436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
14
|
Multiple QTL Mapping in Autopolyploids: A Random-Effect Model Approach with Application in a Hexaploid Sweetpotato Full-Sib Population. Genetics 2020; 215:579-595. [PMID: 32371382 PMCID: PMC7337090 DOI: 10.1534/genetics.120.303080] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2019] [Accepted: 04/26/2020] [Indexed: 11/18/2022] Open
Abstract
In developing countries, the sweetpotato, Ipomoea batatas (L.) Lam. [Formula: see text], is an important autopolyploid species, both socially and economically. However, quantitative trait loci (QTL) mapping has remained limited due to its genetic complexity. Current fixed-effect models can fit only a single QTL and are generally hard to interpret. Here, we report the use of a random-effect model approach to map multiple QTL based on score statistics in a sweetpotato biparental population ('Beauregard' × 'Tanzania') with 315 full-sibs. Phenotypic data were collected for eight yield component traits in six environments in Peru, and jointly adjusted means were obtained using mixed-effect models. An integrated linkage map consisting of 30,684 markers distributed along 15 linkage groups (LGs) was used to obtain the genotype conditional probabilities of putative QTL at every centiMorgan position. Multiple interval mapping was performed using our R package QTLpoly and detected a total of 13 QTL, ranging from none to four QTL per trait, which explained up to 55% of the total variance. Some regions, such as those on LGs 3 and 15, were consistently detected among root number and yield traits, and provided a basis for candidate gene search. In addition, some QTL were found to affect commercial and noncommercial root traits distinctly. Further best linear unbiased predictions were decomposed into additive allele effects and were used to compute multiple QTL-based breeding values for selection. Together with quantitative genotyping and its appropriate usage in linkage analyses, this QTL mapping methodology will facilitate the use of genomic tools in sweetpotato breeding as well as in other autopolyploids.
Collapse
|
15
|
Nordgren R, Hedeker D, Dunton G, Yang C. Extending the mixed‐effects model to consider within‐subject variance for Ecological Momentary Assessment data. Stat Med 2019; 39:577-590. [PMID: 31846119 DOI: 10.1002/sim.8429] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2018] [Revised: 09/17/2019] [Accepted: 09/30/2019] [Indexed: 11/06/2022]
Abstract
Ecological Momentary Assessment data present some new modeling opportunities. Typically, there are sufficient data to explicitly model the within-subject (WS) variance, and in many applications, it is of interest to allow the WS variance to depend on covariates as well as random subject effects. We describe a model that allows multiple random effects per subject in the mean model (eg, random location intercept and slopes), as well as random scale in the error variance model. We present an example of the use of this model on a real dataset and a simulation study that shows the benefit of this model, relative to simpler approaches.
Collapse
Affiliation(s)
- Rachel Nordgren
- Division of Epidemiology and BiostatisticsSchool of Public Health, University of Illinois at Chicago Chicago Illinois
| | - Donald Hedeker
- Department of Public Health SciencesUniversity of Chicago Chicago Illinois
| | - Genevieve Dunton
- Department of PsychologyUniversity of Southern California Los Angeles California
- Department of Preventive MedicineUniversity of Southern California Los Angeles California
| | - Chih‐Hsiang Yang
- Department of Preventive MedicineUniversity of Southern California Los Angeles California
| |
Collapse
|
16
|
Ventrucci M, Cocchi D, Burgazzi G, Laini A. PC priors for residual correlation parameters in one-factor mixed models. STAT METHOD APPL-GER 2019. [DOI: 10.1007/s10260-019-00501-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
17
|
Schweizer K, Troche S. The EV Scaling Method for Variances of Latent Variables. METHODOLOGY-EUROPEAN JOURNAL OF RESEARCH METHODS FOR THE BEHAVIORAL AND SOCIAL SCIENCES 2019. [DOI: 10.1027/1614-2241/a000179] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Abstract. The paper describes EV scaling for variances of latent variables included in confirmatory factor models. EV-scaled variances can be achieved in two ways: the estimation of variance parameters based on adjusted factor loadings and alternatively the summation of squared factor loadings obtained under the condition that the variance parameter is set equal to one. By definition, the second procedure yields values that are always positive. EV-scaled variances of latent variables show sizes similar to eigenvalues. The outcome of applying this scaling method is demonstrated in empirical data. The results of a simulation study reveal that the outcomes of the two ways virtually always correspond if the data are generated to include the contribution of a latent source. If there is no such source, the exclusion of solutions with negative error variances virtually always leads to correspondence.
Collapse
Affiliation(s)
- Karl Schweizer
- Department of Psychology, Goethe University, Frankfurt a.M., Germany
| | - Stefan Troche
- Department of Psychology, University of Bern, Switzerland
| |
Collapse
|
18
|
|
19
|
Schweizer K, Troche SJ, DiStefano C. Scaling the Variance of a Latent Variable While Assuring Constancy of the Model. Front Psychol 2019; 10:887. [PMID: 31068871 PMCID: PMC6491693 DOI: 10.3389/fpsyg.2019.00887] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2018] [Accepted: 04/03/2019] [Indexed: 12/02/2022] Open
Abstract
This paper investigates how the major outcome of a confirmatory factor investigation is preserved when scaling the variance of a latent variable by the various scaling methods. A constancy framework, based upon the underlying factor analysis formula that enables scaling by modifying components through scalar multiplication, is described; a proof is included to demonstrate the constancy property of the framework. It provides the basis for a scaling method that enables the comparison of the contribution of different latent variables of the same confirmatory factor model to observed scores, as for example, the contributions of trait and method latent variables. Furthermore, it is shown that available scaling methods are in line with this constancy framework and that the criterion number included in some scaling methods enables modifications. The impact of the number of manifest variables on the scaled variance parameter can be modified and the range of possible values. It enables the adaptation of scaling methods to the requirements of the field of application.
Collapse
Affiliation(s)
- Karl Schweizer
- Institute of Psychology, Goethe University Frankfurt, Frankfurt, Germany
| | - Stefan J Troche
- Department of Psychology, University of Bern, Bern, Switzerland
| | - Christine DiStefano
- Department of Educational Studies, University of South Carolina, Columbia, SC, United States
| |
Collapse
|
20
|
Hui FKC, Müller S, Welsh AH. Testing random effects in linear mixed models: another look at the F‐test (with discussion). AUST NZ J STAT 2019. [DOI: 10.1111/anzs.12256] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
21
|
Drikvandi R, Noorian S. Testing random effects in linear mixed-effects models with serially correlated errors. Biom J 2019; 61:802-812. [PMID: 30721539 DOI: 10.1002/bimj.201700203] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2017] [Revised: 10/28/2018] [Accepted: 12/19/2018] [Indexed: 11/07/2022]
Abstract
In linear mixed-effects models, random effects are used to capture the heterogeneity and variability between individuals due to unmeasured covariates or unknown biological differences. Testing for the need of random effects is a nonstandard problem because it requires testing on the boundary of parameter space where the asymptotic chi-squared distribution of the classical tests such as likelihood ratio and score tests is incorrect. In the literature several tests have been proposed to overcome this difficulty, however all of these tests rely on the restrictive assumption of i.i.d. measurement errors. The presence of correlated errors, which often happens in practice, makes testing random effects much more difficult. In this paper, we propose a permutation test for random effects in the presence of serially correlated errors. The proposed test not only avoids issues with the boundary of parameter space, but also can be used for testing multiple random effects and any subset of them. Our permutation procedure includes the permutation procedure in Drikvandi, Verbeke, Khodadadi, and Partovi Nia (2013) as a special case when errors are i.i.d., though the test statistics are different. We use simulations and a real data analysis to evaluate the performance of the proposed permutation test. We have found that random slopes for linear and quadratic time effects may not be significant when measurement errors are serially correlated.
Collapse
Affiliation(s)
- Reza Drikvandi
- Department of Computing and Mathematics, Manchester Metropolitan University, Manchester, UK.,Statistics Section, Department of Mathematics, Imperial College London, London, UK
| | - Sajad Noorian
- Department of Statistics, Faculty of Science, University of Qom, Qom, Iran
| |
Collapse
|
22
|
On testing the hidden heterogeneity in negative binomial regression models. METRIKA 2018. [DOI: 10.1007/s00184-018-0684-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
23
|
Luyts M, Molenberghs G, Verbeke G, Matthijs K, Ribeiro Jr EE, Demétrio CGB, Hinde J. A Weibull-count approach for handling under- and overdispersed longitudinal/clustered data structures. STAT MODEL 2018. [DOI: 10.1177/1471082x18789992] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
A Weibull-model-based approach is examined to handle under- and overdispersed discrete data in a hierarchical framework. This methodology was first introduced by Nakagawa and Osaki (1975, IEEE Transactions on Reliability, 24, 300–301), and later examined for under- and overdispersion by Klakattawi et al. (2018, Entropy, 20, 142) in the univariate case. Extensions to hierarchical approaches with under- and overdispersion were left unnoted, even though they can be obtained in a simple manner. This is of particular interest when analysing clustered/longitudinal data structures, where the underlying correlation structure is often more complex compared to cross-sectional studies. In this article, a random-effects extension of the Weibull-count model is proposed and applied to two motivating case studies, originating from the clinical and sociological research fields. A goodness-of-fit evaluation of the model is provided through a comparison of some well-known count models, that is, the negative binomial, Conway–Maxwell–Poisson and double Poisson models. Empirical results show that the proposed extension flexibly fits the data, more specifically, for heavy-tailed, zero-inflated, overdispersed and correlated count data. Discrete left-skewed time-to-event data structures are also flexibly modelled using the approach, with the ability to derive direct interpretations on the median scale, provided the complementary log–log link is used. Finally, a large simulated set of data is created to examine other characteristics such as computational ease and orthogonality properties of the model, with the conclusion that the approach behaves best for highly overdispersed cases.
Collapse
Affiliation(s)
- Martial Luyts
- Interuniversity Institute for Biostatistics and Statistical Bioinformatics, KU Leuven and Universiteit Hasselt, Leuven, Belgium
| | - Geert Molenberghs
- Interuniversity Institute for Biostatistics and Statistical Bioinformatics, KU Leuven and Universiteit Hasselt, Leuven, Belgium
| | - Geert Verbeke
- Interuniversity Institute for Biostatistics and Statistical Bioinformatics, KU Leuven and Universiteit Hasselt, Leuven, Belgium
| | - Koen Matthijs
- Family and Population Studies, KU Leuven, Leuven, Belgium
| | | | | | - John Hinde
- School of Mathematics, Statistics and Applied Mathematics, NUI Galway, Galway, Ireland
| |
Collapse
|
24
|
Todem D, Hsu WW, Fine JP. A quasi-score statistic for homogeneity testing against covariate-varying heterogeneity. Scand Stat Theory Appl 2017; 45:465-481. [PMID: 30275656 DOI: 10.1111/sjos.12308] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
In statistical modeling, it is often of interest to evaluate non-negative quantities that capture heterogeneity in the population such as variances, mixing proportions and dispersion parameters. In instances of covariate-dependent heterogeneity, the implied homogeneity hypotheses are nonstandard and existing inferential techniques are not applicable. In this paper, we develop a quasi-score test statistic to evaluate homogeneity against heterogeneity that varies with a covariate profile through a regression model. We establish the limiting null distribution of the proposed test as a functional of mixtures of chi-square processes. The methodology does not require the full distribution of the data to be entirely specified. Instead, a general estimating function for a finite dimensional component of the model that is of interest is assumed but other characteristics of the population are left completely unspecified. We apply the methodology to evaluate the excess zero proportion in zero-inflated models for count data. Our numerical simulations show that the proposed test can greatly improve efficiency over tests of homogeneity that neglect covariate information under the alternative hypothesis. An empirical application to dental caries indices demonstrates the importance and practical utility of the methodology in detecting excess zeros in the data.
Collapse
Affiliation(s)
- David Todem
- Michigan State University, East Lansing, USA
| | | | | |
Collapse
|
25
|
Sun H. Testing for autocorrelation and random-effects in nonlinear mixed effects models based on M-estimation. COMMUN STAT-SIMUL C 2017. [DOI: 10.1080/03610918.2016.1230211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Huihui Sun
- School of Mathematics and Statistics, Yancheng Teachers University, Yancheng, China
| |
Collapse
|
26
|
Cai B, Bandyopadhyay D. Bayesian semiparametric variable selection with applications to periodontal data. Stat Med 2017; 36:2251-2264. [PMID: 28226392 DOI: 10.1002/sim.7255] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2016] [Revised: 11/29/2016] [Accepted: 01/19/2017] [Indexed: 11/11/2022]
Abstract
A normality assumption is typically adopted for the random effects in a clustered or longitudinal data analysis using a linear mixed model. However, such an assumption is not always realistic, and it may lead to potential biases of the estimates, especially when variable selection is taken into account. Furthermore, flexibility of nonparametric assumptions (e.g., Dirichlet process) on these random effects may potentially cause centering problems, leading to difficulty of interpretation of fixed effects and variable selection. Motivated by these problems, we proposed a Bayesian method for fixed and random effects selection in nonparametric random effects models. We modeled the regression coefficients via centered latent variables which are distributed as probit stick-breaking scale mixtures. By using the mixture priors for centered latent variables along with covariance decomposition, we could avoid the aforementioned problems and allow efficient selection of fixed and random effects from the model. We demonstrated the advantages of our proposed approach over other competing alternatives through a simulated example and also via an illustrative application to a data set from a periodontal disease study. Copyright © 2017 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
- Bo Cai
- Department of Epidemiology and Biostatistics, University of South Carolina, Columbia, SC, U.S.A
| | | |
Collapse
|
27
|
Ha ID, Christian NJ, Jeong JH, Park J, Lee Y. Analysis of clustered competing risks data using subdistribution hazard models with multivariate frailties. Stat Methods Med Res 2016; 25:2488-2505. [PMID: 24619110 PMCID: PMC5771528 DOI: 10.1177/0962280214526193] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Competing risks data often exist within a center in multi-center randomized clinical trials where the treatment effects or baseline risks may vary among centers. In this paper, we propose a subdistribution hazard regression model with multivariate frailty to investigate heterogeneity in treatment effects among centers from multi-center clinical trials. For inference, we develop a hierarchical likelihood (or h-likelihood) method, which obviates the need for an intractable integration over the frailty terms. We show that the profile likelihood function derived from the h-likelihood is identical to the partial likelihood, and hence it can be extended to the weighted partial likelihood for the subdistribution hazard frailty models. The proposed method is illustrated with a dataset from a multi-center clinical trial on breast cancer as well as with a simulation study. We also demonstrate how to present heterogeneity in treatment effects among centers by using a confidence interval for the frailty for each individual center and how to perform a statistical test for such heterogeneity using a restricted h-likelihood.
Collapse
Affiliation(s)
- Il Do Ha
- Department of Asset Management, Daegu Haany University, Gyeongsan, South Korea
| | | | - Jong-Hyeon Jeong
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, USA
| | - Junwoo Park
- Department of Statistics, Seoul National University, Seoul, South Korea
| | - Youngjo Lee
- Department of Statistics, Seoul National University, Seoul, South Korea
| |
Collapse
|
28
|
Savalli C, Paula GA, Cysneiros FJA. Assessment of variance components in elliptical linear mixed models. STAT MODEL 2016. [DOI: 10.1191/1471082x06st104oa] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
The aim of this article is to discuss the problem of testing variance components in elliptical linear mixed models. The elliptical class includes all symmetrical continuous distributions, such as normal, Student- t, Pearson VII, power exponential, logistic I, logistic II, and so on. An approach for elliptical linear mixed models is proposed. The estimation procedure for the fixed parameters, variance components and random effects is also presented and a score-type test for one-sided alternatives to assess the variance components is investigated. Finally, two illustrative examples are given in which normal linear mixed models are compared with elliptical linear mixed models with heavy-tailed error distributions.
Collapse
Affiliation(s)
- Carine Savalli
- Instituto de Mathemática e Estatística,
Universidade de São Paulo, Brazil
| | - Gilberto A Paula
- Instituto de Mathemática e Estatística,
Universidade de São Paulo, Brazil,
| | | |
Collapse
|
29
|
Bellio R, Varin C. A pairwise likelihood approach to generalized linear models with crossed random effects. STAT MODEL 2016. [DOI: 10.1191/1471082x05st095oa] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Inference in generalized linear models with crossed effects is often made cumbersome by the high-dimensional intractable integrals involved in the likelihood function. We propose an inferential strategy based on the pairwise likelihood, which only requires the computation of bivariate distributions. The benefits of our approach are the simplicity of implementation and the potential to handle large data sets. The estimators based on the pairwise likelihood are generally consistent and asymptotically normally distributed. The pairwise likelihood makes it possible to improve on standard inferential procedures by means of bootstrap methods. The performance of the proposed methodology is illustrated by simulations and application to the well-known salamander mating data set.
Collapse
|
30
|
Oliveira IRC, Molenberghs G, Verbeke G, Demétrio CGB, Dias CTS. Negative variance components for non-negative hierarchical data with correlation, over-, and/or underdispersion. J Appl Stat 2016. [DOI: 10.1080/02664763.2016.1191624] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
31
|
Hsu WW, Todem D, Kim K. A sup-score test for the cure fraction in mixture models for long-term survivors. Biometrics 2016; 72:1348-1357. [PMID: 27078815 DOI: 10.1111/biom.12514] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2015] [Revised: 01/01/2016] [Accepted: 02/01/2016] [Indexed: 01/05/2023]
Abstract
The evaluation of cure fractions in oncology research under the well known cure rate model has attracted considerable attention in the literature, but most of the existing testing procedures have relied on restrictive assumptions. A common assumption has been to restrict the cure fraction to a constant under alternatives to homogeneity, thereby neglecting any information from covariates. This article extends the literature by developing a score-based statistic that incorporates covariate information to detect cure fractions, with the existing testing procedure serving as a special case. A complication of this extension, however, is that the implied hypotheses are not typical and standard regularity conditions to conduct the test may not even hold. Using empirical processes arguments, we construct a sup-score test statistic for cure fractions and establish its limiting null distribution as a functional of mixtures of chi-square processes. In practice, we suggest a simple resampling procedure to approximate this limiting distribution. Our simulation results show that the proposed test can greatly improve efficiency over tests that neglect the heterogeneity of the cure fraction under the alternative. The practical utility of the methodology is illustrated using ovarian cancer survival data with long-term follow-up from the surveillance, epidemiology, and end results registry.
Collapse
Affiliation(s)
- Wei-Wen Hsu
- Department of Statistics, Kansas State University 101 Dickens Hall, Manhattan, Kansas 66506, U.S.A
| | - David Todem
- Department of Epidemiology and Biostatistics, Michigan State University B601 West Fee Hall, East Lansing, Michigan 48824, U.S.A
| | - KyungMann Kim
- Departments of Biostatistics & Medical Informatics and Statistics, University of Wisconsin-Madison, 600 Highland Ave., Madison, Wisconsin 53792, U.S.A
| |
Collapse
|
32
|
Rauschenberger A, Jonker MA, van de Wiel MA, Menezes RX. Testing for association between RNA-Seq and high-dimensional data. BMC Bioinformatics 2016; 17:118. [PMID: 26951498 PMCID: PMC4782413 DOI: 10.1186/s12859-016-0961-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2015] [Accepted: 02/18/2016] [Indexed: 11/13/2022] Open
Abstract
Background Testing for association between RNA-Seq and other genomic data is challenging due to high variability of the former and high dimensionality of the latter. Results Using the negative binomial distribution and a random-effects model, we develop an omnibus test that overcomes both difficulties. It may be conceptualised as a test of overall significance in regression analysis, where the response variable is overdispersed and the number of explanatory variables exceeds the sample size. Conclusions The proposed test can detect genetic and epigenetic alterations that affect gene expression. It can examine complex regulatory mechanisms of gene expression. The R package globalSeq is available from Bioconductor. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-0961-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Armin Rauschenberger
- Department of Epidemiology and Biostatistics, VU University Medical Center, Amsterdam, 1007, MB, The Netherlands.
| | - Marianne A Jonker
- Department of Epidemiology and Biostatistics, VU University Medical Center, Amsterdam, 1007, MB, The Netherlands.
| | - Mark A van de Wiel
- Department of Epidemiology and Biostatistics, VU University Medical Center, Amsterdam, 1007, MB, The Netherlands. .,Department of Mathematics, VU University, Amsterdam, 1081, HV, The Netherlands.
| | - Renée X Menezes
- Department of Epidemiology and Biostatistics, VU University Medical Center, Amsterdam, 1007, MB, The Netherlands.
| |
Collapse
|
33
|
Ivanova A, Molenberghs G, Verbeke G. Mixed models approaches for joint modeling of different types of responses. J Biopharm Stat 2015; 26:601-18. [PMID: 26098411 DOI: 10.1080/10543406.2015.1052487] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
In many biomedical studies, one jointly collects longitudinal continuous, binary, and survival outcomes, possibly with some observations missing. Random-effects models, sometimes called shared-parameter models or frailty models, received a lot of attention. In such models, the corresponding variance components can be employed to capture the association between the various sequences. In some cases, random effects are considered common to various sequences, perhaps up to a scaling factor; in others, there are different but correlated random effects. Even though a variety of data types has been considered in the literature, less attention has been devoted to ordinal data. For univariate longitudinal or hierarchical data, the proportional odds mixed model (POMM) is an instance of the generalized linear mixed model (GLMM; Breslow and Clayton, 1993). Ordinal data are conveniently replaced by a parsimonious set of dummies, which in the longitudinal setting leads to a repeated set of dummies. When ordinal longitudinal data are part of a joint model, the complexity increases further. This is the setting considered in this paper. We formulate a random-effects based model that, in addition, allows for overdispersion. Using two case studies, it is shown that the combination of random effects to capture association with further correction for overdispersion can improve the model's fit considerably and that the resulting models allow to answer research questions that could not be addressed otherwise. Parameters can be estimated in a fairly straightforward way, using the SAS procedure NLMIXED.
Collapse
Affiliation(s)
- Anna Ivanova
- a Leuven Statistics Research Centre , KU Leuven, Leuven , Belgium.,b I-BioStat , KU Leuven, Leuven , Belgium
| | - Geert Molenberghs
- b I-BioStat , KU Leuven, Leuven , Belgium.,c I-BioStat, Universiteit Hasselt , Hasselt , Belgium
| | - Geert Verbeke
- b I-BioStat , KU Leuven, Leuven , Belgium.,c I-BioStat, Universiteit Hasselt , Hasselt , Belgium
| |
Collapse
|
34
|
Applying Linear Mixed Effects Models with Crossed Random Effects to Psycholinguistic Data: Multilevel Specification and Model Selection. ACTA ACUST UNITED AC 2015. [DOI: 10.20982/tqmp.11.2.p078] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
35
|
Ganjgahi H, Winkler AM, Glahn DC, Blangero J, Kochunov P, Nichols TE. Fast and powerful heritability inference for family-based neuroimaging studies. Neuroimage 2015; 115:256-68. [PMID: 25812717 PMCID: PMC4463976 DOI: 10.1016/j.neuroimage.2015.03.005] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2014] [Accepted: 03/03/2015] [Indexed: 11/29/2022] Open
Abstract
Heritability estimation has become an important tool for imaging genetics studies. The large number of voxel- and vertex-wise measurements in imaging genetics studies presents a challenge both in terms of computational intensity and the need to account for elevated false positive risk because of the multiple testing problem. There is a gap in existing tools, as standard neuroimaging software cannot estimate heritability, and yet standard quantitative genetics tools cannot provide essential neuroimaging inferences, like family-wise error corrected voxel-wise or cluster-wise P-values. Moreover, available heritability tools rely on P-values that can be inaccurate with usual parametric inference methods. In this work we develop fast estimation and inference procedures for voxel-wise heritability, drawing on recent methodological results that simplify heritability likelihood computations (Blangero et al., 2013). We review the family of score and Wald tests and propose novel inference methods based on explained sum of squares of an auxiliary linear model. To address problems with inaccuracies with the standard results used to find P-values, we propose four different permutation schemes to allow semi-parametric inference (parametric likelihood-based estimation, non-parametric sampling distribution). In total, we evaluate 5 different significance tests for heritability, with either asymptotic parametric or permutation-based P-value computations. We identify a number of tests that are both computationally efficient and powerful, making them ideal candidates for heritability studies in the massive data setting. We illustrate our method on fractional anisotropy measures in 859 subjects from the Genetics of Brain Structure study.
Collapse
Affiliation(s)
- Habib Ganjgahi
- Department of Statistics, The University of Warwick, Coventry, UK
| | - Anderson M Winkler
- Centre for Functional MRI of the Brain, University of Oxford, Oxford, UK; Department of Psychiatry, Yale University School of Medicine, New Haven, USA
| | - David C Glahn
- Department of Psychiatry, Yale University School of Medicine, New Haven, USA; Olin Neuropsychiatry Research Center, Institute of Living, Hartford Hospital, Hartford, CT, USA
| | - John Blangero
- Department of Genetics, Texas Biomedical Research Institute, San Antonio, TX, USA
| | - Peter Kochunov
- Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Thomas E Nichols
- Department of Statistics, The University of Warwick, Coventry, UK; Centre for Functional MRI of the Brain, University of Oxford, Oxford, UK; WMG, The University of Warwick, Coventry, UK.
| |
Collapse
|
36
|
Swihart BJ, Goldsmith J, Crainiceanu CM. Restricted Likelihood Ratio Tests for Functional Effects in the Functional Linear Model. Technometrics 2014. [DOI: 10.1080/00401706.2013.863163] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
37
|
Mccarthy JM, Shea PR, Goldstein DB, Allen AS. Testing for risk and protective trends in genetic analyses of HIV acquisition. Biostatistics 2014; 16:268-80. [PMID: 25270736 DOI: 10.1093/biostatistics/kxu044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Host genetics studies of HIV-1 acquisition are critically important for the identification of new targets for drug and vaccine development. Analyses of such studies typically focus on pairwise comparisons of three different groups: HIV-1 positive individuals, HIV-1 high-risk seronegative individuals, and population controls. Because there is a clear expectation of how gene frequencies of risk or protective alleles would be ordered in the three groups, we are able to construct a statistical framework that offers a consistent increase in power over a wide-range of the magnitude of risk/protective effects. In this paper, we develop tests that constrain the alternative hypothesis to appropriately reflect risk or protective trends jointly across the three groups and show that they lead to a substantial increase in power over the naive pairwise approach. We develop both likelihood-ratio and score statistics that test for genetic effects across the three groups while constraining the alternative hypothesis to reflect biologically motivated trends of risk or protection. The asymptotic distribution of both statistics (likelihood ratio and score) is derived. We investigate the performance of our approach via extensive simulation studies using a biologically motivated model of HIV-1 acquisition, and find that our proposed approach leads to an increase in power of roughly 10-28%. We illustrate our approach with an analysis of the effect of the CCR5Δ32 mutation on HIV acquisition.
Collapse
Affiliation(s)
- Janice M Mccarthy
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC 27710, USA
| | - Patrick R Shea
- Center for Human Genome Variation, Duke University, Durham, NC 27710, USA
| | - David B Goldstein
- Center for Human Genome Variation, Duke University, Durham, NC 27710, USA
| | - Andrew S Allen
- Department of Biostatistics and Bioinformatics and Center for Human Genome Variation, Duke University, Durham, NC 27710, USA
| |
Collapse
|
38
|
Abstract
Non-Gaussian outcomes are frequently modelled using members of the exponential family. In particular, the Bernoulli model for binary data and the Poisson model for count data are well-known. Two reasons for extending this family are (1) the occurrence of overdispersion, implying that the variability in the data is not adequately described by the models, and (2) the incorporation of hierarchical structure in the data. These issues are routinely addressed separately, the first one through overdispersion models, the second one, for example, by means of random effects within the generalized linear mixed models framework. Molenberghs et al. ( 2007 , 2010 ) introduced a so-called ‘combined model’ that simultaneously addresses both. In these and subsequent papers, a lot of attention was given to binary outcomes, counts, and time-to-event responses. While common in practice, ordinal data have not been studied from this angle. In this article, a model for ordinal repeated measures, subject to overdispersion, is formulated. It can be fitted without difficulty using standard statistical software. The model is exemplified using data from an epidemiological study in diabetic patients and using data from a clinical trial in psychiatric patients.
Collapse
Affiliation(s)
- Anna Ivanova
- KU Leuven – University of Leuven, LStat, Leuven, Belgium
- KU Leuven – University of Leuven, I-BioStat, Leuven, Belgium
| | - Geert Molenberghs
- KU Leuven – University of Leuven, I-BioStat, Leuven, Belgium
- Hasselt University, I-BioStat, Leuven, Belgium
| | - Geert Verbeke
- KU Leuven – University of Leuven, I-BioStat, Leuven, Belgium
- Hasselt University, I-BioStat, Leuven, Belgium
| |
Collapse
|
39
|
Lippert C, Xiang J, Horta D, Widmer C, Kadie C, Heckerman D, Listgarten J. Greater power and computational efficiency for kernel-based association testing of sets of genetic variants. ACTA ACUST UNITED AC 2014; 30:3206-14. [PMID: 25075117 PMCID: PMC4221116 DOI: 10.1093/bioinformatics/btu504] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Motivation: Set-based variance component tests have been identified as a way to increase power in association studies by aggregating weak individual effects. However, the choice of test statistic has been largely ignored even though it may play an important role in obtaining optimal power. We compared a standard statistical test—a score test—with a recently developed likelihood ratio (LR) test. Further, when correction for hidden structure is needed, or gene–gene interactions are sought, state-of-the art algorithms for both the score and LR tests can be computationally impractical. Thus we develop new computationally efficient methods. Results: After reviewing theoretical differences in performance between the score and LR tests, we find empirically on real data that the LR test generally has more power. In particular, on 15 of 17 real datasets, the LR test yielded at least as many associations as the score test—up to 23 more associations—whereas the score test yielded at most one more association than the LR test in the two remaining datasets. On synthetic data, we find that the LR test yielded up to 12% more associations, consistent with our results on real data, but also observe a regime of extremely small signal where the score test yielded up to 25% more associations than the LR test, consistent with theory. Finally, our computational speedups now enable (i) efficient LR testing when the background kernel is full rank, and (ii) efficient score testing when the background kernel changes with each test, as for gene–gene interaction tests. The latter yielded a factor of 2000 speedup on a cohort of size 13 500. Availability: Software available at http://research.microsoft.com/en-us/um/redmond/projects/MSCompBio/Fastlmm/. Contact:heckerma@microsoft.com Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Christoph Lippert
- eScience Research Group, Microsoft Research, Los Angeles, CA, 90024 and eScience Research Group, Microsoft Research, Redmond, WA, 98052, USA
| | - Jing Xiang
- eScience Research Group, Microsoft Research, Los Angeles, CA, 90024 and eScience Research Group, Microsoft Research, Redmond, WA, 98052, USA
| | - Danilo Horta
- eScience Research Group, Microsoft Research, Los Angeles, CA, 90024 and eScience Research Group, Microsoft Research, Redmond, WA, 98052, USA
| | - Christian Widmer
- eScience Research Group, Microsoft Research, Los Angeles, CA, 90024 and eScience Research Group, Microsoft Research, Redmond, WA, 98052, USA
| | - Carl Kadie
- eScience Research Group, Microsoft Research, Los Angeles, CA, 90024 and eScience Research Group, Microsoft Research, Redmond, WA, 98052, USA
| | - David Heckerman
- eScience Research Group, Microsoft Research, Los Angeles, CA, 90024 and eScience Research Group, Microsoft Research, Redmond, WA, 98052, USA
| | - Jennifer Listgarten
- eScience Research Group, Microsoft Research, Los Angeles, CA, 90024 and eScience Research Group, Microsoft Research, Redmond, WA, 98052, USA
| |
Collapse
|
40
|
Etxeberria J, Ugarte MD, Goicoa T, Militino AF. Age- and sex-specific spatio-temporal patterns of colorectal cancer mortality in Spain (1975-2008). Popul Health Metr 2014; 12:17. [PMID: 25136264 PMCID: PMC4131489 DOI: 10.1186/1478-7954-12-17] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2013] [Accepted: 06/25/2014] [Indexed: 01/04/2023] Open
Abstract
In this paper, space-time patterns of colorectal cancer (CRC) mortality risks are studied by sex and age group (50-69, ≥70) in Spanish provinces during the period 1975-2008. Space-time conditional autoregressive models are used to perform the statistical analyses. A pronounced increase in mortality risk has been observed in males for both age-groups. For males between 50 and 69 years of age, trends seem to stabilize from 2001 onward. In females, trends reflect a more stable pattern during the period in both age groups. However, for the 50-69 years group, risks take an upward trend in the period 2006-2008 after the slight decline observed in the second half of the period. This study offers interesting information regarding CRC mortality distribution among different Spanish provinces that could be used to improve prevention policies and resource allocation in different regions.
Collapse
Affiliation(s)
- Jaione Etxeberria
- Department of Statistics and O. R., Public University of Navarre, Campus de Arrosadia, Pamplona, Navarre, Spain
- Consortium for Biomedical Research in Epidemiology and Public Health (CIBERESP), Madrid, Spain
| | - María Dolores Ugarte
- Department of Statistics and O. R., Public University of Navarre, Campus de Arrosadia, Pamplona, Navarre, Spain
| | - Tomás Goicoa
- Department of Statistics and O. R., Public University of Navarre, Campus de Arrosadia, Pamplona, Navarre, Spain
- Research Network on Health Services in Chronic Diseases (REDISSEC), Pamplona, Spain
| | - Ana F Militino
- Department of Statistics and O. R., Public University of Navarre, Campus de Arrosadia, Pamplona, Navarre, Spain
| |
Collapse
|
41
|
Janmohamed SR, Oranje AP, Devillers AC, Rizopoulos D, van Praag MC, Van Gysel D, Goeteyn M, de Waard-van der Spek FB. The proactive wet-wrap method with diluted corticosteroids versus emollients in children with atopic dermatitis: A prospective, randomized, double-blind, placebo-controlled trial. J Am Acad Dermatol 2014; 70:1076-82. [PMID: 24698702 DOI: 10.1016/j.jaad.2014.01.898] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2013] [Revised: 01/28/2014] [Accepted: 01/30/2014] [Indexed: 11/29/2022]
|
42
|
Zeng P, Zhao Y, Zhang L, Huang S, Chen F. Rare variants detection with kernel machine learning based on likelihood ratio test. PLoS One 2014; 9:e93355. [PMID: 24675868 PMCID: PMC3968153 DOI: 10.1371/journal.pone.0093355] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2013] [Accepted: 03/03/2014] [Indexed: 11/18/2022] Open
Abstract
This paper mainly utilizes likelihood-based tests to detect rare variants associated with a continuous phenotype under the framework of kernel machine learning. Both the likelihood ratio test (LRT) and the restricted likelihood ratio test (ReLRT) are investigated. The relationship between the kernel machine learning and the mixed effects model is discussed. By using the eigenvalue representation of LRT and ReLRT, their exact finite sample distributions are obtained in a simulation manner. Numerical studies are performed to evaluate the performance of the proposed approaches under the contexts of standard mixed effects model and kernel machine learning. The results have shown that the LRT and ReLRT can control the type I error correctly at the given α level. The LRT and ReLRT consistently outperform the SKAT, regardless of the sample size and the proportion of the negative causal rare variants, and suffer from fewer power reductions compared to the SKAT when both positive and negative effects of rare variants are present. The LRT and ReLRT performed under the context of kernel machine learning have slightly higher powers than those performed under the context of standard mixed effects model. We use the Genetic Analysis Workshop 17 exome sequencing SNP data as an illustrative example. Some interesting results are observed from the analysis. Finally, we give the discussion.
Collapse
Affiliation(s)
- Ping Zeng
- Department of Epidemiology and Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical College, Xuzhou, Jiangsu, China
| | - Yang Zhao
- Department of Epidemiology and Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Liwei Zhang
- Department of Epidemiology and Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Shuiping Huang
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical College, Xuzhou, Jiangsu, China
| | - Feng Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
- * E-mail:
| |
Collapse
|
43
|
Cho SJ, De Boeck P, Embretson S, Rabe-Hesketh S. Additive multilevel item structure models with random residuals: item modeling for explanation and item generation. PSYCHOMETRIKA 2014; 79:84-104. [PMID: 24337937 DOI: 10.1007/s11336-013-9360-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2011] [Indexed: 06/03/2023]
Abstract
An additive multilevel item structure (AMIS) model with random residuals is proposed. The model includes multilevel latent regressions of item discrimination and item difficulty parameters on covariates at both item and item category levels with random residuals at both levels. The AMIS model is useful for explanation purposes and also for prediction purposes as in an item generation context. The parameters can be estimated with an alternating imputation posterior algorithm that makes use of adaptive quadrature, and the performance of this algorithm is evaluated in a simulation study.
Collapse
|
44
|
Codd CL, Cudeck R. Nonlinear random-effects mixture models for repeated measures. PSYCHOMETRIKA 2014; 79:60-83. [PMID: 24337936 DOI: 10.1007/s11336-013-9358-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/05/2012] [Indexed: 06/03/2023]
Abstract
A mixture model for repeated measures based on nonlinear functions with random effects is reviewed. The model can include individual schedules of measurement, data missing at random, nonlinear functions of the random effects, of covariates and of residuals. Individual group membership probabilities and individual random effects are obtained as empirical Bayes predictions. Although this is a complicated model that combines a mixture of populations, nonlinear regression, and hierarchical models, it is straightforward to estimate by maximum likelihood using SAS PROC NLMIXED. Many different models can be studied with this procedure. The model is more general than those that can be estimated with most special purpose computer programs currently available because the response function is essentially any form of nonlinear regression. Examples and sample code are included to illustrate the method.
Collapse
Affiliation(s)
- Casey L Codd
- Psychology Department, Ohio State University, 240D Lazenby Hall, Columbus, OH, 43210, USA,
| | | |
Collapse
|
45
|
Qu L, Guennel T, Marshall SL. Linear score tests for variance components in linear mixed models and applications to genetic association studies. Biometrics 2013; 69:883-92. [PMID: 24328714 DOI: 10.1111/biom.12095] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2012] [Revised: 06/01/2013] [Accepted: 07/01/2013] [Indexed: 01/16/2023]
Abstract
Following the rapid development of genome-scale genotyping technologies, genetic association mapping has become a popular tool to detect genomic regions responsible for certain (disease) phenotypes, especially in early-phase pharmacogenomic studies with limited sample size. In response to such applications, a good association test needs to be (1) applicable to a wide range of possible genetic models, including, but not limited to, the presence of gene-by-environment or gene-by-gene interactions and non-linearity of a group of marker effects, (2) accurate in small samples, fast to compute on the genomic scale, and amenable to large scale multiple testing corrections, and (3) reasonably powerful to locate causal genomic regions. The kernel machine method represented in linear mixed models provides a viable solution by transforming the problem into testing the nullity of variance components. In this study, we consider score-based tests by choosing a statistic linear in the score function. When the model under the null hypothesis has only one error variance parameter, our test is exact in finite samples. When the null model has more than one variance parameter, we develop a new moment-based approximation that performs well in simulations. Through simulations and analysis of real data, we demonstrate that the new test possesses most of the aforementioned characteristics, especially when compared to existing quadratic score tests or restricted likelihood ratio tests.
Collapse
Affiliation(s)
- Long Qu
- Department of Mathematics and Statistics, Wright State University, Dayton, Ohio 45435, U.S.A
| | | | | |
Collapse
|
46
|
|
47
|
Ritz C. Penalized likelihood ratio tests for repeated measurement models. TEST-SPAIN 2013. [DOI: 10.1007/s11749-013-0324-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
48
|
Sample Size Considerations for Hierarchical Populations. EFSA J 2013. [DOI: 10.2903/j.efsa.2013.3292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
|
49
|
Wang WL. Multivariate t linear mixed models for irregularly observed multiple repeated measures with missing outcomes. Biom J 2013; 55:554-71. [PMID: 23740830 DOI: 10.1002/bimj.201200001] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2012] [Revised: 03/27/2013] [Accepted: 03/30/2013] [Indexed: 11/08/2022]
Abstract
Missing outcomes or irregularly timed multivariate longitudinal data frequently occur in clinical trials or biomedical studies. The multivariate t linear mixed model (MtLMM) has been shown to be a robust approach to modeling multioutcome continuous repeated measures in the presence of outliers or heavy-tailed noises. This paper presents a framework for fitting the MtLMM with an arbitrary missing data pattern embodied within multiple outcome variables recorded at irregular occasions. To address the serial correlation among the within-subject errors, a damped exponential correlation structure is considered in the model. Under the missing at random mechanism, an efficient alternating expectation-conditional maximization (AECM) algorithm is used to carry out estimation of parameters and imputation of missing values. The techniques for the estimation of random effects and the prediction of future responses are also investigated. Applications to an HIV-AIDS study and a pregnancy study involving analysis of multivariate longitudinal data with missing outcomes as well as a simulation study have highlighted the superiority of MtLMMs on the provision of more adequate estimation, imputation and prediction performances.
Collapse
Affiliation(s)
- Wan-Lun Wang
- Department of Statistics, Graduate Institute of Statistics and Actuarial Science, Feng Chia University, Taichung 40724, Taiwan.
| |
Collapse
|
50
|
Yang M. Bayesian nonparametric centered random effects models with variable selection. Biom J 2013; 55:217-30. [PMID: 23322356 DOI: 10.1002/bimj.201100149] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2011] [Revised: 11/19/2012] [Accepted: 11/30/2012] [Indexed: 11/08/2022]
Abstract
In a linear mixed effects model, it is common practice to assume that the random effects follow a parametric distribution such as a normal distribution with mean zero. However, in the case of variable selection, substantial violation of the normality assumption can potentially impact the subset selection and result in poor interpretation and even incorrect results. In nonparametric random effects models, the random effects generally have a nonzero mean, which causes an identifiability problem for the fixed effects that are paired with the random effects. In this article, we focus on a Bayesian method for variable selection. We characterize the subject-specific random effects nonparametrically with a Dirichlet process and resolve the bias simultaneously. In particular, we propose flexible modeling of the conditional distribution of the random effects with changes across the predictor space. The approach is implemented using a stochastic search Gibbs sampler to identify subsets of fixed effects and random effects to be included in the model. Simulations are provided to evaluate and compare the performance of our approach to the existing ones. We then apply the new approach to a real data example, cross-country and interlaboratory rodent uterotrophic bioassay.
Collapse
Affiliation(s)
- Mingan Yang
- Department of Mathematics, Central Michigan University, Mt. Pleasant, MI 48859, USA.
| |
Collapse
|