1
|
Ren B, Lipsitz SR, Fitzmaurice GM, Weiss RD. Permutation Tests for Assessing Potential Non-Linear Associations between Treatment Use and Multivariate Clinical Outcomes. MULTIVARIATE BEHAVIORAL RESEARCH 2024; 59:110-122. [PMID: 37379399 PMCID: PMC10753035 DOI: 10.1080/00273171.2023.2217662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/30/2023]
Abstract
In many psychometric applications, the relationship between the mean of an outcome and a quantitative covariate is too complex to be described by simple parametric functions; instead, flexible nonlinear relationships can be incorporated using penalized splines. Penalized splines can be conveniently represented as a linear mixed effects model (LMM), where the coefficients of the spline basis functions are random effects. The LMM representation of penalized splines makes the extension to multivariate outcomes relatively straightforward. In the LMM, no effect of the quantitative covariate on the outcome corresponds to the null hypothesis that a fixed effect and a variance component are both zero. Under the null, the usual asymptotic chi-square distribution of the likelihood ratio test for the variance component does not hold. Therefore, we propose three permutation tests for the likelihood ratio test statistic: one based on permuting the quantitative covariate, the other two based on permuting residuals. We compare via simulation the Type I error rate and power of the three permutation tests obtained from joint models for multiple outcomes, as well as a commonly used parametric test. The tests are illustrated using data from a stimulant use disorder psychosocial clinical trial.
Collapse
Affiliation(s)
- Boyu Ren
- McLean Hospital, Blemont, MA, U.S.A
| | | | | | | |
Collapse
|
2
|
El-Horbaty YS. A note on covariance decomposition in linear models with nested-error structure: new and alternative derivations of the F-test. JOURNAL OF STATISTICAL THEORY AND PRACTICE 2022. [DOI: 10.1007/s42519-022-00291-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/10/2022]
|
3
|
Wu Q, Vos P. Permutation confidence region for multiple regression and fidelity to asymptotic approximation. COMMUN STAT-THEOR M 2022. [DOI: 10.1080/03610926.2022.2076119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Affiliation(s)
- Qiang Wu
- Department of Biostatistics, East Carolina University, Greenville, North Carolina, USA
| | - Paul Vos
- Department of Biostatistics, East Carolina University, Greenville, North Carolina, USA
| |
Collapse
|
4
|
Bosch-Bayard J, Razzaq FA, Lopez-Naranjo C, Wang Y, Li M, Galan-Garcia L, Calzada-Reyes A, Virues-Alba T, Rabinowitz AG, Suarez-Murias C, Guo Y, Sanchez-Castillo M, Rogers K, Gallagher A, Prichep L, Anderson SG, Michel CM, Evans AC, Bringas-Vega ML, Galler JR, Valdes-Sosa PA. Early protein energy malnutrition impacts life-long developmental trajectories of the sources of EEG rhythmic activity. Neuroimage 2022; 254:119144. [PMID: 35342003 DOI: 10.1016/j.neuroimage.2022.119144] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2022] [Revised: 03/20/2022] [Accepted: 03/23/2022] [Indexed: 02/07/2023] Open
Abstract
Protein Energy Malnutrition (PEM) has lifelong consequences on brain development and cognitive function. We studied the lifelong developmental trajectories of resting-state EEG source activity in 66 individuals with histories of Protein Energy Malnutrition (PEM) limited to the first year of life and in 83 matched classmate controls (CON) who are all participants of the 49 years longitudinal Barbados Nutrition Study (BNS). qEEGt source z-spectra measured deviation from normative values of EEG rhythmic activity sources at 5-11 years of age and 40 years later at 45-51 years of age. The PEM group showed qEEGt abnormalities in childhood, including a developmental delay in alpha rhythm maturation and an insufficient decrease in beta activity. These profiles may be correlated with accelerated cognitive decline.
Collapse
Affiliation(s)
- Jorge Bosch-Bayard
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation, University of Electronic Science and Technology of China, Chengdu, China; McGill Center for Integrative Neuroscience Center MCIN. Ludmer Center for Mental Health. Montreal Neurological Institute, McGill University, Montreal, Canada
| | - Fuleah Abdul Razzaq
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation, University of Electronic Science and Technology of China, Chengdu, China.
| | - Carlos Lopez-Naranjo
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation, University of Electronic Science and Technology of China, Chengdu, China
| | - Ying Wang
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation, University of Electronic Science and Technology of China, Chengdu, China
| | - Min Li
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation, University of Electronic Science and Technology of China, Chengdu, China
| | | | | | | | - Arielle G Rabinowitz
- Department of Neurology and Neurosurgery, McGill University, Montreal, QC, Canada
| | | | - Yanbo Guo
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation, University of Electronic Science and Technology of China, Chengdu, China
| | | | - Kassandra Rogers
- LION Lab, Sainte-Justine University Hospital Research Centre, University of Montreal, Montreal, QC, Canada
| | - Anne Gallagher
- LION Lab, Sainte-Justine University Hospital Research Centre, University of Montreal, Montreal, QC, Canada
| | | | - Simon G Anderson
- Caribbean Institute for Health Research, University of the West Indies, Barbados
| | | | - Alan C Evans
- McGill Center for Integrative Neuroscience Center MCIN. Ludmer Center for Mental Health. Montreal Neurological Institute, McGill University, Montreal, Canada
| | - Maria L Bringas-Vega
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation, University of Electronic Science and Technology of China, Chengdu, China; Cuban Neuroscience Center, La Habana, Cuba
| | - Janina R Galler
- Division of Pediatric Gastroenterology and Nutrition, Mucosal Immunology and Biology Research Center, Mass General Hospital for Children, Boston, MA, USA
| | - Pedro A Valdes-Sosa
- The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation, University of Electronic Science and Technology of China, Chengdu, China; McGill Center for Integrative Neuroscience Center MCIN. Ludmer Center for Mental Health. Montreal Neurological Institute, McGill University, Montreal, Canada; Cuban Neuroscience Center, La Habana, Cuba.
| |
Collapse
|
5
|
El-Horbaty YS. Testing the absence of random effects in the nested-error regression model using orthogonal transformations. COMMUN STAT-SIMUL C 2019. [DOI: 10.1080/03610918.2019.1700278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Affiliation(s)
- Yahia S. El-Horbaty
- Department of Mathematics, Insurance and Applied Statistics, Helwan University, Greater Cairo, Egypt
| |
Collapse
|
6
|
Baey C, Cournède PH, Kuhn E. Asymptotic distribution of likelihood ratio test statistics for variance components in nonlinear mixed effects models. Comput Stat Data Anal 2019. [DOI: 10.1016/j.csda.2019.01.014] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
7
|
Wu Q, Vos P. Permutation inference distribution for linear regression and related models. J Nonparametr Stat 2019. [DOI: 10.1080/10485252.2019.1632306] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Affiliation(s)
- Qiang Wu
- Department of Biostatistics, East Carolina University, Greenville, NC, USA
| | - Paul Vos
- Department of Biostatistics, East Carolina University, Greenville, NC, USA
| |
Collapse
|
8
|
Hui FKC, Müller S, Welsh AH. Testing random effects in linear mixed models: another look at the F‐test (with discussion). AUST NZ J STAT 2019. [DOI: 10.1111/anzs.12256] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
9
|
Schweiger R, Fisher E, Weissbrod O, Rahmani E, Müller-Nurasyid M, Kunze S, Gieger C, Waldenberger M, Rosset S, Halperin E. Detecting heritable phenotypes without a model using fast permutation testing for heritability and set-tests. Nat Commun 2018; 9:4919. [PMID: 30464216 PMCID: PMC6249264 DOI: 10.1038/s41467-018-07276-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2017] [Accepted: 10/26/2018] [Indexed: 01/08/2023] Open
Abstract
Testing for association between a set of genetic markers and a phenotype is a fundamental task in genetic studies. Standard approaches for heritability and set testing strongly rely on parametric models that make specific assumptions regarding phenotypic variability. Here, we show that resulting p-values may be inflated by up to 15 orders of magnitude, in a heritability study of methylation measurements, and in a heritability and expression quantitative trait loci analysis of gene expression profiles. We propose FEATHER, a method for fast permutation-based testing of marker sets and of heritability, which properly controls for false-positive results. FEATHER eliminated 47% of methylation sites found to be heritable by the parametric test, suggesting a substantial inflation of false-positive findings by alternative methods. Our approach can rapidly identify heritable phenotypes out of millions of phenotypes acquired via high-throughput technologies, does not suffer from model misspecification and is highly efficient. Standard approaches for heritability and set testing in statistical genetics rely on parametric models that might not hold in reality and give inflated p-values. Here, the authors develop a fast method for permutation-based testing of marker sets and of heritability that does not suffer from model misspecification.
Collapse
Affiliation(s)
- Regev Schweiger
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, 6997801, Israel.
| | - Eyal Fisher
- School of Mathematical Sciences, Department of Statistics, Tel Aviv University, Tel Aviv, 69978, Israel
| | - Omer Weissbrod
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, 02115, MA, USA
| | - Elior Rahmani
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, 6997801, Israel
| | - Martina Müller-Nurasyid
- Institute of Genetic Epidemiology, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, 85764, Germany.,Department of Medicine I, Ludwig-Maximilians-Universität, Munich, 80539, Germany.,DZHK (German Centre for Cardiovascular Research), partner site Munich Heart Alliance, Munich, 80636, Germany
| | - Sonja Kunze
- Institute of Epidemiology II, Helmholtz Zentrum München - German Research Center for Environmental Health, 85764, Neuherberg, Germany.,Research Unit of Molecular Epidemiology, Helmholtz Zentrum München-German Research Center for Environmental Health, 85764, Neuherberg, Germany
| | - Christian Gieger
- Institute of Epidemiology II, Helmholtz Zentrum München - German Research Center for Environmental Health, 85764, Neuherberg, Germany.,Research Unit of Molecular Epidemiology, Helmholtz Zentrum München-German Research Center for Environmental Health, 85764, Neuherberg, Germany
| | - Melanie Waldenberger
- DZHK (German Centre for Cardiovascular Research), partner site Munich Heart Alliance, Munich, 80636, Germany.,Institute of Epidemiology II, Helmholtz Zentrum München - German Research Center for Environmental Health, 85764, Neuherberg, Germany.,Research Unit of Molecular Epidemiology, Helmholtz Zentrum München-German Research Center for Environmental Health, 85764, Neuherberg, Germany
| | - Saharon Rosset
- School of Mathematical Sciences, Department of Statistics, Tel Aviv University, Tel Aviv, 69978, Israel
| | - Eran Halperin
- Los Angeles, University of California Los Angeles, Los Angeles, 90095, CA, USA.,Department of Anesthesiology and Perioperative Medicine, University of California, Los Angeles, 90095, CA, USA
| |
Collapse
|
10
|
Mestdagh M, Verdonck S, Duisters K, Tuerlinckx F. Fingerprint resampling: A generic method for efficient resampling. Sci Rep 2015; 5:16970. [PMID: 26597870 PMCID: PMC4657057 DOI: 10.1038/srep16970] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2015] [Accepted: 10/22/2015] [Indexed: 11/09/2022] Open
Abstract
In resampling methods, such as bootstrapping or cross validation, a very similar
computational problem (usually an optimization procedure) is solved over and over
again for a set of very similar data sets. If it is computationally burdensome to
solve this computational problem once, the whole resampling method can become
unfeasible. However, because the computational problems and data sets are so
similar, the speed of the resampling method may be increased by taking advantage of
these similarities in method and data. As a generic solution, we propose to learn
the relation between the resampled data sets and their corresponding optima. Using
this learned knowledge, we are then able to predict the optima associated with new
resampled data sets. First, these predicted optima are used as starting values for
the optimization process. Once the predictions become accurate enough, the
optimization process may even be omitted completely, thereby greatly decreasing the
computational burden. The suggested method is validated using two simple problems
(where the results can be verified analytically) and two real-life problems (i.e.,
the bootstrap of a mixed model and a generalized extreme value distribution). The
proposed method led on average to a tenfold increase in speed of the resampling
method.
Collapse
Affiliation(s)
| | - Stijn Verdonck
- University of Leuven, Oude Markt 13, 3000 Leuven, Belgium
| | - Kevin Duisters
- University of Leuven, Oude Markt 13, 3000 Leuven, Belgium.,Leiden University, Rapenburg 70, 2311 EZ Leiden, Netherlands
| | | |
Collapse
|
11
|
Zeng P, Zhao Y, Li H, Wang T, Chen F. Permutation-based variance component test in generalized linear mixed model with application to multilocus genetic association study. BMC Med Res Methodol 2015; 15:37. [PMID: 25897803 PMCID: PMC4410500 DOI: 10.1186/s12874-015-0030-1] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2014] [Accepted: 04/07/2015] [Indexed: 11/29/2022] Open
Abstract
Background In many medical studies the likelihood ratio test (LRT) has been widely applied to examine whether the random effects variance component is zero within the mixed effects models framework; whereas little work about likelihood-ratio based variance component test has been done in the generalized linear mixed models (GLMM), where the response is discrete and the log-likelihood cannot be computed exactly. Before applying the LRT for variance component in GLMM, several difficulties need to be overcome, including the computation of the log-likelihood, the parameter estimation and the derivation of the null distribution for the LRT statistic. Methods To overcome these problems, in this paper we make use of the penalized quasi-likelihood algorithm and calculate the LRT statistic based on the resulting working response and the quasi-likelihood. The permutation procedure is used to obtain the null distribution of the LRT statistic. We evaluate the permutation-based LRT via simulations and compare it with the score-based variance component test and the tests based on the mixture of chi-square distributions. Finally we apply the permutation-based LRT to multilocus association analysis in the case–control study, where the problem can be investigated under the framework of logistic mixed effects model. Results The simulations show that the permutation-based LRT can effectively control the type I error rate, while the score test is sometimes slightly conservative and the tests based on mixtures cannot maintain the type I error rate. Our studies also show that the permutation-based LRT has higher power than these existing tests and still maintains a reasonably high power even when the random effects do not follow a normal distribution. The application to GAW17 data also demonstrates that the proposed LRT has a higher probability to identify the association signals than the score test and the tests based on mixtures. Conclusions In the present paper the permutation-based LRT was developed for variance component in GLMM. The LRT outperforms existing tests and has a reasonably higher power under various scenarios; additionally, it is conceptually simple and easy to implement.
Collapse
Affiliation(s)
- Ping Zeng
- Department of Epidemiology and Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, 211166, , Jiangsu, People's Republic of China. .,Department of Epidemiology and Biostatistics, Center of Medical Statistics and Data Analysis, School of Public Health, Xuzhou Medical College, Xuzhou, 221004, Jiangsu, People's Republic of China.
| | - Yang Zhao
- Department of Epidemiology and Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, 211166, , Jiangsu, People's Republic of China.
| | - Hongliang Li
- Center for Disease Control and Prevention of Pudong New Area, Pudong New Area, Shanghai, 200136, People's Republic of China.
| | - Ting Wang
- Department of Epidemiology and Biostatistics, Center of Medical Statistics and Data Analysis, School of Public Health, Xuzhou Medical College, Xuzhou, 221004, Jiangsu, People's Republic of China.
| | - Feng Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, 211166, , Jiangsu, People's Republic of China.
| |
Collapse
|
12
|
Ganjgahi H, Winkler AM, Glahn DC, Blangero J, Kochunov P, Nichols TE. Fast and powerful heritability inference for family-based neuroimaging studies. Neuroimage 2015; 115:256-68. [PMID: 25812717 PMCID: PMC4463976 DOI: 10.1016/j.neuroimage.2015.03.005] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2014] [Accepted: 03/03/2015] [Indexed: 11/29/2022] Open
Abstract
Heritability estimation has become an important tool for imaging genetics studies. The large number of voxel- and vertex-wise measurements in imaging genetics studies presents a challenge both in terms of computational intensity and the need to account for elevated false positive risk because of the multiple testing problem. There is a gap in existing tools, as standard neuroimaging software cannot estimate heritability, and yet standard quantitative genetics tools cannot provide essential neuroimaging inferences, like family-wise error corrected voxel-wise or cluster-wise P-values. Moreover, available heritability tools rely on P-values that can be inaccurate with usual parametric inference methods. In this work we develop fast estimation and inference procedures for voxel-wise heritability, drawing on recent methodological results that simplify heritability likelihood computations (Blangero et al., 2013). We review the family of score and Wald tests and propose novel inference methods based on explained sum of squares of an auxiliary linear model. To address problems with inaccuracies with the standard results used to find P-values, we propose four different permutation schemes to allow semi-parametric inference (parametric likelihood-based estimation, non-parametric sampling distribution). In total, we evaluate 5 different significance tests for heritability, with either asymptotic parametric or permutation-based P-value computations. We identify a number of tests that are both computationally efficient and powerful, making them ideal candidates for heritability studies in the massive data setting. We illustrate our method on fractional anisotropy measures in 859 subjects from the Genetics of Brain Structure study.
Collapse
Affiliation(s)
- Habib Ganjgahi
- Department of Statistics, The University of Warwick, Coventry, UK
| | - Anderson M Winkler
- Centre for Functional MRI of the Brain, University of Oxford, Oxford, UK; Department of Psychiatry, Yale University School of Medicine, New Haven, USA
| | - David C Glahn
- Department of Psychiatry, Yale University School of Medicine, New Haven, USA; Olin Neuropsychiatry Research Center, Institute of Living, Hartford Hospital, Hartford, CT, USA
| | - John Blangero
- Department of Genetics, Texas Biomedical Research Institute, San Antonio, TX, USA
| | - Peter Kochunov
- Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Thomas E Nichols
- Department of Statistics, The University of Warwick, Coventry, UK; Centre for Functional MRI of the Brain, University of Oxford, Oxford, UK; WMG, The University of Warwick, Coventry, UK.
| |
Collapse
|
13
|
Loredo-Osti JC. A cautionary note on ignoring polygenic background when mapping quantitative trait loci via recombinant congenic strains. Front Genet 2014; 5:68. [PMID: 24765102 PMCID: PMC3980105 DOI: 10.3389/fgene.2014.00068] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2013] [Accepted: 03/17/2014] [Indexed: 11/13/2022] Open
Abstract
In gene mapping, it is common to test for association between the phenotype and the genotype at a large number of loci, i.e., the same response variable is used repeatedly to test a large number of non-independent and non-nested hypotheses. In many of these genetic problems, the underlying model is a mixed model consistent of one or very few major genes concurrently with a genetic background effect, usually thought as of polygenic nature and, consequently, modeled through a random effects term with a well-defined covariance structure dependent upon the kinship between individuals. Either because the interest lies only on the major genes or to simplify the analysis, it is habitual to drop the random effects term and use a simple linear regression model, sometimes complemented with testing via resampling as an attempt to minimize the consequences of this practice. Here, it is shown that dropping the random effects term has not only extreme negative effects on the control of the type I error rate, but it is also unlikely to be fixed by resampling because, whenever the mixed model is correct, this practice does not allow to meet some basic requirements of resampling in a gene mapping context. Furthermore, simulations show that the type I error rates when the random term is ignored can be unacceptably high. As an alternative, this paper introduces a new bootstrap procedure to handle the specific case of mapping by using recombinant congenic strains under a linear mixed model. A simulation study showed that the type I error rates of the proposed procedure are very close to the nominal ones, although they tend to be slightly inflated for larger values of the random effects variance. Overall, this paper illustrates the extent of the adverse consequences of ignoring random effects term due to polygenic factors while testing for genetic linkage and warns us of potential modeling issues whenever simple linear regression for a major gene yields multiple significant linkage peaks.
Collapse
|