1
|
Zaloumis SG, Scurrah KJ, Harrap SB, Ellis JA, Gurrin LC. Non-proportional odds multivariate logistic regression of ordinal family data. Biom J 2014; 57:286-303. [PMID: 25287055 DOI: 10.1002/bimj.201300137] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2013] [Revised: 02/05/2014] [Accepted: 04/18/2014] [Indexed: 11/10/2022]
Abstract
Methods to examine whether genetic and/or environmental sources can account for the residual variation in ordinal family data usually assume proportional odds. However, standard software to fit the non-proportional odds model to ordinal family data is limited because the correlation structure of family data is more complex than for other types of clustered data. To perform these analyses we propose the non-proportional odds multivariate logistic regression model and take a simulation-based approach to model fitting using Markov chain Monte Carlo methods, such as partially collapsed Gibbs sampling and the Metropolis algorithm. We applied the proposed methodology to male pattern baldness data from the Victorian Family Heart Study.
Collapse
Affiliation(s)
- Sophie G Zaloumis
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, University of Melbourne, 207 Bouverie Street, Carlton, Victoria 3010, Australia
| | | | | | | | | |
Collapse
|
2
|
Li Q, Schwender H, Louis TA, Fallin MD, Ruczinski I. Efficient simulation of epistatic interactions in case-parent trios. Hum Hered 2013; 75:12-22. [PMID: 23548797 DOI: 10.1159/000348789] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2012] [Accepted: 02/11/2013] [Indexed: 12/26/2022] Open
Abstract
Statistical approaches to evaluate interactions between single nucleotide polymorphisms (SNPs) and SNP-environment interactions are of great importance in genetic association studies, as susceptibility to complex disease might be related to the interaction of multiple SNPs and/or environmental factors. With these methods under active development, algorithms to simulate genomic data sets are needed to ensure proper type I error control of newly proposed methods and to compare power with existing methods. In this paper we propose an efficient method for a haplotype-based simulation of case-parent trios when the disease risk is thought to depend on possibly higher-order epistatic interactions or gene-environment interactions with binary exposures.
Collapse
Affiliation(s)
- Qing Li
- Statistical Genetics Section, National Human Genome Research Institute, National Institutes of Health, Baltimore, MD, USA
| | | | | | | | | |
Collapse
|
3
|
Schwender H, Bowers K, Fallin MD, Ruczinski I. Importance measures for epistatic interactions in case-parent trios. Ann Hum Genet 2010; 75:122-32. [PMID: 21118192 DOI: 10.1111/j.1469-1809.2010.00623.x] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Ensemble methods (such as Bagging and Random Forests) take advantage of unstable base learners (such as decision trees) to improve predictions, and offer measures of variable importance useful for variable selection. LogicFS has been proposed as such an ensemble learner for case-control studies when interactions of single nucleotide polymorphisms (SNPs) are of particular interest. LogicFS uses bootstrap samples of the data and employs the Boolean trees derived in logic regression as base learners to create ensembles of models that allow for the quantification of the contributions of epistatic interactions to the disease risk. In this article, we propose an extension of logicFS suitable for case-parent trio data, and derive an additional importance measure that is much less influenced by linkage disequilibrium between SNPs than the measure originally used in logicFS. We illustrate the performance of the novel procedure in simulation studies and in a case study of 461 case-parent trios with autistic children.
Collapse
Affiliation(s)
- Holger Schwender
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21218, USA
| | | | | | | |
Collapse
|
4
|
Diao G, Lin DY. Variance-components methods for linkage and association analysis of ordinal traits in general pedigrees. Genet Epidemiol 2010; 34:232-7. [PMID: 19918762 PMCID: PMC3003595 DOI: 10.1002/gepi.20453] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Many complex human diseases such as alcoholism and cancer are rated on ordinal scales. Well-developed statistical methods for the genetic mapping of quantitative traits may not be appropriate for ordinal traits. We propose a class of variance-component models for the joint linkage and association analysis of ordinal traits. The proposed models accommodate arbitrary pedigrees and allow covariates and gene-environment interactions. We develop efficient likelihood-based inference procedures under the proposed models. The maximum likelihood estimators are approximately unbiased, normally distributed, and statistically efficient. Extensive simulation studies demonstrate that the proposed methods perform well in practical situations. An application to data from the Collaborative Study on the Genetics of Alcoholism is provided.
Collapse
Affiliation(s)
- G Diao
- Department of Statistics, George Mason University, MS 4A7, 4400 University Drive, Fairfax, VA 22030-4444, USA.
| | | |
Collapse
|
5
|
Edwards TL, Turner SD, Torstenson ES, Dudek SM, Martin ER, Ritchie MD. A general framework for formal tests of interaction after exhaustive search methods with applications to MDR and MDR-PDT. PLoS One 2010; 5:e9363. [PMID: 20186329 PMCID: PMC2826406 DOI: 10.1371/journal.pone.0009363] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2009] [Accepted: 12/16/2009] [Indexed: 02/03/2023] Open
Abstract
The initial presentation of multifactor dimensionality reduction (MDR) featured cross-validation to mitigate over-fitting, computationally efficient searches of the epistatic model space, and variable construction with constructive induction to alleviate the curse of dimensionality. However, the method was unable to differentiate association signals arising from true interactions from those due to independent main effects at individual loci. This issue leads to problems in inference and interpretability for the results from MDR and the family-based compliment the MDR-pedigree disequilibrium test (PDT). A suggestion from previous work was to fit regression models post hoc to specifically evaluate the null hypothesis of no interaction for MDR or MDR-PDT models. We demonstrate with simulation that fitting a regression model on the same data as that analyzed by MDR or MDR-PDT is not a valid test of interaction. This is likely to be true for any other procedure that searches for models, and then performs an uncorrected test for interaction. We also show with simulation that when strong main effects are present and the null hypothesis of no interaction is true, that MDR and MDR-PDT reject at far greater than the nominal rate. We also provide a valid regression-based permutation test procedure that specifically tests the null hypothesis of no interaction, and does not reject the null when only main effects are present. The regression-based permutation test implemented here conducts a valid test of interaction after a search for multilocus models, and can be applied to any method that conducts a search to find a multilocus model representing an interaction.
Collapse
Affiliation(s)
- Todd L. Edwards
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
- Center for Human Genetics Research, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- Center for Genetic Epidemiology and Statistical Genetics, John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, Florida, United States of America
| | - Stephen D. Turner
- Center for Human Genetics Research, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Eric S. Torstenson
- Center for Human Genetics Research, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Scott M. Dudek
- Center for Human Genetics Research, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Eden R. Martin
- Center for Genetic Epidemiology and Statistical Genetics, John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, Florida, United States of America
| | - Marylyn D. Ritchie
- Center for Human Genetics Research, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- * E-mail:
| |
Collapse
|
6
|
Naylor MG, Weiss ST, Lange C. Recommendations for using standardised phenotypes in genetic association studies. Hum Genomics 2009; 3:308-19. [PMID: 19706362 PMCID: PMC3525193 DOI: 10.1186/1479-7364-3-4-308] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Genetic association studies of complex traits often rely on standardised quantitative phenotypes, such as percentage of predicted forced expiratory volume and body mass index to measure an underlying trait of interest (eg lung function, obesity). These phenotypes are appealing because they provide an easy mechanism for comparing subjects, although such standardisations may not be the best way to control for confounders and other covariates. We recommend adjusting raw or standardised phenotypes within the study population via regression. We illustrate through simulation that optimal power in both population- and family-based association tests is attained by using the residuals from within-study adjustment as the complex trait phenotype. An application of family-based association analysis of forced expiratory volume in one second, and obesity in the Childhood Asthma Management Program data, illustrates that power is maintained or increased when adjusted phenotype residuals are used instead of typical standardised quantitative phenotypes.
Collapse
Affiliation(s)
- Melissa G Naylor
- Department of Biostatistics, Harvard University, Boston, MA, USA.
| | | | | |
Collapse
|
7
|
Zeitz A, Spötter A, Blazyczek I, Diesterbeck U, Ohnesorge B, Deegen E, Distl O. Whole-genome scan for guttural pouch tympany in Arabian and German warmblood horses. Anim Genet 2009; 40:917-24. [DOI: 10.1111/j.1365-2052.2009.01942.x] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
8
|
Motsinger AA, Ritchie MD, Reif DM. Novel methods for detecting epistasis in pharmacogenomics studies. Pharmacogenomics 2007; 8:1229-41. [DOI: 10.2217/14622416.8.9.1229] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
The importance of gene–gene and gene–environment interactions in the underlying genetic architecture of common, complex phenotypes is gaining wide recognition in the field of pharmacogenomics. In epidemiological approaches to mapping genetic variants that predict drug response, it is important that researchers investigate potential epistatic interactions. In the current review, we discuss data-mining tools available in genetic epidemiology to detect such interactions and appropriate applications. We survey several classes of novel methods available and present an organized collection of successful applications in the literature. Finally, we provide guidance as to how to incorporate these novel methods into a genetic analysis. The overall goal of this paper is to aid researchers in developing an analysis plan that accounts for gene–gene and gene–environment in their own work.
Collapse
Affiliation(s)
- Alison A Motsinger
- North Carolina State University, Bioinformatics Research Center, Department of Statistics, Raleigh, NC 27695, USA
| | - Marylyn D Ritchie
- Vanderbilt University, Center for Human Genetics Research, Department of Molecular Physiology and Biophysics, Nashville, TN 37232, USA
| | - David M Reif
- US Environmental Protection Agency, National Center for Computational Toxicology, MD 353-03, Research Triangle Park, NC 27709, USA
| |
Collapse
|