1
|
Bruni M, Flax JF, Buyske S, Shindhelm AD, Witton C, Brzustowicz LM, Bartlett CW. Behavioral and Molecular Genetics of Reading-Related AM and FM Detection Thresholds. Behav Genet 2017; 47:193-201. [PMID: 27826669 PMCID: PMC5305590 DOI: 10.1007/s10519-016-9821-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2015] [Accepted: 09/28/2016] [Indexed: 12/24/2022]
Abstract
Auditory detection thresholds for certain frequencies of both amplitude modulated (AM) and frequency modulated (FM) dynamic auditory stimuli are associated with reading in typically developing and dyslexic readers. We present the first behavioral and molecular genetic characterization of these two auditory traits. Two extant extended family datasets were given reading tasks and psychoacoustic tasks to determine FM 2 Hz and AM 20 Hz sensitivity thresholds. Univariate heritabilities were significant for both AM (h 2 = 0.20) and FM (h 2 = 0.29). Bayesian posterior probability of linkage (PPL) analysis found loci for AM (12q, PPL = 81 %) and FM (10p, PPL = 32 %; 20q, PPL = 65 %). Bivariate heritability analyses revealed that FM is genetically correlated with reading, while AM was not. Bivariate PPL analysis indicates that FM loci (10p, 20q) are not also associated with reading.
Collapse
Affiliation(s)
- Matthew Bruni
- The Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital, Columbus, OH, USA
| | - Judy F Flax
- Department of Genetics and the Human Genetics Institute of New Jersey, Rutgers The State University of New Jersey, Piscataway, NJ, USA
| | - Steven Buyske
- Department of Genetics and the Human Genetics Institute of New Jersey, Rutgers The State University of New Jersey, Piscataway, NJ, USA
- Department of Statistics, Rutgers The State University of New Jersey, Piscataway, NJ, USA
| | - Amber D Shindhelm
- The Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital, Columbus, OH, USA
| | - Caroline Witton
- Aston Brain Centre, School of Life and Health Sciences, Aston University, Birmingham, B4 7ET, UK
| | - Linda M Brzustowicz
- Department of Genetics and the Human Genetics Institute of New Jersey, Rutgers The State University of New Jersey, Piscataway, NJ, USA
| | - Christopher W Bartlett
- Department of Pediatrics, College of Medicine, The Ohio State University, Columbus, OH, USA.
- Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital & The Ohio State University, 575 Children's Crossroad, Columbus, OH, 43205, USA.
| |
Collapse
|
2
|
Roy S, Sarkar A, Das K. Analysis of bivariate binary data with possible chances of wrong ascertainment. J STAT COMPUT SIM 2014. [DOI: 10.1080/00949655.2012.722635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
3
|
Schildcrout JS, Garbett SP, Heagerty PJ. Outcome vector dependent sampling with longitudinal continuous response data: stratified sampling based on summary statistics. Biometrics 2013; 69:405-16. [PMID: 23409789 PMCID: PMC3880022 DOI: 10.1111/biom.12013] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2012] [Revised: 11/01/2012] [Accepted: 11/01/2012] [Indexed: 11/30/2022]
Abstract
The analysis of longitudinal trajectories usually focuses on evaluation of explanatory factors that are either associated with rates of change, or with overall mean levels of a continuous outcome variable. In this article, we introduce valid design and analysis methods that permit outcome dependent sampling of longitudinal data for scenarios where all outcome data currently exist, but a targeted substudy is being planned in order to collect additional key exposure information on a limited number of subjects. We propose a stratified sampling based on specific summaries of individual longitudinal trajectories, and we detail an ascertainment corrected maximum likelihood approach for estimation using the resulting biased sample of subjects. In addition, we demonstrate that the efficiency of an outcome-based sampling design relative to use of a simple random sample depends highly on the choice of outcome summary statistic used to direct sampling, and we show a natural link between the goals of the longitudinal regression model and corresponding desirable designs. Using data from the Childhood Asthma Management Program, where genetic information required retrospective ascertainment, we study a range of designs that examine lung function profiles over 4 years of follow-up for children classified according to their genotype for the IL 13 cytokine.
Collapse
Affiliation(s)
- Jonathan S Schildcrout
- Department of Biostatistics, Vanderbilt University School of Medicine, Nashville, TN, USA.
| | | | | |
Collapse
|
4
|
Roy S, Das K, Sarkar A. Analysis of binary data with the possibility of wrong ascertainment. STAT NEERL 2013. [DOI: 10.1111/stan.12008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Surupa Roy
- Department of Statistics; St. Xavier's College; Kolkata
| | - Kalyan Das
- Department of Statistics; University of Calcutta; Kolkata
| | - Angshuman Sarkar
- Department of Statistics, Siksha Bhavana; Visva-Bharati; Santiniketan
| |
Collapse
|
5
|
Lee Y, Noh M. Modelling random effect variance with double hierarchical generalized linear models. STAT MODEL 2012. [DOI: 10.1177/1471082x12460132] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Random-effect models are becoming increasingly popular in the analysis of data. Lee and Nelder (2006) introduced double hierarchical generalized linear models (DHGLMs) in which not only the mean but also the residual variance (overdispersion) can be further modelled as random-effect models. In this article, we introduce DHGLMs that allow random-effect models for both the variances of random effects and the residual variance. We show how to use this general model class for the analysis of data and discuss how to select the best fitting model using the likelihood and various model-checking plots.
Collapse
Affiliation(s)
- Youngjo Lee
- Department of Statistics, Seoul National University, Seoul 151–742, South Korea
| | - Maengseok Noh
- Department of Statistics, Pukyong National University, Busan 608–737, South Korea
| |
Collapse
|
6
|
Schildcrout JS, Heagerty PJ. Outcome-dependent sampling from existing cohorts with longitudinal binary response data: study planning and analysis. Biometrics 2011; 67:1583-93. [PMID: 21457191 PMCID: PMC3134621 DOI: 10.1111/j.1541-0420.2011.01582.x] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
When novel scientific questions arise after longitudinal binary data have been collected, the subsequent selection of subjects from the cohort for whom further detailed assessment will be undertaken is often necessary to efficiently collect new information. Key examples of additional data collection include retrospective questionnaire data, novel data linkage, or evaluation of stored biological specimens. In such cases, all data required for the new analyses are available except for the new target predictor or exposure. We propose a class of longitudinal outcome-dependent sampling schemes and detail a design corrected conditional maximum likelihood analysis for highly efficient estimation of time-varying and time-invariant covariate coefficients when resource limitations prohibit exposure ascertainment on all participants. Additionally, we detail an important study planning phase that exploits available cohort data to proactively examine the feasibility of any proposed substudy as well as to inform decisions regarding the most desirable study design. The proposed designs and associated analyses are discussed in the context of a study that seeks to examine the modifying effect of an interleukin-10 cytokine single nucleotide polymorphism on asthma symptom regression in adolescents participating Childhood Asthma Management Program Continuation Study. Using this example we assume that all data necessary to conduct the study are available except subject-specific genotype data. We also assume that these data would be ascertained by analyzing stored blood samples, the cost of which limits the sample size.
Collapse
Affiliation(s)
- Jonathan S. Schildcrout
- Departments of Biostatistics and Anesthesiology, Vanderbilt University School of Medicine, 1161 21st Avenue South, S-2323 Medical Center North, Nashville, TN 37232, USA
| | - Patrick J. Heagerty
- Department of Biostatistics, University of Washington, F-600 Health Sciences Building, Campus Mail Stop 357232, Seattle, WA, 98105-7232
| |
Collapse
|
7
|
Neuhaus JM, McCulloch CE. The effect of misspecification of random effects distributions in clustered data settings with outcome-dependent sampling. CAN J STAT 2011. [PMID: 23204632 DOI: 10.1002/cjs.10117] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Genetic epidemiologists often gather outcome-dependent samples of family data to measure within-family associations of genetic factors with disease outcomes. Generalized linear mixed models provide effective methods to estimate within-family associations but typically require parametric specification of the random effects distribution. Although misspecification of the random effects distribution often leads to little bias in estimated regression coefficients in standard, prospective clustered data settings, some recent studies suggest that such misspecification will impact parameter estimates from outcome-dependent cluster sampling designs. Using analytic results, simulation studies and fits to example data, this study examines the effect of misspecification of random effects distributions on parameter estimates in clustered data settings with outcome-dependent sampling. We show that the effects are consistent with results from prospective cluster sampling settings. In particular, ascertainment corrected mixed model methods that assume normally distributed random intercepts and conditional likelihood approaches provide accurate estimates of within-family covariate effects even under a misspecified random effects distribution.
Collapse
Affiliation(s)
- John M Neuhaus
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA 94143-0560, USA
| | | |
Collapse
|
8
|
Javaras KN, Hudson JI, Laird NM. Fitting ACE structural equation models to case-control family data. Genet Epidemiol 2010; 34:238-45. [PMID: 19918760 DOI: 10.1002/gepi.20454] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Investigators interested in whether a disease aggregates in families often collect case-control family data, which consist of disease status and covariate information for members of families selected via case or control probands. Here, we focus on the use of case-control family data to investigate the relative contributions to the disease of additive genetic effects (A), shared family environment (C), and unique environment (E). We describe an ACE model for binary family data; this structural equation model, which has been described previously, combines a general-family extension of the classic ACE twin model with a (possibly covariate-specific) liability-threshold model for binary outcomes. We then introduce our contribution, a likelihood-based approach to fitting the model to singly ascertained case-control family data. The approach, which involves conditioning on the proband's disease status and also setting prevalence equal to a prespecified value that can be estimated from the data, makes it possible to obtain valid estimates of the A, C, and E variance components from case-control (rather than only from population-based) family data. In fact, simulation experiments suggest that our approach to fitting yields approximately unbiased estimates of the A, C, and E variance components, provided that certain commonly made assumptions hold. Further, when our approach is used to fit the ACE model to Austrian case-control family data on depression, the resulting estimate of heritability is very similar to those from previous analyses of twin data.
Collapse
Affiliation(s)
- K N Javaras
- Waisman Laboratory for Brain Imaging & Behavior, University of Wisconsin-Madison, Madison, Wisconsin 53705, USA.
| | | | | |
Collapse
|
9
|
Yip BH, Reilly M, Cnattingius S, Pawitan Y. Matched ascertainment of informative families for complex genetic modelling. Behav Genet 2010; 40:404-14. [PMID: 20033275 PMCID: PMC2953624 DOI: 10.1007/s10519-009-9322-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2008] [Accepted: 11/28/2009] [Indexed: 11/26/2022]
Abstract
Family data are used extensively in quantitative genetic studies to disentangle the genetic and environmental contributions to various diseases. Many family studies based their analysis on population-based registers containing a large number of individuals composed of small family units. For binary trait analyses, exact marginal likelihood is a common approach, but, due to the computational demand of the enormous data sets, it allows only a limited number of effects in the model. This makes it particularly difficult to perform joint estimation of variance components for a binary trait and the potential confounders. We have developed a data-reduction method of ascertaining informative families from population-based family registers. We propose a scheme where the ascertained families match the full cohort with respect to some relevant statistics, such as the risk to relatives of an affected individual. The ascertainment-adjusted analysis, which we implement using a pseudo-likelihood approach, is shown to be efficient relative to the analysis of the whole cohort and robust to mis-specification of the random effect distribution.
Collapse
Affiliation(s)
- Benjamin H. Yip
- Department of Psychiatry, University of Hong Kong, Hong Kong, China
| | - Marie Reilly
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, 17177 Stockholm, Sweden
| | - Sven Cnattingius
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, 17177 Stockholm, Sweden
| | - Yudi Pawitan
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, 17177 Stockholm, Sweden
| |
Collapse
|
10
|
Epstein MP, Hunter JE, Allen EG, Sherman SL, Lin X, Boehnke M. A Variance-Component Framework for Pedigree Analysis of Continuous and Categorical Outcomes. STATISTICS IN BIOSCIENCES 2009; 1:181-198. [PMID: 20436936 PMCID: PMC2860148 DOI: 10.1007/s12561-009-9010-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Variance-component methods are popular and flexible analytic tools for elucidating the genetic mechanisms of complex quantitative traits from pedigree data. However, variance-component methods typically assume that the trait of interest follows a multivariate normal distribution within a pedigree. Studies have shown that violation of this normality assumption can lead to biased parameter estimates and inflations in type-I error. This limits the application of variance-component methods to more general trait outcomes, whether continuous or categorical in nature. In this paper, we develop and apply a general variance-component framework for pedigree analysis of continuous and categorical outcomes. We develop appropriate models using generalized-linear mixed model theory and fit such models using approximate maximum-likelihood procedures. Using our proposed method, we demonstrate that one can perform variance-component pedigree analysis on outcomes that follow any exponential-family distribution. Additionally, we also show how one can modify the method to perform pedigree analysis of ordinal outcomes. We also discuss extensions of our variance-component framework to accommodate pedigrees ascertained based on trait outcome. We demonstrate the feasibility of our method using both simulated data and data from a genetic study of ovarian insufficiency.
Collapse
Affiliation(s)
| | | | - Emily G. Allen
- Department of Human Genetics, Emory University, Atlanta, GA
| | | | - Xihong Lin
- Department of Biostatistics, Harvard University, Boston, MA
| | - Michael Boehnke
- Department of Biostatistics, University of Michigan, Ann Arbor, MI
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI
| |
Collapse
|
11
|
|
12
|
Bowden J, Thompson JR, Burton PR. A two-stage approach to the correction of ascertainment bias in complex genetic studies involving variance components. Ann Hum Genet 2007; 71:220-9. [PMID: 17354286 DOI: 10.1111/j.1469-1809.2006.00307.x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Correction for ascertainment bias is a vital part of the analysis of genetic epidemiology studies that needs to be undertaken whenever subjects are not recruited at random. Adjustment often requires extensive numerical integration, which can be very slow or even computationally infeasible, especially if the model includes many fixed and random effects. In this paper we propose a two-stage method for ascertainment bias correction. In the first stage we estimate parameters that pertain to the ascertained population, that is the population that would be selected into the sample if the ascertainment criterion were applied to everyone. In the second stage we convert the estimates for the ascertained population into general population parameter estimates. We illustrate the method with simulations based on a simple model and then describe how the method can be used with complex models. The two-stage approach avoids some of the integration required in direct adjustment, hence speeding up the process of model fitting.
Collapse
|
13
|
Lindström L, Pawitan Y, Reilly M, Hemminki K, Lichtenstein P, Czene K. Estimation of genetic and environmental factors for melanoma onset using population-based family data. Stat Med 2007; 25:3110-23. [PMID: 16372390 DOI: 10.1002/sim.2266] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Estimation of genetic and environmental contributions to cancers falls in the framework of generalized linear mixed modelling with several random effect components. Computational challenges remain, however, in dealing with binary or survival phenotypes. In this paper, we consider the analysis of melanoma onset in a population of 2.6 million nuclear families in Sweden, for which none of the current survival-based methodologies is feasible. We treat the disease outcome as a binary phenotype, so that the standard proportional hazard model leads to a generalized linear model with the complementary-log link function. For rare diseases this link is very close to the probit link, and thus allows the use of marginal likelihood for the estimation of the variance components. We correct for the survival length bias by censoring the parent generation within each family at the time they attain the same cumulative hazard as the child generation, thus improving the validity of the estimates. Our finding that childhood shared environment in addition to genetic factors had a considerable effect on the development of melanoma is consistent with epidemiological studies.
Collapse
Affiliation(s)
- L Lindström
- Department of Medical Epidemiology and Biostatistics, Karolinska Institute P.O. Box 281, 17177 Stockholm, Sweden
| | | | | | | | | | | |
Collapse
|
14
|
Noh M, Yip B, Lee Y, Pawitan Y. Multicomponent variance estimation for binary traits in family-based studies. Genet Epidemiol 2006; 30:37-47. [PMID: 16265627 DOI: 10.1002/gepi.20099] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In biometrical genetic analyses of binary traits, the use of family data overcomes some limitations of twin studies, particularly in terms of sample size and types of genetic or environmental factors that can be estimated. However, because of computational problems, recent methods in the application of generalized linear mixed models for family data structure have limited the ability to handle large data sets with general covariates. In this paper, we investigate the use of the hierarchical likelihood approach to the analysis of binary traits from family data. In a simulation study, the method is shown to be highly accurate for the estimation of both the variance components and fixed regression parameters, even for small family sizes. For illustration, we analyze a real data set of familial aggregation of preeclampsia, a pregnancy-induced hypertension. When possible, the analysis is compared with the exact maximum likelihood approach.
Collapse
Affiliation(s)
- M Noh
- Department of Statistics, Seoul National University, South Korea
| | | | | | | |
Collapse
|
15
|
Igo RP, Chapman NH, Berninger VW, Matsushita M, Brkanac Z, Rothstein JH, Holzman T, Nielsen K, Raskind WH, Wijsman EM. Genomewide scan for real-word reading subphenotypes of dyslexia: novel chromosome 13 locus and genetic complexity. Am J Med Genet B Neuropsychiatr Genet 2006; 141B:15-27. [PMID: 16331673 PMCID: PMC2556979 DOI: 10.1002/ajmg.b.30245] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Dyslexia is a common learning disability exhibited as a delay in acquiring reading skills despite adequate intelligence and instruction. Reading single real words (real-word reading, RWR) is especially impaired in many dyslexics. We performed a genome scan, using variance components (VC) linkage analysis and Bayesian Markov chain Monte Carlo (MCMC) joint segregation and linkage analysis, for three quantitative measures of RWR in 108 multigenerational families, with follow up of the strongest signals with parametric LOD score analyses. We used single-word reading efficiency (SWE) to assess speed and accuracy of RWR, and word identification (WID) to assess accuracy alone. Adjusting SWE for WID provided a third measure of RWR efficiency. All three methods of analysis identified a strong linkage signal for SWE on chromosome 13q. Based on multipoint analysis with 13 markers we obtained a MCMC intensity ratio (IR) of 53.2 (chromosome-wide P < 0.004), a VC LOD score of 2.29, and a parametric LOD score of 2.94, based on a quantitative-trait model from MCMC segregation analysis (SA). A weaker signal for SWE on chromosome 2q occurred in the same location as a significant linkage peak seen previously in a scan for phonological decoding. MCMC oligogenic SA identified three models of transmission for WID, which could be assigned to two distinct linkage peaks on chromosomes 12 and 15. Taken together, these results indicate a locus for efficiency and accuracy of RWR on chromosome 13, and a complex model for inheritance of RWR accuracy with loci on chromosomes 12 and 15.
Collapse
Affiliation(s)
- Robert P. Igo
- Department of Medicine, University of Washington, Seattle, WA
- Department of Biostatistics, University of Washington, Seattle, WA
| | | | | | - Mark Matsushita
- Department of Medicine, University of Washington, Seattle, WA
| | - Zoran Brkanac
- Department of Psychiatry and Behavioral Sciences, University of Washington, Seattle, WA
| | | | | | - Kathleen Nielsen
- Department of Educational Psychology, University of Washington, Seattle, WA
| | - Wendy H. Raskind
- Department of Medicine, University of Washington, Seattle, WA
- Department of Psychiatry and Behavioral Sciences, University of Washington, Seattle, WA
| | - Ellen M. Wijsman
- Department of Medicine, University of Washington, Seattle, WA
- Department of Biostatistics, University of Washington, Seattle, WA
| |
Collapse
|
16
|
Raskind WH, Igo RP, Chapman NH, Berninger VW, Thomson JB, Matsushita M, Brkanac Z, Holzman T, Brown M, Wijsman EM. A genome scan in multigenerational families with dyslexia: Identification of a novel locus on chromosome 2q that contributes to phonological decoding efficiency. Mol Psychiatry 2005; 10:699-711. [PMID: 15753956 DOI: 10.1038/sj.mp.4001657] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Dyslexia is a common and complex developmental disorder manifested by unexpected difficulty in learning to read. Multiple different measures are used for diagnosis, and may reflect different biological pathways related to the disorder. Impaired phonological decoding (translation of written words without meaning cues into spoken words) is thought to be a core deficit. We present a genome scan of two continuous measures of phonological decoding ability: phonemic decoding efficiency (PDE) and word attack (WA). PDE measures both accuracy and speed of phonological decoding, whereas WA measures accuracy alone. Multipoint variance component linkage analyses (VC) and Markov chain Monte-Carlo (MCMC) multipoint joint linkage and segregation analyses were performed on 108 families. A strong signal was observed on chromosome 2 for PDE using both VC (LOD=2.65) and MCMC methods (intensity ratio (IR)=32.1). The IR is an estimate of the ratio of the posterior to prior probability of linkage in MCMC analysis. The chromosome 2 signal was not seen for WA. More detailed mapping with additional markers provided statistically significant evidence for linkage of PDE to chromosome 2, with VC-LOD=3.0 and IR=59.6 at D2S1399. Parametric analyses of PDE, using a model obtained by complex segregation analysis, provided a multipoint maximum LOD=2.89. The consistency of results from three analytic approaches provides strong evidence for a locus on chromosome 2 that influences speed but not accuracy of phonological decoding.
Collapse
Affiliation(s)
- W H Raskind
- Department of Medicine, University of Washington, Seattle, WA, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Hsu L, Chen L, Gorfine M, Malone K. Semiparametric estimation of marginal hazard function from case-control family studies. Biometrics 2005; 60:936-44. [PMID: 15606414 DOI: 10.1111/j.0006-341x.2004.00249.x] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Estimating marginal hazard function from the correlated failure time data arising from case-control family studies is complicated by noncohort study design and risk heterogeneity due to unmeasured, shared risk factors among the family members. Accounting for both factors in this article, we propose a two-stage estimation procedure. At the first stage, we estimate the dependence parameter in the distribution for the risk heterogeneity without obtaining the marginal distribution first or simultaneously. Assuming that the dependence parameter is known, at the second stage we estimate the marginal hazard function by iterating between estimation of the risk heterogeneity (frailty) for each family and maximization of the partial likelihood function with an offset to account for the risk heterogeneity. We also propose an iterative procedure to improve the efficiency of the dependence parameter estimate. The simulation study shows that both methods perform well under finite sample sizes. We illustrate the method with a case-control family study of early onset breast cancer.
Collapse
Affiliation(s)
- Li Hsu
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109-1024, USA.
| | | | | | | |
Collapse
|
18
|
Iversen ES, Chen S. Population-Calibrated Gene Characterization: Estimating Age at Onset Distributions Associated With Cancer Genes. J Am Stat Assoc 2005; 100:399-409. [PMID: 18418465 DOI: 10.1198/016214505000000196] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Phenotypic characterization of rare disease genes poses a significant statistical challenge, but the need to do so is clear. Clinical management of patients carrying a disease gene depends crucially on an accurate characterization of the genetically predisposed disease, including its likelihood of occurrence among mutation carriers, natural history, and response to treatment. We propose a formal yet practical method for controlling for bias due to ignoring ascertainment, defined as the sampling mechanism, when quantifying the association between genotype and disease using data on high-risk families. The approach is more statistically efficient than conditioning on the variables used in sampling. In it, the likelihood is adjusted by a factor that is a function of sampling weights in strata defined by those variables. It requires that these variables and the sampling probabilities in the strata they define either are known or can be estimated. The latter requires a second, population-based dataset. As an example, we derive ascertainment-corrected estimates of penetrance for the breast cancer susceptibility genes BRCA1 and BRCA2. The Bayesian analysis that we use incorporates a modified segregation model and prior data on penetrance derived from the literature. Markov chain Monte Carlo methods are used for inference.
Collapse
Affiliation(s)
- Edwin S Iversen
- Edwin S. Iversen, Jr. is Research Assistant Professor, Department of Biostatistics and Bioinformatics and Institute of Statistics and Decision Sciences, Duke University, Durham, NC 27708 (E-mail: ). Sining Chen is Postdoctoral Fellow, Oncology Biostatistics, Johns Hopkins University, Baltimore, MD 21205 (E-mail: )
| | | |
Collapse
|
19
|
Abstract
Nonrandom ascertainment is commonly used in genetic studies of rare diseases, since this design is often more convenient than the random-sampling design. When there is an underlying latent heterogeneity, Epstein et al. ([2002] Am. J. Hum. Genet. 70:886-895) showed that it is possible to get unbiased or consistent estimation of population parameters under ascertainment adjustment, but Glidden and Liang ([2002] Genet. Epidemiol. 23:201-208) showed in a simulation study that the resulting estimates are highly sensitive to misspecification of the latent components. To overcome this difficulty, we consider a heavy-tailed model for latent variables that allows a robust estimation of the parameters. We describe a hierarchical-likelihood approach that avoids the integration used in the standard marginal likelihood approach. We revisit and extend the previous simulation, and show that the resulting estimator is efficient and robust against misspecification of the distribution of latent variables.
Collapse
Affiliation(s)
- Maengseok Noh
- Department of Statistics, Seoul University, Seoul, Republic of Korea
| | | | | |
Collapse
|
20
|
|
21
|
Burton PR. Correcting for nonrandom ascertainment in generalized linear mixed models (GLMMs), fitted using Gibbs sampling. Genet Epidemiol 2003; 24:24-35. [PMID: 12508253 DOI: 10.1002/gepi.10206] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Gibbs sampling-based generalized linear mixed models (GLMMs) provide a convenient and flexible way to extend variance components models for multivariate normally distributed continuous traits to other classes of phenotype. This includes binary traits and right-censored failure times such as age-at-onset data. The approach has applications in many areas of genetic epidemiology. However, the required GLMMs are sensitive to nonrandom ascertainment. In the absence of an appropriate correction for ascertainment, they can exhibit marked positive bias in the estimated grand mean and serious shrinkage in the estimated magnitude of variance components. To compound practical difficulties, it is currently difficult to implement a conventional adjustment for ascertainment because of the need to undertake repeated integration across the distribution of random effects. This is prohibitively slow when it must be repeated at every iteration of the Markov chain Monte Carlo (MCMC) procedure. This paper motivates a correction for ascertainment that is based on sampling random effects rather than integrating across them and can therefore be implemented in a general-purpose Gibbs sampling environment such as WinBUGS. The approach has the characteristic that it returns ascertainment-adjusted parameter estimates that pertain to the true distribution of determinants in the ascertained sample rather than in the general population. The implications of this characteristic are investigated and discussed. This paper extends the utility of Gibbs sampling-based GLMMs to a variety of settings in which family data are ascertained nonrandomly.
Collapse
Affiliation(s)
- Paul R Burton
- Department of Epidemiology and Public Health and Institute of Genetics, University of Leicester, Leicester, UK.
| |
Collapse
|
22
|
Affiliation(s)
- David V Glidden
- Department of Epidemiology and Biostatistics, University of California at San Francisco, San Francisco, California 94143-0560, USA.
| |
Collapse
|