1
|
Schaid DJ, McDonnell SK, Thibodeau SN. Familial recurrence risk with varying amount of family history. Genet Epidemiol 2019; 43:440-448. [PMID: 30740785 DOI: 10.1002/gepi.22193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2018] [Revised: 12/13/2018] [Accepted: 01/24/2019] [Indexed: 11/09/2022]
Abstract
The familial recurrence risk is the probability a person will have disease, given a reported family history. When family histories are obtained as simple counts of disease among family members, as often obtained in cancer registries or surveys, we propose methods to estimate recurrence risks based on truncated binomial distributions. By this approach, we are able to obtain unbiased estimates of risk for a person with at least k-affected relatives, where k can be specified to determine how risk varies with k. We also derive robust variances of the recurrence risk estimate, to account for correlations within families, such as those induced by shared genes or shared environment, without explicitly modeling the factors that cause familial correlations. Furthermore, we illustrate how mixture models can be used to account for a sample composed of low- and high-risk families. Using simulations, we illustrate the properties of the proposed methods. Application of our methods to a family history survey of prostate cancer shows that the recurrence risk for prostate cancer increased from 16%, when there was at least one affected relative, to 52%, when there was at least five affected relatives.
Collapse
Affiliation(s)
- Daniel J Schaid
- Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota
| | | | - Stephen N Thibodeau
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, Minnesota
| |
Collapse
|
2
|
Guo F, Kim I, Klauer SG. Semiparametric Bayesian models for evaluating time‐variant driving risk factors using naturalistic driving data and case‐crossover approach. Stat Med 2017; 38:160-174. [DOI: 10.1002/sim.7574] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Revised: 11/02/2017] [Accepted: 11/05/2017] [Indexed: 11/12/2022]
Affiliation(s)
- Feng Guo
- Department of StatisticsVirginia Tech Blacksburg VA 24060 USA
- Virginia Tech Transportation Institute Blacksburg VA 24060 USA
| | - Inyoung Kim
- Department of StatisticsVirginia Tech Blacksburg VA 24060 USA
| | | |
Collapse
|
3
|
Zhong Y, Cook RJ. Second-Order Estimating Equations for Clustered Current Status Data from Family Studies Using Response-Dependent Sampling. STATISTICS IN BIOSCIENCES 2017; 10:160-183. [PMID: 30147803 PMCID: PMC6097126 DOI: 10.1007/s12561-017-9201-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2016] [Accepted: 07/13/2017] [Indexed: 11/25/2022]
Abstract
Studies about the genetic basis for disease are routinely conducted through family studies under response-dependent sampling in which affected individuals called probands are sampled from a disease registry, and their respective family members (non-probands) are recruited for study. The extent to which the dependence in some feature of the disease process (e.g., presence, age of onset, severity) varies according to the kinship of individuals reflects the evidence of a genetic cause for disease. When the probands are selected from a disease registry, it is common for them to provide quite detailed information regarding their disease history, but non-probands often simply provide their disease status at the time of contact. We develop conditional second-order estimating equations for studying the nature and extent of within-family dependence which recognizes the biased sampling scheme employed in family studies and the current status data provided by the non-probands. Simulation studies are carried out to evaluate the finite sample performance of different estimating functions and to quantify the empirical relative efficiency of the various methods. Sensitivity to model misspecification is also explored. An application to a motivating psoriatic arthritis family study is given for illustration.
Collapse
Affiliation(s)
- Yujie Zhong
- MRC Biostatistics Unit, School of Clinical Medicine, University of Cambridge, Cambridge Institute of Public Health, Forvie Site, Robinson Way, Cambridge, CB2 0SR UK
| | - Richard J. Cook
- Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, ON N2L 3G1 Canada
| |
Collapse
|
4
|
Bruni M, Flax JF, Buyske S, Shindhelm AD, Witton C, Brzustowicz LM, Bartlett CW. Behavioral and Molecular Genetics of Reading-Related AM and FM Detection Thresholds. Behav Genet 2017; 47:193-201. [PMID: 27826669 PMCID: PMC5305590 DOI: 10.1007/s10519-016-9821-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2015] [Accepted: 09/28/2016] [Indexed: 12/24/2022]
Abstract
Auditory detection thresholds for certain frequencies of both amplitude modulated (AM) and frequency modulated (FM) dynamic auditory stimuli are associated with reading in typically developing and dyslexic readers. We present the first behavioral and molecular genetic characterization of these two auditory traits. Two extant extended family datasets were given reading tasks and psychoacoustic tasks to determine FM 2 Hz and AM 20 Hz sensitivity thresholds. Univariate heritabilities were significant for both AM (h 2 = 0.20) and FM (h 2 = 0.29). Bayesian posterior probability of linkage (PPL) analysis found loci for AM (12q, PPL = 81 %) and FM (10p, PPL = 32 %; 20q, PPL = 65 %). Bivariate heritability analyses revealed that FM is genetically correlated with reading, while AM was not. Bivariate PPL analysis indicates that FM loci (10p, 20q) are not also associated with reading.
Collapse
Affiliation(s)
- Matthew Bruni
- The Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital, Columbus, OH, USA
| | - Judy F Flax
- Department of Genetics and the Human Genetics Institute of New Jersey, Rutgers The State University of New Jersey, Piscataway, NJ, USA
| | - Steven Buyske
- Department of Genetics and the Human Genetics Institute of New Jersey, Rutgers The State University of New Jersey, Piscataway, NJ, USA
- Department of Statistics, Rutgers The State University of New Jersey, Piscataway, NJ, USA
| | - Amber D Shindhelm
- The Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital, Columbus, OH, USA
| | - Caroline Witton
- Aston Brain Centre, School of Life and Health Sciences, Aston University, Birmingham, B4 7ET, UK
| | - Linda M Brzustowicz
- Department of Genetics and the Human Genetics Institute of New Jersey, Rutgers The State University of New Jersey, Piscataway, NJ, USA
| | - Christopher W Bartlett
- Department of Pediatrics, College of Medicine, The Ohio State University, Columbus, OH, USA.
- Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital & The Ohio State University, 575 Children's Crossroad, Columbus, OH, 43205, USA.
| |
Collapse
|
5
|
Song YE, Stein CM, Morris NJ. strum: an R package for structural modeling of latent variables for general pedigrees. BMC Genet 2015; 16:35. [PMID: 25887541 PMCID: PMC4404673 DOI: 10.1186/s12863-015-0190-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2014] [Accepted: 03/19/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Structural equation modeling (SEM) is an extremely general and powerful approach to account for measurement error and causal pathways when analyzing data, and it has been used in wide range of applied sciences. There are many commercial and freely available software packages for SEM. However, it is difficult to use any of the packages to analyze general pedigree data, and SEM packages for genetics are limited in their application. RESULTS We present the new R package strum to serve the need of a suitable SEM software tool for genetic analysis. It implements a general framework for SEM within the context of general pedigree data. This context requires specialized considerations such as familial correlations and ascertainment. Our package is an extraordinarily flexible tool capable of modeling genetic association, linkage analysis, polygenic effects, shared environment, and ascertainment combined with confirmatory factor analysis and general SEM. It also provides a convenient tool for model visualization, and integrates tools for simulating pedigree data. The various features of this package are tested through a simulation study to evaluate performance, and our results show that strum is very reliable and robust in terms of the accuracy and coverage of parameter estimates. CONCLUSIONS strum is a valuable new tool for genetic analysis. It can be easily used with general pedigree data, incorporating both measurement and structural models, giving it some significant advantages over other software packages. It also includes a built-in approach for handling ascertainment, a helpful integrated tool for genetic data simulation, and built-in tools for model visualization, providing a significant addition to biomedical research.
Collapse
Affiliation(s)
- Yeunjoo E Song
- Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, OH, 44106, USA.
| | - Catherine M Stein
- Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, OH, 44106, USA.
- Center for Proteomics and Bioinformatics, Case Western Reserve University, Cleveland, OH, 44106, USA.
| | - Nathan J Morris
- Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, OH, 44106, USA.
- Center for Clinical Investigation, Case Western Reserve University, Cleveland, OH, 44106, USA.
| |
Collapse
|
6
|
Roy S, Sarkar A, Das K. Analysis of bivariate binary data with possible chances of wrong ascertainment. J STAT COMPUT SIM 2014. [DOI: 10.1080/00949655.2012.722635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
7
|
Lee Y, Ghosh D, Zhang Y. Association testing to detect gene-gene interactions on sex chromosomes in trio data. Front Genet 2013; 4:239. [PMID: 24312118 PMCID: PMC3826485 DOI: 10.3389/fgene.2013.00239] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2013] [Accepted: 10/24/2013] [Indexed: 11/13/2022] Open
Abstract
Autism Spectrum Disorder (ASD) occurs more often among males than females in a 4:1 ratio. Among theories used to explain the causes of ASD, the X chromosome and the Y chromosome theories attribute ASD to the X-linked mutation and the male-limited gene expressions on the Y chromosome, respectively. Despite the rationale of the theory, studies have failed to attribute the sex-biased ratio to the significant linkage or association on the regions of interest on X chromosome. We further study the gender biased ratio by examining the possible interaction effects between two genes in the sex chromosomes. We propose a logistic regression model with mixed effects to detect gene–gene interactions on sex chromosomes. We investigated the power and type I error rates of the approach for a range of minor allele frequencies and varying linkage disequilibrium between markers and QTLs. We also evaluated the robustness of the model to population stratification. We applied the model to a trio-family data set with an ASD affected male child to study gene–gene interactions on sex chromosomes.
Collapse
Affiliation(s)
- Yeonok Lee
- Department of Statistics, Penn State University, University Park PA, USA
| | | | | |
Collapse
|
8
|
Abstract
Dupuytren's disease is a complex condition, with both genetic and environmental factors contributing to its aetiology. We aimed to quantify the extent to which genetic factors predispose to the disease, through the calculation of sibling recurrence risk (ls), and to calculate the proportion of heritability accounted for by currently known genetic loci. From 174 siblings of patients with surgically confirmed disease, 100 were randomly selected. Controls were recruited from patients attending an ophthalmology outpatient clinic for eye conditions unrelated to diabetes. There were no statistically significant differences in baseline characteristics between the case and control groups. In siblings, 47% had Dupuytren's disease, compared with 10% of controls, giving a ls of 4.5. Currently known loci that predispose to Dupuytren's disease account for 12.1% of the total heritability of the disease. Dupuytren's disease was significantly more common in siblings than in controls. These results accurately quantify the magnitude of the genetic predisposition to Dupuytren's disease.
Collapse
Affiliation(s)
- R Capstick
- Nuffield Department of Surgical Sciences, University of Oxford, UK
| | | | | | | |
Collapse
|
9
|
Roy S, Das K, Sarkar A. Analysis of binary data with the possibility of wrong ascertainment. STAT NEERL 2013. [DOI: 10.1111/stan.12008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Surupa Roy
- Department of Statistics; St. Xavier's College; Kolkata
| | - Kalyan Das
- Department of Statistics; University of Calcutta; Kolkata
| | - Angshuman Sarkar
- Department of Statistics, Siksha Bhavana; Visva-Bharati; Santiniketan
| |
Collapse
|
10
|
Roy S. Accounting for Response Misclassification and Covariate Measurement Error Using a Random Effects Logit Model. COMMUN STAT-SIMUL C 2012. [DOI: 10.1080/03610918.2011.611312] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
11
|
Al-Chalabi A, Lewis CM. Modelling the effects of penetrance and family size on rates of sporadic and familial disease. Hum Hered 2011; 71:281-8. [PMID: 21846995 DOI: 10.1159/000330167] [Citation(s) in RCA: 81] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2011] [Accepted: 06/11/2011] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND/AIMS Many complex diseases show a diversity of inheritance patterns ranging from familial disease, manifesting with autosomal dominant inheritance, through to simplex families in which only one person is affected, manifesting as apparently sporadic disease. The role of ascertainment bias in generating apparent patterns of inheritance is often overlooked. We therefore explored the role of two key parameters that influence ascertainment, penetrance and family size, in rates of observed familiality. METHODS We develop a mathematical model of familiality of disease, with parameters for penetrance, mutation frequency and family size, and test this in a complex disease: amyotrophic lateral sclerosis. RESULTS Monogenic, high-penetrance variants can explain patterns of inheritance in complex diseases and account for a large proportion of those with no apparent family history. With current demographic trends, rates of familiality will drop further. For example, a variant with penetrance 0.5 will cause apparently sporadic disease in 12% of families of size 10, but 80% of families of size 1. A variant with penetrance 0.9 has only an 11% chance of appearing sporadic in families of a size similar to those of Ireland in the past, compared with 57% in one-child families like many in China. CONCLUSIONS These findings have implications for genetic counselling, disease classification and the design of gene-hunting studies. The distinction between familial and apparently sporadic disease should be considered artificial.
Collapse
Affiliation(s)
- Ammar Al-Chalabi
- Department of Clinical Neuroscience, Medical Research Council Centre for Neurodegeneration Research, London, UK
| | | |
Collapse
|
12
|
|
13
|
Javaras KN, Hudson JI, Laird NM. Fitting ACE structural equation models to case-control family data. Genet Epidemiol 2010; 34:238-45. [PMID: 19918760 DOI: 10.1002/gepi.20454] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Investigators interested in whether a disease aggregates in families often collect case-control family data, which consist of disease status and covariate information for members of families selected via case or control probands. Here, we focus on the use of case-control family data to investigate the relative contributions to the disease of additive genetic effects (A), shared family environment (C), and unique environment (E). We describe an ACE model for binary family data; this structural equation model, which has been described previously, combines a general-family extension of the classic ACE twin model with a (possibly covariate-specific) liability-threshold model for binary outcomes. We then introduce our contribution, a likelihood-based approach to fitting the model to singly ascertained case-control family data. The approach, which involves conditioning on the proband's disease status and also setting prevalence equal to a prespecified value that can be estimated from the data, makes it possible to obtain valid estimates of the A, C, and E variance components from case-control (rather than only from population-based) family data. In fact, simulation experiments suggest that our approach to fitting yields approximately unbiased estimates of the A, C, and E variance components, provided that certain commonly made assumptions hold. Further, when our approach is used to fit the ACE model to Austrian case-control family data on depression, the resulting estimate of heritability is very similar to those from previous analyses of twin data.
Collapse
Affiliation(s)
- K N Javaras
- Waisman Laboratory for Brain Imaging & Behavior, University of Wisconsin-Madison, Madison, Wisconsin 53705, USA.
| | | | | |
Collapse
|
14
|
Epstein MP, Hunter JE, Allen EG, Sherman SL, Lin X, Boehnke M. A Variance-Component Framework for Pedigree Analysis of Continuous and Categorical Outcomes. STATISTICS IN BIOSCIENCES 2009; 1:181-198. [PMID: 20436936 PMCID: PMC2860148 DOI: 10.1007/s12561-009-9010-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Variance-component methods are popular and flexible analytic tools for elucidating the genetic mechanisms of complex quantitative traits from pedigree data. However, variance-component methods typically assume that the trait of interest follows a multivariate normal distribution within a pedigree. Studies have shown that violation of this normality assumption can lead to biased parameter estimates and inflations in type-I error. This limits the application of variance-component methods to more general trait outcomes, whether continuous or categorical in nature. In this paper, we develop and apply a general variance-component framework for pedigree analysis of continuous and categorical outcomes. We develop appropriate models using generalized-linear mixed model theory and fit such models using approximate maximum-likelihood procedures. Using our proposed method, we demonstrate that one can perform variance-component pedigree analysis on outcomes that follow any exponential-family distribution. Additionally, we also show how one can modify the method to perform pedigree analysis of ordinal outcomes. We also discuss extensions of our variance-component framework to accommodate pedigrees ascertained based on trait outcome. We demonstrate the feasibility of our method using both simulated data and data from a genetic study of ovarian insufficiency.
Collapse
Affiliation(s)
| | | | - Emily G. Allen
- Department of Human Genetics, Emory University, Atlanta, GA
| | | | - Xihong Lin
- Department of Biostatistics, Harvard University, Boston, MA
| | - Michael Boehnke
- Department of Biostatistics, University of Michigan, Ann Arbor, MI
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI
| |
Collapse
|
15
|
Khoury MJ, Bertram L, Boffetta P, Butterworth AS, Chanock SJ, Dolan SM, Fortier I, Garcia-Closas M, Gwinn M, Higgins JPT, Janssens ACJW, Ostell J, Owen RP, Pagon RA, Rebbeck TR, Rothman N, Bernstein JL, Burton PR, Campbell H, Chockalingam A, Furberg H, Little J, O'Brien TR, Seminara D, Vineis P, Winn DM, Yu W, Ioannidis JPA. Genome-wide association studies, field synopses, and the development of the knowledge base on genetic variation and human diseases. Am J Epidemiol 2009; 170:269-79. [PMID: 19498075 PMCID: PMC2714948 DOI: 10.1093/aje/kwp119] [Citation(s) in RCA: 115] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Genome-wide association studies (GWAS) have led to a rapid increase in available data on common genetic variants and phenotypes and numerous discoveries of new loci associated with susceptibility to common complex diseases. Integrating the evidence from GWAS and candidate gene studies depends on concerted efforts in data production, online publication, database development, and continuously updated data synthesis. Here the authors summarize current experience and challenges on these fronts, which were discussed at a 2008 multidisciplinary workshop sponsored by the Human Genome Epidemiology Network. Comprehensive field synopses that integrate many reported gene-disease associations have been systematically developed for several fields, including Alzheimer's disease, schizophrenia, bladder cancer, coronary heart disease, preterm birth, and DNA repair genes in various cancers. The authors summarize insights from these field synopses and discuss remaining unresolved issues—especially in the light of evidence from GWAS, for which they summarize empirical P-value and effect-size data on 223 discovered associations for binary outcomes (142 with P < 10−7). They also present a vision of collaboration that builds reliable cumulative evidence for genetic associations with common complex diseases and a transparent, distributed, authoritative knowledge base on genetic variation and human health. As a next step in the evolution of Human Genome Epidemiology reviews, the authors invite investigators to submit field synopses for possible publication in the American Journal of Epidemiology.
Collapse
Affiliation(s)
- Muin J Khoury
- Office of Public Health Genomics, National Center for Chronic Disease Prevention and Health Promotion, Centers for Disease Control and Prevention, Atlanta, GA 30341, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Khoury MJ, Bertram L, Boffetta P, Butterworth AS, Chanock SJ, Dolan SM, Fortier I, Garcia-Closas M, Gwinn M, Higgins JPT, Janssens ACJW, Ostell J, Owen RP, Pagon RA, Rebbeck TR, Rothman N, Bernstein JL, Burton PR, Campbell H, Chockalingam A, Furberg H, Little J, O'Brien TR, Seminara D, Vineis P, Winn DM, Yu W, Ioannidis JPA. Genome-wide association studies, field synopses, and the development of the knowledge base on genetic variation and human diseases. Am J Epidemiol 2009. [PMID: 19498075 DOI: 10.1093/aje.kwp119] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Genome-wide association studies (GWAS) have led to a rapid increase in available data on common genetic variants and phenotypes and numerous discoveries of new loci associated with susceptibility to common complex diseases. Integrating the evidence from GWAS and candidate gene studies depends on concerted efforts in data production, online publication, database development, and continuously updated data synthesis. Here the authors summarize current experience and challenges on these fronts, which were discussed at a 2008 multidisciplinary workshop sponsored by the Human Genome Epidemiology Network. Comprehensive field synopses that integrate many reported gene-disease associations have been systematically developed for several fields, including Alzheimer's disease, schizophrenia, bladder cancer, coronary heart disease, preterm birth, and DNA repair genes in various cancers. The authors summarize insights from these field synopses and discuss remaining unresolved issues -- especially in the light of evidence from GWAS, for which they summarize empirical P-value and effect-size data on 223 discovered associations for binary outcomes (142 with P < 10(-7)). They also present a vision of collaboration that builds reliable cumulative evidence for genetic associations with common complex diseases and a transparent, distributed, authoritative knowledge base on genetic variation and human health. As a next step in the evolution of Human Genome Epidemiology reviews, the authors invite investigators to submit field synopses for possible publication in the American Journal of Epidemiology.
Collapse
Affiliation(s)
- Muin J Khoury
- Office of Public Health Genomics, National Center for Chronic Disease Prevention and Health Promotion, Centers for Disease Control and Prevention, Atlanta, GA 30341, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Ma J, Amos CI, Warwick Daw E. Ascertainment correction for Markov chain Monte Carlo segregation and linkage analysis of a quantitative trait. Genet Epidemiol 2007; 31:594-604. [PMID: 17487893 DOI: 10.1002/gepi.20231] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Although extended pedigrees are often sampled through probands with extreme levels of a quantitative trait, Markov chain Monte Carlo (MCMC) methods for segregation and linkage analysis have not been able to perform ascertainment corrections. Further, the extent to which ascertainment of pedigrees leads to biases in the estimation of segregation and linkage parameters has not been previously studied for MCMC procedures. In this paper, we studied these issues with a Bayesian MCMC approach for joint segregation and linkage analysis, as implemented in the package Loki. We first simulated pedigrees ascertained through individuals with extreme values of a quantitative trait in spirit of the sequential sampling theory of Cannings and Thompson [Cannings and Thompson [1977] Clin. Genet. 12:208-212]. Using our simulated data, we detected no bias in estimates of the trait locus location. However, in addition to allele frequencies, when the ascertainment threshold was higher than or close to the true value of the highest genotypic mean, bias was also found in the estimation of this parameter. When there were multiple trait loci, this bias destroyed the additivity of the effects of the trait loci, and caused biases in the estimation all genotypic means when a purely additive model was used for analyzing the data. To account for pedigree ascertainment with sequential sampling, we developed a Bayesian ascertainment approach and implemented Metropolis-Hastings updates in the MCMC samplers used in Loki. Ascertainment correction greatly reduced biases in parameter estimates. Our method is designed for multiple, but a fixed number of trait loci.
Collapse
Affiliation(s)
- Jianzhong Ma
- Department of Epidemiology, The University of Texas M. D. Anderson Cancer Center, Houston, Texas 77005, USA
| | | | | |
Collapse
|
18
|
McDonnell SM, Sinsheimer J, Price AJ, Carr AJ. Genetic influences in the aetiology of anteromedial osteoarthritis of the knee. ACTA ACUST UNITED AC 2007; 89:901-3. [PMID: 17673582 DOI: 10.1302/0301-620x.89b7.18915] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
We report a study of 112 patients with primary anteromedial osteoarthritis of the knee and their families. Sibling risk was determined using randomly selected single siblings. Spouses were used as controls. The presence of symptomatic osteoarthritis was determined using an Oxford knee score of ≥ 29 supported by a Kellgren and Lawrence radiological score of II or greater. Using Fisher’s exact test we found that there was a significant increased risk of anteromedial osteoarthritis (OA) relative to the control group (p = 0.031). The recurrence risk of anteromedial OA to siblings was 3.21 (95% confidence interval 1.12 to 9.27). These findings imply that genetic factors may play a major role in the development of anteromedial OA of the knee.
Collapse
Affiliation(s)
- S M McDonnell
- University of Oxford, Nuffield Department of Orthopaedic Surgery, Nuffield Orthopaedic Centre, Windmill Road, Headington, Oxford OX3 7LD, UK
| | | | | | | |
Collapse
|
19
|
Leu M, Czene K, Reilly M. “Population Lab”: The Creation of Virtual Populations for Genetic Epidemiology Research. Epidemiology 2007; 18:433-40. [PMID: 17486019 DOI: 10.1097/ede.0b013e31805d8ab2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
BACKGROUND Studies of familial aggregation of disease routinely use linked population registers to construct retrospective cohorts. Although such resources have provided numerous estimates of familial risk, little is known regarding the sensitivity of the estimates to assumed disease models, changing demographics and incidence, and incompleteness of the data. Furthermore, there are no standard tools for testing the validity of estimates from standard epidemiologic designs and from new analytic strategies using register data. METHODS We present a method and a software package for simulating realistic populations of related individuals, using easily available vital statistics (population counts and fertility and mortality rates). The virtual population is stored in a pedigree file, allowing for easy retrieval of relatives and family structures. We simulate breast cancer in our population using age-specific incidence rates. RESULTS The Swedish population is simulated as dynamically evolving over the calendar period 1955-2002. The simulated and real population agree well on important features such as age profile, sibship size distribution, and average age at first birth. Using breast cancer as an example, we present several models of familial disease aggregation and show that the parameters used in the simulations are faithfully estimated. In addition, we illustrate how our simulated population provides insight into how incomplete family history in real register data can affect estimates of familial risk. CONCLUSIONS This simulation method can be used to investigate various underlying models of disease aggregation in families and enhance the development of optimal approaches for family studies. The software package, Population Lab, is available for free download (http://www.meb.ki.se/ approximately marrei/software/poplab/ and http://cran.at.r-project.org/).
Collapse
Affiliation(s)
- Monica Leu
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden.
| | | | | |
Collapse
|
20
|
Iyengar SK, Freedman BI, Sedor JR. Mining the genome for susceptibility to diabetic nephropathy: the role of large-scale studies and consortia. Semin Nephrol 2007; 27:208-22. [PMID: 17418689 DOI: 10.1016/j.semnephrol.2007.01.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Approximately 30% of individuals with type 1 and type 2 diabetes develop persistent albuminuria, lose renal function, and are at increased risk for cardiovascular and other microvascular complications. Diabetes and kidney diseases rank within the top 10 causes of death in Westernized countries and cause significant morbidity. Given these observations, genetic, genomic, and proteomic investigations have been initiated to better define basic mechanisms for disease initiation and progression, to identify individuals at risk for diabetic complications, and to develop more efficacious therapies. In this review we have focused on linkage analyses of candidate genes or chromosomal regions, or coarse genome-wide scans, which have mapped either categorical (chronic kidney disease or end-stage renal disease) or quantitative kidney traits (albuminuria/proteinuria or glomerular filtration rate). Most loci identified to date have not been replicated, however, several linked chromosomal regions are concordant between independent samples, suggesting the presence of a diabetic nephropathy gene. Two genes, carnosinase (CNDP1) on 18q, and engulfment and cell motility 1 (ELMO1) on 7p14, have been identified as diabetic nephropathy susceptibility genes, but these results require authentication. The availability of patient data sets with large sample sizes, improvements in informatics, genotyping technology, and statistical methodologies should accelerate the discovery of valid diabetic nephropathy susceptibility genes.
Collapse
Affiliation(s)
- Sudha K Iyengar
- Department of Epidemiology and Biostatistics, Case Western Reserve University, 2103 Cornell Road, Cleveland, OH 44106, USA.
| | | | | |
Collapse
|
21
|
Bowden J, Thompson JR, Burton PR. A two-stage approach to the correction of ascertainment bias in complex genetic studies involving variance components. Ann Hum Genet 2007; 71:220-9. [PMID: 17354286 DOI: 10.1111/j.1469-1809.2006.00307.x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Correction for ascertainment bias is a vital part of the analysis of genetic epidemiology studies that needs to be undertaken whenever subjects are not recruited at random. Adjustment often requires extensive numerical integration, which can be very slow or even computationally infeasible, especially if the model includes many fixed and random effects. In this paper we propose a two-stage method for ascertainment bias correction. In the first stage we estimate parameters that pertain to the ascertained population, that is the population that would be selected into the sample if the ascertainment criterion were applied to everyone. In the second stage we convert the estimates for the ascertained population into general population parameter estimates. We illustrate the method with simulations based on a simple model and then describe how the method can be used with complex models. The two-stage approach avoids some of the integration required in direct adjustment, hence speeding up the process of model fitting.
Collapse
|
22
|
Noh M, Yip B, Lee Y, Pawitan Y. Multicomponent variance estimation for binary traits in family-based studies. Genet Epidemiol 2006; 30:37-47. [PMID: 16265627 DOI: 10.1002/gepi.20099] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In biometrical genetic analyses of binary traits, the use of family data overcomes some limitations of twin studies, particularly in terms of sample size and types of genetic or environmental factors that can be estimated. However, because of computational problems, recent methods in the application of generalized linear mixed models for family data structure have limited the ability to handle large data sets with general covariates. In this paper, we investigate the use of the hierarchical likelihood approach to the analysis of binary traits from family data. In a simulation study, the method is shown to be highly accurate for the estimation of both the variance components and fixed regression parameters, even for small family sizes. For illustration, we analyze a real data set of familial aggregation of preeclampsia, a pregnancy-induced hypertension. When possible, the analysis is compared with the exact maximum likelihood approach.
Collapse
Affiliation(s)
- M Noh
- Department of Statistics, Seoul National University, South Korea
| | | | | | | |
Collapse
|
23
|
Feng R, Zhang H. Ascertainment adjustment in genetic studies of ordinal traits. Hum Genet 2006; 119:429-35. [PMID: 16528520 DOI: 10.1007/s00439-006-0147-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2005] [Accepted: 01/17/2006] [Indexed: 10/24/2022]
Abstract
Most genetic studies recruit high risk families and the discoveries are based on non-random selected groups. We must consider the consequences of this ascertainment process in order to apply the results of genetic research to the general population. In previous reports, we developed a latent variable model to assess the familial aggregation and inheritability of ordinal-scaled diseases, and found a major gene component of alcoholism after applying the model to the data from the Yale family study of comorbidity of alcoholism and anxiety (YFSCAA). In this report, we examine the ascertainment effects on parameter estimates and correct potential bias in the latent variable model. The simulation studies for various ascertainment schemes suggest that our ascertainment adjustment is necessary and effective. We also find that the estimated effects are relatively unbiased for the particular ascertainment scheme used in the YFSCAA, which assures the validity of our earlier conclusion.
Collapse
Affiliation(s)
- Rui Feng
- Department of Biostatistics, University of Alabama at Birmingham, Birmingham, AL, USA
| | | |
Collapse
|
24
|
Oddy WH, Pal S, Kusel MMH, Vine D, de Klerk NH, Hartmann P, Holt PG, Sly PD, Burton PR, Stanley FJ, Landau LI. Atopy, eczema and breast milk fatty acids in a high-risk cohort of children followed from birth to 5 yr. Pediatr Allergy Immunol 2006; 17:4-10. [PMID: 16426248 DOI: 10.1111/j.1399-3038.2005.00340.x] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
BACKGROUND The incidence of atopic diseases such as eczema is increasing in westernized societies. The suggestion that there is a "protective" association between the unique fatty acid composition of breast milk, particularly the omega-3 (n-3) and omega-6 (n-6) essential polyunsaturated fatty acid content, and the development of atopic disease in children was investigated in a cohort study of 263 infants born into families with a history of allergy (one or both parents had asthma, hayfever, eczema). The objectives of this study were to determine the lipid profile [specifically in relation to long-chain polyunsaturated fatty acid (LC-PUFA) composition] in maternal breast milk samples collected at 6 wk and at 6 months following birth, and to investigate the potential role of these fatty acids in modulating the phenotype of children at high genetic risk of developing atopic disease. METHOD Breast milk samples were available from 91 atopic mothers at their child's ages of 6 wk and 6 months. These samples were analysed for the fatty acid spectrum. Analysis of variance was used to detect differences between groups of outcomes (no atopy or eczema, non-atopic eczema, atopy, atopic eczema) at ages 6 months and 5 yr, and a multiple comparisons procedure was conducted to isolate the parameters producing the different results (F-test, LSD test). For the exposure variables, n-3 and n-6 fatty acids are expressed as weight percentage and as a ratio (at both time-points). RESULTS The fatty acid profiles of maternal breast milk at 6 wk and 6 months were similar. An increased ratio of n-6: n-3 fatty acids in both 6 wk and 6 month milk samples was associated with non-atopic eczema (p < 0.005) but not atopy alone or atopic eczema. CONCLUSION We found milk fatty acids were a significant modulator of non-atopic eczema but not atopy or atopic eczema in infants at 6 months. In mothers with a history of asthma, hayfever or eczema, their 6-month-old infants were more likely to develop non-atopic eczema if their milk had a higher ratio of n-6: n-3 LC-PUFA.
Collapse
Affiliation(s)
- Wendy H Oddy
- Telethon Institute for Child Health Research, Centre for Child Health Research, University of Western Australia, PO Box 855, West Perth, Perth, WA, Australia.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
25
|
Abstract
This article is the first in a series of seven that will provide an overview of central concepts and topical issues in modern genetic epidemiology. In this article, we provide an overall framework for investigating the role of familial factors, especially genetic determinants, in the causation of complex diseases such as diabetes. The discrete steps of the framework to be outlined integrate the biological science underlying modern genetics and the population science underpinning mainstream epidemiology. In keeping with the broad readership of The Lancet and the diverse background of today's genetic epidemiologists, we provide introductory sections to equip readers with basic concepts and vocabulary. We anticipate that, depending on their professional background and specialist knowledge, some readers will wish to skip some of this article.
Collapse
Affiliation(s)
- Paul R Burton
- Department of Health Sciences, University of Leicester, Leicester, UK.
| | | | | |
Collapse
|
26
|
Abstract
Nonrandom ascertainment is commonly used in genetic studies of rare diseases, since this design is often more convenient than the random-sampling design. When there is an underlying latent heterogeneity, Epstein et al. ([2002] Am. J. Hum. Genet. 70:886-895) showed that it is possible to get unbiased or consistent estimation of population parameters under ascertainment adjustment, but Glidden and Liang ([2002] Genet. Epidemiol. 23:201-208) showed in a simulation study that the resulting estimates are highly sensitive to misspecification of the latent components. To overcome this difficulty, we consider a heavy-tailed model for latent variables that allows a robust estimation of the parameters. We describe a hierarchical-likelihood approach that avoids the integration used in the standard marginal likelihood approach. We revisit and extend the previous simulation, and show that the resulting estimator is efficient and robust against misspecification of the distribution of latent variables.
Collapse
Affiliation(s)
- Maengseok Noh
- Department of Statistics, Seoul University, Seoul, Republic of Korea
| | | | | |
Collapse
|
27
|
Abstract
This report points out that some sibling genetic risk parameters can be regarded as the ratios of the characteristic values in the ascertainment subpopulation. Based on this observation, we reconsider Olson and Cordell's ([2000] Genet. Epidemiol. 18:217-235) and Cordell and Olson's ([2000] Genet. Epidemiol. 18:307-321) estimators, and re-derive these estimators. Furthermore, we provide the closed-form variance estimators. Simulation results suggest that our proposed estimators perform very well, and single ascertainment may be better than complete ascertainment for estimating these genetic parameters.
Collapse
Affiliation(s)
- Guohua Zou
- Department of Epidemiology and Public Health, Yale University School of Medicine, New Haven, Connecticut 06520-8034, USA
| | | |
Collapse
|
28
|
Sun W, Li H. Ascertainment-adjusted maximum likelihood estimation for the additive genetic gamma frailty model. LIFETIME DATA ANALYSIS 2004; 10:229-245. [PMID: 15456105 DOI: 10.1023/b:lida.0000036390.15378.5e] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
The additive genetic gamma frailty model has been proposed for genetic linkage analysis for complex diseases to account for variable age of onset and possible covariates effects. To avoid ascertainment biases in parameter estimates, retrospective likelihood ratio tests are often used, which may result in loss of efficiency due to conditioning. This paper considers when the sibships are ascertained by having at least two affected sibs with the disease before a given age and provides two approaches for estimating the parameters in the additive gamma frailty model. One approach is based on the likelihood function conditioning on the ascertainment event, the other is based on maximizing a full ascertainment-adjusted likelihood. Explicit forms for these likelihood functions are derived. Simulation studies indicate that when the baseline hazard function can be correctly pre-specified, both approaches give accurate estimates of the model parameters. However, when the baseline hazard function has to be estimated simultaneously, only the ascertainment-adjusted likelihood method gives an unbiased estimate of the parameters. These results imply that the ascertainment-adjusted likelihood ratio test in the context of the additive genetic gamma frailty may be used for genetic linkage analysis.
Collapse
Affiliation(s)
- Wanlong Sun
- Rowe Program in Human Genetics, University of California, Davis School of Medicine, Davis, CA, USA
| | | |
Collapse
|
29
|
|
30
|
Burton PR. Correcting for nonrandom ascertainment in generalized linear mixed models (GLMMs), fitted using Gibbs sampling. Genet Epidemiol 2003; 24:24-35. [PMID: 12508253 DOI: 10.1002/gepi.10206] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Gibbs sampling-based generalized linear mixed models (GLMMs) provide a convenient and flexible way to extend variance components models for multivariate normally distributed continuous traits to other classes of phenotype. This includes binary traits and right-censored failure times such as age-at-onset data. The approach has applications in many areas of genetic epidemiology. However, the required GLMMs are sensitive to nonrandom ascertainment. In the absence of an appropriate correction for ascertainment, they can exhibit marked positive bias in the estimated grand mean and serious shrinkage in the estimated magnitude of variance components. To compound practical difficulties, it is currently difficult to implement a conventional adjustment for ascertainment because of the need to undertake repeated integration across the distribution of random effects. This is prohibitively slow when it must be repeated at every iteration of the Markov chain Monte Carlo (MCMC) procedure. This paper motivates a correction for ascertainment that is based on sampling random effects rather than integrating across them and can therefore be implemented in a general-purpose Gibbs sampling environment such as WinBUGS. The approach has the characteristic that it returns ascertainment-adjusted parameter estimates that pertain to the true distribution of determinants in the ascertained sample rather than in the general population. The implications of this characteristic are investigated and discussed. This paper extends the utility of Gibbs sampling-based GLMMs to a variety of settings in which family data are ascertained nonrandomly.
Collapse
Affiliation(s)
- Paul R Burton
- Department of Epidemiology and Public Health and Institute of Genetics, University of Leicester, Leicester, UK.
| |
Collapse
|
31
|
Affiliation(s)
- David V Glidden
- Department of Epidemiology and Biostatistics, University of California at San Francisco, San Francisco, California 94143-0560, USA.
| |
Collapse
|
32
|
Affiliation(s)
- Michael P Epstein
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan 48109-2029, USA.
| |
Collapse
|
33
|
Abstract
Genetic studies of complex diseases must confront two statistically difficult issues simultaneously. First, in many settings, to minimize the number of individuals to be genotyped, families enriched for disease must be oversampled. Also, statistical models in family studies should allow for residual association. This association will represent unmeasured genetic and environmental factors influencing disease risk. Dealing with these features simultaneously is both compelling and challenging. Burton et al. [2000] (Am. J. Hum. Gen. 69: 1505-14) recently discussed this issue and suggested that ascertainment corrections may lead to problematic parameter estimation. We revisit the issues and examples of Burton et al. [2000] (Am. J. Hum. Genet. 69: 1505-14) and present a more optimistic assessment. Estimation in this context is conceptually straightforward, but may be more problematic in practice. Specifically, we find that even slight misspecification of the random effects distribution in ascertainment-adjusted likelihood can yield severely biased parameter estimates. This result should make scientists wary when interpreting results from ascertainment-adjusted variance-component models. .
Collapse
Affiliation(s)
- David V Glidden
- Department of Epidemiology and Biostatistics, University of California at San Francisco, San Francisco, California 94143-0560, USA.
| | | |
Collapse
|
34
|
Affiliation(s)
- Paul R Burton
- Department of Epidemiology and Public Health, University of Leicester, Leicester, United Kingdom.
| |
Collapse
|
35
|
Burton PR, Palmer LJ, Keen KJ, Olson JM, Elston RC. Response to Epstein et al. Am J Hum Genet 2002; 71:441-2. [PMID: 12154780 PMCID: PMC379179 DOI: 10.1086/341663] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022] Open
|
36
|
Jacobs KB, Burton PR, Iyengar SK, Elston RC, Palmer LJ. Pooling data and linkage analysis in the chromosome 5q candidate region for asthma. Genet Epidemiol 2002; 21 Suppl 1:S103-8. [PMID: 11793650 DOI: 10.1002/gepi.2001.21.s1.s103] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
We investigated a variety of methods for pooling data from eight data sets (n = 5,424 subjects) to validate evidence for linkage of markers in the cytokine cluster on chromosome 5q31-33 to asthma and asthma-associated phenotypes. Chromosome 5 markers were integrated into current genetic linkage and physical maps, and a consensus map was constructed to facilitate effective data pooling. To provide more informative phenotypes with better distributional properties, variance component models were fitted using Gibbs sampling methods in order to generate residual additive genetic effects, or sigma-squared-A-random-effects (SSARs), which were used as derived phenotypes in subsequent linkage analyses. Multipoint estimates of alleles shared identically by descent (IBD) were computed for all full sibling pairs. Linkage analyses were performed with a new Haseman-Elston method that uses generalized-least-squares and a weighted combination of the mean-corrected trait-sum squared and trait-difference squared as the dependent variable. Analyses were performed with all data sets pooled together, and also separately with the resulting linkage statistics pooled by several meta-analytic methods. Our results provide no significant evidence that loci conferring susceptibility to asthma affection or atopy, as measured by total serum IgE levels, are present in the 5q31-33 region. This study has provided a clearer understanding of the significance, or lack of significance, of the 5q31-33 region in asthma genetics for the phenotypes studied.
Collapse
Affiliation(s)
- K B Jacobs
- Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, Ohio, USA
| | | | | | | | | |
Collapse
|
37
|
Epstein MP, Lin X, Boehnke M. Ascertainment-adjusted parameter estimates revisited. Am J Hum Genet 2002; 70:886-95. [PMID: 11880949 PMCID: PMC379117 DOI: 10.1086/339517] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2001] [Accepted: 01/07/2002] [Indexed: 11/03/2022] Open
Abstract
Ascertainment-adjusted parameter estimates from a genetic analysis are typically assumed to reflect the parameter values in the original population from which the ascertained data were collected. Burton et al. (2000) recently showed that, given unmodeled parameter heterogeneity, the standard ascertainment adjustment leads to biased parameter estimates of the population-based values. This finding has important implications in complex genetic studies, because of the potential existence of unmodeled genetic parameter heterogeneity. The authors further stated the important point that, given unmodeled heterogeneity, the ascertainment-adjusted parameter estimates reflect the true parameter values in the ascertained subpopulation. They illustrated these statements with two examples. By revisiting these examples, we demonstrate that if the ascertainment scheme and the nature of the data can be correctly modeled, then an ascertainment-adjusted analysis returns population-based parameter estimates. We further demonstrate that if the ascertainment scheme and data cannot be modeled properly, then the resulting ascertainment-adjusted analysis produces parameter estimates that generally do not reflect the true values in either the original population or the ascertained subpopulation.
Collapse
Affiliation(s)
- Michael P Epstein
- Department of Biostatistics, University of Michigan, 1420 Washington Heights, Ann Arbor, MI 48109-2029, USA
| | | | | |
Collapse
|
38
|
Abstract
Genetic mapping in analysis of medical disease is performed under several assumptions and (experimental) conditions, which are made about the data in general and the disease in particular. Here we discuss these conditions, what they mean, and what kind of deleterious effects they might have on the analysis. We also illustrate how to proceed and what kind of possibilities the statistical analysis may provide to medical scientists.
Collapse
|