1
|
Zhong Y, Cook RJ, Yu A. Analysis of secondary failure time responses in studies with response-dependent sampling schemes. Stat Med 2023; 42:4763-4775. [PMID: 37643587 DOI: 10.1002/sim.9887] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Revised: 06/20/2023] [Accepted: 08/15/2023] [Indexed: 08/31/2023]
Abstract
Response-dependent sampling is routinely used as an enrichment strategy in the design of family studies investigating the heritable nature of disease. In addition to the response of primary interest, investigators often wish to investigate the association between biomarkers and secondary responses related to possible comorbidities. Statistical analysis regarding genetic biomarkers and their association with the secondary outcome must address the biased sampling scheme involving the primary response. In this article, we develop composite likelihoods and two-stage estimation procedures for such secondary analyses in which the within-family dependence structure for the primary and secondary outcomes is modeled via a Gaussian copula. The dependence among responses within family members is modeled based on kinship coefficients. Auxiliary data from independent individuals are exploited by augmenting the composite likelihoods to increase precision of marginal parameter estimates and enhance the efficiency of estimators of the dependence parameters. Simulation studies are carried out to evaluate the finite sample performance of the proposed method, and an application to a motivating family study in psoriatic arthritis is given for illustration.
Collapse
Affiliation(s)
- Yujie Zhong
- School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, China
- Oncology Statistics, R&D China AstraZeneca, Shanghai, China
| | - Richard J Cook
- Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario, Canada
| | - Aiai Yu
- School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, China
| |
Collapse
|
2
|
Lakhal‐Chaieb L, Cook RJ, Zhong Y. Testing the heritability and parent‐of‐origin hypotheses for ages at onset of psoriatic arthritis under biased sampling. Biometrics 2019; 76:293-303. [DOI: 10.1111/biom.13138] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2017] [Accepted: 07/19/2019] [Indexed: 11/29/2022]
Affiliation(s)
- Lajmi Lakhal‐Chaieb
- Département de Mathématiques et de StatistiqueUniversité LavalQuébec Québec Canada
| | - Richard J. Cook
- Department of Statistics and Actuarial ScienceUniversity of WaterlooWaterloo Ontario Canada
| | - Yujie Zhong
- School of Statistics and ManagementShanghai University of Finance and EconomicsShanghai China
| |
Collapse
|
3
|
Zhong Y, Cook RJ. Augmented composite likelihood for copula modeling in family studies under biased sampling. Biostatistics 2016; 17:437-52. [DOI: 10.1093/biostatistics/kxv054] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2015] [Accepted: 12/22/2015] [Indexed: 11/12/2022] Open
|
4
|
Measuring missing heritability: inferring the contribution of common variants. Proc Natl Acad Sci U S A 2014; 111:E5272-81. [PMID: 25422463 DOI: 10.1073/pnas.1419064111] [Citation(s) in RCA: 178] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Genome-wide association studies (GWASs), also called common variant association studies (CVASs), have uncovered thousands of genetic variants associated with hundreds of diseases. However, the variants that reach statistical significance typically explain only a small fraction of the heritability. One explanation for the "missing heritability" is that there are many additional disease-associated common variants whose effects are too small to detect with current sample sizes. It therefore is useful to have methods to quantify the heritability due to common variation, without having to identify all causal variants. Recent studies applied restricted maximum likelihood (REML) estimation to case-control studies for diseases. Here, we show that REML considerably underestimates the fraction of heritability due to common variation in this setting. The degree of underestimation increases with the rarity of disease, the heritability of the disease, and the size of the sample. Instead, we develop a general framework for heritability estimation, called phenotype correlation-genotype correlation (PCGC) regression, which generalizes the well-known Haseman-Elston regression method. We show that PCGC regression yields unbiased estimates. Applying PCGC regression to six diseases, we estimate the proportion of the phenotypic variance due to common variants to range from 25% to 56% and the proportion of heritability due to common variants from 41% to 68% (mean 60%). These results suggest that common variants may explain at least half the heritability for many diseases. PCGC regression also is readily applicable to other settings, including analyzing extreme-phenotype studies and adjusting for covariates such as sex, age, and population structure.
Collapse
|
5
|
Cobat A, Abel L, Alcaïs A, Schurr E. A general efficient and flexible approach for genome-wide association analyses of imputed genotypes in family-based designs. Genet Epidemiol 2014; 38:560-71. [PMID: 25044438 DOI: 10.1002/gepi.21842] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2013] [Revised: 05/13/2014] [Accepted: 05/19/2014] [Indexed: 01/10/2023]
Abstract
Genotype imputation is a critical technique for following up genome-wide association studies. Efficient methods are available for dealing with the probabilistic nature of imputed single nucleotide polymorphisms (SNPs) in population-based designs, but not for family-based studies. We have developed a new analytical approach (FBATdosage), using imputed allele dosage in the general framework of family-based association tests to bridge this gap. Simulation studies showed that FBATdosage yielded highly consistent type I error rates, whatever the level of genotype uncertainty, and a much higher power than the best-guess genotype approach. FBATdosage allows fast linkage and association testing of several million of imputed variants with binary or quantitative phenotypes in nuclear families of arbitrary size with arbitrary missing data for the parents. The application of this approach to a family-based association study of leprosy susceptibility successfully refined the association signal at two candidate loci, C1orf141-IL23R on chromosome 1 and RAB32-C6orf103 on chromosome 6.
Collapse
Affiliation(s)
- Aurélie Cobat
- Departments of Human Genetics and Medicine, McGill International TB Center, McGill University Health Center, Montreal, QC, Canada
| | | | | | | |
Collapse
|
6
|
Zhao Y, Yu H, Zhu Y, Ter-Minassian M, Peng Z, Shen H, Diao N, Chen F. Genetic association analysis using sibship data: a multilevel model approach. PLoS One 2012; 7:e31134. [PMID: 22312441 PMCID: PMC3270036 DOI: 10.1371/journal.pone.0031134] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2011] [Accepted: 01/03/2012] [Indexed: 11/29/2022] Open
Abstract
Family based association study (FBAS) has the advantages of controlling for population stratification and testing for linkage and association simultaneously. We propose a retrospective multilevel model (rMLM) approach to analyze sibship data by using genotypic information as the dependent variable. Simulated data sets were generated using the simulation of linkage and association (SIMLA) program. We compared rMLM to sib transmission/disequilibrium test (S-TDT), sibling disequilibrium test (SDT), conditional logistic regression (CLR) and generalized estimation equations (GEE) on the measures of power, type I error, estimation bias and standard error. The results indicated that rMLM was a valid test of association in the presence of linkage using sibship data. The advantages of rMLM became more evident when the data contained concordant sibships. Compared to GEE, rMLM had less underestimated odds ratio (OR). Our results support the application of rMLM to detect gene-disease associations using sibship data. However, the risk of increasing type I error rate should be cautioned when there is association without linkage between the disease locus and the genotyped marker.
Collapse
Affiliation(s)
- Yang Zhao
- Department of Epidemiology and Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
- Environmental and Occupational Medicine and Epidemiology Program, Department of Environmental Health, Harvard School of Public Health, Harvard University, Boston, Massachusetts, United States of America
| | - Hao Yu
- Department of Epidemiology and Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Ying Zhu
- Imperial College Business School, Imperial College London, London, United Kingdom
| | - Monica Ter-Minassian
- Environmental and Occupational Medicine and Epidemiology Program, Department of Environmental Health, Harvard School of Public Health, Harvard University, Boston, Massachusetts, United States of America
| | - Zhihang Peng
- Department of Epidemiology and Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Hongbing Shen
- Department of Epidemiology and Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
| | - Nancy Diao
- Environmental and Occupational Medicine and Epidemiology Program, Department of Environmental Health, Harvard School of Public Health, Harvard University, Boston, Massachusetts, United States of America
| | - Feng Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, China
- * E-mail:
| |
Collapse
|
7
|
Javaras KN, Hudson JI, Laird NM. Fitting ACE structural equation models to case-control family data. Genet Epidemiol 2010; 34:238-45. [PMID: 19918760 DOI: 10.1002/gepi.20454] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Investigators interested in whether a disease aggregates in families often collect case-control family data, which consist of disease status and covariate information for members of families selected via case or control probands. Here, we focus on the use of case-control family data to investigate the relative contributions to the disease of additive genetic effects (A), shared family environment (C), and unique environment (E). We describe an ACE model for binary family data; this structural equation model, which has been described previously, combines a general-family extension of the classic ACE twin model with a (possibly covariate-specific) liability-threshold model for binary outcomes. We then introduce our contribution, a likelihood-based approach to fitting the model to singly ascertained case-control family data. The approach, which involves conditioning on the proband's disease status and also setting prevalence equal to a prespecified value that can be estimated from the data, makes it possible to obtain valid estimates of the A, C, and E variance components from case-control (rather than only from population-based) family data. In fact, simulation experiments suggest that our approach to fitting yields approximately unbiased estimates of the A, C, and E variance components, provided that certain commonly made assumptions hold. Further, when our approach is used to fit the ACE model to Austrian case-control family data on depression, the resulting estimate of heritability is very similar to those from previous analyses of twin data.
Collapse
Affiliation(s)
- K N Javaras
- Waisman Laboratory for Brain Imaging & Behavior, University of Wisconsin-Madison, Madison, Wisconsin 53705, USA.
| | | | | |
Collapse
|
8
|
Yip BH, Reilly M, Cnattingius S, Pawitan Y. Matched ascertainment of informative families for complex genetic modelling. Behav Genet 2010; 40:404-14. [PMID: 20033275 PMCID: PMC2953624 DOI: 10.1007/s10519-009-9322-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2008] [Accepted: 11/28/2009] [Indexed: 11/26/2022]
Abstract
Family data are used extensively in quantitative genetic studies to disentangle the genetic and environmental contributions to various diseases. Many family studies based their analysis on population-based registers containing a large number of individuals composed of small family units. For binary trait analyses, exact marginal likelihood is a common approach, but, due to the computational demand of the enormous data sets, it allows only a limited number of effects in the model. This makes it particularly difficult to perform joint estimation of variance components for a binary trait and the potential confounders. We have developed a data-reduction method of ascertaining informative families from population-based family registers. We propose a scheme where the ascertained families match the full cohort with respect to some relevant statistics, such as the risk to relatives of an affected individual. The ascertainment-adjusted analysis, which we implement using a pseudo-likelihood approach, is shown to be efficient relative to the analysis of the whole cohort and robust to mis-specification of the random effect distribution.
Collapse
Affiliation(s)
- Benjamin H. Yip
- Department of Psychiatry, University of Hong Kong, Hong Kong, China
| | - Marie Reilly
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, 17177 Stockholm, Sweden
| | - Sven Cnattingius
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, 17177 Stockholm, Sweden
| | - Yudi Pawitan
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, 17177 Stockholm, Sweden
| |
Collapse
|
9
|
Pitkäniemi J, Varvio SL, Corander J, Lehti N, Partanen J, Tuomilehto-Wolf E, Tuomilehto J, Thomas A, Arjas E. Full likelihood analysis of genetic risk with variable age at onset disease--combining population-based registry data and demographic information. PLoS One 2009; 4:e6836. [PMID: 19718441 PMCID: PMC2730012 DOI: 10.1371/journal.pone.0006836] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2008] [Accepted: 07/30/2009] [Indexed: 11/24/2022] Open
Abstract
Background In genetic studies of rare complex diseases it is common to ascertain familial data from population based registries through all incident cases diagnosed during a pre-defined enrollment period. Such an ascertainment procedure is typically taken into account in the statistical analysis of the familial data by constructing either a retrospective or prospective likelihood expression, which conditions on the ascertainment event. Both of these approaches lead to a substantial loss of valuable data. Methodology and Findings Here we consider instead the possibilities provided by a Bayesian approach to risk analysis, which also incorporates the ascertainment procedure and reference information concerning the genetic composition of the target population to the considered statistical model. Furthermore, the proposed Bayesian hierarchical survival model does not require the considered genotype or haplotype effects be expressed as functions of corresponding allelic effects. Our modeling strategy is illustrated by a risk analysis of type 1 diabetes mellitus (T1D) in the Finnish population-based on the HLA-A, HLA-B and DRB1 human leucocyte antigen (HLA) information available for both ascertained sibships and a large number of unrelated individuals from the Finnish bone marrow donor registry. The heterozygous genotype DR3/DR4 at the DRB1 locus was associated with the lowest predictive probability of T1D free survival to the age of 15, the estimate being 0.936 (0.926; 0.945 95% credible interval) compared to the average population T1D free survival probability of 0.995. Significance The proposed statistical method can be modified to other population-based family data ascertained from a disease registry provided that the ascertainment process is well documented, and that external information concerning the sizes of birth cohorts and a suitable reference sample are available. We confirm the earlier findings from the same data concerning the HLA-DR3/4 related risks for T1D, and also provide here estimated predictive probabilities of disease free survival as a function of age.
Collapse
Affiliation(s)
- Janne Pitkäniemi
- Department of Public Health, University of Helsinki, Helsinki, Finland.
| | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Pitkäniemi J, Moltchanova E, Haapala L, Harjutsalo V, Tuomilehto J, Hakulinen T. Genetic random effects model for family data with long-term survivors: analysis of diabetic nephropathy in type 1 diabetes. Genet Epidemiol 2008; 31:697-708. [PMID: 17487884 DOI: 10.1002/gepi.20234] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
A shared and additive genetic variance component-long-term survivor (LTS) model for familial aggregation studies of complex diseases with variable age-at-onset phenotype and non-susceptible subjects in the study cohort is proposed. LTS has been used from the early 1970s, especially in epidemiological studies of cancer. The LTS model utilizes information on the age at onset (survival) distribution to make inference on partially latent susceptibility. Bayesian modeling with uninformative priors is used and estimates of the posterior distribution of age at onset and susceptibility parameters of interest have been obtained using Bayesian Markov chain Monte Carlo (MCMC) methods with OpenBugs program. A simulation study confirms that we obtain posterior estimates of the model parameters on shared and genetic variance components of age at onset and susceptibility with good coverage rates. Further, we analyze familial aggregation of diabetic nephropathy (DN) in large Finnish cohort of 528 sibships with type 1 diabetes (T1D). According to the variance components estimated a substantial familial variation in the susceptibility to DN exist among families, while time to DN is less influenced by shared familial factors.
Collapse
Affiliation(s)
- Janne Pitkäniemi
- Department of Public Health, University of Helsinki, Helsinki, Finland.
| | | | | | | | | | | |
Collapse
|
11
|
Ma J, Amos CI, Warwick Daw E. Ascertainment correction for Markov chain Monte Carlo segregation and linkage analysis of a quantitative trait. Genet Epidemiol 2007; 31:594-604. [PMID: 17487893 DOI: 10.1002/gepi.20231] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Although extended pedigrees are often sampled through probands with extreme levels of a quantitative trait, Markov chain Monte Carlo (MCMC) methods for segregation and linkage analysis have not been able to perform ascertainment corrections. Further, the extent to which ascertainment of pedigrees leads to biases in the estimation of segregation and linkage parameters has not been previously studied for MCMC procedures. In this paper, we studied these issues with a Bayesian MCMC approach for joint segregation and linkage analysis, as implemented in the package Loki. We first simulated pedigrees ascertained through individuals with extreme values of a quantitative trait in spirit of the sequential sampling theory of Cannings and Thompson [Cannings and Thompson [1977] Clin. Genet. 12:208-212]. Using our simulated data, we detected no bias in estimates of the trait locus location. However, in addition to allele frequencies, when the ascertainment threshold was higher than or close to the true value of the highest genotypic mean, bias was also found in the estimation of this parameter. When there were multiple trait loci, this bias destroyed the additivity of the effects of the trait loci, and caused biases in the estimation all genotypic means when a purely additive model was used for analyzing the data. To account for pedigree ascertainment with sequential sampling, we developed a Bayesian ascertainment approach and implemented Metropolis-Hastings updates in the MCMC samplers used in Loki. Ascertainment correction greatly reduced biases in parameter estimates. Our method is designed for multiple, but a fixed number of trait loci.
Collapse
Affiliation(s)
- Jianzhong Ma
- Department of Epidemiology, The University of Texas M. D. Anderson Cancer Center, Houston, Texas 77005, USA
| | | | | |
Collapse
|
12
|
Bowden J, Thompson JR, Burton PR. A two-stage approach to the correction of ascertainment bias in complex genetic studies involving variance components. Ann Hum Genet 2007; 71:220-9. [PMID: 17354286 DOI: 10.1111/j.1469-1809.2006.00307.x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Correction for ascertainment bias is a vital part of the analysis of genetic epidemiology studies that needs to be undertaken whenever subjects are not recruited at random. Adjustment often requires extensive numerical integration, which can be very slow or even computationally infeasible, especially if the model includes many fixed and random effects. In this paper we propose a two-stage method for ascertainment bias correction. In the first stage we estimate parameters that pertain to the ascertained population, that is the population that would be selected into the sample if the ascertainment criterion were applied to everyone. In the second stage we convert the estimates for the ascertained population into general population parameter estimates. We illustrate the method with simulations based on a simple model and then describe how the method can be used with complex models. The two-stage approach avoids some of the integration required in direct adjustment, hence speeding up the process of model fitting.
Collapse
|
13
|
Tobin MD, Raleigh SM, Newhouse S, Braund P, Bodycote C, Ogleby J, Cross D, Gracey J, Hayes S, Smith T, Ridge C, Caulfield M, Sheehan NA, Munroe PB, Burton PR, Samani NJ. Association of WNK1 gene polymorphisms and haplotypes with ambulatory blood pressure in the general population. Circulation 2005; 112:3423-9. [PMID: 16301342 DOI: 10.1161/circulationaha.105.555474] [Citation(s) in RCA: 97] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
BACKGROUND Blood pressure (BP) is a heritable trait of major public health concern. The WNK1 and WNK4 genes, which encode proteins in the WNK family of serine-threonine kinases, are involved in renal electrolyte homeostasis. Mutations in the WNK1 and WNK4 genes cause a rare monogenic hypertensive syndrome, pseudohypoaldosteronism type II. We investigated whether polymorphisms in these WNK genes influence BP in the general population. METHODS AND RESULTS Associations between 9 single-nucleotide polymorphisms (SNPs) in WNK1 and 1 in WNK4 with ambulatory BP were studied in a population-based sample of 996 subjects from 250 white European families. The heritability estimates of mean 24-hour systolic BP (SBP) and diastolic BP (DBP) were 63.4% and 67.9%, respectively. We found statistically significant (P<0.05) associations of several common SNPs and haplotypes in WNK1 with mean 24-hour SBP and/or DBP. The minor allele (C) of rs880054, with a frequency of 44%, reduced mean 24-hour SBP and DBP by 1.37 (95% confidence interval, -2.45 to -0.23) and 1.14 (95% confidence interval, -1.93 to -0.38) mm Hg, respectively, per copy of the allele. CONCLUSIONS Common variants in WNK1 contribute to BP variation in the general population. This study shows that a gene causing a rare monogenic form of hypertension also plays a significant role in BP regulation in the general population. The findings provide a basis to identify functional variants of WNK1, elucidate any interactions of these variants with dietary intake or with response to antihypertensive drugs, and determine their impact on cardiovascular morbidity and mortality.
Collapse
Affiliation(s)
- Martin D Tobin
- Department of Health Sciences, University of Leicester, Leicester, England
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
14
|
Abstract
This article is the first in a series of seven that will provide an overview of central concepts and topical issues in modern genetic epidemiology. In this article, we provide an overall framework for investigating the role of familial factors, especially genetic determinants, in the causation of complex diseases such as diabetes. The discrete steps of the framework to be outlined integrate the biological science underlying modern genetics and the population science underpinning mainstream epidemiology. In keeping with the broad readership of The Lancet and the diverse background of today's genetic epidemiologists, we provide introductory sections to equip readers with basic concepts and vocabulary. We anticipate that, depending on their professional background and specialist knowledge, some readers will wish to skip some of this article.
Collapse
Affiliation(s)
- Paul R Burton
- Department of Health Sciences, University of Leicester, Leicester, UK.
| | | | | |
Collapse
|
15
|
Scurrah K, Gurrin L, Palmer L, Burton P. Estimation of genetic and environmental factors for binary traits using family data by Y. Pawitan, M. Reilly, E. Nilsson, S. Cnattingius and P. Lichtenstein,Statistics in Medicine 2004;23:449–465. Stat Med 2005; 24:1613-7; author reply 1617-8. [PMID: 15880579 DOI: 10.1002/sim.2066] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
16
|
Abstract
Nonrandom ascertainment is commonly used in genetic studies of rare diseases, since this design is often more convenient than the random-sampling design. When there is an underlying latent heterogeneity, Epstein et al. ([2002] Am. J. Hum. Genet. 70:886-895) showed that it is possible to get unbiased or consistent estimation of population parameters under ascertainment adjustment, but Glidden and Liang ([2002] Genet. Epidemiol. 23:201-208) showed in a simulation study that the resulting estimates are highly sensitive to misspecification of the latent components. To overcome this difficulty, we consider a heavy-tailed model for latent variables that allows a robust estimation of the parameters. We describe a hierarchical-likelihood approach that avoids the integration used in the standard marginal likelihood approach. We revisit and extend the previous simulation, and show that the resulting estimator is efficient and robust against misspecification of the distribution of latent variables.
Collapse
Affiliation(s)
- Maengseok Noh
- Department of Statistics, Seoul University, Seoul, Republic of Korea
| | | | | |
Collapse
|
17
|
Green MJ, Burton PR, Green LE, Schukken YH, Bradley AJ, Peeler EJ, Medley GF. The use of Markov chain Monte Carlo for analysis of correlated binary data: patterns of somatic cells in milk and the risk of clinical mastitis in dairy cows. Prev Vet Med 2004; 64:157-74. [PMID: 15325770 DOI: 10.1016/j.prevetmed.2004.05.006] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2003] [Revised: 03/30/2004] [Accepted: 05/06/2004] [Indexed: 10/26/2022]
Abstract
Two analytical approaches were used to investigate the relationship between somatic cell concentrations in monthly quarter milk samples and subsequent, naturally occurring clinical mastitis in three dairy herds. Firstly, cows with clinical mastitis were selected and a conventional matched analysis was used to compare affected and unaffected quarters of the same cow. The second analysis included all cows, and in order to overcome potential bias associated with the correlation structure, a hierarchical Bayesian generalised linear mixed model was specified. A Markov chain Monte Carlo (MCMC) approach, that is Gibbs sampling, was used to estimate parameters. The results of both the matched analysis and the hierarchical modelling suggested that quarters with a somatic cell count (SCC) in the range 41,000-100,000 cells/ml had a lower risk of clinical mastitis during the next month than quarters <41,000 cell/ml. Quarters with an SCC >200,000 cells/ml were at the greatest risk of clinical mastitis in the next month. There was a reduced risk of clinical mastitis between 1 and 2 months later in quarters with an SCC of 81,000-150,000 cells/ml compared with quarters below this level. The hierarchical modelling analysis identified a further reduced risk of clinical mastitis between 2 and 3 months later in quarters with an SCC 61,000-150,000 cells/ml, compared to other quarters. We conclude that low concentrations of somatic cells in milk are associated with increased risk of clinical mastitis, and that high concentrations are indicative of pre-existing immunological mobilisation against infection. The variation in risk between quarters of affected cows suggests that local quarter immunological events, rather than solely whole cow factors, have an important influence on the risk of clinical mastitis. MCMC proved a useful tool for estimating parameters in a hierarchical Bernoulli model. Model construction and an approach to assessing goodness of model fit are described.
Collapse
Affiliation(s)
- M J Green
- Ecology and Epidemiology Group, Department of Biological Sciences, University of Warwick, Coventry CV4 7AL, UK.
| | | | | | | | | | | | | |
Collapse
|
18
|
|