1
|
MacNab YC. Revisiting Gaussian Markov random fields and Bayesian disease mapping. Stat Methods Med Res 2023; 32:207-225. [PMID: 36317373 PMCID: PMC9814028 DOI: 10.1177/09622802221129040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
We revisit several conditionally formulated Gaussian Markov random fields, known as the intrinsic conditional autoregressive model, the proper conditional autoregressive model, and the Leroux et al. conditional autoregressive model, as well as convolution models such as the well known Besag, York and Mollie model, its (adaptive) re-parameterization, and its scaled alternatives, for their roles of modelling underlying spatial risks in Bayesian disease mapping. Analytic and simulation studies, with graphic visualizations, and disease mapping case studies, present insights and critique on these models for their nature and capacities in characterizing spatial dependencies, local influences, and spatial covariance and correlation functions, and in facilitating stabilized and efficient posterior risk prediction and inference. It is illustrated that these models are Gaussian (Markov) random fields of different spatial dependence, local influence, and (covariance) correlation functions and can play different and complementary roles in Bayesian disease mapping applications.
Collapse
Affiliation(s)
- Ying C MacNab
- School of Population and Public Health, 8166University of British Columbia, Vancouver, Canada
| |
Collapse
|
2
|
Zhang J, Zhang YY, Tao J, Chen MH. Bayesian Item Response Theory Models With Flexible Generalized Logit Links. Appl Psychol Meas 2022; 46:382-405. [PMID: 35812812 PMCID: PMC9265488 DOI: 10.1177/01466216221089343] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
In educational and psychological research, the logit and probit links are often used to fit the binary item response data. The appropriateness and importance of the choice of links within the item response theory (IRT) framework has not been investigated yet. In this paper, we present a family of IRT models with generalized logit links, which include the traditional logistic and normal ogive models as special cases. This family of models are flexible enough not only to adjust the item characteristic curve tail probability by two shape parameters but also to allow us to fit the same link or different links to different items within the IRT model framework. In addition, the proposed models are implemented in the Stan software to sample from the posterior distributions. Using readily available Stan outputs, the four Bayesian model selection criteria are computed for guiding the choice of the links within the IRT model framework. Extensive simulation studies are conducted to examine the empirical performance of the proposed models and the model fittings in terms of "in-sample" and "out-of-sample" predictions based on the deviance. Finally, a detailed analysis of the real reading assessment data is carried out to illustrate the proposed methodology.
Collapse
Affiliation(s)
- Jiwei Zhang
- Faculty of Education Northeast Normal University, Changchun, China
| | - Ying-Ying Zhang
- Department of Statistics and Actuarial Science, Chongqing University, Chongqing, China
| | - Jian Tao
- School of Mathematics and Statistics, Northeast Normal University, Changchun, China
| | - Ming-Hui Chen
- Department of Statistics, University of Connecticut, Storrs, CT, USA
| |
Collapse
|
3
|
Joo SH, Lee P, Stark S. Bayesian Approaches for Detecting Differential Item Functioning Using the Generalized Graded Unfolding Model. Appl Psychol Meas 2022; 46:98-115. [PMID: 35281341 PMCID: PMC8908411 DOI: 10.1177/01466216211066606] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Differential item functioning (DIF) analysis is one of the most important applications of item response theory (IRT) in psychological assessment. This study examined the performance of two Bayesian DIF methods, Bayes factor (BF) and deviance information criterion (DIC), with the generalized graded unfolding model (GGUM). The Type I error and power were investigated in a Monte Carlo simulation that manipulated sample size, DIF source, DIF size, DIF location, subpopulation trait distribution, and type of baseline model. We also examined the performance of two likelihood-based methods, the likelihood ratio (LR) test and Akaike information criterion (AIC), using marginal maximum likelihood (MML) estimation for comparison with past DIF research. The results indicated that the proposed BF and DIC methods provided well-controlled Type I error and high power using a free-baseline model implementation, their performance was superior to LR and AIC in terms of Type I error rates when the reference and focal group trait distributions differed. The implications and recommendations for applied research are discussed.
Collapse
|
4
|
Spineli LM. A Revised Framework to Evaluate the Consistency Assumption Globally in a Network of Interventions. Med Decis Making 2021; 42:637-648. [PMID: 34961377 PMCID: PMC9189723 DOI: 10.1177/0272989x211068005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Background The unrelated mean effects (UME) model has been proposed for evaluating the
consistency assumption globally in the network of interventions. However,
the UME model does not accommodate multiarm trials properly and omits
comparisons between nonbaseline interventions in the multiarm trials not
investigated in 2-arm trials. Methods We proposed a refinement of the UME model that tackles the limitations
mentioned above. We also accompanied the scatterplots on the posterior mean
deviance contributions of the trial arms under the network meta-analysis
(NMA) and UME models with Bland-Altman plots to detect outlying trials
contributing to poor model fit. We applied the refined and original UME
models to 2 networks with multiarm trials. Results The original UME model omitted more than 20% of the observed comparisons in
both networks. The thorough inspection of the individual data points’
deviance contribution using complementary plots in conjunction with the
measures of model fit and the estimated between-trial variance indicated
that the refined and original UME models revealed possible inconsistency in
both examples. Conclusions The refined UME model allows proper accommodation of the multiarm trials and
visualization of all observed evidence in complex networks of interventions.
Furthermore, considering several complementary plots to investigate deviance
helps draw informed conclusions on the possibility of global inconsistency
in the network. Highlights
Collapse
Affiliation(s)
- Loukia M Spineli
- Midwifery Research and Education Unit, Hannover Medical School, Hannover, Germany
| |
Collapse
|
5
|
Fu Z, Zhang S, Su YH, Shi N, Tao J. A Gibbs sampler for the multidimensional four-parameter logistic item response model via a data augmentation scheme. Br J Math Stat Psychol 2021; 74:427-464. [PMID: 34002857 DOI: 10.1111/bmsp.12234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/23/2019] [Revised: 12/30/2020] [Indexed: 06/12/2023]
Abstract
The four-parameter logistic (4PL) item response model, which includes an upper asymptote for the correct response probability, has drawn increasing interest due to its suitability for many practical scenarios. This paper proposes a new Gibbs sampling algorithm for estimation of the multidimensional 4PL model based on an efficient data augmentation scheme (DAGS). With the introduction of three continuous latent variables, the full conditional distributions are tractable, allowing easy implementation of a Gibbs sampler. Simulation studies are conducted to evaluate the proposed method and several popular alternatives. An empirical data set was analysed using the 4PL model to show its improved performance over the three-parameter and two-parameter logistic models. The proposed estimation scheme is easily accessible to practitioners through the open-source IRTlogit package.
Collapse
Affiliation(s)
- Zhihui Fu
- Department of Statistics, School of Mathematics and Statistics, Minnan Normal University, Zhangzhou, Fujian, China
- Key Laboratory for Applied Statistics of MOE, School of Mathematics and Statistics, Northeast Normal University, Changchun, Jilin, China
| | - Susu Zhang
- Departments of Psychology and Statistics, University of Illinois at Urbana-Champaign, IL, USA
| | - Ya-Hui Su
- Department of Psychology, National Chung Cheng University, Chiayi County, Taiwan
| | - Ningzhong Shi
- Key Laboratory for Applied Statistics of MOE, School of Mathematics and Statistics, Northeast Normal University, Changchun, Jilin, China
| | - Jian Tao
- Key Laboratory for Applied Statistics of MOE, School of Mathematics and Statistics, Northeast Normal University, Changchun, Jilin, China
| |
Collapse
|
6
|
Lu J, Zhang J, Zhang Z, Xu B, Tao J. A Novel and Highly Effective Bayesian Sampling Algorithm Based on the Auxiliary Variables to Estimate the Testlet Effect Models. Front Psychol 2021; 12:509575. [PMID: 34456774 PMCID: PMC8386915 DOI: 10.3389/fpsyg.2021.509575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2019] [Accepted: 07/06/2021] [Indexed: 11/29/2022] Open
Abstract
In this paper, a new two-parameter logistic testlet response theory model for dichotomous items is proposed by introducing testlet discrimination parameters to model the local dependence among items within a common testlet. In addition, a highly effective Bayesian sampling algorithm based on auxiliary variables is proposed to estimate the testlet effect models. The new algorithm not only avoids the Metropolis-Hastings algorithm boring adjustment the turning parameters to achieve an appropriate acceptance probability, but also overcomes the dependence of the Gibbs sampling algorithm on the conjugate prior distribution. Compared with the traditional Bayesian estimation methods, the advantages of the new algorithm are analyzed from the various types of prior distributions. Based on the Markov chain Monte Carlo (MCMC) output, two Bayesian model assessment methods are investigated concerning the goodness of fit between models. Finally, three simulation studies and an empirical example analysis are given to further illustrate the advantages of the new testlet effect model and Bayesian sampling algorithm.
Collapse
Affiliation(s)
- Jing Lu
- Key Laboratory of Applied Statistics of MOE, School of Mathematics and Statistics, Northeast Normal University, Changchun, China
| | - Jiwei Zhang
- Key Lab of Statistical Modeling and Data Analysis of Yunnan Province, School of Mathematics and Statistics, Yunnan University, Kunming, China
| | - Zhaoyuan Zhang
- Department of Statistics, School of Mathematics and Statistics, Northeast Normal University, Changchun, China
| | - Bao Xu
- Institute of Mathematics, Jilin Normal University, Siping, China
| | - Jian Tao
- Key Laboratory of Applied Statistics of MOE, School of Mathematics and Statistics, Northeast Normal University, Changchun, China
| |
Collapse
|
7
|
MacNab YC. Bayesian estimation of multivariate Gaussian Markov random fields with constraint. Stat Med 2020; 39:4767-4788. [PMID: 32935375 DOI: 10.1002/sim.8752] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Revised: 06/28/2020] [Accepted: 08/08/2020] [Indexed: 11/10/2022]
Abstract
This article concerns with conditionally formulated multivariate Gaussian Markov random fields (MGMRF) for modeling multivariate local dependencies with unknown dependence parameters subject to positivity constraint. In the context of Bayesian hierarchical modeling of lattice data in general and Bayesian disease mapping in particular, analytic and simulation studies provide new insights into various approaches to posterior estimation of dependence parameters under "hard" or "soft" positivity constraint, including the well-known strictly diagonal dominance criterion and options of hierarchical priors. Hierarchical centering is examined as a means to gain computational efficiency in Bayesian estimation of multivariate generalized linear mixed effects models in the presence of spatial confounding and weakly identified model parameters. Simulated data on irregular or regular lattice, and three datasets from the multivariate and spatiotemporal disease mapping literature, are used for illustration. The present investigation also sheds light on the use of deviance information criterion for model comparison, choice, and interpretation in the context of posterior risk predictions judged by borrowing-information and bias-precision tradeoff. The article concludes with a summary discussion and directions of future work. Potential applications of MGMRF in spatial information fusion and image analysis are briefly mentioned.
Collapse
Affiliation(s)
- Ying C MacNab
- School of Population and Public Health, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
8
|
Ballesta P, Bush D, Silva FF, Mora F. Genomic Predictions Using Low-Density SNP Markers, Pedigree and GWAS Information: A Case Study with the Non-Model Species Eucalyptus cladocalyx. Plants (Basel) 2020; 9:E99. [PMID: 31941085 PMCID: PMC7020392 DOI: 10.3390/plants9010099] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/25/2019] [Revised: 12/20/2019] [Accepted: 01/09/2020] [Indexed: 11/16/2022]
Abstract
High-throughput genotyping techniques have enabled large-scale genomic analysis to precisely predict complex traits in many plant species. However, not all species can be well represented in commercial SNP (single nucleotide polymorphism) arrays. In this study, a high-density SNP array (60 K) developed for commercial Eucalyptus was used to genotype a breeding population of Eucalyptus cladocalyx, yielding only ~3.9 K informative SNPs. Traditional Bayesian genomic models were investigated to predict flowering, stem quality and growth traits by considering the following effects: (i) polygenic background and all informative markers (GS model) and (ii) polygenic background, QTL-genotype effects (determined by GWAS) and SNP markers that were not associated with any trait (GSq model). The estimates of pedigree-based heritability and genomic heritability varied from 0.08 to 0.34 and 0.002 to 0.5, respectively, whereas the predictive ability varied from 0.19 (GS) and 0.45 (GSq). The GSq approach outperformed GS models in terms of predictive ability when the proportion of the variance explained by the significant marker-trait associations was higher than those explained by the polygenic background and non-significant markers. This approach can be particularly useful for plant/tree species poorly represented in the high-density SNP arrays, developed for economically important species, or when high-density marker panels are not available.
Collapse
Affiliation(s)
- Paulina Ballesta
- Institute of Biological Sciences, University of Talca, 2 Norte 685, Talca 3460000, Chile;
| | - David Bush
- CSIRO–Australian Tree Seed Centre, Acton 2601, Australia;
| | - Fabyano Fonseca Silva
- Department of Animal Science, Universidade Federal de Viçosa, Viçosa 36570-900, Brazil;
| | - Freddy Mora
- Institute of Biological Sciences, University of Talca, 2 Norte 685, Talca 3460000, Chile;
| |
Collapse
|
9
|
Pooley CM, Marion G. Bayesian model evidence as a practical alternative to deviance information criterion. R Soc Open Sci 2018; 5:171519. [PMID: 29657762 PMCID: PMC5882686 DOI: 10.1098/rsos.171519] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/05/2017] [Accepted: 02/13/2018] [Indexed: 06/08/2023]
Abstract
While model evidence is considered by Bayesian statisticians as a gold standard for model selection (the ratio in model evidence between two models giving the Bayes factor), its calculation is often viewed as too computationally demanding for many applications. By contrast, the widely used deviance information criterion (DIC), a different measure that balances model accuracy against complexity, is commonly considered a much faster alternative. However, recent advances in computational tools for efficient multi-temperature Markov chain Monte Carlo algorithms, such as steppingstone sampling (SS) and thermodynamic integration schemes, enable efficient calculation of the Bayesian model evidence. This paper compares both the capability (i.e. ability to select the true model) and speed (i.e. CPU time to achieve a given accuracy) of DIC with model evidence calculated using SS. Three important model classes are considered: linear regression models, mixed models and compartmental models widely used in epidemiology. While DIC was found to correctly identify the true model when applied to linear regression models, it led to incorrect model choice in the other two cases. On the other hand, model evidence led to correct model choice in all cases considered. Importantly, and perhaps surprisingly, DIC and model evidence were found to run at similar computational speeds, a result reinforced by analytically derived expressions.
Collapse
Affiliation(s)
- C. M. Pooley
- The Roslin Institute, The University of Edinburgh, Midlothian EH25 9RG, UK
- Biomathematics and Statistics Scotland, James Clerk Maxwell Building, The King's Buildings, Peter Guthrie Tait Road, Edinburgh EH9 3FD, UK
| | - G. Marion
- Biomathematics and Statistics Scotland, James Clerk Maxwell Building, The King's Buildings, Peter Guthrie Tait Road, Edinburgh EH9 3FD, UK
| |
Collapse
|
10
|
Gamado K, Marion G, Porphyre T. Data-Driven Risk Assessment from Small Scale Epidemics: Estimation and Model Choice for Spatio-Temporal Data with Application to a Classical Swine Fever Outbreak. Front Vet Sci 2017; 4:16. [PMID: 28293559 PMCID: PMC5329025 DOI: 10.3389/fvets.2017.00016] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2016] [Accepted: 01/30/2017] [Indexed: 11/30/2022] Open
Abstract
Livestock epidemics have the potential to give rise to significant economic, welfare, and social costs. Incursions of emerging and re-emerging pathogens may lead to small and repeated outbreaks. Analysis of the resulting data is statistically challenging but can inform disease preparedness reducing potential future losses. We present a framework for spatial risk assessment of disease incursions based on data from small localized historic outbreaks. We focus on between-farm spread of livestock pathogens and illustrate our methods by application to data on the small outbreak of Classical Swine Fever (CSF) that occurred in 2000 in East Anglia, UK. We apply models based on continuous time semi-Markov processes, using data-augmentation Markov Chain Monte Carlo techniques within a Bayesian framework to infer disease dynamics and detection from incompletely observed outbreaks. The spatial transmission kernel describing pathogen spread between farms, and the distribution of times between infection and detection, is estimated alongside unobserved exposure times. Our results demonstrate inference is reliable even for relatively small outbreaks when the data-generating model is known. However, associated risk assessments depend strongly on the form of the fitted transmission kernel. Therefore, for real applications, methods are needed to select the most appropriate model in light of the data. We assess standard Deviance Information Criteria (DIC) model selection tools and recently introduced latent residual methods of model assessment, in selecting the functional form of the spatial transmission kernel. These methods are applied to the CSF data, and tested in simulated scenarios which represent field data, but assume the data generation mechanism is known. Analysis of simulated scenarios shows that latent residual methods enable reliable selection of the transmission kernel even for small outbreaks whereas the DIC is less reliable. Moreover, compared with DIC, model choice based on latent residual assessment correlated better with predicted risk.
Collapse
Affiliation(s)
| | - Glenn Marion
- Biomathematics and Statistics Scotland , Edinburgh , UK
| | - Thibaud Porphyre
- Epidemiology Research Group, Center for Immunity, Infection and Evolution, University of Edinburgh, Edinburgh, UK; The Roslin Institute, University of Edinburgh, Easter Bush Campus, Edinburgh, UK
| |
Collapse
|
11
|
Abstract
This paper concerns with multivariate conditional autoregressive models defined by linear combination of independent or correlated underlying spatial processes. Known as linear models of coregionalization, the method offers a systematic and unified approach for formulating multivariate extensions to a broad range of univariate conditional autoregressive models. The resulting multivariate spatial models represent classes of coregionalized multivariate conditional autoregressive models that enable flexible modelling of multivariate spatial interactions, yielding coregionalization models with symmetric or asymmetric cross-covariances of different spatial variation and smoothness. In the context of multivariate disease mapping, for example, they facilitate borrowing strength both over space and cross variables, allowing for more flexible multivariate spatial smoothing. Specifically, we present a broadened coregionalization framework to include order-dependent, order-free, and order-robust multivariate models; a new class of order-free coregionalized multivariate conditional autoregressives is introduced. We tackle computational challenges and present solutions that are integral for Bayesian analysis of these models. We also discuss two ways of computing deviance information criterion for comparison among competing hierarchical models with or without unidentifiable prior parameters. The models and related methodology are developed in the broad context of modelling multivariate data on spatial lattice and illustrated in the context of multivariate disease mapping. The coregionalization framework and related methods also present a general approach for building spatially structured cross-covariance functions for multivariate geostatistics.
Collapse
Affiliation(s)
- Ying C MacNab
- Division of Epidemiology and Biostatistics, School of Population and Public Health, University of British Columbia, Vancouver, Canada
| |
Collapse
|
12
|
Abstract
Meta-analysis has been widely applied to rare adverse event data because it is very difficult to reliably detect the effect of a treatment on such events in an individual clinical study. However, it is known that standard meta-analysis methods are often biased, especially when the background incidence rate is very low. A recent work by Bhaumik et al. (2012) proposed new moment-based approaches under a natural random effects model, to improve estimation and testing of the treatment effect and the between-study heterogeneity parameter. It has been demonstrated that for rare binary events, their methods have superior performance to commonly-used meta-analysis methods. However, their comparison does not include any Bayesian methods, although Bayesian approaches are a natural and attractive choice under the random-effects model. In this paper, we study a Bayesian hierarchical approach to estimation and testing in meta-analysis of rare binary events using the random effects model in Bhaumik et al. (2012). We develop Bayesian estimators of the treatment effect and the heterogeneity parameter, as well as hypothesis testing methods based on Bayesian model selection procedures. We compare them with the existing methods through simulation. A data example is provided to illustrate the Bayesian approach as well.
Collapse
Affiliation(s)
- Ou Bai
- Department of Statistical Science, Southern Methodist University
| | - Min Chen
- Department of Mathematical Sciences, University of Texas at Dallas
| | - Xinlei Wang
- Department of Statistical Science, Southern Methodist University
| |
Collapse
|
13
|
Jiao H, Zhang Y. Polytomous multilevel testlet models for testlet-based assessments with complex sampling designs. Br J Math Stat Psychol 2015; 68:65-83. [PMID: 24571376 DOI: 10.1111/bmsp.12035] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/05/2012] [Revised: 12/16/2013] [Indexed: 06/03/2023]
Abstract
Applications of standard item response theory models assume local independence of items and persons. This paper presents polytomous multilevel testlet models for dual dependence due to item and person clustering in testlet-based assessments with clustered samples. Simulation and survey data were analysed with a multilevel partial credit testlet model. This model was compared with three alternative models - a testlet partial credit model (PCM), multilevel PCM, and PCM - in terms of model parameter estimation. The results indicated that the deviance information criterion was the fit index that always correctly identified the true multilevel testlet model based on the quantified evidence in model selection, while the Akaike and Bayesian information criteria could not identify the true model. In general, the estimation model and the magnitude of item and person clustering impacted the estimation accuracy of ability parameters, while only the estimation model and the magnitude of item clustering affected the item parameter estimation accuracy. Furthermore, ignoring item clustering effects produced higher total errors in item parameter estimates but did not have much impact on the accuracy of ability parameter estimates, while ignoring person clustering effects yielded higher total errors in ability parameter estimates but did not have much effect on the accuracy of item parameter estimates. When both clustering effects were ignored in the PCM, item and ability parameter estimation accuracy was reduced.
Collapse
Affiliation(s)
- Hong Jiao
- Measurement, Statistics and Evaluation, Department of Human Development and Quantitative Methodology, University of Maryland, College Park, USA
| | | |
Collapse
|
14
|
Abstract
A Bayesian hierarchical model is developed for count data with spatial and temporal correlations as well as excessive zeros, uneven sampling intensities, and inference on missing spots. Our contribution is to develop a model on zero-inflated count data that provides flexibility in modeling spatial patterns in a dynamic manner and also improves the computational efficiency via dimension reduction. The proposed methodology is of particular importance for studying species presence and abundance in the field of ecological sciences. The proposed model is employed in the analysis of the survey data by the Northeast Fisheries Sciences Center (NEFSC) for estimation and prediction of the Atlantic cod in the Gulf of Maine - Georges Bank region. Model comparisons based on the deviance information criterion and the log predictive score show the improvement by the proposed spatial-temporal model.
Collapse
Affiliation(s)
- Xia Wang
- Department of Mathematical Sciences, University of Cincinnati, 2815 Commons Way Cincinnati, OH 45221-0025, USA
| | - Ming-Hui Chen
- Department of Statistics, University of Connecticut, 215 Glenbrook Road, U-4120 Storrs, CT 06269-4120, U.S.A
| | - Rita C Kuo
- Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Dipak K Dey
- Department of Statistics, University of Connecticut, 215 Glenbrook Road, U-4120 Storrs, CT 06269-4120, U.S.A
| |
Collapse
|
15
|
Revell LJ. Ancestral character estimation under the threshold model from quantitative genetics. Evolution 2013; 68:743-59. [PMID: 24152239 DOI: 10.1111/evo.12300] [Citation(s) in RCA: 82] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2012] [Accepted: 10/13/2013] [Indexed: 10/26/2022]
Abstract
Evolutionary biology is a study of life's history on Earth. In researching this history, biologists are often interested in attempting to reconstruct phenotypes for the long extinct ancestors of living species. Various methods have been developed to do this on a phylogeny from the data for extant taxa. In the present article, I introduce a new approach for ancestral character estimation for discretely valued traits. This approach is based on the threshold model from evolutionary quantitative genetics. Under the threshold model, the value exhibited by an individual or species for a discrete character is determined by an underlying, unobserved continuous trait called "liability." In this new method for ancestral state reconstruction, I use Bayesian Markov chain Monte Carlo (MCMC) to sample the liabilities of ancestral and tip species, and the relative positions of two or more thresholds, from their joint posterior probability distribution. Using data simulated under the model, I find that the method has very good performance in ancestral character estimation. Use of the threshold model for ancestral state reconstruction relies on a priori specification of the order of the discrete character states along the liability axis. I test the use of a Bayesian MCMC information theoretic criterion based approach to choose among different hypothesized orderings for the discrete character. Finally, I apply the method to the evolution of feeding mode in centrarchid fishes.
Collapse
Affiliation(s)
- Liam J Revell
- Department of Biology, University of Massachusetts Boston, Boston, Massachusetts, 02125.
| |
Collapse
|
16
|
Abstract
Mixed-effects models have recently become popular for analyzing sparse longitudinal data that arise naturally in biological, agricultural and biomedical studies. Traditional approaches assume independent residuals over time and explain the longitudinal dependence by random effects. However, when bivariate or multivariate traits are measured longitudinally, this fundamental assumption is likely to be violated because of intertrait dependence over time. We provide a more general framework where the dependence of the observations from the same subject over time is not assumed to be explained completely by the random effects of the model. We propose a novel, mixed model-based approach and estimate the error-covariance structure nonparametrically under a generalized linear model framework. We use penalized splines to model the general effect of time, and we consider a Dirichlet process mixture of normal prior for the random-effects distribution. We analyze blood pressure data from the Framingham Heart Study where body mass index, gender and time are treated as covariates. We compare our method with traditional methods including parametric modeling of the random effects and independent residual errors over time. We conduct extensive simulation studies to investigate the practical usefulness of the proposed method. The current approach is very helpful in analyzing bivariate irregular longitudinal traits.
Collapse
Affiliation(s)
- Kiranmoy Das
- Department of Statistics, Temple University, Philadelphia, PA 19122, U.S.A
| | | | | | | |
Collapse
|
17
|
Alegana VA, Atkinson PM, Wright JA, Kamwi R, Uusiku P, Katokele S, Snow RW, Noor AM. Estimation of malaria incidence in northern Namibia in 2009 using Bayesian conditional-autoregressive spatial-temporal models. Spat Spatiotemporal Epidemiol 2013; 7:25-36. [PMID: 24238079 PMCID: PMC3839406 DOI: 10.1016/j.sste.2013.09.001] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/23/2013] [Revised: 08/05/2013] [Accepted: 09/05/2013] [Indexed: 10/29/2022]
Abstract
As malaria transmission declines, it becomes increasingly important to monitor changes in malaria incidence rather than prevalence. Here, a spatio-temporal model was used to identify constituencies with high malaria incidence to guide malaria control. Malaria cases were assembled across all age groups along with several environmental covariates. A Bayesian conditional-autoregressive model was used to model the spatial and temporal variation of incidence after adjusting for test positivity rates and health facility utilisation. Of the 144,744 malaria cases recorded in Namibia in 2009, 134,851 were suspected and 9893 were parasitologically confirmed. The mean annual incidence based on the Bayesian model predictions was 13 cases per 1000 population with the highest incidence predicted for constituencies bordering Angola and Zambia. The smoothed maps of incidence highlight trends in disease incidence. For Namibia, the 2009 maps provide a baseline for monitoring the targets of pre-elimination.
Collapse
Affiliation(s)
- Victor A Alegana
- Malaria Public Health Department, KEMRI-Wellcome Trust-University of Oxford Collaborative Programme, P.O. Box 43640, 00100 GPO Nairobi, Kenya; Centre for Geographical Health Research, Geography and Environment, University of Southampton, Highfield, Southampton SO17 1BJ, UK.
| | | | | | | | | | | | | | | |
Collapse
|
18
|
Boonstra PS, Mukherjee B, Taylor JMG, Nilbert M, Moreno VM, Gruber SB. Bayesian modeling for genetic anticipation in presence of mutational heterogeneity: a case study in Lynch syndrome. Biometrics 2011; 67:1627-37. [PMID: 21627626 PMCID: PMC3176998 DOI: 10.1111/j.1541-0420.2011.01607.x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Genetic anticipation, described by earlier age of onset (AOO) and more aggressive symptoms in successive generations, is a phenomenon noted in certain hereditary diseases. Its extent may vary between families and/or between mutation subtypes known to be associated with the disease phenotype. In this article, we posit a Bayesian approach to infer genetic anticipation under flexible random effects models for censored data that capture the effect of successive generations on AOO. Primary interest lies in the random effects. Misspecifying the distribution of random effects may result in incorrect inferential conclusions. We compare the fit of four-candidate random effects distributions via Bayesian model fit diagnostics. A related statistical issue here is isolating the confounding effect of changes in secular trends, screening, and medical practices that may affect time to disease detection across birth cohorts. Using historic cancer registry data, we borrow from relative survival analysis methods to adjust for changes in age-specific incidence across birth cohorts. Our motivating case study comes from a Danish cancer register of 124 families with mutations in mismatch repair (MMR) genes known to cause hereditary nonpolyposis colorectal cancer, also called Lynch syndrome (LS). We find evidence for a decrease in AOO between generations in this article. Our model predicts family-level anticipation effects that are potentially useful in genetic counseling clinics for high-risk families.
Collapse
Affiliation(s)
| | - Bhramar Mukherjee
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
| | | | - Mef Nilbert
- Clinical Research Centre, Copenhagen University Hospital, Hvidovre, Denmark
| | - Victor M. Moreno
- Cancer Prevention and Control Program, Catalan Institute of Oncology, IDIBELL, Barcelona, Spain
- Department of Clinical Sciences, School of Medicine, University of Barcelona, Spain
| | - Stephen B. Gruber
- Departments of Internal Medicine, Epidemiology and Human Genetics, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
19
|
White LJ, Buttery J, Cooper B, Nokes DJ, Medley GF. Rotavirus within day care centres in Oxfordshire, UK: characterization of partial immunity. J R Soc Interface 2008; 5:1481-90. [PMID: 18477541 PMCID: PMC2475553 DOI: 10.1098/rsif.2008.0115] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2008] [Revised: 04/22/2008] [Accepted: 04/22/2008] [Indexed: 12/04/2022] Open
Abstract
Repeated measures data for rotavirus infection in children within 14 day care centres (DCCs) in the Oxfordshire area, UK, are used to explore aspects of rotavirus transmission and immunity. A biologically realistic model for the transmission of infection is presented as a set of probability models suitable for application to the data. Two transition events are modelled separately: incidence and recovery. The complexity of the underlying mechanistic model is reflected in the choice of the fixed variables in the probability models. Parameter estimation was carried out using a Bayesian Markov chain Monte Carlo method. We use the parameter estimates obtained to build a profile of the natural history of rotavirus reinfection in an individual child. We infer that rotavirus transmission in children in DCCs is dependent on the DCC prevalence, with symptomatic infection of longer duration, but no more infectious per day of infectious period, than asymptomatic infection. There was evidence that a recent previous infection reduces the risk of disease and, to a lesser extent, reinfection, but not duration of infection. The results provide evidence that partial immunity to rotavirus infection develops over several time scales.
Collapse
Affiliation(s)
- L J White
- Mahidol Oxford Research Unit, Faculty of Tropical Medicine, Mahidol University, Bangkok 10400, Thailand.
| | | | | | | | | |
Collapse
|
20
|
Ibrahim JG, Chen MH, Kim S. Bayesian variable selection for the Cox regression model with missing covariates. Lifetime Data Anal 2008; 14:496-520. [PMID: 18836829 PMCID: PMC2858597 DOI: 10.1007/s10985-008-9101-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/22/2008] [Accepted: 09/10/2008] [Indexed: 05/26/2023]
Abstract
In this paper, we develop Bayesian methodology and computational algorithms for variable subset selection in Cox proportional hazards models with missing covariate data. A new joint semi-conjugate prior for the piecewise exponential model is proposed in the presence of missing covariates and its properties are examined. The covariates are assumed to be missing at random (MAR). Under this new prior, a version of the Deviance Information Criterion (DIC) is proposed for Bayesian variable subset selection in the presence of missing covariates. Monte Carlo methods are developed for computing the DICs for all possible subset models in the model space. A Bone Marrow Transplant (BMT) dataset is used to illustrate the proposed methodology.
Collapse
Affiliation(s)
- Joseph G. Ibrahim
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599, USA, e-mail:
| | - Ming-Hui Chen
- Department of Statistics, University of Connecticut, Storrs, CT 06269, USA, e-mail:
| | - Sungduk Kim
- Division of Epidemiology, Statistics and Prevention Research, National Institute of Child Health and Human Development, NIH, Rockville, MD 20852, USA, e-mail:
| |
Collapse
|