201
|
Kurz CF, Stafford S. Isolating cost drivers in interstitial lung disease treatment using nonparametric Bayesian methods. Biom J 2020; 62:1896-1908. [PMID: 32954516 DOI: 10.1002/bimj.202000076] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2020] [Revised: 08/06/2020] [Accepted: 08/14/2020] [Indexed: 11/30/2023]
Abstract
Mixture modeling is a popular approach to accommodate overdispersion, skewness, and multimodality features that are very common for health care utilization data. However, mixture modeling tends to rely on subjective judgment regarding the appropriate number of mixture components or some hypothesis about how to cluster the data. In this work, we adopt a nonparametric, variational Bayesian approach to allow the model to select the number of components while estimating their parameters. Our model allows for a probabilistic classification of observations into clusters and simultaneous estimation of a Gaussian regression model within each cluster. When we apply this approach to data on patients with interstitial lung disease, we find distinct subgroups of patients with differences in means and variances of health care costs, health and treatment covariates, and relationships between covariates and costs. The subgroups identified are readily interpretable, suggesting that this nonparametric variational approach to inference can discover valid insights into the factors driving treatment costs. Moreover, the learning algorithm we employed is very fast and scalable, which should make the technique accessible for a broad range of applications.
Collapse
|
202
|
Galimberti M, Leuenberger C, Wolf B, Szilágyi SM, Foll M, Wegmann D. Detecting Selection from Linked Sites Using an F-Model. Genetics 2020; 216:1205-1215. [PMID: 33067324 PMCID: PMC7768260 DOI: 10.1534/genetics.120.303780] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2020] [Accepted: 10/03/2020] [Indexed: 11/18/2022] Open
Abstract
Allele frequencies vary across populations and loci, even in the presence of migration. While most differences may be due to genetic drift, divergent selection will further increase differentiation at some loci. Identifying those is key in studying local adaptation, but remains statistically challenging. A particularly elegant way to describe allele frequency differences among populations connected by migration is the F-model, which measures differences in allele frequencies by population specific FST coefficients. This model readily accounts for multiple evolutionary forces by partitioning FST coefficients into locus- and population-specific components reflecting selection and drift, respectively. Here we present an extension of this model to linked loci by means of a hidden Markov model (HMM), which characterizes the effect of selection on linked markers through correlations in the locus specific component along the genome. Using extensive simulations, we show that the statistical power of our method is up to twofold higher than that of previous implementations that assume sites to be independent. We finally evidence selection in the human genome by applying our method to data from the Human Genome Diversity Project (HGDP).
Collapse
|
203
|
Garcia Barrado L, Burzykowski T. Bayesian biomarker-driven outcome-adaptive randomization with an imperfect biomarker assay. Clin Trials 2020; 18:137-146. [PMID: 33231131 DOI: 10.1177/1740774520964202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
OBJECTIVE We investigate the impact of biomarker assay's accuracy on the operating characteristics of a Bayesian biomarker-driven outcome-adaptive randomization design. METHODS In a simulation study, we assume a trial with two treatments, two biomarker-based strata, and a binary clinical outcome (response). Pbt denotes the probability of response for treatment t (t = 0 or 1) in biomarker stratum (b = 0 or 1). Four different scenarios in terms of true underlying response probabilities are considered: a null (P00 = P01 = 0.25, P10 = P11= 0.25) and consistent (P00 = P10 = 0.25, P01 = 0.5) treatment effect scenario, as well as a quantitative (P00 = P01 = P10 = 0.25, P11 = 0.5) and a qualitative (P00 = P11 = 0.5, P01 = P10 = 0.25) stratum-treatment interaction. For each scenario, we compare the case of a perfect with the case of an imperfect biomarker assay with sensitivity and specificity of 0.8 and 0.7, respectively. In addition, biomarker-positive prevalence values P(B = 1) = 0.2 and 0.5 are investigated. RESULTS Results show that the use of an imperfect assay affects the operational characteristics of the Bayesian biomarker-based outcome-adaptive randomization design. In particular, the misclassification causes a substantial reduction in power accompanied by a considerable increase in the type-I error probability. The magnitude of these effects depends on the sensitivity and specificity of the assay, as well as on the distribution of the biomarker in the patient population. CONCLUSION With an imperfect biomarker assay, the decision to apply a biomarker-based outcome-adaptive randomization design may require careful reflection.
Collapse
|
204
|
Depaoli S, Winter SD, Visser M. The Importance of Prior Sensitivity Analysis in Bayesian Statistics: Demonstrations Using an Interactive Shiny App. Front Psychol 2020; 11:608045. [PMID: 33324306 PMCID: PMC7721677 DOI: 10.3389/fpsyg.2020.608045] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Accepted: 10/30/2020] [Indexed: 11/25/2022] Open
Abstract
The current paper highlights a new, interactive Shiny App that can be used to aid in understanding and teaching the important task of conducting a prior sensitivity analysis when implementing Bayesian estimation methods. In this paper, we discuss the importance of examining prior distributions through a sensitivity analysis. We argue that conducting a prior sensitivity analysis is equally important when so-called diffuse priors are implemented as it is with subjective priors. As a proof of concept, we conducted a small simulation study, which illustrates the impact of priors on final model estimates. The findings from the simulation study highlight the importance of conducting a sensitivity analysis of priors. This concept is further extended through an interactive Shiny App that we developed. The Shiny App allows users to explore the impact of various forms of priors using empirical data. We introduce this Shiny App and thoroughly detail an example using a simple multiple regression model that users at all levels can understand. In this paper, we highlight how to determine the different settings for a prior sensitivity analysis, how to visually and statistically compare results obtained in the sensitivity analysis, and how to display findings and write up disparate results obtained across the sensitivity analysis. The goal is that novice users can follow the process outlined here and work within the interactive Shiny App to gain a deeper understanding of the role of prior distributions and the importance of a sensitivity analysis when implementing Bayesian methods. The intended audience is broad (e.g., undergraduate or graduate students, faculty, and other researchers) and can include those with limited exposure to Bayesian methods or the specific model presented here.
Collapse
|
205
|
Alquier P. Approximate Bayesian Inference. ENTROPY 2020; 22:e22111272. [PMID: 33287041 PMCID: PMC7711853 DOI: 10.3390/e22111272] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Accepted: 11/06/2020] [Indexed: 11/16/2022]
Abstract
This is the Editorial article summarizing the scope of the Special Issue: Approximate Bayesian Inference.
Collapse
|
206
|
Vaiente MA, Scotch M. Going back to the roots: Evaluating Bayesian phylogeographic models with discrete trait uncertainty. INFECTION, GENETICS AND EVOLUTION : JOURNAL OF MOLECULAR EPIDEMIOLOGY AND EVOLUTIONARY GENETICS IN INFECTIOUS DISEASES 2020; 85:104501. [PMID: 32798768 PMCID: PMC7686256 DOI: 10.1016/j.meegid.2020.104501] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Revised: 08/06/2020] [Accepted: 08/09/2020] [Indexed: 01/14/2023]
Abstract
Phylogeography is a popular way to analyze virus sequences annotated with discrete, epidemiologically-relevant, trait data. For applied public health surveillance, a key quantity of interest is often the state at the root of the inferred phylogeny. In epidemiological terms, this represents the geographic origin of the observed outbreak. Since determining the origin of an outbreak is often critical for public health intervention, it is prudent to understand how well phylogeographic models perform this root state classification task under various analytical scenarios. Specifically, we investigate how discrete state space and sequence data set influence the root state classification accuracy. We performed phylogeographic inference on several simulated DNA data sets while i) increasing the number of sequences and ii) increasing the total number of possible discrete trait values. We show that phylogeographic models tend to perform best at intermediate sequence data set sizes. Further, we demonstrate that a popular metric used for evaluation of phylogeographic models, the Kullback-Leibler (KL) divergence, both increases with discrete state space and data set sizes. Further, by modeling phylogeographic root state classification accuracy using logistic regression, we show that KL is not supported as a predictor of model accuracy, indicating its limited utility for assessing phylogeographic model performance on empirical data. These results suggest that relying solely on the KL metric may lead to artificially inflated support for models with finer discretization schemes and larger data set sizes. These results will be important for public health practitioners seeking to use phylogeographic models for applied infectious disease surveillance.
Collapse
|
207
|
Bürkner PC, Charpentier E. Modelling monotonic effects of ordinal predictors in Bayesian regression models. THE BRITISH JOURNAL OF MATHEMATICAL AND STATISTICAL PSYCHOLOGY 2020; 73:420-451. [PMID: 31943157 DOI: 10.1111/bmsp.12195] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Revised: 10/02/2019] [Indexed: 05/07/2023]
Abstract
Ordinal predictors are commonly used in regression models. They are often incorrectly treated as either nominal or metric, thus under- or overestimating the information contained. Such practices may lead to worse inference and predictions compared to methods which are specifically designed for this purpose. We propose a new method for modelling ordinal predictors that applies in situations in which it is reasonable to assume their effects to be monotonic. The parameterization of such monotonic effects is realized in terms of a scale parameter b representing the direction and size of the effect and a simplex parameter ς modelling the normalized differences between categories. This ensures that predictions increase or decrease monotonically, while changes between adjacent categories may vary across categories. This formulation generalizes to interaction terms as well as multilevel structures. Monotonic effects may be applied not only to ordinal predictors, but also to other discrete variables for which a monotonic relationship is plausible. In simulation studies we show that the model is well calibrated and, if there is monotonicity present, exhibits predictive performance similar to or even better than other approaches designed to handle ordinal predictors. Using Stan, we developed a Bayesian estimation method for monotonic effects which allows us to incorporate prior information and to check the assumption of monotonicity. We have implemented this method in the R package brms, so that fitting monotonic effects in a fully Bayesian framework is now straightforward.
Collapse
|
208
|
Iwata K, Morishita N, Nishiwaki M, Miyakoshi C. Use of Rifampin Compared with Isoniazid for the Treatment of Latent Tuberculosis Infection in Japan: A Bayesian Inference with Markov Chain Monte Carlo Method. Intern Med 2020; 59:2687-2691. [PMID: 32669488 PMCID: PMC7691023 DOI: 10.2169/internalmedicine.3477-19] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Objective Treating latent tuberculosis infection (LTBI) is essential for eliminating the serious endemicity of tuberculosis. A shorter regimen is preferred to longer regimens because the former has better adherence with a better safety profile. However, lengthy treatment with isoniazid is still recommended in Japan. Based on the latest evidence, we switched from a conventional nine-month isoniazid regimen to a shorter four-month rifampin regimen for the treatment of LTBI. Methods To evaluate the safety and efficacy of the shorter regimen, we conducted Bayesian analyses using a stochastic mathematical model to calculate the posterior probabilities of several parameters. Patients Clinical data of 13 patients in the isoniazid group and 5 in the rifampin group were used for the Bayesian analyses. The outcomes measured were completion of the treatment, adverse effects, number of clinic visits, and medical costs. Results The medial posterior probability of the isoniazid group completing the treatment was 66% [95% credible interval (CrI) 43-89%], whereas that of the rifampin group was 86% (95% CrI 60-100%). The probability that the completion rate in the rifampin group was better than that in the isoniazid group was as high as 88% (95% CrI 0-100%). Other parameters, such as the number of clinical visits and duration of treatment, were better with rifampin therapy than with isoniazid therapy, with comparable medical costs. Conclusion Four months of rifampin therapy might be preferred to isoniazid for treating LTBI in Japan.
Collapse
|
209
|
Ferreira D, Ludes PO, Diemunsch P, Noll E, Torp KD, Meyer N. Bayesian predictive probabilities: a good way to monitor clinical trials. Br J Anaesth 2020; 126:550-555. [PMID: 33129491 DOI: 10.1016/j.bja.2020.08.062] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Revised: 08/24/2020] [Accepted: 08/30/2020] [Indexed: 11/28/2022] Open
Abstract
BACKGROUND Bayesian methods, with the predictive probability (PredP), allow multiple interim analyses with interim posterior probability (PostP) computation, without the need to correct for multiple looks at the data. The objective of this paper was to illustrate the use of PredP by simulating a sequential analysis of a clinical trial. METHODS We used data from the Laryngobloc trial that planned to include 480 patients to demonstrate the equivalence of success between a laryngoscopy performed with the Laryngobloc® device and a control device. A crossover Bayesian design was used. The success rates of the two laryngoscopy devices were compared. Interim analyses, computed from random numbers of subjects, were simulated. RESULTS The PostP of equivalence rapidly reached the predefined bound of 0.95. The PredP computed with an equivalence margin of 10% reached the efficacy bound between 352 and 409 of the 480 included patients. If a frequentist analysis had been made on the basis of 217 out of 480 subjects, the study would have been prematurely stopped for equivalence. The PredP indicated that this result was nonetheless unstable and that the equivalence was, thus far, not guaranteed. CONCLUSIONS Based on these interim analyses, we can conclude with a sufficiently high probability that the equivalence would have been met on the primary outcome before the predetermined end of this particular trial. If a Bayesian approach using PredP had been used, it would have allowed an early termination of the trial by reducing the calculated sample size by 15-20%.
Collapse
|
210
|
Cortegiani A, Absalom AR. Importance of proper conduct of clinical trials. Br J Anaesth 2020; 126:354-356. [PMID: 33121749 DOI: 10.1016/j.bja.2020.09.030] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2020] [Accepted: 09/25/2020] [Indexed: 10/23/2022] Open
|
211
|
Executive Functions Are Associated with Fall Risk but not Balance in Chronic Cerebrovascular Disease. J Clin Med 2020; 9:jcm9113405. [PMID: 33114243 PMCID: PMC7690867 DOI: 10.3390/jcm9113405] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2020] [Revised: 10/06/2020] [Accepted: 10/20/2020] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND Older people's deficits in executive functions (EF) have been shown to lead to higher fall risk, postural sway, and reduced speed. Crucially, EF impairments are even more pronounced in individuals with chronic cerebrovascular disease (CVD), namely vascular cognitive impairment. METHODS In this retrospective cross-sectional study, we used a complete neuropsychological battery, including the Trail Making Test (TMT) and physical measures, such as the Morse fall and EQUI scales, to assess 66 individuals with chronic CVD. Linear regressions, Bayesian analyses, and model selection were performed to see the impact of EF, global cognition, and vascular parkinsonism/hemiplegia on physical measures (fall risk and balance). RESULTS The TMT part B and BA correlated (r = 0.44 and r = 0.45) with Morse fall scale. Only EF significantly explained fall risk, whereas global cognition and vascular parkinsonism/hemiplegia did not. These findings were confirmed by Bayesian evidence and parsimony model selection. Balance was not significantly correlated with any of the neuropsychological tests. CONCLUSIONS This is the first study investigating the relationship between cognitive and physical measures in a sample of older people with chronic CVD. The results are consistent with previous findings that link EF with fall risk in CVD.
Collapse
|
212
|
Jamal R, Mubarak S, Sahulka SQ, Kori JA, Tajammul A, Ahmed J, Mahar RB, Olsen MS, Goel R, Weidhaas J. Informing water distribution line rehabilitation through quantitative microbial risk assessment. THE SCIENCE OF THE TOTAL ENVIRONMENT 2020; 739:140021. [PMID: 32758946 DOI: 10.1016/j.scitotenv.2020.140021] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Revised: 06/01/2020] [Accepted: 06/04/2020] [Indexed: 06/11/2023]
Abstract
Poor urban water quality has been linked to diminished source water quality, poorly functioning water treatment systems and infiltration into distribution lines after treatment resulting in microbiological contamination. With limited funding to rehabilitate distribution lines, developing nations need tools to identify the areas of greatest concern to human health so as to target cost effective remediation approaches. Herein, a case study of Hyderabad, Pakistan was used to demonstrate the efficacy of combining quantitative microbial risk assessment (QMRA) for multiple pathogens with spatial distribution system modeling to identify areas for pipe rehabilitation. Abundance of Escherichia coli, Enterococcus (enterococci), Salmonella spp., Shigella spp., Giardia intestinalis, Vibrio cholera, norovirus GI and adenovirus 40/41, were determined in 85 locations including the source water, treatment plant effluent and the city distribution lines. Bayesian statistics and Monte Carlo simulations were used in the QMRA to account for left-censored microbial abundance distributions. Bacterial and viral abundances in the distribution system samples decreased as follows: 9400 ± 19,800 norovirus gene copies/100 mL (average ± standard deviation, 100% of samples positive); 340 ± 2200 enterococci CFU/100 mL (94%), 71 ± 97 Shigella sp. CFU/100 mL (97%), 60 ± 360 E. coli CFU/100 mL (89%), 35 ± 79 adenovirus gene copies/100 mL (100%), and 21 ± 46 Salmonella sp. CFU/100 mL (76%). The QMRA revealed unacceptable probabilities of illness (>1 in 10,000 illness level) from the four exposure routes considered (drinking water, or only showering, tooth brushing, and rinsing vegetables consumed raw). Disease severity indices based on the QMRA combined with mapping the distribution system revealed areas for targeted rehabilitation. The combined intensive sampling, risk assessment and mapping can be used in low- and middle-income countries to target distribution system rehabilitation efforts and improve health outcomes.
Collapse
|
213
|
Baldi P, Shahbaba B. Bayesian Causality. AM STAT 2020; 74:249-257. [PMID: 33041343 DOI: 10.1080/00031305.2019.1647876] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
Although no universally accepted definition of causality exists, in practice one is often faced with the question of statistically assessing causal relationships in different settings. We present a uniform general approach to causality problems derived from the axiomatic foundations of the Bayesian statistical framework. In this approach, causality statements are viewed as hypotheses, or models, about the world and the fundamental object to be computed is the posterior distribution of the causal hypotheses, given the data and the background knowledge. Computation of the posterior, illustrated here in simple examples, may involve complex probabilistic modeling but this is no different than in any other Bayesian modeling situation. The main advantage of the approach is its connection to the axiomatic foundations of the Bayesian framework, and the general uniformity with which it can be applied to a variety of causality settings, ranging from specific to general cases, or from causes of effects to effects of causes.
Collapse
|
214
|
Zanin M, Belkoura S, Gomez J, Alfaro C, Cano J. Uncertainty in Functional Network Representations of Brain Activity of Alcoholic Patients. Brain Topogr 2020; 34:6-18. [PMID: 33044705 DOI: 10.1007/s10548-020-00799-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2020] [Accepted: 10/04/2020] [Indexed: 11/30/2022]
Abstract
In spite of the large attention received by brain activity analyses through functional networks, the effects of uncertainty on such representations have mostly been neglected. We here elaborate the hypothesis that such uncertainty is not just a nuisance, but that on the contrary is condition-dependent. We test this hypothesis by analysing a large set of EEG brain recordings corresponding to control subjects and patients suffering from alcoholism, through the reconstruction of the corresponding Maximum Spanning Trees (MSTs), the assessment of their topological differences, and the comparison of two frequentist and Bayesian reconstruction approaches. A machine learning model demonstrates that the Bayesian reconstruction encodes more information than the frequentist one, and that such additional information is related to the uncertainty of the topological structures. We finally show how the Bayesian approach is more effective in the validation of generative models, over and above the frequentist one, by proposing and disproving two models based on additive noise.
Collapse
|
215
|
Liu D, Mitchell L, Cope RC, Carlson SJ, Ross JV. Elucidating user behaviours in a digital health surveillance system to correct prevalence estimates. Epidemics 2020; 33:100404. [PMID: 33002805 DOI: 10.1016/j.epidem.2020.100404] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Revised: 08/12/2020] [Accepted: 08/24/2020] [Indexed: 10/23/2022] Open
Abstract
Estimating seasonal influenza prevalence is of undeniable public health importance, but remains challenging with traditional datasets due to cost and timeliness. Digital epidemiology has the potential to address this challenge, but can introduce sampling biases that are distinct to traditional systems. In online participatory health surveillance systems, the voluntary nature of the data generating process must be considered to address potential biases in estimates. Here we examine user behaviours in one such platform, FluTracking, from 2011 to 2017. We build a Bayesian model to estimate probabilities of an individual reporting in each week, given their past reporting behaviour, and to infer the weekly prevalence of influenza-like-illness (ILI) in Australia. We show that a model that corrects for user behaviour can substantially affect ILI estimates. The model examined here elucidates several factors, such as the status of having ILI and consistency of prior reporting, that are strongly associated with the likelihood of participating in online health surveillance systems. This framework could be applied to other digital participatory health systems where participation is inconsistent and sampling bias may be of concern.
Collapse
|
216
|
Masmaliyeva RC, Babai KH, Murshudov GN. Local and global analysis of macromolecular atomic displacement parameters. Acta Crystallogr D Struct Biol 2020; 76:926-937. [PMID: 33021494 PMCID: PMC7543658 DOI: 10.1107/s2059798320011043] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Accepted: 08/11/2020] [Indexed: 04/13/2023] Open
Abstract
This paper describes the global and local analysis of atomic displacement parameters (ADPs) of macromolecules in X-ray crystallography. The distribution of ADPs is shown to follow the shifted inverse-gamma distribution or a mixture of these distributions. The mixture parameters are estimated using the expectation-maximization algorithm. In addition, a method for the resolution- and individual ADP-dependent local analysis of neighbouring atoms has been designed. This method facilitates the detection of mismodelled atoms, heavy-metal atoms and disordered and/or incorrectly modelled ligands. Both global and local analyses can be used to detect errors in atomic models, thus helping in the (re)building, refinement and validation of macromolecular structures. This method can also serve as an additional validation tool during PDB deposition.
Collapse
|
217
|
Overmann AL, Clark DM, Tsagkozis P, Wedin R, Forsberg JA. Validation of PATHFx 2.0: An open-source tool for estimating survival in patients undergoing pathologic fracture fixation. J Orthop Res 2020; 38:2149-2156. [PMID: 32492213 DOI: 10.1002/jor.24763] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/23/2020] [Revised: 05/04/2020] [Accepted: 05/11/2020] [Indexed: 02/04/2023]
Abstract
Treatment decisions in patients with metastatic bone disease rely on accurate survival estimation. We developed the original PATHFx models using expensive, proprietary software and now seek to provide a more cost-effective solution. Using open-source machine learning software to create PATHFx version 2.0, we asked whether PATHFx 2.0 could be created using open-source methods and externally validated in two unique patient populations. The training set of a well-characterized, database records of 189 patients and the bnlearn package within R Version 3.5.1 (R Foundation for Statistical Computing), was used to establish a series of Bayesian belief network models designed to predict survival at 1, 3, 6, 12, 18, and 24 months. Each was externally validated in both a Scandinavian (n = 815 patients) and a Japanese (n = 261 patients) data set. Brier scores and receiver operating characteristic curves to assessed discriminatory ability. Decision curve analysis (DCA) evaluated whether models should be used clinically. DCA showed that the model should be used clinically at all time points in the Scandinavian data set. For the 1-month time point, DCA of the Japanese data set suggested to expect better outcomes assuming all patients will survive greater than 1 month. Brier scores for each curve demonstrate that the models are accurate at each time point. Statement of Clinical Significance: we successfully transitioned to PATHFx 2.0 using open-source software and externally validated it in two unique patient populations, which can be used as a cost-effective option to guide surgical decisions in patients with metastatic bone disease.
Collapse
|
218
|
Vandewalle V, Caron A, Delettrez C, Périchon R, Pelayo S, Duhamel A, Dervaux B. Estimating the number of usability problems affecting medical devices: modelling the discovery matrix. BMC Med Res Methodol 2020; 20:234. [PMID: 32948143 PMCID: PMC7653970 DOI: 10.1186/s12874-020-01091-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2019] [Accepted: 07/29/2020] [Indexed: 12/03/2022] Open
Abstract
Background Usability testing of medical devices are mandatory for market access. The testings’ goal is to identify usability problems that could cause harm to the user or limit the device’s effectiveness. In practice, human factor engineers study participants under actual conditions of use and list the problems encountered. This results in a binary discovery matrix in which each row corresponds to a participant, and each column corresponds to a usability problem. One of the main challenges in usability testing is estimating the total number of problems, in order to assess the completeness of the discovery process. Today’s margin-based methods fit the column sums to a binomial model of problem detection. However, the discovery matrix actually observed is truncated because of undiscovered problems, which corresponds to fitting the marginal sums without the zeros. Margin-based methods fail to overcome the bias related to truncation of the matrix. The objective of the present study was to develop and test a matrix-based method for estimating the total number of usability problems. Methods The matrix-based model was based on the full discovery matrix (including unobserved columns) and not solely on a summary of the data (e.g. the margins). This model also circumvents a drawback of margin-based methods by simultaneously estimating the model’s parameters and the total number of problems. Furthermore, the matrix-based method takes account of a heterogeneous probability of detection, which reflects a real-life setting. As suggested in the usability literature, we assumed that the probability of detection had a logit-normal distribution. Results We assessed the matrix-based method’s performance in a range of settings reflecting real-life usability testing and with heterogeneous probabilities of problem detection. In our simulations, the matrix-based method improved the estimation of the number of problems (in terms of bias, consistency, and coverage probability) in a wide range of settings. We also applied our method to five real datasets from usability testing. Conclusions Estimation models (and particularly matrix-based models) are of value in estimating and monitoring the detection process during usability testing. Matrix-based models have a solid mathematical grounding and, with a view to facilitating the decision-making process for both regulators and device manufacturers, should be incorporated into current standards.
Collapse
|
219
|
National population mapping from sparse survey data: A hierarchical Bayesian modeling framework to account for uncertainty. Proc Natl Acad Sci U S A 2020; 117:24173-24179. [PMID: 32929009 PMCID: PMC7533662 DOI: 10.1073/pnas.1913050117] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
High-resolution population estimates are essential for government planning, development projects, and public health campaigns, but countries where this information is most needed are often where recent national census data are least available. We present a modeling framework that combines recent neighborhood-scale microcensus surveys with national-scale data from satellite images and digital maps to estimate population sizes for every 100-m grid square nationally. We present a case study from Nigeria where population estimates with national coverage were produced using household survey data from 1,141 locations. This work represents a significant step toward achieving high-resolution population estimates with national coverage from sparse population data while providing reliable estimates of uncertainty at any spatial scale. Population estimates are critical for government services, development projects, and public health campaigns. Such data are typically obtained through a national population and housing census. However, population estimates can quickly become inaccurate in localized areas, particularly where migration or displacement has occurred. Some conflict-affected and resource-poor countries have not conducted a census in over 10 y. We developed a hierarchical Bayesian model to estimate population numbers in small areas based on enumeration data from sample areas and nationwide information about administrative boundaries, building locations, settlement types, and other factors related to population density. We demonstrated this model by estimating population sizes in every 10- m grid cell in Nigeria with national coverage. These gridded population estimates and areal population totals derived from them are accompanied by estimates of uncertainty based on Bayesian posterior probabilities. The model had an overall error rate of 67 people per hectare (mean of absolute residuals) or 43% (using scaled residuals) for predictions in out-of-sample survey areas (approximately 3 ha each), with increased precision expected for aggregated population totals in larger areas. This statistical approach represents a significant step toward estimating populations at high resolution with national coverage in the absence of a complete and recent census, while also providing reliable estimates of uncertainty to support informed decision making.
Collapse
|
220
|
Turner NA, Pan W, Martinez-Bianchi VS, Panayotti GMM, Planey AM, Woods CW, Lantos PM. Racial, Ethnic, and Geographic Disparities in Novel Coronavirus (Severe Acute Respiratory Syndrome Coronavirus 2) Test Positivity in North Carolina. Open Forum Infect Dis 2020; 8:ofaa413. [PMID: 33575416 PMCID: PMC7499753 DOI: 10.1093/ofid/ofaa413] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Accepted: 09/03/2020] [Indexed: 12/19/2022] Open
Abstract
Background Emerging evidence suggests that black and Hispanic communities in the United States are disproportionately affected by coronavirus disease 2019 (COVID-19). A complex interplay of socioeconomic and healthcare disparities likely contribute to disproportionate COVID-19 risk. Methods We conducted a geospatial analysis to determine whether individual- and neighborhood-level attributes predict local odds of testing positive for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). We analyzed 29 138 SARS-CoV-2 tests within the 6-county catchment area for Duke University Health System from March to June 2020. We used generalized additive models to analyze the spatial distribution of SARS-CoV-2 positivity. Adjusted models included individual-level age, gender, and race, as well as neighborhood-level Area Deprivation Index, population density, demographic composition, and household size. Results Our dataset included 27 099 negative and 2039 positive unique SARS-CoV-2 tests. The odds of a positive SARS-CoV-2 test were higher for males (odds ratio [OR], 1.43; 95% credible interval [CI], 1.30–1.58), blacks (OR, 1.47; 95% CI, 1.27–1.70), and Hispanics (OR, 4.25; 955 CI, 3.55–5.12). Among neighborhood-level predictors, percentage of black population (OR, 1.14; 95% CI, 1.05–1.25), and percentage Hispanic population (OR, 1.23; 95% CI, 1.07–1.41) also influenced the odds of a positive SARS-CoV-2 test. Population density, average household size, and Area Deprivation Index were not associated with SARS-CoV-2 test results after adjusting for race. Conclusions The odds of testing positive for SARS-CoV-2 were higher for both black and Hispanic individuals, as well as within neighborhoods with a higher proportion of black or Hispanic residents—confirming that black and Hispanic communities are disproportionately affected by SARS-CoV-2.
Collapse
|
221
|
Vokó Z, Bitter I, Mersich B, Réthelyi J, Molnár A, Pitter JG, Götze Á, Horváth M, Kóczián K, Fonticoli L, Lelli F, Németh B. Using informative prior based on expert opinion in Bayesian estimation of the transition probability matrix in Markov modelling-an example from the cost-effectiveness analysis of the treatment of patients with predominantly negative symptoms of schizophrenia with cariprazine. COST EFFECTIVENESS AND RESOURCE ALLOCATION 2020; 18:28. [PMID: 32874137 PMCID: PMC7457290 DOI: 10.1186/s12962-020-00224-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2019] [Accepted: 08/17/2020] [Indexed: 11/18/2022] Open
Abstract
Background When patient health state transition evidence is missing from clinical literature, analysts are inclined to make simple assumptions to complete the transition matrices within a health economic model. Our aim was to provide a solution for estimating transition matrices by the Bayesian statistical method within a health economic model when empirical evidence is lacking. Methods We used a previously published cost-effectiveness analysis of the use of cariprazine compared to that of risperidone in patients with predominantly negative symptoms of schizophrenia. We generated the treatment-specific state transition probability matrices in three different ways: (1) based only on the observed clinical trial data; (2) based on Bayesian estimation where prior transition probabilities came from experts’ opinions; and (3) based on Bayesian estimation with vague prior transition probabilities (i.e., assigning equal prior probabilities to the missing transitions from one state to the others). For the second approach, we elicited Dirichlet prior distributions by three clinical experts. We compared the transition probability matrices and the incremental quality-adjusted life years (QALYs) across the three approaches. Results The estimates of the prior transition probabilities from the experts were feasible to obtain and showed considerable consistency with the clinical trial data. As expected, the estimated health benefit of the treatments was different when only the clinical trial data were considered (QALY difference 0.0260), its combination with the experts’ beliefs were used in the economic model (QALY difference 0.0253), and when vague prior distributions were used (QALY difference 0.0243). Conclusions Imputing zeros to missing transition probabilities in Markov models might be untenable from the clinical perspective and may result in inappropriate estimates. Bayesian statistics provides an appropriate framework for imputing missing values without making overly simple assumptions. Informative priors based on expert opinions might be more appropriate than vague priors.
Collapse
|
222
|
Bluschke A, Chmielewski WX, Roessner V, Beste C. Intact Context-Dependent Modulation of Conflict Monitoring in Childhood ADHD. J Atten Disord 2020; 24:1503-1510. [PMID: 27114409 DOI: 10.1177/1087054716643388] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Objective: Conflict monitoring is well known to be modulated by context. This is known as the Gratton effect, meaning that the degree of interference is smaller when a stimulus-response conflict had been encountered previously. It is unclear to what extent these processes are changed in ADHD. Method: Children with ADHD (combined subtype) and healthy controls performed a modified version of the sequence flanker task. Results: Patients with ADHD made significantly more errors than healthy controls, indicating general performance deficits. However, there were no differences regarding reaction times, indicating an intact Gratton effect in ADHD. These results were supported by Bayesian statistics. Conclusion: The results suggest that the ability to take contextual information into account during conflict monitoring is preserved in patients with ADHD despite this disorder being associated with changes in executive control functions overall. These findings are discussed in light of different theoretical accounts on contextual modulations of conflict monitoring.
Collapse
|
223
|
Kern JL, Culpepper SA. A Restricted Four-Parameter IRT Model: The Dyad Four-Parameter Normal Ogive (Dyad-4PNO) Model. PSYCHOMETRIKA 2020; 85:575-599. [PMID: 32803390 DOI: 10.1007/s11336-020-09716-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2020] [Indexed: 06/11/2023]
Abstract
Recently, there has been a renewed interest in the four-parameter item response theory model as a way to capture guessing and slipping behaviors in responses. Research has shown, however, that the nested three-parameter model suffers from issues of unidentifiability (San Martín et al. in Psychometrika 80:450-467, 2015), which places concern on the identifiability of the four-parameter model. Borrowing from recent advances in the identification of cognitive diagnostic models, in particular, the DINA model (Gu and Xu in Stat Sin https://doi.org/10.5705/ss.202018.0420 , 2019), a new model is proposed with restrictions inspired by this new literature to help with the identification issue. Specifically, we show conditions under which the four-parameter model is strictly and generically identified. These conditions inform the presentation of a new exploratory model, which we call the dyad four-parameter normal ogive (Dyad-4PNO) model. This model is developed by placing a hierarchical structure on the DINA model and imposing equality constraints on a priori unknown dyads of items. We present a Bayesian formulation of this model, and show that model parameters can be accurately recovered. Finally, we apply the model to a real dataset.
Collapse
|
224
|
Jin N, Li J, Jin M, Zhang X. Spatiotemporal variation and determinants of population's PM 2.5 exposure risk in China, 1998-2017: a case study of the Beijing-Tianjin-Hebei region. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2020; 27:31767-31777. [PMID: 32504429 DOI: 10.1007/s11356-020-09484-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/05/2020] [Accepted: 05/27/2020] [Indexed: 06/11/2023]
Abstract
PM2.5 pollution has emerged as a global human health risk. The best measure of its impact is a population's PM2.5 exposure (PPM2.5E), an index that simultaneously considers PM2.5 concentrations and population spatial density. The spatiotemporal variation of PPM2.5E over the Beijing-Tianjin-Hebei (BTH) region, which is the national capital region of China, was investigated using a Bayesian space-time model, and the influence patterns of the anthropic and geographical factors were identified using the GeoDetector model and Pearson correlation analysis. The spatial pattern of PPM2.5E maintained a stable structure over the BTH region's distinct terrain, which has been described as "high in the northwest, low in the southeast". The spatial difference of PPM2.5E intensified annually. An overall increase of 6.192 (95% CI 6.186, 6.203) ×103 μg/m3 ∙ persons/km2 per year occurred over the BTH region from 1998 to 2017. The evolution of PPM2.5E in the region can be described as "high value, high increase" and "low value, low increase", since human activities related to gross domestic product (GDP) and energy consumption (EC) were the main factors in its occurrence. GDP had the strongest explanatory power of 76% (P < 0.01), followed by EC and elevation (EL), which accounted for 61% (P < 0.01) and 40% (P < 0.01), respectively. There were four factors, proportion of secondary industry (PSI), normalized differential vegetation index (NDVI), relief amplitude (RA), and EL, associated negatively with PPM2.5E and four factors, GDP, EC, annual precipitation (AP), and annual average temperature (AAT), associated positively with PPM2.5E. Remarkably, the interaction of GDP and NDVI, which was 90%, had the greatest explanatory power for PPM2.5E ' s diffusion and impact on the BTH region.
Collapse
|
225
|
Oliver RC, Potrzebowski W, Najibi SM, Pedersen MN, Arleth L, Mahmoudi N, André I. Assembly of Capsids from Hepatitis B Virus Core Protein Progresses through Highly Populated Intermediates in the Presence and Absence of RNA. ACS NANO 2020; 14:10226-10238. [PMID: 32672447 PMCID: PMC7458484 DOI: 10.1021/acsnano.0c03569] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/29/2020] [Accepted: 07/16/2020] [Indexed: 05/17/2023]
Abstract
The genetic material of viruses is protected by protein shells that are assembled from a large number of subunits in a process that is efficient and robust. Many of the mechanistic details underpinning efficient assembly of virus capsids are still unknown. The assembly mechanism of hepatitis B capsids has been intensively researched using a truncated core protein lacking the C-terminal domain responsible for binding genomic RNA. To resolve the assembly intermediates of hepatitis B virus (HBV), we studied the formation of nucleocapsids and empty capsids from full-length hepatitis B core proteins, using time-resolved small-angle X-ray scattering. We developed a detailed structural model of the HBV capsid assembly process using a combination of analysis with multivariate curve resolution, structural modeling, and Bayesian ensemble inference. The detailed structural analysis supports an assembly pathway that proceeds through the formation of two highly populated intermediates, a trimer of dimers and a partially closed shell consisting of around 40 dimers. These intermediates are on-path, transient and efficiently convert into fully formed capsids. In the presence of an RNA oligo that binds specifically to the C-terminal domain the assembly proceeds via a similar mechanism to that in the absence of nucleic acids. Comparisons between truncated and full-length HBV capsid proteins reveal that the unstructured C-terminal domain has a significant impact on the assembly process and is required to obtain a more complete mechanistic understanding of HBV capsid formation. These results also illustrate how combining scattering information from different time-points during time-resolved experiments can be utilized to derive a structural model of protein self-assembly pathways.
Collapse
|