1
|
Tong G, Tong J, Jiang Y, Esserman D, Harhay MO, Warren JL. Hierarchical Bayesian modeling of heterogeneous outcome variance in cluster randomized trials. Clin Trials 2024; 21:451-460. [PMID: 38197388 PMCID: PMC11233424 DOI: 10.1177/17407745231222018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2024]
Abstract
BACKGROUND Heterogeneous outcome correlations across treatment arms and clusters have been increasingly acknowledged in cluster randomized trials with binary endpoints, where analytical methods have been developed to study such heterogeneity. However, cluster-specific outcome variances and correlations have yet to be studied for cluster randomized trials with continuous outcomes. METHODS This article proposes models fitted in the Bayesian setting with hierarchical variance structure to quantify heterogeneous variances across clusters and explain it with cluster-level covariates when the outcome is continuous. The models can also be extended to analyzing heterogeneous variances in individually randomized group treatment trials, with arm-specific cluster-level covariates, or in partially nested designs. Simulation studies are carried out to validate the performance of the newly introduced models across different settings. RESULTS Simulations showed that overall the newly introduced models have good performance, reporting low bias and approximately 95% coverage for the intraclass correlation coefficients and regression parameters in the variance model. When variances are heterogeneous, our proposed models had improved model fit over models with homogeneous variances. When used to analyze data from the Kerala Diabetes Prevention Program study, our models identified heterogeneous variances and intraclass correlation coefficients across clusters and examined cluster-level characteristics associated with such heterogeneity. CONCLUSION We proposed new hierarchical Bayesian variance models to accommodate cluster-specific variances in cluster randomized trials. The newly developed methods inform the understanding of how an intervention strategy is implemented and disseminated differently across clusters and can help improve future trial design.
Collapse
Affiliation(s)
- Guangyu Tong
- Department of Internal Medicine, Yale School of Medicine, New Haven, Connecticut, USA
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, USA
- Center for Methods in Implementation and Prevention Science, Yale School of Public Health, New Haven, Connecticut, USA
| | - Jiaqi Tong
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, USA
- Center for Methods in Implementation and Prevention Science, Yale School of Public Health, New Haven, Connecticut, USA
| | - Yi Jiang
- Department of Biostatistics, Penn State College of Medicine, Hershey, Pennsylvania, USA
| | - Denise Esserman
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, USA
- Yale Center for Analytical Science, Yale School of Public Health, New Haven, Connecticut, USA
| | - Michael O Harhay
- Clinical Trials Methods and Outcomes Lab, Palliative and Advanced Illness Research (PAIR) Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Joshua L Warren
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, USA
| |
Collapse
|
2
|
Westgate PM, Cheng DM, Feaster DJ, Fernández S, Shoben AB, Vandergrift N. Marginal modeling in community randomized trials with rare events: Utilization of the negative binomial regression model. Clin Trials 2022; 19:162-171. [PMID: 34991359 PMCID: PMC9038610 DOI: 10.1177/17407745211063479] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
BACKGROUND/AIMS This work is motivated by the HEALing Communities Study, which is a post-test only cluster randomized trial in which communities are randomized to two different trial arms. The primary interest is in reducing opioid overdose fatalities, which will be collected as a count outcome at the community level. Communities range in size from thousands to over one million residents, and fatalities are expected to be rare. Traditional marginal modeling approaches in the cluster randomized trial literature include the use of generalized estimating equations with an exchangeable correlation structure when utilizing subject-level data, or analogously quasi-likelihood based on an over-dispersed binomial variance when utilizing community-level data. These approaches account for and estimate the intra-cluster correlation coefficient, which should be provided in the results from a cluster randomized trial. Alternatively, the coefficient of variation or R coefficient could be reported. In this article, we show that negative binomial regression can also be utilized when communities are large and events are rare. The objectives of this article are (1) to show that the negative binomial regression approach targets the same marginal regression parameter(s) as an over-dispersed binomial model and to explain why the estimates may differ; (2) to derive formulas relating the negative binomial overdispersion parameter k with the intra-cluster correlation coefficient, coefficient of variation, and R coefficient; and (3) analyze pre-intervention data from the HEALing Communities Study to demonstrate and contrast models and to show how to report the intra-cluster correlation coefficient, coefficient of variation, and R coefficient when utilizing negative binomial regression. METHODS Negative binomial and over-dispersed binomial regression modeling are contrasted in terms of model setup, regression parameter estimation, and formulation of the overdispersion parameter. Three specific models are used to illustrate concepts and address the third objective. RESULTS The negative binomial regression approach targets the same marginal regression parameter(s) as an over-dispersed binomial model, although estimates may differ. Practical differences arise in regard to how overdispersion, and hence the intra-cluster correlation coefficient is modeled. The negative binomial overdispersion parameter is approximately equal to the ratio of the intra-cluster correlation coefficient and marginal probability, the square of the coefficient of variation, and the R coefficient minus 1. As a result, estimates corresponding to all four of these different types of overdispersion parameterizations can be reported when utilizing negative binomial regression. CONCLUSION Negative binomial regression provides a valid, practical, alternative approach to the analysis of count data, and corresponding reporting of overdispersion parameters, from community randomized trials in which communities are large and events are rare.
Collapse
Affiliation(s)
- Philip M Westgate
- Department of Biostatistics, College of Public Health, University of Kentucky, Lexington, KY, USA
| | - Debbie M Cheng
- Department of Biostatistics, School of Public Health, Boston University, Boston, MA, USA
| | - Daniel J Feaster
- Department of Public Health Sciences, Miller School of Medicine, University of Miami, Coral Gables, FL, USA
| | - Soledad Fernández
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, USA
| | - Abigail B Shoben
- Division of Biostatistics, College of Public Health, The Ohio State University, Columbus, OH, USA
| | | |
Collapse
|
3
|
Nikolay B, Ribeiro Dos Santos G, Lipsitch M, Rahman M, Luby SP, Salje H, Gurley ES, Cauchemez S. Assessing the feasibility of Nipah vaccine efficacy trials based on previous outbreaks in Bangladesh. Vaccine 2021; 39:5600-5606. [PMID: 34426025 DOI: 10.1016/j.vaccine.2021.08.027] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Revised: 08/03/2021] [Accepted: 08/06/2021] [Indexed: 11/24/2022]
Abstract
BACKGROUND Nipah virus (NiV) is an emerging, bat-borne pathogen that can be transmitted from person-to-person. Vaccines are currently being developed for NiV, and studies have been funded to evaluate their safety and immunogenicity. An important unanswered question is whether it will be possible to evaluate the efficacy of vaccine candidates in phase III clinical trials in a context where spillovers from the zoonotic reservoir are infrequent and associated with small outbreaks. The objective of this study was to investigate the feasibility of conducting a phase III vaccine trial in Bangladesh, the only country regularly reporting NiV cases. METHODS We used simulations based on previously observed NiV cases from Bangladesh, an assumed vaccine efficacy of 90% and other NiV vaccine target characteristics, to compare three vaccination study designs: (i) cluster randomized ring vaccination, (ii) cluster randomized mass vaccination, and (iii) an observational case-control study design. RESULTS The simulations showed that, assuming a ramp-up period of 10 days and a mean hospitalization delay of 4 days,a cluster-randomized ring vaccination trial would require 516 years and over 163,000 vaccine doses to run a ring vaccination trial under current epidemic conditions. A cluster-randomized mass vaccination trial in the two most affected districts would take 43 years and 1.83 million vaccine doses. An observational case-control design in these two districts would require seven years and 2.5 million vaccine doses. DISCUSSION Without a change in the epidemiology of NiV, ring vaccination or mass vaccination trials are unlikely to be completed within a reasonable time window. In this light, the remaining options are: (i) not conducting a phase III trial until the epidemiology of NiV changes, (ii) identifying alternative ways to licensure such as observational studies or controlled studies in animals such as in the US Food and Drug Administration's (FDA) Animal Rule.
Collapse
Affiliation(s)
- Birgit Nikolay
- Mathematical Modelling of Infectious Diseases Unit, Institut Pasteur, UMR2000, CNRS, 75015 Paris, France
| | | | - Marc Lipsitch
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | | | - Stephen P Luby
- Infectious Diseases and Geographic Medicine Division, Stanford University, Stanford, CA, USA
| | - Henrik Salje
- Department of Genetics, University of Cambridge, Cambridge, UK.
| | - Emily S Gurley
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Simon Cauchemez
- Mathematical Modelling of Infectious Diseases Unit, Institut Pasteur, UMR2000, CNRS, 75015 Paris, France
| |
Collapse
|
4
|
Mbekwe Yepnang AM, Caille A, Eldridge SM, Giraudeau B. Association of intracluster correlation measures with outcome prevalence for binary outcomes in cluster randomised trials. Stat Methods Med Res 2021; 30:1988-2003. [PMID: 34218744 DOI: 10.1177/09622802211026004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In cluster randomised trials, a measure of intracluster correlation such as the intraclass correlation coefficient (ICC) should be reported for each primary outcome. Providing intracluster correlation estimates may help in calculating sample size of future cluster randomised trials and also in interpreting the results of the trial from which they are derived. For a binary outcome, the ICC is known to be associated with its prevalence, which raises at least two issues. First, it questions the use of ICC estimates obtained on a binary outcome in a trial for sample size calculations in a subsequent trial in which the same binary outcome is expected to have a different prevalence. Second, it challenges the interpretation of ICC estimates because they do not solely depend on clustering level. Other intracluster correlation measures proposed for clustered binary data settings include the variance partition coefficient, the median odds ratio and the tetrachoric correlation coefficient. Under certain assumptions, the theoretical maximum possible value for an ICC associated with a binary outcome can be derived, and we proposed the relative deviation of an ICC estimate to this maximum value as another measure of the intracluster correlation. We conducted a simulation study to explore the dependence of these intracluster correlation measures on outcome prevalence and found that all are associated with prevalence. Even if all depend on prevalence, the tetrachoric correlation coefficient computed with Kirk's approach was less dependent on the outcome prevalence than the other measures when the intracluster correlation was about 0.05. We also observed that for lower values, such as 0.01, the analysis of variance estimator of the ICC is preferred.
Collapse
Affiliation(s)
| | - Agnès Caille
- Université de Tours, Université de Nantes, INSERM, Tours, France.,INSERM CIC1415, CHRU de Tours, Tours, France
| | - Sandra M Eldridge
- Centre for Primary Care and Public Health, Queen Mary University of London, London, UK
| | - Bruno Giraudeau
- Université de Tours, Université de Nantes, INSERM, Tours, France.,INSERM CIC1415, CHRU de Tours, Tours, France
| |
Collapse
|
5
|
Chatfield MD, Farewell DM. Understanding between-cluster variation in prevalence and limits for how much variation is plausible. Stat Methods Med Res 2020; 30:286-298. [PMID: 32907496 DOI: 10.1177/0962280220951831] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In clinical trials and observational studies of clustered binary data, understanding between-cluster variation is essential: in sample size and power calculations of cluster randomised trials, for example, the intra-cluster correlation coefficient is often specified. However, quantifications of between-cluster variation can be unintuitive, and an intra-cluster correlation coefficient as low as 0.04 may correspond to surprisingly large between-cluster differences. We suggest that understanding is improved through visualising the implied distribution of true cluster prevalences - possibly by assuming they follow a beta distribution - or by calculating their standard deviation, which is more readily interpretable than the intra-cluster correlation coefficient. Even so, the bounded nature of binary data complicates the interpretation of variances as primary measures of uncertainty, and entropy offers an attractive alternative. Appealing to maximum entropy theory, we propose the following rule of thumb: that plausible intra-cluster correlation coefficients and standard deviations of true cluster prevalences are both bounded above by the overall prevalence, its complement, and one third. We also provide corresponding bounds for the coefficient of variation, and for a different standard deviation and intra-cluster correlation defined on the log odds scale. Using previously published data, we observe the quantities defined on the log odds scale to be more transportable between studies with different outcomes with different prevalences than the intra-cluster correlation and coefficient of variation. The latter increase and decrease, respectively, as prevalence increases from 0% to 50%, and the same is true for our bounds. Our work will help clinical trialists better understand between-cluster variation and avoid specifying implausibly high values for the intra-cluster correlation in sample size and power calculations.
Collapse
Affiliation(s)
- Mark D Chatfield
- Faculty of Medicine, The University of Queensland, Brisbane, Australia
| | - Daniel M Farewell
- Division of Population Medicine, School of Medicine, Cardiff University, Cardiff, UK
| |
Collapse
|
6
|
Chatfield MD, Farewell DM. Letter to the Editor: Is the R coefficient of interest in cluster randomized trials with a binary outcome? Stat Methods Med Res 2020; 29:1763-1764. [DOI: 10.1177/0962280220912783] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Affiliation(s)
- Mark D Chatfield
- Faculty of Medicine, The University of Queensland, Brisbane, Australia
| | | |
Collapse
|
7
|
Yepnang AMM, Caille A, Eldridge SM, Giraudeau B. Is the R coefficient of interest in cluster randomized trials with a binary outcome? Stat Methods Med Res 2020; 29:2470-2480. [DOI: 10.1177/0962280219900200] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
In cluster randomized trials, the intraclass correlation coefficient (ICC) is classically used to measure clustering. When the outcome is binary, the ICC is known to be associated with the prevalence of the outcome. This association challenges its interpretation and can be problematic for sample size calculation. To overcome these situations, Crespi et al. extended a coefficient named R, initially proposed by Rosner for ophthalmologic data, to cluster randomized trials. Crespi et al. asserted that R may be less influenced by the outcome prevalence than is the ICC, although the authors provided only empirical data to support their assertion. They also asserted that “the traditional ICC approach to sample size determination tends to overpower studies under many scenarios, calling for more clusters than truly required”, although they did not consider empirical power. The aim of this study was to investigate whether R could indeed be considered independent of the outcome prevalence. We also considered whether sample size calculation should be better based on the R coefficient or the ICC. Considering the particular case of 2 individuals per cluster, we theoretically demonstrated that R is not symmetrical around the 0.5 prevalence value. This in itself demonstrates the dependence of R on prevalence. We also conducted a simulation study to explore the case of both fixed and variable cluster sizes greater than 2. This simulation study demonstrated that R decreases when prevalence increases from 0 to 1. Both the analytical and simulation results demonstrate that R depends on the outcome prevalence. In terms of sample size calculation, we showed that an approach based on the ICC is preferable to an approach based on the R coefficient because with the former, the empirical power is closer to the nominal one. Hence, the R coefficient does not outperform the ICC for binary outcomes because it does not offer any advantage over the ICC.
Collapse
Affiliation(s)
| | - Agnès Caille
- Université de Tours, Université de Nantes, INSERM, SPHERE U1246, Tours, France
- INSERM CIC1415, CHRU de Tours, Tours, France
| | - Sandra M Eldridge
- Centre for Primary Care and Public Health, Queen Mary University of London, London, UK
| | - Bruno Giraudeau
- Université de Tours, Université de Nantes, INSERM, SPHERE U1246, Tours, France
- INSERM CIC1415, CHRU de Tours, Tours, France
| |
Collapse
|
8
|
Mbekwe Yepnang A, Caille A, Eldridge S, Giraudeau B. A note about the R Coefficient, the intraclass correlation coefficient and their association with outcome prevalence. Rev Epidemiol Sante Publique 2019. [DOI: 10.1016/j.respe.2019.03.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
|
9
|
Westgate PM. A readily available improvement over method of moments for intra-cluster correlation estimation in the context of cluster randomized trials and fitting a GEE-type marginal model for binary outcomes. Clin Trials 2018; 16:41-51. [PMID: 30295512 DOI: 10.1177/1740774518803635] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
BACKGROUND/AIMS Cluster randomized trials are popular in health-related research due to the need or desire to randomize clusters of subjects to different trial arms as opposed to randomizing each subject individually. As outcomes from subjects within the same cluster tend to be more alike than outcomes from subjects within other clusters, an exchangeable correlation arises that is measured via the intra-cluster correlation coefficient. Intra-cluster correlation coefficient estimation is especially important due to the increasing awareness of the need to publish such values from studies in order to help guide the design of future cluster randomized trials. Therefore, numerous methods have been proposed to accurately estimate the intra-cluster correlation coefficient, with much attention given to binary outcomes. As marginal models are often of interest, we focus on intra-cluster correlation coefficient estimation in the context of fitting such a model with binary outcomes using generalized estimating equations. Traditionally, intra-cluster correlation coefficient estimation with generalized estimating equations has been based on the method of moments, although such estimators can be negatively biased. Furthermore, alternative estimators that work well, such as the analysis of variance estimator, are not as readily applicable in the context of practical data analyses with generalized estimating equations. Therefore, in this article we assess, in terms of bias, the readily available residual pseudo-likelihood approach to intra-cluster correlation coefficient estimation with the GLIMMIX procedure of SAS (SAS Institute, Cary, NC). Furthermore, we study a possible corresponding approach to confidence interval construction for the intra-cluster correlation coefficient. METHODS We utilize a simulation study and application example to assess bias in intra-cluster correlation coefficient estimates obtained from GLIMMIX using residual pseudo-likelihood. This estimator is contrasted with method of moments and analysis of variance estimators which are standards of comparison. The approach to confidence interval construction is assessed by examining coverage probabilities. RESULTS Overall, the residual pseudo-likelihood estimator performs very well. It has considerably less bias than moment estimators, which are its competitor for general generalized estimating equation-based analyses, and therefore, it is a major improvement in practice. Furthermore, it works almost as well as analysis of variance estimators when they are applicable. Confidence intervals have near-nominal coverage when the intra-cluster correlation coefficient estimate has negligible bias. CONCLUSION Our results show that the residual pseudo-likelihood estimator is a good option for intra-cluster correlation coefficient estimation when conducting a generalized estimating equation-based analysis of binary outcome data arising from cluster randomized trials. The estimator is practical in that it is simply a result from fitting a marginal model with GLIMMIX, and a confidence interval can be easily obtained. An additional advantage is that, unlike most other options for performing generalized estimating equation-based analyses, GLIMMIX provides analysts the option to utilize small-sample adjustments that ensure valid inference.
Collapse
Affiliation(s)
- Philip M Westgate
- Department of Biostatistics, College of Public Health, University of Kentucky, Lexington, KY, USA
| |
Collapse
|
10
|
Wu S, Wong WK, Crespi CM. Maximin optimal designs for cluster randomized trials. Biometrics 2017; 73:916-926. [PMID: 28182835 PMCID: PMC5550375 DOI: 10.1111/biom.12659] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2015] [Revised: 04/01/2016] [Accepted: 12/01/2016] [Indexed: 12/13/2022]
Abstract
We consider design issues for cluster randomized trials (CRTs) with a binary outcome where both unit costs and intraclass correlation coefficients (ICCs) in the two arms may be unequal. We first propose a design that maximizes cost efficiency (CE), defined as the ratio of the precision of the efficacy measure to the study cost. Because such designs can be highly sensitive to the unknown ICCs and the anticipated success rates in the two arms, a local strategy based on a single set of best guesses for the ICCs and success rates can be risky. To mitigate this issue, we propose a maximin optimal design that permits ranges of values to be specified for the success rate and the ICC in each arm. We derive maximin optimal designs for three common measures of the efficacy of the intervention, risk difference, relative risk and odds ratio, and study their properties. Using a real cancer control and prevention trial example, we ascertain the efficiency of the widely used balanced design relative to the maximin optimal design and show that the former can be quite inefficient and less robust to mis-specifications of the ICCs and the success rates in the two arms.
Collapse
Affiliation(s)
- Sheng Wu
- Department of Biostatistics, UCLA Fielding School of Public Health, University of California Los Angeles CA 90095-1772
| | - Weng Kee Wong
- Department of Biostatistics, UCLA Fielding School of Public Health, University of California Los Angeles CA 90095-1772
| | - Catherine M. Crespi
- Department of Biostatistics, UCLA Fielding School of Public Health, University of California Los Angeles CA 90095-1772
| |
Collapse
|
11
|
Turner EL, Li F, Gallis JA, Prague M, Murray DM. Review of Recent Methodological Developments in Group-Randomized Trials: Part 1-Design. Am J Public Health 2017; 107:907-915. [PMID: 28426295 PMCID: PMC5425852 DOI: 10.2105/ajph.2017.303706] [Citation(s) in RCA: 113] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/05/2017] [Indexed: 11/04/2022]
Abstract
In 2004, Murray et al. reviewed methodological developments in the design and analysis of group-randomized trials (GRTs). We have highlighted the developments of the past 13 years in design with a companion article to focus on developments in analysis. As a pair, these articles update the 2004 review. We have discussed developments in the topics of the earlier review (e.g., clustering, matching, and individually randomized group-treatment trials) and in new topics, including constrained randomization and a range of randomized designs that are alternatives to the standard parallel-arm GRT. These include the stepped-wedge GRT, the pseudocluster randomized trial, and the network-randomized GRT, which, like the parallel-arm GRT, require clustering to be accounted for in both their design and analysis.
Collapse
Affiliation(s)
- Elizabeth L Turner
- Elizabeth L. Turner and John A. Gallis are with the Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, and the Duke Global Health Institute, Duke University. Fan Li is with the Department of Biostatistics and Bioinformatics, Duke University. Melanie Prague is with the Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, and Inria, project team SISTM, Bordeaux, France. David M. Murray is with the Office of Disease Prevention, Division of Program Coordination and Strategic Planning, and the Office of the Director, National Institutes of Health, Rockville, MD
| | - Fan Li
- Elizabeth L. Turner and John A. Gallis are with the Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, and the Duke Global Health Institute, Duke University. Fan Li is with the Department of Biostatistics and Bioinformatics, Duke University. Melanie Prague is with the Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, and Inria, project team SISTM, Bordeaux, France. David M. Murray is with the Office of Disease Prevention, Division of Program Coordination and Strategic Planning, and the Office of the Director, National Institutes of Health, Rockville, MD
| | - John A Gallis
- Elizabeth L. Turner and John A. Gallis are with the Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, and the Duke Global Health Institute, Duke University. Fan Li is with the Department of Biostatistics and Bioinformatics, Duke University. Melanie Prague is with the Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, and Inria, project team SISTM, Bordeaux, France. David M. Murray is with the Office of Disease Prevention, Division of Program Coordination and Strategic Planning, and the Office of the Director, National Institutes of Health, Rockville, MD
| | - Melanie Prague
- Elizabeth L. Turner and John A. Gallis are with the Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, and the Duke Global Health Institute, Duke University. Fan Li is with the Department of Biostatistics and Bioinformatics, Duke University. Melanie Prague is with the Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, and Inria, project team SISTM, Bordeaux, France. David M. Murray is with the Office of Disease Prevention, Division of Program Coordination and Strategic Planning, and the Office of the Director, National Institutes of Health, Rockville, MD
| | - David M Murray
- Elizabeth L. Turner and John A. Gallis are with the Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, and the Duke Global Health Institute, Duke University. Fan Li is with the Department of Biostatistics and Bioinformatics, Duke University. Melanie Prague is with the Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, and Inria, project team SISTM, Bordeaux, France. David M. Murray is with the Office of Disease Prevention, Division of Program Coordination and Strategic Planning, and the Office of the Director, National Institutes of Health, Rockville, MD
| |
Collapse
|
12
|
Westgate PM. Intra-cluster correlation selection for cluster randomized trials. Stat Med 2016; 35:3272-84. [DOI: 10.1002/sim.6922] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2015] [Revised: 01/14/2016] [Accepted: 02/07/2016] [Indexed: 12/15/2022]
Affiliation(s)
- Philip M. Westgate
- Department of Biostatistics, College of Public Health; University of Kentucky; Lexington 40536 KY U.S.A
| |
Collapse
|
13
|
Hund L, Bedrick EJ, Pagano M. Choosing a Cluster Sampling Design for Lot Quality Assurance Sampling Surveys. PLoS One 2015; 10:e0129564. [PMID: 26125967 PMCID: PMC4488393 DOI: 10.1371/journal.pone.0129564] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2014] [Accepted: 05/11/2015] [Indexed: 11/30/2022] Open
Abstract
Lot quality assurance sampling (LQAS) surveys are commonly used for monitoring and evaluation in resource-limited settings. Recently several methods have been proposed to combine LQAS with cluster sampling for more timely and cost-effective data collection. For some of these methods, the standard binomial model can be used for constructing decision rules as the clustering can be ignored. For other designs, considered here, clustering is accommodated in the design phase. In this paper, we compare these latter cluster LQAS methodologies and provide recommendations for choosing a cluster LQAS design. We compare technical differences in the three methods and determine situations in which the choice of method results in a substantively different design. We consider two different aspects of the methods: the distributional assumptions and the clustering parameterization. Further, we provide software tools for implementing each method and clarify misconceptions about these designs in the literature. We illustrate the differences in these methods using vaccination and nutrition cluster LQAS surveys as example designs. The cluster methods are not sensitive to the distributional assumptions but can result in substantially different designs (sample sizes) depending on the clustering parameterization. However, none of the clustering parameterizations used in the existing methods appears to be consistent with the observed data, and, consequently, choice between the cluster LQAS methods is not straightforward. Further research should attempt to characterize clustering patterns in specific applications and provide suggestions for best-practice cluster LQAS designs on a setting-specific basis.
Collapse
Affiliation(s)
- Lauren Hund
- Department of Family and Community Medicine, University of New Mexico, Albuquerque, NM, USA
| | - Edward J Bedrick
- Department of Biostatistics and Informatics, University of Colorado, Aurora, CO, USA
| | - Marcello Pagano
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
| |
Collapse
|
14
|
Nørskov AK, Lundstrøm LH, Rosenstock CV, Wetterslev J. Detailed statistical analysis plan for the difficult airway management (DIFFICAIR) trial. Trials 2014; 15:173. [PMID: 24885548 PMCID: PMC4030275 DOI: 10.1186/1745-6215-15-173] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2014] [Accepted: 05/08/2014] [Indexed: 02/08/2023] Open
Abstract
Background Preoperative airway assessment in Denmark is based on a non-specific clinical assessment left to the discretion of the responsible anesthesiologist. The DIFFICAIR trial compares the effect of using a systematic and consistent airway assessment versus a non-specific clinical assessment on the frequency of unanticipated difficult airway management. To prevent outcome bias and selective reporting, we hereby present a detailed statistical analysis plan as an amendment (update) to the previously published protocol for the DIFFICAIR trial. Method/Design The DIFFICAIR trial is a stratified, parallel group, cluster (cluster = department) randomized multicenter trial involving 28 departments of anesthesia in Denmark randomized to airway assessment either by the Simplified Airway Risk Index (SARI) or by a usual non-specific assessment. Data from patients’ preoperative airway assessment are registered in the Danish Anesthesia Database. An objective score for intubation grading the severity, that is the severity of the intubations, as well as the frequency of unanticipated difficult intubation, is measured for each group. Primary outcome measures are the fraction of unanticipated difficult and easy intubations. The database is programmed so that the registration of the SARI is mandatory for the intervention group but invisible to controls. Data recruitment was commenced in October 2012 and ended in ultimo December 2013. Conclusion We intend to increase the transparency of the data analyses regarding the DIFFICAIR trial by an a priori publication of a statistical analysis plan. Trial registration ClinicalTrials.gov: NCT01718561.
Collapse
Affiliation(s)
- Anders Kehlet Nørskov
- Department of Anaesthesiology, Nordsjællands Hospital, Copenhagen University Hospital, Hillerød, Capital region of Denmark 3400, Denmark.
| | | | | | | |
Collapse
|
15
|
Litaker MS, Gordan VV, Rindal DB, Fellows JL, Gilbert GH. Cluster Effects in a National Dental PBRN restorative study. J Dent Res 2013; 92:782-7. [PMID: 23857643 DOI: 10.1177/0022034513497752] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Items in clusters, such as patients of the same clinician or teeth within the same patient, tend to be more similar than items from different groups. This within-group similarity, represented by the intraclass correlation coefficient (ICC), reduces precision, yielding less statistical power and wider confidence intervals, compared with non-clustered samples of the same size. This must be considered in the design of studies including clusters. We present ICC estimates from a study of 7,826 restorations placed in previously unrestored tooth surfaces of 4,672 patients by 222 clinicians in the National Dental Practice-Based Research Network, as a resource for sample size planning in restorative studies. Our findings suggest that magnitudes of ICCs in practice-based research can be substantial. These can have large effects on precision and the power to detect treatment effects. Generally, we found relatively large ICCs for characteristics that are influenced by clinician choice (e.g., 0.36 for rubber dam use). ICCs for outcomes within individual patients, such as tooth surfaces affected by a caries lesion, tended to be smaller (from 0.03 to 0.15), but were still sufficiently large to substantially affect statistical power. Clustering should be taken into account in the design of oral health studies and derivation of statistical power estimates for these studies (ClinicalTrials.gov, NCT00847470).
Collapse
Affiliation(s)
- M S Litaker
- Department of Clinical and Community Sciences, School of Dentistry, University of Alabama at Birmingham, USA.
| | | | | | | | | | | |
Collapse
|
16
|
Berger VW. Reply to letter-to-the-editor: efficacy and degree of bias in knee injury prevention studies: a systematic review of RCTs. Clin Orthop Relat Res 2013; 471:341-2. [PMID: 23129476 PMCID: PMC3528924 DOI: 10.1007/s11999-012-2688-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Affiliation(s)
- Vance W. Berger
- Biometry Research Group, National Cancer Institute, and UMBC, Executive Plaza North, Suite 3131, 6130 Executive Boulevard, MSC 7354, Bethesda, MD 20892-7354 USA
| |
Collapse
|
17
|
Affiliation(s)
- Vance W Berger
- Biometry Research Group National Cancer Institute and UMBC Bethesda, MD 20892-7354, USA
| |
Collapse
|
18
|
Crespi CM, Wong WK, Wu S. Response to the letter ‘Conservative Handling of Cluster Randomized Trials’. Clin Trials 2012. [DOI: 10.1177/1740774512444638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Affiliation(s)
- Catherine M Crespi
- Department of Biostatistics, Fielding School of Public Health, University of California, Los Angeles, CA, USA
| | - Weng K Wong
- Department of Biostatistics, Fielding School of Public Health, University of California, Los Angeles, CA, USA
| | - Sheng Wu
- Department of Biostatistics, Fielding School of Public Health, University of California, Los Angeles, CA, USA
| |
Collapse
|
19
|
Wu S, Crespi CM, Wong WK. Comparison of methods for estimating the intraclass correlation coefficient for binary responses in cancer prevention cluster randomized trials. Contemp Clin Trials 2012; 33:869-80. [PMID: 22627076 DOI: 10.1016/j.cct.2012.05.004] [Citation(s) in RCA: 164] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2011] [Revised: 05/07/2012] [Accepted: 05/13/2012] [Indexed: 01/02/2023]
Abstract
The intraclass correlation coefficient (ICC) is a fundamental parameter of interest in cluster randomized trials as it can greatly affect statistical power. We compare common methods of estimating the ICC in cluster randomized trials with binary outcomes, with a specific focus on their application to community-based cancer prevention trials with primary outcome of self-reported cancer screening. Using three real data sets from cancer screening intervention trials with different numbers and types of clusters and cluster sizes, we obtained point estimates and 95% confidence intervals for the ICC using five methods: the analysis of variance estimator, the Fleiss-Cuzick estimator, the Pearson estimator, an estimator based on generalized estimating equations and an estimator from a random intercept logistic regression model. We compared estimates of the ICC for the overall sample and by study condition. Our results show that ICC estimates from different methods can be quite different, although confidence intervals generally overlap. The ICC varied substantially by study condition in two studies, suggesting that the common practice of assuming a common ICC across all clusters in the trial is questionable. A simulation study confirmed pitfalls of erroneously assuming a common ICC. Investigators should consider using sample size and analysis methods that allow the ICC to vary by study condition.
Collapse
Affiliation(s)
- Sheng Wu
- Department of Biostatistics, UCLA Fielding School of Public Health, University of California, Los Angeles, Center for the Health Sciences, Los Angeles, CA 90095-1772, USA.
| | | | | |
Collapse
|