451
|
Hardt J, Herke M, Leonhart R. Auxiliary variables in multiple imputation in regression with missing X: a warning against including too many in small sample research. BMC Med Res Methodol 2012; 12:184. [PMID: 23216665 PMCID: PMC3538666 DOI: 10.1186/1471-2288-12-184] [Citation(s) in RCA: 88] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2012] [Accepted: 11/28/2012] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Multiple imputation is becoming increasingly popular. Theoretical considerations as well as simulation studies have shown that the inclusion of auxiliary variables is generally of benefit. METHODS A simulation study of a linear regression with a response Y and two predictors X1 and X2 was performed on data with n = 50, 100 and 200 using complete cases or multiple imputation with 0, 10, 20, 40 and 80 auxiliary variables. Mechanisms of missingness were either 100% MCAR or 50% MAR + 50% MCAR. Auxiliary variables had low (r=.10) vs. moderate correlations (r=.50) with X's and Y. RESULTS The inclusion of auxiliary variables can improve a multiple imputation model. However, inclusion of too many variables leads to downward bias of regression coefficients and decreases precision. When the correlations are low, inclusion of auxiliary variables is not useful. CONCLUSION More research on auxiliary variables in multiple imputation should be performed. A preliminary rule of thumb could be that the ratio of variables to cases with complete data should not go below 1 : 3.
Collapse
Affiliation(s)
- Jochen Hardt
- Medical Psychology and Medical Sociology, Clinic for Psychosomatic Medicine and Psychotherapy, University of Mainz, Duesbergweg 6, Mainz 55128, Germany.
| | | | | |
Collapse
|
452
|
Roumie CL, Hung AM, Greevy RA, Grijalva CG, Liu X, Murff HJ, Elasy TA, Griffin MR. Comparative effectiveness of sulfonylurea and metformin monotherapy on cardiovascular events in type 2 diabetes mellitus: a cohort study. Ann Intern Med 2012; 157:601-10. [PMID: 23128859 PMCID: PMC4667563 DOI: 10.7326/0003-4819-157-9-201211060-00003] [Citation(s) in RCA: 211] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND The effects of sulfonylureas and metformin on outcomes of cardiovascular disease (CVD) in type 2 diabetes are not well-characterized. OBJECTIVE To compare the effects of sulfonylureas and metformin on CVD outcomes (acute myocardial infarction and stroke) or death. DESIGN Retrospective cohort study. SETTING National Veterans Health Administration databases linked to Medicare files. PATIENTS Veterans who initiated metformin or sulfonylurea therapy for diabetes. Patients with chronic kidney disease or serious medical illness were excluded. MEASUREMENTS Composite outcome of hospitalization for acute myocardial infarction or stroke, or death, adjusted for baseline demographic characteristics; medications; cholesterol, hemoglobin A1c, and serum creatinine levels; blood pressure; body mass index; health care utilization; and comorbid conditions. RESULTS Among 253 690 patients initiating treatment (98 665 with sulfonylurea therapy and 155 025 with metformin therapy), crude rates of the composite outcome were 18.2 per 1000 person-years in sulfonylurea users and 10.4 per 1000 person-years in metformin users (adjusted incidence rate difference, 2.2 [95% CI, 1.4 to 3.0] more CVD events with sulfonylureas per 1000 person-years; adjusted hazard ratio [aHR], 1.21 [CI, 1.13 to 1.30]). Results were consistent for both glyburide (aHR, 1.26 [CI, 1.16 to 1.37]) and glipizide (aHR, 1.15 [CI, 1.06 to 1.26]) in subgroups by CVD history, age, body mass index, and albuminuria; in a propensity score-matched cohort analysis; and in sensitivity analyses. LIMITATION Most of the veterans in the study population were white men; data on women and minority groups were limited but reflective of the Veterans Health Administration population. CONCLUSION Use of sulfonylureas compared with metformin for initial treatment of diabetes was associated with an increased hazard of CVD events or death. PRIMARY FUNDING SOURCE Agency for Healthcare Research and Quality and the U.S. Department of Health and Human Services.
Collapse
Affiliation(s)
- Christianne L Roumie
- Veterans Affairs Tennessee Valley Healthcare System, 1310 24th Avenue South, Geriatric Research Education Clinical Center, Nashville, TN 37212, USA.
| | | | | | | | | | | | | | | |
Collapse
|
453
|
Williamson EJ, Forbes A, Wolfe R. Doubly robust estimators of causal exposure effects with missing data in the outcome, exposure or a confounder. Stat Med 2012; 31:4382-400. [PMID: 23086504 DOI: 10.1002/sim.5643] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2011] [Revised: 08/31/2012] [Accepted: 09/07/2012] [Indexed: 11/06/2022]
Abstract
We consider the estimation of the causal effect of a binary exposure on a continuous outcome. Confounding and missing data are both likely to occur in practice when observational data are used to estimate this causal effect. In dealing with each of these problems, model misspecification is likely to introduce bias. We present augmented inverse probability weighted (AIPW) estimators that account for both confounding and missing data, with the latter occurring in a single variable only. These estimators have an element of robustness to misspecification of the models used. Our estimators require two models to be specified to deal with confounding and two to deal with missing data. Only one of each of these models needs to be correctly specified. When either the outcome or the exposure of interest is missing, we derive explicit expressions for the AIPW estimator. When a confounder is missing, explicit derivation is complex, so we use a simple algorithm, which can be applied using standard statistical software, to obtain an approximation to the AIPW estimator.
Collapse
Affiliation(s)
- E J Williamson
- Department of Epidemiology and Preventive Medicine, Monash University, Australia.
| | | | | |
Collapse
|
454
|
Descalzo MÁ, Garcia VV, González-Alvaro I, Carbonell J, Balsa A, Sanmartí R, Lisbona P, Hernandez-Barrera V, Jiménez-Garcia R, Carmona L. Tackling missing radiographic progression data: multiple imputation technique compared with inverse probability weights and complete case analysis. Rheumatology (Oxford) 2012; 52:331-6. [PMID: 23024115 DOI: 10.1093/rheumatology/kes245] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
OBJECTIVE To describe the results of different statistical ways of addressing radiographic outcome affected by missing data--multiple imputation technique, inverse probability weights and complete case analysis--using data from an observational study. METHODS A random sample of 96 RA patients was selected for a follow-up study in which radiographs of hands and feet were scored. Radiographic progression was tested by comparing the change in the total Sharp-van der Heijde radiographic score (TSS) and the joint erosion score (JES) from baseline to the end of the second year of follow-up. MI technique, inverse probability weights in weighted estimating equation (WEE) and CC analysis were used to fit a negative binomial regression. RESULTS Major predictors of radiographic progression were JES and joint space narrowing (JSN) at baseline, together with baseline disease activity measured by DAS28 for TSS and MTX use for JES. Results from CC analysis show larger coefficients and s.e.s compared with MI and weighted techniques. The results from the WEE model were quite in line with those of MI. CONCLUSION If it seems plausible that CC or MI analysis may be valid, then MI should be preferred because of its greater efficiency. CC analysis resulted in inefficient estimates or, translated into non-statistical terminology, could guide us into inaccurate results and unwise conclusions. The methods discussed here will contribute to the use of alternative approaches for tackling missing data in observational studies.
Collapse
|
455
|
Decline in early childhood respiratory tract infections in the Norwegian mother and child cohort study after introduction of pneumococcal conjugate vaccination. Pediatr Infect Dis J 2012; 31:951-5. [PMID: 22627867 PMCID: PMC3421039 DOI: 10.1097/inf.0b013e31825d2f76] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
BACKGROUND The 7-valent pneumococcal conjugate vaccine (PCV7) was introduced into the Norwegian Childhood Immunization Program in 2006. A substantial effectiveness of PCV7 immunization against invasive pneumococcal disease has been demonstrated, whereas evidence of its impact on respiratory tract infections are less consistent. METHODS This study included children participating in the Norwegian Mother and Child Cohort Study, which recruited pregnant women between 1999 and 2008. Maternal report of acute otitis media (AOM), lower respiratory tract infections (LRTIs) and asthma in the child was compared by PCV7 immunization status, as obtained from the Norwegian Immunization Registry. Generalized linear models with the log-link function were used to report adjusted relative risks (RRs) and 95% confidence intervals (CIs). RESULTS For children who had received 3 or more PCV7 immunizations by 12 months of age, the adjusted RRs of AOM and LRTIs between 12 and 18 months were 0.86 (95% CI: 0.81, 0.91) and 0.78 (95% CI: 0.70, 0.87) respectively, when compared with nonimmunized children. A reduced risk of AOM, RR 0.92 (95% CI: 0.90, 0.94), and LRTIs, RR 0.75 (95% CI: 0.71, 0.80), between 18 and 36 months of age was also identified among children who had received 3 or more immunizations by 18 months of age. No association was seen between PCV7 immunization and asthma at 36 months of age. CONCLUSION Reduced incidences of AOM and LRTIs before 36 months of age were observed among children immunized with PCV7 through the childhood immunization program.
Collapse
|
456
|
Jonker FAM, Calis JCJ, van Hensbroek MB, Phiri K, Geskus RB, Brabin BJ, Leenstra T. Iron status predicts malaria risk in Malawian preschool children. PLoS One 2012; 7:e42670. [PMID: 22916146 PMCID: PMC3420896 DOI: 10.1371/journal.pone.0042670] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2012] [Accepted: 07/10/2012] [Indexed: 11/30/2022] Open
Abstract
Introduction Iron deficiency is highly prevalent in pre-school children in developing countries and an important health problem in sub-Saharan Africa. A debate exists on the possible protective effect of iron deficiency against malaria and other infections; yet consensus is lacking due to limited data. Recent studies have focused on the risks of iron supplementation but the effect of an individual's iron status on malaria risk remains unclear. Studies of iron status in areas with a high burden of infections often are exposed to bias. The aim of this study was to assess the predictive value of baseline iron status for malaria risk explicitly taking potential biases into account. Methods and materials We prospectively assessed the relationship between baseline iron deficiency (serum ferritin <30 µg/L) and malaria risk in a cohort of 727 Malawian preschool children during a year of follow-up. Data were analyzed using marginal structural Cox regression models and confounders were selected using causal graph theory. Sensitivity of results to bias resulting from misclassification of iron status by concurrent inflammation and to bias from unmeasured confounding were assessed using modern causal inference methods. Results and Conclusions The overall incidence of malaria parasitemia and clinical malaria was 1.9 (95% CI 1.8–2.0) and 0.7 (95% CI 0.6–0.8) events per person-year, respectively. Children with iron deficiency at baseline had a lower incidence of malaria parasitemia and clinical malaria during a year of follow-up; adjusted hazard ratio's 0.55 (95%-CI:0.41–0.74) and 0.49 (95%-CI:0.33–0.73), respectively. Our results suggest that iron deficiency protects against malaria parasitemia and clinical malaria in young children. Therefore the clinical importance of treating iron deficiency in a pre-school child should be weighed carefully against potential harms. In malaria endemic areas treatment of iron deficiency in children requires sustained prevention of malaria.
Collapse
Affiliation(s)
- Femkje A M Jonker
- Global Child Health Group, Emma Children's Hospital, Academic Medical Centre, Amsterdam, The Netherlands.
| | | | | | | | | | | | | |
Collapse
|
457
|
Karahalios A, Baglietto L, Carlin JB, English DR, Simpson JA. A review of the reporting and handling of missing data in cohort studies with repeated assessment of exposure measures. BMC Med Res Methodol 2012; 12:96. [PMID: 22784200 PMCID: PMC3464662 DOI: 10.1186/1471-2288-12-96] [Citation(s) in RCA: 94] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2011] [Accepted: 07/11/2012] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND Retaining participants in cohort studies with multiple follow-up waves is difficult. Commonly, researchers are faced with the problem of missing data, which may introduce biased results as well as a loss of statistical power and precision. The STROBE guidelines von Elm et al. (Lancet, 370:1453-1457, 2007); Vandenbroucke et al. (PLoS Med, 4:e297, 2007) and the guidelines proposed by Sterne et al. (BMJ, 338:b2393, 2009) recommend that cohort studies report on the amount of missing data, the reasons for non-participation and non-response, and the method used to handle missing data in the analyses. We have conducted a review of publications from cohort studies in order to document the reporting of missing data for exposure measures and to describe the statistical methods used to account for the missing data. METHODS A systematic search of English language papers published from January 2000 to December 2009 was carried out in PubMed. Prospective cohort studies with a sample size greater than 1,000 that analysed data using repeated measures of exposure were included. RESULTS Among the 82 papers meeting the inclusion criteria, only 35 (43%) reported the amount of missing data according to the suggested guidelines. Sixty-eight papers (83%) described how they dealt with missing data in the analysis. Most of the papers excluded participants with missing data and performed a complete-case analysis (n=54, 66%). Other papers used more sophisticated methods including multiple imputation (n=5) or fully Bayesian modeling (n=1). Methods known to produce biased results were also used, for example, Last Observation Carried Forward (n=7), the missing indicator method (n=1), and mean value substitution (n=3). For the remaining 14 papers, the method used to handle missing data in the analysis was not stated. CONCLUSIONS This review highlights the inconsistent reporting of missing data in cohort studies and the continuing use of inappropriate methods to handle missing data in the analysis. Epidemiological journals should invoke the STROBE guidelines as a framework for authors so that the amount of missing data and how this was accounted for in the analysis is transparent in the reporting of cohort studies.
Collapse
Affiliation(s)
- Amalia Karahalios
- Cancer Epidemiology Centre, Cancer Council Victoria, Carlton, VIC, Australia
| | | | | | | | | |
Collapse
|
458
|
Lee KJ, Carlin JB. Recovery of information from multiple imputation: a simulation study. Emerg Themes Epidemiol 2012; 9:3. [PMID: 22695083 PMCID: PMC3544721 DOI: 10.1186/1742-7622-9-3] [Citation(s) in RCA: 70] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2011] [Accepted: 05/22/2012] [Indexed: 11/10/2022] Open
Abstract
UNLABELLED BACKGROUND Multiple imputation is becoming increasingly popular for handling missing data. However, it is often implemented without adequate consideration of whether it offers any advantage over complete case analysis for the research question of interest, or whether potential gains may be offset by bias from a poorly fitting imputation model, particularly as the amount of missing data increases. METHODS Simulated datasets (n = 1000) drawn from a synthetic population were used to explore information recovery from multiple imputation in estimating the coefficient of a binary exposure variable when various proportions of data (10-90%) were set missing at random in a highly-skewed continuous covariate or in the binary exposure. Imputation was performed using multivariate normal imputation (MVNI), with a simple or zero-skewness log transformation to manage non-normality. Bias, precision, mean-squared error and coverage for a set of regression parameter estimates were compared between multiple imputation and complete case analyses. RESULTS For missingness in the continuous covariate, multiple imputation produced less bias and greater precision for the effect of the binary exposure variable, compared with complete case analysis, with larger gains in precision with more missing data. However, even with only moderate missingness, large bias and substantial under-coverage were apparent in estimating the continuous covariate's effect when skewness was not adequately addressed. For missingness in the binary covariate, all estimates had negligible bias but gains in precision from multiple imputation were minimal, particularly for the coefficient of the binary exposure. CONCLUSIONS Although multiple imputation can be useful if covariates required for confounding adjustment are missing, benefits are likely to be minimal when data are missing in the exposure variable of interest. Furthermore, when there are large amounts of missingness, multiple imputation can become unreliable and introduce bias not present in a complete case analysis if the imputation model is not appropriate. Epidemiologists dealing with missing data should keep in mind the potential limitations as well as the potential benefits of multiple imputation. Further work is needed to provide clearer guidelines on effective application of this method.
Collapse
Affiliation(s)
- Katherine J Lee
- Clinical Epidemiology and Biostatistics Unit, Murdoch Childrens Research Institute, The Royal Children's Hospital, Flemington Road, Parkville, VIC, 3052, Australia.
| | | |
Collapse
|
459
|
Héraud-Bousquet V, Larsen C, Carpenter J, Desenclos JC, Le Strat Y. Practical considerations for sensitivity analysis after multiple imputation applied to epidemiological studies with incomplete data. BMC Med Res Methodol 2012; 12:73. [PMID: 22681630 PMCID: PMC3537570 DOI: 10.1186/1471-2288-12-73] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2011] [Accepted: 05/09/2012] [Indexed: 01/05/2023] Open
Abstract
BACKGROUND Multiple Imputation as usually implemented assumes that data are Missing At Random (MAR), meaning that the underlying missing data mechanism, given the observed data, is independent of the unobserved data. To explore the sensitivity of the inferences to departures from the MAR assumption, we applied the method proposed by Carpenter et al. (2007).This approach aims to approximate inferences under a Missing Not At random (MNAR) mechanism by reweighting estimates obtained after multiple imputation where the weights depend on the assumed degree of departure from the MAR assumption. METHODS The method is illustrated with epidemiological data from a surveillance system of hepatitis C virus (HCV) infection in France during the 2001-2007 period. The subpopulation studied included 4343 HCV infected patients who reported drug use. Risk factors for severe liver disease were assessed. After performing complete-case and multiple imputation analyses, we applied the sensitivity analysis to 3 risk factors of severe liver disease: past excessive alcohol consumption, HIV co-infection and infection with HCV genotype 3. RESULTS In these data, the association between severe liver disease and HIV was underestimated, if given the observed data the chance of observing HIV status is high when this is positive. Inference for two other risk factors were robust to plausible local departures from the MAR assumption. CONCLUSIONS We have demonstrated the practical utility of, and advocate, a pragmatic widely applicable approach to exploring plausible departures from the MAR assumption post multiple imputation. We have developed guidelines for applying this approach to epidemiological studies.
Collapse
Affiliation(s)
- Vanina Héraud-Bousquet
- Département des maladies infectieuses, Institut de Veille Sanitaire, 12 rue du Val d'Osne, 94415 St Maurice, France.
| | | | | | | | | |
Collapse
|
460
|
Determinants of aortic stiffness: 16-year follow-up of the Whitehall II study. PLoS One 2012; 7:e37165. [PMID: 22629363 PMCID: PMC3358295 DOI: 10.1371/journal.pone.0037165] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2011] [Accepted: 04/16/2012] [Indexed: 11/30/2022] Open
Abstract
Background Aortic stiffness is a strong predictor of cardiovascular disease endpoints. Cross-sectional studies have shown associations of various cardiovascular risk factors with aortic pulse wave velocity, a measure of aortic stiffness, but the long-term impact of these factors on aortic stiffness is unknown. Methods In 3,769 men and women from the Whitehall II cohort, a wide range of traditional and novel cardiovascular risk factors were determined at baseline (1991–1993) and aortic pulse wave velocity was measured at follow-up (2007–2009). The prospective associations between each baseline risk factor and aortic pulse wave velocity at follow-up were assessed through sex stratified linear regression analysis adjusted for relevant confounders. Missing data on baseline determinants were imputed using the Multivariate Imputation by Chained Equations. Results Among men, the strongest predictors were waist circumference, waist-hip ratio, heart rate and interleukin 1 receptor antagonist, and among women, adiponectin, triglycerides, pulse pressure and waist-hip ratio. The impact of 10 centimeter increase in waist circumference on aortic pulse wave velocity was twice as large for men compared with women (men: 0.40 m/s (95%-CI: 0.24;0.56); women: 0.17 m/s (95%-CI: −0.01;0.35)), whereas the opposite was true for the impact of a two-fold increase in adiponectin (men: −0.30 m/s (95%-CI: −0.51;−0.10); women: 0.61 m/s (95%-CI: −0.86;−0.35)). Conclusion In this large prospective study, central obesity was a strong predictor of aortic stiffness. Additionally, heart rate in men and adiponectin in women predicted aortic pulse wave velocity suggesting that strategies to prevent aortic stiffening should be focused differently by sex.
Collapse
|
461
|
Abstract
OBJECTIVE To examine the association between preterm and low-birth-weight (PTLBW) delivery and maternal occupation among Latina women in California. METHODS A cohort of 1024 Latina women in Stockton, California, was observed from baseline to delivery. The association between PTLBW delivery and maternal occupation (farmwork, nonfarmwork, no work) was analyzed using multiple logistic regression models. RESULTS Demographic characteristics varied widely between the three occupation groups. The adjusted odds ratio of a PTLBW delivery for farmworkers compared with women who did not work was 1.28 (95% CI, 0.65 to 2.54). CONCLUSIONS We did not observe a statistically significant association between PTLBW delivery and farmwork in this population. Nevertheless, the relationship between acculturation and risky health behaviors suggests that studies investigating the association between maternal employment and adverse pregnancy outcomes among Latinas need to account for a participant's acculturation status.
Collapse
|
462
|
Treyvaud K, Inder TE, Lee KJ, Northam EA, Doyle LW, Anderson PJ. Can the home environment promote resilience for children born very preterm in the context of social and medical risk? J Exp Child Psychol 2012; 112:326-37. [PMID: 22480454 DOI: 10.1016/j.jecp.2012.02.009] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2011] [Revised: 02/27/2012] [Accepted: 02/27/2012] [Indexed: 10/28/2022]
Abstract
Relationships between the home environment and early developmental outcomes were examined in 166 children born very preterm in one tertiary maternity hospital to explore whether a more optimal home environment could promote resilience. In particular, we explored whether this effect was apparent over and above social risk and children's biological risk, as measured by cerebral white matter abnormality (WMA) evaluated using magnetic resonance imaging (MRI) at term-corrected age and length of hospital stay (LOS), and whether the effect of the home environment differed according to WMA. The home environment and social-emotional outcomes were assessed at 2years' corrected age using the Home Screening Questionnaire (HSQ) and the Infant-Toddler Social and Emotional Assessment (ITSEA). Children's cognitive and motor development was assessed using the Bayley Scales of Infant Development II. A more optimal home environment was associated with better cognitive and social-emotional development after adjusting for social risk, WMA, and LOS. Neonatal cerebral WMA moderated the relationship between the home environment and dysregulation problems only, such that the home environment had less effect on dysregulation for children with mild or moderate to severe WMA. The need to support parents to create an optimal home environment is discussed.
Collapse
Affiliation(s)
- Karli Treyvaud
- Murdoch Children's Research Institute, Royal Children's Hospital, Parkville, Victoria 3052, Australia.
| | | | | | | | | | | |
Collapse
|
463
|
Abstract
Two approaches commonly used to deal with missing data are multiple imputation (MI) and inverse-probability weighting (IPW). IPW is also used to adjust for unequal sampling fractions. MI is generally more efficient than IPW but more complex. Whereas IPW requires only a model for the probability that an individual has complete data (a univariate outcome), MI needs a model for the joint distribution of the missing data (a multivariate outcome) given the observed data. Inadequacies in either model may lead to important bias if large amounts of data are missing. A third approach combines MI and IPW to give a doubly robust estimator. A fourth approach (IPW/MI) combines MI and IPW but, unlike doubly robust methods, imputes only isolated missing values and uses weights to account for remaining larger blocks of unimputed missing data, such as would arise, e.g., in a cohort study subject to sample attrition, and/or unequal sampling fractions. In this article, we examine the performance, in terms of bias and efficiency, of IPW/MI relative to MI and IPW alone and investigate whether the Rubin's rules variance estimator is valid for IPW/MI. We prove that the Rubin's rules variance estimator is valid for IPW/MI for linear regression with an imputed outcome, we present simulations supporting the use of this variance estimator in more general settings, and we demonstrate that IPW/MI can have advantages over alternatives. IPW/MI is applied to data from the National Child Development Study.
Collapse
|
464
|
Groenwold RHH, White IR, Donders ART, Carpenter JR, Altman DG, Moons KGM. Missing covariate data in clinical research: when and when not to use the missing-indicator method for analysis. CMAJ 2012; 184:1265-9. [PMID: 22371511 DOI: 10.1503/cmaj.110977] [Citation(s) in RCA: 305] [Impact Index Per Article: 23.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022] Open
Affiliation(s)
- Rolf H H Groenwold
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, the Netherlands.
| | | | | | | | | | | |
Collapse
|
465
|
Coghill AE, Hansen S, Littman AJ. Risk factors for eclampsia: a population-based study in Washington State, 1987-2007. Am J Obstet Gynecol 2011; 205:553.e1-7. [PMID: 21855842 DOI: 10.1016/j.ajog.2011.06.079] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2011] [Revised: 05/24/2011] [Accepted: 06/21/2011] [Indexed: 01/08/2023]
Abstract
OBJECTIVE We sought to investigate whether previously identified risk factors are associated with eclampsia in a contemporary, heterogeneous cohort of women. STUDY DESIGN Data were collected from birth certificate and hospital discharge records and used to conduct a population-based case-control study among women giving birth to singletons in Washington State from 1987 through 2007. We used multivariable logistic regression to estimate odds ratios and 95% confidence intervals. Multiple imputation procedures were used to address missing data. RESULTS Risk of eclampsia was greater in nulliparous compared to parous women. Being a young mother (< 20 years) or an older mother (≥ 35 years) were each associated with elevated eclampsia risk. Longer birth interval, low socioeconomic status, gestational diabetes, prepregnancy obesity, and weight gain during pregnancy above or below recommended guidelines were positively associated with eclampsia. Multiparity and smoking were inversely associated with eclampsia risk. CONCLUSION Exposures identified more than a decade ago continue to be associated with eclampsia in contemporary birth cohorts.
Collapse
|
466
|
Triant VA, Josephson F, Rochester CG, Althoff KN, Marcus K, Munk R, Cooper C, D'Agostino RB, Costagliola D, Sabin CA, Williams PL, Hughes S, Post WS, Chandra-Strobos N, Guaraldi G, Young SS, Obenchain R, Bedimo R, Miller V, Strobos J. Adverse outcome analyses of observational data: assessing cardiovascular risk in HIV disease. Clin Infect Dis 2011; 54:408-13. [PMID: 22095570 DOI: 10.1093/cid/cir829] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Clinical decisions are ideally based on randomized trials but must often rely on observational data analyses, which are less straightforward and more influenced by methodology. The authors, from a series of expert roundtables convened by the Forum for Collaborative HIV Research on the use of observational studies to assess cardiovascular disease risk in human immunodeficiency virus infection, recommend that clinicians who review or interpret epidemiological publications consider 7 key statistical issues: (1) clear explanation of confounding and adjustment; (2) handling and impact of missing data; (3) consistency and clinical relevance of outcome measurements and covariate risk factors; (4) multivariate modeling techniques including time-dependent variables; (5) how multiple testing is addressed; (6) distinction between statistical and clinical significance; and (7) need for confirmation from independent databases. Recommendations to permit better understanding of potential methodological limitations include both responsible public access to de-identified source data, where permitted, and exploration of novel statistical methods.
Collapse
Affiliation(s)
- V A Triant
- Department of Medicine, Division of Infectious Diseases, Massachusetts General Hospital, Boston, MA, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
467
|
Immune activation, CD4+ T cell counts, and viremia exhibit oscillatory patterns over time in patients with highly resistant HIV infection. PLoS One 2011; 6:e21190. [PMID: 21701594 PMCID: PMC3118814 DOI: 10.1371/journal.pone.0021190] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2011] [Accepted: 05/22/2011] [Indexed: 11/19/2022] Open
Abstract
The rates of immunologic and clinical progression are lower in patients with drug-resistant HIV compared to wild-type HIV. This difference is not fully explained by viral load. It has been argued that reductions in T cell activation and/or viral fitness might result in preserved target cells and an altered relationship between the level of viremia and the rate of CD4+ T cell loss. We tested this hypothesis over time in a cohort of patients with highly resistant HIV. Fifty-four antiretroviral-treated patients with multi-drug resistant HIV and detectable plasma HIV RNA were followed longitudinally. CD4+ T cell counts and HIV RNA levels were measured every 4 weeks and T cell activation (CD38/HLA-DR) was measured every 16 weeks. We found that the levels of CD4+ T cell activation over time were a strong independent predictor of CD4+ T cell counts while CD8+ T cell activation was more strongly associated with viremia. Using spectral analysis, we found strong evidence for oscillatory (or cyclic) behavior in CD4+ T cell counts, HIV RNA levels, and T cell activation. Each of the cell populations exhibited an oscillatory behavior with similar frequencies. Collectively, these data suggest that there may be a mechanistic link between T cell activation, CD4+ T cell counts, and viremia and lends support for the hypothesis of altered predator-prey dynamics as a possible explanation of the stability of CD4+ T cell counts in the presence of sustained multi-drug resistant viremia.
Collapse
|
468
|
Javitz HS, Zbikowski SM, Deprey M, McAfee TA, McClure JB, Richards J, Catz SL, Jack LM, Swan GE. Cost-effectiveness of varenicline and three different behavioral treatment formats for smoking cessation. Transl Behav Med 2011; 1:182-190. [PMID: 21731592 PMCID: PMC3124766 DOI: 10.1007/s13142-010-0009-8] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022] Open
Abstract
There is a lack of evidence of the relative cost-effectiveness of proactive telephone counseling (PTC) and Web-based delivery of smoking cessation services in conjunction with pharmacotherapy. We calculated the differential cost-effectiveness of three behavioral smoking cessation modalities with varenicline treatment in a randomized trial of current smokers from a large health system. Eligible participants were randomized to one of three smoking cessation interventions: Web-based counseling (n=401), PTC (n=402), or combined PTC-Web counseling (n=399). All participants received a standard 12-week course of varenicline. The primary outcome was a 7-day point prevalent nonsmoking at the 6month follow-up. The Web intervention was the least expensive followed by the PTC and PTC-Web groups. Costs per additional 6-month nonsmoker and per additional lifetime quitter were $1,278 and $2,601 for Web, $1,472 and $2,995 for PTC, and $1,617 and $3,291 for PTC-Web. Cost per life-year (LY) and quality-adjusted life-year (QALY) saved were $1,148 and $1,136 for Web, $1,320 and $1,308 for PTC, and $1,450 and $1,437 for PTC-Web. Based on the cost per LY and QALY saved, these interventions are among the most cost-effective life-saving medical treatments. Web, PTC, and combined PTC-Web treatments were all highly cost-effective, with the Web treatment being marginally more cost-effective than the PTC or combined PTC-Web treatments.
Collapse
Affiliation(s)
- Harold S Javitz
- />Center for Health Sciences, SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025 USA
| | - Susan M Zbikowski
- />Free & Clear, Inc., 999 Third Avenue, Suite 2100, Seattle, WA 98104 USA
| | - Mona Deprey
- />Free & Clear, Inc., 999 Third Avenue, Suite 2100, Seattle, WA 98104 USA
| | - Timothy A McAfee
- />Centers for Disease Control and Prevention, 1600 Clifton Rd., Atlanta, GA 30333 USA
| | - Jennifer B McClure
- />Group Health Research Institute (formerly the Group Health Center for Health Studies), 1730 Minor Avenue, Suite 1600, Seattle, WA 98101 USA
| | - Julie Richards
- />Group Health Research Institute (formerly the Group Health Center for Health Studies), 1730 Minor Avenue, Suite 1600, Seattle, WA 98101 USA
| | - Sheryl L Catz
- />Group Health Research Institute (formerly the Group Health Center for Health Studies), 1730 Minor Avenue, Suite 1600, Seattle, WA 98101 USA
| | - Lisa M Jack
- />Center for Health Sciences, SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025 USA
| | - Gary E Swan
- />Center for Health Sciences, SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025 USA
| |
Collapse
|
469
|
White IR, Royston P, Wood AM. Multiple imputation using chained equations: Issues and guidance for practice. Stat Med 2011; 30:377-99. [PMID: 21225900 DOI: 10.1002/sim.4067] [Citation(s) in RCA: 5766] [Impact Index Per Article: 411.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2009] [Accepted: 07/14/2010] [Indexed: 12/20/2022]
Abstract
Multiple imputation by chained equations is a flexible and practical approach to handling missing data. We describe the principles of the method and show how to impute categorical and quantitative variables, including skewed variables. We give guidance on how to specify the imputation model and how many imputations are needed. We describe the practical analysis of multiply imputed data, including model building and model checking. We stress the limitations of the method and discuss the possible pitfalls. We illustrate the ideas using a data set in mental health, giving Stata code fragments.
Collapse
Affiliation(s)
- Ian R White
- MRC Biostatistics Unit, Institute of Public Health, Robinson Way, Cambridge CB2 0SR, U.K..
| | | | | |
Collapse
|