1
|
Song Y, Hughes JP, Ye T. Adjusting for incomplete baseline covariates in randomized controlled trials: a cross-world imputation framework. Biometrics 2024; 80:ujae094. [PMID: 39271117 PMCID: PMC11398886 DOI: 10.1093/biomtc/ujae094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 06/25/2024] [Accepted: 08/22/2024] [Indexed: 09/15/2024]
Abstract
In randomized controlled trials, adjusting for baseline covariates is commonly used to improve the precision of treatment effect estimation. However, covariates often have missing values. Recently, Zhao and Ding studied two simple strategies, the single imputation method and missingness-indicator method (MIM), to handle missing covariates and showed that both methods can provide an efficiency gain compared to not adjusting for covariates. To better understand and compare these two strategies, we propose and investigate a novel theoretical imputation framework termed cross-world imputation (CWI). This framework includes both single imputation and MIM as special cases, facilitating the comparison of their efficiency. Through the lens of CWI, we show that MIM implicitly searches for the optimal CWI values and thus achieves optimal efficiency. We also derive conditions under which the single imputation method, by searching for the optimal single imputation values, can achieve the same efficiency as the MIM. We illustrate our findings through simulation studies and a real data analysis based on the Childhood Adenotonsillectomy Trial. We conclude by discussing the practical implications of our findings.
Collapse
Affiliation(s)
- Yilin Song
- Department of Biostatistics, University of Washington, Seattle, WA 98195, United States
| | - James P Hughes
- Department of Biostatistics, University of Washington, Seattle, WA 98195, United States
| | - Ting Ye
- Department of Biostatistics, University of Washington, Seattle, WA 98195, United States
| |
Collapse
|
2
|
Yu X, Zoh RS, Fluharty DA, Mestre LM, Valdez D, Tekwe CD, Vorland CJ, Jamshidi-Naeini Y, Chiou SH, Lartey ST, Allison DB. Misstatements, misperceptions, and mistakes in controlling for covariates in observational research. eLife 2024; 13:e82268. [PMID: 38752987 PMCID: PMC11098558 DOI: 10.7554/elife.82268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 04/02/2024] [Indexed: 05/18/2024] Open
Abstract
We discuss 12 misperceptions, misstatements, or mistakes concerning the use of covariates in observational or nonrandomized research. Additionally, we offer advice to help investigators, editors, reviewers, and readers make more informed decisions about conducting and interpreting research where the influence of covariates may be at issue. We primarily address misperceptions in the context of statistical management of the covariates through various forms of modeling, although we also emphasize design and model or variable selection. Other approaches to addressing the effects of covariates, including matching, have logical extensions from what we discuss here but are not dwelled upon heavily. The misperceptions, misstatements, or mistakes we discuss include accurate representation of covariates, effects of measurement error, overreliance on covariate categorization, underestimation of power loss when controlling for covariates, misinterpretation of significance in statistical models, and misconceptions about confounding variables, selecting on a collider, and p value interpretations in covariate-inclusive analyses. This condensed overview serves to correct common errors and improve research quality in general and in nutrition research specifically.
Collapse
Affiliation(s)
- Xiaoxin Yu
- Department of Epidemiology and Biostatistics, Indiana University School of Public Health-BloomingtonBloomingtonUnited States
| | - Roger S Zoh
- Department of Epidemiology and Biostatistics, Indiana University School of Public Health-BloomingtonBloomingtonUnited States
| | - David A Fluharty
- Department of Epidemiology and Biostatistics, Indiana University School of Public Health-BloomingtonBloomingtonUnited States
| | - Luis M Mestre
- Department of Epidemiology and Biostatistics, Indiana University School of Public Health-BloomingtonBloomingtonUnited States
| | - Danny Valdez
- Department of Applied Health Science, Indiana University School of Public Health-BloomingtonBloomingtonUnited States
| | - Carmen D Tekwe
- Department of Epidemiology and Biostatistics, Indiana University School of Public Health-BloomingtonBloomingtonUnited States
| | - Colby J Vorland
- Department of Applied Health Science, Indiana University School of Public Health-BloomingtonBloomingtonUnited States
| | - Yasaman Jamshidi-Naeini
- Department of Epidemiology and Biostatistics, Indiana University School of Public Health-BloomingtonBloomingtonUnited States
| | - Sy Han Chiou
- Department of Statistics and Data Science, Southern Methodist UniversityDallasUnited States
| | - Stella T Lartey
- University of Memphis, School of Public HealthMemphisUnited Kingdom
| | - David B Allison
- Department of Epidemiology and Biostatistics, Indiana University School of Public Health-BloomingtonBloomingtonUnited States
| |
Collapse
|
3
|
Zang H, Kim HJ, Huang B, Szczesniak R. Bayesian causal inference for observational studies with missingness in covariates and outcomes. Biometrics 2023; 79:3624-3636. [PMID: 37553770 PMCID: PMC10840608 DOI: 10.1111/biom.13918] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Accepted: 07/13/2023] [Indexed: 08/10/2023]
Abstract
Missing data are a pervasive issue in observational studies using electronic health records or patient registries. It presents unique challenges for statistical inference, especially causal inference. Inappropriately handling missing data in causal inference could potentially bias causal estimation. Besides missing data problems, observational health data structures typically have mixed-type variables - continuous and categorical covariates - whose joint distribution is often too complex to be modeled by simple parametric models. The existence of missing values in covariates and outcomes makes the causal inference even more challenging, while most standard causal inference approaches assume fully observed data or start their works after imputing missing values in a separate preprocessing stage. To address these problems, we introduce a Bayesian nonparametric causal model to estimate causal effects with missing data. The proposed approach can simultaneously impute missing values, account for multiple outcomes, and estimate causal effects under the potential outcomes framework. We provide three simulation studies to show the performance of our proposed method under complicated data settings whose features are similar to our case studies. For example, Simulation Study 3 assumes the case where missing values exist in both outcomes and covariates. Two case studies were conducted applying our method to evaluate the comparative effectiveness of treatments for chronic disease management in juvenile idiopathic arthritis and cystic fibrosis.
Collapse
Affiliation(s)
- Huaiyu Zang
- Heart Institute, Cincinnati Children’s Hospital Medical Center, OH, U.S.A
| | - Hang J. Kim
- Division of Statistics and Data Science, University of Cincinnati, OH, U.S.A
| | - Bin Huang
- Division of Biostatistics and Epidemiology, Cincinnati Children’s Hospital Medical Center, OH, U.S.A
- Department of Pediatrics, University of Cincinnati, OH, U.S.A
| | - Rhonda Szczesniak
- Division of Biostatistics and Epidemiology, Cincinnati Children’s Hospital Medical Center, OH, U.S.A
- Department of Pediatrics, University of Cincinnati, OH, U.S.A
| |
Collapse
|
4
|
Li Y, Miao W, Shpitser I, Tchetgen Tchetgen EJ. A self-censoring model for multivariate nonignorable nonmonotone missing data. Biometrics 2023; 79:3203-3214. [PMID: 37488709 DOI: 10.1111/biom.13916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Accepted: 07/10/2023] [Indexed: 07/26/2023]
Abstract
We introduce an itemwise modeling approach called "self-censoring" for multivariate nonignorable nonmonotone missing data, where the missingness process of each outcome can be affected by its own value and associated with missingness indicators of other outcomes, while conditionally independent of the other outcomes. The self-censoring model complements previous graphical approaches for the analysis of multivariate nonignorable missing data. It is identified under a completeness condition stating that any variability in one outcome can be captured by variability in the other outcomes among complete cases. For estimation, we propose a suite of semiparametric estimators including doubly robust estimators that deliver valid inferences under partial misspecification of the full-data distribution. We also provide a novel and flexible global sensitivity analysis procedure anchored at the self-censoring. We evaluate the performance of the proposed methods with simulations and apply them to analyze a study about the effect of highly active antiretroviral therapy on preterm delivery of HIV-positive mothers.
Collapse
Affiliation(s)
- Yilin Li
- Department of Probability and Statistics, Peking University, Beijing, China
| | - Wang Miao
- Department of Probability and Statistics, Peking University, Beijing, China
| | - Ilya Shpitser
- Department of Computer Science, Johns Hopkins University, Baltimore, USA
| | - Eric J Tchetgen Tchetgen
- Department of Statistics, The Wharton School of the University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
5
|
Mayer I, Josse J. Generalizing treatment effects with incomplete covariates: Identifying assumptions and multiple imputation algorithms. Biom J 2023; 65:e2100294. [PMID: 36907999 DOI: 10.1002/bimj.202100294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Revised: 01/24/2023] [Accepted: 02/13/2023] [Indexed: 03/14/2023]
Abstract
We focus on the problem of generalizing a causal effect estimated on a randomized controlled trial (RCT) to a target population described by a set of covariates from observational data. Available methods such as inverse propensity sampling weighting are not designed to handle missing values, which are however common in both data sources. In addition to coupling the assumptions for causal effect identifiability and for the missing values mechanism and to defining appropriate estimation strategies, one difficulty is to consider the specific structure of the data with two sources and treatment and outcome only available in the RCT. We propose three multiple imputation strategies to handle missing values when generalizing treatment effects, each handling the multisource structure of the problem differently (separate imputation, joint imputation with fixed effect, joint imputation ignoring source information). As an alternative to multiple imputation, we also propose a direct estimation approach that treats incomplete covariates as semidiscrete variables. The multiple imputation strategies and the latter alternative rely on different sets of assumptions concerning the impact of missing values on identifiability. We discuss these assumptions and assess the methods through an extensive simulation study. This work is motivated by the analysis of a large registry of over 20,000 major trauma patients and an RCT studying the effect of tranexamic acid administration on mortality in major trauma patients admitted to intensive care units. The analysis illustrates how the missing values handling can impact the conclusion about the effect generalized from the RCT to the target population.
Collapse
Affiliation(s)
- Imke Mayer
- Institute of Public Health, Charité - Universitätsmedizin, Berlin, Germany
- PreMeDICaL, Inria Sophia-Antipolis, Montpellier, France
| | - Julie Josse
- PreMeDICaL, Inria Sophia-Antipolis, Montpellier, France
| |
Collapse
|
6
|
Mao X, Wang Z, Yang S. Matrix completion under complex survey sampling. ANN I STAT MATH 2023; 75:463-492. [PMID: 37645434 PMCID: PMC10465119 DOI: 10.1007/s10463-022-00851-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Revised: 08/12/2022] [Accepted: 08/17/2022] [Indexed: 01/10/2023]
Abstract
Multivariate nonresponse is often encountered in complex survey sampling, and simply ignoring it leads to erroneous inference. In this paper, we propose a new matrix completion method for complex survey sampling. Different from existing works either conducting row-wise or column-wise imputation, the data matrix is treated as a whole which allows for exploiting both row and column patterns simultaneously. A column-space-decomposition model is adopted incorporating a low-rank structured matrix for the finite population with easy-to-obtain demographic information as covariates. Besides, we propose a computationally efficient projection strategy to identify the model parameters under complex survey sampling. Then, an augmented inverse probability weighting estimator is used to estimate the parameter of interest, and the corresponding asymptotic upper bound of the estimation error is derived. Simulation studies show that the proposed estimator has a smaller mean squared error than other competitors, and the corresponding variance estimator performs well. The proposed method is applied to assess the health status of the U.S. population.
Collapse
Affiliation(s)
- Xiaojun Mao
- School of Mathematical Sciences, Ministry of Education Key Laboratory of Scientific and Engineering Computing, Shanghai Jiao Tong University, Shanghai 200240, People’s Republic of China
| | - Zhonglei Wang
- Wang Yanan Institute for Studies in Economics and School of Economics, Xiamen University, Xiamen 361005, Fujian, People’s Republic of China
| | - Shu Yang
- Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA
| |
Collapse
|
7
|
Orenstein L, Chetrit A, Goldman A, Novikov I, Dankner R. Polypharmacy is differentially associated with 20-year mortality among community-dwelling elderly women and men: The Israel Glucose Intolerance, Obesity and Hypertension cohort study. Mech Ageing Dev 2023; 211:111788. [PMID: 36758642 DOI: 10.1016/j.mad.2023.111788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 01/17/2023] [Accepted: 02/01/2023] [Indexed: 02/10/2023]
Abstract
BACKGROUND Elderly individuals are characterized by multimorbidity and high medication intake, entailing risks for adverse events. We examined the overall and sex-specific association of polypharmacy (≥5 drugs concurrently) with 20-year mortality among community-dwelling older adults. METHODS Survivors of the longitudinal Israel Study of Glucose Intolerance, Obesity, and Hypertension underwent extensive evaluation during 1999-2004, and were followed-up for all-cause mortality until 2019. Cox regression examined association of polypharmacy with all-cause mortality. RESULTS Data included 1210 participants (mean baseline age 72.9 ± 7.4 years, 53% females), 50.7% of them died over a median follow-up of 12.8 years. Women received a higher mean number of drugs (4.3 vs 3.5; p < 0.0001), were twice more likely to take vitamins, and had higher comorbidity. Polypharmacy prevalence was 38.3%, and more frequent with age, female sex, European-American origin, sedentary lifestyle and poor self-rated health. Polypharmacy was independently associated with mortality in women only (HR=1.41, 95%CI:1.05-1.89). An interaction was found with sex (p = 0.045). CONCLUSIONS Polypharmacy was more prevalent in older women than men and associated with increased 20-year mortality in women only. Sex-specific adaptation of guidelines for appropriate drug use among community-dwelling older adults is warranted.
Collapse
Affiliation(s)
- Liat Orenstein
- Unit for Cardiovascular Epidemiology, Gertner Institute for Epidemiology and Health Policy Research, Sheba Medical Center, Ramat-Gan 52621, Israel; Department of Epidemiology and Preventive Medicine, School of Public Health, Sackler Faculty of Medicine, Tel-Aviv University, Tel-Aviv 6997801, Israel.
| | - Angela Chetrit
- Unit for Cardiovascular Epidemiology, Gertner Institute for Epidemiology and Health Policy Research, Sheba Medical Center, Ramat-Gan 52621, Israel.
| | - Adam Goldman
- Department of Epidemiology and Preventive Medicine, School of Public Health, Sackler Faculty of Medicine, Tel-Aviv University, Tel-Aviv 6997801, Israel; Department of Internal Medicine, Sheba Medical Center, Ramat-Gan 52621, Israel.
| | - Ilya Novikov
- Biostatistics and Biomathematics Unit, Gertner Institute for Epidemiology and Health Policy Research, Sheba Medical Center, Ramat-Gan 52621, Israel.
| | - Rachel Dankner
- Unit for Cardiovascular Epidemiology, Gertner Institute for Epidemiology and Health Policy Research, Sheba Medical Center, Ramat-Gan 52621, Israel; Department of Epidemiology and Preventive Medicine, School of Public Health, Sackler Faculty of Medicine, Tel-Aviv University, Tel-Aviv 6997801, Israel.
| |
Collapse
|
8
|
Cuerden MS, Diao L, Cotton CA, Cook RJ. Doubly weighted mean score estimating functions with a partially observed effect modifier. COMMUN STAT-THEOR M 2023. [DOI: 10.1080/03610926.2023.2166790] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Affiliation(s)
| | - Liqun Diao
- Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Canada
| | - Cecilia A. Cotton
- Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Canada
| | - Richard J. Cook
- Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Canada
| |
Collapse
|
9
|
Wei K, Qin G, Zhang J, Sui X. Multiply robust estimation of the average treatment effect with missing outcomes. J STAT COMPUT SIM 2022. [DOI: 10.1080/00949655.2022.2143501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Affiliation(s)
- Kecheng Wei
- Department of Biostatistics, School of Public Health, and The Key Laboratory of Public Health Safety of Ministry of Education, Fudan University, Shanghai, People's Republic of China
| | - Guoyou Qin
- Department of Biostatistics, School of Public Health, and The Key Laboratory of Public Health Safety of Ministry of Education, Fudan University, Shanghai, People's Republic of China
- Shanghai Institute of Infectious Disease and Biosecurity, Shanghai, People's Republic of China
| | - Jiajia Zhang
- Department of Epidemiology and Biostatistics, University of South Carolina, Columbia, SC, USA
| | - Xuemei Sui
- Department of Exercise Science, University of South Carolina, Columbia, SC, USA
| |
Collapse
|
10
|
Zhao A, Ding P. To adjust or not to adjust? Estimating the average treatment effect in randomized experiments with missing covariates. J Am Stat Assoc 2022. [DOI: 10.1080/01621459.2022.2123814] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Affiliation(s)
- Anqi Zhao
- Department of Statistics and Data Science, National University of Singapore.
| | - Peng Ding
- Department of Statistics, University of California, Berkeley
| |
Collapse
|
11
|
Dai M, Shen W, Stern HS. Nonparametric tests for treatment effect heterogeneity in observational studies. CAN J STAT 2022. [DOI: 10.1002/cjs.11728] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
- Maozhu Dai
- Department of Statistics University of California Irvine California USA
| | - Weining Shen
- Department of Statistics University of California Irvine California USA
| | - Hal S. Stern
- Department of Statistics University of California Irvine California USA
| |
Collapse
|
12
|
Bagmar MSH, Shen H. Causal inference with missingness in confounder. J STAT COMPUT SIM 2022. [DOI: 10.1080/00949655.2022.2089672] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Affiliation(s)
- Md. Shaddam Hossain Bagmar
- Department of Mathematics and Statistics, University of Calgary, Calgary, Alberta, Canada
- Institute of Statistical Research and Training (ISRT), University of Dhaka, Dhaka, Bangladesh
| | - Hua Shen
- Department of Mathematics and Statistics, University of Calgary, Calgary, Alberta, Canada
| |
Collapse
|
13
|
Reich BJ, Yang S, Guan Y, Giffin AB, Miller MJ, Rappold A. A Review of Spatial Causal Inference Methods for Environmental and Epidemiological Applications. Int Stat Rev 2021; 89:605-634. [PMID: 37197445 PMCID: PMC10187770 DOI: 10.1111/insr.12452] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2020] [Accepted: 04/30/2021] [Indexed: 11/30/2022]
Abstract
The scientific rigor and computational methods of causal inference have had great impacts on many disciplines but have only recently begun to take hold in spatial applications. Spatial causal inference poses analytic challenges due to complex correlation structures and interference between the treatment at one location and the outcomes at others. In this paper, we review the current literature on spatial causal inference and identify areas of future work. We first discuss methods that exploit spatial structure to account for unmeasured confounding variables. We then discuss causal analysis in the presence of spatial interference including several common assumptions used to reduce the complexity of the interference patterns under consideration. These methods are extended to the spatiotemporal case where we compare and contrast the potential outcomes framework with Granger causality and to geostatistical analyses involving spatial random fields of treatments and responses. The methods are introduced in the context of observational environmental and epidemiological studies and are compared using both a simulation study and analysis of the effect of ambient air pollution on COVID-19 mortality rate. Code to implement many of the methods using the popular Bayesian software OpenBUGS is provided.
Collapse
Affiliation(s)
- Brian J Reich
- Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA
| | - Shu Yang
- Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA
| | - Yawen Guan
- Department of Statistics, University of Nebraska-Lincoln, Lincoln, NE 68583, USA
| | - Andrew B Giffin
- Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA
| | - Matthew J Miller
- Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA
| | - Ana Rappold
- US Environmental Protection Agency, Research Triangle Park, NC 27709, USA
| |
Collapse
|
14
|
Diao L, Cook RJ. Nested doubly robust estimating equations for causal analysis with an incomplete effect modifier. CAN J STAT 2021. [DOI: 10.1002/cjs.11650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Liqun Diao
- Department of Statistics and Actuarial Science University of Waterloo Waterloo Ontario Canada
| | - Richard J. Cook
- Department of Statistics and Actuarial Science University of Waterloo Waterloo Ontario Canada
| |
Collapse
|
15
|
Mayer I, Sverdrup E, Gauss T, Moyer JD, Wager S, Josse J. Doubly robust treatment effect estimation with missing attributes. Ann Appl Stat 2020. [DOI: 10.1214/20-aoas1356] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|