1
|
Jiang C, Thompson M, Wallace M. Estimating dynamic treatment regimes for ordinal outcomes with household interference: Application in household smoking cessation. Stat Methods Med Res 2024:9622802241242313. [PMID: 38623615 DOI: 10.1177/09622802241242313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/17/2024]
Abstract
The focus of precision medicine is on decision support, often in the form of dynamic treatment regimes, which are sequences of decision rules. At each decision point, the decision rules determine the next treatment according to the patient's baseline characteristics, the information on treatments and responses accrued by that point, and the patient's current health status, including symptom severity and other measures. However, dynamic treatment regime estimation with ordinal outcomes is rarely studied, and rarer still in the context of interference - where one patient's treatment may affect another's outcome. In this paper, we introduce the weighted proportional odds model: a regression based, approximate doubly-robust approach to single-stage dynamic treatment regime estimation for ordinal outcomes. This method also accounts for the possibility of interference between individuals sharing a household through the use of covariate balancing weights derived from joint propensity scores. Examining different types of balancing weights, we verify the approximate double robustness of weighted proportional odds model with our adjusted weights via simulation studies. We further extend weighted proportional odds model to multi-stage dynamic treatment regime estimation with household interference, namely dynamic weighted proportional odds model. Lastly, we demonstrate our proposed methodology in the analysis of longitudinal survey data from the Population Assessment of Tobacco and Health study, which motivates this work. Furthermore, considering interference, we provide optimal treatment strategies for households to achieve smoking cessation of the pair in the household.
Collapse
Affiliation(s)
- Cong Jiang
- Faculty of Pharmacy, Université de Montréal, Montreal, Canada
| | - Mary Thompson
- Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Canada
| | - Michael Wallace
- Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Canada
| |
Collapse
|
2
|
Bian Z, Moodie EEM, Shortreed SM, Lambert SD, Bhatnagar S. Variable selection for individualised treatment rules with discrete outcomes. J R Stat Soc Ser C Appl Stat 2024; 73:298-313. [PMID: 38487498 PMCID: PMC10930223 DOI: 10.1093/jrsssc/qlad096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 07/31/2023] [Accepted: 09/29/2023] [Indexed: 03/17/2024]
Abstract
An individualised treatment rule (ITR) is a decision rule that aims to improve individuals' health outcomes by recommending treatments according to subject-specific information. In observational studies, collected data may contain many variables that are irrelevant to treatment decisions. Including all variables in an ITR could yield low efficiency and a complicated treatment rule that is difficult to implement. Thus, selecting variables to improve the treatment rule is crucial. We propose a doubly robust variable selection method for ITRs, and show that it compares favourably with competing approaches. We illustrate the proposed method on data from an adaptive, web-based stress management tool.
Collapse
Affiliation(s)
- Zeyu Bian
- Department of Epidemiology and Biostatistics, McGill University, Montreal, Quebec H3A 0G4, Canada
- Miami Herbert Business School, University of Miami, Miami, FL 33146, USA
| | - Erica E M Moodie
- Department of Epidemiology and Biostatistics, McGill University, Montreal, Quebec H3A 0G4, Canada
| | - Susan M Shortreed
- Kaiser Permanente Washington Health Research Institute, Seattle, Washington, USA
- Department of Biostatistics, University of Washington, Seattle, Washington, USA
| | - Sylvie D Lambert
- Ingram School of Nursing, McGill University, Montreal, Quebec, Canada
- St.Mary’s Research Centre, Montreal, Quebec, Canada
| | - Sahir Bhatnagar
- Department of Epidemiology and Biostatistics, McGill University, Montreal, Quebec H3A 0G4, Canada
| |
Collapse
|
3
|
Manschot C, Laber E, Davidian M. Interim monitoring of sequential multiple assignment randomized trials using partial information. Biometrics 2023; 79:2881-2894. [PMID: 36896962 DOI: 10.1111/biom.13854] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Accepted: 03/02/2023] [Indexed: 03/11/2023]
Abstract
The sequential multiple assignment randomized trial (SMART) is the gold standard trial design to generate data for the evaluation of multistage treatment regimes. As with conventional (single-stage) randomized clinical trials, interim monitoring allows early stopping; however, there are few methods for principled interim analysis in SMARTs. Because SMARTs involve multiple stages of treatment, a key challenge is that not all enrolled participants will have progressed through all treatment stages at the time of an interim analysis. Wu et al. (2021) propose basing interim analyses on an estimator for the mean outcome under a given regime that uses data only from participants who have completed all treatment stages. We propose an estimator for the mean outcome under a given regime that gains efficiency by using partial information from enrolled participants regardless of their progression through treatment stages. Using the asymptotic distribution of this estimator, we derive associated Pocock and O'Brien-Fleming testing procedures for early stopping. In simulation experiments, the estimator controls type I error and achieves nominal power while reducing expected sample size relative to the method of Wu et al. (2021). We present an illustrative application of the proposed estimator based on a recent SMART evaluating behavioral pain interventions for breast cancer patients.
Collapse
Affiliation(s)
- Cole Manschot
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, USA
| | - Eric Laber
- Department of Statistical Science and Department of Biostatistics & Bioinformatics, Duke University, Durham, North Carolina, USA
| | - Marie Davidian
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, USA
| |
Collapse
|
4
|
Yu Y, Zhang M, Mukherjee B. An inverse probability weighted regression method that accounts for right-censoring for causal inference with multiple treatments and a binary outcome. Stat Med 2023; 42:3699-3715. [PMID: 37392070 DOI: 10.1002/sim.9826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 05/29/2023] [Accepted: 06/01/2023] [Indexed: 07/02/2023]
Abstract
Comparative effectiveness research often involves evaluating the differences in the risks of an event of interest between two or more treatments using observational data. Often, the post-treatment outcome of interest is whether the event happens within a pre-specified time window, which leads to a binary outcome. One source of bias for estimating the causal treatment effect is the presence of confounders, which are usually controlled using propensity score-based methods. An additional source of bias is right-censoring, which occurs when the information on the outcome of interest is not completely available due to dropout, study termination, or treatment switch before the event of interest. We propose an inverse probability weighted regression-based estimator that can simultaneously handle both confounding and right-censoring, calling the method CIPWR, with the letter C highlighting the censoring component. CIPWR estimates the average treatment effects by averaging the predicted outcomes obtained from a logistic regression model that is fitted using a weighted score function. The CIPWR estimator has a double robustness property such that estimation consistency can be achieved when either the model for the outcome or the models for both treatment and censoring are correctly specified. We establish the asymptotic properties of the CIPWR estimator for conducting inference, and compare its finite sample performance with that of several alternatives through simulation studies. The methods under comparison are applied to a cohort of prostate cancer patients from an insurance claims database for comparing the adverse effects of four candidate drugs for advanced stage prostate cancer.
Collapse
Affiliation(s)
- Youfei Yu
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, Michigan, USA
| | - Min Zhang
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, Michigan, USA
| | - Bhramar Mukherjee
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, Michigan, USA
| |
Collapse
|
5
|
Jin P, Lu W, Chen Y, Liu M. Change-plane analysis for subgroup detection with a continuous treatment. Biometrics 2023; 79:1920-1933. [PMID: 36134534 PMCID: PMC10030385 DOI: 10.1111/biom.13762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Accepted: 09/14/2022] [Indexed: 11/30/2022]
Abstract
Detecting and characterizing subgroups with differential effects of a binary treatment has been widely studied and led to improvements in patient outcomes and population risk management. Under the setting of a continuous treatment, however, such investigations remain scarce. We propose a semiparametric change-plane model and consequently a doubly robust test statistic for assessing the existence of two subgroups with differential treatment effects under a continuous treatment. The proposed testing procedure is valid when either the baseline function for the covariate effects or the generalized propensity score function for the continuous treatment is correctly specified. The asymptotic distributions of the test statistic under the null and local alternative hypotheses are established. When the null hypothesis of no subgroup is rejected, the change-plane parameters that define the subgroups can be estimated. This paper provides a unified framework of the change-plane method to handle various types of outcomes, including the exponential family of distributions and time-to-event outcomes. Additional extensions with nonparametric estimation approaches are also provided. We evaluate the performance of our proposed methods through extensive simulation studies under various scenarios. An application to the Health Effects of Arsenic Longitudinal Study with a continuous environmental exposure of arsenic is presented.
Collapse
Affiliation(s)
- Peng Jin
- Division of Biostatistics, Department of Population Health, NYU Grossman School of Medicine, New York, New York 10016, U.S.A
| | - Wenbin Lu
- Department of Statistics, North Carolina State University, Raleigh, North Carolina 27695, U.S.A
| | - Yu Chen
- Division of Epidemiplogy, Department of Population Health, NYU Grossman School of Medicine, New York, New York 10016, U.S.A
- Department of Environmental Medicine, NYU Grossman School of Medicine, New York, New York 10016, U.S.A
| | - Mengling Liu
- Division of Biostatistics, Department of Population Health, NYU Grossman School of Medicine, New York, New York 10016, U.S.A
- Department of Environmental Medicine, NYU Grossman School of Medicine, New York, New York 10016, U.S.A
| |
Collapse
|
6
|
Dahabreh IJ, Robins JM, Haneuse SJP, Saeed I, Robertson SE, Stuart EA, Hernán MA. Sensitivity analysis using bias functions for studies extending inferences from a randomized trial to a target population. Stat Med 2023; 42:2029-2043. [PMID: 36847107 PMCID: PMC10219839 DOI: 10.1002/sim.9550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Revised: 05/20/2022] [Accepted: 07/21/2022] [Indexed: 03/01/2023]
Abstract
Extending (i.e., generalizing or transporting) causal inferences from a randomized trial to a target population requires assumptions that randomized and nonrandomized individuals are exchangeable conditional on baseline covariates. These assumptions are made on the basis of background knowledge, which is often uncertain or controversial, and need to be subjected to sensitivity analysis. We present simple methods for sensitivity analyses that directly parameterize violations of the assumptions using bias functions and do not require detailed background knowledge about specific unknown or unmeasured determinants of the outcome or modifiers of the treatment effect. We show how the methods can be applied to non-nested trial designs, where the trial data are combined with a separately obtained sample of nonrandomized individuals, as well as to nested trial designs, where the trial is embedded within a cohort sampled from the target population.
Collapse
Affiliation(s)
- Issa J. Dahabreh
- CAUSALab, Harvard T.H. Chan School of Public Health, Boston, MA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA
| | - James M. Robins
- CAUSALab, Harvard T.H. Chan School of Public Health, Boston, MA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA
| | | | - Iman Saeed
- Center for Evidence Synthesis in Health, Department of Health Services, Policy & Practice, Brown University School of Public Health, Providence, RI
| | - Sarah E. Robertson
- CAUSALab, Harvard T.H. Chan School of Public Health, Boston, MA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA
| | - Elizabeth A. Stuart
- Departments of Mental Health, Biostatistics, and Health Policy and Management, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD
| | - Miguel A. Hernán
- CAUSALab, Harvard T.H. Chan School of Public Health, Boston, MA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA
- Harvard-MIT Division of Health Sciences and Technology, Boston, MA
| |
Collapse
|
7
|
Bian Z, Moodie EEM, Shortreed SM, Bhatnagar S. Variable selection in regression-based estimation of dynamic treatment regimes. Biometrics 2023; 79:988-999. [PMID: 34837380 DOI: 10.1111/biom.13608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Accepted: 11/18/2021] [Indexed: 11/27/2022]
Abstract
Dynamic treatment regimes (DTRs) consist of a sequence of decision rules, one per stage of intervention, that aim to recommend effective treatments for individual patients according to patient information history. DTRs can be estimated from models which include interactions between treatment and a (typically small) number of covariates which are often chosen a priori. However, with increasingly large and complex data being collected, it can be difficult to know which prognostic factors might be relevant in the treatment rule. Therefore, a more data-driven approach to select these covariates might improve the estimated decision rules and simplify models to make them easier to interpret. We propose a variable selection method for DTR estimation using penalized dynamic weighted least squares. Our method has the strong heredity property, that is, an interaction term can be included in the model only if the corresponding main terms have also been selected. We show our method has both the double robustness property and the oracle property theoretically; and the newly proposed method compares favorably with other variable selection approaches in numerical studies. We further illustrate the proposed method on data from the Sequenced Treatment Alternatives to Relieve Depression study.
Collapse
Affiliation(s)
- Zeyu Bian
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, Quebec, Canada
| | - Erica E M Moodie
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, Quebec, Canada
| | - Susan M Shortreed
- Kaiser Permanente Washington Health Research Institute, Seattle, Washington, USA
- Department of Biostatistics, University of Washington, Seattle, Washington, USA
| | - Sahir Bhatnagar
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, Quebec, Canada
- Department of Diagnostic Radiology, McGill University, Montreal, Quebec, Canada
| |
Collapse
|
8
|
Abstract
Propensity score matching has been a long-standing tradition for handling confounding in causal inference, however requiring stringent model assumptions. In this article, we propose novel double score matching (DSM) utilizing both the propensity score and prognostic score. To gain the protection of possible model misspecification, we posit multiple candidate models for each score. We show that the de-biasing DSM estimator achieves the multiple robustness property in that it is consistent if any one of the score models is correctly specified. We characterize the asymptotic distribution for the DSM estimator requiring only one correct model specification based on the martingale representations of the matching estimators and theory for local Normal experiments. We also provide a two-stage replication method for variance estimation and extend DSM for quantile estimation. Simulation demonstrates DSM outperforms single score matching and prevailing multiply robust weighting estimators in the presence of extreme propensity scores.
Collapse
Affiliation(s)
- Shu Yang
- Department of Statistics, North Carolina State University
| | - Yunshu Zhang
- Department of Statistics, North Carolina State University
| |
Collapse
|
9
|
Li X, Miao W, Lu F, Zhou XH. Improving efficiency of inference in clinical trials with external control data. Biometrics 2023; 79:394-403. [PMID: 34694626 DOI: 10.1111/biom.13583] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Revised: 07/29/2021] [Accepted: 09/30/2021] [Indexed: 01/13/2023]
Abstract
Suppose we are interested in the effect of a treatment in a clinical trial. The efficiency of inference may be limited due to small sample size. However, external control data are often available from historical studies. Motivated by an application to Helicobacter pylori infection, we show how to borrow strength from such data to improve efficiency of inference in the clinical trial. Under an exchangeability assumption about the potential outcome mean, we show that the semiparametric efficiency bound for estimating the average treatment effect can be reduced by incorporating both the clinical trial data and external controls. We then derive a doubly robust and locally efficient estimator. The improvement in efficiency is prominent especially when the external control data set has a large sample size and small variability. Our method allows for a relaxed overlap assumption, and we illustrate with the case where the clinical trial only contains a treated group. We also develop doubly robust and locally efficient approaches that extrapolate the causal effect in the clinical trial to the external population and the overall population. Our results also offer a meaningful implication for trial design and data collection. We evaluate the finite-sample performance of the proposed estimators via simulation. In the Helicobacter pylori infection application, our approach shows that the combination treatment has potential efficacy advantages over the triple therapy.
Collapse
Affiliation(s)
- Xinyu Li
- School of Mathematical Sciences & Center for Statistical Science, Peking University, Beijing, China
| | - Wang Miao
- School of Mathematical Sciences & Center for Statistical Science, Peking University, Beijing, China
| | - Fang Lu
- Xiyuan Hospital, China Academy of Chinese Medical Sciences, Beijing, China
| | - Xiao-Hua Zhou
- Department of Biostatistics & Beijing International Center for Mathematical Research, Peking University, Beijing, China
| |
Collapse
|
10
|
Wen L, Marcus JL, Young JG. Intervention treatment distributions that depend on the observed treatment process and model double robustness in causal survival analysis. Stat Methods Med Res 2023; 32:509-523. [PMID: 36597699 PMCID: PMC9983057 DOI: 10.1177/09622802221146311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
The generalized g-formula can be used to estimate the probability of survival under a sustained treatment strategy. When treatment strategies are deterministic, estimators derived from the so-called efficient influence function (EIF) for the g-formula will be doubly robust to model misspecification. In recent years, several practical applications have motivated estimation of the g-formula under non-deterministic treatment strategies where treatment assignment at each time point depends on the observed treatment process. In this case, EIF-based estimators may or may not be doubly robust. In this paper, we provide sufficient conditions to ensure the existence of doubly robust estimators for intervention treatment distributions that depend on the observed treatment process for point treatment interventions and give a class of intervention treatment distributions dependent on the observed treatment process that guarantee model doubly and multiply robust estimators in longitudinal settings. Motivated by an application to pre-exposure prophylaxis (PrEP) initiation studies, we propose a new treatment intervention dependent on the observed treatment process. We show there exist (1) estimators that are doubly and multiply robust to model misspecification and (2) estimators that when used with machine learning algorithms can attain fast convergence rates for our proposed intervention. Finally, we explore the finite sample performance of our estimators via simulation studies.
Collapse
Affiliation(s)
- Lan Wen
- Department of Statistics and Actuarial Science, 8430University of Waterloo, Waterloo, ON, Canada
| | - Julia L Marcus
- Department of Population Medicine, 1811Harvard Medical School, Boston, MA, USA
| | - Jessica G Young
- Department of Population Medicine, 1811Harvard Medical School, Boston, MA, USA
| |
Collapse
|
11
|
Talbot D, Moodie EEM, Diorio C. Double robust estimation of optimal partially adaptive treatment strategies: An application to breast cancer treatment using hormonal therapy. Stat Med 2023; 42:178-192. [PMID: 36408723 DOI: 10.1002/sim.9608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 09/17/2022] [Accepted: 11/05/2022] [Indexed: 11/22/2022]
Abstract
Precision medicine aims to tailor treatment decisions according to patients' characteristics. G-estimation and dynamic weighted ordinary least squares are double robust methods to identify optimal adaptive treatment strategies. It is underappreciated that they require modeling all existing treatment-confounder interactions to be consistent. Identifying optimal partially adaptive treatment strategies that tailor treatments according to only a few covariates, ignoring some interactions, may be preferable in practice. Building on G-estimation and dWOLS, we propose estimators of such partially adaptive strategies and demonstrate their double robustness. We investigate these estimators in a simulation study. Using data maintained by the Centre des Maladies du Sein, we estimate a partially adaptive treatment strategy for tailoring hormonal therapy use in breast cancer patients. R software implementing our estimators is provided.
Collapse
Affiliation(s)
- Denis Talbot
- Département de médecine sociale et préventive, Université Laval, Québec, Canada.,Axe santé des Populations et Pratiques Optimales en Santé, Centre de Recherche du CHU de Québec - Université Laval, Québec, Canada
| | - Erica E M Moodie
- Department of Epidemiology, Biostatistics & Occupational Health, McGill University, Québec, Canada
| | - Caroline Diorio
- Département de médecine sociale et préventive, Université Laval, Québec, Canada.,Axe oncologie, Centre de recherche du CHU de Québec - Université Laval, Québec, Canada
| |
Collapse
|
12
|
Chen S, Hoch JS. Net-benefit regression with censored cost-effectiveness data from randomized or observational studies. Stat Med 2022; 41:3958-3974. [PMID: 35665527 PMCID: PMC9427707 DOI: 10.1002/sim.9486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Revised: 03/25/2022] [Accepted: 05/18/2022] [Indexed: 11/10/2022]
Abstract
Cost-effectiveness analysis is an essential part of the evaluation of new medical interventions. While in many studies both costs and effectiveness (eg, survival time) are censored, standard survival analysis techniques are often invalid due to the induced dependent censoring problem. We propose methods for censored cost-effectiveness data using the net-benefit regression framework, which allow covariate-adjustment and subgroup identification when comparing two intervention groups. The methods provide a straightforward way to construct cost-effectiveness acceptability curves with censored data. We also propose a more efficient doubly robust estimator of average causal incremental net benefit, which increases the likelihood that the results will represent a valid inference in observational studies. Lastly, we conduct extensive numerical studies to examine the finite-sample performance of the proposed methods, and illustrate the proposed methods with a real data example using both survival time and quality-adjusted survival time as the measures of effectiveness.
Collapse
Affiliation(s)
- Shuai Chen
- Division of Biostatistics, Department of Public Health Sciences, University of California, Davis, Davis, California, USA
| | - Jeffrey S. Hoch
- Division of Health Policy and Management, Department of Public Health Sciences, University of California, Davis, Sacramento, California, USA
- Center for Healthcare Policy and Research, University of California, Davis, Sacramento, California, USA
| |
Collapse
|
13
|
Han L, Wang X, Cai T. Identifying surrogate markers in real-world comparative effectiveness research. Stat Med 2022; 41:5290-5304. [PMID: 36062392 DOI: 10.1002/sim.9569] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 08/15/2022] [Accepted: 08/18/2022] [Indexed: 11/09/2022]
Abstract
In comparative effectiveness research (CER), leveraging short-term surrogates to infer treatment effects on long-term outcomes can guide policymakers evaluating new treatments. Numerous statistical procedures for identifying surrogates have been proposed for randomized clinical trials (RCTs), but no methods currently exist to evaluate the proportion of treatment effect (PTE) explained by surrogates in real-world data (RWD), which have become increasingly common. To address this knowledge gap, we propose inverse probability weighted (IPW) and doubly robust (DR) estimators of an optimal transformation of the surrogate and the corresponding PTE measure. We demonstrate that the proposed estimators are consistent and asymptotically normal, and the DR estimator is consistent when either the propensity score model or outcome regression model is correctly specified. Our proposed estimators are evaluated through extensive simulation studies. In two RWD settings, we show that our method can identify and validate surrogate markers for inflammatory bowel disease (IBD).
Collapse
Affiliation(s)
- Larry Han
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| | - Xuan Wang
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| | - Tianxi Cai
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| |
Collapse
|
14
|
Li F, Buchanan AL, Cole SR. Generalizing trial evidence to target populations in non-nested designs: Applications to AIDS clinical trials. J R Stat Soc Ser C Appl Stat 2022; 71:669-697. [PMID: 35968541 PMCID: PMC9367209 DOI: 10.1111/rssc.12550] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
Comparative effectiveness evidence from randomized trials may not be directly generalizable to a target population of substantive interest when, as in most cases, trial participants are not randomly sampled from the target population. Motivated by the need to generalize evidence from two trials conducted in the AIDS Clinical Trials Group (ACTG), we consider weighting, regression and doubly robust estimators to estimate the causal effects of HIV interventions in a specified population of people living with HIV in the USA. We focus on a non-nested trial design and discuss strategies for both point and variance estimation of the target population average treatment effect. Specifically in the generalizability context, we demonstrate both analytically and empirically that estimating the known propensity score in trials does not increase the variance for each of the weighting, regression and doubly robust estimators. We apply these methods to generalize the average treatment effects from two ACTG trials to specified target populations and operationalize key practical considerations. Finally, we report on a simulation study that investigates the finite-sample operating characteristics of the generalizability estimators and their sandwich variance estimators.
Collapse
Affiliation(s)
- Fan Li
- Department of Biostatistics, Yale University School of Public Health, New Haven, Connecticut, USA
- Center for Methods in Implementation and Prevention Science, Yale University, New Haven, Connecticut, USA
| | - Ashley L. Buchanan
- Department of Pharmacy Practice, College of Pharmacy, University of Rhode Island, Kingston, Rhode Island, USA
| | - Stephen R. Cole
- Department of Epidemiology, Gillings School of Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
15
|
Zhang Y, Yang S, Ye W, Faries DE, Lipkovich I, Kadziola Z. Practical recommendations on double score matching for estimating causal effects. Stat Med 2021; 41:1421-1445. [PMID: 34957585 DOI: 10.1002/sim.9289] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Revised: 11/26/2021] [Accepted: 12/01/2021] [Indexed: 11/09/2022]
Abstract
Unlike in randomized clinical trials (RCTs), confounding control is critical for estimating the causal effects from observational studies due to the lack of treatment randomization. Under the unconfoundedness assumption, matching methods are popular because they can be used to emulate an RCT that is hidden in the observational study. To ensure the key assumption hold, the effort is often made to collect a large number of possible confounders, rendering dimension reduction imperative in matching. Three matching schemes based on the propensity score (PSM), prognostic score (PGM), and double score (DSM, ie, the collection of the first two scores) have been proposed in the literature. However, a comprehensive comparison is lacking among the three matching schemes and has not made inroads into the best practices including variable selection, choice of caliper, and replacement. In this article, we explore the statistical and numerical properties of PSM, PGM, and DSM via extensive simulations. Our study supports that DSM performs favorably with, if not better than, the two single score matching in terms of bias and variance. In particular, DSM is doubly robust in the sense that the matching estimator is consistent requiring either the propensity score model or the prognostic score model is correctly specified. Variable selection on the propensity score model and matching with replacement is suggested for DSM, and we illustrate the recommendations with comprehensive simulation studies. An R package is available at https://github.com/Yunshu7/dsmatch.
Collapse
Affiliation(s)
- Yunshu Zhang
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, USA
| | - Shu Yang
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, USA
| | - Wenyu Ye
- Eli Lilly and Company, Indianapolis, Indiana, USA
| | | | | | | |
Collapse
|
16
|
Abstract
Mean residual life (MRL) function defines the remaining life expectancy of a subject who has survived to a time point and is an important alternative to the hazard function for characterizing the distribution of a time-to-event variable. Existing MRL models primarily focus on studying the association between risk factors and disease risks using linear model specifications in multiplicative or additive scale. When risk factors have complex correlation structures, nonlinear effects, or interactions, the prefixed linearity assumption may be insufficient to capture the relationship. Single-index modeling framework offers flexibility in reducing dimensionality and modeling nonlinear effects. In this article, we propose a class of partially linear single-index generalized MRL models, the regression component of which consists of both a semiparametric single-index part and a linear regression part. Regression spline technique is employed to approximate the nonparametric single-index function, and parameters are estimated using an iterative algorithm. Double-robust estimators are also proposed to protect against the misspecification of censoring distribution or MRL models. A further contribution of this article is a nonparametric test proposed to formally evaluate the linearity of the single-index function. Asymptotic properties of the estimators are established, and the finite-sample performance is evaluated through extensive numerical simulations. The proposed models and inference approaches are demonstrated by a New York University Langone Health (NYULH) COVID-19 dataset.
Collapse
Affiliation(s)
- Peng Jin
- Division of Biostatistics, Department of Population Health, New York University Grossman School of Medicine, New York, NY 10016, U.S.A
| | - Mengling Liu
- Division of Biostatistics, Department of Population Health, New York University Grossman School of Medicine, New York, NY 10016, U.S.A
- Department of Environmental Medicine, New York University Grossman School of Medicine, New York, NY 10016, U.S.A
| |
Collapse
|
17
|
Liu Y, Schnitzer ME, Wang G, Kennedy E, Viiklepp P, Vargas MH, Sotgiu G, Menzies D, Benedetti A. Modeling treatment effect modification in multidrug-resistant tuberculosis in an individual patientdata meta-analysis. Stat Methods Med Res 2021; 31:689-705. [PMID: 34903098 PMCID: PMC8961254 DOI: 10.1177/09622802211046383] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Effect modification occurs while the effect of the treatment is not homogeneous across the different strata of patient characteristics. When the effect of treatment may vary from individual to individual, precision medicine can be improved by identifying patient covariates to estimate the size and direction of the effect at the individual level. However, this task is statistically challenging and typically requires large amounts of data. Investigators may be interested in using the individual patient data from multiple studies to estimate these treatment effect models. Our data arise from a systematic review of observational studies contrasting different treatments for multidrug-resistant tuberculosis, where multiple antimicrobial agents are taken concurrently to cure the infection. We propose a marginal structural model for effect modification by different patient characteristics and co-medications in a meta-analysis of observational individual patient data. We develop, evaluate, and apply a targeted maximum likelihood estimator for the doubly robust estimation of the parameters of the proposed marginal structural model in this context. In particular, we allow for differential availability of treatments across studies, measured confounding within and across studies, and random effects by study.
Collapse
Affiliation(s)
- Yan Liu
- Department of Epidemiology, Biostatistics and Occupational Health, 5620McGill University, Canada
| | - Mireille E Schnitzer
- Faculty of Pharmacy, 5622Université de Montréal, Canada.,Department of Social and Preventive Medicine, 5622Université de Montréal, Canada
| | - Guanbo Wang
- Department of Epidemiology, Biostatistics and Occupational Health, 5620McGill University, Canada
| | - Edward Kennedy
- Department of Statistics & Data Science, 6612Carnegie Mellon University, USA
| | | | - Mario H Vargas
- 42635Instituto Nacional de Enfermedades Respiratorias, Mexico
| | - Giovanni Sotgiu
- Clinical Epidemiology and Medical Statistics Unit, Department of Medical, Surgical and Experimental Sciences, University of Sassari, Italy
| | - Dick Menzies
- Respiratory Epidemiology and Clinical Research Unit, 54473Centre for Outcomes Research & Evaluation, Research Institute of the McGill University Health Centre, Montréal, Canada.,Montréal Chest Institute & McGill International TB Centre, Research Institute of the McGill University Health Centre, Montréal, Canada
| | - Andrea Benedetti
- Department of Epidemiology, Biostatistics and Occupational Health, 5620McGill University, Canada.,Respiratory Epidemiology and Clinical Research Unit, 54473Centre for Outcomes Research & Evaluation, Research Institute of the McGill University Health Centre, Montréal, Canada.,Department of Medicine, McGill University, Canada
| |
Collapse
|
18
|
Lee D, Yang S, Dong L, Wang X, Zeng D, Cai J. Improving trial generalizability using observational studies. Biometrics 2021. [PMID: 34862966 PMCID: PMC9166225 DOI: 10.1111/biom.13609] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Revised: 11/06/2021] [Accepted: 11/22/2021] [Indexed: 11/29/2022]
Abstract
Complementary features of randomized controlled trials (RCTs) and observational studies (OSs) can be used jointly to estimate the average treatment effect of a target population. We propose a calibration weighting estimator that enforces the covariate balance between the RCT and OS, therefore improving the trial-based estimator's generalizability. Exploiting semiparametric efficiency theory, we propose a doubly robust augmented calibration weighting estimator that achieves the efficiency bound derived under the identification assumptions. A nonparametric sieve method is provided as an alternative to the parametric approach, which enables the robust approximation of the nuisance functions and data-adaptive selection of outcome predictors for calibration. We establish asymptotic results and confirm the finite sample performances of the proposed estimators by simulation experiments and an application on the estimation of the treatment effect of adjuvant chemotherapy for early-stage non-small cell lung patients after surgery. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Dasom Lee
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, USA
| | - Shu Yang
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, USA
| | - Lin Dong
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, USA
| | - Xiaofei Wang
- Department of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina, USA
| | - Donglin Zeng
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Jianwen Cai
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
19
|
Lin R, Chan KG, Shi H. A unified Bayesian framework for exact inference of area under the receiver operating characteristic curve. Stat Methods Med Res 2021; 30:2269-2287. [PMID: 34468238 DOI: 10.1177/09622802211037070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
The area under the receiver operating characteristic curve is a widely used measure for evaluating the performance of a diagnostic test. Common approaches for inference on area under the receiver operating characteristic curve are usually based upon approximation. For example, the normal approximation based inference tends to suffer from the problem of low accuracy for small sample size. Frequentist empirical likelihood based approaches for area under the receiver operating characteristic curve estimation may perform better, but are usually conducted through approximation in order to reduce the computational burden, thus the inference is not exact. By contrast, we proposed an exact inferential procedure by adapting the empirical likelihood into a Bayesian framework and draw inference from the posterior samples of the area under the receiver operating characteristic curve obtained via a Gibbs sampler. The full conditional distributions within the Gibbs sampler only involve empirical likelihoods with linear constraints, which greatly simplify the computation. To further enhance the applicability and flexibility of the Bayesian empirical likelihood, we extend our method to the estimation of partial area under the receiver operating characteristic curve, comparison of multiple tests, and the doubly robust estimation of area under the receiver operating characteristic curve in the presence of missing test results. Simulation studies confirm the desirable performance of the proposed methods, and a real application is presented to illustrate its usefulness.
Collapse
Affiliation(s)
- Ruitao Lin
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, USA
| | - Kc Gary Chan
- Department of Biostatistics, 7284University of Washington, USA
| | - Haolun Shi
- Department of Statistics and Actuarial Science, Simon Fraser University, Canada
| |
Collapse
|
20
|
Liu M, Zhang YI, Zhou D. Double/debiased machine learning for logistic partially linear model. Econom J 2021; 24:559-588. [PMID: 38223304 PMCID: PMC10786638 DOI: 10.1093/ectj/utab019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/16/2024]
Abstract
We propose double/debiased machine learning approaches to infer a parametric component of a logistic partially linear model. Our framework is based on a Neyman orthogonal score equation consisting of two nuisance models for the nonparametric component of the logistic model and conditional mean of the exposure with the control group. To estimate the nuisance models, we separately consider the use of high dimensional (HD) sparse regression and (nonparametric) machine learning (ML) methods. In the HD case, we derive certain moment equations to calibrate the first order bias of the nuisance models, which preserves the model double robustness property. In the ML case, we handle the nonlinearity of the logit link through a novel and easy-to-implement 'full model refitting' procedure. We evaluate our methods through simulation and apply them in assessing the effect of the emergency contraceptive pill on early gestation and new births based on a 2008 policy reform in Chile.
Collapse
Affiliation(s)
- Molei Liu
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, 677 Huntington Avenue, Boston, MA 02115, USA
| | - Y I Zhang
- Department of Statistics, Harvard University, One Oxford Street, Cambridge, MA 02138-2901, USA
| | - Doudou Zhou
- Department of Statistics, University of California, Davis, One Shields Avenue, Davis, CA 95616, USA
| |
Collapse
|
21
|
Balzer LB, Petersen ML. Invited Commentary: Machine Learning in Causal Inference-How Do I Love Thee? Let Me Count the Ways. Am J Epidemiol 2021; 190:1483-1487. [PMID: 33751059 DOI: 10.1093/aje/kwab048] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 02/02/2021] [Accepted: 02/04/2021] [Indexed: 12/24/2022] Open
Abstract
In this issue of the Journal, Mooney et al. (Am J Epidemiol. 2021;190(8):1476-1482) discuss machine learning as a tool for causal research in the style of Internet headlines. Here we comment by adapting famous literary quotations, including the one in our title (from "Sonnet 43" by Elizabeth Barrett Browning (Sonnets From the Portuguese, Adelaide Hanscom Leeson, 1850)). We emphasize that any use of machine learning to answer causal questions must be founded on a formal framework for both causal and statistical inference. We illustrate the pitfalls that can occur without such a foundation. We conclude with some practical recommendations for integrating machine learning into causal analyses in a principled way and highlight important areas of ongoing work.
Collapse
|
22
|
Van Lancker K, Vandebosch A, Vansteelandt S. Efficient, doubly robust estimation of the effect of dose switching for switchers in a randomized clinical trial. Biom J 2021; 63:1464-1475. [PMID: 34247409 DOI: 10.1002/bimj.202000269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Revised: 03/09/2021] [Accepted: 03/14/2021] [Indexed: 11/09/2022]
Abstract
Motivated by a clinical trial conducted by Janssen Pharmaceutica in which a flexible dosing regimen is compared to placebo, we evaluate how switchers in the treatment arm (i.e., patients who were switched to the higher dose) would have fared had they been kept on the low dose. This is done in order to understand whether flexible dosing is potentially beneficial for them. Simply comparing these patients' responses with those of patients who stayed on the low dose does not likely entail a satisfactory evaluation because the latter patients are usually in a better health condition. Because the available information in the considered trial is too limited to enable a reliable adjustment, we will instead transport data from a fixed dosing trial that has been conducted concurrently on the same target, albeit not in an identical patient population. In particular, we propose an estimator that relies on an outcome model, a model for switching, and a propensity score model for the association between study and patient characteristics. The proposed estimator is asymptotically unbiased if either the outcome or the propensity score model is correctly specified, and efficient (under the semiparametric model where the randomization probabilities are known and independent of baseline covariates) when all models are correctly specified. The proposed method for transporting information from an external study is more broadly applicable in studies where a classical confounding adjustment is not possible due to near positivity violation (e.g., studies where switching takes place in a (near) deterministic manner). Monte Carlo simulations and application to the motivating study demonstrate adequate performance.
Collapse
Affiliation(s)
- Kelly Van Lancker
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium
| | - An Vandebosch
- Janssen R&D, a division of Janssen Pharmaceutica NV, Beerse, Belgium
| | - Stijn Vansteelandt
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium.,Department of Medical Statistics, London School of Hygiene and Tropical Medicine, London, United Kingdom
| |
Collapse
|
23
|
Abstract
This article discusses the augmented inverse propensity weighted (AIPW) estimator as an estimator for average treatment effects. The AIPW combines both the properties of the regression-based estimator and the inverse probability weighted (IPW) estimator and is therefore a “doubly robust” method in that it requires only either the propensity or outcome model to be correctly specified but not both. Even though this estimator has been known for years, it is rarely used in practice. After explaining the estimator and proving the double robustness property, I conduct a simulation study to compare the AIPW efficiency with IPW and regression under different scenarios of misspecification. In 2 real-world examples, I provide a step-by-step guide on implementing the AIPW estimator in practice. I show that it is an easily usable method that extends the IPW to reduce variability and improve estimation accuracy. Highlights • Average treatment effects are often estimated by regression or inverse probability weighting methods, but both are vulnerable to bias. • The augmented inverse probability weighted estimator is an easy-to-use method for average treatment effects that can be less biased because of the double robustness property.
Collapse
Affiliation(s)
- Christoph F Kurz
- Munich School of Management and Munich Center of Health Sciences, Ludwig-Maximilians-Universität Munich, Munich, Germany.,Institute of Health Economics and Health Care Management, Helmholtz Zentrum München, Neuherberg, Germany
| |
Collapse
|
24
|
Dahabreh IJ, Robertson SE, Steingrimsson JA, Stuart EA, Hernán MA. Extending inferences from a randomized trial to a new target population. Stat Med 2020; 39:1999-2014. [PMID: 32253789 DOI: 10.1002/sim.8426] [Citation(s) in RCA: 72] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2018] [Revised: 07/02/2019] [Accepted: 10/02/2019] [Indexed: 12/20/2022]
Abstract
When treatment effect modifiers influence the decision to participate in a randomized trial, the average treatment effect in the population represented by the randomized individuals will differ from the effect in other populations. In this tutorial, we consider methods for extending causal inferences about time-fixed treatments from a trial to a new target population of nonparticipants, using data from a completed randomized trial and baseline covariate data from a sample from the target population. We examine methods based on modeling the expectation of the outcome, the probability of participation, or both (doubly robust). We compare the methods in a simulation study and show how they can be implemented in software. We apply the methods to a randomized trial nested within a cohort of trial-eligible patients to compare coronary artery surgery plus medical therapy versus medical therapy alone for patients with chronic coronary artery disease. We conclude by discussing issues that arise when using the methods in applied analyses.
Collapse
Affiliation(s)
- Issa J Dahabreh
- Center for Evidence Synthesis in Health, Brown University, Providence, Rhode Island.,Department of Health Services, Policy & Practice, Brown University, Providence, Rhode Island.,Department of Epidemiology, Brown University, Providence, Rhode Island.,Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| | - Sarah E Robertson
- Center for Evidence Synthesis in Health, Brown University, Providence, Rhode Island.,Department of Health Services, Policy & Practice, Brown University, Providence, Rhode Island
| | - Jon A Steingrimsson
- Department of Biostatistics, School of Public Health, Brown University, Providence, Rhode Island
| | - Elizabeth A Stuart
- Departments of Mental Health, Biostatistics, and Health Policy and Management, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland
| | - Miguel A Hernán
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts.,Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts.,Harvard-MIT Division of Health Sciences and Technology, Boston, Massachusetts
| |
Collapse
|
25
|
Shu D, Yi GY. Causal inference with noisy data: Bias analysis and estimation approaches to simultaneously addressing missingness and misclassification in binary outcomes. Stat Med 2020; 39:456-468. [PMID: 31802532 DOI: 10.1002/sim.8419] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2019] [Revised: 08/21/2019] [Accepted: 10/13/2019] [Indexed: 11/08/2022]
Abstract
Causal inference has been widely conducted in various fields and many methods have been proposed for different settings. However, for noisy data with both mismeasurements and missing observations, those methods often break down. In this paper, we consider a problem that binary outcomes are subject to both missingness and misclassification, when the interest is in estimation of the average treatment effects (ATE). We examine the asymptotic biases caused by ignoring missingness and/or misclassification and establish the intrinsic connections between missingness effects and misclassification effects on the estimation of ATE. We develop valid weighted estimation methods to simultaneously correct for missingness and misclassification effects. To provide protection against model misspecification, we further propose a doubly robust correction method which yields consistent estimators when either the treatment model or the outcome model is misspecified. Simulation studies are conducted to assess the performance of the proposed methods. An application to smoking cessation data is reported to illustrate the use of the proposed methods.
Collapse
Affiliation(s)
- Di Shu
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, Massachusetts.,Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario, Canada
| | - Grace Y Yi
- Department of Statistical and Actuarial Sciences, Department of Computer Science, University of Western Ontario, London, Ontario, Canada.,Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario, Canada
| |
Collapse
|
26
|
Wang G, Schnitzer ME, Menzies D, Viiklepp P, Holtz TH, Benedetti A. Estimating treatment importance in multidrug-resistant tuberculosis using Targeted Learning: An observational individual patient data network meta-analysis. Biometrics 2019; 76:1007-1016. [PMID: 31868919 DOI: 10.1111/biom.13210] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Revised: 12/06/2019] [Accepted: 12/09/2019] [Indexed: 01/25/2023]
Abstract
Persons with multidrug-resistant tuberculosis (MDR-TB) have a disease resulting from a strain of tuberculosis (TB) that does not respond to at least isoniazid and rifampicin, the two most effective anti-TB drugs. MDR-TB is always treated with multiple antimicrobial agents. Our data consist of individual patient data from 31 international observational studies with varying prescription practices, access to medications, and distributions of antibiotic resistance. In this study, we develop identifiability criteria for the estimation of a global treatment importance metric in the context where not all medications are observed in all studies. With stronger causal assumptions, this treatment importance metric can be interpreted as the effect of adding a medication to the existing treatments. We then use this metric to rank 15 observed antimicrobial agents in terms of their estimated add-on value. Using the concept of transportability, we propose an implementation of targeted maximum likelihood estimation, a doubly robust and locally efficient plug-in estimator, to estimate the treatment importance metric. A clustered sandwich estimator is adopted to compute variance estimates and produce confidence intervals. Simulation studies are conducted to assess the performance of our estimator, verify the double robustness property, and assess the appropriateness of the variance estimation approach.
Collapse
Affiliation(s)
- Guanbo Wang
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montréal, Québec, Canada
| | - Mireille E Schnitzer
- Faculty of Pharmacy, Université de Montréal, Montréal, Québec, Canada.,Department of Social and Preventive Medicine, Université de Montréal, Montréal, Québec, Canada
| | - Dick Menzies
- Respiratory Epidemiology and Clinical Research Unit, McGill University Health Centre, Montréal, Québec, Canada.,Department of Medicine, McGill University, Montréal, Québec, Canada
| | - Piret Viiklepp
- Estonian Tuberculosis Registry, National Institute for Health Development, Tallinn, Estonia
| | - Timothy H Holtz
- Division of Global HIV and TB, Centers for Disease Control and Prevention, Atlanta, Georgia
| | - Andrea Benedetti
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montréal, Québec, Canada.,Respiratory Epidemiology and Clinical Research Unit, McGill University Health Centre, Montréal, Québec, Canada.,Department of Medicine, McGill University, Montréal, Québec, Canada
| |
Collapse
|
27
|
Díaz I. Statistical inference for data-adaptive doubly robust estimators with survival outcomes. Stat Med 2019; 38:2735-2748. [PMID: 30950107 DOI: 10.1002/sim.8156] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2018] [Revised: 02/25/2019] [Accepted: 03/08/2019] [Indexed: 11/06/2022]
Abstract
The consistency of doubly robust estimators relies on the consistent estimation of at least one of two nuisance regression parameters. In moderate-to-large dimensions, the use of flexible data-adaptive regression estimators may aid in achieving this consistency. However, n1/2 -consistency of doubly robust estimators is not guaranteed if one of the nuisance estimators is inconsistent. In this paper, we present a doubly robust estimator for survival analysis with the novel property that it converges to a Gaussian variable at an n1/2 -rate for a large class of data-adaptive estimators of the nuisance parameters, under the only assumption that at least one of them is consistently estimated at an n1/4 -rate. This result is achieved through the adaptation of recent ideas in semiparametric inference, which amount to (i) Gaussianizing (ie, making asymptotically linear) a drift term that arises in the asymptotic analysis of the doubly robust estimator and (ii) using cross-fitting to avoid entropy conditions on the nuisance estimators. We present the formula of the asymptotic variance of the estimator, which allows for the computation of doubly robust confidence intervals and p values. We illustrate the finite-sample properties of the estimator in simulation studies and demonstrate its use in a phase III clinical trial for estimating the effect of a novel therapy for the treatment of human epidermal growth factor receptor 2 (HER2)-positive breast cancer.
Collapse
Affiliation(s)
- Iván Díaz
- Division of Biostatistics, Weill Cornell Medicine, New York, New York
| |
Collapse
|
28
|
Dahabreh IJ, Robertson SE, Tchetgen EJT, Stuart EA, Hernán MA. Generalizing causal inferences from individuals in randomized trials to all trial-eligible individuals. Biometrics 2019; 75:685-694. [PMID: 30488513 PMCID: PMC10938232 DOI: 10.1111/biom.13009] [Citation(s) in RCA: 66] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2017] [Accepted: 11/02/2018] [Indexed: 12/20/2022]
Abstract
We consider methods for causal inference in randomized trials nested within cohorts of trial-eligible individuals, including those who are not randomized. We show how baseline covariate data from the entire cohort, and treatment and outcome data only from randomized individuals, can be used to identify potential (counterfactual) outcome means and average treatment effects in the target population of all eligible individuals. We review identifiability conditions, propose estimators, and assess the estimators' finite-sample performance in simulation studies. As an illustration, we apply the estimators in a trial nested within a cohort of trial-eligible individuals to compare coronary artery bypass grafting surgery plus medical therapy vs. medical therapy alone for chronic coronary artery disease.
Collapse
Affiliation(s)
- Issa J. Dahabreh
- Center for Evidence Synthesis in Health, Brown University School of Public Health, Providence, RI, U.S.A
- Departments of Health Services, Policy & Practice and Epidemiology, Brown University, Providence, RI, U.S.A
- Department of Epidemiology, Harvard-T.H. Chan School of Public Health, Boston, MA, U.S.A
| | - Sarah E. Robertson
- Center for Evidence Synthesis in Health, Brown University School of Public Health, Providence, RI, U.S.A
| | | | - Elizabeth A. Stuart
- Departments of Mental Health, Biostatistics, and Health Policy and Management, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, U.S.A
| | - Miguel A. Hernán
- Department of Epidemiology, Harvard-T.H. Chan School of Public Health, Boston, MA, U.S.A
- Department of Biostatistics, Harvard-T.H. Chan School of Public Health, Boston, MA, U.S.A
- Harvard-MIT Division of Health Sciences and Technology, Boston, MA, U.S.A
| |
Collapse
|
29
|
Zhang Z, Hu Z, Liu C. Estimating the Population Average Treatment Effect in Observational Studies with Choice-Based Sampling. Int J Biostat 2019; 15:/j/ijb.ahead-of-print/ijb-2018-0093/ijb-2018-0093.xml. [PMID: 30990786 DOI: 10.1515/ijb-2018-0093] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2018] [Accepted: 04/02/2019] [Indexed: 11/15/2022]
Abstract
We consider causal inference in observational studies with choice-based sampling, in which subject enrollment is stratified on treatment choice. Choice-based sampling has been considered mainly in the econometrics literature, but it can be useful for biomedical studies as well, especially when one of the treatments being compared is uncommon. We propose new methods for estimating the population average treatment effect under choice-based sampling, including doubly robust methods motivated by semiparametric theory. A doubly robust, locally efficient estimator may be obtained by replacing nuisance functions in the efficient influence function with estimates based on parametric models. The use of machine learning methods to estimate nuisance functions leads to estimators that are consistent and asymptotically efficient under broader conditions. The methods are compared in simulation experiments and illustrated in the context of a large observational study in obstetrics. We also make suggestions on how to choose the target proportion of treated subjects and the sample size in designing a choice-based observational study.
Collapse
Affiliation(s)
- Zhiwei Zhang
- Department of Statistics, University of California, Riverside, CA,USA
| | - Zonghui Hu
- Biostatistics Research Branch, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Rockville, MD,USA
| | - Chunling Liu
- Department of Applied Mathematics, Hong Kong Polytechnic University, Hong Kong, China
| |
Collapse
|
30
|
Miles CH, Schwartz J, Tchetgen EJT. A class of semiparametric tests of treatment effect robust to confounder measurement error. Stat Med 2018; 37:3403-3416. [PMID: 29938816 PMCID: PMC10712939 DOI: 10.1002/sim.7852] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2017] [Revised: 02/10/2018] [Accepted: 05/18/2018] [Indexed: 11/06/2022]
Abstract
When assessing the presence of an exposure causal effect on a given outcome, measurement error of a confounder can inflate the type I error rate of a treatment effect in even the simplest of settings. In this paper, we develop a large class of semiparametric test statistics of an exposure causal effect, which are completely robust to additive unbiased measurement error of a subset of confounders. A unique and appealing feature of our proposed methodology is that it requires no external information such as validation data or replicates of error-prone confounders. We present a doubly robust form of this test that requires the exposure mean model to be linear in the mismeasured confounders, and only one of two models involving error-free confounders to be correctly specified for the resulting test statistic to have correct type I error rate. We demonstrate validity within our class of test statistics through simulation studies. We apply the methods to a multi-US-city time-series data set to test for an effect of temperature on mortality while adjusting for atmospheric particulate matter with diameter of 2.5 micrometres or less, which is known to be measured with error.
Collapse
Affiliation(s)
- Caleb H. Miles
- Division of Biostatistics, University of California at Berkeley, Berkeley, CA, U.S.A
| | - Joel Schwartz
- Departments of Environmental Health and Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, U.S.A
| | - Eric J. Tchetgen Tchetgen
- Departments of Biostatistics and Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA 94720-7358, U.S.A
| |
Collapse
|
31
|
Cefalu M, Dominici F, Arvold N, Parmigiani G. Model averaged double robust estimation. Biometrics 2017; 73:410-421. [PMID: 27893927 PMCID: PMC5466877 DOI: 10.1111/biom.12622] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2014] [Revised: 09/01/2016] [Accepted: 09/01/2016] [Indexed: 11/27/2022]
Abstract
Researchers estimating causal effects are increasingly challenged with decisions on how to best control for a potentially high-dimensional set of confounders. Typically, a single propensity score model is chosen and used to adjust for confounding, while the uncertainty surrounding which covariates to include into the propensity score model is often ignored, and failure to include even one important confounder will results in bias. We propose a practical and generalizable approach that overcomes the limitations described above through the use of model averaging. We develop and evaluate this approach in the context of double robust estimation. More specifically, we introduce the model averaged double robust (MA-DR) estimators, which account for model uncertainty in both the propensity score and outcome model through the use of model averaging. The MA-DR estimators are defined as weighted averages of double robust estimators, where each double robust estimator corresponds to a specific choice of the outcome model and the propensity score model. The MA-DR estimators extend the desirable double robustness property by achieving consistency under the much weaker assumption that either the true propensity score model or the true outcome model be within a specified, possibly large, class of models. Using simulation studies, we also assessed small sample properties, and found that MA-DR estimators can reduce mean squared error substantially, particularly when the set of potential confounders is large relative to the sample size. We apply the methodology to estimate the average causal effect of temozolomide plus radiotherapy versus radiotherapy alone on one-year survival in a cohort of 1887 Medicare enrollees who were diagnosed with glioblastoma between June 2005 and December 2009.
Collapse
Affiliation(s)
| | | | - Nils Arvold
- St. Luke's Radiation Oncology Associates, Duluth, MN, USA
| | - Giovanni Parmigiani
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Dana-Farber Cancer Institute, Boston, MA, USA
| |
Collapse
|
32
|
Kreif N, Gruber S, Radice R, Grieve R, Sekhon JS. Evaluating treatment effectiveness under model misspecification: A comparison of targeted maximum likelihood estimation with bias-corrected matching. Stat Methods Med Res 2016; 25:2315-2336. [PMID: 24525488 PMCID: PMC5051604 DOI: 10.1177/0962280214521341] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Statistical approaches for estimating treatment effectiveness commonly model the endpoint, or the propensity score, using parametric regressions such as generalised linear models. Misspecification of these models can lead to biased parameter estimates. We compare two approaches that combine the propensity score and the endpoint regression, and can make weaker modelling assumptions, by using machine learning approaches to estimate the regression function and the propensity score. Targeted maximum likelihood estimation is a double-robust method designed to reduce bias in the estimate of the parameter of interest. Bias-corrected matching reduces bias due to covariate imbalance between matched pairs by using regression predictions. We illustrate the methods in an evaluation of different types of hip prosthesis on the health-related quality of life of patients with osteoarthritis. We undertake a simulation study, grounded in the case study, to compare the relative bias, efficiency and confidence interval coverage of the methods. We consider data generating processes with non-linear functional form relationships, normal and non-normal endpoints. We find that across the circumstances considered, bias-corrected matching generally reported less bias, but higher variance than targeted maximum likelihood estimation. When either targeted maximum likelihood estimation or bias-corrected matching incorporated machine learning, bias was much reduced, compared to using misspecified parametric models.
Collapse
Affiliation(s)
- Noémi Kreif
- Department of Health Services Research and Policy, London School of Hygiene and Tropical Medicine, London, UK
| | - Susan Gruber
- Department of Epidemiology, Harvard School of Public Health, Boston, MA, USA
| | | | - Richard Grieve
- Department of Health Services Research and Policy, London School of Hygiene and Tropical Medicine, London, UK
| | | |
Collapse
|
33
|
Liu W, Zhang Z, Schroeder RJ, Ho M, Zhang B, Long C, Zhang H, Irony TZ. Joint Estimation of Treatment and Placebo Effects in Clinical Trials with Longitudinal Blinding Assessments. J Am Stat Assoc 2015; 111:538-548. [PMID: 27110045 DOI: 10.1080/01621459.2015.1130633] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
In some therapeutic areas, treatment evaluation is frequently complicated by a possible placebo effect (i.e., the psychobiological effect of a patient's knowledge or belief of being treated). When a substantial placebo effect is likely to exist, it is important to distinguish the treatment and placebo effects in quantifying the clinical benefit of a new treatment. These causal effects can be formally defined in a joint causal model that includes treatment (e.g., new versus placebo) and treatmentality (i.e., a patient's belief or mentality about which treatment she or he has received) as separate exposures. Information about the treatmentality exposure can be obtained from blinding assessments, which are increasingly common in clinical trials where blinding success is in question. Assuming that treatmentality has a lagged effect and is measured at multiple time points, this article is concerned with joint evaluation of treatment and placebo effects in clinical trials with longitudinal follow-up, possibly with monotone missing data. We describe and discuss several methods adapted from the longitudinal causal inference literature, apply them to a weight loss study, and compare them in simulation experiments that mimic the weight loss study.
Collapse
Affiliation(s)
- Wei Liu
- Department of Mathematics, Harbin Institute of Technology, Harbin, P. R. China; Division of Biostatistics, Office of Surveillance and Biometrics, Center for Devices and Radiological Health, Food and Drug Administration, Silver Spring, Maryland, USA
| | - Zhiwei Zhang
- Division of Biostatistics, Office of Surveillance and Biometrics, Center for Devices and Radiological Health, Food and Drug Administration, Silver Spring, Maryland, USA
| | - R Jason Schroeder
- Division of Biostatistics, Office of Surveillance and Biometrics, Center for Devices and Radiological Health, Food and Drug Administration, Silver Spring, Maryland, USA
| | - Martin Ho
- Division of Biostatistics, Office of Surveillance and Biometrics, Center for Devices and Radiological Health, Food and Drug Administration, Silver Spring, Maryland, USA
| | - Bo Zhang
- Division of Biostatistics, Office of Surveillance and Biometrics, Center for Devices and Radiological Health, Food and Drug Administration, Silver Spring, Maryland, USA
| | - Cynthia Long
- Division of Reproductive, Gastro-Renal, and Urological Devices, Office of Device Evaluation, Center for Devices and Radiological Health, Food and Drug Administration, Silver Spring, Maryland, USA
| | - Hui Zhang
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| | - Telba Z Irony
- Office of Biostatistics and Epidemiology, Center for Biologic Evaluation and Research, Food and Drug Administration, Silver Spring, Maryland, USA
| |
Collapse
|
34
|
Zhang Z, Liu W, Zhang B, Tang L, Zhang J. Causal inference with missing exposure information: Methods and applications to an obstetric study. Stat Methods Med Res 2013; 25:2053-2066. [PMID: 24318273 DOI: 10.1177/0962280213513758] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Causal inference in observational studies is frequently challenged by the occurrence of missing data, in addition to confounding. Motivated by the Consortium on Safe Labor, a large observational study of obstetric labor practice and birth outcomes, this article focuses on the problem of missing exposure information in a causal analysis of observational data. This problem can be approached from different angles (i.e. missing covariates and causal inference), and useful methods can be obtained by drawing upon the available techniques and insights in both areas. In this article, we describe and compare a collection of methods based on different modeling assumptions, under standard assumptions for missing data (i.e. missing-at-random and positivity) and for causal inference with complete data (i.e. no unmeasured confounding and another positivity assumption). These methods involve three models: one for treatment assignment, one for the dependence of outcome on treatment and covariates, and one for the missing data mechanism. In general, consistent estimation of causal quantities requires correct specification of at least two of the three models, although there may be some flexibility as to which two models need to be correct. Such flexibility is afforded by doubly robust estimators adapted from the missing covariates literature and the literature on causal inference with complete data, and by a newly developed triply robust estimator that is consistent if any two of the three models are correct. The methods are applied to the Consortium on Safe Labor data and compared in a simulation study mimicking the Consortium on Safe Labor.
Collapse
Affiliation(s)
- Zhiwei Zhang
- Division of Biostatistics, Office of Surveillance and Biometrics, Center for Devices and Radiological Health, Food and Drug Administration, Silver Spring, MD, USA
| | - Wei Liu
- Department of Mathematics, Harbin Institute of Technology, Harbin, P.R. China
| | - Bo Zhang
- Biostatistics Core, School of Biological and Population Health Sciences, College of Public Health and Human Sciences, Oregon State University, Corvallis, OR, USA
| | - Li Tang
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Jun Zhang
- MOE and Shanghai Key Laboratory of Children's Environmental Health, Xinhua Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, P.R. China
| |
Collapse
|
35
|
Haneuse S, Rotnitzky A. Estimation of the effect of interventions that modify the received treatment. Stat Med 2013; 32:5260-77. [PMID: 23913589 DOI: 10.1002/sim.5907] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2012] [Revised: 04/29/2013] [Accepted: 06/14/2013] [Indexed: 11/08/2022]
Abstract
Motivated by a study of surgical operating time and post-operative outcomes for lung cancer, we consider the estimation of causal effects of continuous point-exposure treatments. To investigate causality, the standard paradigm postulates a series of treatment-specific counterfactual outcomes and establishes conditions under which we may learn about them from observational study data. While many choices are possible, causal effects are typically defined in terms of variation of the mean of counterfactual outcomes in hypothetical worlds in which specific treatment strategies are 'applied' to all individuals. For example, one might compare two worlds: one where each individual receives some specific dose and a second where each individual receives some other dose. For our motivating study, defining causal effects in this way corresponds to (hypothetical) interventions that could not conceivably be implemented in the real world. In this work, we consider an alternative, complimentary framework that investigates variation in the mean of counterfactual outcomes under hypothetical treatment strategies where each individual receives a treatment dose corresponding to that actually received but modified in some pre-specified way. Quantification of this variation is defined in terms of contrasts for specific interventions as well as in terms of the parameters of a new class of marginal structural mean models. Within this framework, we propose three estimators: an outcome regression estimator, an inverse probability of treatment weighted estimator and a doubly robust estimator. We illustrate the methods with an analysis of the motivating data.
Collapse
Affiliation(s)
- S Haneuse
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, U.S.A
| | | |
Collapse
|
36
|
Abstract
An important scientific goal of studies in the health and social sciences is increasingly to determine to what extent the total effect of a point exposure is mediated by an intermediate variable on the causal pathway between the exposure and the outcome. A causal framework has recently been proposed for mediation analysis, which gives rise to new definitions, formal identification results and novel estimators of direct and indirect effects. In the present paper, the author describes a new inverse odds ratio-weighted approach to estimate so-called natural direct and indirect effects. The approach, which uses as a weight the inverse of an estimate of the odds ratio function relating the exposure and the mediator, is universal in that it can be used to decompose total effects in a number of regression models commonly used in practice. Specifically, the approach may be used for effect decomposition in generalized linear models with a nonlinear link function, and in a number of other commonly used models such as the Cox proportional hazards regression for a survival outcome. The approach is simple and can be implemented in standard software provided a weight can be specified for each observation. An additional advantage of the method is that it easily incorporates multiple mediators of a categorical, discrete or continuous nature.
Collapse
Affiliation(s)
- Eric J Tchetgen Tchetgen
- Department of Epidemiology, Harvard University, Boston, MA, U.S.A.; Department of Biostatistics, Harvard University, Boston, MA, U.S.A
| |
Collapse
|
37
|
Abstract
Summarizing the effect of many covariates through a few linear combinations is an effective way of reducing covariate dimension and is the backbone of (sufficient) dimension reduction. Because the replacement of high-dimensional covariates by low-dimensional linear combinations is performed with a minimum assumption on the specific regression form, it enjoys attractive advantages as well as encounters unique challenges in comparison with the variable selection approach. We review the current literature of dimension reduction with an emphasis on the two most popular models, where the dimension reduction affects the conditional distribution and the conditional mean, respectively. We discuss various estimation and inference procedures in different levels of detail, with the intention of focusing on their underneath idea instead of technicalities. We also discuss some unsolved problems in this area for potential future research.
Collapse
Affiliation(s)
- Yanyuan Ma
- Department of Statistics, Texas A&M University, College Station, TX 77843, USA,
| | | |
Collapse
|
38
|
Abstract
The current statistical literature on causal inference is mostly concerned with binary or categorical exposures, even though exposures of a quantitative nature are frequently encountered in epidemiologic research. In this article, we review the available methods for estimating the dose-response curve for a quantitative exposure, which include ordinary regression based on an outcome regression model, inverse propensity weighting and stratification based on a propensity function model, and an augmented inverse propensity weighting method that is doubly robust with respect to the two models. We note that an outcome regression model often imposes an implicit constraint on the dose-response curve, and propose a flexible modeling strategy that avoids constraining the dose-response curve. We also propose two new methods: a weighted regression method that combines ordinary regression with inverse propensity weighting and a stratified regression method that combines ordinary regression with stratification. The proposed methods are similar to the augmented inverse propensity weighting method in the sense of double robustness, but easier to implement and more generally applicable. The methods are illustrated with an obstetric example and compared in simulation studies.
Collapse
Affiliation(s)
- Zhiwei Zhang
- Division of Biostatistics, Office of Surveillance and Biometrics, Center for Devices and Radiological Health, Food and Drug Administration, Silver Spring, Maryland, USA
| | - Jie Zhou
- Division of Biostatistics, Office of Surveillance and Biometrics, Center for Devices and Radiological Health, Food and Drug Administration, Silver Spring, Maryland, USA
| | - Weihua Cao
- Division of Biostatistics, Office of Surveillance and Biometrics, Center for Devices and Radiological Health, Food and Drug Administration, Silver Spring, Maryland, USA
| | - Jun Zhang
- MOE and Shanghai Key Laboratory of Children's Environmental Health, Xinhua Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, P.R. China
| |
Collapse
|
39
|
Abstract
Estimating the causal effect of an intervention on a population typically involves defining parameters in a nonparametric structural equation model (Pearl, 2000, Causality: Models, Reasoning, and Inference) in which the treatment or exposure is deterministically assigned in a static or dynamic way. We define a new causal parameter that takes into account the fact that intervention policies can result in stochastically assigned exposures. The statistical parameter that identifies the causal parameter of interest is established. Inverse probability of treatment weighting (IPTW), augmented IPTW (A-IPTW), and targeted maximum likelihood estimators (TMLE) are developed. A simulation study is performed to demonstrate the properties of these estimators, which include the double robustness of the A-IPTW and the TMLE. An application example using physical activity data is presented.
Collapse
Affiliation(s)
- Iván Díaz Muñoz
- Division of Biostatistics, School of Public Health, 101 Haviland Hall, University of California at Berkeley, Berkeley, California 94720-7358, U.S.A
| | - Mark van der Laan
- Division of Biostatistics, School of Public Health, 101 Haviland Hall, University of California at Berkeley, Berkeley, California 94720-7358, U.S.A
| |
Collapse
|
40
|
Abstract
An objective of randomized placebo-controlled preventive HIV vaccine efficacy trials is to assess the relationship between the vaccine effect to prevent infection and the genetic distance of the exposing HIV to the HIV strain represented in the vaccine construct. Motivated by this objective, recently a mark-specific proportional hazards model with a continuum of competing risks has been studied, where the genetic distance of the transmitting strain is the continuous `mark' defined and observable only in failures. A high percentage of genetic marks of interest may be missing for a variety of reasons, predominantly due to rapid evolution of HIV sequences after transmission before a blood sample is drawn from which HIV sequences are measured. This research investigates the stratified mark-specific proportional hazards model with missing marks where the baseline functions may vary with strata. We develop two consistent estimation approaches, the first based on the inverse probability weighted complete-case (IPW) technique, and the second based on augmenting the IPW estimator by incorporating auxiliary information predictive of the mark. We investigate the asymptotic properties and finite-sample performance of the two estimators, and show that the augmented IPW estimator, which satisfies a double robustness property, is more efficient.
Collapse
Affiliation(s)
- Yanqing Sun
- Department of Mathematics and Statistics, The University of North Carolina at Charlotte, Charlotte, NC 28223, USA.
| | | |
Collapse
|
41
|
Abstract
We consider nonparametric regression of a scalar outcome on a covariate when the outcome is missing at random (MAR) given the covariate and other observed auxiliary variables. We propose a class of augmented inverse probability weighted (AIPW) kernel estimating equations for nonparametric regression under MAR. We show that AIPW kernel estimators are consistent when the probability that the outcome is observed, that is, the selection probability, is either known by design or estimated under a correctly specified model. In addition, we show that a specific AIPW kernel estimator in our class that employs the fitted values from a model for the conditional mean of the outcome given covariates and auxiliaries is double-robust, that is, it remains consistent if this model is correctly specified even if the selection probabilities are modeled or specified incorrectly. Furthermore, when both models happen to be right, this double-robust estimator attains the smallest possible asymptotic variance of all AIPW kernel estimators and maximally extracts the information in the auxiliary variables. We also describe a simple correction to the AIPW kernel estimating equations that while preserving double-robustness it ensures efficiency improvement over nonaugmented IPW estimation when the selection model is correctly specified regardless of the validity of the second model used in the augmentation term. We perform simulations to evaluate the finite sample performance of the proposed estimators, and apply the methods to the analysis of the AIDS Costs and Services Utilization Survey data. Technical proofs are available online.
Collapse
Affiliation(s)
- Lu Wang
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109
| | - Andrea Rotnitzky
- Department of Economics, Di Tella University, Buenos Aires, 1425, Argentina and Adjunct Professor, Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115
| | - Xihong Lin
- Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115
| |
Collapse
|