1
|
Wang T, Zhao H, Yang S, Tang S, Cui Z, Li L, Faries DE. Propensity score matching for estimating a marginal hazard ratio. Stat Med 2024; 43:2783-2810. [PMID: 38705726 PMCID: PMC11178458 DOI: 10.1002/sim.10103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Revised: 01/31/2024] [Accepted: 04/24/2024] [Indexed: 05/07/2024]
Abstract
Propensity score matching is commonly used to draw causal inference from observational survival data. However, its asymptotic properties have yet to be established, and variance estimation is still open to debate. We derive the statistical properties of the propensity score matching estimator of the marginal causal hazard ratio based on matching with replacement and a fixed number of matches. We also propose a double-resampling technique for variance estimation that takes into account the uncertainty due to propensity score estimation prior to matching.
Collapse
Affiliation(s)
| | - Honghe Zhao
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, USA
| | - Shu Yang
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, USA
| | - Shuhan Tang
- Eli Lilly and Company, Indianapolis, Indiana, USA
| | - Zhanglin Cui
- Eli Lilly and Company, Indianapolis, Indiana, USA
| | - Li Li
- Eli Lilly and Company, Indianapolis, Indiana, USA
| | | |
Collapse
|
2
|
Park JE, Campbell H, Towle K, Yuan Y, Jansen JP, Phillippo D, Cope S. Unanchored Population-Adjusted Indirect Comparison Methods for Time-to-Event Outcomes Using Inverse Odds Weighting, Regression Adjustment, and Doubly Robust Methods With Either Individual Patient or Aggregate Data. VALUE IN HEALTH : THE JOURNAL OF THE INTERNATIONAL SOCIETY FOR PHARMACOECONOMICS AND OUTCOMES RESEARCH 2024; 27:278-286. [PMID: 38135212 DOI: 10.1016/j.jval.2023.11.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 10/18/2023] [Accepted: 11/15/2023] [Indexed: 12/24/2023]
Abstract
OBJECTIVES Several methods for unanchored population-adjusted indirect comparisons (PAICs) are available. Exploring alternative adjustment methods, depending on the available individual patient data (IPD) and the aggregate data (AD) in the external study, may help minimize bias in unanchored indirect comparisons. However, methods for time-to-event outcomes are not well understood. This study provides an overview and comparison of methods using a case study to increase familiarity. A recent method is applied to marginalize conditional hazard ratios, which allows for the comparisons of methods, and a doubly robust method is proposed. METHODS The following PAIC methods were compared through a case study in third-line small cell lung cancer, comparing nivolumab with standard of care based on a single-arm phase II trial (CheckMate 032) and real-world study (Flatiron) in terms of overall survival: IPD-IPD analyses using inverse odds weighting, regression adjustment, and a doubly robust method; IPD-AD analyses using matching-adjusted indirect comparison, simulated treatment comparison, and a doubly robust method. RESULTS Nivolumab extended survival versus standard of care with hazard ratios ranging from 0.63 (95% CI 0.44-0.90) in naive comparisons (identical estimates for IPD-IPD and IPD-AD analyses) to 0.69 (95% CI 0.44-0.98) in the IPD-IPD analyses using regression adjustment. Regression-based and doubly robust estimates yielded slightly wider confidence intervals versus the propensity score-based analyses. CONCLUSIONS The proposed doubly robust approach for time-to-event outcomes may help to minimize bias due to model misspecification. However, all methods for unanchored PAIC rely on the strong assumption that all prognostic covariates have been included.
Collapse
Affiliation(s)
- Julie E Park
- PRECISIONheor, Evidence Synthesis and Decision Modeling, Vancouver, BC, Canada
| | - Harlan Campbell
- PRECISIONheor, Evidence Synthesis and Decision Modeling, Vancouver, BC, Canada; University of British Columbia, Vancouver, BC, Canada
| | - Kevin Towle
- PRECISIONheor, Evidence Synthesis and Decision Modeling, Vancouver, BC, Canada
| | - Yong Yuan
- Worldwide Health Economics and Outcomes Research, Bristol Myers Squibb, Princeton, NJ, USA
| | - Jeroen P Jansen
- PRECISIONheor, Evidence Synthesis and Decision Modeling, Vancouver, BC, Canada
| | - David Phillippo
- University of Bristol, Bristol Medical School, Bristol, England, UK
| | - Shannon Cope
- PRECISIONheor, Evidence Synthesis and Decision Modeling, Vancouver, BC, Canada.
| |
Collapse
|
3
|
Wolock CJ, Gilbert PB, Simon N, Carone M. A framework for leveraging machine learning tools to estimate personalized survival curves. J Comput Graph Stat 2024; 33:1098-1108. [PMID: 39175935 PMCID: PMC11338658 DOI: 10.1080/10618600.2024.2304070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 01/06/2024] [Indexed: 08/24/2024]
Abstract
The conditional survival function of a time-to-event outcome subject to censoring and truncation is a common target of estimation in survival analysis. This parameter may be of scientific interest and also often appears as a nuisance in nonparametric and semiparametric problems. In addition to classical parametric and semiparametric methods (e.g., based on the Cox proportional hazards model), flexible machine learning approaches have been developed to estimate the conditional survival function. However, many of these methods are either implicitly or explicitly targeted toward risk stratification rather than overall survival function estimation. Others apply only to discrete-time settings or require inverse probability of censoring weights, which can be as difficult to estimate as the outcome survival function itself. Here, we employ a decomposition of the conditional survival function in terms of observable regression models in which censoring and truncation play no role. This allows application of an array of flexible regression and classification methods rather than only approaches that explicitly handle the complexities inherent to survival data. We outline estimation procedures based on this decomposition, empirically assess their performance, and demonstrate their use on data from an HIV vaccine trial. Supplementary materials for this article are available online.
Collapse
Affiliation(s)
- Charles J. Wolock
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania
| | - Peter B. Gilbert
- Department of Biostatistics, University of Washington
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center
| | - Noah Simon
- Department of Biostatistics, University of Washington
| | - Marco Carone
- Department of Biostatistics, University of Washington
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center
| |
Collapse
|
4
|
Westling T, Luedtke A, Gilbert PB, Carone M. Inference for treatment-specific survival curves using machine learning. J Am Stat Assoc 2023; 119:1541-1553. [PMID: 39184837 PMCID: PMC11339859 DOI: 10.1080/01621459.2023.2205060] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Accepted: 04/11/2023] [Indexed: 08/27/2024]
Abstract
In the absence of data from a randomized trial, researchers may aim to use observational data to draw causal inference about the effect of a treatment on a time-to-event outcome. In this context, interest often focuses on the treatment-specific survival curves, that is, the survival curves were the population under study to be assigned to receive the treatment or not. Under certain conditions, including that all confounders of the treatment-outcome relationship are observed, the treatment-specific survival curve can be identified with a covariate-adjusted survival curve. In this article, we propose a novel cross-fitted doubly-robust estimator that incorporates data-adaptive (e.g. machine learning) estimators of the conditional survival functions. We establish conditions on the nuisance estimators under which our estimator is consistent and asymptotically linear, both pointwise and uniformly in time. We also propose a novel ensemble learner for combining multiple candidate estimators of the conditional survival estimators. Notably, our methods and results accommodate events occurring in discrete or continuous time, or an arbitrary mix of the two. We investigate the practical performance of our methods using numerical studies and an application to the effect of a surgical treatment to prevent metastases of parotid carcinoma on mortality.
Collapse
Affiliation(s)
- Ted Westling
- Department of Mathematics and Statistics, University of Massachusetts Amherst
| | - Alex Luedtke
- Department of Statistics, University of Washington
| | - Peter B. Gilbert
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center
| | - Marco Carone
- Department of Biostatistics, University of Washington
| |
Collapse
|
5
|
Tackney MS, Morris T, White I, Leyrat C, Diaz-Ordaz K, Williamson E. A comparison of covariate adjustment approaches under model misspecification in individually randomized trials. Trials 2023; 24:14. [PMID: 36609282 PMCID: PMC9817411 DOI: 10.1186/s13063-022-06967-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 11/28/2022] [Indexed: 01/09/2023] Open
Abstract
Adjustment for baseline covariates in randomized trials has been shown to lead to gains in power and can protect against chance imbalances in covariates. For continuous covariates, there is a risk that the the form of the relationship between the covariate and outcome is misspecified when taking an adjusted approach. Using a simulation study focusing on individually randomized trials with small sample sizes, we explore whether a range of adjustment methods are robust to misspecification, either in the covariate-outcome relationship or through an omitted covariate-treatment interaction. Specifically, we aim to identify potential settings where G-computation, inverse probability of treatment weighting (IPTW), augmented inverse probability of treatment weighting (AIPTW) and targeted maximum likelihood estimation (TMLE) offer improvement over the commonly used analysis of covariance (ANCOVA). Our simulations show that all adjustment methods are generally robust to model misspecification if adjusting for a few covariates, sample size is 100 or larger, and there are no covariate-treatment interactions. When there is a non-linear interaction of treatment with a skewed covariate and sample size is small, all adjustment methods can suffer from bias; however, methods that allow for interactions (such as G-computation with interaction and IPTW) show improved results compared to ANCOVA. When there are a high number of covariates to adjust for, ANCOVA retains good properties while other methods suffer from under- or over-coverage. An outstanding issue for G-computation, IPTW and AIPTW in small samples is that standard errors are underestimated; they should be used with caution without the availability of small-sample corrections, development of which is needed. These findings are relevant for covariate adjustment in interim analyses of larger trials.
Collapse
Affiliation(s)
- Mia S. Tackney
- grid.8991.90000 0004 0425 469XDepartment of Medical Statistics, London School of Hygiene and Tropical Medicine, London, UK ,grid.5335.00000000121885934MRC Biostatistics Unit, University of Cambridge, Cambridge, United Kingdom
| | - Tim Morris
- grid.415052.70000 0004 0606 323XMRC Clinical Trials Unit at UCL, London, UK
| | - Ian White
- grid.415052.70000 0004 0606 323XMRC Clinical Trials Unit at UCL, London, UK
| | - Clemence Leyrat
- grid.8991.90000 0004 0425 469XDepartment of Medical Statistics, London School of Hygiene and Tropical Medicine, London, UK
| | - Karla Diaz-Ordaz
- grid.8991.90000 0004 0425 469XDepartment of Medical Statistics, London School of Hygiene and Tropical Medicine, London, UK ,grid.83440.3b0000000121901201Department of Statistical Science, UCL, London, United Kingdom
| | - Elizabeth Williamson
- grid.8991.90000 0004 0425 469XDepartment of Medical Statistics, London School of Hygiene and Tropical Medicine, London, UK
| |
Collapse
|
6
|
Vansteelandt S, Dukes O, Van Lancker K, Martinussen T. Assumption-lean Cox regression. J Am Stat Assoc 2022. [DOI: 10.1080/01621459.2022.2126362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Affiliation(s)
- Stijn Vansteelandt
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University
- Department of Medical Statistics, London School of Hygiene and Tropical Medicine
| | - Oliver Dukes
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University
| | - Kelly Van Lancker
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University
| | | |
Collapse
|
7
|
Lee Y, Kennedy EH, Mitra N. Doubly robust nonparametric instrumental variable estimators for survival outcomes. Biostatistics 2021; 24:518-537. [PMID: 34676400 DOI: 10.1093/biostatistics/kxab036] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2021] [Revised: 09/15/2021] [Accepted: 09/20/2021] [Indexed: 11/12/2022] Open
Abstract
Instrumental variable (IV) methods allow us the opportunity to address unmeasured confounding in causal inference. However, most IV methods are only applicable to discrete or continuous outcomes with very few IV methods for censored survival outcomes. In this article, we propose nonparametric estimators for the local average treatment effect on survival probabilities under both covariate-dependent and outcome-dependent censoring. We provide an efficient influence function-based estimator and a simple estimation procedure when the IV is either binary or continuous. The proposed estimators possess double-robustness properties and can easily incorporate nonparametric estimation using machine learning tools. In simulation studies, we demonstrate the flexibility and double robustness of our proposed estimators under various plausible scenarios. We apply our method to the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial for estimating the causal effect of screening on survival probabilities and investigate the causal contrasts between the two interventions under different censoring assumptions.
Collapse
Affiliation(s)
- Youjin Lee
- Department of Biostatistics, Brown University, 121 S Main St, Providence, RI 02912, USA
| | - Edward H Kennedy
- Department of Statistics and Data Science, Carnegie Mellon University, 132 J Baker Hall, Pittsburgh, PA 15213, USA
| | - Nandita Mitra
- Department of Biostatistics and Epidemiology, University Pennsylvania, 423 Guardian Drive, Philadelphia, PA 19104, USA
| |
Collapse
|
8
|
Díaz I, Williams N, Hoffman KL, Schenck EJ. Nonparametric Causal Effects Based on Longitudinal Modified Treatment Policies. J Am Stat Assoc 2021. [DOI: 10.1080/01621459.2021.1955691] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Iván Díaz
- Division of Biostatistics, Department of Population Health Sciences, Weill Cornell Medicine, New York
| | - Nicholas Williams
- Division of Biostatistics, Department of Population Health Sciences, Weill Cornell Medicine, New York
| | - Katherine L. Hoffman
- Division of Biostatistics, Department of Population Health Sciences, Weill Cornell Medicine, New York
| | - Edward J. Schenck
- Division of Pulmonary & Critical Care Medicine, Department of Medicine, Weill Cornell Medicine, New York
| |
Collapse
|
9
|
Martínez-Camblor P, MacKenzie TA, Staiger DO, Goodney PP, O'Malley AJ. Summarizing causal differences in survival curves in the presence of unmeasured confounding. Int J Biostat 2020; 17:223-240. [PMID: 32946418 DOI: 10.1515/ijb-2019-0146] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Accepted: 08/10/2020] [Indexed: 11/15/2022]
Abstract
Proportional hazard Cox regression models are frequently used to analyze the impact of different factors on time-to-event outcomes. Most practitioners are familiar with and interpret research results in terms of hazard ratios. Direct differences in survival curves are, however, easier to understand for the general population of users and to visualize graphically. Analyzing the difference among the survival curves for the population at risk allows easy interpretation of the impact of a therapy over the follow-up. When the available information is obtained from observational studies, the observed results are potentially subject to a plethora of measured and unmeasured confounders. Although there are procedures to adjust survival curves for measured covariates, the case of unmeasured confounders has not yet been considered in the literature. In this article we provide a semi-parametric procedure for adjusting survival curves for measured and unmeasured confounders. The method augments our novel instrumental variable estimation method for survival time data in the presence of unmeasured confounding with a procedure for mapping estimates onto the survival probability and the expected survival time scales.
Collapse
Affiliation(s)
- Pablo Martínez-Camblor
- Department of Biomedical Data Sciences, Geisel School of Medicine at Dartmouth, Lebanon, New Hampshire, USA
| | - Todd A MacKenzie
- Department of Biomedical Data Sciences, Geisel School of Medicine at Dartmouth, Lebanon, New Hampshire, USA.,The Dartmouth Institute for Health Policy and Clinical Practice, Geisel School of Medicine, Lebanon, New Hampshire, USA
| | - Douglas O Staiger
- The Dartmouth Institute for Health Policy and Clinical Practice, Geisel School of Medicine, Lebanon, New Hampshire, USA.,Department of Economics, Dartmouth College, Hanover, New Hampshire, USA
| | - Phillip P Goodney
- The Dartmouth Institute for Health Policy and Clinical Practice, Geisel School of Medicine, Lebanon, New Hampshire, USA.,Section of Vascular Surgery, Dartmouth-Hitchcock Medical Center, Lebanon, New Hampshire, USA
| | - A James O'Malley
- Department of Biomedical Data Sciences, Geisel School of Medicine at Dartmouth, Lebanon, New Hampshire, USA.,The Dartmouth Institute for Health Policy and Clinical Practice, Geisel School of Medicine, Lebanon, New Hampshire, USA
| |
Collapse
|
10
|
Díaz I, Savenkov O, Kamel H. Nonparametric targeted Bayesian estimation of class proportions in unlabeled data. Biostatistics 2020; 23:274-293. [PMID: 32529244 DOI: 10.1093/biostatistics/kxaa022] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Revised: 04/21/2020] [Accepted: 04/23/2020] [Indexed: 12/20/2022] Open
Abstract
We introduce a novel Bayesian estimator for the class proportion in an unlabeled dataset, based on the targeted learning framework. The procedure requires the specification of a prior (and outputs a posterior) only for the target of inference, and yields a tightly concentrated posterior. When the scientific question can be characterized by a low-dimensional parameter functional, this focus on target prior and posterior distributions perfectly aligns with Bayesian subjectivism. We prove a Bernstein-von Mises-type result for our proposed Bayesian procedure, which guarantees that the posterior distribution converges to the distribution of an efficient, asymptotically linear estimator. In particular, the posterior is Gaussian, doubly robust, and efficient in the limit, under the only assumption that certain nuisance parameters are estimated at slower-than-parametric rates. We perform numerical studies illustrating the frequentist properties of the method. We also illustrate their use in a motivating application to estimate the proportion of embolic strokes of undetermined source arising from occult cardiac sources or large-artery atherosclerotic lesions. Though we focus on the motivating example of the proportion of cases in an unlabeled dataset, the procedure is general and can be adapted to estimate any pathwise differentiable parameter in a non-parametric model.
Collapse
Affiliation(s)
- Iván Díaz
- Division of Biostatistics, Weill Cornell Medicine, New York, NY 10065, USA
| | | | - Hooman Kamel
- Department of Neurology, Weill Cornell Medicine, New York, NY 10065, USA
| |
Collapse
|