1
|
Robbins MW, Bauhoff S, Burgette L. Data fusion for predicting long-term program impacts. Stat Med 2024. [PMID: 38890124 DOI: 10.1002/sim.10147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 05/01/2024] [Accepted: 06/06/2024] [Indexed: 06/20/2024]
Abstract
Policymakers often require information on programs' long-term impacts that is not available when decisions are made. For example, while rigorous evidence from the Oregon Health Insurance Experiment (OHIE) shows that having health insurance influences short-term health and financial measures, the impact on long-term outcomes, such as mortality, will not be known for many years following the program's implementation. We demonstrate how data fusion methods may be used address the problem of missing final outcomes and predict long-run impacts of interventions before the requisite data are available. We implement this method by concatenating data on an intervention (such as the OHIE) with auxiliary long-term data and then imputing missing long-term outcomes using short-term surrogate outcomes while approximating uncertainty with replication methods. We use simulations to examine the performance of the methodology and apply the method in a case study. Specifically, we fuse data on the OHIE with data from the National Longitudinal Mortality Study and estimate that being eligible to apply for subsidized health insurance will lead to a statistically significant improvement in long-term mortality.
Collapse
Affiliation(s)
| | - Sebastian Bauhoff
- School of Public Health, Harvard University, Cambridge, Massachusetts
| | | |
Collapse
|
2
|
Cole SR, Edwards JK, Breskin A, Rosin S, Zivich PN, Shook-Sa BE, Hudgens MG. Cole et al. Respond to "Combining Information From Diverse Sources". Am J Epidemiol 2024; 193:751-752. [PMID: 37067469 DOI: 10.1093/aje/kwad084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 03/07/2023] [Accepted: 04/06/2023] [Indexed: 04/18/2023] Open
|
3
|
Dahabreh IJ. Invited Commentary: Combining Information to Answer Epidemiologic Questions About a Target Population. Am J Epidemiol 2024; 193:741-750. [PMID: 38456780 DOI: 10.1093/aje/kwad014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Revised: 11/08/2022] [Accepted: 01/12/2023] [Indexed: 03/09/2024] Open
Abstract
Epidemiologists are attempting to address research questions of increasing complexity by developing novel methods for combining information from diverse sources. Cole et al. (Am J Epidemiol. 2023;192(3)467-474) provide 2 examples of the process of combining information to draw inferences about a population proportion. In this commentary, we consider combining information to learn about a target population as an epidemiologic activity and distinguish it from more conventional meta-analyses. We examine possible rationales for combining information and discuss broad methodological considerations, with an emphasis on study design, assumptions, and sources of uncertainty.
Collapse
|
4
|
Ross RK, Cole SR, Edwards JK, Zivich PN, Westreich D, Daniels JL, Price JT, Stringer JSA. Leveraging External Validation Data: The Challenges of Transporting Measurement Error Parameters. Epidemiology 2024; 35:196-207. [PMID: 38079241 PMCID: PMC10841744 DOI: 10.1097/ede.0000000000001701] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2024]
Abstract
Approaches to address measurement error frequently rely on validation data to estimate measurement error parameters (e.g., sensitivity and specificity). Acquisition of validation data can be costly, thus secondary use of existing data for validation is attractive. To use these external validation data, however, we may need to address systematic differences between these data and the main study sample. Here, we derive estimators of the risk and the risk difference that leverage external validation data to account for outcome misclassification. If misclassification is differential with respect to covariates that themselves are differentially distributed in the validation and study samples, the misclassification parameters are not immediately transportable. We introduce two ways to account for such covariates: (1) standardize by these covariates or (2) iteratively model the outcome. If conditioning on a covariate for transporting the misclassification parameters induces bias of the causal effect (e.g., M-bias), the former but not the latter approach is biased. We provide proof of identification, describe estimation using parametric models, and assess performance in simulations. We also illustrate implementation to estimate the risk of preterm birth and the effect of maternal HIV infection on preterm birth. Measurement error should not be ignored and it can be addressed using external validation data via transportability methods.
Collapse
Affiliation(s)
- Rachael K Ross
- Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, NY
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC
| | - Stephen R Cole
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC
| | - Jessie K Edwards
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC
| | - Paul N Zivich
- Institute of Global Health and Infectious Diseases, School of Medicine, University of North Carolina at Chapel Hill, NC
| | - Daniel Westreich
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC
| | - Julie L Daniels
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC
| | - Joan T Price
- Department of Obstetrics and Gynecology, School of Medicine, University of North Carolina, Chapel Hill, NC
| | - Jeffrey S A Stringer
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC
- Department of Obstetrics and Gynecology, School of Medicine, University of North Carolina, Chapel Hill, NC
| |
Collapse
|
5
|
Shook-Sa BE, Zivich PN, Rosin SP, Edwards JK, Adimora AA, Hudgens MG, Cole SR. Fusing trial data for treatment comparisons: Single vs multi-span bridging. Stat Med 2024; 43:793-815. [PMID: 38110289 PMCID: PMC10843571 DOI: 10.1002/sim.9989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Revised: 10/23/2023] [Accepted: 11/30/2023] [Indexed: 12/20/2023]
Abstract
While randomized controlled trials (RCTs) are critical for establishing the efficacy of new therapies, there are limitations regarding what comparisons can be made directly from trial data. RCTs are limited to a small number of comparator arms and often compare a new therapeutic to a standard of care which has already proven efficacious. It is sometimes of interest to estimate the efficacy of the new therapy relative to a treatment that was not evaluated in the same trial, such as a placebo or an alternative therapy that was evaluated in a different trial. Such dual-study comparisons are challenging because of potential differences between trial populations that can affect the outcome. In this article, two bridging estimators are considered that allow for comparisons of treatments evaluated in different trials, accounting for measured differences in trial populations. A "multi-span" estimator leverages a shared arm between two trials, while a "single-span" estimator does not require a shared arm. A diagnostic statistic that compares the outcome in the standardized shared arms is provided. The two estimators are compared in simulations, where both estimators demonstrate minimal empirical bias and nominal confidence interval coverage when the identification assumptions are met. The estimators are applied to data from the AIDS Clinical Trials Group 320 and 388 to compare the efficacy of two-drug vs four-drug antiretroviral therapy on CD4 cell counts among persons with advanced HIV. The single-span approach requires weaker identification assumptions and was more efficient in simulations and the application.
Collapse
Affiliation(s)
- Bonnie E. Shook-Sa
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Paul N. Zivich
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Institute of Global Health and Infectious Diseases, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Samuel P. Rosin
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Jessie K. Edwards
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Adaora A. Adimora
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Michael G. Hudgens
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Stephen R. Cole
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
6
|
Ross RK, Zivich PN, Stringer JSA, Cole SR. M-estimation for common epidemiological measures: introduction and applied examples. Int J Epidemiol 2024; 53:dyae030. [PMID: 38423105 PMCID: PMC10904145 DOI: 10.1093/ije/dyae030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 02/13/2024] [Indexed: 03/02/2024] Open
Abstract
M-estimation is a statistical procedure that is particularly advantageous for some comon epidemiological analyses, including approaches to estimate an adjusted marginal risk contrast (i.e. inverse probability weighting and g-computation) and data fusion. In such settings, maximum likelihood variance estimates are not consistent. Thus, epidemiologists often resort to bootstrap to estimate the variance. In contrast, M-estimation allows for consistent variance estimates in these settings without requiring the computational complexity of the bootstrap. In this paper, we introduce M-estimation and provide four illustrative examples of implementation along with software code in multiple languages. M-estimation is a flexible and computationally efficient estimation procedure that is a powerful addition to the epidemiologist's toolbox.
Collapse
Affiliation(s)
- Rachael K Ross
- Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, NY, USA
| | - Paul N Zivich
- Institute for Global Health and Infectious Diseases, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Jeffrey S A Stringer
- Department of Obstetrics and Gynecology, School of Medicine, University of North Carolina, Chapel Hill, NC, USA
| | - Stephen R Cole
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
7
|
Cole SR, Shook-Sa BE, Zivich PN, Edwards JK, Richardson DB, Hudgens MG. Higher-order evidence. Eur J Epidemiol 2024; 39:1-11. [PMID: 38195955 PMCID: PMC11129850 DOI: 10.1007/s10654-023-01062-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 10/11/2023] [Indexed: 01/11/2024]
Abstract
Higher-order evidence is evidence about evidence. Epidemiologic examples of higher-order evidence include the settings where the study data constitute first-order evidence and estimates of misclassification comprise the second-order evidence (e.g., sensitivity, specificity) of a binary exposure or outcome collected in the main study. While sampling variability in higher-order evidence is typically acknowledged, higher-order evidence is often assumed to be free of measurement error (e.g., gold standard measures). Here we provide two examples, each with multiple scenarios where second-order evidence is imperfectly measured, and this measurement error can either amplify or attenuate standard corrections to first-order evidence. We propose a way to account for such imperfections that requires third-order evidence. Further illustrations and exploration of how higher-order evidence impacts results of epidemiologic studies is warranted.
Collapse
Affiliation(s)
- Stephen R Cole
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
| | - Bonnie E Shook-Sa
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Paul N Zivich
- Department of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Jessie K Edwards
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - David B Richardson
- Department of Epidemiology, University of California Irvine, Irvine, CA, USA
| | - Michael G Hudgens
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
8
|
Johnson CY, Fujishiro K. Identifying occupational health inequities in the absence of suitable data: are there inequities in access to adequate bathrooms in US workplaces? Occup Environ Med 2023; 80:572-579. [PMID: 37669856 DOI: 10.1136/oemed-2023-108900] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Accepted: 07/23/2023] [Indexed: 09/07/2023]
Abstract
OBJECTIVES Our research questions are often chosen based on the existence of suitable data for analysis or prior research in the area. For new interdisciplinary research areas, such as occupational health equity, suitable data might not yet exist. In this manuscript, we describe how we approached a research question in the absence of suitable data using the example of identifying inequities in adequate bathrooms in US workplaces. METHODS We created a conceptual model that hypothesises causal mechanisms for occupational health inequities, and from this model we identified a series of questions that could be answered using separate data sets to better understand inequities in adequate workplace bathrooms. Breaking up the analysis into multiple steps allowed us to use multiple data sources and analysis methods, which helped compensate for limitations in each data set. RESULTS Using the conceptual model as a guide, we were able to identify some jobs that likely have inadequate bathrooms as well as subpopulations potentially at higher risk for inadequate bathrooms. We also identified specific data gaps by reflecting on the challenges we faced in our multistep analysis. These gaps, which indicated future data collection needs, included difficulty finding data sources for some predictors of inadequate bathrooms that prevented us from fully investigating potential inequities. CONCLUSIONS We share our conceptual model and our example analysis to motivate researchers to avoid letting availability of data limit the research questions they pursue.
Collapse
Affiliation(s)
- Candice Y Johnson
- Family Medicine and Community Health, Duke University, Durham, North Carolina, USA
| | - Kaori Fujishiro
- Division of Field Studies and Engineering, National Institute for Occupational Safety and Health, Centers for Disease Control and Prevention, Cincinnati, Ohio, USA
| |
Collapse
|