1
|
Wan W, Murugesan M, Nocon RS, Bolton J, Konetzka RT, Chin MH, Huang ES. Comparison of two propensity score-based methods for balancing covariates: the overlap weighting and fine stratification methods in real-world claims data. BMC Med Res Methodol 2024; 24:122. [PMID: 38831393 PMCID: PMC11145799 DOI: 10.1186/s12874-024-02228-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 04/23/2024] [Indexed: 06/05/2024] Open
Abstract
BACKGROUND Two propensity score (PS) based balancing covariate methods, the overlap weighting method (OW) and the fine stratification method (FS), produce superb covariate balance. OW has been compared with various weighting methods while FS has been compared with the traditional stratification method and various matching methods. However, no study has yet compared OW and FS. In addition, OW has not yet been evaluated in large claims data with low prevalence exposure and with low frequency outcomes, a context in which optimal use of balancing methods is critical. In the study, we aimed to compare OW and FS using real-world data and simulations with low prevalence exposure and with low frequency outcomes. METHODS We used the Texas State Medicaid claims data on adult beneficiaries with diabetes in 2012 as an empirical example (N = 42,628). Based on its real-world research question, we estimated an average treatment effect of health center vs. non-health center attendance in the total population. We also performed simulations to evaluate their relative performance. To preserve associations between covariates, we used the plasmode approach to simulate outcomes and/or exposures with N = 4,000. We simulated both homogeneous and heterogeneous treatment effects with various outcome risks (1-30% or observed: 27.75%) and/or exposure prevalence (2.5-30% or observed:10.55%). We used a weighted generalized linear model to estimate the exposure effect and the cluster-robust standard error (SE) method to estimate its SE. RESULTS In the empirical example, we found that OW had smaller standardized mean differences in all covariates (range: OW: 0.0-0.02 vs. FS: 0.22-3.26) and Mahalanobis balance distance (MB) (< 0.001 vs. > 0.049) than FS. In simulations, OW also achieved smaller MB (homogeneity: <0.04 vs. > 0.04; heterogeneity: 0.0-0.11 vs. 0.07-0.29), relative bias (homogeneity: 4.04-56.20 vs. 20-61.63; heterogeneity: 7.85-57.6 vs. 15.0-60.4), square root of mean squared error (homogeneity: 0.332-1.308 vs. 0.385-1.365; heterogeneity: 0.263-0.526 vs 0.313-0.620), and coverage probability (homogeneity: 0.0-80.4% vs. 0.0-69.8%; heterogeneity: 0.0-97.6% vs. 0.0-92.8%), than FS, in most cases. CONCLUSIONS These findings suggest that OW can yield nearly perfect covariate balance and therefore enhance the accuracy of average treatment effect estimation in the total population.
Collapse
Affiliation(s)
- Wen Wan
- Section of General Internal Medicine, Department of Medicine, The University of Chicago, 5841 S. Maryland Ave, Chicago, MC, IL, 2007, 60637, USA.
| | - Manoradhan Murugesan
- Department of Public Health Sciences, Department of Medicine, The University of Chicago, Chicago, IL, USA
| | - Robert S Nocon
- Health Systems Science, Kaiser Permanente Bernard J. Tyson School of Medicine, Pasadena, CA, USA
| | - Joshua Bolton
- Department of Information Systems, University of Maryland, Baltimore, MD, USA
| | - R Tamara Konetzka
- Department of Public Health Sciences, Department of Medicine, The University of Chicago, Chicago, IL, USA
| | - Marshall H Chin
- Section of General Internal Medicine, Department of Medicine, The University of Chicago, 5841 S. Maryland Ave, Chicago, MC, IL, 2007, 60637, USA
| | - Elbert S Huang
- Section of General Internal Medicine, Department of Medicine, The University of Chicago, 5841 S. Maryland Ave, Chicago, MC, IL, 2007, 60637, USA
| |
Collapse
|
2
|
Weberpals J, Raman SR, Shaw PA, Lee H, Russo M, Hammill BG, Toh S, Connolly JG, Dandreo KJ, Tian F, Liu W, Li J, Hernández-Muñoz JJ, Glynn RJ, Desai RJ. A Principled Approach to Characterize and Analyze Partially Observed Confounder Data from Electronic Health Records. Clin Epidemiol 2024; 16:329-343. [PMID: 38798915 PMCID: PMC11127690 DOI: 10.2147/clep.s436131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2023] [Accepted: 04/09/2024] [Indexed: 05/29/2024] Open
Abstract
Objective Partially observed confounder data pose challenges to the statistical analysis of electronic health records (EHR) and systematic assessments of potentially underlying missingness mechanisms are lacking. We aimed to provide a principled approach to empirically characterize missing data processes and investigate performance of analytic methods. Methods Three empirical sub-cohorts of diabetic SGLT2 or DPP4-inhibitor initiators with complete information on HbA1c, BMI and smoking as confounders of interest (COI) formed the basis of data simulation under a plasmode framework. A true null treatment effect, including the COI in the outcome generation model, and four missingness mechanisms for the COI were simulated: completely at random (MCAR), at random (MAR), and two not at random (MNAR) mechanisms, where missingness was dependent on an unmeasured confounder and on the value of the COI itself. We evaluated the ability of three groups of diagnostics to differentiate between mechanisms: 1)-differences in characteristics between patients with or without the observed COI (using averaged standardized mean differences [ASMD]), 2)-predictive ability of the missingness indicator based on observed covariates, and 3)-association of the missingness indicator with the outcome. We then compared analytic methods including "complete case", inverse probability weighting, single and multiple imputation in their ability to recover true treatment effects. Results The diagnostics successfully identified characteristic patterns of simulated missingness mechanisms. For MAR, but not MCAR, the patient characteristics showed substantial differences (median ASMD 0.20 vs 0.05) and consequently, discrimination of the prediction models for missingness was also higher (0.59 vs 0.50). For MNAR, but not MAR or MCAR, missingness was significantly associated with the outcome even in models adjusting for other observed covariates. Comparing analytic methods, multiple imputation using a random forest algorithm resulted in the lowest root-mean-squared-error. Conclusion Principled diagnostics provided reliable insights into missingness mechanisms. When assumptions allow, multiple imputation with nonparametric models could help reduce bias.
Collapse
Affiliation(s)
- Janick Weberpals
- Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
| | - Sudha R Raman
- Department of Population Health Sciences, Duke University School of Medicine, Durham, NC, USA
| | - Pamela A Shaw
- Biostatistics Division, Kaiser Permanente Washington Health Research Institute, Seattle, WA, USA
| | - Hana Lee
- Office of Biostatistics, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, USA
| | - Massimiliano Russo
- Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
| | - Bradley G Hammill
- Department of Population Health Sciences, Duke University School of Medicine, Durham, NC, USA
| | - Sengwee Toh
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA, USA
| | - John G Connolly
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA, USA
| | - Kimberly J Dandreo
- Department of Population Medicine, Harvard Pilgrim Health Care Institute, Boston, MA, USA
| | - Fang Tian
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, USA
| | - Wei Liu
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, USA
| | - Jie Li
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, USA
| | - José J Hernández-Muñoz
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD, USA
| | - Robert J Glynn
- Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
| | - Rishi J Desai
- Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
3
|
Martin GL, Petri C, Rozenberg J, Simon N, Hajage D, Kirchgesner J, Tubach F, Létinier L, Dechartres A. A methodological review of the high-dimensional propensity score in comparative-effectiveness and safety-of-interventions research finds incomplete reporting relative to algorithm development and robustness. J Clin Epidemiol 2024; 169:111305. [PMID: 38417583 DOI: 10.1016/j.jclinepi.2024.111305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 02/14/2024] [Accepted: 02/20/2024] [Indexed: 03/01/2024]
Abstract
OBJECTIVES The use of secondary databases has become popular for evaluating the effectiveness and safety of interventions in real-life settings. However, the absence of important confounders in these databases is challenging. To address this issue, the high-dimensional propensity score (hdPS) algorithm was developed in 2009. This algorithm uses proxy variables for mitigating confounding by combining information available across several healthcare dimensions. This study assessed the methodology and reporting of the hdPS in comparative effectiveness and safety research. STUDY DESIGN AND SETTING In this methodological review, we searched PubMed and Google Scholar from July 2009 to May 2022 for studies that used the hdPS for evaluating the effectiveness or safety of healthcare interventions. Two reviewers independently extracted study characteristics and assessed how the hdPS was applied and reported. Risk of bias was evaluated with the Risk Of Bias In Non-randomised Studies - of Interventions (ROBINS-I) tool. RESULTS In total, 136 studies met the inclusion criteria; the median publication year was 2018 (Q1-Q3 2016-2020). The studies included 192 datasets, mostly North American databases (n = 132, 69%). The hdPS was used in primary analysis in 120 studies (88%). Dimensions were defined in 101 studies (74%), with a median of 5 (Q1-Q3 4-6) dimensions included. A median of 500 (Q1-Q3 200-500) empirically identified covariates were selected. Regarding hdPS reporting, only 11 studies (8%) reported all recommended items. Most studies (n = 81, 60%) had a moderate overall risk of bias. CONCLUSION There is room for improvement in the reporting of hdPS studies, especially regarding the transparency of methodological choices that underpin the construction of the hdPS.
Collapse
Affiliation(s)
- Guillaume Louis Martin
- Sorbonne Université, INSERM, Institut Pierre Louis d'Epidémiologie et de Santé Publique, AP-HP, Hôpital Pitié Salpêtrière, Département de Santé Publique, Paris, France; Synapse Medicine, Bordeaux, France.
| | - Camille Petri
- UKRI Centre for Doctoral Training in AI for Healthcare, Imperial College London, London, UK; National Heart and Lung Institute, Imperial College London, London, UK
| | | | - Noémie Simon
- Sorbonne Université, INSERM, Institut Pierre Louis d'Epidémiologie et de Santé Publique, AP-HP, Hôpital Pitié Salpêtrière, Département de Santé Publique, Paris, France
| | - David Hajage
- Sorbonne Université, INSERM, Institut Pierre Louis d'Epidémiologie et de Santé Publique, AP-HP, Hôpital Pitié Salpêtrière, Département de Santé Publique, Paris, France
| | - Julien Kirchgesner
- Sorbonne Université, INSERM, Institut Pierre Louis d'Epidémiologie et de Santé Publique, AP-HP, Hôpital Saint-Antoine, Département de Gastroentérologie et Nutrition, Paris, France
| | - Florence Tubach
- Sorbonne Université, INSERM, Institut Pierre Louis d'Epidémiologie et de Santé Publique, AP-HP, Hôpital Pitié Salpêtrière, Département de Santé Publique, Paris, France
| | | | - Agnès Dechartres
- Sorbonne Université, INSERM, Institut Pierre Louis d'Epidémiologie et de Santé Publique, AP-HP, Hôpital Pitié Salpêtrière, Département de Santé Publique, Paris, France
| |
Collapse
|
4
|
Brooks TG, Lahens NF, Mrčela A, Grant GR. Challenges and best practices in omics benchmarking. Nat Rev Genet 2024; 25:326-339. [PMID: 38216661 DOI: 10.1038/s41576-023-00679-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/14/2023] [Indexed: 01/14/2024]
Abstract
Technological advances enabling massively parallel measurement of biological features - such as microarrays, high-throughput sequencing and mass spectrometry - have ushered in the omics era, now in its third decade. The resulting complex landscape of analytical methods has naturally fostered the growth of an omics benchmarking industry. Benchmarking refers to the process of objectively comparing and evaluating the performance of different computational or analytical techniques when processing and analysing large-scale biological data sets, such as transcriptomics, proteomics and metabolomics. With thousands of omics benchmarking studies published over the past 25 years, the field has matured to the point where the foundations of benchmarking have been established and well described. However, generating meaningful benchmarking data and properly evaluating performance in this complex domain remains challenging. In this Review, we highlight some common oversights and pitfalls in omics benchmarking. We also establish a methodology to bring the issues that can be addressed into focus and to be transparent about those that cannot: this takes the form of a spreadsheet template of guidelines for comprehensive reporting, intended to accompany publications. In addition, a survey of recent developments in benchmarking is provided as well as specific guidance for commonly encountered difficulties.
Collapse
Affiliation(s)
- Thomas G Brooks
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - Nicholas F Lahens
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - Antonijo Mrčela
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - Gregory R Grant
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA.
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
5
|
Schreck N, Slynko A, Saadati M, Benner A. Statistical plasmode simulations-Potentials, challenges and recommendations. Stat Med 2024; 43:1804-1825. [PMID: 38356231 DOI: 10.1002/sim.10012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 12/18/2023] [Accepted: 01/02/2024] [Indexed: 02/16/2024]
Abstract
Statistical data simulation is essential in the development of statistical models and methods as well as in their performance evaluation. To capture complex data structures, in particular for high-dimensional data, a variety of simulation approaches have been introduced including parametric and the so-called plasmode simulations. While there are concerns about the realism of parametrically simulated data, it is widely claimed that plasmodes come very close to reality with some aspects of the "truth" known. However, there are no explicit guidelines or state-of-the-art on how to perform plasmode data simulations. In the present paper, we first review existing literature and introduce the concept of statistical plasmode simulation. We then discuss advantages and challenges of statistical plasmodes and provide a step-wise procedure for their generation, including key steps to their implementation and reporting. Finally, we illustrate the concept of statistical plasmodes as well as the proposed plasmode generation procedure by means of a public real RNA data set on breast carcinoma patients.
Collapse
Affiliation(s)
- Nicholas Schreck
- Division of Biostatistics, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Alla Slynko
- Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario, Canada
| | - Maral Saadati
- Division of Biostatistics, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Axel Benner
- Division of Biostatistics, German Cancer Research Center (DKFZ), Heidelberg, Germany
| |
Collapse
|
6
|
DiPrete BL, Girman CJ, Mavros P, Breskin A, Brookhart MA. Characterizing Imbalance in the Tails of the Propensity Score Distribution. Am J Epidemiol 2024; 193:389-403. [PMID: 37830395 DOI: 10.1093/aje/kwad200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2021] [Revised: 07/13/2023] [Accepted: 10/05/2023] [Indexed: 10/14/2023] Open
Abstract
Understanding characteristics of patients with propensity scores in the tails of the propensity score (PS) distribution has relevance for inverse-probability-of-treatment-weighted and PS-based estimation in observational studies. Here we outline a method for identifying variables most responsible for extreme propensity scores. The approach is illustrated in 3 scenarios: 1) a plasmode simulation of adult patients in the National Ambulatory Medical Care Survey (2011-2015) and 2) timing of dexamethasone initiation and 3) timing of remdesivir initiation in patients hospitalized for coronavirus disease 2019 from February 2020 through January 2021. PS models were fitted using relevant baseline covariates, and tails of the PS distribution were defined using asymmetric first and 99th percentiles. After fitting of the PS model in each original data set, values of each key covariate were permuted and model-agnostic variable importance measures were examined. Visualization and variable importance techniques were helpful in identifying variables most responsible for extreme propensity scores and may help identify individual characteristics that might make patients inappropriate for inclusion in a study (e.g., off-label use). Subsetting or restricting the study sample based on variables identified using this approach may help investigators avoid the need for trimming or overlap weights in studies.
Collapse
|
7
|
Friedrich S, Friede T. On the role of benchmarking data sets and simulations in method comparison studies. Biom J 2024; 66:e2200212. [PMID: 36810737 DOI: 10.1002/bimj.202200212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Revised: 01/26/2023] [Accepted: 02/01/2023] [Indexed: 02/24/2023]
Abstract
Method comparisons are essential to provide recommendations and guidance for applied researchers, who often have to choose from a plethora of available approaches. While many comparisons exist in the literature, these are often not neutral but favor a novel method. Apart from the choice of design and a proper reporting of the findings, there are different approaches concerning the underlying data for such method comparison studies. Most manuscripts on statistical methodology rely on simulation studies and provide a single real-world data set as an example to motivate and illustrate the methodology investigated. In the context of supervised learning, in contrast, methods are often evaluated using so-called benchmarking data sets, that is, real-world data that serve as gold standard in the community. Simulation studies, on the other hand, are much less common in this context. The aim of this paper is to investigate differences and similarities between these approaches, to discuss their advantages and disadvantages, and ultimately to develop new approaches to the evaluation of methods picking the best of both worlds. To this aim, we borrow ideas from different contexts such as mixed methods research and Clinical Scenario Evaluation.
Collapse
Affiliation(s)
- Sarah Friedrich
- Institute of Mathematics, University of Augsburg, Augsburg, Germany
- Centre for Advanced Analytics and Predictive Sciences, University of Augsburg, Augsburg, Germany
| | - Tim Friede
- Department of Medical Statistics, University Medical Center Göttingen, Humboldtallee, Göttingen, Germany
- DZHK (German Centre for Cardiovascular Research), Partner Site Göttingen, Göttingen, Germany
| |
Collapse
|
8
|
Ayilara OF, Platt RW, Dahl M, Coulombe J, Ginestet PG, Chateau D, Lix LM. Generating synthetic data from administrative health records for drug safety and effectiveness studies. Int J Popul Data Sci 2023; 8:2176. [PMID: 38414538 PMCID: PMC10898503 DOI: 10.23889/ijpds.v8i1.2176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/29/2024] Open
Abstract
Introduction Administrative health records (AHRs) are used to conduct population-based post-market drug safety and comparative effectiveness studies to inform healthcare decision making. However, the cost of data extraction, and the challenges associated with privacy and securing approvals can make it challenging for researchers to conduct methodological research in a timely manner using real data. Generating synthetic AHRs that reasonably represent the real-world data are beneficial for developing analytic methods and training analysts to rapidly implement study protocols. We generated synthetic AHRs using two methods and compared these synthetic AHRs to real-world AHRs. We described the challenges associated with using synthetic AHRs for real-world study. Methods The real-world AHRs comprised prescription drug records for individuals with healthcare insurance coverage in the Population Research Data Repository (PRDR) from Manitoba, Canada for the 10-year period from 2008 to 2017. Synthetic data were generated using the Observational Medical Dataset Simulator II (OSIM2) and a modification (ModOSIM). Synthetic and real-world data were described using frequencies and percentages. Agreement of prescription drug use measures in PRDR, OSIM2 and ModOSIM was estimated with the concordance coefficient. Results The PRDR cohort included 169,586,633 drug records and 1,395 drug types for 1,604,734 individuals. Synthetic data for 1,000,000 individuals were generated using OSIM2 and ModOSIM. Sex and age group distributions were similar in the real-world and synthetic AHRs. However, there were significant differences in the number of drug records and number of unique drugs per person for OSIM2 and ModOSIM when compared with PRDR. For the average number of days of drug use, concordance with the PRDR was 16% (95% confidence interval [CI]: 12%-19%) for OSIM2 and 88% (95% CI: 87%-90%) for ModOSIM. Conclusions ModOSIM data were more similar to PRDR than OSIM2 data on many measures. Synthetic AHRs consistent with those found in real-world settings can be generated using ModOSIM. Synthetic data will benefit rapid implementation of methodological studies and data analyst training.
Collapse
Affiliation(s)
- Olawale F Ayilara
- Department of Community Health Sciences, University of Manitoba, Winnipeg, Canada
| | - Robert W Platt
- Department of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, Canada
| | - Matt Dahl
- Manitoba Centre for Health Policy, University of Manitoba, Winnipeg, Canada
| | - Janie Coulombe
- Department of Mathematics and Statistics, Université de Montréal, Montreal, Canada
| | - Pablo Gonzalez Ginestet
- Department of Medical Epidemiology and Biostatistics, Karolinska Institute, Stockholm, Sweden
| | - Dan Chateau
- College of Health & Medicine, Australian National University, Canberra, Australia
| | - Lisa M Lix
- Department of Community Health Sciences, University of Manitoba, Winnipeg, Canada
| |
Collapse
|
9
|
Souli Y, Trudel X, Diop A, Brisson C, Talbot D. Longitudinal plasmode algorithms to evaluate statistical methods in realistic scenarios: an illustration applied to occupational epidemiology. BMC Med Res Methodol 2023; 23:242. [PMID: 37853309 PMCID: PMC10585912 DOI: 10.1186/s12874-023-02062-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Accepted: 10/09/2023] [Indexed: 10/20/2023] Open
Abstract
INTRODUCTION Plasmode simulations are a type of simulations that use real data to determine the synthetic data-generating equations. Such simulations thus allow evaluating statistical methods under realistic conditions. As far as we know, no plasmode algorithm has been proposed for simulating longitudinal data. In this paper, we propose a longitudinal plasmode framework to generate realistic data with both a time-varying exposure and time-varying covariates. This work was motivated by the objective of comparing different methods for estimating the causal effect of a cumulative exposure to psychosocial stressors at work over time. METHODS We developed two longitudinal plasmode algorithms: a parametric and a nonparametric algorithms. Data from the PROspective Québec (PROQ) Study on Work and Health were used as an input to generate data with the proposed plasmode algorithms. We evaluated the performance of multiple estimators of the parameters of marginal structural models (MSMs): inverse probability of treatment weighting, g-computation and targeted maximum likelihood estimation. These estimators were also compared to standard regression approaches with either adjustment for baseline covariates only or with adjustment for both baseline and time-varying covariates. RESULTS Standard regression methods were susceptible to yield biased estimates with confidence intervals having coverage probability lower than their nominal level. The bias was much lower and coverage of confidence intervals was much closer to the nominal level when considering MSMs. Among MSM estimators, g-computation overall produced the best results relative to bias, root mean squared error and coverage of confidence intervals. No method produced unbiased estimates with adequate coverage for all parameters in the more realistic nonparametric plasmode simulation. CONCLUSION The proposed longitudinal plasmode algorithms can be important methodological tools for evaluating and comparing analytical methods in realistic simulation scenarios. To facilitate the use of these algorithms, we provide R functions on GitHub. We also recommend using MSMs when estimating the effect of cumulative exposure to psychosocial stressors at work.
Collapse
Affiliation(s)
- Youssra Souli
- Institute for Stochastics Johannes Kepler University, Linz, Austria
| | - Xavier Trudel
- Université Laval, Département de médecine sociale et préventive, Québec, Canada
- Centre de recherche du CHU de Québec - Université Laval, Axe santé des populations et pratiques optimales en santé, Québec, Canada
| | - Awa Diop
- Université Laval, Département de médecine sociale et préventive, Québec, Canada
- Centre de recherche du CHU de Québec - Université Laval, Axe santé des populations et pratiques optimales en santé, Québec, Canada
| | - Chantal Brisson
- Université Laval, Département de médecine sociale et préventive, Québec, Canada
- Centre de recherche du CHU de Québec - Université Laval, Axe santé des populations et pratiques optimales en santé, Québec, Canada
| | - Denis Talbot
- Université Laval, Département de médecine sociale et préventive, Québec, Canada.
- Centre de recherche du CHU de Québec - Université Laval, Axe santé des populations et pratiques optimales en santé, Québec, Canada.
| |
Collapse
|
10
|
Oh IS, Jeong HE, Lee H, Filion KB, Noh Y, Shin JY. Validating an approach to overcome the immeasurable time bias in cohort studies: a real-world example and Monte Carlo simulation study. Int J Epidemiol 2023; 52:1534-1544. [PMID: 37172269 DOI: 10.1093/ije/dyad049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 02/03/2023] [Accepted: 04/18/2023] [Indexed: 05/14/2023] Open
Abstract
BACKGROUND Immeasurable time bias arises from the lack of in-hospital medication information. It has been suggested that time-varying adjustment for hospitalization may minimize this potential bias. However, whereas we examined this issue in one case study, there remains a need to assess the validity of this approach in other settings. METHODS Using a Monte Carlo simulation, we generated synthetic immeasurable time-varying hospitalization-related factors of duration, frequency and timing. Nine scenarios were created by combining three frequency scenarios and three duration scenarios, where the empirical cohort distribution of hospitalization was used to simulate the 'timing'. We used Korea's healthcare database and a case example of β-blocker use and mortality among patients with heart failure. We estimated the gold-standard hazard ratio (HR) with 95% CI using inpatient and outpatient drug data, and that of the pseudo-outpatient setting using outpatient data only. We assessed the validity of adjusting for time-varying hospitalization in nine different scenarios, using relative bias, confidence limit ratio (CLR) and mean squared error (MSE) compared with the empirical gold-standard estimate across bootstrap resamples. RESULTS With the real-world gold standard (HR 0.73; 95% CI 0.67-0.80) as the reference estimate, adjusting for time-varying hospitalization (0.71; 0.63-0.80) effectively reduced the immeasurable time bias and had the following performance metrics across the nine scenarios: relative bias (range: -7.08% to 0.61%), CLR (1.28 to 1.36) and MSE (0.0005 to 0.0031). CONCLUSIONS The approach of adjusting for time-varying hospitalization consistently reduced the immeasurable time bias in Monte Carlo simulated data.
Collapse
Affiliation(s)
- In-Sun Oh
- School of Pharmacy, Sungkyunkwan University, Suwon, Gyeonggi-do, South Korea
- Department of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, Quebec, Canada
- Centre for Clinical Epidemiology, Lady Davis Research Institute-Jewish General Hospital, Montreal, Quebec, Canada
| | - Han Eol Jeong
- School of Pharmacy, Sungkyunkwan University, Suwon, Gyeonggi-do, South Korea
- Department of Biohealth Regulatory Science, Sungkyunkwan University, Suwon, Gyeonggi-do, South Korea
| | - Hyesung Lee
- School of Pharmacy, Sungkyunkwan University, Suwon, Gyeonggi-do, South Korea
- Department of Biohealth Regulatory Science, Sungkyunkwan University, Suwon, Gyeonggi-do, South Korea
| | - Kristian B Filion
- Department of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, Quebec, Canada
- Centre for Clinical Epidemiology, Lady Davis Research Institute-Jewish General Hospital, Montreal, Quebec, Canada
- Department of Medicine, McGill University, Montreal, Quebec, Canada
| | - Yunha Noh
- School of Pharmacy, Sungkyunkwan University, Suwon, Gyeonggi-do, South Korea
- Department of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, Quebec, Canada
- Centre for Clinical Epidemiology, Lady Davis Research Institute-Jewish General Hospital, Montreal, Quebec, Canada
| | - Ju-Young Shin
- School of Pharmacy, Sungkyunkwan University, Suwon, Gyeonggi-do, South Korea
- Department of Biohealth Regulatory Science, Sungkyunkwan University, Suwon, Gyeonggi-do, South Korea
- Department of Clinical Research Design & Evaluation, Samsung Advanced Institute for Health Sciences and Technology, Sungkyunkwan University, Seoul, South Korea
| |
Collapse
|
11
|
Williamson BD, Wyss R, Stuart EA, Dang LE, Mertens AN, Neugebauer RS, Wilson A, Gruber S. An application of the Causal Roadmap in two safety monitoring case studies: Causal inference and outcome prediction using electronic health record data. J Clin Transl Sci 2023; 7:e208. [PMID: 37900347 PMCID: PMC10603358 DOI: 10.1017/cts.2023.632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 09/12/2023] [Accepted: 09/13/2023] [Indexed: 10/31/2023] Open
Abstract
Background Real-world data, such as administrative claims and electronic health records, are increasingly used for safety monitoring and to help guide regulatory decision-making. In these settings, it is important to document analytic decisions transparently and objectively to assess and ensure that analyses meet their intended goals. Methods The Causal Roadmap is an established framework that can guide and document analytic decisions through each step of the analytic pipeline, which will help investigators generate high-quality real-world evidence. Results In this paper, we illustrate the utility of the Causal Roadmap using two case studies previously led by workgroups sponsored by the Sentinel Initiative - a program for actively monitoring the safety of regulated medical products. Each case example focuses on different aspects of the analytic pipeline for drug safety monitoring. The first case study shows how the Causal Roadmap encourages transparency, reproducibility, and objective decision-making for causal analyses. The second case study highlights how this framework can guide analytic decisions beyond inference on causal parameters, improving outcome ascertainment in clinical phenotyping. Conclusion These examples provide a structured framework for implementing the Causal Roadmap in safety surveillance and guide transparent, reproducible, and objective analysis.
Collapse
Affiliation(s)
- Brian D. Williamson
- Biostatistics Division, Kaiser Permanente Washington Health Research Institute, Seattle, WA, USA
| | - Richard Wyss
- Division of Pharmacoepidemiology and Pharmacoeconomics, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
| | - Elizabeth A. Stuart
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Lauren E. Dang
- Department of Biostatistics, University of California, Berkeley, CA, USA
| | - Andrew N. Mertens
- Department of Biostatistics, University of California, Berkeley, CA, USA
| | | | | | | |
Collapse
|
12
|
Vader DT, Mamtani R, Li Y, Griffith SD, Calip GS, Hubbard RA. Inverse Probability of Treatment Weighting and Confounder Missingness in Electronic Health Record-based Analyses: A Comparison of Approaches Using Plasmode Simulation. Epidemiology 2023; 34:520-530. [PMID: 37155612 PMCID: PMC10231933 DOI: 10.1097/ede.0000000000001618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Accepted: 03/22/2023] [Indexed: 05/10/2023]
Abstract
BACKGROUND Electronic health record (EHR) data represent a critical resource for comparative effectiveness research, allowing investigators to study intervention effects in real-world settings with large patient samples. However, high levels of missingness in confounder variables is common, challenging the perceived validity of EHR-based investigations. METHODS We investigated performance of multiple imputation and propensity score (PS) calibration when conducting inverse probability of treatment weights (IPTW)-based comparative effectiveness research using EHR data with missingness in confounder variables and outcome misclassification. Our motivating example compared effectiveness of immunotherapy versus chemotherapy treatment of advanced bladder cancer with missingness in a key prognostic variable. We captured complexity in EHR data structures using a plasmode simulation approach to spike investigator-defined effects into resamples of a cohort of 4361 patients from a nationwide deidentified EHR-derived database. We characterized statistical properties of IPTW hazard ratio estimates when using multiple imputation or PS calibration missingness approaches. RESULTS Multiple imputation and PS calibration performed similarly, maintaining ≤0.05 absolute bias in the marginal hazard ratio even when ≥50% of subjects had missing at random or missing not at random confounder data. Multiple imputation required greater computational resources, taking nearly 40 times as long as PS calibration to complete. Outcome misclassification minimally increased bias of both methods. CONCLUSION Our results support multiple imputation and PS calibration approaches to missingness in missing completely at random or missing at random confounder variables in EHR-based IPTW comparative effectiveness analyses, even with missingness ≥50%. PS calibration represents a computationally efficient alternative to multiple imputation.
Collapse
Affiliation(s)
- Daniel T. Vader
- From the Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA
| | - Ronac Mamtani
- Division of Hematology and Oncology, University of Pennsylvania, Philadelphia, PA
| | - Yun Li
- From the Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA
| | | | | | - Rebecca A. Hubbard
- From the Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA
| |
Collapse
|
13
|
Sarayani A, Brown JD, Hampp C, Donahoo WT, Winterstein AG. Adaptability of High Dimensional Propensity Score Procedure in the Transition from ICD-9 to ICD-10 in the US Healthcare System. Clin Epidemiol 2023; 15:645-660. [PMID: 37274833 PMCID: PMC10237200 DOI: 10.2147/clep.s405165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 04/20/2023] [Indexed: 06/07/2023] Open
Abstract
Background High-Dimensional Propensity Score procedure (HDPS) is a data-driven approach to assist control for confounding in pharmacoepidemiologic research. The transition to the International Classification of Disease (ICD-9/10) in the US health system may pose uncertainty in applying the HDPS procedure. Methods We assembled a base cohort of patients in MarketScan® Commercial Claims Database who had newly initiated celecoxib or traditional NSAIDs to compare gastrointestinal bleeding risk. We then created bootstrapped hypothetical cohorts from the base cohort with predefined patient selection patterns from the ICD eras. Three strategies for HDPS deployment were tested: 1) split the cohort by ICD era, deploy HDPS twice, and pool the relative risks (pooled RR), 2) consider codes from each ICD era as a separate data dimension and deploy HDPS in the entire cohort (data dimensions) and 3) map ICD codes from both eras to Clinical Classifications Software (CCS) concepts before deploying HDPS in the entire cohort (CCS mapping). We calculated percent bias and root-mean-squared error to compare the strategies. Results A similar bias reduction was observed in cohorts where patient selection pattern from each ICD era was comparable between the exposure groups. In the presence of considerable disparity in patient selection, we observed a bimodal distribution of propensity scores in the data dimensions strategy, indicating instrument-like covariates. Moreover, the CCS mapping strategy resulted in at least 30% less bias than pooled RR and data dimensions strategies (RMSE: 0.14, 0.19, 0.21, respectively) in this scenario. Conclusion Mapping ICD codes to a stable terminology like CCS serves as a helpful strategy to reduce residual bias when deploying HDPS in pharmacoepidemiologic studies spanning both ICD eras.
Collapse
Affiliation(s)
- Amir Sarayani
- Department of Pharmaceutical Outcomes and Policy, College of Pharmacy, University of Florida, Gainesville, FL, USA
- Center for Drug Safety and Evaluation, University of Florida, Gainesville, FL, USA
| | - Joshua D Brown
- Department of Pharmaceutical Outcomes and Policy, College of Pharmacy, University of Florida, Gainesville, FL, USA
- Center for Drug Safety and Evaluation, University of Florida, Gainesville, FL, USA
| | - Christian Hampp
- Department of Pharmaceutical Outcomes and Policy, College of Pharmacy, University of Florida, Gainesville, FL, USA
- Regeneron Pharmaceuticals Inc., Tarrytown, NY, USA
| | - William T Donahoo
- Division of Endocrinology, Diabetes, & Metabolism, College of Medicine, University of Florida, Gainesville, FL, USA
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA
| | - Almut G Winterstein
- Department of Pharmaceutical Outcomes and Policy, College of Pharmacy, University of Florida, Gainesville, FL, USA
- Center for Drug Safety and Evaluation, University of Florida, Gainesville, FL, USA
| |
Collapse
|
14
|
Laurent T, Lambrelli D, Wakabayashi R, Hirano T, Kuwatsuru R. Strategies to Address Current Challenges in Real-World Evidence Generation in Japan. Drugs Real World Outcomes 2023:10.1007/s40801-023-00371-5. [PMID: 37178273 PMCID: PMC10182751 DOI: 10.1007/s40801-023-00371-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/19/2023] [Indexed: 05/15/2023] Open
Abstract
The generation of real-world evidence (RWE), which describes patient characteristics or treatment patterns using real-world data (RWD), is rapidly growing more popular as a tool for decision-making in Japan. The aim of this review was to summarize challenges to RWE generation in Japan related to pharmacoepidemiology, and to propose strategies to address some of these challenges. We first focused on data-related issues, including the lack of transparency of RWD sources, linkage across different care settings, definitions of clinical outcomes, and the overall assessment framework of RWD when used for research purposes. Next the study reviewed methodology-related challenges. As lack of design transparency impairs study reproducibility, transparent reporting of study design is critical for stakeholders. For this review, we considered different sources of biases and time-varying confounding, along with potential study design and methodological solutions. Additionally, the implementation of robust assessment of definition uncertainty, misclassification, and unmeasured confounders would enhance RWE credibility in light of RWD source-related limitations, and is being strongly considered by task forces in Japan. Overall, the development of guidance for best practices on data source selection, design transparency, and analytical methods to address different sources of biases and robustness in the process of RWE generation will enhance credibility for stakeholders and local decision-makers.
Collapse
Affiliation(s)
- Thomas Laurent
- Real-World Evidence and Data Assessment (READS), Graduate School of Medicine, Juntendo University, Hongo 2-1-1, Bunkyo-ku, Tokyo, 113-8421, Japan
- Clinical Study Support Inc., 2F Daiei Bldg., 1-11-20 Nishiki Naka-ku, Nagoya, 460-0003, Japan
| | - Dimitra Lambrelli
- Real-World Evidence and Data Assessment (READS), Graduate School of Medicine, Juntendo University, Hongo 2-1-1, Bunkyo-ku, Tokyo, 113-8421, Japan
- Real-World Evidence, Evidera, The Ark, 2nd Floor, 201 Talgarth Road, London, W6 8BJ, UK
| | - Ryozo Wakabayashi
- Real-World Evidence and Data Assessment (READS), Graduate School of Medicine, Juntendo University, Hongo 2-1-1, Bunkyo-ku, Tokyo, 113-8421, Japan
- Clinical Study Support Inc., 2F Daiei Bldg., 1-11-20 Nishiki Naka-ku, Nagoya, 460-0003, Japan
| | - Takahiro Hirano
- Real-World Evidence and Data Assessment (READS), Graduate School of Medicine, Juntendo University, Hongo 2-1-1, Bunkyo-ku, Tokyo, 113-8421, Japan.
- Clinical Study Support Inc., 2F Daiei Bldg., 1-11-20 Nishiki Naka-ku, Nagoya, 460-0003, Japan.
| | - Ryohei Kuwatsuru
- Real-World Evidence and Data Assessment (READS), Graduate School of Medicine, Juntendo University, Hongo 2-1-1, Bunkyo-ku, Tokyo, 113-8421, Japan
- Department of Radiology, School of Medicine, Juntendo University, Hongo 2-1-1, Bunkyo-ku, Tokyo, 113-8421, Japan
| |
Collapse
|
15
|
Getz K, Hubbard RA, Linn KA. Performance of Multiple Imputation Using Modern Machine Learning Methods in Electronic Health Records Data. Epidemiology 2023; 34:206-215. [PMID: 36722803 DOI: 10.1097/ede.0000000000001578] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
BACKGROUND Missing data are common in studies using electronic health records (EHRs)-derived data. Missingness in EHR data is related to healthcare utilization patterns, resulting in complex and potentially missing not at random missingness mechanisms. Prior research has suggested that machine learning-based multiple imputation methods may outperform traditional methods and may perform well even in settings of missing not at random missingness. METHODS We used plasmode simulations based on a nationwide EHR-derived de-identified database for patients with metastatic urothelial carcinoma to compare the performance of multiple imputation using chained equations, random forests, and denoising autoencoders in terms of bias and precision of hazard ratio estimates under varying proportions of observations with missing values and missingness mechanisms (missing completely at random, missing at random, and missing not at random). RESULTS Multiple imputation by chained equations and random forest methods had low bias and similar standard errors for parameter estimates under missingness completely at random. Under missingness at random, denoising autoencoders had higher bias than multiple imputation by chained equations and random forests. Contrary to results of prior studies of denoising autoencoders, all methods exhibited substantial bias under missingness not at random, with bias increasing in direct proportion to the amount of missing data. CONCLUSIONS We found no advantage of denoising autoencoders for multiple imputation in the setting of an epidemiologic study conducted using EHR data. Results suggested that denoising autoencoders may overfit the data leading to poor confounder control. Use of more flexible imputation approaches does not mitigate bias induced by missingness not at random and can produce estimates with spurious precision.
Collapse
Affiliation(s)
- Kylie Getz
- From the Department of Biostatistics and Epidemiology, School of Public Health, Rutgers University, Piscataway, NJ
| | - Rebecca A Hubbard
- Department of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
- Abramson Cancer Center, University of Pennsylvania, Philadelphia, PA
| | - Kristin A Linn
- Department of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
| |
Collapse
|
16
|
Targeted learning: Towards a future informed by real-world evidence. Stat Biopharm Res 2023. [DOI: 10.1080/19466315.2023.2182356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/25/2023]
|
17
|
Abrahamowicz M, Beauchamp ME, Moura CS, Bernatsky S, Ferreira Guerra S, Danieli C. Adapting SIMEX to correct for bias due to interval-censored outcomes in survival analysis with time-varying exposure. Biom J 2022; 64:1467-1485. [PMID: 36065586 DOI: 10.1002/bimj.202100013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2021] [Revised: 05/16/2022] [Accepted: 05/28/2022] [Indexed: 12/14/2022]
Abstract
Many clinical and epidemiological applications of survival analysis focus on interval-censored events that can be ascertained only at discrete times of clinic visits. This implies that the values of time-varying covariates are not correctly aligned with the true, unknown event times, inducing a bias in the estimated associations. To address this issue, we adapted the simulation-extrapolation (SIMEX) methodology, based on assessing how the estimates change with the artificially increased time between clinic visits. We propose diagnostics to choose the extrapolating function. In simulations, the SIMEX-corrected estimates reduced considerably the bias to the null and generally yielded a better bias/variance trade-off than conventional estimates. In a real-life pharmacoepidemiological application, the proposed method increased by 27% the excess hazard of the estimated association between a time-varying exposure, representing the 2-year cumulative duration of past use of a hypertensive medication, and the hazard of nonmelanoma skin cancer (interval-censored events). These simulation-based and real-life results suggest that the proposed SIMEX-based correction may help improve the accuracy of estimated associations between time-varying exposures and the hazard of interval-censored events in large cohort studies where the events are recorded only at relatively sparse times of clinic visits/assessments. However, these advantages may be less certain for smaller studies and/or weak associations.
Collapse
Affiliation(s)
- Michal Abrahamowicz
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, QC, Canada.,Centre for Outcomes Research and Evaluation, Research Institute of the McGill University Health Centre, Montreal, QC, Canada
| | - Marie-Eve Beauchamp
- Centre for Outcomes Research and Evaluation, Research Institute of the McGill University Health Centre, Montreal, QC, Canada
| | - Cristiano Soares Moura
- Centre for Outcomes Research and Evaluation, Research Institute of the McGill University Health Centre, Montreal, QC, Canada
| | - Sasha Bernatsky
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, QC, Canada.,Centre for Outcomes Research and Evaluation, Research Institute of the McGill University Health Centre, Montreal, QC, Canada
| | - Steve Ferreira Guerra
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, QC, Canada
| | - Coraline Danieli
- Centre for Outcomes Research and Evaluation, Research Institute of the McGill University Health Centre, Montreal, QC, Canada
| |
Collapse
|
18
|
Weinstein SM, Vandekar SN, Baller EB, Tu D, Adebimpe A, Tapera TM, Gur RC, Gur RE, Detre JA, Raznahan A, Alexander-Bloch AF, Satterthwaite TD, Shinohara RT, Park JY. Spatially-enhanced clusterwise inference for testing and localizing intermodal correspondence. Neuroimage 2022; 264:119712. [PMID: 36309332 PMCID: PMC10062374 DOI: 10.1016/j.neuroimage.2022.119712] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Revised: 10/16/2022] [Accepted: 10/25/2022] [Indexed: 11/05/2022] Open
Abstract
With the increasing availability of neuroimaging data from multiple modalities-each providing a different lens through which to study brain structure or function-new techniques for comparing, integrating, and interpreting information within and across modalities have emerged. Recent developments include hypothesis tests of associations between neuroimaging modalities, which can be used to determine the statistical significance of intermodal associations either throughout the entire brain or within anatomical subregions or functional networks. While these methods provide a crucial foundation for inference on intermodal relationships, they cannot be used to answer questions about where in the brain these associations are most pronounced. In this paper, we introduce a new method, called CLEAN-R, that can be used both to test intermodal correspondence throughout the brain and also to localize this correspondence. Our method involves first adjusting for the underlying spatial autocorrelation structure within each modality before aggregating information within small clusters to construct a map of enhanced test statistics. Using structural and functional magnetic resonance imaging data from a subsample of children and adolescents from the Philadelphia Neurodevelopmental Cohort, we conduct simulations and data analyses where we illustrate the high statistical power and nominal type I error levels of our method. By constructing an interpretable map of group-level correspondence using spatially-enhanced test statistics, our method offers insights beyond those provided by earlier methods.
Collapse
Affiliation(s)
- Sarah M Weinstein
- Penn Statistics in Imaging and Visualization Center, Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Simon N Vandekar
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - Erica B Baller
- Department of Psychiatry, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Danni Tu
- Penn Statistics in Imaging and Visualization Center, Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Azeez Adebimpe
- Department of Psychiatry, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA 19104, USA; Strategy Innovation & Deployment Section, Johnson and Johnson, Raritan, NJ, 08869, USA
| | - Tinashe M Tapera
- Department of Psychiatry, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Ruben C Gur
- Department of Psychiatry, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Raquel E Gur
- Department of Psychiatry, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - John A Detre
- Department of Neurology, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Armin Raznahan
- Section on Developmental Neurogenomics, National Institute of Mental Health Intramural Research Program, Bethesda, MD 20892, USA
| | - Aaron F Alexander-Bloch
- Department of Psychiatry, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA 19104, USA; Department of Child and Adolescent Psychiatry and Behavioral Science, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Theodore D Satterthwaite
- Department of Psychiatry, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Russell T Shinohara
- Penn Statistics in Imaging and Visualization Center, Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Jun Young Park
- Department of Statistical Sciences and Department of Psychology, University of Toronto, Toronto, ON, M5G 1Z5, Canada.
| |
Collapse
|
19
|
Duchesneau ED, Jackson BE, Webster-Clark M, Lund JL, Reeder-Hayes KE, Nápoles AM, Strassle PD. The Timing, the Treatment, the Question: Comparison of Epidemiologic Approaches to Minimize Immortal Time Bias in Real-World Data Using a Surgical Oncology Example. Cancer Epidemiol Biomarkers Prev 2022; 31:2079-2086. [PMID: 35984990 PMCID: PMC9627261 DOI: 10.1158/1055-9965.epi-22-0495] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Revised: 07/01/2022] [Accepted: 08/17/2022] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND Studies evaluating the effects of cancer treatments are prone to immortal time bias that, if unaddressed, can lead to treatments appearing more beneficial than they are. METHODS To demonstrate the impact of immortal time bias, we compared results across several analytic approaches (dichotomous exposure, dichotomous exposure excluding immortal time, time-varying exposure, landmark analysis, clone-censor-weight method), using surgical resection among women with metastatic breast cancer as an example. All adult women diagnosed with incident metastatic breast cancer from 2013-2016 in the National Cancer Database were included. To quantify immortal time bias, we also conducted a simulation study where the "true" relationship between surgical resection and mortality was known. RESULTS 24,329 women (median age 61, IQR 51-71) were included, and 24% underwent surgical resection. The largest association between resection and mortality was observed when using a dichotomized exposure [HR, 0.54; 95% confidence interval (CI), 0.51-0.57], followed by dichotomous with exclusion of immortal time (HR, 0.62; 95% CI, 0.59-0.65). Results from the time-varying exposure, landmark, and clone-censor-weight method analyses were closer to the null (HR, 0.67-0.84). Results from the plasmode simulation found that the time-varying exposure, landmark, and clone-censor-weight method models all produced unbiased HRs (bias -0.003 to 0.016). Both standard dichotomous exposure (HR, 0.84; bias, -0.177) and dichotomous with exclusion of immortal time (HR, 0.93; bias, -0.074) produced meaningfully biased estimates. CONCLUSIONS Researchers should use time-varying exposures with a treatment assessment window or the clone-censor-weight method when immortal time is present. IMPACT Using methods that appropriately account for immortal time will improve evidence and decision-making from research using real-world data.
Collapse
Affiliation(s)
- Emilie D. Duchesneau
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Bradford E. Jackson
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Michael Webster-Clark
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Jennifer L. Lund
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina.,Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Katherine E. Reeder-Hayes
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina.,Division of Oncology, Department of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Anna M. Nápoles
- Division of Intramural Research, National Institute on Minority Health and Health Disparities, NIH, Bethesda, Maryland
| | - Paula D. Strassle
- Division of Intramural Research, National Institute on Minority Health and Health Disparities, NIH, Bethesda, Maryland.,Corresponding Author: Paula D. Strassle, Division of Intramural Research, National Institute on Minority Health and Health Disparities, NIH, Bethesda, MD 20892. Phone: 301-594-5175; E-mail:
| |
Collapse
|
20
|
Shi J, Wang D, Tesei G, Norgeot B. Generating high-fidelity privacy-conscious synthetic patient data for causal effect estimation with multiple treatments. Front Artif Intell 2022; 5:918813. [PMID: 36187323 PMCID: PMC9515575 DOI: 10.3389/frai.2022.918813] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Accepted: 08/15/2022] [Indexed: 12/03/2022] Open
Abstract
In the past decade, there has been exponentially growing interest in the use of observational data collected as a part of routine healthcare practice to determine the effect of a treatment with causal inference models. Validation of these models, however, has been a challenge because the ground truth is unknown: only one treatment-outcome pair for each person can be observed. There have been multiple efforts to fill this void using synthetic data where the ground truth can be generated. However, to date, these datasets have been severely limited in their utility either by being modeled after small non-representative patient populations, being dissimilar to real target populations, or only providing known effects for two cohorts (treated vs. control). In this work, we produced a large-scale and realistic synthetic dataset that provides ground truth effects for over 10 hypertension treatments on blood pressure outcomes. The synthetic dataset was created by modeling a nationwide cohort of more than 580, 000 hypertension patient data including each person's multi-year history of diagnoses, medications, and laboratory values. We designed a data generation process by combining an adapted ADS-GAN model for fictitious patient information generation and a neural network for treatment outcome generation. Wasserstein distance of 0.35 demonstrates that our synthetic data follows a nearly identical joint distribution to the patient cohort used to generate the data. Patient privacy was a primary concern for this study; the ϵ-identifiability metric, which estimates the probability of actual patients being identified, is 0.008%, ensuring that our synthetic data cannot be used to identify any actual patients. To demonstrate its usage, we tested the bias in causal effect estimation of four well-established models using this dataset. The approach we used can be readily extended to other types of diseases in the clinical domain, and to datasets in other domains as well.
Collapse
|
21
|
Wyss R, Schneeweiss S, Lin KJ, Miller DP, Kalilani L, Franklin JM. Synthetic Negative Controls: Using Simulation to Screen Large-scale Propensity Score Analyses. Epidemiology 2022; 33:541-550. [PMID: 35439779 PMCID: PMC9156547 DOI: 10.1097/ede.0000000000001482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The propensity score has become a standard tool to control for large numbers of variables in healthcare database studies. However, little has been written on the challenge of comparing large-scale propensity score analyses that use different methods for confounder selection and adjustment. In these settings, balance diagnostics are useful but do not inform researchers on which variables balance should be assessed or quantify the impact of residual covariate imbalance on bias. Here, we propose a framework to supplement balance diagnostics when comparing large-scale propensity score analyses. Instead of focusing on results from any single analysis, we suggest conducting and reporting results for many analytic choices and using both balance diagnostics and synthetically generated control studies to screen analyses that show signals of bias caused by measured confounding. To generate synthetic datasets, the framework does not require simulating the outcome-generating process. In healthcare database studies, outcome events are often rare, making it difficult to identify and model all predictors of the outcome to simulate a confounding structure closely resembling the given study. Therefore, the framework uses a model for treatment assignment to divide the comparator population into pseudo-treatment groups where covariate differences resemble those in the study cohort. The partially simulated datasets have a confounding structure approximating the study population under the null (synthetic negative control studies). The framework is used to screen analyses that likely violate partial exchangeability due to lack of control for measured confounding. We illustrate the framework using simulations and an empirical example.
Collapse
Affiliation(s)
- Richard Wyss
- Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
| | - Sebastian Schneeweiss
- Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
| | - Kueiyu Joshua Lin
- Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
- Division of General Internal Medicine, Department of Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA
| | | | | | - Jessica M Franklin
- Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
| |
Collapse
|
22
|
Robertson SE, Steingrimsson JA, Dahabreh IJ. Using Numerical Methods to Design Simulations: Revisiting the Balancing Intercept. Am J Epidemiol 2022; 191:1283-1289. [PMID: 34736280 DOI: 10.1093/aje/kwab264] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Revised: 09/13/2021] [Accepted: 10/27/2021] [Indexed: 01/26/2023] Open
Abstract
In this paper, we consider methods for generating draws of a binary random variable whose expectation conditional on covariates follows a logistic regression model with known covariate coefficients. We examine approximations for finding a "balancing intercept," that is, a value for the intercept of the logistic model that leads to a desired marginal expectation for the binary random variable. We show that a recently proposed analytical approximation can produce inaccurate results, especially when targeting more extreme marginal expectations or when the linear predictor of the regression model has high variance. We then formulate the balancing intercept as a solution to an integral equation, implement a numerical approximation for solving the equation based on Monte Carlo methods, and show that the approximation works well in practice. Our approach to the basic problem of the balancing intercept provides an example of a broadly applicable strategy for formulating and solving problems that arise in the design of simulation studies used to evaluate or teach epidemiologic methods.
Collapse
|
23
|
Wyss R, Yanover C, El-Hay T, Bennett D, Platt RW, Zullo AR, Sari G, Wen X, Ye Y, Yuan H, Gokhale M, Patorno E, Lin KJ. Machine learning for improving high-dimensional proxy confounder adjustment in healthcare database studies: an overview of the current literature. Pharmacoepidemiol Drug Saf 2022; 31:932-943. [PMID: 35729705 PMCID: PMC9541861 DOI: 10.1002/pds.5500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Revised: 06/01/2022] [Accepted: 06/05/2022] [Indexed: 11/10/2022]
Abstract
Controlling for large numbers of variables that collectively serve as 'proxies' for unmeasured factors can often improve confounding control in pharmacoepidemiologic studies utilizing administrative healthcare databases. There is a growing body of evidence showing that data-driven machine learning algorithms for high-dimensional proxy confounder adjustment can supplement investigator-specified variables to improve confounding control compared to adjustment based on investigator-specified variables alone. Consequently, there has been a recent focus on the development of data-driven methods for high-dimensional proxy confounder adjustment. In this paper, we discuss the considerations underpinning three areas for data-driven high-dimensional proxy confounder adjustment: 1) feature generation-transforming raw data into covariates (or features) to be used for proxy adjustment; 2) covariate prioritization, selection and adjustment; and 3) diagnostic assessment. We survey current approaches and recent advancements within each area, including the most widely used approach to proxy confounder adjustment in healthcare database studies (the high-dimensional propensity score or hdPS). We also discuss limitations of the hdPS and outline recent advancements that incorporate the principles of proxy adjustment with machine learning extensions to improve performance. We further discuss challenges and avenues of future development within each area. This manuscript is endorsed by the International Society for Pharmacoepidemiology (ISPE). This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Richard Wyss
- Division of Pharmacoepidemioogy and Pharmacoeconomics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | | | - Tal El-Hay
- KI Research Institute, Kfar Malal, Israel.,IBM Research-Haifa Labs, Haifa, Israel
| | - Dimitri Bennett
- Global Evidence and Outcomes, Takeda Pharmaceutical Company Ltd., Cambridge, MA, USA
| | | | - Andrew R Zullo
- Department of Health Services, Policy, and Practice, Brown University School of Public Health and Center of Innovation in Long-Term Services and Supports, Providence Veterans Affairs Medical Center, Providence, RI, USA
| | - Grammati Sari
- Real World Evidence Strategy Lead, Visible Analytics Ltd, Oxford, UK
| | - Xuerong Wen
- Health Outcomes, Pharmacy Practice, College of Pharmacy, University of Rhode Island, Kingston, RI, USA
| | - Yizhou Ye
- Global Epidemiology, AbbVie Inc. North Chicago, IL, USA
| | - Hongbo Yuan
- Canadian Agency for Drugs and Technologies in Health, Ottawa, Canada
| | - Mugdha Gokhale
- Pharmacoepidemiology, Center for Observational and Real-world Evidence, Merck, PA, USA
| | - Elisabetta Patorno
- Division of Pharmacoepidemioogy and Pharmacoeconomics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Kueiyu Joshua Lin
- Division of Pharmacoepidemioogy and Pharmacoeconomics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.,Department of Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| |
Collapse
|
24
|
Rodriguez PJ, Veenstra DL, Heagerty PJ, Goss CH, Ramos KJ, Bansal A. A Framework for Using Real-World Data and Health Outcomes Modeling to Evaluate Machine Learning-Based Risk Prediction Models. VALUE IN HEALTH : THE JOURNAL OF THE INTERNATIONAL SOCIETY FOR PHARMACOECONOMICS AND OUTCOMES RESEARCH 2022; 25:350-358. [PMID: 35227445 PMCID: PMC9311314 DOI: 10.1016/j.jval.2021.11.1360] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 10/19/2021] [Accepted: 11/16/2021] [Indexed: 05/06/2023]
Abstract
OBJECTIVES We propose a framework of health outcomes modeling with dynamic decision making and real-world data (RWD) to evaluate the potential utility of novel risk prediction models in clinical practice. Lung transplant (LTx) referral decisions in cystic fibrosis offer a complex case study. METHODS We used longitudinal RWD for a cohort of adults (n = 4247) from the Cystic Fibrosis Foundation Patient Registry to compare outcomes of an LTx referral policy based on machine learning (ML) mortality risk predictions to referral based on (1) forced expiratory volume in 1 second (FEV1) alone and (2) heterogenous usual care (UC). We then developed a patient-level simulation model to project number of patients referred for LTx and 5-year survival, accounting for transplant availability, organ allocation policy, and heterogenous treatment effects. RESULTS Only 12% of patients (95% confidence interval 11%-13%) were referred for LTx over 5 years under UC, compared with 19% (18%-20%) under FEV1 and 20% (19%-22%) under ML. Of 309 patients who died before LTx referral under UC, 31% (27%-36%) would have been referred under FEV1 and 40% (35%-45%) would have been referred under ML. Given a fixed supply of organs, differences in referral time did not lead to significant differences in transplants, pretransplant or post-transplant deaths, or overall survival in 5 years. CONCLUSIONS Health outcomes modeling with RWD may help to identify novel ML risk prediction models with high potential real-world clinical utility and rule out further investment in models that are unlikely to offer meaningful real-world benefits.
Collapse
Affiliation(s)
- Patricia J Rodriguez
- The Comparative Health Outcomes, Policy & Economics (CHOICE) Institute, University of Washington, Seattle, WA, USA.
| | - David L Veenstra
- The Comparative Health Outcomes, Policy & Economics (CHOICE) Institute, University of Washington, Seattle, WA, USA
| | | | - Christopher H Goss
- Division of Pulmonary, Critical Care and Sleep Medicine, Department of Medicine, University of Washington, Seattle, WA, USA; Division of Pulmonology, Department of Pediatrics, University of Washington, Seattle, WA, USA
| | - Kathleen J Ramos
- Division of Pulmonary, Critical Care and Sleep Medicine, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Aasthaa Bansal
- The Comparative Health Outcomes, Policy & Economics (CHOICE) Institute, University of Washington, Seattle, WA, USA.
| |
Collapse
|
25
|
Shan M, Faries D, Dang A, Zhang X, Cui Z, Sheffield KM. A Simulation-Based Evaluation of Statistical Methods for Hybrid Real-World Control Arms in Clinical Trials. STATISTICS IN BIOSCIENCES 2022. [DOI: 10.1007/s12561-022-09334-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
26
|
Stopsack KH, Tyekucheva S, Wang M, Gerke TA, Vaselkiv JB, Penney KL, Kantoff PW, Finn SP, Fiorentino M, Loda M, Lotan TL, Parmigiani G, Mucci LA. Extent, impact, and mitigation of batch effects in tumor biomarker studies using tissue microarrays. eLife 2021; 10:71265. [PMID: 34939926 PMCID: PMC8849344 DOI: 10.7554/elife.71265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Accepted: 12/22/2021] [Indexed: 12/05/2022] Open
Abstract
Tissue microarrays (TMAs) have been used in thousands of cancer biomarker studies. To what extent batch effects, measurement error in biomarker levels between slides, affects TMA-based studies has not been assessed systematically. We evaluated 20 protein biomarkers on 14 TMAs with prospectively collected tumor tissue from 1448 primary prostate cancers. In half of the biomarkers, more than 10% of biomarker variance was attributable to between-TMA differences (range, 1–48%). We implemented different methods to mitigate batch effects (R package batchtma), tested in plasmode simulation. Biomarker levels were more similar between mitigation approaches compared to uncorrected values. For some biomarkers, associations with clinical features changed substantially after addressing batch effects. Batch effects and resulting bias are not an error of an individual study but an inherent feature of TMA-based protein biomarker studies. They always need to be considered during study design and addressed analytically in studies using more than one TMA. To understand cancer, researchers need to know which molecules tumor cells use. These so-called ‘biomarkers’ tag cancer cells as being different from healthy cells, and can be used to predict how aggressive a tumor may be, or how well it might respond to treatment. A popular technique for assessing biomarkers across multiple tumors is to use tissue microarrays. This involves taking samples from different tumors and embedding them in a block of wax, which is then cut into micro-thin slices and stained with reagents that can detect specific biomarkers, such as proteins. Each block contains hundreds of samples, which all experience the same conditions. So, any patterns detected in the staining are likely to represent real variations in the biomarkers present. Many cancer studies, however, often compare samples from multiple tissue microarrays, which may increase the risk of technical artifacts: for example, staining may look stronger in one batch of tissue samples than another, even though the amount of biomarker present in these different arrays is roughly the same. These ‘batch effects’ could potentially bias the results of the experiment and lead to the identification of misleading patterns. To evaluate how batch effects impact tissue microarray studies, Stopsack et al. examined 14 wax blocks which contained tumor samples from 1,448 men with prostate cancer. This revealed that for some biomarkers, but not others, there were noticeable differences between tissue microarrays that were clearly the result of batch effects. Stopsack et al. then tested six different ways of fixing these discrepancies using statistical methods. All six approaches were successful, even if the arrays included tumors with different characteristics, such as tumors that had been diagnosed more or less recently. This work highlights the importance of considering batch effects when using tissue microarrays to study cancer. Stopsack et al. have used their statistical approaches to develop freely available software which can reduce the biases that sometimes arise from these technical artifacts. This could help researchers avoid misleading patterns in their data and make it easier to detect real variations in the biomarkers present between tumor samples.
Collapse
Affiliation(s)
- Konrad H Stopsack
- Department of Epidemiology, Harvard T.H. Chan School of Public Health
| | | | - Molin Wang
- Department of Epidemiology, Harvard T.H. Chan School of Public Health
| | - Travis A Gerke
- Department of Cancer Epidemiology, Moffitt Cancer Center
| | - J Bailey Vaselkiv
- Department of Epidemiology, Harvard T.H. Chan School of Public Health
| | | | | | | | | | - Massimo Loda
- Department of Pathology, Weill Cornell Medical Center
| | | | | | - Lorelei A Mucci
- Department of Epidemiology, Harvard T.H. Chan School of Public Health
| |
Collapse
|
27
|
Madjar K, Zucknick M, Ickstadt K, Rahnenführer J. Combining heterogeneous subgroups with graph-structured variable selection priors for Cox regression. BMC Bioinformatics 2021; 22:586. [PMID: 34895139 PMCID: PMC8665528 DOI: 10.1186/s12859-021-04483-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2021] [Accepted: 11/15/2021] [Indexed: 11/12/2022] Open
Abstract
Background Important objectives in cancer research are the prediction of a patient’s risk based on molecular measurements such as gene expression data and the identification of new prognostic biomarkers (e.g. genes). In clinical practice, this is often challenging because patient cohorts are typically small and can be heterogeneous. In classical subgroup analysis, a separate prediction model is fitted using only the data of one specific cohort. However, this can lead to a loss of power when the sample size is small. Simple pooling of all cohorts, on the other hand, can lead to biased results, especially when the cohorts are heterogeneous. Results We propose a new Bayesian approach suitable for continuous molecular measurements and survival outcome that identifies the important predictors and provides a separate risk prediction model for each cohort. It allows sharing information between cohorts to increase power by assuming a graph linking predictors within and across different cohorts. The graph helps to identify pathways of functionally related genes and genes that are simultaneously prognostic in different cohorts. Conclusions Results demonstrate that our proposed approach is superior to the standard approaches in terms of prediction performance and increased power in variable selection when the sample size is small. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04483-z.
Collapse
Affiliation(s)
- Katrin Madjar
- Department of Statistics, TU Dortmund University, 44221, Dortmund, Germany.
| | - Manuela Zucknick
- Department of Biostatistics, Oslo Centre for Biostatistics and Epidemiology, University of Oslo, 0317, Oslo, Norway
| | - Katja Ickstadt
- Department of Statistics, TU Dortmund University, 44221, Dortmund, Germany
| | - Jörg Rahnenführer
- Department of Statistics, TU Dortmund University, 44221, Dortmund, Germany
| |
Collapse
|
28
|
Soeorg H, Sverrisdóttir E, Andersen M, Lund TM, Sessa M. The PHARMACOM-EPI Framework for Integrating Pharmacometric Modelling Into Pharmacoepidemiological Research Using Real-World Data: Application to Assess Death Associated With Valproate. Clin Pharmacol Ther 2021; 111:840-856. [PMID: 34860420 DOI: 10.1002/cpt.2502] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Accepted: 11/17/2021] [Indexed: 01/14/2023]
Abstract
In pharmacoepidemiology, it is usually expected that the observed association should be directly or indirectly related to the pharmacological effects of the drug/s under investigation. Pharmacological effects are, in turn, strongly connected to the pharmacokinetic and pharmacodynamic properties of a drug, which can be characterized and investigated using pharmacometric models. Recently, the use of pharmacometrics has been proposed to provide pharmacological substantiation of pharmacoepidemiological findings derived from real-world data. However, validated frameworks suggesting how to combine these two disciplines for the aforementioned purpose are missing. Therefore, we propose PHARMACOM-EPI, a framework that provides a structured approach on how to identify, characterize, and apply pharmacometric models with practical details on how to choose software, format dataset, handle missing covariates/dosing data, how to perform the external evaluation of pharmacometric models in real-world data, and how to provide pharmacological substantiation of pharmacoepidemiological findings. PHARMACOM-EPI was tested in a proof-of-concept study to pharmacologically substantiate death associated with valproate use in the Danish population aged ≥ 65 years. Pharmacological substantiation of death during a follow-up period of 1 year showed that in all individuals who died (n = 169) individual predictions were within the subtherapeutic range compared with 52.8% of those who did not die (n = 1,084). Of individuals who died, 66.3% (n = 112) had a cause of death possibly related to valproate and 33.7% (n = 57) with well-defined cause of death unlikely related to valproate. This proof-of-concept study showed that PHARMACOM-EPI was able to provide pharmacological substantiation for death associated with valproate use in the study population.
Collapse
Affiliation(s)
- Hiie Soeorg
- Department of Drug Design and Pharmacology, Pharmacovigilance Research Center, University of Copenhagen, Copenhagen, Denmark.,Department of Drug Design and Pharmacology, Pharmacometrics Research Group, University of Copenhagen, Copenhagen, Denmark
| | - Eva Sverrisdóttir
- Department of Drug Design and Pharmacology, Pharmacometrics Research Group, University of Copenhagen, Copenhagen, Denmark
| | - Morten Andersen
- Department of Drug Design and Pharmacology, Pharmacovigilance Research Center, University of Copenhagen, Copenhagen, Denmark
| | - Trine Meldgaard Lund
- Department of Drug Design and Pharmacology, Pharmacometrics Research Group, University of Copenhagen, Copenhagen, Denmark
| | - Maurizio Sessa
- Department of Drug Design and Pharmacology, Pharmacovigilance Research Center, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
29
|
Naimi AI, Mishler AE, Kennedy EH. Practical Strategies for Mitigating the Unknowable. Am J Epidemiol 2021; 192:kwab202. [PMID: 34268571 DOI: 10.1093/aje/kwab202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 07/07/2021] [Accepted: 07/08/2021] [Indexed: 11/14/2022] Open
Affiliation(s)
| | - Alan E Mishler
- Department of Statistics & Data Science, Carnegie Mellon University
| | - Edward H Kennedy
- Department of Statistics & Data Science, Carnegie Mellon University
| |
Collapse
|
30
|
Filion KB, Yu YH. Invited Commentary: The Prevalent New-User Design in Pharmacoepidemiology-Challenges and Opportunities. Am J Epidemiol 2021; 190:1349-1352. [PMID: 33350439 DOI: 10.1093/aje/kwaa284] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 12/15/2020] [Accepted: 12/16/2020] [Indexed: 12/19/2022] Open
Abstract
The prevalent new-user design includes a broader study population than the traditional new-user approach that is frequently used in pharmacoepidemiologic research. In an article appearing in this issue (Am J Epidemiol. 2021;190(7):1341-1348), Webster-Clark et al. describe the treatment initiator types included in the prevalent new-user design and contrast the causal questions assessed using a prevalent new-user design versus a new-user design. They further applied a series of simulation studies showing the importance of accounting for treatment history in addition to time since initiation of the comparator in the prevalent new-user design. In this commentary, we put their findings in the broader context with a discussion of the strengths and limitations of the prevalent new-user design and settings where it would be most useful. The prevalent new-user design and new-user design both address unique questions of clinical and public health importance. Real-world evidence generated by pharmacoepidemiologic research is increasingly being used by regulators and other knowledge users to inform their decision-making. Understanding the causal questions addressed by different designs is crucial in this process; the study by Webster-Clark et al. represents an important step in addressing this issue.
Collapse
|
31
|
Webster-Clark M, Ross RK, Lund JL. Initiator Types and the Causal Question of the Prevalent New-User Design: A Simulation Study. Am J Epidemiol 2021; 190:1341-1348. [PMID: 33350433 DOI: 10.1093/aje/kwaa283] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Revised: 10/30/2020] [Accepted: 11/02/2020] [Indexed: 12/12/2022] Open
Abstract
New-user designs restricting to treatment initiators have become the preferred design for studying drug comparative safety and effectiveness using nonexperimental data. This design reduces confounding by indication and healthy-adherer bias at the cost of smaller study sizes and reduced external validity, particularly when assessing a newly approved treatment compared with standard treatment. The prevalent new-user design includes adopters of a new treatment who switched from or previously used standard treatment (i.e., the comparator), expanding study sample size and potentially broadening the study population for inference. Previous work has suggested the use of time-conditional propensity-score matching to mitigate prevalent user bias. In this study, we describe 3 "types" of initiators of a treatment: new users, direct switchers, and delayed switchers. Using these initiator types, we articulate the causal questions answered by the prevalent new-user design and compare them with those answered by the new-user design. We then show, using simulation, how conditioning on time since initiating the comparator (rather than full treatment history) can still result in a biased estimate of the treatment effect. When implemented properly, the prevalent new-user design estimates new and important causal effects distinct from the new-user design.
Collapse
|
32
|
Abstract
BACKGROUND In perinatal epidemiology, the development of risk prediction models is complicated by parity; how repeat pregnancies influence the predictive accuracy of models that include obstetrical history is unclear. METHODS To assess the influence of repeat pregnancies on the association between predictors and the outcomes, as well as the influence of ignoring the nonindependence between pregnancies, we created four analytical cohorts using the Clinical Practice Research Datalink. The cohorts included (1) first deliveries, (2) a random sample of one delivery per woman, (3) all eligible deliveries per woman, and (4) all eligible deliveries and censoring of follow-up at subsequent pregnancies. Using Plasmode simulations, we varied the predictor-outcome association across cohorts. RESULTS We found minimal differences in the relative contribution of predictors to the overall predictions and the discriminative accuracy of models in the cohort of randomly sampled deliveries versus the all deliveries cohort (C-statistic: 0.62 vs. 0.63; Nagelkerke's R2: 0.03 for both). Accounting for clustering and censoring upon subsequent pregnancies also had negligible influence on model performance. We found important differences in model performance between the models developed in the cohort of first deliveries and the random sample of deliveries. CONCLUSIONS In our study, a model including first deliveries had the best predictive accuracy but was not generalizable to women of varying parities. Moreover, including repeat pregnancies did not improve the predictive accuracy of the models. Multiple models may be needed to improve the transportability and accuracy of prediction models when the outcome of interest is influenced by parity.
Collapse
|
33
|
Weberpals J, Becker T, Davies J, Schmich F, Rüttinger D, Theis FJ, Bauer-Mehren A. Deep Learning-based Propensity Scores for Confounding Control in Comparative Effectiveness Research: A Large-scale, Real-world Data Study. Epidemiology 2021; 32:378-388. [PMID: 33591049 DOI: 10.1097/ede.0000000000001338] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
BACKGROUND Due to the non-randomized nature of real-world data, prognostic factors need to be balanced, which is often done by propensity scores (PSs). This study aimed to investigate whether autoencoders, which are unsupervised deep learning architectures, might be leveraged to compute PS. METHODS We selected patient-level data of 128,368 first-line treated cancer patients from the Flatiron Health EHR-derived de-identified database. We trained an autoencoder architecture to learn a lower-dimensional patient representation, which we used to compute PS. To compare the performance of an autoencoder-based PS with established methods, we performed a simulation study. We assessed the balancing and adjustment performance using standardized mean differences, root mean square errors (RMSE), percent bias, and confidence interval coverage. To illustrate the application of the autoencoder-based PS, we emulated the PRONOUNCE trial by applying the trial's protocol elements within an observational database setting, comparing two chemotherapy regimens. RESULTS All methods but the manual variable selection approach led to well-balanced cohorts with average standardized mean differences <0.1. LASSO yielded on average the lowest deviation of resulting estimates (RMSE 0.0205) followed by the autoencoder approach (RMSE 0.0248). Altering the hyperparameter setup in sensitivity analysis, the autoencoder approach led to similar results as LASSO (RMSE 0.0203 and 0.0205, respectively). In the case study, all methods provided a similar conclusion with point estimates clustered around the null (e.g., HRautoencoder 1.01 [95% confidence interval = 0.80, 1.27] vs. HRPRONOUNCE 1.07 [0.83, 1.36]). CONCLUSIONS Autoencoder-based PS computation was a feasible approach to control for confounding but did not perform better than some established approaches like LASSO.
Collapse
Affiliation(s)
- Janick Weberpals
- From the Data Science, Pharmaceutical Research and Early Development Informatics (pREDi), Roche Innovation Center Munich (RICM), Penzberg, Germany
| | - Tim Becker
- xValue GmbH, Willich, Germany, on behalf of Data Science IV, Pharmaceutical Research and Early Development Informatics (pREDi), Roche Innovation Center Munich (RICM), Penzberg, Germany
| | - Jessica Davies
- F. Hoffmann-La Roche Ltd, Welwyn Garden City, United Kingdom
| | - Fabian Schmich
- From the Data Science, Pharmaceutical Research and Early Development Informatics (pREDi), Roche Innovation Center Munich (RICM), Penzberg, Germany
| | - Dominik Rüttinger
- Early Clinical Development Oncology, Pharmaceutical Research and Early Development (pRED), Roche Innovation Center Munich (RICM), Penzberg, Germany
| | - Fabian J Theis
- Institute of Computational Biology, German Research Center for Environmental Health, Helmholtz Center Munich, Neuherberg, Germany
- Department of Mathematics, Technical University of Munich, Garching, Germany
| | - Anna Bauer-Mehren
- From the Data Science, Pharmaceutical Research and Early Development Informatics (pREDi), Roche Innovation Center Munich (RICM), Penzberg, Germany
| |
Collapse
|
34
|
Acton EK, Hennessy S. Use of prescription drug samples in the US and implications for pharmacoepidemiologic research: a systematic search of the literature. Expert Rev Pharmacoecon Outcomes Res 2021; 21:541-551. [PMID: 33730962 DOI: 10.1080/14737167.2021.1905528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
INTRODUCTION Free drug samples are not captured in the pharmacy claims databases used in many pharmacoepidemiologic studies, which could lead to misclassification of drug exposure status and thus bias study results. AREAS COVERED We systematically searched the literature in PubMed/MEDLINE, Embase, and Scopus from database inception to August 2020 for studies assessing the magnitude of exposure misclassification in pharmacy claims data associated with uncaptured drug sample utilization. Our review identified five US-based studies with substantially different characteristics, contexts, methods, and results. Taken together, these studies suggest that the risk of sample-related bias may be higher for (1) studies of newly approved, patented brand-only drugs in specific classes and contexts; (2) studies of populations where sample use is common and the unexposed cohort is small; and (3) studies where the outcomes of interest are expected to be early-onset or acute, with non-constant hazards. EXPERT OPINION In light of declining overall trends in sample use, future research on sample-related exposure misclassification should focus on delineating bias across those modern contexts where sample use remains high and optimizing bias quantification methods to create a more standardized approach. Additionally, further assessment is warranted for other sources of misclassified exposure status in claims-based pharmacoepidemiology research.
Collapse
Affiliation(s)
- Emily K Acton
- Department of Biostatistics and Epidemiology, Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania School of Medicine, Philadelphia, PA, USA.,Department of Neurology, Translational Center of Excellence for Neuroepidemiology and Neurology Outcomes Research, University of Pennsylvania School of Medicine, Philadelphia, PA, USA.,Center for Pharmacoepidemiology Research and Training, Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania School of Medicine, Philadelphia, PA, USA
| | - Sean Hennessy
- Department of Biostatistics and Epidemiology, Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania School of Medicine, Philadelphia, PA, USA.,Center for Pharmacoepidemiology Research and Training, Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania School of Medicine, Philadelphia, PA, USA.,Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania School of Medicine, Philadelphia, PA, USA
| |
Collapse
|
35
|
Methods to Account for Uncertainty in Latent Class Assignments When Using Latent Classes as Predictors in Regression Models, with Application to Acculturation Strategy Measures. Epidemiology 2021; 31:194-204. [PMID: 31809338 DOI: 10.1097/ede.0000000000001139] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
Latent class models have become a popular means of summarizing survey questionnaires and other large sets of categorical variables. Often these classes are of primary interest to better understand complex patterns in data. Increasingly, these latent classes are reified into predictors of other outcomes of interests, treating the most likely class as the true class to which an individual belongs even though there is uncertainty in class membership. This uncertainty can be viewed as a form of measurement error in predictors, leading to bias in the estimates of the regression parameters associated with the latent classes. Despite this fact, there is very limited literature treating latent class predictors as measurement error models. Most applications ignore this issue and fit a two-stage model that treats the modal class prediction as truth. Here, we develop two approaches-one likelihood-based, the other Bayesian-to implement a joint model for latent class analysis and outcome prediction. We apply these methods to an analysis of how acculturation behaviors predict depression in South Asian immigrants to the United States. A simulation study gives guidance for when a two-stage model can be safely implemented and when the joint model may be required.
Collapse
|
36
|
Conover MM, Rothman KJ, Stürmer T, Ellis AR, Poole C, Jonsson Funk M. Propensity score trimming mitigates bias due to covariate measurement error in inverse probability of treatment weighted analyses: A plasmode simulation. Stat Med 2021; 40:2101-2112. [PMID: 33622016 DOI: 10.1002/sim.8887] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2019] [Revised: 11/15/2020] [Accepted: 01/08/2021] [Indexed: 11/12/2022]
Abstract
BACKGROUND Inverse probability of treatment weighting (IPTW) may be biased by influential observations, which can occur from misclassification of strong exposure predictors. METHODS We evaluated bias and precision of IPTW estimators in the presence of a misclassified confounder and assessed the effect of propensity score (PS) trimming. We generated 1000 plasmode cohorts of size N = 10 000, sampled with replacement from 6063 NHANES respondents (1999-2014) age 40 to 79 with labs and no statin use. We simulated statin exposure as a function of demographics and CVD risk factors; and outcomes as a function of 10-year CVD risk score and statin exposure (rate ratio [RR] = 0.5). For 5% of the people in selected populations (eg, all patients, exposed, those with outcomes), we randomly misclassified a confounder that strongly predicted exposure. We fit PS models and estimated RRs using IPTW and 1:1 PS matching, with and without asymmetric trimming. RESULTS IPTW bias was substantial when misclassification was differential by outcome (RR range: 0.38-0.63) and otherwise minimal (RR range: 0.51-0.53). However, trimming reduced bias for IPTW, nearly eliminating it at 5% trimming (RR range: 0.49-0.52). In one scenario, when the confounder was misclassified for 5% of those with outcomes (0.3% of cohort), untrimmed IPTW was more biased and less precise (RR = 0.37 [SE(logRR) = 0.21]) than matching (RR = 0.50 [SE(logRR) = 0.13]). After 1% trimming, IPTW estimates were unbiased and more precise (RR = 0.49 [SE(logRR) = 0.12]) than matching (RR = 0.51 [SE(logRR) = 0.14]). CONCLUSIONS Differential misclassification of a strong predictor of exposure resulted in biased and imprecise IPTW estimates. Asymmetric trimming reduced bias, with more precise estimates than matching.
Collapse
Affiliation(s)
- Mitchell M Conover
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Kenneth J Rothman
- RTI Health Solutions, RTI International, Research Triangle Park, North Carolina, USA.,Department of Epidemiology, Boston University School of Public Health, Boston, Massachusetts, USA
| | - Til Stürmer
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Alan R Ellis
- School of Social Work, North Carolina State University, Raleigh, North Carolina, USA
| | - Charles Poole
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Michele Jonsson Funk
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
37
|
Tao R, Mercaldo ND, Haneuse S, Maronge JM, Rathouz PJ, Heagerty PJ, Schildcrout JS. Two-wave two-phase outcome-dependent sampling designs, with applications to longitudinal binary data. Stat Med 2021; 40:1863-1876. [PMID: 33442883 DOI: 10.1002/sim.8876] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Revised: 12/07/2020] [Accepted: 12/25/2020] [Indexed: 12/26/2022]
Abstract
Two-phase outcome-dependent sampling (ODS) designs are useful when resource constraints prohibit expensive exposure ascertainment on all study subjects. One class of ODS designs for longitudinal binary data stratifies subjects into three strata according to those who experience the event at none, some, or all follow-up times. For time-varying covariate effects, exclusively selecting subjects with response variation can yield highly efficient estimates. However, if interest lies in the association of a time-invariant covariate, or the joint associations of time-varying and time-invariant covariates with the outcome, then the optimal design is unknown. Therefore, we propose a class of two-wave two-phase ODS designs for longitudinal binary data. We split the second-phase sample selection into two waves, between which an interim design evaluation analysis is conducted. The interim design evaluation analysis uses first-wave data to conduct a simulation-based search for the optimal second-wave design that will improve the likelihood of study success. Although we focus on longitudinal binary response data, the proposed design is general and can be applied to other response distributions. We believe that the proposed designs can be useful in settings where (1) the expected second-phase sample size is fixed and one must tailor stratum-specific sampling probabilities to maximize estimation efficiency, or (2) relative sampling probabilities are fixed across sampling strata and one must tailor sample size to achieve a desired precision. We describe the class of designs, examine finite sampling operating characteristics, and apply the designs to an exemplar longitudinal cohort study, the Lung Health Study.
Collapse
Affiliation(s)
- Ran Tao
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee, USA.,Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Nathaniel D Mercaldo
- Departments of Radiology and Neurology, Massachusetts General Hospital and Harvard University, Boston, Massachusetts, USA
| | - Sebastien Haneuse
- Department of Biostatistics, Harvard University, Boston, Massachusetts, USA
| | - Jacob M Maronge
- Department of Statistics, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Paul J Rathouz
- Department of Population Health, University of Texas, Austin, Texas, USA
| | - Patrick J Heagerty
- Department of Biostatistics, University of Washington, Seattle, Washington, USA
| | - Jonathan S Schildcrout
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| |
Collapse
|
38
|
Garrido MM, Lum J, Pizer SD. Vector-based kernel weighting: A simple estimator for improving precision and bias of average treatment effects in multiple treatment settings. Stat Med 2020; 40:1204-1223. [PMID: 33327037 DOI: 10.1002/sim.8836] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Revised: 10/27/2020] [Accepted: 11/14/2020] [Indexed: 11/08/2022]
Abstract
Treatment effect estimation must account for observed confounding, in which factors affect treatment assignment and outcomes simultaneously. Ignoring observed confounding risks concluding that a helpful treatment is not beneficial or that a treatment is safe when actually harmful. Propensity score matching or weighting adjusts for observed confounding, but the best way to use propensity scores for multiple treatments is unknown. It is unclear when choice of a different weighting or matching strategy leads to divergent inferences. We used Monte Carlo simulations (1000 replications) to examine sensitivity of multivalued treatment inferences to propensity score weighting or matching strategies. We consider five variants of propensity score adjustment: inverse probability of treatment weights, generalized propensity score matching, kernel weights (KW), vector matching, and a new hybrid that is easily implemented-vector-based kernel weighting (VBKW). VBKW matches observations with similar propensity score vectors, assigning greater KW to observations with similar probabilities within a given bandwidth. We varied degree of propensity score model misspecification, sample size, treatment effect heterogeneity, initial covariate imbalance, and sample distribution across treatment groups. We evaluated sensitivity of results to propensity score estimation technique (multinomial logit or multinomial probit). Across simulations, VBKW performed equally or better than the other methods in terms of bias, efficiency, and covariate balance measured via prognostic scores. Our simulations suggest that VBKW is amenable to full automation and is less sensitive to PS model misspecification than other methods used to account for observed confounding in multivalued treatment analyses.
Collapse
Affiliation(s)
- Melissa M Garrido
- Partnered Evidence-based Policy Resource Center, Boston VA Healthcare System, Boston, Massachusetts, USA.,Department of Health Law, Policy and Management, Boston University School of Public Health, Boston, Massachusetts, USA
| | - Jessica Lum
- Partnered Evidence-based Policy Resource Center, Boston VA Healthcare System, Boston, Massachusetts, USA
| | - Steven D Pizer
- Partnered Evidence-based Policy Resource Center, Boston VA Healthcare System, Boston, Massachusetts, USA.,Department of Health Law, Policy and Management, Boston University School of Public Health, Boston, Massachusetts, USA
| |
Collapse
|
39
|
Abstract
Supplemental Digital Content is available in the text. We use simulated data to examine the consequences of depletion of susceptibles for hazard ratio (HR) estimators based on a propensity score (PS). First, we show that the depletion of susceptibles attenuates marginal HRs toward the null by amounts that increase with the incidence of the outcome, the variance of susceptibility, and the impact of susceptibility on the outcome. If susceptibility is binary then the Bross bias multiplier, originally intended to quantify bias in a risk ratio from a binary confounder, also quantifies the ratio of the instantaneous marginal HR to the conditional HR as susceptibles are depleted differentially. Second, we show how HR estimates that are conditioned on a PS tend to be between the true conditional and marginal HRs, closer to the conditional HR if treatment status is strongly associated with susceptibility and closer to the marginal HR if treatment status is weakly associated with susceptibility. We show that associations of susceptibility with the PS matter to the marginal HR in the treated (ATT) though not to the marginal HR in the entire cohort (ATE). Third, we show how the PS can be updated periodically to reduce depletion-of-susceptibles bias in conditional estimators. Although marginal estimators can hit their ATE or ATT targets consistently without updating the PS, we show how their targets themselves can be misleading as they are attenuated toward the null. Finally, we discuss implications for the interpretation of HRs and their relevance to underlying scientific and clinical questions. See video Abstract: http://links.lww.com/EDE/B727.
Collapse
|
40
|
Bykov K, Wang SV, Hallas J, Pottegård A, Maclure M, Gagne JJ. Bias in case-crossover studies of medications due to persistent use: A simulation study. Pharmacoepidemiol Drug Saf 2020; 29:1079-1085. [PMID: 32548875 DOI: 10.1002/pds.5031] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Revised: 05/01/2020] [Accepted: 05/05/2020] [Indexed: 11/09/2022]
Abstract
PURPOSE The case-crossover design is increasingly used to evaluate the effects of chronic medications; however, as traditionally implemented in pharmacoepidemiology, with referent period preceding the outcome, it may lead to bias in the presence of persistent exposures. We aimed to evaluate the extent and magnitude of bias in case-crossover analyses of chronic and persistent exposures, using simulations. METHODS We simulated cohorts with either 30-day, 180-day, or 2-year exposure duration; and with varying degrees of persistence (10%, 30%, 50%, 70%, or 90% of patients not stopping exposure). We evaluated all scenarios under the null and the scenario with 30% persistence under varying exposure effects (odds ratios of 0.25 to 4.0). Cohorts were analyzed using conditional logistic regression that compared the odds of exposure on the outcome day to the odds of exposure on a referent day 30 days prior to the outcome. We further implemented the case-time-control design to evaluate its ability to adjust for bias from persistence. RESULTS Case-crossover analyses produced unbiased estimates across all scenarios without persistent users, regardless of exposure duration. In scenarios where some patients persisted on treatment, case-crossover analyses resulted in upward bias, which increased with increasing proportion of persistent users, but did not vary substantially in relation to the magnitude of the true effect. Case-time-control analyses removed bias in all scenarios. CONCLUSIONS Investigators should be aware of bias due to treatment persistence in unidirectional case-crossover analyses of chronic medications, which can be remedied with a control group of similarly persistent noncases.
Collapse
Affiliation(s)
- Katsiaryna Bykov
- Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, USA
| | - Shirley V Wang
- Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, USA
| | - Jesper Hallas
- Clinical Pharmacology and Pharmacy, Department of Public Health, University of Southern Denmark, Odense, Denmark.,Department of Clinical Biochemistry and Clinical Pharmacology, Odense University Hospital, Odense, Denmark
| | - Anton Pottegård
- Clinical Pharmacology and Pharmacy, Department of Public Health, University of Southern Denmark, Odense, Denmark
| | - Malcolm Maclure
- Department of Anesthesiology, Pharmacology and Therapeutics, Faculty of Medicine, University of British Columbia, Vancouver, British Columbia, Canada
| | - Joshua J Gagne
- Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
41
|
Ripollone JE, Huybrechts KF, Rothman KJ, Ferguson RE, Franklin JM. Evaluating the Utility of Coarsened Exact Matching for Pharmacoepidemiology Using Real and Simulated Claims Data. Am J Epidemiol 2020; 189:613-622. [PMID: 31845719 DOI: 10.1093/aje/kwz268] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Revised: 11/14/2019] [Accepted: 11/19/2019] [Indexed: 01/27/2023] Open
Abstract
Coarsened exact matching (CEM) is a matching method proposed as an alternative to other techniques commonly used to control confounding. We compared CEM with 3 techniques that have been used in pharmacoepidemiology: propensity score matching, Mahalanobis distance matching, and fine stratification by propensity score (FS). We evaluated confounding control and effect-estimate precision using insurance claims data from the Pharmaceutical Assistance Contract for the Elderly (1999-2002) and Medicaid Analytic eXtract (2000-2007) databases (United States) and from simulated claims-based cohorts. CEM generally achieved the best covariate balance. However, it often led to high bias and low precision of the risk ratio due to extreme losses in study size and numbers of outcomes (i.e., sparse data bias)-especially with larger covariate sets. FS usually was optimal with respect to bias and precision and always created good covariate balance. Propensity score matching usually performed almost as well as FS, especially with higher index exposure prevalence. The performance of Mahalanobis distance matching was relatively poor. These findings suggest that CEM, although it achieves good covariate balance, might not be optimal for large claims-database studies with rich covariate information; it might be ideal if only a few (<10) strong confounders must be controlled.
Collapse
|
42
|
Izem R, Liao J, Hu M, Wei Y, Akhtar S, Wernecke M, MaCurdy TE, Kelman J, Graham DJ. Comparison of propensity score methods for pre-specified subgroup analysis with survival data. J Biopharm Stat 2020; 30:734-751. [DOI: 10.1080/10543406.2020.1730868] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Affiliation(s)
- Rima Izem
- Division of Biostatistics and Epidemiology, Children’s National Research Institute, and Department of Pediatrics, George Washington University, Washington, USA
| | | | - Mao Hu
- Acumen LLC, Burlingame, CA, USA
| | | | | | | | - Thomas E. MaCurdy
- Acumen LLC, Burlingame, CA, USA
- Department of Economics, Stanford University
| | - Jeffrey Kelman
- The Center for Medicaid at the Centers for Medicare and Medicaid Services, Baltimore, MD, USA
| | - David J Graham
- Food and Drug Administration, Center for Drug Evaluations and Research, Silver Spring, MD, USA
| |
Collapse
|
43
|
Shi X, Wellman R, Heagerty PJ, Nelson JC, Cook AJ. Safety surveillance and the estimation of risk in select populations: Flexible methods to control for confounding while targeting marginal comparisons via standardization. Stat Med 2020; 39:369-386. [PMID: 31823406 PMCID: PMC7768802 DOI: 10.1002/sim.8410] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2017] [Revised: 09/04/2019] [Accepted: 09/26/2019] [Indexed: 11/07/2022]
Abstract
We consider the critical problem of pharmacosurveillance for adverse events once a drug or medical product is incorporated into routine clinical care. When making inference on comparative safety using large-scale electronic health records, we often encounter an extremely rare binary adverse outcome with a large number of potential confounders. In this context, it is challenging to offer flexible methods to adjust for high-dimensional confounders, whereas use of the propensity score (PS) can help address this challenge by providing both confounding control and dimension reduction. Among PS methods, regression adjustment using the PS as a covariate in an outcome model has been incompletely studied and potentially misused. Previous studies have suggested that simple linear adjustment may not provide sufficient control of confounding. Moreover, no formal representation of the statistical procedure and associated inference has been detailed. In this paper, we characterize a three-step procedure, which performs flexible regression adjustment of the estimated PS followed by standardization to estimate the causal effect in a select population. We also propose a simple variance estimation method for performing inference. Through a realistic simulation mimicking data from the Food and Drugs Administration's Sentinel Initiative comparing the effect of angiotensin-converting enzyme inhibitors and beta blockers on incidence of angioedema, we show that flexible regression on the PS resulted in less bias without loss of efficiency, and can outperform other methods when the PS model is correctly specified. In addition, the direct variance estimation method is a computationally fast and reliable approach for inference.
Collapse
Affiliation(s)
- Xu Shi
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan
| | - Robert Wellman
- Department of Biostatistics, University of Washington, Seattle, Washington
| | - Patrick J. Heagerty
- Biostatistics Unit, Kaiser Permanente Washington Health Research Institute, Seattle, Washington
| | - Jennifer C. Nelson
- Department of Biostatistics, University of Washington, Seattle, Washington
- Biostatistics Unit, Kaiser Permanente Washington Health Research Institute, Seattle, Washington
| | - Andrea J. Cook
- Department of Biostatistics, University of Washington, Seattle, Washington
- Biostatistics Unit, Kaiser Permanente Washington Health Research Institute, Seattle, Washington
| |
Collapse
|
44
|
Ding LJ, Schlüter HM, Szucs MJ, Ahmad R, Wu Z, Xu W. Comparison of Statistical Tests and Power Analysis for Phosphoproteomics Data. J Proteome Res 2020; 19:572-582. [PMID: 31789524 DOI: 10.1021/acs.jproteome.9b00280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Advances in protein tagging and mass spectrometry have enabled generation of large quantitative proteome and phosphoproteome data sets, for identifying differentially expressed targets in case-control studies. The power study of statistical tests is critical for designing strategies for effective target identification and control of experimental cost. Here, we develop a simulation framework to generate realistic phospho-peptide data with known changes between cases and controls. Using this framework, we quantify the performance of traditional t-tests, Bayesian tests, and the ranking-by-fold-change test. Bayesian tests, which share variance information among peptides, outperform the traditional t-tests. Although ranking-by-fold-change has similar power as the Bayesian tests, its type I error rate cannot be properly controlled without proper permutation analysis; therefore, simply relying on the ranking likely brings false positives. Two-sample Bayesian tests considering dependencies between intensity and variance are superior for data sets with complex variance. While increasing the sample size enhances the statistical tests' performance, balanced controls and cases are recommended over a one-side weighted group. Further, higher peptide standard deviations require higher fold changes to achieve the same statistical power. Together, these results highlight the importance of model-informed experimental design and principled statistical analyses when working with large-scale proteomics and phosphoproteomics data.
Collapse
Affiliation(s)
| | - Hannah M Schlüter
- Department of Computing , Imperial College London , South Kensington, London SW7 2AZ , United Kingdom
| | - Matthew J Szucs
- Broad Institute of MIT and Harvard , 415 Main Street , Cambridge , Massachusetts 02139 , United States
| | - Rushdy Ahmad
- Broad Institute of MIT and Harvard , 415 Main Street , Cambridge , Massachusetts 02139 , United States
| | - Zheyang Wu
- Department of Mathematical Sciences and Program of Bioinformatics and Computational Biology and Program of Data Science , Worcester Polytechnic Institute (WPI) , 100 Institute Road , Worcester , Massachusetts 01609 , United States
| | | |
Collapse
|
45
|
Jagdhuber R, Lang M, Stenzl A, Neuhaus J, Rahnenführer J. Cost-Constrained feature selection in binary classification: adaptations for greedy forward selection and genetic algorithms. BMC Bioinformatics 2020; 21:26. [PMID: 31992203 PMCID: PMC6986087 DOI: 10.1186/s12859-020-3361-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Accepted: 01/10/2020] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND With modern methods in biotechnology, the search for biomarkers has advanced to a challenging statistical task exploring high dimensional data sets. Feature selection is a widely researched preprocessing step to handle huge numbers of biomarker candidates and has special importance for the analysis of biomedical data. Such data sets often include many input features not related to the diagnostic or therapeutic target variable. A less researched, but also relevant aspect for medical applications are costs of different biomarker candidates. These costs are often financial costs, but can also refer to other aspects, for example the decision between a painful biopsy marker and a simple urine test. In this paper, we propose extensions to two feature selection methods to control the total amount of such costs: greedy forward selection and genetic algorithms. In comprehensive simulation studies of binary classification tasks, we compare the predictive performance, the run-time and the detection rate of relevant features for the new proposed methods and five baseline alternatives to handle budget constraints. RESULTS In simulations with a predefined budget constraint, our proposed methods outperform the baseline alternatives, with just minor differences between them. Only in the scenario without an actual budget constraint, our adapted greedy forward selection approach showed a clear drop in performance compared to the other methods. However, introducing a hyperparameter to adapt the benefit-cost trade-off in this method could overcome this weakness. CONCLUSIONS In feature cost scenarios, where a total budget has to be met, common feature selection algorithms are often not suitable to identify well performing subsets for a modelling task. Adaptations of these algorithms such as the ones proposed in this paper can help to tackle this problem.
Collapse
Affiliation(s)
- Rudolf Jagdhuber
- Department of Statistics, TU Dortmund, Vogelpothsweg 87, Dortmund, 44227 Germany
- numares AG, Am BioPark 9, Regensburg, 93053 Germany
| | - Michel Lang
- Department of Statistics, TU Dortmund, Vogelpothsweg 87, Dortmund, 44227 Germany
| | - Arnulf Stenzl
- Klinik für Urologie, Universitätsklinikum Tübingen, Hoppe-Seyler-Str. 3, Tübingen, 72076 Germany
| | - Jochen Neuhaus
- Universitätsklinikum Leipzig AöR, Department für Operative Medizin, Klinik und Poliklinik für Urologie, Liebigstr. 20, Leipzig, 04103 Germany
| | - Jörg Rahnenführer
- Department of Statistics, TU Dortmund, Vogelpothsweg 87, Dortmund, 44227 Germany
| |
Collapse
|
46
|
Visualization tool of variable selection in bias-variance tradeoff for inverse probability weights. Ann Epidemiol 2020; 41:56-59. [PMID: 31982245 DOI: 10.1016/j.annepidem.2019.12.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Revised: 11/27/2019] [Accepted: 12/10/2019] [Indexed: 11/23/2022]
Abstract
PURPOSE Inversed probability weighted (IPW) estimators are commonly used to adjust for time-fixed or time-varying confounders. However, in high-dimensional settings, including all identified confounders may result in unstable weights leading to higher variance. We aimed to develop a visualization tool demonstrating the impact of each confounder on the bias and variance of IPW estimates, as well as the propensity score overlap. METHODS A SAS macro was developed for this visualization tool and we demonstrate how this tool can be used to identify potentially problematic confounders of the association of statin use after myocardial infarction on one-year mortality in a plasmode simulation study using a cohort of 39,792 patients from the UK (1998-2012). RESULTS Through the tool's output, we can identify problematic confounders (two instrumental variables) and important confounders by comparing the estimated psuedo MSE with that from the fully adjusted model and propensity score overlap plot. CONCLUSION Our results suggest that the analytic impact of all confounders should be considered carefully when fitting IPW estimators.
Collapse
|
47
|
Use of Time-Dependent Propensity Scores to Adjust Hazard Ratio Estimates in Cohort Studies with Differential Depletion of Susceptibles. Epidemiology 2020; 31:82-89. [DOI: 10.1097/ede.0000000000001107] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
48
|
Edelmann D, Hummel M, Hielscher T, Saadati M, Benner A. Marginal variable screening for survival endpoints. Biom J 2019; 62:610-626. [DOI: 10.1002/bimj.201800269] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2018] [Revised: 05/23/2019] [Accepted: 06/04/2019] [Indexed: 01/31/2023]
Affiliation(s)
- Dominic Edelmann
- Division of Biostatistics German Cancer Research Center (DKFZ) Heidelberg Germany
| | - Manuela Hummel
- Division of Biostatistics German Cancer Research Center (DKFZ) Heidelberg Germany
| | - Thomas Hielscher
- Division of Biostatistics German Cancer Research Center (DKFZ) Heidelberg Germany
| | - Maral Saadati
- Division of Biostatistics German Cancer Research Center (DKFZ) Heidelberg Germany
| | - Axel Benner
- Division of Biostatistics German Cancer Research Center (DKFZ) Heidelberg Germany
| |
Collapse
|
49
|
Missing Data in Marginal Structural Models: A Plasmode Simulation Study Comparing Multiple Imputation and Inverse Probability Weighting. Med Care 2019; 57:237-243. [PMID: 30664611 DOI: 10.1097/mlr.0000000000001063] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
BACKGROUND The use of marginal structural models (MSMs) to adjust for time-varying confounding has increased in epidemiologic studies. However, in the setting of MSMs, recommendations for how best to handle missing data are contradictory. We present a plasmode simulation study to compare the validity and precision of MSMs estimates using complete case analysis (CC), multiple imputation (MI), and inverse probability weighting (IPW) in the presence of missing data on time-independent and time-varying confounders. MATERIALS AND METHODS Simulations were based on a cohort substudy using data from the Osteoarthritis Initiative which estimated the marginal causal effect of intra-articular injection use on yearly changes in knee pain. We simulated 81 scenarios with parameter values varied on missing mechanisms (MCAR, MAR, and MNAR), percentages of missing (10%, 20%, and 30%), type of confounders (time-independent, time-varying, either or both), and analytical approaches (CC, IPW, and MI). The performance of CC, IPW, and MI methods was compared using relative bias, mean squared error of the estimates of interest, and empirical power. RESULTS Across scenarios defined by missing data mechanism, extent of missing data, and confounder type, MI generally produced less biased estimates (range: 1.2%-6.7%) with better precision (range: 0.17-0.18) compared with IPW (relative bias: -5.3% to 8.0%; precision: 0.19-0.53). Empirical power was constant across the scenarios using MI. CONCLUSIONS Under simple yet realistically constructed scenarios, MI seems to confer an advantage over IPW in MSMs applications.
Collapse
|
50
|
Tian Y, Schuemie MJ, Suchard MA. Evaluating large-scale propensity score performance through real-world and synthetic data experiments. Int J Epidemiol 2019; 47:2005-2014. [PMID: 29939268 DOI: 10.1093/ije/dyy120] [Citation(s) in RCA: 103] [Impact Index Per Article: 20.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/24/2018] [Indexed: 12/30/2022] Open
Abstract
Background Propensity score adjustment is a popular approach for confounding control in observational studies. Reliable frameworks are needed to determine relative propensity score performance in large-scale studies, and to establish optimal propensity score model selection methods. Methods We detail a propensity score evaluation framework that includes synthetic and real-world data experiments. Our synthetic experimental design extends the 'plasmode' framework and simulates survival data under known effect sizes, and our real-world experiments use a set of negative control outcomes with presumed null effect sizes. In reproductions of two published cohort studies, we compare two propensity score estimation methods that contrast in their model selection approach: L1-regularized regression that conducts a penalized likelihood regression, and the 'high-dimensional propensity score' (hdPS) that employs a univariate covariate screen. We evaluate methods on a range of outcome-dependent and outcome-independent metrics. Results L1-regularization propensity score methods achieve superior model fit, covariate balance and negative control bias reduction compared with the hdPS. Simulation results are mixed and fluctuate with simulation parameters, revealing a limitation of simulation under the proportional hazards framework. Including regularization with the hdPS reduces commonly reported non-convergence issues but has little effect on propensity score performance. Conclusions L1-regularization incorporates all covariates simultaneously into the propensity score model and offers propensity score performance superior to the hdPS marginal screen.
Collapse
Affiliation(s)
- Yuxi Tian
- Department of Biomathematics, David Geffen School of Medicine at UCLA, University of California, Los Angeles, CA, USA
| | - Martijn J Schuemie
- Epidemiology Department, Janssen Research and Development LLC, Titusville, NJ, USA
| | - Marc A Suchard
- Department of Biomathematics, David Geffen School of Medicine at UCLA, University of California, Los Angeles, CA, USA.,Department of Biostatistics, UCLA Fielding School of Public Health, University of California, Los Angeles, CA, USA.,Department of Human Genetics, David Geffen School of Medicine at UCLA, University of California, Los Angeles, CA, USA
| |
Collapse
|