1
|
Bottigliengo D, Baldi I, Lanera C, Lorenzoni G, Bejko J, Bottio T, Tarzia V, Carrozzini M, Gerosa G, Berchialla P, Gregori D. Oversampling and replacement strategies in propensity score matching: a critical review focused on small sample size in clinical settings. BMC Med Res Methodol 2021; 21:256. [PMID: 34809559 PMCID: PMC8609749 DOI: 10.1186/s12874-021-01454-z] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 10/26/2021] [Indexed: 12/03/2022] Open
Abstract
Background Propensity score matching is a statistical method that is often used to make inferences on the treatment effects in observational studies. In recent years, there has been widespread use of the technique in the cardiothoracic surgery literature to evaluate to potential benefits of new surgical therapies or procedures. However, the small sample size and the strong dependence of the treatment assignment on the baseline covariates that often characterize these studies make such an evaluation challenging from a statistical point of view. In such settings, the use of propensity score matching in combination with oversampling and replacement may provide a solution to these issues by increasing the initial sample size of the study and thus improving the statistical power that is needed to detect the effect of interest. In this study, we review the use of propensity score matching in combination with oversampling and replacement in small sample size settings. Methods We performed a series of Monte Carlo simulations to evaluate how the sample size, the proportion of treated, and the assignment mechanism affect the performances of the proposed approaches. We assessed the performances with overall balance, relative bias, root mean squared error and nominal coverage. Moreover, we illustrate the methods using a real case study from the cardiac surgery literature. Results Matching without replacement produced estimates with lower bias and better nominal coverage than matching with replacement when 1:1 matching was considered. In contrast to that, matching with replacement showed better balance, relative bias, and root mean squared error than matching without replacement for increasing levels of oversampling. The best nominal coverage was obtained by using the estimator that accounts for uncertainty in the matching procedure on sets of units obtained after matching with replacement. Conclusions The use of replacement provides the most reliable treatment effect estimates and that no more than 1 or 2 units from the control group should be matched to each treated observation. Moreover, the variance estimator that accounts for the uncertainty in the matching procedure should be used to estimate the treatment effect. Supplementary Information The online version contains supplementary material available at 10.1186/s12874-021-01454-z.
Collapse
Affiliation(s)
- Daniele Bottigliengo
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, Vascular Sciences and Public Health, University of Padova, Via Loredan 18, 35121, Padova, Italy
| | - Ileana Baldi
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, Vascular Sciences and Public Health, University of Padova, Via Loredan 18, 35121, Padova, Italy
| | - Corrado Lanera
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, Vascular Sciences and Public Health, University of Padova, Via Loredan 18, 35121, Padova, Italy
| | - Giulia Lorenzoni
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, Vascular Sciences and Public Health, University of Padova, Via Loredan 18, 35121, Padova, Italy
| | - Jonida Bejko
- Department of Cardiac, Thoracic,Vascular Sciences and Public Health, University of Padova, Padova, Italy
| | - Tomaso Bottio
- Department of Cardiac, Thoracic,Vascular Sciences and Public Health, University of Padova, Padova, Italy
| | - Vincenzo Tarzia
- Department of Cardiac, Thoracic,Vascular Sciences and Public Health, University of Padova, Padova, Italy
| | - Massimiliano Carrozzini
- Department of Cardiac, Thoracic,Vascular Sciences and Public Health, University of Padova, Padova, Italy
| | - Gino Gerosa
- Department of Cardiac, Thoracic,Vascular Sciences and Public Health, University of Padova, Padova, Italy
| | - Paola Berchialla
- Department of Clinical and Biological Sciences, University of Torino, Torino, Italy
| | - Dario Gregori
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, Vascular Sciences and Public Health, University of Padova, Via Loredan 18, 35121, Padova, Italy.
| |
Collapse
|
2
|
Chu AA, Li W, Zhu YQ, Meng XX, Liu GY. Effect of coronary collateral circulation on the prognosis of elderly patients with acute ST-segment elevation myocardial infarction treated with underwent primary percutaneous coronary intervention. Medicine (Baltimore) 2019; 98:e16502. [PMID: 31374011 PMCID: PMC6709020 DOI: 10.1097/md.0000000000016502] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Investigate the effect of coronary collateral circulation (CCC) on the prognosis of elderly patients with acute ST-segment elevation myocardial infarction (STEMI) and acute total occlusion (ATO) of a single epicardial coronary artery.Three hundred forty-six advanced-age patients (age ≥60 years) with STEMI and ATO who underwent primary percutaneous coronary intervention (PCI) were enrolled in this study. According to the Rentrop grades, the patients were assigned to the poor CCC group (Rentrop grade 0-1) and good CCC group (Rentrop grade 2-3).Multivariate logistic regression analysis revealed that poor coronary collateral circulation was an independent factor for Killip class ≥2 (odds ratio [OR]: -1.559; 95% confidence interval [CI]: 1.346-2.378; P = .013), the use of an intra-aortic balloon pump (IABP) (OR: -1.302; 95% CI: 0.092-0.805; P = .019), and myocardial blush grade (MBG) 3 (OR: 1.516; 95% CI: 2.148-9.655; P < .001). We completed a 12-month follow-up, during which 52 patients (15.0%) were lost to follow-up and 19 patients (5.5%) died. Univariate analysis (Kaplan-Meier and log-rank tests) suggested that poor CCC had a significant effect on all-cause mortality (P = .046), while multivariate analysis (Cox regression analysis) indicated that CCC had no statistically significant effect on all-cause mortality (P = .089) after the exclusion of other confounding factors. After excluding the influence of other confounding factors, this study showed that the mortality rate increased by 26.9% within 1 year for every 1-hour increment of time of onset. The mortality rate in patients with Killip class ≥2 was 8.287 times higher than that in patients with Killip class 0 to 1. The mortality rate in patients over 75 years was 8.25 times higher than that in patients aged 60 to 75 years. The mortality rate in patients with myocardial blush grade 3 (MBG 3) was 5.7% higher than that in patients with MBG 0-2.The conditions of CCC in the acute phase had no significant direct effect on all-cause mortality in patients, but those with good CCC had a higher rate of MBG 3 after primary PCI and a lower rate of Killip ≥2.
Collapse
Affiliation(s)
- Ai-Ai Chu
- Department of Cardiology, Gansu Provincial Hospital
| | - Wei Li
- Department of Cardiology, Qinghai Provincial Hospital, Xining
| | - You-Qi Zhu
- Heart Center, The First Affiliated Hospital, Lanzhou University, Lanzhou
| | - Xiao-Xue Meng
- Heart Center, The First Affiliated Hospital, Lanzhou University, Lanzhou
| | - Guo-Yong Liu
- Heart Center, The First Affiliated Hospital, Lanzhou University, Lanzhou
- Weihai Municipal Hospital, Shandong Province, China
| |
Collapse
|
3
|
Austin PC. Statistical criteria for selecting the optimal number of untreated subjects matched to each treated subject when using many-to-one matching on the propensity score. Am J Epidemiol 2010; 172:1092-7. [PMID: 20802241 PMCID: PMC2962254 DOI: 10.1093/aje/kwq224] [Citation(s) in RCA: 410] [Impact Index Per Article: 29.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2010] [Accepted: 06/18/2010] [Indexed: 12/20/2022] Open
Abstract
Propensity-score matching is increasingly being used to estimate the effects of treatments using observational data. In many-to-one (M:1) matching on the propensity score, M untreated subjects are matched to each treated subject using the propensity score. The authors used Monte Carlo simulations to examine the effect of the choice of M on the statistical performance of matched estimators. They considered matching 1-5 untreated subjects to each treated subject using both nearest-neighbor matching and caliper matching in 96 different scenarios. Increasing the number of untreated subjects matched to each treated subject tended to increase the bias in the estimated treatment effect; conversely, increasing the number of untreated subjects matched to each treated subject decreased the sampling variability of the estimated treatment effect. Using nearest-neighbor matching, the mean squared error of the estimated treatment effect was minimized in 67.7% of the scenarios when 1:1 matching was used. Using nearest-neighbor matching or caliper matching, the mean squared error was minimized in approximately 84% of the scenarios when, at most, 2 untreated subjects were matched to each treated subject. The authors recommend that, in most settings, researchers match either 1 or 2 untreated subjects to each treated subject when using propensity-score matching.
Collapse
Affiliation(s)
- Peter C Austin
- Institute for Clinical Evaluative Sciences, Toronto, Ontario, Canada.
| |
Collapse
|
4
|
Austin PC. Primer on statistical interpretation or methods report card on propensity-score matching in the cardiology literature from 2004 to 2006: a systematic review. Circ Cardiovasc Qual Outcomes 2010; 1:62-7. [PMID: 20031790 DOI: 10.1161/circoutcomes.108.790634] [Citation(s) in RCA: 124] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
BACKGROUND Propensity-score matching is frequently used in the cardiology literature. Recent systematic reviews have found that this method is, in general, poorly implemented in the medical literature. The study objective was to examine the quality of the implementation of propensity-score matching in the general cardiology literature. METHODS AND RESULTS A total of 44 articles published in the American Heart Journal, the American Journal of Cardiology, Circulation, the European Heart Journal, Heart, the International Journal of Cardiology, and the Journal of the American College of Cardiology between January 1, 2004, and December 31, 2006, were examined. Twenty of the 44 studies did not provide adequate information on how the propensity-score-matched pairs were formed. Fourteen studies did not report whether matching on the propensity score balanced baseline characteristics between treated and untreated subjects in the matched sample. Only 4 studies explicitly used statistical methods appropriate for matched studies to compare baseline characteristics between treated and untreated subjects. Only 11 (25%) of the 44 studies explicitly used statistical methods appropriate for the analysis of matched data when estimating the effect of treatment on the outcomes. Only 2 studies described the matching method used, assessed balance in baseline covariates by appropriate methods, and used appropriate statistical methods to estimate the treatment effect and its significance. CONCLUSIONS Application of propensity-score matching was poor in the cardiology literature. Suggestions for improving the reporting and analysis of studies that use propensity-score matching are provided.
Collapse
Affiliation(s)
- Peter C Austin
- Institute for Clinical Evaluative Sciences, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario M4N 3M5, Canada.
| |
Collapse
|
5
|
Austin PC. Assessing balance in measured baseline covariates when using many-to-one matching on the propensity-score. Pharmacoepidemiol Drug Saf 2009; 17:1218-25. [PMID: 18972455 DOI: 10.1002/pds.1674] [Citation(s) in RCA: 137] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The propensity score is defined to be a subject's probability of treatment selection, conditional on observed baseline covariates. Conditional on the propensity score, treated and untreated subjects have similar distributions of observed baseline covariates. Propensity-score matching is a commonly used propensity score method for estimating the effects of treatment on outcomes. Balance diagnostics have been previously described for use when 1:1 matching on the propensity score is employed. We illustrate that these methods can be misleading when many-to-one matching on the propensity score is employed. We then propose modifications of these methods that involve weighting each untreated subject by the inverse of the number of untreated subjects in the matched set. We describe both quantitative and qualitative methods to assess the balance in baseline covariates between treated and untreated subjects in a sample obtained by many-to-one matching on the propensity score. The quantitative method uses the weighted standardized difference. The qualitative methods employ graphical methods to compare the distribution of continuous baseline covariates between treated and untreated subjects in the weighted sample. We illustrate our methods using a large sample of patients discharged from hospital with a diagnosis of a heart attack (acute myocardial infarction). The exposure was receipt of a prescription for a statin at hospital discharge.
Collapse
Affiliation(s)
- Peter C Austin
- Institute for Clinical Evaluative Sciences, Toronto, Ontario, Canada.
| |
Collapse
|
6
|
Toumpoulis IK, Anagnostopoulos CE, Ioannidis JP, Toumpoulis SK, Chamogeorgakis T, Swistel DG, Derose JJ. The importance of independent risk-factors for long-term mortality prediction after cardiac surgery. Eur J Clin Invest 2006; 36:599-607. [PMID: 16919041 DOI: 10.1111/j.1365-2362.2006.01703.x] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The purpose of the present study was to determine independent predictors for long-term mortality after cardiac surgery. The European System for Cardiac Operative Risk Evaluation (EuroSCORE) was developed to score in-hospital mortality and recent studies have shown its ability to predict long-term mortality as well. We compared forecasts based on EuroSCORE with other models based on independent predictors. Medical records of patients with cardiac surgery who were discharged alive (n = 4852) were retrospectively reviewed. Their operative surgical risks were calculated according to EuroSCORE. Patients were randomly divided into two groups: training dataset (n = 3233) and validation dataset (n = 1619). Long-term survival data (mean follow-up 5.1 years) were obtained from the National Death Index. We compared four models: standard EuroSCORE (M1); logistic EuroSCORE (M2); M2 and other preoperative, intra-operative and post-operative selected variables (M3); and selected variables only (M4). M3 and M4 were determined with multivariable Cox regression analysis using the training dataset. The estimated five-year survival rates of the quartiles in compared models in the validation dataset were: 94.5%, 87.8%, 77.1%, 64.9% for M1; 95.1%, 88.0%, 80.5%, 64.4% for M2; 93.4%, 89.4%, 80.8%, 64.1% for M3; and 95.8%, 90.9%, 81.0%, 59.9% for M4. In the four models, the odds of death in the highest-risk quartile was 8.4-, 8.5-, 9.4- and 15.6-fold higher, respectively, than the odds of death in the lowest-risk quartile (P < 0.0001 for all). EuroSCORE is a good predictor of long-term mortality after cardiac surgery. We developed and validated a model using selected preoperative, intra-operative and post-operative variables that has better discriminatory ability.
Collapse
Affiliation(s)
- I K Toumpoulis
- College of Physicians and Surgeons of Columbia University, New York, USA.
| | | | | | | | | | | | | |
Collapse
|