1
|
Nieser KJ, Harris AHS. Split-sample reliability estimation in health care quality measurement: Once is not enough. Health Serv Res 2024; 59:e14310. [PMID: 38659301 PMCID: PMC11250135 DOI: 10.1111/1475-6773.14310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/26/2024] Open
Abstract
OBJECTIVE To examine the sensitivity of split-sample reliability estimates to the random split of the data and propose alternative methods for improving the stability of the split-sample method. DATA SOURCES AND STUDY SETTING Data were simulated to reflect a variety of real-world quality measure distributions and scenarios. There is no date range to report as the data are simulated. STUDY DESIGN Simulation studies of split-sample reliability estimation were conducted under varying practical scenarios. DATA COLLECTION/EXTRACTION METHODS All data were simulated using functions in R. PRINCIPAL FINDINGS Single split-sample reliability estimates can be very dependent on the random split of the data, especially in low sample size and low variability settings. Averaging split-sample estimates over many splits of the data can yield a more stable reliability estimate. CONCLUSIONS Measure developers and evaluators using the split-sample reliability method should average a series of reliability estimates calculated from many resamples of the data without replacement to obtain a more stable reliability estimate.
Collapse
Affiliation(s)
- Kenneth J. Nieser
- Center for Innovation to ImplementationVA Palo Alto Health Care SystemMenlo ParkCaliforniaUSA
- Stanford‐Surgery Policy Improvement Research and Education Center, Department of SurgeryStanford UniversityStanfordCaliforniaUSA
| | - Alex H. S. Harris
- Center for Innovation to ImplementationVA Palo Alto Health Care SystemMenlo ParkCaliforniaUSA
- Stanford‐Surgery Policy Improvement Research and Education Center, Department of SurgeryStanford UniversityStanfordCaliforniaUSA
| |
Collapse
|
2
|
Jones DW, Simons JP, Osborne NH, Schermerhorn M, Dimick JB, Schanzer A. Earned outcomes correlate with reliability-adjusted surgical mortality after abdominal aortic aneurysm repair and predict future performance. J Vasc Surg 2024:S0741-5214(24)01082-6. [PMID: 38697233 DOI: 10.1016/j.jvs.2024.04.056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Revised: 04/18/2024] [Accepted: 04/25/2024] [Indexed: 05/04/2024]
Abstract
OBJECTIVE Cumulative, probability-based metrics are regularly used to measure quality in professional sports, but these methods have not been applied to health care delivery. These techniques have the potential to be particularly useful in describing surgical quality, where case volume is variable and outcomes tend to be dominated by statistical "noise." The established statistical technique used to adjust for differences in case volume is reliability-adjustment, which emphasizes statistical "signal" but has several limitations. We sought to validate a novel measure of surgical quality based on earned outcomes methods (deaths above average [DAA]) against reliability-adjusted mortality rates, using abdominal aortic aneurysm (AAA) repair outcomes to illustrate the measure's performance. METHODS Earned outcomes methods were used to calculate the outcome of interest for each patient: DAA. Hospital-level DAA was calculated for non-ruptured open AAA repair and endovascular aortic repair (EVAR) in the Vascular Quality Initiative database from 2016 to 2019. DAA for each center is the sum of observed - predicted risk of death for each patient; predicted risk of death was calculated using established multivariable logistic regression modeling. Correlations of DAA with reliability-adjusted mortality rates and procedure volume were determined. Because an accurate quality metric should correlate with future results, outcomes from 2016 to 2017 were used to categorize hospital quality based on: (1) risk-adjusted mortality; (2) risk- and reliability-adjusted mortality; and (3) DAA. The best performing quality metric was determined by comparing the ability of these categories to predict 2018 to 2019 risk-adjusted outcomes. RESULTS During the study period, 3734 patients underwent open repair (106 hospitals), and 20,680 patients underwent EVAR (183 hospitals). DAA was closely correlated with reliability-adjusted mortality rates for open repair (r = 0.94; P < .001) and EVAR (r = 0.99; P < .001). DAA also correlated with hospital case volume for open repair (r = -.54; P < .001), but not EVAR (r = 0.07; P = .3). In 2016 to 2017, most hospitals had 0% mortality (55% open repair, 57% EVAR), making it impossible to evaluate these hospitals using traditional risk-adjusted mortality rates alone. Further, zero mortality hospitals in 2016 to 2017 did not demonstrate improved outcomes in 2018 to 2019 for open repair (3.8% vs 4.6%; P = .5) or EVAR (0.8% vs 1.0%; P = .2) compared with all other hospitals. In contrast to traditional risk-adjustment, 2016 to 2017 DAA evenly divided centers into quality quartiles that predicted 2018 to 2019 performance with increased mortality rate associated with each decrement in quality quartile (Q1, 3.2%; Q2, 4.0%; Q3, 5.1%; Q4, 6.0%). There was a significantly higher risk of mortality at worst quartile open repair hospitals compared with best quartile hospitals (odds ratio, 2.01; 95% confidence interval, 1.07-3.76; P = .03). Using 2016 to 2019 DAA to define quality, highest quality quartile open repair hospitals had lower median DAA compared with lowest quality quartile hospitals (-1.18 DAA vs +1.32 DAA; P < .001), correlating with lower median reliability-adjusted mortality rates (3.6% vs 5.1%; P < .001). CONCLUSIONS Adjustment for differences in hospital volume is essential when measuring hospital-level outcomes. Earned outcomes accurately categorize hospital quality and correlate with reliability-adjustment but are easier to calculate and interpret. From 2016 to 2019, highest quality open AAA repair hospitals prevented >40 perioperative deaths compared with the average hospital, and >80 perioperative deaths compared with lowest quality hospitals.
Collapse
Affiliation(s)
- Douglas W Jones
- Division of Vascular and Endovascular Surgery, University of Massachusetts Medical Center, University of Massachusetts Chan Medical School, Worcester, MA.
| | - Jessica P Simons
- Division of Vascular and Endovascular Surgery, University of Massachusetts Medical Center, University of Massachusetts Chan Medical School, Worcester, MA
| | | | - Marc Schermerhorn
- Division of Vascular and Endovascular Surgery, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA
| | - Justin B Dimick
- Department of Surgery, University of Michigan, Ann Arbor, MI
| | - Andres Schanzer
- Division of Vascular and Endovascular Surgery, University of Massachusetts Medical Center, University of Massachusetts Chan Medical School, Worcester, MA
| |
Collapse
|
3
|
Lee JD, Zheng R, Okusanya OT, Evans NR, Grenda TR. Association between surgical quality and long-term survival in lung cancer. Lung Cancer 2024; 190:107511. [PMID: 38417278 DOI: 10.1016/j.lungcan.2024.107511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 02/16/2024] [Accepted: 02/21/2024] [Indexed: 03/01/2024]
Abstract
OBJECTIVES There are significant variations in both perioperative and long-term outcomes after lung cancer resection. While perioperative outcomes are often used as comparative measures of quality, they are unreliable, and their association with long-term outcomes remain unclear. In this context, we evaluated whether historical perioperative mortality after lung cancer resection is associated with 5-year survival. PATIENTS AND METHODS The National Cancer Database (NCDB) was queried to identify patients diagnosed with non-small cell lung cancer (NSCLC) in 2010-2016 who underwent surgical resection (n = 234200). Hospital-level reliability-adjusted 90-day mortality rate quartiles for 2010-2013 was used as the independent variable to analyze 5-year survival for patients diagnosed in 2014-2016 (n = 85396). RESULTS There were 85,396 patients in the 2014-2016 cohort across 1,086 hospitals. Overall observed 90-day mortality rate was 3.2% (SD 17.6%) with 2.6% (SD 16.0%) for the historically best performing quartile vs. 3.9% (SD 19.4%) for the worst performing quartile (p < 0.0001). Patients who underwent resection at hospitals with the best historical mortality rate had significantly better 5-year survival across all stages compared to those treated at hospitals in the worst performing quartile in multivariate Cox regression analysis (all stages - HR 1.21 [95% CI 1.15-1.26]; stage I - HR 1.19 [95% CI 1.12-1.25]; stage II - HR 1.20 [95% CI 1.09-1.32]; stage III - HR 1.36 [95% CI 1.20-1.54]) and Kaplan-Meier survival estimates (all stages - p < 0.0001, stage I - p < 0.0001; stage II - p = 0.0004; stage III - p < 0.0001). CONCLUSION With expanded lung cancer screening criteria and likely increase in early-stage detection, profiling performance is paramount to ensuring mortality benefits. We found that episodes surrounding surgical resection may be used to profile long-term outcomes that likely reflect quality across a broader context of care. Evaluating lung cancer care quality using perioperative outcomes may be useful in profiling provider performance and guiding value-based payment policies.
Collapse
Affiliation(s)
- James D Lee
- Division of Pulmonary, Allergy, and Critical Care, Penn Presbyterian Medical Center, Department of Medicine, University of Pennsylvania, Philadelphia, PA, United States.
| | - Richard Zheng
- Division of Surgical Oncology, Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD, United States
| | - Olugbenga T Okusanya
- Division of Thoracic Surgery, Department of Surgery, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA, United States
| | - Nathaniel R Evans
- Division of Thoracic Surgery, Department of Surgery, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA, United States
| | - Tyler R Grenda
- Division of Thoracic Surgery, Department of Surgery, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA, United States
| |
Collapse
|
4
|
Dorken-Gallastegi A, El Hechi M, Amram M, Naar L, Maurer LR, Gebran A, Dunn J, Zhuo YD, Levine J, Bertsimas D, Kaafarani HMA. Use of artificial intelligence for nonlinear benchmarking of surgical care. Surgery 2023; 174:1302-1308. [PMID: 37778969 DOI: 10.1016/j.surg.2023.08.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2023] [Revised: 07/07/2023] [Accepted: 08/16/2023] [Indexed: 10/03/2023]
Abstract
BACKGROUND Existent methodologies for benchmarking the quality of surgical care are linear and fail to capture the complex interactions of preoperative variables. We sought to leverage novel nonlinear artificial intelligence methodologies to benchmark emergency surgical care. METHODS Using a nonlinear but interpretable artificial intelligence methodology called optimal classification trees, first, the overall observed mortality rate at the index hospital's emergency surgery population (index cohort) was compared to the risk-adjusted expected mortality rate calculated by the optimal classification trees from the American College of Surgeons National Surgical Quality Improvement Program database (benchmark cohort). Second, the artificial intelligence optimal classification trees created different "nodes" of care representing specific patient phenotypes defined by the artificial intelligence optimal classification trees without human interference to optimize prediction. These nodes capture multiple iterative risk-adjusted comparisons, permitting the identification of specific areas of excellence and areas for improvement. RESULTS The index and benchmark cohorts included 1,600 and 637,086 patients, respectively. The observed and risk-adjusted expected mortality rates of the index cohort calculated by optimal classification trees were similar (8.06% [95% confidence interval: 6.8-9.5] vs 7.53%, respectively, P = .42). Two areas of excellence and 4 for improvement were identified. For example, the index cohort had lower-than-expected mortality when patients were older than 75 and in respiratory failure and septic shock preoperatively but higher-than-expected mortality when patients had respiratory failure preoperatively and were thrombocytopenic, with an international normalized ratio ≤1.7. CONCLUSION We used artificial intelligence methodology to benchmark the quality of emergency surgical care. Such nonlinear and interpretable methods promise a more comprehensive evaluation and a deeper dive into areas of excellence versus suboptimal care.
Collapse
Affiliation(s)
- Ander Dorken-Gallastegi
- Trauma, Emergency Surgery, and Surgical Critical Care, Massachusetts General Hospital, Harvard Medical School, Boston, MA; Center for Outcomes and Patient Safety in Surgery, Massachusetts General Hospital, Boston, MA
| | - Majed El Hechi
- Trauma, Emergency Surgery, and Surgical Critical Care, Massachusetts General Hospital, Harvard Medical School, Boston, MA; Center for Outcomes and Patient Safety in Surgery, Massachusetts General Hospital, Boston, MA
| | | | - Leon Naar
- Trauma, Emergency Surgery, and Surgical Critical Care, Massachusetts General Hospital, Harvard Medical School, Boston, MA; Center for Outcomes and Patient Safety in Surgery, Massachusetts General Hospital, Boston, MA
| | - Lydia R Maurer
- Trauma, Emergency Surgery, and Surgical Critical Care, Massachusetts General Hospital, Harvard Medical School, Boston, MA; Center for Outcomes and Patient Safety in Surgery, Massachusetts General Hospital, Boston, MA
| | - Anthony Gebran
- Trauma, Emergency Surgery, and Surgical Critical Care, Massachusetts General Hospital, Harvard Medical School, Boston, MA; Center for Outcomes and Patient Safety in Surgery, Massachusetts General Hospital, Boston, MA
| | | | | | | | | | - Haytham M A Kaafarani
- Trauma, Emergency Surgery, and Surgical Critical Care, Massachusetts General Hospital, Harvard Medical School, Boston, MA; Center for Outcomes and Patient Safety in Surgery, Massachusetts General Hospital, Boston, MA.
| |
Collapse
|
5
|
Kollmann NP, Langenberger B, Busse R, Pross C. Stability of hospital quality indicators over time: A multi-year observational study of German hospital data. PLoS One 2023; 18:e0293723. [PMID: 37934753 PMCID: PMC10629650 DOI: 10.1371/journal.pone.0293723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Accepted: 10/19/2023] [Indexed: 11/09/2023] Open
Abstract
BACKGROUND Retrospective hospital quality indicators can only be useful if they are trustworthy signals of current or future quality. Despite extensive longitudinal quality indicator data and many hospital quality public reporting initiatives, research on quality indicator stability over time is scarce and skepticism about their usefulness widespread. OBJECTIVE Based on aggregated, widely available hospital-level quality indicators, this paper sought to determine whether quality indicators are stable over time. Implications for health policy were drawn and the limited methodological foundation for stability assessments of hospital-level quality indicators enhanced. METHODS Two longitudinal datasets (self-reported and routine data), including all hospitals in Germany and covering the period from 2004 to 2017, were analysed. A logistic regression using Generalized Estimating Equations, a time-dependent, graphic quintile representation of risk-adjusted rates and Spearman's rank correlation coefficient were used. RESULTS For a total of eight German quality indicators significant stability over time was demonstrated. The probability of remaining in the best quality cluster in the future across all hospitals reached from 46.9% (CI: 42.4-51.6%) for hip replacement reoperations to 80.4% (CI: 76.4-83.8%) for decubitus. Furthermore, graphical descriptive analysis showed that the difference in adverse event rates for the 20% top performing compared to the 20% worst performing hospitals in the two following years is on average between 30% for stroke and AMI and 79% for decubitus. Stability over time has been shown to vary strongly between indicators and treatment areas. CONCLUSION Quality indicators were found to have sufficient stability over time for public reporting. Potentially, increasing case volumes per hospital, centralisation of medical services and minimum-quantity regulations may lead to more stable and reliable quality of care indicators. Finally, more robust policy interventions such as outcome-based payment, should only be applied to outcome indicators with a higher level of stability over time. This should be subject to future research.
Collapse
Affiliation(s)
| | - Benedikt Langenberger
- Department of Health Care Management, Berlin University of Technology, Berlin, Germany
| | - Reinhard Busse
- Department of Health Care Management, Berlin University of Technology, Berlin, Germany
| | - Christoph Pross
- Department of Health Care Management, Berlin University of Technology, Berlin, Germany
| |
Collapse
|
6
|
Boyle L, Lumley T, Cumin D, Campbell D, Merry AF. Using days alive and out of hospital to measure surgical outcomes in New Zealand: a cross-sectional study. BMJ Open 2023; 13:e063787. [PMID: 37491100 PMCID: PMC10373692 DOI: 10.1136/bmjopen-2022-063787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 07/27/2023] Open
Abstract
OBJECTIVES To measure differences at various deciles in days alive and out of hospital to 90 days (DAOH90) and explore its utility for identifying outliers of performance among district health boards (DHBs). METHODS Days in hospital and mortality within 90 days of surgery were extracted by linking data from the New Zealand National Minimum Data Set and the births and deaths registry between 1 January 2011 and 31 December 2021 for all adults in New Zealand undergoing acute laparotomy (AL-a relatively high-risk group), elective total hip replacement (THR-a medium risk group) or lower segment caesarean section (LSCS-a low-risk group). DAOH90 was calculated without censoring to zero in cases of mortality. For each DHB, direct risk standardisation was used to adjust for potential confounders and presented in deciles according to baseline patient risk. The Mann-Whitney U test assessed overall DAOH90 differences between DHBs, and comparisons are presented between selected deciles of DAOH90 for each operation. RESULTS We obtained national data for 35 175, 52 032 and 117 695 patients undergoing AL, THR and LSCS procedures, respectively. We have demonstrated that calculating DAOH without censoring zero allows for differences between procedures and DHBs to be identified. Risk-adjusted national mean DAOH90 Scores were 64.0 days, 79.0 days and 82.0 days at the 0.1 decile and 75.0 days, 82.0 days and 84.0 days at the 0.2 decile for AL, THR and LSCS, respectively, matching to their expected risk profiles. Differences between procedures and DHBs were most marked at lower deciles of the DAOH90 distribution, and outlier DHBs were detectable. Corresponding 90-day mortality rates were 5.45%, 0.78% and 0.01%. CONCLUSION In New Zealand after direct risk adjustment, differences in DAOH90 between three types of surgical procedure reflected their respective risk levels and associated mortality rates. Outlier DHBs were identified for each procedure. Thus, our approach to analysing DAOH90 appears to have considerable face validity and potential utility for contributing to the measurement of perioperative outcomes in an audit or quality improvement setting.
Collapse
Affiliation(s)
- Luke Boyle
- Department of Statistics, The University of Auckland, Auckland, New Zealand
| | - Thomas Lumley
- Department of Statistics, The University of Auckland, Auckland, New Zealand
| | - David Cumin
- Department of Anaesthesiology, The University of Auckland, Auckland, New Zealand
| | - Doug Campbell
- Department of Anaesthesia, Auckland City Hospital, Auckland, New Zealand
| | - Alan Forbes Merry
- Department of Anaesthesiology, The University of Auckland, Auckland, New Zealand
- Department of Anaesthesia, Auckland City Hospital, Auckland, New Zealand
| |
Collapse
|
7
|
Adams AM, Reames BN, Krell RW. Morbidity and Mortality of Non-pancreatectomy operations for pancreatic cancer: An ACS-NSQIP analysis. Am J Surg 2023; 225:315-321. [PMID: 36088140 DOI: 10.1016/j.amjsurg.2022.08.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Revised: 08/13/2022] [Accepted: 08/21/2022] [Indexed: 11/01/2022]
Abstract
BACKGROUND Patients with pancreas cancer may undergo palliative gastrointestinal or biliary bypass. Recent comparisons of post-operative outcomes following such procedures are lacking. METHODS We analyzed patients undergoing exploration, gastrojejunostomy, biliary bypass or double bypass for pancreatic cancer using data from the 2005-2019 American College of Surgeons National Surgical Quality Improvement Program. We compared 30-day mortality and complications across procedures and over time periods (2005-10, 2011-14, 2015-19) using multivariable regression models. Factors associated with postoperative mortality were identified. RESULTS Of 43,525 patients undergoing surgery with a postoperative diagnosis of pancreatic cancer, 5572 met inclusion criteria. Palliative operations included 1037 gastrojejunostomies, 792 biliary bypasses, 650 double bypasses, and 3093 explorations. The proportion of biliary and double bypass procedures decreased from 2005-10 to 2015-19. Gastrojejunostomy had higher 30-day mortality rate (11.5%) than other operations (p < 0.001). Adjusted 30-day mortality rates remained stable over time (7.8% vs 6.3%, p = 0.095), while rates of serious complications decreased over time (23.2% vs 17.1%, p < 0.001). CONCLUSIONS Palliative bypass for pancreatic cancer has not become safer over time, and 30-day mortality and complications remain high.
Collapse
Affiliation(s)
- Alexandra M Adams
- Department of Surgery, Brooke Army Medical Center, Fort Sam Houston, TX, USA
| | - Bradley N Reames
- Department of Surgery, University of Nebraska Medical Center, Omaha, NE, USA
| | - Robert W Krell
- Department of Surgery, Brooke Army Medical Center, Fort Sam Houston, TX, USA.
| |
Collapse
|
8
|
Prescott HC, Kadel RP, Eyman JR, Freyberg R, Quarrick M, Brewer D, Hasselbeck R. Risk-Adjusting Mortality in the Nationwide Veterans Affairs Healthcare System. J Gen Intern Med 2022; 37:3877-3884. [PMID: 35028862 PMCID: PMC9640507 DOI: 10.1007/s11606-021-07377-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/12/2021] [Accepted: 12/17/2021] [Indexed: 12/03/2022]
Abstract
BACKGROUND The US Veterans Affairs (VA) healthcare system began reporting risk-adjusted mortality for intensive care (ICU) admissions in 2005. However, while the VA's mortality model has been updated and adapted for risk-adjustment of all inpatient hospitalizations, recent model performance has not been published. We sought to assess the current performance of VA's 4 standardized mortality models: acute care 30-day mortality (acute care SMR-30); ICU 30-day mortality (ICU SMR-30); acute care in-hospital mortality (acute care SMR); and ICU in-hospital mortality (ICU SMR). METHODS Retrospective cohort study with split derivation and validation samples. Standardized mortality models were fit using derivation data, with coefficients applied to the validation sample. Nationwide VA hospitalizations that met model inclusion criteria during fiscal years 2017-2018(derivation) and 2019 (validation) were included. Model performance was evaluated using c-statistics to assess discrimination and comparison of observed versus predicted deaths to assess calibration. RESULTS Among 1,143,351 hospitalizations eligible for the acute care SMR-30 during 2017-2019, in-hospital mortality was 1.8%, and 30-day mortality was 4.3%. C-statistics for the SMR models in validation data were 0.870 (acute care SMR-30); 0.864 (ICU SMR-30); 0.914 (acute care SMR); and 0.887 (ICU SMR). There were 16,036 deaths (4.29% mortality) in the SMR-30 validation cohort versus 17,458 predicted deaths (4.67%), reflecting 0.38% over-prediction. Across deciles of predicted risk, the absolute difference in observed versus predicted percent mortality was a mean of 0.38%, with a maximum error of 1.81% seen in the highest-risk decile. CONCLUSIONS AND RELEVANCE The VA's SMR models, which incorporate patient physiology on presentation, are highly predictive and demonstrate good calibration both overall and across risk deciles. The current SMR models perform similarly to the initial ICU SMR model, indicating appropriate adaption and re-calibration.
Collapse
Affiliation(s)
- Hallie C Prescott
- VA Center for Clinical Management Research, Ann Arbor, MI, USA. .,University of Michigan, Department of Medicine, Ann Arbor, MI, USA.
| | - Rajendra P Kadel
- VA Center for Strategic Analytics and Reporting, Department of Veterans Affairs, Veterans Health Administration, 810 Vermont Ave. NW Room 668, Washington, DC, 20420, USA
| | - Julie R Eyman
- VA Center for Strategic Analytics and Reporting, Department of Veterans Affairs, Veterans Health Administration, 810 Vermont Ave. NW Room 668, Washington, DC, 20420, USA
| | - Ron Freyberg
- VA Center for Strategic Analytics and Reporting, Department of Veterans Affairs, Veterans Health Administration, 810 Vermont Ave. NW Room 668, Washington, DC, 20420, USA
| | - Matthew Quarrick
- VA Center for Strategic Analytics and Reporting, Department of Veterans Affairs, Veterans Health Administration, 810 Vermont Ave. NW Room 668, Washington, DC, 20420, USA
| | - David Brewer
- VA Center for Strategic Analytics and Reporting, Department of Veterans Affairs, Veterans Health Administration, 810 Vermont Ave. NW Room 668, Washington, DC, 20420, USA
| | - Rachael Hasselbeck
- VA Inpatient Evaluation Center, Department of Veterans Affairs, Veterans Health Administration, 810 Vermont Ave. NW Room 668, Washington, DC, 20420, USA
| |
Collapse
|
9
|
Sangji NF, Cain-Nielsen AH, Jakubus JL, Mikhail JN, Lussiez A, Neiman P, Montgomery JR, Oliphant BW, Scott JW, Hemmila MR. Application of power analysis to determine the optimal reporting time frame for use in statewide trauma system quality reporting. Surgery 2022; 172:1015-1020. [PMID: 35811165 DOI: 10.1016/j.surg.2022.05.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 04/27/2022] [Accepted: 05/30/2022] [Indexed: 11/19/2022]
Abstract
BACKGROUND Meaningful reporting of quality metrics relies on detecting a statistical difference when a true difference in performance exists. Larger cohorts and longer time frames can produce higher rates of statistical differences. However, older data are less relevant when attempting to enact change in the clinical setting. The selection of time frames must reflect a balance between being too small (type II errors) and too long (stale data). We explored the use of power analysis to optimize time frame selection for trauma quality reporting. METHODS Using data from 22 Level III trauma centers, we tested for differences in 4 outcomes within 4 cohorts of patients. With bootstrapping, we calculated the power for rejecting the null hypothesis that no difference exists amongst the centers for different time frames. From the entire sample for each site, we simulated randomly generated datasets. Each simulated dataset was tested for whether a difference was observed from the average. Power was calculated as the percentage of simulated datasets where a difference was observed. This process was repeated for each outcome. RESULTS The power calculations for the 4 cohorts revealed that the optimal time frame for Level III trauma centers to assess whether a single site's outcomes are different from the overall average was 2 years based on an 80% cutoff. CONCLUSION Power analysis with simulated datasets allows testing of different time frames to assess outcome differences. This type of analysis allows selection of an optimal time frame for benchmarking of Level III trauma center data.
Collapse
Affiliation(s)
- Naveen F Sangji
- Department of Surgery, University of Michigan, Ann Arbor, MI; Center for Healthcare Outcomes and Policy, University of Michigan, Ann Arbor, MI.
| | - Anne H Cain-Nielsen
- Center for Healthcare Outcomes and Policy, University of Michigan, Ann Arbor, MI
| | - Jill L Jakubus
- Center for Healthcare Outcomes and Policy, University of Michigan, Ann Arbor, MI
| | - Judy N Mikhail
- Center for Healthcare Outcomes and Policy, University of Michigan, Ann Arbor, MI
| | - Alisha Lussiez
- National Clinician Scholars Program, University of Michigan, Ann Arbor, MI
| | - Pooja Neiman
- Center for Healthcare Outcomes and Policy, University of Michigan, Ann Arbor, MI; National Clinician Scholars Program, University of Michigan, Ann Arbor, MI; Department of Surgery, Brigham and Women's Hospital, Boston, MA
| | - John R Montgomery
- Department of Surgery, University of Michigan, Ann Arbor, MI; Center for Healthcare Outcomes and Policy, University of Michigan, Ann Arbor, MI
| | - Bryant W Oliphant
- Center for Healthcare Outcomes and Policy, University of Michigan, Ann Arbor, MI; Department of Orthopaedic Surgery, University of Michigan, Ann Arbor, MI. https://twitter.com/BonezNQuality
| | - John W Scott
- Department of Surgery, University of Michigan, Ann Arbor, MI; Center for Healthcare Outcomes and Policy, University of Michigan, Ann Arbor, MI. https://twitter.com/DrJohnScott
| | - Mark R Hemmila
- Department of Surgery, University of Michigan, Ann Arbor, MI; Center for Healthcare Outcomes and Policy, University of Michigan, Ann Arbor, MI
| |
Collapse
|
10
|
Petrella F, Casiraghi M, Radice D, Bardoni C, Cara A, Mohamed S, Sances D, Spaggiari L. Unplanned Return to the Operating Room after Elective Oncologic Thoracic Surgery: A Further Quality Indicator in Surgical Oncology. Cancers (Basel) 2022; 14:cancers14092064. [PMID: 35565193 PMCID: PMC9104285 DOI: 10.3390/cancers14092064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2022] [Revised: 04/13/2022] [Accepted: 04/18/2022] [Indexed: 01/25/2023] Open
Abstract
Background: An unplanned return to the operating room (UROR) is defined as a readmission to the operating room because of a complication or an untoward outcome related to the initial surgery. The aim of the present report is to evaluate the role of URORs after elective oncologic thoracic surgery. Methods: In the study, 4012 consecutive patients were enrolled; among them, 71 patients (1.76%) had an unplanned return to the operating room. Age, sex, Charlson comorbidity index, induction treatments, type of the first operation, indication to readmission to the operating room and type of second operation, length of stay, complication after reoperation and outcomes were collected. Results: The mean age was 63.3 (SD: 13.0); there were 53 male patients (74.6%); the type of the first procedure was: lower lobectomy (11.3%), middle lobectomy (1.4%), upper lobectomy (22.5%), metastasectomy (5.6%), extrapleural pneumonectomy (4.2%), pneumonectomy (40.9%), pleural biopsy (5.6%) and other procedures (8.5%). Patients presenting complications after UROR had undergone a significantly longer first procedure (p < 0.02), had a longer length of stay (p < 0.001) and had higher post-operative mortality (p < 0.001). Conclusions: The patients experiencing UROR after elective oncologic thoracic surgery have significantly higher morbidity and mortality rates when compared to standard thoracic surgery. Bronchopleural fistula remains the most lethal complication in patients undergoing UROR.
Collapse
Affiliation(s)
- Francesco Petrella
- Department of Thoracic Surgery, IRCCS European Institute of Oncology, 20141 Milan, Italy; (M.C.); (C.B.); (A.C.); (S.M.); (L.S.)
- Department of Oncology and Hemato-Oncology, Università degli Studi di Milano, 20122 Milan, Italy
- Correspondence: or ; Tel.: +39-0257489362; Fax: +39-0294379218
| | - Monica Casiraghi
- Department of Thoracic Surgery, IRCCS European Institute of Oncology, 20141 Milan, Italy; (M.C.); (C.B.); (A.C.); (S.M.); (L.S.)
| | - Davide Radice
- Division of Epidemiology and Biostatistics, IRCCS European Institute of Oncology, 20141 Milan, Italy;
| | - Claudia Bardoni
- Department of Thoracic Surgery, IRCCS European Institute of Oncology, 20141 Milan, Italy; (M.C.); (C.B.); (A.C.); (S.M.); (L.S.)
| | - Andrea Cara
- Department of Thoracic Surgery, IRCCS European Institute of Oncology, 20141 Milan, Italy; (M.C.); (C.B.); (A.C.); (S.M.); (L.S.)
| | - Shehab Mohamed
- Department of Thoracic Surgery, IRCCS European Institute of Oncology, 20141 Milan, Italy; (M.C.); (C.B.); (A.C.); (S.M.); (L.S.)
| | - Daniele Sances
- Division of Anesthesiology, IRCCS European Institute of Oncology, 20141 Milan, Italy;
| | - Lorenzo Spaggiari
- Department of Thoracic Surgery, IRCCS European Institute of Oncology, 20141 Milan, Italy; (M.C.); (C.B.); (A.C.); (S.M.); (L.S.)
- Department of Oncology and Hemato-Oncology, Università degli Studi di Milano, 20122 Milan, Italy
| |
Collapse
|
11
|
Novel Surgical Quality Metrics in Abdominal Aortic Aneurysm Repair. J Vasc Surg 2022; 76:1229-1237.e5. [DOI: 10.1016/j.jvs.2022.03.877] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Accepted: 03/28/2022] [Indexed: 11/20/2022]
|
12
|
|
13
|
Wakeam E, Thumma JR, Bonner SN, Chang AC, Reddy RM, Lagisetty K, Lynch W, Grenda T, Chan K, Lyu D, Lin J. One-year Mortality Is Not a Reliable Indicator of Lung Transplant Center Performance. Ann Thorac Surg 2022; 114:225-232. [PMID: 35247344 DOI: 10.1016/j.athoracsur.2022.02.028] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Revised: 01/25/2022] [Accepted: 02/09/2022] [Indexed: 12/01/2022]
Abstract
BACKGROUND In the United States, the Organ Procurement and Transplant Network uses one-year mortality as the primary measure of transplant center quality. We sought to evaluate the reliability of mortality outcomes in lung transplant and compare statistical methods of program performance evaluation. METHODS We used the Standard Transplant Analysis and Research files from the United Network for Organ Sharing to identify lung transplant recipients from 2013-2018 in the United States. We stratified hospitals based on 30-day, 1-year and 5-year survival using risk adjustment, reliability adjustment using empirical Bayes technique, and hierarchical Bayesian mixed-effects models currently used by the OPTN. We measured variation in mortality rates and identification of performance outliers between techniques. RESULTS We identified 12,769 recipients in 69 centers. Reliability adjustment reduced variation in hospital outcomes and had a large impact on hospital mortality rankings. For example, with 1-year mortality, 28% (5 hospitals) of the "best" hospitals (top 25%) and 18% (3 hospitals) of the "worst" hospitals (bottom 25%) were reclassified after reliability adjustment. The overall reliability of 1-year mortality was low at 0.42. Compared to the Bayesian method used by the OPTN, reliability adjustment identified fewer outliers. 5-year survival reached a higher reliability plateau with a lower volume of cases required. CONCLUSIONS The reliability of 1-year mortality in lung transplantation is low, while 5-year survival estimates may be more reliable at lower case volumes. Reliability adjustment yielded more conservative measures of center performance and fewer outliers compared to current Bayesian methods.
Collapse
Affiliation(s)
- Elliot Wakeam
- Section of Thoracic Surgery, Department of Surgery, University of Michigan, Ann Arbor, MI; Center for Health Outcomes and Policy, University of Michigan, Ann Arbor, MI.
| | - Jyothi R Thumma
- Center for Health Outcomes and Policy, University of Michigan, Ann Arbor, MI
| | - Sidra N Bonner
- Center for Health Outcomes and Policy, University of Michigan, Ann Arbor, MI; Section of General Surgery, Department of Surgery, University of Michigan, Ann Arbor, MI
| | - Andrew C Chang
- Section of Thoracic Surgery, Department of Surgery, University of Michigan, Ann Arbor, MI
| | - Rishindra M Reddy
- Section of Thoracic Surgery, Department of Surgery, University of Michigan, Ann Arbor, MI
| | - Kiran Lagisetty
- Section of Thoracic Surgery, Department of Surgery, University of Michigan, Ann Arbor, MI
| | - William Lynch
- Section of Thoracic Surgery, Department of Surgery, University of Michigan, Ann Arbor, MI
| | - Tyler Grenda
- Division of Thoracic Surgery, Thomas Jefferson University, Philadelphia, PA
| | - Kevin Chan
- Division of Pulmonary and Critical Care Medicine, University of Michigan, Ann Arbor, MI
| | - Dennis Lyu
- Division of Pulmonary and Critical Care Medicine, University of Michigan, Ann Arbor, MI
| | - Jules Lin
- Section of Thoracic Surgery, Department of Surgery, University of Michigan, Ann Arbor, MI
| |
Collapse
|
14
|
Glance LG, Nerenz DR, Joynt Maddox KE, Hall BL, Dick AW. Reproducibility of Hospital Rankings Based on Centers for Medicare & Medicaid Services Hospital Compare Measures as a Function of Measure Reliability. JAMA Netw Open 2021; 4:e2137647. [PMID: 34874402 PMCID: PMC8652605 DOI: 10.1001/jamanetworkopen.2021.37647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
IMPORTANCE Unreliable performance measures can mask poor-quality care and distort financial incentives in value-based purchasing. OBJECTIVE To examine the association between test-retest reliability and the reproducibility of hospital rankings. DESIGN, SETTING, AND PARTICIPANTS In a cross-sectional design, Centers for Medicare & Medicaid Services Hospital Compare data were analyzed for the 2017 (based on 2014-2017 data) and 2018 (based on 2015-2018 data) reporting periods. The study was conducted from December 13, 2020, to September 30, 2021. This analysis was based on 28 measures, including mortality (acute myocardial infarction, congestive heart failure, pneumonia, and coronary artery bypass grafting), readmissions (acute myocardial infarction, congestive heart failure, pneumonia, and coronary artery bypass grafting), and surgical complications (postoperative acute kidney failure, postoperative respiratory failure, postoperative sepsis, and failure to rescue). EXPOSURES Measure reliability based on test-retest reliability testing. MAIN OUTCOMES AND MEASURES The reproducibility of hospital rankings was quantified by calculating the reclassification rate across the 2017 and 2018 reporting periods after categorizing the hospitals into terciles, quartiles, deciles, and statistical outliers. Linear regression analysis was used to examine the association between the reclassification rate and the intraclass correlation coefficient for each of the classification systems. RESULTS The analytic cohort consisted of 28 measures from 4452 hospitals with a median of 2927 (IQR, 2378-3160) hospitals contributing data for each measure. The hospitals participating in the Inpatient Prospective Payment System (n = 3195) had a median bed size of 141 (IQR, 69-261), average daily census of 70 (IQR, 24-155) patients, and a median disproportionate share hospital percentage of 38.2% (IQR, 18.7%-36.6%). The median intraclass correlation coefficient was 0.78 (IQR, 0.72-0.81), ranging between 0.50 and 0.85. The median reclassification rate was 70% (IQR, 62%-71%) when hospitals were ranked by deciles, 43% (IQR, 39%-45%) when ranked by quartiles, 34% (IQR, 31%-36%) when ranked by terciles, and 3.8% (IQR, 2.0%-6.2%) when ranked by outlier status. Increases in measure reliability were not associated with decreases in the reclassification rate. Each 0.1-point increase in the intraclass correlation coefficient was associated with a 6.80 (95% CI, 2.28-11.30; P = .005) percentage-point increase in the reclassification rate when hospitals were ranked into performance deciles, 4.15 (95% CI, 1.16-7.14; P = .008) when ranked into performance quartiles, 1.47 (95% CI, 1.84, 4.77; P = .37) when ranked into performance terciles, and 3.70 (95% CI, 1.30-6.09; P = .004) when ranked by outlier status. CONCLUSIONS AND RELEVANCE In this study, more reliable measures were not associated with lower rates of reclassifying hospitals using test-retest reliability testing. These findings suggest that measure reliability should not be assessed with test-retest reliability testing.
Collapse
Affiliation(s)
- Laurent G. Glance
- Department of Anesthesiology and Perioperative Medicine, University of Rochester School of Medicine, Rochester, New York
- Department of Public Health Sciences, University of Rochester School of Medicine, Rochester, New York
- RAND Health, RAND, Boston, Massachusetts
| | - David R. Nerenz
- Center for Health Policy and Health Services Research, Henry Ford Health System, Detroit, Michigan
| | - Karen E. Joynt Maddox
- Department of Medicine, Washington University in St Louis, St Louis, Missouri
- Center for Health Economics and Policy at the Institute for Public Health, Washington University in St Louis, St Louis, Missouri
| | - Bruce L. Hall
- Department of Surgery, Washington University in St Louis, St Louis, Missouri
- Olin Business School, Washington University in St Louis, St Louis, Missouri
- Division of Research and Optimal Patient Care, American College of Surgeons, Chicago, Illinois
| | | |
Collapse
|
15
|
Thompson MP, Hou H, Brescia AA, Pagani FD, Sukul D, McCullough JS, Likosky DS. Center Variability in Medicare Claims-Based Publicly Reported Transcatheter Aortic Valve Replacement Outcome Measures. J Am Heart Assoc 2021; 10:e021629. [PMID: 34689581 PMCID: PMC8751838 DOI: 10.1161/jaha.121.021629] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Background Public reporting of transcatheter aortic valve replacement (TAVR) claims–based outcome measures is used to identify high‐ and low‐performing centers. Whether claims‐based TAVR outcomes can reliably be used for center‐level comparisons is unknown. In this study, we sought to evaluate center variability in claims‐based TAVR outcomes used in public reporting. Methods and Results The study sample included 119 554 Medicare beneficiaries undergoing TAVR between January 2014 and October 2018 based on procedure codes in 100% Medicare inpatient claims. Multivariable hierarchical logistic regression was used to estimate center‐specific adjusted rates and reliability (R) of 30‐day mortality, discharge not to home/self‐care, 30‐day stroke, and 30‐day readmission. Reliability was defined as the ratio of between‐hospital variation to the sum of the between‐ and within‐hospital variation. The median (interquartile range [IQR]) center‐level adjusted outcome rates were 3.1% (2.9%–3.4%) for 30‐day mortality, 41.4% (31.3%–53.4%) for discharge not to home, 2.5% (2.3%–2.7%) for 30‐day stroke, and 14.9% (14.4%–15.5%) for 30‐day readmission. Median reliability was highest for the discharge not to home measure (R=0.95; IQR, 0.94–0.97), followed by the 30‐day stroke (R=0.92; IQR, 0.87–0.94), 30‐day mortality (R=0.86; IQR, 0.81–0.91), and 30‐day readmission measures (R=0.42; IQR, 0.35–0.51). Across outcomes, there was an inverse relationship between center volume and measure reliability. Conclusions Claims‐based TAVR outcome measures for mortality, discharge not to home, and stroke were reliable measures for center‐level comparisons, but readmission measures were unreliable. Stakeholders should consider these findings when evaluating claims‐based measures to compare center‐level TAVR performance.
Collapse
Affiliation(s)
- Michael P Thompson
- Department of Cardiac Surgery Michigan Medicine Ann Arbor MI.,Institute for Healthcare Policy and Innovation University of Michigan Ann Arbor MI
| | - Hechuan Hou
- Department of Cardiac Surgery Michigan Medicine Ann Arbor MI
| | - Alexander A Brescia
- Department of Cardiac Surgery Michigan Medicine Ann Arbor MI.,Institute for Healthcare Policy and Innovation University of Michigan Ann Arbor MI
| | - Francis D Pagani
- Department of Cardiac Surgery Michigan Medicine Ann Arbor MI.,Institute for Healthcare Policy and Innovation University of Michigan Ann Arbor MI
| | - Devraj Sukul
- Division of Cardiovascular Medicine Department of General Internal Medicine Michigan Medicine Ann Arbor MI
| | - Jeffrey S McCullough
- Department of Health Management and Policy School of Public Health University of Michigan Ann Arbor MI
| | - Donald S Likosky
- Department of Cardiac Surgery Michigan Medicine Ann Arbor MI.,Institute for Healthcare Policy and Innovation University of Michigan Ann Arbor MI
| |
Collapse
|
16
|
Outcomes in Lung Cancer Surgery: Capturing Reliable Metrics. Ann Thorac Surg 2021; 114:1245-1252. [PMID: 34547300 DOI: 10.1016/j.athoracsur.2021.07.105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/03/2020] [Revised: 07/25/2021] [Accepted: 07/28/2021] [Indexed: 11/23/2022]
Abstract
BACKGROUND Measuring variation in perioperative outcomes to accurately discriminate performance between surgical providers may be limited by reliability. We aimed to evaluate reliability estimates of metrics associated with lung cancer resection. METHODS We performed a retrospective cohort study utilizing the 2015 National Cancer Database to identify patients undergoing lung cancer resection. Primary outcomes were reliability estimates for perioperative outcomes and for measures of adherence to clinical benchmarks, generated through hierarchical multi-level modeling techniques. RESULTS We identified 27,300 patients undergoing resection. Overall risk- and reliability-adjusted 30- and 90-day mortality rates were 1.7% and 3.3%, respectively; 61.0% and 41.1% of eligible patients received stage-appropriate adjuvant and neoadjuvant therapy. Video-assisted thoracoscopic surgery (VATS) was performed in 59.6% of cases with clinical stage I disease. The mean reliability of 30- and 90-day mortality was 0.11 (standard deviation (SD) 0.09) and 0.22 (SD 0.15), respectively; for performing VATS for stage I disease, 0.97 (SD 0.04). When stratified by hospital volume quartile, the mean reliability of 30-day mortality was 0.04 (SD 0.03) in the lowest and 0.20 (SD 0.10) in the highest quartile. Only 14% of hospitals met an established 0.7 reliability benchmark for 30- and 90-day mortality, but over 97% of hospitals exceeded these benchmarks for providing stage-appropriate systemic therapy and performing VATS for stage I disease. CONCLUSIONS Metrics used to compare lung cancer surgical performance between providers have varying levels of reliability. Reliability should be considered when profiling providers, which will become particularly important as lung cancer treatment under screening programs continues to expand.
Collapse
|
17
|
A 10-year ACS-NSQIP Analysis of Trends in Esophagectomy Practices. J Surg Res 2020; 256:103-111. [DOI: 10.1016/j.jss.2020.06.008] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Revised: 05/29/2020] [Accepted: 06/16/2020] [Indexed: 02/06/2023]
|
18
|
Schuttner L, Reddy A, White AA, Wong ES, Liao JM. Quality in the Context of Value. Am J Med Qual 2020; 35:465-473. [DOI: 10.1177/1062860620917205] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
19
|
Angel García D, Martínez Nicolás I, García Marín JA, Soria Aledo V. Risk-adjustment models for clean and colorectal surgery surgical site infection for the Spanish health system. Int J Qual Health Care 2020; 32:599-608. [PMID: 32901796 DOI: 10.1093/intqhc/mzaa104] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Revised: 07/22/2020] [Accepted: 08/26/2020] [Indexed: 12/16/2022] Open
Abstract
OBJECTIVE To develop risk-adjusted models for two quality indicators addressing surgical site infection (SSI) in clean and colorectal surgery, to be used for benchmarking and quality improvement in the Spanish National Health System. STUDY DESIGN A literature review was undertaken to identify candidate adjustment variables. The candidate variables were revised by clinical experts to confirm their clinical relevance to SSI; experts also offered additional candidate variables that were not identified in the literature review. Two risk-adjustment models were developed using multiple logistic regression thus allowing calculation of the adjusted indicator rates. DATA SOURCE The two SSI indicators, with their corresponding risk-adjustment models, were calculated from administrative databases obtained from nine public hospitals. A dataset was obtained from a 10-year period (2006-2015), and it included data from 21 571 clean surgery patients and 6325 colorectal surgery patients. ANALYSIS METHODS Risk-adjustment regression models were constructed using Spanish National Health System data. Models were analysed so as to prevent overfitting, then tested for calibration and discrimination and finally bootstrapped. RESULTS Ten adjustment variables were identified for clean surgery SSI, and 23 for colorectal surgery SSI. The final adjustment models showed fair calibration (Hosmer-Lemeshow: clean surgery χ2 = 6.56, P = 0.58; colorectal surgery χ2 = 6.69, P = 0.57) and discrimination (area under receiver operating characteristic [ROC] curve: clean surgery 0.72, 95% confidence interval [CI] 0.67-0.77; colorectal surgery 0.62, 95% CI 0.60-0.65). CONCLUSIONS The proposed risk-adjustment models can be used to explain patient-based differences among healthcare providers. They can be used to adjust the two proposed SSI indicators.
Collapse
Affiliation(s)
- Daniel Angel García
- Departamento de Fisioterapia, Facultad de Ciencias de la Salud, Universidad Católica San Antonio de Murcia, Murcia 30009, Spain
| | - Ismael Martínez Nicolás
- Departamento de Fisioterapia, Facultad de Ciencias de la Salud, Universidad Católica San Antonio de Murcia, Murcia 30009, Spain
| | - José Andrés García Marín
- General and gastrointestinal surgery Unit, Hospital Universitario Morales Meseguer, Murcia 30009, Spain
| | - Victoriano Soria Aledo
- Sección de Gestión de Calidad de la Asociación Española de Cirujanos, Servicio de Cirugía General, Hospital Morales Meseguer de Murcia, Murcia 30009, Spain
- Departamento de Cirugía, Facultad de Medicina, Universidad de Murcia, Murcia 30009, Spain
| |
Collapse
|
20
|
Rodriguez-Lopez M, Merlo J, Perez-Vicente R, Austin P, Leckie G. Cross-classified Multilevel Analysis of Individual Heterogeneity and Discriminatory Accuracy (MAIHDA) to evaluate hospital performance: the case of hospital differences in patient survival after acute myocardial infarction. BMJ Open 2020; 10:e036130. [PMID: 33099490 PMCID: PMC7590346 DOI: 10.1136/bmjopen-2019-036130] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
OBJECTIVE To describe a novel strategy, Multilevel Analysis of Individual Heterogeneity and Discriminatory Accuracy (MAIHDA) to evaluate hospital performance, by analysing differences in 30-day mortality after a first-ever acute myocardial infarction (AMI) in Sweden. DESIGN Cross-classified study. SETTING 68 Swedish hospitals. PARTICIPANTS 43 247 patients admitted between 2007 and 2009, with a first-ever AMI. PRIMARY AND SECONDARY OUTCOME MEASURES We evaluate hospital performance by analysing differences in 30-day mortality after a first-ever AMI using a cross-classified multilevel analysis. We classified the patients into 10 categories according to a risk score (RS) for 30-day mortality and created 680 strata defined by combining hospital and RS categories. RESULTS In the cross-classified multilevel analysis the overall RS adjusted hospital 30-day mortality in Sweden was 4.78% and the between-hospital variation was very small (variance partition coefficient (VPC)=0.70%, area under the curve (AUC)=0.54). The benchmark value was therefore achieved by all hospitals. However, as expected, there were large differences between the RS categories (VPC=34.13%, AUC=0.77) CONCLUSIONS: MAIHDA is a useful tool to evaluate hospital performance. The benefit of this novel approach to adjusting for patient RS is that it allowed one to estimate separate VPCs and AUC statistics to simultaneously evaluate the influence of RS categories and hospital differences on mortality. At the time of our analysis, all hospitals in Sweden were performing homogeneously well. That is, the benchmark target for 30-day mortality was fully achieved and there were not relevant hospital differences. Therefore, possible quality interventions should be universal and oriented to maintain the high hospital quality of care.
Collapse
Affiliation(s)
- Merida Rodriguez-Lopez
- Unit for Social Epidemiology, Faculty of Medicine, Lund University, Malmö, Sweden
- Department of Public Health and Epidemiology, Pontificia Universidad Javeriana - Cali, Cali, Colombia
| | - Juan Merlo
- Unit for Social Epidemiology, Faculty of Medicine, Lund University, Malmö, Sweden
- Center for Primary Health Care Research, Region Skåne, Malmö, Sweden
| | - Raquel Perez-Vicente
- Unit for Social Epidemiology, Faculty of Medicine, Lund University, Malmö, Sweden
| | - Peter Austin
- Institute of Health Management, Policy and Evaluation, University of Toronto, Toronto, Ontario, Canada
- Institute for Clinical Evaluative Sciences, Toronto, Ontario, Canada
- Schulich Heart Research Program, Sunnybrook Research Institute, Toronto, Ontario, Canada
| | - George Leckie
- Centre for Multilevel Modelling, University of Bristol, Bristol, UK
| |
Collapse
|
21
|
Mori M, Weininger GA, Shang M, Brooks C, Mullan CW, Najem M, Malczewska M, Vallabhajosyula P, Geirsson A. Association between coronary artery bypass graft center volume and year-to-year outcome variability: New York and California statewide analysis. J Thorac Cardiovasc Surg 2020; 161:1035-1041.e1. [PMID: 33070939 DOI: 10.1016/j.jtcvs.2020.07.119] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/12/2020] [Revised: 07/01/2020] [Accepted: 07/12/2020] [Indexed: 11/30/2022]
Abstract
OBJECTIVE We evaluated whether volume-based, rather than time-based, annual reporting of center outcomes for coronary artery bypass grafting may improve inference of quality, assuming that large center-level year-to-year outcome variability is related to statistical noise. METHODS We analyzed 2012 to 2016 data on isolated coronary artery bypass grafting using statewide outcome reports from New York and California. Annual changes in center-level observed-to-expected mortality ratio represented stability of year-to-year outcomes. Cubic spline fit related the annual observed-to-expected ratio change and center volume. Volume above the inflection point of the spline curve indicated centers with low year-to-year change in outcome. We compared observed-to-expected ratio changes between centers below and above the volume threshold and observed-to-expected ratio changes between consecutive annual and biennial measurements. RESULTS There were 155 centers with median annual volume of 89 (interquartile range, 55-160) for isolated coronary artery bypass grafting. The inflection point of observed-to-expected ratio variability was observed at 111 cases/year. Median year-to-year observed-to-expected ratio change for centers performing less than 111 cases (62 centers) was greater at 0.83 (0.26-1.59) compared with centers performing 111 cases or more (93 centers) at 0.49 (022-0.87) (P < .001). By aggregating the outcome over 2 years, centers above the 111-case threshold increased from 93 centers (60%) to 118 centers (76%), but the median observed-to-expected change for all centers was similar between annual aggregates at 0.70 (0.26-1.22) compared with observed-to-expected change between biennial aggregates at 0.54 (0.23-1.02) (P = .095). CONCLUSIONS Center-level, risk-adjusted coronary artery bypass grafting mortality varies significantly from one year to the next. Reporting outcomes by specific case volume may complement annual reports.
Collapse
Affiliation(s)
- Makoto Mori
- Section of Cardiac Surgery, Yale School of Medicine, New Haven, Conn; Center for Outcomes Research and Evaluation, Yale-New Haven Hospital, New Haven, Conn
| | - Gabe A Weininger
- Section of Cardiac Surgery, Yale School of Medicine, New Haven, Conn
| | - Michael Shang
- Section of Cardiac Surgery, Yale School of Medicine, New Haven, Conn
| | - Cornell Brooks
- Section of Cardiac Surgery, Yale School of Medicine, New Haven, Conn
| | - Clancy W Mullan
- Section of Cardiac Surgery, Yale School of Medicine, New Haven, Conn
| | - Michael Najem
- Section of Cardiac Surgery, Yale School of Medicine, New Haven, Conn
| | | | | | - Arnar Geirsson
- Section of Cardiac Surgery, Yale School of Medicine, New Haven, Conn.
| |
Collapse
|
22
|
Commentary: Safety in numbers. J Thorac Cardiovasc Surg 2020; 161:1043-1045. [PMID: 32863033 DOI: 10.1016/j.jtcvs.2020.07.058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Revised: 07/17/2020] [Accepted: 07/17/2020] [Indexed: 11/20/2022]
|
23
|
Barnett PG, Jacobs JC, Jarvik JG, Chou R, Boothroyd D, Lo J, Nevedal A. Assessment of Primary Care Clinician Concordance With Guidelines for Use of Magnetic Resonance Imaging in Patients With Nonspecific Low Back Pain in the Veterans Affairs Health System. JAMA Netw Open 2020; 3:e2010343. [PMID: 32658287 PMCID: PMC7358914 DOI: 10.1001/jamanetworkopen.2020.10343] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
IMPORTANCE Magnetic responance imaging (MRI) of the lumbar spine that is not concordant with treatment guidelines for low back pain represents an unnecessary cost for US health plans and may be associated with adverse effects. Use of MRI in the US Department of Veterans Affairs (VA) primary care clinics remains unknown. OBJECTIVE To assess the use of MRI scans during the first 6 weeks (early MRI scans) of episodes of nonspecific low back pain in VA primary care sites and to determine if historical concordance can identify clinicians and sites that are the least concordant with guidelines. DESIGN, SETTING, AND PARTICIPANTS Retrospective cohort study of electronic health records from 944 VA primary care sites from the 3 years ending in 2016. Data were analyzed between January 2017 and August 2019. Participants were patients with new episodes of nonspecific low back pain and the primary care clinicians responsible for their care. EXPOSURES MRI scans. MAIN OUTCOMES AND MEASURES The proportion of early MRI scans at VA primary care clinics was assessed. Clinician concordance with published guidelines over 2 years was used to select clinicians expected to have low concordance in a third year. RESULTS A total of 1 285 405 new episodes of nonspecific low back pain from 920 547 patients (mean [SD] age, 56.7 [15.8] years; 93.6% men) were attributed to 9098 clinicians (mean [SD] age, 52.1 [10.1] years; 55.7% women). An early MRI scan of the lumbar spine was performed in 31 132 of the episodes (2.42%; 95% CI, 2.40%-2.45%). Historical concordance was better than a random draw in selecting the 10% of clinicians who were subsequently the least concordant with published guidelines. For primary care clinicians, the area under the receiver operating characteristic curve was 0.683 (95% CI, 0.658-0.701). For primary care sites, the area was under this curve was 0.8035 (95% CI, 0.754-0.855). The 10% of clinicians with the least historical concordance were responsible for just 19.2% of the early MRI scans performed in the follow-up year. CONCLUSIONS AND RELEVANCE VA primary care clinics had low rates of use of early MRI scans. A history of low concordance with imaging guidelines was associated with subsequent low concordance but with limited potential to select clinicians most in need of interventions to implement guidelines.
Collapse
Affiliation(s)
- Paul G. Barnett
- Veterans Affairs Health Economics Resource Center, VA Palo Alto Health Care System, Menlo Park, California
- Center for Innovation to Implementation, VA Palo Alto Health Care System, Menlo Park, California
| | - Josephine C. Jacobs
- Veterans Affairs Health Economics Resource Center, VA Palo Alto Health Care System, Menlo Park, California
- Center for Innovation to Implementation, VA Palo Alto Health Care System, Menlo Park, California
| | - Jeffrey G. Jarvik
- Department of Radiology, University of Washington, Seattle
- Department of Neurological Surgery, University of Washington, Seattle
- Department of Health Services, University of Washington, Seattle
| | - Roger Chou
- Department of Clinical Epidemiology and Medical Informatics, Oregon Health & Science University, Portland
- Department of Medicine, Oregon Health & Science University, Portland
| | - Derek Boothroyd
- Quantitative Research Unit, Stanford University Medical School, Stanford, California
| | - Jeanie Lo
- Veterans Affairs Health Economics Resource Center, VA Palo Alto Health Care System, Menlo Park, California
| | - Andrea Nevedal
- Center for Innovation to Implementation, VA Palo Alto Health Care System, Menlo Park, California
| |
Collapse
|
24
|
National Quality Forum Guidelines for Evaluating the Scientific Acceptability of Risk-adjusted Clinical Outcome Measures: A Report From the National Quality Forum Scientific Methods Panel. Ann Surg 2020; 271:1048-1055. [PMID: 31850998 DOI: 10.1097/sla.0000000000003592] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
: Quality measurement is at the heart of efforts to achieve high-quality surgical and medical care at a lower cost. Without accurate quality measures, it is not possible to appropriately align incentives with quality. The aim of these National Quality Forum (NQF) guidelines is to provide measure developers and other stakeholders with guidance on the standards used by the NQF to evaluate the scientific acceptability of performance measures. Using a methodologically rigorous and transparent process for evaluating health care quality measures is the best insurance that alternative payment plans will truly reward and promote higher quality care. Performance measures need to be credible in order for physicians and hospitals to willingly partner with payers in efforts to improve population outcomes. Our goal in creating this position paper is to promote the transparency of NQF evaluations, improve the quality of performance measurements, and engage surgeons and all other stakeholders to work together to advance the science of performance measurement.
Collapse
|
25
|
Ten-year Trends in Surgical Mortality, Complications, and Failure to Rescue in Medicare Beneficiaries. Ann Surg 2020; 271:855-861. [DOI: 10.1097/sla.0000000000003193] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
26
|
Schär RT, Tashi S, Branca M, Söll N, Cipriani D, Schwarz C, Pollo C, Schucht P, Ulrich CT, Beck J, Z'Graggen WJ, Raabe A. How safe are elective craniotomies in elderly patients in neurosurgery today? A prospective cohort study of 1452 consecutive cases. J Neurosurg 2020; 134:1113-1121. [PMID: 32330879 DOI: 10.3171/2020.2.jns193460] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2019] [Accepted: 02/05/2020] [Indexed: 11/06/2022]
Abstract
OBJECTIVE With global aging, elective craniotomies are increasingly being performed in elderly patients. There is a paucity of prospective studies evaluating the impact of these procedures on the geriatric population. The goal of this study was to assess the safety of elective craniotomies for elderly patients in modern neurosurgery. METHODS For this cohort study, adult patients, who underwent elective craniotomies between November 1, 2011, and October 31, 2018, were allocated to 3 age groups (group 1, < 65 years [n = 1008], group 2, ≥ 65 to < 75 [n = 315], and group 3, ≥ 75 [n = 129]). Primary outcome was the 30-day mortality after craniotomy. Secondary outcomes included rate of delayed extubation (> 1 hour), need for emergency head CT scan and reoperation within 48 hours after surgery, length of postoperative intensive or intermediate care unit stay, hospital length of stay (LOS), and rate of discharge to home. Adjustment for American Society of Anesthesiologists Physical Status (ASA PS) class, estimated blood loss, and duration of surgery were analyzed as a comparison using multiple logistic regression. For significant differences a post hoc analysis was performed. RESULTS In total, 1452 patients (mean age 55.4 ± 14.7 years) were included. The overall mortality rate was 0.55% (n = 8), with no significant differences between groups (group 1: 0.5% [95% binominal CI 0.2%, 1.2%]; group 2: 0.3% [95% binominal CI 0.0%, 1.7%]; group 3: 1.6% [95% binominal CI 0.2%, 5.5%]). Deceased patients had a significantly higher ASA PS class (2.88 ± 0.35 vs 2.42 ± 0.62; difference 0.46 [95% CI 0.03, 0.89]; p = 0.036) and increased estimated blood loss (1444 ± 1973 ml vs 436 ± 545 ml [95% CI 618, 1398]; p <0.001). Significant differences were found in the rate of postoperative head CT scans (group 1: 6.65% [n = 67], group 2: 7.30% [n = 23], group 3: 15.50% [n = 20]; p = 0.006), LOS (group 1: median 5 days [IQR 4; 7 days], group 2: 5 days [IQR 4; 7 days], and group 3: 7 days [5; 9 days]; p = 0.001), and rate of discharge to home (group 1: 79.0% [n = 796], group 2: 72.0% [n = 227], and group 3: 44.2% [n = 57]; p < 0.001). CONCLUSIONS Mortality following elective craniotomy was low in all age groups. Today, elective craniotomy for well-selected patients is safe, and for elderly patients, too. Elderly patients are more dependent on discharge to other hospitals and postacute care facilities after elective craniotomy. Clinical trial registration no.: NCT01987648 (clinicaltrials.gov).
Collapse
Affiliation(s)
- Ralph T Schär
- 1Department of Neurosurgery, Inselspital, Bern University Hospital, University of Bern
| | - Shpend Tashi
- 1Department of Neurosurgery, Inselspital, Bern University Hospital, University of Bern
| | - Mattia Branca
- 2Clinical Trials Unit Bern, University of Bern, Switzerland; and
| | - Nicole Söll
- 1Department of Neurosurgery, Inselspital, Bern University Hospital, University of Bern
| | - Debora Cipriani
- 1Department of Neurosurgery, Inselspital, Bern University Hospital, University of Bern.,3Department of Neurosurgery, Medical Center, University of Freiburg, Freiburg im Breisgau, Germany
| | - Christa Schwarz
- 1Department of Neurosurgery, Inselspital, Bern University Hospital, University of Bern
| | - Claudio Pollo
- 1Department of Neurosurgery, Inselspital, Bern University Hospital, University of Bern
| | - Philippe Schucht
- 1Department of Neurosurgery, Inselspital, Bern University Hospital, University of Bern
| | - Christian T Ulrich
- 1Department of Neurosurgery, Inselspital, Bern University Hospital, University of Bern
| | - Jürgen Beck
- 1Department of Neurosurgery, Inselspital, Bern University Hospital, University of Bern.,3Department of Neurosurgery, Medical Center, University of Freiburg, Freiburg im Breisgau, Germany
| | - Werner J Z'Graggen
- 1Department of Neurosurgery, Inselspital, Bern University Hospital, University of Bern
| | - Andreas Raabe
- 1Department of Neurosurgery, Inselspital, Bern University Hospital, University of Bern
| |
Collapse
|
27
|
Tang AB, Childers CP, Dworsky JQ, Maggard-Gibbons M. Surgeon work captured by the National Surgical Quality Improvement Program across specialties. Surgery 2020; 167:550-555. [DOI: 10.1016/j.surg.2019.11.013] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2019] [Revised: 11/03/2019] [Accepted: 11/12/2019] [Indexed: 11/27/2022]
|
28
|
Shahian D. Improving cardiac surgical quality: lessons from the Japanese experience. BMJ Qual Saf 2020; 29:531-535. [PMID: 32015051 DOI: 10.1136/bmjqs-2019-010125] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/14/2020] [Indexed: 12/28/2022]
Affiliation(s)
- David Shahian
- Division of Cardiac Surgery, Department of Surgery, and Center for Quality and Safety, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
29
|
Reponen E, Tuominen H, Korja M. Quality of British and American Nationwide Quality of Care and Patient Safety Benchmarking Programs: Case Neurosurgery. Neurosurgery 2019; 85:500-507. [PMID: 30165390 DOI: 10.1093/neuros/nyy380] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2018] [Accepted: 07/19/2018] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Multiple nationwide outcome registries are utilized for quality benchmarking between institutions and individual surgeons. OBJECTIVE To evaluate whether nationwide quality of care programs in the United Kingdom and United States can measure differences in neurosurgical quality. METHODS This prospective observational study comprised 418 consecutive adult patients undergoing elective craniotomy at Helsinki University Hospital between December 7, 2011 and December 31, 2012.We recorded outcome event rates and categorized them according to British Neurosurgical National Audit Programme (NNAP), American National Surgical Quality Improvement Program (NSQIP), and American National Neurosurgery Quality and Outcomes Database (N2QOD) to assess the applicability of these programs for quality benchmarking and estimated sample sizes required for reliable quality comparisons. RESULTS The rate of in-hospital major and minor morbidity was 18.7% and 38.0%, respectively, and 30-d mortality rate was 2.4%. The NSQIP criteria identified 96.2% of major but only 38.4% of minor complications. N2QOD performed better, but almost one-fourth (23.2%) of all patients with adverse outcomes, mostly minor, went unnoticed. For NNAP, a sample size of over 4200 patients per surgeon is required to detect a 50.0% increase in mortality rates between surgeons. The sample size required for reliable comparisons between the rates of complications exceeds 600 patients per center per year. CONCLUSION The implemented benchmarking programs in the United Kingdom and United States fail to identify a considerable number of complications in a high-volume center. Health care policy makers should be cautious as outcome comparisons between most centers and individual surgeons are questionable if based on the programs.
Collapse
Affiliation(s)
- Elina Reponen
- Department of Anesthesiology, Intensive Care and Pain Medicine, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
| | - Hanna Tuominen
- Department of Anesthesiology, Intensive Care and Pain Medicine, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
| | - Miikka Korja
- Department of Neurosurgery, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
| |
Collapse
|
30
|
Preoperative risk stratification of patient mortality following elective craniotomy; a comparative analysis of prediction algorithms. J Clin Neurosci 2019; 67:24-31. [DOI: 10.1016/j.jocn.2019.06.037] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Revised: 05/10/2019] [Accepted: 06/21/2019] [Indexed: 11/17/2022]
|
31
|
Begum H, Vishwanath S, Merenda M, Tacey M, Dean N, Elder E, Mureau M, Bezic R, Carter P, Cooter RD, Deva A, Earnest A, Higgs M, Klein H, Magnusson M, Moore C, Rakhorst H, Saunders C, Stark B, Hopper I. Defining Quality Indicators for Breast Device Surgery: Using Registries for Global Benchmarking. PLASTIC AND RECONSTRUCTIVE SURGERY-GLOBAL OPEN 2019; 7:e2348. [PMID: 31592377 PMCID: PMC6756659 DOI: 10.1097/gox.0000000000002348] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2019] [Accepted: 05/29/2019] [Indexed: 11/26/2022]
Abstract
Breast device registries monitor devices encompassing breast implants, tissue expanders and dermal matrices, and the quality of care and patient outcomes for breast device surgery. Defining a standard set of quality indicators and risk adjustment factors will enable consistency and adjustment for case-mix in benchmarking quality of care across breast implant registries. This study aimed to develop a set of quality indicators to enable assessment and reporting of quality of care for breast device surgery which can be applied globally. METHODS A scoping literature review was undertaken, and potential quality indicators were identified. Consensus on the final list of quality indicators was obtained using a modified Delphi approach. This process involved a series of online surveys, and teleconferences over 6 months. The Delphi panel included participants from various countries and representation from surgical specialty groups including breast and general surgeons, plastic and reconstructive surgeons, cosmetic surgeons, a breast-care nurse, a consumer, a devices regulator (Therapeutic Goods Administration), and a biostatistician. A total of 12 candidate indicators were proposed: Intraoperative antibiotic wash, intraoperative antiseptic wash, preoperative antibiotics, nipple shields, surgical plane, volume of implant, funnels, immediate versus delayed reconstruction, time to revision, reoperation due to complications, patient satisfaction, and volume of activity. RESULTS Three of the 12 proposed indicators were endorsed by the panel: preoperative intravenous antibiotics, reoperation due to complication, and patient reported outcome measures. CONCLUSION The 3 endorsed quality indicator measures will enable breast device registries to standardize benchmarking of care internationally for patients undergoing breast device surgery.
Collapse
Affiliation(s)
- Husna Begum
- From the Department of Epidemiology and Preventive Medicine, Monash University, Victoria, Australia
| | - Swarna Vishwanath
- From the Department of Epidemiology and Preventive Medicine, Monash University, Victoria, Australia
| | - Michelle Merenda
- From the Department of Epidemiology and Preventive Medicine, Monash University, Victoria, Australia
| | - Mark Tacey
- From the Department of Epidemiology and Preventive Medicine, Monash University, Victoria, Australia
| | - Nicola Dean
- Department of Plastic and Reconstructive Surgery, Flinders Medical Center, Flinders University, South Australia, Australia
| | - Elisabeth Elder
- From the Department of Epidemiology and Preventive Medicine, Monash University, Victoria, Australia
- Westmead Breast Cancer Institute, Westmead Hospital, New South Wales, Australia
| | - Marc Mureau
- Erasmus MC Cancer Institute, University Medical Centre Rotterdam, Rotterdam, The Netherlands
| | - Ron Bezic
- Refine Cosmetic Clinic, New South Wales, Australia
| | - Pamela Carter
- Therapeutic Goods Administration, Australian Capital Territory Australia
| | - Rodney D. Cooter
- From the Department of Epidemiology and Preventive Medicine, Monash University, Victoria, Australia
| | - Anand Deva
- Macquarie Plastic & Reconstructive Surgery, Faculty of Medicine and Health Sciences, Macquarie University, New South Wales, Australia
| | - Arul Earnest
- From the Department of Epidemiology and Preventive Medicine, Monash University, Victoria, Australia
| | - Michael Higgs
- Parkside Cosmetic Surgery, South Australia Australia
| | - Howard Klein
- South Island Plastic Surgery, Christchurch, New Zealand
| | - Mark Magnusson
- School of Medicine, Griffith University, Queensland, Australia; Australasian College of Cosmetic Surgery, New South Wales, Australia
| | - Colin Moore
- Refine Cosmetic Clinic, New South Wales, Australia
| | - Hinne Rakhorst
- Department of Plastic, Reconstructive and Hand surgery, Medisch Spectrum Twente and ZGT Almelo, Enschede, The Netherlands
| | - Christobel Saunders
- School of Medicine, University of Western Australia, Western Australia, Australia
| | - Birgit Stark
- Kliniken för Rekonstruktiv Plastikkirurgi Karolinska Institute, Stockholm, Sweden
| | - Ingrid Hopper
- From the Department of Epidemiology and Preventive Medicine, Monash University, Victoria, Australia
| |
Collapse
|
32
|
Abel GA, Gomez-Cano M, Pham TM, Lyratzopoulos G. Reliability of hospital scores for the Cancer Patient Experience Survey: analysis of publicly reported patient survey data. BMJ Open 2019; 9:e029037. [PMID: 31345975 PMCID: PMC6661614 DOI: 10.1136/bmjopen-2019-029037] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/11/2019] [Revised: 06/04/2019] [Accepted: 06/06/2019] [Indexed: 11/04/2022] Open
Abstract
OBJECTIVES To assess the degree to which variations in publicly reported hospital scores arising from the English Cancer Patient Experience Survey (CPES) are subject to chance. DESIGN Secondary analysis of publically reported data. SETTING English National Health Service hospitals. PARTICIPANTS 72 756 patients who were recently treated for cancer in one of 146 hospitals and responded to the 2016 English CPES. MAIN OUTCOME MEASURES Spearman-Brown reliability of hospital scores on 51 evaluative questions regarding cancer care. RESULTS Hospitals varied in respondent sample size with a median hospital sample size of 419 responses (range 31-1972). There were some hospitals with generally highly reliable scores across most questions, whereas other hospitals had generally unreliable scores (the median reliability of question scores within individual hospitals varied between 0.11 and 0.86). Similarly, there were some questions with generally high reliability across most hospitals, whereas other questions had generally low reliability. Of the 7377 individual hospital scores publically reported (146 hospitals by 51 questions, minus 69 suppressed scores), only 34% reached a reliability of 0.7, the minimum generally considered to be useful. In order for 80% of the individual hospital scores to reach a reliability of 0.7, some hospitals would require a fourfold increase in number of respondents; although in a few other hospitals sample sizes could be reduced. CONCLUSIONS The English Patient Experience Survey represents a globally unique source for understanding experience of a patient with cancer; but in its present form, it is not reliable for high stakes comparisons of the performance of different hospitals. Revised sampling strategies and survey questions could help increase the reliability of hospital scores, and thus make the survey fit for use in performance comparisons.
Collapse
Affiliation(s)
- Gary A Abel
- University of Exeter Medical School, University of Exeter, Exeter, UK
| | - Mayam Gomez-Cano
- University of Exeter Medical School, University of Exeter, Exeter, UK
| | - Tra My Pham
- Behavioural Science and Health, University College London, London, UK
- Primary Care and Population Health, University College London, London, UK
| | - Georgios Lyratzopoulos
- Department of Epidemiology and Public Health, Health Behaviour Research Centre, University College London, London, UK
| |
Collapse
|
33
|
Kristensen PK, Merlo J, Ghith N, Leckie G, Johnsen SP. Hospital differences in mortality rates after hip fracture surgery in Denmark. Clin Epidemiol 2019; 11:605-614. [PMID: 31410068 PMCID: PMC6643065 DOI: 10.2147/clep.s213898] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2019] [Accepted: 06/14/2019] [Indexed: 11/23/2022] Open
Abstract
Background Thirty-day mortality after hip fracture is widely used when ranking hospital performance, but the reliability of such hospital ranking is seldom calculated. We aimed to quantify the variation in 30-day mortality across hospitals and to determine the hospital general contextual effect for understanding patient differences in 30-day mortality risk. Methods Patients aged ≥65 years with an incident hip fracture registered in the Danish Multidisciplinary Fracture Registry between 2007 and 2016 were identified (n=60,004). We estimated unadjusted and patient-mix adjusted risk of 30-day mortality in 32 hospitals. We performed a multilevel analysis of individual heterogeneity and discriminatory accuracy with patients nested within hospitals. We expressed the hospital general contextual effect by the median odds ratio (MOR), the area under the receiver operating characteristics curve and the variance partition coefficient (VPC). Results The overall 30-day mortality rate was 10%. Patient characteristics including high sociodemographic risk score, underweight, comorbidity, a subtrochanteric fracture, and living at a nursing home were strong predictors of 30-day mortality (area under the curve=0.728). The adjusted differences between hospital averages in 30-day mortality varied from 5% to 9% across the 32 hospitals, which correspond to a MOR of 1.18 (95% CI: 1.12-1.25). However, the hospital general context effect was low, as the VPC was below 1% and adding the hospital level to a single-level model with adjustment for patient-mix increased the area under the receiver operating characteristics curve by only 0.004 units. Conclusions Only minor hospital differences were found in 30-day mortality after hip fracture. Mortality after hip fracture needs to be lowered in Denmark but possible interventions should be patient oriented and universal rather than focused on specific hospitals.
Collapse
Affiliation(s)
- Pia Kjær Kristensen
- Department of Clinical Epidemiology, Aarhus University Hospital, Aarhus N DK-8200, Denmark.,Department of Orthopedic Surgery, Regional Hospital Horsens, Horsens DK-8700, Denmark
| | - Juan Merlo
- Research Unit of Social Epidemiology, CRC, Faculty of Medicine, Lund University, Malmö SE-20502, Sweden
| | - Nermin Ghith
- Research Unit of Social Epidemiology, CRC, Faculty of Medicine, Lund University, Malmö SE-20502, Sweden.,Research Unit for Chronic Diseases and E-Health, Section for Health Promotion and Prevention, Center for Clinical Research and Prevention, Frederiksberg Hospital, Frederiksberg 2000, Denmark
| | - George Leckie
- Centre for Multilevel Modelling, School of Education, University of Bristol, Bristol BS8 1JA, UK
| | | |
Collapse
|
34
|
Haneuse S, Dominici F, Normand SL, Schrag D. Assessment of Between-Hospital Variation in Readmission and Mortality After Cancer Surgical Procedures. JAMA Netw Open 2018; 1:e183038. [PMID: 30646221 PMCID: PMC6324436 DOI: 10.1001/jamanetworkopen.2018.3038] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/05/2018] [Accepted: 07/31/2018] [Indexed: 01/29/2023] Open
Abstract
Importance Although current federal quality improvement programs do not include cancer surgery, the Centers for Medicare & Medicaid Services and other payers are considering extending readmission reduction initiatives to include these and other common high-cost episodes. Objectives To quantify between-hospital variation in quality-related outcomes and identify hospital characteristics associated with high and low performance. Design, Setting, and Participants This retrospective cohort study obtained data through linkage of the California Cancer Registry to hospital discharge claims databases maintained by the California Office of Statewide Health Planning and Development. All 351 acute care hospitals in California at which 1 or more adults underwent curative intent surgery between January 1, 2007, and December 31, 2011, with analyses finalized July 15, 2018, were included. A total of 138 799 adults undergoing surgery for colorectal, breast, lung, prostate, bladder, thyroid, kidney, endometrial, pancreatic, liver, or esophageal cancer within 6 months of diagnosis, with an American Joint Committee on Cancer stage of I to III at diagnosis, were included. Main Outcomes and Measures Measures included adjusted odds ratios and variance components from hierarchical mixed-effects logistic regression analyses of in-hospital mortality, 90-day readmission, and 90-day mortality, as well as hospital-specific risk-adjusted rates and risk-adjusted standardized rate ratios for hospitals with a mean annual surgical volume of 10 or more. Results Across 138 799 patients at the 351 included hospitals, 8.9% were aged 18 to 44 years and 45.9% were aged 65 years or older, 57.4% were women, and 18.2% were nonwhite. Among these, 1240 patients (0.9%) died during the index admission. Among 137 559 patients discharged alive, 19 670 (14.3%) were readmitted and 1754 (1.3%) died within 90 days. After adjusting for patient case-mix differences, evidence of statistically significant variation in risk across hospitals was identified, as characterized by the variance of the random effects in the mixed model, for all 3 metrics (P < .001). In addition, substantial variation was observed in hospital performance profiles: across 260 hospitals with a mean annual surgical volume of 10 or more, 59 (22.7%) had lower-than-expected rates for all 3 metrics, 105 (40.4%) had higher-than-expected rates for 2 of the 3, and 19 (7.3%) had higher-than-expected rates for all 3 metrics. Conclusions and Relevance Accounting for patient case-mix differences, there appears to be substantial between-hospital variation in in-hospital mortality, 90-day readmission, and 90-day mortality after cancer surgical procedures. Recognizing the multifaceted nature of hospital performance through consideration of mortality and readmission simultaneously may help to prioritize strategies for improving surgical outcomes.
Collapse
Affiliation(s)
- Sebastien Haneuse
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| | - Francesca Dominici
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| | - Sharon-Lise Normand
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
- Department of Health Care Policy, Harvard Medical School, Boston, Massachusetts
| | - Deborah Schrag
- Division of Population Sciences, Dana Farber Cancer Institute, Boston, Massachusetts
| |
Collapse
|
35
|
Cheng C, Scott A, Sundararajan V, Yong J. On measuring the quality of hospitals. J Health Organ Manag 2018; 32:842-859. [PMID: 30465489 DOI: 10.1108/jhom-03-2018-0088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
PURPOSE Researchers, policymakers and hospital managers often encounter numerous quality measures when assessing hospital quality. The purpose of this paper is to address the challenge of summarising, interpreting and comparing multiple quality measures across different quality dimensions by proposing a simple method of constructing a composite quality index. The method is applied to hospital administrative data to demonstrate its use in analysing hospital performance. DESIGN/METHODOLOGY/APPROACH Logistic and fixed effects regression analyses are applied to secondary admitted patient data from all hospitals in the state of Victoria, Australia for the period 2000/2001-2011/2012. FINDINGS The derived composite quality index was used to rank hospital performance and to assess changes in state-wide average hospital quality over time. Further regression analyses found private hospitals, day hospitals and non-acute hospitals were associated with higher composite quality, while small hospitals were associated with lower quality. PRACTICAL IMPLICATIONS The method will enable policymakers and hospital managers to better monitor the performance of hospitals. It allows quality to be related to other attributes of hospitals such as size and volume, and enables policymakers and managers to focus on hospitals with relevant characteristics such that quantity and quality changes can be better understood, monitored and acted upon. ORIGINALITY/VALUE A simple method of constructing a composite quality is an indispensable practical tool in tracking the quality of hospitals when numerous measures are used to capture different aspects of quality. The derived composite quality can be used to summarise hospital performance and to identify factors associated with quality via regression analyses.
Collapse
Affiliation(s)
- Choon Cheng
- Department of Health and Human Services, Melbourne, Australia
| | | | | | - Jongsay Yong
- Faculty of Business and Economics, University of Melbourne , Melbourne, Australia
| |
Collapse
|
36
|
Chang AC. Centralizing Esophagectomy to Improve Outcomes and Enhance Clinical Research: Invited Expert Review. Ann Thorac Surg 2018; 106:916-923. [DOI: 10.1016/j.athoracsur.2018.04.004] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/14/2018] [Accepted: 04/01/2018] [Indexed: 12/19/2022]
|
37
|
The relationship of hospital market concentration, costs, and quality for major surgical procedures. Am J Surg 2018; 216:1037-1045. [PMID: 30060911 DOI: 10.1016/j.amjsurg.2018.07.042] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2018] [Revised: 07/22/2018] [Accepted: 07/24/2018] [Indexed: 11/23/2022]
Abstract
BACKGROUND Our objective was to determine the association between indicators of surgical quality - incidence of major complications and failure-to-rescue - and hospital market concentration in light of differences in costs of care. METHODS Patients undergoing coronary artery bypass graft (CABG), colon resection, pancreatic resection, or liver resection in the 2008-2011 Nationwide Inpatient Sample were identified. The effect of hospital market concentration on major complications, failure-to-rescue, and inpatient costs was estimated at the lowest and highest mortality hospitals using multivariable regression techniques. RESULTS A weighted total of 527,459 patients were identified. Higher market concentration was associated with between 4% and 6% increased odds of failure-to-rescue across all four procedures. Across procedures, more concentrated markets had decreased inpatient costs (average marginal effect ranging from -$3064 (95% CI: -$5812 - -$316) for CABG to -$4876 (-$7773 - -$1980) for liver resection. CONCLUSION In less competitive (more concentrated) hospital markets, higher overall risk of failure-to-rescue after complications was accompanied by lower inpatient costs, on average. These data suggest that market controls may be leveraged to influence surgical quality and costs.
Collapse
|
38
|
Brakenhoff TB, Moons KG, Kluin J, Groenwold RH. Investigating Risk Adjustment Methods for Health Care Provider Profiling When Observations are Scarce or Events Rare. Health Serv Insights 2018; 11:1178632918785133. [PMID: 30083056 PMCID: PMC6069022 DOI: 10.1177/1178632918785133] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2018] [Accepted: 05/24/2018] [Indexed: 12/03/2022] Open
Abstract
Background: When profiling health care providers, adjustment for case-mix is essential. However, conventional risk adjustment methods may perform poorly, especially when provider volumes are small or events rare. Propensity score (PS) methods, commonly used in observational studies of binary treatments, have been shown to perform well when the amount of observations and/or events are low and can be extended to a multiple provider setting. The objective of this study was to evaluate the performance of different risk adjustment methods when profiling multiple health care providers that perform highly protocolized procedures, such as coronary artery bypass grafting. Methods: In a simulation study, provider effects estimated using PS adjustment, PS weighting, PS matching, and multivariable logistic regression were compared in terms of bias, coverage and mean squared error (MSE) when varying the event rate, sample size, provider volumes, and number of providers. An empirical example from the field of cardiac surgery was used to demonstrate the different methods. Results: Overall, PS adjustment, PS weighting, and logistic regression resulted in provider effects with low amounts of bias and good coverage. The PS matching and PS weighting with trimming led to biased effects and high MSE across several scenarios. Moreover, PS matching is not practical to implement when the number of providers surpasses three. Conclusions: None of the PS methods clearly outperformed logistic regression, except when sample sizes were relatively small. Propensity score matching performed worse than the other PS methods considered.
Collapse
Affiliation(s)
- Timo B Brakenhoff
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Karel Gm Moons
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Jolanda Kluin
- Heart Center, Academic Medical Center, Amsterdam, The Netherlands
| | - Rolf Hh Groenwold
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands
| |
Collapse
|
39
|
|
40
|
Brakenhoff TB, Roes KCB, Moons KGM, Groenwold RHH. Outlier classification performance of risk adjustment methods when profiling multiple providers. BMC Med Res Methodol 2018; 18:54. [PMID: 29902975 PMCID: PMC6003201 DOI: 10.1186/s12874-018-0510-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2017] [Accepted: 05/15/2018] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND When profiling multiple health care providers, adjustment for case-mix is essential to accurately classify the quality of providers. Unfortunately, misclassification of provider performance is not uncommon and can have grave implications. Propensity score (PS) methods have been proposed as viable alternatives to conventional multivariable regression. The objective was to assess the outlier classification performance of risk adjustment methods when profiling multiple providers. METHODS In a simulation study based on empirical data, the classification performance of logistic regression (fixed and random effects), PS adjustment, and three PS weighting methods was evaluated when varying parameters such as the number of providers, the average incidence of the outcome, and the percentage of outliers. Traditional classification accuracy measures were considered, including sensitivity and specificity. RESULTS Fixed effects logistic regression consistently had the highest sensitivity and negative predictive value, yet a low specificity and positive predictive value. Of the random effects methods, PS adjustment and random effects logistic regression performed equally well or better than all the remaining PS methods for all classification accuracy measures across the studied scenarios. CONCLUSIONS Of the evaluated PS methods, only PS adjustment can be considered a viable alternative to random effects logistic regression when profiling multiple providers in different scenarios.
Collapse
Affiliation(s)
- Timo B. Brakenhoff
- Julius Center for Health Sciences and Primary CareUniversity Medical Center Utrecht, PO Box 85500, Utrecht, 3508 GA the Netherlands
| | - Kit C. B. Roes
- Julius Center for Health Sciences and Primary CareUniversity Medical Center Utrecht, PO Box 85500, Utrecht, 3508 GA the Netherlands
| | - Karel G. M. Moons
- Julius Center for Health Sciences and Primary CareUniversity Medical Center Utrecht, PO Box 85500, Utrecht, 3508 GA the Netherlands
| | - Rolf H. H. Groenwold
- Julius Center for Health Sciences and Primary CareUniversity Medical Center Utrecht, PO Box 85500, Utrecht, 3508 GA the Netherlands
| |
Collapse
|
41
|
Abstract
A robust quality management system (QMS) will provide value to patients, providers, and hospitals or systems by focusing on system performance. The QMS must remain independent of provider-specific measures used for privileging. Some outcome measures may be used to assess system performance; they must not be used to assess individual provider performance. All anesthesia providers, especially leaders, must be guardians of an organization's safety culture.
Collapse
Affiliation(s)
- John Allyn
- Department of Anesthesiology and Peri-operative Medicine, Spectrum Healthcare Partners, Maine Medical Center, 22 Bramhall Street, Portland, ME 04102, USA.
| | - Craig Curry
- Department of Anesthesiology and Peri-operative Medicine, Spectrum Healthcare Partners, Maine Medical Center, 22 Bramhall Street, Portland, ME 04102, USA
| |
Collapse
|
42
|
Lingsma HF, Bottle A, Middleton S, Kievit J, Steyerberg EW, Marang-van de Mheen PJ. Evaluation of hospital outcomes: the relation between length-of-stay, readmission, and mortality in a large international administrative database. BMC Health Serv Res 2018; 18:116. [PMID: 29444713 PMCID: PMC5813333 DOI: 10.1186/s12913-018-2916-1] [Citation(s) in RCA: 84] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2016] [Accepted: 02/06/2018] [Indexed: 11/21/2022] Open
Abstract
Background Hospital mortality, readmission and length of stay (LOS) are commonly used measures for quality of care. We aimed to disentangle the correlations between these interrelated measures and propose a new way of combining them to evaluate the quality of hospital care. Methods We analyzed administrative data from the Global Comparators Project from 26 hospitals on patients discharged between 2007 and 2012. We correlated standardized and risk-adjusted hospital outcomes on mortality, readmission and long LOS. We constructed a composite measure with 5 levels, based on literature review and expert advice, from survival without readmission and normal LOS (best) to mortality (worst outcome). This composite measure was analyzed using ordinal regression, to obtain a standardized outcome measure to compare hospitals. Results Overall, we observed a 3.1% mortality rate, 7.8% readmission rate (in survivors) and 20.8% long LOS rate among 4,327,105 admissions. Mortality and LOS were correlated at the patient and the hospital level. A patient in the upper quartile LOS had higher odds of mortality (odds ratio = 1.45, 95% confidence interval 1.43–1.47) than those in the lowest quartile. Hospitals with a high standardized mortality had higher proportions of long LOS (r = 0.79, p < 0.01). Readmission rates did not correlate with either mortality or long LOS rates. The interquartile range of the standardized ordinal composite outcome was 74–117. The composite outcome had similar or better reliability in ranking hospitals than individual outcomes. Conclusions Correlations between different outcome measures are complex and differ between hospital- and patient-level. The proposed composite measure combines three outcomes in an ordinal fashion for a more comprehensive and reliable view of hospital performance than its component indicators.
Collapse
Affiliation(s)
- Hester F Lingsma
- Department of Public Health, Erasmus Medical Centre, PO box 2040, 3000, CA, Rotterdam, The Netherlands.
| | - Alex Bottle
- Imperial College, Faculty of Medicine, School of Public Health, South Kensington Campus, London, SW7 2AZ, UK
| | | | - Job Kievit
- Department of Medical Decision Making, Leiden University Medical Centre, Albinusdreef 2, 2333, ZA, Leiden, The Netherlands
| | - Ewout W Steyerberg
- Department of Public Health, Erasmus Medical Centre, PO box 2040, 3000, CA, Rotterdam, The Netherlands
| | - Perla J Marang-van de Mheen
- Department of Medical Decision Making, Leiden University Medical Centre, Albinusdreef 2, 2333, ZA, Leiden, The Netherlands
| |
Collapse
|
43
|
Abel G, Saunders CL, Mendonca SC, Gildea C, McPhail S, Lyratzopoulos G. Variation and statistical reliability of publicly reported primary care diagnostic activity indicators for cancer: a cross-sectional ecological study of routine data. BMJ Qual Saf 2018; 27:21-30. [PMID: 28847789 PMCID: PMC5750427 DOI: 10.1136/bmjqs-2017-006607] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2017] [Revised: 05/24/2017] [Accepted: 05/28/2017] [Indexed: 12/29/2022]
Abstract
OBJECTIVES Recent public reporting initiatives in England highlight general practice variation in indicators of diagnostic activity related to cancer. We aimed to quantify the size and sources of variation and the reliability of practice-level estimates of such indicators, to better inform how this information is interpreted and used for quality improvement purposes. DESIGN Ecological cross-sectional study. SETTING English primary care. PARTICIPANTS All general practices in England with at least 1000 patients. MAIN OUTCOME MEASURES Sixteen diagnostic activity indicators from the Cancer Services Public Health Profiles. RESULTS Mixed-effects logistic and Poisson regression showed that substantial proportions of the observed variance in practice scores reflected chance, variably so for different indicators (between 7% and 85%). However, after accounting for the role of chance, there remained substantial variation between practices (typically up to twofold variation between the 75th and 25th centiles of practice scores, and up to fourfold variation between the 90th and 10th centiles). The age and sex profile of practice populations explained some of this variation, by different amounts across indicators. Generally, the reliability of diagnostic process indicators relating to broader populations of patients most of whom do not have cancer (eg, rate of endoscopic investigations, or urgent referrals for suspected cancer (also known as 'two week wait referrals')) was high (≥0.80) or very high (≥0.90). In contrast, the reliability of diagnostic outcome indicators relating to incident cancer cases (eg, per cent of all cancer cases detected after an emergency presentation) ranged from 0.24 to 0.54, which is well below recommended thresholds (≥0.70). CONCLUSIONS Use of indicators of diagnostic activity in individual general practices should principally focus on process indicators which have adequate or high reliability and not outcome indicators which are unreliable at practice level.
Collapse
Affiliation(s)
- Gary Abel
- Primary Care, University of Exeter, Exeter, UK
| | - Catherine L Saunders
- Cambridge Centre for Health Services Research, University of Cambridge, Cambridge, UK
| | - Silvia C Mendonca
- Cambridge Centre for Health Services Research, University of Cambridge, Cambridge, UK
| | - Carolynn Gildea
- Knowledge and Intelligence Team (East Midlands), Public Health England, Sheffield, UK
| | - Sean McPhail
- National Cancer Registration and Analysis Service, Public Health England, London, UK
| | - Georgios Lyratzopoulos
- Cambridge Centre for Health Services Research, University of Cambridge, Cambridge, UK
- National Cancer Registration and Analysis Service, Public Health England, London, UK
- Epidemiology of Cancer Healthcare and Outcomes (ECHO) Group, Department of Behavioural Science and Health, University College London, London, UK
| |
Collapse
|
44
|
Application of a simple, affordable quality metric tool to colorectal, upper gastrointestinal, hernia, and hepatobiliary surgery patients: the HARM score. Surg Endosc 2017; 32:2886-2893. [PMID: 29282576 DOI: 10.1007/s00464-017-5998-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2017] [Accepted: 12/02/2017] [Indexed: 11/27/2022]
Abstract
BACKGROUND Quality is the major driver for both clinical and financial assessment. There remains a need for simple, affordable, quality metric tools to evaluate patient outcomes, which led us to develop the HospitAl length of stay, Readmission and Mortality (HARM) score. We hypothesized that the HARM score would be a reliable tool to assess patient outcomes across various surgical specialties. METHODS From 2011 to 2015, we identified colorectal, hepatobiliary, upper gastrointestinal, and hernia surgery admissions using the Vizient Clinical Database. Individual and hospital HARM scores were calculated from length of stay, 30-day readmission, and mortality rates. We evaluated the correlation of HARM scores with complication rates using the Clavien-Dindo classification. RESULTS We identified 525,083 surgical patients: 206,981 colorectal, 164,691 hepatobiliary, 97,157 hernia, and 56,254 upper gastrointestinal. Overall, 53.8% of patients were admitted electively with a mean HARM score of 2.24; 46.2% were admitted emergently with a mean HARM score of 1.45 (p < 0.0001). All HARM components correlated with patient complications on logistic regression (p < 0.0001). The mean length of stay increased from 3.2 ± 1.8 days for a HARM score < 2 to 15.1 ± 12.2 days for a HARM score > 4 (p < 0.001). In elective admissions, for HARM categories of < 2, 2-< 3, 3-4, and > 4, complication rates were 9.3, 23.2, 38.8, and 71.6%, respectively. There was a similar trend for increasing HARM score in emergent admissions as well. For all surgical procedure categories, increasing HARM score, with and without risk adjustment, correlated with increasing severity of complications by Clavien-Dindo classification. CONCLUSIONS The HARM score is an easy-to-use quality metric that correlates with increasing complication rates and complication severity across multiple surgical disciplines when evaluated on a large administrative database. This inexpensive tool could be adopted across multiple institutions to compare the quality of surgical care.
Collapse
|
45
|
Martin JR, Wang TY, Loriaux D, Desai R, Kuchibhatla M, Karikari IO, Bagley CA, Gottfried ON. Race as a predictor of postoperative hospital readmission after spine surgery. J Clin Neurosci 2017; 46:21-25. [DOI: 10.1016/j.jocn.2017.08.015] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2017] [Accepted: 08/10/2017] [Indexed: 12/29/2022]
|
46
|
Shahian DM, Jacobs JP, Badhwar V, D’Agostino RS, Bavaria JE, Prager RL. Risk Aversion and Public Reporting. Part 2: Mitigation Strategies. Ann Thorac Surg 2017; 104:2102-2110. [DOI: 10.1016/j.athoracsur.2017.06.076] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/16/2017] [Accepted: 06/25/2017] [Indexed: 01/25/2023]
|
47
|
Abstract
BACKGROUND Surgical site infection (SSI) rates are publicly reported as quality metrics and increasingly used to determine financial reimbursement. OBJECTIVE To evaluate the volume-outcome relationship as well as the year-to-year stability of performance rankings following coronary artery bypass graft (CABG) surgery and hip arthroplasty. RESEARCH DESIGN We performed a retrospective cohort study of Medicare beneficiaries who underwent CABG surgery or hip arthroplasty at US hospitals from 2005 to 2011, with outcomes analyzed through March 2012. Nationally validated claims-based surveillance methods were used to assess for SSI within 90 days of surgery. The relationship between procedure volume and SSI rate was assessed using logistic regression and generalized additive modeling. Year-to-year stability of SSI rates was evaluated using logistic regression to assess hospitals' movement in and out of performance rankings linked to financial penalties. RESULTS Case-mix adjusted SSI risk based on claims was highest in hospitals performing <50 CABG/year and <200 hip arthroplasty/year compared with hospitals performing ≥200 procedures/year. At that same time, hospitals in the worst quartile in a given year based on claims had a low probability of remaining in that quartile the following year. This probability increased with volume, and when using 2 years' experience, but the highest probabilities were only 0.59 for CABG (95% confidence interval, 0.52-0.66) and 0.48 for hip arthroplasty (95% confidence interval, 0.42-0.55). CONCLUSIONS Aggregate SSI risk is highest in hospitals with low annual procedure volumes, yet these hospitals are currently excluded from quality reporting. Even for higher volume hospitals, year-to-year random variation makes past experience an unreliable estimator of current performance.
Collapse
|
48
|
Chang AC, Kosinski AS, Raymond DP, Magee MJ, DeCamp MM, Farjah F, Grogan EL, Seder CW, Allen MS, Blasberg JD, Blackmon SH, Burfeind WR, Cassivi SD, Park BJ, Shahian DM, Wormuth DW, Han JM, Wright CD, Fernandez FG, Kozower BD. The Society of Thoracic Surgeons Composite Score for Evaluating Esophagectomy for Esophageal Cancer. Ann Thorac Surg 2017; 103:1661-1667. [DOI: 10.1016/j.athoracsur.2016.10.027] [Citation(s) in RCA: 43] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/04/2016] [Accepted: 10/05/2016] [Indexed: 11/25/2022]
|
49
|
|
50
|
|