Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	Shahian DM, Normand SLT. What is a performance outlier? BMJ Qual Saf 2015;24:95-9. [DOI: 10.1136/bmjqs-2015-003934] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]

Number

Cited by Other Article(s)

Hengelbrock J, Rauh J, Cederbaum J, Kähler M, Höhle M. Hospital profiling using Bayesian decision theory. Biometrics 2023;79:2757-2769. [PMID: 36401573 DOI: 10.1111/biom.13798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Accepted: 11/02/2022] [Indexed: 11/21/2022]

Boyle JM, van der Meulen J, Kuryba A, Cowling TE, Booth C, Fearnhead NS, Braun MS, Walker K, Aggarwal A. Measuring variation in the quality of systemic anti-cancer therapy delivery across hospitals: A national population-based evaluation. Eur J Cancer 2023;178:191-204. [PMID: 36459767 DOI: 10.1016/j.ejca.2022.10.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Revised: 10/10/2022] [Accepted: 10/19/2022] [Indexed: 11/06/2022]

Abstract

AIM

To date, there has been little systematic assessment of the quality of care associated with systemic anti-cancer therapy (SACT) delivery across national healthcare systems. We evaluated hospital-level toxicity rates during SACT treatment as a means of identifying variation in care quality.

METHODS

All colorectal cancer (CRC) patients receiving SACT within 106 English National Health Service (NHS) hospitals between 2016 and 2019 were included. Severe acute toxicity rates were derived from hospital administrative data using a validated coding framework. Variation in hospital-level toxicity rates was assessed separately in the adjuvant and metastatic settings. Toxicity rates were adjusted for age, sex, comorbidity, performance status, tumour site, and TNM staging.

RESULTS

Eight thousand one hundred and seventy three patients received SACT in the adjuvant setting, and 7,683 patients in the metastatic setting. Adjusted severe acute toxicity rates varied between hospitals from 11% to 49% for the adjuvant cohort, and from 25% to 67% for the metastatic cohort. Compared to the national mean toxicity rate in the adjuvant cohort, six hospitals were more than two standard deviations (2SD) above, and four hospitals were more than 2SD below. In the metastatic cohort, six hospitals were more than 2SD above, and seven hospitals were more than 2SD below the national mean toxicity rate. Overall, 12 hospitals (12%) had toxicity rates more than 2SD above the national mean, and 11 (10%) had rates more than 2SD below.

CONCLUSION

There is substantial variation in hospital-level severe acute toxicity rates in both the adjuvant and metastatic settings, despite risk-adjustment. Ongoing reporting of this performance indicator can be used to focus further investigation of toxicity rates and stimulate quality improvement initiatives to improve care.

Collapse

Aman F, Rauf A, Ali R, Hussain J, Ahmed I. Balancing Complex Signals for Robust Predictive Modeling. SENSORS 2021;21:s21248465. [PMID: 34960557 PMCID: PMC8706336 DOI: 10.3390/s21248465] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Revised: 12/12/2021] [Accepted: 12/14/2021] [Indexed: 01/10/2023]

Harris AHS, Hagedorn HJ, Finlay AK. Delta Studies: Expanding the Concept of Deviance Studies to Design More Effective Improvement Interventions. J Gen Intern Med 2021;36:280-287. [PMID: 32935314 PMCID: PMC7878588 DOI: 10.1007/s11606-020-06199-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/06/2019] [Accepted: 08/27/2020] [Indexed: 10/23/2022]

Abstract

BACKGROUND

The effects of improvement (implementation and de-implementation) interventions are often modest. Although positive and negative deviance studies have been extensively used in improvement science and quality improvement efforts, conceptual and methodological innovations are needed to improve our ability to use information about variation in quality to design more effective interventions.

OBJECTIVE

We describe a novel mixed methods extension of the deviance study we term "delta studies." Delta studies seek to quantitatively identify sites that have recently changed from low performers to high performers, or vice versa, in order to qualitatively learn about active strategies that produced recent change, challenges change agents faced and how they overcame them, and where applicable, the causes of recent deterioration in performance-information intended to inform the design of improvement interventions for deployment in low performing sites. We provide examples of lessons learned from this method that may have been missed with traditional positive or negative deviance designs.

DESIGN

Considerations for quantitatively identifying delta sites are described including which quality metrics to track, over what timeframe to observe change, how to account for reliability of observed change, consideration of patient volume and initial performance as implementation context factors, and how to define clinically meaningful change. Methods to adapt qualitative protocols by integrating quantitative information about change in performance are also presented. We provide sample data and R code that can be used to graphically display distributions of initial status, change, and volume that are essential to delta studies.

PARTICIPANTS

Patients and facilities of the US Veterans Health Administration.

KEY RESULTS

As an example, we discuss what decisions we made regarding the delta study design considerations in a funded study of low-value preoperative testing. The method helped us find sites that had recently reduced the burden of low-value testing, and learn about the strategies they employed and challenges they faced.

CONCLUSIONS

The delta study concept is a promising mixed methods innovation to efficiently and effectively identify improvement strategies and other factors that have actually produced change in real-world settings.

Collapse

Commentary: Safety in numbers. J Thorac Cardiovasc Surg 2020;161:1043-1045. [PMID: 32863033 DOI: 10.1016/j.jtcvs.2020.07.058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Revised: 07/17/2020] [Accepted: 07/17/2020] [Indexed: 11/20/2022]

Raphael MJ, Siemens R, Peng Y, Vera-Badillo FE, Booth CM. Volume of systemic cancer therapy delivery and outcomes of patients with solid tumors: A systematic review and methodologic evaluation of the literature. J Cancer Policy 2020. [DOI: 10.1016/j.jcpo.2020.100215] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]

Bernard A, Falcoz PE, Thomas PA, Rivera C, Brouchet L, Baste JM, Puyraveau M, Quantin C, Pages PB, Dahan M. Comparison of Epithor clinical national database and medico-administrative database to identify the influence of case-mix on the estimation of hospital outliers. PLoS One 2019;14:e0219672. [PMID: 31339906 PMCID: PMC6655697 DOI: 10.1371/journal.pone.0219672] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Accepted: 06/30/2019] [Indexed: 11/25/2022] Open

Abstract

Background

The national Epithor database was initiated in 2003 in France. Fifteen years on, a quality assessment of the recorded data seemed necessary. This study examines the completeness of the data recorded in Epithor through a comparison with the French PMSI database, which is the national medico-administrative reference database. The aim of this study was to demonstrate the influence of data quality with respect to identifying 30-day mortality hospital outliers.

Methods

We used each hospital’s individual FINESS code to compare the number of pulmonary resections and deaths recorded in Epithor to the figures found in the PMSI. Centers were classified into either the good-quality data (GQD) group or the low-quality data (LQD) group. To demonstrate the influence of case-mix quality on the ranking of centers with low-quality data, we used 2 methods to estimate the standardized mortality rate (SMR). For the first (SMR1), the expected number of deaths per hospital was estimated with risk-adjustment models fitted with low-quality data. For the second (SMR2), the expected number of deaths per hospital was estimated with a linear predictor for the LQD group using the coefficients of a logistic regression model developed from the GQD group.

Results

Of the hospitals that use Epithor, 25 were classified in the GQD group and 75 in the LQD group. The 30-day mortality rate was 2.8% (n = 300) in the GQD group vs. 1.9% (n = 181) in the LQD group (P <0.0001). The between-hospital differences in SMR1 appeared substantial (interquartile range (IQR) 0–1.036), and they were even higher in SMR2 (IQR 0–1.19). SMR1 identified 7 hospitals as high-mortality outliers. SMR2 identified 4 hospitals as high-mortality outliers. Some hospitals went from non-outlier to high mortality and vice-versa. Kappa values were roughly 0.46 and indicated moderate agreement.

Conclusion

We found that most hospitals provided Epithor with high-quality data, but other hospitals needed to improve the quality of the information provided. Quality control is essential for this type of database and necessary for the unbiased adjustment of regression models.

Collapse

Arias-de la Torre J, Domingo L, Martínez O, Muñoz L, Robles N, Puigdomenech E, Pons-Cabrafiga M, Pallisó F, Mora X, Espallargues M. Evaluation of the effectiveness of hip and knee implant models used in Catalonia: a protocol for a prospective registry-based study. J Orthop Surg Res 2019;14:61. [PMID: 30791929 PMCID: PMC6385421 DOI: 10.1186/s13018-019-1087-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Accepted: 02/04/2019] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Monitoring results regarding the effectiveness of knee and hip arthroplasties may be useful at the clinical, economic and patient level and help reduce the number of prosthesis revisions. In Spain, and specifically in Catalonia, there is currently no systematic monitoring of the different prosthesis models available on the market. Within this context, the aim of the project presented in this protocol is to evaluate the short- and medium-term effectiveness of knee and hip models implanted in Catalonia and to identify where the results could be better or worse than expected.

METHODS

A prospective observational design will be drawn up based on data from a population-based arthroplasty register for hip and knee replacements that includes data from 53 of the 61 public hospitals in Catalonia. The knee and hip prosthesis models used will be identified and classified according to the type of prosthesis, fixation and, in total hip replacements, the bearing surface. For the data analysis, two methodological approaches will be used sequentially: first, an approach based on a survival analysis, followed by an approach based on standardised revision ratios and funnel plots. Following the analyses, a panel of experts will evaluate the results to identify possible sources of bias. Lastly, those models with results better or worse than expected compared to those from the comparison group will be valued, and strengths and difficulties for routine implementation of this methodology within the Catalan Arthroplasty Register will be identified.

DISCUSSION

The study presented in this protocol will allow us to identify the hip and knee prosthesis models whose results might be better or worse than expected. This information could have a potential impact at the patient, orthopaedic surgeon, healthcare manager, decision-making and industry levels, both in the short term and in the medium and long term.

Collapse

Ridgeway G, Nørgaard M, Rasmussen TB, Finkle WD, Pedersen L, Bøtker HE, Sørensen HT. Benchmarking Danish hospitals on mortality and readmission rates after cardiovascular admission. Clin Epidemiol 2019;11:67-80. [PMID: 30655706 PMCID: PMC6324920 DOI: 10.2147/clep.s189263] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open

Abstract

Objective

The aim of this study was to examine hospital performance measures that account more comprehensively for unique mixes of patients' characteristics.

Design

Nationwide cohort registry-based study within a population-based health care system.

Participants

In this study, 331,513 patients discharged with a primary cardiovascular diagnosis from 1 of 26 Danish hospitals during 2011-2015 were included. Data covering all Danish hospitals were drawn from the Danish National Patient Registry and the Danish National Health Service Prescription Database.

Main outcome measures

Thirty-day post-admission mortality rates, 30-day post-discharge readmission rates, and the associated numbers needed to harm were measured.

Methods

For each index hospital, we used a non-parametric logistic regression model to compute propensity scores. Propensity score weighted patients treated at other hospitals collectively resembled patients treated at the index hospital in terms of age, sex, primary discharge diagnosis, diagnosis history, medications, previous cardiac procedures, and comorbidities. Outcomes for the weighted patients treated at other hospitals formed benchmarks for the index hospital. Doubly robust regression formally tested whether the outcomes of patients at the index hospital differed from the outcomes of the patients used to form the benchmarks. For each index hospital, we computed the false discovery rate, ie, the probability of being incorrect if we claimed the hospital differed from its benchmark.

Results

Five hospitals exceeded their benchmark for 30-day mortality rates, with the number needed to harm ranging between 55 and 137. Seven hospitals exceeded their benchmark for readmission, with the number needed to harm ranging from 22 to 71. Our benchmarking approach flagged fewer hospitals as outliers compared with conventional regression methods.

Conclusion

Conventional methods flag more hospitals as outliers than our benchmarking approach. Our benchmarking approach accounts more thoroughly for differences in hospitals' patient case mix, reducing the risk of false-positive selection of suspected outliers. A more comprehensive system of hospital performance measurement could be based on this approach.

Collapse

Brakenhoff TB, Roes KCB, Moons KGM, Groenwold RHH. Outlier classification performance of risk adjustment methods when profiling multiple providers. BMC Med Res Methodol 2018;18:54. [PMID: 29902975 PMCID: PMC6003201 DOI: 10.1186/s12874-018-0510-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2017] [Accepted: 05/15/2018] [Indexed: 12/25/2022] Open

O'Hara JK, Grasic K, Gutacker N, Street A, Foy R, Thompson C, Wright J, Lawton R. Identifying positive deviants in healthcare quality and safety: a mixed methods study. J R Soc Med 2018;111:276-291. [PMID: 29749286 DOI: 10.1177/0141076818772230] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open

Kristoffersen DT, Helgeland J, Clench-Aas J, Laake P, Veierød MB. Observed to expected or logistic regression to identify hospitals with high or low 30-day mortality? PLoS One 2018;13:e0195248. [PMID: 29652941 PMCID: PMC5898724 DOI: 10.1371/journal.pone.0195248] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2017] [Accepted: 03/19/2018] [Indexed: 11/19/2022] Open

Abstract

INTRODUCTION

A common quality indicator for monitoring and comparing hospitals is based on death within 30 days of admission. An important use is to determine whether a hospital has higher or lower mortality than other hospitals. Thus, the ability to identify such outliers correctly is essential. Two approaches for detection are: 1) calculating the ratio of observed to expected number of deaths (OE) per hospital and 2) including all hospitals in a logistic regression (LR) comparing each hospital to a form of average over all hospitals. The aim of this study was to compare OE and LR with respect to correctly identifying 30-day mortality outliers. Modifications of the methods, i.e., variance corrected approach of OE (OE-Faris), bias corrected LR (LR-Firth), and trimmed mean variants of LR and LR-Firth were also studied.

MATERIALS AND METHODS

To study the properties of OE and LR and their variants, we performed a simulation study by generating patient data from hospitals with known outlier status (low mortality, high mortality, non-outlier). Data from simulated scenarios with varying number of hospitals, hospital volume, and mortality outlier status, were analysed by the different methods and compared by level of significance (ability to falsely claim an outlier) and power (ability to reveal an outlier). Moreover, administrative data for patients with acute myocardial infarction (AMI), stroke, and hip fracture from Norwegian hospitals for 2012-2014 were analysed.

RESULTS

None of the methods achieved the nominal (test) level of significance for both low and high mortality outliers. For low mortality outliers, the levels of significance were increased four- to fivefold for OE and OE-Faris. For high mortality outliers, OE and OE-Faris, LR 25% trimmed and LR-Firth 10% and 25% trimmed maintained approximately the nominal level. The methods agreed with respect to outlier status for 94.1% of the AMI hospitals, 98.0% of the stroke, and 97.8% of the hip fracture hospitals.

CONCLUSION

We recommend, on the balance, LR-Firth 10% or 25% trimmed for detection of both low and high mortality outliers.

Collapse

Baxter R, Taylor N, Kellar I, Pye V, Mohammed MA, Lawton R. Identifying positively deviant elderly medical wards using routinely collected NHS Safety Thermometer data: an observational study. BMJ Open 2018;8:e020219. [PMID: 29453303 PMCID: PMC5829907 DOI: 10.1136/bmjopen-2017-020219] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/24/2017] [Revised: 12/08/2017] [Accepted: 12/20/2017] [Indexed: 11/17/2022] Open

Abstract

OBJECTIVE

The positive deviance approach seeks to identify and learn from exceptional performers. Although a framework exists to apply positive deviance within healthcare organisations, there is limited guidance to support its implementation. The approach has also rarely explored exceptional performance on broad outcomes, been implemented at ward level, or applied within the UK. This study develops and critically appraises a pragmatic method for identifying positively deviant wards using a routinely collected, broad measure of patient safety.

DESIGN

A two-phased observational study was conducted. During phase 1, cross-sectional and temporal analyses of Safety Thermometer data were conducted to identify a discrete group of positively deviant wards that consistently demonstrated exceptional levels of safety. A group of matched comparison wards with above average performances were also identified. During phase 2, multidisciplinary staff and patients on the positively deviant and comparison wards completed surveys to explore whether their perceptions of safety supported the identification of positively deviant wards.

SETTING

34 elderly medical wards within a northern region of England, UK.

PARTICIPANTS

Multidisciplinary staff (n=161) and patients (n=188) clustered within nine positively deviant and comparison wards.

RESULTS

Phase 1: A combination of analyses identified five positively deviant wards that performed best in the region, outperformed their organisation and performed consistently well over 12 months. Five above average matched comparator wards were also identified. Phase 2: Staff and patient perceptions of safety generally supported the identification of positively deviant wards using Safety Thermometer data, although patient perceptions of safety were less concordant with the routinely collected data.

CONCLUSIONS

This study tentatively supports a pragmatic method of using routinely collected data to identify positively deviant elderly medical wards; however, it also highlights the various challenges that are faced when conducting the first stage of the positive deviance approach.

TRIAL REGISTRATION NUMBER

UK Clinical Research Network Portfolio (reference-18050).

Collapse

Wholey DR, Finch M, Kreiger R, Reeves D. Public Reporting of Primary Care Clinic Quality: Accounting for Sociodemographic Factors in Risk Adjustment and Performance Comparison. Popul Health Manag 2018;21:378-386. [PMID: 29298402 DOI: 10.1089/pop.2017.0137] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022] Open

Identification of outliers and positive deviants for healthcare improvement: looking for high performers in hypoglycemia safety in patients with diabetes. BMC Health Serv Res 2017;17:738. [PMID: 29145834 PMCID: PMC5691393 DOI: 10.1186/s12913-017-2692-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2016] [Accepted: 11/07/2017] [Indexed: 11/29/2022] Open

Abstract

Background

The study objectives were to determine: (1) how statistical outliers exhibiting low rates of diabetes overtreatment performed on a reciprocal measure – rates of diabetes undertreatment; and (2) the impact of different criteria on high performing outlier status.

Methods

The design was serial cross-sectional, using yearly Veterans Health Administration (VHA) administrative data (2009–2013). Our primary outcome measure was facility rate of HbA1c overtreatment of diabetes in patients at risk for hypoglycemia. Outlier status was assessed by using two approaches: calculating a facility outlier value within year, comparator group, and A1c threshold while incorporating at risk population sizes; and examining standardized model residuals across year and A1c threshold. Facilities with outlier values in the lowest decile for all years of data using more than one threshold and comparator or with time-averaged model residuals in the lowest decile for all A1c thresholds were considered high performing outliers.

Results

Using outlier values, three of the 27 high performers from 2009 were also identified in 2010–2013 and considered outliers. There was only modest overlap between facilities identified as top performers based on three thresholds: A1c < 6%, A1c < 6.5%, and A1c < 7%. There was little effect of facility complexity or regional Veterans Integrated Service Networks (VISNs) on outlier identification. Consistent high performing facilities for overtreatment had higher rates of undertreatment (A1c > 9%) than VA average in the population of patients at high risk for hypoglycemia.

Conclusions

Statistical identification of positive deviants for diabetes overtreatment was dependent upon the specific measures and approaches used. Moreover, because two facilities may arrive at the same results via very different pathways, it is important to consider that a “best” practice may actually reflect a separate “worst” practice.

Collapse

Unpacking quality indicators: how much do they reflect differences in the quality of care? BMJ Qual Saf 2017;27:4-6. [DOI: 10.1136/bmjqs-2017-006782] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/08/2017] [Indexed: 12/29/2022]

Gaynor JW, Pasquali SK, Ohye RG, Spray TL. Potential benefits and consequences of public reporting of pediatric cardiac surgery outcomes. J Thorac Cardiovasc Surg 2016;153:904-907. [PMID: 27919455 DOI: 10.1016/j.jtcvs.2016.08.066] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/30/2016] [Revised: 08/26/2016] [Accepted: 08/26/2016] [Indexed: 11/27/2022]

Driessen SRC, Wallwiener M, Taran FA, Cohen SL, Kraemer B, Wallwiener CW, van Zwet EW, Brucker SY, Jansen FW. Hospital versus individual surgeon's performance in laparoscopic hysterectomy. Arch Gynecol Obstet 2016;295:111-117. [PMID: 27628752 PMCID: PMC5225188 DOI: 10.1007/s00404-016-4199-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2016] [Accepted: 09/06/2016] [Indexed: 11/29/2022]

Pasquali SK, Wallace AS, Gaynor JW, Jacobs ML, O'Brien SM, Hill KD, Gaies MG, Romano JC, Shahian DM, Mayer JE, Jacobs JP. Congenital Heart Surgery Case Mix Across North American Centers and Impact on Performance Assessment. Ann Thorac Surg 2016;102:1580-1587. [PMID: 27457827 DOI: 10.1016/j.athoracsur.2016.04.034] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/26/2016] [Revised: 04/05/2016] [Accepted: 04/11/2016] [Indexed: 11/24/2022]

Abstract

BACKGROUND

Performance assessment in congenital heart surgery is challenging due to the wide heterogeneity of disease. We describe current case mix across centers, evaluate methodology inclusive of all cardiac operations versus the more homogeneous subset of Society of Thoracic Surgeons benchmark operations, and describe implications regarding performance assessment.

METHODS

Centers (n = 119) participating in the Society of Thoracic Surgeons Congenital Heart Surgery Database (2010 through 2014) were included. Index operation type and frequency across centers were described. Center performance (risk-adjusted operative mortality) was evaluated and classified when including the benchmark versus all eligible operations.

RESULTS

Overall, 207 types of operations were performed during the study period (112,140 total cases). Few operations were performed across all centers; only 25% were performed at least once by 75% or more of centers. There was 7.9-fold variation across centers in the proportion of total cases comprising high-complexity cases (STAT 5). In contrast, the benchmark operations made up 36% of cases, and all but 2 were performed by at least 90% of centers. When evaluating performance based on benchmark versus all operations, 15% of centers changed performance classification; 85% remained unchanged. Benchmark versus all operation methodology was associated with lower power, with 35% versus 78% of centers meeting sample size thresholds.

CONCLUSIONS

There is wide variation in congenital heart surgery case mix across centers. Metrics based on benchmark versus all operations are associated with strengths (less heterogeneity) and weaknesses (lower power), and lead to differing performance classification for some centers. These findings have implications for ongoing efforts to optimize performance assessment, including choice of target population and appropriate interpretation of reported metrics.

Collapse

The Importance of Integrating Clinical Relevance and Statistical Significance in the Assessment of Quality of Care--Illustrated Using the Swedish Stroke Register. PLoS One 2016;11:e0153082. [PMID: 27054326 PMCID: PMC4824466 DOI: 10.1371/journal.pone.0153082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2015] [Accepted: 03/23/2016] [Indexed: 11/19/2022] Open

Abstract

Background

When profiling hospital performance, quality inicators are commonly evaluated through hospital-specific adjusted means with confidence intervals. When identifying deviations from a norm, large hospitals can have statistically significant results even for clinically irrelevant deviations while important deviations in small hospitals can remain undiscovered. We have used data from the Swedish Stroke Register (Riksstroke) to illustrate the properties of a benchmarking method that integrates considerations of both clinical relevance and level of statistical significance.

Methods

The performance measure used was case-mix adjusted risk of death or dependency in activities of daily living within 3 months after stroke. A hospital was labeled as having outlying performance if its case-mix adjusted risk exceeded a benchmark value with a specified statistical confidence level. The benchmark was expressed relative to the population risk and should reflect the clinically relevant deviation that is to be detected. A simulation study based on Riksstroke patient data from 2008–2009 was performed to investigate the effect of the choice of the statistical confidence level and benchmark value on the diagnostic properties of the method.

Results

Simulations were based on 18,309 patients in 76 hospitals. The widely used setting, comparing 95% confidence intervals to the national average, resulted in low sensitivity (0.252) and high specificity (0.991). There were large variations in sensitivity and specificity for different requirements of statistical confidence. Lowering statistical confidence improved sensitivity with a relatively smaller loss of specificity. Variations due to different benchmark values were smaller, especially for sensitivity. This allows the choice of a clinically relevant benchmark to be driven by clinical factors without major concerns about sufficiently reliable evidence.

Conclusions

The study emphasizes the importance of combining clinical relevance and level of statistical confidence when profiling hospital performance. To guide the decision process a web-based tool that gives ROC-curves for different scenarios is provided.

Collapse

Amaral ACKB, Cuthbertson BH. Balancing quality of care and resource utilisation in acute care hospitals. BMJ Qual Saf 2016;25:824-826. [PMID: 26762149 DOI: 10.1136/bmjqs-2015-005037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/21/2015] [Indexed: 11/03/2022]

Pandit JJ. Deaths by horsekick in the Prussian army - and other ‘Never Events’ in large organisations. Anaesthesia 2015;71:7-11. [DOI: 10.1111/anae.13261] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Abel G, Lyratzopoulos G. Ranking hospitals on avoidable death rates derived from retrospective case record review: methodological observations and limitations. BMJ Qual Saf 2015;24:554-7. [PMID: 26141503 PMCID: PMC4552920 DOI: 10.1136/bmjqs-2015-004366] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2015] [Revised: 06/15/2015] [Accepted: 06/17/2015] [Indexed: 11/11/2022]

Shahian DM, He X, Jacobs JP, Kurlansky PA, Badhwar V, Cleveland JC, Fazzalari FL, Filardo G, Normand SLT, Furnary AP, Magee MJ, Rankin JS, Welke KF, Han J, O'Brien SM. The Society of Thoracic Surgeons Composite Measure of Individual Surgeon Performance for Adult Cardiac Surgery: A Report of The Society of Thoracic Surgeons Quality Measurement Task Force. Ann Thorac Surg 2015;100:1315-24; discussion 1324-5. [PMID: 26330012 DOI: 10.1016/j.athoracsur.2015.06.122] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/12/2015] [Revised: 05/29/2015] [Accepted: 06/26/2015] [Indexed: 10/23/2022]

Abstract

BACKGROUND

Previous composite performance measures of The Society of Thoracic Surgeons (STS) were estimated at the STS participant level, typically a hospital or group practice. The STS Quality Measurement Task Force has now developed a multiprocedural, multidimensional composite measure suitable for estimating the performance of individual surgeons.

METHODS

The development sample from the STS National Database included 621,489 isolated coronary artery bypass grafting procedures, isolated aortic valve replacement, aortic valve replacement plus coronary artery bypass grafting, mitral, or mitral plus coronary artery bypass grafting procedures performed by 2,286 surgeons between July 1, 2011, and June 30, 2014. Each surgeon's composite score combined their aggregate risk-adjusted mortality and major morbidity rates (each weighted inversely by their standard deviations) and reflected the proportion of case types they performed. Model parameters were estimated in a Bayesian framework. Composite star ratings were examined using 90%, 95%, or 98% Bayesian credible intervals. Measure reliability was estimated using various 3-year case thresholds.

RESULTS

The final composite measure was defined as 0.81 × (1 minus risk-standardized mortality rate) + 0.19 × (1 minus risk-standardized complication rate). Risk-adjusted mortality (median, 2.3%; interquartile range, 1.7% to 3.0%), morbidity (median, 13.7%; interquartile range, 10.8% to 17.1%), and composite scores (median, 95.4%; interquartile range, 94.4% to 96.3%) varied substantially across surgeons. Using 98% Bayesian credible intervals, there were 207 1-star (lower performance) surgeons (9.1%), 1,701 2-star (as-expected performance) surgeons (74.4%), and 378 3-star (higher performance) surgeons (16.5%). With an eligibility threshold of 100 cases over 3 years, measure reliability was 0.81.

CONCLUSIONS

The STS has developed a multiprocedural composite measure suitable for evaluating performance at the individual surgeon level.

Collapse

Deeny SR, Steventon A. Making sense of the shadows: priorities for creating a learning healthcare system based on routinely collected data. BMJ Qual Saf 2015;24:505-15. [PMID: 26065466 PMCID: PMC4515981 DOI: 10.1136/bmjqs-2015-004278] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2015] [Accepted: 04/13/2015] [Indexed: 11/08/2022]