1
|
Ross J, Lavallee LT, Hickling D, van Walraven C. Development of the multivariate administrative data cystectomy model and its impact on misclassification bias. BMC Med Res Methodol 2024; 24:73. [PMID: 38515018 PMCID: PMC10956281 DOI: 10.1186/s12874-024-02199-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 03/06/2024] [Indexed: 03/23/2024] Open
Abstract
BACKGROUND Misclassification bias (MB) is the deviation of measured from true values due to incorrect case assignment. This study compared MB when cystectomy status was determined using administrative database codes vs. predicted cystectomy probability. METHODS We identified every primary cystectomy-diversion type at a single hospital 2009-2019. We linked to claims data to measure true association of cystectomy with 30 patient and hospitalization factors. Associations were also measured when cystectomy status was assigned using billing codes and by cystectomy probability from multivariate logistic regression model with covariates from administrative data. MB was the difference between measured and true associations. RESULTS 500 people underwent cystectomy (0.12% of 428 677 hospitalizations). Sensitivity and positive predictive values for cystectomy codes were 97.1% and 58.6% for incontinent diversions and 100.0% and 48.4% for continent diversions, respectively. The model accurately predicted cystectomy-incontinent diversion (c-statistic [C] 0.999, Integrated Calibration Index [ICI] 0.000) and cystectomy-continent diversion (C:1.000, ICI 0.000) probabilities. MB was significantly lower when model-based predictions was used to impute cystectomy-diversion type status using for both incontinent cystectomy (F = 12.75; p < .0001) and continent cystectomy (F = 11.25; p < .0001). CONCLUSIONS A model using administrative data accurately returned the probability that cystectomy by diversion type occurred during a hospitalization. Using this model to impute cystectomy status minimized MB. Accuracy of administrative database research can be increased by using probabilistic imputation to determine case status instead of individual codes.
Collapse
Affiliation(s)
- James Ross
- Department of Surgery, University of Ottawa, Ottawa, Canada
| | | | - Duane Hickling
- Department of Surgery, University of Ottawa, Ottawa, Canada
| | - Carl van Walraven
- Department of Medicine / Department of Epidemiology & Community Medicine, University of Ottawa, ASB1-003, 1053 Carling Ave, Ottawa, ON, K1Y 4E9, Canada.
- Ottawa Hospital Research Institute, ASB1-003, 1053 Carling Ave, Ottawa, ON, K1Y 4E9, Canada.
- ICES-uOttawa, ASB1-003, 1053 Carling Ave, Ottawa, ON, K1Y 4E9, Canada.
| |
Collapse
|
2
|
Adamczyk A, Grammatopoulos G, van Walraven C. Minimizing misclassification bias with a model to identify acetabular fractures using health administrative data: A cohort study. Medicine (Baltimore) 2021; 100:e28223. [PMID: 34967356 PMCID: PMC8718247 DOI: 10.1097/md.0000000000028223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Accepted: 11/24/2021] [Indexed: 01/05/2023] Open
Abstract
Acetabular fractures (AFs) are relatively uncommon thereby limiting their study. Analyses using population-based health administrative data can return erroneous results if case identification is inaccurate ('misclassification bias'). This study measured the impact of an AF prediction model based exclusively on administrative data upon misclassification bias.We applied text analytical methods to all radiology reports over 11 years at a large, tertiary care teaching hospital to identify all AFs. Using clinically-based variable selection techniques, a logistic regression model was created.We identified 728 AFs in 438,098 hospitalizations (15.1 cases/10,000 admissions). The International Classification of Disease, 10th revision (ICD-10) code for AF (S32.4) missed almost half of cases and misclassified more than a quarter (sensitivity 51.2%, positive predictive value 73.0%). The AF model was very accurate (optimism adjusted R2 0.618, c-statistic 0.988, calibration slope 1.06). When model-based expected probabilities were used to determine AF status using bootstrap imputation methods, misclassification bias for AF prevalence and its association with other variables was much lower than with International Classification of Disease, 10th revision S32.4 (median [range] relative difference 1.0% [0%-9.0%] vs 18.0% [5.4%-75.0%]).Lone administrative database diagnostic codes are inadequate to create AF cohorts. The probability of AF can be accurately determined using health administrative data. This probability can be used in bootstrap imputation methods to importantly reduce misclassification bias.
Collapse
Affiliation(s)
- Andrew Adamczyk
- Department of Surgery, University of Ottawa, Ottawa Hospital Research Institute, Canada
| | - George Grammatopoulos
- Department of Surgery, University of Ottawa, Ottawa Hospital Research Institute, Canada
| | - Carl van Walraven
- Department of Medicine, University of Ottawa, Canada
- Department of Epidemiology & Community Medicine, University of Ottawa, Ottawa Hospital Research Institute, ICES, Canada
| |
Collapse
|
3
|
Kendzerska T, van Walraven C, McIsaac DI, Povitz M, Mulpuru S, Lima I, Talarico R, Aaron SD, Reisman W, Gershon AS. Case-Ascertainment Models to Identify Adults with Obstructive Sleep Apnea Using Health Administrative Data: Internal and External Validation. Clin Epidemiol 2021; 13:453-467. [PMID: 34168503 PMCID: PMC8216743 DOI: 10.2147/clep.s308852] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Accepted: 05/12/2021] [Indexed: 01/29/2023] Open
Abstract
Background There is limited evidence on whether obstructive sleep apnea (OSA) can be accurately identified using health administrative data. Study Design and Methods We derived and validated a case-ascertainment model to identify OSA using linked provincial health administrative and clinical data from all consecutive adults who underwent a diagnostic sleep study (index date) at two large academic centers (Ontario, Canada) from 2007 to 2017. The presence of moderate/severe OSA (an apnea–hypopnea index≥15) was defined using clinical data. Of 39 candidate health administrative variables considered, 32 were tested. We used classification and regression tree (CART) methods to identify the most parsimonious models via cost-complexity pruning. Identified variables were also used to create parsimonious logistic regression models. All individuals with an estimated probability of 0.5 or greater using the predictive models were classified as having OSA. Results The case-ascertainment models were derived and validated internally through bootstrapping on 5099 individuals from one center (33% moderate/severe OSA) and validated externally on 13,486 adults from the other (45% moderate/severe OSA). On the external cohort, parsimonious models demonstrated c-statistics of 0.75–0.81, sensitivities of 59–60%, specificities of 87–88%, positive predictive values of 79%, negative predictive values of 73%, positive likelihood ratios (+LRs) of 4.5–5.0 and –LRs of 0.5. Logistic models performed better than CART models (mean integrated calibration indices of 0.02–0.03 and 0.06–0.12, respectively). The best model included: sex, age, and hypertension at the index date, as well as an outpatient specialty physician visit for OSA, a repeated sleep study, and a positive airway pressure treatment claim within 1 year since the index date. Interpretation Among adults who underwent a sleep study, case-ascertainment models for identifying moderate/severe OSA using health administrative data had relatively low sensitivity but high specificity and good discriminative ability. These findings could help study trends and outcomes of OSA individuals using routinely collected health care data.
Collapse
Affiliation(s)
- Tetyana Kendzerska
- Department of Medicine, The Ottawa Hospital Research Institute/The Ottawa Hospital, Ottawa, Ontario, Canada.,Department of Medicine, University of Ottawa, Ottawa, Ontario, Canada.,ICES, Ottawa, Toronto, Ontario, Canada
| | - Carl van Walraven
- Department of Medicine, The Ottawa Hospital Research Institute/The Ottawa Hospital, Ottawa, Ontario, Canada.,Department of Medicine, University of Ottawa, Ottawa, Ontario, Canada.,ICES, Ottawa, Toronto, Ontario, Canada
| | - Daniel I McIsaac
- Department of Medicine, The Ottawa Hospital Research Institute/The Ottawa Hospital, Ottawa, Ontario, Canada.,ICES, Ottawa, Toronto, Ontario, Canada.,Departments of Anesthesiology & Pain Medicine, University of Ottawa and Ottawa Hospital, Ottawa, Ontario, Canada
| | - Marcus Povitz
- Department of Medicine at Schulich School of Medicine and Dentistry at Western University, London, Ontario, Canada.,Cumming School of Medicine, Department of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Sunita Mulpuru
- Department of Medicine, The Ottawa Hospital Research Institute/The Ottawa Hospital, Ottawa, Ontario, Canada.,Department of Medicine, University of Ottawa, Ottawa, Ontario, Canada
| | - Isac Lima
- Department of Medicine, The Ottawa Hospital Research Institute/The Ottawa Hospital, Ottawa, Ontario, Canada.,ICES, Ottawa, Toronto, Ontario, Canada
| | - Robert Talarico
- Department of Medicine, The Ottawa Hospital Research Institute/The Ottawa Hospital, Ottawa, Ontario, Canada.,ICES, Ottawa, Toronto, Ontario, Canada
| | - Shawn D Aaron
- Department of Medicine, The Ottawa Hospital Research Institute/The Ottawa Hospital, Ottawa, Ontario, Canada.,Department of Medicine, University of Ottawa, Ottawa, Ontario, Canada
| | - William Reisman
- Department of Medicine at Schulich School of Medicine and Dentistry at Western University, London, Ontario, Canada.,Department of Medicine, London Health Sciences Centre, London, Ontario, Canada
| | - Andrea S Gershon
- ICES, Ottawa, Toronto, Ontario, Canada.,Faculty of Medicine, Department of Medicine, University of Toronto, Toronto, Ontario, Canada.,Department of Medicine, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada
| |
Collapse
|
4
|
Corrales-Medina VF, van Walraven C. Accuracy of Administrative Database Algorithms for Hospitalized Pneumonia in Adults: a Systematic Review. J Gen Intern Med 2021; 36:683-690. [PMID: 33420557 PMCID: PMC7947096 DOI: 10.1007/s11606-020-06211-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/12/2020] [Accepted: 09/02/2020] [Indexed: 11/29/2022]
Abstract
BACKGROUND Administrative data algorithms (ADAs) to identify pneumonia cases are commonly used in the analysis of pneumonia burden, trends, etiology, processes of care, outcomes, health care utilization, cost, and response to preventative and therapeutic interventions. However, without a good understanding of the validity of ADAs for pneumonia case identification, an adequate appreciation of this literature is difficult. We systematically reviewed the quality and accuracy of published ADAs to identify adult hospitalized pneumonia cases. METHODS We reviewed the Medline, EMBase, and Cochrane Central databases through May 2020. All studies describing ADAs for adult hospitalized pneumonia and at least one accuracy statistic were included. Investigators independently extracted information about the sampling frame, reference standard, ADA composition, and ADA accuracy. RESULTS Thirteen studies involving 24 ADAs were analyzed. Compliance with a 38-item study-quality assessment tool ranged from 17 to 29 (median, 23; interquartile range [IQR], 20 to 25). Study setting, design, and ADA composition varied extensively. Inclusion criteria of most studies selected for high-risk populations and/or increased pneumonia likelihood. Reference standards with explicit criteria (clinical, laboratorial, and/or radiographic) were used in only 4 ADAs. Only 2 ADAs were validated (one internally and one externally). ADA positive predictive values ranged from 35.0 to 96.5% (median, 84.8%; IQR, 65.3 to 89.1%). However, these values are exaggerated for an unselected patient population because pneumonia prevalences in the study cohorts were very high (median, 66%; IQR, 46 to 86%). ADA sensitivities ranged from 31.3 to 97.8% (median, 65.1%; IQR 52.5-72.4). DISCUSSION ADAs for identification of adult pneumonia hospitalizations are highly heterogeneous, poorly validated, and at risk for misclassification bias. Greater standardization in reporting ADA accuracy is required in studies using pneumonia ADA for case identification so that results can be properly interpreted.
Collapse
Affiliation(s)
- Vicente F Corrales-Medina
- Clinical Epidemiology Program, The Ottawa Hospital Research Institute, Ottawa, Ontario, Canada. .,Department of Medicine, University of Ottawa, Ottawa, Ontario, Canada. .,The Ottawa Hospital Civic Campus, Ottawa, Ontario, Canada.
| | - Carl van Walraven
- Clinical Epidemiology Program, The Ottawa Hospital Research Institute, Ottawa, Ontario, Canada.,Department of Medicine, University of Ottawa, Ottawa, Ontario, Canada
| |
Collapse
|
5
|
Walraven CV. A comparison of methods to correct for misclassification bias from administrative database diagnostic codes. Int J Epidemiol 2019; 47:605-616. [PMID: 29253160 DOI: 10.1093/ije/dyx253] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/06/2017] [Indexed: 11/15/2022] Open
Abstract
Background In administrative database research, misclassification bias can result from diagnostic codes that imperfectly represent the condition being studied. It is unclear how to correct for this bias. Methods Severe renal failure and Colles' fracture status were determined in two distinct cohorts using gold standard methods. True disease prevalence and disease association with other covariables were measured and compared with results when disease status was determined using diagnostic codes. Differences ('misclassification bias') were then adjusted for using two methods: quantitative bias analysis (QBA) with bias parameters (code sensitivity and specificity) of varying accuracy; and disease status imputation using bootstrap methods and disease probability models. Results Prevalences of severe renal failure (n = 50 074) and Colles' fracture (n = 5680) were 7.5% and 37.0%, respectively. Compared with true values, important bias resulted when diagnostic codes were used to measure disease prevalence and disease-covariable associations. QBA increased bias when population-based (vs strata-specific) bias parameters were used. QBA's ability to account for misclassification bias was most dependent upon deviations in code specificity. Bootstrap imputation accounted for misclassification bias, but this depended on disease model calibration. Conclusions Extensive bias can result from using inaccurate diagnostic codes to determine disease status. This bias can be addressed with QBA using accurate bias parameter measures, or by bootstrap imputation using well-calibrated disease prediction models.
Collapse
Affiliation(s)
- Carl van Walraven
- Departments of Medicine and Epidemiology & Community Medicine, University of Ottawa, ASB1-003 1053, Carling Ave, Ottawa ON, K1Y 4E9, Canada
| |
Collapse
|
7
|
van Walraven C. Bootstrap imputation minimized misclassification bias when measuring Colles' fracture prevalence and its associations using health administrative data. J Clin Epidemiol 2017; 96:93-100. [PMID: 29288134 DOI: 10.1016/j.jclinepi.2017.12.012] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2017] [Revised: 11/27/2017] [Accepted: 12/08/2017] [Indexed: 10/18/2022]
Abstract
OBJECTIVES Misclassification bias can result from the incorrect assignment of disease status using inaccurate diagnostic codes in health administrative data. This study quantified misclassification bias in the study of Colles' fracture. STUDY DESIGN AND SETTING Colles' fracture status was determined in all patients >50 years old seen in the emergency room at a single teaching hospital between 2006 and 2014 by manually reviewing all forearm radiographs. This data set was linked to population-based data capturing all emergency room visits. Reference disease prevalence and its association with covariates were measured. A multivariate model using covariates derived from administrative data was used to impute Colles' fracture status and measure its prevalence and associations using bootstrapping methods. These values were compared with reference values to measure misclassification bias. This was repeated using diagnostic codes to determine Colles' fracture status. RESULTS Five hundred eighteen thousand, seven hundred forty-four emergency visits were included with 3,538 (0.7%) having a Colles' fracture. Determining disease status using the diagnostic code (sensitivity 69.4%, positive predictive value 79.9%) resulted in significant underestimate of Colles' fracture prevalence (relative difference -13.3%) and biased associations with covariates. The Colles' fracture model accurately determined disease probability (c-statistic 98.9 [95% confidence interval {CI} 98.7-99.1], calibration slope 1.009 [95% CI 1.004-1.013], Nagelkerke's R2 0.71 [95% CI 0.70-0.72]). Using disease probability estimates from this model, bootstrap imputation (BI) resulted in minimal misclassification bias (relative difference in disease prevalence -0.01%). The statistical significance of the association between Colles' fracture and age was accurate in 32.4% and 70.4% of samples when using the code or BI, respectively. CONCLUSION Misclassification bias in estimating disease prevalence and its associations can be minimized with BI using accurate disease probability estimates.
Collapse
Affiliation(s)
- Carl van Walraven
- Professor of Medicine and Epidemiology & Community Medicine, University of Ottawa; Senior Scientist, Ottawa Hospital Research Institute; Scientist, Institute for Clinical Evaluative Sciences.
| |
Collapse
|