1
|
Perets O, Stagno E, Yehuda EB, McNichol M, Anthony Celi L, Rappoport N, Dorotic M. Inherent Bias in Electronic Health Records: A Scoping Review of Sources of Bias. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.09.24305594. [PMID: 38680842 PMCID: PMC11046491 DOI: 10.1101/2024.04.09.24305594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/01/2024]
Abstract
Objectives 1.1Biases inherent in electronic health records (EHRs), and therefore in medical artificial intelligence (AI) models may significantly exacerbate health inequities and challenge the adoption of ethical and responsible AI in healthcare. Biases arise from multiple sources, some of which are not as documented in the literature. Biases are encoded in how the data has been collected and labeled, by implicit and unconscious biases of clinicians, or by the tools used for data processing. These biases and their encoding in healthcare records undermine the reliability of such data and bias clinical judgments and medical outcomes. Moreover, when healthcare records are used to build data-driven solutions, the biases are further exacerbated, resulting in systems that perpetuate biases and induce healthcare disparities. This literature scoping review aims to categorize the main sources of biases inherent in EHRs. Methods 1.2We queried PubMed and Web of Science on January 19th, 2023, for peer-reviewed sources in English, published between 2016 and 2023, using the PRISMA approach to stepwise scoping of the literature. To select the papers that empirically analyze bias in EHR, from the initial yield of 430 papers, 27 duplicates were removed, and 403 studies were screened for eligibility. 196 articles were removed after the title and abstract screening, and 96 articles were excluded after the full-text review resulting in a final selection of 116 articles. Results 1.3Systematic categorizations of diverse sources of bias are scarce in the literature, while the effects of separate studies are often convoluted and methodologically contestable. Our categorization of published empirical evidence identified the six main sources of bias: a) bias arising from past clinical trials; b) data-related biases arising from missing, incomplete information or poor labeling of data; human-related bias induced by c) implicit clinician bias, d) referral and admission bias; e) diagnosis or risk disparities bias and finally, (f) biases in machinery and algorithms. Conclusions 1.4Machine learning and data-driven solutions can potentially transform healthcare delivery, but not without limitations. The core inputs in the systems (data and human factors) currently contain several sources of bias that are poorly documented and analyzed for remedies. The current evidence heavily focuses on data-related biases, while other sources are less often analyzed or anecdotal. However, these different sources of biases add to one another exponentially. Therefore, to understand the issues holistically we need to explore these diverse sources of bias. While racial biases in EHR have been often documented, other sources of biases have been less frequently investigated and documented (e.g. gender-related biases, sexual orientation discrimination, socially induced biases, and implicit, often unconscious, human-related cognitive biases). Moreover, some existing studies lack causal evidence, illustrating the different prevalences of disease across groups, which does not per se prove the causality. Our review shows that data-, human- and machine biases are prevalent in healthcare and they significantly impact healthcare outcomes and judgments and exacerbate disparities and differential treatment. Understanding how diverse biases affect AI systems and recommendations is critical. We suggest that researchers and medical personnel should develop safeguards and adopt data-driven solutions with a "bias-in-mind" approach. More empirical evidence is needed to tease out the effects of different sources of bias on health outcomes.
Collapse
|
2
|
Her QL, Dejene SZ, Ismail S, Wang T, Jonsson-Funk M, Pate V, Min JY, Flory J. Validation of an international classification of disease, tenth revision, clinical modification (ICD-10-CM) algorithm in identifying severe hypoglycaemia events for real-world studies. Diabetes Obes Metab 2024; 26:1282-1290. [PMID: 38204417 DOI: 10.1111/dom.15428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 11/29/2023] [Accepted: 12/09/2023] [Indexed: 01/12/2024]
Abstract
AIM The transition to the ICD-10-CM coding system has reduced the utility of hypoglycaemia algorithms based on ICD-9-CM diagnosis codes in real-world studies of antidiabetic drugs. We mapped a validated ICD-9-CM hypoglycaemia algorithm to ICD-10-CM codes to create an ICD-10-CM hypoglycaemia algorithm and assessed its performance in identifying severe hypoglycaemia. MATERIALS AND METHODS We assembled a cohort of Medicare patients with DM and linked electronic health record (EHR) data to the University of North Carolina Health System and identified candidate severe hypoglycaemia events from their Medicare claims using the ICD-10-CM hypoglycaemia algorithm. We confirmed severe hypoglycaemia by EHR review and computed a positive predictive value (PPV) of the algorithm to assess its performance. We refined the algorithm by removing poor performing codes (PPV ≤0.5) and computed a Cohen's κ statistic to evaluate the agreement of the EHR reviews. RESULTS The algorithm identified 642 candidate severe hypoglycaemia events, and we confirmed 455 as true severe hypoglycaemia events, PPV of 0.709 (95% confidence interval: 0.672, 0.744). When we refined the algorithm, the PPV increased to 0.893 (0.862, 0.918) and missed <2.42% (<11) true severe hypoglycaemia events. Agreement between reviewers was high, κ = 0.93 (0.89, 0.97). CONCLUSIONS We translated an ICD-9-CM hypoglycaemia algorithm to an ICD-10-CM version and found its performance was modest. The performance of the algorithm improved by removing poor performing codes at the trade-off of missing very few severe hypoglycaemia events. The algorithm has the potential to be used to identify severe hypoglycaemia in real-world studies of antidiabetic drugs.
Collapse
Affiliation(s)
- Qoua L Her
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Sara Z Dejene
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Sherin Ismail
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Tiansheng Wang
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Michele Jonsson-Funk
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Virigina Pate
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Jea Young Min
- Department of Population Health Sciences, Weill Cornell Medical College, New York, New York, USA
| | - James Flory
- Endocrinology Service, Department of Subspecialty Medicine, Memorial Sloan Kettering Cancer Center, New York, New York, USA
| |
Collapse
|
3
|
Rosen EM, Ritchey ME, Girman CJ. Can Weight of Evidence, Quantitative Bias, and Bounding Methods Evaluate Robustness of Real-world Evidence for Regulator and Health Technology Assessment Decisions on Medical Interventions? Clin Ther 2023; 45:1266-1276. [PMID: 37798219 DOI: 10.1016/j.clinthera.2023.09.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 06/07/2023] [Accepted: 09/12/2023] [Indexed: 10/07/2023]
Abstract
PURPOSE High-quality evidence is crucial for health care intervention decision-making. These decisions frequently use nonrandomized data, which can be more vulnerable to biases than randomized trials. Accordingly, methods to quantify biases and weigh available evidence could elucidate the robustness of findings, giving regulators more confidence in making approval and reimbursement decisions. METHODS We conducted an integrative literature review to identify methods for determining probability of causation, evaluating weight of evidence, and conducting quantitative bias analysis as related to health care interventions. Eligible studies were published from 2012 to 2021, applicable to pharmacoepidemiology, and presented a method that met our objective. FINDINGS Twenty-two eligible studies were classified into 4 categories: (1) quantitative bias analysis; (2) weight of evidence methods; (3) Bayesian networks; and (4) miscellaneous. All of the methods have strengths, limitations, and situations in which they are more well suited than others. Some methods seem to lend themselves more to applications of health care evidence on medical interventions than others. IMPLICATIONS To provide robust evidence for and improve confidence in regulatory or reimbursement decisions, we recommend applying multiple methods to triangulate associations of medical interventions, accounting for biases in different ways. This approach could lead to well-defined robustness assessments of study findings and appropriate science-driven decisions by regulators and payers for public health.
Collapse
Affiliation(s)
- Emma M Rosen
- Department of Epidemiology, University of North Carolina-Chapel Hill, Chapel Hill, North Carolina, USA; CERobs Consulting, LLC, Wrightsville Beach, North Carolina, USA
| | - Mary E Ritchey
- CERobs Consulting, LLC, Wrightsville Beach, North Carolina, USA; Med Tech Epi, LLC; Philadelphia, Pennsylvania, USA; Center for Pharmacoepidemiology & Treatment Science, Rutgers University, New Brunswick, New Jersey, USA
| | - Cynthia J Girman
- Department of Epidemiology, University of North Carolina-Chapel Hill, Chapel Hill, North Carolina, USA; CERobs Consulting, LLC, Wrightsville Beach, North Carolina, USA.
| |
Collapse
|
4
|
Poulsen MN, Nordberg CM, Troiani V, Berrettini W, Asdell PB, Schwartz BS. Identification of opioid use disorder using electronic health records: Beyond diagnostic codes. Drug Alcohol Depend 2023; 251:110950. [PMID: 37716289 PMCID: PMC10620734 DOI: 10.1016/j.drugalcdep.2023.110950] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 08/24/2023] [Accepted: 08/29/2023] [Indexed: 09/18/2023]
Abstract
BACKGROUND We used structured and unstructured electronic health record (EHR) data to develop and validate an approach to identify moderate/severe opioid use disorder (OUD) that includes individuals without prescription opioid use or chronic pain, an underrepresented population. METHODS Using electronic diagnosis grouper text from EHRs of ~1 million patients (2012-2020), we created indicators of OUD-with "tiers" indicating OUD likelihood-combined with OUD medication (MOUD) orders. We developed six sub-algorithms with varying criteria (multiple vs single MOUD orders, multiple vs single tier 1 indicators, tier 2 indicators, tier 3 and 4 indicators). Positive predictive values (PPVs) were calculated based on chart review to determine OUD status and severity. We compared demographic and clinical characteristics of cases identified by the sub-algorithms. RESULTS In total, 14,852 patients met criteria for one of the sub-algorithms. Five sub-algorithms had PPVs ≥0.90 for any severity OUD; four had PPVs ≥0.90 for moderate/severe OUD. Demographic and clinical characteristics differed substantially between groups. Of identified OUD cases, 31.3% had no past opioid analgesic orders, 79.7% lacked evidence of chronic prescription opioid use, and 43.5% lacked a chronic pain diagnosis. DISCUSSION Incorporating unstructured data with MOUD orders yielded an approach that adequately identified moderate/severe OUD, identified unique demographic and clinical sub-groups, and included individuals without prescription opioid use or chronic pain, whose OUD may stem from illicit opioids. Findings show that incorporating unstructured data strengthens EHR algorithms for identifying OUD and suggests approaches limited to populations with prescription opioid use or chronic pain exclude many individuals with OUD.
Collapse
Affiliation(s)
- Melissa N Poulsen
- Department of Population Health Sciences, Geisinger, Danville, PA, USA.
| | - Cara M Nordberg
- Department of Population Health Sciences, Geisinger, Danville, PA, USA.
| | - Vanessa Troiani
- Department of Autism and Developmental Medicine, Geisinger, Lewisburg, PA, USA.
| | - Wade Berrettini
- Department of Psychiatry, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA.
| | - Patrick B Asdell
- Department of Family Medicine, Summa Health, Barberton, OH, USA.
| | - Brian S Schwartz
- Department of Environmental Health and Engineering, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.
| |
Collapse
|
5
|
Penrod N, Okeh C, Velez Edwards DR, Barnhart K, Senapati S, Verma SS. Leveraging electronic health record data for endometriosis research. Front Digit Health 2023; 5:1150687. [PMID: 37342866 PMCID: PMC10278662 DOI: 10.3389/fdgth.2023.1150687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Accepted: 05/10/2023] [Indexed: 06/23/2023] Open
Abstract
Endometriosis is a chronic, complex disease for which there are vast disparities in diagnosis and treatment between sociodemographic groups. Clinical presentation of endometriosis can vary from asymptomatic disease-often identified during (in)fertility consultations-to dysmenorrhea and debilitating pelvic pain. Because of this complexity, delayed diagnosis (mean time to diagnosis is 1.7-3.6 years) and misdiagnosis is common. Early and accurate diagnosis of endometriosis remains a research priority for patient advocates and healthcare providers. Electronic health records (EHRs) have been widely adopted as a data source in biomedical research. However, they remain a largely untapped source of data for endometriosis research. EHRs capture diverse, real-world patient populations and care trajectories and can be used to learn patterns of underlying risk factors for endometriosis which, in turn, can be used to inform screening guidelines to help clinicians efficiently and effectively recognize and diagnose the disease in all patient populations reducing inequities in care. Here, we provide an overview of the advantages and limitations of using EHR data to study endometriosis. We describe the prevalence of endometriosis observed in diverse populations from multiple healthcare institutions, examples of variables that can be extracted from EHRs to enhance the accuracy of endometriosis prediction, and opportunities to leverage longitudinal EHR data to improve our understanding of long-term health consequences for all patients.
Collapse
Affiliation(s)
- Nadia Penrod
- College of Agriculture and Life Sciences, Texas A&M University, College Station, TX, United States
| | - Chelsea Okeh
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, Philadelphia, PA, United States
| | - Digna R. Velez Edwards
- Department of Obstetrics and Gynecology, Vanderbilt University, Nashville, TN, United States
| | - Kurt Barnhart
- Department of Obstetrics and Gynecology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Suneeta Senapati
- Department of Obstetrics and Gynecology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Shefali S. Verma
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, Philadelphia, PA, United States
| |
Collapse
|
6
|
Greenberg V, Vazquez-Benitez G, Kharbanda EO, Daley MF, Fu Tseng H, Klein NP, Naleway AL, Williams JTB, Donahue J, Jackson L, Weintraub E, Lipkind H, DeSilva MB. Tdap vaccination during pregnancy and risk of chorioamnionitis and related infant outcomes. Vaccine 2023; 41:3429-3435. [PMID: 37117057 PMCID: PMC10466272 DOI: 10.1016/j.vaccine.2023.04.043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Revised: 04/13/2023] [Accepted: 04/16/2023] [Indexed: 04/30/2023]
Abstract
INTRODUCTION An increased risk of chorioamnionitis in people receiving tetanus toxoid, reduced diphtheria toxoid, and acellular pertussis (Tdap) vaccine during pregnancy has been reported. The importance of this association is unclear as additional study has not demonstrated increased adverse infant outcomes associated with Tdap vaccination in pregnancy. METHODS We conducted a retrospective observational cohort study of pregnant people ages 15-49 years with singleton pregnancies ending in live birth who were members of 8 Vaccine Safety Datalink (VSD) sites during October 2016-September 2018. We used a time-dependent covariate Cox model with stabilized inverse probability weights applied to evaluate associations between Tdap vaccination during pregnancy and chorioamnionitis and preterm birth outcomes. We used Poisson regression with robust variance with stabilized inverse probability weights applied to evaluate the association of Tdap vaccination with adverse infant outcomes. We performed medical record reviews on a random sample of patients with ICD-10-CM-diagnosed chorioamnionitis to determine positive predictive values (PPV) of coded chorioamnionitisfor "probable clinical chorioamnionitis," "possible clinical chorioamnionitis," or "histologic chorioamnionitis." RESULTS We included 118,211 pregnant people; 103,258 (87%) received Tdap vaccine during pregnancy; 8098 (7%) were diagnosed with chorioamnionitis. The adjusted hazard ratio for chorioamnionitis in the Tdap vaccine-exposed group compared to unexposed was 0.96 (95% CI 0.90-1.03). There was no association between Tdap vaccine and preterm birth or adverse infant outcomes associated with chorioamnionitis. Chart reviews were performed for 528 pregnant people with chorioamnionitis. The PPV for clinical (probable or possible clinical chorioamnionitis) was 48% and 59% for histologic chorioamnionitis. The PPV for the combined outcome of clinical or histologic chorioamnionitis was 81%. CONCLUSIONS AND RELEVANCE Tdap vaccine exposure during pregnancy was not associated with chorioamnionitis, preterm birth, or adverse infant outcomes. ICD-10 codes for chorioamnionitis lack specificity for clinical chorioamnionitis and should be a recognized limitation when interpreting results.
Collapse
Affiliation(s)
| | | | | | - Matthew F Daley
- Institute for Health Research, Kaiser Permanente Colorado, Denver, CO, United States
| | - Hung Fu Tseng
- Kaiser Permanente Southern California, Pasadena, CA, United States
| | - Nicola P Klein
- Kaiser Permanente Vaccine Study Center, Oakland, CA, United States
| | - Allison L Naleway
- Center for Health Research, Kaiser Permanente Northwest, Portland, OR, United States
| | | | - James Donahue
- Marshfield Clinic, Research Institute, Marshfield, WI, United States
| | - Lisa Jackson
- Kaiser Permanente Washington, Seattle, WA, United States
| | - Eric Weintraub
- Immunization Safety Office, U.S. Centers for Disease Control and Prevention, Atlanta, GA, United States
| | | | | |
Collapse
|
7
|
He WQ, Nassar N, Schneuer FJ, Lain SJ. Examination of validity of identifying congenital heart disease from hospital discharge data without a gold standard: Using a data linkage approach. Paediatr Perinat Epidemiol 2023; 37:303-312. [PMID: 36991572 PMCID: PMC10946896 DOI: 10.1111/ppe.12976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Revised: 03/18/2023] [Accepted: 03/20/2023] [Indexed: 03/31/2023]
Abstract
BACKGROUND Administrative health data has been used extensively to examine congenital heart disease (CHD). However, the accuracy and completeness of these data must be assessed. OBJECTIVES To use data linkage of multiple administrative data sources to examine the validity of identifying CHD cases recorded in hospital discharge data. METHODS We identified all liveborn infants born 2013-2017 in New South Wales, Australia with a CHD diagnosis up to age one, recorded in hospital discharge data. Using record linkage to multiple data sources, the diagnosis of CHD was compared with five reference standards: (i) multiple hospital admissions containing CHD diagnosis; (ii) receiving a cardiac procedure; (iii) CHD diagnosis in the Register of Congenital Conditions; (iv) cardiac-related outpatient health service recorded; and/or (v) cardiac-related cause of death. Positive predictive values (PPV) comparing CHD diagnosis with the reference standards were estimated by CHD severity and for specific phenotypes. RESULTS Of 485,239 liveborn infants, there were 4043 infants with a CHD diagnosis identified in hospital discharge data (8.3 per 1000 live births). The PPV for any CHD identified in any of the five methods was 62.8% (95% confidence interval [CI] 60.9, 64.8), with PPV higher for severe CHD at 94.1% (95% CI 88.2, 100). Infant characteristics associated with higher PPVs included lower birthweight, presence of a syndrome or non-cardiac congenital anomaly, born to mothers aged <20 years and residing in disadvantaged areas. CONCLUSION Using data linkage of multiple datasets is a novel and cost-effective method to examine the validity of CHD diagnoses recorded in one dataset. These results can be incorporated into bias analyses in future studies of CHD.
Collapse
Affiliation(s)
- Wen-Qiang He
- Child Population and Translational Health Research, Children's Hospital at Westmead Clinical School, Faculty of Medicine and Health, University of Sydney, Sydney, New South Wales, Australia
| | - Natasha Nassar
- Child Population and Translational Health Research, Children's Hospital at Westmead Clinical School, Faculty of Medicine and Health, University of Sydney, Sydney, New South Wales, Australia
| | - Francisco J Schneuer
- Child Population and Translational Health Research, Children's Hospital at Westmead Clinical School, Faculty of Medicine and Health, University of Sydney, Sydney, New South Wales, Australia
| | - Samantha J Lain
- Child Population and Translational Health Research, Children's Hospital at Westmead Clinical School, Faculty of Medicine and Health, University of Sydney, Sydney, New South Wales, Australia
| |
Collapse
|
8
|
Laursen ASD, Jensen BW, Strate LL, Sørensen TIA, Baker JL, Sørensen HT. Birth weight, childhood body mass index, and risk of diverticular disease in adulthood. Int J Obes (Lond) 2023; 47:207-214. [PMID: 36698028 DOI: 10.1038/s41366-023-01259-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Revised: 01/05/2023] [Accepted: 01/11/2023] [Indexed: 01/26/2023]
Abstract
OBJECTIVE Adult overweight is associated with increased risk of diverticular disease (DD). We investigated associations between birthweight and childhood body mass index (BMI) and DD. METHODS Cohort study of 346,586 persons born during 1930-1996 with records in the Copenhagen School Health Records Register. Data included birthweight, and height and weight from ages 7 through 13. We used Cox proportional hazard regression to examine associations between birthweight and BMI z-scores and DD registered in the Danish National Patient Registry. Due to non-proportionality, we followed participants from age 18-49 and from age 50. RESULTS During follow-up, 5459 (3.2%) women and 4429 (2.5%) men had DD. For low and high BMI in childhood, we observed a higher risk of DD before age 50. Among women with z-scores <0 at age 13, the hazard ratio (HR) was 1.16 [95% confidence interval (CI): 0.98-1.39] per one-point lower z-score. For z-scores ≥0 at age 13, the HR was 1.30 (95% CI: 1.11-1.51) per one-point higher z-score. Among men with z-scores <0 at age 13, the HR was 1.02 (95% CI: 0.85-1.22). For z-scores ≥0 at age 13, the HR was 1.54 (95% CI: 1.34-1.78). Z-scores ≥0 were not associated with DD after age 50. Among women only, birthweight was inversely associated with DD before age 50 [HR = 0.90 (95% CI: 0.83-0.99) per 500 g higher birthweight]. CONCLUSION BMI z-scores below and above zero in childhood were associated with higher risk of DD before age 50. In addition, we observed lower risk of DD among women, the higher their birthweight.
Collapse
Affiliation(s)
- Anne Sofie D Laursen
- Department of Clinical Medicine, Department of Clinical Epidemiology, Aarhus University and Aarhus University Hospital, Aarhus, Denmark.
| | - Britt W Jensen
- Center for Clinical Research and Prevention, Copenhagen University Hospital-Bispebjerg and Frederiksberg, Copenhagen, Denmark
| | - Lisa L Strate
- Division of Gastroenterology, University of Washington School of Medicine, Seattle, WA, USA
| | - Thorkild I A Sørensen
- The Novo Nordisk Foundation Center for Basic Metabolic Research, Genomic Physiology and Translation Program, Faculty of Health and Medical Sciences, Copenhagen, Denmark
- Department of Public Health, Section of Epidemiology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Jennifer L Baker
- Center for Clinical Research and Prevention, Copenhagen University Hospital-Bispebjerg and Frederiksberg, Copenhagen, Denmark
| | - Henrik T Sørensen
- Department of Clinical Medicine, Department of Clinical Epidemiology, Aarhus University and Aarhus University Hospital, Aarhus, Denmark
| |
Collapse
|
9
|
Weinstein EJ, Ritchey ME, Lo Re V. Core concepts in pharmacoepidemiology: Validation of health outcomes of interest within real-world healthcare databases. Pharmacoepidemiol Drug Saf 2023; 32:1-8. [PMID: 36057777 PMCID: PMC9772105 DOI: 10.1002/pds.5537] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Revised: 08/09/2022] [Accepted: 08/19/2022] [Indexed: 02/06/2023]
Abstract
Real-world healthcare data, including administrative and electronic medical record databases, provide a rich source of data for the conduct of pharmacoepidemiologic studies but carry the potential for misclassification of health outcomes of interest (HOIs). Validation studies are important ways to quantify the degree of error associated with case-identifying algorithms for HOIs and are crucial for interpreting study findings within real-world data. This review provides a rationale, framework, and step-by-step approach to validating case-identifying algorithms for HOIs within healthcare databases. Key steps in validating a case-identifying algorithm within a healthcare database include: (1) selecting the appropriate health outcome; (2) determining the reference standard against which to validate the algorithm; (3) developing the algorithm using diagnosis codes, diagnostic tests or their results, procedures, drug therapies, patient-reported symptoms or diagnoses, or some combinations of these parameters; (4) selection of patients and sample sizes for validation; (5) collecting data to confirm the HOI; (6) confirming the HOI; and (7) assessing the algorithm's performance. Additional strategies for algorithm refinement and methods to correct for bias due to misclassification of outcomes are discussed. The review concludes by discussing factors affecting the transportability of case-identifying algorithms and the need for ongoing validation as data elements within healthcare databases, such as diagnosis codes, change over time or new variables, such as patient-generated health data, are included in these data sources.
Collapse
Affiliation(s)
- Erica J Weinstein
- Division of Infectious Diseases, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Center for Pharmacoepidemiology Research and Training, Center for Clinical Epidemiology and Biostatistics, and Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Mary Elizabeth Ritchey
- Med Tech Epi, LLC, Philadelphia, PA, USA
- Center for Pharmacoepidemiology and Treatment Science, Rutgers University, New Brunswick, New Jersey, USA
| | - Vincent Lo Re
- Division of Infectious Diseases, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Center for Pharmacoepidemiology Research and Training, Center for Clinical Epidemiology and Biostatistics, and Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
10
|
McClure ES, Gartner DR, Bell RA, Cruz TH, Nocera M, Marshall SW, Richardson DB. Challenges with misclassification of American Indian/Alaska Native race and Hispanic ethnicity on death records in North Carolina occupational fatalities surveillance. FRONTIERS IN EPIDEMIOLOGY 2022; 2:878309. [PMID: 38455305 PMCID: PMC10910913 DOI: 10.3389/fepid.2022.878309] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Accepted: 10/03/2022] [Indexed: 03/09/2024]
Abstract
As frequently segregated and exploitative environments, workplaces are important sites in driving health and mortality disparities by race and ethnicity. Because many worksites are federally regulated, US workplaces also offer opportunities for effectively intervening to mitigate these disparities. Development of policies for worker safety and equity should be informed by evidence, including results from research studies that use death records and other sources of administrative data. North Carolina has a long history of Black/white disparities in work-related mortality and evidence of such disparities is emerging in Hispanic and American Indian/Alaska Native (AI/AN) worker populations. The size of Hispanic and AI/AN worker populations have increased in North Carolina over the last decade, and North Carolina has the largest AI/AN population in the eastern US. Previous research indicates that misidentification of Hispanic and AI/AN identities on death records can lead to underestimation of race/ethnicity-specific mortality rates. In this commentary, we describe problems and complexities involved in determining AI/AN and Hispanic identities from North Carolina death records. We provide specific examples of misidentification that are likely introducing bias to occupational mortality disparity documentation, and offer recommendations for improved data collection, analysis, and interpretation. Our primary recommendation is to build and maintain relationships with local community leadership, so that improvements in the ascertainment of race and ethnicity are grounded in the lived experience of workers from communities of color.
Collapse
Affiliation(s)
- Elizabeth S. McClure
- NC Occupational Safety and Health Education and Research Center, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Danielle R. Gartner
- Department of Epidemiology & Biostatistics, College of Human Medicine, Michigan State University, East Lansing, MI, United States
| | - Ronny A. Bell
- Division of Public Health Sciences, Department of Social Sciences and Health Policy, Wake Forest School of Medicine, Winston-Salem, NC, United States
- Office of Cancer Health Equity, Wake Forest Baptist Comprehensive Cancer Center, Winston-Salem, NC, United States
- North Carolina American Indian Health Board, Winston-Salem, NC, United States
| | - Theresa H. Cruz
- Department of Pediatrics, University of New Mexico, Albuquerque, NM, United States
- UNM Prevention Research Center, Albuquerque, NM, United States
| | - Maryalice Nocera
- University of North Carolina Injury Prevention Research Center, Chapel Hill, NC, United States
| | - Stephen W. Marshall
- University of North Carolina Injury Prevention Research Center, Chapel Hill, NC, United States
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - David B. Richardson
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
- Environmental and Occupational Health, Program in Public Health, University of California, Irvine, Irvine, CA, United States
| |
Collapse
|
11
|
Lynch KE, Viernes B, Gatsby E, DuVall SL, Jones BE, Box TL, Kreisler C, Jones M. Positive Predictive Value of COVID-19 ICD-10 Diagnosis Codes Across Calendar Time and Clinical Setting. Clin Epidemiol 2021; 13:1011-1018. [PMID: 34737645 PMCID: PMC8558427 DOI: 10.2147/clep.s335621] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Accepted: 10/06/2021] [Indexed: 11/23/2022] Open
Abstract
Purpose To estimate the positive predictive value (PPV) of International Classification of Diseases, Tenth Revision (ICD-10) code U07.1, COVID-19 virus identified, in the Department of Veterans of Affairs (VA). Patients and Methods Records of ICD-10 code U07.1 from inpatient, outpatient, and emergency/urgent care settings were extracted from VA medical record data from 4/01/2020 to 3/31/2021. A weighted, random sample of 1500 records from each quarter of the one-year observation period was reviewed by study personnel to confirm active COVID-19 infection at the time of diagnosis and classify reasons for false positive records. PPV was estimated overall and compared across clinical setting and quarters. Results We identified 664,406 records of U07.1. Among the 1500 reviewed, 237 were false positives (PPV: 84.2%, 95% CI: 82.4–86.0). PPV ranged from 77.7% in outpatient settings to 93.8% in inpatient settings and was 83.3% in quarter 1, 80.5% in quarter 2, 86.1% in quarter 3, and 83.6% in quarter 4. The most common reasons for false positive records were history of COVID-19 (44.3%) and orders for laboratory tests (21.5%). Conclusion The PPV of ICD-10 code U07.1 is low, especially in outpatient settings. Directed training may improve accuracy of coding to levels that are deemed adequate for future use in surveillance efforts.
Collapse
Affiliation(s)
- Kristine E Lynch
- VA Informatics and Computing Infrastructure (VINCI), VA Salt Lake City Health Care System, Salt Lake City, UT, USA.,Department of Internal Medicine, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Benjamin Viernes
- VA Informatics and Computing Infrastructure (VINCI), VA Salt Lake City Health Care System, Salt Lake City, UT, USA.,Department of Internal Medicine, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Elise Gatsby
- VA Informatics and Computing Infrastructure (VINCI), VA Salt Lake City Health Care System, Salt Lake City, UT, USA
| | - Scott L DuVall
- VA Informatics and Computing Infrastructure (VINCI), VA Salt Lake City Health Care System, Salt Lake City, UT, USA.,Department of Internal Medicine, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Barbara E Jones
- Department of Internal Medicine, University of Utah School of Medicine, Salt Lake City, UT, USA.,Informatics, Decision-Enhancement, and Analytic Sciences (IDEAS) Center of Innovation, VA Salt Lake City Health Care System, Salt Lake City, UT, USA
| | - Tamára L Box
- Analytics and Performance Integration (API), Office of Quality and Patient Safety, Veterans Health Administration, Washington, DC, USA
| | - Craig Kreisler
- Analytics and Performance Integration (API), Office of Quality and Patient Safety, Veterans Health Administration, Washington, DC, USA
| | - Makoto Jones
- Department of Internal Medicine, University of Utah School of Medicine, Salt Lake City, UT, USA.,Informatics, Decision-Enhancement, and Analytic Sciences (IDEAS) Center of Innovation, VA Salt Lake City Health Care System, Salt Lake City, UT, USA
| |
Collapse
|
12
|
Mansi ET, Johnson ES, Thorp ML, Go AS, Lee MS, Shen AYJ, Park KJ, Budzynska K, Markin A, Sung SH, Thompson JH, Slaughter MT, Luong TQ, An J, Reynolds K, Roblin DW, Cassidy-Bushrow AE, Kuntz JL, Schlienger RG, Behr S, Smith DH. Physician adjudication of angioedema diagnosis codes in a population of patients with heart failure prescribed angiotensin-converting enzyme inhibitor therapy. Pharmacoepidemiol Drug Saf 2021; 30:1630-1634. [PMID: 34558760 DOI: 10.1002/pds.5361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Revised: 09/13/2021] [Accepted: 09/20/2021] [Indexed: 11/10/2022]
Abstract
PURPOSE Our objective was to calculate the positive predictive value (PPV) of the ICD-9 diagnosis code for angioedema when physicians adjudicate the events by electronic health record review. Our secondary objective was to evaluate the inter-rater reliability of physician adjudication. METHODS Patients from the Cardiovascular Research Network previously diagnosed with heart failure who were started on angiotensin-converting enzyme inhibitors (ACEI) during the study period (July 1, 2006 through September 30, 2015) were included. A team of two physicians per participating site adjudicated possible events using electronic health records for all patients coded for angioedema for a total of five sites. The PPV was calculated as the number of physician-adjudicated cases divided by all cases with the diagnosis code of angioedema (ICD-9-CM code 995.1) meeting the inclusion criteria. The inter-rater reliability of physician teams, or kappa statistic, was also calculated. RESULTS There were 38 061 adults with heart failure initiating ACEI in the study (21 489 patient-years). Of 114 coded events that were adjudicated by physicians, 98 angioedema events were confirmed for a PPV of 86% (95% CI: 80%, 92%). The kappa statistic based on physician inter-rater reliability was 0.65 (95% CI: 0.47, 0.82). CONCLUSIONS ICD-9 diagnosis code of 995.1 (angioneurotic edema, not elsewhere classified) is highly predictive of angioedema in adults with heart failure exposed to ACEI.
Collapse
Affiliation(s)
- Elizabeth T Mansi
- School of Public Health, University of Washington, Seattle, Washington, USA.,Center for Health Research, Kaiser Permanente Northwest, Portland, Oregon, USA
| | - Eric S Johnson
- Center for Health Research, Kaiser Permanente Northwest, Portland, Oregon, USA
| | - Micah L Thorp
- Department of Nephrology, Kaiser Permanente Northwest, Portland, Oregon, USA
| | - Alan S Go
- Division of Research, Kaiser Permanente Northern California, Oakland, California, USA
| | - Ming-Sum Lee
- Department of Cardiology, Los Angeles Medical Center, Kaiser Permanente Southern California, Los Angeles, California, USA
| | - Albert Yuh-Jer Shen
- Department of Cardiology, Los Angeles Medical Center, Kaiser Permanente Southern California, Los Angeles, California, USA
| | - Ken J Park
- Department of Nephrology, Kaiser Permanente Northwest, Portland, Oregon, USA
| | | | - Abraham Markin
- Department of Emergency Medicine, Henry Ford Hospital, Detroit, Michigan, USA
| | - Sue Hee Sung
- Division of Research, Kaiser Permanente Northern California, Oakland, California, USA
| | - Jamie H Thompson
- Center for Health Research, Kaiser Permanente Northwest, Portland, Oregon, USA
| | - Matthew T Slaughter
- Center for Health Research, Kaiser Permanente Northwest, Portland, Oregon, USA
| | - Tiffany Q Luong
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, California, USA
| | - Jaejin An
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, California, USA
| | - Kristi Reynolds
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, California, USA
| | - Douglas W Roblin
- Mid-Atlantic Permanente Research Institute, Rockville, Maryland, USA
| | | | - Jennifer L Kuntz
- Center for Health Research, Kaiser Permanente Northwest, Portland, Oregon, USA
| | | | - Sigrid Behr
- Quantitative Safety and Epidemiology, Novartis Pharma AG, Basel, Switzerland
| | - David H Smith
- Center for Health Research, Kaiser Permanente Northwest, Portland, Oregon, USA
| |
Collapse
|
13
|
Nibell O, Svanström H, Inghammar M. Oral Fluoroquinolone Use and the Risk of Acute Liver Injury: A Nationwide Cohort Study. Clin Infect Dis 2021; 74:2152-2158. [PMID: 34537834 PMCID: PMC9258930 DOI: 10.1093/cid/ciab825] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2021] [Indexed: 11/28/2022] Open
Abstract
BACKGROUND Antibiotics are considered to be among the most frequent causes of drug-related acute liver injury (ALI). Although many ALIs have mild and reversible clinical outcomes, there is substantial risk of severe reactions leading to acute liver failure, need for liver transplant, and death. Recent studies have raised concerns of hepatotoxic potential related to the use of fluoroquinolones. METHODS This study examined the risk of ALI associated with oral fluoroquinolone treatment compared with amoxicillin (419 930 courses, propensity score matched 1:1). The information on drug use was collected from a national, registry-based cohort derived from all Swedish adults aged 40-85 years. RESULTS During a follow-up period of 60 days, users of oral fluoroquinolones had a >2-fold risk of ALI compared to users of amoxicillin (hazard ratio, 2.32 [95% confidence interval {CI}, 1.01-5.35). The adjusted absolute risk difference for use of fluoroquinolones as compared to amoxicillin was 4.94 (95% CI, .04-16.3) per 1 million episodes. CONCLUSIONS In this propensity score-matched study, fluoroquinolone treatment was associated with an increased risk of ALI in the first 2 months after starting treatment.
Collapse
Affiliation(s)
- Olof Nibell
- Correspondence: O. Nibell, Section for Infection Medicine, Department of Clinical Sciences Lund, Lund University, SE-221 00, Lund, Sweden ()
| | - Henrik Svanström
- Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark
| | - Malin Inghammar
- Section for Infection Medicine, Department of Clinical Sciences Lund, Lund University, Lund, Sweden
| |
Collapse
|
14
|
Schelde AB, Kornholt J. Validation studies in epidemiologic research: estimation of the positive predictive value. J Clin Epidemiol 2021; 137:262-264. [PMID: 34022395 DOI: 10.1016/j.jclinepi.2021.05.009] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Revised: 05/03/2021] [Accepted: 05/10/2021] [Indexed: 11/29/2022]
Affiliation(s)
- Astrid Blicher Schelde
- Department of Clinical Pharmacology, Copenhagen University Hospital, Bispebjerg and Frederiksberg, Denmark
| | - Jonatan Kornholt
- Department of Clinical Pharmacology, Copenhagen University Hospital, Bispebjerg and Frederiksberg, Denmark.
| |
Collapse
|
15
|
Abstract
OBJECTIVES Clinical Research Informatics (CRI) declares its scope in its name, but its content, both in terms of the clinical research it supports-and sometimes initiates-and the methods it has developed over time, reach much further than the name suggests. The goal of this review is to celebrate the extraordinary diversity of activity and of results, not as a prize-giving pageant, but in recognition of the field, the community that both serves and is sustained by it, and of its interdisciplinarity and its international dimension. METHODS Beyond personal awareness of a range of work commensurate with the author's own research, it is clear that, even with a thorough literature search, a comprehensive review is impossible. Moreover, the field has grown and subdivided to an extent that makes it very hard for one individual to be familiar with every branch or with more than a few branches in any depth. A literature survey was conducted that focused on informatics-related terms in the general biomedical and healthcare literature, and specific concerns ("artificial intelligence", "data models", "analytics", etc.) in the biomedical informatics (BMI) literature. In addition to a selection from the results from these searches, suggestive references within them were also considered. RESULTS The substantive sections of the paper-Artificial Intelligence, Machine Learning, and "Big Data" Analytics; Common Data Models, Data Quality, and Standards; Phenotyping and Cohort Discovery; Privacy: Deidentification, Distributed Computation, Blockchain; Causal Inference and Real-World Evidence-provide broad coverage of these active research areas, with, no doubt, a bias towards this reviewer's interests and preferences, landing on a number of papers that stood out in one way or another, or, alternatively, exemplified a particular line of work. CONCLUSIONS CRI is thriving, not only in the familiar major centers of research, but more widely, throughout the world. This is not to pretend that the distribution is uniform, but to highlight the potential for this domain to play a prominent role in supporting progress in medicine, healthcare, and wellbeing everywhere. We conclude with the observation that CRI and its practitioners would make apt stewards of the new medical knowledge that their methods will bring forward.
Collapse
Affiliation(s)
- Anthony Solomonides
- Outcomes Research Network, Research Institute, NorthShore University HealthSystem, Evanston, IL, USA
| |
Collapse
|