Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhong VW, Obeid JS, Craig JB, Pfaff ER, Thomas J, Jaacks LM, Beavers DP, Carey TS, Lawrence JM, Dabelea D, Hamman RF, Bowlby DA, Pihoker C, Saydah SH, Mayer-Davis EJ. An efficient approach for surveillance of childhood diabetes by type derived from electronic health record data: the SEARCH for Diabetes in Youth Study. J Am Med Inform Assoc 2016;23:1060-1067. [PMID: 27107449 DOI: 10.1093/jamia/ocv207] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2015] [Revised: 12/02/2015] [Accepted: 12/08/2015] [Indexed: 12/16/2022] Open

For:	Zhong VW, Obeid JS, Craig JB, Pfaff ER, Thomas J, Jaacks LM, Beavers DP, Carey TS, Lawrence JM, Dabelea D, Hamman RF, Bowlby DA, Pihoker C, Saydah SH, Mayer-Davis EJ. An efficient approach for surveillance of childhood diabetes by type derived from electronic health record data: the SEARCH for Diabetes in Youth Study. J Am Med Inform Assoc 2016;23:1060-1067. [PMID: 27107449 DOI: 10.1093/jamia/ocv207] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2015] [Revised: 12/02/2015] [Accepted: 12/08/2015] [Indexed: 12/16/2022] Open

Number

Cited by Other Article(s)

Li Z, Pang S, Qu H, Lian W. Logistic regression prediction models and key influencing factors analysis of diabetes based on algorithm design. Neural Comput Appl 2023. [DOI: 10.1007/s00521-023-08447-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/31/2023]

Vajravelu ME, Hitt TA, Amaral S, Levitt Katz LE, Lee JM, Kelly A. Real-world treatment escalation from metformin monotherapy in youth-onset Type 2 diabetes mellitus: A retrospective cohort study. Pediatr Diabetes 2021;22:861-871. [PMID: 33978986 PMCID: PMC8373808 DOI: 10.1111/pedi.13232] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/28/2021] [Revised: 03/22/2021] [Accepted: 04/26/2021] [Indexed: 01/21/2023] Open

Abstract

BACKGROUND

Due to high rates of comorbidities and rapid progression, youth with Type 2 diabetes may benefit from early and aggressive treatment. However, until 2019, the only approved medications for this population were metformin and insulin.

OBJECTIVE

To investigate patterns and predictors of treatment escalation within 5 years of metformin monotherapy initiation for youth with Type 2 diabetes in clinical practice.

SUBJECTS

Commercially-insured patients with incident youth-onset (10-18 years) Type 2 diabetes initially treated with metformin only.

METHODS

Retrospective cohort study using a patient-level medical claims database with data from 2000 to 2020. Frequency and order of treatment escalation to insulin and non-insulin antihyperglycemics were determined and categorized by age at diagnosis. Cox proportional hazards regression was used to evaluate potential predictors of treatment escalation, including age, sex, race/ethnicity, comorbidities, complications, and metformin adherence (medication possession ratio ≥ 0.8).

RESULTS

The cohort included 829 (66% female; median age at diagnosis 15 years; 19% Hispanic, 17% Black) patients, with median 2.9 year follow-up after metformin initiation. One-quarter underwent treatment escalation (n = 207; 88 to insulin, 164 to non-insulin antihyperglycemic). Younger patients were more likely to have insulin prescribed prior to other antihyperglycemics. Age at diagnosis (HR 1.14, 95% CI 1.07-1.21), medication adherence (HR 4.10, 95% CI 2.96-5.67), Hispanic ethnicity (HR 1.83, 95% CI 1.28-2.61), and diabetes-related complications (HR 1.78, 95% CI 1.15-2.74) were positively associated with treatment escalation.

CONCLUSIONS

In clinical practice, treatment escalation for pediatric Type 2 diabetes differs with age. Off-label use of non-insulin antihyperglycemics occurs, most commonly among older adolescents.

Collapse

Barrett CE, Park J, Kompaniyets L, Baggs J, Cheng YJ, Zhang P, Imperatore G, Pavkov ME. Intensive Care Unit Admission, Mechanical Ventilation, and Mortality Among Patients With Type 1 Diabetes Hospitalized for COVID-19 in the U.S. Diabetes Care 2021;44:1788-1796. [PMID: 34158365 PMCID: PMC9109617 DOI: 10.2337/dc21-0604] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Accepted: 05/16/2021] [Indexed: 02/03/2023]

Abstract

OBJECTIVE

To assess whether risk of severe outcomes among patients with type 1 diabetes mellitus (T1DM) hospitalized for coronavirus disease 2019 (COVID-19) differs from that of patients without diabetes or with type 2 diabetes mellitus (T2DM).

RESEARCH DESIGN AND METHODS

Using the Premier Healthcare Database Special COVID-19 Release records of patients discharged after COVID-19 hospitalization from U.S. hospitals from March to November 2020 (N = 269,674 after exclusion), we estimated risk differences (RD) and risk ratios (RR) of intensive care unit admission or invasive mechanical ventilation (ICU/MV) and of death among patients with T1DM compared with patients without diabetes or with T2DM. Logistic models were adjusted for age, sex, and race or ethnicity. Models adjusted for additional demographic and clinical characteristics were used to examine whether other factors account for the associations between T1DM and severe COVID-19 outcomes.

RESULTS

Compared with patients without diabetes, T1DM was associated with a 21% higher absolute risk of ICU/MV (RD 0.21, 95% CI 0.19-0.24; RR 1.49, 95% CI 1.43-1.56) and a 5% higher absolute risk of mortality (RD 0.05, 95% CI 0.03-0.07; RR 1.40, 95% CI 1.24-1.57), with adjustment for age, sex, and race or ethnicity. Compared with T2DM, T1DM was associated with a 9% higher absolute risk of ICU/MV (RD 0.09, 95% CI 0.07-0.12; RR 1.17, 95% CI 1.12-1.22), but no difference in mortality (RD 0.00, 95% CI -0.02 to 0.02; RR 1.00, 95% CI 0.89-1.13). After adjustment for diabetic ketoacidosis (DKA) occurring before or at COVID-19 diagnosis, patients with T1DM no longer had increased risk of ICU/MV (RD 0.01, 95% CI -0.01 to 0.03) and had lower mortality (RD -0.03, 95% CI -0.05 to -0.01) in comparisons with patients with T2DM.

CONCLUSIONS

Patients with T1DM hospitalized for COVID-19 are at higher risk for severe outcomes than those without diabetes. Higher risk of ICU/MV in patients with T1DM than in patients with T2DM was largely accounted for by the presence of DKA. These findings might further guide recommendations related to diabetes management and the prevention of COVID-19.

Collapse

Dabelea D, Sauder KA, Jensen ET, Mottl AK, Huang A, Pihoker C, Hamman RF, Lawrence J, Dolan LM, Agostino RD, Wagenknecht L, Mayer-Davis EJ, Marcovina SM. Twenty years of pediatric diabetes surveillance: what do we know and why it matters. Ann N Y Acad Sci 2021;1495:99-120. [PMID: 33543783 PMCID: PMC8282684 DOI: 10.1111/nyas.14573] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 01/14/2021] [Accepted: 01/20/2021] [Indexed: 12/23/2022]

Lee S, Doktorchik C, Martin EA, D'Souza AG, Eastwood C, Shaheen AA, Naugler C, Lee J, Quan H. Electronic Medical Record-Based Case Phenotyping for the Charlson Conditions: Scoping Review. JMIR Med Inform 2021;9:e23934. [PMID: 33522976 PMCID: PMC7884219 DOI: 10.2196/23934] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 11/20/2020] [Accepted: 12/05/2020] [Indexed: 12/16/2022] Open

Abstract

Background

Electronic medical records (EMRs) contain large amounts of rich clinical information. Developing EMR-based case definitions, also known as EMR phenotyping, is an active area of research that has implications for epidemiology, clinical care, and health services research.

Objective

This review aims to describe and assess the present landscape of EMR-based case phenotyping for the Charlson conditions.

Methods

A scoping review of EMR-based algorithms for defining the Charlson comorbidity index conditions was completed. This study covered articles published between January 2000 and April 2020, both inclusive. Embase (Excerpta Medica database) and MEDLINE (Medical Literature Analysis and Retrieval System Online) were searched using keywords developed in the following 3 domains: terms related to EMR, terms related to case finding, and disease-specific terms. The manuscript follows the Preferred Reporting Items for Systematic reviews and Meta-analyses extension for Scoping Reviews (PRISMA) guidelines.

Results

A total of 274 articles representing 299 algorithms were assessed and summarized. Most studies were undertaken in the United States (181/299, 60.5%), followed by the United Kingdom (42/299, 14.0%) and Canada (15/299, 5.0%). These algorithms were mostly developed either in primary care (103/299, 34.4%) or inpatient (168/299, 56.2%) settings. Diabetes, congestive heart failure, myocardial infarction, and rheumatology had the highest number of developed algorithms. Data-driven and clinical rule–based approaches have been identified. EMR-based phenotype and algorithm development reflect the data access allowed by respective health systems, and algorithms vary in their performance.

Conclusions

Recognizing similarities and differences in health systems, data collection strategies, extraction, data release protocols, and existing clinical pathways is critical to algorithm development strategies. Several strategies to assist with phenotype-based case definitions have been proposed.

Collapse

Affiliation(s)

Seungwon Lee Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Alberta Health Services, Calgary, AB, Canada.,Data Intelligence for Health Lab, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
Chelsea Doktorchik Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
Elliot Asher Martin Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Alberta Health Services, Calgary, AB, Canada
Adam Giles D'Souza Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Alberta Health Services, Calgary, AB, Canada
Cathy Eastwood Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
Abdel Aziz Shaheen Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Medicine, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
Christopher Naugler Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Pathology and Laboratory Medicine, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
Joon Lee Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Data Intelligence for Health Lab, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Cardiac Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
Hude Quan Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada

Collapse

Crume TL, Hamman RF, Isom S, Divers J, Mayer-Davis EJ, Liese AD, Saydah S, Lawrence JM, Pihoker C, Dabelea D. The accuracy of provider diagnosed diabetes type in youth compared to an etiologic criteria in the SEARCH for Diabetes in Youth Study. Pediatr Diabetes 2020;21:1403-1411. [PMID: 32981196 PMCID: PMC7819667 DOI: 10.1111/pedi.13126] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Revised: 09/10/2020] [Accepted: 09/16/2020] [Indexed: 12/18/2022] Open

Knight GM, Spencer-Bonilla G, Maahs DM, Blum MR, Valencia A, Zuma BZ, Prahalad P, Sarraju A, Rodriguez F, Scheinker D. Multimethod, multidataset analysis reveals paradoxical relationships between sociodemographic factors, Hispanic ethnicity and diabetes. BMJ Open Diabetes Res Care 2020;8:e001725. [PMID: 33229378 PMCID: PMC7684662 DOI: 10.1136/bmjdrc-2020-001725] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Revised: 10/06/2020] [Accepted: 10/21/2020] [Indexed: 12/13/2022] Open

Abstract

INTRODUCTION

Population-level and individual-level analyses have strengths and limitations as do 'blackbox' machine learning (ML) and traditional, interpretable models. Diabetes mellitus (DM) is a leading cause of morbidity and mortality with complex sociodemographic dynamics that have not been analyzed in a way that leverages population-level and individual-level data as well as traditional epidemiological and ML models. We analyzed complementary individual-level and county-level datasets with both regression and ML methods to study the association between sociodemographic factors and DM.

RESEARCH DESIGN AND METHODS

County-level DM prevalence, demographics, and socioeconomic status (SES) factors were extracted from the 2018 Robert Wood Johnson Foundation County Health Rankings and merged with US Census data. Analogous individual-level data were extracted from 2007 to 2016 National Health and Nutrition Examination Survey studies and corrected for oversampling with survey weights. We used multivariate linear (logistic) regression and ML regression (classification) models for county (individual) data. Regression and ML models were compared using measures of explained variation (area under the receiver operating characteristic curve (AUC) and R2).

RESULTS

Among the 3138 counties assessed, the mean DM prevalence was 11.4% (range: 3.0%-21.1%). Among the 12 824 individuals assessed, 1688 met DM criteria (13.2% unweighted; 10.2% weighted). Age, gender, race/ethnicity, income, and education were associated with DM at the county and individual levels. Higher county Hispanic ethnic density was negatively associated with county DM prevalence, while Hispanic ethnicity was positively associated with individual DM. ML outperformed regression in both datasets (mean R2 of 0.679 vs 0.610, respectively (p<0.001) for county-level data; mean AUC of 0.737 vs 0.727 (p<0.0427) for individual-level data).

CONCLUSIONS

Hispanic individuals are at higher risk of DM, while counties with larger Hispanic populations have lower DM prevalence. Analyses of population-level and individual-level data with multiple methods may afford more confidence in results and identify areas for further study.

Collapse

Obeid JS, Davis M, Turner M, Meystre SM, Heider PM, O'Bryan EC, Lenert LA. An artificial intelligence approach to COVID-19 infection risk assessment in virtual visits: A case report. J Am Med Inform Assoc 2020;27:1321-1325. [PMID: 32449766 PMCID: PMC7313981 DOI: 10.1093/jamia/ocaa105] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Revised: 05/07/2020] [Accepted: 05/21/2020] [Indexed: 12/15/2022] Open

Wells BJ, Lenoir KM, Wagenknecht LE, Mayer-Davis EJ, Lawrence JM, Dabelea D, Pihoker C, Saydah S, Casanova R, Turley C, Liese AD, Standiford D, Kahn MG, Hamman R, Divers J. Detection of Diabetes Status and Type in Youth Using Electronic Health Records: The SEARCH for Diabetes in Youth Study. Diabetes Care 2020;43:2418-2425. [PMID: 32737140 PMCID: PMC7510036 DOI: 10.2337/dc20-0063] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/09/2020] [Accepted: 06/20/2020] [Indexed: 02/03/2023]

Affiliation(s)

Brian J Wells Division of Public Health Sciences, Department of Biostatistics and Data Science, Wake Forest School of Medicine, Winston-Salem, NC
Kristin M Lenoir Division of Public Health Sciences, Department of Biostatistics and Data Science, Wake Forest School of Medicine, Winston-Salem, NC
Lynne E Wagenknecht Division of Public Health Sciences, Department of Biostatistics and Data Science, Wake Forest School of Medicine, Winston-Salem, NC
Elizabeth J Mayer-Davis Departments of Nutrition and Medicine, The University of North Carolina at Chapel Hill, Chapel Hill, NC
Jean M Lawrence Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA
Dana Dabelea Department of Epidemiology, Colorado School of Public Health, University of Colorado Denver, Aurora, CO
Catherine Pihoker Department of Pediatrics, University of Washington, Seattle, WA
Sharon Saydah Division of Diabetes Translation, National Center for Chronic Disease Prevention and Health Promotion, Centers for Disease Control and Prevention, Atlanta, GA
Ramon Casanova Division of Public Health Sciences, Department of Biostatistics and Data Science, Wake Forest School of Medicine, Winston-Salem, NC
Christine Turley Department of Pediatrics, Medical University of South Carolina, Charleston, SC
Angela D Liese Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina, Columbia, SC
Debra Standiford Cincinnati Children's Hospital Medical Center, Cincinnati, OH
Michael G Kahn Department of Pediatrics, University of Colorado Anschutz Medical Campus, Aurora, CO
Richard Hamman Department of Epidemiology, Colorado School of Public Health, University of Colorado Denver, Aurora, CO
Jasmin Divers Division of Health Services Research, NYU Winthrop Research Institute, NYU Long Island School of Medicine, Mineola, NY

Collapse

Walters CE, Nitin R, Margulis K, Boorom O, Gustavson DE, Bush CT, Davis LK, Below JE, Cox NJ, Camarata SM, Gordon RL. Automated Phenotyping Tool for Identifying Developmental Language Disorder Cases in Health Systems Data (APT-DLD): A New Research Algorithm for Deployment in Large-Scale Electronic Health Record Systems. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020;63:3019-3035. [PMID: 32791019 PMCID: PMC7890229 DOI: 10.1044/2020_jslhr-19-00397] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Revised: 04/23/2020] [Accepted: 05/19/2020] [Indexed: 05/13/2023]

Abstract

Purpose Data mining algorithms using electronic health records (EHRs) are useful in large-scale population-wide studies to classify etiology and comorbidities (Casey et al., 2016). Here, we apply this approach to developmental language disorder (DLD), a prevalent communication disorder whose risk factors and epidemiology remain largely undiscovered. Method We first created a reliable system for manually identifying DLD in EHRs based on speech-language pathologist (SLP) diagnostic expertise. We then developed and validated an automated algorithmic procedure, called, Automated Phenotyping Tool for identifying DLD cases in health systems data (APT-DLD), that classifies a DLD status for patients within EHRs on the basis of ICD (International Statistical Classification of Diseases and Related Health Problems) codes. APT-DLD was validated in a discovery sample (N = 973) using expert SLP manual phenotype coding as a gold-standard comparison and then applied and further validated in a replication sample of N = 13,652 EHRs. Results In the discovery sample, the APT-DLD algorithm correctly classified 98% (concordance) of DLD cases in concordance with manually coded records in the training set, indicating that APT-DLD successfully mimics a comprehensive chart review. The output of APT-DLD was also validated in relation to independently conducted SLP clinician coding in a subset of records, with a positive predictive value of 95% of cases correctly classified as DLD. We also applied APT-DLD to the replication sample, where it achieved a positive predictive value of 90% in relation to SLP clinician classification of DLD. Conclusions APT-DLD is a reliable, valid, and scalable tool for identifying DLD cohorts in EHRs. This new method has promising public health implications for future large-scale epidemiological investigations of DLD and may inform EHR data mining algorithms for other communication disorders. Supplemental Material https://doi.org/10.23641/asha.12753578.

Collapse

Weisman A, Tu K, Young J, Kumar M, Austin PC, Jaakkimainen L, Lipscombe L, Aronson R, Booth GL. Validation of a type 1 diabetes algorithm using electronic medical records and administrative healthcare data to study the population incidence and prevalence of type 1 diabetes in Ontario, Canada. BMJ Open Diabetes Res Care 2020;8:8/1/e001224. [PMID: 32565422 PMCID: PMC7307536 DOI: 10.1136/bmjdrc-2020-001224] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/24/2020] [Revised: 05/12/2020] [Accepted: 05/19/2020] [Indexed: 12/19/2022] Open

Abstract

INTRODUCTION

We aimed to develop algorithms distinguishing type 1 diabetes (T1D) from type 2 diabetes in adults ≥18 years old using primary care electronic medical record (EMRPC) and administrative healthcare data from Ontario, Canada, and to estimate T1D prevalence and incidence.

RESEARCH DESIGN AND METHODS

The reference population was a random sample of patients with diabetes in EMRPC whose charts were manually abstracted (n=5402). Algorithms were developed using classification trees, random forests, and rule-based methods, using electronic medical record (EMR) data, administrative data, or both. Algorithm performance was assessed in EMRPC. Administrative data algorithms were additionally evaluated using a diabetes clinic registry with endocrinologist-assigned diabetes type (n=29 371). Three algorithms were applied to the Ontario population to evaluate the minimum, moderate and maximum estimates of T1D prevalence and incidence rates between 2010 and 2017, and trends were analyzed using negative binomial regressions.

RESULTS

Of 5402 individuals with diabetes in EMRPC, 195 had T1D. Sensitivity, specificity, positive predictive value and negative predictive value for the best performing algorithms were 80.6% (75.9-87.2), 99.8% (99.7-100), 94.9% (92.3-98.7), and 99.3% (99.1-99.5) for EMR, 51.3% (44.0-58.5), 99.5% (99.3-99.7), 79.4% (71.2-86.1), and 98.2% (97.8-98.5) for administrative data, and 87.2% (81.7-91.5), 99.9% (99.7-100), 96.6% (92.7-98.7) and 99.5% (99.3-99.7) for combined EMR and administrative data. Administrative data algorithms had similar sensitivity and specificity in the diabetes clinic registry. Of 11 499 711 adults in Ontario in 2017, there were 24 789 (0.22%, minimum estimate) to 102 140 (0.89%, maximum estimate) with T1D. Between 2010 and 2017, the age-standardized and sex-standardized prevalence rates per 1000 person-years increased (minimum estimate 1.7 to 2.56, maximum estimate 7.48 to 9.86, p<0.0001). In contrast, incidence rates decreased (minimum estimate 0.1 to 0.04, maximum estimate 0.47 to 0.09, p<0.0001).

CONCLUSIONS

Primary care EMR and administrative data algorithms performed well in identifying T1D and demonstrated increasing T1D prevalence in Ontario. These algorithms may permit the development of large, population-based cohort studies of T1D.

Collapse

Ke C, Stukel TA, Luk A, Shah BR, Jha P, Lau E, Ma RCW, So WY, Kong AP, Chow E, Chan JCN. Development and validation of algorithms to classify type 1 and 2 diabetes according to age at diagnosis using electronic health records. BMC Med Res Methodol 2020;20:35. [PMID: 32093635 PMCID: PMC7038546 DOI: 10.1186/s12874-020-00921-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2019] [Accepted: 02/10/2020] [Indexed: 12/12/2022] Open

Abstract

BACKGROUND

Validated algorithms to classify type 1 and 2 diabetes (T1D, T2D) are mostly limited to white pediatric populations. We conducted a large study in Hong Kong among children and adults with diabetes to develop and validate algorithms using electronic health records (EHRs) to classify diabetes type against clinical assessment as the reference standard, and to evaluate performance by age at diagnosis.

METHODS

We included all people with diabetes (age at diagnosis 1.5-100 years during 2002-15) in the Hong Kong Diabetes Register and randomized them to derivation and validation cohorts. We developed candidate algorithms to identify diabetes types using encounter codes, prescriptions, and combinations of these criteria ("combination algorithms"). We identified 3 algorithms with the highest sensitivity, positive predictive value (PPV), and kappa coefficient, and evaluated performance by age at diagnosis in the validation cohort.

RESULTS

There were 10,196 (T1D n = 60, T2D n = 10,136) and 5101 (T1D n = 43, T2D n = 5058) people in the derivation and validation cohorts (mean age at diagnosis 22.7, 55.9 years; 53.3, 43.9% female; for T1D and T2D respectively). Algorithms using codes or prescriptions classified T1D well for age at diagnosis < 20 years, but sensitivity and PPV dropped for older ages at diagnosis. Combination algorithms maximized sensitivity or PPV, but not both. The "high sensitivity for type 1" algorithm (ratio of type 1 to type 2 codes ≥ 4, or at least 1 insulin prescription within 90 days) had a sensitivity of 95.3% (95% confidence interval 84.2-99.4%; PPV 12.8%, 9.3-16.9%), while the "high PPV for type 1" algorithm (ratio of type 1 to type 2 codes ≥ 4, and multiple daily injections with no other glucose-lowering medication prescription) had a PPV of 100.0% (79.4-100.0%; sensitivity 37.2%, 23.0-53.3%), and the "optimized" algorithm (ratio of type 1 to type 2 codes ≥ 4, and at least 1 insulin prescription within 90 days) had a sensitivity of 65.1% (49.1-79.0%) and PPV of 75.7% (58.8-88.2%) across all ages. Accuracy of T2D classification was high for all algorithms.

CONCLUSIONS

Our validated set of algorithms accurately classifies T1D and T2D using EHRs for Hong Kong residents enrolled in a diabetes register. The choice of algorithm should be tailored to the unique requirements of each study question.

Collapse

Affiliation(s)

Calvin Ke Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, Hong Kong Department of Medicine, University of Toronto, Toronto, Canada Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, Canada
Thérèse A. Stukel Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, Canada ICES, Toronto, Canada
Andrea Luk Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, Hong Kong Asia Diabetes Foundation, Prince of Wales Hospital, Shatin, Hong Kong Hong Kong Institute of Diabetes and Obesity, The Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, Hong Kong Li Ka Shing Institute of Health Science, The Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, Hong Kong
Baiju R. Shah Department of Medicine, University of Toronto, Toronto, Canada Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, Canada ICES, Toronto, Canada Department of Medicine, Sunnybrook Health Sciences Centre, Toronto, Canada
Prabhat Jha Centre for Global Health Research, St. Michael’s Hospital, and Dalla Lana School of Public Health, University of Toronto, Toronto, Canada
Eric Lau Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, Hong Kong Asia Diabetes Foundation, Prince of Wales Hospital, Shatin, Hong Kong
Ronald C. W. Ma Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, Hong Kong Hong Kong Institute of Diabetes and Obesity, The Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, Hong Kong Li Ka Shing Institute of Health Science, The Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, Hong Kong
Wing-Yee So Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, Hong Kong
Alice P. Kong Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, Hong Kong Hong Kong Institute of Diabetes and Obesity, The Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, Hong Kong Li Ka Shing Institute of Health Science, The Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, Hong Kong
Elaine Chow Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, Hong Kong
Juliana C. N. Chan Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, Hong Kong Asia Diabetes Foundation, Prince of Wales Hospital, Shatin, Hong Kong Hong Kong Institute of Diabetes and Obesity, The Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, Hong Kong Li Ka Shing Institute of Health Science, The Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, Hong Kong

Collapse

Pfaff ER, Crosskey M, Morton K, Krishnamurthy A. Clinical Annotation Research Kit (CLARK): Computable Phenotyping Using Machine Learning. JMIR Med Inform 2020;8:e16042. [PMID: 32012059 PMCID: PMC7007592 DOI: 10.2196/16042] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Revised: 10/30/2019] [Accepted: 12/16/2019] [Indexed: 01/02/2023] Open

Abstract

Computable phenotypes are algorithms that translate clinical features into code that can be run against electronic health record (EHR) data to define patient cohorts. However, computable phenotypes that only make use of structured EHR data do not capture the full richness of a patient’s medical record. While natural language processing (NLP) methods have shown success in extracting clinical features from text, the use of such tools has generally been limited to research groups with substantial NLP expertise. Our goal was to develop an open-source phenotyping software, Clinical Annotation Research Kit (CLARK), that would enable clinical and translational researchers to use machine learning–based NLP for computable phenotyping without requiring deep informatics expertise. CLARK enables nonexpert users to mine text using machine learning classifiers by specifying features for the software to match in clinical notes. Once the features are defined, the user-friendly CLARK interface allows the user to choose from a variety of standard machine learning algorithms (linear support vector machine, Gaussian Naïve Bayes, decision tree, and random forest), cross-validation methods, and the number of folds (cross-validation splits) to be used in evaluation of the classifier. Example phenotypes where CLARK has been applied include pediatric diabetes (sensitivity=0.91; specificity=0.98), symptomatic uterine fibroids (positive predictive value=0.81; negative predictive value=0.54), nonalcoholic fatty liver disease (sensitivity=0.90; specificity=0.94), and primary ciliary dyskinesia (sensitivity=0.88; specificity=1.0). In each of these use cases, CLARK allowed investigators to incorporate variables into their phenotype algorithm that would not be available as structured data. Moreover, the fact that nonexpert users can get started with machine learning–based NLP with limited informatics involvement is a significant improvement over the status quo. We hope to disseminate CLARK to other organizations that may not have NLP or machine learning specialists available, enabling wider use of these methods.

Collapse

A Review of Automatic Phenotyping Approaches using Electronic Health Records. ELECTRONICS 2019. [DOI: 10.3390/electronics8111235] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Obeid JS, Weeda ER, Matuskowitz AJ, Gagnon K, Crawford T, Carr CM, Frey LJ. Automated detection of altered mental status in emergency department clinical notes: a deep learning approach. BMC Med Inform Decis Mak 2019;19:164. [PMID: 31426779 PMCID: PMC6701023 DOI: 10.1186/s12911-019-0894-9] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2019] [Accepted: 08/11/2019] [Indexed: 12/18/2022] Open

Abstract

BACKGROUND

Machine learning has been used extensively in clinical text classification tasks. Deep learning approaches using word embeddings have been recently gaining momentum in biomedical applications. In an effort to automate the identification of altered mental status (AMS) in emergency department provider notes for the purpose of decision support, we compare the performance of classic bag-of-words-based machine learning classifiers and novel deep learning approaches.

METHODS

We used a case-control study design to extract an adequate number of clinical notes with AMS and non-AMS based on ICD codes. The notes were parsed to extract the history of present illness, which was used as the clinical text for the classifiers. The notes were manually labeled by clinicians. As a baseline for comparison, we tested several traditional bag-of-words based classifiers. We then tested several deep learning models using a convolutional neural network architecture with three different types of word embeddings, a pre-trained word2vec model and two models without pre-training but with different word embedding dimensions.

RESULTS

We evaluated the models on 1130 labeled notes from the emergency department. The deep learning models had the best overall performance with an area under the ROC curve of 98.5% and an accuracy of 94.5%. Pre-training word embeddings on the unlabeled corpus reduced training iterations and had performance that was statistically no different than the other deep learning models.

CONCLUSION

This supervised deep learning approach performs exceedingly well for the detection of AMS symptoms in clinical text in our environment. Further work is needed for the generalizability of these findings, including evaluation of these models in other types of clinical notes and other environments. The results seem promising for the ultimate use of these types of classifiers in combination with other information derived from the electronic health records as input for clinical decision support.

Collapse

Kosowan L, Wicklow B, Queenan J, Yeung R, Amed S, Singer A. Enhancing Health Surveillance: Validation of a Novel Electronic Medical Records-Based Definition of Cases of Pediatric Type 1 and Type 2 Diabetes Mellitus. Can J Diabetes 2019;43:392-398. [PMID: 30956098 DOI: 10.1016/j.jcjd.2019.02.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Revised: 12/18/2018] [Accepted: 02/13/2019] [Indexed: 12/20/2022]

Wiese AD, Roumie CL, Buse JB, Guzman H, Bradford R, Zalimeni E, Knoepp P, Morris HL, Donahoo WT, Fanous N, Epstein BF, Katalenich BL, Ayala SG, Cook MM, Worley KJ, Bachmann KN, Grijalva CG, Rothman RL, Chakkalakal RJ. Performance of a computable phenotype for identification of patients with diabetes within PCORnet: The Patient-Centered Clinical Research Network. Pharmacoepidemiol Drug Saf 2019;28:632-639. [PMID: 30680840 DOI: 10.1002/pds.4718] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2018] [Revised: 11/27/2018] [Accepted: 12/02/2018] [Indexed: 01/14/2023]

Affiliation(s)

Andrew D Wiese Department of Health Policy, Vanderbilt University Medical Center, Nashville, TN, USA
Christianne L Roumie Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA.,Department of Pediatrics, Vanderbilt University Medical Center, Nashville, TN, USA.,Veterans Health Administration-Tennessee Valley Healthcare System, Geriatric Research Education Clinical Center (GRECC), Nashville, TN, USA
John B Buse Department of Medicine, University of North Carolina, Chapel Hill, NC, USA
Herodes Guzman Department of Medicine, University of North Carolina, Chapel Hill, NC, USA
Robert Bradford Department of Medicine, University of North Carolina, Chapel Hill, NC, USA
Emily Zalimeni Department of Medicine, University of North Carolina, Chapel Hill, NC, USA
Patricia Knoepp Department of Medicine, University of North Carolina, Chapel Hill, NC, USA
Heather L Morris Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
William T Donahoo Department of Medicine, University of Florida, Gainesville, FL, USA
Nada Fanous Department of Medicine, University of Florida, Gainesville, FL, USA
Britany F Epstein Department of Medicine, University of Florida, Gainesville, FL, USA
Bonnie L Katalenich LA CaTS Clinical Translational Unit, Tulane University School of Medicine, Tulane, LA, USA
Sujata G Ayala Institute for Medicine and Public Health, Vanderbilt University Medical Center, Nashville, TN, USA
Megan M Cook Institute for Medicine and Public Health, Vanderbilt University Medical Center, Nashville, TN, USA
Katherine J Worley Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, TN, USA
Katherine N Bachmann Veterans Health Administration-Tennessee Valley Healthcare System, CSR&D, Nashville, TN, USA.,Vanderbilt Translational and Clinical Cardiovascular Research Center, Vanderbilt University Medical Center, Nashville, TN, USA.,Division of Diabetes, Endocrinology, and Metabolism, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
Carlos G Grijalva Department of Health Policy, Vanderbilt University Medical Center, Nashville, TN, USA.,Veterans Health Administration-Tennessee Valley Healthcare System, Geriatric Research Education Clinical Center (GRECC), Nashville, TN, USA
Russell L Rothman Department of Health Policy, Vanderbilt University Medical Center, Nashville, TN, USA.,Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA.,Department of Pediatrics, Vanderbilt University Medical Center, Nashville, TN, USA
Rosette J Chakkalakal Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA

Collapse

Chi GC, Li X, Tartof SY, Slezak JM, Koebnick C, Lawrence JM. Validity of ICD-10-CM codes for determination of diabetes type for persons with youth-onset type 1 and type 2 diabetes. BMJ Open Diabetes Res Care 2019;7:e000547. [PMID: 30899525 PMCID: PMC6398816 DOI: 10.1136/bmjdrc-2018-000547] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/01/2018] [Revised: 11/16/2018] [Accepted: 12/08/2018] [Indexed: 01/18/2023] Open

Abstract

OBJECTIVE

Diagnosis codes might be used for diabetes surveillance if they accurately distinguish diabetes type. We assessed the validity of International Classification of Disease, 10th Revision, Clinical Modification (ICD-10-CM) codes to discriminate between type 1 diabetes mellitus (T1DM) and type 2 diabetes mellitus (T2DM) among health plan members with youth-onset (diagnosis age <20 years) diabetes.

RESEARCH DESIGN AND METHODS

Diabetes case identification and abstraction of diabetes type was done as part of the SEARCH for Diabetes in Youth Study. The gold standard for diabetes type is the physician-assigned diabetes type documented in patients' medical records. Using all healthcare encounters with ICD-10-CM codes for diabetes, we summarized codes within each encounter and determined diabetes type using percent of encounters classified as T2DM. We chose 50% as the threshold from a receiver operating characteristic curve because this threshold yielded the largest Youden's index. Persons with ≥50% T2DM-coded encounters were classified as having T2DM. Otherwise, persons were classified as having T1DM. We calculated sensitivity, specificity, positive and negative predictive values, and accuracy overall and by demographic characteristics.

RESULTS

According to the gold standard, 1911 persons had T1DM and 652 persons had T2DM (mean age (SD): 19.1 (6.5) years). We obtained 90.6% (95% CI 88.4% to 92.9%) sensitivity, 96.3% (95% CI 95.4% to 97.1%) specificity, 89.3% (95% CI 86.9% to 91.6%) positive predictive value, 96.8% (95% CI 96.0% to 97.6%) negative predictive value, and 94.8% (95% CI 94.0% to 95.7%) accuracy for discriminating T2DM from T1DM.

CONCLUSIONS

ICD-10-CM codes can accurately classify diabetes type for persons with youth-onset diabetes, showing promise for rapid, cost-efficient diabetes surveillance.

Collapse

Saydah S, Imperatore G. Emerging Approaches in Surveillance of Type 1 Diabetes. Curr Diab Rep 2018;18:61. [PMID: 29995215 DOI: 10.1007/s11892-018-1033-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]

Hoffman SR, Vines AI, Halladay JR, Pfaff E, Schiff L, Westreich D, Sundaresan A, Johnson LS, Nicholson WK. Optimizing research in symptomatic uterine fibroids with development of a computable phenotype for use with electronic health records. Am J Obstet Gynecol 2018;218:610.e1-610.e7. [PMID: 29432754 DOI: 10.1016/j.ajog.2018.02.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2017] [Revised: 01/12/2018] [Accepted: 02/05/2018] [Indexed: 01/27/2023]

Abstract

BACKGROUND

Women with symptomatic uterine fibroids can report a myriad of symptoms, including pain, bleeding, infertility, and psychosocial sequelae. Optimizing fibroid research requires the ability to enroll populations of women with image-confirmed symptomatic uterine fibroids.

OBJECTIVE

Our objective was to develop an electronic health record-based algorithm to identify women with symptomatic uterine fibroids for a comparative effectiveness study of medical or surgical treatments on quality-of-life measures. Using an iterative process and text-mining techniques, an effective computable phenotype algorithm, composed of demographics, and clinical and laboratory characteristics, was developed with reasonable performance. Such algorithms provide a feasible, efficient way to identify populations of women with symptomatic uterine fibroids for the conduct of large traditional or pragmatic trials and observational comparative effectiveness studies. Symptomatic uterine fibroids, due to menorrhagia, pelvic pain, bulk symptoms, or infertility, are a source of substantial morbidity for reproductive-age women. Comparing Treatment Options for Uterine Fibroids is a multisite registry study to compare the effectiveness of hormonal or surgical fibroid treatments on women's perceptions of their quality of life. Electronic health record-based algorithms are able to identify large numbers of women with fibroids, but additional work is needed to develop electronic health record algorithms that can identify women with symptomatic fibroids to optimize fibroid research. We sought to develop an efficient electronic health record-based algorithm that can identify women with symptomatic uterine fibroids in a large health care system for recruitment into large-scale observational and interventional research in fibroid management.

STUDY DESIGN

We developed and assessed the accuracy of 3 algorithms to identify patients with symptomatic fibroids using an iterative approach. The data source was the Carolina Data Warehouse for Health, a repository for the health system's electronic health record data. In addition to International Classification of Diseases, Ninth Revision diagnosis and procedure codes and clinical characteristics, text data-mining software was used to derive information from imaging reports to confirm the presence of uterine fibroids. Results of each algorithm were compared with expert manual review to calculate the positive predictive values for each algorithm.

RESULTS

Algorithm 1 was composed of the following criteria: (1) age 18-54 years; (2) either ≥1 International Classification of Diseases, Ninth Revision diagnosis codes for uterine fibroids or mention of fibroids using text-mined key words in imaging records or documents; and (3) no International Classification of Diseases, Ninth Revision or Current Procedural Terminology codes for hysterectomy and no reported history of hysterectomy. The positive predictive value was 47% (95% confidence interval 39-56%). Algorithm 2 required ≥2 International Classification of Diseases, Ninth Revision diagnosis codes for fibroids and positive text-mined key words and had a positive predictive value of 65% (95% confidence interval 50-79%). In algorithm 3, further refinements included ≥2 International Classification of Diseases, Ninth Revision diagnosis codes for fibroids on separate outpatient visit dates, the exclusion of women who had a positive pregnancy test within 3 months of their fibroid-related visit, and exclusion of incidentally detected fibroids during prenatal or emergency department visits. Algorithm 3 achieved a positive predictive value of 76% (95% confidence interval 71-81%).

CONCLUSION

An electronic health record-based algorithm is capable of identifying cases of symptomatic uterine fibroids with moderate positive predictive value and may be an efficient approach for large-scale study recruitment.

Collapse

Newcomer SR, Kulldorff M, Xu S, Daley MF, Fireman B, Lewis E, Glanz JM. Bias from outcome misclassification in immunization schedule safety research. Pharmacoepidemiol Drug Saf 2018;27:221-228. [PMID: 29292551 DOI: 10.1002/pds.4374] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2017] [Revised: 09/18/2017] [Accepted: 11/20/2017] [Indexed: 11/11/2022]

Abstract

PURPOSE

The Institute of Medicine recommended conducting observational studies of childhood immunization schedule safety. Such studies could be biased by outcome misclassification, leading to incorrect inferences. Using simulations, we evaluated (1) outcome positive predictive values (PPVs) as indicators of bias of an exposure-outcome association, and (2) quantitative bias analyses (QBA) for bias correction.

METHODS

Simulations were conducted based on proposed or ongoing Vaccine Safety Datalink studies. We simulated 4 studies of 2 exposure groups (children with no vaccines or on alternative schedules) and 2 baseline outcome levels (100 and 1000/100 000 person-years), with 3 relative risk (RR) levels (RR = 0.50, 1.00, and 2.00), across 1000 replications using probabilistic modeling. We quantified bias from non-differential and differential outcome misclassification, based on levels previously measured in database research (sensitivity > 95%; specificity > 99%). We calculated median outcome PPVs, median observed RRs, Type 1 error, and bias-corrected RRs following QBA.

RESULTS

We observed PPVs from 34% to 98%. With non-differential misclassification and true RR = 2.00, median bias was toward the null, with severe bias (median observed RR = 1.33) with PPV = 34% and modest bias (median observed RR = 1.83) with PPV = 83%. With differential misclassification, PPVs did not reflect median bias, and there was Type 1 error of 100% with PPV = 90%. QBA was generally effective in correcting misclassification bias.

CONCLUSIONS

In immunization schedule studies, outcome misclassification may be non-differential or differential to exposure. Overall outcome PPVs do not reflect the distribution of false positives by exposure and are poor indicators of bias in individual studies. Our results support QBA for immunization schedule safety research.

Collapse

Kennell TI, Willig JH, Cimino JJ. Clinical Informatics Researcher's Desiderata for the Data Content of the Next Generation Electronic Health Record. Appl Clin Inform 2017;8:1159-1172. [PMID: 29270955 DOI: 10.4338/aci-2017-06-r-0101] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open

Abstract

OBJECTIVE

Clinical informatics researchers depend on the availability of high-quality data from the electronic health record (EHR) to design and implement new methods and systems for clinical practice and research. However, these data are frequently unavailable or present in a format that requires substantial revision. This article reports the results of a review of informatics literature published from 2010 to 2016 that addresses these issues by identifying categories of data content that might be included or revised in the EHR.

MATERIALS AND METHODS

We used an iterative review process on 1,215 biomedical informatics research articles. We placed them into generic categories, reviewed and refined the categories, and then assigned additional articles, for a total of three iterations.

RESULTS

Our process identified eight categories of data content issues: Adverse Events, Clinician Cognitive Processes, Data Standards Creation and Data Communication, Genomics, Medication List Data Capture, Patient Preferences, Patient-reported Data, and Phenotyping.

DISCUSSION

These categories summarize discussions in biomedical informatics literature that concern data content issues restricting clinical informatics research. These barriers to research result from data that are either absent from the EHR or are inadequate (e.g., in narrative text form) for the downstream applications of the data. In light of these categories, we discuss changes to EHR data storage that should be considered in the redesign of EHRs, to promote continued innovation in clinical informatics.

CONCLUSION

Based on published literature of clinical informaticians' reuse of EHR data, we characterize eight types of data content that, if included in the next generation of EHRs, would find immediate application in advanced informatics tools and techniques.

Collapse