1
|
Ozonze O, Scott PJ, Hopgood AA. Automating Electronic Health Record Data Quality Assessment. J Med Syst 2023; 47:23. [PMID: 36781551 PMCID: PMC9925537 DOI: 10.1007/s10916-022-01892-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Accepted: 11/15/2022] [Indexed: 02/15/2023]
Abstract
Information systems such as Electronic Health Record (EHR) systems are susceptible to data quality (DQ) issues. Given the growing importance of EHR data, there is an increasing demand for strategies and tools to help ensure that available data are fit for use. However, developing reliable data quality assessment (DQA) tools necessary for guiding and evaluating improvement efforts has remained a fundamental challenge. This review examines the state of research on operationalising EHR DQA, mainly automated tooling, and highlights necessary considerations for future implementations. We reviewed 1841 articles from PubMed, Web of Science, and Scopus published between 2011 and 2021. 23 DQA programs deployed in real-world settings to assess EHR data quality (n = 14), and a few experimental prototypes (n = 9), were identified. Many of these programs investigate completeness (n = 15) and value conformance (n = 12) quality dimensions and are backed by knowledge items gathered from domain experts (n = 9), literature reviews and existing DQ measurements (n = 3). A few DQA programs also explore the feasibility of using data-driven techniques to assess EHR data quality automatically. Overall, the automation of EHR DQA is gaining traction, but current efforts are fragmented and not backed by relevant theory. Existing programs also vary in scope, type of data supported, and how measurements are sourced. There is a need to standardise programs for assessing EHR data quality, as current evidence suggests their quality may be unknown.
Collapse
Affiliation(s)
- Obinwa Ozonze
- School of Computing, University of Portsmouth, Buckingham Building, Lion Terrace, Portsmouth, PO1 3HE, UK
| | - Philip J Scott
- Institute of Management and Health, University of Wales Trinity Saint David, Lampeter, SA48 7ED, UK
| | - Adrian A Hopgood
- School of Computing, University of Portsmouth, Buckingham Building, Lion Terrace, Portsmouth, PO1 3HE, UK.
| |
Collapse
|
2
|
Souza J, Caballero I, Vasco Santos J, Fernandes Lobo M, Pinto A, Viana J, Sáez C, Lopes F, Freitas A. Multisource and temporal variability in Portuguese hospital administrative datasets: data quality implications. J Biomed Inform 2022; 136:104242. [DOI: 10.1016/j.jbi.2022.104242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Revised: 08/18/2022] [Accepted: 11/06/2022] [Indexed: 11/13/2022]
|
3
|
Tute E, Scheffner I, Marschollek M. A method for interoperable knowledge-based data quality assessment. BMC Med Inform Decis Mak 2021; 21:93. [PMID: 33750371 PMCID: PMC7942002 DOI: 10.1186/s12911-021-01458-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Accepted: 02/26/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Assessing the quality of healthcare data is a complex task including the selection of suitable measurement methods (MM) and adequately assessing their results. OBJECTIVES To present an interoperable data quality (DQ) assessment method that formalizes MMs based on standardized data definitions and intends to support collaborative governance of DQ-assessment knowledge, e.g. which MMs to apply and how to assess their results in different situations. METHODS We describe and explain central concepts of our method using the example of its first real world application in a study on predictive biomarkers for rejection and other injuries of kidney transplants. We applied our open source tool-openCQA-that implements our method utilizing the openEHR specifications. Means to support collaborative governance of DQ-assessment knowledge are the version-control system git and openEHR clinical information models. RESULTS Applying the method on the study's dataset showed satisfactory practicability of the described concepts and produced useful results for DQ-assessment. CONCLUSIONS The main contribution of our work is to provide applicable concepts and a tested exemplary open source implementation for interoperable and knowledge-based DQ-assessment in healthcare that considers the need for flexible task and domain specific requirements.
Collapse
Affiliation(s)
- Erik Tute
- Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Hannover Medical School, Hannover Medical School, Carl-Neuberg-Str. 1, 30625 Hannover, Germany
| | - Irina Scheffner
- Department of Nephrology, Hannover Medical School, Hannover, Germany
| | - Michael Marschollek
- Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Hannover Medical School, Hannover Medical School, Carl-Neuberg-Str. 1, 30625 Hannover, Germany
| |
Collapse
|
4
|
Bailey SR, Stevens VJ, Fortmann SP, Kurtz SE, McBurnie MA, Priest E, Puro J, Solberg LI, Schweitzer R, Masica AL, Hazlehurst B. Long-Term Outcomes From Repeated Smoking Cessation Assistance in Routine Primary Care. Am J Health Promot 2018. [PMID: 29534598 DOI: 10.1177/0890117118761886] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
PURPOSE To test the association between repeated clinical smoking cessation support and long-term cessation. DESIGN Retrospective, observational cohort study using structured and free-text data from electronic health records. SETTING Six diverse health systems in the United States. PARTICIPANTS Patients aged ≥18 years who were smokers in 2007 and had ≥1 primary care visit in each of the following 4 years (N = 33 691). MEASURES Primary exposure was a composite categorical variable (comprised of documentation of smoking cessation medication, counseling, or referral) classifying the proportions of visits for which patients received any cessation assistance (<25% (reference), 25%-49%, 50%-74%, and ≥75% of visits). The dependent variable was long-term quit (LTQ; yes/no), defined as no indication of being a current smoker for ≥365 days following a visit where nonsmoker or former smoker was indicated. ANALYSIS Mixed effects logistic regression analysis adjusted for age, sex, race, and comorbidities, with robust standard error estimation to account for within site correlation. RESULTS Overall, 20% of the cohort achieved LTQ status. Patients with ≥75% of visits with any assistance had almost 3 times the odds of achieving LTQ status compared to those with <25% visits with assistance (odds ratio = 2.84; 95% confidence interval: 1.50-5.37). Results were similar for specific assistance types. CONCLUSIONS These findings provide support for the importance of repeated assistance at primary care visits to increase long-term smoking cessation.
Collapse
Affiliation(s)
- Steffani R Bailey
- 1 Department of Family Medicine, Oregon Health & Science University, Portland, OR, USA
| | - Victor J Stevens
- 2 Kaiser Permanente Center for Health Research, Portland, OR, USA
| | | | - Stephen E Kurtz
- 2 Kaiser Permanente Center for Health Research, Portland, OR, USA
| | | | | | | | | | - Rebecca Schweitzer
- 6 Department is Office of Public Health Studies, University of Hawai'i at Manoa, Honolulu, HI, USA
| | | | - Brian Hazlehurst
- 2 Kaiser Permanente Center for Health Research, Portland, OR, USA
| |
Collapse
|
5
|
Price M, Davies I, Rusk R, Lesperance M, Weber J. Applying STOPP Guidelines in Primary Care Through Electronic Medical Record Decision Support: Randomized Control Trial Highlighting the Importance of Data Quality. JMIR Med Inform 2017; 5:e15. [PMID: 28619704 PMCID: PMC5491896 DOI: 10.2196/medinform.6226] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2016] [Revised: 03/21/2017] [Accepted: 04/28/2017] [Indexed: 11/24/2022] Open
Abstract
Background Potentially Inappropriate Prescriptions (PIPs) are a common cause of morbidity, particularly in the elderly. Objective We sought to understand how the Screening Tool of Older People’s Prescriptions (STOPP) prescribing criteria, implemented in a routinely used primary care Electronic Medical Record (EMR), could impact PIP rates in community (non-academic) primary care practices. Methods We conducted a mixed-method, pragmatic, cluster, randomized control trial in research naïve primary care practices. Phase 1: In the randomized controlled trial, 40 fully automated STOPP rules were implemented as EMR alerts during a 16-week intervention period. The control group did not receive the 40 STOPP rules (but received other alerts). Participants were recruited through the OSCAR EMR user group mailing list and in person at user group meetings. Results were assessed by querying EMR data PIPs. EMR data quality probes were included. Phase 2: physicians were invited to participate in 1-hour semi-structured interviews to discuss the results. Results In the EMR, 40 STOPP rules were successfully implemented. Phase 1: A total of 28 physicians from 8 practices were recruited (16 in intervention and 12 in control groups). The calculated PIP rate was 2.6% (138/5308) (control) and 4.11% (768/18,668) (intervention) at baseline. No change in PIPs was observed through the intervention (P=.80). Data quality probes generally showed low use of problem list and medication list. Phase 2: A total of 5 physicians participated. All the participants felt that they were aware of the alerts but commented on workflow and presentation challenges. Conclusions The calculated PIP rate was markedly less than the expected rate found in literature (2.6% and 4.0% vs 20% in literature). Data quality probes highlighted issues related to completeness of data in areas of the EMR used for PIP reporting and by the decision support such as problem and medication lists. Users also highlighted areas for better integration of STOPP guidelines with prescribing workflows. Many of the STOPP criteria can be implemented in EMRs using simple logic. However, data quality in EMRs continues to be a challenge and was a limiting step in the effectiveness of the decision support in this study. This is important as decision makers continue to fund implementation and adoption of EMRs with the expectation of the use of advanced tools (such as decision support) without ongoing review of data quality and improvement. Trial Registration Clinicaltrials.gov NCT02130895; https://clinicaltrials.gov/ct2/show/NCT02130895 (Archived by WebCite at http://www.webcitation.org/6qyFigSYT)
Collapse
Affiliation(s)
- Morgan Price
- LEAD Lab, Department of Family Practice, Island Medical Program, University of British Columbia, Victoria, BC, Canada.,University of Victoria, Victoria, BC, Canada
| | - Iryna Davies
- LEAD Lab, Department of Family Practice, Island Medical Program, University of British Columbia, Victoria, BC, Canada
| | - Raymond Rusk
- LEAD Lab, Department of Family Practice, Island Medical Program, University of British Columbia, Victoria, BC, Canada
| | | | - Jens Weber
- LEAD Lab, Department of Family Practice, Island Medical Program, University of British Columbia, Victoria, BC, Canada.,University of Victoria, Victoria, BC, Canada
| |
Collapse
|
6
|
Sáez C, Zurriaga O, Pérez-Panadés J, Melchor I, Robles M, García-Gómez JM. Applying probabilistic temporal and multisite data quality control methods to a public health mortality registry in Spain: a systematic approach to quality control of repositories. J Am Med Inform Assoc 2016; 23:1085-1095. [PMID: 27107447 DOI: 10.1093/jamia/ocw010] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2015] [Revised: 12/21/2015] [Accepted: 01/17/2016] [Indexed: 11/14/2022] Open
Abstract
Abstract
Objective To assess the variability in data distributions among data sources and over time through a case study of a large multisite repository as a systematic approach to data quality (DQ).
Materials and Methods Novel probabilistic DQ control methods based on information theory and geometry are applied to the Public Health Mortality Registry of the Region of Valencia, Spain, with 512 143 entries from 2000 to 2012, disaggregated into 24 health departments. The methods provide DQ metrics and exploratory visualizations for (1) assessing the variability among multiple sources and (2) monitoring and exploring changes with time. The methods are suited to big data and multitype, multivariate, and multimodal data.
Results The repository was partitioned into 2 probabilistically separated temporal subgroups following a change in the Spanish National Death Certificate in 2009. Punctual temporal anomalies were noticed due to a punctual increment in the missing data, along with outlying and clustered health departments due to differences in populations or in practices.
Discussion Changes in protocols, differences in populations, biased practices, or other systematic DQ problems affected data variability. Even if semantic and integration aspects are addressed in data sharing infrastructures, probabilistic variability may still be present. Solutions include fixing or excluding data and analyzing different sites or time periods separately. A systematic approach to assessing temporal and multisite variability is proposed.
Conclusion Multisite and temporal variability in data distributions affects DQ, hindering data reuse, and an assessment of such variability should be a part of systematic DQ procedures.
Collapse
Affiliation(s)
- Carlos Sáez
- Instituto Universitario de Aplicaciones de las Tecnologías de la Información y de las Comunicaciones Avanzadas. Universitat Politècnica de València. Camino de Vera s/n. 46022 Valencia, España
- Centre for Health Technologies and Services Research, University of Porto, Porto, Portugal
| | - Oscar Zurriaga
- Dirección General de Salud Pública, Conselleria de Sanidad, Valencia, Spain
- FISABIO – Salud Pública, Consellería de Sanidad, Valencia, Spain
- CIBERESP, Madrid, Spain
| | | | - Inma Melchor
- Dirección General de Salud Pública, Conselleria de Sanidad, Valencia, Spain
| | - Montserrat Robles
- Instituto Universitario de Aplicaciones de las Tecnologías de la Información y de las Comunicaciones Avanzadas. Universitat Politècnica de València. Camino de Vera s/n. 46022 Valencia, España
| | - Juan M García-Gómez
- Instituto Universitario de Aplicaciones de las Tecnologías de la Información y de las Comunicaciones Avanzadas. Universitat Politècnica de València. Camino de Vera s/n. 46022 Valencia, España
- Unidad Mixta de Investigación en TICs aplicadas a la Reingeniería de Procesos Sociosanitarios (eRPSS), Instituto de Investigación Sanitaria del Hospital Universitario y Politécnico La Fe, Valencia, Spain
| |
Collapse
|
7
|
Hazlehurst BL, Kurtz SE, Masica A, Stevens VJ, McBurnie MA, Puro JE, Vijayadeva V, Au DH, Brannon ED, Sittig DF. CER Hub: An informatics platform for conducting comparative effectiveness research using multi-institutional, heterogeneous, electronic clinical data. Int J Med Inform 2015; 84:763-73. [PMID: 26138036 DOI: 10.1016/j.ijmedinf.2015.06.002] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2014] [Revised: 02/17/2015] [Accepted: 06/02/2015] [Indexed: 02/08/2023]
Abstract
OBJECTIVES Comparative effectiveness research (CER) requires the capture and analysis of data from disparate sources, often from a variety of institutions with diverse electronic health record (EHR) implementations. In this paper we describe the CER Hub, a web-based informatics platform for developing and conducting research studies that combine comprehensive electronic clinical data from multiple health care organizations. METHODS The CER Hub platform implements a data processing pipeline that employs informatics standards for data representation and web-based tools for developing study-specific data processing applications, providing standardized access to the patient-centric electronic health record (EHR) across organizations. RESULTS The CER Hub is being used to conduct two CER studies utilizing data from six geographically distributed and demographically diverse health systems. These foundational studies address the effectiveness of medications for controlling asthma and the effectiveness of smoking cessation services delivered in primary care. DISCUSSION The CER Hub includes four key capabilities: the ability to process and analyze both free-text and coded clinical data in the EHR; a data processing environment supported by distributed data and study governance processes; a clinical data-interchange format for facilitating standardized extraction of clinical data from EHRs; and a library of shareable clinical data processing applications. CONCLUSION CER requires coordinated and scalable methods for extracting, aggregating, and analyzing complex, multi-institutional clinical data. By offering a range of informatics tools integrated into a framework for conducting studies using EHR data, the CER Hub provides a solution to the challenges of multi-institutional research using electronic medical record data.
Collapse
Affiliation(s)
- Brian L Hazlehurst
- Kaiser Permanente Northwest, Center for Health Research, Portland, OR, USA.
| | - Stephen E Kurtz
- Kaiser Permanente Northwest, Center for Health Research, Portland, OR, USA
| | - Andrew Masica
- Baylor Scott & White Health, Center for Clinical Effectiveness, Dallas, TX, USA
| | - Victor J Stevens
- Kaiser Permanente Northwest, Center for Health Research, Portland, OR, USA
| | - Mary Ann McBurnie
- Kaiser Permanente Northwest, Center for Health Research, Portland, OR, USA
| | | | | | - David H Au
- VA Puget Sound Health Care System, Seattle, WA, USA
| | | | - Dean F Sittig
- University of Texas Health Science Center, School of Biomedical Informatics, Houston, TX, USA
| |
Collapse
|