1
|
Zhang Y, Callaghan-Koru JA, Koru G. The challenges and opportunities of continuous data quality improvement for healthcare administration data. JAMIA Open 2024; 7:ooae058. [PMID: 39091510 PMCID: PMC11293638 DOI: 10.1093/jamiaopen/ooae058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 05/12/2024] [Accepted: 06/18/2024] [Indexed: 08/04/2024] Open
Abstract
Background Various data quality issues have prevented healthcare administration data from being fully utilized when dealing with problems ranging from COVID-19 contact tracing to controlling healthcare costs. Objectives (i) Describe the currently adopted approaches and practices for understanding and improving the quality of healthcare administration data. (ii) Explore the challenges and opportunities to achieve continuous quality improvement for such data. Materials and Methods We used a qualitative approach to obtain rich contextual data through semi-structured interviews conducted at a state health agency regarding Medicaid claims and reimbursement data. We interviewed all data stewards knowledgeable about the data quality issues experienced at the agency. The qualitative data were analyzed using the Framework method. Results Sixteen themes emerged from our analysis, collected under 4 categories: (i) Defect characteristics: Data defects showed variability, frequently remained obscure, and led to negative outcomes. Detecting and resolving them was often difficult, and the work required often exceeded the organizational boundaries. (ii) Current process and people issues: The agency adopted primarily ad-hoc, manual approaches to resolving data quality problems leading to work frustration. (iii) Challenges: Communication and lack of knowledge about legacy software systems and the data maintained in them constituted challenges, followed by different standards used by various organizations and vendors, and data verification difficulties. (iv) Opportunities: Training, tool support, and standardization of data definitions emerged as immediate opportunities to improve data quality. Conclusions Our results can be useful to similar agencies on their journey toward becoming learning health organizations leveraging data assets effectively and efficiently.
Collapse
Affiliation(s)
- Yili Zhang
- Innovation Center for Biomedical Informatics, Georgetown University, Washington, DC 20007, United States
| | - Jennifer A Callaghan-Koru
- Department of Internal Medicine, University of Arkansas for Medical Sciences, Fayetteville, AR 72703, United States
| | - Güneş Koru
- Departments of Health Policy and Management & Biomedical Informatics, University of Arkansas for Medical Sciences, Fayetteville, AR 72703, United States
| |
Collapse
|
2
|
Syed R, Eden R, Makasi T, Chukwudi I, Mamudu A, Kamalpour M, Kapugama Geeganage D, Sadeghianasl S, Leemans SJJ, Goel K, Andrews R, Wynn MT, Ter Hofstede A, Myers T. Digital Health Data Quality Issues: Systematic Review. J Med Internet Res 2023; 25:e42615. [PMID: 37000497 PMCID: PMC10131725 DOI: 10.2196/42615] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 12/07/2022] [Accepted: 12/31/2022] [Indexed: 04/01/2023] Open
Abstract
BACKGROUND The promise of digital health is principally dependent on the ability to electronically capture data that can be analyzed to improve decision-making. However, the ability to effectively harness data has proven elusive, largely because of the quality of the data captured. Despite the importance of data quality (DQ), an agreed-upon DQ taxonomy evades literature. When consolidated frameworks are developed, the dimensions are often fragmented, without consideration of the interrelationships among the dimensions or their resultant impact. OBJECTIVE The aim of this study was to develop a consolidated digital health DQ dimension and outcome (DQ-DO) framework to provide insights into 3 research questions: What are the dimensions of digital health DQ? How are the dimensions of digital health DQ related? and What are the impacts of digital health DQ? METHODS Following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines, a developmental systematic literature review was conducted of peer-reviewed literature focusing on digital health DQ in predominately hospital settings. A total of 227 relevant articles were retrieved and inductively analyzed to identify digital health DQ dimensions and outcomes. The inductive analysis was performed through open coding, constant comparison, and card sorting with subject matter experts to identify digital health DQ dimensions and digital health DQ outcomes. Subsequently, a computer-assisted analysis was performed and verified by DQ experts to identify the interrelationships among the DQ dimensions and relationships between DQ dimensions and outcomes. The analysis resulted in the development of the DQ-DO framework. RESULTS The digital health DQ-DO framework consists of 6 dimensions of DQ, namely accessibility, accuracy, completeness, consistency, contextual validity, and currency; interrelationships among the dimensions of digital health DQ, with consistency being the most influential dimension impacting all other digital health DQ dimensions; 5 digital health DQ outcomes, namely clinical, clinician, research-related, business process, and organizational outcomes; and relationships between the digital health DQ dimensions and DQ outcomes, with the consistency and accessibility dimensions impacting all DQ outcomes. CONCLUSIONS The DQ-DO framework developed in this study demonstrates the complexity of digital health DQ and the necessity for reducing digital health DQ issues. The framework further provides health care executives with holistic insights into DQ issues and resultant outcomes, which can help them prioritize which DQ-related problems to tackle first.
Collapse
Affiliation(s)
- Rehan Syed
- School of Information Systems, Faculty of Science, Queensland University of Technology, Brisbane, Australia
| | - Rebekah Eden
- School of Information Systems, Faculty of Science, Queensland University of Technology, Brisbane, Australia
| | - Tendai Makasi
- School of Information Systems, Faculty of Science, Queensland University of Technology, Brisbane, Australia
| | - Ignatius Chukwudi
- School of Information Systems, Faculty of Science, Queensland University of Technology, Brisbane, Australia
| | - Azumah Mamudu
- School of Information Systems, Faculty of Science, Queensland University of Technology, Brisbane, Australia
| | - Mostafa Kamalpour
- School of Information Systems, Faculty of Science, Queensland University of Technology, Brisbane, Australia
| | - Dakshi Kapugama Geeganage
- School of Information Systems, Faculty of Science, Queensland University of Technology, Brisbane, Australia
| | - Sareh Sadeghianasl
- School of Information Systems, Faculty of Science, Queensland University of Technology, Brisbane, Australia
| | - Sander J J Leemans
- Rheinisch-Westfälische Technische Hochschule, Aachen University, Aachen, Germany
| | - Kanika Goel
- School of Information Systems, Faculty of Science, Queensland University of Technology, Brisbane, Australia
| | - Robert Andrews
- School of Information Systems, Faculty of Science, Queensland University of Technology, Brisbane, Australia
| | - Moe Thandar Wynn
- School of Information Systems, Faculty of Science, Queensland University of Technology, Brisbane, Australia
| | - Arthur Ter Hofstede
- School of Information Systems, Faculty of Science, Queensland University of Technology, Brisbane, Australia
| | - Trina Myers
- School of Information Systems, Faculty of Science, Queensland University of Technology, Brisbane, Australia
| |
Collapse
|
3
|
Ahmad FS, Luo Y, Wehbe RM, Thomas JD, Shah SJ. Advances in Machine Learning Approaches to Heart Failure with Preserved Ejection Fraction. Heart Fail Clin 2022; 18:287-300. [PMID: 35341541 PMCID: PMC8983114 DOI: 10.1016/j.hfc.2021.12.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Heart failure with preserved ejection fraction (HFpEF) represents a prototypical cardiovascular condition in which machine learning may improve targeted therapies and mechanistic understanding of pathogenesis. Machine learning, which involves algorithms that learn from data, has the potential to guide precision medicine approaches for complex clinical syndromes such as HFpEF. It is therefore important to understand the potential utility and common pitfalls of machine learning so that it can be applied and interpreted appropriately. Although machine learning holds considerable promise for HFpEF, it is subject to several potential pitfalls, which are important factors to consider when interpreting machine learning studies.
Collapse
Affiliation(s)
- Faraz S. Ahmad
- Division of Cardiology, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL
- Bluhm Cardiovascular Institute Center for Artificial Intelligence, Northwestern Medicine, Chicago, IL
| | - Yuan Luo
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL
- Bluhm Cardiovascular Institute Center for Artificial Intelligence, Northwestern Medicine, Chicago, IL
| | - Ramsey M. Wehbe
- Division of Cardiology, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL
- Bluhm Cardiovascular Institute Center for Artificial Intelligence, Northwestern Medicine, Chicago, IL
| | - James D. Thomas
- Division of Cardiology, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL
- Bluhm Cardiovascular Institute Center for Artificial Intelligence, Northwestern Medicine, Chicago, IL
| | - Sanjiv J. Shah
- Division of Cardiology, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL
- Bluhm Cardiovascular Institute Center for Artificial Intelligence, Northwestern Medicine, Chicago, IL
| |
Collapse
|
4
|
Bian J, Lyu T, Loiacono A, Viramontes TM, Lipori G, Guo Y, Wu Y, Prosperi M, George TJ, Harle CA, Shenkman EA, Hogan W. Assessing the practice of data quality evaluation in a national clinical data research network through a systematic scoping review in the era of real-world data. J Am Med Inform Assoc 2021; 27:1999-2010. [PMID: 33166397 PMCID: PMC7727392 DOI: 10.1093/jamia/ocaa245] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Revised: 09/13/2020] [Accepted: 09/18/2020] [Indexed: 11/13/2022] Open
Abstract
Objective To synthesize data quality (DQ) dimensions and assessment methods of real-world data, especially electronic health records, through a systematic scoping review and to assess the practice of DQ assessment in the national Patient-centered Clinical Research Network (PCORnet). Materials and Methods We started with 3 widely cited DQ literature—2 reviews from Chan et al (2010) and Weiskopf et al (2013a) and 1 DQ framework from Kahn et al (2016)—and expanded our review systematically to cover relevant articles published up to February 2020. We extracted DQ dimensions and assessment methods from these studies, mapped their relationships, and organized a synthesized summarization of existing DQ dimensions and assessment methods. We reviewed the data checks employed by the PCORnet and mapped them to the synthesized DQ dimensions and methods. Results We analyzed a total of 3 reviews, 20 DQ frameworks, and 226 DQ studies and extracted 14 DQ dimensions and 10 assessment methods. We found that completeness, concordance, and correctness/accuracy were commonly assessed. Element presence, validity check, and conformance were commonly used DQ assessment methods and were the main focuses of the PCORnet data checks. Discussion Definitions of DQ dimensions and methods were not consistent in the literature, and the DQ assessment practice was not evenly distributed (eg, usability and ease-of-use were rarely discussed). Challenges in DQ assessments, given the complex and heterogeneous nature of real-world data, exist. Conclusion The practice of DQ assessment is still limited in scope. Future work is warranted to generate understandable, executable, and reusable DQ measures.
Collapse
Affiliation(s)
- Jiang Bian
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, Florida, USA.,Cancer Informatics Shared Resource, University of Florida Health Cancer Center, Gainesville, Florida, USA
| | - Tianchen Lyu
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, Florida, USA
| | - Alexander Loiacono
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, Florida, USA
| | - Tonatiuh Mendoza Viramontes
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, Florida, USA
| | - Gloria Lipori
- Clinical and Translational Institute, University of Florida, Gainesville, Florida, USA
| | - Yi Guo
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, Florida, USA
| | - Yonghui Wu
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, Florida, USA
| | - Mattia Prosperi
- Department of Epidemiology, College of Public Health and Health Professions & College of Medicine, University of Florida, Gainesville, Florida, USA
| | - Thomas J George
- Hematology & Oncology, Department of Medicine, College of Medicine, University of Florida, Gainesville, Florida, USA
| | - Christopher A Harle
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, Florida, USA
| | - Elizabeth A Shenkman
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, Florida, USA
| | - William Hogan
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, Florida, USA
| |
Collapse
|
5
|
Horth RZ, Wagstaff S, Jeppson T, Patel V, McClellan J, Bissonette N, Friedrichs M, Dunn AC. Use of electronic health records from a statewide health information exchange to support public health surveillance of diabetes and hypertension. BMC Public Health 2019; 19:1106. [PMID: 31412826 PMCID: PMC6694493 DOI: 10.1186/s12889-019-7367-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2019] [Accepted: 07/24/2019] [Indexed: 11/12/2022] Open
Abstract
Background Electronic health record (EHR) data, collected primarily for individual patient care and billing purposes, compiled in health information exchanges (HIEs) may have a secondary use for population health surveillance of noncommunicable diseases. However, data compilation across fragmented data sources into HIEs presents potential barriers and quality of data is unknown. Methods We compared 2015 patient data from a mid-size health system (Database A) to data from System A patients in the Utah HIE (Database B). We calculated concordance of structured data (sex and age) and unstructured data (blood pressure reading and A1C). We estimated adjusted hypertension and diabetes prevalence in each database and compared these across age groups. Results Matching resulted in 72,356 unique patients. Concordance between Database A and Database B exceeded 99% for sex and age, but was 89% for A1C results and 54% for blood pressure readings. Sensitivity, using Database A as the standard, was 57% for hypertension and 55% for diabetes. Age and sex adjusted prevalence of diabetes (8.4% vs 5.8%, Database A and B, respectively) and hypertension (14.5% vs 11.6%, respectively) differed, but this difference was consistent with parallel slopes in prevalence over age groups in both databases. Conclusions We identified several gaps in the use of HIE data for surveillance of diabetes and hypertension. High concordance of structured data demonstrate some promise in HIEs capacity to capture patient data. Improving HIE data quality through increased use of structured variables may help make HIE data useful for population health surveillance in places with fragmented EHR systems. Electronic supplementary material The online version of this article (10.1186/s12889-019-7367-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Roberta Z Horth
- Epidemic Intelligence Service, Division of Scientific Education and Professional Development, CDC, Atlanta, Georgia, USA. .,Utah Department of Health, Salt Lake City, UT, 84114, USA.
| | | | - Theron Jeppson
- Utah Department of Health, Salt Lake City, UT, 84114, USA
| | - Vishal Patel
- Utah Health Information Network, Murray, UT, USA
| | | | | | | | - Angela C Dunn
- Utah Department of Health, Salt Lake City, UT, 84114, USA
| |
Collapse
|
6
|
Johnson JK, Erickson JA, Miller CJ, Fritz JM, Marcus RL, Pelt CE. Short-term functional recovery after total joint arthroplasty is unaffected by bundled payment participation. Arthroplast Today 2019; 5:119-125. [PMID: 31020035 PMCID: PMC6470353 DOI: 10.1016/j.artd.2018.12.003] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/22/2018] [Revised: 12/22/2018] [Accepted: 12/29/2018] [Indexed: 11/18/2022] Open
Abstract
Background Bundled payment models for lower extremity total joint arthroplasty (TJA) aim to improve value by decreasing costs via efficient care pathways. It is unclear how such models affect patient-centered outcomes such as functional recovery. We aimed to determine whether participation in bundled payment for TJA negatively affects patients’ functional recovery. Methods All patients, regardless of payer, undergoing elective TJA between July 2014 and December 2016 were identified retrospectively and categorized into prebundle (n = 680) and postbundle (n = 1216) cohorts. Mixed-effects linear regression and Wald postests were used to test for differences in patients’ functional recovery during the hospital period and over 12 months after TJA between cohorts. We also used multivariate regression to test for differences in hospital length of stay (LOS) and postacute care (PAC) facility use between cohorts. Results Compared with the prebundle cohort, patients in the postbundle cohort demonstrated a small and nonmeaningful difference in the trajectory of functional recovery in the hospital [χ2(3) = 31.3, P < .01] and no difference in the 12 months after TJA [χ2(3) = 3.9, P = .28]. They had a 0.4-day shorter hospital LOS (95% confidence interval: −0.5, −0.3) and decreased odds for PAC facility use (adjusted odds ratio = 0.3; 95% confidence interval: 0.2, 0.4). Conclusions Participation in bundled payment for TJA was not associated with significant changes in patients’ functional recovery, an important patient-centered outcome. For the postbundle cohort, hospital LOS and PAC facility use were decreased, consistent with previous studies describing cost-saving strategies in bundled payment. These findings support the need for an ongoing study of the long-term sustainability of these value-based payment models.
Collapse
Affiliation(s)
- Joshua K. Johnson
- Department of Physical Therapy and Athletic Training, University of Utah, Salt Lake City, UT, USA
- Corresponding author. Department of Physical Therapy and Athletic Training, University of Utah, 520 Wakara Way, Salt Lake City, UT 84108, USA. Tel.: +1 216 903 0621.
| | - Jill A. Erickson
- Department of Orthopaedics, University of Utah, Salt Lake City, UT, USA
| | - Caitlin J. Miller
- Department of Physical Therapy and Athletic Training, University of Utah, Salt Lake City, UT, USA
| | - Julie M. Fritz
- Department of Physical Therapy and Athletic Training, University of Utah, Salt Lake City, UT, USA
| | - Robin L. Marcus
- Department of Physical Therapy and Athletic Training, University of Utah, Salt Lake City, UT, USA
| | | |
Collapse
|
7
|
|
8
|
Estiri H, Stephens K. DQ e-v: A Database-Agnostic Framework for Exploring Variability in Electronic Health Record Data Across Time and Site Location. EGEMS 2017; 5:3. [PMID: 29930954 PMCID: PMC5994933 DOI: 10.13063/2327-9214.1277] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Data variability is a commonly observed phenomenon in Electronic Health Records (EHR) data networks. A common question asked in scientific investigations of EHR data is whether the cross-site and -time variability reflects an underlying data quality error at one or more contributing sites versus actual differences driven by various idiosyncrasies in the healthcare settings. Although research analysts and data scientists have commonly used various statistical methods to detect and account for variability in analytic datasets, self service tools to facilitate exploring cross-organizational variability in EHR data warehouses are lacking and could benefit from meaningful data visualizations. DQe-v, an interactive, database-agnostic tool for visually exploring variability in EHR data provides such a solution. DQe-v is built on an open source platform, R statistical software, with annotated scripts and a readme document that makes it fully reproducible. To illustrate and describe functionality of DQe-v, we describe the DQe-v’s readme document which includes a complete guide to installation, running the program, and interpretation of the outputs. We also provide annotated R scripts and an example dataset as supplemental materials. DQe-v offers a self service tool to visually explore data variability within EHR datasets irrespective of the data model. GitHub and CIELO offer hosting and distribution of the tool and can facilitate collaboration across any interested community of users as we target improving usability, efficiency, and interoperability.
Collapse
|
9
|
Piccinni C, Antonazzo IC, Simonetti M, Mennuni MG, Parretti D, Cricelli C, Colombo D, Nica M, Cricelli I, Lapi F. The Burden of Chronic Heart Failure in Primary Care in Italy. High Blood Press Cardiovasc Prev 2017; 24:171-178. [DOI: 10.1007/s40292-017-0193-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2017] [Accepted: 03/15/2017] [Indexed: 12/12/2022] Open
|