1
|
Xu Y, Zheng X, Li Y, Ye X, Cheng H, Wang H, Lyu J. Exploring patient medication adherence and data mining methods in clinical big data: A contemporary review. J Evid Based Med 2023; 16:342-375. [PMID: 37718729 DOI: 10.1111/jebm.12548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 08/30/2023] [Indexed: 09/19/2023]
Abstract
BACKGROUND Increasingly, patient medication adherence data are being consolidated from claims databases and electronic health records (EHRs). Such databases offer an indirect avenue to gauge medication adherence in our data-rich healthcare milieu. The surge in data accessibility, coupled with the pressing need for its conversion to actionable insights, has spotlighted data mining, with machine learning (ML) emerging as a pivotal technique. Nonadherence poses heightened health risks and escalates medical costs. This paper elucidates the synergistic interaction between medical database mining for medication adherence and the role of ML in fostering knowledge discovery. METHODS We conducted a comprehensive review of EHR applications in the realm of medication adherence, leveraging ML techniques. We expounded on the evolution and structure of medical databases pertinent to medication adherence and harnessed both supervised and unsupervised ML paradigms to delve into adherence and its ramifications. RESULTS Our study underscores the applications of medical databases and ML, encompassing both supervised and unsupervised learning, for medication adherence in clinical big data. Databases like SEER and NHANES, often underutilized due to their intricacies, have gained prominence. Employing ML to excavate patient medication logs from these databases facilitates adherence analysis. Such findings are pivotal for clinical decision-making, risk stratification, and scholarly pursuits, aiming to elevate healthcare quality. CONCLUSION Advanced data mining in the era of big data has revolutionized medication adherence research, thereby enhancing patient care. Emphasizing bespoke interventions and research could herald transformative shifts in therapeutic modalities.
Collapse
Affiliation(s)
- Yixian Xu
- Department of Anesthesiology, The First Affiliated Hospital of Jinan University, Guangzhou, China
| | - Xinkai Zheng
- Department of Dermatology, The First Affiliated Hospital of Jinan University, Guangzhou, China
| | - Yuanjie Li
- Planning & Discipline Construction Office, The First Affiliated Hospital of Jinan University, Guangzhou, China
| | - Xinmiao Ye
- Department of Anesthesiology, The First Affiliated Hospital of Jinan University, Guangzhou, China
| | - Hongtao Cheng
- School of Nursing, Jinan University, Guangzhou, China
| | - Hao Wang
- Department of Anesthesiology, The First Affiliated Hospital of Jinan University, Guangzhou, China
| | - Jun Lyu
- Department of Clinical Research, The First Affiliated Hospital of Jinan University, Guangzhou, China
- Guangdong Provincial Key Laboratory of Traditional Chinese Medicine Informatization, Guangzhou, China
| |
Collapse
|
2
|
Hossain ME, Khan A, Moni MA, Uddin S. Use of Electronic Health Data for Disease Prediction: A Comprehensive Literature Review. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:745-758. [PMID: 31478869 DOI: 10.1109/tcbb.2019.2937862] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
Disease prediction has the potential to benefit stakeholders such as the government and health insurance companies. It can identify patients at risk of disease or health conditions. Clinicians can then take appropriate measures to avoid or minimize the risk and in turn, improve quality of care and avoid potential hospital admissions. Due to the recent advancement of tools and techniques for data analytics, disease risk prediction can leverage large amounts of semantic information, such as demographics, clinical diagnosis and measurements, health behaviours, laboratory results, prescriptions and care utilisation. In this regard, electronic health data can be a potential choice for developing disease prediction models. A significant number of such disease prediction models have been proposed in the literature over time utilizing large-scale electronic health databases, different methods, and healthcare variables. The goal of this comprehensive literature review was to discuss different risk prediction models that have been proposed based on electronic health data. Search terms were designed to find relevant research articles that utilized electronic health data to predict disease risks. Online scholarly databases were searched to retrieve results, which were then reviewed and compared in terms of the method used, disease type, and prediction accuracy. This paper provides a comprehensive review of the use of electronic health data for risk prediction models. A comparison of the results from different techniques for three frequently modelled diseases using electronic health data was also discussed in this study. In addition, the advantages and disadvantages of different risk prediction models, as well as their performance, were presented. Electronic health data have been widely used for disease prediction. A few modelling approaches show very high accuracy in predicting different diseases using such data. These modelling approaches have been used to inform the clinical decision process to achieve better outcomes.
Collapse
|
3
|
A Systematic Review of Network Studies Based on Administrative Health Data. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2020; 17:ijerph17072568. [PMID: 32283623 PMCID: PMC7177895 DOI: 10.3390/ijerph17072568] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Revised: 04/05/2020] [Accepted: 04/06/2020] [Indexed: 11/17/2022]
Abstract
Effective and efficient delivery of healthcare services requires comprehensive collaboration and coordination between healthcare entities and their complex inter-reliant activities. This inter-relation and coordination lead to different networks among diverse healthcare stakeholders. It is important to understand the varied dynamics of these networks to measure the efficiency of healthcare delivery services. To date, however, a work that systematically reviews these networks outlined in different studies is missing. This article provides a comprehensive summary of studies that have focused on networks and administrative health data. By summarizing different aspects including research objectives, key research questions, adopted methods, strengths and weaknesses, this research provides insights into the inherently complex and interlinked networks present in healthcare services. The outcome of this research is important to healthcare management and may guide further research in this area.
Collapse
|
4
|
Uddin S, Khan A, Hossain ME, Moni MA. Comparing different supervised machine learning algorithms for disease prediction. BMC Med Inform Decis Mak 2019; 19:281. [PMID: 31864346 PMCID: PMC6925840 DOI: 10.1186/s12911-019-1004-8] [Citation(s) in RCA: 377] [Impact Index Per Article: 75.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2019] [Accepted: 12/11/2019] [Indexed: 12/17/2022] Open
Abstract
Background Supervised machine learning algorithms have been a dominant method in the data mining field. Disease prediction using health data has recently shown a potential application area for these methods. This study ai7ms to identify the key trends among different types of supervised machine learning algorithms, and their performance and usage for disease risk prediction. Methods In this study, extensive research efforts were made to identify those studies that applied more than one supervised machine learning algorithm on single disease prediction. Two databases (i.e., Scopus and PubMed) were searched for different types of search items. Thus, we selected 48 articles in total for the comparison among variants supervised machine learning algorithms for disease prediction. Results We found that the Support Vector Machine (SVM) algorithm is applied most frequently (in 29 studies) followed by the Naïve Bayes algorithm (in 23 studies). However, the Random Forest (RF) algorithm showed superior accuracy comparatively. Of the 17 studies where it was applied, RF showed the highest accuracy in 9 of them, i.e., 53%. This was followed by SVM which topped in 41% of the studies it was considered. Conclusion This study provides a wide overview of the relative performance of different variants of supervised machine learning algorithms for disease prediction. This important information of relative performance can be used to aid researchers in the selection of an appropriate supervised machine learning algorithm for their studies.
Collapse
Affiliation(s)
- Shahadat Uddin
- Complex Systems Research Group, Faculty of Engineering, The University of Sydney, Room 524, SIT Building (J12), Darlington, NSW, 2008, Australia.
| | - Arif Khan
- Complex Systems Research Group, Faculty of Engineering, The University of Sydney, Room 524, SIT Building (J12), Darlington, NSW, 2008, Australia.,Health Market Quality Research Stream, Capital Markets CRC, Level 3, 55 Harrington Street, Sydney, NSW, Australia
| | - Md Ekramul Hossain
- Complex Systems Research Group, Faculty of Engineering, The University of Sydney, Room 524, SIT Building (J12), Darlington, NSW, 2008, Australia
| | - Mohammad Ali Moni
- Faculty of Medicine and Health, School of Medical Sciences, The University of Sydney, Camperdown, NSW, 2006, Australia
| |
Collapse
|
5
|
Use of Real-World Data Sources for Canadian Drug Pricing and Reimbursement Decisions: Stakeholder Views and Lessons for Other Countries. Int J Technol Assess Health Care 2019; 35:181-188. [DOI: 10.1017/s0266462319000291] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
AbstractBackgroundCanada has a long history of the use of clinical evidence to support healthcare decision making. Given improvements in data holdings and analytic capacity in Canada and stakeholder interest, the purpose of this study is to reflect on perceptions of the value of real-world evidence in pricing and reimbursement decisions, barriers to its optimal use in pricing and reimbursement, current initiatives that may lead to its increased use, and what role the pharmaceutical industry may play in this.Methods/ResultsTo capture stakeholder perceptions, ninety-one participants identified as key stakeholders were identified according to background roles and geography and invited to participate in four round table discussions conducted under Chatham House rule. Important themes emerging from these discussions included: (i) the need to understand what “real world” evidence means; (ii) barriers to using real world evidence from differences in access, governance, inter-operability, system structures, expertise, and quality across Canadian health systems; (iii) differing views on industry's role.ConclusionsThe use of real-world data in Canada to inform pricing and reimbursement decisions is far from routine but nascent and slowly increasing. Barriers, including interoperability concerns, may also apply to other federated health systems that need to focus on the networking of healthcare administrative data across provincial jurisdictional boundaries. There also appears to be a desire to see better use of pragmatic trials linked to these administrative data sets. Emerging initiatives are under way to use real world evidence more broadly, and include identification of common data elements and approaches to networking data.
Collapse
|
6
|
Uddin S, Kelaher M, Srinivasan U. A framework for administrative claim data to explore healthcare coordination and collaboration. AUST HEALTH REV 2016; 40:500-510. [DOI: 10.1071/ah15058] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2015] [Accepted: 09/25/2015] [Indexed: 11/23/2022]
Abstract
Previous studies have documented the application of electronic health insurance claim data for health services research purposes. In addition to administrative and billing details of healthcare services, insurance data reveal important information regarding professional interactions and/or links that emerge among healthcare service providers through, for example, informal knowledge sharing. By using details of such professional interactions and social network analysis methods, the aim of the present study was to develop a research framework to explore health care coordination and collaboration. The proposed framework was used to analyse a patient-centric care coordination network and a physician collaboration network. The usefulness of this framework and its applications in exploring collaborative efforts of different healthcare professionals and service providers is discussed.
What is known about the topic?
Application of methods and measures of social network analytics in exploring different health care collaboration and coordination networks is a comparatively new research direction. It is apparent that no other study in the present healthcare literature proposes a generic framework for examining health care collaboration and coordination using an administrative claim dataset.
What does this paper add?
Using methods and measures of social network analytics, this paper proposes a generic framework for analysing various health care collaboration and coordination networks extracted from an administrative claim dataset.
What are the implications for the practitioners?
Healthcare managers or administrators can use the framework proposed in the present study to evaluate organisational functioning in terms of effective collaboration and coordination of care in their respective healthcare organisations.
Collapse
|
7
|
Uddin S, Hamra J, Hossain L. Mapping and modeling of physician collaboration network. Stat Med 2013; 32:3539-51. [PMID: 23468249 DOI: 10.1002/sim.5770] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2012] [Revised: 01/23/2013] [Accepted: 01/30/2013] [Indexed: 11/07/2022]
Abstract
Effective provisioning of healthcare services during patient hospitalization requires collaboration involving a set of interdependent complex tasks, which needs to be carried out in a synergistic manner. Improved patients' outcome during and after hospitalization has been attributed to how effective different health services provisioning groups carry out their tasks in a coordinated manner. Previous studies have documented the underlying relationships between collaboration among physicians on the effective outcome in delivering health services for improved patient outcomes. However, there are very few systematic empirical studies with a focus on the effect of collaboration networks among healthcare professionals and patients' medical condition. On the basis of the fact that collaboration evolves among physicians when they visit a common hospitalized patient, in this study, we first propose an approach to map collaboration network among physicians from their visiting information to patients. We termed this network as physician collaboration network (PCN). Then, we use exponential random graph (ERG) models to explore the microlevel network structures of PCNs and their impact on hospitalization cost and hospital readmission rate. ERG models are probabilistic models that are presented by locally determined explanatory variables and can effectively identify structural properties of networks such as PCN. It simplifies a complex structure down to a combination of basic parameters such as 2-star, 3-star, and triangle. By applying our proposed mapping approach and ERG modeling technique to the electronic health insurance claims dataset of a very large Australian health insurance organization, we construct and model PCNs. We notice that the 2-star (subset of 3 nodes in which 1 node is connected to each of the other 2 nodes) parameter of ERG has significant impact on hospitalization cost. Further, we identify that triangle (subset of 3 nodes in which each node is connected to the rest 2 nodes), alternative k-star (subset of k nodes in which 1 node is connected to each of other k - 1 nodes), and alternative k - 2 path (subset of k nodes in which, between a specific pair of nodes, there exists k - 2 paths of length 2) parameters of ERG have impact on the hospital readmission rate. Our findings can have implications for healthcare administrators or managers who could potentially improve the practice cultures in their organizations by following these outcomes.
Collapse
Affiliation(s)
- Shahadat Uddin
- Centre for Complex Systems Research, The University of Sydney, Room 402, Civil Engineering Building, Australia.
| | | | | |
Collapse
|
8
|
Losina E, Barrett J, Baron JA, Katz JN. Accuracy of Medicare claims data for rheumatologic diagnoses in total hip replacement recipients. J Clin Epidemiol 2003; 56:515-9. [PMID: 12873645 DOI: 10.1016/s0895-4356(03)00056-8] [Citation(s) in RCA: 81] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
This analysis was performed to examine whether Medicare claims accurately document underlying rheumatologic diagnoses in total hip replacement (THR) recipients. We obtained data on rheumatologic diagnoses including rheumatoid arthritis (RA), avascular necrosis (AVN), and osteoarthritis (OA) from medical records and from Medicare claims data. To examine the accuracy of claims data we calculated sensitivity and positive predictive value using medical records data as the "gold standard" and assessed bias due to misclassification of claims-based diagnoses. The sensitivities of claims-based diagnoses of RA, AVN, and OA were 0.65, 0.54, and 0.96, respectively; the positive predictive values were all in the 0.86-0.89 range. The sensitivities of RA and AVN varied substantially across hospital volume strata, but in different directions for the two diagnoses. We conclude that inaccuracies in claims coding of diagnoses are frequent, and are potential sources of bias. More studies are needed to examine the magnitude and direction of bias in health outcomes research due to inaccuracy of claims coding for specific diagnoses.
Collapse
Affiliation(s)
- Elena Losina
- Department of Epidemiology and Biostatistics, Boston University School of Public Health, 715 Albany Street, Talbor 3-E, Boston, MA 02118, USA.
| | | | | | | |
Collapse
|
9
|
Schiff GD, Young QD. You can't leap a chasm in two jumps: The institute of medicine health care quality report. Public Health Rep 2001. [DOI: 10.1016/s0033-3549(04)50067-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
|
10
|
Schiff GD, Young QD. You can't leap a chasm in two jumps: The Institute of Medicine health care quality report. Public Health Rep 2001; 116:396-403. [PMID: 12042603 PMCID: PMC1497368 DOI: 10.1093/phr/116.5.396] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Affiliation(s)
- G D Schiff
- Department of Medicine, Cook County Hospital and Rush Medical College, Chicago, IL60612, USA.
| | | |
Collapse
|
11
|
Parente ST, Weiner JP, Garnick DW, Richards TM, Fowles J, Lawthers AG, Chandler P, Palmer RH. Developing a quality improvement database using health insurance data: a guided tour with application to Medicare's National Claims History file. Am J Med Qual 1995; 10:162-76. [PMID: 8547795 DOI: 10.1177/0885713x9501000402] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
Health policy researchers are increasingly turning to insurance claims to provide timely information on cost, utilization, and quality trends in health care markets. This research offers an in-depth description of how to systematically transform raw inpatient and ambulatory claims data into useful information for health care management and research using the Health Care Financing Administration's National Claims History file as an example. The topics covered include: (a) understanding the contents and architecture of claims data, (b) creating analytic files from raw claims, (c) technical innovations for health policy studies, (d) assessing data accuracy, (d) the costs of using claims data, and (e) ensuring confidentiality. In summary, claims data are found to have great potential for quality of care analysis. As in any analysis, careful development of a database is required for scientific research. The methods outlined in this study offer health data novices as well as experienced analysts a series of strategies to maximize the value of claims data for health policy analysis.
Collapse
Affiliation(s)
- S T Parente
- Center for Health Affairs, Project HOPE, Bethesda, MD 20814, USA
| | | | | | | | | | | | | | | |
Collapse
|
12
|
Abstract
Concern about the quality, cost, and outcomes of health care has become a driving force in health policy research. The growing accessibility of large clinical and administrative health care data bases has led to an interest in using such data in health policy research. Clinical data bases are created by providers of care and contain data about episodes and outcomes of care, usually organized as patient records. Administrative data bases contain data about indirect care processes such as insurance claims processing, vital event recording, and quality assurance. Clinical and administrative data bases may contain millions of records, consist of data from multiple sites, and often have missing data issues that must be considered by researchers. These and other characteristics of large data bases require special data manipulation and analytic techniques. Large data bases have been used in epidemiological studies, risk assessment, and technology assessment and to study variations in caregiver practice patterns. Because the use of large data bases by nurse researchers has been constrained by the lack of nursing-relevant data in them, there is a need to reach consensus on useful and feasible nursing data elements and to include those data in ongoing data collection efforts by government agencies and private organizations.
Collapse
Affiliation(s)
- L L Lange
- Nursing Informatics Program, University of Utah, Salt Lake City 84103
| | | |
Collapse
|
13
|
Abstract
Using claims data for a 5% random sample of Medicare beneficiaries, we estimated the costs of surgical treatment for benign prostatic hyperplasia (BPH), including those related to the initial prostatectomy, the treatment of postsurgical complications, and reoperation within one year. We identified 14,480 men who underwent prostatectomy for BPH during 1986-1987, including 13,730 transurethral and 750 open procedures. Mean total inpatient costs (including all hospital charges and professional service fees) for these procedures were estimated to be $6,501 and $10,223, respectively. Among patients who underwent transurethral and open prostatectomy, we identified 938 (6.8%) and 39 (5.2%) individuals who had at least one readmission for postsurgical complications or reoperation. Total expected costs of transurethral and open prostatectomy, inclusive of readmissions for complications and reoperations within one year, were estimated to be $6,823 and $10,477, respectively. Our study indicates the economic burden represented by surgical treatment of BPH.
Collapse
Affiliation(s)
- K A Weis
- Center for Medical Effectiveness Research, Agency for Health Care Policy and Research, Rockville, MD 20857
| | | | | | | | | |
Collapse
|
14
|
Sherman CR, Potosky AL, Weis KA, Ferguson JH. The Consensus Development Program. Detecting changes in medical practice following a consensus conference on the treatment of prostate cancer. Int J Technol Assess Health Care 1992; 8:683-93. [PMID: 1464488 DOI: 10.1017/s0266462300002373] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
The treatment of prostate cancer was reviewed at a U.S. National Institutes of Health Consensus Development Conference in June 1987. Data from the U.S. National Cancer Institute's Surveillance, Epidemiology, and End Results tumor registries were analyzed and showed that the proportion of eligible prostate cancer patients receiving the recommended therapies did not increase at a faster rate after the conference than before.
Collapse
|