1
|
Ding S, Zhang S, Hu X, Zou N. Identify and mitigate bias in electronic phenotyping: A comprehensive study from computational perspective. J Biomed Inform 2024; 156:104671. [PMID: 38876452 DOI: 10.1016/j.jbi.2024.104671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 05/26/2024] [Accepted: 06/05/2024] [Indexed: 06/16/2024]
Abstract
Electronic phenotyping is a fundamental task that identifies the special group of patients, which plays an important role in precision medicine in the era of digital health. Phenotyping provides real-world evidence for other related biomedical research and clinical tasks, e.g., disease diagnosis, drug development, and clinical trials, etc. With the development of electronic health records, the performance of electronic phenotyping has been significantly boosted by advanced machine learning techniques. In the healthcare domain, precision and fairness are both essential aspects that should be taken into consideration. However, most related efforts are put into designing phenotyping models with higher accuracy. Few attention is put on the fairness perspective of phenotyping. The neglection of bias in phenotyping leads to subgroups of patients being underrepresented which will further affect the following healthcare activities such as patient recruitment in clinical trials. In this work, we are motivated to bridge this gap through a comprehensive experimental study to identify the bias existing in electronic phenotyping models and evaluate the widely-used debiasing methods' performance on these models. We choose pneumonia and sepsis as our phenotyping target diseases. We benchmark 9 kinds of electronic phenotyping methods spanning from rule-based to data-driven methods. Meanwhile, we evaluate the performance of the 5 bias mitigation strategies covering pre-processing, in-processing, and post-processing. Through the extensive experiments, we summarize several insightful findings from the bias identified in the phenotyping and key points of the bias mitigation strategies in phenotyping.
Collapse
Affiliation(s)
- Sirui Ding
- Department of Computer Science & Engineering, Texas A&M University, College Station, TX, United States
| | - Shenghan Zhang
- Department of Biomedical Informatics, Harvard University, Boston, MA, United States
| | - Xia Hu
- Department of Computer Science, Rice University, Houston, TX, United States
| | - Na Zou
- Department of Industrial Engineering, University of Houston, Houston, TX, United States.
| |
Collapse
|
2
|
Cao T, Brady V, Whisenant M, Wang X, Gu Y, Wu H. Toward Reliable Symptom Coding in Electronic Health Records for Symptom Assessment and Research: Identification and Categorization of International Classification of Diseases, Ninth Revision, Clinical Modification Symptom Codes. Comput Inform Nurs 2024:00024665-990000000-00209. [PMID: 38968447 DOI: 10.1097/cin.0000000000001146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/07/2024]
Abstract
To date, symptom documentation has mostly relied on clinical notes in electronic health records or patient-reported outcomes using disease-specific symptom inventories. To provide a common and precise language for symptom recording, assessment, and research, a comprehensive list of symptom codes is needed. The International Classification of Diseases, Ninth Revision or its clinical modification (International Classification of Diseases, Ninth Revision, Clinical Modification) has a range of codes designated for symptoms, but it does not contain codes for all possible symptoms, and not all codes in that range are symptom related. This study aimed to identify and categorize the first list of International Classification of Diseases, Ninth Revision, Clinical Modification symptom codes for a general population and demonstrate their use to characterize symptoms of patients with type 2 diabetes mellitus in the Cerner database. A list of potential symptom codes was automatically extracted from the Unified Medical Language System Metathesaurus. Two clinical experts in symptom science and diabetes manually reviewed this list to identify and categorize codes as symptoms. A total of 1888 International Classification of Diseases, Ninth Revision, Clinical Modification symptom codes were identified and categorized into 65 categories. The symptom characterization using the newly obtained symptom codes and categories was found to be more reasonable than that using the previous symptom codes and categories on the same Cerner diabetes cohort.
Collapse
Affiliation(s)
- Tru Cao
- Author Affiliations: UTHealth Houston School of Public Health (Drs Cao, Wang, and Wu and Mr Gu), UTHealth Houston Cizik School of Nursing (Dr Brady), and The University of Texas MD Anderson Cancer Center (Dr Whisenant)
| | | | | | | | | | | |
Collapse
|
3
|
Li Y, Yang AY, Marelli A, Li Y. MixEHR-SurG: A joint proportional hazard and guided topic model for inferring mortality-associated topics from electronic health records. J Biomed Inform 2024; 153:104638. [PMID: 38631461 DOI: 10.1016/j.jbi.2024.104638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 03/07/2024] [Accepted: 04/03/2024] [Indexed: 04/19/2024]
Abstract
Survival models can help medical practitioners to evaluate the prognostic importance of clinical variables to patient outcomes such as mortality or hospital readmission and subsequently design personalized treatment regimes. Electronic Health Records (EHRs) hold the promise for large-scale survival analysis based on systematically recorded clinical features for each patient. However, existing survival models either do not scale to high dimensional and multi-modal EHR data or are difficult to interpret. In this study, we present a supervised topic model called MixEHR-SurG to simultaneously integrate heterogeneous EHR data and model survival hazard. Our contributions are three-folds: (1) integrating EHR topic inference with Cox proportional hazards likelihood; (2) integrating patient-specific topic hyperparameters using the PheCode concepts such that each topic can be identified with exactly one PheCode-associated phenotype; (3) multi-modal survival topic inference. This leads to a highly interpretable survival topic model that can infer PheCode-specific phenotype topics associated with patient mortality. We evaluated MixEHR-SurG using a simulated dataset and two real-world EHR datasets: the Quebec Congenital Heart Disease (CHD) data consisting of 8211 subjects with 75,187 outpatient claim records of 1767 unique ICD codes; the MIMIC-III consisting of 1458 subjects with multi-modal EHR records. Compared to the baselines, MixEHR-SurG achieved a superior dynamic AUROC for mortality prediction, with a mean AUROC score of 0.89 in the simulation dataset and a mean AUROC of 0.645 on the CHD dataset. Qualitatively, MixEHR-SurG associates severe cardiac conditions with high mortality risk among the CHD patients after the first heart failure hospitalization and critical brain injuries with increased mortality among the MIMIC-III patients after their ICU discharge. Together, the integration of the Cox proportional hazards model and EHR topic inference in MixEHR-SurG not only leads to competitive mortality prediction but also meaningful phenotype topics for in-depth survival analysis. The software is available at GitHub: https://github.com/li-lab-mcgill/MixEHR-SurG.
Collapse
Affiliation(s)
- Yixuan Li
- Department of Mathematics and Statistics, McGill University, Montreal, Canada; Mila - Quebec AI institute, Montreal, Canada
| | - Archer Y Yang
- Department of Mathematics and Statistics, McGill University, Montreal, Canada; Mila - Quebec AI institute, Montreal, Canada; School of Computer Science, McGill University, Montreal, Canada.
| | - Ariane Marelli
- McGill Adult Unit for Congenital Heart Disease (MAUDE Unit), McGill University of Health Centre, Montreal, Canada.
| | - Yue Li
- Mila - Quebec AI institute, Montreal, Canada; School of Computer Science, McGill University, Montreal, Canada.
| |
Collapse
|
4
|
Kaufmann B, Busby D, Das CK, Tillu N, Menon M, Tewari AK, Gorin MA. Validation of a Zero-shot Learning Natural Language Processing Tool to Facilitate Data Abstraction for Urologic Research. Eur Urol Focus 2024; 10:279-287. [PMID: 38278710 DOI: 10.1016/j.euf.2024.01.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 12/18/2023] [Accepted: 01/15/2024] [Indexed: 01/28/2024]
Abstract
BACKGROUND Urologic research often requires data abstraction from unstructured text contained within the electronic health record. A number of natural language processing (NLP) tools have been developed to aid with this time-consuming task; however, the generalizability of these tools is typically limited by the need for task-specific training. OBJECTIVE To describe the development and validation of a zero-shot learning NLP tool to facilitate data abstraction from unstructured text for use in downstream urologic research. DESIGN, SETTING, AND PARTICIPANTS An NLP tool based on the GPT-3.5 model from OpenAI was developed and compared with three physicians for time to task completion and accuracy for abstracting 14 unique variables from a set of 199 deidentified radical prostatectomy pathology reports. The reports were processed in vectorized and scanned formats to establish the impact of optical character recognition on data abstraction. INTERVENTION A zero-shot learning NLP tool for data abstraction. OUTCOME MEASUREMENTS AND STATISTICAL ANALYSIS The tool was compared with the human abstractors in terms of superiority for data abstraction speed and noninferiority for accuracy. RESULTS AND LIMITATIONS The human abstractors required a median (interquartile range) of 93 s (72-122 s) per report for data abstraction, whereas the software required a median of 12 s (10-15 s) for the vectorized reports and 15 s (13-17 s) for the scanned reports (p < 0.001 for all paired comparisons). The accuracies of the three human abstractors were 94.7% (95% confidence interval [CI], 93.8-95.5%), 97.8% (95% CI, 97.2-98.3%), and 96.4% (95% CI, 95.6-97%) for the combined set of 2786 data points. The tool had accuracy of 94.2% (95% CI, 93.3-94.9%) for the vectorized reports and was noninferior to the human abstractors at a margin of -10% (α = 0.025). The tool had slightly lower accuracy of 88.7% (95% CI 87.5-89.9%) for the scanned reports, making it noninferior to two of three human abstractors. CONCLUSIONS The developed zero-shot learning NLP tool offers urologic researchers a highly generalizable and accurate method for data abstraction from unstructured text. An open access version of the tool is available for immediate use by the urologic community. PATIENT SUMMARY In this report, we describe the design and validation of an artificial intelligence tool for abstracting discrete data from unstructured notes contained within the electronic medical record. This freely available tool, which is based on the GPT-3.5 technology from OpenAI, is intended to facilitate research and scientific discovery by the urologic community.
Collapse
Affiliation(s)
- Basil Kaufmann
- Milton and Carroll Petrie Department of Urology, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Department of Urology, University Hospital Zurich, University of Zurich, Zurich, Switzerland.
| | - Dallin Busby
- Milton and Carroll Petrie Department of Urology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Chandan Krushna Das
- Milton and Carroll Petrie Department of Urology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Neeraja Tillu
- Milton and Carroll Petrie Department of Urology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Mani Menon
- Milton and Carroll Petrie Department of Urology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Ashutosh K Tewari
- Milton and Carroll Petrie Department of Urology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Michael A Gorin
- Milton and Carroll Petrie Department of Urology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| |
Collapse
|
5
|
Gao J, Bonzel CL, Hong C, Varghese P, Zakir K, Gronsbell J. Semi-supervised ROC analysis for reliable and streamlined evaluation of phenotyping algorithms. J Am Med Inform Assoc 2024; 31:640-650. [PMID: 38128118 PMCID: PMC10873838 DOI: 10.1093/jamia/ocad226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 09/22/2023] [Accepted: 11/20/2023] [Indexed: 12/23/2023] Open
Abstract
OBJECTIVE High-throughput phenotyping will accelerate the use of electronic health records (EHRs) for translational research. A critical roadblock is the extensive medical supervision required for phenotyping algorithm (PA) estimation and evaluation. To address this challenge, numerous weakly-supervised learning methods have been proposed. However, there is a paucity of methods for reliably evaluating the predictive performance of PAs when a very small proportion of the data is labeled. To fill this gap, we introduce a semi-supervised approach (ssROC) for estimation of the receiver operating characteristic (ROC) parameters of PAs (eg, sensitivity, specificity). MATERIALS AND METHODS ssROC uses a small labeled dataset to nonparametrically impute missing labels. The imputations are then used for ROC parameter estimation to yield more precise estimates of PA performance relative to classical supervised ROC analysis (supROC) using only labeled data. We evaluated ssROC with synthetic, semi-synthetic, and EHR data from Mass General Brigham (MGB). RESULTS ssROC produced ROC parameter estimates with minimal bias and significantly lower variance than supROC in the simulated and semi-synthetic data. For the 5 PAs from MGB, the estimates from ssROC are 30% to 60% less variable than supROC on average. DISCUSSION ssROC enables precise evaluation of PA performance without demanding large volumes of labeled data. ssROC is also easily implementable in open-source R software. CONCLUSION When used in conjunction with weakly-supervised PAs, ssROC facilitates the reliable and streamlined phenotyping necessary for EHR-based research.
Collapse
Affiliation(s)
- Jianhui Gao
- Department of Statistical Sciences, University of Toronto, Toronto, ON, Canada
| | - Clara-Lea Bonzel
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
| | - Chuan Hong
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, United States
| | - Paul Varghese
- Health Informatics, Verily Life Sciences, Cambridge, MA, United States
| | - Karim Zakir
- Department of Statistical Sciences, University of Toronto, Toronto, ON, Canada
| | - Jessica Gronsbell
- Department of Statistical Sciences, University of Toronto, Toronto, ON, Canada
- Department of Family and Community Medicine, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
6
|
Jeffery AD, Fabbri D, Reeves RM, Matheny ME. Use of Noisy Labels as Weak Learners to Identify Incompletely Ascertainable Outcomes: A Feasibility Study with Opioid-Induced Respiratory Depression. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.01.29.24301963. [PMID: 38352435 PMCID: PMC10863026 DOI: 10.1101/2024.01.29.24301963] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/19/2024]
Abstract
Objective Assigning outcome labels to large observational data sets in a timely and accurate manner, particularly when outcomes are rare or not directly ascertainable, remains a significant challenge within biomedical informatics. We examined whether noisy labels generated from subject matter experts' heuristics using heterogenous data types within a data programming paradigm could provide outcomes labels to a large, observational data set. We chose the clinical condition of opioid-induced respiratory depression for our use case because it is rare, has no administrative codes to easily identify the condition, and typically requires at least some unstructured text to ascertain its presence. Materials and Methods Using de-identified electronic health records of 52,861 post-operative encounters, we applied a data programming paradigm (implemented in the Snorkel software) for the development of a machine learning classifier for opioid-induced respiratory depression. Our approach included subject matter experts creating 14 labeling functions that served as noisy labels for developing a probabilistic Generative model. We used probabilistic labels from the Generative model as outcome labels for training a Discriminative model on the source data. We evaluated performance of the Discriminative model with a hold-out test set of 599 independently-reviewed patient records. Results The final Discriminative classification model achieved an accuracy of 0.977, an F1 score of 0.417, a sensitivity of 1.0, and an AUC of 0.988 in the hold-out test set with a prevalence of 0.83% (5/599). Discussion All of the confirmed Cases were identified by the classifier. For rare outcomes, this finding is encouraging because it reduces the number of manual reviews needed by excluding visits/patients with low probabilities. Conclusion Application of a data programming paradigm with expert-informed labeling functions might have utility for phenotyping clinical phenomena that are not easily ascertainable from highly-structured data.
Collapse
Affiliation(s)
- Alvin D Jeffery
- School of Nursing, Vanderbilt University, Department of Biomedical Informatics, Vanderbilt University Medical Center, Tennessee Valley Healthcare System, U.S. Department of Veterans Affairs, Nashville, TN, USA
| | - Daniel Fabbri
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Ruth M Reeves
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Tennessee Valley Healthcare System, U.S. Department of Veterans Affairs, Nashville, TN, USA
| | - Michael E Matheny
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Tennessee Valley Healthcare System, U.S. Department of Veterans Affairs, Nashville, TN, USA
| |
Collapse
|
7
|
Clermont G. The Learning Electronic Health Record. Crit Care Clin 2023; 39:689-700. [PMID: 37704334 DOI: 10.1016/j.ccc.2023.03.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/15/2023]
Abstract
Electronic medical records (EMRs) constitute the electronic version of all medical information included in a patient's paper chart. The electronic health record (EHR) technology has witnessed massive expansion in developed countries and to a lesser extent in underresourced countries during the last 2 decades. We will review factors leading to this expansion, how the emergence of EHRs is affecting several health-care stakeholders; some of the growing pains associated with EHRs with a particular emphasis on the delivery of care to the critically ill; and ongoing developments on the path to improve the quality of research, health-care delivery, and stakeholder satisfaction.
Collapse
Affiliation(s)
- Gilles Clermont
- VA Pittsburgh Medical Center, 1054 Aliquippa Street, Pittsburgh, PA 15104, USA; Critical Care Medicine, University of Pittsburgh, 200 Lothrop Street, Pittsburgh, PA 15061, USA.
| |
Collapse
|
8
|
Liu P, Wang Z, Liu N, Peres MA. A scoping review of the clinical application of machine learning in data-driven population segmentation analysis. J Am Med Inform Assoc 2023; 30:1573-1582. [PMID: 37369006 PMCID: PMC10436153 DOI: 10.1093/jamia/ocad111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Revised: 06/08/2023] [Accepted: 06/16/2023] [Indexed: 06/29/2023] Open
Abstract
OBJECTIVE Data-driven population segmentation is commonly used in clinical settings to separate the heterogeneous population into multiple relatively homogenous groups with similar healthcare features. In recent years, machine learning (ML) based segmentation algorithms have garnered interest for their potential to speed up and improve algorithm development across many phenotypes and healthcare situations. This study evaluates ML-based segmentation with respect to (1) the populations applied, (2) the segmentation details, and (3) the outcome evaluations. MATERIALS AND METHODS MEDLINE, Embase, Web of Science, and Scopus were used following the PRISMA-ScR criteria. Peer-reviewed studies in the English language that used data-driven population segmentation analysis on structured data from January 2000 to October 2022 were included. RESULTS We identified 6077 articles and included 79 for the final analysis. Data-driven population segmentation analysis was employed in various clinical settings. K-means clustering is the most prevalent unsupervised ML paradigm. The most common settings were healthcare institutions. The most common targeted population was the general population. DISCUSSION Although all the studies did internal validation, only 11 papers (13.9%) did external validation, and 23 papers (29.1%) conducted methods comparison. The existing papers discussed little validating the robustness of ML modeling. CONCLUSION Existing ML applications on population segmentation need more evaluations regarding giving tailored, efficient integrated healthcare solutions compared to traditional segmentation analysis. Future ML applications in the field should emphasize methods' comparisons and external validation and investigate approaches to evaluate individual consistency using different methods.
Collapse
Affiliation(s)
- Pinyan Liu
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore, Singapore
| | - Ziwen Wang
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore, Singapore
| | - Nan Liu
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore, Singapore
- Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore
- Institute of Data Science, National University of Singapore, Singapore, Singapore
| | - Marco Aurélio Peres
- Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore
- National Dental Research Institute Singapore, National Dental Centre Singapore, Singapore, Singapore
| |
Collapse
|
9
|
Oommen C, Howlett-Prieto Q, Carrithers MD, Hier DB. Inter-rater agreement for the annotation of neurologic signs and symptoms in electronic health records. Front Digit Health 2023; 5:1075771. [PMID: 37383943 PMCID: PMC10294690 DOI: 10.3389/fdgth.2023.1075771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Accepted: 05/26/2023] [Indexed: 06/30/2023] Open
Abstract
The extraction of patient signs and symptoms recorded as free text in electronic health records is critical for precision medicine. Once extracted, signs and symptoms can be made computable by mapping to signs and symptoms in an ontology. Extracting signs and symptoms from free text is tedious and time-consuming. Prior studies have suggested that inter-rater agreement for clinical concept extraction is low. We have examined inter-rater agreement for annotating neurologic concepts in clinical notes from electronic health records. After training on the annotation process, the annotation tool, and the supporting neuro-ontology, three raters annotated 15 clinical notes in three rounds. Inter-rater agreement between the three annotators was high for text span and category label. A machine annotator based on a convolutional neural network had a high level of agreement with the human annotators but one that was lower than human inter-rater agreement. We conclude that high levels of agreement between human annotators are possible with appropriate training and annotation tools. Furthermore, more training examples combined with improvements in neural networks and natural language processing should make machine annotators capable of high throughput automated clinical concept extraction with high levels of agreement with human annotators.
Collapse
Affiliation(s)
- Chelsea Oommen
- Department of Neurology and Rehabilitation, University of Illinois at Chicago, Chicago, IL, United States
| | - Quentin Howlett-Prieto
- Department of Neurology and Rehabilitation, University of Illinois at Chicago, Chicago, IL, United States
| | - Michael D. Carrithers
- Department of Neurology and Rehabilitation, University of Illinois at Chicago, Chicago, IL, United States
| | - Daniel B. Hier
- Department of Neurology and Rehabilitation, University of Illinois at Chicago, Chicago, IL, United States
- Department of Electrical and Computer Engineering, Missouri University of Science and Technology, Rolla, MO, United States
| |
Collapse
|
10
|
Hossain E, Rana R, Higgins N, Soar J, Barua PD, Pisani AR, Turner K. Natural Language Processing in Electronic Health Records in relation to healthcare decision-making: A systematic review. Comput Biol Med 2023; 155:106649. [PMID: 36805219 DOI: 10.1016/j.compbiomed.2023.106649] [Citation(s) in RCA: 28] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Revised: 01/04/2023] [Accepted: 02/07/2023] [Indexed: 02/12/2023]
Abstract
BACKGROUND Natural Language Processing (NLP) is widely used to extract clinical insights from Electronic Health Records (EHRs). However, the lack of annotated data, automated tools, and other challenges hinder the full utilisation of NLP for EHRs. Various Machine Learning (ML), Deep Learning (DL) and NLP techniques are studied and compared to understand the limitations and opportunities in this space comprehensively. METHODOLOGY After screening 261 articles from 11 databases, we included 127 papers for full-text review covering seven categories of articles: (1) medical note classification, (2) clinical entity recognition, (3) text summarisation, (4) deep learning (DL) and transfer learning architecture, (5) information extraction, (6) Medical language translation and (7) other NLP applications. This study follows the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. RESULT AND DISCUSSION EHR was the most commonly used data type among the selected articles, and the datasets were primarily unstructured. Various ML and DL methods were used, with prediction or classification being the most common application of ML or DL. The most common use cases were: the International Classification of Diseases, Ninth Revision (ICD-9) classification, clinical note analysis, and named entity recognition (NER) for clinical descriptions and research on psychiatric disorders. CONCLUSION We find that the adopted ML models were not adequately assessed. In addition, the data imbalance problem is quite important, yet we must find techniques to address this underlining problem. Future studies should address key limitations in studies, primarily identifying Lupus Nephritis, Suicide Attempts, perinatal self-harmed and ICD-9 classification.
Collapse
Affiliation(s)
- Elias Hossain
- School of Engineering & Physical Sciences, North South University, Dhaka 1229, Bangladesh.
| | - Rajib Rana
- School of Mathematics, Physics and Computing, University of Southern Queensland, Springfield Central QLD 4300, Australia
| | - Niall Higgins
- School of Management and Enterprise, University of Southern Queensland, Darling Heights QLD 4350, Australia; School of Nursing, Queensland University of Technology, Kelvin Grove, Brisbane, QLD 4000, Australia; Metro North Mental Health, Herston QLD 4029, Australia
| | - Jeffrey Soar
- School of Business, University of Southern Queensland, Springfield Central QLD 4300, Australia
| | - Prabal Datta Barua
- School of Business, University of Southern Queensland, Springfield Central QLD 4300, Australia
| | - Anthony R Pisani
- Center for the Study and Prevention of Suicide, University of Rochester, Rochester, NY, United States
| | - Kathryn Turner
- School of Nursing, Queensland University of Technology, Kelvin Grove, Brisbane, QLD 4000, Australia
| |
Collapse
|
11
|
Yang S, Varghese P, Stephenson E, Tu K, Gronsbell J. Machine learning approaches for electronic health records phenotyping: a methodical review. J Am Med Inform Assoc 2023; 30:367-381. [PMID: 36413056 PMCID: PMC9846699 DOI: 10.1093/jamia/ocac216] [Citation(s) in RCA: 23] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 09/27/2022] [Accepted: 10/27/2022] [Indexed: 11/23/2022] Open
Abstract
OBJECTIVE Accurate and rapid phenotyping is a prerequisite to leveraging electronic health records for biomedical research. While early phenotyping relied on rule-based algorithms curated by experts, machine learning (ML) approaches have emerged as an alternative to improve scalability across phenotypes and healthcare settings. This study evaluates ML-based phenotyping with respect to (1) the data sources used, (2) the phenotypes considered, (3) the methods applied, and (4) the reporting and evaluation methods used. MATERIALS AND METHODS We searched PubMed and Web of Science for articles published between 2018 and 2022. After screening 850 articles, we recorded 37 variables on 100 studies. RESULTS Most studies utilized data from a single institution and included information in clinical notes. Although chronic conditions were most commonly considered, ML also enabled the characterization of nuanced phenotypes such as social determinants of health. Supervised deep learning was the most popular ML paradigm, while semi-supervised and weakly supervised learning were applied to expedite algorithm development and unsupervised learning to facilitate phenotype discovery. ML approaches did not uniformly outperform rule-based algorithms, but deep learning offered a marginal improvement over traditional ML for many conditions. DISCUSSION Despite the progress in ML-based phenotyping, most articles focused on binary phenotypes and few articles evaluated external validity or used multi-institution data. Study settings were infrequently reported and analytic code was rarely released. CONCLUSION Continued research in ML-based phenotyping is warranted, with emphasis on characterizing nuanced phenotypes, establishing reporting and evaluation standards, and developing methods to accommodate misclassified phenotypes due to algorithm errors in downstream applications.
Collapse
Affiliation(s)
- Siyue Yang
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
| | | | - Ellen Stephenson
- Department of Family & Community Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Karen Tu
- Department of Family & Community Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Jessica Gronsbell
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
- Department of Family & Community Medicine, University of Toronto, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
12
|
Alzubi R, Alzoubi H, Katsigiannis S, West D, Ramzan N. Automated Detection of Substance-Use Status and Related Information from Clinical Text. SENSORS (BASEL, SWITZERLAND) 2022; 22:9609. [PMID: 36559979 PMCID: PMC9783118 DOI: 10.3390/s22249609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/05/2022] [Revised: 11/21/2022] [Accepted: 11/25/2022] [Indexed: 06/17/2023]
Abstract
This study aims to develop and evaluate an automated system for extracting information related to patient substance use (smoking, alcohol, and drugs) from unstructured clinical text (medical discharge records). The authors propose a four-stage system for the extraction of the substance-use status and related attributes (type, frequency, amount, quit-time, and period). The first stage uses a keyword search technique to detect sentences related to substance use and to exclude unrelated records. In the second stage, an extension of the NegEx negation detection algorithm is developed and employed for detecting the negated records. The third stage involves identifying the temporal status of the substance use by applying windowing and chunking methodologies. Finally, in the fourth stage, regular expressions, syntactic patterns, and keyword search techniques are used in order to extract the substance-use attributes. The proposed system achieves an F1-score of up to 0.99 for identifying substance-use-related records, 0.98 for detecting the negation status, and 0.94 for identifying temporal status. Moreover, F1-scores of up to 0.98, 0.98, 1.00, 0.92, and 0.98 are achieved for the extraction of the amount, frequency, type, quit-time, and period attributes, respectively. Natural Language Processing (NLP) and rule-based techniques are employed efficiently for extracting substance-use status and attributes, with the proposed system being able to detect substance-use status and attributes over both sentence-level and document-level data. Results show that the proposed system outperforms the compared state-of-the-art substance-use identification system on an unseen dataset, demonstrating its generalisability.
Collapse
Affiliation(s)
- Raid Alzubi
- Department of Computer Science, College of Computer Science and Information Technology, King Faisal University, Al-Ahsa 31982, Saudi Arabia
| | - Hadeel Alzoubi
- Department of Computer Science, College of Computer Science and Information Technology, King Faisal University, Al-Ahsa 31982, Saudi Arabia
| | - Stamos Katsigiannis
- Department of Computer Science, Durham University, Upper Mountjoy Campus, Stockton Road, Durham DH1 3LE, UK
| | - Daune West
- School of Computing, Engineering and Physical Sciences, University of the West of Scotland, High St., Paisley PA1 2BE, UK
| | - Naeem Ramzan
- School of Computing, Engineering and Physical Sciences, University of the West of Scotland, High St., Paisley PA1 2BE, UK
| |
Collapse
|
13
|
Rodríguez-Fernández JM, Loeb JA, Hier DB. It's time to change our documentation philosophy: writing better neurology notes without the burnout. Front Digit Health 2022; 4:1063141. [PMID: 36518562 PMCID: PMC9742203 DOI: 10.3389/fdgth.2022.1063141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Accepted: 11/10/2022] [Indexed: 08/23/2023] Open
Abstract
Succinct clinical documentation is vital to effective twenty-first-century healthcare. Recent changes in outpatient and inpatient evaluation and management (E/M) guidelines have allowed neurology practices to make changes that reduce the documentation burden and enhance clinical note usability. Despite favorable changes in E/M guidelines, some neurology practices have not moved quickly to change their documentation philosophy. We argue in favor of changes in the design, structure, and implementation of clinical notes that make them shorter yet still information-rich. A move from physician-centric to team documentation can reduce work for physicians. Changing the documentation philosophy from "bigger is better" to "short but sweet" can reduce the documentation burden, streamline the writing and reading of clinical notes, and enhance their utility for medical decision-making, patient education, medical education, and clinical research. We believe that these changes can favorably affect physician well-being without adversely affecting reimbursement.
Collapse
Affiliation(s)
| | - Jeffrey A. Loeb
- Department of Neurology and Rehabilitation, University of Illinois at Chicago, Chicago, IL, United States
| | - Daniel B. Hier
- Department of Neurology and Rehabilitation, University of Illinois at Chicago, Chicago, IL, United States
- Department of Electrical and Computer Engineering, Missouri University of Science and Technology, Rolla, MO, United States
| |
Collapse
|
14
|
Zou Y, Pesaranghader A, Song Z, Verma A, Buckeridge DL, Li Y. Modeling electronic health record data using an end-to-end knowledge-graph-informed topic model. Sci Rep 2022; 12:17868. [PMID: 36284225 PMCID: PMC9596500 DOI: 10.1038/s41598-022-22956-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 10/21/2022] [Indexed: 01/20/2023] Open
Abstract
The rapid growth of electronic health record (EHR) datasets opens up promising opportunities to understand human diseases in a systematic way. However, effective extraction of clinical knowledge from EHR data has been hindered by the sparse and noisy information. We present Graph ATtention-Embedded Topic Model (GAT-ETM), an end-to-end taxonomy-knowledge-graph-based multimodal embedded topic model. GAT-ETM distills latent disease topics from EHR data by learning the embedding from a constructed medical knowledge graph. We applied GAT-ETM to a large-scale EHR dataset consisting of over 1 million patients. We evaluated its performance based on topic quality, drug imputation, and disease diagnosis prediction. GAT-ETM demonstrated superior performance over the alternative methods on all tasks. Moreover, GAT-ETM learned clinically meaningful graph-informed embedding of the EHR codes and discovered interpretable and accurate patient representations for patient stratification and drug recommendations. GAT-ETM code is available at https://github.com/li-lab-mcgill/GAT-ETM .
Collapse
Affiliation(s)
- Yuesong Zou
- grid.14709.3b0000 0004 1936 8649School of Computer Science, McGill University, Montreal, Canada
| | - Ahmad Pesaranghader
- grid.14709.3b0000 0004 1936 8649School of Computer Science, McGill University, Montreal, Canada
| | - Ziyang Song
- grid.14709.3b0000 0004 1936 8649School of Computer Science, McGill University, Montreal, Canada
| | - Aman Verma
- grid.14709.3b0000 0004 1936 8649School of Population and Global Health, McGill University, Montreal, Canada
| | - David L. Buckeridge
- grid.14709.3b0000 0004 1936 8649School of Population and Global Health, McGill University, Montreal, Canada
| | - Yue Li
- grid.14709.3b0000 0004 1936 8649School of Computer Science, McGill University, Montreal, Canada
| |
Collapse
|
15
|
Chen W, Abeyaratne A, Gorham G, George P, Karepalli V, Tran D, Brock C, Cass A. Development and validation of algorithms to identify patients with chronic kidney disease and related chronic diseases across the Northern Territory, Australia. BMC Nephrol 2022; 23:320. [PMID: 36151531 PMCID: PMC9502610 DOI: 10.1186/s12882-022-02947-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Accepted: 09/13/2022] [Indexed: 11/15/2022] Open
Abstract
BACKGROUND Electronic health records can be used for population-wide identification and monitoring of disease. The Territory Kidney Care project developed algorithms to identify individuals with chronic kidney disease (CKD) and several commonly comorbid chronic diseases. This study aims to describe the development and validation of our algorithms for CKD, diabetes, hypertension, and cardiovascular disease. A secondary aim of the study was to describe data completeness of the Territory Kidney Care database. METHODS The Territory Kidney Care database consolidates electronic health records from multiple health services including public hospitals (n = 6) and primary care health services (> 60) across the Northern Territory, Australia. Using the database (n = 48,569) we selected a stratified random sample of patients (n = 288), which included individuals with mild to end-stage CKD. Diagnostic accuracy of the algorithms was tested against blinded manual chart reviews. Data completeness of the database was also described. RESULTS For CKD defined as CKD stage 1 or higher (eGFR of any level with albuminuria or persistent eGFR < 60 ml/min/1.732, including renal replacement therapy) overall algorithm sensitivity was 93% (95%CI 89 to 96%) and specificity was 73% (95%CI 64 to 82%). For CKD defined as CKD stage 3a or higher (eGFR < 60 ml/min/1.732) algorithm sensitivity and specificity were 93% and 97% respectively. Among the CKD 1 to 5 staging algorithms, the CKD stage 5 algorithm was most accurate with > 99% sensitivity and specificity. For related comorbidities - algorithm sensitivity and specificity results were 75% and 97% for diabetes; 85% and 88% for hypertension; and 79% and 96% for cardiovascular disease. CONCLUSIONS We developed and validated algorithms to identify CKD and related chronic diseases within electronic health records. Validation results showed that CKD algorithms have a high degree of diagnostic accuracy compared to traditional administrative codes. Our highly accurate algorithms present new opportunities in early kidney disease detection, monitoring, and epidemiological research.
Collapse
Affiliation(s)
- Winnie Chen
- Menzies School of Health Research, Charles Darwin University, PO Box 41096, Casuarina, NT 0811 Australia
| | - Asanga Abeyaratne
- Menzies School of Health Research, Charles Darwin University, PO Box 41096, Casuarina, NT 0811 Australia
- Royal Darwin Hospital, Darwin, NT Australia
| | - Gillian Gorham
- Menzies School of Health Research, Charles Darwin University, PO Box 41096, Casuarina, NT 0811 Australia
| | | | | | - Dan Tran
- Alice Springs Hospital, Alice Springs, NT Australia
| | | | - Alan Cass
- Menzies School of Health Research, Charles Darwin University, PO Box 41096, Casuarina, NT 0811 Australia
| |
Collapse
|
16
|
Kandaswamy S, Orenstein E, Quincer EM, Fernandez A, Gonzalez M, Lu L, Kamaleswaran R, Banerjee I, Jaggi P. Automated Identification of Immunocompromised Status in Critically Ill Children. Methods Inf Med 2022; 61:46-54. [PMID: 35381616 DOI: 10.1055/a-1817-7208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
BACKGROUND Easy identification of immunocompromised hosts (ICH) would allow for stratification of culture results based on host type. METHODS We utilized antimicrobial stewardship (ASP) team notes written during handshake stewardship rounds in the pediatric intensive care unit as the gold standard for host status; clinical notes from the primary team, medication orders during the encounter, problem list and billing diagnoses documented prior to the ASP documentation were extracted to develop models that predict host status. We calculated performance for three models based on diagnoses/medications, with and without natural language processing from clinical notes. The susceptibility of pathogens causing bacteremia to commonly used empiric antibiotic regimens was then stratified by host status. RESULTS We identified 844 antimicrobial episodes from 666 unique patients; 160 (18.9%) were identified as an ICH. We randomly selected 675 initiations (80%) for model training and 169 initiations (20%) for testing. A rule-based model using diagnoses and medications alone yielded sensitivity of 0.87 (08.6-0.88), specificity of 0.93 (0.92-0.93), and positive predictive value (PPV) of 0.74 (0.73-0.75). Adding clinical notes into XGBoost model led to improved specificity of 0.98 (0.98 - 0.98) and PPV of 0.9 (0.88 - 0.91), but with decreased sensitivity 0.77 (0.76 - 0.79). There were 77 bacteremia episodes during the study period identified and a host specific visualization was created. CONCLUSIONS An EHR phenotype based on notes, diagnoses and medications identifies ICH in the PICU with high specificity.
Collapse
Affiliation(s)
| | - Evan Orenstein
- Children's Healthcare of Atlanta Inc, Atlanta, United States
| | | | | | - Mark Gonzalez
- Children's Healthcare of Atlanta Inc, Atlanta, United States
| | - Lydia Lu
- Children's Healthcare of Atlanta Inc, Atlanta, United States
| | | | - Imon Banerjee
- Biomedical Informatics, Emory University School of Medicine, Atlanta, United States
| | - Preeti Jaggi
- Children's Healthcare of Atlanta Inc, Atlanta, United States.,Emory University School of Medicine, Atlanta, United States
| |
Collapse
|
17
|
Applying Machine Learning in Distributed Data Networks for Pharmacoepidemiologic and Pharmacovigilance Studies: Opportunities, Challenges, and Considerations. Drug Saf 2022; 45:493-510. [PMID: 35579813 PMCID: PMC9112258 DOI: 10.1007/s40264-022-01158-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/13/2022] [Indexed: 01/28/2023]
Abstract
Increasing availability of electronic health databases capturing real-world experiences with medical products has garnered much interest in their use for pharmacoepidemiologic and pharmacovigilance studies. The traditional practice of having numerous groups use single databases to accomplish similar tasks and address common questions about medical products can be made more efficient through well-coordinated multi-database studies, greatly facilitated through distributed data network (DDN) architectures. Access to larger amounts of electronic health data within DDNs has created a growing interest in using data-adaptive machine learning (ML) techniques that can automatically model complex associations in high-dimensional data with minimal human guidance. However, the siloed storage and diverse nature of the databases in DDNs create unique challenges for using ML. In this paper, we discuss opportunities, challenges, and considerations for applying ML in DDNs for pharmacoepidemiologic and pharmacovigilance studies. We first discuss major types of activities performed by DDNs and how ML may be used. Next, we discuss practical data-related factors influencing how DDNs work in practice. We then combine these discussions and jointly consider how opportunities for ML are affected by practical data-related factors for DDNs, leading to several challenges. We present different approaches for addressing these challenges and highlight efforts that real-world DDNs have taken or are currently taking to help mitigate them. Despite these challenges, the time is ripe for the emerging interest to use ML in DDNs, and the utility of these data-adaptive modeling techniques in pharmacoepidemiologic and pharmacovigilance studies will likely continue to increase in the coming years.
Collapse
|
18
|
Binkheder S, Asiri MA, Altowayan KW, Alshehri TM, Alzarie MF, Aldekhyyel RN, Almaghlouth IA, Almulhem JA. Real-World Evidence of COVID-19 Patients' Data Quality in the Electronic Health Records. Healthcare (Basel) 2021; 9:1648. [PMID: 34946374 PMCID: PMC8701465 DOI: 10.3390/healthcare9121648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Revised: 11/18/2021] [Accepted: 11/25/2021] [Indexed: 11/19/2022] Open
Abstract
Despite the importance of electronic health records data, less attention has been given to data quality. This study aimed to evaluate the quality of COVID-19 patients' records and their readiness for secondary use. We conducted a retrospective chart review study of all COVID-19 inpatients in an academic healthcare hospital for the year 2020, which were identified using ICD-10 codes and case definition guidelines. COVID-19 signs and symptoms were higher in unstructured clinical notes than in structured coded data. COVID-19 cases were categorized as 218 (66.46%) "confirmed cases", 10 (3.05%) "probable cases", 9 (2.74%) "suspected cases", and 91 (27.74%) "no sufficient evidence". The identification of "probable cases" and "suspected cases" was more challenging than "confirmed cases" where laboratory confirmation was sufficient. The accuracy of the COVID-19 case identification was higher in laboratory tests than in ICD-10 codes. When validating using laboratory results, we found that ICD-10 codes were inaccurately assigned to 238 (72.56%) patients' records. "No sufficient evidence" records might indicate inaccurate and incomplete EHR data. Data quality evaluation should be incorporated to ensure patient safety and data readiness for secondary use research and predictive analytics. We encourage educational and training efforts to motivate healthcare providers regarding the importance of accurate documentation at the point-of-care.
Collapse
Affiliation(s)
- Samar Binkheder
- Medical Informatics and E-Learning Unit, Medical Education Department, College of Medicine, King Saud University, Riyadh 12372, Saudi Arabia; (M.A.A.); (K.W.A.); (T.M.A.); (M.F.A.); (R.N.A.); (J.A.A.)
| | - Mohammed Ahmed Asiri
- Medical Informatics and E-Learning Unit, Medical Education Department, College of Medicine, King Saud University, Riyadh 12372, Saudi Arabia; (M.A.A.); (K.W.A.); (T.M.A.); (M.F.A.); (R.N.A.); (J.A.A.)
- Department of Medicine, College of Medicine, King Saud University, Riyadh 12372, Saudi Arabia;
| | - Khaled Waleed Altowayan
- Medical Informatics and E-Learning Unit, Medical Education Department, College of Medicine, King Saud University, Riyadh 12372, Saudi Arabia; (M.A.A.); (K.W.A.); (T.M.A.); (M.F.A.); (R.N.A.); (J.A.A.)
- Department of Medicine, College of Medicine, King Saud University, Riyadh 12372, Saudi Arabia;
| | - Turki Mohammed Alshehri
- Medical Informatics and E-Learning Unit, Medical Education Department, College of Medicine, King Saud University, Riyadh 12372, Saudi Arabia; (M.A.A.); (K.W.A.); (T.M.A.); (M.F.A.); (R.N.A.); (J.A.A.)
- Department of Medicine, College of Medicine, King Saud University, Riyadh 12372, Saudi Arabia;
| | - Mashhour Faleh Alzarie
- Medical Informatics and E-Learning Unit, Medical Education Department, College of Medicine, King Saud University, Riyadh 12372, Saudi Arabia; (M.A.A.); (K.W.A.); (T.M.A.); (M.F.A.); (R.N.A.); (J.A.A.)
- Department of Medicine, College of Medicine, King Saud University, Riyadh 12372, Saudi Arabia;
| | - Raniah N. Aldekhyyel
- Medical Informatics and E-Learning Unit, Medical Education Department, College of Medicine, King Saud University, Riyadh 12372, Saudi Arabia; (M.A.A.); (K.W.A.); (T.M.A.); (M.F.A.); (R.N.A.); (J.A.A.)
| | - Ibrahim A. Almaghlouth
- Department of Medicine, College of Medicine, King Saud University, Riyadh 12372, Saudi Arabia;
| | - Jwaher A. Almulhem
- Medical Informatics and E-Learning Unit, Medical Education Department, College of Medicine, King Saud University, Riyadh 12372, Saudi Arabia; (M.A.A.); (K.W.A.); (T.M.A.); (M.F.A.); (R.N.A.); (J.A.A.)
| |
Collapse
|
19
|
Mahajan A, Deonarine A, Bernal A, Lyons G, Norgeot B. Developing the Total Health Profile, a Generalizable Unified Set of Multimorbidity Risk Scores Derived From Machine Learning for Broad Patient Populations: Retrospective Cohort Study. J Med Internet Res 2021; 23:e32900. [PMID: 34842542 PMCID: PMC8665380 DOI: 10.2196/32900] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Revised: 09/15/2021] [Accepted: 09/18/2021] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Multimorbidity clinical risk scores allow clinicians to quickly assess their patients' health for decision making, often for recommendation to care management programs. However, these scores are limited by several issues: existing multimorbidity scores (1) are generally limited to one data group (eg, diagnoses, labs) and may be missing vital information, (2) are usually limited to specific demographic groups (eg, age), and (3) do not formally provide any granularity in the form of more nuanced multimorbidity risk scores to direct clinician attention. OBJECTIVE Using diagnosis, lab, prescription, procedure, and demographic data from electronic health records (EHRs), we developed a physiologically diverse and generalizable set of multimorbidity risk scores. METHODS Using EHR data from a nationwide cohort of patients, we developed the total health profile, a set of six integrated risk scores reflecting five distinct organ systems and overall health. We selected the occurrence of an inpatient hospital visitation over a 2-year follow-up window, attributable to specific organ systems, as our risk endpoint. Using a physician-curated set of features, we trained six machine learning models on 794,294 patients to predict the calibrated probability of the aforementioned endpoint, producing risk scores for heart, lung, neuro, kidney, and digestive functions and a sixth score for combined risk. We evaluated the scores using a held-out test cohort of 198,574 patients. RESULTS Study patients closely matched national census averages, with a median age of 41 years, a median income of $66,829, and racial averages by zip code of 73.8% White, 5.9% Asian, and 11.9% African American. All models were well calibrated and demonstrated strong performance with areas under the receiver operating curve (AUROCs) of 0.83 for the total health score (THS), 0.89 for heart, 0.86 for lung, 0.84 for neuro, 0.90 for kidney, and 0.83 for digestive functions. There was consistent performance of this scoring system across sexes, diverse patient ages, and zip code income levels. Each model learned to generate predictions by focusing on appropriate clinically relevant patient features, such as heart-related hospitalizations and chronic hypertension diagnosis for the heart model. The THS outperformed the other commonly used multimorbidity scoring systems, specifically the Charlson Comorbidity Index (CCI) and the Elixhauser Comorbidity Index (ECI) overall (AUROCs: THS=0.823, CCI=0.735, ECI=0.649) as well as for every age, sex, and income bracket. Performance improvements were most pronounced for middle-aged and lower-income subgroups. Ablation tests using only diagnosis, prescription, social determinants of health, and lab feature groups, while retaining procedure-related features, showed that the combination of feature groups has the best predictive performance, though only marginally better than the diagnosis-only model on at-risk groups. CONCLUSIONS Massive retrospective EHR data sets have made it possible to use machine learning to build practical multimorbidity risk scores that are highly predictive, personalizable, intuitive to explain, and generalizable across diverse patient populations.
Collapse
|
20
|
Hara K, Kobayashi Y, Tomio J, Ito Y, Svensson T, Ikesu R, Chung UI, Svensson AK. Claims-based algorithms for common chronic conditions were efficiently constructed using machine learning methods. PLoS One 2021; 16:e0254394. [PMID: 34570785 PMCID: PMC8476042 DOI: 10.1371/journal.pone.0254394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Accepted: 06/25/2021] [Indexed: 11/29/2022] Open
Abstract
Identification of medical conditions using claims data is generally conducted with algorithms based on subject-matter knowledge. However, these claims-based algorithms (CBAs) are highly dependent on the knowledge level and not necessarily optimized for target conditions. We investigated whether machine learning methods can supplement researchers' knowledge of target conditions in building CBAs. Retrospective cohort study using a claims database combined with annual health check-up results of employees' health insurance programs for fiscal year 2016-17 in Japan (study population for hypertension, N = 631,289; diabetes, N = 152,368; dyslipidemia, N = 614,434). We constructed CBAs with logistic regression, k-nearest neighbor, support vector machine, penalized logistic regression, tree-based model, and neural network for identifying patients with three common chronic conditions: hypertension, diabetes, and dyslipidemia. We then compared their association measures using a completely hold-out test set (25% of the study population). Among the test cohorts of 157,822, 38,092, and 153,608 enrollees for hypertension, diabetes, and dyslipidemia, 25.4%, 8.4%, and 38.7% of them had a diagnosis of the corresponding condition. The areas under the receiver operating characteristic curve (AUCs) of the logistic regression with/without subject-matter knowledge about the target condition were .923/.921 for hypertension, .957/.938 for diabetes, and .739/.747 for dyslipidemia. The logistic lasso, logistic elastic-net, and tree-based methods yielded AUCs comparable to those of the logistic regression with subject-matter knowledge: .923-.931 for hypertension; .958-.966 for diabetes; .747-.773 for dyslipidemia. We found that machine learning methods can attain AUCs comparable to the conventional knowledge-based method in building CBAs.
Collapse
Affiliation(s)
- Konan Hara
- Department of Public Health, Graduate School of Medicine, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
| | - Yasuki Kobayashi
- Department of Public Health, Graduate School of Medicine, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
| | - Jun Tomio
- Department of Public Health, Graduate School of Medicine, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
| | - Yuki Ito
- Department of Economics, University of California, Berkeley, Berkeley, California, United States of America
| | - Thomas Svensson
- Precision Health, Department of Bioengineering, Graduate School of Engineering, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
- Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden
- School of Health Innovation, Kanagawa University of Human Services, Kawasaki-shi, Kanagawa, Japan
| | - Ryo Ikesu
- Department of Public Health, Graduate School of Medicine, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
- Precision Health, Department of Bioengineering, Graduate School of Engineering, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
| | - Ung-il Chung
- Precision Health, Department of Bioengineering, Graduate School of Engineering, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
- School of Health Innovation, Kanagawa University of Human Services, Kawasaki-shi, Kanagawa, Japan
- Clinical Biotechnology, Center for Disease Biology and Integrative Medicine, Graduate School of Medicine, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
| | - Akiko Kishi Svensson
- Precision Health, Department of Bioengineering, Graduate School of Engineering, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
- Department of Clinical Sciences, Lund University, Skåne University Hospital, Malmö, Sweden
- Department of Diabetes and Metabolic Diseases, Graduate School of Medicine, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
| |
Collapse
|
21
|
Chiang C, Zhang P, Donneyong M, Chen Y, Su Y, Li L. Random control selection for conducting high-throughput adverse drug events screening using large-scale longitudinal health data. CPT Pharmacometrics Syst Pharmacol 2021; 10:1032-1042. [PMID: 34313404 PMCID: PMC8452297 DOI: 10.1002/psp4.12673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Revised: 04/07/2021] [Accepted: 05/22/2021] [Indexed: 11/12/2022] Open
Abstract
Case-control design based high-throughput pharmacoinformatics study using large-scale longitudinal health data is able to detect new adverse drug event (ADEs) signals. Existing control selection approaches for case-control design included the dynamic/super control selection approach. The dynamic/super control selection approach requires all individuals to be evaluated at all ADE case index dates, as the individuals' eligibilities as control depend on ADE/enrollment history. Thus, using large-scale longitudinal health data, the dynamic/super control selection approach requires extraordinarily high computational time. We proposed a random control selection approach in which ADE case index dates were matched by randomly generated control index dates. The random control selection approach does not depend on ADE/enrollment history. It is able to significantly reduce computational time to prepare case-control data sets, as it requires all individuals to be evaluated only once. We compared the performance metrics of all control selection approaches using two large-scale longitudinal health data and a drug-ADE gold standard including 399 drug-ADE pairs. The F-scores for the random control selection approach were between 0.586 and 0.600 compared to between 0.545 and 0.562 for dynamic/super control selection approaches. The random control selection approach was ~ 1000 times faster than dynamic/super control selection approach on preparing case-control data sets. With large-scale longitudinal health data, a case-control design-based pharmacoinformatics study using random control selection is able to generate comparable ADE signals than the existing control selection approaches. The random control selection approach also significantly reduces computational time to prepare the case-control data sets.
Collapse
Affiliation(s)
- Chien‐Wei Chiang
- Department of Biomedical InformaticsOhio State UniversityColumbusOhioUSA
| | - Penyue Zhang
- Department of Biostatistics and Health Data ScienceIndiana UniversityBloomingtonIndianaUSA
| | - Macarius Donneyong
- Division of Outcomes and Translational SciencesCollege of PharmacyOhio State UniversityColumbusOhioUSA
| | - You Chen
- Department of Biomedical InformaticsVanderbilt University Medical CenterNashvilleTennesseeUSA
| | - Yu Su
- Department of Computer Science and EngineeringThe Ohio State UniversityColumbusOhioUSA
| | - Lang Li
- Department of Biomedical InformaticsOhio State UniversityColumbusOhioUSA
| |
Collapse
|